Serverless Chats

Episode #99: You already have a Multi-Cloud Strategy with Rob Sutter

May 3 '21

About Rob Sutter

Rob Sutter, a Principal Developer Advocate at Fauna, has woven application development into his entire career, from time in the U.S. Army and U.S. Government to stints with the Big Four and Amazon Web Services. He has started his own company – twice – once providing consulting services and most recently with WorkFone, a software as a service startup that provided virtual digital identities to government clients. Rob loves to build in public with cloud architectures, Node.js or Go, and all things serverless!

Twitter: @rts_rob
Personal email: rob@fauna.com
Personal website: robsutter.com
Fauna Homepage
Learn more about Fauna
Supported Languages and Frameworks
Try Fauna for Free
The Calvin Paper

This episode sponsored by CBT Nuggets: https://www.cbtnuggets.com/

Watch this video on YouTube: https://youtu.be/CUx1KMJCbvk

Transcript
Jeremy: Hi, everyone. I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Rob Sutter. Hey, Rob. Thanks for joining me.

Rob: Hey, Jeremy. Thanks for having me.

Jeremy: So you are now the or a Principal Developer Advocate at Fauna. So I'd love it if you could tell the listeners a little bit about your background and what Fauna is all about.

Rob: Right. So as you've said, I'm a DA at Fauna. I've been a serverless user in production since 2017, started the Serverless User Group in Dubai. So that's how I got into serverless in general. Previously, I was a DA on the Serverless Team at AWS, and I've been a SaaS startup co-founder, a government employee, an IT auditor, and like a lot of serverless people I find, I have a lot of Ops in my background, which is why I don't want to do it anymore. There's a lot of us that end up here that way, I think. Fauna is the data API for modern applications. So it's a database that you access as an API just as you would access Stripe for payments, Twilio for messaging. You just put your data into Fauna and access it that way. It's flexible, serverless. It's transactional. So it's a distributed database with asset transactions, and again, it's as simple as accessing any other API that you're already accessing as a developer so that you can simplify your code and ship faster.

Jeremy: Awesome. All right. Well, so I want to talk more about Fauna, but I want to talk about it actually in a broader ... I think in the broader ecosystem of what's happening in the cloud right now, and we hear this term "multicloud" all the time. By the way, I'm super excited to have you on. I wanted to have you on for the longest time, and then just schedules, and it's like ...

Rob: Yeah.

Jeremy: You know how it is, but anyways.

Rob: Thank you.

Jeremy: No. But seriously, I'm super excited because your tweets, and everything that you've written, and the things that you were doing at AWS and things like that I think just all reinforced this idea that we are living in this multicloud world, right, and that when people think of multicloud ... and this is something I try to be very clear on. Multicloud is not cloud-agnostic, right?

Rob: Right.

Jeremy: It's a very different thing, right? We're not talking about running the same work load in parallel on multiple service providers or whatever.

Rob: Right.

Jeremy: We're talking about this idea of using the best services that are available to you across the spectrum of providers, whether those are cloud service providers, whether those are SaaS companies, or even to some degree, some open-source projects that are out there that make up this strategy. So let's start there right from the beginning. Just give me your thoughts on this idea of what multicloud is.

Rob: Right. Well, it's sort of a dirty word today, and people like to rail against it. I think rightly so because it's that multicloud 1.0, the idea of, as you said, cloud-agnostic that "write once, run everywhere." All that is, is a race to the bottom, right? It's the lowest common denominator. It's, "What do I have available on every cloud service provider? And then let me write for that as a risk management strategy." That's a cost center when you want to put it in business terms.

Jeremy: Right.

Rob: Right? You're not generating any value there. You're managing risk by investing against that. In contrast, what you and I are talking about today is this idea of, "Let me use best in class everywhere," and that's a value generation strategy. This cloud service provider offers something that this team understands, and wants to build with, and creates value for the customer more quickly. So they're going to write on that cloud service provider. This team over here has different needs, different customers. Let them write over there. Quite frankly, a lot of this is already happening today at medium businesses and enterprises. It's just not called multicloud, right?

Jeremy: Right.

Rob: So it's this bottom-up approach that individual teams are consuming according to their needs to create the greatest value for customers, and that's what I like to see, and that's what I like to promote.

Jeremy: Yeah, yeah. I love that idea of bottom-up because I think that is absolutely true, and I don't think you've seen this as aggressively as you have in the last probably five years as more SaaS companies have become or SaaS has become a household name. I mean, probably 10 years ago, I think Salesforce was around, and some of these other things were around, right?

Rob: Yeah.

Jeremy: But they just weren't ... They weren't the household names that they are now. Now, you watch any sport, any professional sport, and you see advertisements for all these SaaS companies now because that seems to be the modern economy. But the idea of the bottom-up approach, that is something where you basically give a developer or maybe you don't give them, but the developer takes the liberties, I would say, to maybe try and experiment with something new without having to do years of research, go through procurement, get approval to use some platform. Even companies trying to move to AWS, or on to Azure, or something like that, they have to go through hoops. Basically, jump through hoops in order to get them there. So this idea of the bottom-up approach, the developers are the ones who are experimenting. Very low-risk experiments, by the way, with some of these other services. That approach, that seems like the right marketing approach for companies that are building these services, right?

Rob: Yeah. It seems like a powerful approach for them. Maybe not necessarily the only one, but it is a good one. I mean, there's a historical lesson here as well, right? I want to come back to your point about the developers after, but I think of this as shadow cloud. Right? You saw this with the early days of SaaS where people would go out and sign up for accounts for their business and use them. They weren't necessarily regulated, but we saw even before that with shadow IT, right, where people were bringing their own software in?

Jeremy: Right.

Rob: So for enterprises that are afraid of this that are heavily risk-focused or control-focused top-down, I would say don't be so afraid because there's an entire set of lessons you can learn about this as you bring it, as you come forward to it. Then, with the developers, I think it's even more powerful than the way you put it because a lot of times, it's not an experiment. I mean, you've seen the same things on Twitter I've seen, the great tech turnover of 2021, right? That's normal for tech. There's such a turnover that a lot of times, people are coming in already having the skills that they know will enhance delivery and add customer value more quickly. So it's not even an experiment. They already have the evidence, and they're able to get their team skilled up and building quickly. If you hire someone who's coming from an AWS shop, you hire someone who's coming from an Azure shop on to two different teams, they're likely going to evolve that excellence or that capability independently, and I don't necessarily think there's a reason to stop that as long as you have the right controls around it.

Jeremy: Right. I mean, and you mentioned controls, and I think that if I'm the CTO of some company or whatever, or I'm the CIO because we're dealing in a super enterprise-y world, that you have developers that are starting to use tools ... Maybe not Stripe, but maybe like a Twilio or maybe they're using, I don't know, ChaosSearch or something, something where data that is from within their corporate walls are going out somewhere or being stored somewhere else, like the security risk around that. I mean, there's something there though, right?

Rob: Yeah, there absolutely is. I think it's incumbent on the organizations to understand what's going on and adapt. I think it's also imcu,bent on the cloud service providers to understand those organizational concerns and evolve their product to address them, right?

Jeremy: Right.

Rob: This is one thing. My classic example of this is data exfiltration in a Lambda function. Some companies get ... I want to be able to inspect every packet that leaves, and they have that hard requirement for reasons, right?

Jeremy: Right.

Rob: You can't argue with them that they're right or wrong. They made that decision as a company. But then, they have to understand the impact of that is, "Okay. Well, every single Lambda function that you ever create is going to run inside of VPC or is going to run connected to a VPC." Now, you have the added complexity of managing a VPC, managing your firewall rules, your NACLs, your security groups. All of this stuff that ... Maybe you still have to do it. Maybe it really is a requirement. But if you examine your requirements from a business perspective and say, "Okay. There's another way we can address this with tightly-scoped IAM permissions that only allow me to read certain records or from certain tables, or access certain keys, or whatever." Then, we assume all that traffic goes out and that's okay. Then, you get to throw all of that complexity away and get back to delivering value more quickly. So they have to meet together, right? They have to meet.

Jeremy: Right.

Rob: This led to a lot of the work that AWS did with VPC networking for Lambda functions or removing the cold start because a lot ... Those enterprises weren't ready to let go of that requirement, and AWS can't tell them, "You're wrong." It's their business. It's their requirement. So AWS built that for them to reduce the cold start so that Lambda became a viable platform for them to build against.

Jeremy: Right, and so if you're a developer and you're looking at some solution because ... By the way, I mean, like you said, choosing the best of breed. There are just a lot of good services out there. There are thousands and thousands of SaaS companies, and I think ... I don't know if we made this point, but I certainly consider SaaS companies themselves to be part of the cloud. I would think you would probably agree with that, right?

Rob: Yeah.

Jeremy: It might as well be cloud providers themselves. Most of them run on top of the cloud providers anyways, but they found ...

Rob: But they don't have to, and that's interesting to me and another truth that you could be consuming services from somebody else's data center and that's still multicloud.

Jeremy: Right, right. So, anyway. So my thought here or I guess the question I have is if you're a developer and you're trying to choose something best in breed, right? Whatever that is. Let's say I'm trying to send text messages, and I just think Twilio is ... It's got all the features that I need. I want to go with Twilio. If you're a developer, what are the things that you need to look for in some of these companies that maybe don't have ... I mean, I would say Twilio does, but like don't necessarily have the trust or the years of experience or I guess years under their belts of providing these services, and keeping data secure, and things like that. What's the advice that you give to developers looking to choose something like that to be aware of?

Rob: To developers in particular I think is a different answer ...

Jeremy: Well, I mean, yeah. Well, answer it both ways then.

Rob: Yeah, because there's the builder and the buyer.

Jeremy: Right.

Rob: Right?

Jeremy: Right.

Rob: Whoever the buyer is, and a lot of times, that could just be the software development manager who's the buyer, and they still would approach it different ways. I think the developer is first going to be concerned with, "Does it solve my problem?" Right? "Overall, does it allow me to ship faster?" The next thing has to be stability. You have to expect that this company will be around, which means there is a certain level of evidence that you need of, "Okay. This company has been around and has serviced," and that's a bit of a chicken and an egg problem.

Jeremy: Right.

Rob: I think the developer is going to be a lot less concerned with that and more concerned with the immediacy of the problem that they're facing. The buyer, whether that's a manager, or CIO, or anywhere in between, is going to need to be concerned with other things according to their size, right? You even get the weird multicloud corner cases of, "Well, we're a major supplier of Walmart, and this tool only runs on a certain cloud service provider that they don't want us to use. So we're not going to use it." Again, that's a business decision, like would I build my software that way? No, but I'm not subject to that constraint. So that means nothing in that equation.

Jeremy: Right. So you mentioned a little bit earlier this idea of bringing people in from different organizations like somebody comes in and they can pick up where somebody else left off. One of the things that I've noticed quite a bit in some of the companies that I've worked with is that they like to build custom tools. They build custom tools to solve a job, right? That's great. But as soon as Fred or Sarah leave, right, then all of a sudden, it's like, "Well, who takes over this project?" That's one of the things where I mentioned ... I said "experiments," and I said "low-risk." I think something that's probably more low-risk than building your own thing is choosing an API or a service that solves your problem because then, there's likely someone else who knows that API or that service that can come in, and can replace it, and then can have that seamless transition.

And as you said, with all the turnover that's been happening lately, it's probably a good thing if you have some backup, and even if you don't necessarily have that person, if you have a custom system built in-house, there is no one that can support that. But if you have a custom ... or if you have a system you've used, you're interfacing with Twilio, or Stripe, or whatever it is, you can find a lot of developers who could come in even as consultants and continue to maintain and solve your problems for you.

Rob: Yeah, and it's not just those external providers. It's the internal tooling as well.

Jeremy: Right.

Rob: Right?

Jeremy: Right.

Rob: We're guilty of this in my company. We wrote a lot of stuff. Everybody is, right, like you like to do it?

Jeremy: Right.

Rob: It's a problem that you recognized. It feels good to solve it. It's a quick win, and it's almost always the wrong answer. But when you get into things like ... a lot of cases it doesn't matter what specific tool you use. 10 years ago, if you had chosen Puppet, or Chef, or Ansible, it wouldn't be as important which one as the fact that you chose one of those so that you could then go out and find someone who knew it. Today, of course, we've got Pulumi, Terraform, and all these other things that you could choose from, and it's just better than writing a bunch of Bash scripts to tile the stuff together. I believe Bash should more or less be banned in the cloud, but that's another ... That's my hot take for another time. Come at me on Twitter if you don't like that one.

Jeremy: So, yeah. So I think just bringing up this idea of tooling is important because the other thing that you potentially run into is with the variety of tooling that's out there, and you mentioned the original IAC. I guess they would... Right? We call those like Ansible and those sort of things, right?

Rob: Right.

Jeremy: All of those things, the Chefs and the Puppets. Those were great because you could have repeatable deployments and things like that. I mean, there's still work to be done there, but that was great because you made the choice to not building something yourself, right?

Rob: Right.

Jeremy: Something that somebody else could jump in on. But now with Terraform and with ... You mentioned Pulumi. We've got CloudFormation and even Microsoft has their own ... I think it's called ARM or something like that that is infrastructure as code. We've got the Serverless Framework. We've got SAM. We've got Begin. You've got ... or Architect, right? You've got all of these choices, and I think what happens too is that if teams don't necessarily ... If they don't rally around a tool, then you end up with a bunch of people using a bunch of different tools. Maybe not all these tools are going to be compatible, but I've seen really interesting mixes of people using Terraform, and CloudFormation, and SAM, and Serverless Framework, like binding it all together, and I think that just becomes ... I think that becomes a huge mess.

Rob: It does, and I get back to my favorite quote about complexity, right? "Simplicity before complexity is worthless. Simplicity beyond complexity is priceless." I find it hard to get to one tool that's like artificial, premature optimization, fake simplicity.

Jeremy: Yeah.

Rob: If you force yourself into one tool all the time, then you're going to find it doing what it wasn't built to do. A good example of this, you talked about Terraform and the Serverless Framework. My opinion, they weren't great together, but your Terraform comes for your persistent infrastructure, and your Serverless Framework comes in and consumes the outputs of those Terraform stacks, but then builds the constantly churning infrastructure pieces of it. Right? There's a blast radius issue there as well where you can't take down your database, or S3 bucket, or all of this from a bad deploy when all of that is done in Terraform either by your team, or by another team, or by another process. Right? So there's a certain irreducible complexity that we get to, but you don't want to have duplication of effort with multiple tools, right?

Jeremy: Right.

Rob: You don't want to use CloudFormation to manage your persistent data over here and Terraform to manage your persistent data over here because then you're not ... That's like that agnostic model where you're not benefiting from the excellent features in each. You're only using whatever is common between them.

Jeremy: Right, right, and I totally agree with you. I do like the idea of consuming. I mean, I have been using AWS for a very, very long time like 2007, 2008.

Rob: Yeah, same. Oh, yeah.

Jeremy: Right when EC2 instances were a thing. I guess 2008. But the biggest thing for me looking at using Terraform or something like that, I always felt like keeping it in, keeping it in the family. That's the wrong way to say it, but like using CloudFormation made a lot of sense, right, because I knew that CloudFormation ... or I thought I knew that CloudFormation would always support the services that needed to be built, and that was one of my big complaints about it. It was like you had this delay between ... They would release some service, and you had to either do it through the CLI or through the console. But then, CloudFormation support came months later. The problem that you have with some of that was then again other tools that were generating CloudFormation, like a Serverless Framework, that they would have to wait to get CloudFormation support before they could support it, and that would be another delay or they'd have to build something custom, which is not always the cleanest way to do it.

Rob: Right.

Jeremy: So anyways, I've always felt like the CloudFormation route was great if you could get to that CloudFormation, but things have happened with CDK. We didn't even mention CDK, but CDK, and Pulumi, and Terraform, and all of these other things, they've all provided these different ways to do things. But the thing that I always thought was funny was, and this is ... and maybe you have some insight into this if you can share it, but with SAM, for example, SAM wasn't extensible, right? You would just run into issues where you're like, "Oh, I can't do that with SAM." Whereas the Serverless Framework had this really great third-party plug-in system that allowed you to do some of these other things. Now, granted not all third-party plug-ins were super stable and were the best way to do something, right, because they'd either interact with APIs directly or whatever, but at least it gave you ... It unblocked you. Whereas I felt like with SAM and even CloudFormation when it didn't support something would block you.

Rob: Yeah. Yeah, and those are just two different implementation philosophies from two different companies at two different stages of their existence, right? Like AWS ... and let's separate the reality from the theory here. The theory is that a large company can exert control over release cycles and limit what it delivers, but deliver it with a bar of excellence. A small company can open things up, and it depends on its community members for contributions to solve problems. It's very much like this is the cathedral and the Bazaar of cloud tooling, right?

AWS has that CloudFormation architecture that they're working around with its own goals and approach, the one way to do it. Serverless Framework is, "Look, you need to ... You want to set up a stall here and insert IAM policies per function. Set up a stall. It will be great. Maybe people come and maybe they don't," and the system inherently sorts or bubbles up the value, right? So you see things like the Step Functions plug-in for Serverless Framework. It was one of the early ones that became very popular very quickly, whereas Step Functions supporting SAM trailed, but eventually came in. I think that team, by the way, deserves a lot of credit for really being focused on developers, but that's not the point of the difference between the two.

A small young company like Serverless Framework that is moving very quickly can't have that cathedral approach to it, and both are valid, right? They're both just different strategies and good for the marketplace, quite frankly. I have my preferred approach, which is not about AWS or SAM vs Serverless Framework. It's the extensibility of plug-in frameworks to me are a key component of tooling that adapts as quickly as the clouds change, and you see this. Like Terraform was the first place that I really learned about plug-ins, and their plug-in framework is fantastic, the way they do providers. Serverless Framework as well is another good example, but you can't know how developers are going to build with your services. You just can't.

You do customer development. You talk to them ahead of time. You get all this research. You talk to a thousand customers, and then you release it to 14 million customers, right? You're never going to guess, so let them. Let them build it, and if people ... They put the work in. People find there's value in it. Sometimes you can bring it in. Sometimes you leave it up to the community to maintain, but you just ... You have to be willing to accept that customers are going to use your product in different ways than you envisioned, and that's a good thing because it means customers are using your product.

Jeremy: Right, right. Yeah. So I mean, from your perspective though ... because let's talk about SAM for a minute because I was excited when SAM came out. I was thinking to myself. I'm like, "All right. A simplified tooling that is focused on serverless. Right? Like gives me all the things that I think I'm going to need." And then I did ... from a developer experience standpoint, and let's call out the elephant in the room. AWS and developer experience are not always the same. They don't always give you that developer experience that you would want. They give you tons of tools, right, but they don't always give you that ...

Rob: Funny enough, you can spell "developer experience" without AWS.

Jeremy: Right. So I mean, that's my ... I was disappointed when I started using SAM, and I immediately reverted back to the Serverless Framework. Not because I thought that it wasn't good or that it wasn't well-thought-out. Like you said, there was a level of excellence there, which certainly you cannot diminish, It just didn't do the things I needed it to do, and I'm just curious if that was a consistent feedback that you got as being someone on the dev advocate team there. Was that something that you felt as well?

Rob: I need to give two answers to this to be fair, to be honest. That was something that I felt as well. I never got as comfortable with SAM as I am with the Serverless Framework, but there's another side to this coin, and that's that enterprise uptake of SAM CLI has been really strong.

Jeremy: Right.

Rob: Enterprise it, it does what they need it to do, and it addresses their concerns, and they liked getting tooling from AWS. It just goes back to there being a place for both, right?

Jeremy: Right.

Rob: Enterprises are much more likely to build cathedrals. They want that top-down, "Okay, everybody. This is how you define something. In fact, we've created a module for you. Consume it here. Thou shalt not write new S3 to web server configuration in your SAM templates. Thou shalt consume this." That's not wrong, and the usage numbers don't lie with SAM. It's got a lot of fans, and it's got a lot of uptake, but that's an entirely different answer from how I feel about it. I think it also goes back to I'm not running an enterprise. I've never run an enterprise. The biggest I've got in terms of responsibility is at best a small company, right? So I think it's natural for me to feel that way when I try to use a tool that has such popularity amongst enterprise. Now, of course, you have the switch, right? You have enterprises using Serverless Framework, and you have small builders using SAM. But in general, I think the success there was with the enterprise, and it's a validation of their strategy.

Jeremy: Right, right. So let's talk about enterprises for a second because this is where we look at tools like the CDK and SAM, Serverless Framework, things like that. You look at all those different tools, and like you said, there's adoption across some of those. But at the end of the day, most of those tools are compiling down to a CloudFormation or they're compiling down to ... What's it called? The Azure Resource Manager Language or whatever the heck it is, right?

Rob: ARM templates.

Jeremy: ARM templates. What's the value now in CloudFormation and those sort of things that the final product that you get to ... I mean, certainly, it's so much easier to build those using these frameworks, but do we need CloudFormation in those things anymore? Do we need to know those? Does an individual developer need to be able to understand those, or can they just basically take a step back and say, "Look, CDK does it for me," or, "Pulumi does it for me. So why do I need to know what's baked into those templates?"

Rob: Yeah. So let's set Terraform aside and talk about it after because it's different. I think the choice of JSON and YAML as implementation languages for most of this tooling, most of this tooling is very ... It was a very effective choice because you don't necessarily have to know CloudFormation to look at a template and define what it's doing.

Jeremy: Right.

Rob: Right? You don't have to understand transforms. You don't have to understand parameter replacement and all of this stuff to look at the final transformed template in CloudFormation in the console and get a very quick reasoning about what's happening. That's good. Do I think there's value in learning to create multi-thousand-line CloudFormation templates by hand? I don't. It's the assembly language of the cloud, right?

Jeremy: Right.

Rob: It's there when you need it, and just like with procedural languages, you might want to look underneath at the instructions, how it unrolled certain loops, how it decided not to unroll others so that you can make changes at the next level. But I think that's rare, and that's optimization. In terms of getting things done and getting things shipped and delivered, to start, I wouldn't start with plain CloudFormation for any of these, especially of anything for any meaningful production size. That's not a criticism of CloudFormation. It's just like you said. It's all these other tooling is there to help us generate what we want consistently.

The other benefit of it is once you have that underlying lingua franca of the cloud, you can build visualization, and debugging, and monitoring, and like ... I mean, all of these other evaluatory forensic. "Evaluatory?" Is that a word? It's a word now. You heard it here first on this podcast. Like forensic, cloud forensic type tooling that lets you see what's going on because it is a universal language among all of the tools.

Jeremy: Yeah, and I want to get back to Terraform because I know you mentioned that, but I also want to be clear. I don't suggest you write CloudFormation. I think it is horribly verbose, but probably needs to be, right?

Rob: Yeah.

Jeremy: It probably needs to have that level of fidelity there or that just has all that descriptive information. Yeah. I would not suggest. I'm with you, don't suggest that people choose that as their way to go. I'm just wondering if it's one of those things where we don't need to be able to look at ones and zeroes anymore, and understand what they do, right?

Rob: Right.

Jeremy: We've got higher-level constructs that do that for us. I wouldn't quite put ... I get the assembly language comparison. I think that's a good comparison, but it's just that if you are an enterprise, right, is that ... Do you trust? Do you trust something like CDK to do everything you need it to do and make sure that it's covering all its bases because, again, you're writing layers of abstraction on top of a layer of abstraction that then abstracts it even more. So yeah, I'm just wondering. You had mentioned forensic tools. I think there's value there in being able to understand exactly what gets put into the cloud.

Rob: Yeah, and I'd agree with that. It takes 15 seconds to run into the limit of my knowledge on CDK, by the way. But that knowledge includes the fact that CDK Synth is there, which generates the CloudFormation template for you, and that's actually the thing, which is uploaded to the CloudFormation service and executed. You'd have to bring in somebody like Richard Boyd or someone to talk about the guard rails that are there around it. I know they exist. I don't know what they are. It's wildly popular, and adoption is through the roof. So again, whether I philosophically think it's a good idea or not is irrelevant. Developers want it and want to build with it, right?

Jeremy: Yeah.

Rob: It's a Bazaar-type tool where they give you some basic constructs, and you can write your own constructs around it and get whatever you need. But ultimately, that comes back to CloudFormation, which is then subject to all the controls that your organization puts around CloudFormation, so it is ... There's value there. It can't be denied.

Jeremy: Right. No, and the thing that I like about the CDK is the idea of being able to create those constructs because I think, especially from a ... What's the right word? Compliance standpoint or something like that, that you can write in these constructs that you say, "You need to use these constructs when you deploy a microservice or you deploy this," or whatever it is, and then you have those guard rails as you mentioned or whatever, but those ... All of those checkboxes are ticked because you can put that all into one construct. So I totally think that's great. All right. So let's talk about Terraform.

Rob: Yeah, so there's ... First, it's a completely different model, right?

Jeremy: Right.

Rob: This is an interesting discussion to have because it's API calls. You write your provider, whatever your infrastructure is, and anything that can be an API call can now be a Terraform declarative resource. So it's that mapping between declarative and imperative that I find fascinating. Well, also building the dependency graph for you. So it handles all of those aspects of it, which is a really powerful tool. The thing that they did so well ... Terraform is equally verbose as CloudFormation. You've got to configure all the same options. You get the same defaults, et cetera. It can be terribly verbose, but it's modular. Every Terraform file that you have in one directory is concatenated, and that is a huge distinction between how CloudFormation wants everything in one template, or well, you can refer to something in an S3 bucket, but that's not actually useful to me as a developer.

Jeremy: Right, right.

Rob: I can't mount an S3 bucket as a drive on my workstation and compose all of these independent files at once and do them that way. Sidebar here. Maybe I can. Maybe it supports that, and I haven't been able to discover it, right? Whereas Terraform by default, out of the box put everything in its own file according to function. It's very easy to look in your databases.tf and understand what's in there, look in your VPC.tf and understand what's in there, and not have to go through thousands of lines of code at once. Yes, we have find and replace. Yes, we have search, and you ... Anybody who's ever built any of this stuff knows that's not the same thing. It's not the same thing as being able to open a hundred lines in your text editor, and look at everything all at once, and gain an understanding of it, and then dive into the next level of detail a hundred lines at a time and understand that.

Jeremy: Right. But now, just a question here, without... because the API thing, I love that idea, and actually, Serverless Components used an API thing to do it and bypass CloudFormation. Actually, I believe Architect originally was using APIs, and then switched to CloudFormation, but the question I have about that is, if you don't have a CloudFormation template, if you don't have that assembly language of the web, and that's not sitting in your CloudFormation tool built into the dashboard, you don't get the drift protection, right, or the detection, and you don't get ... You don't have that resource map necessarily up there, right?

Rob: First, I don't think CloudFormation is the assembly language of the web. I think it's the assembly language of AWS.

Jeremy: I'm sorry. Right. Yes. Yeah. Yeah.

Rob: That leads into my point here, which is, "Okay. AWS gives you the CloudFormation dashboard, but what if you're now consuming things from Datadog, or from Fauna, or from other places that don't map this the same way?

Jeremy: Right.

Rob: Terraform actually does manage that. You can do a plan against your existing file, and it will go out and check the actual existing state of all of your resources, and compare them to what you've asked for declaratively, and show you what the changeset will be. If it's zero, there's no drift. If there is something, then there's either drift or you've added new functionality. Now, with Terraform Cloud, which I've only used at a basic level, I'm not sure how automatic that is or whether it provides that for you. If you're from HashiCorp and listening to this, I would love to learn more. Get in touch with me. Please tell me. But the tooling is there to do that, but it's there to do it across anything that can be treated as an API that has ... really just create and retrieve. You don't even necessarily need the update and delete functionality there.

Jeremy: Right, right. Yeah, and I certainly ... I am a fan of Terraform and all of these services that make it easier for you to build clouds more easily, but let's talk about APIs because you mentioned anything with an API. Well, everything has APIs now, right? I mean, essentially, we are living in a ... I mentioned SaaS, but now we're sort of ... This whole idea of the API economy. So just overall, what are your thoughts on this idea of basically anything you want to do is available now via an API?

Rob: It's not anything you want to do. It's everything you want to do short of fulfilling your customer's needs, short of solving your customer's specific problem is available as an API. That means that you get to choose best-in-class for everything, right? Your customer's need isn't, "I want to spend $25 on my credit card." Your customer's need is, "I need a book."

Jeremy: Right.

Rob: So it's not, "I want to store information about books in a database." It's, "I need a book." So everything and every step of the way there can now be consumed from an API. Again, it's like serverless in general, right? It allows you to focus purely or as close to purely as we can right now on generating customer value and addressing customer problems so that you can ship faster, and so that you have it as a competitive advantage. I can write a payment processing program. I know I can because I've done it back in 2004, and it was horrible, and it was awful. It wasn't a very good one, and it worked. It took your money, but this was like pre-PCIDSS.

If I had to comply with all of those things, why would I do that? I'm not a credit card payment processor. Stripe is, and they have specialists in all of the areas related to the problem of, "I need to take and process payments." That's the customer problem that they're solving. The specialization of labor that comes along with the API economy is fantastic. Ops never went away. All the ops people work at the cloud service providers now.

Jeremy: Right.

Rob: Right? Audit never went away. All the auditors have disappeared from view and gone into internal roles in payments companies. All of this continues to happen where the specialists are taking their deep, deep knowledge and bringing it inside companies that specialize in that domain.

Jeremy: Right, and I think the domain expertise value that you get from whatever it is, whether it's running a database company or whether it's running a payment company, the number of people that you would need to hire to have a level of specialization for what you're paying for two cents per transaction or whatever, $50 a month for some service, you couldn't even begin. The total cost of ownership on those things are ... It's not even a conversation you would want to have, but I also built a payment processing system, and I did have to pass PCI, which we did pass, but it was ...

Rob: Oh, good for you.

Jeremy: Let's put it this way. It was for a customer, and we lost money on that customer because we had to go through PCI compliance, but it was good. It was a good experience to have, and it's a good experience to have because now I know I never want to do it again.

Rob: Yeah, yeah. Back to my earlier point on Ops and serverless.

Jeremy: Right. Exactly.

Rob: These things are hard, right?

Jeremy: Right, right.

Rob: Sorry.

Jeremy: No, no. Go ahead.

Rob: Not to interrupt, but these are all really hard problems that people with graduate degrees and post-graduate research who have ... They're 30 years old when they start working on the problem are solving. There's a supply question there as well, right? There's just not enough people, and so you and I can like ... Well, I'm not going to project this on to you. I can stumble through an implementation and check off the requirements just like I worked in an optical microscopy lab in college, and I could create computer programs that modeled those concepts, but I was not an optical microscopist. I was not a PhD-level generating understandings of these things, and all of these, they're just so hard. Why would you do that when customer problems are equally hard and this set is solved?

Jeremy: Right, right.

Rob: This set of problems over here is solved, and you can't differentiate yourself by solving it better, and you're not likely to solve it better either. But even if you did, it wouldn't matter. This set of problems are completely unsolved. Why not just assemble the pieces from best-in-class so that you can attack those problems yourself?

Jeremy: Again, I think that makes a ton of sense. So speaking about expertise, let's talk about what you might have to pay say a database administrator. If you were to hire a database administrator to maintain all the databases for you and keep all that uptime, and maybe you have to hire six database administrators in order for them to ... Well, I'm thinking multi-region and all that kind of stuff. Maybe you have to hire a hundred, depending on what it is. I mean, I'm getting a little ahead of myself, but ... So if I can buy a service like Fauna, so tell me a little bit more about just how that works.

Rob: Right. Well, I mean, six database engineers in the US, you're over a million dollars a year easily, right?

Jeremy: Right.

Rob: I don't know what the exact number is, but when you consider benefits, and total cost, and all of that, it's a million dollars a year for six database engineers. Then, there are some very difficult problems in especially distributed databases and database scaling that Fauna solves. A number of other products or services solve some of them. I'm biased, of course, but I happen to think Fauna solves all of them in a way that no other product does, but you're looking ... You mentioned distributed transactions. Fauna is built atop the Calvin paper, which came out of Yale. It's a very brief, but dense academic research paper. It's a PHC research paper, and it talks about a model for distributed transactions and databases. It's a layer, a serialization layer, that sits atop your database.

So let's say you wanted to replicate something like Fauna. So not only do you need to get six database engineers who understand the underlying database, but you need to find engineers that understand this paper, understand the limitations of the theory in the paper, and how to overcome them in operations. In reality, what happens when you actually start running regions around the world, replicating transactions between those regions? Quite frankly, there's a level of sophistication there that most of the set of people who satisfy that criteria already work at Fauna. So there's not much of a supply there. Now, there are other database competitors that solve this problem in different ways, and most of the specialists already work at those companies as well, right? I'm not saying that they aren't equally competent database engineers. I'm just saying there's not a lot of them.

Jeremy: Right.

Rob: So if you're thing is to sell books at a certain scale like Amazon, then that makes sense to have them because you are a database creator as well. But if your thing is to sell books at some level below that, why would you compete for that talent rather than just consuming it?

Jeremy: Right. Yeah, and I would say unless you're a horse with a horn on your head, it's probably not worth maintaining your own database and things like that. So let's talk a little bit more, though, about that. I guess just this idea of maybe a shortage of people, like if you're ... You're right. There's a limited number of resources, right? I'm sure there's brilliant database engineers all around the world, and they have the experience where, right, they could come in and they could probably really help you maintain your database. Even if you could afford six of them and you wanted to do that, I think the problem is it's got to be the interestingness of the problem. I don't think "interestingness" is a word either, but like if I'm a database engineer, wouldn't I want to be working on something like Fauna that I could help millions and millions of people as opposed to helping some trucking company maintain their internal database or something like that?

Rob: Yeah, and I would hope so. I hope it's okay that I mention we're hiring. So come to Fauna.com and look at our roles database engineers.

Jeremy: You just read that Calvin paper first. Go ahead.

Rob: But read the Calvin paper first. I think it's only like 12 pages, and even just the first page is enough. I'm happy to talk about that at any length because I find it fascinating and it's public. It is an interesting problem and the ... It's the reification or the implementation of theory. It's bringing that theory to the real world and ... Okay. First off, the theory is brilliant. This is not to take away from it, but the theory is conceived inside someone's mind. They do some tests, they prove it, and there's a world of difference between that point, which is foundational, and deploying it to production where people are trusting their workloads on it around the world. You're actually replicating across multiple cloud providers, and you're actually replicating across multiple regions, and you're actually delivering on the promise of the paper.

What's described in the paper is not what we run at Fauna other than as a kernel, as a nugget, right, as the starting point or the first principle. That I think is wildly interesting for that class of talent like you talked about, the really world-class engineers who want to do something that can't be done anywhere else. I think one thing that Fauna did smartly early was be a remote-first company, which means that they can take advantage of those world-class engineers and their thirst for innovation regardless of wherever Fauna finds them. So that's a big deal, right? Would you rather work on a world-class or global problem or would you rather work on a local problem? Now, look, we need people working on local problems too. It's not to disparage that, but if this is your wheelhouse, if innovation is the thing that you want to do, if you want to be doing things with databases that nobody else is doing, this is where you want to be. So I think there's a strong argument there for coming to work in a place like Fauna.

Jeremy: Yeah, and I want to make sure I apologize to any database engineer working at a trucking company because I'm sure there are actually probably really interesting problems with logistics and things like that that they are solving. So maybe not the best example. Maybe. I don't know. I can't think of another example. I don't want to offend anybody who's chosen a more local problem because you're right. I mean, there are local problems that need to be solved, but I do think that there are people ... I mean, even someone like me. I want to work on a bigger problem. You know what I mean? I owned a web development company for 12 years, and I was solving other people's problems by building them a website or whatever, and it just got to a point where I'm like, "I'm not making enough of an impact here." You're not solving a big enough problem. You want to work on something more interesting.

Rob: Yeah. Humans crave challenge, right? Challenge is a necessary precondition for growth, and at least most of us, we want to grow. We want to be better at whatever it is we're doing or just however we think of ourselves next year that we aren't today, and you can't do that with challenge. If you build other people's websites for 12 years, eventually, you get to a point where maybe you're too good at it. Maybe that's great from a business perspective, but it's not so great from a personal fulfillment perspective.

Jeremy: Right.

Rob: If it's, "Oh, look, another brochure website. Okay. Here you go. Oh, you need a contact form?" Again, it's not to disparage this. It's the fact that if you do anything for 12 years, sometimes mastery is stasis. Not always.

Jeremy: Right, and I have nightmares of contact forms, of building contact forms, by the way, but...

Rob: It makes sense. Yeah. You know what you should do is just put all of those directly into Fauna and don't worry about it.

Jeremy: Easy enough. Easy enough.

Rob: Yeah, but it's not necessarily stasis, but I think about craftsmen and people who actually make things with their hands, physical builders, and I think a lot of that ... Like if you're making furniture, you're a cabinet maker. I think a lot of that is every time, it's just a little bit wrong, right? Not wrong, but just a little bit off from your optimum no matter how long you do it, and so everything has a chance to evolve. That's there with software to a certain extent, but the problem is never changing.

Jeremy: Right.

Rob: So, yeah. I can see both sides of it, but for me, I ... You can see it when I was on serverless four years ago and now that I'm on a serverless database now. I like to be out at the edge, pushing that edge out for everyone who's coming behind. It can be challenging because sometimes there's just no way forward, and sometimes everybody is not ready to come with you. In a lot of ways, being early is the same as being wrong.

Jeremy: Right. Well, I've been ...

Rob: Not an original statement, but ...

Jeremy: No, but I've been early on many things as well where like five years after we tried to do something, like then, all of a sudden, it was like this magical thing where everybody is doing it, but you mentioned the edge. That would be something ... or you said on the edge. I know you mean this way, but the edge in terms of the actual edge. That's going to be an interesting data problem to solve.

Rob: Oh, that's a fascinating data problem, especially for us at Fauna. Yeah, compute, and Andy Jassy, when he was at AWS, talked about how compute was bifurcating, right? It's either moving all the way out to the edge or it's moving all the way into the cloud, and that's true. But I think at Fauna, we take that a step further, right? The edge part is true, and a lot of the work that we've done recently, announcements with CloudFlare workers. We're ready for that. We believe in it, and we like pushing that out as close to the user as possible. One thing that we do uniquely is we have this concept of user-defined functions, and anybody who's written T-SQL back in the day, who wrote store procedures is going to be familiar with this, but it's ... You bring that business logic and that code to your data. Not near your data, to your data.

Jeremy: Right.

Rob: So you bring the compute not just to the cloud where it still needs to pass through top-of-rack and all of this. You bring it literally on to the same instance as your data where these functions execute against them there. So you get not just the database, but you get a compute layer in there, and this helps for things like filtering for things like the equivalent of joins, stuff that just ... If you've got to load gigabytes of data and move it somewhere, compute against it, reduce it to something, and then store that back, the speed of light still matters. Even if it's the speed of light across a couple switches, it still matters, and so there are some really interesting things that you can begin to do as you pull more and more of that logic into your data layer, and you also protect that logic from other components of your application.

So I like that because things like GraphQL that endpoints already speak and already understand, just send it over, and again, they don't care about the architectural, quite frankly, genius--I can say that because I didn't create it--the genius behind all of this stuff. They just care that, "Look, I send this request over and I get it back," and entire workflows, and complex processes, and everything are executing behind the scenes just so that the endpoints can send and retrieve what they need more effectively and more quickly. The edge is fascinating. The thing I regret the most about the edge is I have no hardware skills, right? So I can't make fun things to do fun things in my house. I have to buy them, but you can't do everything.

Jeremy: Yeah. Well, no. I think you make a good point though about bringing the compute to the data side, and other people have said there's no ... Ben Kehoe has been talking about this for a while too where like it just makes sense. Run the compute where the data is, and then send that data somewhere else, right, because there's more things that can be done with data after that initial bit of compute. But certainly, like you said, filtering it down or getting the bits that are relevant and moving a small amount of data as opposed to a large amount of data I think is hugely important.

Now, the other thing I just want to mention before I let you go or I want to talk about quickly is this idea of going back to the API economy aspect of things and buying versus building. If you think about what you've had to do at Fauna, and I know you're relatively recent there, but you know what they've done and the work that had to go in in order to build this distributed system. I mean, I think about most systems now, and I think like anything I'm going to build now, I got to think about scale, right?

I don't necessarily have to build to scale right away, especially if I'm doing an MVP or something like that. But if I was going to build a service that did something, I need to think about multi-region, and I need to think about failover, and I need to think about potentially providing it at the edge, and all these other things. So you come down to this thing, and I will just use the database example. But even if you were say using like MySQL, or Postgres, or something like that, that's going to scale. That's going to scale pretty well to get to a certain point, and then you're going to have to start sharding, right? When data gets hard, it's time to shard, right? You just have to start sharding everything.

Rob: Yeah.

Jeremy: Essentially, what you end up doing is rebuilding DynamoDB, or trying to rebuild Fauna, or something like that. So just thinking about that, anything you're building ... Maybe you have some advice for developers who ... I know we've talked about this a little bit, but I just go back to this idea of like, if you think about how complex some of these SaaS companies and these services that are being built out right now, why would you ever want to take that complexity on yourself?

Rob: Pride. Hubris. I mean, the correct answer is you wouldn't. You shouldn't.

Jeremy: People do.

Rob: Yeah, they do. I would beg and plead with them like, "Look, we did take a lot of that on. Fauna scales. You don't need to plan for sharding. You don't need to plan for global replication. All of these things are happening." I raise that as an example of understanding the customer's problem. The customer didn't want to think about, "Okay, past a thousand TPS, I need to create a new read replica. Past a million TPS, I need to have another region with active-active." The customer wanted to store some data and get that data, knowing that they had the ASA guarantees around it, right, and that's what the customer has.

So get that good understanding of what your customer really wants. If you can buy that, then you don't have a product yet. This is even out of software development and into product ideation at startups, right? If you can go ... Your customer's problem isn't they can't send text messages programmatically. They can do that through Twilio. They can do that through Amazon. They can do that through a number of different services, right? Your customer's problem is something else. So really get a good understanding of it. This is where knowing a little ... Like Joe Emison loves to rage against senior developers for knowing not quite enough. This is where knowing like, "Oh, yeah, Postgres. You can just shard it." "Just," the worst word in computer science, right?

Jeremy: Right.

Rob: "You can just shard it." Okay. Now, we're back to those database engineers that you talked about, and your customer doesn't want to shard a database. Your customer wants to store and retrieve data.

Jeremy: Right.

Rob: So any time that you can really focus in, and I guess I really got this one, this customer obsession beaten into me from my time at AWS. Really focus in on what the customer is asking you to do or what the customer needs even if they don't know how to express it, and build for that.

Jeremy: Right, right. Like the saying. I forgot who said it. Somebody from Harvard Business Review, but, "Your customers don't want a quarter-inch drill. They want a quarter-inch hole."

Rob: Right, right.

Jeremy: That's basically true. I mean, the complexity that goes behind the scenes are something that I think a vast majority of customers don't necessarily want, and you're right. If you focus on that product ideation thing, I think that's a big piece of that. All right. Well, anyway. So I have one more question for you just to ...

Rob: Please.

Jeremy: We've been talking for a while here. Hopefully, we haven't been boring people to death with our talk about APIs and stuff like that, but I would like to get a little bit academic here and go into that Calvin paper just a tiny bit because I think most people probably will not want to read it. Not because they don't want to, but because people are busy, right, and so they're listening to the podcast.

Rob: Yeah.

Jeremy: Just give us a quick summary of this because I think this is actually really fascinating and has benefits beyond just I think solving data problems.

Rob: Yeah. So what I would say first. I actually have this paper on my desk at all times. I would say read Section 1. It's one page, front and back. So if you're interested in it, you don't have to read the whole paper. Read that, and then listeners to this podcast will probably understand when I say this. Previously, for distributed databases and distributed transactions, you had what was called a two-phase commit. The first was you'd go out to all of your replicas, and you say, "Hey, I need lock." When everybody comes back, and acknowledges, and says, "Okay. You have the lock," then you do your transaction, and then you replicate it, and then you say, "Hey, everybody. I'm done. Release the lock." So it's a two-phase commit. If something went wrong, you rolled it all the way back and you said, "Hey, everybody. Forget it."

Calvin is event-sourcing for databases. So if I could distill the entire paper down into one concept, that's it. Right? Instead of saying, "Hey, everybody. Give me a lock, I'm going to do something," you say, "Hey, everybody. Here's what we're going to do." It's a deterministic application of the transaction so that you can ... You both create the lock and execute the transaction at the same time. So rather than having this outbound round trip, and then doing the thing in an outbound round trip, you have the outbound round trip, and you're done.

They all apply it independently, and then it gets into how you structure the guarantees around that, which again is very similar to event-sourcing in that you use snapshotting or checkpointing. "So, hey. At this point, we all agree. So we can forget all of our old transactions, and we roll forward from here." If somebody leaves the cluster, they come back in at that checkpoint. They reapply all of the events that occurred prior to that. Those events are transactions. They'd get up to working speed, and then off they go. The paper I think uses Paxos. That's an implementation detail, but the really interesting thing about it is you're not having a double round trip.

Jeremy: Yeah.

Rob: Again, I love the idea of event-sourcing. I think Amazon EventBridge is the best service that they've released in the past couple years.

Jeremy: Totally.

Rob: If you understand all of that and are already building serverless applications that way, well, that's what we're doing, just event-sourcing for database. That's it. That's it.

Jeremy: Just event-sourcing. It's easy. Simple. All right. All the words you never want to hear. Simple, easy, just. Right. Yeah. Perfect.

Rob: Yeah, but we do the hard work, so you don't have to. You don't care about all of that. You want to write your data somewhere, and you want to retrieve your data from an API, and that's what Fauna gives you.

Jeremy: Which I think is the main point here. So awesome. All right. Well, Rob, listen. This was great, and I'm super happy that I finally got you on the show. Congratulations for the new role at Fauna and on what's happening over there because it is pretty exciting.

Rob: Thank you.

Jeremy: I love companies that are innovating, right? It's not just another hosted database. You're actually building something here that is innovative, which is pretty amazing. So if people want to find out more about you, follow you on Twitter, or find out more about Fauna, how do they do that?

Rob: Right. Twitter, rts_rob. Probably the easiest way, go to my website, robsutter.com, and you will link to me from there. From there, of course, you'll get to fauna.com and all of our resources there. Always open to answer questions on Twitter. Yeah. Oh, rob@fauna.com. If you're old-school like me and you prefer the email, there you go.

Jeremy: All right. Awesome. Well, I will get all that into the show notes. Thanks again, Rob.

Rob: Thank you, Jeremy. Thanks for having me.

Episode source

Serverless Chats Follow

Episode #99: You already have a Multi-Cloud Strategy with Rob Sutter

Serverless Chats