Is Serverless Ready for Mission-Critical Apps?


Read and watch part 1 and part 2 of our latest Serverless Show featuring Erica Windisch, CTO and Co-Founder, IOpipe.

In part 3, Hillel stated, “Let’s talk a little bit about serverless and mission-critical systems. I saw an article on MeriTalk about a project that eGlobalTech is doing with FEMA in the U.S. on moving them to AWS and then moving some of their applications to serverless. I thought that would apropos some of the things that we talked about before led me to ask myself a question, ‘Is serverless ready for more mission-critical applications, disaster-recovery applications, or power-critical infrastructure, things like that?’ Are these platforms reliable enough? Are the tools out there to help us do these things or do you think this is fine for mobile apps and gaming, but we should wait a few more years before we start deploying critical things on it?”

Erica said, “If you’re trying to build an application like that for, say, nuclear power or something, I think there are a lot of things you have to consider about your application design and architecture that are beyond just if it’s serverless or not. I don’t think that serverless itself is a limiting factor for really any kind of application, necessarily. It’s more about how you use it. If everything you’re doing is going through a Kinesis stream or maybe even let’s say you’re super paranoid and you want to have full control over it, you can put it on a ‘not serverless’ thing like Kafka, and then have that go into Lambda.

“You build in resiliency at multiple layers and have multiple layers of redundancy. I think if you do that appropriately, it could be safe. I think it definitely depends on, again, how you build your application more so than what the technology stack looks like underneath.”

Hillel said, “I get how designing for resiliency is obviously critical and can be done on these platforms. I guess I would still avoid a self-driving car system that was using a Lambda function.”

Erica said, “I think they’re trying to build a highly resilient application that is real-time is very difficult.”

Hillel stated, “Obviously, for us, from my perspective, we focus on security. I mean, certainly when you build something that’s more mission critical, your security concerns are elevated as well, and so you have to ask yourself, in that sense, ‘Is this the right platform?’ I actually am a firm believer of the idea that serverless applications tend to be more secure, not less secure than non-serverless or serverfull applications, so in that sense, I’m happy to hear about more mission-critical applications perhaps adopting this environment where I think security can be hard and better. We’ll see how it plays out. But if the next big meltdown is caused by Lambda, we’ll have talked about it here first.”

Favorite Tweets

Hillel selected a tweet of a screenshot from a talk given by recent Serverless Show guest Forrest Brazeal. Hillel explained, “When you look at it, you think, ‘I’m just paying for server costs and they’re lowered somehow from running all the time.’ But then when you factor in all the engineering hours and operations costs and infrastructure management and all that, you have all this heavy cost. I think it’s, perhaps, a slight exaggeration, but I think it’s a good point about how people often miss the point about where costs really are and where you can save costs by changing how you do things.”

The Pain of Cold Starts


Erica selected a tweet from James Ward and explained, “I picked this tweet because I felt that it was really related to our previous conversation about the SLAs and about building for resiliency. I think that we all want serverless to have zero cold starts, but we also recognize physics exists.

“I feel like this is less of a problem than it was before partially because of the services themselves getting better about cold starts and also developers becoming more familiar with building resilient systems where it’s not as much of an issue.

“I think this is going to continue. I feel that, at some point somewhere, the code that was running your function has to exist on a file system. All those functions are stored on S3, for instance. Let’s say they put Lambda compute nodes adjacent to S3 as part of the S3 service or deeply integrated with S3 service. Somehow suddenly those cold start costs become very low.

“You could do things like when you upload the function, it spins up a container and it snapshots it. Instead of the thing that is shipped and shared across the machines being, ‘Here’s a container file system,’ which we then have to bring up and cold start, but what if instead it’s a prewarmed, the service has been started. It’s already been launched, but now we’re just going to take this memory blob and basically execute this memory blob. That’s a thing that suddenly becomes very, very fast. I think there are architectures that could eliminate cold starts, I feel. They might not be easy and they might require even further changes to the way the developers build their serverless applications, but I do think that it is a thing that is achievable. The question is if and when we’ll ever see that actually happen.

Hillel replied, “The way I read James’ tweet was that there’s kind of this contention of mindset where on the one hand we’re being told, ‘Don’t think about those things. You write your code and we’ll just magically sprinkle dust on it and make it work.’ But on the other hand, we’re told to understand, ‘Look, there are some scaling rules and there are some cold start penalties. You need to be aware of those things,’ because, hey, like you said, physics.’

“Part of me wants to say, ‘Hey, you guys sold us the fairytale of the magical scaling dust. Deliver on that. Figure out, to your point earlier of, the technology solution to eradicate these problems or stop selling us the dream of ‘don’t think about it, don’t worry about it,’ and start maybe saying, ‘Yeah, it scales better, but you still have to prewarm things and you still have to handle certain cases, etc.”

Erica said, “This is a thing I’ve run into, there are apps which I think are really great fits for serverless, where 3000 milliseconds from request to response isn’t fast enough and Lambda has proven that the cold start cost can be over that, but the typical response is definitely much, much smaller than that.

“For example, Slackbots are an amazing application of using Lambda because they’re very infrequent. A web poll handler might not get called more than a dozen times a day. It maybe doesn’t get called more than once a week. That’s a case where it’s so infrequently used it’s always going to cold start. But what happens is, because it has to cold start, the first time you request that slash handler on Slack, it fails. Yes, you can go deploy to a server. But I think it’s one of these things where Lambda scales really well to zero, but also doesn’t scale really well to zero because of the cold starts.

“For applications where you have a decent baseline and you have lots of bursts, Lambda is really, really great at that, but if you have applications where the baseline is literally zero and you just have periodic bursts, that’s where I think you run into more of these edge cases, things like the Slackbots that very infrequently get used, but are also very important because then they have to warm up that very first time. The Slack API says, ‘You have to be within three milliseconds,’ and there’s some workarounds for that within the Slack API. They have architect around it and it makes your architecture more complicated.”

Hillel said, “You’re just selling me on the idea that Amazon has got to get this figured out and make it all go away for me.

Erica replied, “I guess so. Yeah, and you can architect around it. But sometimes I don’t want to have that burden fall on me to make my application more complicated because Amazon has the cold start time too high. That seems like an anti-pattern for me to have a much more complicated application. This is something that I’ve noticed with serverless applications overall is that for small applications, like the things that are infrequently used. Sometimes a serverless design is going to cost you a lot less in the architecture in runtime costs, but the design of the application can sometimes be more complicated because you’re building a distributed application, and you have to be concerned about things like cold start costs and everything.”

Sometimes Serverless is Too Complex

“It’s an interesting thing. There are gaps here, which means that to build certain types of applications, it becomes very complex. Whereas instead, I could have went and downloaded an AMI that’s completely preconfigured to do all of these things. Then it would just be a single server and it would cost me a bit of money to run that with a decent-sized baseline cost because it’s going to be an always-running EC2 server, and at some point, you’re going to run out of the capacity to handle more clients and it’ll get slow.

“Sometimes even the complexity of the architecture of a serverless application actually causes me to say, ‘You know what? It’s going to be so much easier for this application. I don’t need scalability, so it’s going to be easier for me to put up an EC2 instance.’

“Serverless is super scalable and really costs a lot less when you need that scalability, but when you don’t need that scalability, the complexity is more than I’m willing to deal with.”

Speaking Opps

Erica will be speaking at the Philadelphia Serverless Meetup, as well as The AWS Summits in New York and D.C, and possibly Re:Invent If you want her to speak at your event, reach out to her.

The post Is Serverless Ready for Mission-Critical Apps? appeared first on Protego.


*** This is a Security Bloggers Network syndicated blog from Blog – Protego authored by Megan Bozman. Read the original post at: https://www.protego.io/is-serverless-ready-for-mission-critical-apps/