What are AWS Lambda Extensions and How It will Foster Serverless?

by Serkan Özal on October 8, 2020

WHAT ARE AWS LAMBDA EXTENSIONS AND HOW IT WILL FOSTER SERVERLESS?

Today, we are happy to announce a beta of the Thundra extension for AWS Lambda. This extension enables us to reduce the overhead caused by data communication out of Lambda invocation to zero. Now, we are proudly enabling our customers to use Thundra with zero overhead. Before diving into details, let’s explain what AWS Lambda Extensions API brings into Lambda and generally to the serverless tooling.

Lambda extensions provide a way to hook into the Lambda execution environment as a companion process outside of the Lambda function runtime process or even configure the execution environment. Especially for diagnostic tools like Thundra, it enables plug into the execution cycle before, during, and after an invocation and even on a spin down the environment. AWS Lambda extensions can be separated into two main categories based on how they interact with the runtime process. These categories are named pretty straightforwardly as internal and external Lambda extensions.

In the rest of the blog post, we’ll explain the core concepts around the new feature and introduce you to Thundra’s Lambda extension that will alleviate any concern around overhead forever.

Internal Extensions

Internal extensions make it possible to configure the runtime environment and modify the startup options/arguments of the runtime process. Unlike external extensions, internal extensions run as part of the main runtime process. Internal extensions are supported in two ways: environment variables and wrapper scripts

Environment Variables

The first way of configuring language-specific tunings for the internal extensions is through environment variables, so they will be picked up and used automatically by the runtime itself. Let’s have a look at some of those:

JAVA_TOOL_OPTIONS [Java11 and Java8.al2]: JAVA_TOOL_OPTIONS environment variable is used to start the Java application with custom command-line options in environments where the command line is not accessible. Typically this environment variable allows you to specify the initialization of tools, specifically the launching of native or Java programming language agents using the -agentlib or -javaagent options. Besides Java agents, this environment variable can also be used to tweak JVM startup options for different reasons like tuning JVM for fast startup or memory pool sizes (eden space, survivor spaces, tenured generation, or metaspace) according to our application’s object allocation pattern. For example, one possible case might be logging loaded classes with their locations to troubleshoot NoSuchMethodError exceptions which might occur when multiple versions of the same library exist in the classpath and the first one in the classpath is not the one that our application is using. By settingJAVA_TOOL_OPTIONS environment variable to -verbose:class, we can track the loaded classes and their locations. And if the class we are using is loaded from another jar file, we can identify the problem as probably it will not be able to reproduce locally easily because of the nature of the problem.
NODE_OPTIONS [Node.js 10x+]: NODE_OPTIONS environment variable can be thought of as Node.js equivalent of the JAVA_TOOL_OPTIONS environment variable for Java runtime and provides similar functionality. This environment variable allows adding custom command-line options to the Lambda runtime process on Node.js runtime.
DOTNET_STARTUP_HOOKS [dotnetcore 3.1+]: By DOTNET_STARTUP_HOOKS environment variable, a list of managed assemblies can be specified so each of them will be called in the order provided, before the Main entry point. A typical use-case of this environment variable is that it allows enabling agents at startup automatically without needing to activate them programmatically.

Wrapper Script

Wrapper script allows you to start the Lambda runtime process through your script. Original startup arguments are passed to the wrapper script so then the script can

– add new arguments or environment variables
– modify or remove original arguments or environment variables to start the actual runtime process however you want Wrapper script can be set by the AWS_LAMBDA_EXEC_WRAPPER environment variable by its path.The following Lambda runtimes support wrapper scripts:
- Node.js 10.x+
- Python 3.8
- Java8.al2 and Java 11
- Dotnetcore 3.1
- Ruby 2.7

For example, by wrapper script, we can enable verbose mode for module loading and initialization at Python runtime so we will be able to track which modules are resolved and loaded from which directory. The following example script wrapper_script.sh can be used to achieve this:

#!/bin/bashargs=("$@")args=("${args[0]}" "-v" "${args[@]:1}") exec "${args[@]}"

After deploying the script above as a Lambda layer, the script can be specified by setting AWS_LAMBDA_EXEC_WRAPPER environment variable to /opt/wrapper_script.sh to start the Python runtime process with an additional verbose mode (-v) option.

Even, you can put the wrapper_script.sh script into your function bundle and reference it from the AWS Lambda root directory (/ var/task) by setting AWS_LAMBDA_EXEC_WRAPPER environment variable to /var/task/wrapper_script.sh. So no layer deployment is required.

External Extensions

External extensions make it possible to hook into Lambda execution environment lifecycles: INIT, INVOKE, and SHUTDOWN.

The new Extensions API provides an HTTP-based API like the Runtime API so extensions can register to lifecycle events externally as a standalone separate process. Then, extensions are notified by the lifecycle events through the HTTP based API.

Lambda Extension Flow

The Extensions API enables tooling providers like Thundra, to integrate with the Lambda environment deeply externally without requiring any source code, dependency, or configuration change.

External extensions run as a separate process from the runtime process itself in the same Lambda execution environment. External extensions don’t need to be implemented in the same language with the Lambda function itself and can be written in any language. As mentioned before, extensions run in the same Lambda execution environment with the Lambda function itself. So they share the same CPU, memory resources, IO throughput, and disk storage under /tmp directory. Also, extensions have access to the Lambda functions environment variables directly and use the same AWS IAM role of the function.

External extensions are published as layers so they can be added to the Lambda functions as a regular layer and exported under /opt directory like other layers. Extension bundles are expected to have /extension/${extension-name} structure so when they are exported, they will be located under /opt/extension/${extension-name} path. The other important point with the extensions is that you should be aware of the fact that extensions might have a negative effect on your Lambda function’s performance as they share the same execution environment. For example, if the extension consumes too much CPU resource, there might be CPU starvation for the runtime process and so Lambda function execution duration might increase. Additionally, if an extension is registered to the “INVOKE” phase, Lambda runtime waits for the extension to complete before returning the response of the invocation. So, extension delays might lead to a delay in Lambda function execution.

What does Thundra Lambda Extension Provide?

By default (also supports async monitoring through Amazon CloudWatch logs. See the doc), Thundra agent collects telemetry data during the invocation and sends them at the end of invocation to the Thundra collector endpoint (communicates with regional collector endpoint${region}.collector.thundra.io to minimize network RTT). Naturally, this operation adds extra latency to the invocation duration around 3-4 milliseconds (1-2 ms collector API latency, 1-2 ms network delay) on average.

Thundra Collector Latency

Thundra Lambda extension is a companion process that provides asynchronous telemetry data (traces, metrics, logs) reporting functionality to the Thundra agents. To do that, Thundra extension starts a local collector and the Thundra agent sends collected data to the local collector endpoint without any network delay. Then, Thundra extension sends buffered data to the real collector asynchronously along with the execution of the invocation. So there will be zero network delay added by the Thundra agent. Additionally, the remaining buffered telemetry data is flushed on the shutdown of the Lambda container.

How to Setup Thundra Lambda Extension?

Thundra Lambda extension is also an extension to Thundra Lambda agent running along with your application Our Lambda extension is supported every runtime that Thundra has the agent for – Node.js, Python, Java, Go, .NET. In order to use our extension, the Thundra agent must be integrated into your Lambda application first.
Thundra Lambda extension is published as an AWS Lambda layer which includes extension implementation. So it can be added as follows:arn:aws:lambda:${region}:269863060030:layer:thundra-lambda-extension:2
Then, the Thundra agent must be configured to use the local collector as the data transmission point, which is provided by the Thundra Lambda extension, instead of the remote collector. The configuration can be done easily through environment variables by setting THUNDRA_AGENT_REPORT_REST_LOCAL to true And that’s all. After then, even single-digit telemetry data reporting delays (network delay to Thundra collector endpoint + Thundra collector API delay) will not be added onto your Lambda invocation duration. How to Setup Thundra Lambda Extension?

Wrapping Up

Since the inception of the AWS Lambda and serverless paradigm, AWS has always worked very hard to remove barriers preventing the adoption of serverless. Extensions API will also help companies that complain about the limitations of serverless overcome those challenges. It’ll also help observability and security vendors like Thundra to implement more effective solutions to ease the life of developers. We are proud to be part of the preview launch of Lambda extensions with our external extension that makes the telemetry data ingestion with zero overhead possible with ease. If you want to give a try to Thundra’s extension released in beta and let us improve it, ping us over [email protected].