AWS Lambda runs your code on an highly available and scalable compute infrastructure so that you can focus on what you want to build. Do you want to get the advantages of Lambda for workloads that are memory or computationally intensive? Wait no more!
Starting today, you can allocate up to 10 GB of memory to a Lambda function. This is more than a 3x increase compared to previous limits. Lambda allocates CPU and other resources linearly in proportion to the amount of memory configured. That means you can now have access to up to 6 vCPUs in each execution environment. In this way, your multithreaded and multiprocess applications run faster. Since Lambda charges are proportional to memory configured and function duration (GB-seconds), the additional costs for using more memory may be offset by lower duration. I have more on this in the example below.
With more memory and CPU power, and support for the AVX2 instruction set, new use cases — such as machine learning applications; batch and extract, transform, load (ETL) jobs; modelling; genomics; gaming; high-performance computing (HPC); and media processing — become easier to implement and scale with Lambda functions.
Let’s see how this works in practice!
Lambda Function Performance as Memory Increases
When I first wrote about the capability of mounting a shared Amazon Elastic File System (EFS) for Lambda functions, one of the examples I used was a function doing machine learning inference to classify images of birds. The function is using PyTorch to run the inference, applying a pre-trained machine learning model.
Now, I can execute the same function in the updated Lambda execution environment. Let’s see how increasing memory affects the duration of the function. Here are the results of using memory configurations between 1 and 10 GB. To get these numbers, I ran 20 invocations for each memory configuration. Then, I computed the average duration, discarding function initializations. To avoid possible outliers, I also excluded from the average the top and bottom 10% of reported durations. Based on the results, I estimated the charges I would have for 1 million invocations with each configuration.
As you can see, the function is able to use the additional CPU power that comes with more memory, decreasing the duration of the invocations. What is interesting is the impact of increasing memory on my costs.
Lambda charges are related to memory and duration, so if I increase memory and this is reducing duration by the same proportion, the overall charges are about the same. For example, looking at the graph above, when I configure 5 GB of memory, I have the same costs as when I have 1 GB of memory (about $61 for one million invocations), but the function is 5x faster. If I need lower latency, I can increase memory up to 10 GB, where the function is 7.6x faster and I pay a little more ($80 for one million invocations).
Depending on your code and business case, you can find out which memory configuration gives the optimal trade-off between cost and performance. To help you with that, my colleague and friend Alex Casalboni started the AWS Lambda Power Tuning project to help you optimize your Lambda functions in a data-driven way. This open source tool is really useful and has been improved by the support of many contributors. Give it a try!
In my tests, PyTorch is also using the optimizations of the Advanced Vector Extensions 2 (AVX2) instruction set, now available in the Lambda execution environment. With the AVX2 instruction set, the processor allows running a certain set of operations simultaneously. This is extremely beneficial for applications with operations that can run in parallel such as matrix multiplication. As a result, using AVX2 can improve performance by increasing CPU throughput per cycle. This typically helps compute intensive workloads such as machine learning inference, multimedia processing, scientific simulations, and financial modeling applications.
AWS Lambda support for larger functions is available in Africa (Cape Town), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), EU (Milano), EU (Paris), EU (Stockholm), South America (Sao Paulo), US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon).
Here’s a snapshot of the new console experience. We replaced the slider with a field, and you can now configure memory in 1 MB increments (it was 64MB increments before). In this way, the console works similarly to the Lambda API that always accepted memory configurations with 1MB granularity.