Cold Start in Serverless: The Overcome Problem That Still Generates Prejudice

Cold Start in Serverless: The Overcome Problem That Still Generates Prejudice

🛂 Versão Português Brasil

About 4 years ago, cold start in serverless computing truly represented a problem, especially for applications that didn't allow for waiting time during their initialization.

Today, this deficiency has been overcome, and cold start is practically insignificant. Yet, it's still referred to as a "serverless problem," mainly by people who have only read about the subject and don't use this technology daily.

In this article, we will introduce the concept of cold start in serverless services, explore its main causes, examine how cold start works in AWS Lambdas, Microsoft Azure, and Google Cloud Platform (GCP), identify the most efficient programming languages, and discuss the technologies used to overcome the problem.

What is cold start in serverless?

Serverless computing is one of the main technology trends currently, and one of the most significant benefits of this type of computing is automatic scalability, allowing businesses and developers to create highly scalable services without worrying about the underlying infrastructure.

However, until about 4 years ago, serverless developers faced a problem called cold start, which occurs when a serverless function is triggered for the first time, and there's a noticeable latency until the function is fully executed.

Cold start is a term used to describe the time a serverless service takes to start after a period of inactivity. When a service is in "standby mode," its environment is shut down to save resources and reduce costs. When a request is received after a period of inactivity, the environment needs to be restarted, which can lead to an initial delay.

Cold start directly affected the service's performance and, therefore, the user experience, forcing users to wait a few seconds until the service was fully executed - a critical issue in some situations, such as in financial transactions, for example.

What is AWS Lambda cold start?

AWS Lambda cold start happens when a serverless function is activated for the first time or after a period of inactivity and needs to be initialized by the AWS computing service. This can lead to a noticeable delay in the function's execution time, as the infrastructure needs to be prepared to execute the function fully.

The main causes of cold start in an AWS Lambda include:

  • Lack of running instances: when a function is activated, and there are no running instances, the service needs to create a new instance and prepare it to execute the function. This process can take a few seconds and result in a noticeable delay in its execution.

  • Instance lifespan: instances that run serverless functions on AWS have a limited lifespan and can be deactivated after a period of inactivity. When a function is activated after this period, a new instance may need to be created, resulting in a cold start.

  • Function size: large or complex serverless functions may take longer to initialize and, therefore, cause a more noticeable cold start.

  • Programming language: Some programming languages may take longer to initialize. We will explore this further now.

What is the best programming language for Lambda cold start?

The choice of programming language can be an important factor when dealing with cold start in serverless services. Some languages tend to have faster startup times than others, which can influence a function's efficiency.

The most common programming languages in serverless services are Node.js, Python, Java, Go, and C#. Each of these languages has its advantages and disadvantages regarding cold start.

🖳 Node.js⇗ is known for having faster startup times than other languages. This is partly due to Node.js running on a single thread, which simplifies initialization and reduces the time needed to load modules.

🖳 Python⇗ is also a good choice for Lambda cold start, as it is an interpreted language that doesn't require code compilation before execution. However, the startup time can be affected by the code size and the complexity of the runtime environment.

🖳 Java⇗ is a compiled language that may take longer to initialize, but it is known for its efficient performance once initialization is completed. Additionally, the JVM⇗ (Java Virtual Machine) has a wide range of optimization and debugging tools available.

🖳 Go⇗ is a language designed for fast startup times and is known for its performance efficiency. It's a good choice for serverless services that require a high level of performance and low latency.

🖳 C#⇗ is a language that supports just-in-time (JIT) compilation and may have longer startup times. However, C# is known for being a powerful and flexible programming language.

💡 In summary, there isn't a definitive answer to which is the best language for handling cold start in serverless services. Each language has its advantages and disadvantages, and the choice will depend on the specific context and project requirements.

What is the cold start of Azure Functions?

First and foremost, it is essential to distinguish between the two main types of Microsoft Azure Functions: the consumption plan and the dedicated plan.

The consumption plan is the serverless model: the code reacts to events, scales effectively to meet any load, and reduces when the code is not running - and you are only charged for what you use. It is in this plan that the cold start occurs.

In the dedicated plan, which involves renting control of a virtual machine available 24/7, there are no cold starts - but you pay for this availability.

Similar to AWS Lambdas, the cold start in Azure Functions is the total time a user must wait from when an event occurs to start a function until that function completes the response to the event.

What happens during a cold start in Azure Functions?

  • Azure allocates a pre-configured server from the "warm" worker pool to your application. This server already has the Functions runtime running, but it is not specialized.

  • This worker becomes specialized by configuring the Function runtime in specific ways for your application. Some things happen to make this specialization:

    • The Azure Functions infrastructure mounts your file content to the assigned worker.

    • Application-specific settings for your Function application are applied to the worker.

  • The Functions runtime is reset, and all necessary extensions are loaded onto the worker. To find out which extensions to load, the runtime reads the function.json files of any functions in the Functions application. For example, this occurs if you are using Durable Functions or if you have input or output calls.

  • The functions themselves are loaded into memory by the language providers. This will take a variable amount of time based on the size of your application.

  • Your code is executed.

If you have run your function recently, steps 1-4 have already occurred, resources are already allocated, and your site is warmed up. As you can imagine, things are considerably faster in this scenario.

Azure deallocates resources after about 20 minutes of inactivity, after which your next call will be a cold start, and this entire process will happen again.

What are the best programming languages for Azure cold start?

Microsoft advises generally available languages like C#, F#, and JavaScript (there are several experimental languages that are not fully supported and optimized, which generate a new process with each execution, greatly affecting latency).

How to completely avoid the cold start in Microsoft Azure?

By subscribing to the dedicated plan, you control what happens on your virtual machine. This is a bit more expensive and not serverless, but if your solution has a strict low-latency requirement for each individual call, this is the best option.

If your application tolerates a small latency, be aware that Azure Functions have evolved enough for cold start to no longer be a problem.

What is the cold start in Google Cloud Functions?

Just like in AWS Lambdas and Azure Functions, the cold start in Google Cloud Functions occurs when the first request arrives after deployment.

After processing this request, the instance remains active for 15 minutes. After this period, the instances are recycled.

Regarding the main programming languages for Google Cloud Functions concerning cold starts, this is the sequence (from best to worst performance, although differences are minimal): JavaScript, Go, and Python.

The package size is another factor that influences cold start in Google Cloud Functions. Adding dependencies, thus increasing the size of the deployed package, will increase cold start durations.

Functions with many dependencies can be 5 times slower to start.

How was the cold start problem in serverless overcome?

In recent years, several technologies and approaches have been developed to overcome the cold start problem in serverless services. Here are the main ones:

  • Node.js V8 Engine: using Node.js with the V8 engine can help reduce startup time and improve the overall performance of a Lambda function.

  • Runtime Customization: This approach allows for the customization of a Lambda function's runtime environment, enabling the pre-loading of specific libraries and modules that are required for the function.

  • Caching Strategies: Using caching can help reduce startup time and cold start occurrences in Lambda functions, especially when it comes to network requests and data storage.

  • AWS Lambda Provisioned Concurrency: This technology allows you to pre-allocate a specific number of instances of a Lambda function, reducing startup time and the risk of cold start (this deserves its own chapter).

AWS Provisioned Concurrency and the end of cold start

Provisioned Concurrency is an AWS feature that offers more control over the performance of serverless applications. With it, you gain the ability to avoid cold starts in Lambda functions.

Provisioned Concurrency allows you to create scalable serverless applications with predictable latency. You can set the desired concurrency for all versions or aliases of a function.

AWS Lambda prepares containers for your functions, ensuring that they can be invoked with two-digit millisecond latency after being called. This means that serverless functions can adapt to sudden traffic spikes or significant scaling events without increasing latency.

⚠️ However, provisioned concurrency has a cost: You are charged from the moment you enable it, rounded to the nearest 5 minutes. The price is calculated based on the amount of concurrency (number of simultaneous function calls that can be executed without latency) and the amount of allocated memory.

This means that you should set provisioned concurrency carefully, specifying only enough for your workloads to avoid incurring unnecessary costs.

Conclusion

Throughout this article, we have introduced the concept of cold start in serverless services and how it can affect an application's performance. We have also looked at the main causes of cold start in AWS Lambdas and Azure and Google Cloud Platform Functions, and strategies for dealing with this problem, such as warming up the environment, using hot functions, choosing more efficient programming languages, and utilizing performance monitoring and analysis tools.

Additionally, we discussed the possibility of minimizing latency and addressed recent technologies that have been used to overcome the cold start problem.

In summary, the days when dealing with cold start in serverless services was a significant problem for developers are gone. Technology has evolved, not only in software but also in hardware, such as SSDs and high-speed storage devices. All of this has had positive impacts on Lambdas and Functions. Personal computers themselves have evolved, resulting in better application performance.

Moreover, all the mentioned platforms are continually working to mitigate the effects of cold start, each in its own way.

Thus, with hardware and software evolving together, we have managed to reduce application latency, resulting in an improved user experience.

Although it is (still) not possible to completely avoid cold start in serverless services, its latency is now so small that it becomes trivial to try to justify not adopting this technology using cold start as an example.

So, if that was your concern, feel free to embrace the benefits of serverless computing without fear! 🚀