6 Considerations Before Using AWS Lambda SnapStart
AWS Lambda’s new “SnapStart” feature was announced at re:Invent 2022.
What is it?
When using this feature as part of deploying your lambda function, a snapshot (including disk and RAM) is created. When a new instance of your function is required, this snapshot (powered by MicroVM Snapshot technology within FireCracker) is used. This skips the current flow of a cold start:
- Initialising an environment
- Downloading you lambda code/image
- Running any pre-invocation code
This will greatly reduce the impact of cold starts!
Great! Let’s enable this feature for all our existing lambda function, right? Well… there are six things to be aware of:
SnapShot is Only Available on Java 11 Runtime (for now)
This fantastic article written by Aleksandr Filichkin shows that the Java 11 (and .Net to an extent) runtimes have a real problem with cold starts. A Java 11 cold start is roughly 10x longer than the other runtimes!
So it is no surprise that AWS prioritised Java 11 for SnapShot support initially. There is a good chance that support for other runtimes will be announced in the not-so-distant future.
Common Code Patterns to Handle Database Connections Will Not Work
A common code pattern for a lambda writing/reading a DB is to create the database connection outside the lambda handler function. This means the DB connection will be made as the lambda container initialises, instead of on every lambda invocation (if the code lives inside the handler function). Meaning less connections to the database and a faster lambda execution.
Below is an example lambda function doing just that (although the database credentials could be handled in a more secure way):
import sys
import logging
import rds_config
import pymysql
#rds settings
rds_host = "rds-instance-endpoint"
name = rds_config.db_username
password = rds_config.db_password
db_name = rds_config.db_name
logger = logging.getLogger()
logger.setLevel(logging.INFO)
try:
conn = pymysql.connect(host=rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
except pymysql.MySQLError as e:
logger.error("ERROR: Unexpected error: Could not connect to MySQL instance.")
logger.error(e)
sys.exit()
logger.info("SUCCESS: Connection to RDS MySQL instance succeeded")
def handler(event, context):
"""
This function fetches content from MySQL RDS instance
"""
item_count = 0
with conn.cursor() as cur:
cur.execute("create table Employee ( EmpID int NOT NULL, Name varchar(255) NOT NULL, PRIMARY KEY (EmpID))")
cur.execute('insert into Employee (EmpID, Name) values(1, "Joe")')
cur.execute('insert into Employee (EmpID, Name) values(2, "Bob")')
cur.execute('insert into Employee (EmpID, Name) values(3, "Mary")')
conn.commit()
cur.execute("select * from Employee")
for row in cur:
item_count += 1
logger.info(row)
#print(row)
conn.commit()
return "Added %d items from RDS MySQL table" %(item_count)
But if using Lambda’s SnapStart feature, the container is initialised upon deployment instead of a cold start invocation, creating a database connection and storing it’s state in memory as part of the snapshot. So when you invoke your lambda later in time using this snapshot, the execution assumes there is a live database connection. But in reality this database connection will likely be closed already.
Your Long-Lived/Environment Variables Might Be Wrong
A similar theme to the above point on database connections. Variables like environment variables, expiring tokens or database credentials will be stored at deployment time.
Perhaps you rotate your RDS credentials? Due to the nice integration with AWS Secrets Manager, the value of your DB credentials will also be updated in Secrets Manager automatically. When a new lambda container in initialised on a cold start, it will pull the new values of your DB secret automatically.
But with SnapStart, the value of the secrets are pulled at deployment time and stored in memory, instead of at initialisation time of a new container. So your lambda will not be aware that the value of your DB secrets have changed.
The solution?
You must read your secrets within the handler function on every invocation. However, this slows your lambda and increases your Secrets Manager cost/throughput.
Uniqueness/Randomness Is A Concern
There is wide concern about the impacts of randomness when using a single FireCracker snapshot across multiple invocations.
The solution is to use cryptographically-secure pseudo-random number generators instead of PRNGs — credit to this article
AWS were quick to release documentation for how to mediate this issue.
Only Fully qualified Lambda ARNs will use SnapStart
It appears that SnapStart snapshots are associated with specific version of your lambda function, meaning you need to specify the exact version of your function to invoke in order to use SnapStart.
A fully qualified lambda ARN looks like:
arn:aws:lambda:aws-region:acct-id:function:helloworld:42
instead of:
arn:aws:lambda:aws-region:acct-id:function:helloworld
Your Lambda Performance Will Only Be Improved On Cold Starts
So from the above points, there is a good chance that code changes are required. Either inside your lambda itself or from the code invoking it.
Is this worth it?
Well if you’ve used strategies of either:
- Keeping lambda functions “warm”
- Utilising caching to reduce your reliance on lambda and it’s performance
… then cold starts may not be an issue in your architecture at the moment!
Remember, SnapStart will only improve the performance of a cold start. Regular invocations from an existing lambda container are not impacted, and your code changes may even slow these invocations down slightly.