What is this post about?
Today, as usual, I started off the job in the morning. I’m at a juncture point of building part of a project – we need yet another micro-service. So I started writing a Python script to be deployed in AWS Lambda. This usually takes a couple of hours, max. But after I wrote it, I realized half the time was spent debugging an issue I didn’t even understand.
After finding the why and how, I thought it’d be good to share it here – hopefully someone finds it useful along the way.
About the script
For someone unfamiliar with micro-services: it’s just a piece of code sitting outside your main app, doing some specific task – collecting data from an API, populating a database, and so on.
This piece of code could sit anywhere, not only in AWS Lambda. We use Lambda because we’re lazy enough to skip Dockerizing the scripts, maintaining and updating them, and layering Kubernetes on top – which would make things even more complex. That said, none of that should discourage you from using those tools :)
Define the problem!
The reason half the time writing the script was spent debugging: in Python, I had a list defined in global scope that some functions were updating.
myresult = []
def foo() -> None:
# collect some info from api, append to 'myresult'
myresult.append(newdata)
def bar() -> None:
# do something with myresult
def main():
foo()
bar()
Generally speaking, this code is perfectly valid and not an issue whatsoever – unless you’re running it in Lambda.
Why is this a problem in AWS Lambda?
To understand the problem, you need to know how Lambda handles invocations.
When Lambda receives its first invocation, it spins up a new execution environment (a cold start): imports your module, runs all module-level code – including myresult = [] – and then calls your handler. So far so good.
But here’s the catch: Lambda intentionally keeps that execution environment alive after your handler returns, so the next invocation can skip the cold start overhead. This is called a warm start, and it’s a deliberate performance feature.
The problem? Module-level code like myresult = [] only runs once – on cold start. On every warm invocation, the same Python interpreter is reused, and myresult still holds whatever was left in it from the previous run.
So what actually happens with our code across invocations:
- Invocation 1 (cold):
myresultstarts as[],foo()appends →["item1"] - Invocation 2 (warm):
myresultis still["item1"],foo()appends →["item1", "item1"] - Invocation 3 (warm):
myresultis["item1", "item1"]→ keeps growing
bar(), without knowing any of this, uses myresult assuming it only contains fresh data. The result is silently wrong – no errors, just bad output.
Fixing it
The right fix – keep it local:
The cleanest solution is to declare myresult inside main() and pass it as an argument where needed. This way it’s re-initialized on every invocation, warm or cold:
def foo(myresult) -> None:
# collect some info from api, append to 'myresult'
myresult.append(newdata)
def bar(myresult) -> None:
# do something with myresult
def main() -> list:
myresult = [] # fresh every invocation
foo(myresult)
bar(myresult)
return myresult
def lambda_handler(event, context) -> list: # <-- Entrypoint of Lambda
return main()
The dirty fix – .clear() after use:
If restructuring isn’t an option, you can clear the global list at the end of each invocation. One important thing to watch out for here: in Python, main() returns a reference to the same myresult object. If you call myresult.clear() after assigning, you’ll clear both – and end up returning an empty list. Make a copy first:
myresult = []
def foo() -> None:
myresult.append(newdata)
def bar() -> None:
# do something with myresult
def main() -> None:
foo()
bar()
def lambda_handler(event, context) -> list: # <-- Entrypoint of Lambda
main()
final_result = list(myresult) # <-- copy before clearing
myresult.clear() # <-- safe to clear now
return final_result
Takeaway
Only define constants in global scope inside a Lambda function. Anything mutable that gets written to by your functions should live locally, scoped to the handler or the function that owns it.
I could have easily passed myresult as a parameter to bar() from the start and this would never have happened. But since it did – and since this is the second time I’ve hit the same issue a year apart – I figured writing it down would make it stick. Hope it saves you the debugging session.
I must also share a more technical video covering this whole topic below. Have a good week!
AWS Lambda Global Variables - The Good the Bad & the Ugly - AWS Service Deep Dive