AWS Python Lambda "Hello World" + psycopg2 dependency = 3.2 MB. Can I shrink?
I have a trivial python lambda function defined in index.py
:
def handler(event, context):
return {"msg": "hello world. this is hello handler"}
Deployed via CDK (typescript):
const stack = new Stack(app, "PythonHelloStack", {env})
new PythonFunction(stack, `HelloFunction`, {
runtime: Runtime.PYTHON_3_9,
entry: path.join(__dirname, `../../../lambdas/hello`),
})
This works and is 4.8 kB. Great. If I add a single dependency to psycopg2-binary
, without changing the Python code, the AWS Lambda code size jumps from 4.8 kB to 3.2 MB. Is that inevitable or is there a fix? Can I do anything to reduce the code size? Should I? Is creating a layer necessary or helpful for this? Is there a simpler fix? Thank you :)
My project with the psycopg2-binary
dependency has the following pyproject.toml
:
[tool.poetry]
name = "hello"
version = "0.1.0"
description = ""
authors = []
[tool.poetry.dependencies]
python = "~3.9"
psycopg2-binary = "~2.9"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core>=1.1.0"]
build-backend = "poetry.core.masonry.api"
Solution 1:
Okay, at first what is psycopg2-binary
and what binary
means:
The binary packages come with their own versions of a few C libraries, among which libpq and libssl, which will be used regardless of other libraries available on the client
So psycopg2-binary
includes dependencies out of the box. Because of it the size of resulted Lambda layer is relatively large.
As you can see at link above there is recommended to build own library version by using psycopg2
package:
For production use you are advised to use the source distribution.
That will allow you to use newer dependency library versions (libpq
, libssl
, etc). It is possible that psycopg2-binary
was built long time ago and may be outdated or vulnerable.
Regarding your question about library size: even if you build own version of psycopg2
it will include same number libraries as pre-built binary
one, so I'm not sure if here is able to economy reasonable size.
Also you can check this documentation and there is not recommended to use binary package again:
The binary package is a practical choice for development and testing but in production it is advised to use the package built from sources.
Also this answer may be helpful if you will decide to build library
Solution 2:
If you create a v-env with PsycoPG2 installed, you'll see that that's pretty much the minimum you can get away with due to the size of the components of the wheel and its dependencies
While it's not 100% the same as 2.9, here's mine for 2.9.1:
~/v-3.9/lib/python3.9/site-packages/psycopg2$ du -sh *
8.0K __init__.py
140K __pycache__
4.0K _ipaddress.py
8.0K _json.py
1.5M _psycopg.cpython-39-x86_64-linux-gnu.so
20K _range.py
16K errorcodes.py
4.0K errors.py
8.0K extensions.py
44K extras.py
8.0K pool.py
16K sql.py
8.0K tz.py
Note the size of the shared object. You might also want to check the import
statements - there are a few other pieces which they pull in as well, which results in your code size increase.