AWS Python Lambda "Hello World" + psycopg2 dependency = 3.2 MB. Can I shrink?

I have a trivial python lambda function defined in index.py:

def handler(event, context):
    return {"msg": "hello world. this is hello handler"}

Deployed via CDK (typescript):

const stack = new Stack(app, "PythonHelloStack", {env})
new PythonFunction(stack, `HelloFunction`, {
    runtime: Runtime.PYTHON_3_9,
    entry: path.join(__dirname, `../../../lambdas/hello`),
})

This works and is 4.8 kB. Great. If I add a single dependency to psycopg2-binary, without changing the Python code, the AWS Lambda code size jumps from 4.8 kB to 3.2 MB. Is that inevitable or is there a fix? Can I do anything to reduce the code size? Should I? Is creating a layer necessary or helpful for this? Is there a simpler fix? Thank you :)

My project with the psycopg2-binary dependency has the following pyproject.toml:

[tool.poetry]
name = "hello"
version = "0.1.0"
description = ""
authors = []

[tool.poetry.dependencies]
python = "~3.9"
psycopg2-binary = "~2.9"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.1.0"]
build-backend = "poetry.core.masonry.api"

Solution 1:

Okay, at first what is psycopg2-binary and what binary means:

The binary packages come with their own versions of a few C libraries, among which libpq and libssl, which will be used regardless of other libraries available on the client

So psycopg2-binary includes dependencies out of the box. Because of it the size of resulted Lambda layer is relatively large.

As you can see at link above there is recommended to build own library version by using psycopg2 package:

For production use you are advised to use the source distribution.

That will allow you to use newer dependency library versions (libpq, libssl, etc). It is possible that psycopg2-binary was built long time ago and may be outdated or vulnerable.

Regarding your question about library size: even if you build own version of psycopg2 it will include same number libraries as pre-built binary one, so I'm not sure if here is able to economy reasonable size.

Also you can check this documentation and there is not recommended to use binary package again:

The binary package is a practical choice for development and testing but in production it is advised to use the package built from sources.

Also this answer may be helpful if you will decide to build library

Solution 2:

If you create a v-env with PsycoPG2 installed, you'll see that that's pretty much the minimum you can get away with due to the size of the components of the wheel and its dependencies

While it's not 100% the same as 2.9, here's mine for 2.9.1:

~/v-3.9/lib/python3.9/site-packages/psycopg2$ du -sh *
8.0K    __init__.py
140K    __pycache__
4.0K    _ipaddress.py
8.0K    _json.py
1.5M    _psycopg.cpython-39-x86_64-linux-gnu.so
20K     _range.py
16K     errorcodes.py
4.0K    errors.py
8.0K    extensions.py
44K     extras.py
8.0K    pool.py
16K     sql.py
8.0K    tz.py

Note the size of the shared object. You might also want to check the import statements - there are a few other pieces which they pull in as well, which results in your code size increase.