4 minutes
Python dependency management 💥
All hell broke loose
Python dependency management has historically been chaotic. There exist several tools that try to alleviate the same problems, and there really is no common standard. In this post, I will talk about some of them and explain my take on this topic.
Coming from a web development background, I am used to having robust and versatile dependency managers such as npm and yarn. They both are feature rich and allow you to get reproducible development environments easily. For example, it suffices to run npm i
to create an isolated environment containing all the dependencies needed to run the project. Furthermore, this environment is reproducible if the lockfile
is present.
You can spot the inconsistent state of the Python developer world easily by taking a look at some popular repositories. You will find requirements.txt
, requirements.in
, Pipfile
, setup.py
, pyproject.toml
, meta.yaml
or even Dockerfile
depending on the project you visit.
pip
The most common tool used when dealing with Python dependencies is pip. This tool reads from the most famous Python package repository, PyPi, and installs packages to the current Python environment.
Usually, we can find a file called requirements.txt
on many Python projects. This file contains a list of the packages needed by the project and the version to install, such as:
appnope==0.1.0
backcall==0.1.0
beautifulsoup4==4.6.3
bleach==2.1.4
pip is able to read this kind of file and install the packages by running pip install -r equirements.txt
. But using pip alone leads to a problem: soon your Python environment will be cluttered with a huge number of packages, and it can reach an inconsistent state. This is where virtual environments are useful, but pip does not offer any way of creating them.
pip-tools is an addon built on top of pip which aims at ensuring reproducible installs, but the problem of clutter remains.
Virtual environments and Python installations
In order to have a lightweight, reproducible and independent environment, we should create a virtual environment for each of our Python projects. There are many ways of doing this, and it’s part of the chaotic state of affairs:
- virtualenv: a tool to create isolated Python environments.
- venv: a subset of virtualenv included in the standard library. Lacks some functionality and speed.
- pyenv: lets you easily switch between multiple versions of Python.
Unfortunately, none of these tools include dependency management, so they would need to be used in addition to pip or pip-tools.
Fully-fledged package managers
Now let’s talk about real dependency managers. These are the tools which I would call comparable to the likes of npm. I will start with the one I like the least: conda.
conda
Conda is a package + environment manager which was created with the scientific community in mind. It comes in different flavours such as anaconda or miniconda. It has a (really ugly and slow) user interface to manage your conda environments and dependencies.
Pros:
- Easy to install and update dependencies
- Manages virtual environments
Cons:
- Does not use PyPi as package repository
- Normally not compatible with the latest Python version
- Slow and bloated
- Convoluted and not versatile
- Suffers from random stability problems
pipenv
pipenv has been around since 2017 and it is the first tool that I would consider a real, production-ready package manager.
It deals with virtual environments, dependency compatibility and reproducibility. It has terminal commands to easily add, remove and upgrade packages. It writes the added packages to a Pipfile
and Pipfile.lock
. It is really lightweight and functional.
poetry
Finally, let’s talk about my personal favourite: poetry.
Poetry comes with all the tools you might need to manage your projects in a deterministic way. It is a modern alternative to conda and pipenv. It comes with a much better dependency resolver, which translates to faster install speeds and better stability.
Poetry makes it really easy to initialise a project, add/remove/update dependencies to a consistent state.
In addition, Poetry streamlines the process of deploying your project to PyPi, which can be done by running just two commands: poetry build
and then poetry publish
.
This tool is by far my favourite and it’s under active development, so more features will be coming in the future.
A good guide on what Poetry has to offer can be found here: Upgrade your Python project with poetry.
Conclusion
If you’re just starting a new Python project, I strongly recommend using Poetry. It will make everything easier for you, and your project will be more solid.
I hope this post made things a bit clearer for you 😁
See you around!