Effective dependency management, including libraries, functions, and packages crucial to project functionality, is made easier by using a package manager. Pip, a widely adopted classic, is the preferred choice of many developers, as it allows for the seamless installation of Python packages from the Python Package Index (PyPI). Conda, recognized not only as a package manager but also as an environment manager, extends its capabilities to manage dependencies from both Python and other systems, making it a versatile tool. For our purposes, we will focus on using it primarily for Python-only environments.
Pip and Conda stand out as reliable tools, widely used and trusted by the developer community. However, as projects expand, staying organized amidst an increasing number of dependencies becomes a challenge. In this context, Poetry emerges as a modern and organized solution for dependency management.
Poetry, built on top of Pip, presents a contemporary approach to dependency management. It goes beyond being a simple fusion of Pip and a virtual environment to serving as a comprehensive tool covering dependency management, project packaging, and build processes. The comparison to Conda is nuanced; Poetry aims to simplify the packaging and distribution of Python projects, offering a distinct feature set.
Pip and Conda remain valuable options for managing dependencies, thanks to Conda's versatility in handling diverse dependencies. Poetry, on the other hand, offers a modernized and comprehensive solution, offering simplicity in managing Python projects and their dependencies. Choosing the right tool depends on the specific project requirements and developer preferences.
Package Management
Poetry uses a pyproject.toml file to specify your project settings, accompanied by an automatically generated lock file. The pyproject.toml file looks like this:
(tool.poetry.dependencies)
python = "^3.8"
pandas = "^1.5"
numpy = "^1.24.3"
(tool.poetry.dev.dependencies)
pytest = "^7.3.2"
precomit = "^3.3.3"
Like other dependency managers, Poetry keeps close track of package versions in the current environment via a lock file. This lock file contains project metadata, package version parameters, and more, ensuring consistency across different environments. Developers can intelligently separate dependencies into development- and production-based categories within toml files, streamlining deployment environments and reducing the risk of conflicts, especially across different operating systems.
Poetry's pyproject.toml file is designed to address certain limitations found in Pip's requirements.txt and Conda's environment.yaml files. Unlike Pip and Conda, which typically produce long lists of dependencies without metadata in a separate file, Poetry aims for a more organized and concise representation.
While it is true that Pip and Conda, by default, lack a locking feature, it is important to note that recent versions offer options to generate lock files via installed libraries such as pip-tools and conda-lock. This functionality ensures that different users can install the desired library versions specified in the requirements.txt file, thus promoting reproducibility.
Poetry emerges as a modern, organized solution for Python dependency management, offering better organization, version control, and flexibility compared to traditional tools like Pip and Conda.
Updating, installing and removing dependencies
With Poetry, updating libraries is simple and takes into account other dependencies to ensure they are up to date. Poetry has a bulk update command that will update your dependencies (based on your toml file) while keeping all dependencies compatible with each other and maintaining the package version parameters found within the lock file. This will simultaneously update your lock file.
As for installation, it couldn’t be simpler. To install dependencies with Poetry, you can use Poetry’s add function, with which you can specify the version, use logic to specify version parameters (greater than less than), or use flags like @latest, which will install the most recent version of the package from PyPI. You can even group multiple packages into the same add function. Any newly installed packages are automatically resolved to maintain the correct dependencies.
$poetry add requests pandas@latest
As for classic dependency managers, let's run a test to see what happens when we try to install an older, incompatible version. Packages installed by Pip will throw errors and conflicts, but will eventually install the package, which can lead to less than ideal development. Conda has a compatibility error resolver and will notify the user, but it immediately goes into search mode to resolve the compatibility issue and throws a secondary error when it can't find a solution.
(test-env) user:~$ pip install "numpy<1.18.5"
Collecting numpy<1.18.5
Downloading numpy-1.18.4-cp38-cp38-manylinux1_x86_64.whl (20.7 MB)
|████████████████████████████████| 20.7 MB 10.9 MB/s
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 1.22.3
Uninstalling numpy-1.22.3:
Successfully uninstalled numpy-1.22.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pandas 1.4.2 requires numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10", but you have numpy 1.18.4 which is incompatible.
Successfully installed numpy-1.18.4
(test-env) user:~$ pip list
Package Version
--------------- -------
numpy 1.18.4
pandas 1.4.2
pip 21.1.1
python-dateutil 2.8.2
pytz 2022.1
six 1.16.0
Poetry has an immediate response to dependency compatibility errors to report conflicts early and quickly. It refuses to continue with the installation, so the user is now in charge of checking for a different version of the new package or the existing package. We believe this allows for more control compared to Conda's immediate action.
user:~$ poetry add "numpy<1.18.5"
Updating dependencies
Resolving dependencies... (53.1s)
SolverProblemError
Because pandas (1.4.2) depends on numpy (>=1.18.5)
and no versions of pandas match >1.4.2,<2.0.0, pandas (>=1.4.2,<2.0.0) requires numpy (>=1.18.5).
So, because dependency-manager-test depends on both pandas (^1.4.2) and numpy (<1.18.5), version solving failed.
...
user:~$ poetry show
numpy 1.22.3 NumPy is the fundamental package for array computing with Python.
pandas 1.4.2 Powerful data structures for data analysis, time series, and statistics
python-dateutil 2.8.2 Extensions to the standard Python datetime module
pytz 2022.1 World timezone definitions, modern and historical
six 1.16.0 Python 2 and 3 compatibility utilities
Last but not least is Poetry's package uninstallation. Some packages require more dependencies to be installed. In the case of Pip, removing a package will only uninstall the defined package and nothing else. Conda will remove some packages, but not all dependencies. Poetry, on the other hand, will remove the package and all its dependencies to keep the dependency list tidy.
Is poetry compatible with existing Pip or Conda projects?
Yes, Poetry is compatible with existing projects managed by Pip or Conda. Simply initialize your code using Poetry's Poetry.toml format and run it to get the package library and its dependencies, allowing for a seamless transition.
If you have an existing project that uses Pip or Conda, you can migrate it to Poetry without too much difficulty. Poetry uses its own pyproject.toml file to manage project dependencies and configurations. To start using Poetry in your project, you can follow these steps:
1. Install Poetry either by curling and edging or using Pip
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
2. Navigate to the root directory of your existing project.
3. Initialize Poetry in your project directory:
This command will guide you through a series of prompts to configure the initial settings for your project.
4. Once initialization is complete, Poetry will create the pyproject.toml file in your project directory. Open the toml file to add or modify your project's dependencies.
5. To install existing dependencies in your project
This will create a virtual environment and install the project dependencies inside it.
6. You can now use Poetry's run command to run your project's scripts, similar to how you would use Python or Conda commands.
poetry run python my_script.py
Poetry manages your project's virtual environment and dependency resolution, making it compatible with existing Pip or Conda projects. It simplifies dependency management and allows for consistent package installations across different environments.
Note: It is always good practice to backup your project before making significant changes to your configuration or dependency management tools.
Final thoughts
Ensuring that the correct versions of packages are in your code environment is essential to getting the right results every time. Small changes in the backend of your code can alter the result. But it's also important to keep those packages and libraries up to date to take advantage of the innovations that each patch brings.
To manage these dependencies in your code, Poetry is a great tool for those working with more complex and diverse projects with a larger number of dependencies. While Pip and Conda are still viable options, they are better suited for smaller, less complex environments. Poetry may not be used by everyone, but since Pip has been around forever, it may be worth sticking with Pip alone for its ease of use.
But if your project and workload value the importance of organization and are willing to explore new tools to improve your process, Poetry is a tool you should consider. The expanded functionality from Pip to Poetry really makes a difference. We encourage you to try Poetry for yourself.
OriginalRepublished with permission.
Kevin Vu manages Exxact Corp Blog and works with many of its talented authors who write about different aspects of deep learning.