Poetry is a dependency management and packaging tool for python. It makes it really easy to create and share data science work with others without worrying about version conflicts.
Poetry creates a new virtual environment for each project, allows to add and track packages and even publish your work to PyPi. It can be used to make it easy for fellow data scientists to reproduce a jupyter notebook just by running poetry install
to create the same virtual environment.
1. Why use Poetry?
You might wonder, why use poetry at all? For me it was several reasons.
Great dependency resolver
Sometimes with conda I run into some weird dependency issues, which forced me to spend time on StackOverflow looking through various posts to resolve the issue. Poetry has a dependency resolver, where the solutions are already implemented if they exist.Build for both dependancy management and packaging
Poetry allows not only track packages and resolve dependency conflicts, but it also helps to publish packages to PiPy. So it's very helpful to learn it if you plan to release your own package at some point.Lots of people are using it
Many great data science notebooks are packaged with poetry, so it is useful to understand how it works and to have it installed.
2. Installing poetry
The recommended way of installing poetry is via a custom installer, which will isolate the package from the rest of the system.
On OSX:
curl -sSL https://install.python-poetry.org | python3 -
On Windows:
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
To make sure it's installed correctly, let's check the poetry version.
$ poetry --V
3. Creating a new project
Poetry project creates a new virtual environment.
$ poetry new jupyter-demo
The project consists of several files, most important one is pyproject.toml
, which contains a list of all project dependencies.
4. Installing Jupyter
To add packages to poetry project, we just need to use a simple poetry add
command.
$ poetry add jupyter ipykernel
Once Jupyter is installed, let's add a couple of packages often used in data science.
$ poetry add pandas tensorflow
To uninstall any package, we can use poetry remove
command.
$ poetry remove tensorflow
Poetry allows us to see details of a particular package installed in our virtual environment by using poetry show
with the package name.
$ poetry show pandas
5. Track your packages
While we could always look up all the packages in pyproject.toml
file, we could see packages in a command line by running poetry show --tree
to see all packages and their dependencies.
To see the latest updates for packages on PyPi to check if we're using latest versions, just run poetry show --latest
.
For compatibility, poetry also allows to export dependencies in requirements.txt
$ poetry export -f requirements.txt --output requirements.txt
6. Run the Jypyter notebook
To run any executable from our newly created environment, we need to use poetry run
command. Let's spin up jupyter server.
$ poetry run jupyter notebook
Copy and paste one of the URLS from your command line in the browser.
So now we have a jupyter environment running and can create a new notebook. I've just created a Test_notebook.ipynb
in jupyter_demo folder.
7. Publish Notebook on GitHub
Once notebook is ready and environment with all packages is created, it's time to share your work on GitHub!
First, we need to lock the package versions we use with poetry lock
command, so people who will work with our code can create the same virtual environment.
$ poetry lock
Once the packages are locked, you can initialise a local git repository, add all files and push initial commit.
$ git init -b main
$ git add .
$ git commit -m "My first commit"
Then, create a git repo with GitHub UI, don't initialise it, then copy the repo url.
$ git remote add origin <YOU_REPO_URL>
$ git push -u origin main
This pushes all your files to Github, so other people can easily access your work and set up similar environment.
8. Run notebook from existing poetry project
To recreate the notebook published by somebody else, we just need to clone the repo and set up virtual environment by running poetry install
.
$ git clone <YOU_REPO_URL>
$ cd <your_repo>
$ poetry install
Once the files are copied and virtual environment is created, we can spin up the notebook, just like before.
$ poetry run jupyter notebook
This is how, in a few easy step we can share and reproduce jupyter notebooks with poetry. Hope, this will inspire you to give poetry a try!
Top comments (0)