Recently I wrote a post on using Docker for Jupyter and Python with the Intel Python2 distribition for data science. With this, I got some feedback to use virtualenv instead of Docker.
I started using Docker for the desire to learn how to make portable environments that can be moved from one machine to another with ease. With this, Docker comes with the benefit of adding other services into an application through the use of multiple containers. I am currently learning to use docker-compose
to create an application with multiple containers. This allows for adding services such as PostgreSQL, MongoDB, Jupyter notebooks, or Apache Hadoop.
What do you typically use and why?
References
Cover image sourced from Docker Wallpapers
Top comments (20)
Good question! I use Pipenv when developing Python applications locally.
pipenv
is now the official Python recommendation for managing isolated environments. I usepipenv
because it makes isolated environment management incredibly simple. It's important to note that you aren't facing a mutually exclusive choice here. There are Dockerized applications that use multiple isolated environments within their architecture. The boundary for me when deciding to use Docker is the scale at which the application will be used. If we're talking a production-ready web application that needs to scale to 1000's of users then containerize away! However, if we're talking about an internal tool that a team of data scientists distributes locally to use on their personal machines, then you may not need Docker. Honestly, I don't see a problem with Dockerizing for the second use case either as long as it doesn't create an unnecessary barrier to entry amongst the team.Oh interesting! Thanks for the link, I will definitely look into it more. I appreciate you responding, the idea of containerizing large scale production versus small scale in-house applications makes sense. Some applications I have seen for in-house use don't necessarily need the container while others seem more suited towards it.
Anything you are going to deploy to a server, you should use docker unless you have a really good reason not to. It's a better alternative to having a lot of custom scripts that install whatever on a server and it will just run on most server platforms. Something a lot of javascript, ruby, and python hackers always struggled with is the notion that system administrators generally don't like having to deal with stuff that just craps all over the filesystem that requires root permissions; dumps all over the place in /usr, /lib, etc. Things like virtualenv really have no business being on a server: they are development tools intended for developers on their workstations. Having any kind of custom install procedures for software on servers is just not a thing these days.
If you are using docker anyway, using docker build to build, package and run your application locally also makes a lot of sense. All it requires is adding a Dockerfile to your repo. It's really easy to set up and you document exactly what your application needs to run as a side effect. You can pick whatever python version you need, whatever libraries, binaries, etc. need to be there, etc. And if you then still want to run virtualenv or similar locally you still can.
How do you square this with text editors that expect the packages to be available locally and give linting errors (not to mention don't give hints as to how to use the API) if your packages are all sitting in a running docker container?
That makes sense! I have worked on servers in the past and I can see how it would be a benefit to use Docker in that case. Plus, like you mentioned, it helps in picking the exact version of Python along with all the libraries and such that are needed.
I use both docker and virtualenv for local development. I do this so I can containerize my editor, shell customizations, tmux, etc. Virtualenv is used to isolate dependencies from different projects I have.
The following command runs my development environment:
You can find all my images here: github.com/AGhost-7/docker-dev
Nice.
I've thought about using docker or singularity in a similar way but haven't taken the time to implement anything yet. It's great that you shared your environment -good stuffs.
You're seriously awesome for posting your dev environment. Such a helpful boost.
Thank you.
Cool thanks! I will take a look.
If the goal is only to play your software against a specific version of Python, I am using pyenv.
I found it very easy to set up and to run.
As soon as Python is on my machine, I am installing pyenv.
If you look for something similar for java (I had to some time ago and that was kind of hard to find 😄), you can use jenv (see below).
pyenv / pyenv
Simple Python version management
Simple Python Version Management: pyenv
pyenv lets you easily switch between multiple versions of Python. It's simple, unobtrusive, and follows the UNIX tradition of single-purpose tools that do one thing well.
This project was forked from rbenv and ruby-build, and modified for Python.
pyenv does...
In contrast with pythonbrew and pythonz, pyenv does not...
$PATH
.and
jenv / jenv
Manage your Java environment
Master your Java Environment with jenv
Website : jenv.be
Maintainers :
Future maintainer in discussion:
As he makes an incredible work by taking time to merge the Pull Request on his fork, I (@gcuisinier) am in discussion with him to take part of jEnv directly if he wants Whatever his decision, I thank him for his work, and to have convincing me to think about the future of jEnv and accepting a new maintainer for the good of the project.
What's jEnv ?
This is an updated fork of
jenv
, a beloved Java environment manager adapted fromrbenv
.jenv
gives you a few critical affordances for usingjava
on development machines:java
versions. This is useful when developing Android applications, which generally require Java 8 for its tools, versus server applications, which use later versions like Java 11.Oh cool! Thank you so much for sharing!
If you use docker for mac, take a note about its performance of syncing files between host and containers. Recently my app surfered horribly slow response. Luckily, there is gem docker-sync which can solve this problem, its great :)
I use both, build ubuntu/windows with driver/software requirements in a container, like database drivers, middleware, tools, then we apply python via pyenv(linux) or venv(windows) this is another containter layer with python version as parameter, then we apply the git repo with requirements.txt, and then we run.
toolchain is Nexus3 (docker image repo), jenkins container build and deployment via Jenkinsfile. all code in gitlab. We build, unittest, deploy and integrationtest in containers.
I have yet to start using Docker at all and it's entirely my bad. Most of my clients apps are deployed on Heroku which has already a container system that builds on "git push".
Locally I use pyenv + pyenv-virtualenv to manage multiple Python versions and each projects isolated virtualenv. I also use pipenv to manage a project's depedencies. Once you push a project with a Pipfile to Heroku it automatically does the build, that's why I'm spoiled :-D
Still, I definitely need to learn Docker and Docker Compose at some point for the same reasons you mentioned. In addition: being provider agnostic as much as possible, to ease scalability and to make it easier for new developers.
Thanks for the reminder :-)
I use both conda env and Docker on daily basis. Because I deploy an application in form a container on the cloud, I usually don't trust to run the app on my laptop even I spawn a newly fresh env, instead, I run it with docker-compose and mouth the file system to my laptop in order to make sure that everything would exactly be the same way when it's deployed.
Interesting, I just started learning to use docker-compose instead of just having a Dockerfile.
I'm not a pythonista by trade but use it in some side projects and often find myself fumbling around. I'm comfortable using bundler for ruby and yarn/npm with node, and have even used some sbt with scala but for some reason I've always found the dev workflow with virtualenv (and now pipenv) painful...
I've actually tried to work around some of this by falling back on docker to manage dependencies in one place and not worry about it, but that's not a solution as modern text editors like vscode expect to be able to resolve packages locally, and the hints they give to the api are valuable.
I use docker on daily basis and the great advantage I can see is it's transparency. Whenever looking at Dockerfile I can easily judge what happened on the raw machine to prepare it
Good point about the Dockerfile.