Nowadays, there is no way around Docker. It's a great technology that ensures that your software works on production the same as on your laptop.
If you have a medium size Node.JS or Ruby on Rails project, with a hand full of dependencies, this command can take a couple of minutes:
docker build -t orga/project:1.0.0 .
Especially if you have dependencies with native extensions like libxml or sassc, the build will take a long time.
The build time can be reduced dramatically if you are using a Docker base image which includes already the majority of the needed dependencies. That's why I have, in most of my projects, a directory called docker_base
, which contains a Dockerfile and package manager files for my base image. For a typical Ruby on Rails project the Dockerfile would look like this:
FROM ruby:2.5-alpine
ENV RAILS_ENV=production
WORKDIR /usr/src/app_base
COPY docker_base/Gemfile .
COPY package.json .
RUN apk update; \
apk add build-base; \
apk add libxml2-dev; \
apk add libxslt-dev; \
apk add ruby-nokogiri; \
apk add yarn; \
apk add tzdata; \
yarn install --production=true; \
bundle config build.nokogiri --use-system-libraries; \
bundle install --without development test;
The first COPY
command adds a Gemfile with the most important dependencies. These Gemfile does not include private (closed source) dependencies. That way the Docker base image can be later hosted on a public Docker Hub repository. The second COPY
command adds the package.json file for the frontend dependencies.
The RUN
command is doing all the hard work which we want to avoid on our daily builds. It's installing all the native system libraries which are required for the Ruby and Node dependencies. In the last 3 lines we finally install all the Node and Ruby dependencies.
Let's build the base image like this:
docker build -t versioneye/base-web:1.0.0 .
Now the regular Dockerfile in the root directory of the project can inherit from this base image. The first line of that Dockerfile would look like this:
FROM versioneye/base-web:1.0.0
The dependencies in your Gemfile and package.json might change every day. That's why it's important that you run the install steps again in your main Dockerfile! That way new dependencies will be installed. However, the build time shrinks dramatically because the base image contains already the majority of the dependencies. In my case I could reduce the build time from 5 minutes to under 1 minute!
What do you think about this method? Do you have another trick to reduce the build time of your daily Docker build?
Top comments (9)
You don't need separate Dockerfiles. Just use a multistage build: docs.docker.com/develop/develop-im...
Once you get past the initial cognitive load, multistage builds are quite easy to use and very powerful. In addition to using one stage for a toolchain and another for active development, similar to what you suggest here, copying just what's needed in production into a minimal runtime can make for some very compact images.
I know multi-stage builds and I'm even using them to avoid secrets in Docker images. But a multi-stage build doesn't improve your build time ;-) With a multi-stage build, you can improve the size of a Docker image, but not the build time. And All the packages I'm adding to the base image I need in production. I can not leave out libxml for example, because then the HTML rendering would not work.
The method I described is really useful to improve the daily docker build time. Of course, you can combine it with docker multi-stage builds. That's something I'm doing as well. Will write another blog post about that topic.
Sorry, I don't follow. How does it not improve your build time? Are you not caching layers?
If you run your builds on a CI server, all the dependencies, in all docker stages, have to be installed on each build. Right? But if you refer to a base image that has already all the dependencies, it saves time because it only needs to be downloaded. Correct me if I'm wrong. Always willing to improve my knowledge :)
Ah, I guess I misunderstood the post. I thought you were talking about speeding up builds in a local dev context, not CI.
My company uses codefresh.io which provides Docker layer caching in builds, but that's not typical. With most CI (CircleCI, TravisCI, the dreaded Jenkins) you wouldn't get this out of the box. You would if you had your own permanent CI server I guess, but who the heck does that anymore? :D
I have a little experience in rolling my own Docker caching in CI. You can try using the
--cache-from
flag (docs.docker.com/engine/reference/c...), but it ends up being a bunch of extra scripting with probably a low ROI vs. doing as you suggest and building your own base image.That's pretty amazing! Would you use this to distribute applications inside docker images as well or only as runtimes for the code on your local disk? I'm only using containers as runtimes environments at the moment, using a lot of volumes, but I'm debating whether bundling the whole app inside if an image is better in any way.
Yes! I would use it to distribute applications inside of docker images. Actually that's what I'm doing. But if you use docker images to distribute your applications to customers then be careful not to leave secrets inside of the Docker image. Sometimes you need secret credentials at build time to fetch a package from a private repository for example. Even if you delete the secret during build time, it will stick around in the layers of the Docker image. To solve this problem you need docker multi-stage builds like described here: docs.docker.com/develop/develop-im...
Hey Robert,
Thank you for you post. I was just wondering how do you ensure that your base image is up to date without having it built with every you push or merge a branch on your CI. Is there a way to programatically detect if the dependencies have changed and only then rebuild the base image?
Hey Eduardo,
I'm using VersionEye on every git push to get notified about outdated dependencies, security vulnerabilities and license violations. The base images are getting updated once a month. That's not automated yet. You can do it at the end of each Scrum sprint for example or on the first work day of the month. That depends on your software development cycle.