There is no doubt about the fact that Docker makes it very easy to deploy multiple applications on a single box. Be it different versions of the same tool, different applications with different version dependencies - Docker has you covered. But then nothing comes free. This flexibility comes with some problems - like high disk usage and large images. With Docker, you have to be careful about writing your Dockerfile efficiently in order to reduce the image size and also improve the build times.
Docker provides a set of standard practices to follow in order to keep your image size small - also covers multi-stage builds in brief.
Multi-stage builds are specifically useful for use cases where we build an artifact, binary or executable. Usually, there are lots of dependencies required for building the binary - for example - GCC, Maven, build-essentials, etc., but once you have the executable, you don’t need those dependencies to run the executable. Multi-stage builds use this to skim the image size. They let you build the executable in a separate environment and then build the final image only with the executable and minimal dependencies required to run the executable.
For example, here’s a simple application written in Go. All it does is to print “Hello World!!” as output. Let’s start without using multi-stage builds.
Dockerfile
FROM golang
ADD . /app
WORKDIR /app
RUN go build # This will create a binary file named app
ENTRYPOINT /app/app
- Build and run the image
docker build -t goapp .
~/g/helloworld ❯❯❯ docker run -it --rm goapp
Hello World!!
- Now let us check the image size
~/g/helloworld ❯❯❯ docker images | grep goapp
goapp latest b4221e45dfa0 18 seconds ago 805MB
- New Dockerfile
# Build executable stage
FROM golang
ADD . /app
WORKDIR /app
RUN go build
ENTRYPOINT /app/app
# Build final image
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /app/app .
CMD ["./app"]
- Re-build and run the image
docker build -t goapp .
~/g/helloworld ❯❯❯ docker run -it --rm goapp
Hello World!!
- Let us check the image again
~/g/helloworld ❯❯❯ docker images | grep goapp
goapp latest 100f92d756da 8 seconds ago 8.15MB
~/g/helloworld ❯❯❯
We can see a massive reduction in the image size -> From 805 MB ** we are down to **8.15MB. This is mostly because the Golang image has lots of dependencies which our final executable doesn’t even require for running.
What’s happening here?
We are building the image in two stages. First, we are using a Golang base image, copying our code inside it and building our executable file App. Now in the next stage, we are using a new Alpine base image and copying the binary which we built earlier to our new stage. Important point to note here is that the image built at each stage is entirely independent.
- Stage 0
# Build executable stage
FROM golang
ADD . /app
WORKDIR /app
RUN go build
ENTRYPOINT /app/app
* Stage 1
# Build final image
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /app/app .
CMD ["./app”]
Note the line COPY —from=0 /app/app
-> this lets you access data from inside the image built in previous image.
How multi-stage builds work?
If you look at the process carefully multi-stage builds are not much different from actual Docker builds. The only major difference is that you build multiple independent images (1 per stage) and you get the capability to copy artifacts/files from one image to another easily. The feature which multi-stage build is providing right now was earlier archived from through scripts. People used to create the build image - copy artifact from it manually - and then copy it to a new image with no additional dependencies. In the above example, we build one image on Stage 0 and then in Stage 1 we build another image, to which we copy files from the older image - nothing complicated.
NOTE: We are copying /app
from one image to another - not one container to another.
This can speed up deployments and save cost in multiple ways
- You build efficient lightweight images - hence you ship lesser data during deployment which saves both cost and time.
- You can stop a multi-stage build at any stage - so you can use it to avoid the builder pattern with multi-stage builds. Have a single Dockerfile for both dev, staging and deployment.
The above was just a small example, multi-stage builds can be used to improve Docker images of applications written in other languages as well. Also, multi-stage builds can help to avoid writing multiple Dockerfiles (builder pattern) - instead a single Dockerfile with multiple stages can be adapted to streamline the development process. If you haven't explored it already - go ahead and do it.
Top comments (0)