Series Introduction
Welcome to Part 2 of this blog series that will go from the most basic example of a .net 5 webapi in C#, and the journey from development to production with a shift-left mindset. We will use Azure, Docker, GitHub, GitHub Actions for CI/C-Deployment and Infrastructure as Code using Pulumi.
In this post we will be looking at:
- Optimising the Docker image size
TL;DR
We managed to optimise the resulting Docker image from 210MB down to 109MB; this 109MB could be even smaller if you were happy not to use AOT (ahead of time compiling) with .net - you can get this down to ~69MB. We did this by using Alpine Linux - a small, lightweight Linux distribution, using the .net 5 SDK to restore, build, publish, and then we used the minimal .net runtime dependencies image (which is only 9MB!). We took advantage of self-contained, PublishReadyToRun and PublishTrimmed.
--self-contained=true \
-p:PublishReadyToRun=true \
-p:PublishTrimmed=true
GitHub Repository
peteking / Samples.WeatherForecast-Part-2
This repository is part of the blog post series, API's from Dev to Production - Part 2 on dev.to. Based on the standard .net standard Weather API sample.
Requirements
We will be picking-up where we left off in Part 1, which means you’ll need the end-result from GitHub Repo - Part 1 to start with.
We have the same requirements as last time, but to save you some time, I've put them below anyway :)
Visual Studio Code - https://code.visualstudio.com/download
WSL 2 (Windows 10 only) - https://docs.microsoft.com/en-us/windows/wsl/install-win10
I’m using Windows 10, for other OS’s like MacOS and Linux, there will be small differences that are not covered here.
VS Code Extensions
C# - https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.csharp
.gitignore Generator - https://marketplace.visualstudio.com/items?itemName=piotrpalarz.vscode-gitignore-generator
Docker - https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-docker
Dockerfile optimisation
Dockerfiles works in distinct layers and each of the following commands creates a new layer.
In relation to a previous file which is below.
FROM
creates a layer from the mcr.microsoft.com/dotnet/aspnet:5.0
Docker image.
COPY
adds files from your source to destination directory.
RUN
in our case will restore, build and publish using dotnet
.
FROM mcr.microsoft.com/dotnet/aspnet:5.0 AS base
WORKDIR /app
EXPOSE 80
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /app
COPY . .
WORKDIR /app/src/Samples.WeatherForecast.Api
RUN dotnet restore "Samples.WeatherForecast.Api.csproj"
RUN dotnet build "Samples.WeatherForecast.Api.csproj" -c Release -o /app/build --no-restore
FROM build AS publish
RUN dotnet publish "Samples.WeatherForecast.Api.csproj" -c Release -o /app/publish
FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "Samples.WeatherForecast.Api.dll"]
Docker build cache
When building an image, Docker steps through the instructions in your Dockerfile
, executing each in the order specified. As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image.
If you do not want to use the cache at all, you can use the --no-cache=true
option on the docker build
command. However, if you do let Docker use its cache, it is important to understand when it can, and cannot, find a matching image. The basic rules that Docker follows are outlined below:
Starting with a parent image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
In most cases, simply comparing the instruction in the
Dockerfile
with one of the child images is sufficient. However, certain instructions require more examination and explanation.For the
ADD
andCOPY
instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.Aside from the
ADD
andCOPY
commands, cache checking does not look at the files in the container to determine a cache match. For example, when processing aRUN apt-get -y update
command the files updated in the container are not examined to determine if a cache hit exists. In that case just the command string itself is used to find a match.
Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache is not used.
It’s preferable to use COPY instead of ADD.
This is simply because the term
COPY
is more transparent in terms of what it is doing, whereasADD
supports features like local-only tar extraction and remote URL support that are not immediately obvious.
See references [1]
How can we optimise?
For starters, we can use a different base image, a different OS (Operating System), one of the smallest is Alpine Linux. Alpine is great, it’s small, is not bloated with anything extra, even curl
isn’t installed!
With .net, the question becomes - How do I know what’s available?
Well, DockerHub is your best friend, it has everything you need to find out about the image that companies and individuals have published.
Let’s take a look at .net 5 from Microsoft.
You can see there are over 1 billion downloads overall!
The build
What do we actually need to build? When you think about it, you need the SDK (Software Development Kit) in order to build your projects.
If you navigate to ‘dotnet/sdk’ you’ll see a page with all the relevant information you need about the SDK.
From there you can see the various versions (tags) are available, we want Alpine. There are various tags there, but they all equate to the same Dockerfile; therefore, the same image. For us, using the tag, ‘5.0-alpine’ is enough.
Always specify a version
It’s best practice to pin a version, it is then a known thing - You don’t want to be caught out by using ‘latest’ tag. i.e. if you build later on, and the version with the ‘latest’ tag has changed, but you haven’t tested it, you may find some issues.
Rewrite the Dockerfile
Let’s start by rewriting our Dockerfile
, don’t worry, it will still only be about 20 lines of code.
ARG VERSION=5.0-alpine
FROM mcr.microsoft.com/dotnet/sdk:${VERSION} AS build
WORKDIR /app
Here you can see a couple of things, we are using ARG
to hold 5.0-alpine and we use that variable in the FROM
statement with the dollar sign and curly braces ${ }
If you’re wondering what this FROM
is and why we have more than one (even in our generated Dockerfile
from Part 1), this is known as a multi-stage build - We essentially take advantage of different base images to do certain things, and use another final image to ensure it is the most optimal it can be. For more information about multi-stage builds, please see, Docker Docs - Multistage Build
DotNet restore
Next up is dotnet restore
we want to restore because if that fails, there is certainly no point in going any further with the build. We will also optimise for the runtime too, because we are targeting x64 Linux, we can specify the -r
or -runtime
for more information about the options available for dotnet restore
please see, Microsoft Docss - dotnet restore
Our Dockerfile
now looks like:
# Copy and restore as distinct layers
COPY . .
WORKDIR /app/src/Samples.WeatherForecast.Api
RUN dotnet restore Samples.WeatherForecast.Api.csproj -r linux-musl-x64
Here you can see a few more things, one the COPY
command and two, the WORKDIR
command. We copy everything from the source to our destination, hence the period . for the source and the period .
for the destination. Next, the WORKDIR
changes the current directory.
DotNet build (publish)
Now we move on to the build, or rather publish, we do this because publish will also build as well, but what we don’t want it to do is restore, so we use the option of --no-restore
. I feel it is down to personal preference, you may want to do a dotnet build
before a dotnet publish
and then in addition specify --no-build
, but I’ll leave it up to you to decide. Here, I’m going to go straight to publish.
FROM build AS publish
RUN dotnet publish \
-c Release \
-o /out \
-r linux-musl-x64 \
--self-contained=true \
--no-restore \
-p:PublishReadyToRun=true \
-p:PublishTrimmed=true
There are a few more dotnet publish
options happening here.
-c
defines the build configuration, in our case Release.
-o
specifies the path for the output directory.
-r
or -runtime
specifies the given runtime, just like in dotnet restore
Please see the RID catalogue for more information. https://docs.microsoft.com/en-us/dotnet/core/rid-catalog
--self-contained
publishes the .net runtime with the application so the runtime doesn't need to be installed in the image.
--no-restore
doesn't execute an implicit restore when running the command.
-p:PublishReadyToRun=true
compiles the apps assemblies as ReadyToRun (R2R) format. R2R is a form of ahead-of-time (AOT) compilation.
This will make our start-up time a great deal faster, but it comes at the cost of size.
-p:PublishTrimmed=true
trims unused libraries to reduce the deployment size of an app when publishing a self-contained executable.
Take great care in using this option
For more information, please see, Microsoft Docs - dotnet trim-self-containedWhat is linux-musl-x64 ?
Lightweight distributions using musl like Alpine Linux
See references [2]
The final stage
With the new commands we have recently used, we can use the FROM
to indicate a different base image. This is the most important part to take note around Docker multi-stage builds.
We don’t want the SDK in the final runtime image, for one reason, it’s too large, but most importantly we don’t actually even need it!
We have compiled AOT (Ahead-of-time) and including the .net runtime as part of our app.
We go to Docker Hub again and we can find the runtime dependency images. It’s name is mcr.microsoft.com/dotnet/runtime-deps
and it has a bunch of tags like we’ve seen previously. So we’ll attach ${VERSION}
to it.
For more information, please see, Microsoft Docs - dotnet Runtime Dependencies
# Final stage/image
FROM mcr.microsoft.com/dotnet/runtime-deps:${VERSION}
WORKDIR /app
COPY --from=publish /out .
We set the WORKDIR
again and COPY
the output from publish, and if you noted the directory with the -o
option, this is where we specified the output from the dotnet publish
command.
Finally, we can expose the port and specify an entry point.
EXPOSE 8080
ENTRYPOINT ["./Samples.WeatherForecast.Api"]
However, you may have clocked it, but if you haven’t, our
ENTRYPOINT
looks a little different compared to what it was in Part 1. This is because we are taking advantage of -p:PublishReadyToRun=true
.
Full Dockerfile
ARG VERSION=5.0-alpine
FROM mcr.microsoft.com/dotnet/sdk:${VERSION} AS build
WORKDIR /app
# Copy and restore as distinct layers
COPY . .
WORKDIR /app/src/Samples.WeatherForecast.Api
RUN dotnet restore Samples.WeatherForecast.Api.csproj -r linux-musl-x64
FROM build AS publish
RUN dotnet publish \
-c Release \
-o /out \
-r linux-musl-x64 \
--self-contained=true \
--no-restore \
-p:PublishReadyToRun=true \
-p:PublishTrimmed=true
# Final stage/image
FROM mcr.microsoft.com/dotnet/runtime-deps:${VERSION}
WORKDIR /app
COPY --from=publish /out .
EXPOSE 8080
ENTRYPOINT ["./Samples.WeatherForecast.Api"]
Let’s build it
Navigate to your repo root directory and execute the docker build command.
docker build -t samples.weatherforecast:v2 .
We have changed the tag so we can compare from Part 1.
Now, execute docker image ls
to see your images.
Notice the size different, we’ve been able to optimise the size from 210MB down to 109MB!; and this includes AOT.
With AOT, you do pay a size price, you need to determine if this size is worth paying for in terms of execution speed, or you’d rather an even smaller image size and the JIT compile.
If you believe the image size is more important to you, simply remove the option and adjust your
ENTRYPOINT
.You end image size will be about 69MB!
Image size is not everything and does not equate to an increase in security hardening - However, it does help, the less bloat a base image has, i.e. less tooling etc. including in an image, the slightly less likely a hacker could take advantage of any vulnerabilities of those.
You’ll also notice the TAG, and hopefully seeing this will really hit home about what it really is, and the concept of using latest can be misleading...
You can see latest is 25 hours old, whereas v2 is about 1 hour, the tag can be anything, just because I’ve used the term latest, it doesn’t actually mean anything. In our case, the latest version is really the image with the v2 tag.
It comes down to good practice around the image tag - Using latest should point to the latest production image, however, using latest is no longer good practice.
Let’s test it
We should test the image to make sure it serves requests as we expect.
docker run -it --rm -p 8080:80 samples-weatherforecast:v2
Don’t forget the tag change to v2.
Let’s use Postman again... It works!
What have we learned?
We have learned about multi-stage builds in Docker, how the build cache works, and how to really optimise your final image. There are many different base images supporting different OS’s, Alpine, Buster, Ubuntu and more. This give engineers the ultimate flexibility to chooser what is right for their solution. For this example, we’ve opted for something very minimal and lightweight.
If you're new to Docker, I recommend that you spend further time to understand more docker commands that are available in the Dockerfile.
Up next
Part 3 in this series will be about:
GitHub Actions - Build the docker image
GitHub Actions - Publish the docker image to the GitHub Container Registry
More information
- https://www.azure.com
- https://dotnet.microsoft.com/
- https://www.github.com
- https://www.docker.com
- https://www.pulumi.com
References
[1] - https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
[2] - https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-publish
Top comments (2)
Great content! :)
Small error: The run command in Let's test it should be
docker run -it --rm -p 8080:80 samples-weatherforecast:v2
not
docker run -it --rm -p 8080:8080 samples-weatherforecast:v2
Good spot @christianbirkenmaier , there are of course many typos and this looks to be the one that got through!
Thanks for letting me know, and I'll correct it.