Nathan Cook

Posted on May 17, 2023 • Edited on Oct 14, 2023

Creating a development dockerfile and docker-compose.yml for yarn 1.22 monorepos using Turborepo.

#turborepo #docker #node #monorepo

If you've ever worked in a yarn monorepo project with inter-dependent workspaces, you know that creating docker containers for development in a way that avoids unnecessary rebuilds can be a challenge.

Turborepo, "...an intelligent build system optimized for JavaScript and TypeScript codebases", provides tools that make this task much easier.

Let's assume a project that looks like something like this:

- /project-directory
    - /apps
        - /frontend
            - package.json
            - other files...
        - /backend
            - package.json
            - other files...
    - /packages
        - /shared-stuff
            - package.json
            - other files...
    - .dockerignore
    - package.json
    - turbo.json
    - Dockerfile.dev
    - docker-compose.yml

.dockerignore

**/node_modules
**/.next
**/dist

This ensures that node_modules and build artifacts are excluded when copying directories to the container from the host in our Dockerfile.

package.json

{
  "name": "turbo-docker-monorepo",
  "version": "1.0.0",
  "main": "index.js",
  "license": "MIT",
  "private": true,
  "workspaces": ["apps/*", "packages/*"],
  "packageManager": "yarn@1.22.19",
  "devDependencies": {
    "turbo": "^1.9.6"
  }
}

This is the minimal package.json you'll need for a yarn monorepo that makes use of turborepo with workspaces in the 'apps' and 'packages' directories. Not much else to say!

turbo.json

{
  "$schema": "https://turbo.build/schema.json",
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**", ".next/**", "!.next/cache/**"]
    },
    "lint": {},
    "dev": {
      "cache": false,
      "persistent": true
    }
  }
}

This turbo.json file is taken from the basic turborepo example.

Let's assume that the backend app has the packages/shared-stuff workspace as a dependency (meaning the backend package.json has "shared-stuff": "*" in its list of dependencies).

We can use turborepo to build backend in a way that respects its dependence on the shared-stuff package with the command turbo run build --filter=backend, which will build (in order) the shared-stuff package and then backend app. Cool!

Dockerfile.dev

# syntax=docker/dockerfile:1.5.2
# based on: https://github.com/vercel/turbo/blob/main/examples/with-docker/apps/api/Dockerfile

FROM node:20.2-alpine3.17 as base

# adding apk deps to avoid node-gyp related errors and some other stuff. adds turborepo globally
RUN apk add -f --update --no-cache --virtual .gyp nano bash libc6-compat python3 make g++ \
      && yarn global add turbo \
      && apk del .gyp

#############################################
FROM base AS pruned
WORKDIR /app
ARG APP

COPY . .

# see https://turbo.build/repo/docs/reference/command-line-reference#turbo-prune---scopetarget
RUN turbo prune --scope=$APP --docker

#############################################
FROM base AS installer
WORKDIR /app
ARG APP

COPY --from=pruned /app/out/json/ .
COPY --from=pruned /app/out/yarn.lock /app/yarn.lock

# Forces the layer to recreate if the app's package.json changes
COPY apps/${APP}/package.json /app/apps/${APP}/package.json

# see https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/reference.md#run---mounttypecache
RUN \
      --mount=type=cache,target=/usr/local/share/.cache/yarn/v6,sharing=locked \
      yarn --prefer-offline --frozen-lockfile

COPY --from=pruned /app/out/full/ .
COPY turbo.json turbo.json

# For example: `--filter=frontend^...` means all of frontend's dependencies will be built, but not the frontend app itself (which we don't need to do for dev environment)
RUN turbo run build --no-cache --filter=${APP}^...

# re-running yarn ensures that dependencies between workspaces are linked correctly
RUN \
      --mount=type=cache,target=/usr/local/share/.cache/yarn/v6,sharing=locked \
      yarn --prefer-offline --frozen-lockfile

#############################################
FROM base AS runner
WORKDIR /app
ARG APP
ARG START_COMMAND=dev

COPY --from=installer /app .

CMD yarn workspace ${APP} ${START_COMMAND}

Let's go over what happens in each layer. For the following discussion, assume that the backend app service is being created, which has shared-stuff as a dependency.

base

FROM node:20.2-alpine3.17 as base

RUN apk add -f --update --no-cache --virtual .gyp nano bash libc6-compat python3 make g++ \
      && yarn global add turbo \
      && apk del .gyp

This layer adds container dependencies to ensure NPM modules that use node-gyp build correctly. It also adds turborepo globally. The remaining layers are built from 'base'

pruned

FROM base AS pruned
WORKDIR /app
ARG APP

COPY . .

RUN turbo prune --scope=$APP --docker

This layer copies over the project files and runs turbo prune for the service in question. The $APP argument will be either 'frontend' or 'backend' (assuming the later, for this example), which is set in docker-compose.yml (this will be covered last). Regarding the --docker flag:

With the docker flag, the prune command will generate folder called out with the following inside of it:

A folder json with the pruned workspace's package.jsons

A folder full with the pruned workspace's full source code, but only including the internal packages that are needed to build the target.

A new pruned lockfile that only contains the pruned subset of the original root lockfile with the dependencies that are actually used by the packages in the pruned workspace.

Assuming we're building the backend service container, here is what the 'out' directory in the ‘pruned’ layer would look like at this point:

- /out
    - /full
       - /apps
           - /backend (all files)
       - /packages
           - /shared-stuff (all files)
       - .gitignore
       - package.json
       - turbo.json
    - /json
        - /apps
            - /backend
                - package.json
        - /packages
            - /shared-stuff
                - package.json
        - package.json
    - yarn.lock

Note that the 'frontend' app is not present and (according to the turborepo docs) its dependencies were excluded from yarn.lock, which is what we want!

installer

FROM base AS installer
WORKDIR /app
ARG APP

# COPY 1
COPY --from=pruned /app/out/json/ .
COPY --from=pruned /app/out/yarn.lock /app/yarn.lock
COPY apps/${APP}/package.json /app/apps/${APP}/package.json

# RUN 1
RUN \
      --mount=type=cache,target=/usr/local/share/.cache/yarn/v6,sharing=locked \
      yarn --prefer-offline --frozen-lockfile

# COPY 2
COPY --from=pruned /app/out/full/ .
COPY turbo.json turbo.json

# RUN 2
RUN turbo run build --no-cache --filter=${APP}^...

# RUN 3
RUN \
      --mount=type=cache,target=/usr/local/share/.cache/yarn/v6,sharing=locked \
      yarn --prefer-offline --frozen-lockfile

The COPY 1 command series begins by copying the json directory in out from the pruned layer, as well as the 'scoped' yarn.lock file. It also redundantly copies the app's package.json from the host, which I've found necessary for the container to recreate correctly when shared workspace dependencies change.

The first run command (RUN 1) runs yarn and uses a cache mount targeting the yarn cache directory, which will help speed up subsequent re-builds.

The next two statements (COPY 2) copy the contents of out/full from the pruned layer as well as turbo.json from the host.

Since this container will run in a dev environment, it means we can safely skip building the backend app. However, we do need to build packages/shared-stuff, which in the current example is a dependency of backend. This is done with the command in RUN 2: RUN turbo run build --no-cache --filter=${APP}^.... In this case, since we're building the backend service container, the filter flag will resolve to --filter=backend^.... What this means is that all workspace dependencies of backend, but not backend itself, will be built, which is what we want.

This is a huge time saver for certain frameworks (looking at you, Next.js).

The final RUN command (RUN 3) is just a copy of the first (RUN 1). This final yarn call ensures that dependencies between workspaces are linked correctly.

runner

FROM base AS runner
WORKDIR /app
ARG APP
ARG START_COMMAND=dev

COPY --from=installer /app .

CMD yarn workspace ${APP} ${START_COMMAND}

The final layer copies over the installer layer and runs the command to start the app (based on the START_COMMAND argument).

Now let's look at docker-compose.yml, specifically the config for the backend service (almost done!)

version: '3.8'

x-defaults:
  &defaults
  init: true
  tty: true
  networks:
    - my_monorepo_network

networks:
  my_monorepo_network:

services:
    backend:
        <<: *defaults
        ports:
          - "3333:3333"
        command: yarn workspace backend dev
        environment:
          - PORT=3333
        build:
          args:
            APP: backend
            START_COMMAND: dev
          context: .
          dockerfile: ./Dockerfile.dev
        volumes:
          - ./apps/backend:/app/apps/backend
          - /app/apps/backend/node_modules
    ...other services...

This is a pretty typical definition for a Node service. Here you can see the build args, which are used in Dockerfile.dev.

Changing dependencies

tl;dr: docker compose up -d -V --build <service>

When you change dependencies for your node services on the host, you need to rebuild the container for it to reflect those changes.

You don't need to use docker compose build --no-cache <service> for this. There are better methods which don't require completely busting cache and starting from scratch.

The complicating factor for node services, as commonly configured in docker compose projects, is anonymous volumes:

...

volumes:
   - ./apps/backend:/app/apps/backend
   - /app/apps/backend/node_modules # <-- this guy

...

In the context of yarn 1.22 monorepos, apps/backend/node_modules contains development dependencies that are unique to that workspace.

The point of the anonymous volume /app/apps/backend/node_modules is to make it so node_modules in the backend workspace within the container is excluded from the bind mount ./apps/backend:/app/apps/backend.

Because of that volume, simply running docker compose build <service> or docker compose up -d --build <service> won't work like you expect it to when development dependencies for backend change.

There are two ways I've found to reliably recreate containers when deps change, while dealing with anonymous volumes:

1 The -V, --renew-anon-volumes flag

docker compose up -d -V --build <service>

You don't need to stop or restart the service.

2 rm container and build

docker compose rm -f -s <service> && docker compose up -d --build <service>

Either of these methods will do the trick.

Top comments (3)

Magnus • Feb 7 • Edited

Thank you for the solid article! Quick question, regarding this:

This final yarn call ensures that dependencies between workspaces are linked correctly.

I had to do the same to make pnpm dev work (I am using pnpm, but I don't think it makes any difference). Any idea why it is necessary to run the command again? I looked around a bit in my node_modules folders, and could not find any difference before and after.

Thanks!

Nathan Cook • Feb 8

I'm not entirely sure! I was primarily having issues around changes to (external) dependencies in a given shared workspace not getting recognized by other workspaces. Running the second install command fixed this without needing to completely rebuild the entire container from scratch.

Something to keep in mind is that you need to take into account the volumes set up in docker-compose.yml when looking at files in a running container, as you're seeing those volumes overlaid on the files resulting from the build process.

(This is why, in my case, I was getting errors about missing external dependencies: the source files in the container were the latest version, due to bind mounted volumes, but the node_modules directory, created during the build process, was out of date)

Magnus • Feb 9

That's actually a good point, but I am not sure the node_modules volumes work that way. According to the Docker docs, the bind mounted folders will obscure (hide, not delete) the folders in the container. The volumes, on the other hand, are just persistent storage of container files. When you make new volumes on top of container folders that already have files and folders within, those files and folders are moved into the new volumes.

In other words, I see how the bind mount will obscure (hide) any files/folders with the same name that already in the container (there should not be a mismatch in this case though, since we copied all source files), but not why your node_modules should be out of date.

As a side note: I had a look at vscode's devcontainer solution, and they actually bind mount the entire project root, including node_modules and all. They do not create any anonymous volumes, and they do not copy over any source files during container build (only use the bind mount).

Isn't that better for development than what you/we did above?

Thanks again!

DEV Community