Unit tests are great. They are quick to run. They can help validate your work. They make future changes easier. They can serve as a specification for complicated bits of logic.
Things usually break at the interfaces though. When my code talks to the database, or the file system, or the other service, that is where the problems occur. For these reasons integration tests are useful. You can test writing and reading from the filesystem or the database or the remote system. In terms of code coverage, integration tests are hard to beat.
They do have problems though. Running them is harder. Especially if you are on a team, and want to run them for each branch of parallel development.
An Example
Here is a simple example. I have an application that prints out the top 5 countries alphabetically. The countries come from Postgres. The example is in Scala, but my point is hopefully, more universal.
> sbt run
The first 5 countries are Afghanistan, Albania, Algeria, American Samoa, Andorra
I also have a simple integration test that I can successfully get 5 countries in and out of my database. The full code for this example is here.
class DatabaseIntegrationTest extends FlatSpec {
implicit val cs = IO.contextShift(ExecutionContext.global)
...
"A table" should "have country data" in {
val dal = new DataAccessLayer()
assert(dal.countries(5).transact(xa).unsafeRunSync.size == 5)
}
}
Output:
>sbt it:test
[info] DatabaseIntegrationTest:
[info] A table
[info] - should have country data
[info] Run completed in 2 seconds, 954 milliseconds.
[info] Total number of tests run: 1
Locally, while developing this, I use a docker-compose file to start up my database and other dependencies.
version: "3"
services:
postgres:
container_name: local-postgres
image: aa8y/postgres-dataset:iso3166
ports:
- 5432:5432
hostname: postgres
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
postgres-ui:
container_name: local-postgres-ui
image: adminer:latest
depends_on:
- postgres
ports:
- 8080:8080
hostname: postgres-ui
But I want to run this test and other tests as part of a build pipeline. I want GitHub Actions or Jenkins to let me know in the future if any changes break this test. But what database do I point it at?
The Test Environment Solution
One solution to this is to use a test environment. Whatever your real-deal production world looks like, make a copy of it for testing and have the tests run against it.
This solution brings more problems. One problem is reproducibility. With just one person working on a single service, with one dependency, a test environment will probably be fine. But as more people start working on the parallel branches, and as more services appear, things will start to break down.
Here are some issues based on real experiences:
- Modified Data: my test run fails because a parallel test was modifying some state I depended on.
- Missing Data: my test run fails because the sample data in the database was removed.
- New data: it fails because another integration test in another branch didn't properly clean up its data.
The heart of the issue is we want to test in parallel, but we only have one test environment. As the number of services grows, the problem will only get worse.
A Solution
There are specific solutions to each of these problems. However, as your service depends on more other services things get harder and harder to keep in a clean state. I have a potential solution.
The heart of the issue here is that we don't have true isolation between the runs of our integration tests.
The solution is to use our docker-compose file of dependencies, locally and in the build, for running integration tests.
This can be done with a make file, or with bash scripting but I'm going to show how it can be done with earthly.
I create an Earthfile, which is kind of like a combination docker file and make file. In it, I create an integration target where I copy in my source, start my docker-compose up, and run my tests.
integration-test:
FROM +project-files
COPY src src
COPY docker-compose.yml ./
WITH DOCKER --compose docker-compose.yml
RUN sbt it:test
END
I can then run it locally or in the build pipeline, and every run will be isolated from each other. Wherever it runs, containerization ensures every test run is isolated.
> earth -P +integration-test
+integration-test | Creating local-postgres ... done
+integration-test | Creating local-postgres-ui ... done
+integration-test | +integration-test | [info] Loading settings for project scala-example-build from plugins.sbt ...
+integration-test | [info] DatabaseIntegrationTest:
+integration-test | [info] An table
+integration-test | [info] - should have country data
+integration-test | [info] Run completed in 2 seconds, 923 milliseconds.
+integration-test | [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
+integration-test | Stopping local-postgres-ui ... done
+integration-test | Stopping local-postgres ... done
+integration-test | Removing local-postgres-ui ... done
+integration-test | Removing local-postgres ... done
+integration-test | Removing network scala-example_default
+integration-test | Target github.com/earthly/earthly-example-scala/integration:master+integration-test built successfully
...
Using this pattern, services declare their dependencies in a docker-compose and integration tests will become less flaky.
I have a more full-featured example here, and a longer guide version here.
If you are interested in learning about the differences between unit and integration tests, I wrote about that as well here.
This is how I solve this problem. What solutions have you seen?
Top comments (0)