Fabio

Posted on Jul 16, 2022 • Edited on Jul 26, 2022

5 Things I wished I had known when I started to maintain GitLab CICD

#devops #gitlab #ci #cd

In this post, I describe a few of the amazing concepts of GitLab CICD pipelines. They made my daily work on the CICD pipelines significantly easier.

All examples can be found here.

1. YAML Anchors

This is probably for many of you a no-brainer. When I got into CICD I was also pretty new to work with YAML files. I also did not start from scratch. I had to deal with a huge CICD file which was over 1000 lines long. I could shrink it down to a couple of hundreds line while also adding more functionality into the pipelines. I could achive this utilizing the concept of parallelization and YAML Anchors.

What are YAML Anchors?

YAML Anchors are reusable code blocks you can easily insert at a later stage. You can define entire jobs like this and based on some variables you set you can change the job. I will make an example.

Let's say we have two builds in our pipeline to perform, one for development and one for production. But for production, we need a different .env file than for development. We could just create two jobs like this below, which will result in this kind of pipeline:

stages:
  - deploy

dev_deploy:
  variables:
    - ENV_FILE: .env.test
  stage: deploy
  script:
    - source $ENV_FILE
    - echo "DEPLOY APPLICATION"

master_deploy:
  variables:
    - ENV_FILE: .env.prod
  stage: deploy
  script:
    - source $ENV_FILE
    - echo "DEPLOY APPLICATION"

These two jobs are fairly easy to read, but imagine a more complex build/bundle script and also added different rules on when to run what jobs. We can do better if we use YAML Anchors because most parts of the jobs are the same. So we can transform the above code block to the following, which will result in this kind of pipeline:

stages:
  - deploy

.deploy: &deploy
  stage: deploy
  before_script:
    - if [[ "$ENV_FILE" == '' ]] ; then echo "ENV_FILE is not set" ; exit 1 ; fi
  script:
    - source $ENV_FILE
    - echo "BUILDING APPLICATION"

dev_deploy:
  <<: *deploy
  variables:
    ENV_FILE: .env.test

staging_deploy:
  <<: *deploy

master_deploy:
  <<: *deploy
  variables:
    ENV_FILE: .env.prod

As you can see, we now share the code of across the different jobs. I also added a staging job into the mix to show that we can also prevent jobs from running if the required variables are not set for the job. When it comes to override, the stuff which comes at a later line will override the declarations of before. That is why we "spread" the deploy anchor at the top of the job.

2. Parallelization

Parallelization has similar use cases to YAML Anchors. But is somewhat different. Sometimes a job is exactly the same, but just one variable is different, and therefore it needs to be run again and again. So going back to the first example, we could also improve on it, in the following manner, which results in this kind of pipeline:

stages:
  - deploy

dev_deploy:
  stage: deploy
  parallel:
    matrix:
      - ENV:
          - test
          - prod
  script:
    - echo $ENV
    - echo .env.$ENV
    - source .env.$ENV
    - echo "DEPLOY TO $ENV"

So instead of define multiple jobs, we define one job with a parallel matrix. This will spin up the job multiple times and inject the ENV variable. This is very useful if you for instance need to build or test your app based on different environments files, because then only one CI Variable is different. The downside is that you can only spin up 50 parallel processes.

On the other hand, parallel jobs are often used to split up a big job into smaller parts and then bring everything together in the next job, or you can split your test files into parallel jobs.

3. `CI_JOB_TOKEN`

The CI_JOB_TOKEN is a pre-set variable which allows you to access or trigger other resources within a group. So if you need to trigger a multi project pipeline where for instance after the backend is deployed you want to trigger the frontend deployment the CI_JOB_TOKEN comes in very handy. But there is more! If you use the CI_JOB_TOKEN then GitLab will actually know and make a connection between these pipelines. You can jump from one project's pipeline to another project's pipeline. The call would look like this:

stages:
  - trigger_pipeline_in_other_project

trigger:
  stage: trigger_pipeline_in_other_project
  script:
    - curl --request POST --form token=${CI_JOB_TOKEN} https://gitlab.com/api/v4/projects/<PROJECT_ID>/trigger/pipeline

A resulting pipeline could look like this:

4. Clean Up Jobs

Clean up jobs are jobs which run after another job and based on the pipeline status the execution changes. So you can basically run a different job depending on the pipeline status. For instance, you can then clear the cache on failure or invalidate some CloudFront dist etc. So to utilize this concept you can do something like the following, which result in a pipeline like this:

stages:
  - build
  - deploy

build:
  stage: build
  script:
    - echo "BUILD APPLICATION"

deploy_on_failure:
  stage: deploy
  when: on_failure
  script:
    - echo "CLEAR ARTIFACTS"

deploy_on_success:
  stage: deploy
  when: on_success
  script:
    - echo "DEPLOY APPLICATION"

deploy_on_failure runs only if the build has failed, while deploy_on_success will run when the build has succeeded. This can come very handy but has limitations, that is why I really like the next concept.

5. Child Pipelines & Dynamic Child Pipelines

Child Pipelines are pipelines which are started using the combination of the trigger and include keywords. They are detached from the parent pipeline and start by default directly running when triggered. So if one stage in your CICD file is "trigger job" for a child pipeline, it will trigger the pipeline and then the next job will start immediately after. Child pipelines are defined in a second CICD file, which is included into the main file. Let me make an example, which would result in this kind of pipeline:

The main file would look like this:

stages:
  - test
  - build
  - deploy

test:
  stage: test
  trigger:
    include: test.yml

build:
  stage: build
  script:
    - echo "BUILD APPLICATION"

deploy:
  stage: deploy
  script:
    - echo "DEPLOY APPLICATION"

As you can see, the test stage includes a second YAML file, which will be triggered into the detached (child) pipeline. The file could look like this:

stages:
  - test

test:
  stage: test
  script:
    - echo "TEST APPLICATION"

So, child pipelines allow splitting your YAML files into multiple files. But they also have a constraint and can only be triggered up to two levels down. That means the first child pipeline can trigger another child pipeline, but this pipeline cannot trigger a third child pipeline. But why is this exciting, we can use other tools for splitting YAML files.

This is exciting because the triggered YAML File does not have to exist before the pipeline starts!

The above statement leads us right into Dynamic Child Pipelines. This concept is really powerful and deserves an article on its own (Let me know if I should write more about it).

Most programming languages have some sort of packages to convert a JSON like structure into a YAML file. So what you can do, you have a pre-job which will compute the YAML file for you and then passes the YAML file as an artifact to the trigger job. This way, you can decide on the fly what the child pipeline should look like.

What I am going to show is not the most elegant or dynamic way, but it is the easiest way to grasp the concept. I call this set up a pipeline switch. Let's say we have a job which computes something for us.

Example conditions:

For instance gas prices for blockchains and if the gas prices are low we want to deploy the new contracts (basically when deployment costs are low)
On every Sunday we want to deploy our frontend in a random color 😆

You get the gist, so we have some condition on which we want to alter the pipeline.

In the below example the pipeline depends on the outcome of the condition (the deployment fees):

stages:
  - build
  - check_deployment_costs
  - trigger_dynamic_child

build:
  stage: build
  script:
    - echo "BUILD APPLICATION"

check_deployment_costs:
  stage: check_deployment_costs
  script:
    - echo "RUNS SCRIPT TO CHECK DEPLOYMENT COSTS"
    - echo "query computed costs per contract are 50 Finney"
    - bash pipelineSwitch.sh 50
  artifacts:
    paths:
      - './dynamicChildPipeline.yml'

trigger_dynamic_child:
  stage: trigger_dynamic_child
  trigger:
    include:
      - artifact: dynamicChildPipeline.yml
        job: check_deployment_costs

So in the step check_deployment_costs we check for the deployment costs and plug that into our bash script. The bash script is a simple check and then copies from the template folder to the location from where we will upload the artifact.

echo "input value: $1"
if [[ $1 < 51 ]]; then
    echo "should deployment"
    cp ./CICDTemplates/deployment.yml ./dynamicChildPipeline.yml
else
    echo "should wait deployment"
    cp ./CICDTemplates/sendNotification.yml ./dynamicChildPipeline.yml
fi

This solution might be as stated earlier not as elegant as other solutions but still pretty viable for a quick way. The resulting pipelines look like this, if the price is too high or if the price is okay.

LET ME KNOW 🚀

Do you need help, with anything written above?
What would be top on your list? 😄
Do you think I can improve - then let me know
Did you like the article? 🔥

DEV Community

5 Things I wished I had known when I started to maintain GitLab CICD

1. YAML Anchors

What are YAML Anchors?

2. Parallelization

3. `CI_JOB_TOKEN`

4. Clean Up Jobs

5. Child Pipelines & Dynamic Child Pipelines

LET ME KNOW 🚀

Top comments (0)

Read next

Troubleshooting the "JavaScript heap out of memory" Error in a Node.js Application on ECS

💡 How Does Sveltos Compare to Other Kubernetes Tools Like Argo CD and Flux CD?

How I Went From Being a DBA to DevOps Engineer

How to Crack Your First DevOps Interview: Tips and Sample Questions

1. YAML Anchors

What are YAML Anchors?

2. Parallelization

3. CI_JOB_TOKEN

4. Clean Up Jobs

5. Child Pipelines & Dynamic Child Pipelines

LET ME KNOW 🚀

Read next

Troubleshooting the "JavaScript heap out of memory" Error in a Node.js Application on ECS

💡 How Does Sveltos Compare to Other Kubernetes Tools Like Argo CD and Flux CD?

How I Went From Being a DBA to DevOps Engineer

How to Crack Your First DevOps Interview: Tips and Sample Questions

3. `CI_JOB_TOKEN`