Running cypress on CI always comes with its headaches and in this article Iā€™ll share a non-standard approach to a problem we faced that might come in handy to you.

The dilemma

Working on a Rails project we used Gitlab as our repository and CI provider. Using cypress as our main testing framework the tests started growing along with functionality and at some point waiting for the test suite to finish became too painful so we had to start running the tests in parallel.

Enaabling parallel runs for cypress was an easy job, but here comes the other circumstance where the real dilemma began. We used a Kubernetes single node cluster (via Kubernetes Gitlab integration) as our power engine behind the CI which means one pod is created per CI job in a pipeline and the default number of pods was limited to 4 (or 6). That became quite insufficient for our needs the minute we enabled parallel cypress runs because each cypress run meant one CI job (and hence one pod behind the scene) and that became a real blocker. Some pipelines had to wait for others running cypress to finish which blocked jobs of higher priority like deploying hotfixes, so we had to find another way out.

Now two other constraints come into the picture - our K8s costs and knowledge. At that point our K8s bill was relatively fine but we still felt it was hefty and we could and should reduce it. In that sense the obvious solution to the pod limit, namely adding more nodes and use autoscaling to controll costs, that meant a heftier bill overall.

Although concerning a bigger bill wasnā€™t a dealbreaker. Increasing the number of nodes and using autoscaling meant managing the k8s cluster ourselves. Up to that time the cluster had been working seamlessly without us worrying about anything since it was set via the Gitlab integration which was as easy as filling a few inputs from some UI. Despite having some K8s knowlege that route still felt like a big leap that wouldā€™ve exposed us to another world of problems we wouldā€™ve had to deal with.

Apart from Kubernetes executor we were also exploring the other options for executors that Gitlab provide. While researching and weighing pros and cons we decided to find another temporary solution in the meanwhile.

Then googling workarounds we found that other developers have attempted running cypress on Cloud Build. Despite not looking promising we liked that possibility because it felt relativeley simple though strange and at the same time works as a pay-per-use solution as Cloud Build billing is per build minute.

What is Cloud Build serverless?

Cloud Build is a service that executes your builds on Google Cloud Platformā€™s infrastructure. It can serve as your CI provider where you create pipelines of steps like build, test, deploy for your newest app version. You simply add your cloudbuild.yaml file in your app directory and define the steps there, similarly to gitlab-ci.yml for Gitlab.

One of the core Cloud Buildā€™s functionalities is building container images. Itā€™s usually used to create an image of your latest app version and then deploy/run a container using that image. The image creation itself is performed by a VM with resources of yor choice. In our case weā€™ll use the image creation process just to make use of the VM and its resources and occupying the VM is charged per minute.

More specifically weā€™ll use only our cypress-related files (cypress specs and configuration files) to build the image and building the image will essentially come down to executing the cypress tests.

Weā€™ll go over 3 files:

  1. Dockerfile.cypress_build - a Dockerfile that serves as a template
  2. gitlab-ci.yml - a few snippets from the yml file showing how a CI job starts cypress runs in parallel
  3. cypress_parallel.sh - bash script that triggers the building of container images using a Dockerfile based off Dockerfile.cypress_build template

Dockerfile.cypress.build

Letā€™s look at the Dockerfile.cypress_build first:

FROM cypress/browsers:node12.19.0-chrome86-ff82

WORKDIR /cypress

COPY package.json /cypress
COPY yarn.lock /cypress
RUN yarn install

ADD . /cypress

ENV CYPRESS_MODE {MODE}
ENV CYPRESS_API_TOKEN {API_TOKEN}
ENV CYPRESS_RECORD_KEY {CYPRESS_RECORD_KEY}
ENV CI_PIPELINE_ID {CI_PIPELINE_ID}

ENV COMMIT_INFO_BRANCH {COMMIT_INFO_BRANCH}
ENV COMMIT_INFO_MESSAGE {COMMIT_INFO_MESSAGE}
ENV COMMIT_INFO_AUTHOR {COMMIT_INFO_AUTHOR}
ENV COMMIT_INFO_SHA {COMMIT_INFO_SHA}

RUN ["yarn", "cypress", "run", "--record", "--parallel"]

RUN exit 1

It serves as a template that will be edited before each Gitlab job responsible for running cypress starts. {MODE}, {API_INTERNAL_TOKEN} and the other {...} serve as placholders that will be replaced with real values.

COMMIT_INFO_* - used by cypress to visualize the respective piece of info in case you use Cypress Dashboard, so those are optional.

CYPRESS_RECORD_KEY - used by cypress to record video when --record option is present

CI_PIPELINE_ID - used by cypress to indicate which pipeline a cypress run is related to. This is needed because weā€™ll spin 5 cypress runs in parallel from each test CI job for the same pipeline.

CYPRESS_MODE & CYPRESS_API_TOKEN are examples for environment-dependent variables which you might need depending on what target environment youā€™re running cypress against.

For example, if all possible target envionrments are development, staging and production your CYPRESS_MODE could hold each of those and before executing the tests cypress will load a different configuration based on the MODE from env.developemnt.json, env.staging.json or env.production.json. Token can also be different depending on environment. Hence, you can remove those or add new ones depending on your cypress configuration.

āš ļø Any OS-level environment variable on your machine that starts with either CYPRESS_ or cypress_ will automatically be added to Cypressā€™ environment variables and made available to you.

RUN exit 1 - the last instruction of the dockerfile which triggers a failure of the cloud build process to avoid saving the actual image and then later having to remove it.

gitlab-ci.yml

Now, letā€™s look at our Gitlab CI job responsible for running cypress tests:

.cypress:
  image: google/cloud-sdk
  stage: test
  before_script:
    - echo $SERVICE_ACCOUNT_GITLAB > /tmp/$CI_JOB_ID.json
    - gcloud auth activate-service-account --key-file /tmp/$CI_JOB_ID.json
    - gcloud config set project $PROJECT

    - cd cypress
    - echo $CYPRESS_JSON > cypress.env.json

    - sed -i -e "s|{MODE}|$MODE|g" Dockerfile.cypress_build
    - sed -i -e "s|{API_TOKEN}|$API_TOKEN|g" Dockerfile.cypress_build
    - sed -i -e "s|{CYPRESS_RECORD_KEY}|$CYPRESS_RECORD_KEY|g" Dockerfile.cypress_build
    - sed -i -e "s|{CI_PIPELINE_ID}|$CI_PIPELINE_ID|g" Dockerfile.cypress_build

    - sed -i -e "s|{COMMIT_INFO_BRANCH}|$CI_COMMIT_REF_NAME|g" Dockerfile.cypress_build
    - sed -i -e "s|{COMMIT_INFO_MESSAGE}|$CI_JOB_ID|g" Dockerfile.cypress_build
    - sed -i -e "s|{COMMIT_INFO_AUTHOR}|$GITLAB_USER_EMAIL|g" Dockerfile.cypress_build
    - sed -i -e "s|{COMMIT_INFO_SHA}|$CI_COMMIT_SHORT_SHA|g" Dockerfile.cypress_build

    - cp Dockerfile.cypress_build > Dockerfile
  script:
    - bash cypress_parallel.sh
    - rm /tmp/$CI_JOB_ID.json

...

staging_cypress:
  variables:
    MODE: 'staging'
    API_TOKEN: $STAGING_API_TOKEN
    ...
  extends: .cypress

...

development_cypress:
  variables:
    MODE: 'development'
    API_TOKEN: $DEVELOPMENT_API_TOKEN
    ...
  extends: .cypress

First few lines of the before_script set google cloud service account and project. Then we go the cypress directory where test files reside and we fill cypress.env.json with the any environment variables, that your cypress setup may require, coming from CYPRESS_JSON Gitlab CI environment variable that should be set.

With the rest of the lines within before_script we replace the placeholders in Dockerfile.cypress_build with values coming from other CI environment variables and the very last line creates the actual Dockerfile off of our template that will be used later by cypress_parallel.sh script.

cypress_parallel.sh

#!/usr/bin/env bash

build_ids=()
i=0
while [ $i -ne 5 ]
do
  build_id=$(gcloud builds submit \
    --tag eu.gcr.io/ziggu-engage-review/test_cypress \
    --timeout="30m" \
    --async \
    --machine-type=n1-highcpu-8 . \
    --format="value(id)")
  build_ids+=($build_id)
  ((i=i+1))
done

seconds=0
timeout=1200

while [[ -n $build_ids && $seconds -lt $timeout ]]; do
  echo "Checking for cypress results"
  ((seconds=seconds+5))
  sleep 5

  all_finished="true"
  for id in ${build_ids[@]}; do
    status=$(gcloud builds describe "$id" --format="value(status)")
    if [ $status != "FAILURE" ]; then
      all_finished="false"
    fi
  done

  if [ $all_finished == "true" ]; then
    break
  fi
done

touch logs.txt
for id in ${build_ids[@]}; do
  build_log=$(gcloud builds log "$id" | sed -n '1,/Run Starting/d; /Recorded Run/q; p')
  printf $build_log >> logs.txt
done

cat logs.txt

if [[ ! -z $(cat logs.txt | grep "failed (") ]]; then
  rm logs.txt
  exit 1
fi

rm logs.txt

With the first while loop we trigger 5 gcloud builds submit commands and collect their ids. The command expects that Dockerfile is present within the same directory. After the loop is done weā€™ll have the following situation: the Gitlab CI job have spun 5 other jobs asynchronously.

Then with the second while loop we start polling for updates every 5 seconds using the cloud build processesā€™ ids. Once all builds have finished (we check for failures as dictated by the Dockerfile where we always force exit with status 1) we proceed with logging the results of each build in Gitlab job - last for loop.

Thereā€™s also a timeout mechanism set to 20 minutes in case something goes wrong and the cloud build prcess takes a lot of time to finish.

And if thereā€™s a failing test among all buildsā€™ tests we exit with 1 to trigger failure for the Gitlab job in the end.

āš ļø Caveats

As you can probably sense this is a quick and hacky way of running cypress in parallel, but as a temporary solution it bought us time until we figured out how to proceed with making our CI more scalable.

Sometimes we had false positives for the Gitlab job, if you add a preparatory step before launching cypress and that step fails resulting in a failure for the cloud build process then you get a false positive for the whole gitlab job since with this approach we expect failures by design.

āš ļø This approach merely illustrates a template so you may have to adapt certain pieces of the code to fit your needs.

I really hope this experince will be useful to you and could give you ideas for using Cloud Build in other situations šŸ˜‰