Samba CI on gitlab/Debugging CI failures
- 1 Using Docker to debug GitLab CI falilures
- 2 make test
- 3 Getting patches back out of a Docker session
- 4 Points to note
Using Docker to debug GitLab CI falilures
GitCab CI uses Docker to provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries.
To install docker on Ubuntu, follow the instructions on this page:
Install docker and run:
docker run -ti $IMAGE_URL /bin/bash
You can find the value for $IMAGE_URL at the top of the CI output as:
Using Docker executor with image $IMAGE_URL
Then you need to clone the samba git repository with your changes and run the test.
You can find how to run it in the log of the failed pipeline.
A more complete example for cloning your samba git repositry would be to run, from your samba source dir:
docker run -ti --mount type=bind,source="$(pwd)",target=/src,ro $IMAGE_URL /bin/bash
And then from within the docker session:
git clone /src cd src
... you know the drill
Finding the right autobuild command to run in docker
The .gitlab-ci.yml file lists the autobuild command string that is run, matching the split of jobs in the GitLab pipeline GUI, where the command is also printed (in green) at the top of the log. This makes it fairly easy to copy the command into the docker shell. For example:
script/autobuild.py samba-none-env --verbose --nocleanup --keeplogs --tail --testbase /tmp/samba-testbase
Other container tooling
The container images stored in our registry and used for CI can be consumed by and the containers started using other tools like podman, but to closely replicate the environment on the runners, use Docker.
Many issues shown up in CI reproduce without difficulty by running the individual test.
- Most of these issues will reproduce locally on your normal development system
- Otherwise you may need to use the docker container described above (which has the reference set of packages)
Build Samba with
./configure --enable-developer make -j
And run the test with
make test TESTS=mytest
Getting patches back out of a Docker session
If you have made changes inside a docker runtime container:
Tell git who you are
git config --global user.name "Fred Nurk" git config --global user.email "firstname.lastname@example.org"
Make a proper commit within the container runtime
git add --patch git commit -s -m 'My commit message'
Export the patch back to your host
docker exec [CONTAINER_ID] sh -c 'cd samba;git format-patch -1 --stdout' > /tmp/patch.txt
You can typically find the [CONTAINER ID] as the part after the @ in shell prompt:
Points to note
Notable Pipeline error conditions
fatal: reference is not a tree
If a branch is pushed to twice in quick succession, the already started CI pipeline may fail with errors like:
fatal: reference is not a tree: f27116a9a0d047629d074bc14c18caf6139731e2
This just means that the runner lost the race with your new push and could not get the old git hash. Your new CI run is in another pipeline.
The 'private' runners are 4 CPU virtual machines with 8GB of ram. These run in Rackspace's cloud and are paid for from a credit with RackSpace by the Samba Team.
The 'shared' runners are 1 CPU virtual machines with 4GB of RAM. The name is a misnomer, they are not shared VMs, but access to the newly booted VMs is shared to us (and paid for) by gitlab.com.
Some tests fail or flap on GitLab CI due to resource limitations. This can cause
- Docker failure code 137 (likely a kill -9 due to the out of memory killer running)
- Tests failure because they do not run fast enough (timeouts or failures due to timing)
- Race conditions (AD schema and DRS replication are particularly prone to this)
Tests should be re-worked to be more memory efficient, more robust to poor CPU scheduling and race-free, but in the meantime this is worth being aware of.
sn-devel is a nice short hostname, so is laptop etc. Specifically they are less than 14 characters, so do not need to be truncated.
Due to the way the GitLab CI instances are booted under docker, they get long hostnames like runner-191a8437-project-6378020-concurrent-0, which sometimes cause difficult to diagnose issues if not always overridden in the test.