Samba CI on gitlab/Debugging CI failures

From SambaWiki
Revision as of 21:46, 18 February 2019 by Abartlet (talk | contribs) (Add note about other possible tools)

Using Docker to debug GitLab CI falilures

GitCab CI uses Docker to provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries.

To install docker on Ubuntu, follow the instructions on this page:

Install docker and run:

docker run -ti /bin/bash

Then you need to clone the samba git repository with your changes and run the test.

You can find how to run it in the log of the failed pipeline.

A more complete example for cloning your samba git repositry would be to run, from your samba source dir:

docker run -ti --mount type=bind,source="$(pwd)",target=/src,ro /bin/bash

And then from within the docker session:

git clone /src
cd src

... you know the drill

Finding the right autobuild command to run in docker

The .gitlab-ci.yml file lists the autobuild command string that is run, matching the split of jobs in the GitLab pipeline GUI, where the command is also printed (in green) at the top of the log. This makes it fairly easy to copy the command into the docker shell. For example:

script/ samba-none-env    --verbose --nocleanup --keeplogs --tail --testbase /tmp/samba-testbase

Other container tooling

The container images stored in our registry and used for CI can be consumed by and the containers started using other tools like podman, but to closely replicate the environment on the runners, use Docker.

make test

Many issues shown up in CI reproduce without difficulty by running the individual test.

  • Most of these issues will reproduce locally on your normal development system
  • Otherwise you may need to use the docker container described above (which has the reference set of packages)

Build Samba with

./configure --enable-developer
make -j

And run the test with

make test TESTS=mytest

Points to note

Resource limitations

The 'private' runners are 4 CPU virtual machines with 8GB of ram. These run in Rackspace's cloud and are paid for from a credit with RackSpace by the Samba Team.

The 'shared' runners are 1 CPU virtual machines with 4GB of RAM. The name is a misnomer, they are not shared VMs, but access to the newly booted VMs is shared to us (and paid for) by

Some tests fail or flap on GitLab CI due to resource limitations. This can cause

  • Docker failure code 137 (likely a kill -9 due to the out of memory killer running)
  • Tests failure because they do not run fast enough (timeouts or failures due to timing)
  • Race conditions (AD schema and DRS replication are particularly prone to this)

Tests should be re-worked to be more memory efficient, more robust to poor CPU scheduling and race-free, but in the meantime this is worth being aware of.

Long hostnames

sn-devel is a nice short hostname, so is laptop etc. Specifically they are less than 14 characters, so do not need to be truncated.

Due to the way the GitLab CI instances are booted under docker, they get long hostnames like runner-191a8437-project-6378020-concurrent-0, which sometimes cause difficult to diagnose issues if not always overridden in the test.