Difference between revisions of "Samba CI on gitlab/Under the hood"

(.gitlab-ci-private.yml vs .gitlab-ci.yml)
(Add note about Kubernetes)
 
(23 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=How GitLab CI works at a High Level=
+
=How GitLab CI works in Samba=
  
 
==Running remote scripts, displaying the output==
 
==Running remote scripts, displaying the output==
  
Like the Samba build farm of old, GitLab CI is best seen as a system for running scripts on remote hosts against a git checkout.  
+
Like the Samba [https://git.samba.org/samba.git/?p=build-farm.git;a=blob;f=README build farm of old], GitLab CI is a system '''for running scripts on remote hosts''' against a git checkout.  
  
In Samba's case, the remote script is '''[[autobuild.py]]''' plus some housekeeping before and after.  The details is recorded in the '''.gitlab-ci-private'''
+
===Pipelines===
 +
 
 +
Samba uses a feature called [https://docs.gitlab.com/ee/ci/pipelines.html GitLab Pipelines] to orchestrate our CI.
 +
 
 +
===In-repo configuration===
 +
 
 +
In Samba's case, the remote script is '''[[autobuild|script/autobuild.py]]''' plus some housekeeping before and after.  The details is recorded in the '''.gitlab-ci*.yml''' files in the Samba tree (so it is maintained with the code).
 +
 
 +
* See also [https://docs.gitlab.com/ee/ci/quick_start/README.html an introduction to setting up GitLab CI]
  
 
===.gitlab-ci-private.yml vs .gitlab-ci.yml===
 
===.gitlab-ci-private.yml vs .gitlab-ci.yml===
Line 15: Line 23:
 
The motivation here is to use the [https://about.gitlab.com/2016/04/05/shared-runners/ shared runners] where possible as these are provided by gitlab.com at no cost to Samba Team.
 
The motivation here is to use the [https://about.gitlab.com/2016/04/05/shared-runners/ shared runners] where possible as these are provided by gitlab.com at no cost to Samba Team.
  
==Wrapping docker==
+
==Wrapping containers==
 +
 
 +
To get a consistent build environment container images are used, so the scripts described above all run inside a container. 
 +
 
 +
The image used is defined in the .gitlab-ci.yml file.
 +
 
 +
'''GitLab CI is best thought of as a fancy way to run commands in containers and report their results.'''
 +
 
 +
===Docker===
 +
 
 +
GitLab CI uses [https://www.docker.com/ Docker] as the container runtime. 
 +
 
 +
''While the container image format can be consumed by and the containers started using other tools, to closely replicate the environment on the runners, use Docker.
 +
''
 +
===A bit like running in a chroot===
  
To get a consistent build environment docker images are used, so the scripts runs in a docker container.  The image used is defined in the .gitlab-ci.yml file.  
+
The way containers are used by GitLab CI is very much akin to downloading a tarball (the image), unpacking it and calling ''chroot'' into it (entering the container)Modern container concepts like namespaces etc are used to make it more seamless, but this conceptualization may assist those struggling with the concepts.
  
 
==On a private VM==
 
==On a private VM==
  
To allow us to accept and test code from a broader range of contributors, and to enable scaling at times of peak load, the docker container is started in a private VM using [https://docs.docker.com/machine/overview/ Docker Machine]].  This applies for '''both''' the private and shared (provided by gitlab.com) runners.
+
To allow us to accept and test code from a broader range of contributors, and to enable scaling at times of peak load, the docker container is started in a private VM using [https://docs.docker.com/machine/overview/ Docker Machine].  This applies for '''both''' the private and shared (provided by gitlab.com) runners.
 +
 
 +
==Multiple VMs in parallel==
 +
 
 +
Each section in the '''.gitlab-ci*.yml''' file is a [https://docs.gitlab.com/ee/ci/yaml job], and each job is distributed to an independent VM, allowing execution in parallel.
 +
 
 +
=Providing the private VMs=
 +
 
 +
* The Samba team provides the ''private'' VMs in the [https://www.rackspace.com/cloud/public Rackspace cloud] paid for by the team using [https://www.samba.org/samba/donations.html donations]. 
 +
 
 +
* A single host running [https://docs.gitlab.com/runner/install/ gitlab-runner] is registered to the [[Samba CI on gitlab#Where is the Samba CI repo on GitLab?|shared development repo]]. 
 +
 
 +
* That host is configured to [https://docs.gitlab.com/runner/configuration/autoscale.html autoscale] [https://docs.gitlab.com/runner/executors/docker_machine.html using docker-machine].
 +
 
 +
=Ansible management scripts=
 +
 
 +
* The [https://gitlab.com/catalyst-samba/samba-cloud-autobuild/tree/master/gitlab-ci scripts used to configure and operate this service] are available.
 +
 
 +
* These scripts allow a new '''bastion host''' to be fully built by just running single script invocation:
 +
[https://gitlab.com/catalyst-samba/samba-cloud-autobuild/-/blob/master/gitlab-ci/one-step-rebuild-rackspace.sh gitlab-ci/one-step-rebuild-rackspace.sh]
 +
 
 +
=Future CI services=
 +
 
 +
As all the complex parts of Samba's build and test system are still below [[autobuild]], migration to a different CI service in the future or in parallel is quite practical. 
 +
 
 +
For example, in the past there was parallel operation with [https://travis-ci.org/ Travis CI] before the team abandoned [https://github.com/samba-team/samba GitHub].
 +
 
 +
==Not tied to gitlab.com==
 +
 
 +
If needed, private GitLab hosts running the Open Source GitLab CE can interpret the same configuration and operate against the same runners (just without the free shared runners, naturally).
 +
 
 +
This gives the Samba Team options if gitlab.com hosting becomes a problem for any reason.
 +
 
 +
==CI Cloud Requirements==
 +
 
 +
'''To aid in the selection of any future cloud provider'''
 +
 
 +
To be a suitable provider for Samba's Samba's CI, a cloud must be able to provide:
 +
 
 +
* On the basis of at least 40 parallel jobs ''(the current limit is 40, this is often reached when doing security work as all jobs are run on the private runners)''
 +
** 160 CPUs at peak
 +
** 160 GB RAM at peak
 +
* S3 or Google Compute Engine compatible object store is desirable (for caching, not currently available with Rackspace)
 +
* Provide the openstack API to launch hosts (current scripts are built around this and Rackspace, each new cloud is non-trivial to set up)
 +
** Docker-machine compatible driver to launch the runners from gitlab-runner
 +
** Ansible compatible drivers to launch the bastion host
 +
** Command-line ability to upload SSH keys to launch the bastion host
 +
** API access available from arbitrary networks.
 +
* Billing to an AMEX to allow the SFC to pay for services
 +
** Billing console so we can confirm current level of billing
 +
* Maintained host images for (currently) Ubuntu 18.04 to boot from
 +
** Ideally these would be under a stable name or ID but updated with any security updates
 +
 
 +
===Future Cloud: Kubernetes?===
 +
 
 +
If we are willing to put in more effort than just a like-for-like port of the existing rig, we should consider if the native [https://docs.gitlab.com/ee/user/project/clusters/ Gitlab Kubernetes integration] would allow less maintenance of the script infrastructure.

Latest revision as of 08:45, 19 May 2020

How GitLab CI works in Samba

Running remote scripts, displaying the output

Like the Samba build farm of old, GitLab CI is a system for running scripts on remote hosts against a git checkout.

Pipelines

Samba uses a feature called GitLab Pipelines to orchestrate our CI.

In-repo configuration

In Samba's case, the remote script is script/autobuild.py plus some housekeeping before and after. The details is recorded in the .gitlab-ci*.yml files in the Samba tree (so it is maintained with the code).

.gitlab-ci-private.yml vs .gitlab-ci.yml

We have two different CI configurations, one using the default name .gitlab-ci.yml (so picked up by default by forks of our repo) and one that we specify in the Common development repo (.gitlab-ci-private.yml)

The .gitlab-ci-private.yml file includes .gitlab-ci.yml to as to avoid duplication.

The motivation here is to use the shared runners where possible as these are provided by gitlab.com at no cost to Samba Team.

Wrapping containers

To get a consistent build environment container images are used, so the scripts described above all run inside a container.

The image used is defined in the .gitlab-ci.yml file.

GitLab CI is best thought of as a fancy way to run commands in containers and report their results.

Docker

GitLab CI uses Docker as the container runtime.

While the container image format can be consumed by and the containers started using other tools, to closely replicate the environment on the runners, use Docker.

A bit like running in a chroot

The way containers are used by GitLab CI is very much akin to downloading a tarball (the image), unpacking it and calling chroot into it (entering the container). Modern container concepts like namespaces etc are used to make it more seamless, but this conceptualization may assist those struggling with the concepts.

On a private VM

To allow us to accept and test code from a broader range of contributors, and to enable scaling at times of peak load, the docker container is started in a private VM using Docker Machine. This applies for both the private and shared (provided by gitlab.com) runners.

Multiple VMs in parallel

Each section in the .gitlab-ci*.yml file is a job, and each job is distributed to an independent VM, allowing execution in parallel.

Providing the private VMs

Ansible management scripts

  • These scripts allow a new bastion host to be fully built by just running single script invocation:
gitlab-ci/one-step-rebuild-rackspace.sh

Future CI services

As all the complex parts of Samba's build and test system are still below autobuild, migration to a different CI service in the future or in parallel is quite practical.

For example, in the past there was parallel operation with Travis CI before the team abandoned GitHub.

Not tied to gitlab.com

If needed, private GitLab hosts running the Open Source GitLab CE can interpret the same configuration and operate against the same runners (just without the free shared runners, naturally).

This gives the Samba Team options if gitlab.com hosting becomes a problem for any reason.

CI Cloud Requirements

To aid in the selection of any future cloud provider

To be a suitable provider for Samba's Samba's CI, a cloud must be able to provide:

  • On the basis of at least 40 parallel jobs (the current limit is 40, this is often reached when doing security work as all jobs are run on the private runners)
    • 160 CPUs at peak
    • 160 GB RAM at peak
  • S3 or Google Compute Engine compatible object store is desirable (for caching, not currently available with Rackspace)
  • Provide the openstack API to launch hosts (current scripts are built around this and Rackspace, each new cloud is non-trivial to set up)
    • Docker-machine compatible driver to launch the runners from gitlab-runner
    • Ansible compatible drivers to launch the bastion host
    • Command-line ability to upload SSH keys to launch the bastion host
    • API access available from arbitrary networks.
  • Billing to an AMEX to allow the SFC to pay for services
    • Billing console so we can confirm current level of billing
  • Maintained host images for (currently) Ubuntu 18.04 to boot from
    • Ideally these would be under a stable name or ID but updated with any security updates

Future Cloud: Kubernetes?

If we are willing to put in more effort than just a like-for-like port of the existing rig, we should consider if the native Gitlab Kubernetes integration would allow less maintenance of the script infrastructure.