Difference between revisions of "User/Timbeale/Replace cwrap with namespaces"

(Created page with "As an experiment, I tried to remove selftest's reliance on cwrap and use linux namespaces instead. Mostly I played around with network namespaces, which removes the need for s...")
 
Line 1: Line 1:
 
As an experiment, I tried to remove selftest's reliance on cwrap and use linux namespaces instead. Mostly I played around with network namespaces, which removes the need for socket-wrapper.
 
As an experiment, I tried to remove selftest's reliance on cwrap and use linux namespaces instead. Mostly I played around with network namespaces, which removes the need for socket-wrapper.
 
  
 
=== What are namespaces? ===
 
=== What are namespaces? ===
Line 10: Line 9:
 
* Namespaces are created using the 'unshare' utility. The new selftest namespaces are anonymous/nameless, and so the different namespaces are identified by the PID of the processes running within the namespace (typically samba).
 
* Namespaces are created using the 'unshare' utility. The new selftest namespaces are anonymous/nameless, and so the different namespaces are identified by the PID of the processes running within the namespace (typically samba).
 
* Linux supports nesting namespaces within namespaces. In this case, each testenv DC has its own network namespace, which is a child of the overarching selftest namespace (which itself is a child of whatever namespace you run 'make test' from - usually this would be the root namespace).
 
* Linux supports nesting namespaces within namespaces. In this case, each testenv DC has its own network namespace, which is a child of the overarching selftest namespace (which itself is a child of whatever namespace you run 'make test' from - usually this would be the root namespace).
 +
 +
=== How does it work? ===
 +
 +
Currently, every testenv uses a 127.0.0.x IP address and socket-wrapper passes the packets between them.
 +
 +
With namespaces, we can use real IP addresses and have the packets pass through the kernel's IP stack normally, as it forwards them between namespaces.
 +
 +
How this works is we create a new [http://man7.org/linux/man-pages/man4/veth.4.html veth] interface pair, which we use to connect the namespaces. All packets sent on a veth interface simply end up as received packets on the other veth interface. So we add one end of the veth pair to the main selftest namespace, and the other end to a separate namespace which we'll run samba in. E.g.
 +
 +
<pre>
 +
selftest.pl  veth21-br ------------------------ veth21 samba (ad_dc_ntvfs)
 +
            10.0.0.11                          10.0.0.21
 +
Namespace 1                                      Namespace 2
 +
</pre>
 +
 +
However, we need to run multiple different testenvs and have them talk to each other. So to do this, we need a bridge interface ('selftest0') to connect up the namespaces, which essentially just acts as a hub. So connecting together multiple testenvs looks more like this:
 +
 +
<pre>
 +
selftest.pl    +-- veth21-br ------------------------ veth21 samba (ad_dc_ntvfs)
 +
                |                                      10.0.0.21
 +
    selftest0 --+                                        Namespace 2
 +
    10.0.0.11  |
 +
                +-- veth22-br ------------------------ veth22 samba (vampire_dc)
 +
                                                      10.0.0.22
 +
Namespace 1                                            Namespace 3     
 +
</pre>
 +
 +
The veth interfaces are named vethX and vethX-br, where ''X'' is the <tt>SOCKET_WRAPPER_DEFAULT_IFACE</tt> for the testenv. The vethX-br interface is always added to the selftest0 bridge interface.
  
 
=== Why would we want to replace cwrap? ===
 
=== Why would we want to replace cwrap? ===

Revision as of 02:28, 29 January 2019

As an experiment, I tried to remove selftest's reliance on cwrap and use linux namespaces instead. Mostly I played around with network namespaces, which removes the need for socket-wrapper.

What are namespaces?

Namespaces allow the kernel to segregate its system resources (files, CPU, etc), so different processes only see the set of resources they are allowed to use. There are several different types of namespace: network, user, process, file, IPC, etc.

Some key points are:

  • Each type of namespace gets managed separately by the kernel, i.e. process namespaces are managed separately to network namespaces, which are separate to user namespaces. This prototyping gave each testenv its own network namespace, but otherwise they all still share the same user/process/etc namespace. (In future, we may want to give each testenv its own process and user namespace, to better mimic a production DC, but this was beyond the scope of what I had time to do).
  • Namespaces are created using the 'unshare' utility. The new selftest namespaces are anonymous/nameless, and so the different namespaces are identified by the PID of the processes running within the namespace (typically samba).
  • Linux supports nesting namespaces within namespaces. In this case, each testenv DC has its own network namespace, which is a child of the overarching selftest namespace (which itself is a child of whatever namespace you run 'make test' from - usually this would be the root namespace).

How does it work?

Currently, every testenv uses a 127.0.0.x IP address and socket-wrapper passes the packets between them.

With namespaces, we can use real IP addresses and have the packets pass through the kernel's IP stack normally, as it forwards them between namespaces.

How this works is we create a new veth interface pair, which we use to connect the namespaces. All packets sent on a veth interface simply end up as received packets on the other veth interface. So we add one end of the veth pair to the main selftest namespace, and the other end to a separate namespace which we'll run samba in. E.g.

selftest.pl  veth21-br ------------------------ veth21 samba (ad_dc_ntvfs)
             10.0.0.11                          10.0.0.21
 Namespace 1                                       Namespace 2

However, we need to run multiple different testenvs and have them talk to each other. So to do this, we need a bridge interface ('selftest0') to connect up the namespaces, which essentially just acts as a hub. So connecting together multiple testenvs looks more like this:

selftest.pl     +-- veth21-br ------------------------ veth21 samba (ad_dc_ntvfs)
                |                                      10.0.0.21
    selftest0 --+                                        Namespace 2
    10.0.0.11   |
                +-- veth22-br ------------------------ veth22 samba (vampire_dc)
                                                       10.0.0.22
 Namespace 1                                             Namespace 3      

The veth interfaces are named vethX and vethX-br, where X is the SOCKET_WRAPPER_DEFAULT_IFACE for the testenv. The vethX-br interface is always added to the selftest0 bridge interface.

Why would we want to replace cwrap?

The main benefits are:

  • We can do real testing of DNS, which has historically been hard to write decent automated tests for. There may be some overlap between this namespace work and dns_hub, which were both done in parallel.
  • We can do more realistic testing of the samba codebase. E.g. we noticed that the LMDB backend was 20% slower when run inside a testenv compared to when run outside. The difference was purely due to socket-wrapper.
  • It allows developers to do more realistic testing - we're no longer limited to what we can do inside a testenv. They can easily connect to the selftest DC from outside of the testenv, which allows you to potentially do things like join a Windows DC to your testenv DC, or point the Windows RSAT GUI at the testenv DC.
  • It allows you to do quite powerful things with the Customdc testenv, as the testenv essentially becomes like a VM that's ridiculously easy to spin up.
  • Initially we though that using user namespaces would allow better testing of UIDs and root vs non-root permissions. This may be the case, but it ended up falling outside of what I had time to prototype.
  • Also Cwrap is getting pretty old, and was added long before namespaces even existed. We though it'd be interesting to see what we could do with more modern kernel containerization.

Can just ditch cwrap?

Cwrap will still be needed for the foreseeable future. There are a few limitations that mean we can't just drop cwrap completely:

  • We can't use namespaces on older ubuntu 14.04 releases (i.e. sn-devel).
  • It still needs more work to use namespaces on gitlab. Selftest needs to call unshare, and this syscall is not permitted by docker's default seccomp. You can get around this locally by using --privileged, but this is probably not something we want to do as part of CI. One way forward may be to use a custom seccomp profile that whitelists the unshare/clone syscalls.
  • Other systems like FreeBSD won't support it.
  • Currently we've only prototyped removal of socket-wrapper and resolv-wrapper. Cwrap is still used for a bunch more things.
  • Removing just the socket-wrapper highlighted a bunch of tests where we've relied on the socket-wrapper behaviour (either explicitly or implicitly), and so the tests won't run correctly when socket-wrapper is removed. Some testenvs can pass all their test-cases successfully (e.g. restoredc), but other tests will need to be fixed and made more generic before we can run an entire autobuild using namespaces.

So what was the point of this work?

We wanted to see if it was feasible to replace cwrap. The answer is yes, but not easily and not quickly.

The current plan is to tidy up what we've done, integrate it with master (so that it's disabled by default), and gradually chip away at extending it, i.e. a similar approach to what we took with the python3 work.

We'll get some benefit from having the option of running tests differenty. And it'll be insurance in case we really do need to drop cwrap in the future.

Further down the track, once we've addressed the above issues, we could decide to switchover so that namespaces are the CI default and cwrap is the fallback.

So why didn't we just use docker?

The Gitlab CI runs selftest inside a docker container. Running docker inside another docker container isn't really a practical solution. We want to keep using gitlab CI, therefore the testenvs themselves cannot use docker directly.

Docker is essentially just a convenient wrapper for the underlying kernel namespaces, which is doing the bulk of the containerization work anyway.