SAMBA4 CLEANUPS NEEDED FOR CURRENT API USERS AND A 4.0 RELEASE
As some of you have noticied, a number of refactoring commits that try to add support for the MIT Kerberos Library have been pushed recently. we would like to present the larger picture behind these changes.
Red Hat's FreeIPA/Samba team is working on bringing Samba 3 and Samba 4 code base as solution to the following use cases:
- Domain member file server
- Simple winbindd-based client
- Samba 3 and Samba 4 client libraries for use in OpenChange and FreeIPA, as well as various desktop projects (GNOME, KDE virtual file system support, for example)
- NT4 Domain Controller, a classical role used by Samba 3 and needed for FreeIPA cross-forest trusts with Active Directory
Currently Samba 4 is developed from the perspective of building an Active Directory-compatible Domain Controller where the samba 4 daemon functions as an integrated CIFS, LDAP, and Kerberos (etc..) server. To achieve it, the Heimdal kerberos implementation has been embedded in our code, and its KDC code in a librarized form to allow in-process KDC implementation.
The Samba development tree represents a so-called merged build where both the Samba 3 and Samba 4 code bases are located, with the ability to build both components of a traditional Samba 3 setup (smbd, nmbd, winbindd, various libraries) and Samba 4 parts (samba daemon, various client libraries). In order to build a full package from the merged build tree, one needs to use the WAF-based build system.
There is also still an option to build the Samba 3 code base only, when using the classical autoconf build infrastructure located in source3/.
Both Samba 4 and Samba 3 code-bases rely on Kerberos functionality to implement authentication features for various protocols. While the majority of requirements are related to the use of a client-side Kerberos APIs, a tight integration with KDC server implementation is needed for the Active Directory Domain Controller case.
The four use cases identified above are currently not properly supported by the merged samba build and need to be supported if we want to relase a unified project and retire the 3.0 series.
These are the issues we identified:
Fedora, openSUSE, Red Hat Enterprise Linux, SUSE Linux Enterpise, as well as many other GNU/Linux distributions use MIT Kerberos as their system-wide Kerberos library. There are different uses of it but an important one is to be able to share user's credentials cache between different applications running on the same host. Server applications have few in-host interoperability issues when using the Kerberos protocol, applications running on behalf of users, on the same host and sharing access to the same credentials cache instead need to collaborate more closely and ultimately need to share knowledge about the ccache structure.
Unfortunately, there are differences in how MIT and Heimdal implementations store certain features in the ccache. There are also differences in how ccaches are represented on disk. More on this below.
Samba 3 is usable with MIT Kerberos already. As a first step, what we are looking for is to make Samba 4 client code usable with MIT Kerberos as well so that we can achieve proper integration and interoperability on the same host for user applications.
The main issue here is the unconditional use of Heimdal API and is well known. There are APIs that are unique to Heimdal and their usage breaks compilation against MIT Kerberos.
These cases, like use of Heimdal-specific configuration setup in source4/auth/kerberos/kerberos.c, or ticket decoding, need to be solved by wrapping the code into common helpers that are implementation dependent.
At the same time a more complex issue is ABI conflict. In the merged build the build system attempts to hide Heimdal symbols with use of various linker tricks. The merged build also uses system-supplied libraries which are dynamically linked against the system-provided Kerberos implementation, in our case MIT Kerberos. The behavior of the system and the embedded Heimdal libraries is not always consistent and breaks down in some cases.
As an example, in FreeIPA we use the Samba 4 python bindings to talk to the Samba 3 smbd server from an un-privileged process. This process is actually an Apache child running with mod_auth_kerb and receiving the ticket from a web client. When the S4U2Proxy Kerberos feature is used to allow a service to do constrained delegation (http://www.samba.org/~idra/blog/id_011.html), the MIT kerberos implementation stores information about it in the ccache in the form of hints that Heimdal seem not to understand. The Samba 4 python bindings are using Samba 4 client code that is linked against Heimdal so it tries to use Heimdal specific symbols. As result, constrained delegation does not work because it is not properly recognized by Heimdal, furthermore depending on the order the libraries are loaded some parts of the code still see bleeding in symbols from one implementation to the other. This cause obvious issues when one symbols is not shared by both implementation and is used with data generated by the other library. Whether it is a file base cache or a structure allocated in memory.
There are features both in Heimdal and MIT implementations that are not implemented in the counter-part. One of those is the DIR: ccache format in MIT Kerberos that allows to use tickets from different realms at the same time. In such case Heimdal code will not be able to see the ccache content at all. (Note that we are trying to switch to use the DIR: ccache format by default in Fedora 18). Other differences are functions like (Heimdal-only) krb5_get_init_cred_ops_set_win2k(), gss_import_creds(), gss_export_creds() or (MIT-only) krb5_get_init_creds_opt_set_out_ccache() and others.
It is impossible to confidently segregate two libraries with conflicting symbols unless static linking or full symbol renaming techniques are employed. Neither is done in current Samba code.
Another Kerberos issue is over-reliance on raw krb5 interfaces and as result re-implementation of the functionality available in GSSAPI 'for free', like aforementioned S4U2Proxy or S4U2Self. Ideally we would need to reduce raw Kerberos use to a minimum where GSSAPI does not give enough means. Current code is a result of evolutionary development from times when GSSAPI implementations were not so advanced. However, as current code relies on Heimdal implementation anyway, Samba 4 seems to be in a position to better use GSSAPI where possible.
Currently an assumption is made in the code that if Kerberos is used, we always have Kerberos support with GSSAPI, not only a raw Kerberos API. Thus a single HAVE_KRB5 define is used.
However, merging GSSAPI requirement into HAVE_KRB5 and always assuming it is available isn't going to solve the issues described below.
One consequence of the WAF build system is the ability to clearly separate dependencies. This means include directories, as well as library paths and link-time options, can be made specific to one subsystem without affecting unrelated subsystems. This is a welcomed improvement over the traditional approach in Samba.
In theory this would make it possible to hide compile-time dependencies specific to a subsystem from other subsystems that are using it. It also would allow to reduce the amount of information that goes into a binary to the absolute minimum. The latter is important to improve startup speed by allowing the linker to avoid spending time resolving dependencies and loading unrelated libraries.
Linking unrelated libraries is called overlinking. While it is strictly not a sin, it gives a fair load of headaches to distribution maintainers in case of ABI changes as more packages are affected on rebuilds and also gives security teams more work than really required when certain APIs are compromised -- increased amount of code to review for affected cases will not really help to get better security releases.
How Samba is affected ? There are few areas.
API bleeding through common headers
There are headers in Samba that are used to abstract out differences in system-provided libraries. These headers are relying on some defines set out during the configuration step to decide which actual headers/libraries need to be provided and linked as they may vary from system to system.
These headers are included in the relevant subsystems to get access to the APIs. In some places they are also needed to get access to common structures referenced in a subsystem's own exported API.
The problem here arises in cases when a subsystem's prime purpose is to provide different functionality and those specific structures are used only in one or two exported function prototypes. As a result, all subsystems depending on this one will have to include common headers even if they are not directly using any of the provided functions or structures.
This is the case with Kerberos support. Most of the code in the merged build does not need to access Kerberos but few subsystems follow a style of providing Kerberos-dependent functions in the same headers as the rest of functionality. Even if they are guarded with
#ifdef HAVE_KRB5 ... definitions #endif
the issue is that HAVE_KRB5 is *always* defined whether you have 'krb5' as dependency to your subsystem or not due to the fact that all HAVE_* defines are provided by the global config.h. So protection like this does not help at all to guard in subsystems that never use Kerberos due to...
... the fact that WAF will not provide proper include paths on which common headers are relying unless the subsystem in question does specify the subsystem that actually provides the underlying library as its direct dependency, we end up referencing krb5 or gssapi as dependencies to things like LDAP_PRINTER (in libads) that doesn't really use krb5 or gssapi at all.
As result, we are overlinking our libraries.
Now, someone may object this is not an issue because we pass in the as-needed flag to the linker. But the fact we ask the linker to trim out stuff is not really a good reason to get loose and overlink everywhere.
The as-needed flag has other consequencies, for example it can easily lead to underlinking based on the order in which librarries are passed in, resulting in difficult to debug issues in some corner cases.
The main issue with this approach is that it exacerbates the problems we describe in the next section.
Dependency issues in client libraries
Due the the problems explained above and a certain liberal tendency to not clearly separate subsystems another problem arises. Many of the useful client libraries that samba 4 provides have an excess of dependencies that basically force them to get linked with unrelated and unused code. This has been slightly concealed in the waf build because every time a dependency conflict arises a new 'private' library is built. But when packagers try to package the minimum set of libraries needed for external users it becomes immediately evident that a huge part of the project code becomes a dependency to these libraries.
Most of the code is not needed and shouldn't be linked into a client library but some subsystems do not have interfaces abstract enough to attain the goal. Besides the evident issue in loading in tons of unnecessary code in a client application there are also confinement and security issues that derive from this situation. Client libraries often end up trying, unconditionally, to access server side code and even server side databases like sercrets.tdb/ldb and samdb. As an example, this makes writing confinement policies quite difficult on systems with MAC based security. It also raises quite some questions about the ability to control and confine code that needs to be run as client from the root user.
This situation is solvable, by changing APIs, and cutting dependencies where possible, or making them opaque so that the dependency is not formed at link time but rather at run time by passing in callbacks or vtables.
Resolving code in source4/libcli/resolve currently relies on Heimdal's libroken library. libroken provides a simple wrapper on top of libresolv to query DNS and source4/libcli/resolve implements asynchronous version of it by using composite API and forking out a child to do resolution. In the merged build we have *three* different DNS resolving libraries in use and unifying them would make sense from a general perspective, not only for the purpose of the MIT build.
Simo has started resolve refactoring in his tree  but quickly hit rk_dns_lookup() use. rk_dns_lookup() implements simple lookup of A or AAAA type queries against system-provided resolver. This code can be rewritten directly with the use of res_search() or dns_search() without relying on libroken. In fact, it is reasonable to extend existing lib/addns/ library to allow using system-provided resolver. To do so one needs to implement support in dns_open_connection() for NULL nameserver and then treat that case properly in dns_transaction() code by using existing system resolver instead of directly talking to the name server. This way we can keep same resolving interface.