Samba codebase organization
Broadly speaking, the Samba source-code tree can be organized into the following major groups:
- Top-level libraries, which contains common code shared amongst the Samba processes.
- Source3, which is code primarily used by the file server and domain member.
- Source4, which is code primarily used by the Active Directory Domain Controller.
- Infrastructure components, which provide the build and test framework for Samba.
The following sections break down the codebase layout in more detail. This is not intended to be a comprehensive directory, and just covers the major components.
At the time of the merge, all code was located in either the
source4 directory. Over time, as duplicate code between the two branches becomes merged or used in common, the code is moved out into the top-level of the source-code tree.
The major libraries components at the top-level are:
- Third-party libraries Samba needs some specific libraries to build. Some of these are included in the Samba source tree to aid in building on older and non-Linux platforms.
- General purpose libraries Samba, being like any large program written in C, has a number of internal helper functions that do not implement the protocols but are required to share code and make the rest of Samba possible.
- The sub-projects of
ldband live here, in the
- Common RPC client library (librpc) is the common (between and
source4) parts of Samba’s RPC client implementation.
- Common client library (libcli) is the common (between and
source4) parts of Samba’s client implementation for our protocols.
- Common authentication library (auth) is the common (between and
source4) parts of Samba’s authentication implementation.
- PIDL is Samba’s code auto-generation system for generating C code and C-Python bindings from IDL.
- python contains Samba’s Python library. It is not generally used in the file server, but is critical for the AD DC.
- CTDB Samba’s clustered database (which enables the clustered file server).
The source3 directory is home to code primarily used by the file server and domain member.
source3 contains the following major components:
- The SMB file server (smbd) is the file server that most people think of when they think of Samba.
- The NBT name server (nmbd) provides NetBIOS over TCP/IP (NBT) for those who want it.
- Winbindd provides the connection between Samba and the AD Domain to which it is joined, for authentication and name lookup. It also manages the IDMAP, being the mapping between unix UID/GID values and Windows SID values.
winbinddis used in both Domain member and AD Domain Controller modes.
- RPC client library (librpc) contains the parts of Samba’s RPC client implementation that are specific to the source3 subtree.
- SMB client library (libsmb) contains the parts of Samba’s SMB client implementation used in the source3 subtree.
- Authentication server (auth) contains the parts of Samba’s NTLM authentication server used in the source3 subtree. A shim module connects this to the
source4authentication code when Samba is an AD DC.
- Password database (passdb) contains the NTLM password database used in the
source3subtree. A shim module connects this to the
sam.ldbdata store when Samba is an AD DC.
- RPC server contains the
source3RPC server. However, most parts of this are not used in the AD DC, but instead are redirected to the equivalent parts of
source4/rpc_server. When used, this provides the classic or NT4-like DC either as a DC or to service the SAM on each member or standalone server (each Windows machine has a database under its own name, which Samba does too).
- Print server functionality is located in the
printingdirectory. Also relevant is the source3/rpc_server/spoolss code.
The source4 directory is home to code primarily used by the Active Directory Domain Controller.
source4 contains the following major components:
- Active Directory Database templates located in
setup. These templates fill out the basic structure of an Active Directory DC in the
sam.ldb. This includes the full schema definition.
- Heimdal is an (old) branch/fork of Heimdal with some changes. An attempt is made to sync this Samba fork with a tree called lorikeet-heimdal (which is a true branch/fork of Heimdal). Patches applied here should first be incorporated upstream, however this has not always happened.
- General purpose libraries (lib) that have not yet been migrated to the top level.
- Client library (libcli) contains the parts of Samba’s client implementation for our protocols specific to the
- RPC client library (librpc) contains the parts of Samba’s RPC client implementation specific to the codebase.
- libgpo contains Group Policy Object support.
- smbtorture binary, used for testing Samba and Windows. For historical reasons there are two
source4framework is the one being extended at this time, but some tests will remain in source3/torture.
- Old NTVFS file server and VFS layer. The attempt at a new file server architecture is preserved in the following directories. These demonstrated a new VFS layer that is organised around the SMB and NTFS semantics rather than the POSIX semantics that Samba used in
smbdat the time (
smbdnow uses a hybrid approach).
- AD Services. The core AD DC is implemented in the named folders for each component:
- Authentication server (auth) contains parts of Samba’s authentication server used in the AD DC. A shim module connects
smbdto this authentication code when Samba is an AD DC.
- The Directory Services DB (DSDB), which provides the main implementation behind the sam.ldb database (covered in more detail below).
Directory Services DB (DSDB)
The code that implements the main AD database is located in source4/dsdb. The
dsdb directory contains the following notable components:
- LDB modules The LDB library provides a generic framework where custom plug-in modules can be added to modify the database’s behaviour. DSDB uses the LDB library framework and defines its own set of plug-in modules (located in
dsdb/samdb/ldb_modules) that are specific to Active Directory. The result is a database that provides the full AD semantics.
- Schema handling The
sam.ldbdatabase follows and conforms to the AD schema. The handling for loading and using the full AD schema is located in
- Replication handling (part) Some of the code related to handling AD’s DRS replication is located in
- KCC The Knowledge Consistency Checker (KCC) is a process that ensures that a valid replication graph is maintained and other periodic cleanup work is done. Parts of the implementation are located in
source4/dsdb/kcc, mostly for historical reasons. Other KCC handling is also located in python/samba/kcc.
The source-code tree contains the following components that are used to build and test Samba.
- Selftest is a bespoke framework for unit and integration testing. The tests themselves are located in many different parts of the source tree.
- python/samba/tests contains many of the Python unit tests.
- selftest/selftest.pl is the runner for
- selftest/target is Perl code that constructs Samba test environments.
- selftest/tests.py declares all the Samba tests (both unit and integration) and the test environment they should run in. Note that
make testis also spread across source3 and source4 as well.
- Wintest is a system that sits outside Samba’s selftest. Wintest builds and installs Samba and runs some limited testing against Windows automatically. Note that this system is not currently maintained and in-use.
- Build system. the code in
wscriptfiles in each directory of the source tree in order to build Samba.
- Documentation. Samba’s manpages are constructed from XML and are located in the
docs-xmldirectory. In particular the smb.conf manpage is constructed from a whole sub-directory of files in here.
- Note that the internal list of valid parameters in Samba is created from the XML documentation of each configuration parameter, ensuring the code and documentation is always consistent. Documented defaults are also checked for consistency in the automated test-suites.
Note that significant amounts of Samba’s codebase is autogenerated from IDL (Interface Definition Language) files. This code is spread across source-code tree (i.e.
source4, and top-level libraries).
PIDL generates pull (serialize, or pack) and push (deserialize, or unpack) functions for all the structures described using IDL, and structures marked
[public] are exposed in public functions in C and Python. This is very helpful for parsing not just DCE/RPC packets but any other regularly structured buffer. The IDL files are located:
For complex structures that don’t quite fit into IDL, a marker
[noprint] can be specified. Hand-written parsers can then be written to handle these structures. These manual parsers are located in:
See the PIDL page for specifics on PIDL syntax and examples.