Samba Security Documentation

Revision as of 17:27, 19 August 2019 by Mmuehlfeld (talk | contribs) (Migrated the content from the Catalyst Samba repository on https://gitlab.com/catalyst-samba/samba-docs/wikis/home)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

This Document

Samba code overview prepared by Catalyst. The emphasis is on aspects of the AD DC relevant for security.


Overview of Samba functionality

Samba is the standard Windows interoperability suite of programs for Linux and Unix.

Samba is an open-source software project that dates back to 1992. It takes the protocols that are essential to the operation of a Windows network and provides support for them on Linux, Unix, and Mac OS systems. This allows the clients and servers in a network to be either Windows- or Samba-based, and to seamlessly integrate together.

Samba gives network administrators freedom in how they structure their networks.

Samba contains many features. In general the Samba server operates as either:

  • A File server, which can also provide other network services, such as printing or NetBIOS name resolution.
  • An Active Directory Domain Controller, which provides directory-based network authentication (as well as all the Samba file server functionality).

Besides its server functionality, Samba provides tools for Linux-based clients to access Windows-based file shares or Active Directory services. Samba also provides Domain Member and NT4-like Domain Controller functionality, that allows it to integrate with other network servers within a particular domain.


File server

Samba is best known as a File Server, sharing POSIX file systems to Microsoft’s Windows clients. Samba translates between the NTFS file system semantics expected by modern Windows clients and the POSIX file system on which it runs, including locking, Access Control Lists, and case insensitivity.

Core to the file server operations is the SMB (Server Message Block) protocol, which in the past has been known as CIFS (Common Internet File System).

As well as being a file server, Samba can also function as:

  • Clustered file server (CTDB). A clustered version of Samba is available using the ctdb binary to link multiple Samba servers that share a common file system into the appearance of single SMB file server.
  • Print Server. As well as sharing files, Samba can share printers, which are either locally attached or are remote printers connected to the local CUPS (Common UNIX Printing System) server.
Samba can also provide automatic driver download to allow clients to access and install the correct driver for available printers. This can be used to create a central print server.
  • Name server. Samba can announce its name and accept name resolution requests via NetBIOS broadcasts and maintain the database of names in the Network Neigbourhood (the browse list). It also supports the centralised WINS protocol, allowing a single server to maintain the registrations.


Active Directory Domain Controller

Active Directory (AD) is a set of network services that run on a Domain Controller (DC). The AD DC administers a domain of users and computers. The AD DC is responsible for verifying the identity of hosts in the network, using a common database (or directory).

Active Directory provides secure centralised authentication, authorization to allow access to different networked resources, as well as address-book services. A range of different network protocols are involved, and Samba (specifically the samba binary) acts as server for each protocol.

The server responsibilties include:

  • File and NetBIOS Server. The AD DC must always provide file server and NetBIOS functionality. The file server always runs as a separate binary, called smbd. Note that when run as an AD DC, Samba uses different NetBIOS server code (rather than the nmbd binary), which also includes multi-master WINS replication support.
  • LDAP Server. LDAP (Lightweight Directory Access Protocol) is one way AD clients look-up user information or to perform administration. LDAP is the primary administrative interface to Active Directory and is generally the most comprehensive view of the database. It is, however, the most unstructured way to manipulate data stored in Active Directory and so, often must be used with care.
  • Kerberos KDC. An extended Kerberos version 5 is core to Active Directory, and the AD DC contains a Kerberos Key Distribution Center (KDC), the central authentication server for this protocol.
  • Database consistency. The common database is distributed across multiple Domain Controllers, whilst preserving database consistency. This feature is called DRS (Directory Replication Service).
  • DNS Server. Samba provides both an internal DNS (Domain Name Service) server and a shared-library plugin for BIND 9.8 and above.
  • DCE/RPC Server for Microsoft protocols. Key network services (e.g. LSA, SAMR, NETLOGON) actually operate over a common transport called DCE/RPC. These services, along with the DCE/RPC transport, will be explained in more detail later in the document.
  • Group Policy server. Samba acts as a Group Policy server, although this simply consists of providing files that the clients download and parse. So this functionality is actually provided by the file server (via the [netlogon] share). Note that it is critical for client security that access to this share only be made over a SMB-signed connection, and clients need to enforce this.


Domain member

A domain member is essentially a machine that forwards authentication requests to an AD DC. The domain member joins an AD domain and uses that domain as the source of authentication and authorization for connecting users. This allows transparent access to the resources on that server, without the server maintaining a distinct password list.

The domain member is often used when Samba is run solely as a file server (rather than an AD DC). The domain member plumbs the authentication required by the file server through to an AD DC in the network. The domain member can also query domain information on the AD DC. The domain member functionality uses winbindd.

A Linux-based workstation can also use the domain member functionality to authenticate itself (i.e. allow desktop login).

A domain member holds a Kerberos principal in the domain, and has a corresponding machine account in the directory that can be used to make or accept Kerberised network requests.


Client utilities

Samba provides a wide range of client utilities. For example, these client tools allow a Linux-based client to talk to a Windows file server. These utilities are documented in more detail in Appendix I. Samba utilities.


samba-tool

One of the most important Samba command-line tools is samba-tool, which is primarily used to administer the Samba server. samba-tool provides an extensive set of functionality, for example creating a new AD Domain or adding a new DC to an existing domain are all done using different samba-tool command options.

samba-tool is based on a set of Python APIs in the Samba codebase. These set of python APIs could potentially be re-used to build custom tools.


Legacy NT4-like Domain Controller

Samba can provide a Classic Domain Controller using technologies similar to NT4. Prior to supporting AD DC (i.e. on Samba 3 releases), the solution was to back Samba on to an external LDAP server such as OpenLDAP. This solution was very popular for being able to emulate an NT4 domain, scale very well, and leverage OpenLDAP for multi-master replication. This solution is still supported on Samba 4 releases, although it’s recommended to run Samba as an AD DC instead.

This domain is also not entirely NT4-like because Windows clients will use modern cryptography against such a Samba domain that NT4 never supported.

Sometimes users refer to this solution as a Samba 3 domain, although that name is not really correct. Samba tried to refer to this as a Classic Domain, however that name never really caught on.


Development history

Before trying to understand Samba in more detail, it’s helpful to know a little about the development history behind Samba. Of particular interest is:

  • The nature of the initial Samba development, which led to the comparative testing principle behind Samba development.
  • The Samba4 fork, which is key to understanding Samba’s code layout.



Initial Samba development

Samba was started by Andrew Tridgell as a SMB client and server to connect between DEC Pathworks and a Sun Workstation. However, between 1992 and 2007, Samba developers did not have access to an authoritative source for Windows protocol definitions. Instead, developers relied on network protocol analysis using packet sniffing and other tools to glean insights into the behaviour of the Windows protocols. Andrew Tridgell wrote an elegant analogy of what this process actually involved in his document How Samba was written.

In 2007, the EU Judgement on software competition and the creation of the Protocol Freedom Information Foundation led Microsoft to publish comprehensive documentation on all Microsoft protocols relevant to Samba. Since then, the Microsoft documentation has formed the starting point for any new Samba project or investigation. These Microsoft specifications, and how they relate to Samba, are covered in more detail in the next chapter.

Although Microsoft has made its protocols public, many documents contain errors and omissions. Therefore manual and automatic comparative testing is still required to verify specific Windows Server behaviour. Samba development generally involves writing tests for a specific feature that run against a Windows DC. Once the developer has a good set of tests that pass against Windows, they then run the tests against a Samba DC and begin implementing the feature on Samba.

Once the Samba feature is complete, the resulting set of automated tests will run against both Windows and Samba DCs, and produce consistent results. The new automated tests are then integrated into Samba’s self-test suite, which means the tests get run as part of the Continuous Integration (CI) that happens every time new Samba code is delivered. This ensures that Samba remains permanently compliant with Windows behaviour.


Samba4 fork

In 2003, Samba branched for Samba 4.0, leaving Samba 3.0 as the maintenance branch. However, due to a variety of mis-steps, the development effort for Samba 4.0 did not make an orderly and timely progression to stability and subsequent release. Instead it took on the task of developing greater and greater features, in particular the Active Directory Domain Controller.

In the meantime, the Samba 3.x branch continued to be developed (beyond basic bug fixes and maintenance). Releases Samba 3.2 and beyond were made at the same time as Samba4 (as it was known) continued in development.


Samba reunification

As it became clear that Samba 4.x would never cover all the features and behaviors of the Samba 3.x effort, it was decided to merge the code trees, which had significantly diverged by this point.

To merge the source-code trees, a rewrite of the git tree (using git filter-branch) was done in 2008, with the former source/ directory of each tree renamed as source3 and source4 in the merged tree.

Besides now complicating the codebase with separate source3 and source4 and directories, the re-unification had other side-effects:

  • A Legacy NTVFS file server. The NTVFS file server that was written as part of the Samba4 effort is abandoned. The code still exists because it’s used in the self-test environment, however, it’s not used in production Samba releases. A number of test-suites rely on the NTVFS server for testing client-side tools.
  • Extra plumbing. The AD DC uses source3 the file server (smbd), but it is a completely separate daemon. The protocol code used by the AD DC is now spread between the source3 and source4 parts of the source-code tree.
This requires extra plumbing that hooks the smbd file server into the rest of the AD DC for the purposes of authentication. The samba binary handles starting smbd, as well as winbindd.

Further Reading


Reference Documents

There are numerous reference documents that describe how Active Directory, and the related network protocols, should behave. This chapter describes these specifications in more detail, organized into documents that:

  • Provide an high-level overview of how the Active Directory protocols tie together.
  • Describe the operation of the underlying network protocols (i.e. DNS, LDAP, etc).
  • Describe the major RPC interfaces that provide specific sub-sets of AD functionality.

Where possible, this section describes how complete the Samba implementation is, makes note of any significant deviations from the specification, and links to the relevant Windows Protocol technical specification.

Note that if a link becomes out-of-date, the documents can be downloaded from the Microsoft website as either Overview Documents or Technical Documents. Alternatively, all the Windows protocol documents can be downloaded as a single Windows Protocol.zip file.


Windows overview documents

The Windows technical overview documents that are most useful for gaining a better understanding of the protocols involved in Active Directory are the following:

  • MS-REF Windows Protocol Master Reference.
This contains a summary of the external references and RFCs across a number of different protocols.
  • MS-ADTS Active Directory Technical Specification.
This documents many of the non-protocol specific technical internals of Active Directory. In particular, it documents the extensions made to the standard LDAP protocol. This includes the introduction of the rootDSE, a top level object designed to join together separate distinguished name namespaces. It includes references to schema class and attribute specifications, and also lists critical objects required by a domain. Samba supports the core of this document up to a Windows 2008 R2 Server level. Some experimental preparations have been made to bring support up to the Windows 2012 R2 Server level on Samba.
Other areas of note, along with the corresponding level of implemented support in Samba:
  • Trusted domain support (partly implemented).
  • Deletion handling of LDAP objects (partly implemented, no recycling bin feature).
  • Claims-based authentication (not implemented).
  • Knowledge Consistency Checker (KCC) (no trusted domain support and with limited failover).
  • Extended rights in access control lists (many rights are not implemented).
  • Connection-less LDAP (CLDAP) (implemented, with some possible minor exceptions).
  • MS-ADOD Active Directory Protocols Overview.
This documents a high level overview of the protocol interactions within Active Directory. It contains a number of diagrams and user-flows which describe how a user might interact with the system. Samba implements most of these protocols, however, Samba does not implement services over HTTP (which appear to use SOAP/XML). This document also describes transport and message-level security features, which indicates based on protocol how traffic is signed or encrypted.


Protocols of Interest

The operation of the Samba AD DC involves extensive use of the following network protocols:

  • DNS Domain Name Service. Both Samba and Windows implement three core RFCs: RFC1034, RFC1035, RFC2136. Samba notably does not implement DNS load balance or DNS round robin (RFC1794). There are also some notable deviations between the implementation specific behaviour of Samba and Windows. In some cases, DNS queries will fail to return any response from Windows if no such name exists, where Samba might return the NXDOMAIN error code.
  • LDAP Lightweight Directory Access Protocol. LDAP has a large number of associated specifications, including: RFC4510, RFC4511, RFC4512, RFC4516. MS-REF contains some additional specifications associated with this protocol. Of the core specifications, implementation in Samba should be complete, however, there are a large number of controls for triggering different behaviour in the LDAP server which are likely missing in Samba (either from the original LDAP specifications or from the Microsoft documentation). The implementation of the LDAP server was done by inspecting real world traffic, and so there should be the most commonly used controls and extensions implemented.
  • KRB5 Kerberos 5. This is documented in MS-KILE, which has listings to Kerberos specifications. Samba implements the core Kerberos functionality as described in RFC4120, but notably lacks FAST (Flexible Authentication Secure Tunneling) support which is described in RFC6113.
  • SMB Server Message Block Protocol The different versions of the protocol are documented across different Microsoft specifications: MS-CIFS, MS-SMB, MS-SMB2. The latest version of the protocol (SMB3) is currently defined in MS-SMB2, which has only a partial implementation in Samba (but note that certain features of SMB3 are still available for use).
  • GPO Group Policy Objects An overview of the Group Policy management system is described in MS-GPOD, which also describes the different GPO extensions implemented by Windows. In a number of cases, these extensions are client-side only and so Samba does not need to implement any behaviour, however replication of group policies (MS-FRS2) is still incomplete and so manual replication between domain controllers must be done. Server-side extension behaviour is not supported.


DCE/RPC Endpoints

A large amount of the Active Directory functionality is implemented using a DCE/RPC (Distributed Computing Environment/Remote Procedure Calls) framework. The functionality is not a separate protocol per se, but can achieve protocol-like functionality using a common RPC transport. Different sub-sets of AD functionality are handled by different DCE/RPC ‘endpoints’, which are essentially a set of related RPC APIs. Windows implements extensions (described in MS-RPCE) on top of the original DCE 1.1: Remote Procedure Call (RPC) Specification, which Samba largely implements in its server either directly or in a compatible way. One notable feature which is missing from Samba is DCE/RPC pipe support, which specifically allows a more efficient mode of operation (but can mostly be ignored otherwise).

There are some high level differences between Samba and Windows in regards to connecting to the DCE/RPC server. Different DCE/RPC endpoints have different encryption or signing requirements, but generally speaking, Windows demands higher levels of protection (i.e. encryption). Currently, Samba allows connecting to the DCE/RPC server anonymously (without authentication), whereas Windows does not.

Listed below are some of the main endpoints which can be connected to over the DCE/RPC protocol:

  • DNSSERVER Described in MS-DNSP, DNSSERVER provides an interface for administrating a DNS server. Samba implements the standard query, update and delete methods but does not implement many of the more complex administration methods which allow modification of different zone or DNS server settings. There are also some slight name normalization differences between Samba and Windows.
  • LSARPC (Local Security Authority RPC). Described in MS-LSAD, LSARPC provides an interface for managing machine and domain security policies. The protocol can change the rights and privileges given to different users (and security principals), securely store secrets on the server, manage trusted domain objects and allows manipulation of some other security settings. The most common operations in LSARPC have been implemented, but there are a number of lesser used operations which have not been implemented.
  • NETLOGON Described in MS-NRPC, NETLOGON allows authentication of users and machines onto a domain. Samba appears to implement the specification sufficiently (some unused or rarely used functions are not implemented).
  • SAMR (Security Account Manager Remote Protocol). Described in MS-SAMR, SAMR allows remote management of users and groups which are managed by the Security Account Manager (SAM). Samba appears to implement the specification sufficiently (some unused or rarely used functions are not implemented).
  • DRSUAPI (Directory Replication Service API). Described in MS-DRSR, DRSUAPI allows for domain data replication between two Active Directory servers. The CrackNames function, which converts directory names of one type into a different type, is notably incomplete as many input and output formats are not handled in every case. There are also some implementation-specific deviations (i.e. not described in the specification) in the order that replication data is chunked (and the chunk size limits) between Samba and Windows. Also, Samba does not implement one of the extended operations to the GetNCChangesfunction. The core replication functions are implemented, but there are a number of query calls not implemented in Samba.
  • SRVSVC (Server Service). Described in MS-SRVS, SRVSVC allows for remote administration of file and print shares (via SMB). The pipe allows querying of diagnostic information about existing shares, as well as the ability to add, modify or delete them. Samba only implements the basic querying, and any modifications to shares or to the list of shares is not enabled by default (although some experimental code exists to do so).
  • WINREG (Remote Registry Protocol). Described in MS-RRP, WINREG allows for a remote client to manipulate a hierarchical data store, specifically the Windows registry. Basic querying, insertion, modification, deletion are implemented but more complex queries and registry key types are unsupported both by the Samba implementation in the RPC server and the underlying library that is used to implement the registry. Registry support in Samba is still in an incomplete state, and there is also a notable lack of testing of all the pieces.
Samba also does not support remote manipulation of registry objects via DCOM (Distributed Component Object Model), which has not been implemented in the DCE/RPC server. This group of protocols is described in MS-COM and MS-DCOM.
  • WKSSVC (Work Station Service Remote Protocol). Described in MS-WKST, this protocol allows remote tasks to be performed on a computer in a network. This appears mostly unimplemented in Samba, apart from a basic query information call.
  • PROTECTED_STORAGE (Backup Key Remote Protocol). Described in MS-BKRP, this protocol encrypts a client’s secret values using the help of a remote server (to be decrypted by the server later). This appears to be completely implemented in Samba.
  • EVENTLOG/EVENTLOG6 (Event-Log Remoting Protocol). Described in MS-EVEN and MS-EVEN6, these endpoints allow reading of event logs stored on a remote computer. In Samba, EVENTLOG is implemented to some degree but remains unused and untested. EVENTLOG6 has no implementation (only an empty stub call).


Samba architecture

Samba’s architecture is very complex. However, taking a very simplified view, the Samba AD DC can be thought of in terms of the main roles it performs:

  • A Directory Services Database, that provides the Active Directory objects and semantics.
  • A set of server processes that respond to network protocol requests.

This chapter will also cover some of Samba’s architectural infrastructure, which can be thought of as the ‘glue’ that binds the major components together.


Directory Services Database

At its core, the AD DC uses information contained in a primary database to provide network services. This primary database is known as the Directory Services Database (DSDB). On disk, the DSDB roughly corresponds to the sam.ldb file.

The AD DC’s basic operation involves taking a network protocol request, using the information in the packet to build a database request (usually a query), and responding accordingly to the client based on the result received back from the database. A very straightforward example of this (database-to-network-protocol mapping) would be LDAP. LDAP add, modify and delete operations correspond to add, modify and delete of objects in the underlying database. LDAP searches allow for a simple read over the entire database, but a client can build a more complex structured query expression in order to select a subset of the data. e.g. all users. To do this, Samba has to implement a mapping between an LDAP search expression and a database search (or multiple chained searches).

The DSDB is primarily implemented through shared libraries. Each separate Samba server process loads the shared libraries in order to connect to the DSDB. The DSDB is locked appropriately across read and write operations, and the database contents are constantly mirrored to disk (offering transactional behaviour).


The underlying database

The overall DSDB is made up of the following components:

  • A generic database framework, called LDB (LDAP-like Database).
  • A set of DSDB plugin modules that provide the Active Directory semantics.
  • A database backend implementation: either TDB (Trivial Database) or LMDB (Lightning Memory-Mapped Database).

The LDB is generic library code that provides a framework that can be used for any LDAP database. It gets used for several smaller databases within Samba (e.g. secrets.ldb, idmap.ldb). Taken on its own, the LDB code just provides some simple APIs for an LDAP database, e.g. ‘add’, ‘delete’, ‘modify’, ‘search’, etc. However, the LDB supports plugin-modules that allow the simple database to be transformed into a more complex Active Directory database.

The DSDB plugin-modules are a set of shared libraries that each provide a sub-set of Active Directory functionality. For example, the operational module provides support for Operational LDAP Attributes in Active Directory, which don’t exist directly in the underlying database but get constructed dynamically in response to specific queries. The DSDB modules are explained in more detail in the next section.

Finally, the DSDB has a backend implementation that handles storing the actual database information in terms of key-value pairs. The backend integrates into LDB as a key-value abstraction layer with transactional semantics. Typically the key stored is the objectGUID and the value is the entire object record. However, attributes that are indexed for performance also get stored as separate records. Historically Samba has used TDB as its database backend. In Samba v4.9, initial support was also added for an LMDB backend.

To recap, the LDB provides APIs to access or modify LDAP records. The DSDB modules plug-in to the LDB framework, and can adjust the LDAP records on the way through, in order to conform to the Active Directory semantics. Finally, LDB takes the resulting LDAP record and stores it as a key-value pair using the backend database implementation.


DSDB modules

Each DSDB module generally focuses on one small sub-set of Active Directory functionality. Some examples of the more note-worthy DSDB modules are:

  • RootDSE: dynamically constructs the rootDSE object, which contains a top-level view of the database, all the partitions it contains, and the LDAP extended controls that the server supports.
  • Schema: several DSDB modules combine to provide the underlying Schema functionality that turns sam.ldbinto an Active Directory database:
  • schema_load, which loads the Schema objects initially, and also handles Schema updates.
  • schema_data, which verifies that only the FSMO (Flexible Single Master Operation) master can update the schema.
  • objectclass and objectclass_attr, which actually enforce that database objects and attributes confirm to the Schema semantics. For example, verifying that the child-class is appropriate, or that an attribute’s value is the correct type.
  • ACL: checks that the user’s access rights allow it to read or write a particular object or attribute. Unprivileged requests are either rejected or, for searches results, the unprivileged information may simply be suppressed. The ACL functionality is really split over two modules:
  • acl_read, which handles LDB search operations.
  • acl, which handles the remaining LDB operations (‘modify’, ‘add’, ‘delete’, etc).
  • Replication Meta-Data (repl_meta_data): this implements DRS (Directory Replication Service) at the database-level, and is one of the most complicated DSDB modules. repl_meta_data takes inbound replication data and applies it to the database, checking the data’s integrity in the process (for example, that linked attributes can successfully be resolved). It is also responsible for maintaining local database information required for outbound replication, for example updating the usnChanged attribute whenever an object is modified.
  • Partitions: directs an LDB operation to the correct partition. Behind the scenes, there is actually a separate database for each partition (which are visible in the sam.ldb.d/ directory). The partition module provides an abstraction, so that Samba appears to have just a single sam.ldb database file.

This is not a comprehensive list of all the DSDB modules Samba uses. To find more details about the DSDB modules, look at the samba_dsdb.c code, which is where the modules get loaded.

Generally, the communication between DSDB modules is generic, with LDB requests flowing from one module to another. The modules are layered, so that one module will take the LDB operation (e.g. ‘search’, ‘add’, ‘delete’, etc), perform its own specific processing, and then pass the request on to the next module. If one module determines the LDB operation is invalid, then the request is rejected and is not processed further.

In certain cases, a Samba server process may need to communicate directly with a specific DSDB module. For example, when receiving DRS replication data from another DC, only the repl_meta_datamodule needs to process the replicated data. This is done via LDB ‘extended’ operations. Each special-case operation is identified by an Extended OID (Object Identifier), and only the module(s) interested in that particular operation will process the LDB request. Note that Samba has its own registered OID space.

The DSDB modules can also generate notifications for a Samba server process. For example, the dns_notify module checks for dnsZone records being modified, and notifies the DNS server process whenever a change occurs. The RPC (Remote Procedure Call) mechanics behind this notification will be covered more in the next section.



Samba server processes

The other way to think of the Samba architecture is as a set of server processes, each of which specializes in responding to requests for a specific network protocol. Samba consists of the following processes:

  • samba Root Process. Responsible for starting the other processes and monitoring them. Note that smbd and winbindd get started via exec() as separate processes, whereas the others are started via fork(), as child processes.
  • smbd File Server. Provides the network file server functionality that Samba is best-known for. This runs as a separate daemon to the rest of samba, and is built from the source3 code.
  • winbindd. Provides Windows-like bindings, by maintaining connections between network clients and the DC. winbindd doesn’t respond to a specific network protocol like the other Samba processes do, but is instead more of an implementation-specific component needed to manage the many DC connections. winbindd runs as a separate daemon to the rest of samba, and is built from the source3 code.
  • KDC Server. Provides a Key Distribution Center (KDC) for Kerberos authentication. This process grants session tickets to clients and later validates those claims.
  • LDAP Server. Responds to network LDAP requests. Note there is also a separate process that handles CLDAP (Connectionless LDAP).
  • RPC Server. Handles the various DCE/RPC APIs defined in the Microsoft specifications (e.g. LSA, SAMR and NETLOGON). The autogenerated IDL (Interface Definition Language) code plugs in at this point to handle marshalling and unmarshalling the network packet data, and calling the corresponding C function.
  • DNS Server. Responds to network DNS requests. Note that the DNS server only runs when Samba’s Internal DNS is configured. If BIND is used, then this process doesn’t exist.
  • Replication Server. DRS (Directory Replication Service) maintains database consistency between DCs across the domain. The replication server’s responsibilities are:
  • Outbound replication: notifying peer DCs when the local database changes.
  • Inbound replication: pulling in remote changes when a peer’s database changes, and applying them locally.
Note that this communication actually involves talking to the peer DC’s RPC server process, using the DRSUAPI interface.
  • KCC Server. The Knowledge Consistency Checker (KCC) process is responsible for maintaining DRS connections. Not all DCs replicate with each other, otherwise with a large number of DCs the network overhead would quickly become too great. It’s the KCC’s job to work out which neighbouring DCs it should replicate with.
The KCC process is also responsible for the periodic cleanup of tombstoned objects.
  • NBT Server. Provides NETBIOS over TCP services. This process is the source4 replacement of the nmbd process in the source3 code.

This Samba process model is known as the standard-process mode. The samba executable also supports a single-process mode, where a single server process is responsible for responding to all protocols. However, the single-process mode is only really used for developer debugging.

The server processes that accept connections from network clients generally fork() a separate process for each new connection. Samba is moving towards a pre-fork mode that will manage these client connections more efficiently.


Server process interaction

The main ways the Samba processes interact are:

  • IRPC (Internal RPC) between the samba processes, which uses the DCE/RPC framework. This is how the AD DC (or source4) Samba components talk to each other.
  • ncacn_np pipes between the smbd/winbindd processes and samba. This is how the file server (or source3 components) talk to the AD DC (or source4) Samba components.
  • TDB files used to share state information between the processes.

The AD DC processes generally use asynchronous IRPC messaging to communicate. These messages use the DCE/RPC framework and are autogenerated via an IDL file for convenience and simplicity. The message APIs are Samba-specific extensions to the DCE/RPC framework defined by Microsoft.

For example, the dns_notify DSDB module uses an asynchronous IRPC socket to notify the DNS server of a DNS zone change. It uses a dnsserv_reload_dns_zones DCE/RPC API, which is not part of the Microsoft standards but is an API that Samba developers have added. The dnsserv_reload_dns_zones API is defined in the IDL file, which means most of the marshalling/unmarshalling RPC code is autogenerated. Whatever server process is making the dnsZOne modification to the database will generate the RPC call, and the notification will then be received and handled by the DNS server process.

The smbd and winbindd processes use different messaging (ncacn_np sockets) to talk to the samba AD DC processes. ncacn_np stands for Network Computing Architecture Connection-oriented protocol (NCACN) Named Pipes. These are Unix-based, root-only sockets that the file server uses to authenticate clients via the AD DC. A separate socket exists for each file server client.

Finally, the Samba processes can also share state via TDB files. The TDB files are memory-mapped and constantly mirrored to disk. Some examples of the types of information TDB files share are:

  • Locking, such as brlock.tdb, which handles byte-range locking.
  • State, such as secrets.tdb, which stores private information like the DC’s machine account information.
  • Cached Information, such as gencache.tdb, which is a generic caching database.


Architectural Infrastructure

Samba sub-projects

The following components not only form an important part of Samba’s architecture, they are also standalone sub-projects that can be freely reused by other open-source projects.

  • LDB, as well as providing the framework for the DSDB, is also a standalone library for wider use. LDB aims to be an LDAP-like serverless database backed on to a memory-mapped database for simplicity of operation.
  • TDB was NoSQL before NoSQL was hip: TDB is a transactional key-value store database with fcntl() locking for concurrent access.
  • talloc is Samba’s tree memory allocator and is the primary memory abstraction in Samba.
  • tevent provides Samba’s event loop management.


Autogenerated code

Autogenerated code is the glue between the DCE/RPC APIs defined by Microsoft (as well as Samba’s own IRPC extensions) and the server process code that executes the RPC. The set of RPC that Samba supports is defined in IDL files. PIDL is Samba’s Perl IDL compiler, which takes these IDL files and converts them into autogenerated code.

The autogenerated code handles the marshalling and unmarshalling of network packet data, commonly known as NDR (Network Data Representation). PIDL autogenerates the following code:

  • Client-side C bindings.
  • Server-side C bindings.
  • Client-side Python bindings.

The C bindings allow programmers to easily call/implement individual RPC functions. This guarantees that the function signatures and returned structures are all correct.

PIDL also generates Python bindings that turn the DCE/RPC protocols (and other IDL structures) into native Python structures (objects). These can then either be sent over DCE/RPC to a server or packed into a binary buffer in-place (using the __ndr_pack__ method on the object). This makes it easy to write Python code that acts as an DCE/RPC client, which is particularly useful for testing the Samba server.

Note that PIDL is also used externally by the Wireshark project in order to generate packet dissectors to inspect a number of Windows network protocols.


IDL compatibility

Samba has implemented a number of extensions on the official IDL specification in order to allow PIDL to parse a larger variety of network packets and linearisations. This allows Samba to leverage the auto-generated code widely, particularly in Python.

Microsoft implements their own set of extensions on top of IDL, in the form of MIDL (Microsoft Interface Definition Language). A number of these are not supported by PIDL and so there are notable differences between the two.

Not everything can be easily encoded in IDL, or easily converted to autogenerated code, so some of the RPC-handling code relies on hand-written parsers. Typically the hand-written parsers begin as generated code and are then modified by hand. Note that this is not ideal, as hand-maintained C code is at a higher risk of having flaws (e.g. a buffer overflow).

If PIDL fully supported MIDL, it might avoid the need for hand-written parsers.




Samba codebase organization

Broadly speaking, the Samba source-code tree can be organized into the following major groups:

  • Top-level libraries, which contains common code shared amongst the Samba processes.
  • Source3, which is code primarily used by the file server and domain member.
  • Source4, which is code primarily used by the Active Directory Domain Controller.
  • Infrastructure components, which provide the build and test framework for Samba.

The following sections break down the codebase layout in more detail. This is not intended to be a comprehensive directory, and just covers the major components.


Top-Level libraries

At the time of the merge, all code was located in either the srouce3 or source4 directory. Over time, as duplicate code between the two branches becomes merged or used in common, the code is moved out into the top-level of the source-code tree.

The major libraries components at the top-level are:

  • Third-party libraries Samba needs some specific libraries to build. Some of these are included in the Samba source tree to aid in building on older and non-Linux platforms.
  • General purpose libraries Samba, being like any large program written in C, has a number of internal helper functions that do not implement the protocols but are required to share code and make the rest of Samba possible.
The sub-projects of talloc, tdb, tevent and ldb and live here, in the lib directory.
  • PIDL is Samba’s code auto-generation system for generating C code and C-Python bindings from IDL.
  • python contains Samba’s Python library. It is not generally used in the file server, but is critical for the AD DC.
  • CTDB Samba’s clustered database (which enables the clustered file server).


Source3

The source3 directory is home to code primarily used by the file server and domain member. source3 contains the following major components:

  • The SMB file server (smbd) is the file server that most people think of when they think of Samba.
  • The NBT name server (nmbd) provides NetBIOS over TCP/IP (NBT) for those who want it.
  • Winbindd provides the connection between Samba and the AD Domain to which it is joined, for authentication and name lookup. It also manages the IDMAP, being the mapping between unix UID/GID values and Windows SID values. winbindd is used in both Domain member and AD Domain Controller modes.
  • RPC client library (librpc) contains the parts of Samba’s RPC client implementation that are specific to the source3 subtree.
  • SMB client library (libsmb) contains the parts of Samba’s SMB client implementation used in the source3 subtree.
  • Authentication server (auth) contains the parts of Samba’s NTLM authentication server used in the source3 subtree. A shim module connects this to the source4 authentication code when Samba is an AD DC.
  • Password database (passdb) contains the NTLM password database used in the source3 subtree. A shim module connects this to the sam.ldb data store when Samba is an AD DC.
  • RPC server contains the source3 RPC server. However, most parts of this are not used in the AD DC, but instead are redirected to the equivalent parts of source4/rpc_server. When used, this provides the classic or NT4-like DC either as a DC or to service the SAM on each member or standalone server (each Windows machine has a database under its own name, which Samba does too).
  • Print server functionality is located in the printing directory. Also relevant is the source3/rpc_server/spoolss code.



== Source4

The source4 directory is home to code primarily used by the Active Directory Domain Controller. source4 contains the following major components:

  • Active Directory Database templates located in setup. These templates fill out the basic structure of an Active Directory DC in the sam.ldb. This includes the full schema definition.
  • Heimdal is an (old) branch/fork of Heimdal with some changes. An attempt is made to sync this Samba fork with a tree called lorikeet-heimdal (which is a true branch/fork of Heimdal). Patches applied here should first be incorporated upstream, however this has not always happened.
  • General purpose libraries (lib) that have not yet been migrated to the top level.
  • Client library (libcli) contains the parts of Samba’s client implementation for our protocols specific to the source4 codebase.
  • RPC client library (librpc) contains the parts of Samba’s RPC client implementation specific to the codebase.
  • libgpo contains Group Policy Object support.
  • smbtorture binary, used for testing Samba and Windows. For historical reasons there are two smbtorture frameworks, the source4 framework is the one being extended at this time, but some tests will remain in source3/torture.
  • Old NTVFS file server and VFS layer. The attempt at a new file server architecture is preserved in the following directories. These demonstrated a new VFS layer that is organised around the SMB and NTFS semantics rather than the POSIX semantics that Samba used in smbd at the time (smbd now uses a hybrid approach).
  • AD Services. The core AD DC is implemented in the named folders for each component:
  • Authentication server (auth) contains parts of Samba’s authentication server used in the AD DC. A shim module connects smbd to this authentication code when Samba is an AD DC.
  • The Directory Services DB (DSDB), which provides the main implementation behind the sam.ldb database (covered in more detail below).


Directory Services DB (DSDB)

The code that implements the main AD database is located in source4/dsdb. The dsdb directory contains the following notable components:

  • LDB modules The LDB library provides a generic framework where custom plug-in modules can be added to modify the database’s behaviour. DSDB uses the LDB library framework and defines its own set of plug-in modules (located in dsdb/samdb/ldb_modules) that are specific to Active Directory. The result is a database that provides the full AD semantics.
  • Schema handling The sam.ldb database follows and conforms to the AD schema. The handling for loading and using the full AD schema is located in source4/dsdb/schema.
  • Replication handling (part) Some of the code related to handling AD’s DRS replication is located in source4/dsdb/repl.
  • KCC The Knowledge Consistency Checker (KCC) is a process that ensures that a valid replication graph is maintained and other periodic cleanup work is done. Parts of the implementation are located in source4/dsdb/kcc, mostly for historical reasons. Other KCC handling is also located in python/samba/kcc.


Infrastructure components

The source-code tree contains the following components that are used to build and test Samba.

  • Selftest is a bespoke framework for unit and integration testing. The tests themselves are located in many different parts of the source tree.
  • Wintest is a system that sits outside Samba’s selftest. Wintest builds and installs Samba and runs some limited testing against Windows automatically. Note that this system is not currently maintained and in-use.
  • Build system. the code in buildtools uses the wscript files in each directory of the source tree in order to build Samba.
  • Documentation. Samba’s manpages are constructed from XML and are located in the docs-xml directory. In particular the smb.conf manpage is constructed from a whole sub-directory of files in here.
  • Note that the internal list of valid parameters in Samba is created from the XML documentation of each configuration parameter, ensuring the code and documentation is always consistent. Documented defaults are also checked for consistency in the automated test-suites.


Autogenerated code

Note that significant amounts of Samba’s codebase is autogenerated from IDL (Interface Definition Language) files. This code is spread across source-code tree (i.e. source3, source4, and top-level libraries).

PIDL generates pull (serialize, or pack) and push (deserialize, or unpack) functions for all the structures described using IDL, and structures marked [public] are exposed in public functions in C and Python. This is very helpful for parsing not just DCE/RPC packets but any other regularly structured buffer. The IDL files are located:

For complex structures that don’t quite fit into IDL, a marker [nopull], [nopush], or [noprint] can be specified. Hand-written parsers can then be written to handle these structures. These manual parsers are located in:



Development Practices

Typical development process

The typical development process on Samba looks like this:

  • A developer has a problem to solve. This might be fixing a bug, or implementing some previously unsupported Windows Server functionality.
  • The developer would then write test cases that demonstrate the problem. For server-side behaviour, these test cases would pass when run against a Windows DC, but fail against a Samba DC.
  • The finished tests are integrated into Samba’s self-test and marked as known failures initially. There are a couple of benefits to this approach:
  • It’s standard practice in Test-Driven Development (TDD) to help prove that the new test-case works correctly.
  • It means git bisect can be run over the codebase, which can help to identify any degradations introduced to Samba. After any given commit, the Samba code will always compile, and will always pass all tests.
  • The developer then writes the code to fix the bug or implement the desired functionality. The known failure status for the new tests is removed, the new tests are re-run, and this time they should all pass.
  • The developer should then run the full Continuous Integration (CI) test suite over their changes, to verify they haven’t broken any existing functionality. The Gitlab CI provides a convenient way to do this, although there are several other approaches.
  • The developer should end up with a coherent set of patches that add the new functionality, along with tests that prove the new functionality works correctly. They then send the patch-set to the samba-technical mailing-list for review.
  • The code is reviewed by Samba Team members. While any developer can potentially contribute changes to the Samba codebase, only Samba Team members have the access rights to actually deliver code changes to the master code branch. Usually the reviewers provide some feedback on how the patches could be further improved.
  • Once the reviewer is happy, the code must then pass a final CI test run before it’s incorporated into the main Samba codebase.

The following sections cover the Continuous Integration and Code Review process in more detail, as these steps are particularly important to maintaining the quality of the Samba codebase.


Continuous Integration

Autobuild

Samba’s CI process is called Autobuild. autobuild.py is a python script that verifies Samba builds, installs, and passes all automated tests successfully. Autobuild is generally invoked for CI automatically (e.g. via git hooks) rather than run manually.

Autobuild is more than just a wrapper for ‘make test’. It runs additional tests that make test doesn’t cover, such as the CTDB tests. Autobuild also builds various parts of the Samba project in different ways, for example using the LDB system library as well building the in-tree LDB source code. Running tests against these different build permutations ensures the main supported Samba configurations always work.

A successful autobuild is a gating step in delivering code to the master Samba branch. Autobuild is run on a host called sn-devel that only Samba Team members have access to. The script automatically rebases the commits under test to be on current master, so that if another change is merged into master first, testing restarts. Therefore, the Samba master branch always passes all tests.


Methods of running Autobuild

While autobuild.py can run on any Linux machine, the official results are only accepted on the sn-devel host. However, only Samba Team members can access sn-devel, and it’s only a single host with limited CPU resources. So there are other methods that allow developers to get CI results in a similar way:

  • Github and Travis. Users who make a pull request against our github repository trigger a Travis CI job that runs some of autobuild. .travis.yml implements Samba CI on Travis.
  • Gitlab. Gitlab provides a CI service. Some tests can run against public ‘runners’, which gitlab provides for free. However, other tests need a specially configured CI box, due to memory or ext4 filesystem requirements. .gitlab-ci.yml implements Samba CI on Gitlab.
  • Openstack. Catalyst developers often run autobuild.py on a virtual machine in the Catalyst Cloud. The scripts to do this should operate against any OpenStack cloud.
  • Manually. Autobuild can also be run manually. script/autobuild.py --testbase=/tmp is a typical invocation of autobuild and can run on a Linux system with the right packages installed. The current reference system is Ubuntu 14.04.


Potential improvements to Autobuild

Autobuild tests everything in the Samba codebase, which means it takes several hours to run. The autobuild.py script defines a series of tasks, which means some tests can be run in parallel. More work could be done to further parallelize the tests, resulting in quicker test runs, and improved Samba productivity.

The other problem is sometimes tests fail for unknown reasons at random times, i.e. the tests ‘flap’. A bot runs on sn-devel four times per day and e-mails the samba-cvs mailing-list if any failures are detected. This allows developers to see which tests, perhaps newly introduced, caused the failure.

Tests can be marked as ‘flapping’ (either in selftest/flapping or selftest/flapping.d), which means the test case still gets run, but the test result is essentially ignored. Obviously, this practice is discouraged as it reduces the usefulness of the test.

More work could be done to investigate these intermittent test failures and fix them.


Code Review

All Samba code changes written by a Samba Team member must be reviewed by another Samba Team member before they can be delivered. Code from someone outside the team must be reviewed by two Samba Team members. Once the reviewer is happy, a Reviewed-by: tag is added to each git commit.

The Samba Code Review policy outlines the review process in more detail. Samba has a guide for new developers, which includes things like:

  • Each patch should be as small as possible, i.e. changes only one thing. This makes review easier.
  • The patch should have an appropriate commit description.
  • Patches that fix a bug should contain an appropriate BUG tag.

Samba has formal coding guidelines, as well as informal practices that Samba Team Members tend to enforce during review.


Samba coding guidelines

Samba’s coding style (for C) is documented in the README.Coding. Adherence to this is encouraged by peer-review and git hooks. The coding guidelines include:

  • Purely stylistic conventions, such as how to format code comments.
  • Sensible programming strategies, such as always initializing pointer variables to NULL.
  • A mixture of both, such as 80-character line lengths, which discourages writing complex, deeply-nested code.

The coding standards are applied to new code that is delivered. However, because the coding standards have evolved over time, existing code that was written a long time ago may not conform to the guidelines.

As well as the C guidelines, the following are used for other languages:

  • Python. The Python coding standard follows PEP8. This is partially enforced by make test. Python code should be compatible with both Python 2 and Python 3.
  • Shell. Samba aims to operate on a POSIX shell as /bin/sh and must not assume /bin/sh is BASH. Specifically Debian and Ubuntu use dash as /bin/sh. If BASH is required for some reason, then #!/bin/bash must be used.

Samba also has a Copyright Policy that requires mandatory tags on every git commit.



Informal coding guidelines

As well as the formal coding standard, Samba Team Members tend to enforce informal, common sense practices during review. For example, the coding guidelines don’t dictate specific naming conventions, but most Samba Team reviewers would suggest existing naming conventions are followed. The general convention is that commonly-used variables just use an abbreviated form of the structure name, such as:

c
struct ldb_result *res = NULL;
struct ldb_message *msg = NULL;
TALLOC_CTX *mem_ctx = NULL;
struct loadparm_context *lp_ctx = NULL

Another example would be a patch-set where new code is introduced, and then the code is further reworked in a subsequent patch. Reviewers would generally request that the patch-set is redone to avoid unnecessary code churn.

Samba also has an informal (but enforced) policy that code changes and new features should have automated tests. Once the new feature, along with its tests, is integrated into the main codebase, Samba’s autobuild process ensures that future changes cannot break these tests and degrade the new feature’s functionality.



Testing Practices

Background

Samba has been using automated tests for some time, but due to the age of the Samba project, the testing infrastructure has sometimes developed in ways that the rest of the industry did not follow. Combined with the unusual testing requirements of Samba, this means the Samba testing framework has an idiosyncratic feature set and uses approaches that are not shared by many other projects. Samba developers make efforts to adopt the standard practice, such as using Python’s unittest module in preference to a pre-existing bespoke Samba implementation. However, a significant amount of Samba-specific testing infrastructure remains.

To connect the variety of test systems into a common reporting format Samba uses the Subunit testing protocol (version 1). This protocol is both human- and machine-readable and allows us to create knownfail files listing expected test-case failures.

Note that the sub-projects of TDB, CTDB, talloc, tevent and LDB have some of their own tests declared in their own build systems. These sub-projects use a much less advanced test framework compared to Samba.


Test environments

Samba’s integration testing heavily relies on the automatic creation of a Samba network. This specialized test environment is generally referred to as a Samba ‘testenv’.

A testenv involves starting the Samba server listening on a fake network, which is established using the socket_wrapper library from cwrap. All testing is also done as a non-root user using the uid_wrapper library, also from cwrap. This allows testing without installation on a developer workstation.

Samba’s test framework uses many different types of testenv. Each testenv is customized to test a particular Samba feature or configuration.


Integration testing

A significant number of Samba tests are integration tests. This is due to the comparative-testing nature of Samba development. Integration-style tests allow developers to probe a Windows server to examine its behaviour. The Windows behaviour is then locked into an automated test that gets run against Samba as part of make test.

There are several different ways integration tests can be written:

  • smbtorture is the traditional C-based testing framework for Samba, created with the dual purpose of testing the protocol against Windows and ensuring Samba has the same behavior. smbtorture tests SMB and SMB2 in particular.
  • Python test scripts are used to implement integration tests in a similar way to smbtorture. The Python bindings that PIDL autogenerates for RPC, combined with additional LDB Python bindings for LDAP access, make it easy to write Python client code that exercises the server-side Active Directory behaviour.
  • shell tests can be a simple way to test the behaviour of Samba command-line tools. However, generally Python tests are preferred over shell, for better maintainability.


Unit tests

Samba unit tests are generally written either in:

  • Python (as typical python unit tests), usually used to test Samba python libraries.
  • The cmocka framework.
  • Part of smbtorture (or smbtorture3) as a ‘local’ test (where a server is specified but is not actually contacted).

Note that most Samba library code does not have a specific unit test, and is instead tested as part of the overall testing of Samba’s protocol implementation. There is no specific boundary to indicate where unit testing (compared with integration testing) is required, but unit testing is strongly encouraged for new and modified library code.


Hardening practices

Samba developers take security seriously. There are several approaches used to harden Samba in order to minimize its vulnerability to attack. These include:

  • Utilizing the compiler’s security options.
  • Using static analysis and other tools to proactively detect potential security vulnerabilities.
  • Restricting access permissions of sensitive files.
  • Mitigating the potential for buffer overflow attacks.


Compiler hardening options

Samba is a large, important code base written in C, run as root and exposed to the network. Therefore is is important to gain as much from the compiler as possible to protect against buffer overflow and similar attacks. The full list of compiler options Samba uses is detailed in Appendix II. Samba compiler options used.

As well as Samba’s default compiler options, the primary Linux distributions enable additional hardening options when compiling Samba:

Note that Samba never needs an executable stack or heap. When we compile assembler for the AES-NI acceleration we also mark this as such, so a modern compiler and linker will likewise mark the resulting binaries.


Potential further improvements

Potential further improvements to Samba’s compiler options include:

  1. Set the hardening compiler options currently used by the primary Linux distributions as default Samba options, so that they get used for all distributions. For example, change Samba from using -fstack-protector to -fstack-protector-strong. The main work involved would be ensuring the compiler options are compatible with the numerous distributions that support Samba. Note that in some cases, there may also be a run-time performance impact from using particular compiler options.
  2. Enabling the -Wl,-z,defs option, which is a standard hardening option on Fedora, but is currently disabled on Samba. This is due to dependency loops when building Samba, and has proven to be very difficult to resolve in the past. The subsystems affected are marked with allow_undefined_symbols=True in their wscript_build file. Fixing these dependency loops is hard, but if finally resolved then full link-time dependencies could be calculated.
  3. Enabling -Werror for all sub-systems. Some warnings in Samba are difficult to remove, which has prevented the use of -Werror in all subsystems. The subsystems affected are marked with allow_warnings=True in their wscript_build file. For source4 code, this includes some third-party (heimdal, zlib, pop5) and lib/com code. Fixing these sub-systems to enable -Werror would be hard.


Vulnerability detection tools

Samba developers utilize a range of tools to proactively detect vulnerabilities in the codebase, including:

  • Coverity, a cloud-based static analysis tool used to detect potential security vulnerabilities. Coverity periodically scans the latest Samba codebase. Samba team members get notified whenever a new error is introduced, and these tend to be rectified quickly.
  • Address Sanitizer uses the -fsanitize=address compiler option to detect memory overruns and use-after-free errors. Address Sanitizer has been integrated into the Samba build system as an option that can be enabled. However, the issues that it reports still need to be investigated in more detail. Currently the full set of Samba tests (i.e. ‘make test’) doesn’t pass with the Address Sanitizer enabled. Note that the Address Sanitizer is only intended for use during development testing (it’s not a compiler option that’s suitable for production Samba deployments).
  • Valgrind is a useful tool for debugging memory leaks and corruption issues. Samba can run in valgrind, but the performance is very slow (and Samba may only work in single-process mode). The slow performance results in network protocol requests timing out, which limits the usefulness of running Samba in valgrind. However, valgrind can be very useful for detecting memory issues when developing c,pcla tests (i.e. debugging a LDB module or specific shared library).
  • Undefined Behaviour Sanitizer (UBSan). Samba developers have tried running UBSan over the Samba codebase using clang, however, it has not been integrated into the Samba build system properly. Initial investigations showed that the warnings in the UBSan output contained a lot of ‘noise’ and not many actual bugs. For UBSan to be useful, further work would be needed to suppress the UBSan warnings that are of less concern.
  • Fuzzing has so far been underutilized on Samba, but this is something Samba developers are keen to rectify. A few small experiments have been carried out with the LDAP server using AFL (American Fuzzy Lop). These experiments showed promising results, and Samba developers have a plan to use fuzzing in more depth, however, currently progress is slow due to lack of funding.


Restrictive file-access permissions

All Samba files containing keys (either inside a large database like sam.ldb or single files like a keytab) are protected with restrictive file permissions.

Currently Samba runs as root for all AD operations so these files are mode 0600 with owner root.

Samba’s private files containing keys are all in a subdirectory called privat/ the owner is root and the mode is 0700.


9.4. Mitigating buffer overflow attacks

In memory key hygiene

One goal of a buffer overflow attack is to read sensitive information sitting in memory, such as encryption keys. Unlike other projects such as OpenSSH, Samba does not pro-actively attempt a defence-in-depth approach to the wiping of keys from memory.

For example, no specific attempt is made to zero memory that keys were read into, nor is any attempt made to avoid functions like realloc() that might duplicate memory containing keys. The reason is it is very difficult to secure all sensitive in-memory information, as highlighted by Things I learned from OpenSSH about reading very sensitive files by Chris Siebenmann.

Currently Samba assumes that any arbitrary memory read is already a full compromise. However, recent work towards a more nuanced position can be seen in the encrypted storage of secret attributes, as this also ensures these values are encrypted in the memory-mapped area.


Potential further improvements

Samba uses a common memory allocation scheme (Talloc), which may make it easier to secure sensitive information that gets read into memory. A potential improvement would be to:

  • Ensure any sensitive information is stored in a talloc_chunk of memory, rather than on the stack.
  • When the sensitive information is allocated, mark the talloc_chunk appropriately.
  • Ensure realloc() never gets used for these sensitive blocks.
  • When the sensitive talloc block is freed, a talloc destructor is called to zero the memory.

However, note that using talloc destructors has its own security implications.


Talloc security

Another goal of buffer overflow attacks is to execute arbitrary code. talloc supports a custom ‘destructor’ function and, although destructor functions are not widely used, this feature has been exploited in the past. For example, a buffer overrun attack could in theory craft its own talloc_chunk that uses system() as the destructor function, thus getting shell commands to execute when the memory is freed.

To make it harder for an attacker to create an apparently valid talloc_chunk, the talloc magic (which must be correct in the chunk) is randomized at library load time. The source of this random number on Linux is the< code>getauxval() function returning a value filled in by the Linux kernel. This allows talloc to obtain a random number without opening files such as /dev/urandom and is the same mechanism that is used by the ASLR (Address Space Layout Randomisation) code in the library linker.

Additionally with talloc 2.1.11 (used in Samba 4.8) when talloc pointer is free()ed the magic is reset to a fixed value, to avoid being visible later in uninitialised memory.


Potential further improvements

Samba could further mitigate potential overrun attacks by checking that the talloc destructor function is valid before executing it. For example, adding an extra sanity-check that the destructor function being called is one that had been previously registered.



Overview of the Samba Security Layers

In order to provide signing and encryption to protect data coming in and leaving an Active Directory domain controller, there are a number of network protocol abstractions and API abstractions to simplify the overall flow.

Samba implements a full range of security layers and extensions in order to interoperate with Windows clients, but in some cases, it does not implement exactly the same mechanisms (or choices) to Windows that can be made at each layer. Such choices can be either more secure or less secure, depending on what kind of cryptography is used or the integrity of a specific mechanism (or chain of mechanisms).


Simplified overview of Samba encryption

To start with, let’s take an extremely simplified walk-through of the concepts involved in Samba encryption:

  1. There is a network protocol connection between the client and the server that needs securing. The network protocol here could be LDAP, DNS, DCE/RPC, or SMB.
  1. The connection data is secured as an ‘opaque blob’ nested within each network packet. This opaque blob normally contains some form of header to indicate how the encrypted (or signed) data should be interpreted. Note that the blob’s data might also contain another such blob for a lower security layer.
  1. Note that each network protocol (LDAP, DNS, etc) has its own standards-based specification (i.e. RFC). So how the opaque blob is incorporated into each protocol packet differs. It requires specific extensions that must be made to the protocol, which are generally only implemented for Active Directory support.
  1. The next concept to grasp is how each opaque blob is actually secured. The key that gets used and the type of encryption algorithm used varies depending on the specific client and server involved, and the cryptographic algorithms that they support. The encryption key and algorithm to use for the connection is determined by one of the following methods:
  • Kerberos. When using Kerberos, the session keys and cryptographic algorithms used are determined by the Kerberos Key Distribution Center (KDC). The KDC knows from previous exchanges with the server and client exactly what algorithms each supports, and finds the most-secure algorithm in common. The KDC also has the master keys for the server and client, and uses these to create a unique session key for the two to authenticate with each other.
With Kerberos, connections can be made more or less secure based on how the network servers are configured (i.e. the cryptographic algorithms they support). This makes Kerberos the preferred and more secure way to protect the connection.
  • NTLM (NT LAN Manager) authentication. When NTLM authentication is used, the session key is derived from the shared secret, which is the user or machine account password in the database. The choice of cryptographic algorithms are essentially hardwired into the NTLM protocol, and are not really able to be configured.
  • SCHANNEL (Secure channel). This is only used for a subset of the DCE/RPC pipes, namely NETLOGON and LSARPC. It gets negotiated in a unique way, so is described in more detail in the next chapter.
  • The next step is working out whether to use Kerberos or NTLM during the connection setup. Note that Kerberos and NTLM each have their own further abstraction layer:
  • For Kerberos, GSSAPI (Generic Security Services Application Program Interface) acts as a wrapper layer for dealing with a Kerberos backend.
  • For NTLM, NTLMSSP (NTLM Security Support Provider) acts as a wrapper layer around the NTLMv1 and NTLMv2 protocols.
The negotiation over whether to use Kerberos-based or NTLM-based security for the connection is determined by two more security layers: SASL (Simple Authentication and Security Layer), and SPNEGO (Simple and Protected GSSAPI Negotiation Mechanism). How these layers work are covered in more detail in the next section.

Note that this walk-through is extremely simplified and focuses on the concepts involved. As such, the security layers are not covered in the order they would actually appear in the network-stack. Next, we will look at a more concrete example of the security layers involved in an LDAP connection.


An expanded view of the security layers (LDAP)

In order to understand the order and structure of these layers, a simple explanation of how secure LDAP traffic can be negotiated and protected is appropriate. LDAP uses the full range of these layers and many of the interactions are described in MS-ADTS, under the SASL and GSSAPI sections. The following lists the security layers from highest to lowest.

  • TCP - Transmission Control Protocol
This first stage is to create a connection (stream) between the client and server.
  • LDAP - Lightweight Directory Access Protocol
Once a client is connected to a server and port of interest, there is continuous (application) protocol handling which normally consists of an initial setup (normally establishing who is connecting and the security level they desire) and then ongoing traffic handling which consists of wrapped (or joined) data and buffers (which is wrapped and unwrapped on either end as desired). In LDAP, the initial setup is done on an anonymous connection without credentials. On this connection, the client triggers a search on the rootDSE object of the LDAP directory to discover what SASL mechanisms the server supports. Once all the initial setup work on the lower layers is done, important LDAP entries can be sent encrypted or signed based on the negotiated setup (and reusing the chosen mechanisms at the lower layers).
  • SASL - Simple Authentication and Security Layer
This layer separates out the authentication and data security from an application level protocol. SASL provides data integrity and confidentiality in a way that can be incorporated into a number of different application protocols and allows for a non-static set of mechanisms to provide integrity and confidentiality services. Once a client has chosen a particular SASL mechanism and sent it to the server, the next standard stage is SPNEGO.
Alternatives to attempting SPNEGO (and having fewer layers) include: using NTLMSSP directly and NTLM (NT LAN Manager) authentication; and skipping a layer to directly reach GSSAPI.
This protocol is originally described in RFC2222.
  • SPNEGO - Simple and Protected GSSAPI Negotiation Mechanism
This layer is an accompanying layer to GSSAPI, designed to negotiate which GSSAPI mechanisms should be used to secure the client-server traffic. Available GSSAPI mechanisms are given to the negotiator, and once a specific mechanism is chosen, all subsequent traffic is sent via that mechanism.
Alternatives to attempting GSSAPI (and having fewer layers) include: using NTLMSSP directly and NTLM authentication.
This protocol is described in RFC4178 and MS-SPNG. Note that the Microsoft documentation often refers to this layer simply as Negotiate, rather than SPNEGO.
  • GSSAPI - Generic Security Services Application Program Interface
GSSAPI does not implement any security directly - it is only an abstraction for dealing with different GSSAPI backends, such as Kerberos. This is the most clear and central example of the usage of an opaque blob, as described in the introduction of this chapter, in order to hide lower level information from a high-level application. GSSAPI defines a number of procedure calls, error codes and overall flow (and states), in order to abstract a common interface to deal with arbitrary GSSAPI backends.
This protocol is described in RFC2078.
Microsoft uses SSPI (Security Support Provider Interface) as its own implementation-specific variant of GSSAPI, with its own custom extensions.
  • KRB5 - Kerberos 5
This is normally the last complete layer in the security layers. The Kerberos protocol consists of multiple exchanges with a Key Distribution Center (KDC), as well as forwarding of authentication information and mutually shared keys to allow for secure connections between different hosts and services. During the initial exchanges of Kerberos, the KDC will offer different encryption types (e.g. AES, RC4-HMAC) corresponding to the different ways the server stores secrets and how the server is configured. These encryption types determine the security level and some are now considered obsolete due to cryptographic weakness (for example DES).


GENSEC Abstraction

The Generic Security subsystem (GENSEC) is an abstraction created for use in Samba to combine most of the security layers together in order to provide a simplified, unified (programming) interface. GENSEC accomplishes much of what was hoped with GSSAPI: a generic interface to authentication of a remote user over the network, abstracting all the specific details of the protocol into opaque buffers.

GENSEC works on layers 3-6, or more generically, anything below SASL. Where some layers are skipped or the standard flow changed, GENSEC handles all the different combinations (of which there are many). GENSEC provides signing and encryption at a high level, hiding away most of the implementation details from the developer.

NTLMSSP, GSSAPI(KRB5) and SCHANNEL (Secure Channel) are concrete crypto-systems which can be accessed via the GENSEC interface in Samba.

GENSEC is specified in the GENSEC - Designing a security subsystem whitepaper by Andrew Bartlett. The Samba implementation is in auth/gensec.


Usage of security Layers in other protocols

The Server Message Block (SMB) protocol does not use SASL and begins from the SPNEGO security layer. Since SMB can be used as a transport layer (e.g. for DCE/RPC), other protocols can benefit from the security protections and guarantees created by these layers, and can specify security layers as requirements.

The Domain Name Service (DNS) protocol has extensions for authenticated DNS documented in RFC2845 - Secret Key Transaction Authentication for DNS (TSIG). In order to support GSSAPI, there is another specification, RFC3645, which describes how GSSAPI and TSIG can be combined together (in order to support Kerberos-secured DNS transactions that are required by Active Directory).


Additional information

Further details on some of the abstractions and source tree locations are located in Appendix III. Additional security layer information.



Major Cryptographic Subsystems in Samba

NTLMSSP

NTLMSSP (NTLM Security Support Provider) is a wrapper around the NTLM (NT LAN Manager) protocols. NTLMSSP provides a generic wrapper around NTLMv1 and NTLMv2 as well as optional signing and encryption of the protocol stream. As described in the previous chapter, NTLMSSP can be accessed via a number of different layers and can happen at various points during a network conversation. When Kerberos is unavailable (which might be for a number of different reasons, such as clock skew), clients often fallback to these protocols, so that a quality of service can be maintained.


Cryptography used

  • DES
  • MD4
  • MD5
  • HMAC-MD5
  • RC4


Features and Operation

NTLMSSP appears on the network as a three-way handshake of NtLmNegotiate, NtLmChallenge (containing the random challenge from the server) and NtLmAuthenticate (containing the client password proof, a hash of the password with the challenge and other stuff).

Others have written really good documentation on NTLMSSP, listed in the following references:


Optional negotiated features

NTLMSSP allows negotiation of NTLM features and this negotiation can be secured in the NTLMv2 handshake. In particular use of NTLMv2 or the NTLMSSP_NEGOTIATE_EXTENDED_SESSIONSECURITY (aka NTLMSSP_NEGOTIATE_EXTENDED_SESSIONSECURITY) flags trigger a per-direction keys, client-supplied nonce and HMAC-MD5 checksums of the data stream.

The NTLMSSP_NEGOTIATE_KEY_EXCH flag allows the client to propose a new per-session key encrypted with the long-term session key. Sadly however no Diffie-Hellman key exchange is done in this protocol and there is no forward secrecy.


Samba implementation

Passwords sent via NTLMSSP are checked via Samba’s NTLM authentication subsystem.


NETLOGON Secure Channel (Schannel)

Schannel is a mechanism to allow computer accounts secure, encrypted access to the NETLOGON and LSARPC DCE/RPC pipes. The cryptography can be negotiated, including DES, RC4 (similar to arcfour-hmac-md5 in Kerberos) and AES. It uses the shared machine account password as the secret between the client and server. Schannel is a concrete crypto-system that can be accessed and selected through the GENSEC interface.

The NETLOGON Secure Channel is created by the client first connecting to the NETLOGON service without authentication and establishing a session key via the ServerAuthenticate3 call. This then allows the client to reconnect and secure the NETLOGON pipe with schannel as a DCE/RPC transport security.

Importantly the NETLOGON service (which the secure channel is primarily to protect) is critically responsible for forwarding NTLM authentication to a DC in the domain. It also handles key rotation for the member server via the ServerPasswordSet2 call and historically in NT4 was responsible for replication.


Cryptographic details

Schannel has evolved over a number of years, including from the days of US export control. For that reason key lengths as short as 64 bits and algorithms as weak as RC4 can be negotiated (but not by default).

The key length is 128 bits otherwise.


RC4 Mode

RC4 with an 8 byte random confounder and an HMA5-MD5 checksum (of an MD5 checksum of the data and header).


AES

AES in 8-bit CFB Mode with an 8 byte random confounder and a SHA256 checksum of the data and header.


Cryptographic configuration

A number of smb.conf options control which of the above protocols are permitted by the server:

  • server schannel allows Schannel to be disabled or required.
  • reject md5 clients controls the use of the RC4/MD5 scheme.
  • allow nt4 crypto controls the use of 64-bit keys in the RC4/MD5 scheme.

These smb.conf options control which of the above protocols are permitted by the client:

  • client schannel allows Schannel to be disabled or required.
  • reject md5 servers controls the use of the RC4/MD5 scheme.
  • reject strong key controls the client’s requirement for use of 128-bit RC4 cryptography.

The defaults for these options have been slowly increased. Samba 4.8 has set server schannel and client schannel to yes by default and a future Samba version will not allow it to be disabled.


Samba implementation


Documentation and specifications


TLS

GnuTLS is used to protect the LDAP server when ldaps:// is used or the StartTLS extended operation is invoked. From the view of the security layers, TLS (Transport Layer Security) occurs above all the layers of GENSEC. This currently means that there is no way to tie the TLS connection to any of the other cryptographic mechanisms (GSSAPI, Kerberos, etc) and so there are weaknesses in how TLS interacts with other cryptographic mechanisms.

By preference, and due to these weaknesses, AD clients prefer to secure the session with the SASL sealing (encryption) provided by NTLMSSP or GSSAPI (KRB5).

The wrapping of GnuTLS is implemented in source4/lib/tls.


Self-signed temporary certificate

By default, Samba will generate a 4096 bit RSA self-signed certificate in the name of the host and with a 700 day lifetime.

The lifetime is determined by the LIFETIME constant defined in the source code and the size of the RSA key is determined by the RSA_BITS constant in source4/lib/tls/tlscert.c.

The certificate is not automatically renewed or rotated.


TLS configuration

The following smb.conf options control TLS:

  • Client-side certificate validation is controlled by the tls verify peer smb.conf option in conjunction with the tls cafile and tls crlfile options.
  • Server-side certificate and key are controlled by the tls cafile, tls certfile and tls keyfile smb.conf options.
  • Restrictions on which TLS protocols are used (mostly for the AD DC) is controlled by the tls_priority smb.conf option.

The default setting tls priority for currently disables SSLv3. As new attacks on TLS are found, this parameter can be used to update and configure which protocols and algorithms GnuTLS will use.


Usages of TLS which do not use the GnuTLS library

Some parts of Samba, such as winbindd, can use the OpenLDAP client libraries and invoke TLS if ldap ssl ads is set. These do not use the above configuration, however this is not the default.


Special dangers of NTLMSSP and Kerberos over TLS

By avoiding the mix of NTLMSSP and Kerberos with TLS, Samba avoids needing to implement channel bindings between the SSL layer as described in the Microsoft LdapEnforceChannelBinding documentation.

Samba uses the ldap server require strong auth to control this problematic configuration as well as simple binds over unprotected links, however the administrator can override it. In future, Samba could implement the appropriate channel bindings in order to correctly link TLS into the rest of the GENSEC stack.


Kerberos

For completeness, a mention of Kerberos here is warranted. Many of the details regarding the use of Kerberos in Samba (and also Active Directory) are documented elsewhere in this document or are documented in the relevant specifications.



What protocols does cryptography protect in samba

In Samba, cryptography is generally tied to authentication of user or computer accounts and the subsequent data streams.

Compromise of any of the below channels, including SMB, can lead to total takeover of the domain, therefore all connections must be cryptographically signed to prevent session takeover. Specifically the concern is that connections from a Domain Administrator must be integrity protected.


LDAP

LDAP is protected by SASL authentication and signing or encryption of the subsequent data stream. Protections include:

  • None (disabled by default)
  • GSSAPI(Kerberos)
  • NTLMSSP
  • TLS


Data protected

  • Add/delete/modify/search of all AD Directory objects
  • Password set
  • User password change

Secret attributes are never able to be read over LDAP


Default protection level

All connections must be cryptographically signed to prevent session takeover. Connections used to change and reset passwords should be encrypted, and the client should enforce this. This is not enforced by Samba’s LDAP server.

The ldap server require strong auth smb.conf setting controls this behaviour.


DCE/RPC

DCE/RPC is protected by authentication and signing or encryption of the subsequent data stream. Protections include:

  • None (eg anonymous access or no protection negotiated)
  • GSSAPI(Kerberos)
  • NTLMSSP
  • NETLOGON Secure Channel (Schannel)

Note however that the protections on the DCE/RPC protocol are poorly designed and incomplete. Some messages (DCE/RPC faults in particular) are not protected and the headers (including the operation number) are never encrypted.


Data protected

  • Returned session keys over NETLOGON from SamLogon (used for NTLMSSP authentication between a client and member server)
  • DRS Replicated data including domain secrets
  • Administrative operations (add/delete/update accounts) over SAMR
  • Administrative operations (add/delete/update trusts) over LSA
  • Add/delete/update ‘secrets’ over LSA


Default protection level

All connections must be cryptographically signed to prevent session takeover. Connections used to change and reset passwords should be encrypted as the bespoke cryptography for SAMR is outdated, and the client should enforce this. This is not enforced by Samba’s DCE/RPC server.

The allow dcerpc auth level connect smb.conf setting controls this behaviour.


Documentation and specifications


DCE/RPC (BackupKey)

In the DCE/RPC server, there is a special protocol which is designed for dealing with secrets (and is required by a number of Windows clients). BackupKey is a protocol for unlocking a client-side password safe using a key stored on the domain controller.

There are two sub-protocols (ServerWrap and ClientWrap)


ServerWrap

ServerWrap uses a symmetric cryptography key stored in as a LSA Secret in the AD directory to encrypt and decrypt the passwords. The encryption mode is RC4, a SHA-1 checksum is used and a salt is used as an nonce.

For this mode of operation to work a server must be online at the time the password safe is encrypted.


ClientWrap

ClientWrap uses public key (RSA 2048bit) cryptography, with the public and private key also stored as an LSA Secret in the AD directory. The client does the encryption, the server must only be online for the key fetch and decryption.

When the server is asked to unwrap the secret, the RSA key is used to decrypt a AES or 3DES key that encrypts the actual secret value.


References


Samba implementation


SMB

Data protected

  • Group Policy Object upload and download
  • DCE/RPC traffic not otherwise protected

Note that DCE/RPC can be tunneled over SMB, and inherits the credentials of the SMB connection over which it is carried).


Default protection level

All connections must be cryptographically signed to prevent session takeover, and the client should enforce this.

The server signing smb.conf setting controls this behaviour.


Kerberos KDC

Kerberos relies totally on cryptography to secure its own operations, to provide the session tickets to clients and to validate those claims.



SMB Signing and Encryption

Unlike LDAP, where the SASL framing is used for signing and encryption, and unlike DCE/RPC where the framing is custom but the algorithms are not, SMB signing and SMB encryption uses a unique cryptosystem taking only the session key from the original authentication.


Use in security DCE/RPC

Because DCE/RPC can be carried over SMB (known as ncacn_np) SMB signing can be a way to secure DCE/RPC bind and the subsequent data stream.


SMBv1 Signing

SMBv1 uses MD5 over the session key and the packet (including a sequence number in place of the signature).

Use of SMB signing is negotiated via flags in the NegProt and SessionSetupAndX. Either party can insist on SMB Signing.

SMB Signing ensures that information downloaded over SMB (such as group policies) has not been altered in transit.


References


Samba implementation


SMBv1 Encryption (Samba-only)

SMBv1 Encryption applies the normal encryption modes of NTLMSSP or GSSAPI(krb5) in a similar wrapping to SASL.


References


Samba implementation


SMBv2 Signing

SMB2 Signing uses HMAC-SHA256.


References


Samba implementation


SMBv3 Encryption

AES-128-CCM and AES-128-GCM are used to encrypt the session using a derived session key. 13.7. Samba implementation


References




Application level encryption

For historical reasons, some password/secret get/set operations on LSA and SAMR are protected with application level encryption. This is in contrast to simply requiring that the optional DCE/RPC encryption be used.


Limitiations on operation

The session key used for this encryption / obfuscation restricts the transport used:

For the session key to be available (and therefore for the operations to succeed), the DCE/RPC layer must be over ncacn_np (that is when the \pipe\lsarpc named pipe is accessed over SMB) and must not be additionally authenticated at the DCE/RPC layer.

In this ncacn_np case, the session key is derived from the SMB session key, which is in turn from the user’s NTLMSSP or Kerberos authentication.

For ncacn_ip_tcp (access over a direct TCP port) or if any DCE/RPC authentication is used, the session key was the fixed string SystemLibraryDTC. However modern Samba and windows versions refuse to use this key.

The implication is sadly that DCE/RPC encryption (which is generally stronger) can also not be used!


References


LSARPC

LSA SetSecret and QuerySecret

The session key is used to encrypt an administrator-supplied secret stored in the AD directory. The transport of this secret is protected by single DES using the session key (without any further salt for a nonce). There is no checksum.


References


Samba implementation


LSA CreateTrustedDomain and SetInformationTrustedDomain

The session key is used to encrypt an administrator-supplied inter-domain trust account secret stored to be store in the AD directory. The transport of this secret is protected by RC4 using the session key (without any further salt for a nonce). There is no checksum.


References


Samba implementation


SAMR

SAMR SetUserInfo level 18 and 21

The session key is used to encrypt an administrator-supplied password hash of a user for storage in the AD directory. The transport of this secret is protected by single DES using the session key (without any further salt for a nonce). There is no checksum.


SAMR SetUserInfo level 23 (samr_CryptPassword)

The session key is used to encrypt an administrator-supplied password of a user for storage in the AD directory. The transport of this secret is protected by RC4 using the session key (without any further salt for a nonce). There is no checksum but the length is stored at the tail of the buffer, so an incorrect decryption usually gives an implausible length.


SAMR SetUserInfo level 25 (samr_CryptPasswordEx)

The session key is used to encrypt an administrator-supplied password of a user for storage in the AD directory. The transport of this password is protected by RC4 using the session key (with a 16 byte salt as a nonce). There is no checksum but the length is stored at the tail of the buffer, so an incorrect decryption usually gives an implausible length.


References


Samba implementation




Database secrets

This chapter covers the sensitive information stored in the Samba database, and the related security implications. In particular, it covers:

  • What secret information is stored by Samba.
  • Which secrets and which protocols are particularly critical, if they happened to be compromised.
  • How Samba stores secrets, and mitigates potential disclosure.
  • The importance of access rights in Samba’s database.


Secret attributes

User and Group objects are the fundamental reason an AD domain exists. These objects control who can access what network resources. The full privilege model is stored in the AD database, and so modification of these database objects is security-relevant.

As explained in the Architecture section, Samba stores all the users and groups in a consolidated AD database called the DSDB. The user’s privileges in the domain are stored in a record in the database. In particular the attributes objectSID, primaryGroupoID and member (technically the memberOf backlink) implicitly control the user’s privileges.

A user’s password information is also stored along with the rest of the user record. The password attributes, such as supplementalCredentials and unicodePwd, are known as secret attributes. Secret attributes also include inter-domain trust tokens and encryption keys for password vaults. They also include the user’s password history (maintained by the password_hash.c DSDB module). The full list of secret attributes is listed in Appendix V. Secret Attributes.

While all of the user information is sensitive, secret attributes are subject to additional protection.


Particularly critical secret attributes

The following user accounts are particularly critical. A compromise of one of these account passwords can lead to a total domain compromise.

  • krbtgt password (key). At the centre of any Kerberos realm is the krbtgt (Kerberos Ticket Granting Ticket) principal, represented in Samba as a (disabled) user account. The password to this account is randomly generated at the time the domain is provisioned. Access to this secret can result in undetectable total domain compromise as new Kerberos tickets can be printed without auditing.
  • Administrator password. The Administrator account is essentially the most privileged normal account in the domain, and members of the Domain Admins group are similarly privileged. Access to these secrets can also lead to total compromise, in particularly by the ability to join new domain controllers, but also to reset any other user’s password (including krbtgt).
  • Domain controller machine account passwords. The account passwords for the Domain Controllers are even more important than the administrator’s account. A Domain controller can read all the domain secrets in clear-text, allowing undetected impersonation. An attacker with access to even just a single domain controller’s password can impersonate any user in that domain.


Protocol access of database

The following is a short summary of which protocols allow different forms of access to the underlying database.

  • Read Access. These protocols have the ability to read database objects (with some restrictions on what objects might be read).
  • LSA
  • SAMR
  • LDAP
  • RAP (forwards to SAMR)
  • DRS Replication
  • Write Access. These protocols have the ability to write new objects to the database (or modify attributes on existing objects).
  • SAMR
  • LDAP
  • Authentication access. These protocols provide authentication access by using database objects (or attributes) for checking passwords.
  • NETLOGON
  • NTLMSSP
  • winbindd / ntlm_auth
  • Kerberos KDC
  • Password changes. A user’s password can be changed in the database by the following protocols.

Note that password changes on a Read-Only Domain Controller (RODC) are simply rejected on Samba. MS-SAMS specifies that a RODC should proxy the password change to an appropriate read-write DC (since the RODC does not have write-access to the AD distributed database). However, in Samba this proxying behaviour is not implemented.


Particularly critical protocols

Directory Replication Service (DRS) is used to replicate AD data between domain controllers. An authorized DRS client can read all the internal secrets, and an authorized DRS server can similarly write secrets and change privileges such as group memberships. As such, the security and integrity of the DRS protocol are critical to the security of the domain.

To protect the secret attributes during replication, the DCE/RPC transport payload is encrypted, which is enforced at connection time. In addition to this, the GetNCChanges RPC requires that the secret attributes are individually encrypted with RC4 and checksummed with CRC32. The session key is salted with a 16-byte nonce using MD5. However, because this second layer of encryption uses the same session key as the DCE/RPC transport encryption, it is essentially an obfuscation layer. It is an extra layer of protection to prevent secret attributes from ever being sent across the network unencrypted.

The DRSUAPI DCE/RPC defines the server-side engine for DRS replication. The main DRS API is GetNCChanges. As mentioned in the Architecture section, Samba has a dedicated Replication server process (known as drepl), as well as a dedicated DSDB module. Attribute encryption is handled by the repl_decrypt.c code.

For more detail on AD replication, see also the Active Directory Replication Bachelor thesis by Stefan Metzmacher.


Storing secret attributes

Samba stores the secret attributes in its DSDB, both in memory and constantly mirrored to sam.ldb on disk. (In reality, the sam.ldb file is actually somewhat of an abstraction, and a separate database file exists in the sam.ldb.d directory for every partition in the domain.) The secret attributes are logically connected with the other attributes on the database record, and so they are co-mingled with less-sensitive information, such as the address-book and group memberships, for storage.

Samba protects the secret attributes using encryption. Samba encrypts secret attributes with AES-128-GCM using the AEAD mode and a 16 byte nonce.

The encryption is handled by an encrypted_secrets.c DSDB module. The module ensures that the lower-layer database layers always see the secret attributes as encrypted. The higher-layers of the DSDB can see the unecrypted attributes, if they specifically request them.

The encrypted-secrets module ensures the secret attributes are always encrypted on disk. When a database record is loaded into memory, the secret attributes are still encrypted at that point. This reduces the unencrypted data in memory, i.e. a read of the memory-mapped database, via a security-hole, would not expose the secret attributes.

Note that, by default, the encrypted_secrets.key has the same directory location and the same file permissions as the database itself. If the domain controller’s disk were compromised, then the secret attributes would still be plainly accessible. Mitigating this weakness would require storing the encrypted_secrets.key in a separate network location, for example using a TPM (Trusted Platform Module) when Samba is started.


Secret attribute disclosure

The values of secret attributes are not routinely disclosed. Effort is taken to avoid secrets being exposed, even by authorized users. The following mitigation strategizes are used:

  • Search value restriction. Samba will refuse to allow a search filter on secret attributes. Any search filter on a secret attribute is transformed such that it will simply not match anything in the database.
  • LDAP. Samba will refuse to disclose secret attributes over LDAP. Secret values are stripped from the output before being returned over LDAP.
  • Hidden by direct file access. By default, when a authorized user (i.e. Administrator or root-user) searches directly against the DSDB on disk (i.e. the sam.ldb file) Samba will hide the values for secret attributes. Secret values will only be returned if specifically requested. This avoids the values being shown during a casual ldbsearch on a user (for example).
  • During Replication. As mentioned previously, an additional layer of encryption (i.e. double encryption) is applied to secret attributes during DRS replication.
  • RODC Filtering. AD Domain controllers do not routinely pass the value to an RODC (except by special permission on a special operation).
  • To obtain a secret attribute, the RODC must ask for each object individually using the REPL_SECRET extended operation in the GetNCChanges API. The disclosure of the attribute is recorded in the directory in the msDS-RevealedUsers attribute on the RODC DSA object.
  • Logging. When logging internal LDB operations, Samba will redact the value of secret attributes during printing of LDIF information to avoid secret values being included in syslog messages or log files.
  • This is particularly important if a Samba administrator were to post debug to public mailing-lists, when reporting a problem or bug. Note that private values like staff names are still included in the logs, so care must always be taken when posting logs to public forums.


Database access rights

As well as secret attributes, the access rights for the database are extremely important. The behaviour of Active Directory is controlled by the database itself, so comprimising the database objects can comprise the secrets that the database contains. For example:

  • Group membership: if a user were able to add itself as a member of Domain Admins, it would have the same security implications as comprimising the Administrator’s password.
  • DRS Replication: if any user were able to replicate in full with a DC, it would have the same security implications as comprimising the DC’s machine account.

The permissions for AD database operations are controlled by Access Control Lists (ACLs) stored in the database itself. In particular, the top-level domain object has a number of Access Control Entries (ACEs) that relate to extended rights, while other ACEs are attached directly to the object they relate to. The ACEs define which users have the rights to read or modify a specific database object or its attributes.

For example, the GUID_DRS_GET_ALL_CHANGES right, which allows DRS replication of the domain, is stored on the root object of each partition. Whereas the GUID_DRS_USER_CHANGE_PASSWORDright, which allows a user to change their own password, is stored on the user object itself.

A user obtains access rights if the SID in their authorization token matches an allowed SID in the ACE. Typically the ACEs use well-known groups, such as Domain Admins or Authorized Users. Therefore both the ACL entry and the groups referenced are security-sensitive.

For database operations (like ‘search’ or ‘modify’) on a specific object, access is controlled via a DSDB module (acl.c). For other RPC operations, for example GetNCChanges, the access rights are checked explicitly in the server-side RPC handler code.

Access rights are documented further in the MS-ADTS specification, or in the Active Directory Technical Specification Control Access Rights Concordance.



Key rollover

Keys and secrets in AD

In Active Directory there are a number of encryption and signing keys that should be rolled over at regular intervals.


Machine account passwords

Machine (Domain Controller) account passwords can be rolled over safely. The old password is valid for NTLMSSP login for an hour. Likewise when the Kerberos libraries check an incoming ticket, they will accept any key left in the keytab, which allows tickets to be accepted despite being encrypted by the KDC with the old password.

Additionally, a Key Version Number (KVNO) can indicate which old password (key) to use in the keytab.


Krbtgt key (the core domain-wide secret in the KDC)

The krbtgt key operates like the machine account password above. The protocol includes a KVNO and at a protocol level tickets encrypted with old keys can be accepted.


TLS certificates for LDAPS and LDAP+StartTLS

TLS certificates, if issued by a trusted (by the clients) CA can be rolled over in the same way this is done on web servers.


The BackupKey ClientWrap certificate and key

Multiple certificates and keys can be stored in the directory, with the current preferred key (named G$BCKUPKEY_keyGuidString) pointed at by a LSA special secret called .


The BackupKey ServerWrap key

Multiple keys can be stored in the directory, with the current preferred key (named ) pointed at by a LSA special secret called G$BCKUPKEY_P.


The key used for the encrypted_secrets LDB module to encrypt other secrets at rest

No capacity for multiple keys or key roll-over is included in this module. To create a new key the DC can be re-joined to the domain and re-synced with new data.


Samba implementation

Samba does not currently implement automated key roll-over for any of these keys as an AD DC. However some of the host-specific keys can be forced to roll over:


Machine account password roll-over

When Samba acts as a domain member server the machine account passwords are rolled over automatically by winbindd, but this is inhibited on the AD DC as at Samba 4.8.


Manual roll-over

The scripts source4/scripting/chgkrbtgtpass and source4/scripting/devel/chgtdcpass provide manual roll-over for the krbtgt and machine (Domain Controller) account password. They could be used in an emergency situation. (Run then twice in an emergency to remove the old key from the previous password slot).

For the TLS certificates, if the files in private/tls/ are removed they will be re-generated. Likewise they can be replaced by new CA-signed certificates and Samba restarted.


Re-join to the domain

As all host-specific keys are generated fresh on a new domain join, wiping the private directory and re-joining the domain is the most comprehensive way to re-key a server.



What parts of Samba use cryptography and what algorithms are used

The most accurate and up to date reference on how what Cryptographic algorithms Samba uses and where they are used is Samba’s own document on crypto requirements as this is updated regularly as the code changes.


Modern crypto

AES

AES is now used extensively in new protocols and as a general statement the cryptosystems appears to have been designed with the input of a cryptographer.


Kerberos

Modern Kerberos clients will use Kerberos encryption types based on AES preferentially.

The following Kerberos encryption types are available in Samba:

  • aes256-cts-hmac-sha1-96
  • aes128-cts-hmac-sha1-96
  • arcfour-hmac-md5
  • des-cbc-md5 (disabled by default)

des-cbc-md5 can be enabled by enabling DES at a user-level, but this is generally not considered secure, which is why it is disabled by default.


Outdated and insecure algorithms

It is worth bringing special attention to the following algorithms:


RC4 without a confounder

Not only is RC4 used extensively in the protocols, a number of use cases are implemented without the use of a confounder/nonce or checksum, specifically:

  • Password encryption on SAMR for password set.
  • Protection of NTLM session keys in the SamLogon() family of calls, except for the latest variant, SamLogonEx(). Use of the earlier calls is deprecated and the full DCE/RPC response is protected by Schannel in default configurations.
  • Protection of inter-domain trust password during Establishment of Trusted Domains over LSA.
  • Encryption of session keys in NTLMSSP.


RC4 with a random confounder

  • Encryption of password values using the same session key as the outer (DCE/RPC) encryption during DRS Replication. As no new cryptographic material is used (compared to the wrapping), this could be considered to be obfuscation rather than encryption.
  • The NETLOGON Secure Channel (for older clients).


Disabling RC4

RC4 cannot be universally disabled in Samba and many of the protocols do not support another cipher. The best that can be controlled at this time is to set in smb.conf:

reject md5 clients = true
reject md5 server = true
ntlm auth = disabled

Additionally, there are a few use cases that appear to be of sound (for the time) cryptographic practises. These two are are of very similar in design:

  • Protection of the NETLOGON Secure Channel
  • The encryption type


Samba implementation

In Samba the arcfour_crypt_*() functions indicate the use of RC4.

The Samba implementation is in lib/crypto/arcfour.c.


Triple DES

  • BackupKey can use 3DES via GnuTLS to decrypt a client stored password safe.


Single DES

The most notable use of single DES is in NTLMv1. By default NTLMv1 support is disabled.

While RC4 seems to have been the ‘go to’ crypto function in Windows, some aspects of the protocols are old enough that single DES was used instead:

  • Password encryption on SAMR for password hash set.
  • Protection of NTLM session keys in the SamLogon() family of calls, except for the latest variant, SamLogonEx(). Fallback to DES requires special configuration (setting in the allow nt4 crypto smb.conf file). Use of this mode, and the calls that allow it, is deprecated and the full DCE/RPC response is protected by Schannel in default configurations.
  • LSA GetSecret and SetSecret.
  • Obfuscation of password values during DRS replication using the RID (user number) as the key during DRS Replication.


Samba implementation

In Samba, the functions sess_crypt_blob() and sess_encrypt_blob indicates the use of Single DES outside NTLM. The obfuscation with the RID is done using sam_rid_crypt().

This wrapper is in libcli/auth/session.c.



Where is the raw crypto implemented

Samba has raw (as compared to use of a library) cryptography implemented in the following locations.


Common cryptography functions

This is the common location for Samba’s implmentation of cryptography. The implementations here come from various places historically, most particularly Heimdal.


Path

lib/crypto


Samba-written cryptographic primitives

lib/crypto/arcfour.c

lib/crypto/arcfour.c - Samba’s own implementation of RC4.


lib/crypto/md4.c

lib/crypto/md4.c - Samba’s own implementation of MD4.


AES modes

Samba has implemented these AES modes on top of the imported AES.


Imported primitives from elsewhere

lib/crypto/hmacmd5.c

lib/crypto/hmacmd5.c - Samba’s import of rfc2104 HMAC-MD5.


lib/crypto/hmacsha256.c

lib/crypto/hmacsha256.c - Samba’s import of rfc2202 HMAC-SHA256.


Imported primitives from Heimdal

The following are from Heimdal:


SMBDES

This is an implementation of DES originally for NTLM authentication. Despite the comments, full forward and reverse DES is provided.


Path

libcli/auth/smbdes.c


AES-NI

Intel’s AES NI instructions provide faster access to AES on supported CPUs.

This is taken directly from the Linux Kernel and then wrapped for Samba’s use in lib/crypto/aes.c.


Path

third_party/aesni-intel


Heimdal

Samba has an old copy of Heimdal, a Kerberos implemention, vendored into our tree for use in building the Samba AD DC.

Heimdal provides a cryptographic library for its own use as a Kerberos library and Samba uses this directly for RSA and DES operations if the BackupKey implementation if a recent GnuTLS is not available.


Path

source4/heimdal/lib/hx509


Other (essentially) unused cryptography

lib/replace

There is a DES implementation for crypt() in lib/replace/crypt.c however all modern systems not only have crypt() they have a version with many more features. This is only used for plaintext authentication encrypt passwords = no on systems without PAM. It is mentioned here for completeness.


zLib

There is a ZIP encryption implementation in third_party/zlib/contrib/minizip/crypt.h due to the inclusion of the whole zlib release tree. It is not used in Samba.



What third party crypto is used

For historic reasons Samba has generally relied on in-tree cryptography. However we also relay, particularly in the AD DC, on these external libraries


GnuTLS

GnuTLS supplies Samba with TLS support for LDAP. It also supplies generic cryptographic operations in BackupKey and the on-disk encryption of secret attributes.


Nettle

Nettle supplies generic cryptographic operations in BackupKey and the on-disk encryption of secret attributes.


MIT Kerberos (optional)

As an alternative to the Samba fork of Heimdal, Samba can be built against the system version of MIT Kerberos.



Random number generation

/dev/urandom

Samba uses /dev/urandom as the sole source of random numbers for cryptographic purposes. If this file can not be opened the program will abort().

No internal pool is maintained as this requires work to ensure safe operation across a fork() etc. We choose to trust the kernel’s security promises over any performance gains that might be possible by optimisation.

Samba also uses /dev/urandom for other types of random numbers by policy to avoid miss-selection. The only exception is in test code.


GnuTLS, Kerberos, etc.

Samba does not use the random number APIs from GnuTLS or Kerberos, but when these libraries use random numbers internally the sourcing is decided inside that library. We understand that is /dev/urandom however.


Other References

Samba 4 - Active Directory, by Andrew Bartlett

While quite old, this document still describes well how Samba’s AD DC is built.


Core Infrastructure Initiative (CII) Badge Application for Samba

The CII Best practices badge is obtained by documenting Samba’s practices and processes as a Open Source project, with a particular focus on security practices. Many assurances a security auditor would like to know regarding our internal processes are catalogued there, with references back to Samba as evidence.

Samba does not yet have the CII Best practices badge.



License

See License.



Appendix I. Samba utilities

  • libsmbclient allows client applications like Gnome to browse and view files on a remote SMB server. Its source code is source3/libsmb.
  • libwbclient: modern FreeRadius versions actually link directly to the winbind client library libwbclient and so avoid the fork()/exec() cost of calling ntlm_auth. Its source code is nsswitch.
  • net is a command line tool that provides are more extensive set of administrative functionality for Samba. The most notable function is net ads join used to join new members servers to an AD domain.
The focus of this tool is on aspects of the file server. Its source code is source3/utils.
  • ndrdump is a testing utility used to validate the parsing of DCE/RPC requests and replies seen (for example) over the network. It can also parse some other arbitrary blobs (typically defined as fake network calls) which are described using IDL.
  • ntlm_auth is a tool that allows external projects like FreeRadius (for 802.1x authentication), Squid and Apache (for NTLM over HTTP) to authenticate users against the joined domain. Its source code is source3/utils/ntlm_auth.c.
  • nss_winbind is a key part of running Samba as a domain member as it provides local user and group entries to the Name Service Switch (NSS) subsystem. It can also be used on systems that provide local desktop logins. Its source code is nsswitch.
  • pam_winbindd allows local logins to be authenticated against the joined domain using UNIX Pluggable Authentication Modules (PAM).
  • smbclient is a command line tool. It’s described as ftp like, in reference to the early command-like ftp client. It allows get, put, and mkdir many similar commands againt an SMB server. Its source code is in source3/client.
  • smbcontrol is a command line tool that gives (diagnostic) information on and allows control of the Unix processes that Samba creates. Its source code is source3/utils/smbcontrol.c.
  • smbstatus is a command line tool that gives information on which files are open and which locks are held by Samba clients. Its source code is source3/utils/status.c.



Appendix II. Samba compiler options used

The compiler options used by Samba are defined in buildtools/wafsamba/samba_autoconf.py: SAMBA_CONFIG_H(). These are:

  • -fstack-protector
  • -Wall
  • -Wshadow
  • -Wmissing-prototypes
  • -Wcast-align
  • -Wcast-qual
  • -fno-common
  • -Wformat=2 -Wno-format-y2k
  • -Wno-format-zero-length
  • -Werror=format
  • -Werror=format-security -Wformat-security
  • -Werror=address
  • -Werror=strict-prototypes -Wstrict-prototypes
  • -Werror=write-strings -Wwrite-strings
  • -Werror-implicit-function-declaration
  • -Werror=pointer-arith -Wpointer-arith
  • -Werror=declaration-after-statement -Wdeclaration-after-statement
  • -Werror=return-type -Wreturn-type
  • -Werror=uninitialized -Wuninitialized

In addition to warning during all builds, we compiler some subsystems with -Werror so all the above warnings become errors. We only do this if -Wno-error=tautological-compare is supported as we do rely on this idiom.


Additional Fedora hardening options

By comparison, the standard Fedora options also include the following relevant options that not enabled by default in Samba:

  • -fstack-protector-strong Broadens the scope of the stack-protection checks (compared with -fstack-protector), without the overhead and performance impact of -fstack-protector-all.
  • -fPIE Position-Independent Executable (PIE) compilation, needed for full ASLR (Address Space Layout Randomization). Currently supported by the Samba build framework, but only enabled if the compiler supports it.
  • -Wp,-D_FORTIFY_SOURCE=2 Activates glibc hardening features.
  • -Wl,-z,relro,-z,now Full RELRO (Read-Only Relocation). Currently supported by the Samba build framework, but only enabled if the compiler supports it.



Appendix III. Additional security layering information

GSSAPI / Kerberos

This section adds a little more information in regards to how GSSAPI is combined with Kerberos in Samba. GSSAPI(KRB5) is the standard GSSAPI mechanism that is offered.


Samba implementation

While Samba itself has avoided implementing Kerberos and GSSAPI directly, the wrapper code is important to locate:

Plus in Heimdal (when selected):


Additional detail on the NTLM authentication subsystem

Differences between Kerberos and NTLM

In Kerberos, the ticket and PAC (Privilege Account Certificate), described in MS-PAC, provides both authentication and authorization information in the authentication assertion (which ensures only the correct user has access to decrypted information and therefore access to specific resources). NTLM does not have these mechanisms and so is required to be checked against a DC in real-time.

Since NTLM must be done in real-time, offline domain controllers may have more of an effect on users accessing network resources.


Differences between Samba and Windows

In Windows, certain authentication requests and recording of login failures are recorded at the primary domain controller (PDC, or PDC emulator to be more precise). This is meant to ensure that a domain-wide lockout of an account, or changing a password is more reliable and consistent (by prioritising changes to occur on the PDC and forwarding changes to the PDC as a high priority).

In Samba, the behaviour to redirect such traffic to the PDC has not been implemented. Where some form of forwarding is required, a neighbouring domain controller is chosen rather than the primary domain controller.


Authentication on the RODC

Where passwords are not replicated to the RODC (because it has not been able to, or because it is configured not to do so), authentication is forwarded to a read-write domain controller. This is normally done to the PDC, but in Samba, we only fallback to a neighbouring DC.

As a related note, Kerberos forwarding of such login failures has also not been implemented in Samba, but the maintenance of the number of login failures has been implemented using a dummy NTLM request with no password (to trigger lockouts).


cli_credentials interface and abstraction

The cli_credentials abstraction is a key part of making authentication work in Samba (when operating as a client). By abstracting the various details needed for authentication into one opaque object, cli_credentials simplifies a number of the authentication APIs. One useful feature that cli_credentials has is that it can maintain multiple passwords, where one password might a newly-set password, while the other is the previous password. In this manner, retrying with the correct credentials avoids becoming a caller issue.


Samba implementation



Appendix IV. Change password routines in SAMR

The SAMR protocol which is available over DCE/RPC provides user and group enumeration, as well other critical operations such as password changes (or resets).

There are three routines of interest. The first two change the user’s password given the old password which is protected by RC4, while the last routine is protected with DES (this is not enabled by default).


Samba implementation


samr_ChangePasswordUser2 and samr_ChangePasswordUser3

Passwords can be changed over SAMR by providing the new password cross-encrypted with the old password. The new unicode password is packed as UTF16 into the end of a random-filled buffer and encrypted using RC4 with the old password. The old NT hash (and LM hash if supplied) values are encrypted with the new password (again with RC4) to prove the password is known.


=== References


Samba implementation


samr_OemChangePasswordUser2

Passwords can be changed over SAMR by providing the new password cross-encrypted with the old password. The new unicode password is packed as UTF16 into the end of a random-filled buffer and encrypted using RC4 with the old password. The old NT hash (and LM hash if supplied) values are encrypted with the new password (using DES in ECB mode) to prove the password is known.


References


Samba implementation

This password change routine is disabled unless an smb.conf option is set: lanman auth = yes



Appendix V. Secret Attributes

The list of secret attributes is currently:

  • pekList
  • msDS-ExecuteScriptPassword
  • currentValue
  • dBCSPwd
  • initialAuthIncoming
  • initialAuthOutgoing
  • lmPwdHistory
  • ntPwdHistory
  • priorValue
  • supplementalCredentials
  • trustAuthIncoming
  • trustAuthOutgoing
  • unicodePwd

While never exposed over the network, we treat this internal attribute in the same way:

  • clearTextPassword

This attribute is not in the schema, but is used in secrets.ldb, so it is also redacted when printed in LDIF during debugging.

  • secret

The list of secret attributes is hard-coded in AD, rather than being based on the current schema. They correspond to the following reference documents:

In the Samba implementation, the full list of secret attributes is maintained as the DSDB_SECRET_ATTRIBUTES_EX macro in source4/dsdb/common/util.h

Note that Samba uses the above MS-ADTS attribute list, which is a superset of the MS-DRSR list.


Extended access rights

The list of extended rights known to Samba for GUI display purposes is here:

The list of those GUIDs for which we have a C constant allowing implementation is here:

However only some of these are actually evaluated by Samba at this time.