Code Coverage: Difference between revisions

From SambaWiki
Line 119: Line 119:
These are our third party sources, which are pulled in as tarballs or full copies of their respective code bases. In some cases, they might have their own build files, tests or other artifacts which might normally be triggered in upstream development. Like Heimdal, we may not use much of the codebase and only use certain functions.
These are our third party sources, which are pulled in as tarballs or full copies of their respective code bases. In some cases, they might have their own build files, tests or other artifacts which might normally be triggered in upstream development. Like Heimdal, we may not use much of the codebase and only use certain functions.


Part of the reason there is a third party directory is so that we do not have to reimplement functionality (and we use the most well known implementations) and their respective testing.
Part of the reason there is a third party directory in our source tree is so that we do not have to reimplement functionality (and we use the most well known implementations) and worry about their respective testing or maintenance.


== Areas of code to consider removing or consolidating ==
== Areas of code to consider removing or consolidating ==

Revision as of 07:06, 16 May 2019

Code coverage is generated via a periodic Gitlab pipeline and is aggregated from merging the result of each individual job (which together runs all the available tests). This includes tests for LDB and TDB, which have their own individual build systems (and currently excludes CTDB). This page only describes C code coverage currently.

LCOV coverage for C code is available here:

https://samba-team.gitlab.io/devel/samba/

Total coverage

Last updated: 2019-05-14 17:41:12

Line coverage: 39.8% (519470 / 1304847)
Function coverage: 42.0% (30933 / 73567)

Function coverage is generally better than line coverage across the code base. In a number of cases where function coverage is not 100%, the calls are stubbed out and return errors such as 'Not Implemented'. Much of the test code has high levels of code coverage indicating that these tests are actually being run correctly. From looking at what is being covered, it does not seem like there is much trivial testing that has been obviously missed and much of it requires a more in-depth investigation.

Line coverage exceeding 75% normally indicates reasonably good coverage in a file. Reaching higher than this (and 100%) is often impossible without heavy instrumentation because of the nature of C code. When inspecting these files, you can notice that many of the allocation errors and general error cases are not well covered. Under normal operation, and even in extreme error conditions, many of these cases are practically impossible to hit. In some cases, as a result of defensive coding, these cases are in fact impossible to hit (but allow refactoring and other unexpected external changes to preserve the correct behaviour).

Line coverage below 60% may be in need of some further testing, although do note that in some cases, these files have more error cases or difficult to test scenarios and so 60% (or below) may already indicate reasonable coverage.

Notable areas

bin/default/librpc/gen_ndr (Generated Code)

Line coverage: 18.1% (104224 / 574711)

Generated code contributes to almost half of all detected C code. This code is generated using PIDL which reads IDL files and generates (network) marshalling code. In particular, it writes binary serialization and deserialization functions for structures sent over network protocols, formatted print routines for structures and in some cases generates remote procedure call (RPC) interfaces. These RPC interfaces are generally for DCE/RPC protocols implemented by Active Directory and internal messaging. They are often stubbed and allow manual implementation elsewhere in the code.

In many cases, there are no callers of our formatted print routines for many structures. Many structures remain unused, and many structures are deprecated (e.g. v3, v5 are from earlier versions of the protocol and v8 is commonly in use), as well as many functions defined in various RPC protocol specifications. In fact, many exposed functions which are at the latest version may still be superseded by non-RPC mechanisms, or clients generally perform these operations over a different protocol e.g. LDAP.

One other caveat with the generated code is that there is a significant amount of dead-code generated because PIDL is quite simple and may duplicate a number of different checks that LCOV might be detecting. Regardless, the coverage statistics for this set of code is very low and any ways we can improve this warrant investigation.

source4/heimdal (Kerberos Implementation Forked from Heimdal)

This code implements our Kerberos Key Distribution Center (KDC) and includes a number of client libraries and crypto. This is a fork of Heimdal (known as Lorikeet Heimdal) from some time ago with security fixes and some of our own patches. Although this forms a critical component of the Samba Active Directory implementation, a number of the components in the Heimdal code base which we have copied into our source tree are not used (source4/heimdal/lib/ntlm). Similarly, any tests that come with Heimdal and could be exercising different functionality are not being run within our code. Due to the way we wrap the KDC and our own independent implementation of different security layers (GENSEC and GSSAPI), many paths in this code are never used and should never be used.

source4/dsdb/samdb/ldb_modules

Line coverage: 73.0% (16673 / 22843)

In terms of the Active Directory code, the LDB modules implement much of the underlying logic as database layers. Aggregating the entire LDB module stack gives an LDAP-like database with AD compatibility, which is then wrapped with our LDAP server. The core LDB implements a LDAP-like database (without any constraints), backed by a key value store (TDB, a Samba-based project or LMDB). The modules code and core LDB code (and TDB) in the lib/ folder have similar levels of code coverage as the modules. Much of the LDAP server seems to be similarly tested (or slightly better).

Some notable exceptions in the code coverage are modules which are not run frequently or at all. There is an example skel.c in lib/ldb/modules for describing how to build an LDB module. In the Samba AD specific modules, in source4/dsdb/samdb/ldb_modules, there is count_attrs which is a statistics gathering module for development purposes and some historical attempts at OpenLDAP backends (parts which exists in a number of places in the code base).

source4/torture and source3/torture (Test Code)

These describe a number of tests written in C code (as opposed to our Python tests), our primary testing method in the past but sometimes written due to the difficulty of creating external bindings. The coverage of the source4 code is notably better than the source3. This likely warrants more investigation, although it seems to be that benchmarking code (for performance reasons) is not triggered by our CI.

source3/torture

Line coverage: 47.1% (6519 / 13830)
Function coverage: 58.1% (293 / 504)

source4/torture

Varies by directory, generally 80%+ line coverage besides nbench.

source3/rpc_server, source4/rpc_server and source3/rpcclient (DCE/RPC)

source3/rpcclient

Line coverage: 6.2% (494 / 8025)


dnsserver (browser has stub functions)

source4/ntvfs (File Server Not Running in Production)

Not well tested, with less than 50% coverage in parts.

source3/modules (Samba File Server VFS Modules)

Line coverage: 37.7% (9698 / 25692)
Function coverage: 47.5% (868 / 1828)

In order to implement SMB network file semantics and Windows ACL semantics, there are virtual file system (VFS) modules to intercept and transform file operations into operations on a POSIX / Linux based system (and file system). A number of these modules are optional and there are a few in the source tree due to vendors (like anti-virus scanning) to simplify maintenance burdens. Since they have performance implications or other external dependencies (like clustered environments like Ceph), many are not turned on by default and are not tested in our default testing environments.

Normally our testing is done on the EXT4 file system, but by swapping out our VFS modules we support other file systems which are not as well tested if you look at code coverage, as they are not run in our normal CI and build pipelines. Some of our tests depend heavily on EXT4 semantics and are hard-coded, so simply switching the backend file system is not necessarily appropriate. Furthermore, in situations like clustered environments, the semantics and guarantees may not be at the same level.

source3/winbindd

Line coverage: 33.8% (6677 / 19765)
Function coverage: 51.2% (511 / 998)

The AD winbind

source3/smbd

Line coverage: 63.4% (26729 / 42164)
Function coverage: 82.3% (1346 / 1636)

source3/nmbd

Line coverage: 38.0% (2175 / 5728)
Function coverage: 53.4% (164 / 307)

Provides Netbios name services and implements a number of protocols and functions (some of which may be considered obsolete with new versions of Windows). In Active Directory, nbtserver is what is instead to provide this functionality and nmbd is not normally used (or can be used). DNS often replaces the use of Netbios, with DNS names being preferred than the short Netbios name. In some of our code, Netbios would only be used as the fallback and so much of this code may not be hit unless there are resolution issues.

third_party

These are our third party sources, which are pulled in as tarballs or full copies of their respective code bases. In some cases, they might have their own build files, tests or other artifacts which might normally be triggered in upstream development. Like Heimdal, we may not use much of the codebase and only use certain functions.

Part of the reason there is a third party directory in our source tree is so that we do not have to reimplement functionality (and we use the most well known implementations) and worry about their respective testing or maintenance.

Areas of code to consider removing or consolidating

Group Policy Code

Registry Code

NTVFS File server