CTDB Performance

From SambaWiki
Revision as of 02:45, 8 October 2021 by MartinSchwenke (talk | contribs) (→‎Record contention: Fix broken heading)

Record contention

CTDB's distributed volatile databases are subject to contention for database records. This can result in performance issues. Contention is most often seen in locking.tdb (and sometimes brlock.tdb). Records in these databases are directly associated with files, so when several nodes contend for access to metadata for a particular file or directory, the associated record(s) are contended and bounce between nodes.

In this situation it is important to understand that CTDB is only involved in creating records and moving them between nodes. smbd looks for a record in the desired TDB and if it determines that the latest version of that record is present on the current node then it uses that record. There are 2 other cases:

  • The record is present but the current node does not have the latest copy
  • The record is not present

In both cases smbd will ask ctdbd to fetch the record. So, when multiple nodes need to access the same record then that record will bounce. This can be expensive.

Log messages indicating poor performance

Log messages like the following are an indicator of performance problems:

 db_ctdb_fetch_locked for /var/cache/dbdir/volatile/locking.tdb.N key ABCDEFBC2A66F9AD1C55142C290000000000000000000000, chain 62588 needed 1 attempts, X milliseconds, chainlock: Y ms, CTDB Z ms

If Z is large (multiple seconds, particularly tens of seconds) then CTDB took a long time to fetch the record from another node.

Aside: Stuck smbd processes

If Z is even larger (hundreds or thousands of seconds) then this can indicate that an smbd process on a node is stuck in D state, probably in a cluster filesystem system call, while holding a TDB lock. In this case the above db_ctdb_fetch_locked messages may not even be seen because a record is never successfully fetched. In this case, one or more repeated message like the following may be seen:

 Unable to get RECORD lock on database locking.tdb for X seconds

A very large value of X (hundreds or thousands of seconds) indicates a serious problem.

This can be confirmed by finding a long-running smbd process in D state and obtaining a kernel stack trace (on Linux, /proc/<pid>/stack). See the documentation for the [database] lock debug script ctdb.conf(5) option for an automated way of debugging this (when robust mutexes are in use, which is the modern Samba default, this automated method only works on versions >= 4.15).

As hinted at above, the usual reason for this type of problem is a cluster filesystem issue.

Hot records

ctdb dbstatistics locking.tdb

Workarounds

Deliberately breaking lock coherency

Discussion of deliberately but carefully breaking lock coherency using:

 fileid:algorithm = fsname_norootdir

or even:

 fileid:algorithm = fsname_nodirs

See vfs_fileid(8). This needs to be carefully considered and understood to avoid filesystem corruption.