CTDB2releaseNotes
From SambaWiki
ctdb 2.5.4
- Release Notes for ctdb 2.5.4
- Septembet 26, 2014
Changes
User-visible changes
- New command "ctdb detach" to detach a database.
- Support for TDB robust mutexes. To enable set TDBMutexEnabled=1. The setting is per node.
- New manual page ctdb-statistics.7.
Important bug fixes
- Verify policy routing configuration when starting up to make sure that policy routing tables do not override default routing tables.
- "ctdb scriptstatus" should correctly list the number of scripts executed.
- Do not run eventscripts at real-time priority.
- Make sure "ctdb restoredb" and "ctdb wipedb" cannot affect an ongoing recovery.
- If a readonly record revokation fails, CTDB does not abort anymore. It will retry revoke.
- pending_calls statistic now gets updated correctly.
Important internal changes
- Vacuuming performance has been improved.
- Fix the order of setting recovery mode and freezing databases.
- Remove NAT gateway "monitor" event.
- Add per database queue for lock requests. This improves the lock scheduling performance.
- When processing dmaster packets (DMASTER_REQUEST and DMASTER_REPLY) defer all call processing for that record. This avoids the temporary inconsistency in dmaster information which causes rapid bouncing of call request between two nodes.
- Correctly capture the output from lock helper processes, so it can be logged.
- Many test improvements and additions.
ctdb 2.5.3
- Release Notes for ctdb 2.5.3
- April 1, 2014
Changes
User-visible changes
- New configuration variable CTDB_NATGW_STATIC_ROUTES allows NAT gateway feature to create static host/network routes instead of default routes. See the documentation. Use with care.
Important bug fixes
- ctdbd no longer crashes when tickles are processed after reloading the nodes file.
- "ctdb reloadips" works as expected because the DEL_PUBLIC_IP control now waits until public IP addresses are released before returning.
Important internal changes
- Vacuuming performance has been improved.
- Record locking now compares records based on their hashes to avoid scheduling multiple requests for records on the same hashchain.
- An internal timeout for revoking read-only record relegations has been changed from hard-coded 5 seconds to the value of the ControlTimeout tunable. This makes it less likely that ctdbd will abort.
- Many test improvements and additions.
ctdb 2.5.2
- Release Notes for ctdb 2.5.2
- February 3, 2014
Changes
User-visible changes
- Much improved manpages from CTDB 2.5 are now installed and packaged.
Important bug fixes
- "ctdb reloadips" now waits for replies to addip/delip controls before returning.
Important internal changes
- The event scripts are now executed using vfork(2) and a helper binary instead of fork(2) providing a performance improvement.
- "ctdb reloadips" will now works if some nodes are inactive. This means that public IP addresses can be reconfigured even if nodes are stopped.
ctdb 2.5.1
- Release Notes for ctdb 2.5.1
- November 27, 2013
Changes
Important bug fixes
- The locking code now correctly implements a per-database active locks limit. Whole database lock requests can no longer be denied because there are too many active locks - this is particularly important for freezing databases during recovery.
- The debug_locks.sh script locks against itself. If it is already running then subsequent invocations will exit immediately.
- ctdb tool commands that operate on databases now work correctly when a database ID is given.
- Various code fixes for issues found by Coverity.
Important internal changes
- statd-callout has been updated so that statd client information is always up-to-date across the cluster. This is implemented by storing the client information in a persistent database using a new "ctdb ptrans" command.
- The transaction code for persistent databases now retries until it is able to take the transaction lock. This makes the transation semantics compatible with Samba's implementation.
- Locking helpers are created with vfork(2) instead of fork(2), providing a performance improvement.
- config.guess has been updated to the latest upstream version so CTDB should build on more platforms.
ctdb 2.5
- Release Notes for ctdb 2.5
- October 30, 2013
Changes
User-visible changes
- The default location of the ctdbd socket is now:
/var/run/ctdb/ctdbd.socket
- If you currently set CTDB_SOCKET in configuration then unsetting it will probably do what you want.
- The default location of CTDB TDB databases is now:
/var/lib/ctdb
- If you only set CTDB_DBDIR (to the old default of /var/ctdb) then you probably want to move your databases to /var/lib/ctdb, drop your setting of CTDB_DBDIR and just use the default.
- To maintain the database files in /var/ctdb you will need to set CTDB_DBDIR, CTDB_DBDIR_PERSISTENT and CTDB_DBDIR_STATE, since all of these have moved.
- Use of CTDB_OPTIONS to set ctdbd command-line options is no longer supported. Please use individual configuration variables instead.
- Obsolete tunables VacuumDefaultInterval, VacuumMinInterval and VacuumMaxInterval have been removed. Setting them had no effect but if you now try to set them in a configuration files via CTDB_SET_X=Y then CTDB will not start.
- Much improved manual pages. Added new manpages ctdb(7), ctdbd.conf(5), ctdb-tunables(7). Still some work to do.
- Most CTDB-specific configuration can now be set in
/etc/ctdb/ctdbd.conf.
- This avoids cluttering distribution-specific configuration files, such as /etc/sysconfig/ctdb. It also means that we can say: see ctdbd.conf(5) for more details. :-)
- Configuration variable NFS_SERVER_MODE is deprecated and has been replaced by CTDB_NFS_SERVER_MODE. See ctdbd.conf(5) for more details.
- "ctdb reloadips" is much improved and should be used for reloading the public IP configuration.
- This commands attempts to yield much more predictable IP allocations than using sequences of delip and addip commands. See ctdb(1) for details.
- Ability to pass comma-separated string to ctdb(1) tool commands via the -n option is now documented and works for most commands. See ctdb(1) for details.
- "ctdb rebalancenode" is now a debugging command and should not be used in normal operation. See ctdb(1) for details.
- "ctdb ban 0" is now invalid.
- This was documented as causing a permanent ban. However, this was not implemented and caused an "unban" instead. To avoid confusion, 0 is now an invalid ban duration. To administratively "ban" a node use "ctdb stop" instead.
- The systemd configuration now puts the PID file in /run/ctdb (rather than /run/ctdbd) for consistency with the initscript and other uses of /var/run/ctdb.
Important bug fixes
- Traverse regression fixed.
- The default recovery method for persistent databases has been changed to use database sequence numbers instead of doing record-by-record recovery (using record sequence numbers). This fixes issues including registry corruption.
- Banned nodes are no longer told to run the "ipreallocated" event during a takeover run, when in fallback mode with nodes that don't support the IPREALLOCATED control.
Important internal changes
- Persistent transactions are now compatible with Samba and work reliably.
- The recovery master role has been made more stable by resetting the priority time each time a node becomes inactive. This means that nodes that are active for a long time are more likely to retain the recovery master role.
- The incomplete libctdb library has been removed.
- Test suite now starts ctdbd with the --sloppy-start option to speed up startup. However, this should not be done in production.
ctdb 2.4
- Release Notes for ctdb 2.4
- August 22, 2013
Changes
User-visible changes
- A missing network interface now causes monitoring to fail and the node to become unhealthy.
- Changed ctdb command's default control timeout from 3s to 10s.
- debug-hung-script.sh now includes the output of "ctdb scriptstatus" to provide more information.
Important bug fixes
- Starting CTDB daemon by running ctdbd directly should not remove existing unix socket unconditionally.
- ctdbd once again successfully kills client processes on releasing public IPs. It was checking for them as tracked child processes and not finding them, so wasn't killing them.
- ctdbd_wrapper now exports CTDB_SOCKET so that child processes of ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
- Always use Jenkins hash when creating volatile databases. There were a few places where TDBs would be attached with the wrong flags.
- Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code which led to header corruption for empty records. This resulted in inconsistent headers on two nodes and a request for such a record keeps bouncing between nodes indefinitely and logs "High hopcount" messages in the log. This also caused performance degradation.
- ctdbd was losing log messages at shutdown because they weren't being given time to flush. ctdbd now sleeps for a second during shutdown to allow time to flush log messages.
- Improved socket handling introduced in CTDB 2.2 caused ctdbd to process a large number of packets available on single FD before polling other FDs. Use fixed size queue buffers to allow fair scheduling across multiple FDs.
Important internal changes
- A node that fails to take/release multiple IPs will only incur a single banning credit. This makes a brief failure less likely to cause node to be banned.
- ctdb killtcp has been changed to read connections from stdin and 10.interface now uses this feature to improve the time taken to kill connections.
- Improvements to hot records statistics in ctdb dbstatistics.
- Recovery daemon now assembles up-to-date node flags information from remote nodes before checking if any flags are inconsistent and forcing a recovery.
- ctdbd no longer creates multiple lock sub-processes for the same key. This reduces the number of lock sub-processes substantially.
- Changed the nfsd RPC check failure policy to failover quickly instead of trying to repair a node first by restarting NFS. Such restarts would often hang if the cause of the RPC check failure was the cluster filesystem or storage.
- Logging improvements relating to high hopcounts and sticky records.
- Make sure lower level tdb messages are logged correctly.
- CTDB commands disable/enable/stop/continue are now resilient to individual control failures and retry in case of failures.
ctdb 2.3
- Release Notes for ctdb 2.3
- July 11, 2013
Changes
User-visible changes
- 2 new configuration variables for 60.nfs eventscript:
- CTDB_MONITOR_NFS_THREAD_COUNT
- CTDB_NFS_DUMP_STUCK_THREADS
- See ctdb.sysconfig for details.
- Removed DeadlockTimeout tunable. To enable debug of locking issues set
CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
- In overall statistics and database statistics, lock buckets have been updated to use following timings:
< 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >=64s
- Initscript is now simplified with most CTDB-specific functionality split out to ctdbd_wrapper, which is used to start and stop ctdbd.
- Add systemd support.
- CTDB subprocesses are now given informative names to allow them to be easily distinguished when using programs like "top" or "perf".
Important bug fixes
- ctdb tool should not exit from a retry loop if a control times out (e.g. under high load). This simple fix will stop an exit from the retry loop on any error.
- When updating flags on all nodes, use the correct updated flags. This should avoid wrong flag change messages in the logs.
- The recovery daemon will not ban other nodes if the current node is banned.
- ctdb dbstatistics command now correctly outputs database statistics.
- Fixed a panic with overlapping shutdowns (regression in 2.2).
- Fixed 60.ganesha "monitor" event (regression in 2.2).
- Fixed a buffer overflow in the "reloadips" implementation.
- Fixed segmentation faults in ping_pong (called with incorrect argument) and test binaries (called when ctdbd not running).
Important internal changes
- The recovery daemon on stopped or banned node will stop participating in any cluster activity.
- Improve cluster wide database traverse by sending the records directly from traverse child process to requesting node.
- TDB checking and dropping of all IPs moved from initscript to "init" event in 00.ctdb.
- To avoid "rogue IPs" the release IP callback now fails if the released IP is still present on an interface.
ctdb 2.2
- Release Notes for ctdb 2.2
- May 29, 2013
Changes
User-visible changes
- The "stopped" event has been removed.
- The "ipreallocated" event is now run when a node is stopped. Use this instead of "stopped".
- New --pidfile option for ctdbd, used by initscript
- The 60.nfs eventscript now uses configuration files in /etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of hardcoding them into the script.
- Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
- The NoIPTakeoverOnDisabled tunable has been renamed to NoIPHostOnAllDisabled and now works properly when set on individual nodes.
- New ctdb subcommand "runstate" prints the current internal runstate. Runstates are used for serialising startup.
Important bug fixes
- The Unix domain socket is now set to non-blocking after the connection succeeds. This avoids connections failing with EAGAIN and not being retried.
- Fetching from the log ringbuffer now succeeds if the buffer is full.
- Fix a severe recovery bug that can lead to data corruption for SMB clients.
- The statd-callout script now runs as root via sudo.
- "ctdb delip" no longer fails if it is unable to move the IP.
- A race in the ctdb tool's ipreallocate code was fixed. This fixes potential bugs in the "disable", "enable", "stop", "continue", "ban", "unban", "ipreallocate" and "sync" commands.
- The monitor cancellation code could sometimes hang indefinitely. This could cause "ctdb stop" and "ctdb shutdown" to fail.
Important internal changes
- The socket I/O handling has been optimised to improve performance.
- IPs will not be assigned to nodes during CTDB initialisation. They will only be assigned to nodes that are in the "running" runstate.
- Improved database locking code. One improvement is to use a standalone locking helper executable - the avoids creating many forked copies of ctdbd and potentially running a node out of memory.
- New control CTDB_CONTROL_IPREALLOCATED is now used to generate "ipreallocated" events.
- Message handlers are now indexed, providing a significant performance improvement.
ctdb 2.1
- Release Notes for ctdb 2.1
- January 8, 2013
Highlights
- Support for Samba 4.0.0
- To use CTDB 2.1 with Samba 3.x, enable Samba3AvoidDeadlocks tunable
- Set CTDB_BASE in eventscripts, so they can be run easily
- Clean up orphaned interfaces
- Do not restart NFS on reconfigure event
- Fix RSN based recovery of persistent databases to avoid corruption
- Re-factor and separate IP allocation algorithms
ctdb 2.0
- Release Notes for ctdb 2.0
- October 30, 2012
This is long overdue CTDB release. There have been numerous code enhancements and bug fixes since the last release of CTDB.
Highlights
- Support for readonly records (http://ctdb.samba.org/doc/readonlyrecords.txt)
- Locking API to detect deadlocks between ctdb and samba
- Fetch-lock optimization to rate-limit concurrent requests for same record
- Support for policy routing
- Modified IP allocation algorithm
- Improved database vacuuming
- New test infrastructure
Reporting bugs & Development Discussion
Please discuss this release on the samba-technical mailing list or by joining the #ctdb IRC channel on irc.freenode.net.
All bug reports should be filed under CTDB product in the project's Bugzilla database (https://bugzilla.samba.org/).
Download Details
The source code can be downloaded from:
http://ftp.samba.org/pub/ctdb/
Git repository
git://git.samba.org/ctdb.git http://git.samba.org/?p=ctdb.git;a=summary (Git via web)
CTDB documentation
https://ctdb.samba.org/