Configuring clustered Samba
Setting up pCIFS using Samba and CTDB
As of April 2007 you can setup a simple Samba3 or Samba4 CTDB cluster, running either on loopback (with simulated nodes) or on a real cluster with TCP. This page will tell you how to get started.
Next you need to initialise the Samba password database, e.g.
smbpasswd -a root
Samba with clustering must use the tdbsam or ldap SAM passdb backends (it must not use the default smbpasswd backend), or must be configured to be a member of a domain. The rest of the configuration of Samba is exactly as it is done on a normal system. See the docs on http://samba.org/ for details.
Critical smb.conf parameters
A clustered Samba install must set some specific configuration parameters
netbios name = something * clustering = yes idmap config * : backend = autorid idmap config * : range = 1000000-1999999
- See idmap(8) for more information about the idmap configuration
- netbios name should be the same on all node
If using the Samba registry then these must be set in smb.conf:
There are several things to configure in CTDB to support clustered Samba.
CTDB configuration file
CTDB_MANAGES_SAMBA configuration variable must be set to
yes in the ctdbd configuration file.
This causes CTDB to start and stop Samba at startup and shutdown. It also tells CTDB to monitor Samba.
The recovery lock, configured via
CTDB_RECOVERY_LOCK provides important split-brain prevention and is usually configured to point to a locl file in the cluster filesystem. See the RECOVERY LOCK section in ctdb(7) for more details.
This directory contains event scripts that are called out to by CTDB when certain events occur. Event scripts support health monitoring, service management, IP failover, internal CTDB operations and features. They handle events such as
Please see the service scripts that installed by ctdb in /etc/ctdb/events.d for examples of how to configure other services to be aware of the HA features of CTDB.
Also see /etc/ctdb/events.d/README for additional documentation on how to write and modify event scripts.
CTDB defaults to use IANA assigned TCP port 4379 for its traffic. Configuring a different port to use for CTDB traffic is done by adding a ctdb entry to the /etc/services file.
Example: for change CTDB to use port 9999 add the following line to /etc/services
Note: all nodes in the cluster MUST use the same port or else CTDB will not start correctly.
You need to setup some method for your Windows and NFS clients to find the nodes of the cluster, and automatically balance the load between the nodes. We recommend that you setup a round-robin DNS entry for your cluster, listing all the public IP addresses that CTDB will be managing as a single DNS A record.
You may also wish to setup a static WINS server entry listing all of your cluster nodes IP addresses.
Managing Network Interfaces
The default install of CTDB is able to add/remove IP addresses from your network interfaces using the CTDB_PUBLIC_ADDRESSS option shown above.
For more sophisticated interface management you will need to add a new events script in /etc/ctdb/events.d/.
For example, say you wanted CTDB to add a default route when it brings it up. You could have an event script called /etc/ctdb/events.d/11.route that looks like this:
#!/bin/sh . /etc/ctdb/functions loadconfig ctdb cmd="$1" shift case $cmd in takeip) # we ignore errors from this, as the route might be up already when we're grabbing # a 2nd IP on this interface /sbin/ip route add $CTDB_PUBLIC_NETWORK via $CTDB_PUBLIC_GATEWAY dev $1 2> /dev/null ;; esac exit 0
Then you would put CTDB_PUBLIC_NETWORK and CTDB_PUBLIC_GATEWAY in /etc/sysconfig/ctdb like this:
Filesystem specific configuration
The cluster filesystem you use with ctdb plays a critical role in ensuring that CTDB works seamlessly. Here are some filesystem specific tips
If you are interested in testing a new cluster filesystem with CTDB then we strongly recommend looking at the page on testing filesystems using ping_pong to ensure that the cluster filesystem supports correct POSIX locking semantics.
IBMs GPFS filesystem
The GPFS filesystem (see http://www-03.ibm.com/systems/clusters/software/gpfs.html) is a proprietary cluster filesystem that has been extensively tested with CTDB/Samba. When using GPFS, the following smb.conf settings are recommended
clustering = yes idmap backend = tdb2 fileid:mapping = fsname vfs objects = gpfs fileid gpfs:sharemodes = No force unknown acl user = yes nfs4: mode = special nfs4: chown = yes nfs4: acedup = merge
The ACL related options should only be enabled if you have NFSv4 ACLs enabled on your filesystem
The most important of these options is the "fileid:mapping". You risk data corruption if you use a different mapping backend with Samba and GPFS, because locking wilk break across nodes. NOTE: You must also load "fileid" as a vfs object in order for this to take effect.
A guide to configuring Samba with CTDB and GPFS can be found at Samba CTDB GPFS Cluster HowTo
RedHat GFS filesystem
Red Hat GFS is a native file system that interfaces directly with the Linux kernel file system interface (VFS layer).
The gfs_controld daemon manages mounting, unmounting, recovery and posix locks. Edit /etc/init.d/cman (If using RedHat Cluster Suite) to start gfs_controld with the '-l 0 -o 1' flags to optimize posix locking performance. You'll notice the difference this makes by running the ping_pong test with and without these options.
A complete HowTo document to setup clustered samba with CTDB and GFS2 is here: GFS CTDB HowTo
Lustre® is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by a number of companies ( Intel, Seagate ) and OpenSFS which is a not for profit organisation.
Tests have been done on Lustre releases of 1.4.x and 1.6.x with CTDB/Samba, The current lustre release is 2.5.2 . When mounting Lustre, an option of "-o flock" should be specified to enable cluster-wide byte range lock among all Lustre clients.
These two versions have differnt mechanisms of configuration and startup. More information is available at http://wiki.lustre.org.
In comparison of Lustre configurating, setting up CTDB/Samba on the two different versions keeps the same way. The following settings are recommended:
clustering = yes idmap backend = tdb2 fileid:mapping = fsname use mmap = no nt acl support = yes ea support = yes
The options of "fileid:mapping" and "use mmap" must be specified to avoid possibe data corruption. The sixth of "nt acl support" is to map the POSIX ACL to Windows NT's format. At the moment, Lustre only supports POSIX ACL.
GlusterFS is a cluster file-system capable of scaling to several peta-bytes that is easy to configure. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. GlusterFS is based on a stackable user space design without compromising performance. It uses Linux File System in Userspace (FUSE) to achieve all this.
NOTE: GlusterFS has not yet had extensive testing but this is currently underway.
Currently from versions 2.0 to 2.0.4 of GlusterFS, it must be patched with:
This is to ensure GlusterFS passes the ping_pong test. This issue is being tracked at:
Update: As of GlusterFS 2.0.6 this has been fixed.
- OCFS2 - see http://oss.oracle.com/projects/ocfs2/
fileid:mapping = fsid vfs objects = fileid
OCFS2 1.4 offers cluster-wide byte-range locking.
Testing clustered Samba
Once your cluster is up and running, you may wish to know how to test that it is functioning correctly. The following tests may help with that
You can check for connectivity to the smbd daemons on each node using smbcontrol
- smbcontrol smbd ping
Using Samba4 smbtorture
The Samba4 version of smbtorture has several tests that can be used to benchmark a CIFS cluster. You can download Samba4 like this:
git clone git://git.samba.org/samba.git cd samba/source4
Then configure and compile it as usual. The particular tests that are helpful for cluster benchmarking are the RAW-BENCH-OPEN, RAW-BENCH-LOCK and BENCH-NBENCH tests. These tests take a unclist that allows you to spread the workload out over more than one node. For example:
smbtorture //localhost/data -Uuser%password RAW-BENCH-LOCK --unclist=unclist.txt --num-progs=32 -t60
The file unclist.txt should contain a list of share in your cluster (UNC format: //server//share). For example
//node1/data //node2/data //node3/data //node4/data
For NBENCH testing you need a client.txt file. A suitable file can be found in the dbench distribution at http://samba.org/ftp/tridge/dbench/