Configuring clustered Samba: Difference between revisions
(prettify the text) |
Slowfranklin (talk | contribs) |
||
(138 intermediate revisions by 19 users not shown) | |||
Line 1: | Line 1: | ||
= Goal = |
|||
= Setting up a simple CTDB Samba cluster = |
|||
Configure clustered Samba using a CTDB cluster |
|||
As of April 2007 you can setup a simple Samba3 or Samba4 CTDB cluster, running either on loopback (with simulated nodes) or on a real cluster with TCP. This page will tell you how to get started. |
|||
= Note = |
|||
== Clustering Model == |
|||
This page still contains some details not directly relevant to clustering Samba. The documentation is being cleaned up and restructured. |
|||
The setup instructions on this page are modelled on setting up a cluster of N nodes that function in nearly all respects as a single multi-homed node. So the cluster will export N IP interfaces, each of which is equivalent (same shares) and which offers coherent CIFS file access across all nodes. |
|||
= Prerequisites = |
|||
The clustering model utilizes IP takeover techniques to ensure that the full set of public ip addresses assigned to services on the cluster will always be available to the clients even when some nodes have failed and become unavailable. |
|||
* [[Basic CTDB configuration]] |
|||
* [[Setting up a cluster filesystem]] |
|||
* [[Configuring the CTDB recovery lock]] (recommended) |
|||
* [[Adding public IP addresses]] (or some other failover/load balancing scheme) |
|||
=Samba Configuration= |
|||
== Getting the code == |
|||
Next you need to initialise the Samba password database, e.g. |
|||
You need two source trees, one is a copy of Samba3 with clustering patches, and the other is the ctdb code itself. Both source trees are stored in bzr repositories. See http://bazaar-vcs.org/ for more information on bzr. |
|||
smbpasswd -a root |
|||
Samba with clustering must use the tdbsam or ldap SAM passdb backends (it must not use the default smbpasswd backend), or must be configured to be a member of a domain. The rest of the configuration of Samba is exactly as it is done on a normal system. See the docs on http://samba.org/ for details. |
|||
The fastest way to checkout an initial copy of the Samba3 tree with clustering patches is: |
|||
rsync -avz samba.org::ftp/unpacked/samba_3_0_ctdb . |
|||
To update this tree when improvements are made in the upstream code do this: |
|||
cd samba_3_0_ctdb |
|||
bzr merge http://samba.org/~tridge/samba_3_0_ctdb |
|||
If you don't have bzr and can't easily install it, then you can instead use the following command to update your tree to the latest version: |
|||
cd samba_3_0_ctdb |
|||
rsync -avz samba.org::ftp/unpacked/samba_3_0_ctdb/ . |
|||
==Critical smb.conf parameters== |
|||
Volker Lendecke maintains his own tree that sometimes has later changes in it. To merge from Volkers tree use this command: |
|||
bzr merge http://www.samba.sernet.de/vl/bzr/3_0-ctdb/ |
|||
Generally the two trees will only be a day or so apart, but Samba/ctdb is undergoing fast development at the moment, so one day can include quite a few changes. |
|||
A clustered Samba install must set some specific configuration parameters |
|||
To get an initial checkout of the ctdb code do this: |
|||
rsync -avz samba.org::ftp/unpacked/ctdb . |
|||
To update this tree when improvements are made in the upstream code do this: |
|||
cd ctdb |
|||
bzr merge http://samba.org/~tridge/ctdb |
|||
If you don't have bzr and can't easily install it, then you can instead use the following command to update your tree to the latest version: |
|||
cd ctdb |
|||
rsync -avz samba.org::ftp/unpacked/ctdb/ . |
|||
netbios name = something |
|||
clustering = yes |
|||
idmap config * : backend = autorid |
|||
idmap config * : range = 1000000-1999999 |
|||
NB: |
|||
== Building the Samba3 tree == |
|||
* See [https://www.samba.org/samba/docs/man/manpages/idmap_autorid.8.html idmap(8)] for more information about the idmap configuration |
|||
* netbios name should be the same on all nodes |
|||
Note that <code>bind interfaces only = yes</code> should not be used when configuring clustered Samba with [[Adding public IP addresses|CTDB public IP addresses]]. CTDB will start <code>smbd</code> before public IP addresses are hosted, so <code>smbd</code> will not listen on any of the public IP addresses. When public IP addresses are eventually hosted, <code>smbd</code> will not bind to the new addresses. |
|||
To build a copy of Samba3 with clustering and ctdb support you should do this: |
|||
cd samba_3_0_ctdb/source |
|||
./autogen.sh |
|||
./configure --prefix=/gpfs0/samba/prefix --with-ctdb=/usr/src/ctdb --with-cluster-support --enable-pie=no |
|||
make proto |
|||
make |
|||
==Using the Samba registry== |
|||
You should replace the /gpfs0/samba/prefix path with the cluster shared storage path you will use to install Samba. The path should to be a directory that is the same on all nodes of the cluster. If you are setting up a virtual cluster on loopback then this can be any local directory. |
|||
A recommended way of ensuring that all Samba nodes have the same configuration is to put most configuration into the registry. |
|||
The /usr/src/ctdb path should be replaced with the path to the ctdb sources that you downloaded above |
|||
This means that <code>smb.conf</code> can be as simple as: |
|||
== Building the CTDB tree == |
|||
[global] |
|||
To build a copy of the CTDB code you should do this: |
|||
clustering = yes |
|||
cd ctdb |
|||
include = registry |
|||
./autogen.sh |
|||
./configure --prefix=/gpfs0/samba/prefix |
|||
make |
|||
make install |
|||
The initial contents of the registry can then be placed into a file (say <code>tmp.conf</code>): |
|||
== Installing Samba3 == |
|||
[global] |
|||
To install Samba3 you should do this: |
|||
security = ADS |
|||
logging = syslog |
|||
log level = 1 |
|||
netbios name = test |
|||
workgroup = SAMBA |
|||
realm = samba.example.com |
|||
idmap config * : backend = autorid |
|||
idmap config * : range = 1000000-1999999 |
|||
and loaded from one of the nodes: |
|||
cd samba_3_0_ctdb/source |
|||
make install |
|||
net conf import tmp.conf |
|||
If your path points to another version of Samba, it is recommended that you reset your path to point to the bin/ and sbin/ directories of this newer Samba installation (e.g. /gpfs0/samba/prefix/bin and /gpfs0/samba/prefix/sbin). |
|||
Then you need to configure an appropriate smb.conf. There is a very simple example in samba_3_0_ctdb/examples/ctdb. You need to put this smb.conf in the lib/ subdirectory of the prefix you chose above. |
|||
Further <code>net conf</code> commands such as <code>net conf addshare</code> can then be used to continue configuration. |
|||
Next you need to initialise the Samba password database, e.g. |
|||
smbpasswd -a root |
|||
or if you have not reset your path to point to this newer version of Samba: |
|||
/gpfs0/samba/prefix/bin/smbpasswd -a root |
|||
= Configure CTDB to manage Samba = |
|||
Samba with clustering must use the tdbsam or ldap SAM passdb backends (it must not use the default smbpasswd backend). The rest of the configuration of Samba is exactly as it is done on a normal system. See the docs on http://samba.org/ for details. |
|||
For CTDB to manage Samba, the <code>50.samba</code> event script must be enabled |
|||
== Cluster Configuration == |
|||
ctdb event script enable legacy 50.samba |
|||
The two main cluster configuration file are cluster_nodes.txt and public_addresses.txt in your Samba configuration directory (the lib/ subdirectory of your Samba installation prefix). |
|||
This causes CTDB to start and stop Samba at startup and shutdown. It also tells CTDB to monitor Samba. |
|||
=== cluster_nodes.txt === |
|||
Similarly, if using <code>winbind</code>, CTDB should also be configured to manage it: |
|||
This file needs to be created and should contain a list of the private IP addresses that the CTDB daemons will use in your cluster. |
|||
This should be a private non-routable subnet which is only used for CTDB traffic. |
|||
ctdb event script enable legacy 49.winbind |
|||
Example : |
|||
10.1.1.1 |
|||
10.1.1.2 |
|||
10.1.1.3 |
|||
10.1.1.4 |
|||
Please see the <code>event</code> command in [http://ctdb.samba.org/manpages/ctdb.1.html ctdb(1)] for more details. |
|||
=== public_addresses.txt === |
|||
CTDB will manage and start/stop/restart the Samba services, so the operating system should be configured so these are not started/stopped automatically. |
|||
This file contains a list (one for each node) of public cluster addresses. these are the addresses that the SMBD daemons will bind to. |
|||
During failover of nodes, the CTDB daemons will take over the public address of a failed node to ensure that all public addresses are always available to clients. |
|||
== Red Hat Linux variants == |
|||
Example: |
|||
192.168.1.1/24 |
|||
192.168.1.2/24 |
|||
192.168.2.1/24 |
|||
192.168.2.2/24 |
|||
If using a Red Hat Linux variant, the Samba services are <code>smb</code> and <code>winbind</code>. Starting them at boot time is not recommended and this can be disabled using <code>chkconfig</code>. |
|||
These are the IP addresses that you should configure in DNS for the name of the clustered samba server and are the addresses that cifs clients will connect to. |
|||
The CTDB cluster utilizes ip takeover techniques to ensure that as long as at least one node in the cluster is available, all the public ip addresses will always be available to clients. |
|||
chkconfig smb off |
|||
This list also includes the netmasks used for the ip address so that the host routes are created correctly. |
|||
chkconfig winbind off |
|||
The service names and mechanism for disabling them varies across operating systems. |
|||
CTDB nodes will only take over ipaddresses that are inside the same subnet as its own public ip address. |
|||
In the example above, nodes 0 and 1 would be able to take over each others public ip and analog for nodes 2 and 3, but node 0 and 1 would NOT be able |
|||
to take over the ip addresses for nodes 2 or 3 since they are on a different subnet. |
|||
= Event scripts = |
|||
== Starting the cluster == |
|||
CTDB clustering for Samba involves the <code>50.samba</code> and <code>49.winbind</code> event scripts. These are provided as part of CTDB and do not usually need to be changed. |
|||
There is an example startup script in samba_3_0_ctdb/examples/ctdb/cluster_start.sh. This script will read your cluster_nodes.txt and create smb.conf files for each node, and start smbd and ctdbd on each node of the cluster. |
|||
There are several configuration variables that affect the operation of these scripts. Please see [http://ctdb.samba.org/manpages/ctdb-script.options.5.html ctdb-script.options(5)] for details. |
|||
== Loopback Setup == |
|||
= Filesystem specific configuration = |
|||
For testing purposes you can setup a Samba/CTDB cluster on a single computer using loopback networking. To set this up you need to do this: |
|||
The cluster filesystem you use with ctdb plays a critical role in ensuring that CTDB works seamlessly. |
|||
- use ifconfig to create IP aliases for your loopback device for each virtual node |
|||
Here are some filesystem specific tips |
|||
- put the list of aliased IP addresses in cluster_nodes.txt |
|||
If you are interested in testing a new cluster filesystem with CTDB then we strongly recommend looking at the page on testing filesystems using [[ping_pong|ping_pong]] to ensure that the cluster filesystem supports correct POSIX locking semantics. |
|||
For example in order to create loopback devices 2 through 4 (lookpback device 1 already exists on most systems), you could do this: |
|||
== IBM GPFS filesystem == |
|||
for i in `seq 2 4`; do |
|||
ifconfig lo:$i 127.0.0.$i |
|||
done |
|||
The [https://www.ibm.com/support/knowledgecenter/SSFKCN/gpfs_welcome.html GPFS] filesystem (now known as [https://www-03.ibm.com/systems/storage/spectrum/scale/ Spectrum Scale ]) is a proprietary cluster filesystem that has been extensively tested with CTDB/Samba. When using GPFS, the following smb.conf settings are recommended |
|||
then to configure these you would create a cluster_nodes.txt with the lines: |
|||
vfs objects = gpfs fileid |
|||
127.0.0.1 |
|||
127.0.0.2 |
|||
gpfs:sharemodes = yes |
|||
127.0.0.3 |
|||
127.0.0.4 |
|||
fileid:algorithm = fsname |
|||
force unknown acl user = yes |
|||
nfs4: mode = special |
|||
nfs4: chown = yes |
|||
nfs4: acedup = merge |
|||
The ACL related options should only be enabled if you have NFSv4 ACLs enabled on your filesystem |
|||
Then start the cluster as above. For the system to start you also need an onnode script in your path. For this simple example of running a simulated cluster on a single computer the onnode.loop example script can be renamed to onnode in order to create the necessary script. The user rarely needs to directly invoke this script but it is used by the cluster startup script to remotely execute commands on other cluster nodes. There is a second example onnode script, onnode.ssh, which is not needed for this example (but which could be renamed to onnode, instead of using onnode.local, when using a multi-computer cluster). The last line of onnode.ssh, which contains the sample command for starting ssh could be changed (e.g. for certain Kerberized ssh configurations) when the cluster is run over multiple computers. |
|||
The most important of these options is the "fileid:algorithm". You risk data corruption if you use a different mapping backend with Samba and GPFS, because locking wilk break across nodes. NOTE: You must also load "fileid" as a vfs object in order for this to take effect. |
|||
== Testing your cluster == |
|||
A guide to configuring Samba with CTDB and GPFS can be found at [[Samba CTDB GPFS Cluster HowTo]] |
|||
Once your cluster is up and running, you may wish to know how to test that it is functioning correctly. The following tests may help with that |
|||
== RedHat GFS filesystem == |
|||
[http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Global_File_System/index.html Red Hat GFS] is a native file system that interfaces directly with the Linux kernel file system interface (VFS layer). |
|||
The ctdb package comes with a utility called ctdb_control that can be used to look at the behaviour of the ctdb protocol. If you run it with no options it will provide some terse usage information. The most commonly used commands are: |
|||
The gfs_controld daemon manages mounting, unmounting, recovery and posix locks. Edit /etc/init.d/cman (If using RedHat Cluster Suite) to start gfs_controld with the '-l 0 -o 1' flags to optimize posix locking performance. You'll notice the difference this makes by running the [http://wiki.samba.org/index.php/Ping_pong ping_pong] test with and without these options. |
|||
- ctdb_control ping |
|||
- ctdb_control status all |
|||
A complete HowTo document to setup clustered samba with CTDB and GFS2 is here: [[GFS CTDB HowTo]] |
|||
=== Using smbcontrol === |
|||
== Lustre filesystem == |
|||
Lustre® is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by a number of companies ( [http://www.intel.com/content/www/us/en/software/intel-solutions-for-lustre-software.html Intel], [http://www.seagate.com/products/enterprise-servers-storage/enterprise-storage-systems/clustered-file-systems/ Seagate] ) and [http://opensfs.org/ OpenSFS] which is a not for profit organisation. |
|||
Tests have been done on Lustre releases of 1.4.x and 1.6.x with CTDB/Samba, The current lustre release is 2.5.2 . When mounting Lustre, an option of "-o flock" should be specified to enable cluster-wide byte range lock among all Lustre clients. |
|||
These two versions have differnt mechanisms of configuration and startup. More information is available at http://wiki.lustre.org. |
|||
In comparison of Lustre configurating, setting up CTDB/Samba on the two different versions keeps the same way. The following settings are recommended: |
|||
vfs objects = fileid |
|||
fileid:algorithm = fsname |
|||
The options of "fileid:mapping" must be specified to avoid possibe data corruption. |
|||
== GlusterFS filesystem == |
|||
[http://www.gluster.org/ GlusterFS] is a cluster file-system capable of scaling to several peta-bytes that is easy to configure. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. GlusterFS is based on a stackable user space design without compromising performance. It uses Linux File System in Userspace (FUSE) to achieve all this. |
|||
NOTE: GlusterFS has not yet had extensive testing but this is currently underway. |
|||
Currently from versions 2.0 to 2.0.4 of GlusterFS, it must be patched with: |
|||
http://patches.gluster.com/patch/813/ |
|||
This is to ensure GlusterFS passes the ping_pong test. This issue is being tracked at: |
|||
http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=159 |
|||
Update: As of GlusterFS 2.0.6 this has been fixed. |
|||
== OCFS2 == |
|||
* OCFS2 - see http://oss.oracle.com/projects/ocfs2/ |
|||
recommended settings: |
|||
vfs objects = fileid |
|||
fileid:algorithm = fsid |
|||
OCFS2 1.4 offers cluster-wide byte-range locking. |
|||
== Other cluster filesystems == |
|||
If you can't find documentation about your choice of cluster filesystem and clustered Samba then you might need to work around some limitations. |
|||
=== Inconsistent device numbers === |
|||
Locking will not work if a cluster filesystem does not provide uniform device numbers across nodes. It testing shows locking problems then you should test [[Setting_up_a_cluster_filesystem#Checking_uniformity_of_device_and_inode_numbering|device number uniformity]] of your cluster filesystem. |
|||
To work around a lack of device number uniformity, the following settings should be used in the global section of the Samba configuration: |
|||
vfs objects = fileid |
|||
fileid:algorithm = fsname |
|||
See [https://www.samba.org/samba/docs/man/manpages/vfs_fileid.8.html vfs_fileid(8)] for more information. |
|||
= Testing clustered Samba = |
|||
Once your cluster is up and running, you may wish to know how to test that it is functioning correctly. The following tests may help with that |
|||
== Using smbcontrol == |
|||
You can check for connectivity to the smbd daemons on each node using smbcontrol |
You can check for connectivity to the smbd daemons on each node using smbcontrol |
||
Line 154: | Line 201: | ||
- smbcontrol smbd ping |
- smbcontrol smbd ping |
||
== Using Samba4 smbtorture == |
|||
The Samba4 version of smbtorture has several tests that can be used to benchmark a CIFS cluster. You can download Samba4 like this: |
The Samba4 version of smbtorture has several tests that can be used to benchmark a CIFS cluster. You can download Samba4 like this: |
||
git clone git://git.samba.org/samba.git |
|||
cd samba/source4 |
|||
Then configure and compile it as usual. The particular tests that are helpful for cluster benchmarking are the RAW-BENCH-OPEN, RAW-BENCH-LOCK and BENCH-NBENCH tests. These tests take a unclist that allows you to spread the workload out over more than one node. For example: |
Then configure and compile it as usual. The particular tests that are helpful for cluster benchmarking are the RAW-BENCH-OPEN, RAW-BENCH-LOCK and BENCH-NBENCH tests. These tests take a unclist that allows you to spread the workload out over more than one node. For example: |
||
Line 164: | Line 212: | ||
smbtorture //localhost/data -Uuser%password RAW-BENCH-LOCK --unclist=unclist.txt --num-progs=32 -t60 |
smbtorture //localhost/data -Uuser%password RAW-BENCH-LOCK --unclist=unclist.txt --num-progs=32 -t60 |
||
The file unclist.txt should contain a list of share in your cluster (UNC format: ''//server//share''). For example |
|||
//node1/data |
|||
//node2/data |
|||
//node3/data |
|||
//node4/data |
|||
For NBENCH testing you need a client.txt file. A suitable file can be found in the dbench distribution at http://samba.org/ftp/tridge/dbench/ |
For NBENCH testing you need a client.txt file. A suitable file can be found in the dbench distribution at http://samba.org/ftp/tridge/dbench/ |
Latest revision as of 15:59, 13 November 2023
Goal
Configure clustered Samba using a CTDB cluster
Note
This page still contains some details not directly relevant to clustering Samba. The documentation is being cleaned up and restructured.
Prerequisites
- Basic CTDB configuration
- Setting up a cluster filesystem
- Configuring the CTDB recovery lock (recommended)
- Adding public IP addresses (or some other failover/load balancing scheme)
Samba Configuration
Next you need to initialise the Samba password database, e.g.
smbpasswd -a root
Samba with clustering must use the tdbsam or ldap SAM passdb backends (it must not use the default smbpasswd backend), or must be configured to be a member of a domain. The rest of the configuration of Samba is exactly as it is done on a normal system. See the docs on http://samba.org/ for details.
Critical smb.conf parameters
A clustered Samba install must set some specific configuration parameters
netbios name = something clustering = yes idmap config * : backend = autorid idmap config * : range = 1000000-1999999
NB:
- See idmap(8) for more information about the idmap configuration
- netbios name should be the same on all nodes
Note that bind interfaces only = yes
should not be used when configuring clustered Samba with CTDB public IP addresses. CTDB will start smbd
before public IP addresses are hosted, so smbd
will not listen on any of the public IP addresses. When public IP addresses are eventually hosted, smbd
will not bind to the new addresses.
Using the Samba registry
A recommended way of ensuring that all Samba nodes have the same configuration is to put most configuration into the registry.
This means that smb.conf
can be as simple as:
[global] clustering = yes include = registry
The initial contents of the registry can then be placed into a file (say tmp.conf
):
[global] security = ADS logging = syslog log level = 1 netbios name = test workgroup = SAMBA realm = samba.example.com idmap config * : backend = autorid idmap config * : range = 1000000-1999999
and loaded from one of the nodes:
net conf import tmp.conf
Further net conf
commands such as net conf addshare
can then be used to continue configuration.
Configure CTDB to manage Samba
For CTDB to manage Samba, the 50.samba
event script must be enabled
ctdb event script enable legacy 50.samba
This causes CTDB to start and stop Samba at startup and shutdown. It also tells CTDB to monitor Samba.
Similarly, if using winbind
, CTDB should also be configured to manage it:
ctdb event script enable legacy 49.winbind
Please see the event
command in ctdb(1) for more details.
CTDB will manage and start/stop/restart the Samba services, so the operating system should be configured so these are not started/stopped automatically.
Red Hat Linux variants
If using a Red Hat Linux variant, the Samba services are smb
and winbind
. Starting them at boot time is not recommended and this can be disabled using chkconfig
.
chkconfig smb off chkconfig winbind off
The service names and mechanism for disabling them varies across operating systems.
Event scripts
CTDB clustering for Samba involves the 50.samba
and 49.winbind
event scripts. These are provided as part of CTDB and do not usually need to be changed.
There are several configuration variables that affect the operation of these scripts. Please see ctdb-script.options(5) for details.
Filesystem specific configuration
The cluster filesystem you use with ctdb plays a critical role in ensuring that CTDB works seamlessly. Here are some filesystem specific tips
If you are interested in testing a new cluster filesystem with CTDB then we strongly recommend looking at the page on testing filesystems using ping_pong to ensure that the cluster filesystem supports correct POSIX locking semantics.
IBM GPFS filesystem
The GPFS filesystem (now known as Spectrum Scale ) is a proprietary cluster filesystem that has been extensively tested with CTDB/Samba. When using GPFS, the following smb.conf settings are recommended
vfs objects = gpfs fileid gpfs:sharemodes = yes fileid:algorithm = fsname force unknown acl user = yes nfs4: mode = special nfs4: chown = yes nfs4: acedup = merge
The ACL related options should only be enabled if you have NFSv4 ACLs enabled on your filesystem
The most important of these options is the "fileid:algorithm". You risk data corruption if you use a different mapping backend with Samba and GPFS, because locking wilk break across nodes. NOTE: You must also load "fileid" as a vfs object in order for this to take effect.
A guide to configuring Samba with CTDB and GPFS can be found at Samba CTDB GPFS Cluster HowTo
RedHat GFS filesystem
Red Hat GFS is a native file system that interfaces directly with the Linux kernel file system interface (VFS layer).
The gfs_controld daemon manages mounting, unmounting, recovery and posix locks. Edit /etc/init.d/cman (If using RedHat Cluster Suite) to start gfs_controld with the '-l 0 -o 1' flags to optimize posix locking performance. You'll notice the difference this makes by running the ping_pong test with and without these options.
A complete HowTo document to setup clustered samba with CTDB and GFS2 is here: GFS CTDB HowTo
Lustre filesystem
Lustre® is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by a number of companies ( Intel, Seagate ) and OpenSFS which is a not for profit organisation.
Tests have been done on Lustre releases of 1.4.x and 1.6.x with CTDB/Samba, The current lustre release is 2.5.2 . When mounting Lustre, an option of "-o flock" should be specified to enable cluster-wide byte range lock among all Lustre clients.
These two versions have differnt mechanisms of configuration and startup. More information is available at http://wiki.lustre.org.
In comparison of Lustre configurating, setting up CTDB/Samba on the two different versions keeps the same way. The following settings are recommended:
vfs objects = fileid fileid:algorithm = fsname
The options of "fileid:mapping" must be specified to avoid possibe data corruption.
GlusterFS filesystem
GlusterFS is a cluster file-system capable of scaling to several peta-bytes that is easy to configure. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. GlusterFS is based on a stackable user space design without compromising performance. It uses Linux File System in Userspace (FUSE) to achieve all this.
NOTE: GlusterFS has not yet had extensive testing but this is currently underway.
Currently from versions 2.0 to 2.0.4 of GlusterFS, it must be patched with:
http://patches.gluster.com/patch/813/
This is to ensure GlusterFS passes the ping_pong test. This issue is being tracked at:
http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=159
Update: As of GlusterFS 2.0.6 this has been fixed.
OCFS2
- OCFS2 - see http://oss.oracle.com/projects/ocfs2/
recommended settings:
vfs objects = fileid fileid:algorithm = fsid
OCFS2 1.4 offers cluster-wide byte-range locking.
Other cluster filesystems
If you can't find documentation about your choice of cluster filesystem and clustered Samba then you might need to work around some limitations.
Inconsistent device numbers
Locking will not work if a cluster filesystem does not provide uniform device numbers across nodes. It testing shows locking problems then you should test device number uniformity of your cluster filesystem.
To work around a lack of device number uniformity, the following settings should be used in the global section of the Samba configuration:
vfs objects = fileid fileid:algorithm = fsname
See vfs_fileid(8) for more information.
Testing clustered Samba
Once your cluster is up and running, you may wish to know how to test that it is functioning correctly. The following tests may help with that
Using smbcontrol
You can check for connectivity to the smbd daemons on each node using smbcontrol
- smbcontrol smbd ping
Using Samba4 smbtorture
The Samba4 version of smbtorture has several tests that can be used to benchmark a CIFS cluster. You can download Samba4 like this:
git clone git://git.samba.org/samba.git cd samba/source4
Then configure and compile it as usual. The particular tests that are helpful for cluster benchmarking are the RAW-BENCH-OPEN, RAW-BENCH-LOCK and BENCH-NBENCH tests. These tests take a unclist that allows you to spread the workload out over more than one node. For example:
smbtorture //localhost/data -Uuser%password RAW-BENCH-LOCK --unclist=unclist.txt --num-progs=32 -t60
The file unclist.txt should contain a list of share in your cluster (UNC format: //server//share). For example
//node1/data //node2/data //node3/data //node4/data
For NBENCH testing you need a client.txt file. A suitable file can be found in the dbench distribution at http://samba.org/ftp/tridge/dbench/