6.0: DRBD

From SambaWiki

Replicated Failover Domain Controller and file server using LDAP


1.0. Configuring Samba

2.0. Configuring LDAP

3.0. Initialization LDAP Database

4.0. User Management

5.0. Heartbeat HA Configuration

6.0. DRBD

7.0. BIND DNS



6.1. Requirements

High Availability and data replication should not replace traditional backups such as tape and external media devices, especially if you are using this configuration and are not familiar with the workings.

DRBD Configuration

Primary/Secondary

Primary/Primary <-- to do

DRBD is a kernel module which has the ability to network 2 machines to provide Raid1 over LAN. It is assumed that we have two identical drives in both machines; all data on this device will be destroyed.

If you are updating your kernel or version of DRBD, make sure DRBD is stopped on both machines. Never attempt to run different versions of DRBD, this means both machines need the same kernel.

You will need to install the DRBD kernel Module. We will build our own RPM kernel modules so it is optimized for our architecture.

I have tested many different kernels with DRBD, some are not stable so you will need to check Google to make sure your kernel is compatible with the particular DRBD release, most of the time this isn’t an issue.

Please browse this list http://www.linbit.com/support/drbd-current/ and look for packages available.

If you are having problems compileing the software and getting make errors, things can become complicated.

It is best to compile drbd and kernel modules from source to suit your kernel. But if you get make errors you should not have any issues finding prebuilt packages for centOS, RHEL, all Fedora Core versions that work just fine.

Packages for Fedora Core 6 x86 and x86-64 Check here for Fedora Core 6 packages http://atrpms.net/dist/fc6/drbd/

6.2. Installation

Step1.

Extract the latest stable version of DRBD.

[root@node1 stable]# tar zxvf drbd-0.7.20.tar.gz
[root@node1 stable]# cd drbd-0.7.20
[root@node1 drbd-0.7.20]#


Step2.

It is nice to make your own rpm for your distribution. It makes upgrades seamless.

This will give us a RPM build specifically to our kernel, it may take some time.

[root@node1 drbd-0.7.20]# make
[root@node1 drbd-0.7.20]# make rpm

If you get make errors, try and find an RPM for your distribution.

Step3.

[root@node1 drbd-0.7.20]# cd dist RPMS/i386/

[root@node1 i386]# ls
drbd-0.7.20-1.i386.rpm
drbd-debuginfo-0.7.20-1.i386.rpm
drbd-km-2.6.14_1.1656_FC4smp-0.7.20-1.i386.rpm

Step4.

We will now install DRBD and our Kernel module which we built earlier.

[root@node1 i386]# rpm -Uvh drbd-0.7.20-1.i386.rpm drbd-debuginfo-0.7.20-1.i386.rpm 
 drbd-km-2.6.14_1.1656_FC4smp-0.7.20-1.i386.rpm

Step5.

Login to node 2 the backup domain controller and do the same.

6.3. Configuration

In the example throughout this document we have linked /dev/hdd1 to /dev/drbd0; your however may be a different device, it could be SCSI.

All data on the device /dev/hdd will be destroyed.


Step1.

We are going to create a partition on /dev/hdd1 using fdisk. Your actuall device will most likely differ from /dev/hdd

[root@node1]# fdisk /dev/hdd1

Command (m for help): m
Command action

  a   toggle a bootable flag
  b   edit bsd disklabel
  c   toggle the dos compatibility flag
  d   delete a partition
  l   list known partition types
  m   print this menu
  n   add a new partition
  o   create a new empty DOS partition table
  p   print the partition table
  q   quit without saving changes
  s   create a new empty Sun disklabel
  t   change a partition's system id
  u   change display/entry units
  v   verify the partition table
  w   write table to disk and exit
  x   extra functionality (experts only)

Command (m for help): d

No partition is defined yet!

Command (m for help): n
Command action
e   extended
p   primary partition (1-4) p
Partition number (1-4): 1
First cylinder (1-8677, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-8677, default 8677):
Using default value 8677
Command (m for help): w

Step2.

Now login to node2 the backup domain controller and fdisk /dev/hdd1 as per above; or your chosen device.

6.3.1. drbd.conf

Create this file on both you master and slave server, it should be identical however it is not a requirement. As long as the partition size is the same any mount point can be used.


Step1.

The below file is fairly self explanatory, you see the real disk link to the DRBD kernel module device.

Make sure you set your hostname as well, otherwise DRBD will not start.

[root@node1]# vi /etc/drbd.conf

# Datadrive (/data) /dev/hdd1 80GB

resource drbd1 {
 protocol C;
 disk {
   on-io-error panic;
 }
 net {
   max-buffers 2048;
   ko-count 4;
   on-disconnect reconnect;
 }
 syncer {
   rate 10000;
 }
 on node1.differentialdesign.org {
   device    /dev/drbd0;
   disk      /dev/hdd1;
   address   10.0.0.1:7789;
   meta-disk internal;
 }
 on node2.differentialdesign.org {
   device    /dev/drbd0;
   disk      /dev/hdd1;
   address   10.0.0.2:7789;
   meta-disk internal;
 }
}

Step2.

[root@node1]# scp /etc/drbd.conf root@node2:/etc/

6.3.2. Initialization

In the following steps we will configure the disks to synchronize and choose a master node.

Step1

On the Primary Domain Controller

[root@node1]# service drbd start

On the Backup Domain Controller

[root@node2]# service drbd start


Step2.

You can see both devices are ready, and waiting for a Primary drive to be activated which will do an initial synchronization to the secondary device.

[root@node1]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.17 (api:77/proto:74)
SVN Revision: 2093 build by root@node1, 2006-04-23 14:40:20
0: cs:Connected st:Secondary/Secondary ld:Inconsistent
   ns:25127936 nr:3416 dw:23988760 dr:4936449 al:19624 bm:1038 lo:0 pe:0 ua:0 ap:0


Step3.

Stop the heartbeat service on both nodes.


Step4.

We are now telling DRBD to make node1 the primary drive; this will overwrite all data on the secondary device.

[root@node1]#  drbdadm -- --do-what-I-say primary all
[root@node1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.23 (api:79/proto:74)
SVN Revision: 2686 build by root@node1, 2007-01-23 20:26:13
0: cs:SyncSource st:Primary/Secondary ld:Consistent
   ns:67080 nr:85492 dw:91804 dr:72139 al:9 bm:268 lo:0 pe:30 ua:2019 ap:0
       [==>.................] sync'ed: 12.5% (458848/520196)K
       finish: 0:01:44 speed: 4,356 (4,088) K/sec

Step5.

Create a filesystem on our RAID devices.

[root@node1]# mkfs.ext3 /dev/drbd0

6.4. Testing

We have a 2 node cluster replicating drive data, its time to test a failover.


Step1.

Start the heartbeat service on both nodes.


Step2.

On node1 we can see the status of DRBD.

[root@node1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.23 (api:79/proto:74)
0: cs:Connected st:Primary/Secondary ld:Consistent
   ns:1536 nr:0 dw:1372 dr:801 al:4 bm:6 lo:0 pe:0 ua:0 ap:0
[root@node1 ~]#

On node2 we can see the status of DRBD.

[root@node2 ~]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.23 (api:79/proto:74)
SVN Revision: 2686 build by root@node2, 2007-01-23 20:26:03
0: cs:Connected st:Secondary/Primary ld:Consistent
   ns:0 nr:1484 dw:1484 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0
[root@node2 ~]#

That all looks good; we can see the devices are consistent and ready for use.


Step3.

Now let’s check the mount point we created in the heartbeat haresources file.

We can see heartbeat has successfully mounted “/dev/drbd0 to the /data directory” of course your device will not have any data on it yet.

[root@node1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      35G   14G   20G  41% /
/dev/hdc1              99M   21M   74M  22% /boot
/dev/shm              506M     0  506M   0% /dev/shm
/dev/drbd0             74G   37G   33G  53% /data
[root@node1 ~]#


Step4.

Login to node1 and execute the following command; once heartbeat is stopped it should only take a few seconds to migrate the services to node2.


[root@node1 ~]# service heartbeat stop
Stopping High-Availability services:
                                         [  OK  ]

We can see drbd change state to secondary on node1.

[root@node1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.23 (api:79/proto:74)
SVN Revision: 2686 build by root@node1, 2007-01-23 20:26:13
0: cs:Connected st:Secondary/Primary ld:Consistent
   ns:5616 nr:85492 dw:90944 dr:2162 al:9 bm:260 lo:0 pe:0 ua:0 ap:0


Step5.

Now let’s check that status of DRBD on node2; we can see it has changed state and become the primary.

[root@node2 ~]# service drbd status
drbd driver loaded OK; device status:
version: 0.7.23 (api:79/proto:74)
 SVN Revision: 2686 build by root@node2, 2007-01-23 20:26:03
0: cs:Connected st:Primary/Secondary ld:Consistent
   ns:4 nr:518132 dw:518136 dr:17 al:0 bm:220 lo:0 pe:0 ua:0 ap:0
1: cs:Connected st:Primary/Secondary ld:Consistent
   ns:28 nr:520252 dw:520280 dr:85 al:0 bm:199 lo:0 pe:0 ua:0 ap:0

Check that node2 has mounted the device.

[root@node2 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      35G   12G   22G  35% /
/dev/hdc1              99M   17M   78M  18% /boot
/dev/shm              506M     0  506M   0% /dev/shm
/dev/hdh1             111G   97G  7.6G  93% /storage
/dev/drbd0             74G   37G   33G  53% /data
[root@node2 ~]#


Step6.

Finally start the heartbeat service on node1 and be sure that all processes migrate back.


6.5. DRBD 8.0 GFS2 Primary/Primary Clustered Filesystem

Tested with Fedora 8

The following section is not intended for use.

- GFS must be used for 8.0 primary/primary

Using DRBD we can create a clustered filesystem and avoid expensive SAN and Filer devices. This also opens up gateways for those of us that wish to run CTDB clustered Samba on a 2 node cluster.


Step1.

Install GFS2 on the node. With x86-64 never install the i386 packages for GFS or or you will receive an error "/usr/sbin/cman_tool: aisexec daemon didn't start"

[root@core-01 ~]# yum install gfs2-utils.x86_64
[root@core-01 ~]# yum install cman.x86_64
[root@core-01 ~]# yum install openais-0.80.1-6.x86_64

Step2.

In the below example configuration file we have called our 2 nodes core-01 and core-02 respectively; the clustername is "hardcore".

Edit the gfs2 cluster configuration file; this file is to be identical on both nodes.

"Ordinarily, the loss of quorum after one out of two nodes fails will prevent the remaining node from continuing (if both nodes have one vote.) Special configuration options can be set to allow the one remaining node to continue operating if the other fails. To do this only two nodes, each with one vote, can be defined in cluster.conf. The two_node and expected_votes values must then be set to 1 in the cman section as follows."

[root@core-01 ~]# vi /etc/cluster/cluster.conf 
[root@core-01 ~]# scp /etc/cluster/cluster.conf root@core-02:/etc/cluster/
<?xml version="1.0"?>
<cluster name="hardcore" config_version="2">  

   <cman two_node="1" expected_votes="1">
   </cman>
   <clusternodes>
     <clusternode name="core-01" votes="1" nodeid="1">
      <fence>
       <method name="single">
        <device name="human" ipaddr="192.168.0.2"/>
      </method>
     </fence>
    </clusternode>
    <clusternode name="core-02" votes="1" nodeid="2">
     <fence>
      <method name="single">
        <device name="human" ipaddr="192.168.0.3"/>
      </method>
     </fence>
   </clusternode>
  </clusternodes>
  <fence_devices>
  <fence_device name="human" agent="fence_manual"/> 
 </fence_devices>
</cluster>


Step3.

On the Primary node edit /etc/drbd.conf; both drbd.conf are to be identical on both nodes.

[root@core-01 ~]# vi /etc/drbd.conf
[root@core-01 ~]# scp /etc/drbd.conf root@core-02:/etc/
# Datadrive (/data) /dev/sdb1 500GB

resource r0 {
       protocol C;
       startup {
               become-primary-on both;
}
       net {
               allow-two-primaries;
               cram-hmac-alg "sha1"; 
               shared-secret "123456";
               after-sb-0pri discard-least-changes;
#              after-sb-0pri discard-younger-primary;
#              after-sb-0pri discard-zero-changes;
               after-sb-1pri violently-as0p;
               after-sb-2pri violently-as0p;
               rr-conflict violently;
}
  syncer {
  rate 100M;
  }

on core-01 {
  device    /dev/drbd0;
  disk      /dev/sdb1;
  address   10.0.0.1:7789;
  meta-disk internal;
}
on core-02 {
  device    /dev/drbd0;
  disk      /dev/sdb1;
  address   10.0.0.2:7789;
  meta-disk internal;
 }
}

#/dev/sdc1

resource r1 {
       protocol C;
       startup {
               become-primary-on both;
}
       net {
               allow-two-primaries;
               cram-hmac-alg "sha1";
               shared-secret "123456";
               after-sb-0pri discard-least-changes;
#              after-sb-0pri discard-younger-primary;
#              after-sb-0pri discard-zero-changes;
               after-sb-1pri violently-as0p;
               after-sb-2pri violently-as0p;
               rr-conflict violently;
}
syncer {
  rate 100M;
}
on core-01 {
  device    /dev/drbd1;
  disk      /dev/sdc1;
  address   10.0.1.1:7789;
  meta-disk internal;
}
on core-02 {
  device    /dev/drbd1;
  disk      /dev/sdc1;
  address   10.0.1.2:7789;
  meta-disk internal;
 }
}


Step4.

Now lets start up GFS2.

[root@core-01 ~]# cman_tool nodes
cman_tool: Cannot open connection to cman, is it running ?
[root@core-01 ~]# service cman start
Starting cluster: 
  Loading modules... done
  Mounting configfs... done
  Starting ccsd... done
  Starting cman... done
  Starting daemons... done
  Starting fencing... 

At this point fencing will not start because it is waiting for core-02 to join.

[root@core-01 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M  34944   2008-02-16 02:08:14  core-01
   2   X      0                        core-02
[root@core-01 ~]# cman_tool status
Version: 6.0.1
Config Version: 2
Cluster Name: hardcore
Cluster Id: 26333
Cluster Member: Yes
Cluster Generation: 34944
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 2
Quorum: 1  
Active subsystems: 6
Flags: 2node 
Ports Bound: 0  
Node name: core-01
Node ID: 1
Multicast addresses: 239.192.102.68 
Node addresses: 192.168.0.2 

Time to start gfs2 on core-02

[root@core-02 ~]# service cman start
Starting cluster: 
  Loading modules... done
  Mounting configfs... done
  Starting ccsd... done
  Starting cman... done
  Starting daemons... done
  Starting fencing... done
                                                          [  OK  ]

Now lets check the status of the cluster.

[root@core-01 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M  34944   2008-02-16 02:08:14  core-01
   2   M  34948   2008-02-16 02:10:09  core-02
[root@core-01 ~]# cman_tool status
Version: 6.0.1
Config Version: 2
Cluster Name: hardcore
Cluster Id: 26333
Cluster Member: Yes
Cluster Generation: 34948
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1  
Active subsystems: 6
Flags: 2node 
Ports Bound: 0  
Node name: core-01
Node ID: 1
Multicast addresses: 239.192.102.68 
Node addresses: 192.168.0.2 


Step5.

Start DRBD on both nodes

DRBD will wait for core-02

[root@core-01 ~]# service drbd start Starting DRBD resources: [ d0 d1 s0 s1 n0 n1 ]. ......

[root@core-01 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.2.4 (api:88/proto:86-88)
GIT-hash: fc00c6e00a1b6039bfcebe37afa3e7e28dbd92fa build by root@core-01, 2008-02-13 22:22:18
0: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
       resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
1: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
       resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

Start DRBD on core-02

[root@core-02 ~]# service drbd start
Starting DRBD resources:    [ d0 d1 s0 s1 n0 n1 ].

Now we can see both nodes have clustered filesystem synchronized.

[root@core-01 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.2.4 (api:88/proto:86-88)
GIT-hash: fc00c6e00a1b6039bfcebe37afa3e7e28dbd92fa build by root@core-01, 2008-02-13 22:22:18
0: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r---
   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
       resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
1: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r---
   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
       resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

Step6.

We need to specify 2 journals as each cluster node requires its own.

Referencing back to our cluster.conf file we have chosen hardcore as our cluster name. We will call the clustered filesystem gfs2-00.

[root@core-01 ~]# mkfs.gfs2 -t hardcore:gfs2-00 -p lock_gulm -j 2 /dev/drbd0

Are you sure you want to proceed? [y/n] y

Device:                    /dev/drbd0
Blocksize:                 4096
Device Size                465.76 GB (122096000 blocks)
Filesystem Size:           465.76 GB (122095999 blocks)
Journals:                  3
Resource Groups:           1864
Locking Protocol:          "lock_dlm"
Lock Table:                "core-01:gfs2-00"

Now do the same for the our second disk we have defined in drbd.conf.

[root@core-01 ~]# mkfs.gfs2 -t hardcore:gfs2-01 -p lock_gulm -j 2 /dev/drbd1

Are you sure you want to proceed? [y/n] y

Device:                    /dev/drbd1
Blocksize:                 4096
Device Size                465.76 GB (122096000 blocks)
Filesystem Size:           465.76 GB (122095999 blocks)
Journals:                  3
Resource Groups:           1864
Locking Protocol:          "lock_dlm"
Lock Table:                "core-01:gfs2-01"


Step7.

Now we have created the filesystem we can go ahead and mount it.

If you are not able to mount the file system check that fence is in a running state.

/sbin/mount.gfs2: lock_dlm_join: gfs_controld join error: -22
/sbin/mount.gfs2: error mounting lockproto lock_dlm


[root@core-01 ~]# mount -t gfs2 /dev/drbd0 /gfs2-00 -v
/sbin/mount.gfs2: mount /dev/drbd0 /gfs2-00
/sbin/mount.gfs2: parse_opts: opts = "rw"
/sbin/mount.gfs2:   clear flag 1 for "rw", flags = 0
/sbin/mount.gfs2: parse_opts: flags = 0
/sbin/mount.gfs2: parse_opts: extra = ""
/sbin/mount.gfs2: parse_opts: hostdata = ""
/sbin/mount.gfs2: parse_opts: lockproto = ""
/sbin/mount.gfs2: parse_opts: locktable = ""
/sbin/mount.gfs2: message to gfs_controld: asking to join mountgroup:
/sbin/mount.gfs2: write "join /gfs2-00 gfs2 lock_dlm hardcore:gfs2-00 rw /dev/drbd0"
/sbin/mount.gfs2: message from gfs_controld: response to join request:
/sbin/mount.gfs2: lock_dlm_join: read "0"
/sbin/mount.gfs2: message from gfs_controld: mount options:
/sbin/mount.gfs2: lock_dlm_join: read "hostdata=jid=0:id=131073:first=0"
/sbin/mount.gfs2: lock_dlm_join: hostdata: "hostdata=jid=0:id=131073:first=0"
/sbin/mount.gfs2: lock_dlm_join: extra_plus: "hostdata=jid=0:id=131073:first=0"
/sbin/mount.gfs2: mount(2) ok
/sbin/mount.gfs2: lock_dlm_mount_result: write "mount_result /gfs2-00 gfs2 0"
/sbin/mount.gfs2: read_proc_mounts: device = "/dev/drbd0"
/sbin/mount.gfs2: read_proc_mounts: opts = "rw,relatime,hostdata=jid=0:id=131073:first=0"


Now lets add the mounts to fstab so we can have them mount when system boots

[root@core-01 ~]# vi /etc/fstab 
#GFS DRBD MOUNT POINTS
/dev/drbd0              /gfs2-00                gfs2    defaults        1 1
/dev/drbd1              /gfs2-01                gfs2    defaults        1 1