Configuring Software RAID



Configuring Software RAID

Configuring RAID using Fedora Linux requires a number of steps that need to be followed carefully. In the tutorial example, you'll be configuring RAID 5 using a system with three pre-partitioned hard disks. The partitions to be used are:

  • /dev/hde1

  • /dev/hdf2

  • /dev/hdg1

Be sure to adapt the various stages outlined below to your particular environment.

RAID Partitioning

You first need to identify two or more partitions, each on a separate disk. If you are doing RAID 0 or RAID 5, the partitions should be of approximately the same size, as in this scenario. RAID limits the extent of data access on each partition to an area no larger than that of the smallest partition in the RAID set.

Determining Available Partitions

First use the fdisk -l command to view all the mounted and unmounted filesystems available on your system. You may then also want to use the df -k command, which shows only mounted filesystems but has the big advantage of giving you the mount points too.

These two commands should help you to easily identify the partitions you want to use. Here is some sample output of these commands:

     [root@bigboy tmp]# fdisk l

     Disk /dev/hda: 12.0 GB, 12072517632 bytes
     255 heads, 63 sectors/track, 1467 cylinders
     Units = cylinders of 16065 * 512 = 8225280 bytes

        Device Boot    Start       End    Blocks   Id  System
     /dev/hda1   *         1        13    104391   83  Linux
     /dev/hda2            14       144   1052257+  83  Linux
     /dev/hda3           145       209    522112+  82  Linux swap
     /dev/hda4           210      1467  10104885    5  Extended
     /dev/hda5           210       655   3582463+  83  Linux
     ...
     ...
     /dev/hda15         1455      1467    104391   83  Linux
     [root@bigboy tmp]#
     [root@bigboy tmp]# df k
     Filesystem           1K-blocks      Used Available Use%
     Mounted on
     /dev/hda2           1035692   163916   819164 17% /
     /dev/hda1            101086     8357    87510  9% /boot
     /dev/hda15           101086     4127    91740  5% /data1
     ...
     ...
     ...
     /dev/hda7           5336664   464228  4601344 10% /var
     [root@bigboy tmp]#

Unmount the Partitions

You don't want anyone else accessing these partitions while you are creating the RAID set, so you need to make sure they are unmounted:

     [root@bigboy tmp]# umount /dev/hde1
     [root@bigboy tmp]# umount /dev/hdf2
     [root@bigboy tmp]# umount /dev/hdg1

Prepare the Partitions with fdisk

You have to change each partition in the RAID set to be of type FD (Linux raid autodetect), and you can do this with fdisk. Here is an example using /dev/hde1:

     [root@bigboy tmp]# fdisk /dev/hde

     The number of cylinders for this disk is set to 8355.
     There is nothing wrong with that, but this is larger than 1024,
     and could in certain setups cause problems with:
     1) software that runs at boot time (e.g., old versions of LILO)
     2) booting and partitioning software from other OSs
        (e.g., DOS FDISK, OS/2 FDISK)

     Command (m for help):

Use fdisk Help

Now use the fdisk m command to get some help:

     Command (m for help): m
        ...
        ...
        p print the partition table
        q quit without saving changes
        s create a new empty Sun disklabel
        t change a partition's system id
        ...
        ...

     Command (m for help):

Set the ID Type to FD

Partition /dev/hde1 is the first partition on disk /dev/hde. Modify its type using the t command, and specify the partition number and type code. You also should use the L command to get a full listing of ID types in case you forget:

     Command (m for help): t
     Partition number (1-5): 1
     Hex code (type L to list codes): L

     ...
     ...
     ...
     16  Hidden FAT16    61 SpeedStor       a9 NetBSD      f2 DOS secondary
     17  Hidden HPFS/NTF 63 GNU HURD or Sys ab Darwin boot fd Linux raid auto
     18  AST SmartSleep  64 Novell Netware  b7 BSDI fs     fe LANstep
     1b  Hidden Win95 FA 65 Novell Netware  b8 BSDI swap   ff BBT
     Hex code (type L to list codes): fd
     Changed system type of partition 1 to fd (Linux raid autodetect)

     Command (m for help):

Make Sure the Change Occurred

Use the p command to get the new proposed partition table:

     Command (m for help): p

     Disk /dev/hde: 4311 MB, 4311982080 bytes
     16 heads, 63 sectors/track, 8355 cylinders
     Units = cylinders of 1008 * 512 = 516096 bytes

        Device Boot    Start       End    Blocks   Id  System
     /dev/hde1             1      4088   2060320+  fd  Linux raid
                                                       autodetect
     /dev/hde2          4089      5713    819000   83  Linux
     /dev/hde4          6608      8355    880992    5  Extended
     /dev/hde5          6608      7500    450040+  83  Linux
     /dev/hde6          7501      8355    430888+  83  Linux

     Command (m for help):

Save the Changes

Use the w command to permanently save the changes to disk /dev/hde:

     Command (m for help): w
     The partition table has been altered!

     Calling ioctl() to re-read partition table.
     WARNING: Re-reading the partition table failed with error 16: Device
     or resource busy.
     The kernel still uses the old table.
     The new table will be used at the next reboot.
     Syncing disks.
     [root@bigboy tmp]#

The error above will occur if any of the other partitions on the disk is mounted.

Repeat for the Other Partitions

For the sake of brevity, I won't show the process for the other partitions. It's enough to know that the steps for changing the IDs for /dev/hdf2 and /dev/hdg1 are very similar.

Edit the RAID Configuration File

The Linux RAID configuration file is /etc/raidtab. You can find templates for this file in the /usr/share/doc/raidtools* directory. For an explanation of the various parameters, issue the man raidtab command.

To ensure success, remember these general guidelines:

  • When configuring RAID 5, you must use a parity-algorithm setting.

  • The raid-disk parameters for each partition in the /etc/raidtab file are numbered starting at 0. For example, if you have four partitions for RAID, they would be numbered 0, 1, 2, and 3.

  • For RAID levels 1, 4, and 5, the /etc/raidtab persistent-superblock must be set to 1 for the RAID autodetect feature (partition type FD) to work. For all other RAID versions, persistent-superblock must be set to 0.

Consider an example. Here, RAID 5 is configured to use each of the desired partitions on the three disks, and the set of three is called /dev/md0. The data is distributed across the drives in 32MB chunks:

     #
     # sample raiddev configuration file
     # 'old' RAID0 array created with mdtools.
     #
     raiddev /dev/md0
        raid-level              5
        nr-raid-disks           3
        persistent-superblock   1
        chunk-size              32
        parity-algorithm        left-symmetric
        device                  /dev/hde1
        raid-disk               0
        device                  /dev/hdf2
        raid-disk               1
        device                  /dev/hdg1
        raid-disk               2

Create the RAID Set

The mkraid command creates the RAID set by reading the /etc/raidtab file. The example creates the logical RAID device /dev/md0:

     [root@bigboy tmp]# mkraid /dev/md0
     analyzing super-block
     disk 0: /dev/hde1, 104391kB, raid superblock at 104320kB
     disk 1: /dev/hdf2, 104391kB, raid superblock at 104320kB
     disk 2: /dev/hdg1, 104391kB, raid superblock at 104320kB
     [root@bigboy tmp]#

Confirm RAID Is Correctly Initialized

The /proc/mdstat file provides the current status of all RAID devices. Confirm that the initialization is finished by inspecting the file and making sure that there are no initialization-related messages:

     [root@bigboy tmp]# cat /proc/mdstat
     Personalities : [raid5]
     read_ahead 1024 sectors
     md0 : active raid5 hdg1[2] hde1[1] hdf2[0]
           4120448 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]

     unused devices: <none>
     [root@bigboy tmp]#

Format the New RAID Set

Your new RAID device now has to be formatted. The next example uses the -j qualifier to ensure that a journaling filesystem is created. Here a block size of 4KB (4096 bytes) is used with each chunk, which is comprised of 8 blocks. It is very important that the chunk-size parameter in the /etc/raidtab file match the value of the block size multiplied by the stride value in the command below. If the values don't match, you will get parity errors.

     [root@bigboy tmp]# mke2fs -j -b 4096 -R stride=8 /dev/md0
     mke2fs 1.32 (09-Nov-2002)
     Filesystem label=
     OS type: Linux
     Block size=4096 (log=2)
     Fragment size=4096 (log=2)
     516096 inodes, 1030160 blocks
     51508 blocks (5.00%) reserved for the super user
     First data block=0
     32 block groups
     32768 blocks per group, 32768 fragments per group
     16128 inodes per group
     Superblock backups stored on blocks:
             32768, 98304, 163840, 229376, 294912, 819200, 884736

     Writing inode tables: done
     Creating journal (8192 blocks): done
     Writing superblocks and filesystem accounting information: done

     This filesystem will be automatically checked every 26 mounts or
     180 days, whichever comes first. Use tune2fs -c or -i to override.
     [root@bigboy tmp]#

Load the RAID Driver for the New RAID Set

Next, make the Linux operating system fully aware of the RAID set by loading the driver for the new RAID set using the raidstart command:

     [root@bigboy tmp]# raidstart /dev/md0
     [root@bigboy tmp]#

Create a Mount Point for the RAID Set

After the driver loads, create a mount point for /dev/md0, such as this one called /mnt/raid:

     [root@bigboy mnt]# mkdir /mnt/raid

Edit the /etc/fstab File

The /etc/fstab file lists all the partitions that need to mount when the system boots. Add an Entry for the RAID set, the /dev/md0 device:

     /dev/md0      /mnt/raid     ext3    defaults    1 2

Do not use labels in the /etc/fstab file for RAID devices; just use the real device name, such as /dev/md0. On startup, the /etc/rc.d/rc.sysinit script checks the /etc/fstab file for device entries that match RAID set names in the /etc/raidtab file. The script will not automatically start the RAID set driver for the RAID set if it doesn't find a match. Device mounting then occurs later on in the boot process. Mounting a RAID device that doesn't have a loaded driver can corrupt your data and produce this error:

     Starting up RAID devices: md0(skipped)
     Checking filesystems
     /raiddata: Superblock has a bad ext3 journal(inode8)
     CLEARED.
     ***journal has been deleted - file system is now ext 2 only***

     /raiddata: The filesystem size (according to the superblock) is
     2688072 blocks.
     The physical size of the device is 8960245 blocks.
     Either the superblock or the partition table is likely to be corrupt!
     /boot: clean, 41/26104 files, 12755/104391 blocks

     /raiddata: UNEXPECTED INCONSISTENCY; Run fsck manually (ie without -a
     or -p options).

If you are not familiar with the /etc/fstab file, use the man fstab command to get a comprehensive explanation of each data column it contains.

The /dev/hde1, /dev/hdf2, and /dev/hdg1 partitions were replaced by the combined /dev/md0 partition. You therefore don't want the old partitions to be mounted again. Make sure that all references to them in this file are commented with a # at the beginning of the line or deleted entirely:

     #/dev/hde1      /data1         ext3    defaults        1 2
     #/dev/hdf2      /data2         ext3    defaults        1 2
     #/dev/hdg1      /data3         ext3    defaults        1 2

Start the New RAID Set's Driver

You now can start the new RAID set's driver with the raidstart command. This command is run automatically at boot time, so you'll only have to do this once.

      [root@bigboy tmp]# raidstart /dev/md0

Mount the New RAID Set

Use the mount command to mount the RAID set. You have your choice of methods:

  • The mount command's -a flag causes Linux to mount all the devices in the /etc/fstab file that have automounting enabled (default) and that are also not already mounted.

    [root@bigboy tmp]# mount -a
    

  • You can also mount the device manually.

    [root@bigboy tmp]# mount /dev/md0 /mnt/raid
    

Check the Status of the New RAID

The /proc/mdstat file provides the current status of all the devices. When the RAID driver is stopped, the file has very little information, as seen here:

      [root@bigboy tmp]# raidstop /dev/md0
      [root@bigboy tmp]# cat /proc/mdstat
      Personalities : [raid5]
      read_ahead 1024 sectors
      unused devices: <none>
      [root@bigboy tmp]#

More information, including the partitions of the RAID set, is provided after you load the driver using the raidstart command.

     [root@bigboy tmp]# raidstart /dev/md0
     [root@bigboy tmp]# cat /proc/mdstat
     Personalities : [raid5]
     read_ahead 1024 sectors
     md0 : active raid5 hdg1[2] hde1[1] hdf2[0]
           4120448 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]

     unused devices: <none>
     [root@bigboy tmp]#