| HOME > MISC > LINUX > RAID & LVM |
|---|
I hope to turn this into a general easy to follow guide to setting up RAID-5 and LVM on a modern Linux system. However, for now it's basically a collection of my notes as I experimented on my own systems. Please note that my own experimentation was based on the RAID and LVM implementations under Fedora Core 3 & 4, as wel as Red Hat Enterprise Linux 4, all of which are based on the 2.6 series of kernels. These instructions may or may not work with other versions or distros. I'm not an expert (yet) in either Software RAID or LVM so please use the comment section below for corrections and comments. Recent changes are highlighted in yellow.
The first is RAID-1 which is also known as mirroring. With RAID-1 it's basically done with two essentially identical drives, each with a complete set of data. The second, the one I will mostly refer to in this guide is RAID-5 which is set up using three or more drives with the data spread in a way that any one drive failing will not result in data loss. The Red Hat website has a great overview of the RAID Levels.
There is one limitation with Linux Software RAID that a /boot partition can only reside on a RAID-1 array.
Linux supports both several hardware RAID devices but also software RAID which allows you to use any IDE or SCSI drives as the physical devices. In all cases I'll refer to software RAID.
LVM stands for Logical Volume Manager and is a way of grouping drives and/or partition in a way where instead of dealing with hard and fast physical partitions the data is managed in a virtual basis where the virtual partitions can be resized. The Red Hat website has a great overview of the Logical Volume Manager.
There is one limitation that a LVM cannot be used for the /boot.
In my case I have two systems with RAID arrays, one with two 73G SCSI drives running RAID-1 (mirroring) and my other test system is configured with three 120G IDE drives running RAID-5. In most cases I will refer to my RAID-5 configuration as that will be more typical.
I have an extra IDE controller in my system to allow me to support the use of more than 4 IDE devices which caused a very odd drive assignment. The order doesn't seem to bother the Linux kernel so it doesn't bother me. My basic configuration is as follows:
hda 120G driveThe first step is to create the physical partitions on each drive that will be part of the RAID array. In my case I want to use each 120G drive in the array in it's entirety. All the drives are partitioned identically so for example, this is how hda is partitioned:
hdb 120G drive
hde 60G boot drive not on RAID array
hdf 120G drive
hdg CD-ROM drive
Disk /dev/hda: 120.0 GB, 120034123776 bytes 16 heads, 63 sectors/track, 232581 cylinders Units = cylinders of 1008 * 512 = 516096 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 232581 117220792+ fd Linux raid autodetectSo now with all three drives with a partitioned with id fd Linux raid autodetect you can go ahead and combine the partitions into a RAID array:
# /sbin/mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 \ /dev/hdb1 /dev/hda1 /dev/hdf1Wow, that was easy. That created a special device /dev/md0 which can be used instead of a physical partition. You can check on the status of that RAID array with the mdadm command:
# /sbin/mdadm --detail /dev/md0
Version : 00.90.01
Creation Time : Wed May 11 20:00:18 2005
Raid Level : raid5
Array Size : 234436352 (223.58 GiB 240.06 GB)
Device Size : 117218176 (111.79 GiB 120.03 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Jun 10 04:13:11 2005
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 36161bdd:a9018a79:60e0757a:e27bb7ca
Events : 0.10670
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 3 65 1 active sync /dev/hdb1
2 33 65 2 active sync /dev/hdf1
The important lines to see are the State line which should
say clean otherwise there might be a problem.
At the bottom you should make sure that the State column always
says active sync which says each device is
actively in the array. You could potentially have a spare device
that's on-hand should any drive should fail. If you have a spare
you'll see it listed as such here.
One thing you'll see above if you're paying attention is the fact that the size of the array is 240G but I have three 120G drives as part of the array. That's because the extra space is used as extra parity data that is needed to survive the failure of one of the drives.
Anyway, here are the steps to then add this RAID array to the LVM system. The first command pvcreate will "initialize a disk or partition for use by LVM". The second command vgcreate will then create the Volume Group, in my case I called it lvm-raid:
# pvcreate /dev/md0 # vgcreate lvm-raid /dev/md0The default value for the physical extent size can be too low for a large RAID array. In those cases you'll need to specify the -s option with a larger than default physical extent size. The default is only 4MB as of the version in Fedora Core 5. The maximum number of physical extents is approximately 65k so take your maximum volume size and divide it by 65k then round it to the next nice round number. For example, to successfully create a 550G RAID let's figure that's approximately 550,000 megabytes and divide by 65,000 which gives you roughly 8.46. Round it up to the next nice round number and use 16M (for 16 megabytes) as the physical extent size and you'll be fine:
# vgcreate -s 16M <volume group name>Ok, you've created a blank receptacle but now you have to tell how many Physical Extents from the physical device (/dev/md0 in this case) will be allocated to this Volume Group. In my case I wanted all the data from /dev/md0 to be allocated to this Volume Group. If later I wanted to add additional space I would create a new RAID array and add that physical device to this Volume Group.
To find out how many PEs are available to me use the vgdisplay command to find out how many are available and now I can create a Logical Volume using all (or some) of the space in the Volume Group. In my case I call the Logical Volume lvm0.
# vgdisplay lvm-raid . . Free PE / Size 57235 / 223.57 GB # lvcreate -l 57235 lvm-raid -n lvm0In the end you will have a device you can use very much like a plain 'ol partition called /dev/lvm-raid/lvm0. You can now check on the status of the Logical Volume with the lvdisplay command. The device can then be used to to create a filesystem on.
# lvdisplay /dev/lvm-raid/lvm0
--- Logical volume ---
LV Name /dev/lvm-raid/lvm0
VG Name lvm-raid
LV UUID FFX673-dGlX-tsEL-6UXl-1hLs-6b3Y-rkO9O2
LV Write Access read/write
LV Status available
# open 1
LV Size 223.57 GB
Current LE 57235
Segments 1
Allocation inherit
Read ahead sectors 0
Block device 253:2
# mkfs.ext3 /dev/lvm-raid/lvm0
.
.
# mount /dev/lvm-raid/lvm0 /mnt
# df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/lvm--raid-lvm0
224G 93M 224G 1% /mnt
# /sbin/mdadm /dev/md0 -f /dev/hdb1 mdadm: set /dev/hdb1 faulty in /dev/md0
Once the system has determined a drive has failed or is otherwise missing (you can shut down and pull out a drive and reboot to similate a drive failure or use the command to manually fail a drive above it will show something like this in mdadm:
# /sbin/mdadm --detail /dev/md0
Update Time : Wed Jun 15 11:30:59 2005
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
.
.
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 0 0 - removed
2 33 65 2 active sync /dev/hdf1
You'll notice in this case I had /dev/hdb
fail. I replaced it with a new drive with the same capacity and was
able to add it back to the array. The first step is to partition
the new drive just like when first creating the array. Then you
can simply add the partition back to the array and watch the status
as the data is rebuilt onto the newly replace drive.
# /sbin/mdadm /dev/md0 -a /dev/hdb1
# /sbin/mdadm --detail /dev/md0
Update Time : Wed Jun 15 12:11:23 2005
State : clean, degraded, recovering
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 2% complete
.
.
During the rebuild process the system performance may be somewhat
impacted but the data should remain in-tact.
I'm told it's now possible to expand the size of a RAID array much as you could on a commercial array such as the NetApp. The link below describes the procedure. I have yet to try it but it looks promising:Growing a RAID5 array - http://scotgate.org/?p=107
Encrypting /home and swap over RAID with dm-crypt - Do you have important company files on your PC at home, that you can neither afford to lose, nor let fall into the wrong hands? This page explains how to set up encrypted RAID1 ext3 filesystems with dm-crypt, along with an encrypted RAID0 swap, on RedHat / Fedora Core 5, using the twofish encryption algorithm and dm-crypt's new ESSIV mode.
|
Donate via Paypal:
All these guides are done on my personal time as a community service.
Please consider a donation to allow me to allocate the time to put
these together: | |
| This comments section below is only for comments, suggestions or corrections for this guide only. Please do not use this for general Fedora/Linux support. If you do require support for something other than what's described here I recommend using Fedora Forums. |
| Comments From People Like You! Managing RAID and LVM with Linux | |
|
10-Oct-2010 10:19 |
|
Hi all,
|
|
|
24-Mar-2010 18:14 |
|
This is very interesting. I like reading pertinent information on similar topic I experienced. If I may also suggest some reading that I wrote about similar issue.
|
|
|
26-Jul-2009 13:59 |
|
This is a great guide, I've showed it to a couple of friends who were interested in software RAID but they complained about not having a GUI. After a couple of days of searching I've found a great tutorial on howto setup a RAID array using Fedora's installer, if anybody is interested the tutorial is at http://www.optimiz3.com/installing-fedora-11-and-setting-up-a-raid-0-1-5-6-or-10-array/ . It's a nice tutorial for the Linux n00bs out there.
|
|
|
15-Sep-2008 21:48 |
|
With regard to /etc/mdadm.conf -- you can regenerate it by running:
|
|
|
04-Jul-2008 00:18 |
|
@Joe - regarding /etc/mdadm.conf I find it much easier in the long run to specify "DEVICE partitions" and dynamically scan for the array based on the UUID.
|
|
|
06-Jun-2008 15:34 |
|
This is a wonderful tutorial. It saved me hours of work.
|
|
|
03-Jun-2008 11:12 |
|
I've been using WinXP and mounting drives in folders to accommodate my storage requirements, but admittedly, I've been looking for alternatives.
|
|
|
28-May-2008 21:58 |
|
Thanks for this great how to article. Finally, someone who can write an easy to understand guide to do something on Linux. What a breath of fresh air!
|
|
|
29-Feb-2008 14:35 |
|
The guide suggests increasing the physical extent size (-s) to because "The maximum number of physical extents is approximately 65k". That is only true if you are still using the LVM 1 meta data format.
|
|
|
17-Oct-2007 02:23 |
|
You should also point out that when replacing a raid array by swapping in bigger drives and using grow that you should make the partitions on the bigger drives the size you want in the end raid array. i.e if you are replacing 300 GB drives with 750 GB drives and want the whole drive being used, make the partition on the 750 GB drive 750 GB with a tiny bit of space at the end. I just wasted 10 hours by making the partitions 300 GB... lame.
|
|
|
29-Sep-2007 18:20 |
|
From what I read elsewhere it is now possible to add a disk to a raid array!
|
|
|
09-Sep-2007 02:10 |
|
Very useful guide. Have run in on my Fedora 6 Server and found some errors(?).
|
|
|
15-Aug-2007 13:21 |
|
Tried the pure lvm method as mentioned earlier. It's not as robust under linux. It has no ability as far as I know to rebuild the array, and if a drive fails, it system will not come back up cleanly because the PV within the LV is missing. I like mdadm.
|
|
|
24-Jul-2007 04:41 |
|
I have a problem. I set up a RAID-1 using Fedora Core 4. Then i tried to update the installation to Fedora7, but it failed. I ended up installing Fedora7 on my main disk, and just left the raid disks alone during the installation. Is there a way I can get the raid up and running again with the data that is on them?
|
|
|
15-May-2007 00:36 |
|
Excellent and superb tutorial ..
|
|
|
20-Apr-2007 07:33 |
|
Oh, I should also mention that when doing any steps that involve modifying the raid array (mdadm --create, mdadm --grow, or mdadm -a), you must WAIT even though the command returned right away. Rest assured, the raid subsystem is churning in the background.
|
|
|
19-Apr-2007 16:26 |
|
Just to recap the info thus far:
|
|
|
23-Mar-2007 01:12 |
|
I agree with Nevyn sorta, but technically I think that Splinter/slashdots below method would work, and keep data fault-tolerant. This is because he is putting each of the 10x RAID arrays across 3 disks. If one disk goes down, all 10 RAIDs go down, but they are each recoverable if you did them right.
|
|
|
28-Feb-2007 07:10 |
|
I'm a tad concerned by people talking about partitioning off their hard drives. This would make the redundancy pointless wouldn't it? I.e. if 1 drive goes down, suddenly you've got a loss of several partitions when Raid 5 only has tolerance for 1 drive (/ partition) to go down.
|
|
|
02-Dec-2006 02:39 |
|
Having alot of experience with mdadm (even raid 10 and 50 over 16 disks) I have thought about doing this but am going to experiment with a straight LVM "raid" and not use mdadm underneath as LVM supports multiple disks and striping, although what it may lack is the ability to rebuild (see: http://tldp.org/HOWTO/LVM-HOWTO/recipethreescsistripe.html).
|
|
|
07-Aug-2006 09:25 |
|
For those now reading this, you can actually expand a raid 5 array with mdadm for software based raid in linux. At least as of kernel 2.6.17 and mdadm 2.5.2 it is possible. One easy way to stest this is to create 4 files with dd, I created 4 100MB files named raidfile[1-4]. Next, use losetup to set them up as loop back devices, I used loop[0-3]. To create the initial array do:
|
|
|
07-Aug-2006 01:23 |
|
You can have mixed drive sizes with raid 5. The trick is to partition the drives into smaller partitions that can be matched in threes (or pairs for raid-1). When adding a drive to expand space, it may be necessary to move 1 or more partitions to the new drive. Simply add a same size partition on the new drive as a hot spare to the md device, then 'fail' the partition to be moved. When the sync is done, remove the "failed" partition from that md - it is now free to use with another md.
|
|
|
11-Jun-2006 01:39 |
|
I have a 3-ware 9550sx controller that has online capactity expansion capabilites. I also am using it in a raid-5 configuration. I stumbled upon your page looking for answers on how to make linux see the new hard drive I just added to the array, and you said it cant be done :(
|
|
|
08-Jun-2006 12:53 |
|
Thanks Fake Rake, so to truley take advantage of a RAID-5 you have to have seperate drives for each part, but the limiting part is expansion correct? If I start with 300gb HD I can only add 300gb HDs and to add it to the RAID array with out rebuilding it is to use EVMS correct?
|
|
|
08-Jun-2006 01:15 |
|
Michael,
|
|
|
07-Jun-2006 12:11 |
|
This is in regards to Splinters splitting the HDs up in to smaller bits and then putting them together to expand on the raid.
|
|
|
04-Jun-2006 17:47 |
|
Adding a small comment for the pvmove info. Kind of for my own benefit, as I come back to the page to remind myself how to do things with raid/lvm. Anyhow...
|
|
|
30-May-2006 13:58 |
|
Excellent guide, excellent explanations of each step. Thank you for providing this resource.
|
|
|
27-May-2006 02:19 |
|
Actually, there is a cool trick to be able to extend a raid/lvm scheme (I got this from slashdot):
|
|
|
24-May-2006 01:28 |
|
To expand on what an earlier commenter said, you can actually expand your RAID-5 provided you use EVMS to do it. Midway down the EVMS FAQ page, they have a somewhat hackish way of doing so. The relevant Q&A is:
|
|
|
09-Apr-2006 12:50 |
|
Excellent tutorial! *REALLY* helped. But I must agree with the previous comment that keeping Logical Volume and Logical Group terms straight is important and in building LVs on my test system (using your tutorial) I have learned to name Volume Groups vg01 or vg01 and Logical Volumes lv01 or lv02 so that there is no confusion with the commands. But THANK YOU for taking the time to build this page! WONDERFUL stuff!!!
|
|
|
26-Jan-2006 09:02 |
|
@author:
|
|
|
15-Jan-2006 17:47 |
|
> The answer to how to expand a RAID-5 array is very simple: You can't.
|
|
|
03-Jan-2006 03:50 |
|
Everything has gone fine for me until the creation of the file system. I have setup a RAID 5 array with four 400Gb disks, sda, sdb, sdc, sdd. FC4 is running of sdf. The logical volume gets created and looks fine. When I try mkfs.ext3, it runs until its about 25% finished, and then the whole system crashes. After I hit the reset button and reboot, the system spends about 3 hours resyncing md0. I have tried this now about 5 times. How do I go about debugging? Should can I make a smaller logical volume first and then expand it?
|
|
|
13-Dec-2005 22:43 |
|
every thing went great. killer notes. suggestions
|
|
|
23-Oct-2005 15:51 |
|
Great tutorial, simple and effective. Thanks.
|
|
|
22-Oct-2005 07:20 |
|
Thanks again! Your infos are very readable and all steps to the point. Its probaly exactly the amount of information you need to set it up the first time, and it has the instructions how to handle disk-failures. The man-pages i've seen are missing that.
|
|