During the Boxing Day sales event of 2017, I purchased a couple of Seagate Barracuda ST4000DM004 (4TB) hard drives. The intention is to expand our main home network storage, which is a network accessible / attached storage (NAS) managed by our Linux server using mdadm and Logical Volume Manager (LVM).
However I procrastinated until this weekend and finally performed the upgrade. The task went smoothly without any hiccups. I have to give due credit to the following site: https://raid.wiki.kernel.org/index.php/Growing. It really provided very detail information for me to follow. It was a great help.
One major concern I had was whether I can do this without any data loss, and a question of how much down time would this upgrade require.
I had a logical volume, named /dev/airvideovg2/airvideo, consisted of 100% usage of a volume group which was made up of three RAID-1 multiple devices (md). Since I ran out of physical drive bays, to perform the upgrade, I had to effectively replace 2 older drives which were 2TB in size with the newer 4TB drives. The old drives were Western Digital WDC WD20EZRX-00D8PB0 (2TB). I can use these old drives for other needs.
First I had to find the md containing the 2TB pair in RAID-1 (mirror) configuration. I did this with a combination of lsblk and mdadm commands. For example:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:0 0 1.8T 0 disk
└─sdb1 8:1 0 1.8T 0 part
└─md2 9:3 0 1.8T 0 raid1
└─airvideovg2-airvideo 252:0 0 10T 0 lvm /mnt/airvideo
$ sudo mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Sat Nov 12 18:01:36 2016
Raid Level : raid1
Array Size : 1906885632 (1725.90 GiB 2000.65 GB)
Used Dev Size : 1906885632 (1725.90 GiB 2000.65 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sun Mar 11 09:12:05 2018
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : avs:2 (local to host avs)
UUID : 3d1afb64:878574e6:f9beb686:00eebdd5
Events : 55191
Number Major Minor RaidDevice State
2 8 81 0 active sync /dev/sdf1
3 8 17 1 active sync /dev/sdb1
I found that I needed to replace /dev/md2 which consisted of two /dev/sdf1 and /dev/sdb1 partitions. These partitions belonged respectively to the old WD drives. I have 6 hard drives in the server chassis, so I needed to get the serial number of the drives to ensure that I was swapping the right one. I used the hdparm command to get serial number of /dev/sdb and /dev/sdf. For example:
$ sudo hdparm -i /dev/sdb
/dev/sdb:
Model=WDC WD20EZRX-00D8PB0, FwRev=0001, SerialNo=WD-WCC4M4UDRZLD
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=7814037168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-4,5,6,7
* signifies the current active mode
Before I physically replace the drives, I must first remove them from the md device by doing the following.
sudo mdadm -f /dev/md2 /dev/sdb1
sudo mdadm -r /dev/md2 /dev/sdb1
After swapping out only one of the drives, and replaced it with a new one, I rebooted the machine. I first have to partitioned the new drive using the parted command.
sudo parted /dev/sdb
Once within parted, execute the following commands to create a single partition using the whole drive.
mklabel gpt
mkpart primary 2048s 100%
I learned that I can only start at sector 2048 to achieve optimal alignment. Once I created the /dev/sdb1 partition, I have to add it back into the RAID.
sudo mdadm --add /dev/md2 /dev/sdb1
As soon as the new partition with the new drive was added into the RAID, a drive resynchronization begins automatically. The resync took more than 3 hours. Once the resync is completed, I did the same thing with the remaining old drive in the RAID, and performed another resync. This time from the first new drive to the second new drive. After another 3+ hours, we can now grow the RAID device.
sudo mdadm --grow /dev/md2 --bitmap none
sudo mdadm --grow /dev/md2 --size max
sudo mdadm --wait /dev/md2
sudo mdadm --grow /dev/md2 --bitmap internal
The third command (–wait) took another 3+ hours to complete. After a full day of RAID resync, we now have the same /dev/md2 that is 4TB instead of 2TB in size. However the corresponding physical volume of LVM still needed to be resized. We did this with:
sudo pvresize /dev/md2
Once the physical volume is resized, we can then extend the logical volume to use up the remaining free space that we just added.
lvresize -l +100%FREE /dev/airvideovg2/airvideo
This took a few minutes but at least it was not 3+ hours.
Aside from the down time that I had to use to swap the hard drives, the logical volume was usable throughout the resynchronization and resizing process. This was impressive. However, now I have to take the volume offline by first umount the volume and changing it to inactive status.
sudo lvchange -an /dev/airvideovg2/airvideo
Note that I had stop the smbd and other services that was using the volume before I can unmount it.
The last step is to resize the file system of the logical volume, but before I can do that I was forced to perform a file system check.
e2fsck -f /dev/airvideovg2/airvideo
resize2fs -p /dev/airvideovg2/airvideo
I rebooted the machine to ensure all the new configurations held, and voila I upgraded my 8TB network attached storage to 10TB! Okay it was not super simple, but the upgrade process was pretty simple and painless. The down time was minimal. The LVM and mdadm guys did a really good job here.