3ware raid resize via pivot_root

Posted on September 13, 2010 with tags . See the previous or next posts.

How `pivot_root` won against a stupid RAID controller

This post is brought to you by a very stupid 3ware 9650SE RAID controller, which doesn’t understand the concept of growing a RAID array after all disks have been replaced by bigger ones. Or, well, it can, but the procedure, involving sending array dumps to the technical support, then receiving a program to run under DOS (!), doesn’t inspire me with confidence. Furthermore, this stupid RAID card doesn’t allow easy reordering (if it’s possible, and I just missed it, please let me know) of the RAID arrays at runtime, only from the BIOS screen, and I didn’t have access to that (only poor man’s serial console under Linux).

So, after buying new harddrives, and replacing the original ones in the RAID1 array with double the capacity, I realised I have to do a manual conversion. Basically, go into single-user mode, remount all read-only, split the RAID1 array into its (identical) components, change root directory to the second disk, re-partition the first disk, copy the data from the 2nd to the 1st disk, then re-add the 2nd disk to the 1st one (repartitioned) to re-establish the RAID1.

All this, on a headless (no keyboard, no VGA) machine, just with a serial console. Of course, if this is just your local machine, you can easily reboot from an external media, and do all this easily from a “full” user-space. In any case, I’m writing this down, as there seem to be no easily findable guides on how to do the resize on a 3ware with its restrictions.

Procedure

Disclaimer: you might lose data via this, there are no guarantees. Have up-to-date backups.

First, make sure you have power control (you can power-cycle the box if needed) and the serial console works.

Also, make absolutely sure that the RAID array is fully consistent, and that you don’t have any bad sectors on the drives, as you will lose redundancy during the operation.

If /usr is not on your root file-system, then you need to copy /usr/sbin/chroot to somewhere on it. Also, if you have tools for the array that are not on your root file-system, you should copy them there now. In my case, that was tw_cli.

First reboot

Reboot into init=/bin/sh.

Note: I’ve tried to simply go into single-user mode, or to reboot with emergency, but this doesn’t work, because in both modes, beside the init process with PID 1 and your shell, there’s another init process (the parent of the shell), and I didn’t know how to tell it to re-execute itself. To the main init process it’s easy, you just run telinit u, but that child keeps the old root directory mounted. If it’s possible to re-exec that child too, then this step can be skipped—please let me know how!

Changing the root mount

Make sure everything is mounted read-only (should be), then split the raid array into its component drives, using whatever tool the card needs. For me, it was:

# tw_cli /c0/u0 migrate type=single

The second drive now should be visible e.g. as /dev/sdc, so mount it somewhere, and pivot_root to it:

# mount -o ro /dev/sdcX /mnt
# cd /mnt/
# pivot_root . mnt/
# exec chroot .

At this point, the new root (on some partition on /dev/sdc) is mount at the root, and the exec call replaced the current shell with another one that has the right root directory.

So nothing should keep the old root directory (at /dev/sdaX) in use, and umount /mnt (or umount /dev/sdaX) should succeed now.

Recreate the first raid array

Now, since it’s no longer in use, you can delete and recreate the first raid array.

Once it’s available, you have two choices:

  • either do a bit-for-bit copy, using dd, from /dev/sdc to /dev/sda, then run sfdisk -R /dev/sda to re-read the changed partition table on the target drive
  • or repartition /dev/sda if you also want to change the partition layout

I did the latter, since I wanted to use the new alignment standards (the initial installation was still using sector 63 for the boot partition, etc.) as a future preparation for either SSD or 4K-sector drives. So I used:

  • fdisk -cu /dev/sda, and created partitions accordingly
  • then re-did my physical volume for LVM, but with --dataalignment 1024k, to have the same megabyte alignment (yes, I went a little overboard)

In any case, either copy the entire disk, or each partition in turn, or possibly pvmove from the old to the new disk, etc.

Switch back the root mount

At this point, /dev/sdc is no longer used, and you can again change the root directory. This is similar to the first pivot_root, but now using /dev/sda as a target. All the instructions are the same.

Convert back to RAID1

Now, since (in our example) /dev/sdc is no longer in-use, you can delete that drive and rejoin it to the first drive, recreating the RAID1 array. In my case:

# tw_cli
/> /c0/u2 delete
blah blah [y/n] y
/> /c0/u0 migrate type=raid1 disk=X

At this point, the machine is again redundant (or will be once the resync finishes), but it still needs one more step or it won’t boot correctly.

Reinstall the boot-loader

Now, for reinstall the boot loader, you’ll probably need to mount /boot (and mount it read-write), and do whatever you need. In my case, I re-ran lilo, but grub-install /dev/sda should also work.

Final reboot

And now, you only need to reboot into the normal, resized system. Since you can’t easily use shutdown when booted with /bin/sh as init, just use /proc/sysrq-trigger to sync/unmount/reboot.

Thoughts

While I use a hardware RAID controller for the battery backup it offers, I have never thought it cannot do such a simple operation (compare with mdadm --grow /dev/mdX --size=max).

Second, I knew about pivot_root but have not used it before. It’s a very very nice thing, and the only downside is that you can’t really use it with more than a few processes, since it’s hard/impossible to change a random process’ root directory. But anyway, it works, and it’s damn handy for doing things to the root file-system or in general to the root drive (e.g. just repartitioning it) when you can’t boot from another media (e.g. remote machine).