Hi all,
Yesterday one of the HDD's in a F19 (yes I know) server died. I found this out when the server would not reboot (45miles away from where I was too :( )
The only way the server would boot was if I disconnected the drive, connected to ATA0 /dev/sdb.
I have now replaced the drive with another of the same model and am about to set about rebuilding the RAID1 setup. However every page that Google gives me regarding this shows using FDisk to sort out the partition table prior to starting.
The niggle I've got is that because these are all 3TB drives I can't use fdisk. If I use fdisk on one of the existing drives I get:
Device Boot Start End Blocks Id System /dev/sda1 1 4294967295 2147483647+ ee GPT Partition 1 does not start on physical sector boundary.
My question is, what do I need to do to set up the new drive, without using fdisk, so that I can then start rebuilding the 3 RAID1 devices on it?
If anyone has a link to a full set of instructions for restoring a large HDD I'd appreciate it
On Thu, Mar 15, 2018 at 12:30:14PM +0000, Gary Stainburn wrote:
Hi all,
Yesterday one of the HDD's in a F19 (yes I know) server died. I found this out when the server would not reboot (45miles away from where I was too :( )
The only way the server would boot was if I disconnected the drive, connected to ATA0 /dev/sdb.
I have now replaced the drive with another of the same model and am about to set about rebuilding the RAID1 setup. However every page that Google gives me regarding this shows using FDisk to sort out the partition table prior to starting.
The niggle I've got is that because these are all 3TB drives I can't use fdisk. If I use fdisk on one of the existing drives I get:
Device Boot Start End Blocks Id System /dev/sda1 1 4294967295 2147483647+ ee GPT Partition 1 does not start on physical sector boundary.
My question is, what do I need to do to set up the new drive, without using fdisk, so that I can then start rebuilding the 3 RAID1 devices on it?
If anyone has a link to a full set of instructions for restoring a large HDD I'd appreciate it _______________________________________________
Hi Gary,
You should use parted instead of fdisk to create the GPT and its partitions. It should be able to handle larger partitions than fdisk.
Are you using mdadm for the RAID1 array?
On Thursday 15 March 2018 13:59:39 Juan Martinez wrote:
Hi Gary,
You should use parted instead of fdisk to create the GPT and its partitions. It should be able to handle larger partitions than fdisk.
Are you using mdadm for the RAID1 array?
I ended up using the following to clone the GPT table, and checked using parted and it looked correct (sda and sdb match)
Yes I am using mdadm to control the RAID1 array. From all of the pages that I have found, it should now be a case of going through each of my 'md' devices and removing the dead partition and adding a new one. What I'm not sure about is how to remove the 'removed' entries, or whether they'll just disappear once I add the replacement.
For example I have:
[root@lou log]# mdadm --detail /dev/md124 /dev/md124: Version : 1.2 Creation Time : Thu Jun 5 11:16:44 2014 Raid Level : raid1 Array Size : 2770227008 (2641.89 GiB 2836.71 GB) Used Dev Size : 2770227008 (2641.89 GiB 2836.71 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Mar 14 21:37:24 2018 State : active, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0
Name : var_bacula UUID : 2d9ba248:b6d1236a:cf9ebd49:918bad94 Events : 1275274
Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 0 0 1 removed [root@lou log]#
To add the replacement device I should enter
From everything I know I think I should enter
mdadm /dev/md124 --manage --add /dev/sdb2
as the two drives have identical partition tables, so sdb2 would match the existing /dev/sda2.
An I correct, and will that replace the 'removed' line?
Presumably, I then simply follow the same rule for my remaining md devices?
/dev/md125 /dev/md126 /dev/md127
Would have been good if I'd actually pasted the commands:
sgdisk /dev/sda -R /dev/sdb sgdisk -G /dev/sdb
On 3/15/2018 10:28 AM, Gary Stainburn wrote:
On Thursday 15 March 2018 13:59:39 Juan Martinez wrote:
Hi Gary,
You should use parted instead of fdisk to create the GPT and its partitions. It should be able to handle larger partitions than fdisk.
Are you using mdadm for the RAID1 array?
I ended up using the following to clone the GPT table, and checked using parted and it looked correct (sda and sdb match)
Yes I am using mdadm to control the RAID1 array. From all of the pages that I have found, it should now be a case of going through each of my 'md' devices and removing the dead partition and adding a new one. What I'm not sure about is how to remove the 'removed' entries, or whether they'll just disappear once I add the replacement.
For example I have:
[root@lou log]# mdadm --detail /dev/md124 /dev/md124: Version : 1.2 Creation Time : Thu Jun 5 11:16:44 2014 Raid Level : raid1 Array Size : 2770227008 (2641.89 GiB 2836.71 GB) Used Dev Size : 2770227008 (2641.89 GiB 2836.71 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Mar 14 21:37:24 2018 State : active, degradedActive Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0
Name : var_bacula UUID : 2d9ba248:b6d1236a:cf9ebd49:918bad94 Events : 1275274 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 0 0 1 removed[root@lou log]#
To add the replacement device I should enter
From everything I know I think I should enter
mdadm /dev/md124 --manage --add /dev/sdb2
as the two drives have identical partition tables, so sdb2 would match the existing /dev/sda2.
An I correct, and will that replace the 'removed' line?
Presumably, I then simply follow the same rule for my remaining md devices?
/dev/md125 /dev/md126 /dev/md127 _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org
If that doesn't work try: 'mdadm /dev/md124 --re-add /dev/sdb2'.
Bill
On Thursday 15 March 2018 14:50:07 Bill Shirley wrote:
If that doesn't work try: 'mdadm /dev/md124 --re-add /dev/sdb2'.
Bill
Thanks for this Bill, but I did
sgdisk /dev/sda -R /dev/sdb sgdisk -G /dev/sdb
followe by
mdadm /dev/md124 --manage --add /dev/sdb2 mdadm /dev/md125 --manage --add /dev/sdb3 mdadm /dev/md126 --manage --add /dev/sdb4 mdadm /dev/md127 --manage --add /dev/sdb5
I now have /dev/md124 syncing with the others resync=DELAYED
Hopefully that will see me with a fully working system again, in 6 hours.
Part of the reason that this recover seems to have gone well is that the system booted up and gave me access to everything.
How would I have been able to complete the recovery if it had been the boot device that had failed?
From what I understand, and from past experience of when software RAID1 setups have failed, it isn't possible to boot using the second drive.
Ideally, in that situation I would want to make the second drive the first drive, add a new blank drive, boot and the complete the exercise above.
What do I need to do to make the second drive bootable?
Both sda and sdb are currently identical. / is /dev/md125 which is sda3 and sdb3 /boot is /dev/md126 which is sda4 and sdb4 /boot/efi is only on /dev/sda1
Would copying /dev/sda1 to /dev/sdb1 using dd be enough to make the second drive bootable if the first drive fails?
Allegedly, on or about 15 March 2018, Gary Stainburn sent:
From what I understand, and from past experience of when software RAID1 setups have failed, it isn't possible to boot using the second drive.
Ideally, in that situation I would want to make the second drive the first drive, add a new blank drive, boot and the complete the exercise above.
I thought with RAID1 being "all drives identical," and that unless you were using yet another drive to boot from (separate from your raid), that each drive would have a boot partition on it. Following that train of thought, if your controller didn't let you boot from a different drive (which seems a serious shortcoming, to me), wouldn't it be possible to just unplug the drives and put your still working one into the first slot?
Just a brute force and ignorance approach to the situation...
Seems to me that the idea of mirrored drives is to give an easy way of dealing with drive failures, surely it shouldn't impose complex routines to get past the first drive going bad.
On Thursday 15 March 2018 15:43:49 Tim wrote:
I thought with RAID1 being "all drives identical," and that unless you were using yet another drive to boot from (separate from your raid), that each drive would have a boot partition on it. Following that train of thought, if your controller didn't let you boot from a different drive (which seems a serious shortcoming, to me), wouldn't it be possible to just unplug the drives and put your still working one into the first slot?
Just a brute force and ignorance approach to the situation...
Seems to me that the idea of mirrored drives is to give an easy way of dealing with drive failures, surely it shouldn't impose complex routines to get past the first drive going bad.
Tim,
That is exactly what I thought until I was in that situation. sda failed so I disconnected it, leaving just sdb connected. It refused to boot. I tried swapping the cable to put it in the same SATA port on the board but it still didn't boot.
Last time was some time ago, and /boot could not be a RAID device. This time /boot is a RAID device, but /boot/efi isn't and is just a vfat partition.
Hence my question. If I just dd this partition from sda to sdb would that then make sdb bootable?
/boot is on /: [0:root@elmo raid]$ df Filesystem Size Used Avail Use% Mounted on /dev/md2 1.9T 834G 1021G 45% / /dev/sdc2 3.6T 2.3T 1.4T 63% /bacula
dos partition table (/dev/sdb is the same): [0:root@elmo raid]$ fdisk -l /dev/sda Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x7eb4f1d4
Device Boot Start End Sectors Size Id Type /dev/sda1 2048 16779263 16777216 8G 82 Linux swap / Solaris /dev/sda2 * 16779264 3907029167 3890249904 1.8T fd Linux raid autodetect
Run: grub2-install /dev/sda grub2-install /dev/sdb
Now, you can boot from either drive.
Bill
On 3/15/2018 11:52 AM, Gary Stainburn wrote:
On Thursday 15 March 2018 15:43:49 Tim wrote:
I thought with RAID1 being "all drives identical," and that unless you were using yet another drive to boot from (separate from your raid), that each drive would have a boot partition on it. Following that train of thought, if your controller didn't let you boot from a different drive (which seems a serious shortcoming, to me), wouldn't it be possible to just unplug the drives and put your still working one into the first slot?
Just a brute force and ignorance approach to the situation...
Seems to me that the idea of mirrored drives is to give an easy way of dealing with drive failures, surely it shouldn't impose complex routines to get past the first drive going bad.
Tim,
That is exactly what I thought until I was in that situation. sda failed so I disconnected it, leaving just sdb connected. It refused to boot. I tried swapping the cable to put it in the same SATA port on the board but it still didn't boot.
Last time was some time ago, and /boot could not be a RAID device. This time /boot is a RAID device, but /boot/efi isn't and is just a vfat partition.
Hence my question. If I just dd this partition from sda to sdb would that then make sdb bootable? _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org
On Thursday 15 March 2018 16:34:26 Bill Shirley wrote:
Run: grub2-install /dev/sda grub2-install /dev/sdb
Now, you can boot from either drive.
Bill
Hi Bill, Thanks for this. However, when I try this I get:
[root@lou ~]# grub2-install /dev/sda /usr/lib/grub/x86_64-efi doesn't exist. Please specify --target or --directory [root@lou ~]#
A quick google shows gives the indication that this is for MBR systems where mine is UEFI.
A slower google hasn't given me instructions on how to do this on a UEFI system in a way that I can understand.
Can anyone give the equivelant to Bill's commands above.
To review the situation. I have sda which I boot from. I have a brand new sdb which I use for RAID1. Should sda fail at some point I want to be able to unplug sda, and be able to boot from (what is currently) sdb.
this is what the two drives both look like at the moment,
[root@lou ~]# parted GNU Parted 3.1 Using /dev/sda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: ATA ST3000DM001-1CH1 (scsi) Disk /dev/sda: 3001GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags:
Number Start End Size File system Name Flags 1 1049kB 53.5MB 52.4MB fat16 EFI System Partition boot 2 53.5MB 2837GB 2837GB raid 3 2837GB 2942GB 105GB raid 4 2942GB 2994GB 52.5GB ext4 raid 5 2994GB 3001GB 6296MB raid 6 3001GB 3001GB 1049kB bios_grub
(parted)
Yeah, I'm not a fan of EFI because:
Both sda and sdb are currently identical. / is /dev/md125 which is sda3 and sdb3 /boot is /dev/md126 which is sda4 and sdb4 _*/boot/efi is only on /dev/sda1*_
and IIUC, can't be raided which means you have to remember to sync /dev/sda1 to /dev/sdb1 after anything that updates /dev/sda1.
And if you could raid /boot, I don't see a need for it to be a separate partition. My /boot is on /.
I just buy a couple of 2TB (or less) disks where / and swap sit and boot legacy BIOS. If I need more storage, I'll get a couple of 4TB (or larger) disks and partition them: [0:root@elmo vhosts 2]$ fdisk -l /dev/sdc Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 72FCBBCF-6AA6-47F4-8A99-965F773B35CB
Device Start End Sectors Size Type /dev/sdc1 2048 31250431 31248384 14.9G Microsoft basic data /dev/sdc2 31250432 7814035455 7782785024 3.6T Microsoft basic data
... and what's with this 'Microsoft basic data' stuff. /dev/sdc1 is swap and /dev/sdc2 is ext4.
I guess I woke up grumpy this morning. :-)
Bill
On 3/16/2018 5:55 AM, Gary Stainburn wrote:
On Thursday 15 March 2018 16:34:26 Bill Shirley wrote:
Run: grub2-install /dev/sda grub2-install /dev/sdb
Now, you can boot from either drive.
Bill
Hi Bill, Thanks for this. However, when I try this I get:
[root@lou ~]# grub2-install /dev/sda /usr/lib/grub/x86_64-efi doesn't exist. Please specify --target or --directory [root@lou ~]#
A quick google shows gives the indication that this is for MBR systems where mine is UEFI.
A slower google hasn't given me instructions on how to do this on a UEFI system in a way that I can understand.
Can anyone give the equivelant to Bill's commands above.
To review the situation. I have sda which I boot from. I have a brand new sdb which I use for RAID1. Should sda fail at some point I want to be able to unplug sda, and be able to boot from (what is currently) sdb.
this is what the two drives both look like at the moment,
[root@lou ~]# parted GNU Parted 3.1 Using /dev/sda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: ATA ST3000DM001-1CH1 (scsi) Disk /dev/sda: 3001GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags:
Number Start End Size File system Name Flags 1 1049kB 53.5MB 52.4MB fat16 EFI System Partition boot 2 53.5MB 2837GB 2837GB raid 3 2837GB 2942GB 105GB raid 4 2942GB 2994GB 52.5GB ext4 raid 5 2994GB 3001GB 6296MB raid 6 3001GB 3001GB 1049kB bios_grub
(parted) _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org
On 03/16/2018 02:55 AM, Gary Stainburn wrote:
Hi Bill, Thanks for this. However, when I try this I get:
[root@lou ~]# grub2-install /dev/sda /usr/lib/grub/x86_64-efi doesn't exist. Please specify --target or --directory [root@lou ~]#
A quick google shows gives the indication that this is for MBR systems where mine is UEFI.
Right, don't do that on an EFI system.
A slower google hasn't given me instructions on how to do this on a UEFI system in a way that I can understand.
Can anyone give the equivelant to Bill's commands above.
I think it should be possible. You will need to use RAID metadata version 1.0 so that the superblock is not at the front. This should allow the BIOS to read the partition as if it was only fat32 and not raid. I'm assuming you don't have windows on this system so nothing should modify the partition while it's not in raid mode.
Make sure you have a live boot available in case something goes wrong. Make a copy of the files on the EFI partition. Create the raid array using "-e 1.0". Format it as fat32. Copy the files back. Adjust your fstab. Make sure you don't change the partition type or flags in the partition table.
this is what the two drives both look like at the moment,
Number Start End Size File system Name Flags 1 1049kB 53.5MB 52.4MB fat16 EFI System Partition boot 2 53.5MB 2837GB 2837GB raid 3 2837GB 2942GB 105GB raid 4 2942GB 2994GB 52.5GB ext4 raid 5 2994GB 3001GB 6296MB raid
6 3001GB 3001GB 1049kB bios_grub
I thought this partition was only needed when booting a GPT partitioned drive in legacy (non-EFI) mode.