valence

the capacity of one person or thing to react with or affect another in some special way, as by attraction or the facilitation of a function or activity.

Mirrored Drive Failure #2 – Win Server 2003

Posted on | July 19, 2011 | Comments Off on Mirrored Drive Failure #2 – Win Server 2003

I am on a roll with failed mirrored drives lately. I am currently fixing a friends failed mirror set on a windows 2003 server after last weeks in-house ubuntu software raid 1 failure.

 

The phone call

The system volume on the primary drive failed due to read errors.  When rebooted they could not get it to load the OS even when they restarted the system. Selecting the default ‘windows 2003’ boot option just put them into a boot loop.    This is when they involved me by way of a phone call.

The system is several years old running server 2003 standard with a single 3ghz p4 and 2gb of ram and an asus motherboard in an antec case. The mirrored drives are 80gb sata drives. Good drives in the day. Software RAID 1 mirroring.

Talking with them on the phone – I asked them to choose the ‘mirror – secondary plex’ boot option but this just locked up the system part way into the boot. I was afraid that what ever had messed up the primary system dynamic volume had been copied to the mirror drive so I made arrangements to stop by after finishing the job I was currently at.

 

First Look

Looking at the server ‘in situ’ I noticed that the box was infested with dust bunnies – but didn’t notice any unusual noises – though it is located adjacent to several other pieces of equipment that are fairly loud.

So I shut it down and took it out to give it a quick cleaning. Just enough to remove the bunnies and visually inspect the interior of the box for stuck fans, loose cables, etc.

Reassembled and attempted a ‘default’, don’t touch anything, boot. No luck – bios failure on recognizing the primary boot drive.

Shut down again and checked all the drive cables – removed and reinstalled.

 

Backups?

Took a minute to check with my friend to see if he still had the disk image that we had made of his system volume for insurance – and he did. We also took the time to check the backups from the night before of all the data. Looked good as well. It always feels good at a time like this to know that if all else fails we can restore the system volume from the image file and then restore all of the data from the backups.

 

Boot the system

Bios recognized the drive but would not boot to default. Rebooted and chose ‘secondary plex’ option.

Booted into Windows server, logged in and ran compmgmt.msc /s from the run command.

In disk management I took a look at the drives. The data volume was re-syncing and the system volume was online with errors – failed redundancy status. Hmm.

I waited for the re-sysncing volume to finish (because I am paranoid) and took the opportunity to take a look at the event viewer – run->eventvwr.msc 

 

Check Event Viewer

I read through the errors and decided that the drive probably should be replaced just to be on the safe side even if we could bring it back on line and repair it.

 

Don’t remove the Mirror! or even break it – yet.

Do not remove the mirror. That will wipe out the shadow drive. This is bad.

Do not break the mirror either.  If you break the mirror now (while both drives are in the computer) the second (shadow) drive dynamic volumes will be assigned new drive letters – this will mess with the ability to boot off of that drive at a later date or possibly even rebuilding the raid. I suspect this has something to do with the LDM (Logical Disk Manager) database used by dynamic disks to track volume types, drive letters, etc. If anyone knows the answer to this, let me know.

It is also related to the the fact that the paging file, as far as  this particular registry is concerned, is located on a drive that no longer exists…ouch. This can cause a vicious cycle of ‘enter your login name and password’ because there is no virtual memory.

For some more info on this check out http://support.microsoft.com/kb/249321

Another support doc you might want to look at if you inadvertently break your mirror before you remove the bad drive – http://support.microsoft.com/kb/223188

 

Why is it so complicated? I know, stop whining and get back to work.

For more information on Dynamic disks you can check out http://support.microsoft.com/kb/816307 

There is an interesting paragraph there (well more than one, but this is relevant to our conversation)

 

Missing dynamic disks

If Disk Management shows a missing dynamic disk, this means that a dynamic disk that was attached to the system cannot be located. Because every dynamic disk in the system knows about every other dynamic disk, this “missing” disk is shown in Disk Management. Do not delete the missing disk’s volumes or select the Remove Disk option in Disk Management unless you intentionally removed the physical disk from the system and you do not intend to ever reattach it. This is important because after you delete the disk and volume records from the remaining dynamic disk’s LDM database, you may not be able to import the missing disk and bring it back online on the same system after you reattach it.

 

Remove problem drive

After the data volume finished its job syncing I shut down the server and removed the problem hard drive. I then installed the replacement hard drive and rebooted.

After logging in I returned to disk management and deleted the failed drive followed by converting the newly installed drive to dynamic by right clicking on the Disk and selecting Dynamic.

 

Re-enable the mirror

Once the new drive is dynamic as opposed to basic, a fast process, right click on the old drive volumes and create a mirror for each volume.

Now the syncing will take a while. Be patient. Go have lunch, dinner, a cup of coffee or if you prefer, a beer. You deserve it.

Comments

Comments are closed.

  • About

    This website is supported by Ken Lombardi @ analogman consulting.
    phone: 253.two.two.two-7626
    email: ken@analogman'dot'org
    tweet: analogmanorg

  • Admin