RAID: Benefit or Hazard?

| | Comments (0) | TrackBacks (0)

Replicants are like any other machine. They’re either a benefit or a hazard. If they’re a benefit, it’s not my problem.Deckard, Blade Runner

A Software Update

The Software Update window popped up on my Mac Pro today, with iTunes and Quicktime updates. I installed and rebooted, then… nothing. It just sat there with the grey spinner going. When this happens (which is a lot), I tell myself:

Apple sucks. They make the hardware and the software and they can’t even reboot without a hang.

Then I cycle power, and it boots. Except this time.

Welcome to Hell

This time, cycling power did nothing. Rebooting in safe mode (Shift key) did nothing. Rebooting in choose-startup-disk mode (Option key) did nothing. Repeating the above with a wired Apple keyboard did nothing. Repeating the above with all external devices disconnected did nothing. Oh shit.

Getting a Clue

After about 10 minutes of fooling around, I got lucky and stumbled onto this series of steps:

  1. Plug in a wired mouse.
  2. Reboot in eject-cd mode (hold mouse button down).
  3. Stick in the OS X install disk.
  4. Reboot in choose-startup-disk mode (Option key).
  5. It works. Notice that your boot choices include, as usual, (a) the RAID mirror, (b) the second slice of the RAID mirror, but not (c) the first slice of the RAID mirror. Grunt knowingly.
  6. Select the RAID mirror, boot. Cheer.
  7. degraded-raid-mirror-in-disk-utility.pngOpen Disk Utility.app, and note that the RAID mirror is “Degraded,” and that “disk0s2” has “Failed.”
  8. Select the (degraded) RAID mirror and do a “Verify Disk.” It passes. Whew.

Well, now we’re getting somewhere. I’ve got a bad drive. I wonder if my Mac Pro’s history of dodgy reboot performance could be related. But I digress.

SMART Isn’t

I forgot to mention Step 9: Check SMARTReporter’s log file. You see, campers, modern hard drives support a “standard” called SMART, which stands for “Self-Monitoring, Analysis, and Reporting Technology,” but as we’ll soon see, means “Worse Than Worthless False Confidence Generator.”

Anyway, SMARTReporter’s menu bar display says everything is grrrrrreat. And its log file seconds the motion:

2008-01-20 12:29:17.721 SMARTReporter[466:20b] Drive: 'Hitachi HDS725050KLA360 ( | KRVN27ZAK612ZF | disk0)' Status: SMARTOK (S.M.A.R.T. condition not exceeded, drive OK)

Mind you, I’m not slamming SMARTReporter. I’m slamming the apparently-pathetic implementation of SMART on my Hitachi Deskstar HDS725050KLA360. A drive for which I paid too much money because, you see, it’s genuine Apple factory equipment. Which leads us to the happy part of this story, my call to AppleCare.

AppleCare Saves the Day

At this stage of the game, I’m feeling pretty good. I have a bad drive, but I’ve also got AppleCare, so my drives are covered (which is why I paid too much for Apple drives in the first place).

I call 800-275-2273, listen to hip music for 10 minutes, then get some exotic sounding guy. Despite the accent, I’m pretty sure he’s in California rather than Bangalore.

I spew my sad story. He counters with “have you tried reinstalling the operating system?” OK, at this point, I’m priming myself for some serious idiot-destroying. I’m gonna make him wish he’d never been born. Same for the five other guys I’m gonna have to talk to before they do the right thing.

But I’m cool. I repeat that this is a drive failure. That the array is “degraded,” and that the drive has “failed.” Immediately, he gets it, and three minutes later, a new drive is on its way from Apple to yours truly.

Wow. I am happy, again, that I bought an Apple.

OS X Blew It

I’m using OS X’s software implementation of RAID. It seems to work just fine. But if you’ve ever been burned by a RAID setup, you know that the rubber meets the road when something goes wrong. In this case, OS X’s software RAID totally blew it.

Imagine that you’re a software implementation of RAID. You notice that one of the slices of a RAID-1 array refuses to come online. How would you handle this error? If you answered “I’d hang the whole freaking operating system” you may have a future at Apple writing device drivers. If you answered “I’d boot anyway, and display an error dialog,” go to the head of the class.

Extra credit if you write a description of the problem to a log file. Which OS X didn’t. The logic here is that, if the computer won’t boot, there’s no way to read a log file anyway. Joke!

A Benefit or a Hazard?

By using a RAID mirror, I basically asked for this problem. A two-drive mirror is twice as likely to lose a drive as a single drive setup. On the bright side, the odds of losing two drives at once is the square of the odds of losing a single drive. And losing two drives at once is the only scenario that matters.

Unless you count the losing-your-last-drive-before-fedex-delivers-the-replacement scenario. That’s a bad one :-)

0 TrackBacks

Listed below are links to blogs that reference this entry: RAID: Benefit or Hazard?.

TrackBack URL for this entry: http://www.craptasm.org/cgi-bin/mt/mt-tb.cgi/21

Leave a comment

About this Entry

This page contains a single entry by published on January 20, 2008 5:30 PM.

Yahoo! Sucks! was the previous entry in this blog.

Project Smells: The Nine Billion Names of Order Manager Pro Gold is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.01