Guide feedback needed; hard drive testing

Question

Discussion Topic

Nick @nick

Rep: 11.9k

114

31

1

Posted: Apr 28, 2018

Options

Guide feedback needed; hard drive testing

https://www.ifixit.com/Guide/How+to+test...

This guide will be used in other guides I may make in the future where the health of the hard drive is used, but can be used independently of these guides if the reader has a used system they suspect the hard drive is failing on and they want to verify their hard drive's health. I did not label it as a prereq only guide because of this, but it is mainly meant to be used as such. I will retain this drive for a while to avoid inconsistency, so if I need to correct anything it isn't a big deal for me to do this. Once it's been used in everything I need it for and the content created with it is good that's when I will drill my holes and get rid of it.

I did not cover all of the attributes I check, since these are not really something the average person needs to check to determine how healthy their hard drive is and is extra information. The goal of this guide is not to replicate what I check; it is meant to quickly and reliably verify the health of a hard drive. That being said, the attributes this guide covers everything I check before I go any further so I don't waste my time on a junk drive. I check these areas since most problems develop here and I can fail out a drive quickly with this information. For the guide I've selected this data since it's the data end users pretty much need to focus on to determine to health of a hard drive in nearly all circumstances.

Just to give you an idea of what I do vs what this guide highlights I check every piece of SMART data available. While end users don't need to do this, I stock these drives for live spares to quickly replace failed drives when I do not have a new one available. Since my stock relies on excellent SMART data, it makes sense for me to be picky about what I accept and reject as live spares. It does not make sense to do this for a drive you are inspecting for pass/fail.

Added a picture of what the error log will output
Added pictures to show what self-tests look like in Parted Magic and Ubuntu
Added a note about drives with SMART errors being too far gone
Warned that manufacturer provided tools patch the problem and do not fix it

(4/29) Rewrote guide intro
(4/29) Refined Steps 1 and 2
(4/29) Merged Steps 3 and 4
(4/29) Merged steps 5 and 7
(4/29) Revised Step 7 title

(4/30) Reduced the scare language on the mention of manufacturer provided utilities to remap the drive in the intro
(4/30) Revised Steps 6 and 7
(5/3) Revised Steps 2 and 3
(5/3) Revised Steps 4 5, 6 and 7
(5/3) Intro revised

(5/11) Intro heavily rewritten
(5/11) Step 2 revised
(5/11) Step 3 revised
(5/11) Step 4 revised (shared line from Step 2 copied+step specific revisions)
(5/11) Step 5 revised (shared lines from Step 3 copied+step specific revisions)
(5/11) Step 7 revised
(5/11) Removed default conclusion

(5/12) Step 3 revised (shared line with Step 5)
(5/12) Step 4 revised
(5/12) Step 5 revised
(5/12) Step 7 revised

(5/15) Minor intro modification (why it's important to test used drives/WD Blue/Green lower quality)
(5/15) Steps 3 and 5 received minor revision (shared line)
(5/15) Step 6 minor revision

It looks like all of the major problems that were initially there are now more or less ironed out. There's probably more but for the most part these issues are cleared up.

Name changed to Diagnosing and Erasing Hard Drives
Intro revised (character reduction)
(5/21) Markups revised in Step 2
(5/21) Minor revision for Step 2
(5/21) Step 3 revised
(5/21) Step 4 markers revised
(5/21) Step 5 markers revised
(5/21) Step 5 revised
(5/21) Step 6 revised
(5/21) Step 7 revised

Added a warning to use ATA Secure Erase (Step 7, line 2)

Reply to discussion Subscribe to discussion

Is this a worthwhile discussion?

Yes No

Score 0

Add a comment

1 Reply

S W @avanteguarde Rep: 1.6k 25 4 1 · Answer 1 · 2018-04-28T17:07:35-07:00

Most Helpful Answer

S W @avanteguarde

Rep: 1.6k

25

4

1

Posted: Apr 28, 2018

Options

@nick, most bad hard drives I have encounter do not trip SMART monitoring.

When SMART errors are triggered, it is often too late and much damage has already occurred.

The big 3 manufacturers WD, Seagate, and Hitachi have DOS bootable utilities that are customized for their particular drives to perform extensive surface tests and can try to remap bad sectors and/or move data from a bad sector to a known good sector.

On a hard drive, there are often "good" sectors that are used as reserved and not mapped as usable so that these utilities can try to "fix" your drive before data is ultimately lost.

You should add this to your guide, as what you have included, although informative, does not really repair or recover your drive.

Was this reply helpful?

Yes No

Score 1

5 Comments:

For me they trip the drive to log these problems but I've never seen it work as intended otherwise. This drive should be labeled as bad, yet it is considered good/caution (depends on the utility).

It's meant more to be for diagnostics by spotting these issues then trying to repair it by remapping the drive. Usually when a drive develops bad sectors it gets worse until the drive dies and takes the user's data with it. Nonetheless, I added this to the guide notes, along with links in case someone wants to try these tools. My concern with making it more prominent will cause more harm then good. I've explained why below this comment. The other issue is I'd need to source out bad WD and Seagate drives.

Apr 28, 2018 by Nick

I don't feel comfortable making it more prominent because remapping the drive (which these tools do) patches it by moving the data to reserve sectors. It's an option if you're remapping a reasonable amount, but there's a point the drive is beyond repair and needs to be replaced.

In this situation, I am referencing a 100 sector reserve. If the drive has 10 bad sectors (10% of the reserve), this isn't a big deal since 90% is still available. When I get sketched out is when you have to remap 20 bad sectors (20% of the reserve). While this is high, you still have 80 reserve sectors left. However, it is beginning to become a problem.

I draw a hard line against it at 30 sectors (30% of the drive). When you get to this point, you only have 70% of the reserve left. Usually when you get to this point the drive is not good for much longer and you can just about guarantee the drive is going to keep getting worse. These really should just be retired and replaced.

Many end users will think it's a be all end all solution and try and repair dead drives. That only works on drives with a small amount of bad sectors. Yes, the self-test these drives incorporate DO remap the drive; so if the user chooses to self-test the drive these bad sectors are getting remapped in the process.

Apr 29, 2018 by Nick

@nick so it sounds like you've never used these remapping utilities. They do agree with you. They do have a built in threshold that they do tell you literally there are too many bad sectors and will refuse to repair the drive. I've been able to successfully use these for minor repairs and the drives worked great for another 5-7 years. However, caveat is that you should always have a disclaimer to backup your data before and after to prevent or recover from one of these disasters.

Also, I've pretty much seen it all... including a seagate barracuda literally burst into flames on the logic board.

Apr 29, 2018 by S W

That's good to know they have an internal threshold where they will refuse a repair if the drive is too far gone. That eases my concerns a lot and was one of the primary reasons I made the tools known but limited acknowledging the option to the intro. I've toned down my scare language to the point of "this only works if the problem is minor/make a backup before repairing the drive". For now since I can't really show all 3 it will stay in the intro but if someone mentions it I'm going to be less hesitant to suggest trying it.

Generally when a drive fails for me it's usually a total degradation problem or the drive is too far gone to remap; many of them come from used systems so I often find the previous owner let it get to the point it's beyond remapping (this is why I recommend AGAINST using these drives without diagnostics and SMART data inspection). The Deskstar is an extreme case, but it's what I typically run into: worn out and beyond any attempt to reuse it. Since I generally find them from used systems with some sort of problem and I (usually) don't know the history, I'm hesitant to invest time into a unknown drive that could very well die in a month.

Apr 29, 2018 by Nick

@nick The Desktar is actually not extremely rare to find failed. There is a reason why IBM sold it to Hitachi. Google "IBM deathstar" and you will see hatred beyond imagination. -- voted one of the top 25 worst tech products of all time... LOL

Apr 29, 2018 by S W

Add a comment

Site Navigation

Your Account

Choose Language

Guide feedback needed; hard drive testing

Changes made

Initial public release

4/29/18

4/30/18

5/11/18

5/12/18

5/15/18

5/21/18

5/22/18

1 Reply

Join the discussion