Hack 59 Rescue Files from Damaged Hard Drives 
When your hard drive is damaged or is on its
last leg, use Knoppix to recover what's left on the
drive and attempt to restore it.
Hard drives continue to get larger and
more complicated, and at least in the desktop IDE market, hard drives
seem to be getting less and less reliable. If you
don't believe me, search the Internet for
"IBM Deathstar" (referring to
problems in the 60GXP and 75GXP series of hard drives). While a
three-year warranty guarantees you a replacement drive, if your drive
fails, there is no way to receive replacement data. When your hard
drive starts to fail, you might notice that it becomes much louder
than it used to be and makes a loud clicking noise that sounds a bit
like your hard drive is crushing ice. Your drive has the click of
death. In addition to general file-access failures, the click of
death is the main indicator that your hard drive is dying and should
be backed up immediately.
Unfortunately, most backup and imaging utilities operate on the
assumption that they are running on fully functioning hardware. When
a hard drive is dying, many backup utilities won't
be able to handle the different access errors. If your drive has
gotten so bad that you can't even boot from it, your
best chance of creating a backup is to image the drive [Hack #48] . But
even the faithful dd program exits out with an
error if it hits a bad block in a file, so if you try to image a
failing hard drive, you end up with an incomplete image.
Knoppix comes with a tool called
dd_rescue
(http://www.garloff.de/kurt/linux/ddrescue)
that aims to pick up where dd leaves off when
reading from questionable drives. When dd_rescue
comes across a bad block, it simply skips it and moves on by default,
or it can be set to move on after a certain number of failures. On a
failing drive, this means you can create an image of a full partition
with some holes here and there, and then use
fsck to try to repair some of the damage on the
filesystem. By using Knoppix for this recovery, you access the drives
as little as possible, so you are only putting strain on the bad
drive long enough to make a single copy, and then you can browse
around the image from a fully functioning drive.
While you can do the complete drive rescue with the
dd_rescue tool, there is a helper frontend tool
called
dd_rhelp
that automates and speeds up much of the process.
Dd_rescue doesn't stop when it
hits bad sectors, but it does slow down significantly. If your drive
has a number of bad blocks in a row, it can take
dd_rescue a long time to move past them into
recoverable data. If the drive is going to fail quickly, this means
your drive can fail while dd_rescue is waiting
on bad blocks. Dd_rhelp speeds up this process
by assuming that bad blocks are generally in groups. When
dd_rhelp sees that
dd_rescue has hit a bad block, it skips ahead a
number of blocks and reads from that point in reverse until it hits
another bad block. It uses this method to map out sections of bad
blocks on the drive and attempts to recover the good blocks first.
Then, when it has recovered the good blocks, it goes back and tries
to recover from the group of bad blocks.
Time is precious when a drive is failing, so
dd_rhelp tries to spend more time recovering
good data, and then goes back to recover questionable data if it can.
There are other benefits to dd_rhelp, such as it
can use the logs that dd_rescue generates to
resume a rescue operation that you have stopped with Ctrl-C. Also,
dd_rhelp generates nice ASCII output that shows
you where it is on your drive and which bad blocks it has discovered.
So your drive has the click of death, and some files are missing.
Don't panic. You should still be able to recover
most or all of your data. First, you need something to store the disk
image on. You are using Knoppix, so you can save the image to any
drive that Knoppix supports, including locally mounted drives, USB
drives, and remote file servers. This drive must be large enough to
hold a complete image of the failing disk partition, so even if you
have 7 GB free on a 10-GB drive, you still need 10 GB of space on a
second drive to back up the image.
Boot Knoppix. Open a browser and go to http://www.garloff.de/kurt/linux/ddrescue/.
Knoppix includes dd_rescue v1.02, but
dd_rhelp requires v1.03. Download Version 1.03
or greater to your home directory, create a local
bin directory to hold the binaries (so the new
dd_rescue is run instead of the one shipped with
Knoppix), and extract dd_rescue to that
directory:
knoppix@ttyp0[knoppix]$ mkdir -p ~/.dist/bin
knoppix@ttyp0[knoppix]$ tar xzf dd_rescue-1.03
.tar.gz dd_rescue
knoppix@ttyp0[knoppix]$ mv dd_rescue ~/.dist/bin
Now browse to http://www.kalysto.ath.cx/utilities/dd_rhelp/index.en.html
and download the latest version of the dd_rhelp
tool to your home directory. Open a terminal, extract the
files from the dd_rhelp-version.tar.gz file that
you have downloaded, and change to the directory it creates. Then
compile the program and copy the new dd_rhelp
binary to your local bin directory with
dd_rescue:
knoppix@ttyp0[knoppix]$ tar xzf dd_rhelp-0.0.5
.tar.gz
knoppix@ttyp0[knoppix]$ cd dd_rhelp-0.0.5
/
knoppix@ttyp0[dd_rhelp-0.0.5]$ ./configure && make
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for bash... /bin/sh
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/include/begin-sh
config.status: creating src/include/copyright-sh
config.status: creating src/include/end-sh
config.status: creating src/include/vars-sh
rm -f dd_rhelp
echo "#!/bin/sh" > dd_rhelp
cat ./src/include/begin-sh >> dd_rhelp
cat ./src/include/copyright-sh >> dd_rhelp
cat ./src/include/GPL-sh >> dd_rhelp
echo "# TODO : " >> dd_rhelp
cat ./TODO | sed 's/^/# /g' >> dd_rhelp
cat ./src/include/vars-sh >> dd_rhelp
echo "# Including 'libcolor.sh'" >> dd_rhelp
cat ./src/include/libcolor.sh >> dd_rhelp
echo "# Including 'libcommon.sh'" >> dd_rhelp
cat ./src/include/libcommon.sh >> dd_rhelp
cat ./src/dd_rhelp-sh >> dd_rhelp
cat ./src/include/end-sh >> dd_rhelp
chmod ugo+x dd_rhelp
knoppix@ttyp0[dd_rhelp-0.0.5]$ cp dd_rhelp ~/.dist/bin/
Mount the drive to which you are saving the image with read/write
access. You don't need to mount the problem drive
(if the drive is far enough gone, you aren't able to
mount it anyway). Then run dd_rhelp:
knoppix@ttyp0[knoppix]$ sudo mount -o rw /dev/hdb1 /mnt/hdb1
knoppix@ttyp0[knoppix]$ sudo dd_rhelp /dev/hda1 /mnt/hdb1/hda1_rescue.img
=== launched via 'dd_rhelp' at 0k, 0 >>> ===
dd_rescue: (info): ipos: 1048444.0k, opos: 1048444.0k, xferd: 1048444.0k
* errs: 0, errxfer: 0.0k, succxfer: 1048444.0k
+curr.rate: 8339kB/s, avg.rate: 7564kB/s, avg.load: 7.9%
dd_rescue: (warning): /dev/hda1 (1048444.0k): Input/output error!
dd_rescue: (info): ipos: 1048444.5k, opos: 1048444.5k, xferd: 1048444.5k
* errs: 1, errxfer: 0.5k, succxfer: 1048444.0k
+curr.rate: 812kB/s, avg.rate: 7564kB/s, avg.load: 7.9%
dd_rescue: (warning): /dev/hda1 (1048444.5k): Input/output error!
dd_rescue: (info): ipos: 1048445.0k, opos: 1048445.0k, xferd: 1048445.0k
* errs: 2, errxfer: 1.0k, succxfer: 1048444.0k
+curr.rate: 1057kB/s, avg.rate: 7564kB/s, avg.load: 7.9%
dd_rescue: (warning): /dev/hda1 (1048445.0k): Input/output error!
dd_rescue: (info): ipos: 1048445.5k, opos: 1048445.5k, xferd: 1048445.5k
* errs: 3, errxfer: 1.5k, succxfer: 1048444.0k
+curr.rate: 994kB/s, avg.rate: 7564kB/s, avg.load: 7.9%
dd_rescue: (warning): /dev/hda1 (1048445.5k): Input/output error!
dd_rescue: (info): /dev/hda1 (1048446.0k): EOF
Summary for /dev/hda1 -> /mnt/hdb1/hda1_rescue.img:
dd_rescue: (info): ipos: 1048446.0k, opos: 1048446.0k, xferd: 1048446.0k
errs: 4, errxfer: 2.0k, succxfer: 1048444.0k
+curr.rate: 1042kB/s, avg.rate: 7564kB/s, avg.load: 7.9%
knoppix@ttyp0[knoppix]$
Replace /dev/hda1 with the partition that
you are recovering, and /mnt/hdb1 with the
mount point where you are saving the image. As
dd_rhelp scans the drive, it prints out all of
its progress, including any errors it finds. When it finishes, you
should have two files in your recovery drive: the image and a log
from dd_rescue, in case you want to audit its
progress.
Now, run fsck on the image to attempt to repair
any filesystem errors that might have occurred [Hack #57]
by typing this command:
knoppix@ttyp0[knoppix]$ sudo fsck -y /mnt/hdb1/hda1_rescue.img
fsck 1.35 (28-Feb-2004)
e2fsck 1.35 (28-Feb-2004)
/mnt/hdb1/hda1_rescue.img: clean, 12/131072 files, 187767/262111 blocks
The -y option tells
fsck to automatically repair any filesystem
errors it finds. Mount the image with the -o loop
option, and you should be able to access your files at that mount
point as if it were a hard drive:
knoppix@ttyp0[knoppix]$ sudo mount -o loop /mnt/hdb1/hda1_rescue.img /mnt/hda1
|