Message ID | 20140706213710.GA19847@amd.pavel.ucw.cz |
---|---|
State | New, archived |
Headers | show |
On Sun, Jul 06, 2014 at 11:37:11PM +0200, Pavel Machek wrote: > > Well, when I got report about hw problems, badblocks -c was my first > instinct. On the usb hdd, the most errors were due to 3.16-rc1 kernel > bug, not real problems. The problem is with modern disk drives, this is a *wrong* instinct. That's my point. In general, trying to mess with the bad blocks list in the ext2/3/4 file system is just not the right thing to do with modern disk drives. That's because with modern disk drives, the hard drives will do bad block remapping. Basically, with modern disks, if the HDD has a hard ECC error, it will return an error --- but if you write to the sector, it will either rewrite onto that location on the platter, or if that part of the platter is truly gone, it will remap to the bad block spare pool. So telling the disk to never use that block again isn't going to be the right answer. The badblocks approach to dealing with hardware problems made sense back when we had IDE disks. But that's been over a decade ago. These days, it's horribly obsolete. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun 2014-07-06 21:00:02, Theodore Ts'o wrote: > On Sun, Jul 06, 2014 at 11:37:11PM +0200, Pavel Machek wrote: > > > > Well, when I got report about hw problems, badblocks -c was my first > > instinct. On the usb hdd, the most errors were due to 3.16-rc1 kernel > > bug, not real problems. > > The problem is with modern disk drives, this is a *wrong* instinct. > That's my point. In general, trying to mess with the bad blocks list > in the ext2/3/4 file system is just not the right thing to do with > modern disk drives. That's because with modern disk drives, the hard > drives will do bad block remapping. Actually... I believe it was the right instinct. If I wanted to recover the data... remount-r would be the way to go. Then back it up using dd_rescue. ... But that way I'd turn bad sectors into silent data corruption. If I wanted to recover data from that partition, fsck -c (or badblocks, but that's trickier) and then dd_rescue would be the way to go. > Basically, with modern disks, if the HDD has a hard ECC error, it will > return an error --- but if you write to the sector, it will either > rewrite onto that location on the platter, or if that part of the > platter is truly gone, it will remap to the bad block spare pool. So > telling the disk to never use that block again isn't going to be the > right answer. Actually -- tool to do relocations would be nice. It is not exactly easy to do it right by hand. I know the theory. I had 5 read-error incidents this year. #1: Seagate refuses to reallocate sectors. Not sure why, I tried pretty much everything. #2: 3.16-rc1 produces incorrect errors every 4GB, leading to "bad sectors" that disappear with other kernels #3: Some more bad sectors appear on the Seagate #4: Kernel on thinkpad reports errors in daily check. Which is strange because there's nothing in SMART. #5: Some old IDE hdd has bad sectors in unused or unimportant areas. In #5 the theory might match the reality (I did not check, I trashed the disks). > The badblocks approach to dealing with hardware problems made sense > back when we had IDE disks. But that's been over a decade ago. These > days, it's horribly obsolete. Forcing reallocation is hard & tricky. You may want to simply mark it bad and lose a tiny bit of disk space... And even if you want to force reallocation, you want to do fsck -c, first, and restore affected files from backup. Pavel
Hi! With 3.16-rc3, I did deliberate powerdown by holding down power key (not a clean shutdown). On the next boot, I got some scary messages about data corruption, "filesystem has errors, check forced", "reboot linux". Unfortunately, that made the scary messages gone forever (I tried ^S, was not fast enough), as system rebooted. But it seems I have more of the bad stuff coming: Mounting local filesystems threw an oops and then mount was killed due to out-of-memory. I lost sda2 (or /data) filesystem. Then both sda3 (root) and sda2 gave me. But there's no disk error either in smart or in syslog. Jul 8 01:03:18 duo kernel: EXT4-fs (sda3): error count: 2 Jul 8 01:03:18 duo kernel: EXT4-fs (sda3): initial error at 1404773782: ext4_mb_generate_buddy:757 Jul 8 01:03:18 duo kernel: EXT4-fs (sda3): last error at 1404773782: ext4_mb_generate_buddy:757 Jul 8 01:05:44 duo kernel: EXT4-fs (sda2): error count: 12 Jul 8 01:05:44 duo kernel: EXT4-fs (sda2): initial error at 1404773906: ext4_mb_generate_buddy:757 Jul 8 01:05:44 duo kernel: EXT4-fs (sda2): last error at 1404774058: ext4_journal_check_start:56 (Thinkpad x60 with Hitachi HTS... SATA disk). I attach complete syslog from the boot up... it should have everything relevant. I'm running fsck -f on sda3 now. I'd like to repair sda2 tommorow. Best regards, Pavel
On Mon, Jul 07, 2014 at 08:55:43PM +0200, Pavel Machek wrote: > If I wanted to recover the data... remount-r would be the way to > go. Then back it up using dd_rescue. ... But that way I'd turn bad > sectors into silent data corruption. > > If I wanted to recover data from that partition, fsck -c (or > badblocks, but that's trickier) and then dd_rescue would be the way to go. Ah, if that's what you're worried about, just do the following: badblocks -b 4096 -o /tmp/badblocks.sdXX /dev/sdXX debugfs -R "icheck $(cat /tmp/badblocks.sdXX)" /dev/sdXX > /tmp/bad-inodes debugfs -R "ncheck $(sed -e 1d /tmp/bad-inodes | awk '{print $2}' | sort -nu)" > /tmp/bad-files This will give you a list of the files that contain blocks that had I/O errors. So now you know which files have contents which have probably been corrupted. No more silent data corruption. :-) > Actually -- tool to do relocations would be nice. It is not exactly > easy to do it right by hand. It's not *that* hard. All you really need to do is: for i in $(cat /tmp/badblocks.sdXX) ; do dd if=/dev/zero of=/dev/sdXX bs=4k seek=$i count=1 done e2fsck -f /dev/sdXX For bonus points, you could write a C program which tries to read the block one final time before doing the forced write of all zeros. It's a bit harder if you are trying to interpret the device-driver dependent error messages, and translate the absolute sector number into a partition-relative block number. (Except sometimes, depending on the block device, the number which is given is either a relative sector number, or a relative block number.) For disks that do bad block remapping, an even simpler thing to do is to just delete the corrupted files. When the blocks get reallocated for some other purpose, the HDD should automatically remap the block on write, and if the write fails, such that you are getting an I/O error on the write, it's time to replace the disk. > Forcing reallocation is hard & tricky. You may want to simply mark it > bad and lose a tiny bit of disk space... And even if you want to force > reallocation, you want to do fsck -c, first, and restore affected > files from backup. Trying to force reallocation isn't that hard, so long as you have resigned yourself that you've lost the data in the blocks in question. And if it doesn't work, for whatever reason, I would simply not trust the disk any longer. For me at least, it's all about the value of the disk versus the value of my time and the data on the disk. When I take my hourly rate into question ($annual comp divided by 2000) the value of trying to save a particular hard drive almost never works out in my favor. So these days, my bias is to do what I can to save the data, but to not fool around with trying to play fancy games with e2fsck -c. I'll just want to save what I can, and hopefully, with regular backups, that won't require heroic measures, and then trash and replace the HDD. Cheers, - Ted P.S. I'm not sure why you consider running badblocks to be tricky. The only thing you need to be careful about is passing the file system blocksize to badblocks. And since the block size is almost always 4k for any non-trivial file system, all you really need to do is "badblocks -b 4096". Or, if you really like: badblocks -b $(dumpe2fs -h /dev/sdXX | awk -F: '/^Block size: / {print $2}') /dev/sdXX See? Easy peasy! :-) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/e2fsck/badblocks.c b/e2fsck/badblocks.c index 32e08bf..7ae7a61 100644 --- a/e2fsck/badblocks.c +++ b/e2fsck/badblocks.c @@ -60,8 +60,10 @@ void read_bad_blocks_file(e2fsck_t ctx, const char *bad_blocks_file, } old_bb_count = ext2fs_u32_list_count(bb_list); printf("%s: Currently %d bad blocks.\n", ctx->device_name, old_bb_count); - if (replace_bad_blocks) + if (replace_bad_blocks) { + ext2fs_badblocks_list_free(bb_list); bb_list = 0; + } /* * Now read in the bad blocks from the file; if