Friday, March 29, 2013

Hitting (and fixing) the ext4 corruption bug

Remember the data corruption bug that ext4 had a while back? No? Well, I experienced it first hand yesterday in the process of upgrading my motherboard and RAM. It's quite scary when the hard disk that holds your data files is suddenly inaccessible.

I bought this 2TB hard disk about a year and a half ago. It was formatted with ext4 right from the start. I'm not one to experiment with filesystem options so I simply formatted with defaults. Whether that means it's using extents or traditional block allocation, I have no idea. I'll assume extents since that's the way forward and ext4 was already around as an option for some time before my hard disk upgrade.

The hard disk is mounted with default options, too.

So once I got USB working on Ubuntu with my new motherboard, it was the ext4 partition's turn to get a fix. Fortunately, a simple fsck solved my problems.

~$ sudo fsck /dev/sdc1

I have 3 hard disks at this time and the 2TB drive is the 3rd one. I simply answered yes to everything it asked. It's kinda scary not knowing if that's the right choice, but once it was done my hard disk was back in action.

It seems this bug causes quite a lot of damage. It's not as simple as letting fsck delete the corrupt journals and we're done. fsck had to update quite a large number of inodes, too. That's what's scary. Is it gonna render some file inaccessible? Will I be able to even mount the partition after fsck's done? Will all my files be there? Or, will I lose some? Or, will they be there but be corrupted?

There's too many files to be sure, but it seems all my files are back. Randomly checking some files turned out fine too so I'll assume everything's fine, for now.