2 Harddrives... dead.
I recently had a hard drive fail in my NAS start to fail; end-to-end SMART report lit up. No biggie - I had a warm spare ready for this occasion. I use unRaid as my NAS, which is great at rebuilding drives from parity. So I unassigned the failing drive, zero'd it out to get it ready for it's RMA, and started rebuilding the drive. But ouch! A second drive started failing too! A rellocated sector explosion killed a second drive whilst the first was being rebuilt. After the rebuild finished, I ran
reseirfsck --rebuild-tree on both drives to repair their filesystems (while the unRaid array was mounted, to maintain parity). But now I had a small bunch of missing/corrupted files on two drives. Before rebuilding the second drive, I needed my files back...
CrashPlan to the... rescue??
I use CrashPlan to backup the most important things - pictures and the such. I thought it would be a simple matter to "restore missing/corrupted" - but turns out CrashPlan has no such feature. I can't imagine why they think people using a backup service would not need that vital feature. But my choices were:
- 1) Download the ENTIRE backup, renaming existing local files.
- 2) Go through the backup and my local NAS filesystem file-by-file, folder-by-folder, looking for missing files.
After about 5 minutes of pursuing option number 2, I nope'd out and start looking for better solutions. Looking around my local CrashPlan config directory, I found some backup log files that looked interesting - here was a complete record of every file uploaded! Hundreds of thousands of lines that look kind of like this:
I 09/02/15 11:38PM 42 5ca1bc63fd0a96a963486de8d8dce91e 0 /data/Pictures/Latest pics from both phones/Big Nerd Ranch.jpg (37987) [2,0,37987,0,0,20,20]
I read the tiny bit of documentation CrashPlan provides:
Backupfiles.log contains a list of files that CrashPlan has attempted to back up. Information found in backupfiles.log includes:
- Backup success or failure (an "I" means that the file backed up successfully; a "W" means that the file failed to back up)
- Date and time
- Destination GUID
- File name and path
- File size
I decided that missing files was one problem - time to make it two problems and use a regex:
/ (?<status>[IW]) (?<date>\d\d\/\d\d\/\d\d) (?<time>\d\d:\d\d[AP]M) (?<some_num>\d+) (?<hash>\w+) (?<other_num>\d+) (?<filename>.*) \((?<size>\d+)\) \[(?<flags>[\d\,]+)\] /
Ruby to the rescue
I wrote a simple program that parsed through the entire log looking for file uploads, stashed them away in an
OpenStruct format, then checked my NAS's filesytem for files that I had on my CrashPlan backup. It compared both files that I was either missing locally or had a size mismatch. The result wasn't perfect; I'd imagine I could spend some time perfecting it; find and duplicate the hashing algo, more robust error parsing, etc. But for a 20 minute script it worked perfect. It took 5 minutes to run - I'm sure a properly optimized Elixir version could crunch that stuff in seconds 😛 - but in the end it gave me an exact list of files to check manually, with only a tiny amount of false positives.
^^ Restoring ~18.5 GB instead of hundreds ^^
Now if only CrashPlan had a "restore missing button", I could have just hit a single button and spent my Sunday afternoon doing something more enjoyable.