Thursday, October 25, 2007

Super cool file parity tool

I've used par2 quite a bit, and always found it amazing. The typical use for it would be to repair corrupt files you download on usenet, but you can also use it on your own system to prevent data loss. What it does, basically, is create files that can be used to reconstruct files in the case that they've been corrupted, or even deleted, in most cases.

I'm going to show you how to use a GUI front end for par2 called PyPar2, in Ubuntu 7.10. The interface makes creating file redundancy a breeze. Here's their home page..

So, the first thing you need to do is go to Add/Remove Applications, and do a search for par2. Two programs come up, but it doesn't look like the first one actually supports creating par files, but only verifies. (I didn't test it though, so correct me if you know different) We're going to select the second one, and install it.

Once it's installed, it will show up under "Other" in the Applications menu. Go ahead and run it. Here's a screenshot in it's default state. As you can see, it's very plain and simple.

Since we'll be creating parity files, click on the "Create" tab. Now you'll see an empty window there. At this point all you have to do is drag and drop files, or in my case a folder into the window. If you drag a folder into the window, it will automatically list all the files from the folder in the window.

I'm going to click the "use advanced settings, so I can get some more options. Let's click on the redundancy tab, and set it at 15%. I'm also going to click on the "parity files" tab and set it at 5, with uniform parity file size.

When you have that set, click on the Go button and it will ask you where you want to store the files. I just used the default location, but changed the file name to suit my needs better. Then, when you click save, a console will pop up and show you all the files it's working on, and show a percentage of completion for the redundancy files. On a music album on my system, this process took about 30 seconds.

Now if you look in the folder where your files are stored, you'll see all your regular files, plus 6 par files. 1 main par file, and 5 par files with the main data. And here comes the super cool part. I have 11 music files in my folder. I'm going to go ahead and delete (YES, DELETE) lucky song number 11. If you don't feel comfortable deleting your file, move it to another location so we can test this stuff out.

Once the file is deleted, I'm going to go back to the PyPar2 program and select the other tab "Check". Then you select your par2 file. Any of the files you created will work, but it's standard to work from the first one (the one with the shortest name). And, because we already know our file is completely missing, we're going to choose "Repair".

Click on go, and a console will pop up again, and it will scan through your files, see what needs to be repaired, and if you have enough parity blocks (we should have more than enough) it will repair your file, or in our case completely rebuild it.

By now, I'm sure some of you have examined the size of the parity files. My parity files for one album were 10MB, and one mp3 was 6 MB. So, of course there's enough data in the parity files to recover an mp3. What's the big deal? Why not just back up the mp3's to another folder? Well, here's why this program is amazing. We could have deleted any one of those mp3's(not all at once, please) and it would have been abe to recover it. How does it do it? I have no idea, but after reading a little about it, I discovered it works on the same mathematical principles they use for redundancy in a raid array. As long as it has enough parity chunks to repair your files, it will have no problems. And, in the advanced tab, you can adjust the amount of parity according to your needs. And of course this can be used on any file type, not just mp3.

If you would like to read more about par and par2, here's a link for the wiki.

Thank you and enjoy.

No comments: