PapaScott I like big blogs and I cannot lie! 🐘


The past couple of weeks I've been evaluating BackupPC, an open source backup to hard disk program. It uses standard client programs like rsync and tar or can read Windows smb shares, so there is nothing to install on the clients. It has a nice web interface (mod perl) for administering full and incremental backups and for retrieving restored files (either save to client or retrieve as tar or zip archive via http), Best of all, it has a intelligent pool system so that identical files from various clients are stored only once. So if you have several dozen identical workstations, your disk space requirement for the backup is the size of a single workstation, or less if you turn on compression.

We want to use it for something different, which is why it's taken me so long to get it going. We want to take nightly backups of our development servers, specifically the working directories of our developers with their CVS playgrounds. The servers are under load literally both day (development) and night (daily builds). A typical partition is 20 GB with 1.5 million files, with lots of duplicates . Our tape drives are maxxed out, so we'd like to put the daily backups on disk and archive to tape just once a week.

My problem was not setting up BackupPC itself, but setting up the clients. My test machines all had good performance, but under real conditions rsync was simply too slow, taking 20 hours for a full and 12 hours for an incremental, putting load on both server and client. The testing process was somewhat tedious, since I had to wait overnight each time for results. But I've finally found settings that work... using tar over ssh (rsh is not an option) reniced in a wrapper script on the backup client to -19. The full backups still take 20 hours (that's OK, we'll do full backups on weekends), but the incrementals are done in 3 hours. Bingo. Now we can get everything done overnight.

We'll try things our for a few more days, but it looks like we'll be able to put this system into production.

comments powered by Disqus