X-Git-Url: http://git.rot13.org/?p=BackupPC.git;a=blobdiff_plain;f=doc-src%2FBackupPC.pod;h=cc4175e99d07e41e542be8b123c498d2d46b1526;hp=ea463086fd8ea44bb4e2d7e8555818a2a715bd91;hb=617af75f7419e95a9c3ea05b05cf21957acc331c;hpb=7dee89bfce659051d486cc66515bb7f22bbc4f09 diff --git a/doc-src/BackupPC.pod b/doc-src/BackupPC.pod index ea46308..cc4175e 100644 --- a/doc-src/BackupPC.pod +++ b/doc-src/BackupPC.pod @@ -6,12 +6,12 @@ released on __RELEASEDATE__. =head2 Overview BackupPC is a high-performance, enterprise-grade system for backing up -Linux and WinXX PCs, desktops and laptops to a server's disk. BackupPC -is highly configurable and easy to install and maintain. +Unix, Linux and WinXX PCs, desktops and laptops to a server's disk. +BackupPC is highly configurable and easy to install and maintain. Given the ever decreasing cost of disks and raid systems, it is now -practical and cost effective to backup a large number of machines onto a -server's local disk or network storage. For some sites this might be +practical and cost effective to backup a large number of machines onto +a server's local disk or network storage. For some sites this might be the complete backup solution. For other sites additional permanent archives could be created by periodically backing up the server to tape. @@ -41,7 +41,8 @@ cancel backups and browse and restore files from backups. =item * The http/cgi user interface has internationalization (i18n) support, -currently prodiving English, French and Spanish. +currently providing English, French, German, Spanish, Italian +and Dutch. =item * @@ -96,31 +97,53 @@ BackupPC is Open Source software hosted by SourceForge. =item Full Backup -A full backup is a complete backup of a share. BackupPC can be configured to -do a full backup at a regular interval (often weekly). BackupPC can also -be configured to keep a certain number of full backups, and to keep -a smaller number of very old full backups. +A full backup is a complete backup of a share. BackupPC can be +configured to do a full backup at a regular interval (typically +weekly). BackupPC can be configured to keep a certain number +of full backups. Exponential expiry is also supported, allowing +full backups with various vintages to be kept (for example, a +settable number of most recent weekly fulls, plus a settable +number of older fulls that are 2, 4, 8, or 16 weeks apart). =item Incremental Backup -An incremental backup is a backup of files that have changed (based on their -modification time) since the last successful full backup. For SMB and -tar, BackupPC backups all files that have changed since one hour prior to the -start of the last successful full backup. Rsync is more clever: any files -who attributes have changed (ie: uid, gid, mtime, modes, size) since the -last full are backed up. Deleted and new files are also detected by -Rsync incrementals (SMB and tar are not able to detect deleted files or -new files whose modification time is prior to the last full dump. +An incremental backup is a backup of files that have changed (based on +their modification time) since the last successful full backup. For +SMB and tar, BackupPC backups all files that have changed since one +hour prior to the start of the last successful full backup. Rsync is +more clever: any files whose attributes have changed (ie: uid, gid, +mtime, modes, size) since the last full are backed up. Deleted, new +files and renamed files are detected by Rsync incrementals. +In constrast, SMB and tar incrementals are not able to detect deleted +files, renamed files or new files whose modification time is prior to +the last full dump. BackupPC can also be configured to keep a certain number of incremental backups, and to keep a smaller number of very old incremental backups. (BackupPC does not support multi-level incremental backups, although it -would be easy to do so.) +will in a future version.) BackupPC's CGI interface "fills-in" incremental backups based on the last full backup, giving every backup a "full" appearance. This makes browsing and restoring backups easier. +=item Partial Backup + +When a full backup fails or is canceled, and some files have already +been backed up, BackupPC keeps a partial backup containing just the +files that were backed up successfully. The partial backup is removed +when the next successful backup completes, or if another full backup +fails resulting in a newer partial backup. A failed full backup +that has not backed up any files, or any failed incremental backup, +is removed; no partial backup is saved in these cases. + +The partial backup may be browsed or used to restore files just like +a successful full or incremental backup. + +With the rsync transfer method the partial backup is used to resume +the next full backup, avoiding the need to retransfer the file data +already in the partial backup. + =item Identical Files BackupPC pools identical files using hardlinks. By "identical @@ -141,9 +164,7 @@ full support for special file types and unix attributes in v1.4.0 likely means an exact image of a linux/unix file system can be made. BackupPC saves backups onto disk. Because of pooling you can relatively -economically keep several weeks of old backups. But BackupPC does not -provide permanent storage to tape. Other Open Source applications can do -this by backing up BackupPC's pool directories to tape. +economically keep several weeks of old backups. At some sites the disk-based backup will be adequate, without a secondary tape backup. This system is robust to any single failure: if a @@ -151,10 +172,18 @@ client disk fails or loses files, the BackupPC server can be used to restore files. If the server disk fails, BackupPC can be restarted on a fresh file system, and create new backups from the clients. The chance of the server disk failing can be made very small by spending more money -on increasingly better RAID systems. +on increasingly better RAID systems. However, there is still the risk +of catastrophic events like fires or earthquakes that can destroy +both the BackupPC server and the clients it is backing up if they +are physically nearby. -At other sites a secondary tape backup will be required. This tape -backup can be done perhaps weekly from the BackupPC pool file system. +Some sites might choose to do periodic backups to tape or cd/dvd. +This backup can be done perhaps weekly using the archive function of +BackupPC. + +Other users have reported success with removable disks to rotate the +BackupPC data drives, or using rsync to mirror the BackupPC data pool +offsite. =back @@ -180,12 +209,23 @@ The SourceForge project page is at: This page has links to the current releases of BackupPC. +=item BackupPC FAQ + +BackupPC has a FAQ at L. + =item Mail lists Three BackupPC mailing lists exist for announcements (backuppc-announce), developers (backuppc-devel), and a general user list for support, asking questions or any other topic relevant to BackupPC (backuppc-users). +The lists are archived on SourceForge and Gmane. The SourceForge lists +are not always up to date and the searching is limited, so Gmane is +a good alternative. See: + + http://news.gmane.org/index.php?prefix=gmane.comp.sysutils.backup.backuppc + http://sourceforge.net/mailarchive/forum.php?forum_id=503 + You can subscribe to these lists by visiting: http://lists.sourceforge.net/lists/listinfo/backuppc-announce @@ -211,144 +251,41 @@ Do not send subscription requests to this address! =item Other Programs of Interest If you want to mirror linux or unix files or directories to a remote server -you should consider rsync, L. BackupPC now uses +you should use rsync, L. BackupPC now uses rsync as a transport mechanism; if you are already an rsync user you can think of BackupPC as adding efficient storage (compression and pooling) and a convenient user interface to rsync. Unison is a utility that can do two-way, interactive, synchronization. -See L. +See L. An external wrapper around +rsync that maintains transfer data to enable two-way synchronization is +drsync; see L. -Three popular open source packages that do tape backup are -Amanda (L), -afbackup (L), and -Bacula (L). +Two popular open source packages that do tape backup are +Amanda (L) +and Bacula (L). Amanda can also backup WinXX machines to tape using samba. These packages can be used as back ends to BackupPC to backup the BackupPC server data to tape. Various programs and scripts use rsync to provide hardlinked backups. -See, for example, Mike Rubel's site (L), -J. W. Schultz's dirvish (L), +See, for example, Mike Rubel's site (L), +JW Schultz's dirvish (L), +Ben Escoto's rdiff-backup (L), and John Bowman's rlbackup (L). + BackupPC provides many additional features, such as compressed storage, hardlinking any matching files (rather than just files with the same name), and storing special files without root privileges. But these other scripts -provide simple and effective solutions and are worthy of consideration. +provide simple and effective solutions and are definitely worthy of +consideration. =back =head2 Road map -Here are some ideas for new features that might appear in future -releases of BackupPC: - -=over 4 - -=item * - -Adding hardlink support to rsync. - -=item * - -Adding block and file checksum caching to rsync. This will significantly -increase performance since the server doesn't have to read each file -(twice) to compute the block and file checksums. - -=item * - -Adding a trip wire feature for notification when files below certain -directories change. For example, if you are backing up a DMZ machine, -you could request that you get sent email if any files below /bin, -/sbin or /usr change. - -=item * - -Allow editing of config parameters via the CGI interface. Users should -have permission to edit a subset of the parameters for their clients. -Additionally, allow an optional self-service capability so that users -can sign up and setup their own clients with no need for IT support. - -=item * - -Add backend SQL support for various BackupPC metadata, including -configuration parameters, client lists, and backup and restore -information. At installation time the backend data engine will -be specified (eg: MySQL, ascii text etc). - -=item * - -Disconnect the notion of a physical host and a backup client. -Currently there is a one-to-one match between physical hosts -and backup clients. Instead, the current notion of a host -should be replaced by a backup client. Each backup client -corresponds to a physical host. A physical host could have -several backup clients. This is useful for backing up -different types of data, or backing up different portions -of a machine with different frequencies or settings. - -(Note: this has already been implemented in 2.0.0.) - -=item * - -Resuming incomplete full backups. Useful if a machine -(eg: laptop) is disconnected from the network during a backup, -or if the user manually stops a backup. This would be supported -initially for rsync. The partial dump would be kept, and be -browsable. When the next dump starts, an incremental against -the partial dump would be done to make sure it was up to date, -and then the rest of the full dump would be done. - -=item * - -Replacing smbclient with the perl module FileSys::SmbClient. This -gives much more direct control of the smb transfer, allowing -incrementals to depend on any attribute change (eg: exist, mtime, -file size, uid, gid), and better support for include and exclude. -Currently smbclient incrementals only depend upon mtime, so -deleted files or renamed files are not detected. FileSys::SmbClient -would also allow resuming of incomplete full backups in the -same manner as rsync will. - -=item * - -Support --listed-incremental or --incremental for tar, -so that incrementals will depend upon any attribute change (eg: exist, -mtime, file size, uid, gid), rather than just mtime. This will allow -tar to be to as capable as FileSys::SmbClient and rsync. - -=item * - -For rysnc (and smb when FileSys::SmbClient is supported, and tar when ---listed-incremental is supported) support multi-level incrementals. -In fact, since incrementals will now be more "accurate", you could -choose to never to full dumps (except the first time), or at a -minimum do them infrequently: each incremental would depend upon -the last, giving a continuous chain of differential dumps. - -=item * - -Add a backup browsing feature that shows backup history by file. -So rather than a single directory view, it would be a table showing -the files (down) and the backups (across). The internal hardlinks -encode which files are identical across backups. You could immediately -see which files changed on which backups. - -=item * - -More speculative: Storing binary file deltas (in fact, reverse deltas) -for files that have the same name as a previous backup, but that aren't -already in the pool. This will save storage for things like mailbox -files or documents that change slightly between backups. Running some -benchmarks on a large pool suggests that the potential savings are -around 15-20%, which isn't spectacular, and likely not worth the -implementation effort. The program xdelta (v1) on SourceForge (see -L) uses an rsync algorithm for -doing efficient binary file deltas. Rather than using an external -program, File::RsyncP will eventually get the necessary delta -generataion code from rsync. - -=back +The new features planned for future releases of BackupPC +are at L. Comments and suggestions are welcome. @@ -360,19 +297,19 @@ to contribute to the open source community. BackupPC already has more than enough features for my own needs. The main compensation for continuing to work on BackupPC is knowing that more and more people find it useful. So feedback is certainly -appreciated. Even negative feedback is helpful, for example "We -evaluated BackupPC but didn't use it because it doesn't ...". +appreciated, both positive and negative. Beyond being a satisfied user and telling other people about it, everyone -is encouraged to add links to L (I'll -see then via Google) or otherwise publicize BackupPC. Unlike the -commercial products in this space, I have a zero budget (in both +is encouraged to add links to L +(I'll see them via Google) or otherwise publicize BackupPC. Unlike +the commercial products in this space, I have a zero budget (in both time and money) for marketing, PR and advertising, so it's up to -all of you! +all of you! Feel free to vote for BackupPC at +L. Also, everyone is encouraged to contribute patches, bug reports, feature -and design suggestions, code, and documentation corrections or -improvements. +and design suggestions, new code, FAQs, and documentation corrections or +improvements. Answering questions on the mail list is a big help too. =head1 Installing BackupPC @@ -390,6 +327,12 @@ performance on this server will determine how many simultaneous backups you can run. You should be able to run 4-8 simultaneous backups on a moderately configured server. +Several users have reported significantly better performance using +reiser compared to ext3 for the BackupPC data file system. It is +also recommended you consider either an LVM or raid setup (either +in HW or SW; eg: 3Ware RAID5) so that you can expand the +file system as necessary. + When BackupPC starts with an empty pool, all the backup data will be written to the pool on disk. After more backups are done, a higher percentage of incoming files will already be in the pool. BackupPC is @@ -407,8 +350,8 @@ compression is on. =item * Perl version 5.6.0 or later. BackupPC has been tested with -version 5.6.0 and 5.6.1. If you don't have perl, please see -L. +version 5.6.x, and 5.8.x. If you don't have perl, please +see L. =item * @@ -426,9 +369,8 @@ Rsync as a transport method. If you are using smb to backup WinXX machines you need smbclient and nmblookup from the samba package. You will also need nmblookup if you are backing up linux/unix DHCP machines. See L. -Version 2.2.0 or later of Samba is required (smbclient's tar feature in -2.0.8 and prior has bugs for file path lengths around 100 characters -and generates bad output when file lengths change during the backup). +Version 2.2.0 or later of Samba is required. +Samba versions 3.x are stable and now recommended instead of 2.x. See L for source and binaries. It's pretty easy to fetch and compile samba, and just grab smbclient and nmblookup, without @@ -441,17 +383,18 @@ If you are using tar to backup linux/unix machines you should have version 1.13.7 at a minimum, with version 1.13.20 or higher recommended. Use "tar --version" to check your version. Various GNU mirrors have the newest versions of tar, see for example L. -As of February 2003 the latest version is 1.13.25. +As of June 2003 the latest version is 1.13.25. =item * If you are using rsync to backup linux/unix machines you should have -version 2.5.5 on each client machine. See L. -Use "rsync --version" to check your version. +version 2.5.5 or higher on each client machine. See +L. Use "rsync --version" to check your version. For BackupPC to use Rsync you will also need to install the perl File::RsyncP module, which is available from -L. Version 0.31 or later is required. +L. +Version 0.52 or later is required. =item * @@ -475,7 +418,7 @@ Keeping three weekly full backups, and six incrementals is around 1200GB of raw data. Because of pooling and compression, only 150GB is needed. -Here's a rule of thumb. Add up the C drive usage of all the machines you +Here's a rule of thumb. Add up the disk usage of all the machines you want to backup (210GB in the first example above). This is a rough minimum space estimate that should allow a couple of full backups and at least half a dozen incremental backups per machine. If compression is on @@ -503,9 +446,35 @@ more troublesome, since it keeps this file locked all the time, so it cannot be read by smbclient whenever Outlook is running. See the L section for more discussion of this problem. +In addition to total disk space, you shold make sure you have +plenty of inodes on your BackupPC data partition. Some users have +reported running out of inodes on their BackupPC data partition. +So even if you have plenty of disk space, BackupPC will report +failures when the inodes are exhausted. This is a particular +problem with ext2/ext3 file systems that have a fixed number of +inodes when the file system is built. Use "df -i" to see your +inode usage. + =head2 Step 1: Getting BackupPC -Download the latest version from L. +Some linux distributions now include BackupPC. The Debian +distribution, supprted by Ludovic Drolez, can be found at +L; it should be included +in the next stable Debian release. On Debian, BackupPC can +be installed with the command: + + apt-get install backuppc + +In the future there might be packages for Gentoo and other +linux flavors. If the packaged version is older than the +released version then you will probably want to install the +lastest version as described below. + +Otherwise, manually fetching and installing BackupPC is easy. +Start by downloading the latest version from +L. Hit the "Code" button, +then select the "backuppc" or "backuppc-beta" package and +download the latest version. =head2 Step 2: Installing the distribution @@ -531,7 +500,7 @@ You can run "perldoc Archive::Zip" to see if this module is installed. To use rsync and rsyncd with BackupPC you will need to install File::RsyncP. You can run "perldoc File::RsyncP" to see if this module is installed. File::RsyncP is available from L. -Version 0.31 or later is required. +Version 0.52 or later is required. =back @@ -554,8 +523,38 @@ BackupPC-__VERSION__.tar.gz, run these commands as root: cd BackupPC-__VERSION__ perl configure.pl -You will be prompted for the full paths of various executables, and -you will be prompted for the following information: +In the future this release might also have patches available on the +SourceForge site. These patch files are text files, with a name of +the form + + BackupPC-__VERSION__plN.diff + +where N is the patch level, eg: pl5 is patch-level 5. These +patch files are cumulative: you only need apply the last patch +file, not all the earlier patch files. If a patch file is +available, eg: BackupPC-__VERSION__pl5.diff, you should apply +the patch after extracting the tar file: + + # fetch BackupPC-__VERSION__.tar.gz + # fetch BackupPC-__VERSION__pl5.diff + tar zxf BackupPC-__VERSION__.tar.gz + cd BackupPC-__VERSION__ + patch -p0 < ../BackupPC-__VERSION__pl5.diff + perl configure.pl + +A patch file includes comments that describe that bug fixes +and changes. Feel free to review it before you apply the patch. + +The configure.pl script also accepts command-line options if you +wish to run it in a non-interactive manner. It has self-contained +documentation for all the command-line options, which you can +read with perldoc: + + perldoc configure.pl + +When you run configure.pl you will be prompted for the full paths +of various executables, and you will be prompted for the following +information: =over 4 @@ -570,6 +569,17 @@ sure the BackupPC user's group is chosen restrictively. On this installation, this is __BACKUPPCUSER__. +For security purposes you might choose to configre the BackupPC +user with the shell set to /bin/false. Since you might need to +run some BackupPC programs as the BackupPC user for testing +purposes, you can use the -s option to su to explicitly run +a shell, eg: + + su -s /bin/bash __BACKUPPCUSER__ + +Depending upon your configuration you might also need +the -l option. + =item Data Directory You need to decide where to put the data directory, below which @@ -602,55 +612,11 @@ directory. =head2 Step 3: Setting up config.pl After running configure.pl, browse through the config file, -__INSTALLDIR__/conf/config.pl, and make sure all the default settings +__TOPDIR__/conf/config.pl, and make sure all the default settings are correct. In particular, you will need to decide whether to use -smb or tar transport (or whether to set it on a per-PC basis), -set the smb share password (if using smb), set the backup policies -and modify the email message headers and bodies. - -BackupPC needs to know the smb share user name and password for each PC -that uses smb (ie: all the WinXX clients). The user name is specified -in $Conf{SmbShareUserName}. There are four ways to tell BackupPC the smb -share password: - -=over 4 - -=item * - -As an environment variable BPC_SMB_PASSWD set before BackupPC starts. -If you start BackupPC manually the BPC_SMB_PASSWD variable must be set -manually first. For backward compatibility for v1.5.0 and prior, the -environment variable PASSWD can be used if BPC_SMB_PASSWD is not set. -Warning: on some systems it is possible to see environment variables of -running processes. - -=item * - -Alternatively the BPC_SMB_PASSWD setting can be included in -/etc/init.d/backuppc, in which case you must make sure this file -is not world (other) readable. - -=item * - -As a configuration variable $Conf{SmbSharePasswd} in -__TOPDIR__/conf/config.pl. If you put the password -here you must make sure this file is not world (other) readable. - -=item * - -As a configuration variable $Conf{SmbSharePasswd} in the per-PC -configuration file, __TOPDIR__/pc/$host/config.pl. You will have to -use this option if the smb share password is different for each host. -If you put the password here you must make sure this file is not -world (other) readable. - -=back - -Placement and protection of the smb share password is a possible -security risk, so please double-check the file and directory -permissions. In a future version there might be support for -encryption of this password, but a private key will still have to -be stored in a protected place. Suggestions are welcome. +smb, tar or rsync transport (or whether to set it on a per-PC basis) +and set the relevant parameters for that transport method. +See the section L for more details. =head2 Step 4: Setting up the hosts file @@ -724,7 +690,7 @@ DHCP addresses to search is specified in $Conf{DHCPAddressRanges}. Note also that the $Conf{ClientNameAlias} feature does not work for clients with DHCP set to 1. - + =item User name This should be the unix login/email name of the user who "owns" or uses @@ -754,9 +720,9 @@ Here's a simple example of a hosts file: =head2 Step 5: Client Setup -Two methods for getting backup data from a client are -supported: smb and tar. Smb is the preferred method for WinXX clients -and tar is preferred method for linux/unix clients. +Two methods for getting backup data from a client are supported: smb and +tar. Smb or rsync are the preferred methods for WinXX clients and rsync or +tar are the preferred methods for linux/unix clients. The transfer method is set using the $Conf{XferMethod} configuration setting. If you have a mixed environment (ie: you will use smb for some @@ -779,11 +745,38 @@ The preferred setup for WinXX clients is to set $Conf{XferMethod} to "smb". prepared to run rsync/cygwin on your WinXX client. More information about this will be provided via the FAQ.) -You need to create shares for the data you want to backup. -Open "My Computer", right click on the drive (eg: C), and -select "Sharing..." (or select "Properties" and select the -"Sharing" tab). In this dialog box you can enable sharing, -select the share name and permissions. +If you want to use rsyncd for WinXX clients you can find a pre-packaged +zip file on L. The package is called +cygwin-rsync. It contains rsync.exe, template setup files and the +minimal set of cygwin libraries for everything to run. The README file +contains instructions for running rsync as a service, so it starts +automatically everytime you boot your machine. + +If you build your own rsync, for rsync 2.6.2 it is strongly +recommended you apply the patch in the cygwin-rsync package on +L. This patch adds the --checksum-seed +option for checksum caching, and also sends all errors to the client, +which is important so BackupPC can log all file access errors. + + +Otherwise, to use SMB, you can either create shares for the data you want +to backup or your can use the existing C$ share. To create a new +share, open "My Computer", right click on the drive (eg: C), and +select "Sharing..." (or select "Properties" and select the "Sharing" +tab). In this dialog box you can enable sharing, select the share name +and permissions. + +All Windows NT based OS (NT, 2000, XP Pro), are configured by default +to share the entire C drive as C$. This is a special share used for +various administration functions, one of which is to grant access to backup +operators. All you need to do is create a new domain user, specifically +for backup. Then add the new backup user to the built in "Backup +Operators" group. You now have backup capability for any directory on +any computer in the domain in one easy step. This avoids using +administrator accounts and only grants permission to do exactly what you +want for the given user, i.e.: backup. +Also, for additional security, you may wish to deny the ability for this +user to logon to computers in the default domain policy. If this machine uses DHCP you will also need to make sure the NetBios name is set. Go to Control Panel|System|Network Identification @@ -792,6 +785,55 @@ Also, you should go to Control Panel|Network Connections|Local Area Connection|Properties|Internet Protocol (TCP/IP)|Properties|Advanced|WINS and verify that NetBios is not disabled. +The relevant configuration settings are $Conf{SmbShareName}, +$Conf{SmbShareUserName}, $Conf{SmbSharePasswd}, $Conf{SmbClientPath}, +$Conf{SmbClientFullCmd}, $Conf{SmbClientIncrCmd} and +$Conf{SmbClientRestoreCmd}. + +BackupPC needs to know the smb share user name and password for a +client machine that uses smb. The user name is specified in +$Conf{SmbShareUserName}. There are four ways to tell BackupPC the +smb share password: + +=over 4 + +=item * + +As an environment variable BPC_SMB_PASSWD set before BackupPC starts. +If you start BackupPC manually the BPC_SMB_PASSWD variable must be set +manually first. For backward compatibility for v1.5.0 and prior, the +environment variable PASSWD can be used if BPC_SMB_PASSWD is not set. +Warning: on some systems it is possible to see environment variables of +running processes. + +=item * + +Alternatively the BPC_SMB_PASSWD setting can be included in +/etc/init.d/backuppc, in which case you must make sure this file +is not world (other) readable. + +=item * + +As a configuration variable $Conf{SmbSharePasswd} in +__TOPDIR__/conf/config.pl. If you put the password +here you must make sure this file is not world (other) readable. + +=item * + +As a configuration variable $Conf{SmbSharePasswd} in the per-PC +configuration file, __TOPDIR__/pc/$host/config.pl. You will have to +use this option if the smb share password is different for each host. +If you put the password here you must make sure this file is not +world (other) readable. + +=back + +Placement and protection of the smb share password is a possible +security risk, so please double-check the file and directory +permissions. In a future version there might be support for +encryption of this password, but a private key will still have to +be stored in a protected place. Suggestions are welcome. + As an alternative to setting $Conf{XferMethod} to "smb" (using smbclient) for WinXX clients, you can use an smb network filesystem (eg: ksmbfs or similar) on your linux/unix server to mount the share, @@ -799,13 +841,19 @@ and then set $Conf{XferMethod} to "tar" (use tar on the network mounted file system). Also, to make sure that file names with 8-bit characters are correctly -transferred by smbclient you should add this to samba's smb.conf file: +transferred by smbclient you should add this to samba's smb.conf file +for samba 2.x: [global] # Accept the windows charset client code page = 850 character set = ISO8859-1 +For samba 3.x this should instead be: + + [global] + unix charset = ISO8859-1 + This setting should work for western europe. See L for more information about settings for other languages. @@ -847,20 +895,24 @@ is recommended. Rsync is run on the remote client via rsh or ssh. The relevant configuration settings are $Conf{RsyncClientPath}, $Conf{RsyncClientCmd}, $Conf{RsyncClientRestoreCmd}, $Conf{RsyncShareName}, -$Conf{RsyncArgs}, $Conf{RsyncRestoreArgs} and $Conf{RsyncLogLevel}. +$Conf{RsyncArgs}, and $Conf{RsyncRestoreArgs}. =item rsyncd -You should have at least rsync 2.5.5, and the latest version 2.5.6 +You should have at least rsync 2.5.5, and the latest version 2.6.2 is recommended. In this case the rsync daemon should be running on the client machine and BackupPC connects directly to it. The relevant configuration settings are $Conf{RsyncdClientPort}, $Conf{RsyncdUserName}, $Conf{RsyncdPasswd}, $Conf{RsyncdAuthRequired}, -$Conf{RsyncShareName}, $Conf{RsyncArgs}, $Conf{RsyncRestoreArgs} -and $Conf{RsyncLogLevel}. In the case of rsyncd, $Conf{RsyncShareName} -is the name of an rsync module (ie: the thing in square brackets in -rsyncd's conf file -- see rsyncd.conf), not a file system path. +$Conf{RsyncShareName}, $Conf{RsyncArgs}, and $Conf{RsyncRestoreArgs}. +$Conf{RsyncShareName} is the name of an rsync module (ie: the thing +in square brackets in rsyncd's conf file -- see rsyncd.conf), not a +file system path. + +Be aware that rsyncd will remove the leading '/' from path names in +symbolic links if you specify "use chroot = no" in the rsynd.conf file. +See the rsyncd.conf manual page for more information. =back @@ -896,199 +948,48 @@ for a password. There are two common versions of ssh: v1 and v2. Here are some instructions for one way to setup ssh. (Check which version of SSH you have by typing "ssh" or "man ssh".) -=over 4 - -=item OpenSSH Instructions - -Depending upon your OpenSSH installation, many of these steps can be -replaced by running the scripts ssh-user-config and ssh-host-config -included with OpenSSH. You still need to manually exchange the keys. - -=over 4 - -=item Key generation - -As root on the client machine, use ssh-keygen to generate a -public/private key pair, without a pass-phrase: - - ssh-keygen -t rsa -N '' +=item Mac OS X -This will save the public key in ~/.ssh/id_rsa.pub and the private -key in ~/.ssh/id_rsa. - -=item BackupPC setup - -Repeat the above steps for the BackupPC user (__BACKUPPCUSER__) on the server. -Make a copy of the public key to make it recognizable, eg: - - ssh-keygen -t rsa -N '' - cp ~/.ssh/id_rsa.pub ~/.ssh/BackupPC_id_rsa.pub - -See the ssh and sshd manual pages for extra configuration information. - -=item Key exchange - -To allow BackupPC to ssh to the client as root, you need to place -BackupPC's public key into root's authorized list on the client. -Append BackupPC's public key (BackupPC_id_rsa.pub) to root's -~/.ssh/authorized_keys2 file on the client: - - touch ~/.ssh/authorized_keys2 - cat BackupPC_id_rsa.pub >> ~/.ssh/authorized_keys2 - -You should edit ~/.ssh/authorized_keys2 and add further specifiers, -eg: from, to limit which hosts can login using this key. For example, -if your BackupPC host is called backuppc.my.com, there should be -one line in ~/.ssh/authorized_keys2 that looks like: - - from="backuppc.my.com" ssh-rsa [base64 key, eg: ABwBCEAIIALyoqa8....] - -=item Fix permissions - -You will probably need to make sure that all the files -in ~/.ssh have no group or other read/write permission: - - chmod -R go-rwx ~/.ssh - -You should do the same thing for the BackupPC user on the server. - -=item Testing - -As the BackupPC user on the server, verify that this command: - - ssh -l root clientHostName whoami - -prints - - root - -You might be prompted the first time to accept the client's host key and -you might be prompted for root's password on the client. Make sure that -this command runs cleanly with no prompts after the first time. You -might need to check /etc/hosts.equiv on the client. Look at the -man pages for more information. The "-v" option to ssh is a good way -to get detailed information about what fails. - -=back - -=item SSH2 Instructions - -=over 4 - -=item Key generation - -As root on the client machine, use ssh-keygen2 to generate a -public/private key pair, without a pass-phrase: - - ssh-keygen2 -t rsa -P - -or: - - ssh-keygen -t rsa -N '' - -(This command might just be called ssh-keygen on your machine.) - -This will save the public key in /.ssh2/id_rsa_1024_a.pub and the private -key in /.ssh2/id_rsa_1024_a. - -=item Identification - -Create the identification file /.ssh2/identification: - - echo "IdKey id_rsa_1024_a" > /.ssh2/identification - -=item BackupPC setup - -Repeat the above steps for the BackupPC user (__BACKUPPCUSER__) on the server. -Rename the key files to recognizable names, eg: - - ssh-keygen2 -t rsa -P - mv ~/.ssh2/id_rsa_1024_a.pub ~/.ssh2/BackupPC_id_rsa_1024_a.pub - mv ~/.ssh2/id_rsa_1024_a ~/.ssh2/BackupPC_id_rsa_1024_a - echo "IdKey BackupPC_id_rsa_1024_a" > ~/.ssh2/identification - -Based on your ssh2 configuration, you might also need to turn off -StrictHostKeyChecking and PasswordAuthentication: - - touch ~/.ssh2/ssh2_config - echo "StrictHostKeyChecking ask" >> ~/.ssh2/ssh2_config - echo "PasswordAuthentication no" >> ~/.ssh2/ssh2_config - -=item Key exchange - -To allow BackupPC to ssh to the client as root, you need to place -BackupPC's public key into root's authorized list on the client. -Copy BackupPC's public key (BackupPC_id_rsa_1024_a.pub) to the -/.ssh2 directory on the client. Add the following line to the -/.ssh2/authorization file on the client (as root): - - touch /.ssh2/authorization - echo "Key BackupPC_id_rsa_1024_a.pub" >> /.ssh2/authorization - -=item Fix permissions - -You will probably need to make sure that all the files -in /.ssh2 have no group or other read/write permission: - - chmod -R go-rwx /.ssh2 - -You should do the same thing for the BackupPC user on the server. - -=item Testing - -As the BackupPC user on the server, verify that this command: - - ssh2 -l root clientHostName whoami - -prints - - root - -You might be prompted the first time to accept the client's host key and -you might be prompted for root's password on the client. Make sure that -this command runs cleanly with no prompts after the first time. You -might need to check /etc/hosts.equiv on the client. Look at the -man pages for more information. The "-v" option to ssh2 is a good way -to get detailed information about what fails. - -=back +In general this should be similar to Linux/Unix machines. +Mark Stosberg reports that you can also use hfstar. +See L. -=item SSH version 1 Instructions +=item SSH Setup -The concept is identical and the steps are similar, but the specific -commands and file names are slightly different. +SSH is a secure way to run tar or rsync on a backup client to extract +the data. SSH provides strong authentication and encryption of +the network data. -First, run ssh-keygen on the client (as root) and server (as the BackupPC -user) and simply hit enter when prompted for the pass-phrase: +Note that if you run rsyncd (rsync daemon), ssh is not used. +In this case, rsyncd provides its own authentication, but there +is no encryption of network data. If you want encryption of +network data you can use ssh to create a tunnel, or use a +program like stunnel. If someone submits instructions I - ssh-keygen +Setup instructions for ssh are at +L. -This will save the public key in /.ssh/identity.pub and the private -key in /.ssh/identity. +=item Clients that use DHCP -Next, append BackupPC's ~/.ssh/identity.pub (from the server) to root's -/.ssh/authorized_keys file on the client. It's a single long line that -you can cut-and-paste with an editor (make sure it remains a single line). +If a client machine uses DHCP BackupPC needs some way to find the +IP address given the host name. One alternative is to set dhcp +to 1 in the hosts file, and BackupPC will search a pool of IP +addresses looking for hosts. More efficiently, it is better to +set dhcp = 0 and provide a mechanism for BackupPC to find the +IP address given the host name. -Next, force protocol version 1 by adding: +For WinXX machines BackupPC uses the NetBios name server to determine +the IP address given the host name. +For unix machines you can run nmbd (the NetBios name server) from +the Samba distribution so that the machine responds to a NetBios +name request. See the manual page and Samba documentation for more +information. - Protocol 1 +Alternatively, you can set $Conf{NmbLookupFindHostCmd} to any command +that returns the IP address given the host name. -to BackupPC's ~/.ssh/config on the server. - -Next, run "chmod -R go-rwx ~/.ssh" on the server and "chmod -R go-rwx /.ssh" -on the client. - -Finally, test using: - - ssh -l root clientHostName whoami - -=back - -Finally, if this machine uses DHCP you will need to run nmbd (the -NetBios name server) from the Samba distribution so that the machine -responds to a NetBios name request. See the manual page and Samba -documentation for more information. +Please read the section L +for more details. =back @@ -1119,19 +1020,22 @@ it has started and all is ok. =head2 Step 7: Talking to BackupPC -Note: as of version 1.5.0, BackupPC no longer supports telnet -to its TCP port. First off, a unix domain socket is used -instead of a TCP port. (The TCP port can still be re-enabled -if your installation has apache and BackupPC running on different -machines.) Secondly, even if you still use the TCP port, the -messages exchanged over this interface are now protected by -an MD5 digest based on a shared secret (see $Conf{ServerMesgSecret}) -as well as sequence numbers and per-session unique keys, preventing -forgery and replay attacks. - You should verify that BackupPC is running by using BackupPC_serverMesg. This sends a message to BackupPC via the unix (or TCP) socket and prints -the response. +the response. Like all BackupPC programs, BackupPC_serverMesg +should be run as the BackupPC user (__BACKUPPCUSER__), so you +should + + su __BACKUPPCUSER__ + +before running BackupPC_serverMesg. If the BackupPC user is +configured with /bin/false as the shell, you can use the -s +option to su to explicitly run a shell, eg: + + su -s /bin/bash __BACKUPPCUSER__ + +Depending upon your configuration you might also need +the -l option. You can request status information and start and stop backups using this interface. This socket interface is mainly provided for the CGI interface @@ -1213,12 +1117,13 @@ This is because setuid scripts are disabled by the kernel in most flavors of unix and linux. To see if your perl has setuid emulation, see if there is a program -called sperl5.6.0 or sperl5.6.1 in the place where perl is installed. -If you can't find this program, then you have two options: rebuild -and reinstall perl with the setuid emulation turned on (answer "y" to -the question "Do you want to do setuid/setgid emulation?" when you -run perl's configure script), or switch to the mod_perl alternative -for the CGI script (which doesn't need setuid to work). +called sperl5.6.0 (or sperl5.8.2 etc, based on your perl version) +in the place where perl is installed. If you can't find this program, +then you have two options: rebuild and reinstall perl with the setuid +emulation turned on (answer "y" to the question "Do you want to do +setuid/setgid emulation?" when you run perl's configure script), or +switch to the mod_perl alternative for the CGI script (which doesn't +need setuid to work). =item Mod_perl Setup @@ -1261,20 +1166,17 @@ to Apache's 1.x httpd.conf file: -For Apache 2.x and perl 5.8.x - Apache 2.0.44 with Perl 5.8.0 on RedHat 7.1, Don Silvia reports that -this works: +this works (with tweaks from Michael Tuzi): LoadModule perl_module modules/mod_perl.so PerlModule Apache2 - + SetHandler perl-script PerlResponseHandler ModPerl::Registry PerlOptions +ParseHeaders Options +ExecCGI - Order deny,allow Deny from all Allow from 192.168.0 @@ -1282,7 +1184,7 @@ this works: AuthType Basic AuthUserFile /path/to/user_file Require valid-user - + There are other optimizations and options with mod_perl. For example, you can tell mod_perl to preload various perl modules, @@ -1323,9 +1225,14 @@ One alternative is to use LDAP. In Apache's http.conf add these lines: require valid-user -If you want to defeat the user authentication you can force a -particular user name by getting Apache to set REMOTE_USER, eg, -to hardcode the user to www you could add this to httpd.conf: +If you want to disable the user authentication you can set +$Conf{CgiAdminUsers} to '*', which allows any user to have +full access to all hosts and backups. In this case the REMOTE_USER +environment variable does not have to be set by Apache. + +Alternatively, you can force a particular user name by getting Apache +to set REMOTE_USER, eg, to hardcode the user to www you could add +this to Apache's httpd.conf: # <--- change path as needed Setenv REMOTE_USER www @@ -1340,6 +1247,8 @@ images into $Conf{CgiImageDir} that BackupPC_Admin needs to serve up. You should make sure that $Conf{CgiImageDirURL} is the correct URL for the image directory. +See the section L for suggestions on debugging the Apache authentication setup. + =head2 How BackupPC Finds Hosts Starting with v2.0.0 the way hosts are discovered has changed. In most @@ -1383,7 +1292,7 @@ Depending on your netmask you might need to specify the -B option to nmblookup. For example: nmblookup -B 10.10.1.255 myhost - + If necessary, experiment on the nmblookup command that will return the IP address of the client given its name. Then update $Conf{NmbLookupFindHostCmd} with any necessary options to nmblookup. @@ -1425,6 +1334,46 @@ but does respond to a request directed to its IP address: =over 4 +=item Removing a client + +If there is a machine that no longer needs to be backed up (eg: a retired +machine) you have two choices. First, you can keep the backups accessible +and browsable, but disable all new backups. Alternatively, you can +completely remove the client and all its backups. + +To disable backups for a client there are two special values for +$Conf{FullPeriod} in that client's per-PC config.pl file: + +=over 4 + +=item -1 + +Don't do any regular backups on this machine. Manually +requested backups (via the CGI interface) will still occur. + +=item -2 + +Don't do any backups on this machine. Manually requested +backups (via the CGI interface) will be ignored. + +=back + +This will still allow that client's old backups to be browsable +and restorable. + +To completely remove a client and all its backups, you should remove its +entry in the conf/hosts file, and then delete the __TOPDIR__/pc/$host +directory. Whenever you change the hosts file, you should send +BackupPC a HUP (-1) signal so that it re-reads the hosts file. +If you don't do this, BackupPC will automatically re-read the +hosts file at the next regular wakeup. + +Note that when you remove a client's backups you won't initially recover +a lot of disk space. That's because the client's files are still in +the pool. Overnight, when BackupPC_nightly next runs, all the unused +pool files will be deleted and this will recover the disk space used +by the client's backups. + =item Copying the pool If the pool disk requirements grow you might need to copy the entire @@ -1555,144 +1504,10 @@ can now re-start BackupPC. =back -=head2 Debugging installation problems - -This section will probably grow based on the types of questions on -the BackupPC mail list. Eventually the FAQ at -L will include more details -than this section. - -=over 4 - -=item Check log files - -Assuming BackupPC can start correctly you should inspect __TOPDIR__/log/LOG -for any errors. Assuming backups for a particular host start, you -should be able to look in __TOPDIR__/pc/$host/LOG for error messages -specific to that host. Always check both log files. - -=item CGI script doesn't run - -Perhaps the most common program with the installation is getting the -CGI script to run. Often the setuid isn't configured correctly, or -doesn't work on your system. - -First, try running BackupPC_Admin manually as the BackupPC user, eg: - - su __BACKUPPCUSER__ - __CGIDIR__/BackupPC_Admin - -Now try running it as the httpd user (which ever user apache runs as); - - su httpd - __CGIDIR__/BackupPC_Admin - -In both cases do you get normal html output? - -If the first case works but the second case fails with an error that -the wrong user is running the script then you have a setuid problem. -(This assumes you are running BackupPC_Admin without mod_perl, and -you therefore need seduid to work. If you are using mod_perl then -apache should run as user __BACKUPPCUSER__.) - -First you should make sure the cgi-bin directory is on a file system -that doesn't have the "nosuid" mount option. - -Next, experiment by creating this script: - - #!/bin/perl - - printf("My userid is $> (%s)\n", (getpwuid($>))[0]); - -then chown it to backuppc and chmod u+s: - - root# chown backuppc testsetuid - root# chmod u+s testsetuid - root# chmod a+x testsetuid - root# ls -l testsetuid - -rwsr-xr-x 1 backuppc wheel 76 Aug 26 09:46 testsetuid* +=head2 Fixing installation problems -Now run this program as a normal user. What uid does it print? -Try changing the first line of the script to directly call sperl: - - #!/usr/bin/sperl5.8.0 - -(modify according to your version and path). Does this work -instead? - -Finally, you should invoke the CGI script from a browser, using -a URL like: - - http://myHost/cgi-bin/BackupPC/BackupPC_Admin - -You should make sure REMOTE_USER is being set by apache (see the -earlier section) so that user authentication works. Make sure -the config settings $Conf{CgiAdminUserGroup} and $Conf{CgiAdminUsers} -correctly specify the privileged administrator users. - -=item Can't ping or find host - -Please read the section L. - -You should also verify that nmblookup correctly returns the netbios name. -This is essential for DHCP hosts, and depending upon the setting of -$Conf{FixedIPNetBiosNameCheck} might also be required for fixed IP -address hosts too. Run this command: - - nmblookup -A hostName - -Verify that the host name is printed. The output might look like: - - received 7 names - DELLLS13 <00> - P - DOMAINNAME <00> - P - DELLLS13 <20> - P - DOMAINNAME <1e> - P - DELLLS13 <03> - P - DELLLS13$ <03> - P - CRAIG <03> - P - -The first name, converted to lower case, is used for the host name. - -=item Transport method doesn't work - -The BackupPC_dump command now has a -v option, so the easiest way to -debug backup problems on a specific host is to run BackupPC_dump -manually as the BackupPC user: - - su __BACKUPPCUSER__ - __INSTALLDIR__/bin/BackupPC_zcat - -The most likely problems will relate to connecting to the smb shares on -each host. On each failed backup, a file __TOPDIR__/pc/$host/XferLOG.bad.z -will be created. This is the stderr output from the transport program. -You can view this file via the CGI interface, or manually uncompress it -with; - - __INSTALLDIR__/bin/BackupPC_zcat __TOPDIR__/pc/$host/XferLOG.bad.z | more - -The first line will show the full command that was run (eg: rsync, tar -or smbclient). Based on the error messages you should figure out what -is wrong. Possible errors on the server side are invalid host, invalid -share name, bad username or password. Possible errors on the client -side are misconfiguration of the share, username or password. - -You should try running the command manually to see what happens. -For example, for smbclient you should it manually and verify that -you can connect to the host in interactive mode, eg: - - smbclient '\\hostName\shareName' -U userName - -shareName should match the $Conf{SmbShareName} setting and userName -should match the the $Conf{SmbShareUserName} setting. - -You will be prompted for the password. You should then see this prompt: - - smb: \> - -Verify that "ls" works and then type "quit" to exit. - -=back +Please see the FAQ at L for +debugging suggestions. =head1 Restore functions @@ -1743,15 +1558,25 @@ with a summary of the exact source and target files and directories before you commit. When you give the final go ahead the restore operation will be queued like a normal backup job, meaning that it will be deferred if there is a backup currently running for that host. -When the restore job is run, smbclient or tar is used (depending upon -$Conf{XferMethod}) to actually restore the files. Sorry, there is -currently no option to cancel a restore that has been started. +When the restore job is run, smbclient, tar, rsync or rsyncd is used +(depending upon $Conf{XferMethod}) to actually restore the files. +Sorry, there is currently no option to cancel a restore that has been +started. A record of the restore request, including the result and list of files and directories, is kept. It can be browsed from the host's home page. $Conf{RestoreInfoKeepCnt} specifies how many old restore status files to keep. +Note that for direct restore to work, the $Conf{XferMethod} must +be able to write to the client. For example, that means an SMB +share for smbclient needs to be writable, and the rsyncd module +needs "read only" set to "false". This creates additional security +risks. If you only create read-only SMB shares (which is a good +idea), then the direct restore will fail. You can disable the +direct restore option by setting $Conf{SmbClientRestoreCmd}, +$Conf{TarClientRestoreCmd} and $Conf{RsyncRestoreArgs} to undef. + =item Option 2: Download Zip archive With this option a zip file containing the selected files and directories @@ -1793,6 +1618,8 @@ full file name, eg: It's your responsibility to make sure the file is really compressed: BackupPC_zcat doesn't check which backup the requested file is from. +BackupPC_zcat returns a non-zero status if it fails to uncompress +a file. =item BackupPC_tarCreate @@ -1804,7 +1631,7 @@ incremental or full backup. The usage is: BackupPC_tarCreate [-t] [-h host] [-n dumpNum] [-s shareName] - [-r pathRemove] [-p pathAdd] + [-r pathRemove] [-p pathAdd] [-b BLOCKS] [-w writeBufSz] files/directories... The command-line files and directories are relative to the specified @@ -1844,6 +1671,16 @@ path prefix that will be replaced with pathAdd new path prefix +=item -b BLOCKS + +the tar block size, default is 20, meaning tar writes data in 20 * 512 +bytes chunks. + +=item -w writeBufSz + +write buffer size, default 1048576 (1MB). You can increase this if +you are trying to stream to a fast tape device. + =back The -h, -n and -s options specify which dump is used to generate @@ -1916,6 +1753,59 @@ in a location different from their original location. Each of these programs reside in __INSTALLDIR__/bin. +=head1 Archive functions + +BackupPC supports archiving to removable media. For users that require +offsite backups, BackupPC can create archives that stream to tape +devices, or create files of specified sizes to fit onto cd or dvd media. + +Each archive type is specified by a BackupPC host with its XferMethod +set to 'archive'. This allows for multiple configurations at sites where +there might be a combination of tape and cd/dvd backups being made. + +BackupPC provides a menu that allows one or more hosts to be archived. +The most recent backup of each host is archived using BackupPC_tarCreate, +and the output is optionally compressed and split into fixed-sized +files (eg: 650MB). + +The archive for each host is done by default using +__INSTALLDIR__/BackupPC_archiveHost. This script can be copied +and customized as needed. + +=head2 Configuring an Archive Host + +To create an Archive Host, add it to the hosts file just as any other host +and call it a name that best describes the type of archive, e.g. ArchiveDLT + +To tell BackupPC that the Host is for Archives, create a config.pl file in +the Archive Hosts's pc directory, adding the following line: + +$Conf{XferMethod} = 'archive'; + +To further customise the archive's parameters you can adding the changed +parameters in the host's config.pl file. The parameters are explained in +the config.pl file. Parameters may be fixed or the user can be allowed +to change them (eg: output device). + +The per-host archive command is $Conf{ArchiveClientCmd}. By default +this invokes + + __INSTALLDIR__/BackupPC_archiveHost + +which you can copy and customize as necessary. + +=head2 Starting an Archive + +In the web interface, click on the Archive Host you wish to use. You will see a +list of previous archives and a summary on each. By clicking the "Start Archive" +button you are presented with the list of hosts and the approximate backup size +(note this is raw size, not projected compressed size) Select the hosts you wish +to archive and press the "Archive Selected Hosts" button. + +The next screen allows you to adjust the parameters for this archive run. +Press the "Start the Archive" to start archiving the selected hosts with the +parameters displayed. + =head1 BackupPC Design =head2 Some design issues @@ -2013,29 +1903,66 @@ background command queue. =item 2 For each PC, BackupPC_dump is forked. Several of these may be run in -parallel, based on the configuration. First a ping is done to see if the -machine is alive. If this is a DHCP address, nmblookup is run to get -the netbios name, which is used as the host name. The file -__TOPDIR__/pc/$host/backups is read to decide whether a full or -incremental backup needs to be run. If no backup is scheduled, or the ping -to $host fails, then BackupPC_dump exits. - -The backup is done using samba's smbclient or tar over ssh/rsh/nfs piped -into BackupPC_tarExtract, extracting the backup into __TOPDIR__/pc/$host/new. -The smbclient or tar output is put into __TOPDIR__/pc/$host/XferLOG. - -As BackupPC_tarExtract extracts the files from smbclient, it checks each -file in the backup to see if it is identical to an existing file from -any previous backup of any PC. It does this without needed to write the -file to disk. If the file matches an existing file, a hardlink is -created to the existing file in the pool. If the file does not match any -existing files, the file is written to disk and the file name is saved -in __TOPDIR__/pc/$host/NewFileList for later processing by -BackupPC_link. BackupPC_tarExtract can handle arbitrarily large -files and multiple candidate matching files without needing to -write the file to disk in the case of a match. This significantly -reduces disk writes (and also reads, since the pool file comparison -is done disk to memory, rather than disk to disk). +parallel, based on the configuration. First a ping is done to see if +the machine is alive. If this is a DHCP address, nmblookup is run to +get the netbios name, which is used as the host name. If DNS lookup +fails, $Conf{NmbLookupFindHostCmd} is run to find the IP address from +the host name. The file __TOPDIR__/pc/$host/backups is read to decide +whether a full or incremental backup needs to be run. If no backup is +scheduled, or the ping to $host fails, then BackupPC_dump exits. + +The backup is done using the specified XferMethod. Either samba's smbclient +or tar over ssh/rsh/nfs piped into BackupPC_tarExtract, or rsync over ssh/rsh +is run, or rsyncd is connected to, with the incoming data +extracted to __TOPDIR__/pc/$host/new. The XferMethod output is put +into __TOPDIR__/pc/$host/XferLOG. + +The letter in the XferLOG file shows the type of object, similar to the +first letter of the modes displayed by ls -l: + + d -> directory + l -> symbolic link + b -> block special file + c -> character special file + p -> pipe file (fifo) + nothing -> regular file + +The words mean: + +=over 4 + +=item create + +new for this backup (ie: directory or file not in pool) + +=item pool + +found a match in the pool + +=item same + +file is identical to previous backup (contents were +checksummed and verified during full dump). + +=item skip + +file skipped in incremental because attributes are the +same (only displayed if $Conf{XferLogLevel} >= 2). + +=back + +As BackupPC_tarExtract extracts the files from smbclient or tar, or as +rsync runs, it checks each file in the backup to see if it is identical +to an existing file from any previous backup of any PC. It does this +without needed to write the file to disk. If the file matches an +existing file, a hardlink is created to the existing file in the pool. +If the file does not match any existing files, the file is written to +disk and the file name is saved in __TOPDIR__/pc/$host/NewFileList for +later processing by BackupPC_link. BackupPC_tarExtract and rsync can handle +arbitrarily large files and multiple candidate matching files without +needing to write the file to disk in the case of a match. This +significantly reduces disk writes (and also reads, since the pool file +comparison is done disk to memory, rather than disk to disk). Based on the configuration settings, BackupPC_dump checks each old backup to see if any should be removed. Any expired backups @@ -2056,20 +1983,20 @@ is removed and replaced by a hard link to the existing file. If the file is new, a hard link to the file is made in the pool area, so that this file is available for checking against each new file and new backup. -Then, assuming $Conf{IncrFill} is set, for each incremental backup, -hard links are made in the new backup to all files that were not extracted -during the incremental backups. The means the incremental backup looks -like a complete image of the PC (with the exception that files -that were removed on the PC since the last full backup will still -appear in the backup directory tree). - -As of v1.03, the CGI interface knows how to merge unfilled -incremental backups will the most recent prior filled (full) -backup, giving the incremental backups a filled appearance. The -default for $Conf{IncrFill} is off, since there is now no need to -fill incremental backups. This saves some level of disk activity, -since lots of extra hardlinks are no longer needed (and don't have -to be deleted when the backup expires). +Then, if $Conf{IncrFill} is set (note that the default setting is +off), for each incremental backup, hard links are made in the new +backup to all files that were not extracted during the incremental +backups. The means the incremental backup looks like a complete +image of the PC (with the exception that files that were removed on +the PC since the last full backup will still appear in the backup +directory tree). + +The CGI interface knows how to merge unfilled incremental backups will +the most recent prior filled (full) backup, giving the incremental +backups a filled appearance. The default for $Conf{IncrFill} is off, +since there is no need to fill incremental backups. This saves +some level of disk activity, since lots of extra hardlinks are no +longer needed (and don't have to be deleted when the backup expires). =item 4 @@ -2082,7 +2009,13 @@ administrative tasks, such as cleaning the pool. This involves removing any files in the pool that only have a single hard link (meaning no backups are using that file). Again, to avoid race conditions, BackupPC_nightly is only run when there are no BackupPC_dump or BackupPC_link processes -running. +running. Therefore, when it is time to run BackupPC_nightly, no new +backups are started and BackupPC waits until all backups have finished. +Then BackupPC_nightly is run, and until it finishes no new backups are +started. If BackupPC_nightly takes too long to run, the settings +$Conf{MaxBackupPCNightlyJobs} and $Conf{BackupPCNightlyPeriod} can +be used to run several BackupPC_nightly processes in parallel, and +to split its job over several nights. =back @@ -2251,7 +2184,7 @@ Last month's log file. Log files are aged monthly and compressed =item XferERR or XferERR.z -Output from the transport program (ie: smbclient or tar) +Output from the transport program (ie: smbclient, tar or rsync) for the most recent failed backup. =item new @@ -2261,7 +2194,7 @@ directory is renamed if the backup succeeds. =item XferLOG or XferLOG.z -Output from the transport program (ie: smbclient or tar) +Output from the transport program (ie: smbclient, tar or rsync) for the current backup. =item nnn (an integer) @@ -2270,7 +2203,7 @@ Successful backups are in directories numbered sequentially starting at 0. =item XferLOG.nnn or XferLOG.nnn.z -Output from the transport program (ie: smbclient or tar) +Output from the transport program (ie: smbclient, tar or rsync) corresponding to backup number nnn. =item RestoreInfo.nnn @@ -2281,9 +2214,20 @@ numbers are not related to the backup number.) =item RestoreLOG.nnn.z -Output from smbclient or tar during restore #nnn. (Note that the restore +Output from smbclient, tar or rsync during restore #nnn. (Note that the restore numbers are not related to the backup number.) +=item ArchiveInfo.nnn + +Information about archive request #nnn including who, what, when, and +why. This file is in Data::Dumper format. (Note that the archive +numbers are not related to the restore or backup number.) + +=item ArchiveLOG.nnn.z + +Output from archive #nnn. (Note that the archive numbers are not related +to the backup or restore number.) + =item config.pl Optional configuration settings specific to this host. Settings in this @@ -2317,11 +2261,11 @@ Stop time of the backup in unix seconds. =item nFiles -Number of files backed up (as reported by smbclient or tar). +Number of files backed up (as reported by smbclient, tar or rsync). =item size -Total file size backed up (as reported by smbclient or tar). +Total file size backed up (as reported by smbclient, tar or rsync). =item nFilesExist @@ -2345,15 +2289,15 @@ Total size of files that were not in the pool =item xferErrs -Number of errors or warnings from smbclient (zero for tar). +Number of errors or warnings from smbclient, tar or rsync. =item xferBadFile -Number of errors from smbclient that were bad file errors (zero for tar). +Number of errors from smbclient that were bad file errors (zero otherwise). =item xferBadShare -Number of errors from smbclient that were bad share errors (zero for tar). +Number of errors from smbclient that were bad share errors (zero otherwise). =item tarErrs @@ -2444,7 +2388,37 @@ Number of errors from BackupPC_tarCreate during restore. =item xferErrs -Number of errors from smbclient or tar during restore. +Number of errors from smbclient, tar or rsync during restore. + +=back + +=item archives + +A tab-delimited ascii table listing information about each requested +archive, one per row. The columns are: + +=over 4 + +=item num + +Archive number (matches the suffix of the ArchiveInfo.nnn and +ArchiveLOG.nnn.z file), unrelated to the backup or restore number. + +=item startTime + +Start time of the restore in unix seconds. + +=item endTime + +End time of the restore in unix seconds. + +=item result + +Result (ok or failed). + +=item errorMsg + +Error message if archive failed. =back @@ -2479,6 +2453,72 @@ To easily decompress a BackupPC compressed file, the script BackupPC_zcat can be found in __INSTALLDIR__/bin. For each file name argument it inflates the file and writes it to stdout. +=head2 Rsync checksum caching + +An incremental backup with rsync compares attributes on the client +with the last full backup. Any files with identical attributes +are skipped. A full backup with rsync sets the --ignore-times +option, which causes every file to be examined independent of +attributes. + +Each file is examined by generating block checksums (default 2K +blocks) on the receiving side (that's the BackupPC side), sending +those checksums to the client, where the remote rsync matches those +checksums with the corresponding file. The matching blocks and new +data is sent back, allowing the client file to be reassembled. +A checksum for the entire file is sent to as an extra check the +the reconstructed file is correct. + +This results in significant disk IO and computation for BackupPC: +every file in a full backup, or any file with non-matching attributes +in an incremental backup, needs to be uncompressed, block checksums +computed and sent. Then the receiving side reassembles the file and +has to verify the whole-file checksum. Even if the file is identical, +prior to 2.1.0, BackupPC had to read and uncompress the file twice, +once to compute the block checksums and later to verify the whole-file +checksum. + +Starting in 2.1.0, BackupPC supports optional checksum caching, +which means the block and file checksums only need to be computed +once for each file. This results in a significant performance +improvement. This only works for compressed pool files. +It is enabled by adding + + '--checksum-seed=32761', + +to $Conf{RsyncArgs} and $Conf{RsyncRestoreArgs}. + +Rsync versions prior to and including rsync-2.6.2 need a small patch to +add support for the --checksum-seed option. This patch is available in +the cygwin-rsyncd package at L. +This patch is already included in rsync CVS, so it will be standard +in future versions of rsync. + +When this option is present, BackupPC will add block and file checksums +to the compressed pool file the next time a pool file is used and it +doesn't already have cached checksums. The first time a new file is +written to the pool, the checksums are not appended. The next time +checksums are needed for a file, they are computed and added. So the +full performance benefit of checksum caching won't be noticed until the +third time a pool file is used (eg: the third full backup). + +With checksum caching enabled, there is a risk that should a file's contents +in the pool be corrupted due to a disk problem, but the cached checksums +are still correct, the corruption will not be detected by a full backup, +since the file contents are no longer read and compared. To reduce the +chance that this remains undetected, BackupPC can recheck cached checksums +for a fraction of the files. This fraction is set with the +$Conf{RsyncCsumCacheVerifyProb} setting. The default value of 0.01 means +that 1% of the time a file's checksums are read, the checksums are verified. +This reduces performance slightly, but, over time, ensures that files +contents are in sync with the cached checksums. + +The format of the cached checksum data can be discovered by looking at +the code. Basically, the first byte of the compressed file is changed +to denote that checksums are appended. The block and file checksum +data, plus some other information and magic word, are appended to the +compressed file. This allows the cache update to be done in-place. + =head2 File name mangling Backup file names are stored in "mangled" form. Each node of @@ -2591,295 +2631,14 @@ BackupPC's data directory with the noatime attribute (see mount(1)). =head2 Limitations -BackupPC isn't perfect (but it is getting better). Here are some -limitations of BackupPC: - -=over 4 - -=item Non-unix file attributes not backed up - -smbclient doesn't extract the WinXX ACLs, so file attributes other than -the equivalent (as provided by smbclient) unix attributes are not -backed up. - -=item Locked files are not backed up - -Under WinXX a locked file cannot be read by smbclient. Such files will -not be backed up. This includes the WinXX system registry files. - -This is especially troublesome for Outlook, which stores all its data -in a single large file and keeps it locked whenever it is running. -Since many users keep Outlook running all the time their machine -is up their Outlook file will not be backed up. Sadly, this file -is the most important file to backup. As one workaround, Microsoft has -a user-level application that periodically asks the user if they want to -make a copy of their outlook.pst file. This copy can then be backed up -by BackupPC. See L. - -Similarly, all of the data for WinXX services like SQL databases, -Exchange etc won't be backed up. If these applications support -some kind of export or utility to save their data to disk then this -can =used to create files that BackupPC can backup. - -So far, the best that BackupPC can do is send warning emails to -the user saying that their outlook files haven't been backed up in -X days. (X is configurable.) The message invites the user to -exit Outlook and gives a URL to manually start a backup. - -I suspect there is a way of mirroring the outlook.pst file so -that at least the mirror copy can be backed up. Or perhaps a -manual copy can be started at login. Does some WinXX expert -know how to do this? - -Comment: two users have noted that there are commercial OFM (open file -manager) products that are designed to solve this problem, for example -from St. Bernard or Columbia Data Products. Apparently Veritas and -Legato bundle this product with their commercial products. See for -example L. -If anyone tries these programs with BackupPC please tell us whether or -not they work. - -=item Don't expect to reconstruct a complete WinXX drive - -The conclusion from the last few items is that BackupPC is not intended -to allow a complete WinXX disk to be re-imaged from the backup. Our -approach to system restore in the event of catastrophic failure is to -re-image a new disk from a generic master, and then use the BackupPC -archive to restore user files. - -It is likely that linux/unix backups done using tar (rather than -smb) can be used to reconstruct a complete file system, although -I haven't tried it. - -=item Maximum Backup File Sizes - -BackupPC can backup and manage very large file sizes, probably as large -as 2^51 bytes (when a double-precision number's mantissa can no longer -represent an integer exactly). In practice, several things outside -BackupPC limit the maximum individual file size. Any one of the -following items will limit the maximum individual file size: - -=over 4 - -=item Perl - -Perl needs to be compiled with uselargefiles defined. Check your -installation with: - - perl -V | egrep largefiles - -Without this, the maximum file size will be 2GB. - -=item File system - -The BackupPC pool and data directories must be on a file system that -supports large files. - -Without this, the maximum file size will be 2GB. - -=item Transport - -The transport mechanism also limits the maximum individual file size. - -GNU tar maximum file size is limited by the tar header format. The tar -header uses 11 octal digits to represent the file size, which is 33 bits -or 8GB. I vaguely recall (but I haven't recently checked) that GNU tar -uses an extra octal digit (replacing a trailing delimiter) if necessary, -allowing 64GB files. So tar transport limits the maximum file size to -8GB or perhaps 64GB. It is possible that files >= 8GB don't work; this -needs to be looked into. - -Smbclient is limited to 4GB file sizes. Moreover, a bug in smbclient -(mixing signed and unsigned 32 bit values) causes it to incorrectly -do the tar octal conversion for file sizes from 2GB-4GB. BackupPC_tarExtract -knows about this bug and can recover the correct file size. So smbclient -transport works up to 4GB file sizes. - -=back - -=item Some tape backup systems aren't smart about hard links - -If you backup the BackupPC pool to tape you need to make sure that the -tape backup system is smart about hard links. For example, if you -simply try to tar the BackupPC pool to tape you will backup a lot more -data than is necessary. - -Using the example at the start of the installation section, 65 hosts are -backed up with each full backup averaging 3.2GB. Storing one full backup -and two incremental backups per laptop is around 240GB of raw data. But -because of the pooling of identical files, only 87GB is used (with -compression the total is lower). If you run du or tar on the data -directory, there will appear to be 240GB of data, plus the size of the -pool (around 87GB), or 327GB total. - -If your tape backup system is not smart about hard links an alternative -is to periodically backup just the last successful backup for each host -to tape. Another alternative is to do a low-level dump of the pool -file system (ie: /dev/hda1 or similar) using dump(1). - -Supporting more efficient tape backup is an area for further -development. - -=item Incremental backups might included deleted files - -To make browsing and restoring backups easier, incremental backups -are "filled-in" from the last complete backup when the backup is -browsed or restored. - -However, if a file was deleted by a user after the last full backup, that -file will still appear in the "filled-in" incremental backup. This is not -really a specific problem with BackupPC, rather it is a general issue -with the full/incremental backup paradigm. This minor problem could be -solved by having smbclient list all files when it does the incremental -backup. Volunteers anyone? - -=back - -Comments or suggestions on these issues are welcome. +BackupPC isn't perfect (but it is getting better). Please see +L for a +discussion of some of BackupPC's limitations. =head2 Security issues -Please read this section and consider each of the issues carefully. - -=over 4 - -=item Smb share password - -An important security risk is the manner in which the smb share -passwords are stored. They are in plain text. As described in -L there are four -ways to tell BackupPC the smb share password (manually setting an environment -variable, setting the environment variable in /etc/init.d/backuppc, -putting the password in __TOPDIR__/conf/config.pl, or putting the -password in __TOPDIR__/pc/$host/config.pl). In the latter 3 cases the -smb share password appears in plain text in a file. - -If you use any of the latter three methods please make sure that the file's -permission is appropriately restricted. If you also use RCS or CVS, double -check the file permissions of the config.pl,v file. - -In future versions there will probably be support for encryption of the -smb share password, but a private key will still have to be stored in a -protected place. Comments and suggestions are welcome. - -=item BackupPC socket server - -In v1.5.0 the primary method for communication between the CGI program -(BackupPC_Admin) and the server (BackupPC) is via a unix-domain socket. -Since this socket has restricted permissions, no local user should be -able to connect to this port. No backup or restore data passes through -this interface, but an attacker can start or stop backups and get status -through this port. - -If the Apache server and BackupPC_Admin run on a different host to -BackupPC then a TCP port must be enabled by setting $Conf{ServerPort}. -Anyone can connect to this port. To avoid possible attacks via the TCP -socket interface, every client message is protected by an MD5 digest. -The MD5 digest includes four items: - -=over 4 - -=item * - -a seed that is sent to the client when the connection opens - -=item * - -a sequence number that increments for each message - -=item * - -a shared secret that is stored in $Conf{ServerMesgSecret} - -=item * - -the message itself. - -=back - -The message is sent in plain text preceded by the MD5 digest. A -snooper can see the plain-text seed sent by BackupPC and plain-text -message from the client, but cannot construct a valid MD5 digest since -the secret in $Conf{ServerMesgSecret} is unknown. A replay attack is -not possible since the seed changes on a per-connection and -per-message basis. - -So if you do enable the TCP port, please set $Conf{ServerMesgSecret} -to some hard-to-guess string. A denial-of-service attack is possible -with the TCP port enabled. Someone could simply connect many times -to this port, until BackupPC had exhausted all its file descriptors, -and this would cause new backups and the CGI interface to fail. The -most secure solution is to run BackupPC and Apache on the same machine -and disable the TCP port. - -By the way, if you have upgraded from a version of BackupPC prior to -v1.5.0 you should set $Conf{ServerPort} to -1 to disable the TCP port. - -=item Installation permissions - -It is important to check that the BackupPC scripts in __INSTALLDIR__/bin -and __INSTALLDIR__/lib cannot be edited by normal users. Check the -directory permissions too. - -=item Pool permissions - -It is important to check that the data files in __TOPDIR__/pool, -__TOPDIR__/pc and __TOPDIR__/trash cannot be read by normal users. -Normal users should not be able to see anything below __TOPDIR__. - -=item Host shares - -Enabling shares on hosts carries security risks. If you are on a private -network and you generally trust your users then there should not be a -problem. But if you have a laptop that is sometimes on public networks -(eg: broadband or even dialup) you should be concerned. A conservative -approach is to use firewall software, and only enable the netbios and -smb ports (137 and 139) on connections from the host running BackupPC. - -=item SSH key security - -Using ssh for linux/unix clients is quite secure, but the security is -only as good as the protection of ssh's private keys. If an attacker can -devise a way to run a shell as the BackupPC user then they will have -access to BackupPC's private ssh keys. They can then, in turn, ssh to -any client machine as root (or whichever user you have configured -BackupPC to use). This represents a serious compromise of your entire -network. So in vulnerable networks, think carefully about how to protect -the machine running BackupPC and how to prevent attackers from gaining -shell access (as the BackupPC user) to the machine. - -=item CGI interface - -The CGI interface, __CGIDIR__/BackupPC_Admin, needs access to the pool -files so it is installed setuid to __BACKUPPCUSER__. The permissions of -this file need to checked carefully. It should be owned by -__BACKUPPCUSER__ and have user and group (but not other) execute -permission. To allow apache/httpd to execute it, the group ownership -should be something that apache/httpd belongs to. - -The Apache configuration should be setup for AuthConfig style, -using a .htaccess file so that the user's name is passed into -the script as $ENV{REMOTE_USER}. - -If normal users could directly run BackupPC_Admin then there is a serious -security hole: since it is setuid to __BACKUPPCUSER__ any user can -browse and restore any backups. Be aware that anyone who is allowed to -edit or create cgi scripts on your server can execute BackupPC_Admin as -any user! They simply write a cgi script that sets $ENV{REMOTE_USER} and -then execs BackupPC_Admin. The exec succeeds since httpd runs the first -script as user httpd/apache, which in turn has group permission to -execute BackupPC_Admin. - -While this setup should be safe, a more conservative approach is to -run a dedicated Apache as user __BACKUPPCUSER__ on a different port. -Then BackupPC_Admin no longer needs to be setuid, and the cgi -directories can be locked down from normal users. Moreover, this -setup is exactly the one used to support mod_perl, so this provides -both the highest performance and the lowest security risk. - -=back - -Comments and suggestions are welcome. +Please see L for a +discussion of some of various security issues. =head1 Configuration File @@ -2982,12 +2741,16 @@ __CONFIGPOD__ =head1 Version Numbers -Starting with v1.4.0 BackupPC switched to a X.Y.Z version numbering -system, instead of X.0Y. The first digit is for major new releases, the -middle digit is for significant feature releases and improvements (most -of the releases have been in this category), and the last digit is for +Starting with v1.4.0 BackupPC uses a X.Y.Z version numbering system, +instead of X.0Y. The first digit is for major new releases, the middle +digit is for significant feature releases and improvements (most of +the releases have been in this category), and the last digit is for bug fixes. You should think of the old 1.00, 1.01, 1.02 and 1.03 as -1.0.0, 1.1.0, 1.2.0 and 1.3.0. +1..0, 1.1.0, 1.2.0 and 1.3.0. + +Additionally, patches might be made available. A patched version +number is of the form X.Y.ZplN (eg: 2.1.0pl2), where N is the +patch level. =head1 Author @@ -2997,24 +2760,41 @@ See L. =head1 Copyright -Copyright (C) 2001-2003 Craig Barratt +Copyright (C) 2001-2005 Craig Barratt =head1 Credits +Ryan Kucera contributed the directory navigation code and images +for v1.5.0. He contributed the first skeleton of BackupPC_restore. +He also added a significant revision to the CGI interface, including +CSS tags, in v2.1.0, and designed the BackupPC logo. + Xavier Nicollet, with additions from Guillaume Filion, added the internationalization (i18n) support to the CGI interface for v2.0.0. Xavier provided the French translation fr.pm, with additions from Guillaume. -Ryan Kucera contributed the directory navigation code and images -for v1.5.0. He also contributed the first skeleton of BackupPC_restore. - Guillaume Filion wrote BackupPC_zipCreate and added the CGI support for zip download, in addition to some CGI cleanup, for v1.5.0. +Guillaume continues to support fr.pm updates for each new version. + +Josh Marshall implemented the Archive feature in v2.1.0. + +Ludovic Drolez supports the BackupPC Debian package. + +Javier Gonzalez provided the Spanish translation, es.pm for v2.0.0. + +Manfred Herrmann provided the German translation, de.pm for v2.0.0. +Manfred continues to support de.pm updates for each new version, +together with some help frmo Ralph Paßgang. + +Lorenzo Cappelletti provided the Italian translation, it.pm for v2.1.0. -Javier Gonzalez provided the Spanish translation, es.pm. +Lieven Bridts provided the Dutch translation, nl.pm, for v2.1.0, +with some tweaks from Guus Houtzager. -Manfred provided the German translation, de.pm. +Reginaldo Ferreira provided the Portuguese Brazillian translation +pt_br.pm for v2.2.0. Many people have reported bugs, made useful suggestions and helped with testing; see the ChangeLog and the mail lists.