It stores it's backups on tape.
It keeps a database of what's on which tape.
It supports a veriety of underlying backup progams (tar, dump, etc).
Current status: abandoned
Last modification: 1996
Download: here
FROGBAK(1) FROGBAK(1) NAME frogbak - schedule and execute backups for a small network SYNOPSIS frogbak [-dryrun] [-summary] [-future #h|#d|#w|#m] [-control con- ------- -------- ------- -- -- -- -- -------- ---- trol file] --------- mkblank tapename -------- recycle tapename [tapename...] -------- -------- sum covered control file [control file...] - ------------ ------------ send offsite - DESCRIPTION These programs provide frogbak services for small networks. The algo- rithms they use were designed for the following environment: one tape drive, 20 GB of disk space, several kinds of computers, and a lazy pro- grammer who was paranoid about dumps. The closer an environment is to that, the better the system will work for you. The basic design is that the system choses when to do what kind of dump. It does two kinds of dumps: incrementals and fulls. Unlike a differential, an incremental dump saves the files modified from the time of the last dump at the same dump level to the present rather than ---- the files modified from the time of the last dump at a lower level to ----- the presetn. These dumps are organized into three separate tracks. The tracks are independent of one another such that an incremental in one track does not affect the coverage of an incremental in another track. Two of the tracks have only incrementals and the last has only full dumps. This setup means that each file gets saved onto two tapes by the incremental tracks. This notion of tracks is a convienient way to think about the behavior of frogbak, but it is not how the behavior is implemented: it is implemented by making the incrementals save the files modified between now and the incremental before the last one. To restore a filesystem, the last good full dump must be read first. If a full dump is bad, just ignore it and skip to the previous one. After the full dump, every other incremental from the time of the full dump until the present must be read. If an incremental is bad, switch to the other track. It is not expected that dumps will be bad. Exabyte and DAT media seem to be very reliable in comparison to the 9-track tapes used last decade. However, frogbak performs the dumps on live filesystems and thus sometimes the dump data will be bad even though the media is okay. The choice of dumps is specified with a control file. For each filesystem, the control file specifies: The frequency of incremental dumps. This places a limit on how often a dump is performed. Dumps will not occur more often than the specified rate. The importance of incremental dumps. The importance is combined with the lenght of time it has been since an incremental dump has been done compared to specified frequency of dumps to pro- duce a rating number. The formula is something like rating = ------ time since last dump / frequency * importance. Thus if the fre- ---- ----- ---- ---- --------- ----------- quency is two days, the importance is 50, and it has been ten -- days since an incremental dump, then the rating would be 10/2*50 ------ = 250. The frequency and importance of full dumps. The ratings for both incremental and full dumps are compared on a filesystem by filesytem basis. For each filesystem, either the full dump or the incremental will be discarded from consideration at this point. The ratings for all of the filesystems are then sorted and dumps are performed in order, based on these priorities. CONTROL FILES There are four types of lines that can be in the control files. They are: comments, variable assignments, filesystem control lines, and average statements. Any line beginning with a hash (#) symbol is a comment. Any line beginning with a legal (C-style) identifier followed by an equals (=) is a variable assignment. Variable references may be made in variable assignments and in the dump valuation columns. Variables are recognized because the only other symbols that may occur in those locations are numbers and math operatators like times (*) and plus (+). Filesystem control lines are made up of seven whitespace-separated fields: filesystem Names the filesystem to be dumped. host Names the system that the filesystem is on. os Names the dump program to be used to dump the filesystem. The current legal values for the os field are: solaris, sunos, freebsd, netbsd, hpux, hp-ux, mach, domain, ultrix, sony, linux, gtar, dostar, targtar, and xenix. Dumps can be done with the GNU tar program. On linux it is the default and is called tar, on other systems it is called tar. DOS filesystems can be dumped with gtar if they are mounted on a unix system. The current dostar setup assumes that the tar program is really GNU tar. ifreq Specifies the frequency of incremental dumps. The format isWhere is a deci- -------------- -------- mal number and is one of h, d, w, or m; cor- ------ rosponding to hours, days, weeks, and months. ivalue Specifies the relative value of doing an incremen- tal dump on this filesystem after a duration equal to the ifreq. ffreq Specifies the frequency of doing a full dump. Same format as ifreq. fvalue Specifies the value of doing a full dump of this filesystem after a duration equal to the ffreq. I recommend setting all of the frequencies at one day. That way you can tell if everything is getting dumped or not. I further recommend setting a variable to be the relative importance of doing full dumps. Then when the ivalue is set to x, and the fvalue is set to x/fd, the - ---- number of full dumps per incremental can be varied by changing the value of fd. This allows the dump system to be tuned easily. -- The last sort of line is an average statment. The syntax of an average statement is average system-name filesystem-1 filesystem-2 etc... Nor- ----------- ------------ ------------ ------ mally, the each filesystem for a given system is considered indepen- dently. This means that they may not be near each other on the tape and, futher, they may not both make it onto the same tape if the tape runs out. It is easier to restor if everything you need is on the same tape and still easier if it is grouped together. The average statement causes the averaged dumps to be placed sequentially on the tape. Their ratings are averaged. RECORDS Every time a dump is performed a record of the dump is stored in a file that lists dumps done for that filesystem. The records for full dumps and incrementail dumps are stored separately. Full dumps are named by transforming all the slashes (/) in the filesystem name to dots (.). Thus /usr/local becomes usr.local.full. As a special case, the root ---------- -------------- filesystem becomes simply .full. Make sure you use the -a option to ----- ls(1) when listing the directories. Incremental dumps are similarly named, but with .incr instead of .full. All of the records for a given system are stored in /mas- ----- ter path/records/hostname. The master path is the path to the top of --------- ---------- ----------- the frogbak system's directory. At Berkeley Research and Trading, that is /y/adm/dump. Thus to find out when the last incremental of /y was performed, look in /y/adm/dump/records/troy/y.incr. The filesystem dump record files have the following format: FROM-DATE TO-DATE TAPE-NAME FILE-NUMBER # COMMENT --------- ------- --------- ----------- - ------- The FROM-DATE beginning of the time covered by that particular dump. --------- On full dumps, the FROM-DATE is simply 0. The TAPE-NAME is the sym- --------- --------- bolic name of the tape that the dump is on. It is the name that was given as an argument to mkblank, and is, hopefully, written on the side of the tape. The FILE-NUMBER field specifies how many files must be ----------- skipped over on that tape to get to that dump. Thus if FILE-NUMBER is ----------- 17 and you wanted to restore that dump, you would need to use mt -f device fsf 17 to get to that dump. ------ Although logically the incrementals can be divided into to tracks, they are not stored that way in the records database. In fact, the logical division is just an artifact that that incremental dumps cover back- wards to the incremental dump prior to the previous one. To find out when something has been backed up, both the .full and .incr records files must be examined. They give the times and coverages for the filesystems. To find out if a particular file was backed up, the dump tape must be read. No index of files saved is kept. TAPES The information about each dump performed is also stored grouped by what tape it is on. In the directory /master path/tapes, information ------------- about each dump tape is stored. This information includes tape write speed performance figures and other tidbits. This information substantially duplicates the information in the records directory. RESTORES Each different kind of system uses a different dump program and thus a different restore program. The basic idea is that on the system that was dumped, give a command that pipes the dump output from the tape into the restore program. It is usually easiest to forward the tape to the correct file before logging onto the system to be restored. The number of files to forward over is listed as the forth field in the system dump records database. On hp-ux systems, the command is mt -t /dev/rmt/0mn fsf num- ---- ber of files to skip. On BSD-based systems, the command is usually mt -------------------- -f /dev/nrst0 number of files to skip. ----------------------- The blocksize used to write the tapes is specified in the beginning of the frogbak program file. The value that I use is 112 blocks, or 56k. This size is not arbitrary. On Suns, sizes above 127 blocks are not reliable. Exabytes physically write data in 8k chunks. Larger block sizes have less system overhead and are generally faster. 56k is the largest multiple of 8k smaller than 128 blocks. Dumps can be written in several different formats depending on the type of system being dumped. In general the dump(8) command is used, but on Apollos the wbak(1) command is used, and on Xenix cpio(1) is used. The command needed to restore depends on what was used. On some servers, compresssion is possible in which case the dump must be uncompressed to restore. At Berkeley Research and Trading the command needed to restore most systems is: remsh server -n dd if=/dev/rmt/0mn ibs=112b | /etc/restore ------ ------------ --- -ivf -. Each of the different programs used to do the dumps handles restores in a different way. With wbak(1) and cpio(1), the set of files to be restored must be specified on the command line. With restore(8), the set of files to be restored can be chosen interactivly (-i flag). Obviously, you must load the right tape before trying to restore from it. Hopefully, each tape will have a paper label that identifies it. If it doesn't or, if the label is incorrect, you can identify a dump tapes by copying off the first file. The first file on each dump tape specifies the tape name and it lists which dumps are going to be attempted. If you loose your dump tape database, you may need to use this method to restore it. UTILITIES There are several utilities that are part of the frogbak package. They are sum covered which adds up how much disk space is backed up by a - control file; recycle which marks the tape as erased; send offsite - which figures out which tapes are not needed to do a full restore; and mkblank which names a tape. The sum covered command is useful for partitioning the clients among - several servers because frogbak doesn't do it for you. As arguments, you must provide the names of control files. The mkblank command must be run to initialize blank tapes. Tapes must be initialized before frogbak is run. The argument to mkblank is the name for the tape. Each tape should have a unique name. I recommend that the name be a short string followed by a three digit sequence num- ber. In case it isn't obvious, the tape must be in the drive when you run mkblank. Although it is possible to just keep buying new tapes, it is not necce- sary. The recycle program lets frogbak know that the dumps on the recycled tape no longer exist and that it is okay to overwrite the tape. The arguments to recycle are the list of tapes (by name) that should be marked as recycled. Nothing is done to the actual tape when it is marked recycled; the database is updated. It can be difficult to figure out which tapes are potentially required to do a restored. The send offsite program will figure out what tapes - are not required to do a full restore of everything (assuming, of course that all the tapes are good). Using, send offsite, it is easy - to pick which tapes can be sent away. It also shows you how many tapes it has been since every system was covered by a full dump. Only the last few most recent un-needed tapes are shown. DAILY TASKS It is possible to run frogbak from cron(1). However, a labeled blank or recycled tape must be put in the drive prior to running frogbak. Tapes which are not either labeled blank or recycled will be rejected. Blank tapes are made with with the mkblank utility. Recycled tapes are made with the recycle program. It is important that the output from frogbak be examined each day. If all the dumps run at somewhat standard priorities, then you can tell if something has not been dumped recently because its priority will be off. If priorities are not standardized, every failure must be checked. There is no warning system built into frogbak. You have to be very careful to watch what it does to make sure that nothing gets neglected. EXAMPLES Initialize a new tape and dump to it: # mkblank SEQ-037 ------- ------- # frogbak ------- Recycle an old tape and dump to it: # recycle SEQ-016 ------- ------- # frogbak ------- Check to see how much disk space is being backed up: # sum covered control.* ----------- --------- Restore a single file from a dump(8) full dump: % rlogin system to be restored -l root ------ --------------------- -- ---- # rsh system with tape -n mt -t /dev/rmt/0mn fsf 8 --- ---------------- -- -- -- ------------ --- - # rsh system with tape -n dd if=/dev/rmt/0mn | restore -ivf - --- ---------------- -- -- --------------- - ------- ---- - Verify and Initialize tape. Dumped from: Sun May 2 20:02:00 1993 Extract directories from tape Initialize symbol table. restore > ls 2 *./ 2 *../ 16384 dev/ 10240 etc/ 18433 tmp/ restore > cd tmp restore > ls 18433 ./ 18610 backup.ddout5679 18641 dump.remote 2 *../ 18643 backup.list5679 18644 rou5688 18434 5176 18608 bkup.log restore > add bkup.log Make node ./tmp restore > add dump.remote restore > extract Extract requested files extract file ./tmp/bkup.log extract file ./tmp/dump.remote Add links Set directory mode, owner, and times. set owner/mode for '.'? [yn] n restore > quit ---- OPTIONS Frogbak supports a few options: -dryrun Specifies that dumps should not be performed. Instead, frogbak looks at its control file and at the records files and figures out what dumps it would do. All of its figuring is sent to stan- dard output for debugging puposes. -summary Like the -dryrun option except that just the pro- posed set of dumps is printed. Please note that the summary you get is a summary of what would happen if you ran frogbak right now. If frogbak is invoked from cron(8), then it is likely that the actions that are reported now will not match the actions that will actaully occur. -future amount of time -------------- Specifies that frogbak should pretend that the time is really sometime in the future. This is for use with the -summary option. The amount of time string is in the same format as -------------- the dump periods in the control file: a number followed by the units: h, d, w, or m for hours, - - - - days, weeks, or months. -control control file Specifies that control.control file should be ------------ ------------ used intead of control.hostname. --------- CONFIGURATION The real options are the configuration variables like compression must be specified by changing the frogbak program file itself (frogback is written in perl(1)). $do compress turns compression on and off. Compresssion is very handy - and I recommend using it when you can. Using it requires a device driver that allows odd-sized blocks to be written to tape and the end of the dump. Also, the compress(1) program that comes with most oper- ating systems is annoyingly slow. The latest versions of compress are much faster and should be used. The $eject options controls whether the tape is ejected after a suc- cessful dump. If you have installed a version of rsh(1) that allows you to specifiy a timeout, turn on $timeout rsh. - ENVIRONMENT There are no ENVIRONMENT variables that are used by the frogbak system. PORTS The frogbak system can be thought of as having a server and clients. It is not really a client-server system, but since tape drives are often on servers and clients are often what is being backed up, the analogy holds some water. The server currently works with SunOS 4.*, Mach 2.6, and HP-UX 8.*. The client side currently supports: sunos Sun-3, and Sun-4 running SunOS 4.*. The dump(8) program is used. mach Mach 2.6 running on i386 systems. The dump(8) program is used. hp-ux HP-UX 8.* on HP9000/400, HP9000/700, and HP9000/800 sys- tems. The dump(8) program is used. ultrix Ultrix 3.* and 4.* running on MIPS-based systems. The dump(8) program is used. sony Sony's BSD4.3 OS running on their NEWS systems. The dump(8) program is used. xenix SCO Xenix running on a i386. The cpio(1) program is used. domain Apollo Domain OS version 9.6 and above. The wbak(1) pro- gram is used. PORTING The frogbak system is kinda a pain to move around. Each of the files must be customized for each site. Most, if not all, of the portability switches are in the first few lines of each file. When modifying the frogbak file itself, search for uses of the various strings like Sun- OS, and sunos. Please send any portability changes back for incorporation. OFFSITE It is critically important that dumps be stored off-site. Unfortu- antly, frogbak does not provide any help in chosing which tapes should go off-site. In fact, it makes it difficult because each tape is a grab-bag of what was highest priority at the time the tape was written. BUGS This system is not very well designed or implemented. It is very cranky. However it does work reliably. The major bugs have to do with the design. The dump sequence, although pretty good, is not optimal. A better sequence would be a replicated towers-of-hanoi. The dump sequence does not start off smoothly until every system has been both full and incremental dumped, frogbak does things in a somewhat odd order. When using frogbak, nothing prevents systems from being overlooked. Using the default rsh(1) program (remsh(1) on HP-UX), it is easy for a system to hang the dumps. Rsh does not have a timeout on input and if the remote system being dumped crashes, frogbak will hang. The solu- tion for this is to replace rsh(1) with a special version that has timeouts. The frogbak system is only as good as the dump program that is used. The BSD dump(8) program can write bogus dumps when used on a live filesystem. This usually is not a problem because everything is dumped so many times. The /etc/dumpdates file is faked when using dump(8). Somtimes the original /etc/dumpdates file is not restored and annoying email is sent by frogbak. FILES /y/adm/dump The top of the frogbak commands and records tree at Berkeley Research and Trading. records/hostname The directory of information about dumps of host- --------- ----- name. ----- tapes/ The directory of information about each dump tape. recycled/ The directory of old information about tapes that have been recycled. logs/ The directory of dump output logs. This should be cleaned occaisionaly because they can be fairly large. dump.remote A script that runs on the system to be dumped. Its standard output must a dump and nothing else. dump.local A shell script that copies dump.remote to the sys- tem that is going to be dump and then runs it. control.hostname The control file for hostname. -------- -------- backup.log.NNNN Dump log files for invocations of frogbak that did ---- not complete cleanly. /dev/rmt/0mn Tape device on HP-UX. /dev/nrst0 Tape device on SunOS. CREDITS Thanks are due to Bruce Markey for figuring out how to tune frogbak. Thanks are due to Larry Hubble for allowing a generous copyright notice to be applied to frogbak. AVAILABILITY The copyright on this system is a bit murky. Some work was done on it on behalf of TRW Financial Systems and they did not give me permission to take the changes with me. I would be most surprised if they objected. Berkeley Research and Trading has disclaimed any rights to frogbak that they might have. AUTHOR David Muir Sharnoff ----- ---- -------- --------------------------- SEE ALSO dump(8), restore(8), dd(1), rsh(1), mt(1). Edition May 17, 1995 FROGBAK(1)