3 Last modified $Date: 1998/10/06 17:17:00 $
5 by Alexandre Oliva <oliva@dcc.unicamp.br>
9 This is a proposal of a mechanism for Amanda to support arbitrary
10 backup programs, that relies on a generic backup driver and scripts or
11 programs that interface with backup programs such as dump, tar,
12 smbclient, and others. It can also be used to introduce pre- and
15 The interface is simple, but supports everything that is currently
16 supported by Amanda, and it can be consistently extended to support
17 new abstractions that may be introduced in the backup driver in the
20 This proposal does not imply any modification in the Amanda protocol
21 or in Amanda servers; only Amanda clients have to be modified. By
22 Amanda clients, we refer to hosts whose disks are to be backed up;
23 an Amanda server is a host connected to a tape unit.
25 Currently (as of release 2.4.1 of Amanda), Amanda clients support
26 three operations: selfcheck, estimate and backup.
28 Selfcheck is used by the server program amcheck, to check whether a
29 client is responding or if there are configuration or permission
30 problems in the client that might prevent the backup from taking
33 Estimates are requested by the Amanda planner, that runs on the server
34 and collects information about the expected sizes of backups of each
35 disk at several levels. Given this information and the amount of
36 available tape space, the planner can select which disks and which
37 levels it should tell dumper to run.
39 Dumper is yet another server-side program; it requests clients to
40 perform dumps, as determined by planner, and stores these dumps in
41 holding disks or sends them directly to the taper program. The
42 interaction between dumper and taper is beyond the scope of this text.
44 We are going to focus on the interaction between the Amanda client
45 program and wrappers of dump programs. These wrappers must implement
46 the DUMPER API. The dumptype option `program' should name the wrapper
47 that will be used to back up filesystems of that dumptype. One
48 wrapper may call another, so as to extend its functionality.
52 Different backup programs present distinct requirements; some must be
53 run as super-user, whereas others can be run under other user-ids.
54 Some require a directory name, the root of the tree to be backed up;
55 others prefer a raw device name; some don't even refer to local disks
56 (SAMBA). Some wrappers may need to know a filesystem type in order to
57 decide which particular backup program to use (dump, vdump, vxdump,
60 Some provide special options for estimates, whereas others must be
61 started as if a complete dump were to be performed, and must be killed
62 as soon as they print an estimate.
64 Furthermore, the output formats of these backup programs vary wildly.
65 Some will print estimates and total sizes in bytes, in 512-byte tape
66 blocks units, in Kbytes, Mbytes, Gbytes, and possibly Tbytes in the
67 near future. Some will print a timestamp for the backup; some won't.
69 There are also restrictions related with possible scheduling policies.
70 For example, some backup programs only support full backups or
71 incrementals based on the last full backup (0-1). Some support full
72 backups or incrementals based on the last backup, be it a full or an
73 incremental backup (0-inf++). Some support incrementals based on a
74 timestamp (incr/date); whereas others are based on a limited number of
75 incremental levels, but incrementals of the same level can be
76 repeated, such as dump (0-9).
78 Amanda was originally built upon DUMP incremental levels, so this is
79 the only model it currently supports. Backup programs that use other
80 incremental management mechanisms had to be adapted to this policy.
81 Wrapper scripts are responsible for this adaptation.
83 Another important issue has to do with index generation. Some backup
84 programs can generate indexes, but each one lists files in its own
85 particular format, but they must be stored in a common format, so that
86 the Amanda server can manipulate them.
88 The DUMPER API must accomodate for all these variations.
90 3. OVERVIEW OF THE API
92 We are going to define a standard format of argument lists that the
93 backup driver will provide to wrapper programs, and the expected
94 result of the execution of these wrappers.
96 The first argument to a wrapper should always be a command name. If
97 no arguments are given, or an unsupported command is requested, an
98 error message should be printed to stderr, and the program should
99 terminate with exit status 1.
101 3.1. The `support' command
103 As a general mechanism for Amanda to probe for features provided by a
104 backup program, a wrapper script must support at least the `support'
105 command. Some features must be supported, and Amanda won't ever ask
106 about them. Others will be considered as extensions, and Amanda will
107 ask the wrapper whether they are supported before issuing the
108 corresponding commands.
110 3.1.1. The `level-incrementals' subcommand
112 For example, before requesting for an incremental backup of a given
113 level, Amanda should ask the wrapper whether the backup program
114 supports level-based incrementals. We don't currently support backup
115 programs that don't, but we may in the future, so it would be nice if
116 wrappers already implemented the command `support level-incrementals',
117 by returning a 0 exit status, printing, say, the maximum incremental
118 level it supports, i.e., 9. A sample session would be:
120 % /usr/local/amanda/libexec/wrappers/DUMP support level-incrementals hda0
123 Note that the result of this support command may depend on filesystem
124 information, so the disklist filesystem entry should be specified as a
125 command line argument. In the next examples, we are not going to use
126 full pathnames to wrapper scripts any more.
128 We could have defined a `support' command for full backups, but I
129 can't think of a backup program that does not support full backups...
131 3.1.2. The `index' subcommand
133 The ability to produce index files is also subject to an invocation of
134 `support' command. When the support sub-command is `index', like in
135 the invocation below, the wrapper must print a list of valid indexing
136 mechanisms, one per line, most preferred first. If indexing is not
137 supported, nothing should be printed, and the exit status should be 1.
139 DUMP support index hda0
141 The currently known indexing mechanisms are:
143 output: implies that the command `index-from-output' generates an
144 index file from the output produced by the backup program (for
145 example, from `tar -cv').
147 image: implies that the command `index-from-image' generates an index
148 file from a backup image (for example, `tar -t').
150 direct: implies that the `backup' command can produce an index file as
151 it generates the backup image.
153 parse: implies that the `backup-parse' command can produce an index
154 file as it generates the backup formatted output .
156 The indexing mechanisms will be explicitly requested with the additionnal
157 option `index-<mode>' in the `backup' and `backup-parse' command invocation.
159 `index-from-image' should be supported, if possible, even if other
160 index commands are not, since it can be used in the future to create
161 index files from previously backed up filesystems.
163 3.1.3. The `parse-estimate' subcommand
165 The `parse-estimate' support subcommand print a list of valid mechanisms to
166 parse the estimate output and write the estimate size to its output, the
169 direct: implies that the `estimate' command can produce the estimate output.
171 parse: implies that the `estimate-parse' command can produce the estimate
172 output when fed with the `estimate' output.
174 The estimate parsing mechanisms will be explicitly requested with the
175 additionnal option `estimate-<mode>' in the `estimate' and
176 `estimate-parse' command invocation.
178 3.1.4. The `parse-backup' subcommand
180 The `parse-backup' support subcommand print a list of valid mechanisms to
181 parse the backup stderr, the two mechanisms are:
183 direct: implies that the `backup' command can produce the
184 backup-formatted-ouput.
186 parse: implies that the `backup-parse' command can produce the
187 backup-formatted-ouput when fed with the `backup' stderr.
189 The backup parsing mechanisms will be explicitly requested with the
190 additionnal option `backup-<mode>' in the `backup' and `backup-parse'
193 3.1.5. Others subcommands
195 Some other standard `support' sub-commands are `exclude' and
200 One may think (and several people did :-) that there should be only
201 one support command, that would print information about all supported
202 commands. The main arguments against this proposal have to do with
205 1) the availability of commands might vary from filesystem to
206 filesystem. No, I don't have an example, I just want to keep it as
209 2) one support subcommand may require command line arguments that
210 others don't, and we can't know in advance what these command line
211 arguments are going to be
213 3) the output format and exit status conventions of a support command
214 may vary from command to command; the only pre-defined convention is
215 that, if a wrapper does not know about a support subcommand, it should
216 return exit status 1, implying that the inquired feature is not
219 3.2. The `selfcheck' command
221 We should support commands to perform self-checks, run estimates,
222 backups and restores (for future extensions of the Amanda protocol
223 so as to support restores)
225 A selfcheck request would go like this:
227 DUMP selfcheck hda0 option option=value ...
229 The options specified as command-line arguments are dumptype options
230 enabled for that disk, such as `index', `norecord', etc. Unknown
231 options should be ignored. For each successful check, a message such
234 OK [/dev/hda0 is readable]
235 OK [/usr/sbin/dump is executable]
237 Errors should be printed as:
239 ERROR [/etc/dumpdates is not writable]
241 If selfcheck needs super-user (or some other user, for that matter)
242 access to perform some tests, it should print to the standard output
248 The backup driver should then arrange to re-run the script as the
249 specified user/group. Security concerns may impose restrictions on
250 privileges that can be given to wrapper scripts. For example, we may
251 require that, in order to run a wrapper script as any other user or
252 group, the wrapper script must be in a separate directory, say
253 /usr/local/amanda/libexec/wrappers-protected, and that the script, its
254 containing directory and all its parents must only be writable by
257 The need for starting programs as other users requires amandad (that
258 will incorporate all the functionality from selfcheck, sendsize and
259 sendbackup) to be setuid-root. However, it will fork a child process
260 and drop to the amanda user privileges as soon as possible. This
261 child process will be driven through a pipe, and it will be able to
262 start services as other users, in a way that no other user, not even
263 the backup operator, will be able to run arbitrary commands.
266 A wrapper script will certainly have to figure out either the disk
267 device name or its mount point, given a filesystem name such as
268 `hda0', as specified in the disklist. In order to help these scripts,
269 Amanda provides a helper program that can guess device names, mount
270 points and filesystem types, when given disklist entries.
272 The filesystem type can be useful on some operation systems, in which
273 more than one dump program is available; this information can help
274 automatically selecting the appropriate dump program.
277 The exit status of selfcheck and of this alternate script are probably
278 going to be disregarded. Anyway, for consistency, selfcheck should
279 return exit status 0 for complete success, 1 if any failures have
280 occurred and 2 if it needs additional permissions (USER/GROUP). Note
281 that, if the wrapper needs a special permission to perform a test, it
282 should not report a failure for that test.
284 3.3. The `estimate' and `estimate-parse' commands
286 Estimate requests can be on several different forms. An estimate of a
287 full backup may be requested, or estimates for level- or
288 timestamp-based incrementals:
290 DUMP estimate full hda0 option ...
291 DUMP estimate level 1 hda0 option ...
292 DUMP estimate diff 1998:09:24:01:02:03 hda0 option ...
295 If the backup program needs privileged access to obtain estimates, it
301 and exit, with exit status 2. If requested estimate type is not
302 supported, exit status 3 should be returned.
304 If the option `estimate-direct' is set, then the `estimate' command
305 should write to stdout the estimated size, in bytes, a pair of numbers
306 that, multiplied by one another, yield the estimated size in bytes.
308 If the option `estimate-parse' is set, then the `estimate' command
309 should write to stdout the informations needed by the
310 `estimate-parse' command, that should extract from its input the
313 The syntax of `estimate-parse' is identical to that of `estimate'.
315 Both `estimate' and `estimate-parse' can output the word `KILL', after
316 printing the estimate. In this case, Amanda will send a SIGTERM
317 signal to the process group of the `estimate' process. If it does not
318 die within a few seconds, a SIGKILL will be issued.
320 If `estimate' or `estimate-parse' succeed, they should exit 0,
321 otherwise exit 1, except for the already listed cases of exit status 2
324 3.4. The `backup' and `backup-parse' commands
326 The syntax of `backup' is the same as that of `estimate'. The backup
327 image should be written to standard output, whereas stderr should be
328 used for the user-oriented output of the backup program and other
331 If the option `backup-direct' is set, then the `backup' command should
332 write to stderr a formatted-output-backup.
334 If the option `backup-parse' is set, then the `backup' command
335 should write to stderr the informations needed by the `backup-parse'
336 command, that should edit its input so that it prints to standard
337 output a formatted-output-backup.
339 If the option `no-record' is set, then the `backup' command should
340 not modify its state file (ex. dump should not modify /etc/dumpdates).
342 The syntax of `backup-parse' is identical to that of `backup'.
344 The syntax of the formatted-output-backup is as follow:
345 All lines should start with either `| ' for normal output, `? ' for
346 strange output or `& ' for error output. If the wrapper can determine
347 the total backup size from the output of the backup program, it should
348 print a line starting with `# ', followed by the total backup size in
349 bytes or by a pair of numbers that, multiplied, yield the total backup
350 size; this number will be used for consistency check.
352 The option `index-direct' should cause commands `backup' to output
353 the index directly to file descriptor 3. The option `index-parse'
354 should cause commands `backup-parse' to output the index directly to
355 file descriptor 3. The syntax of the index file is described in the
358 3.5. The `index-from-output' and `index-from-image' commands
360 The syntax of the `index-from-output' and `index-from-image' commands
361 is identical to the one of `backup'. They are fed the backup output
362 or image, and they must produce a list of files and directories, one
363 per line, to the standard output. Directories must be identified by
366 After the file name and a blank space, any additional information
367 about the file or directory, such as permission data, size, etc, can
368 be added. For this reason, blanks and backslashes within filenames
369 should be quoted with backslashes. Linefeeds should be represented as
370 `\n', although it is not always possible to distinguish linefeeds in
371 the middle of filenames from ones that separate one file from another,
372 in the output of, say `restore -t'. It is not clear whether we should
373 also support quoting mechanisms such as `\xHH', `\OOO' or `\uXXXX'.
375 3.6. The `restore' command
379 3.7. The `print-command' command
381 This command must be followed by a valid backup or restore command,
382 and it should print a shell-command that would produce an equivalent
383 result, i.e., that would perform the backup to standard output, or
384 that would restore the whole filesystem reading from standard input.
385 This command is to be included in the header of backup images, to ease
390 Well, that's all. Drop us a note at the amanda-hackers mailing list
391 if you have suggestions to improve this document and/or the API. Some
392 help on its implementation would be welcome too.