+File: tar.info, Node: Reliability and security, Next: Changes, Prev: Media, Up: Top
+
+10 Reliability and Security
+***************************
+
+The `tar' command reads and writes files as any other application does,
+and is subject to the usual caveats about reliability and security.
+This section contains some commonsense advice on the topic.
+
+* Menu:
+
+* Reliability::
+* Security::
+
+\1f
+File: tar.info, Node: Reliability, Next: Security, Up: Reliability and security
+
+10.1 Reliability
+================
+
+Ideally, when `tar' is creating an archive, it reads from a file system
+that is not being modified, and encounters no errors or inconsistencies
+while reading and writing. If this is the case, the archive should
+faithfully reflect what was read. Similarly, when extracting from an
+archive, ideally `tar' ideally encounters no errors and the extracted
+files faithfully reflect what was in the archive.
+
+ However, when reading or writing real-world file systems, several
+things can go wrong; these include permissions problems, corruption of
+data, and race conditions.
+
+* Menu:
+
+* Permissions problems::
+* Data corruption and repair::
+* Race conditions::
+
+\1f
+File: tar.info, Node: Permissions problems, Next: Data corruption and repair, Up: Reliability
+
+10.1.1 Permissions Problems
+---------------------------
+
+If `tar' encounters errors while reading or writing files, it normally
+reports an error and exits with nonzero status. The work it does may
+therefore be incomplete. For example, when creating an archive, if
+`tar' cannot read a file then it cannot copy the file into the archive.
+
+\1f
+File: tar.info, Node: Data corruption and repair, Next: Race conditions, Prev: Permissions problems, Up: Reliability
+
+10.1.2 Data Corruption and Repair
+---------------------------------
+
+If an archive becomes corrupted by an I/O error, this may corrupt the
+data in an extracted file. Worse, it may corrupt the file's metadata,
+which may cause later parts of the archive to become misinterpreted.
+An tar-format archive contains a checksum that most likely will detect
+errors in the metadata, but it will not detect errors in the data.
+
+ If data corruption is a concern, you can compute and check your own
+checksums of an archive by using other programs, such as `cksum'.
+
+ When attempting to recover from a read error or data corruption in an
+archive, you may need to skip past the questionable data and read the
+rest of the archive. This requires some expertise in the archive
+format and in other software tools.
+
+\1f
+File: tar.info, Node: Race conditions, Prev: Data corruption and repair, Up: Reliability
+
+10.1.3 Race conditions
+----------------------
+
+If some other process is modifying the file system while `tar' is
+reading or writing files, the result may well be inconsistent due to
+race conditions. For example, if another process creates some files in
+a directory while `tar' is creating an archive containing the
+directory's files, `tar' may see some of the files but not others, or
+it may see a file that is in the process of being created. The
+resulting archive may not be a snapshot of the file system at any point
+in time. If an application such as a database system depends on an
+accurate snapshot, restoring from the `tar' archive of a live file
+system may therefore break that consistency and may break the
+application. The simplest way to avoid the consistency issues is to
+avoid making other changes to the file system while tar is reading it
+or writing it.
+
+ When creating an archive, several options are available to avoid race
+conditions. Some hosts have a way of snapshotting a file system, or of
+temporarily suspending all changes to a file system, by (say)
+suspending the only virtual machine that can modify a file system; if
+you use these facilities and have `tar -c' read from a snapshot when
+creating an archive, you can avoid inconsistency problems. More
+drastically, before starting `tar' you could suspend or shut down all
+processes other than `tar' that have access to the file system, or you
+could unmount the file system and then mount it read-only.
+
+ When extracting from an archive, one approach to avoid race
+conditions is to create a directory that no other process can write to,
+and extract into that.
+
+\1f
+File: tar.info, Node: Security, Prev: Reliability, Up: Reliability and security
+
+10.2 Security
+=============
+
+In some cases `tar' may be used in an adversarial situation, where an
+untrusted user is attempting to gain information about or modify
+otherwise-inaccessible files. Dealing with untrusted data (that is,
+data generated by an untrusted user) typically requires extra care,
+because even the smallest mistake in the use of `tar' is more likely to
+be exploited by an adversary than by a race condition.
+
+* Menu:
+
+* Privacy::
+* Integrity::
+* Live untrusted data::
+* Security rules of thumb::
+
+\1f
+File: tar.info, Node: Privacy, Next: Integrity, Up: Security
+
+10.2.1 Privacy
+--------------
+
+Standard privacy concerns apply when using `tar'. For example, suppose
+you are archiving your home directory into a file
+`/archive/myhome.tar'. Any secret information in your home directory,
+such as your SSH secret keys, are copied faithfully into the archive.
+Therefore, if your home directory contains any file that should not be
+read by some other user, the archive itself should be not be readable
+by that user. And even if the archive's data are inaccessible to
+untrusted users, its metadata (such as size or last-modified date) may
+reveal some information about your home directory; if the metadata are
+intended to be private, the archive's parent directory should also be
+inaccessible to untrusted users.
+
+ One precaution is to create `/archive' so that it is not accessible
+to any user, unless that user also has permission to access all the
+files in your home directory.
+
+ Similarly, when extracting from an archive, take care that the
+permissions of the extracted files are not more generous than what you
+want. Even if the archive itself is readable only to you, files
+extracted from it have their own permissions that may differ.
+
+\1f
+File: tar.info, Node: Integrity, Next: Live untrusted data, Prev: Privacy, Up: Security
+
+10.2.2 Integrity
+----------------
+
+When creating archives, take care that they are not writable by a
+untrusted user; otherwise, that user could modify the archive, and when
+you later extract from the archive you will get incorrect data.
+
+ When `tar' extracts from an archive, by default it writes into files
+relative to the working directory. If the archive was generated by an
+untrusted user, that user therefore can write into any file under the
+working directory. If the working directory contains a symbolic link
+to another directory, the untrusted user can also write into any file
+under the referenced directory. When extracting from an untrusted
+archive, it is therefore good practice to create an empty directory and
+run `tar' in that directory.
+
+ When extracting from two or more untrusted archives, each one should
+be extracted independently, into different empty directories.
+Otherwise, the first archive could create a symbolic link into an area
+outside the working directory, and the second one could follow the link
+and overwrite data that is not under the working directory. For
+example, when restoring from a series of incremental dumps, the
+archives should have been created by a trusted process, as otherwise
+the incremental restores might alter data outside the working directory.
+
+ If you use the `--absolute-names' (`-P') option when extracting,
+`tar' respects any file names in the archive, even file names that
+begin with `/' or contain `..'. As this lets the archive overwrite any
+file in your system that you can write, the `--absolute-names' (`-P')
+option should be used only for trusted archives.
+
+ Conversely, with the `--keep-old-files' (`-k') option, `tar' refuses
+to replace existing files when extracting; and with the
+`--no-overwrite-dir' option, `tar' refuses to replace the permissions
+or ownership of already-existing directories. These options may help
+when extracting from untrusted archives.
+
+\1f
+File: tar.info, Node: Live untrusted data, Next: Security rules of thumb, Prev: Integrity, Up: Security
+
+10.2.3 Dealing with Live Untrusted Data
+---------------------------------------
+
+Extra care is required when creating from or extracting into a file
+system that is accessible to untrusted users. For example, superusers
+who invoke `tar' must be wary about its actions being hijacked by an
+adversary who is reading or writing the file system at the same time
+that `tar' is operating.
+
+ When creating an archive from a live file system, `tar' is
+vulnerable to denial-of-service attacks. For example, an adversarial
+user could create the illusion of an indefinitely-deep directory
+hierarchy `d/e/f/g/...' by creating directories one step ahead of
+`tar', or the illusion of an indefinitely-long file by creating a
+sparse file but arranging for blocks to be allocated just before `tar'
+reads them. There is no easy way for `tar' to distinguish these
+scenarios from legitimate uses, so you may need to monitor `tar', just
+as you'd need to monitor any other system service, to detect such
+attacks.
+
+ While a superuser is extracting from an archive into a live file
+system, an untrusted user might replace a directory with a symbolic
+link, in hopes that `tar' will follow the symbolic link and extract
+data into files that the untrusted user does not have access to. Even
+if the archive was generated by the superuser, it may contain a file
+such as `d/etc/passwd' that the untrusted user earlier created in order
+to break in; if the untrusted user replaces the directory `d/etc' with
+a symbolic link to `/etc' while `tar' is running, `tar' will overwrite
+`/etc/passwd'. This attack can be prevented by extracting into a
+directory that is inaccessible to untrusted users.
+
+ Similar attacks via symbolic links are also possible when creating an
+archive, if the untrusted user can modify an ancestor of a top-level
+argument of `tar'. For example, an untrusted user that can modify
+`/home/eve' can hijack a running instance of `tar -cf -
+/home/eve/Documents/yesterday' by replacing `/home/eve/Documents' with
+a symbolic link to some other location. Attacks like these can be
+prevented by making sure that untrusted users cannot modify any files
+that are top-level arguments to `tar', or any ancestor directories of
+these files.
+
+\1f
+File: tar.info, Node: Security rules of thumb, Prev: Live untrusted data, Up: Security
+
+10.2.4 Security Rules of Thumb
+------------------------------
+
+This section briefly summarizes rules of thumb for avoiding security
+pitfalls.
+
+ * Protect archives at least as much as you protect any of the files
+ being archived.
+
+ * Extract from an untrusted archive only into an otherwise-empty
+ directory. This directory and its parent should be accessible
+ only to trusted users. For example:
+
+ $ chmod go-rwx .
+ $ mkdir -m go-rwx dir
+ $ cd dir
+ $ tar -xvf /archives/got-it-off-the-net.tar.gz
+
+ As a corollary, do not do an incremental restore from an untrusted
+ archive.
+
+ * Do not let untrusted users access files extracted from untrusted
+ archives without checking first for problems such as setuid
+ programs.
+
+ * Do not let untrusted users modify directories that are ancestors of
+ top-level arguments of `tar'. For example, while you are
+ executing `tar -cf /archive/u-home.tar /u/home', do not let an
+ untrusted user modify `/', `/archive', or `/u'.
+
+ * Pay attention to the diagnostics and exit status of `tar'.
+
+ * When archiving live file systems, monitor running instances of
+ `tar' to detect denial-of-service attacks.
+
+ * Avoid unusual options such as `--absolute-names' (`-P'),
+ `--dereference' (`-h'), `--overwrite', `--recursive-unlink', and
+ `--remove-files' unless you understand their security implications.
+
+
+\1f
+File: tar.info, Node: Changes, Next: Configuring Help Summary, Prev: Reliability and security, Up: Top