[proxy] web.archive.org← back | site home | direct (HTTPS) ↗ | proxy home | ◑ dark◐ light

Tape Archive (tar) File Format Family

Description

A tar (tape archive) file format is an archive created by tar, a UNIX-based utility used to package files together for backup or distribution purposes. It contains multiple files (also known as a tarball) stored in an uncompressed format along with metadata about the archive. Tar files are not compressed archive files. They are often compressed with file compression utilities such as gzip or bzip2.

Each file object includes any file data, and is preceded by a 512-byte header record. The file data is written unaltered except that its length is rounded up to a multiple of 512 bytes. At the end of the archive file there are two 512-byte blocks filled with binary zeros as an end-of-file marker. The file header record contains metadata about a file. To ensure portability across different architectures with different byte orderings, the information in the header record is encoded in ASCII. Tar archives are fully compatible between UNIX and Windows systems because all header information is represented in ASCII. See Notes for more information about the capitalization of tar and Unix.

The tar file format has changed over time as additional functionality has been developed for the tar UNIX utility leading to format extensions that include additional information for necessary implementations beginning in the 1980s. Early versions of tar formats were inconsistent in how numeric fields were constructed that were corrected in later implementations to improve portability of the format, beginning with the first POSIX standard for tar file formats in 1988.

The POSIX.1 2001 introduced the "extended tar", tar.h, or pax format which added vendor-tagged or vendor-specific functionality. This is the most flexible format with the richest features of other tar archive specifications. As stated in gnu.org’s documentation about various iterations of tar file formats, “This format is quite recent, so not all tar implementations are able to handle it properly. However, this format is designed in such a way that any tar implementation able to read 'ustar' archives will be able to read most "posix" archives as well.” The POSIX.1 2001 specification relieved the file size of 8 GB of previous tar formats. The new tags as described in freebsd.org's tar documentation are as follows:

The POSIX.1 2001 standard also features changes to the applicable typefield values. This extended tar or tar.h archive format stores new data in ustar-compatible archive entries that use "x" or "g" typeflags. FreeBSD, an open source Unix-like operation system, provides documentation of tar file format versions and stresses the compatibility between extended tar formats and ustar tar archives defined in the POSIX.1 1988 standard. "older implementations that do not fully support these extensions will extract the metadata into regular files, where the metadata can be examined as necessary." The POSIX.1 2001 standard defined the pax utility and pax format that serves as an extension of the tar format. The pax utility uses "-x" in the command string to output the archive format as ustar. Opengroup.org's Pax documentation clarifies that the pax utility supports the ustar format, defined as, "The tar interchange format; see the EXTENDED DESCRIPTION section. The default blocksize for this format for character special archive files shall be 10240. Implementations shall support all blocksize values less than or equal to 32256 that are multiples of 512."

The tar file format doesn't feature native data compression, so tar archives are often compressed with an external utility such as; gzip, bzip2, XZ (using 7-Zip / p7zip LZMA / LZMA2 compression algorithms), Brotli, Zstandard, and similar tools to reduce the archive's size for portability and data backup. Resulting compressed files can be found named with single extension, e.g. tgz, tbz, txz, tzst, or with double file extension, e.g. tar.gz, tar.br, tar.bz2, tar.xz, tar.zst

For an overview of tar version history, See Notes.