Compress and extract data via tar
command is one of the most basic skill for Linux administrators. If you’re new to this thing or always forget how to use the parameters, then this tutorial is for you.
.tar.gz, vs .tar.bz2 vs tar.xz
They are the most popular file archive types to group a bunch of files with their directory structure preserved, but using different compression algorithms.
To compress your data, gzip (.tar.gz
) does the things with less time but causes larger file size. While, xz (.tar.xz
) takes longer time but has smaller size. Bzip2 (.tar.bz2
) is in the middle for both speed and compressed size.
To decompress data, gzip is the fastest while xz do the second.
- Compress speed: gzip > bzip2 > xz
- Compressed size: gzip > bzip2 > xz
- Decompress speed: gzip > xz > bzip2
In short, use gzip
for the fast speed but xz
if you care more about output size. And, Bzip2
is always available for choice.
Compress your files via tar
The following parameters are mostly used when compressing files:
- f – specify archive filename (required)
- c – create new archive (required)
- z – for gzip (.tar.gz) (choice)
- j – for bzip2 (.tar.bz2) (choice)
- J – for xz (.tar.xz) (choice)
Example 1: to compress all files in current directory and save it into a .tar.gz archive, use command:
tar -zcf output.tar.gz *
Example 2: compress all files under ‘/var/www/wordpress’ into .tar.xz file and save it into user’s home folder.
tar cfJ ~/output.tar.xz /var/www/wordpress/
This command will output “tar: Removing leading `/' from member names
” during the process, which is a feature but not error! And, the archive will keep the absolute locations as you can see via the screenshot below:
If you don’t want to see this output, add -P
parameter in command, so it will be:
tar cfJP ~/output.tar.xz /var/www/wordpress/
Example 3: Though the best choice is first navigating to ‘wordpress’ directory in the case and then do compress process:
cd /var/www/wordpress && tar jcf ~/output.tar.bz2 ./
This command will compress all files under ‘wordpress’ into a bzip2 archive in user home, but exclude the absolute locations (/var/www/wordpress). And, use *
(see Example 1) instead of ./
in command also exclude the ‘.’ directory as screenshot shows you:
Example 4: In case you want to use lzip file archive, use command to compress ‘wordpress’ folder into .tar.lz
file and save in current directory:
tar cf output.tar.lz --lzip /var/www/wordpress/
To compress to lzma archive, use command:
tar cf output.tar.lzma --lzma /var/www/wordpress/
Or, compress to lzop archive via:
tar cf output.tar.lzop --lzop /var/www/wordpress/
Change Compression Level
Tar command supports compression level from 1 to 9. Number 1 meaning fast and lower compression level, which 9 meaning slow but even smaller compressed size. And level 6 is in use by default.
To specify compression level for gzip archive (change number accordingly):
GZIP=-9 tar -zcf output.tar.gz /var/www/wordpress/
To specify compression level for xz archive:
XZ_OPT=-9 tar cfJ ~/output.tar.xz /var/www/wordpress/
And to specify compression level for bzip:
BZIP2=-9 tar jcf output.tar.bz2 /var/www/wordpress
Decompress gzip, xz, bzip2 archives
To decompress files, use x (meaning extract archive) parameters instead of c in tar command.
Example 1: extract a .tar.gz
archive to current directory:
tar zxf output.tar.gz
If files to be extracted already exist in destination, it will replace them by removing the files before extracting. For nonempty folder can not be removed, it normally overwrites its metadata.
Example 2: To preserve the metadata of such a directory, use --no-overwrite-dir
. For example, extract .tar.xz
to current directory and preserve the metadata of existing folder.
tar --no-overwrite-dir -Jxf output.tar.xz
Example 3: To extract .tar.bz2
archive to current directory, and preserve old files untouched, use command with --skip-old-files
parameter:
tar --skip-old-files -jxf output.tar.bz2
Example 4: To extract .tar.xz
archive located in ‘~/Documents’, to a certain directory (/var/www/wordpress in the case) and preserve old file untouched, use -C
parameter:
tar --skip-old-files -Jxf ~/Documents/output.tar.xz -C /var/www/wordpress
Example 5: If the archive file contains the absolute locations, such as /var/www/wordpress/all-the-files
. Then, use the last command will cause repeated file structure in your system, meaning /var/www/wordpress/var/www/wordpress/all-the-files
In this case, use --strip-components
parameter, so the command will be:
tar --skip-old-files --strip-components=3 -Jxf ~/Documents/output.tar.xz -C /var/www/wordpress
Here use another number, such as --strip-components=1
will extract /www/wordpress/all-the-files
but skip '/var'
.