Thursday, June 23, 2005

Unix File Compression Utilities

File Utilities

Archiving and Compressing Files

On UNIX and Linux systems, gzip is probably the most common compression program. The newer but less common bzip2 gives significantly better compression. To compress a single file,

[any UNIX machine]$ bzip2 filename
[any UNIX machine]$ gzip filename
To archive and compress a directory on a machine with a newer version of GNU tar
[machine with GNU tar]$ tar cfj dirname.tar.bz2 dirname
[machine with GNU tar]$ tar cfz dirname.tar.gz dirname
The “j” option tells tar to compress with the more powerful bzip2 and the “z” option tells it to compress with the more commonplace gzip. If GNU tar is not available,
[any UNIX machine]$ tar cf - dirname | bzip2 -c >dirname.tar.bz2
[any UNIX machine]$ tar cf - dirname | gzip -c >dirname.tar.gz

You can give several file and directory names to the tar command. It is a good practice, though, to keep everything in one toplevel subdirectory, to avoid “polluting” the current directory when later extracting the archieve. (Imagine that you extract an archieve with hundreds of files into your home directory. There is a real chance of overwriting existing files, and it may take some time to move files to a more proper place.)

To uncompress the “.tar.gz” and “.tar.bz2” files,

[any UNIX machine]$ bunzip2 -c dirname.tar.bz2 | tar xf -
[any UNIX machine]$ gunzip -c dirname.tar.gz | tar xf -

[machine with GNU tar]$ tar xfj dirname.tar.bz2
[machine with GNU tar]$ tar xfz dirname.tar.gz

No comments: