Make & Manage Archive Files in Linux


Guide on Archive Files in Linux

File and folder management is an essential task of all Linux system administrators. System admins often need to archive and zip older files and store them away so that more space becomes available for active project files. In doing so, we need to know how to create an archive file as well as how to work with it like opening it, exploring it and adding to or deleting from files to it.
To put it concisely, an archive is a single file that contains a collection of other files and/or directories. Archive files are by and large used for a transfer (locally or over the internet) or make a backup copy of a collection of files and directories which allow you to work with only one file instead of many. Likewise, archives are used for software application packaging. This single file can be easily compressed for ease of transfer while the files in the archive retain the structure and permissions of the original files. Here is a good article if you like to learn more about how Linux OS works.

This tutorial shows how to use tar to create an archive, list the contents of an archive, and extract the files from an archive. Two common options used with all three of these operations are ‘-f’ and ‘-v’: to specify the name of the archive file, use ‘-f’ followed by the file name; use the ‘-v’ (“verbose”) option to have tar output the names of files as they are processed. While the ‘-v’ option is not necessary, it lets you observe the progress of your tar operation.

We cover the following 3 topics in this tutorial: 1- Make an archive file, 2- List contents of an archive file, and 3- Extract contents from an archive file. We conclude this tutorial by reviewing the 9 Frequently Asked Questions or FAQs related to archive file management. What you take away from this tutorial is essential for performing tasks related to cybersecurity and cloud technology.

 

1- Making an Archive File

To make an archive with tar, use the ‘-c’ (“create”) option, and specify the name of the archive file to create with the ‘-f’ option. It’s common practice to use a name with a ‘.tar’ extension, such as ‘my-backup.tar’.

To create an archive called ‘asset.tar’ from the contents of the ‘asset directory, type:

  $ tar -cvf asset.tar asset

This command creates an archive file called ‘asset.tar’ containing the ‘asset directory and all of its contents. The original ‘asset directory remains unchanged.

Use the ‘-z’ option to compress the archive as it is being written. This yields the same output as creating an uncompressed archive and then using gzip to compress it, but it eliminates the extra step. We cover more on archive compressing in our question No 4 of archive FAQ section.

 

2- Listing Contents of an Archive File

To list the contents of a tar archive without extracting them, use tar with the ‘-t’ option.
To list the contents of an archive called ‘asset.tar’, type:

  $ tar -tvf asset.tar 

This command lists the contents of the ‘asset.tar’ archive. Using the ‘-v’ option along with the ‘-t’ option causes tar to output the permissions and modification time of each file, along with its file name—the same format used by the ls command with the ‘-l’ option.

3- Extracting contents from an Archive File

To extract (or unpack) the contents of a tar archive, use tar with the ‘-x’ (“extract”) option.
To extract the contents of an archive called ‘asset.tar’, type:

  $ tar -xvf asset.tar

This command extracts the contents of the ‘asset.tar’ archive into the current directory.
If an archive is compressed, which usually means it will have a ‘.tar.gz’ or ‘.tgz’ extension, include the ‘-z’ option.

To extract the contents of a compressed archive called ‘asset.tar.gz’, type:

  $ tar -zxvf asset.tar.gz

FAQs for Archiving in Linux

Now that we have learned how to create an Archive file and list/extract its contents, we can move on to discuss the following 9 FAQ questions that you may experience while working with Linux archives.

  • Can we add content to an archive file without unpacking it?

Regretfully, once a file has been compressed there is no way to add content to it. Thus, you would have to “unpack” it or extract the contents, edit or add content, and then compress the file again.

  • Can we remove content from an archive file without unpacking it?

This depends on the version of tar being used. Newer versions of tar will support a –delete.
For example, let’s say we have files file1 and file2 . They can be removed from file.tar with the following:

  $ tar -vf file.tar --delete file1 file2

To remove a directory dir1:

  $ tar -f file.tar --delete dir1/*

 

  • What are the differences between compressing a folder and archiving it?

The simplest way to look at the difference between archiving and compressing is to look at the end result. When you archive files you are combining multiple files into one. So if we archive 10 100kb files you will end up with one 1000kb file. On the other hand if we compress those files we could end up with a file that is only a few kb or close to 100kb. Here is a good article to learn more about how Linux file system work.

  • How to compress archive files?

As we saw above you can create and archive files using the tar command with the cvf options. To compress the archive file we made there are two options; run the archive file through compression such as gzip. Or use a compression flag when using the tar command. The most common compression flags are z for gzip, j for bzip and J for xz. We can see the first method below:

$ gzip file.tar

Or we can just use a compression flag when using the tar command, here we’ll see the gzip flag “z”:

$ tar -cvzf file.tar /some/directory
  • How to create archives of multiple directories and/or files simultaneously?

As a system admin, you run into many situations where you should archive multiple files or directories simultaneously. To achieve this using the tar command, you just simply supply which files or directories you want to archive as arguments to the tar command as shown below:

$ tar -cvzf file.tar file1 file2 file3
 
 or 
 
$ tar -cvzf file.tar /some/directory1 /some/directory2
  • How to skip directories and/or files when creating an archive?

You may run into a situation where you want to archive a directory or file but you don’t need certain files to be archived. To avoid archiving those files or “exclude” them you would use the –exclude option with tar:

$ tar --exclude ‘/some/directory’ -cvf file.tar  /home/user

So in this example /home/user would be archived but it would exclude the /some/directory if it was under /home/user. It’s important that you put the –exclude option before the source and destination as well as to encapsulate the file or directory being excluded with single quotation marks.

  • How shar is different than Tar?

The biggest difference between tar and shar is the fact that shar is a shell script, that when executed will create the files. Shar is plain text which can be an advantage. But its outputs are executable which can pose a security risk. Note that shar is mainly used in the old Linux Operating Systems, so if you are running cyber security patching (see the list of special Linux OS), you may need to use it.

How to use shar:

  $ shar file.extension > file.shar

how to unshar:

$ unshar file.shar
  • How ar is different than Tar?

ar is mainly used for binary object files. ar will create a flat set of files whereas tar maintains directory structure. So it is much more suitable for distributing directories and files. How to use ar:

$ ar cr libmath.a

where c is create and r is insert file member to archive
to extract an ar file:

$ ar x libmath.a
  • How cpio is different than Tar?

cpio stands for copy in and out. The function of cpio and tar are fairly similar. However, tar is more widely used and much simpler. The file format is also different between the two. As you’ll see in the example below, cpio is a little more painful to use compared to tar:

  $ ls | cpio -ov > /path/to/output/folder/obj.cpio

-o is  Read a list of filenames terminated by a null character instead of a newline and -v is verbose.
To extract cpio files:

  $  cpio -idv < /path/to/output/folder/obj.cpio

where -i is extract, -d is make directories and -v is verbose

Summary

The tar command is very handy for creating backups or compressing files you no longer need. It’s good practice to back up files before changing them. If something fails to work as it intended, you will always be able to revert back to the old file. Compressing files no longer in use helps keep systems clean and lowers the disk space usage. There are other utilities available but tar has reigned supreme for its versatility, ease of use and popularity.

 

Resources- Self-Paced Linux Courses

If you like to learn more about Linux, taking the following courses is highly recommended:

 

Resources- Free Courses

Here is the list of our 9 free self-paced courses that are highly recommended:

Resources- Live Linux Courses

If you like to learn more about Linux, take the following live Linux classes is highly recommended:

 

Resources- Tutorials

If you like to learn more about Linux, reading the following articles and tutorials is highly recommended: