Fatskills
Practice. Master. Repeat.
Study Guide: CompTIA Linux+ Certification: Protecting Files
Source: https://www.fatskills.com/sat/chapter/comptia-linux-certification-protecting-files

CompTIA Linux+ Certification: Protecting Files

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~35 min read

Protecting data includes creating and managing backups. A backup, often called an archive, is a copy of data that can be restored sometime in the future should the data be destroyed or become corrupted.
Backing up your data is a critical activity, but even more important is planning your backups. These plans include choosing backup types, determining the right compression methods to employ, and identifying which utilities will serve your organization's data needs best. You may also need to transfer your backup files over the network. In this case, ensuring that the archive is secure during transit is critical as well as validating its integrity once it arrives at its destination. All of these various topics concerning protecting your data files are covered in this guide.

Key Topics:
Understanding Backup Types
Looking at Compression Methods
Comparing Archive and Restore Utilities
Securing Offsite/Off-System Backups
Checking Backup Integrity

Understanding Backup Types
There are different classifications for data backups. Understanding these various categories is vital for developing your backup plan. The following backup types are the most common types:
System image
Full
Incremental
Differential
Snapshot
Snapshot clone

Each of these backup types is explored in this section. Their advantages and disadvantages are included.

System Image- A system image is a copy of the operating system binaries, configuration files, and anything else you need to boot the Linux system. Its purpose is to quickly restore your system to a bootable state. Sometimes called a clone, these backups are not normally used to recover individual files or directories, and in the case of some backup utilities, you cannot do so.
Full- A full backup is a copy of all the data, ignoring its modification date. This backup type's primary advantage is that it takes a lot less time than other types to restore a system's data. However, not only does it take longer to create a full backup compared to the other types, it also requires more storage. It needs no other backup types to restore a system fully.
Incremental- An incremental backup only makes a copy of data that has been modified since the last backup operation (any backup operation type). Typically, a file's modified timestamp is compared to the last backup type's timestamp. It takes a lot less time to create this backup type than the other types, and it requires a lot less storage space. However, the data restoration time for this backup type can be significant. Imagine that you performed a full backup copy on Monday and incremental backups on Tuesday through Friday. On Saturday the disk crashes and must be replaced. After the disk is replaced, you will have to restore the data using Monday's backup and then continue to restore data using the incremental backups created on Tuesday through Friday. This is very time-consuming and will cause significant delays in getting your system back in operation. Therefore, for optimization purposes, it requires a full backup to be completed periodically.
Differential- A differential backup makes a copy of all data that has changed since the last full backup. It could be considered a good balance between full and incremental backups. This backup type takes less time than a full backup but potentially more time than an incremental backup. It requires less storage space than a full backup but more space than a plain incremental backup. Also, it takes a lot less time to restore using differential backups than incremental backups, because only the full backup and the latest differential backup are needed. For optimization purposes, it requires a full backup to be completed periodically.
Snapshot- A snapshot backup is considered a hybrid approach, and it is a slightly different flavor of backups. First a full (typically read-only) copy of the data is made to backup media. Then pointers, such as hard links, are employed to create a reference table linking the backup data with the original data. The next time a backup is made, instead of a full backup, an incremental backup is made (only modified or new files are copied to the backup media), and the pointer reference table is copied and updated. This saves space because only modified files and the updated pointer reference table need to be stored for each additional backup.

Note: split-mirror snapshot, where the data is kept on a mirrored storage device. When a backup is run, a copy of all the data is created, not just new or modified data.

With a snapshot backup, you can go back to any point in time and do a full system restore from that point. It also uses a lot less space than the other backup types. In essence, snapshots simulate multiple full backups per day without taking up the same space or requiring the same processing power as a full backup type would. The rsync utility (described later in this guide) uses this method.
Snapshot Clone- Another variation of a snapshot backup is a snapshot clone. Once a snapshot is created, such as an LVM snapshot, it is copied, or cloned. Snapshot clones are useful in high data I/O environments. When performing the cloning, you minimize any adverse performance impacts to production data I/O because the clone backup takes place on the snapshot and not on the original data.

While not all snapshots are writable, snapshot clones are typically modifiable. If you are using LVM, you can mount these snapshot clones on a different system. Thus, a snapshot clone is useful in disaster recovery scenarios.
Your particular server environment as well as data protection needs will dictate which backup method to employ. Most likely you need a combination of the preceding types to properly protect your data.

Looking at Compression Methods
Backing up data can potentially consume large amounts of additional disk or media space. Depending on the backup types you employ, you can reduce this consumption via data compression utilities.

The following popular utilities are available on Linux:
gzip
bzip2
xz
zip

The advantages and disadvantages of each of these data compression methods are explored in this section.

gzip- The gzip utility was developed in 1992 as a replacement for the old compress program. Using the Lempel-Ziv (LZ77) algorithm to achieve text-based file compression rates of 60–70 percent, gzip has long been a popular data compression utility. To compress a file, simply type gzip followed by the file's name. The original file is replaced by a compressed version with a .gz file extension. To reverse the operation, type gunzip followed by the compressed file's name.
bzip2- Developed in 1996, the bzip2 utility offers higher compression rates than gzip but takes slightly longer to perform the data compression. The bzip2 utility employs multiple layers of compression techniques and algorithms. Until 2013, this data compression utility was used to compress the Linux kernel for distribution. To compress a file, simply type bzip2 followed by the file's name. The original file is replaced by a compressed version with a .bz2 file extension. To reverse the operation, type bunzip2 followed by the compressed file's name, which decompresses (inflates) the data.

Note: bzip utility program. However, in its layered approach, a patented data compression algorithm was employed. Thus, bzip2 was created to replace it and uses the Huffman coding algorithm instead, which is patent free.
xz- Developed in 2009, the xz data compression utility quickly became very popular among Linux administrators. It boasts a higher default compression rate than bzip2 and gzip via the LZMA2 compression algorithm. However, with certain xz command options, you can employ the legacy LZMA compression algorithm, if needed or desired. The xz compression utility in 2013 replaced bzip2 for compressing the Linux kernel for distribution. To compress a file, simply type xz followed by the file's name. The original file is replaced by a compressed version with an .xz file extension. To reverse the operation, type unxz followed by the compressed file's name.
zip- The zip utility has the ability to operate on multiple files. If you have ever created a zip file on a Windows operating system, then you've used this file format. Multiple files are packed together in a single file, often called a folder or an archive file, and then compressed. Another difference from the other Linux compression utilities is that zip does not replace the original file(s). Instead, it places a copy of the file(s) into the archive file.
To archive and compress files with zip, type zip followed by the final archive file's name, which traditionally ends in a .zip extension. After the archive file, type one or more files you desire to place into the compressed archive, separating them with a space. The original files remain intact, but a copy of them is placed into the compressed zip archive file. To reverse the operation, type unzip followed by the compressed archive file's name.
It's helpful to see a side-by-side comparison of the various compression utilities using their defaults.

List: Comparing the various Linux compression utilities
# cp /var/log/wtmp wtmp
# cp wtmp wtmp1
# cp wtmp wtmp2
# cp wtmp wtmp3
# cp wtmp wtmp4
# ls -lh wtmp?
-rw-r--r--. 1 root root 210K Oct 9 19:54 wtmp1
-rw-r--r--. 1 root root 210K Oct 9 19:54 wtmp2
-rw-r--r--. 1 root root 210K Oct 9 19:54 wtmp3
-rw-r--r--. 1 root root 210K Oct 9 19:54 wtmp4
# gzip wtmp1
# bzip2 wtmp2
# xz wtmp3
# zip wtmp4.zip wtmp4
adding: wtmp4 (deflated 96%)
#
# ls -lh wtmp?.*
-rw-r--r--. 1 root root 7.7K Oct 9 19:54 wtmp1.gz
-rw-r--r--. 1 root root 6.2K Oct 9 19:54 wtmp2.bz2
-rw-r--r--. 1 root root 5.2K Oct 9 19:54 wtmp3.xz
-rw-r--r--. 1 root root 7.9K Oct 9 19:55 wtmp4.zip
# ls wtmp?
wtmp4

In the above List, first the /var/log/wtmp file is copied to the local directory using super user privileges. Four copies of this file are then made. Using the ls -lh command, you can see in human-readable format that the wtmp files are 210K in size. Next, the various compression utilities are employed. Notice that when using the zip command, you must give it the name of the archive file, wtmp4.zip, and follow it with any file names. In this case, only wtmp4 is put into the zip archive. After the files are compressed with the various utilities, another ls -lh command is issued in the List above. Notice the various file extension names as well as the files' compressed sizes. You can see that the xz program produces the highest compression of this file, because its file is the smallest in size.

Note: -# option. The # is a number from 1 to 9, where 1 is the fastest but lowest compression and 9 is the slowest but highest compression method. The zip utility does not yet support these levels for compression, but it does for decompression. Typically, the utilities use -6 as the default compression level. It is a good idea to review these level specifications in each utility's man page, since useful but subtle differences exist.
There are many compression methods. However, when you use a compression utility along with an archive and restore program for data backups, it is vital that you use a lossless compression method. A lossless compression is just as it sounds: no data is lost. The gzip, bzip2, xz, and zip utilities provide lossless compression. Obviously it is important not to lose data when doing backups.

Comparing Archive and Restore Utilities
There are several programs you can employ for managing backups. Some of the more popular products are Amanda, Bacula, Bareos, Duplicity, and BackupPC. Yet often these GUI and/or web-based programs have command-line utilities at their core. Our focus here is on those command-line utilities:
cpio
dd
tar

Copying with cpio
The cpio utility's name stands for “copy in and out.” It gathers together file copies and stores them in an archive file. The program has several useful options.

TABLE: The cpio command's commonly used options

Short	Long	Description
-I	N/A	Designates an archive file to use.
-i	--extract	Copies files from an archive or displays the files within the archive, depending on the other options employed. Called copy-in mode.
N/A	--no-absolute-filenames	Designates that only relative path names are to be used. (The default is to use absolute path names.)
-o	--create	Creates an archive by copying files into it. Called copy-out mode.
-t	--list	Displays a list of files within the archive. This list is called a table of contents.
-v	--verbose	Displays each file's name as each file is processed.

To create an archive using the cpio utility, you have to generate a list of files and then pipe them into the command.

List: Employing cpio to create an archive
$ ls Project4?.txt
Project42.txt Project43.txt Project44.txt
Project45.txt Project46.txt
$ ls Project4?.txt | cpio -ov > Project4x.cpio
Project42.txt
Project43.txt
Project44.txt
Project45.txt
Project46.txt
59 blocks
$ ls Project4?.*
Project42.txt Project44.txt Project46.txt
Project43.txt Project45.txt Project4x.cpio

Using the ? wildcard and the ls command, various text files within the present working directory are displayed first in the List above. This command is then used, and its STDOUT is piped as STDIN to the cpio utility. Read “Searching and Analyzing Text” if you need a refresher on STDOUT and STDIN.) The options used with the cpio command are -ov, which create an archive containing copies of the listed files. They also display the file's name as they are copied into the archive. The archive file used is named Project4x.cpio. Though not necessary, it is considered good form to use the .cpio extension on cpio archive files.

Note: cpio utility. For example, suppose you want to create a cpio archive for any files within the virtual directory system owned by the JKirk user account. You can use the find / -user JKirk command and pipe it into the cpio utility in order to create the archive file. This is a handy feature.
You can view the files stored within a cpio archive fairly easily. Just employ the cpio command again, and use its -itv options and the -I option to designate the archive file.

List: Using cpio to list an archive's contents
$ cpio -itvI Project4x.cpio
-rw-r--r-- 1 Christin Christin 29900 Aug 19 17:37 Project42.txt
-rw-rw-r-- 1 Christin Christin 0 Aug 19 18:07 Project43.txt
-rw-rw-r-- 1 Christin Christin 0 Aug 19 18:07 Project44.txt
-rw-rw-r-- 1 Christin Christin 0 Aug 19 18:07 Project45.txt
-rw-rw-r-- 1 Christin Christin 0 Aug 19 18:07 Project46.txt

Though not displayed in List 12.3, the cpio utility maintains each file's absolute directory reference. Thus, it is often used to create system image and full backups.
To restore files from an archive, employ just the -ivI options. However, because cpio maintains the files' absolute paths, this can be tricky if you need to restore the files to another directory location. To do this, you need to use the --no-absolute-filenames option.

List: Using cpio to restore files to a different directory location
$ ls -dF Projects
Projects/
$ mv Project4x.cpio Projects/
$ cd Projects
/home/Christine/Answers/Projects
Project4x.cpio
$ cpio -iv --no-absolute-filenames -I Project4x.cpio

In the above list, the Project4x.cpio archive file is moved into a preexisting subdirectory, Projects. By stripping the absolute path names from the archived files via the --no-absolute-filenames option, you restore the files to a new directory location. If you wanted to restore the files to their original location, simply leave that option off and just use the other cpio switches.

Archiving with tar
The tar utility's name stands for tape archiver, and it is popular for creating data backups. As with cpio, with the tar command, the selected files are copied and stored in a single file. This file is called a tar archive file. If this archive file is compressed using a data compression utility, the compressed archive file is called a tarball.
The tar program has several useful options.

TABLE: The tar command's commonly used tarball creation options

Short	Long	Description
-c	--create	Creates a tar archive file. The backup can be a full or incremental backup, depending on the other selected options.
-u	--update	Appends files to an existing tar archive file, but only copies those files that were modified since the original archive file was created.
-g	--listed-incremental	Creates an incremental or full archive based on metadata stored in the provided file.
-z	--gzip	Compresses tar archive file into a tarball using gzip.
-j	--bzip2	Compresses tar archive file into a tarball using bzip2.
-J	--xz	Compresses tar archive file into a tarball using xz.
-v	--verbose	Displays each file's name as each file is processed.

To create an archive using the tar utility, you have to add a few arguments to the options and the command.

List: Using tar to create an archive file
$ tar -cvf Project4x.tar Project4?.txt
Project43.txt

In the above List, three options are used. The -c option creates the tar archive. The -v option displays the filenames as they are placed into the archive file. Finally, the -f option designates the archive filename, which is Project42x.tar. Though not required, it is considered good form to use the .tar extension on tar archive files. The command's last argument designates the files to copy into this archive.

Tip: tar command options. For this style, you remove the single dash from the beginning of the tar option. For example, -c becomes c. Keep in mind that additional old-style tar command options must not have spaces between them. Thus, tar cvf is valid, but tar c v f is not.
If you are backing up lots of files or large amounts of data, it is a good idea to employ a compression utility. This is easily accomplished by adding an additional switch to your tar command options.

List: Using tar to create a tarball
$ tar -zcvf Project4x.tar.gz Project4?.txt
$ ls Project4x.tar.gz
Project4x.tar.gz

Notice that the tarball filename has the .tar.gz file extension. It is considered good form to use the .tar extension and tack on an indicator showing the compression method that was used. However, you can shorten it to .tgz if desired.
There is a useful variation of this command to create both full and incremental backups. A simple example helps to explain this concept.

List: Using tar to create a full backup
$ tar -g FullArchive.snar -Jcvf Project42.txz Project4?.txt
$ ls FullArchive.snar Project42.txz
FullArchive.snar Project42.txz

Notice the -g option. The -g option creates a file, called a snapshot file, FullArchive.snar. The .snar file extension indicates that the file is a tarball snapshot file. The snapshot file contains metadata used in association with tar commands for creating full and incremental backups. The snapshot file contains file timestamps, so the tar utility can determine if a file has been modified since it was last backed up. The snapshot file is also used to identify any files that are new or determine if files have been deleted since the last backup.
The previous example created a full backup of the designated files along with the metadata snapshot file, FullArchive.snar. Now the same snapshot file will be used to help determine if any files have been modified, are new, or have been deleted to create an incremental backup.

List: Using tar to create an incremental backup
$ echo "Answer to everything" >> Project42.txt
$ tar -g FullArchive.snar -Jcvf Project42_Inc.txz Project4?.txt
$ ls Project42_Inc.txz
Project42_Inc.txz

In the above List, the file Project42.txt is modified. Again, the tar command uses the -g option and points to the previously created FullArchive.snar snapshot file. This time, the metadata within FullArchive.snar shows the tar command that the Project42.txt file has been modified since the previous backup. Therefore, the new tarball only contains the Project42.txt file, and it is effectively an incremental backup. You can continue to create additional incremental backups using the same snapshot file as needed.

Note: tar command views full and incremental backups in levels. A full backup is one that includes all the files indicated, and it is considered a level 0 backup. The first tar incremental backup after a full backup is considered a level 1 backup. The second tar incremental backup is considered a level 2 backup, and so on.
Whenever you create data backups, it is a good practice to verify them.

TABLE: The tar command's commonly used archive verification options

Short	Long	Description
-d	--compare --diff	Compares a tar archive file's members with external files and lists the differences.
-t	--list	Displays a tar archive file's contents.
-W	--verify	Verifies each file as the file is processed. This option cannot be used with the compression options.

Backup verification can take several different forms. You might ensure that the desired files (sometimes called members) are included in your backup by using the -v option on the tar command in order to watch the files being listed as they are included in the archive file. You can also verify that desired files are included in your backup after the fact. Use the -t option to list tarball or archive file contents.

List: Using tar to list a tarball's contents
$ tar -tf Project4x.tar.gz
You can verify files within an archive file by comparing them against the current files. The option to accomplish this task is the -d option.

List: Using tar to compare tarball members to external files
$ tar -df Project4x.tar.gz
Project42.txt: Mod time differs
Project42.txt: Size differs

Another good practice is to verify your backup automatically immediately after the tar archive is created. This is easily accomplished by tacking on the -W option.

List: Using tar to verify backed-up files automatically
$ tar -Wcvf ProjectVerify.tar Project4?.txt
Verify Project42.txt
Verify Project43.txt
Verify Project44.txt
Verify Project45.txt
Verify Project46.txt

You cannot use the -W option if you employ compression to create a tarball. However, you could create and verify the archive first and then compress it in a separate step. You can also use the -W option when you extract files from a tar archive. This is handy for instantly verifying files restored from archives.

Be aware that several options used to create the backup, such as -g and -W, can also be used when restoring data.

TABLE: The tar command's commonly used file restore options

Short	Long	Description
-x	--extract --get	Extracts files from a tarball or archive file and places them in the current working directory
-z	--gunzip	Decompresses files in a tarball using gunzip
-j	--bunzip2	Decompresses files in a tarball using bunzip2
-J	--unxz	Decompresses files in a tarball using unxz

Extracting files from an archive or tarball is fairly simple using the tar utility.

List: Using tar to extract files from a tarball
$ mkdir Extract
$ mv Project4x.tar.gz Extract/
$ cd Extract
$ tar -zxvf Project4x.tar.gz
$ ls
Project43.txt Project45.txt Project4x.tar.gz

In the above List, a new subdirectory, Extract, is created. The tarball created back in List 12.6 is moved to the new subdirectory, and then the files are restored from the tarball. If you compare the tar command used in this listing to the one used in List 12.6, you'll notice that here the -x option was substituted for the -c option used in List 12.6. Also notice in List 12.12 that the tarball is not removed after a file extraction, so you can use it again and again, as needed.

Note: tar command has many additional capabilities, such as using tar backup parameters and/or the ability to create backup and restore shell scripts. Take a look at the GNU tar website, www.gnu.org/software/tar/manual, to learn more about this popular command-line backup utility.
Since the tar utility is the tape archiver, you can also place your tarballs or archive files on tape, if desired. After mounting and properly positioning your tape, simply substitute your SCSI tape device filename, such as /dev/st0 or /dev/nst0, in place of the archive or tarball filename within your tar command.

Duplicating with dd
The dd utility allows you to back up nearly everything on a disk, including the old Master Boot Record (MBR) partitions some older Linux distributions still employ. It's primarily used to create low-level copies of an entire hard drive or partition. It is often used in digital forensics for creating system images, for copying damaged disks, and for wiping partitions.
The command itself is fairly straightforward. The basic syntax structure for the dd utility is as follows:
dd if=input-device of=output-device [OPERANDS]
The output-device is either an entire drive or a partition. The input-device is the same. Just make sure that you get the right device for out and the right one for in; otherwise you may unintentionally wipe data.
Besides the of and if, there are a few other arguments (called operands) that can assist in dd operations.

TABLE: The dd command's commonly used operands

Operand	Description
bs= BYTES	Sets the maximum block size (number of BYTES) to read and write at a time. The default is 512 bytes.
count= N	Sets the number (N) of input blocks to copy.
status= LEVEL	Sets the amount (LEVEL) of information to display to STDERR.

The status=LEVEL operand needs a little more explanation. LEVEL can be set to one of the following:
none only displays error messages.
noxfer does not display final transfer statistics.
progress displays periodic transfer statistics.

It is usually easier to understand the dd utility through examples.

List: Using dd to copy an entire disk
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 4M 0 disk
⌙sdb1 8:17 0 4M 0 part
sdc 8:32 0 1G 0 disk
⌙sdc1 8:33 0 1023M 0 part
# dd if=/dev/sdb of=/dev/sdc status=progress
8192+0 records in
8192+0 records out
4194304 bytes (4.2 MB) copied, 0.232975 s, 18.0 MB/s

In the above List, the lsblk command is used first. When copying disks via the dd utility, make sure the drives are not mounted anywhere in the virtual directory structure. The two drives involved in this operation, /dev/sdb and /dev/sdc, are not mounted. With the dd command, the if operand is used to indicate the disk we wish to copy, which is the /dev/sdb drive. The of operand indicates that the /dev/sdc disk will hold the copied data. Also, status=progress will display period transfer statistics.

You can also create a system image backup using a dd command similar to the one shown in List 12.13, with a few needed modifications. The basic steps are as follows:

Shut down your Linux system.
Attach the necessary spare drives. You'll need one drive the same size or larger for each system drive.
Boot the system using a live CD, DVD, or USB so that you can either keep the system's drives unmounted or unmount them prior to the backup operation.
For each system drive, issue a dd command, specifying the drive to back up with the if operand and the spare drive with the of operand.
Shut down the system, and remove the spare drives containing the system image.

Reboot your Linux system.
If you have a disk you are getting rid of, you can also use the dd command to zero out the disk.

List: Using dd to zero an entire disk
# dd if=/dev/zero of=/dev/sdc status=progress
1061724672 bytes (1.1 GB) copied, 33.196299 s, 32.0 MB/s
dd: writing to '/dev/sdc': No space left on device
2097153+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 34.6304 s, 31.0 MB/s

The if=/dev/zero uses the zero device file to write zeros to the disk. You need to perform this operation at least 10 times or more to thoroughly wipe the disk. You can also employ the /dev/random and/or the /dev/urandom device files to put random data onto the disk. This particular task can take a long time to run for large disks. It is still better to shred any disks that will no longer be used by your company.

Replicating with rsync
“Managing Files, Directories, and Text” covers this. The rsync utility is known for speed. With this program, you can copy files locally or remotely, and it is wonderful for creating backups.
Before exploring the rsync program, it is a good idea to review a few of the commonly used options.

This guide contains the more commonly used rsync options. There are a few additional switches that help with secure data transfers via the rsync utility:
The -e, or --rsh, option changes the program to use for communication between a local and remote connection. The default is OpenSSH.
The -z, or --compress, option compresses the file data during the transfer.

This option is the equivalent of using the -rlptgoD options and does the following:
Directs rsync to copy files from the directory's contents and for any subdirectory within the original directory tree, consecutively copying their contents as well (recursively).
Preserves the following items:
Device files (only if run with super user privileges)
File group
File modification time
File ownership (only if run with super user privileges)
File permissions
Special files
Symbolic links

It's fairly simple to conduct rsync backup locally. The most popular options, -ahv, allow you to back up files to a local location quickly.

List: Using rsync to back up files locally
$ ls -sh *.tar
40K Project4x.tar 40K ProjectVerify.tar
$ mkdir TarStorage
$ rsync -avh *.tar TarStorage/
sending incremental file list
Project4x.tar
ProjectVerify.tar
sent 82.12K bytes received 54 bytes 164.35K bytes/sec
total size is 81.92K speedup is 1.00
$ ls TarStorage
Project4x.tar ProjectVerify.tar

Where the rsync utility really shines is with protecting files as they are backed up over a network.
For a secure remote copy to work, you need the OpenSSH service up and running on the remote system. In addition, the rsync utility must be installed on both the local and remote machines.

List: Using rsync to back up files remotely
$ rsync -avP -e ssh *.tar [email protected]:~
[email protected]'s password:
40,960 100% 7.81MB/s 0:00:00 (xfr#1, to-chk=1/2)
ProjectVerify.tar
40,960 100% 39.06MB/s 0:00:00 (xfr#2, to-chk=0/2)
sent 82,121 bytes received 54 bytes 18,261.11 bytes/sec
total size is 81,920 speedup is 1.00

Notice that the -avP options are used with the rsync utility. These options not only set the copy mode to archive but will provide detailed information as the file transfers take place. The important switch to notice in this listing is the -e option. This option determines that OpenSSH is used for the transfer and effectively creates an encrypted tunnel so that anyone sniffing the network cannot see the data flowing by. The *.tar in the command simply selects what local files are to be copied to the remote machine. The last argument in the rsync command specifies the following:
The user account (user1) located at the remote system to use for the transfer.
The remote system's IPv4 address, but a hostname can be used instead.
Where the files are to be placed. In this case, it is the home directory, indicated by the ~ symbol.

Notice also in that last argument that there is a needed colon (:) between the IPv4 address and the directory symbol. If you do not include this colon, you will copy the files to a new file named [email protected]~ in the local directory.

Note: rsync utility uses OpenSSH by default. However, it's good practice to use the -e option. This is especially true if you are using any ssh command options, such as designating an OpenSSH key to employ or using a different port than the default port of 22.
The rsync utility can be handy for copying large files to remote media. If you have a fast CPU but a slow network connection, you can speed things up even more by employing the rsync -z option to compress the data for transfer. This is not using gzip compression but instead applying compression via the zlib compression library. You can find more out about zlib at https://zlib.net.

Securing Offsite/Off-System Backups
In business, data is money. Thus it is critical not only to create data archives but also to protect them. There are a few additional ways to secure your backups when they are being transferred to remote locations.
Besides rsync, you can use the scp utility, which is based on the Secure Copy Protocol (SCP). Also, the sftp program, which is based on the SSH File Transfer Protocol (SFTP), is a means for securely transferring archives. We'll cover both utilities in the following sections.

Copying Securely via scp
The scp utility is geared for quickly transferring files in a noninteractive manner between two systems on a network. This program employs OpenSSH.
It is best used for small files that you need to securely copy on the fly, because if it gets interrupted during its operation, it cannot pick back up where it left off. For larger files or more extensive numbers of files, it is better to employ either the rsync or the sftp utility.
There are some rather useful scp options.

TABLE: The scp command's commonly used copy options

Short	Description
-C	Compresses the file data during transfer
-p	Preserves file access and modification times as well as file permissions
-r	Copies files from the directory's contents, and for any subdirectory within the original directory tree, consecutively copies their contents as well (recursively)
-v	Displays verbose information concerning the command's execution

Performing a secure copy of files from a local system to a remote system is rather simple. You do need the OpenSSH service up and running on the remote system.

List: Using scp to copy files securely to a remote system
$ scp Project42.txt [email protected]:~
Project42.txt 100% 29KB 20.5MB/s 00:00
Notice that to accomplish this task, no scp command options are employed. The -v option gives a great deal of information that is not needed in this case.
scp utility will overwrite any remote files with the same name as the one being transferred without asking or even displaying a message stating that fact. You need to be careful when copying files using scp that you don't tromp on any existing files.
A handy way to use scp is to copy files from one remote machine to another remote machine.

List: Using scp to copy files securely from/to a remote system
$ ip addr show | grep 192 | cut -d" " -f6

192.168.0.101/24
$ scp [email protected]:Project42.txt [email protected]:~
[email protected]'s password:
Project42.txt 100% 29KB 4.8MB/s 00:00
Connection to 192.168.0.104 closed.

First in List 12.18, the current machine's IPv4 address is checked using the ip addr show command. Next the scp utility is employed to copy the Project42.txt file from one remote machine to another. Of course, you must have OpenSSH running on these machines and have a user account you can log into as well.

Transferring Securely via sftp
The sftp utility will also allow you to transfer files securely across the network. However, it is designed for a more interactive experience. With sftp, you can create directories as needed, immediately check on transferred files, determine the remote system's present working directory, and so on. In addition, this program employs OpenSSH.
To get a feel for how this interactive utility works, it's good to see a simple example.

List: Using sftp to access a remote system
$ sftp [email protected]
[email protected]'s password:
Connected to 192.168.0.104.
sftp>
sftp> bye

In the above List: the sftp utility is used with a username and a remote host's IPv4 address. Once the user account's correct password is entered, the sftp utility's prompt is shown. At this point, you are connected to the remote system. At the prompt you can enter any commands, including help, to see a display of all the possible commands and, as shown in the listing, bye to exit the utility. Once you have exited the utility, you are no longer connected to the remote system.
Before using the sftp interactive utility, it's helpful to know some of the more common commands.

TABLE: The sftp command's commonly used commands

Command	Description
bye	Exits the remote system and quits the utility.
exit	Exits the remote system and quits the utility.
get	Gets a file (or files) from the remote system and stores it (them) on the local system. Called downloading.
reget	Resumes an interrupted get operation.
put	Sends a file (or files) from the local system and stores it (them) on the remote system. Called uploading.
reput	Resumes an interrupted put operation.
ls	Displays files in the remote system's present working directory.
lls	Displays files in the local system's present working directory.
mkdir	Creates a directory on the remote system.
lmkdir	Creates a directory on the local system.
progress	Toggles on/off the progress display. (Default is on.)

It can be a little tricky the first few times you use the sftp utility if you have never used an FTP interactive program in the past.

List: Using sftp to copy a file to a remote system
sftp> ls
Desktop Documents Downloads Music Pictures
Public Templates
Videos
sftp> lls
AccountAudit.txt Grades.txt Project43.txt ProjectVerify.tar
err.txt Life Project44.txt TarStorage
Everything NologinAccts.txt Project45.txt Universe
Extract Project42_Inc.txz Project46.txt
FullArchive.snar Project42.txt Project4x.tar
Galaxy Project42.txz Projects
sftp> put Project4x.tar
Uploading Project4x.tar to /home/Christine/Project4x.tar
Project4x.tar 100% 40KB 15.8MB/s 00:00
sftp> ls
Desktop Documents Downloads Music Pictures
Project4x.tar Public Templates Videos
sftp> exit
In the above List: after the connection to the remote system is made, the ls command is used in the sftp utility to see the files in the remote user's directory. The lls command is used to see the files within the local user's directory. Next the put command is employed to send the Project4x.tar archive file to the remote system. There is no need to issue the progress command because by default progress reports are already turned on. Once the upload is completed, another ls command is used to see if the file is now on the remote system, and it is.

Real World Scenario
Backup Rule of Three
Businesses need to have several archives in order to properly protect their data. The Backup Rule of Three is typically good for most organizations, and it dictates that you should have three archives of all your data. One archive is stored remotely to prevent natural disasters or other catastrophic occurrences from destroying all your backups. The other two archives are stored locally, but each is on a different media type. You hear about the various statistics concerning companies that go out of business after a significant data loss. A scarier statistic would be the number of system administrators who lose their jobs after such a data loss because they did not have proper archival and restoration procedures in place.
The rsync, scp, and sftp utilities all provide a means to securely copy files. However, when determining what utilities to employ for your various archival and retrieval plans, keep in mind that one utility will not work effectively in every backup case. For example, generally speaking, rsync is better to use than scp in backups because it provides more options. However, if you just have a few files that need secure copying, scp works well. The sftp utility works well for any interactive copying, yet scp is faster because sftp is designed to acknowledge every packet sent across the network. It's most likely you will need to employ all of these various utilities in some way throughout your company's backup plans.

Checking Backup Integrity
Securely transferring your archives is not enough. You need to consider the possibility that the archives could become corrupted during transfer.
Ensuring a backup file's integrity is fairly easy. A few simple utilities can help.

Digesting an MD5 Algorithm
The md5sum utility is based on the MD5 message digest algorithm. It was originally created to be used in cryptography. It is no longer used in such capacities due to various known vulnerabilities. However, it is still excellent for checking a file's integrity.

List: Using md5sum to check the original file
$ md5sum Project4x.tar
efbb0804083196e58613b6274c69d88c Project4x.tar

List: Using md5sum to check the uploaded file

192.168.0.104/24
md5sum produces a 128-bit hash value. You can see from the results in the two listings that the hash values match. This indicates no file corruption occurred during its transfer.
Warning: - A malicious attacker can create two files that have the same MD5 hash value. However, at this point in time, a file that is not under the attacker's control cannot have its MD5 hash value modified. Therefore, it is imperative that you have checks in place to ensure that your original backup file was not created by a third-party malicious user. An even better solution is to use a stronger hash algorithm.

Securing Hash Algorithms
The Secure Hash Algorithms (SHA) is a family of various hash functions. Though typically used for cryptography purposes, they can also be used to verify an archive file's integrity.
Several utilities implement these various algorithms on Linux. The quickest way to find them is using the method shown below. Keep in mind that your particular distribution may store them in the /bin directory instead.

List: Looking at the SHA utility names
$ ls -1 /usr/bin/sha?sum
/usr/bin/sha224sum
/usr/bin/sha256sum
/usr/bin/sha384sum
/usr/bin/sha512sum

Each utility includes the SHA message digest it employs within its name. Therefore, sha384sum uses the SHA-384 algorithm. These utilities are used in a similar manner to the md5sum command.

List: Using sha512sum to check the original file
$ sha224sum Project4x.tar
c36f1632cd4966967a6daa787cdf1a2d6b4ee5592
4e3993c69d9e9d0 Project4x.tar
$ sha512sum Project4x.tar
6d2cf04ddb20c369c2bcc77db294eb60d401fb443
d3277d76a17b477000efe46c00478cdaf25ec6fc09
833d2f8c8d5ab910534ff4b0f5bccc63f88a992fa9
eb3 Project4x.tar

Notice in the above List that the different hash value lengths produced by the different commands. The sha512sum utility uses the SHA-512 algorithm, which is the best to use for security purposes and is typically employed to hash salted passwords in the /etc/shadow file on Linux.
You can use these SHA utilities, just like the md5sum program was used in the above two lists to ensure archive files' integrity. That way, backup corruption is avoided as well as any malicious modifications to the file.
Providing appropriate archival and retrieval of files is critical. Understanding your business and data needs is part of the backup planning process. As you develop your plans, look at integrity issues, archive space availability, privacy needs, and so on. Once rigorous plans are in place, you can rest assured that your data is protected.

Important Exam Questions:

1. Describe the different backup types.
- A system image backup takes a complete copy of files the operating system needs to operate. This allows a restore to take place, which will get the system back up and running. The full, incremental, and differential backups are tied together in how data is backed up and restored. Snapshots and snapshot clones are also closely related and provide the opportunity to achieve rigorous backups in high I/O environments.

2. Summarize compression methods.
- The different utilities, gzip, bzip2, xz, and zip, provide different levels of lossless data compression. Each one's compression level is tied to how fast it operates. Reducing the size of archive data files is needed not only for backup storage but also for increasing transfer speeds across the network.

3. Compare the various archive/restore utilities.
- The assorted command-line utilities each have their own strengths in creating data backups and restoring files. While cpio is one of the oldest, it allows for various files through the system to be gathered and put into an archive. The tar utility has long been used with tape media but provides rigorous and flexible archiving and restoring features, which make it still very useful in today's environment. The dd utility shines when it comes to making system images of an entire disk. Finally, not only is rsync very fast, but it also allows encrypted transfers of data across a network for remote backup storage.

4. Explain the needs when storing backups on other systems.
- To move an archive across the network to another system, it is important to provide data security. Thus, often OpenSSH is employed. In addition, once an archive file arrives at its final destination, it is critical to ensure that no data corruption has occurred during the transfer. Therefore, tools such as md5sum and sha512sum are used.

⚡ Recently practiced quizzes in this class

Linux 201 Linux + Commands Exam LX0-104: CompTIA Linux+ XK0-004 CompTIA Linux+ Certification Exam Practice CompTIA Linux+ Practice Comptia A+ Linux Commands Exam LX0-103: CompTIA Linux+ Linux+ Review Linux 103 Commands Linux 201 Review

➡️ Next Study Guide

CompTIA Linux+ Certification: Protecting Files

❤ If you liked Fatskills, consider supporting us by checking out The Life Manuals You Never Got.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know?
Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson
© 2026 Fatskills.com

All trademarks, logos and brand names are the property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, trademarks and brands does not imply endorsement.

CompTIA Linux+ Certification: Protecting Files

❤ If you liked Fatskills, consider supporting us by checking out The Life Manuals You Never Got.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know? Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson© 2026 Fatskills.com

All trademarks, logos and brand names are the property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, trademarks and brands does not imply endorsement.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know?
Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson
© 2026 Fatskills.com