Fatskills
Practice. Master. Repeat.
Study Guide: CompTIA Cloud+ CV0-003 Exam: Basics of Storage in Cloud Environments
Source: https://www.fatskills.com/cloud-computing/chapter/comptia-cloud-cv0-003-exam-basics-of-storage-in-cloud-environments

CompTIA Cloud+ CV0-003 Exam: Basics of Storage in Cloud Environments

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~17 min read

Objective: Given a scenario, provision storage in cloud environments.
One of the most often-used resources in the cloud is storage. In fact, many organizations that steadfastly stick with on-premises solutions for computing, networking, and applications will use a massive amount of cloud storage.
The focus of this guide is the many features of cloud storage that you need to know for the CompTIA Cloud+ exam. You will learn more about the different types of cloud storage: block, file, and object. The guide will also cover the different storage tiers and protocols.
You will be introduced to RAID storage as well as different cloud storage features related to reducing costs and limiting data loss. Lastly, you will learn about user quotas and software-defined storage.

Topics:
- Types
- Tiers
- Input/Output Operations per Second (IOPS) and Read/Write
- Protocols
- Redundant Array of Inexpensive Disks (RAID)
- Storage System Features
- User Quotas
- Hyperconverged
- Software-Defined Storage (SDS)


1. A(n) _____ is a network-accessible storage device designed for high-speed access to block storage.

2. Flash storage is any storage that is placed on a(n) _____.

3. Which version of NFS is the most recent?

4. Data _____ is the process of ensuring there is no redundant data within a storage resource (or between storage resources).

Answers:

1. Storage-area network (SAN)

2. Solid-state drive (SSD)

3. NFSv4

4. Deduplication

Types
In a cloud environment there are three types of storage resources: block, file, and object. These storage types were introduced in, “Integrating Components into a Cloud Solution.” However, the current exam objective adds some additional topics that you should be aware of. In this section, you’ll learn about these additional topics.

Block
See the “Storage” section in this guide for a description of block storage.

Storage-Area Network (SAN)
A storage-area network is a network-accessible storage device designed for high-speed access to block storage. In a sense it isn’t really a cloud topic because traditionally SAN devices are physical devices that you use in your local-area network (LAN). However, this topic is a CompTIA Cloud+ objective because a SAN is the on-premises solution that is most like cloud block storage.
Understanding what a SAN is used for is important because you may be asked a question like this: “You are an administrator who is migrating some on-premises systems to a cloud infrastructure. You have been asked which cloud solution would be used to perform the function of your SAN devices. What type of storage would be used in this case?” And, of course, the answer would be a block storage device.
Note that a subobjective of SAN for the CompTIA exam is SAN zoning. With SAN zoning, traffic between the storage device (called the target) and the client system (called the initiator) is segmented, resulting in better security and performance. See “Network Segmentation” in “Secure a Network in a Cloud Environment,” for more details regarding network segmentation.

File
See the “Storage” section of the guide here.

Network-Attached Storage (NAS)
Network-attached storage is, as its name states, storage that is accessible via a network. However, NAS is different from the SAN device discussed earlier in this guide in that is it a file-based storage device, not a block-based storage device.

Object
Object storage is a data storage architecture for storing unstructured data, which sections data into units—objects—and stores them in a structurally flat data environment. Each object includes the data, metadata, and a unique identifier that applications can use for easy access and retrieval.

Tenants

There are two types of tenants in cloud computing:
- Single-tenant:
This is a solution in which a resource or an infrastructure serves only a single customer. In small- to-mid-sized companies, this is the standard type of tenant. The organizations all share resources, but resources outside the organization are not shared.
- Multitenant: This is a solution in which a resource or an infrastructure serves multiple customers. These customers could be business units within a large organization or even separate organizations.
The biggest advantages of a single-tenant solution are that it is more secure and the organization typically has more control over the cloud environment. However, typically a multitenant solution will be more cost effective (due to volume discounts), and there will be more flexibility to integrate between the different business units or organizations.

Buckets
When you are dealing with a file-based storage solution, files are organized into folders (sometimes called directories). For object-based storage solutions, the term used to organize objects is buckets. In other words, a bucket is the container that is used to “hold” your object data.

Tiers
For this exam topic, the term tiers refers to different levels of storage that provide different features. Typically, these levels are based on the hardware that the storage resource uses to store the data.
For the CompTIA Cloud+ certification, you should be aware of four tiers of storage: flash, hybrid, spinning disks, and long-term.

Flash
Flash storage is any storage that is placed on a solid-state drive (SSD). These hardware devices are known for being faster than other storage devices, like spinning disks. This faster speed comes at a higher cost, however.

Hybrid
A hybrid tier is one in which you use your own on-premises storage devices, but if they start to become full, the available space is supplemented with cloud-based storage. This solution limits how much you have to spend on storage devices because you can buy just what you think you need. If you miscalculate or there is a sudden high demand for storage space, cloud-based storage is available at an “only pay for what you use” price.

Spinning Disks
Spinning disks, also called magnetic drives, are traditional hard drives in which data is stored on rapidly spinning platters that are coated with a magnetic material. Spinning disks are slower than SSD drives, but if used in a cloud solution, they are typically more cost effective because spinning disks are less expensive for the cloud vendor to purchase. Note that you will also see spinning disks referred to as hard disk drives (HDDs).

Long-Term
Long-term storage typically refers to tape storage or cold storage. You may have some data that you don’t need on a daily basis, but you still need it stored somewhere safely, perhaps in case your organization ever faces an audit or security analysis. Most cloud vendors provide a long-term storage solution that is much more affordable than using SSD or spinning disks. However, access to this data is likely going to be slow and may also require advanced notice before the data is directly available.

Input/Output Operations per Second (IOPS) and Read/Write
The primary speed calculation of a storage device is the input/output operations per second, or IOPS. This is a value that you must take into consideration when choosing which underlying storage type you want to use for your storage resource.
For example, consider Figure 12.1, which demonstrates some differences between two different HDD options that AWS provides for EBS resources.

Images
AWS HDD Storage Options

Note that the durability and the volume size values are identical. The difference between these HDD types is the Max IOPS per Volume (which, in turn, affects the Max Throughput per Volume). The greater speed of the Throughput Optimized HDD makes it more suitable for cases in which large amounts of data need to be transferred (big data, data warehouses, log processing, and so on).
As you might expect, the IOPS for HDD devices is less than for SDD devices, as you can see from below, which demonstrates different SDD options that AWS provides for EBS resources.

Images
AWS SDD Storage Options

Note that the term input refers to writing data to the disk and output refers to reading data from the disk.
Several protocols can be used with storage devices and resources. Some of these protocols, like NFS, CIFS, and iSCSI, are designed to allow you to access storage across the network. Others, like FC and NVMe-oF, are designed to provide high-speed access to drives. In this section you will learn the essentials of each of these protocols.

Network File System (NFS)
Network File System is a distributed file system protocol that has been used since 1984. It is a very popular way of sharing file systems across the network on UNIX and Linux systems.
NFS works by sharing a local file system (or part of a file system) using a collection of NFS server daemons (a daemon is a program that normally functions without needing to interact with or be controlled by a human). On the NFS client system, a system administrator uses a process called mounting to make the NFS shared file system available via a directory in the local file system.
As an older software protocol, NFS has gone through many revisions. The most current protocol, called NFSv4, was last revised in 2015. While it is still very popular and has had many modern features added recently, NFS does have some drawbacks when compared to CIFS. For example, operations to share an NFS resource and to access the NFS resource on the client side require administrative rights. With CIFS, regular users can typically share a file system resource.

Common Internet File System (CIFS)
Common Internet File System is a distributed file system protocol created by Microsoft. This protocol allows a user or administrator to share part of a file system. Regular users can also access the shares (after providing authentication credentials and assuming they have the correct permissions) by performing operations like mounting network drives.
This type of distributed file system is popular because software programs on Windows, Linux, Mac OS, and UNIX can share file systems and also access the file systems. This makes the protocol more flexible. On Linux and UNIX systems the Samba software is used to share and access shares.
Note that CIFS is often used interchangeably with another protocol: Server Message Block (SMB). SMB was created by IBM and, like NFS, has been around since the mid-1980s. CIFS is based on SMB, but it was more of an implementation of SMB and not the exact same protocol as the original SMB. With all that said, CIFS, while widely referred to in documentation, isn’t really a protocol that is in use on modern systems. Microsoft has been using the native SMB protocol (either SMB 2.0 or SMB 3.0) since the introduction of Windows Vista in 2006. If you see mention of CIFS in modern documentation, it really refers to SMB.

Internet Small Computer System Interface (iSCSI)
Internet Small Computer System Interface is a protocol that allows communication to block devices across the network. With this protocol you can share a block device (SSD drive, HDD drive, SAN device, and so on) across the network for a client system to use. The block device can then be formatted with a local file system and then used like a local file system would be used (mounted on a Linux or UNIX system or assigned a drive letter on a Microsoft Windows system).
There are some key terms that you should be aware of for iSCSI. The device being shared is referred to as the iSCSI target, and the system that is accessing the share is referred to as an iSCSI initiator. You should also be aware that iSCSI can be implemented over Fibre Channel.

Fibre Channel (FC)
Fibre Channel is a protocol that is designed to provide high-speed data transfer of storage devices. It can be used to connect systems to SANs, iSCSI, and local data storage.

FC is a pretty robust and flexible protocol, so much so that you can find entire books devoted to this protocol. For the CompTIA Cloud+ exam, you should at least know the following basics of FC:
- Normally, optical fiber cables are used as the media to transfer data because these cables have very fast throughput. However, the protocol can be implemented on copper cabling as well.
- If iSCSI is implemented with FC, the actual protocol used is called Fibre Channel Protocol (FCP).
- FC can also be used with another protocol: Non-Volatile Memory Express (NVMe). This version is called NVMe-oF.
- You may be asked an exam question related to the data rates (speed) of FC. Currently, the following data rates are supported: 1, 2, 4, 8, 16, 32, 64, and 128 gigabits per second.

Non-Volatile Memory Express over Fabrics (NVMe-oF)
When SSD devices first appeared on the market, they were faster than the spinning disks, but they were connected to the motherboard, which used older protocols (SAS and SATA) to transfer data. Initially, this wasn’t a problem, but the faster the SSD devices became, the more often these older protocols created bottlenecks. There was a need for a faster transfer protocol, and the solution was NVMe.
Why is NVMe faster than the older protocols? In a nutshell, it provides more “lanes” of input/output (I/O), which allows for a higher overall throughput. This was a great protocol for local SSD devices, but there was also a need for a faster protocol for communicating with network devices. The result of this is NVMe-oF, or Non-Volatile Memory Express over Fabrics.

Redundant Array of Inexpensive Disks (RAID)
When a hard disk fails, typically all of the data on the disk is lost. Certain professionals may be able to recover some of the data, but that can be time-consuming and expensive, and ultimately the professionals may not be able to recover the data that you really need.
Redundant Array of Inexpensive Disks is a technology that was originally designed to provide a solution to the problem of hard disk failure. RAID is a technology that has been actively used for many decades. In fact, the technology for RAID actually predates the term RAID. The original concept, now referred to as RAID 1, was to mitigate the loss of a hard disk by having a second, completely redundant disk.
In addition to RAID 1, there are other types of RAID. The RAID types that are listed on the CompTIA Cloud+ exam are described next.

0
RAID 0 provides no redundancy but rather increases available storage by merging multiple hard disks (or partitions) into a single device. For example, three hard disks of 30 GB each can produce a RAID 0 device with 90 GB of storage space. Data is written to each physical device (hard disk/partition) in stripes, which results in the requirement of each physical device needing to be the same size to avoid wasting storage space. RAID 0 can improve the performance of reading data from the devices. This graphic illustrates RAID 0.

Images
RAID 0

1
In RAID 1, also called mirroring, two or more disk drives appear to be one single storage device. Data that is written to one disk drive is also written to all of the others. If one drive fails, the data is still available on the other drives. See Figure 12.4 for a graphic that illustrates RAID 1.

Images
RAID 1

5
RAID 5 provides more efficient use of the physical storage devices. Unlike RAID 1, which completely mirrors all data to all physical storage devices, RAID 5 writes different data to each physical storage device with the exception of one device, which is used to store parity data. In the event that a physical storage device is lost or damaged, the data on that device can be restored by using the parity data and the real data on the other storage devices. RAID 5 requires at least three storage devices and can have a negative impact on system performance, so software RAID (RAID performed by the kernel) is not commonly used. However, hardware RAID (RAID performed by a separate processing chip) is fairly common on high-end servers. 

Images
RAID 5

6
RAID 6 is much like RAID 5, except that instead of one parity device, two parity devices are used. This provides better redundancy but also increases the cost involved as an additional storage devices is needed.

10
Also called RAID 1+0, RAID 10 combines the advantages of both RAID 1 and RAID 0. First, two or more sets of two devices are placed into multiple RAID 1 devices. This provides redundancy. Then they are merged together into a RAID 0 device to create a much larger storage container. This graphic that illustrates RAID 10.

Images
RAID 10


Storage System Features
There are several different storage system features that you should be aware of for the CompTIA certification exam. They includes compression, deduplication, thin provisioning, thick provisioning, and replication. These features typically fall into one of two primary categories:
- Cost savings: With cloud storage, you pay for the space that you use (in some cases, typically object storage, you also pay for the process of transferring the data, such as when you download data). The less storage space that you use, the less you pay.
- Data loss prevention: One of the great features of cloud storage is that there are methods to best prevent data loss.
Next, you’ll learn the commonly used storage features and the benefit of using them.

Compression
Compression is the process of reducing the size of data using a mathematical algorithm. Data is run through the algorithm, resulting in compressed data that is stored in the cloud (or on-premises). When the data is needed, another algorithm will convert the data back into the original uncompressed format.
Compression reduces the space you use in the cloud storage environment, reducing your costs. In some cases, particularly object storage, the compression is handled by the cloud vendor. In other cases, like data stored in file storage, the encrypting is handled by the customer.
Compression is often coupled with another feature called encryption. See “Storage” in “OS and Application Security Controls” for additional details.

Deduplication
Data deduplication is the process of ensuring there is no redundant data within a storage resource (or between storage resources). In some situations, a cloud vendor may have tools to perform this process, but in many cases, you may need to create your own methods to perform deduplication tasks. By eliminating redundant data, you end up using less cloud storage, resulting in lower costs.

Thin Provisioning
Recall that you pay for the amount of storage that you use. In some cases, like object storage, this is very straightforward. A 10 MB object results in a charge for 10 MB of storage.
However, in other cases, such as block storage, what you pay for isn’t exactly what you use, but rather what you reserve. For example, if you create a 10 GB block storage device, you are charged 10 GB, even if you don’t use all of the space that you asked for. This makes sense because that space can’t be used for any of the cloud vendor’s other customers. But you don’t want to pay for space that you aren’t actually using.
The process of reserving the entire storage space of a block-based storage resource is called thick provisioning. An alternative, called thin provisioning, allows you to reserve part of the overall size of the block resource—essentially what you currently need, plus a bit more in case more data is written to the block storage.
For example, if you provision a 10 GB block store and then use only 2 GB of the space, a little over 2 GB is reserved for your use (say, 2.5 GB for this example). You pay only for 2.5 GB, unless you start adding more data to the block store. In that case, more space is reserved for your use.
Thin provisioning results in lower costs; however, it does come with a potential drawback. Reserving additional space can take some time, which means if large amounts of data are written to the block storage, the write process may fail because the required space wasn’t reserved quickly enough.

Thick Provisioning
See the preceding “Thin Provisioning” section.

Replication
Data replication is the process of ensuring data is backed up to another storage resource. In the event that your data is lost or becomes unavailable on the original data resource, it will then be available in the backed-up resource. Data replication can be done in the cloud at a zonal or regional level.
As with many of the features discussed in this section, there are cases in which cloud vendors will offer replication as an automatic feature and some cases where it will be the responsibility of the cloud customer. In either case, replication will result in higher costs, but it will also serve to prevent data loss.

User Quotas
A user quota is a limitation placed on a user when using storage. This limit may include how much space the user can use or how many files or objects the user can store. The goal of user quotas is either to reduce costs (remember, space used equals higher costs) or prevent a user from using up all of the allocated space for a storage resource (typically a block storage resource).

Hyperconverged
Hyperconverged infrastructure provides a unified environment that makes upgrading hardware and software much faster and easier. It can streamline and simplify upgrades, eliminating the need to balance independent systems. It provides a flexible and adaptable environment for restructuring systems or hardware.

Software-Defined Storage (SDS)
Software-defined storage is a technology that is designed to simplify using storage. Many cloud vendors have massive numbers of storage devices, and they are very difficult to manage individually. With SDS, these storage devices are “collected” into one massive storage collection (or more). When space is needed to provision a block storage resource, a file storage resource, or an object storage resource, the SDS program is asked for space. The SDS program manages where the actual data is stored.

Quiz:

1. Which of the following is a solution in which a resource or an infrastructure serves only a single customer? A.Single-tenant B.Prime-tenant C.One-tenant D.Only-tenant

2. Which of the following types of storage is the slowest? A.Spinning disks B.Long-term C.Flash D.Hybrid

3. _____ is a protocol that allows communication to block devices across the network. A.CIFS B.SMB C.NFS D.iSCSI

4. Which RAID level provides no redundancy? A.0 B.1 C.5 D.10

5. Which feature allocates only some of the provisioned space? A.Small provisioning B.Minimal provisioning C.Thick provisioning

Answers:

1. Single-tenant

2. Long-term

3. iSCSI

4. 0

5. None of these answers are correct. (Thin provisioning is correct.)