From the course: AWS Certified SysOps Administrator - Associate (SOA-C02) Cert Prep

Amazon EBS overview

- [Instructor] Amazon EBS is a persistent block level storage service that you usually attach to your Amazon EC2 instances. The acronym EBS stands for Amazon Elastic Block Store. Just like the EC2 instance store, this service is also a block storage type. However, the data in EBS is more persistent as it doesn't get lost easily. The files on your block store won't be affected even if your EC2 instance was stopped, restarted, or terminated. A block store is also called an EBS volume, which you can mount or attach to your EC2 instances. An Amazon EBS volume is zonal in scope. Since technically the underlying physical components of the volume only exists in a single availability zone. You can attach an EBS volume to any EC2 instances in the same availability zone only. This means that you can't attach an EBS volume to an EC2 instance that is in a different EC or region. Your data can also be encrypted at rest using AWS Key Management Service. You can attach one or more EBS volumes in a single EC2 instance. Each instance has a root device volume that contains the system image used to boot the instance. The root EBS volume of a running EC2 instance can also be replaced. This allows you to restore your instance to its launch date or to a specific snapshot without having to stop your instance. Amazon EBS is suitable for a variety of workloads, such as databases, enterprise applications, big data analytics engines, file systems, meet the workloads, and many other use cases. It allows you to store and retrieve your data with high throughput and low latency. Unlike Amazon S3, you don't have to send an API request over the public internet or via VPC endpoint to fetch your data from Amazon EBS. This is because your EC2 instance and your EBS volume are logically attached together and are both located within a single availability zone. It is possible that the underlying data center, which hosts your EC2 instance and EBS volume is located within the same city or geographic area. Due to this proximity, Amazon EBS can provide low latency read or write access to your data. Block storage is an integrated technology that mainly operates on the hardware level. It's important to know the core concepts of block level storage in Amazon EBS and how it is different from file level storage or object storage. This will help you better understand the differences between the (indistinct) AWS storage services available. Let's first talk about the meaning of the word block in block storage. Basically, a block is a sequence of bytes or bits that represents your data. When you store a file, your system is split into multiple data blocks each with the same maximum length. The data length is also known as the block size, so if you have a block size of four kilobytes and your file is eight kilobytes in size, then you'll have two blocks with four kilobytes each. The storage technology is actually present in almost all computers today, even including the one that you're using right now. You can do a little experiment on your laptop or desktop to know how block storage works. Let's do a simple demo here. First, let's check the block size of your machine. If you're using Windows, open up your PowerShell and enter this, Get-ciminstance command to get the block size. So for this one, it has a block size of 4,096 bytes or roughly over four kilobytes. If you're using a MAC, you can open up your terminal and use the diskutil command to vary device block size as shown here, the block size in my MacBook is actually the same as my Windows machine. In most computers, the default block size is usually 4 kilobytes, so if you create a new file, your system will split that into multiple blocks where each one doesn't exceed over four kilobytes. Let's try to create an empty file using the terminal app. Just type touch tutorialsdojo.txt to create the file. Open this and using your finder app, do a right click and select the get info option. Since our file has no content at all, its file size is shown as zero bytes. Besides that, you'll also see a text that says zero bytes on disc. If you open the tutorialsdojo.txt file and add one character, say zero, or any character that you want, you'll notice that the file size goes up to one byte. This is because one character is usually around one byte in size. However, its disc size is quite different. It is showing four kilobytes, even if the actual size of our file is just one byte, this is the actual block size that we mentioned earlier. At this point, you might get curious about what will happen if your file exceeds the allocated block size of 4,096 bytes. Since one character is roughly equivalent to one byte, we can append an additional 4,096 characters more so we can go over the block size. So just add a bunch of numbers here and then copy them again and again and again until we finally come up with a total of 4,097 characters. As you can see, the block size is still at four kilobytes as you have exactly 4,096 characters, but if you add one more character, you will see that the block size had incremented to eight kilobytes. That's two times the value of the block size. The the same process applies if you add more text and exceed the eight kilobyte block, the size will increment based on the block size of your computer, so it will go from four kilobytes to eight kilobytes to twelve kilobytes, and so on and so forth. This simple demo showcases how block storage works. Your file is split into smaller four kilobyte blocks. It operates on the hardware components of your storage device, so if you're using hard disk drives, the data blocks are scattered on the physical disks. Platters and sectors of your HDD. If you're using solid state drive, the blocks of data are stored in the underlying flash memory chips. That's basically the underlying data structure of a block level storage type. This is the core technology that enables your Amazon EBS volumes to provide high throughput and low latency. An Amazon ES volume is literally connected to the host computer that powers your EC2 instance or located in close proximity. Since these two components are quite close to each other, the latency of transferring data is significantly low compared with a network file server or an internet-based object storage service. You can further improve the performance of your block storage by using multiple volumes and joining them up in the RAID configuration. RAID stands for redundant array of independent disks, and it is a data storage virtualization technology that allows you to improve the performance or the availability of your storage. There are two popular types of rate configuration, which are RAID 0 and RAID 1. RAID 0 stripes multiple volumes together to provide greater I/O performance than you can achieve with a single EBS volume. Striping is just process of dividing a body of data into blocks and then spreading the data blocks across multiple storage devices. On the other hand, RAID 1 mirrors two volumes together to provide on instance redundancy. This configuration essentially mirrors or duplicates your data to provide more durability and availability. You can also set up a RAID configuration on your laptop. If you're using a Mac, you can open up your disk utility app, navigate to the file menu, and click RAID assistant. From there, you can select a RAID configuration that you want to implement on your computer. You can select RAID 0 or RAID 1. Amazon EBS provides a selection of different volume types, which differ in performance characteristics and storage price. You can choose an EBS volume type that meets the storage performance and cost requirements of your applications. There are two main categories of storage types that you can use. They are solid state drives or SSDs and hard disk drives or HDDs. For the first one, SSDs are optimized for transactional workloads that involve frequent read and write operations with small I/O sizes. In this category, the dominant performance attribute is the I/O per second performance or IOPS. HDD's, on the other hand, are optimized for large streaming workloads that involve large sequential I/O operations. Its dominant performance attribute is throughput, which is calculated in megabits per second. In Amazon EBS, you can back up the data by taking point in time snapshots. Basically, a snapshot is a form of incremental backups that internally uses Amazon S3 to persist your data. A snapshot is incremental in nature in the sense that it only saves the data blocks that have changed after your most recent snapshot. With this feature, you can restore the state of your EBS volume in the event of data loss. You can also use your EBS snapshots to copy your EBS volume to another AWS region. This is quite helpful for data migration, disaster recovery, or for encrypting an unencrypted EBS volume. The Amazon Data Lifecycle Manager Service or AMAZON DLM can also be used to automate degradation, retention, and deletion of your EBS snapshots and EBS backed AMIs. Amazon EBS also offers data encryption to your EBS volumes. The data encryption occurs on the physical servers that hosts your EC2 instances. This ensures the security of both your data address and data in transit between your EC2 instance and its attached EBS volume. The Amazon EBS encryption uses the AWS Key Management Service keys when creating your encrypted volumes and snapshots. There's also a feature called Amazon EBS encryption by default, which is an AWS version specific setting that you can enable on your account. This enables you to automatically set the default encryption setting of all of your new Amazon EBS volumes, as well as the copies of your EBS snapshots, thereby eliminating the burden of encrypting your EBS volumes manually and ensuring data security compliance all the time.

Contents