From the course: ISC2 Certified Cloud Security Professional (CCSP) Cert Prep

Business requirements

- [Instructor] Welcome to this lesson on business requirements. This lesson is a continuation of the previous lesson, where we're going to dive into some different key terms when it comes to business continuity and disaster recovery. We're first going to talk about what an RTO is, or recovery time objective, and then we'll talk about what an RPO is, recovery point objective, and then finally, we'll talk about a recovery service level, or RSL. All right, an RTO is the maximum tolerable duration for service interruption. So what that means is that this is a measurement of how long a service that your organization provides can be down before it is considered a critical loss. It's important to also understand that this is not an IT decision. Defining your RTO should not be set by how we are capable of recovering the system. It should be a business decision based off of when the loss of that service would be considered a critical loss. Ultimately, the definition of your RTO within your organization for a service should be a goal that the IT infrastructure should strive to achieve, rather than something that defines it. An RTO is also not a one-size-fits-all number for an organization. It should specifically vary between the different systems within your organization. For instance, a mission critical system that directly affects revenue might have an RTO of only a few minutes, which makes it necessary to have significant investment in high-availability solutions. In contrast, a nonessential system might have an RTO of 24, 48, or 72 hours, which allows for more standard recovery procedures, where we can have more significant cost savings but a longer time period to recover the service. Of course, the first step in defining an RTO is defining within your organization what is the difference between a critical and a non-critical system. This definition can specifically inform the allocation of resources like funds in designing the recovery infrastructure. All right, next, let's talk about recovery point objective, or RPO. In this case, an RPO is the measurement of maximum acceptable data loss in case of disruption. So rather than a measurement of time, this is a measurement of how much data you lose in the event of an outage. Similar to the RTO, however, this should be a business decision that the IT infrastructure then establishes as a goal or even a requirement to achieve. If we visualize this a little bit in this graphic down here, we imagine that this is a chronological timeline, and at this point right here, we can see that a disaster has recurred. So it doesn't matter exactly how much time is passed between each of these units, just that there's a measurement between the occurrence of a disaster and the period of time in which we should be able to restore data. So a specific example of this might be, say, a cyber attack has affected an application and we lost a critical database. If we have an RPO objective of no more than 24 hours of data loss, that means that we should be keeping additional copies of that database with all of the data up to at least the last 24 hours. Once again, similar to RTO, when it comes to defining an RPO, there's a balancing act between the significance of that data and storage costs. Typically, the more often we're keeping backups of our systems, including databases, the more potential expense we're incurring for collecting those backups. But we can minimize the RPO and maximize the amount of data that we can recover in the event of a disaster. All right, finally, let's talk about recovery service level, or RSL. Once again, with an RSL, this assumes that a disaster has occurred, and this is a measurement of the minimum amount of computing capacity that we need to sustain during a disaster. The RSL is going to be a critical metric to define when coming up with your business continuity plan, which, once again, the business continuity plan is all about keeping essential functions running. By first establishing what all applications are running in your environment, and then defining which of those applications are critical for your business to continue operating in a disaster, you can then define what operational capacity you need from a compute perspective to be able to maintain the operation of those critical systems during a disaster. Generally speaking, this computing capacity should focus on maintaining production and especially critical systems. This also probably means we're going to be excluding nonessential systems so that we can conserve resources. All right, in summary, in this lesson, we first talked about RTO, which is the maximum tolerable downtime of our critical systems. And then we talked about RPO, which is the maximum amount of data loss in the event of a disaster. And then finally, we talked about recovery service level, which is the minimum amount of compute capacity that we want to be able to sustain during a disaster. Thanks for joining. I'll see you in the next lesson.

Contents