From the course: Data Resilience with Spring and RabbitMQ Event Streaming
Introducing RabbitMQ for data resilience
From the course: Data Resilience with Spring and RabbitMQ Event Streaming
Introducing RabbitMQ for data resilience
- [Instructor] In this video, I'll introduce you to RabbitMQ and how it can be used for resilient data delivery. Remember the scenario I introduced where you have a payment and account balance app? We really, really need something that helps us to receive those critical payments even if the account balance app is not up and running. A broker can be used to improve our ability to receive payments when errors occur, it specializes in resilient and reliable delivery of messages. It acts as a mediator between the payment app and the account balance app. The broker is the key to data resiliency. It can even send a payment to multiple account balance apps running across different sites or different geographical locations. I will use my favorite broker, RabbitMQ. It has been providing reliable delivery for mission critical applications since 2007. All the information you need to get started can be found at rabbitmq.com. The communication between the applications and Rabbit uses a network protocol. Rabbit supports many different protocols, but the protocols I'll use in this course are the advanced message queuing protocol, in other words, AMQP and streams. Both support resiliency. Check out rabbitmq.com for more information about other protocols. Reliable delivery of messages is not solely the responsibility of RabbitMQ alone. The A MQP and the streams protocol supports sending a publisher confirm to publisher applications. This indicates that the published message has been safely delivered to Rabbit. The publisher application should resend the message if they do not receive a successful confirmation. The consumer application should also send an acknowledgement back to Rabbit. This acknowledgement informs Rabbit that the message can be safely marked as processed by the consumer application in RabbitMQ itself. Note that messages will be redelivered to consumer applications until a message is safely acknowledged. There are several aspects of resiliency, such as high availability, fault tolerance, and even techniques like disaster recovery, backup and restore, and multi-site replication. A highly available system means Rabbit keeps up and running for producer and consumer applications. If a broker goes down, a highly available system should be able to restart Rabbit as soon as possible. The more brokers you have in a cluster, the higher the availability you have. This allows you to overcome partial failures. Clustering also supports fault tolerance. Fault tolerance means there's no data loss in the case of a partial outage of a broker. This is typically accomplished by having multiple copies of the messages within a RabbitMQ cluster. With disaster recovery, you typically have one RabbitMQ active cluster that is handling messages for producer and consumer applications, but you'll have one or more additional clusters that are passive. Passive means they're not actively involved in the message processing. A passive cluster should be kept in sync so that it can step in and become active for client applications. If there is any error with the primary cluster, the applications can switch from the active to the passive cluster to continue operations as normal. But you may be wondering how do you keep multiple clusters in sync? Well, one solution is to back up RabbitMQ by copying data that will be restored later. But you should know that the current state of messages is managed by each rabbit broker on disc, so you should never copy messages from disc while the brokers are running. This is because only the RabbitMQ brokers know when it's safe to copy messages. So to safely take a backup, you must stop the RabbitMQ brokers first. But stopping Rabbit will also stop the publisher and consumer applications. So this is not the best option. The best approach for backup and restore is to use rabbit multi-site replication. With multi-site replication, messages can be replicated from the active cluster to one or more additional clusters. These additional clusters might be passive or they can be involved in the active processing of messages. This is referred to as an active-active cluster. Unlike an active-passive, an active-active cluster has consumer applications actively consuming messages in multiple clusters. In this course, I'll show you how payment processing can be done across multiple sites for improved availability and fault tolerance. I will have an active-active RabbitMQ cluster. Next, let's talk about how we can start building applications that use RabbitMQ with Spring.