Zookeeper is an orchestration service, typically associated with distributed systems (think Hadoop or Kafka). Managing Zookeeper, especially in cloud environments can be difficult. In this blog post, I address the challenge of deploying Zookeeper as a Service on AWS.
Zookeeper is an orchestration service, typically associated with distributed systems (think Hadoop or Kafka). Managing Zookeeper, especially in cloud environments, can be difficult. In this blog post, I will address the challenge of deploying Zookeeper as a Service on Amazon Web Services (AWS); but first, let’s see what Zookeeper is, and why we need it!
Zookeeper is a centralized service. It maintains configuration information and naming, and it provides distributed synchronization and group services. All of these are a requirement for distributed applications. Distributed applications are difficult to implement; they introduce the possibility of bugs and race conditions. Implementing services to manage them is challenging and application designers often skimp on them, leading to brittle architectures that are difficult to manage. Even when done correctly, different approaches to implementation can lead to unnecessary complexity.
Zookeeper can help address these challenges by providing a very simple interface to a centralized coordination service. The service itself is distributed and highly reliable. Consensus, group management, and presence protocols are implemented by Zookeeper so that the applications do not need to implement them on their own. However, some work is needed to integrate applications with Zookeeper effectively. I’ll describe how to use it with Kafka.
The controller is one of the most important broking entities in a Kafka ecosystem. It maintains the leader-follower relationship across all the partitions. If a node is shutting down, the controller must tell all the replicas to act as partition leaders so that the duties of the partition leaders on the node that is about to fail are fulfilled. This means that whenever a node shuts down, a new controller can be elected. It also means that, at any given time, there is only one controller and all the follower nodes agree on its identity.
This configuration stores all the topics and their current state, including the list of existing topics, the number of partitions for each topic, the location of all the replicas, the list of configuration overrides, and which node is the preferred leader.
Zookeeper will maintain access control lists (ACLs) for all the topics.
Zookeeper will maintain a list of all the brokers that are functioning at any given moment within the cluster.
How much data each client is allowed to read and write.
Please note that Zookeeper in a mandatory service for running Apache Kafka.
Zookeeper can be provisioned on AWS with a single command. However, taking full advantage of its capabilities requires much more work. In a Zookeeper cluster there are number of machines or servers; each one is called a node. Each node needs to know the network information (IP or hostname) of the other nodes and the other services Zookeeper is managing (like Kafka) need to know Zookeeper’s network information.
If we have a Zookeeper cluster on AWS, we’ll likely deploy it on EC2 instances with an Auto Scaling Group (ASG). In that dynamic environment, what happens if a node is replaced? How do the other nodes and other services learn the new node’s information?
There are several options:
I will focus the last of these, Zookeeper as a Service, or Stateless Zookeeper.
Stateless Zookeeper is a Zookeeper cluster, configured so that if a node is terminated, the replacement node will get the same node configuration and no data will be lost.
In this section, I will describe step-by-step how to deploy a stateless Zookeeper cluster in AWS. First, I will establish some assumptions:
We need to create an environment with static internal IP addresses for the nodes, so that if a node is replaced, the new one will get the same IP address. With Elastic Network Interface (ENI), we can manage this in a straightforward manner.
In our ASG launch configuration, we need to include a script to look for an available ENI in the same AZ and attach it. Here is a ruby example:
@ec2 = Aws::EC2::Client.new(region: region)
metadata_endpoint = 'http://169.254.169.254/latest/meta-data/'
instance_az = Net::HTTP.get(URI.parse(metadata_endpoint + 'placement/availability-zone'))
# get the available eni
eni = @ec2.describe_network_interfaces(
filters: [
{ name: 'tag:Name', values: ['ZOOKEEPER-' + '*'] },
{ name: 'availability-zone', values: [instance_az] },
{ name: 'status', values: ['available'] }
]).network_interfaces[0]
Now, we can attach the network:
Exploring the Complexity of Visualizing COVID-19 Case Data
Over the last four+ months, the world has logged on daily to track case counts...
Microsoft recently reached a major milestone in the development of the open-source cross-platform library ML.NET...
Apache Kafka Automation, Part 2: Kafka as a Service
Apache Kafka is a community distributed event streaming and processing platform capable of handling trillions...