42 Ways Removing ZooKeeper Improves Apache Kafka | delaware (2023)

Apache Kafka coming soon®no longer need ZooKeeper! WithKIP-500, Kafka will include its own built-in consensus layer, completely eliminating the dependency on ZooKeeper. The next major milestone in this effort is upon usApache Kafka 2.8.0, where you get early access to new code, the ability to create a development version of Kafka without ZooKeeper, and the ability to play around with implementing Raft as a distributed consensus algorithm.

(Video) What is Zookeeper and how is it working with Apache Kafka?

the blog postApache Kafka doesn't need a keeper: Removing the Apache ZooKeeper dependencydiscusses external metadata management issues, major architectural changes, and how removing ZooKeeper improves Kafka. Finally, removing ZooKeeper simplifies the overall infrastructure design and operational workflows for your Kafka deployments. We've compiled a list of specific benefits that result from this simplification, with a specific focus on the things you can QUIT. It turns out there are plenty of things to quit - and we don't think you'll miss them.42 Ways Removing ZooKeeper Improves Apache Kafka | delaware (1)Once ZooKeeper is removed as a Kafka dependency, your life will become easier in a few different areas:

  • Administration
  • Capacity and disk planning
  • Performance
  • monitoring
  • Troubleshooting


ZooKeeper is a completely separate system from Kafka, with its own deployment patterns, configuration file syntax, and management tools. If you remove ZooKeeper from Kafka, you no longer need to manage a separate service. Additionally, the KIP-500 allows you to optionally deploy the controller and agent in the same JVM, further simplifying management. Now you can stop:

(Video) Apache Kafka for Beginners (3+ hours long)

  • #1:Learn and operate another distributed system
  • #2:Management of additional servers, VMs or containers for ZooKeeper servers
  • #3:A separate security configuration for ZooKeeper, different from the rest of the Kafka cluster
  • #4:Want to know if you are a ZooKeeper or Zookeeper?
  • #5:Sigh every time you see it, Zookeeper spells it out
  • #6:Work withsystemctlfor another Linux service (in contrast, on the KIP-500, a controller and a broker can optionally run in the same JVM)
  • #7:Keep versioning in another properties file (same condition as above)
  • #8:Sharing the ZooKeeper pool between Kafka and non-Kafka services
  • #9:Redesign of themes and key patterns due to Kafka cluster partition limitations (unlike KIP-500, Kafka clusters support millions of partitions)
  • #10:Customize broker timeouts for ZooKeeper - now you can forget about itzookeeper.connection.timeout.msezookeeper.session.timeout.ms
  • #11:Ask why you can't walklight Kafka
  • #12:Read the ZooKeeper release notes to learn more about the availability of new features when upgrading
  • #13:Update the ZooKeeper configuration if the behavior of the feature changes, e.gmust explicitly allow four-letter words
  • #14:Perform quarterly rolling resets of the ZooKeeper suite as part of patch management best practices
  • #15:Speaking of how much you can't wait for the KIP-500

Capacity and disk planning

Storage is an important consideration in ZooKeeper deployments, and without ZooKeeper you don't have to deal with capacity planning, disk issues, and ZooKeeper snapshots. Now you can stop:

  • #16:Go through the sizing exercise for each ZooKeeper server
  • #17:Put the ZooKeeper service on the same server running a Kafka broker. This is generally not recommended unless there is a very light load, but we know some people still try!
  • #18:Determine the number of servers that should be in a ZooKeeper cluster to balance read capacity and write capacity (some Kafka clusters have as many ZooKeeper nodes as Kafka nodes)
  • #19:Purchase Solid State Drives (SSDs), which are recommended for ZooKeeper servers due to latency sensitivity
  • #20:Discovery during initial installation that the ZooKeeper servers do not have the required volume mounts
  • #21:Directory path sharing for ZooKeeper transaction log and snapshot directories
  • #22:Policy setting to delete old ZooKeeper data - forget it nowautopurge.purgeIntervaleautopurge.snapRetainCount
  • #23:Migrate ZooKeeper snapshots to newer, higher-capacity drives


One of the most important changes to the KIP-500 is the improved control plane traffic. Without the KIP-500, broker operations require reading metadata for all ZooKeeper topics and partitions, which can take a long time on a large cluster. However, with the KIP-500, brokers store metadata locally in a log and only read the most recent changes from the controller (similar to how Kafka consumers can read the end of the log, not the entire log), improving O(N) to O operation (1). So these control plane operations work significantly better, so you can stop now:

(Video) How to Operate an Enterprise Scale Apache Kafka Cluster | DZone Webinar by Instaclustr

  • #24:Twiddling my thumbs after a controller controller fails, waiting for a new controller to be selected and restoring ZooKeeper status (in contrast, on the KIP-500, a standby controller can be selected and already has status)
  • #25:Wait if brokers need to be restarted, then read the full status (unlike the KIP-500, brokers persist in their metadata caches across process restarts).
  • #26:O(N) cost of topic creation (unlike KIP-500, topic creation no longer requires retrieving full list of topics from Zookeeper metadata - time is only O(1) to add an entry to metadata- event log)
  • #27:There are O(N) costs for topic deletion


Every service in your mission-critical deployment should be monitored, and if you're using ZooKeeper, it should be monitored like any other service in your Kafka deployment. So if you remove ZooKeeper it may break:

  • #28:Lose yourself in busy Grafana/Kibana/Datadog monitoring dashboards displaying various ZooKeeper metrics
  • #29:Figuring out what alert levels to set for what ZooKeeper JMX metrics - forget it nowNumAliveConnections,Open requests,AvgRequestLatency,MaxRequestLatency,HeapMemoryUsage, etc
  • #30:Figuring out what alert levels to set for what ZooKeeper-related Kafka JMX metrics — forget that nowZooKeeperDisconnectsPerSec,ZooKeeperExpiresPerSec,ZooKeeperReadOnlyConnectsPerSec,ZooKeeperSyncConnectsPerSec,ZooKeeperAuthFailuresPerSec,ZooKeeperSaslAuthenticationsPerSec, etc
  • #31:Respond to nightly pages when the ZooKeeper servers go down
  • #32:Receive alerts when disk usage on ZooKeeper servers exceeds a configured threshold
  • #33:Configure log management for an additional log file
  • #34:Reading ZooKeeper transaction logs and snapshots with unique formatters to investigate problems - now you can forget about itorg.apache.zookeeper.server.LogFormattereorg.apache.zookeeper.server.SnapshotFormatter


If there are problems deploying Kafka, ZooKeeper will create an additional item to investigate. Without ZooKeeper, troubleshooting can focus on core components, so you can stop now:

(Video) Part 2 of 2 - Apache Kafka 2.8.0 without Zookeeper (Running on Docker)

  • #35:they have/var/log/messagesVerbose filling of the log file during network outages
  • #36:Troubleshoot IP connectivity between ZooKeeper suite and Kafka clients if you haven't alreadymigrated its clients and toolslong yard ZooKeeper
  • #37:Dealing with issues related to deviant states between Controller and ZooKeeper (unlike KIP-500 brokers consume all events in order from metadata event log instead of sending notifications to brokers)
  • #38:Troubleshoot when the Kafka cluster is configured to connect to the wrong ZooKeeper cluster (unlike the KIP-500, this situation is less likely to occur when the agent and controller are in the same location)
  • #39:Find and fix surprising production impacting issues when updating ZooKeeper
  • #40:Sweating when ZooKeeper doesn't start after an update, as withmissing snapshot files
  • #41:Try to remember where in the hierarchical tree structure a particular Znode exists
  • #42:Search the runbooks for the command to find out which ZooKeeper server is the leading one if your host doesn't have onencinstalled, perhaps due to company policy (note:eco srvr | (exec 3<>/dev/tcp/zk-host/2181; cat >&3; cat <&3; exec 3<&-) | modo grep -i)

Next Steps

Although KIP-500 is not yet fully implemented, you can now switch your tools from retrieving ZooKeeper metadata to retrieving Broker metadata as described in the blog postPrepare Your Clients and Tools for KIP-500: Remove ZooKeeper from Apache Kafka.

read now

(Video) Kafka: a map of traps for the enlightened dev and op by Emmanuel Bernard And Clement Escoffier

Come or ZooKeeperSem ZooKeeper
Configure clients and serviceszookeeper.connect=zookeeper:2181bootstrap.servers=broker:9092
Configure schema registrationkafkastore.connection.url=zookeeper:2181kafkastore.bootstrap.servers=broker:9092
Kafka administration toolskafka-topics --zookeeper zookeeper:2181 ...kafka-topics --bootstrap-server broker:9092 ... --command-config <properties for connecting to brokers>
Proxy-API RESTv1v2 or v3
Get the Kafka cluster IDzookeeper-shell zookeeper:2181 get /cluster/idKafka-Metadaten-cujoor seemetadata.propertiesorkonfluente Clusterbeschreibung --url http://broker:8090 --output json

Then try out the early access code that will be included in the next major release of Kafka. See the Apache Kafka 2.8.0 release blog post for more details.


Why is Kafka removing ZooKeeper? ›

The motivation for giving Kafka a “brain transplant” (replacing ZooKeeper with KRaft) was to fix scalability and performance issues, enable more topics and partitions, and eliminate the need to run an Apache ZooKeeper cluster alongside every Kafka cluster.

Can Apache Kafka work without ZooKeeper? ›

However, you can install and run Kafka without Zookeeper. In this case, instead of storing all the metadata inside Zookeeper, all the Kafka configuration data will be stored as a separate partition within Kafka itself.

What happens if ZooKeeper goes down in Kafka? ›

If one the ZooKeeper nodes fails, the following occurs: Other ZooKeeper nodes detect the failure to respond. A new ZooKeeper leader is elected if the failed node is the current leader. If multiple nodes fail and ZooKeeper loses its quorum, it will drop into read-only mode and reject requests for changes.

What replaces ZooKeeper? ›

Apache Kafka 3.3 Replaces ZooKeeper with the New KRaft Consensus Protocol.

Why do we need ZooKeeper for Kafka? ›

At a detailed level, ZooKeeper handles the leadership election of Kafka brokers and manages service discovery as well as cluster topology so each broker knows when brokers have entered or exited the cluster, when a broker dies and who the preferred leader node is for a given topic/partition pair.

What is ZooKeeper alternative for Kafka? ›

Top 10 Alternatives to ZooKeeper
  • HashiCorp Consul.
  • Eureka.
  • Docker.
  • Traefik.
  • etcd.
  • GRPC.
  • AWS Cloud Map.
  • Hysterix.

Is Kafka without ZooKeeper production ready? ›

0, users have had early access to KIP-500, which removes the Zookeeper dependency from Kafka. The release of Apache Kafka 3.3. 0 saw KIP-500 become production-ready, meaning Kafka now relies on an internal Raft quorum that uses a Kafka Topic to store metadata, and some Kafka servers act as Controllers.

How many zookeepers are needed for Kafka? ›

3 Zookeeper nodes should be enough, although, it's good to understand what are the trade-offs here: ZooKeeper uses majority quorums, which means that every voting that happens in one of these protocols requires a majority to vote on. In a production environment, the ZooKeeper servers will be deployed on multiple nodes.

Which version of Kafka does not need ZooKeeper? ›

KIP-500 outlines a better way of handling metadata in Kafka. You can think of this as “Kafka on Kafka,” since it involves storing Kafka's metadata in Kafka itself rather than in an external system such as ZooKeeper. In the post-KIP-500 world, metadata will be stored in a partition inside Kafka rather than in ZooKeeper.

What problems does ZooKeeper solve? ›

Co-ordinating and managing a service in a distributed environment is a complicated process. ZooKeeper solves this issue with its simple architecture and API. ZooKeeper allows developers to focus on core application logic without worrying about the distributed nature of the application.

Can we have multiple ZooKeeper in Kafka? ›

In general architectures, Kafka cluster shall be served by 3 ZooKeeper nodes, but if the size of deployment is huge, then it can be ramped up to 5 ZooKeeper nodes but that in turn will add load on the nodes as all nodes try to be in sync as all metadata related activities are handled by ZooKeeper.

What type of failure does ZooKeeper assume? ›

The reliablity of ZooKeeper rests on two basic assumptions. Only a minority of servers in a deployment will fail. Failure in this context means a machine crash, or some error in the network that partitions a server off from the majority. Deployed machines operate correctly.

What are the disadvantages of ZooKeeper? ›

Working in a zoo is an inherently dangerous job. Many tragic incidents have been reported across the United States that have resulted in a zookeeper losing a limb or being killed, according to the Future of Working. Zookeepers can be attacked out of the blue while working in or near an enclosure.

Does Amazon use ZooKeeper? ›

x series, along with the components that Amazon EMR installs with ZooKeeper. For the version of components installed with ZooKeeper in this release, see Release 6.9. 0 Component Versions. The following table lists the version of ZooKeeper included in the latest release of the Amazon EMR 5.

Does Facebook use ZooKeeper? ›

ZooKeeper at Facebook

Notably, ZooKeeper has become the standard low-dependency metadata store of choice for foundational infrastructure, such as: Twine schedulers: Container life cycle management and placement. Service discovery: Our production service endpoint catalog.

What is relationship between ZooKeeper and Kafka? ›

Zookeeper is used by Kafka brokers to determine which broker is the leader of a given partition and topic and perform leader elections. Zookeeper stores configurations for topics and permissions. Zookeeper sends notifications to Kafka in case of changes (e.g. new topic, broker dies, broker comes up, delete topics, etc. ...

What is the difference between ZooKeeper and Kafka? ›

Zookeeper keeps track of status of the Kafka cluster nodes and it also keeps track of Kafka topics, partitions etc. Zookeeper it self is allowing multiple clients to perform simultaneous reads and writes and acts as a shared configuration service within the system.

What is the importance of ZooKeeper? ›

The ZooKeeper utility provides configuration and state management and distributed coordination services to Dgraph nodes of the Big Data Discovery cluster. It ensures high availability of the query processing by the Dgraph nodes in the cluster. ZooKeeper is part of the Hadoop package.

What will replace Kafka? ›

Top 10 Alternatives to Apache Kafka
  • Google Cloud Pub/Sub.
  • MuleSoft Anypoint Platform.
  • Confluent.
  • IBM MQ.
  • RabbitMQ.
  • Amazon MQ.
  • Azure Event Hubs.
  • KubeMQ.

Does Kafka store data in ZooKeeper? ›

ZooKeeper is the default storage engine, for consumer offsets, in Kafka's 0.9. 1 release. However, all information about how many messages Kafka consumer consumes by each consumer is stored in ZooKeeper. Consumers in Kafka also have their own registry as in the case of Kafka Brokers.

What happens when Kafka is full? ›

policy will discard old segments when their retention time or size limit is reached, but by default there is no size limit only a time limit and the property that handle that is retention.

Does Kafka 3.2 need ZooKeeper? ›

KIP-801 introduces a built-in authorizer, StandardAuthorizer, that does not depend on Zookeeper. This means you can now run a secure Kafka cluster without Zookeeper!

Why do we need 3 zookeepers? ›

In case of network failure, DC1 nodes cannot form a quorum and hence can not accept any write requests. Why odd number of nodes is configured? Lets say our ZK has 5 nodes, in this case we need a minimum of 3 nodes for quorum and for zookeeper to keep serving the client request.

How many connections can ZooKeeper handle? ›

This property limits the number of active connections from a host, specified by IP address, to a single ZooKeeper server. By default, this limit is 60 active connections; one host is not allowed to have more than 60 active connections open to one ZooKeeper server.
3 more rows
Aug 12, 2016

How many Kafka topics is too many? ›

The rule of thumb is that the number of Kafka topics can be in the thousands. Jun Rao (Kafka committer; now at Confluent but he was formerly in LinkedIn's Kafka team) wrote: At LinkedIn, our largest cluster has more than 2K topics. 5K topics should be fine.

When should we not use Apache Kafka? ›

It's best to avoid using Kafka as the processing engine for ETL jobs, especially where real-time processing is needed. That said, there are third-party tools you can use that work with Kafka to give you additional robust capabilities – for example, to optimize tables for real-time analytics.

Which applications use ZooKeeper? ›

We will discuss some of the most notable applications of ZooKeeper in this chapter.
  • Yahoo! The ZooKeeper framework was originally built at “Yahoo!”. ...
  • Apache Hadoop. Apache Hadoop is the driving force behind the growth of Big Data industry. ...
  • Apache HBase. ...
  • Apache Solr.

What are the use cases of ZooKeeper? ›

ZooKeeper is used as the core cloud component for node membership and management, coordination of jobs executing among workers, a lock service and a simple queue service and a lot more [1].

Is ZooKeeper a key value store? ›

Zookeeper is not only a key-value store, It can be also used for service discovery and centralised service for maintaining configuration information in a distributed application. The way zookeeper store its key-value pair is bit different than other key-value stores, Zookeeper uses z-node as a key.

How to run Kafka without ZooKeeper? ›

So to run kafka without zookeeper, it can be using with Kafka Raft metadata mode. We can say shortly, KRaft. It provides to us the kafka metadata information will be stored as a partition within kafka itself.

Why does ZooKeeper need 3 nodes? ›

The quorum size of (n/2 + 1) ensures that we do not have the split-brain problem and we can always achieve a majority consensus. Hence, for the above cluster to form a quorum the minimum nodes required must be 5/2 +1=3.

How many ZooKeeper nodes do I need? ›

Keep ZooKeeper nodes to five or fewer (unless you have a strong case for surpassing that limit) In dev environments, a single ZooKeeper node is all you need. In staging environments, it makes sense to mirror the number of nodes that will be in place in production.

Who is the main antagonist in ZooKeeper? ›

Shane (either his first name or last name) is one of two the main antagonists of the comedy film Zookeeper. (along with Stephanie).

What will happen if the leader goes down in ZooKeeper? ›

If the current leader loses connection(You can configure the behavior to only trigger on LOST, not SUSPENDED), the leadership node associated with it will be automatically deleted by the server.

Where does ZooKeeper keeps its data? ›

ZooKeeper stores its data in a data directory and its transaction log in a transaction log directory. By default these two directories are the same. The server can (and should) be configured to store the transaction log files in a separate directory than the data files.

What are 3 issues with zoos? ›

We say zoos are bad because animals are forced to live in unnatural, stressful, boring environments, leading to a lack of mental and physical stimulation. They are removed from their natural habitats and confined to small limited spaces and often forced to perform tricks or entertain visitors.

Can ZooKeeper fail? ›

ZooKeeper might fail to start if the ZooKeeper ensemble is not configured correctly, or there are problems with file permissions, port conflicts, or disk corruption. In a shared file system environment, embedded ZooKeeper can fail to start if it is already running on another resource.

What is the highest paid ZooKeeper? ›

The salaries of Zookeepers in the US range from $10,240 to $209,552 , with a median salary of $37,730 . The middle 57% of Zookeepers makes between $37,730 and $94,998, with the top 86% making $209,552.

Is ZooKeeper using log4j? ›

Logging. ZooKeeper uses log4j version 1.2 as its logging infrastructure.

Does ZooKeeper have a database? ›

ZooKeeper Components shows the high-level components of the ZooKeeper service. With the exception of the request processor, each of the servers that make up the ZooKeeper service replicates its own copy of each of the components. The replicated database is an in-memory database containing the entire data tree.

Is ZooKeeper a load balancer? ›

ZooKeeper is used for High Availability, but not as a Load Balancer exactly. High Availability means, you don't want to loose your single point of contact i.e. your master node.

Is ZooKeeper like Kubernetes? ›

Kubernetes and Zookeeper are primarily classified as "Container" and "Open Source Service Discovery" tools respectively.

Who owns ZooKeeper? ›

It is a project of the Apache Software Foundation.

Does ZooKeeper use TCP? ›

More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election.

Why ZooKeeper was replaced with KRaft? ›

To begin with, it allows the controller to failover much faster. The ZooKeeper-based metadata management has been a bottleneck for cluster-wide partition limits. The new quorum controller is designed to handle a much larger number of partitions per cluster.

Is Kafka 3 production ready? ›

KIP-833 marks KRaft as production-ready for new clusters in the Apache Kafka 3.3 release.

Does Cassandra use ZooKeeper? ›

Cassandra and ZooKeeper can be installed together or separately. It is suggested to install them together on the same machine. In the Global Mailbox system, ZooKeeper is called the coordination node. ZooKeeper installation is supported on Linux platform only.

Why do we need ZooKeeper? ›

ZooKeeper is an open source Apache project that provides a centralized service for providing configuration information, naming, synchronization and group services over large clusters in distributed systems. The goal is to make these systems easier to manage with improved, more reliable propagation of changes.

Is ZooKeeper still relevant? ›

Apache ZooKeeper, Kafka's metadata management tool, will soon be phased out in favor of internal technology. Colin McCabe, a member of the Apache Kafka project management committee and an engineer at Confluent, which leverages Kafka, explained the reason for the change.

Is Kafka FIFO or LIFO? ›

Kafka supports a publish-subscribe model that handles multiple message streams. These message streams are stored as a first-in-first-out (FIFO) queue in a fault-tolerant manner.


1. Analytics Pipelines using Apache Kafka, ksqlDB, and Apache Druid
(Rill Data)
2. Introduction to Kafka Streams
(Knoldus Inc.)
3. All about ZooKeeper (and ClickHouse Keeper Too) | ClickHouse Webinar
4. Data pipeline using Kafka - Elasticsearch - Logstash - Kibana | ELK Stack | Kafka
5. LSPE-IN #40 Nov 2021
(Large Scale Production Engineering IN)
6. Apache Kafka 3.0 – Improving KRaft and an Overview of New Features
Top Articles
Latest Posts
Article information

Author: Tuan Roob DDS

Last Updated: 04/02/2023

Views: 5598

Rating: 4.1 / 5 (42 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Tuan Roob DDS

Birthday: 1999-11-20

Address: Suite 592 642 Pfannerstill Island, South Keila, LA 74970-3076

Phone: +9617721773649

Job: Marketing Producer

Hobby: Skydiving, Flag Football, Knitting, Running, Lego building, Hunting, Juggling

Introduction: My name is Tuan Roob DDS, I am a friendly, good, energetic, faithful, fantastic, gentle, enchanting person who loves writing and wants to share my knowledge and understanding with you.