A Deep Dive into Apache ZooKeeper Security
Figure 1: Apache ZooKeeper Service Architecture
Apache ZooKeeper is a cornerstone of many distributed systems, providing essential services like configuration management, naming, and synchronization. Its reliability and scalability have made it a popular choice for coordinating complex, distributed applications. However, like any powerful tool, ZooKeeper's effectiveness is contingent on its proper implementation and security. This article provides a comprehensive analysis of Apache ZooKeeper's security landscape, exploring its inherent vulnerabilities, notable CVEs, real-world exploitation examples, and a detailed guide to hardening your ZooKeeper deployments.
Understanding ZooKeeper's Security Model
ZooKeeper's security model is built on the premise of a trusted environment. By default, it does not implement any authentication, meaning any user with network access to the ZooKeeper ensemble can interact with it. This design choice prioritizes ease of use and performance, but it also places the onus of security squarely on the shoulders of the administrators. The primary mechanism for securing ZooKeeper is through Access Control Lists (ACLs), which define permissions for specific znodes (the data nodes in ZooKeeper's hierarchical data store). However, without proper configuration, these ACLs are often left wide open, creating a significant security risk.
Figure 2: ZooKeeper Ensemble with Leader and Follower Nodes
Common Vulnerabilities and Attack Vectors
An unsecured ZooKeeper instance is a treasure trove for attackers. The lack of default authentication, coupled with the power of its administrative commands, creates a perfect storm for exploitation. Here are some of the most common vulnerabilities and attack vectors:
Unauthenticated Access
The most significant and fundamental vulnerability in ZooKeeper is its default open-door policy. Without any authentication, anyone who can connect to the ZooKeeper port (typically 2181) can issue commands, read sensitive data, and disrupt the service. This was the root cause of the Shopify vulnerability discussed later in this article.
The Four-Letter Words: Administrative Commands
ZooKeeper exposes a set of powerful administrative commands known as the "four-letter words." These commands, sent via a TCP socket, provide deep insights into the ZooKeeper ensemble and can even be used to shut it down. While useful for administration, they become a dangerous weapon in the hands of an attacker when left unprotected.
Figure 3: Common Security Attack Vectors
Here is a table of some of the most critical four-letter word commands and their potential for misuse:
| Command | Description | Attacker's Advantage |
|---|---|---|
dump |
Lists outstanding sessions and ephemeral nodes. | Can reveal sensitive session information and internal IP addresses. |
envi |
Prints details about the serving environment. | Exposes software versions, Java environment, and other system details for fingerprinting. |
kill |
Shuts down the server. | A simple and effective Denial of Service (DoS) attack. |
ruok |
Tests if the server is in a non-error state. | A basic health check that can be used to confirm the server is a viable target. |
stat |
Lists statistics about performance and clients. | Provides a wealth of information about the cluster's activity and connected clients. |
srst |
Resets server statistics. | Can be used to erase traces of an attack. |
Notable CVEs: A History of ZooKeeper Vulnerabilities
Over the years, several critical vulnerabilities have been discovered in Apache ZooKeeper. Understanding these CVEs is crucial for comprehending the evolving threat landscape and the importance of timely patching.
Figure 4: Understanding Common Vulnerabilities and Exposures (CVE)
Here is a summary of some of the most significant CVEs:
| CVE ID | Severity | Description |
|---|---|---|
| CVE-2025-58457 | Moderate | Insufficient Permission Check in AdminServer Snapshot/Restore Commands: Improper permission check in ZooKeeper AdminServer lets authorized clients run snapshot and restore commands with insufficient permissions. Affects versions 3.9.0 before 3.9.4. |
| CVE-2024-51504 | Important | Authentication Bypass with IP-based Authentication: When using IPAuthenticationProvider in ZooKeeper Admin Server, attackers can bypass authentication by spoofing the X-Forwarded-For HTTP header. This allows arbitrary execution of Admin Server commands. Affects versions 3.9.0 before 3.9.3. |
| CVE-2024-23944 | Critical | Information Disclosure in Persistent Watcher Handling: This critical vulnerability allows an attacker to monitor child znodes by attaching a persistent watcher to a parent znode they already have access to. The server doesn't perform an ACL check when the watcher is triggered, exposing the full path of the child znodes. Affects versions 3.6.0-3.7.2, 3.8.0-3.8.3, 3.9.0-3.9.1. |
| CVE-2023-44981 | Critical | Authorization Bypass in SASL Quorum Peer Authentication: This vulnerability allows an attacker to bypass authorization and join the cluster, gaining complete read-write access to the data tree. This is possible when SASL Quorum Peer authentication is enabled, and the attacker provides a SASL authentication ID without an instance part. Affects all versions before 3.7.2, 3.7.0-3.7.1, 3.8.0-3.8.2, 3.9.0. |
| CVE-2019-0201 | Critical | Information Disclosure Vulnerability: This well-known vulnerability allows unauthenticated or unprivileged users to retrieve the unsalted hash value of user credentials when Digest Authentication is in use. The getACL() command, which doesn't require any permissions, returns all information in the ACL Id field. Affects versions prior to 3.4.14 and 3.5.0-alpha through 3.5.4-beta. |
| CVE-2018-8012 | Critical | Quorum Peer Mutual Authentication: This vulnerability allows an arbitrary endpoint to join the ZooKeeper quorum without any authentication or authorization. This could lead to a complete compromise of the cluster. Affects versions prior to 3.4.10 and 3.5.0-alpha through 3.5.3-beta. |
| CVE-2017-5637 | Moderate | DOS Attack on wchp/wchc Four Letter Words: The wchp/wchc commands are CPU intensive and can cause CPU spike leading to DoS. Affects versions 3.4.0-3.4.9 and 3.5.0-3.5.2. |
| CVE-2016-5017 | Moderate | Buffer Overflow in C CLI Shell: Buffer overflow in C client shells when command exceeds 1024 characters. Affects versions 3.4.0-3.4.8 and 3.5.0-3.5.2. |
These are just a few examples of the vulnerabilities that have affected Apache ZooKeeper. A comprehensive list can be found on the official Apache ZooKeeper security page [1].
Real-World Exploitation: The Shopify Incident
In 2016, a security researcher discovered a publicly accessible ZooKeeper instance belonging to Shopify [2]. This incident, detailed in a HackerOne report, serves as a stark reminder of the dangers of exposing ZooKeeper to the internet without proper security measures. The researcher was able to connect to the ZooKeeper instance on port 2181 without any authentication and execute a series of administrative commands.
The Attack
The researcher used simple tools like ncat to send four-letter commands to the ZooKeeper server. Here's a glimpse of the commands used and the information they revealed:
dump: This command revealed a list of active sessions and ephemeral nodes, including internal IP addresses of Shopify's infrastructure.envi: This command exposed detailed information about the server's environment, including the Java version, classpath, and operating system.stat: This command provided a wealth of statistics about the ZooKeeper ensemble, including the number of connections, latency, and server mode (follower).
The researcher also noted that the kill command was available, which could have been used to shut down the ZooKeeper server, potentially causing a significant outage for Shopify's services. The root cause of this vulnerability was a misconfigured firewall rule that left the ZooKeeper port exposed to the public internet.
This incident highlights the critical importance of network security and the principle of least privilege. ZooKeeper should never be directly exposed to the internet. Access should be restricted to a trusted network, and authentication should always be enabled.
Example: Exploiting ZooKeeper with Four-Letter Commands
Here are some practical examples of how an attacker might exploit an unsecured ZooKeeper instance:
Example 1: Gathering Server Information
$ echo stat | nc zookeeper.target.com 2181
Zookeeper version: 3.5.1-alpha-1693007, built on 07/28/2015 07:19 GMT
Clients:
/10.92.1.120:35986[1](queued=0,recved=2238053,sent=2238053)
/10.92.1.10:48851[1](queued=0,recved=2235979,sent=2235979)
Latency min/avg/max: 0/0/981
Received: 25813570
Sent: 25813622
Connections: 7
Outstanding: 0
Zxid: 0xc2000016ad
Mode: follower
Node count: 192
Example 2: Listing Sessions and Ephemeral Nodes
$ echo dump | nc zookeeper.target.com 2181
SessionTracker dump:
Global Sessions(7):
0x1053c5850800023 4000ms
0x1053c5850800024 4000ms
ephemeral nodes dump:
Sessions with Ephemerals (5):
0x1053c5850800024:
/borg/locutus/agents/061e4b6/10.92.1.192:9257
Example 3: Environment Details
$ echo envi | nc zookeeper.target.com 2181
Environment:
zookeeper.version=3.5.1-alpha-1693007, built on 07/28/2015 07:19 GMT
host.name=locutus-zk3.ec2.shopify.com
java.version=1.7.0_79
java.vendor=Oracle Corporation
Hardening Your ZooKeeper Deployment: A Best Practices Guide
Securing your ZooKeeper deployment is not just a recommendation; it's a necessity. A compromised ZooKeeper ensemble can lead to a catastrophic failure of your entire distributed system. Here is a comprehensive guide to hardening your ZooKeeper deployment, incorporating best practices and lessons learned from past vulnerabilities.
1. Network Security: Firewalls and Network Segmentation
Figure 5: Network Security Components and Defense Layers
The first line of defense for your ZooKeeper ensemble is network security. As demonstrated by the Shopify incident, exposing ZooKeeper to the public internet is a recipe for disaster. Here's how to lock down your network:
- Firewall Rules: Implement strict firewall rules that only allow traffic from trusted clients and other ZooKeeper nodes on the required ports (2181 for clients, 2888 for quorum, and 3888 for leader election).
- Network Segmentation: Isolate your ZooKeeper ensemble in a private network or VPC. This adds an extra layer of security by preventing direct access from the public internet.
Example: Firewall Configuration
# Allow only trusted clients to access ZooKeeper
sudo iptables -A INPUT -p tcp --dport 2181 -s 10.0.1.0/24 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 2181 -j DROP
# Allow ZooKeeper nodes to communicate
sudo iptables -A INPUT -p tcp --dport 2888 -s 10.0.2.0/24 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 3888 -s 10.0.2.0/24 -j ACCEPT
2. Authentication: Kerberos and SASL
Never run ZooKeeper without authentication. The two primary authentication mechanisms supported by ZooKeeper are Kerberos and SASL. Kerberos is the more robust and recommended option for production environments.
- Kerberos: Provides strong, mutual authentication for clients and servers. It's the gold standard for securing ZooKeeper and is highly recommended for production deployments.
- SASL (Simple Authentication and Security Layer): A framework that allows for pluggable authentication mechanisms. While Digest-MD5 is an option, it has known vulnerabilities (CVE-2019-0201) and should be avoided. If you must use SASL, use a stronger mechanism like GSSAPI (Kerberos).
Example: Enabling SASL Authentication
In your ZooKeeper configuration file (zoo.cfg):
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
requireClientAuthScheme=sasl
3. Authorization: Access Control Lists (ACLs)
Figure 6: ZooKeeper Coordination in Distributed Systems (Kafka Example)
ACLs are the primary mechanism for controlling access to znodes. By default, znodes are created with open ACLs, which is a major security risk. Here are some best practices for using ACLs:
- Principle of Least Privilege: Grant only the permissions that are absolutely necessary for a client to perform its function. Avoid using the
worldscheme, which grants access to anyone. - Use Specific Schemes: Use the
authorsaslschemes to grant permissions to specific authenticated users. - Secure Critical Znodes: Pay special attention to znodes that store sensitive configuration data or are critical for the operation of your distributed system.
Example: Setting ACLs with zk_shell
# Connect to ZooKeeper
zk-shell -server localhost:2181
# Create a znode with restricted access
zk> create /secure-config "sensitive data"
# Set ACL to allow only authenticated user 'admin' to read and write
zk> setAcl /secure-config auth:admin:password:rw
# Verify ACL
zk> getAcl /secure-config
4. Encryption: TLS/SSL for Data in Transit
ZooKeeper communication is unencrypted by default, which makes it vulnerable to man-in-the-middle attacks. To protect your data in transit, you should enable TLS/SSL encryption for both client-server and server-server communication.
Example: Enabling TLS in zoo.cfg
secureClientPort=2281
serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
ssl.keyStore.location=/path/to/keystore.jks
ssl.keyStore.password=your_password
ssl.trustStore.location=/path/to/truststore.jks
ssl.trustStore.password=your_password
5. Disable Unnecessary Four-Letter Commands
Many of the four-letter commands are useful for debugging but can be dangerous in production. You can disable specific commands or whitelist only the ones you need.
Example: Disabling Four-Letter Commands
# In zoo.cfg, whitelist only safe commands
4lw.commands.whitelist=stat,ruok,conf,isro
# Or disable all four-letter commands
4lw.commands.whitelist=
6. Auditing and Monitoring
Regularly audit your ZooKeeper configuration and monitor its activity for any signs of suspicious behavior. Here are some key areas to focus on:
- ACLs: Regularly review your ACLs to ensure they are still appropriate and haven't been inadvertently relaxed.
- Logs: Monitor ZooKeeper logs for any unusual activity, such as a large number of failed authentication attempts or unexpected commands.
- Four-Letter Words: Monitor the use of the four-letter word commands. Any unexpected usage should be investigated immediately.
7. Keep ZooKeeper Updated
Always run the latest stable version of ZooKeeper. Security patches are regularly released to address newly discovered vulnerabilities. Subscribe to the ZooKeeper mailing lists and security advisories to stay informed about the latest threats.
Additional Security Considerations
Using zk_shell for Administration
The zk_shell tool is a Python-based command-line interface for ZooKeeper that provides more features than the default CLI. However, it should be used with caution and only from trusted networks.
Example: Installing and Using zk_shell
# Install zk_shell
pip install zk-shell
# Connect to ZooKeeper
zk-shell -server localhost:2181
# List znodes
zk> ls /
# Create a new znode
zk> create /myapp/config "configuration data"
# Get znode data
zk> get /myapp/config
# Copy files between filesystem and ZooKeeper
zk> cp file:///etc/config.txt zk://localhost:2181/myapp/config
Securing Sensitive Data in ZooKeeper
If you must store sensitive data like passwords or API keys in ZooKeeper, always encrypt the data before storing it. ZooKeeper does not provide native encryption at rest, so this must be handled at the application level.
Example: Encrypting Data Before Storage
# Python example using cryptography library
from cryptography.fernet import Fernet
# Generate encryption key
key = Fernet.generate_key()
cipher = Fernet(key)
# Encrypt sensitive data
password = "my_secret_password"
encrypted_password = cipher.encrypt(password.encode())
# Store encrypted data in ZooKeeper
zk.create("/myapp/credentials", encrypted_password)
# Later, retrieve and decrypt
encrypted_data = zk.get("/myapp/credentials")[0]
decrypted_password = cipher.decrypt(encrypted_data).decode()
Summary of Best Practices
| Security Measure | Implementation | Priority |
|---|---|---|
| Network Isolation | Use firewalls and VPCs to restrict access | Critical |
| Authentication | Enable Kerberos or SASL authentication | Critical |
| Authorization | Implement strict ACLs on all znodes | Critical |
| Encryption | Enable TLS/SSL for all communications | High |
| Command Restrictions | Disable or whitelist four-letter commands | High |
| Data Encryption | Encrypt sensitive data before storage | High |
| Regular Updates | Keep ZooKeeper updated with latest patches | High |
| Monitoring | Implement logging and alerting for suspicious activity | Medium |
| Auditing | Regular security audits and ACL reviews | Medium |
Figure 7: ZooKeeper Service Components and Client Interaction
Conclusion
Apache ZooKeeper is a powerful and reliable tool for distributed coordination, but its security cannot be taken for granted. The default open-door policy, the power of its administrative commands, and a history of critical vulnerabilities make it a prime target for attackers. By understanding the inherent risks, staying up-to-date on the latest CVEs, and implementing a multi-layered security strategy that includes network security, authentication, authorization, encryption, and monitoring, you can protect your ZooKeeper deployments and ensure the stability and integrity of your distributed systems.
Remember that security is not a one-time effort but an ongoing process. Regular audits, timely patching, and continuous monitoring are essential to maintaining a secure ZooKeeper environment. As the threat landscape evolves, so too must your security posture.
References
- Apache ZooKeeper Security. (n.d.). Retrieved from https://zookeeper.apache.org/security.html
- Unauthorized access to Zookeeper on http://locutus-zk3.ec2.shopify.com:2181. (2016, July 27). HackerOne. Retrieved from https://hackerone.com/reports/154369
Article by Pentester | Published October 2025
Comments
Post a Comment