Ozone is an Apache Software Foundation project to build a distributed storage platform that caters to the demanding performance needs of analytical workloads, content distribution, and object storage use cases.
The Ozone Manager is a critical component of Ozone. It is a replicated, highly-available service that is responsible for managing the metadata for all objects stored in Ozone. As Ozone scales to exabytes of data, it is important to ensure that Ozone Manager can perform at scale. In this blog post, we will highlight the work done recently to improve the performance of Ozone Manager to scale to exabytes of data.
The hardware specifications are included at the end of this blog. The hardware was provided by Cisco as an open source partnership with Cloudera. Cisco has multiple reference architectures for running Ozone. The hardware certification includes high density nodes with close to 500 TB per node optimized for performance and TCO.
Relevance of Operations per Second to Scale
Ozone Manager hosts the metadata for the Objects stored within Ozone and consists of a cluster of Ozone Manager instances replicated via Ratis (a raft implementation). Data processing workloads tend to be more sensitive to the performance of transferring data between Datanodes and the various applications that process it. As long as the metadata for objects is served within a reasonable low latency, the impact of optimizations to Ozone Manager does not show up in stand-alone analytical benchmarks that are popular.
Ozone is designed to scale to 10s of billions of objects and exabytes in capacity. OM’s rate of serving operations becomes critical at scale, supporting the workloads spanning the entire dataset stored. Most of the work covered in this blog is crucial for scaling the total data under management and supporting multiple high-performance workloads concurrently.
With performance in mind, we narrowed our focus on OM and over the past year and developed a number of improvements that significantly boost performance and scale.. These changes will be part of the upcoming CDP release 7.1.9 and the upcoming Apache Ozone release 1.4.0.
We broke down the improvements to a few key areas listed below:
- Improve the number of operations per second S3 Gateway can support by improving adding connection persistence between S3 Gateway and Ozone Manager HDDS-5881.
- Optimize the Ozone Client to Ozone Manager protocols for reduced network round trips. HDDS-6996 HDDS-7059
- Split the load between foreground and background to isolate scaling of foreground and background traffic independently HDDS-7223
- Simulate exabyte in capacity HDDS-7489
- Improved metric collection for detailed latency breakdown HDDS-7203
- Improving performance for secure block access by using symmetric algorithms for signing token HDDS-7733
Ozone can now support around 105k read operations per second post the improvements mentioned above. This represents around a 7x increase in Ozone Manager IOPS over CDP 7.1.8. For S3 Gateway, the performance per S3 Gateway has increased over 30x since the start of the various performance-related projects.
The following load pattern was generated using Ozone’s built-in CLI load generator. The tool reads only the metadata for objects in a cluster with around 100 million keys. The peak operations per second measured is right around 100k.
The plot before shows the rate of key reads served by Ozone Manager.
Freon is an extension of the Ozone CLI that allows for generating load and benchmarking various Ozone APIs. We use Freon to generate a large dataset of over 400 million keys and read the keys back to generate load on the Ozone Manager. Ozone Freon generated the load from 16 physical client nodes, with each instance spinning up to 90 threads.
The following plot is the rate of reads as seen by a single instance of Freon with an increasing number of threads to generate the load.
One of the many metrics tracked by Ozone is the time taken to process the request internally by Ozone Manager. The work done to improve the block token generation for secure reads, helped reduce the latency down to sub millisecond. The work done for redesigning the block token generation shaved around 6 ms from each read operation.
Overall the various projects listed above helped Ozone’s read key performance to go from around ~15k to over 100k IOPS.
Going forward, we expect another round of performance improvements from planned projects.
The hardware setup was donated by Cisco, and it consisted of three master nodes and 16 datanodes
For the Ozone Manger read operations per second the relevant configurations updated are as follows
Ozone was configured to integrate with Ranger and secured via Kerberos.
With a growing number of customers and scale requirements for Ozone we are constantly innovating and working to push its boundaries for better performance, scale and operational excellence. These improvements will help customers of all sizes ranging from just a few nodes to 1000’s of nodes. Apache Ozone with its performance characteristics and improvements is the foundation for the Modern Data Architecture that allows customers to seamlessly build a Hybrid Cloud Native Architecture for their data applications. Download Apache Ozone at the Apache Download Site or a CDP Trial to get started.