Improving RocksDB’s Write Scalability & Counting Things at Smyte Ted Carstensen
Heavybit member company RainforestQA recently hosted the RocksDB meetup in our San Francisco Clubhouse, where speakers Nathan Bronson from Facebook, and Yunjing Xu from Smyte gave two great talks on their experiences with RocksDB. Sign up here to attend the next RocksDB meetup, and check out the Heavybit Events calendar for all of our upcoming developer focused events.
Improving RocksDB’s Write Scalability, Nathan Bronson
Nathan Bronson has been an engineer at Facebook for 5 years, most notably on the TAO cache. He received a PhD from Stanford for work on better programming models for single-machine concurrency. He’s not a database expert, but the code he helped write ran a billion database queries while you were reading this bio. Currently he’s working in Facebook’s Boston office on a new interface to the social graph with stronger consistency primitives.
RocksDB’s architecture is highly concurrent for reads, but not for writes. When there are concurrent writers, their work is grouped together and applied by a single thread. This makes it easy to batch log writes, keeps the write path simple and reliable, and is sufficient for many workloads. Unfortunately, it also severely limits write scalability. In this talk Nathan will dig into a series of changes he made to rocksdb to tackle the scalability problem with minimal impact on the core write logic.
These changes allow writing threads to join a write group without waiting for the main DB mutex, reduce the cost of waiting for write group leader, and allow many threads to simultaneously update the memtable’s lock-free skip list. The end result is useful (not perfect) write scalability, getting a 3X improvement in peak insert rate with sync on commit disabled and 2X improvement with it enabled.
Counting with Domain Specific Databases, Yunjing Xu
Smyte is building a platform to analyze all of the traffic running through busy consumer websites and mobile apps. In this talk Yunjing describes how Smyte uses RocksDB, Kafka, and Kubernetes to build various domain specific databases. By leveraging prefix scanning and multiple column families in RocksDB, they achieved significant cost reduction with comparable performance when compared to their previous Redis-/MySQL-based solution.
Yunjing is an engineer at Smyte. Smyte builds trust and safety tools to fight spam, scams, credit card fraud and online harassment on peer-to-peer marketplaces and social apps. Before Smyte, Yunjing worked on the data science and infrastructure team at Square and received Ph.D. from University of Michigan for researching performance and security problems of public cloud infrastructure.