- A RocksDB-Cloud instance is durable. Continuous and automatic replication of db data and metadata to Cloud Storage (e.g. AWS S3). In the event that the RocksDB-Cloud machine dies, another process on any other EC2 machine can reopen the same RocksDB-Cloud database
- A RocksDB-Cloud instance is cloneable. RocksDB-Cloud supports a primitive called zero-copy-clone that allows another instance of RocksDB-Cloud on another machine to clone an existing db. Both master and slave RocksDB-Cloud instance can run in parallel and they share some set of common database files.
- A RocksDB-Cloud instance automatically places hot data in SSD and cold data in Cloud Storage. The entire database storage footprint need not be resident on costly SSD. The Cloud storage contains the entire database and the local storage contains only the files that are in the working set.
Monday, May 1, 2017
Wednesday, April 26, 2017
I have been asked by various students and professors about research-and-development-ideas with RocksDB. I remember the first time I was asked about this was by a student at UC Berkeley when I presented at the AMP-Lab-Seminar in 2013. I have been accumulating a list of these topics in my mind since that time, and here it is:
- Time Series: RocksDB stores keys-and-values in the database. RocksDB uses delta-encoding of keys to reduce the storage footprint. Can we get better compressibility if the key-prefix is of a specific type? If the key-prefix is a timestamp, can we use sophisticated methods to achieve better storage efficiency? Algorithms in the lines of the delta-of-delta encoding schemes as described in the Gorilla paper are possible candidates to consider. That paper also describes how a an XOR scheme can be used to compress values that are floating points.
- Byte Addressable Storage: There are storage devices that are byte address-able and RocksDB's PlainTable format is supposed to be used for these types of storage media. However, supporting delta-encoding of keys and supporting compression (e.g. gzip, zlib, etc) are a challenge in this data format. Design and implement delta-encoding and compression for PlainTable format.
- Read-triggered-compactions: New writes to the db causes new files to be created, which in turn, triggers a compaction process. In the current implementation, all compactions are triggered by these writes. However, there are scenarios that can benefit when read/scan requests trigger compaction. For example, if a majority of read-requests are needing to check for that key in multiple files, then it might be better to compact all those files so that succeeding read-requests can be served with fewer reads from storage. Design and implement read-triggered compactions.
- Tiered storage: RocksDB has a feature that allows caching data in persistent storage cache. This means that sst files that reside on slow remote storage can be brought in on demand and cached on the specified persistent storage cache. The current implementation considers the persistent storage cache as a single homogeneous tier. However, there are scenarios when multiple storage systems (with different latency of access) need to be configured for caching. For example, suppose all your sst files are on network storage. And you want to use locally attached spinning disks as well as locally attached SSD to be part of your persistent storage cache. Enhance RocksDB to support more than one tier of storage for its persistent storage cache.
- Versioned store: RocksDB always keeps the latest value for a key. Enhance RocksDB so that it can store and allow access to n most recent updates to a key.
In a future blog post, I will describe some of the advantages of running RocksDB on the cloud (e.g. AWS, Azure, etc). Stay tuned.