I have been working on an open source project called RocksDB-Cloud for the last few months. What does it do and why did we build it?
You might be using RocksDB on your application servers or database servers on a public cloud service like AWS or Azure. RocksDB is a high performance embedded storage engine. But when your server machine instance dies and restarts on a different machine in the cloud, then you lose all your data. This is painful and many engineering teams have build own custom replication code around RocksDB to protect against this scenario. RocksDB-Cloud is built to provide a readymade solution to this problem so that you do not have to write any custom code for making RocksDB data durable and available.
RocksDB-Cloud provides three main advantages for Cloud environments:
- A RocksDB-Cloud instance is durable. Continuous and automatic replication of db data and metadata to Cloud Storage (e.g. AWS S3). In the event that the RocksDB-Cloud machine dies, another process on any other EC2 machine can reopen the same RocksDB-Cloud database
- A RocksDB-Cloud instance is cloneable. RocksDB-Cloud supports a primitive called zero-copy-clone that allows another instance of RocksDB-Cloud on another machine to clone an existing db. Both master and slave RocksDB-Cloud instance can run in parallel and they share some set of common database files.
- A RocksDB-Cloud instance automatically places hot data in SSD and cold data in Cloud Storage. The entire database storage footprint need not be resident on costly SSD. The Cloud storage contains the entire database and the local storage contains only the files that are in the working set.
Compatibility with RocksDB
RocksDB-Cloud is API compatible, data format compatible and license-compatible with RocksDB which means that your applications do not have to change if you move from RocksDB to RocksDB-Cloud.
RocksDB-Cloud is API compatible, data format compatible and license-compatible with RocksDB which means that your applications do not have to change if you move from RocksDB to RocksDB-Cloud.
Workload categories
There is a category of workload where RocksDB is used to tail data from a distributed log storage system. In this case, the RocksDB write-ahead-log is switched off and the application-log is in front of the database. The RocksDB-Cloud library persists every new sst file to the cloud-storage. Reads occur by demand-paging-in relevant data blocks from cloud-storage into the locally attached SSD-based persistent cache. This is shown in the following picture.
There is another category of workload where RocksDB is used as a read/write datastore. Applications can issue gets/puts to the datastore. In this case, RocksDB-Cloud persists the write-ahead-log into a cloud-based logging system like AWS-Kinesis. A slave RocksDB-Cloud instance can be configured to tail this write-ahead-log and keep itself updated. The following picture shows how a clone instance keeps itself upto-date by cloning a base-image from the cloud storage and then keeping itself upto-date by tailing the write-ahead-logs.
These two workloads listed above are described in greater detail in a set of architecture slides that I recently delivered at PerconaLive2017 .
In the current implementation, RocksDB-Cloud support only AWS-S3 but won't it be cool if we can have precisely the same api on Microsoft Azure? That is precisely what Min Wei, an engineer from Microsoft, is working on. We are working together to build something useful. Here is a set of slides that I recently delivered at PerconaLive2017 that describes the high-level architecture of RocksDB-Cloud.
If you plan to tinker with RocksDB-Cloud, please start with this example. Come join us in our google group and we can all hack together!
There is a category of workload where RocksDB is used to tail data from a distributed log storage system. In this case, the RocksDB write-ahead-log is switched off and the application-log is in front of the database. The RocksDB-Cloud library persists every new sst file to the cloud-storage. Reads occur by demand-paging-in relevant data blocks from cloud-storage into the locally attached SSD-based persistent cache. This is shown in the following picture.
There is another category of workload where RocksDB is used as a read/write datastore. Applications can issue gets/puts to the datastore. In this case, RocksDB-Cloud persists the write-ahead-log into a cloud-based logging system like AWS-Kinesis. A slave RocksDB-Cloud instance can be configured to tail this write-ahead-log and keep itself updated. The following picture shows how a clone instance keeps itself upto-date by cloning a base-image from the cloud storage and then keeping itself upto-date by tailing the write-ahead-logs.
These two workloads listed above are described in greater detail in a set of architecture slides that I recently delivered at PerconaLive2017 .
In the current implementation, RocksDB-Cloud support only AWS-S3 but won't it be cool if we can have precisely the same api on Microsoft Azure? That is precisely what Min Wei, an engineer from Microsoft, is working on. We are working together to build something useful. Here is a set of slides that I recently delivered at PerconaLive2017 that describes the high-level architecture of RocksDB-Cloud.
If you plan to tinker with RocksDB-Cloud, please start with this example. Come join us in our google group and we can all hack together!
simply superb interesting rapid recovery on other machine without loosing data....
ReplyDeleteDear Dhruba,
ReplyDeleteThank you for sharing this.
Can i please ask you if how can i make my career in MySQL ? I am a MySQL expert but don't much as you are . Can you please guide ?
Essentially replication is offloaded to cloud storage such as S3. How is this faster than native replication built on top of the storage engine (such as in Cassandra)?
ReplyDeleteThe reason you might want to use rocksdb-cloud is not because replication is quicker in rocksdb-cloud. You would use rocksdb-cloud because of its storage efficiency and reduced cost.
ReplyDeleteStorage efficiency comes from better packing of data in rocksdb. Reduced-cost comes from the fact that you do not need to keep only one copy of your data in SSD/RAM whereas you need to keep two/three copies of your data in SSD/RAM while using cassandra.
I applied for membership to google-group since a few weeks, nobody accepted my request.
ReplyDeleteThanks for sharing. This is an awesome feature.
ReplyDeleteDoes this leverage the persistent read cache work done upstream in RocksDB or is the read cache here new?
Thanks Mark. It is the same persistent read-cache available in RocksDB.
ReplyDeleteHi Dhruba,
ReplyDeleteNice work! Thanks. RocksDB supports HDFS today. This means it can also support all Hadoop supported cloud file systems (AWS, Azure, Google Cloud Storage...) with appropriate configuration or URI settings. Of course, the code in Hadoop libraries are witten in Java not C++ as in rocksdb-cloud.
How does rocksdb-cloud compare to that in terms of features and performance?
Thanks for your answer.
Hi Dhruba,
ReplyDeleteNice work! Thanks. RocksDB supports HDFS today. This means it can also support all Hadoop supported cloud file systems (AWS, Azure, Google Cloud Storage...) with appropriate configuration or URI settings. Of course, the code in Hadoop libraries is written in Java not C++ as in rocksdb-cloud.
How does rocksdb-cloud compare to that in terms of features and performance?
Hi Dhruba, We are using RocksDB as a storage engine in Voldemort NOSQL database. The database runs in our own servers. Can we use rocksdb-cloud to backup the data in our servers to AWS S3?
ReplyDeleteYes, you can run rocksdb-cloud instead of rocksdb. Most of the apis for both systems are same, so you would need minimal change in your code to do it. please email me at @rockset.com for more questions
DeleteHey Dhruba, is there any plan to have Google Cloud Storage support for rocksdb-cloud ? I saw some traction in #76 in github but never move forward. I would like to experiment this in Google Cloud Platform. Do you have any idea when GCS will be support ?
ReplyDeleteIt is likely that azure support would arrive before GCS support. But I am unable to predict when the GCS support via #76 finally becomes available.
DeleteHey Dhruba, Is there any tech doc for implementing scalable rocksdb-cloud with replication covered?
ReplyDelete