Even if I’m working on a new version of my benchmark that will scale to a lot more servers than right now (you can expect a post about its new architecture soon), I wanted to publish the results of the old benchmark for Voldemort as those tests were already nearly done. The data set used is exactly the same and I ran the benchmark on the same cluster than before, so the result should be comparable with the others.

But first a little word of warning about the direct comparison between those results and the results of the previous four databases is necessary. The four other databases are what I call elastic databases, meaning that you can add nodes to the cluster on the fly. With Voldemort the cluster size is static, you have to configure it for a certain size, declaring all the nodes that will participate in the cluster and what partition of the data they will store. The direct consequence of this is that if you want to add a new node to the cluster, you have to shut it down, reconfigure it and then reinsert the data. To be sure to start with a working cluster, I restarted in an initial state for each cluster size, meaning that for each size I started with an empty cluster. As the cluster was reinitialized for each size, the results cannot be directly compared, at least for the elastic part. For me the raw performances comparison can still be interesting if you accept the assumption that 5 minutes is enough for the cluster to stabilize.

But enought talking here are the results :

Read/Update performances Riak Voldemort mongoDB Cassandra

Looking at those results, it seems that Voldemort is doing a relatively good job. It’s faster than Riak but the standard deviation is quite big sometimes. It also seems to scale a little bit slower than the other with “only” a 128% increase in performance while the average for the other was close to a 150% increase in performance despite the fact that they are really elastic. I thought that the fact that I was starting with a clean cluster for each cluster size was an advantage as Voldemort did not have to reorganize the data by itself when a new node was added, but it looks like elastic systems can still do better.

Finally here is the configuration I used for my Voldemort cluster. Please note that I used the python script “generate_partitions.py” with two partitions for each node at each cluster size. The implementation of the Voldemort client is available here.

Do not hesitate to give me your feedback and comments if you think anything was done in a wrong way or if there is something that you don’t understand.