NoSQL benchmarking and analysis

Paper for Cloud Computing 2011

The paper I have written with a few other people for Cloud Computing 2011 whose title is : “Measuring Elasticity for Cloud databases” has been accepted and I will do a presentation about it in Rome on September 27. The complete paper can be downloaded here. Update :  This paper has been awarded as one of […]

Study and Comparison of Elastic Cloud Databases : Myth or Reality?

During my last year in computer engineering I have written my Master’s thesis on the subject : ” Study and Comparison of Elastic Cloud Databases : Myth or Reality? “. It is the achievement of my whole year’s work on cloud databases,  a few parts have also been submitted and discussed on this blog. The whole […]

Paper on elasticity and scalability for ACM SOCC 2011

In parallel with my master’s thesis, I have,  with the help of a few other people, written a paper for the ACM Symposium On Cloud Computing. The paper describe the methodology, infrastructure and configuration used as well as the results obtained for Cassandra, HBase and mongoDB. It can be downloaded here. The goal of this […]

Updated benchmark methodology

As my work on the benchmark progressed, I needed to update my methodology to fit my new needs and to take into account the various feedback I got. First here is a little reminder of the basis of my benchmark. It was inspired by Wikipedia because they can provide me with a lot of real […]

New results for Cassandra 0.7.2

As I updated my benchmark to work with a more up to date HBase version, I thought that I had to do the same for the other databases if I wanted to be fair. Moreover I had some problems with the Cassandra implementation of MapReduce on the 0.6.10 version (you can read this post to […]

HBase 0.90.0 configuration and MapReduce

As I promised in a previous post, this one will explain how I configured HBase 0.90.0 for the last tests and a few observations about my experience with MapReduce on this HBase version. First on the configuration side there are a few modifications worth noticing : I have increased the memory allowed to HBase to […]

New results for HBase 0.90.0

The first results I published were not in favor of HBase, the performances for both read/update and MapReduce were decreasing with the size of the cluster and were very instable. I spent a lot of time trying to figure out what was the problem and I think I finally have found what could be the […]

A scalable benchmark architecture

The first version of my benchmark simply consisted in a single client that started as many thread as there was node in the cluster. It is kind of trivial to see that this approach cannot scale to a large number of nodes, the client would be overloaded very fast. There was two way to use […]

Voldemort benchmark results and configuration

Even if I’m working on a new version of my benchmark that will scale to a lot more servers than right now (you can expect a post about its new architecture soon), I wanted to publish the results of the old benchmark for Voldemort as those tests were already nearly done. The data set used […]

mongoDB configuration

This post explains how I have configured mongoDB as well as how I store data inside it. Please also read this post. For mongoDB, I don’t have any text configuration files so I will explain how I have set it up. First, for my cluster of size 3, I start a shard on each server with […]