NoSQLBenchmarking.com

NoSQL benchmarking and analysis

Entries for February, 2011

New results for Cassandra 0.7.2

As I updated my benchmark to work with a more up to date HBase version, I thought that I had to do the same for the other databases if I wanted to be fair. Moreover I had some problems with the Cassandra implementation of MapReduce on the 0.6.10 version (you can read this post to [...]

HBase 0.90.0 configuration and MapReduce

As I promised in a previous post, this one will explain how I configured HBase 0.90.0 for the last tests and a few observations about my experience with MapReduce on this HBase version. First on the configuration side there are a few modifications worth noticing : I have increased the memory allowed to HBase to [...]

New results for HBase 0.90.0

The first results I published were not in favor of HBase, the performances for both read/update and MapReduce were decreasing with the size of the cluster and were very instable. I spent a lot of time trying to figure out what was the problem and I think I finally have found what could be the [...]

A scalable benchmark architecture

The first version of my benchmark simply consisted in a single client that started as many thread as there was node in the cluster. It is kind of trivial to see that this approach cannot scale to a large number of nodes, the client would be overloaded very fast. There was two way to use [...]

Voldemort benchmark results and configuration

Even if I’m working on a new version of my benchmark that will scale to a lot more servers than right now (you can expect a post about its new architecture soon), I wanted to publish the results of the old benchmark for Voldemort as those tests were already nearly done. The data set used [...]

mongoDB configuration

This post explains how I have configured mongoDB as well as how I store data inside it. Please also read this post. For mongoDB, I don’t have any text configuration files so I will explain how I have set it up. First, for my cluster of size 3, I start a shard on each server with [...]

Riak configuration

This post explains how I have configured Riak as well as how I store data inside it. Please also read this post. The Riak configuration files can be downloaded here. They are really close to the default settings, except for the mandatory informations to set up the cluster. I’m storing data inside Riak in the most [...]

HBase and Hadoop configuration

This post explains how I have configured Hadoop and HBase as well as how I store data inside HBase. Please also read this post. To run an HBase cluster you first have to configure the underlying Hadoop HDF that will store the data. The configuration files can be downloaded here. Those are the configuration file [...]

Cassandra configuration

This post explains how I store data inside Cassandra and how it was configured for my tests. Please also read this post. Like for the other databases, I have mostly used default configuration. In this case I have used the default “Keyspace1″ keyspace with the default column family “Standard1″. The column path is always the [...]

About the cluster infrastructure and databases configurations

Benchmark are the kind of test that always create discussions because there is often at least one of the tested software that could have been configured in a better way or used in a better environment. Another thing that creates a lot of discussion is the fact that a benchmark (especially very simple ones like [...]