A scalable benchmark architecture
The first version of my benchmark simply consisted in a single client that started as many thread as there was node in the cluster. It is kind of trivial to see that this approach cannot scale to a large number of nodes, the client would be overloaded very fast.
There was two way to use the benchmark, with systems that does not include a load balancer like Cassandra or Riak on the left and for systems with a load balancer like HBase on the right :
The solution to this problem is quite simple, I must multiply the number of clients. Of course I would need some kind of controller to send the commands to all the clients and then get back the results from all the clients. I have implemented this idea using sockets to implements simple client/server communications between the clients and the benchmark controller. With this new architecture, the two previous ways of using the benchmark become :
This new architecture coupled to a way bigger data set (all the 10 millions article from Wikipedia in English) should be enough to bring my benchmark to a more realistic level. I plan to to test it on the Amazon EC2 infrastructure, on hundreds of machines.
The code of this new architecture is already available on github in the “scalable” branch.


I don’t mean to disprove your work (you most likely have tested it and the problems you encountered are real), but I’m not sure this new architecture is really necessary.
On the YCSB paper it is stated that the client was run with up to 500 threads and this wasn’t a bottleneck, as most of the time the client was idle waiting for the database to reply.
Still, if you want to have more concurrent requests, multiple clients might be necessary.
I just want you to shed some light on this because I’m really interested in your work (I’m also an MSc student researching NoSQL
Hello André! In fact the biggest cluster that I have used for now is made of only 8 servers, so I did not run into those problem for now. But with only 8 clients, I have already seen that a 100Mb link is not enough for the client. So it is mainly for bandwidth that I wanted to multiply the number of clients. I’m also planning to do a LOT of concurrent requests, so with this new architecture, I know I can scale if I need to. For the moment I don’t know how many clients I will use for big cluster, this is still an open question.
Thank you for your input, this is very appreciated! The goal of this website is also to get the more feedback I can get