Cassandra and Voldemort Benchmarked 45
kreide33 writes "Key/Value storage systems are gaining in popularity, much because of features such as easy scalability and automatic replication. However, there are several to choose from and performance is an important deciding factor. This article compares the performance of two of the most well-known projects, Cassandra and Voldemort, using several different mixes of access types, and compares both throughput and latency."
No Winner (Score:5, Informative)
I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.
Re: (Score:2)
Digg is going to no-sql, for example. They released some of their mysql schema/code and it was poorly designed (bad indexing, manual joins, braindead queries). They chose to go with no-sql because they're clueless retards.
Lazy to rephrase, so here it goes straight from rfc1925:
Some things in life can never be fully appreciated nor understood unless experienced firsthand. Some things in networking can never be fully understood by someone who neither builds commercial networking equipment nor runs an operational network.
In general I'm very very cautious when criticizing production code. After all it works.
Re: (Score:2, Informative)
I wouldn't have mentioned it if it wasn't pure shit that. 1.5 seconds for a query that should be 3-4 disk blocks at max?
Re: (Score:2)
The guys at digg fully admit that they could spend their days tuning MySQL to achieve the performance they need. What is important to realize is that it costs real money and time to perform that tuning. Time and money that could be better spent improving the user experience of the website.
Cassandra, on the other hand, performs optimally no matter what the developers throw at it, without the need to tune every last detail to squeeze every last bit of performance out of it. As the site grows, if the cluster i
Re: (Score:3, Informative)
Production code works ... until it doesn't.
I've seen a situation where half of the bugs reports in our system were down to one badly conceived and shittily implemented module. But when I suggested binning it and doing it again properly, the answer was "but it works!".
Re: (Score:2)
> After all it works.
Since they're abandoning MySQL, apparently their schema didn't work so great...
Re: (Score:2)
Since they are alternative approaches to implementing a backend store for an information system, and the decision between key/value and relational technology is in many cases a bigger decisions with greater risk involved in making the wrong choice than the decision between particular key/value or particular relational options (since the conversion between different systems using the same basic information model is cheaper than the conv
Re: (Score:2)
I read the brief descriptions of each system and if there is any text that is as cotton mouthed fuzzy and unclear outside of legaleze I've not seen it.
Re: (Score:2)
I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.
That wouldn't have made any sense given the replication scheme used: "N=3 (replicas for each entry), R=2 (nodes to wait for on each read), W=2 (nodes to block for on each write)". It's hard to translate that into the sort of replication features available in the other databases you mentioned.
Also, these tests focused on individual put/get operations, where a standard database is going to get creamed no matter what. You'd need to include something that had a higher-level query component to it than that to
Re: (Score:3, Informative)
And what about memcached [memcached.org]? It's a simple key/value object database. What about an "associative array", isn't that basically a key/value database? I don't see what the hype is about.
Drat (Score:2)
Re: (Score:1, Funny)
You mean "He-Who-Must-Not-Be-Named"
Re: (Score:2)
http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=433x266988 [democratic...ground.com]
Re:Drat (Score:5, Funny)
Did anyone else read this as comparing Cassandra from King's Quest and Voldemort from Harry Potter?
I was expecting something about Cassandra producing a bunch of warnings in log files that no one ever bothers to read, and Voldemort having various problems managing the child processes in the cluster (mostly being unable to kill or reap them).
Re: (Score:2)
Wouldn't the problem be rather that Voldemort would keep killing child processes randomly?
Re:Drat (Score:4, Funny)
Re: (Score:1)
Maybe so, but Cassandra is sexier, and Voldemort is just plain evil.
Re: (Score:2)
Re: (Score:1)
I kinda wonder why it's not possible to use these projects as backends for mysql and postgres. Seems to me that shouldn't be that hard an exercise.
Or even having these as mountable volumes.
Re: (Score:2)
You could, but as soon as you try to implement the features of SQL that they lack on top of them you'll end up making them peform far worse than existing backends that are designed from the ground up to provide these features, so what would be the point?
Key compression... (Score:2)
Are there ANY open source key/value stores that support prefix compression?
Silly question (Score:4, Interesting)
Is a key/value system a database with just one table that has one key field and one non-key field?
Re: (Score:2)
mArray[name]
mArray[stats]
They could also be multi-tiered mappings:
mPlayer[data][name], mPlayer[data][stats]
DGD and LPmuds have done mapping/arrays for ~20 years. The underlying DGD core is C++ and the interpreted language is like-C. The underlying core of most other LPMuds is C and interpreted like-C.
Mappings and Compiled (Data) Objects were extremely useful in DGD. Named arrays with dec
Re: (Score:2)
Re: (Score:2)
That sounds like a tree. Like LDAP for instance, who has been doing this with extremely high performance, with replication, etc. for decades ;) These are all solved problems, new copies of the same comes out every 10 years in a cycle, and all the new kids don't realize that it already existed, came to full maturity and was bought by IBM long ago.. IBM has a product that will solve everyone's problems if you'd just call them. But the kids like to go it alone, as if the problem of indexing a few million w
Re: (Score:2)
It’s not a tree. It’s a graph. A tree is a graph’s retarded incomplete brother.
I wish people would stop using trees, and use full ontologies instead. It only creates problems. In file systems. In OO class hierarchies, in categories and tags, etc.
Re: (Score:2)
Re:Silly question - Couchdb (Score:1)
Try couchdb if you want to select ranges.
Its keys are stored in a heap, so selecting ranges of values is a core use case.
The view system also uses the same mechanism, so by having a cached view you can emit any key you like per record, and grab individual or ranges of values.
Nifty. :-)
Well, Cassandra does better. (Score:3, Funny)
Re: (Score:2)
fawkes.hogwarts.edu # su - voldemort
Password:
2010-05-08 16:08:45 have you hugged your death eater today? alias avada_kedavra
kill -9
2010-05-08 16:08:45 have you hugged your death eater today?
Oracle/MySQL - Voldemort (Score:2)
File system? (Score:2)
Key-value storage? That sounds like the ordinary file system to me.
Re: (Score:2)
Not a particularly useful use of the inode table. The filesystem is great for a few hundred or even a few thousand records, but when you're dealing with billions of records, that adds up to a lot of wasted space.
Comparison Against Established Systems? (Score:2)
I am in a bit of a rush, so I can't netgrep for it myself right now, but I am curious how these new contenders stack up against more established key-value stores such as Berkeley DB and GDBM. Has anyone run the benchmarks?