Forgot your password?
typodupeerror
Databases

Cassandra and Voldemort Benchmarked 45

Posted by timothy
from the rifling-the-file-cabinet dept.
kreide33 writes "Key/Value storage systems are gaining in popularity, much because of features such as easy scalability and automatic replication. However, there are several to choose from and performance is an important deciding factor. This article compares the performance of two of the most well-known projects, Cassandra and Voldemort, using several different mixes of access types, and compares both throughput and latency."
This discussion has been archived. No new comments can be posted.

Cassandra and Voldemort Benchmarked

Comments Filter:
  • No Winner (Score:5, Informative)

    by WrongSizeGlass (838941) on Saturday May 08, 2010 @02:11PM (#32140228)
    Their conclusion was that there was "no clear winner". Not surprising. Both of these products are in their early stages of development (Voldemort v0.80.1, Cassandra 0.6.0-beta3) and will certainly work on optimization and performance issues after the product is stable.

    I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.
    • by mrmeval (662166)

      I read the brief descriptions of each system and if there is any text that is as cotton mouthed fuzzy and unclear outside of legaleze I've not seen it.

    • by greg1104 (461138)

      I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.

      That wouldn't have made any sense given the replication scheme used: "N=3 (replicas for each entry), R=2 (nodes to wait for on each read), W=2 (nodes to block for on each write)". It's hard to translate that into the sort of replication features available in the other databases you mentioned.

      Also, these tests focused on individual put/get operations, where a standard database is going to get creamed no matter what. You'd need to include something that had a higher-level query component to it than that to

    • Re: (Score:3, Informative)

      by inKubus (199753)

      And what about memcached [memcached.org]? It's a simple key/value object database. What about an "associative array", isn't that basically a key/value database? I don't see what the hype is about.

  • Did anyone else read this as comparing Cassandra from King's Quest and Voldemort from Harry Potter?
  • Are there ANY open source key/value stores that support prefix compression?

  • Silly question (Score:4, Interesting)

    by Hognoxious (631665) on Saturday May 08, 2010 @03:28PM (#32140822) Homepage Journal

    Is a key/value system a database with just one table that has one key field and one non-key field?

    • AFAIK it's akin to a Mapping/Hash (array), ie:
      mArray[name] := ({ "Crash" }) or
      mArray[stats] := ({ ({ "STR", 10 )}, ({ "DEX", 12 }) })

      They could also be multi-tiered mappings:
      mPlayer[data][name], mPlayer[data][stats]

      DGD and LPmuds have done mapping/arrays for ~20 years. The underlying DGD core is C++ and the interpreted language is like-C. The underlying core of most other LPMuds is C and interpreted like-C.

      Mappings and Compiled (Data) Objects were extremely useful in DGD. Named arrays with dec
    • At the simplest level yes, but cassandra (for example) is more like a multi-dimensional hashmap. Eg; Key-Value where Value points to another Key-Value and so on, so you can reference values such as: SomeApp.Users[UserID][username]=bob The advantage of this is being able to sort by time, alpha, etc, and therefore handle sorted pagination from the key/value listings. The main advantage though is that you can literally just plug in more systems and have it scale horizontally without any extra work, unlike d
      • by inKubus (199753)

        That sounds like a tree. Like LDAP for instance, who has been doing this with extremely high performance, with replication, etc. for decades ;) These are all solved problems, new copies of the same comes out every 10 years in a cycle, and all the new kids don't realize that it already existed, came to full maturity and was bought by IBM long ago.. IBM has a product that will solve everyone's problems if you'd just call them. But the kids like to go it alone, as if the problem of indexing a few million w

        • It’s not a tree. It’s a graph. A tree is a graph’s retarded incomplete brother.

          I wish people would stop using trees, and use full ontologies instead. It only creates problems. In file systems. In OO class hierarchies, in categories and tags, etc.

    • by adamchou (993073)
      When you say "database", I imagine you're referring to the traditional relational database. I've never used Cassandra or Voldemort but I have used memcachedb and tokyodb and the one major difference is that you can't select on ranges in a key/value system. You can't select all keys > 100 or keys 100 - 500, etc
      • Try couchdb if you want to select ranges.

        Its keys are stored in a heap, so selecting ranges of values is a core use case.
        The view system also uses the same mechanism, so by having a cached view you can emit any key you like per record, and grab individual or ranges of values.

        Nifty. :-)

  • by Chas (5144) on Saturday May 08, 2010 @05:14PM (#32141560) Homepage Journal
    Until Voldy pulls that whole Avada Kedavra thing...
    • by blair1q (305137)

      fawkes.hogwarts.edu # su - voldemort
      Password:
      2010-05-08 16:08:45 have you hugged your death eater today? alias avada_kedavra
      kill -9
      2010-05-08 16:08:45 have you hugged your death eater today?

  • A very large company I know is moving from Oracle/MySQL to Voldemort for certain parts of their system. The two they evaluated were Cassandra and Voldemort.
  • Key-value storage? That sounds like the ordinary file system to me.

    • by Xeriar (456730)

      Not a particularly useful use of the inode table. The filesystem is great for a few hundred or even a few thousand records, but when you're dealing with billions of records, that adds up to a lot of wasted space.

  • I am in a bit of a rush, so I can't netgrep for it myself right now, but I am curious how these new contenders stack up against more established key-value stores such as Berkeley DB and GDBM. Has anyone run the benchmarks?

MATH AND ALCOHOL DON'T MIX! Please, don't drink and derive. Mathematicians Against Drunk Deriving

Working...