Forgot your password?
typodupeerror
Databases IT

A Tale of Two Databases, Revisited: DynamoDB and MongoDB 73

Posted by Unknown Lamer
from the there-can-be-only-fifty-or-so dept.
Questioning his belief in relational database dogma, new submitter Travis Brown happened to evaluate Amazon's Dynamo DB and MonogDB. His situation was the opposite of Jeff Cogswell's: he started off wanting to prefer Dynamo DB, but came to the conclusion that the benefits of Amazon managing the database for him didn't outweigh the features Mongo offers. From the article: "DynamoDB technically isn't a database, it's a database service. Amazon is responsible for the availability, durability, performance, configuration, optimization and all other manner of minutia that I didn't want occupying my mind. I've never been a big fan of managing the day-to-day operations of a database, so I liked the idea of taking that task off my plate. ... DynamoDB only allows you to query against the primary key, or the primary key and range. There are ways to periodically index your data using a separate service like CloudSearch, but we are quickly losing the initial simplicity of it being a database service. ... However, it turns out MongoDB isn't quite as difficult as the nerds had me believe, at least not at our scale. MongoDB works as advertised and auto-shards and provides a very simple way to get up and running with replica sets." His weblog entry has a few code snippets illustrating how he came to his conclusions.
This discussion has been archived. No new comments can be posted.

A Tale of Two Databases, Revisited: DynamoDB and MongoDB

Comments Filter:
  • by zmooc (33175) <zmooc AT zmooc DOT net> on Friday February 22, 2013 @08:26PM (#42986375) Homepage

    "in some strange way my brain had been conditioned to think of modeled data in a relational way"

    The relational model is not much more or less than the mathematically sound way of dealing with sets and relations between their items in ways that enforce and maintain consistency. There is no alternative to that. It's not merely the status quo, as the article states. Even when designing a datamodel for storage in a NoSQL database, the rules of the relational model are best taken into account.

    The only sound reason for deviating from the relational model and its rules is that your (reasonably priced) relational database server has shortcomings, typically related to dealing with large datasets in clusters, situations in which relational database solutions typically don't scale well and a compromise is needed.

    Note that NoSQL has its place and I have encountered and worked on projects in which there was just no alternative, but I wouldn't trust my precious data to any developer that chooses NoSQL over a proper datamodel for arguments other than those mentioned above, because they're bound to be wrong.

    I don't get how anybody educated in computer science fails to understand this.

    All hail Edgar F. Codd!

  • by Capt.Albatross (1301561) on Friday February 22, 2013 @09:55PM (#42986945)

    But the base problem remains (which is probably why he finds so dificult to model his data): hierarchical datasets and relational model are not good friends.

    Data modeling should be performed at a level of abstraction higher than the access methods of a DBMS. The relational model is at a higher level and handles hierarchical models very easily, while not being limited to them. If, on the other hand, you are trying to think about the semantic structure of your data in SQL terms, you are doing it wrong.

  • by martin-boundary (547041) on Saturday February 23, 2013 @03:32AM (#42987955)

    The only sound reason for deviating from the relational model and its rules is that your (reasonably priced) relational database server has shortcomings, typically related to dealing with large datasets in clusters, situations in which relational database solutions typically don't scale well and a compromise is needed.

    That's unfortunately incorrect. The Codd model is not as fundamental as you imply. It is a finite dimensional model, suitable for when your data is naturally representable as a finite number of attributes such as name, address, etc. If there are N attributes, then each record is representable as a point in an N dimensional cartesian product.

    Perhaps the simplest example where that assumption fails is when representing a free text document as a bag of words, which is a standard representation for information retrieval applications (eg the google web index). In this case, the natural data representation is infinite dimensional, ie there can be abitrarily many attributes in a document. In such applications, even defining meaningful schemas as done in RDBMS's is impossible.

    Google would not have amounted to anything had they tried to work with relational models.

Almost anything derogatory you could say about today's software design would be accurate. -- K.E. Iverson

Working...