Stories
Slash Boxes
Comments
typodupeerror delete not in

Comments: 122 +-   Researchers Create Database-Hadoop Hybrid on Tuesday July 21, @02:02PM

Posted by kdawson on Tuesday July 21, @02:02PM
from the try-this-at-home dept.
database
software
programming
it
technology
ericatcw writes "'NoSQL' alternatives such as Hadoop and MapReduce may be uber-cheap and scalable, but they remain slower and clumsier to use than relational databases, say some. Now, researchers at Yale University have created a database-Hadoop hybrid that they say offers the best of both worlds: fast performance and the ability to scale out near-indefinitely. HadoopDB was built using PostGreSQL, though MySQL has also successfully been swapped in, according to Yale computer science professor Daniel Abadi, whose students built this prototype."
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Please stop (Score:3, Interesting)

    by Anonymous Coward on Tuesday July 21, @02:06PM (#28773339)

    Uber-cheap is not a word, and it doesn't even make sense because you're saying it's "above cheap". Stop making up stupid shit.

    • German prepositions do not have direct english equivalents. I suppose being an "Ubermensch" would be talking about the HATS that people wear, since that's what's Over the Mensch (person). Stop getting your panties in a twist over things you're wrong about.
        • Re:Please stop (Score:5, Insightful)

          by CorporateSuit (1319461) on Tuesday July 21, @02:35PM (#28773659)
          Considering that "Ubermensch" was translatable to "Superman" then "Ubercheap" would be "Supercheap"

          It's called a prefix. We use them in the English language. This one has recently been adopted into our language. Pick up the pace or shut up about things you don't know.
          • I am a metaphor for war.

            You are for metaphor war.

          • Considering that "Ubermensch" was translatable to "Superman" then "Ubercheap" would be "Supercheap"

            No, it wouldn't. It would be word soup that any German would find to be awkward. To say something is "super cheap" they would say something like "superpreiswertes" which would literally translate as "super inexpensive". They wouldn't use über in such a situation.

            • Word soup? Have you *HEARD* German lately? Most of their speech is made up of huge conglomerations of words and prefixes and suffixes.

              For example, the word for CPR in german? Herzkreislaufwiederbelebung (heart-circle-run-again-enlivenment).

              • Of sure, German is an extremely over verbose language at times, but the fact of the matter is that CorporateSuit, despite all his blusterings, is about as clueless in German as he tries to claim others are.

                • the fact of the matter is that CorporateSuit, despite all his blusterings, is about as clueless in German as he tries to claim others are.

                  I suppose that all comes from living in Germany for several years, speaking nothing but German with around 1,000 people a week, face to face. I suppose anyone who had gone through such rigors would end up being "clueless" in German as well. All sarcasm aside, perhaps you are more right than you think. Some Germans don't consider Koelsch or Hessisch (the dialects I ended up speaking) to be real German at all (Although they are more understandable than Bayerisch or Frankfurterisch - which is like Hessisch o

              • Re: (Score:3, Insightful)

                Compounds parse easier with correct parantheses: (herz)(kreislauf)(wiederbelebung) or (heart)(circle-run)(re-activation), where each of the bracketed words is itself a common compound. FWIW, Cardiopulmonary resuscitation has more characters than the German term. German and English aren't very different, in fact, in terms of compounds; English also has a huge number of compound words, even though they are often not spelled as a single word: circuit breaker, for instance. As English compounds get increasingly

            • We've been co-opting other language's words into English for a long, long time now. To a growing number US citizens prefixing anything with "uber" is the same as saying "ultra" or "super". You know the saying "it's all over except for the shouting"? Yeah, that's pretty much where this is.

              Feel free to mod this entire thread, including the parent, uber off-topic.

          • Pushing a translation into colloquial English does not make it a model for translation. When I'd first come across ubermensch reading Nietzsche it was described to mean 'overman' [wikipedia.org].
              • If I could tag comments I'd tag 'lol'. We should be able to tag comments, and those comments could be called /twits, in memory of some other service.
          • by Hoi Polloi (522990) on Tuesday July 21, @04:10PM (#28774971) Journal

            This thread is uber-dumb.

            Cartman would say it was "hella-stupid".

            • Uber and Super both mean "above", knucklehead. Same proto-indo-european root, in fact.

              Today may just be the day that you learn that a word may have more than one definition. In fact, the word you use "root" refers not just to a word's origin, but it can also refer to a very important part of plants. Do not squander this opportunity. It will open an entire new world of linguistics. I have nothing but hope for the grand future that awaits you and your once-tunneled view of the English language.

    • Since a high price is above a low price, "above cheap" means "expensive".

      • Uber doesn't mean 'above' in popular parlance, its an absolute measure of greatness. Therefore 'uber' cheap would refer to more cheapness.

        • The way I see it, the real question should be "does it increase the ambiguity of the language or decrease it's expressive power?". As long as someone understands what is being said (with slang like "ain't" that has been in use long enough so it is widely known) then I don't see a problem with it. We may become, somewhat Balkanized in the short-term, but, hopefully, this will serve to get those conservatives used to living in a pluralistic society and will wear down some of their xenophobia. I see the rea

  • PostGreSQL (Score:5, Informative)

    by tcopeland (32225) <tom@@@infoether...com> on Tuesday July 21, @02:08PM (#28773349) Homepage

    It's PostgreSQL [postgresql.org]... but I sympathize with the mixed case confusion and refer you to this Postgres vs PostgreSQL permathread [postgresql.org].

  • If both the performance and scalability is as good as described I can safely say that this is the most important thing of the decade and not only for DBMS.
    Handling large portions of data would get cheaper by an order of magnitude at least and scaling out would be way cheaper than now as well. I do hope it's true.

  • I thought Essbase was supposed to be one of the best databases for managing too much information. Is this supposed to be an alternative, or act as something in-between using Essbase and a mysql server?

    • Transaction speed has never been high point of Essbase, nor storing anything but numerical data.

      Changes to the data are not reflected immediately except in the lowest level members until it is re-calculated. It is not unusual to find calc scripts that run for 8+ hours.

  • by chrysalis (50680) on Tuesday July 21, @02:19PM (#28773465) Homepage

    Scalability is one thing, but what we appreciate in SQL-free databases is also that they don't require SQL.

    When what we want is just to retrieve a record, calling get(id) is way easier and more secure than building an SQL statement, and way cheaper than using an ORM.

    The Tokyo Cabinet API is absolutely excellent in this regard. And there's no need to learn yet another domain-specific language like SQL, just use the language you use for the rest of the app.

    Now, SQL-zealots would troll "but how would you do with ?".
    And yes, for complex requests as in data mining, SQL and XPath make sense. For people who aren't developpers, SQL makes sense as well. For interoperability with 3rd-party apps, SQL is also useful, just as FAT is still useful today in order to share filesystems between operating systems.

    But for the rest of us, SQL is cumbersome. Databases like MongoDB make you achieve similar results in a more natural way instead of forcing you to learn SQL and to rethink everything in a tabular way.

    • I would argue that all solutions that currently exist for databases are ideal for some specific set of problems AND some specific set of users for each problem within that initial set.

      There is no "perfect" solution that will work for all types of data, be it a flatfile structure, a hierarchical structure, a relational structure, object-oriented or some combination of those. (The star-structure of OLAP databases is a hybrid, for example.)

      What would be good is if there was a suitable metalanguage in which you

    • Re: (Score:2, Insightful)

      When what we want is just to retrieve a record, calling get(id) is way easier and more secure than building an SQL statement, and way cheaper than using an ORM.

      Yes, I'm an SQL troll, but ... if using SQL to get a row by a unique ID is too hard for you and too insecure, there is no amount of code which is going to fix your problem, which is that you are a shitty developer who is far too lazy to make a function or macro to wrap around the simple sql request.

      There are PLENTY of reasons to not like SQL, but you

    • "a record, calling get(id)"

      So you're relating "id" to "a record." I assume that the record in question is a blob of potentially binary data that your program parses however it wants. So you want to relate unique identifiers to blobs. You can do that quite easily with SQL. Looking up a given unique identifier quickly is something your average relational database is very good at. And writing the wrapper function to implement your hypothetical get() function is trivial in most languages. I'm completely at a loss for what your SQL-free database is offering me in this case. It's saving you from the horror of writing 10 lines of code, once, to implement get(in)? 60 minutes with a good SQL tutorial will teach you everything you need to know. Sure, there is a lot more you can learn, but for the simple case you're describing you can understand SQL at only the most simple level.

      Or are you handwaving the "a record" is actually automatically squeezed into one or more variables or objects in your code? You say get("ChaosDiscord") and out pops the UserObject populated with the relevant information. Of course, at this point you need to start teaching you database, or at least your database wrapper, how your objects are structured, and how to serialize them. This is admittedly a bit of a nuisance, but an SQL-free database doesn't magically make the problem go away. Sure, an SQL-free database can provide a layer to simplify or automate it, but so can a layer on an SQL database (Ruby on Rails is perhaps the best known). Sure, you'll need to tell it that username is a string, userid is an integer, and so on, but you only have to say it once in SQL instead of in your program. The total work hasn't gone up.

      Ultimately, you appear to be complaining that SQL is too powerful (and thus complex) for your needs. But you can easily learn and use a subset of SQL that corresponds to what you claim you're looking for in an SQL-free database! You might as well complain that Java is too powerful it has thousands of classes you don't need. The time to learn the relatively minor amount of SQL you need is insignificant compared to the time to develop any non-trivial application. If even that hour is too much, you can outsource the work to a geeky college student for some pizza and soda.

      There are some compelling reasons to look at SQL-free databases, but "SQL is too powerful" isn't one.

    • Re: (Score:3, Insightful)

      I understand putting another API above SQL to make it simpler to use, but avoiding using SQL because its powerful makes no sense.

      • Not to be a troll, but this sure sounds a lot like IMS. Write a program to analyze the data.

        some mainframers would be laughing their asses off.

    • And there's no need to learn yet another domain-specific language like SQL,

      SQL, "domain specific"? Wow. I am taken aback. Over 30 years of coding, I think SQL is singlehandedly the most productive addition to the development environment I can think of since the compiler. There are a lot of reasons that using a SQL database might not make sense (small platform, single user, low cost, small required footprint, etc) but domain specificity isn't on my list. I can't think of a less domain specific development

        • > WTF! I think that ranks as one of the stupidest statements I have ever read on slashdot!

          Tons of people aren't exactly writing PHP websites, but are still able to install vbulletin, phpbb, phpnuke, joomla, wordpress on mutualised hosting. And then they fire phpmyadmin in order to remove bogus users, to count the number of posts or visits, etc. SQL perfectly makes sense for this.

    • Talked about "records" and "id" because people familiar with SQL might not be familiar with other kind of databases, but you took it the wrong way.

      Now, ask Google. How many critical vulnerabilities were due to SQL injections? How many similar vulnerabilities were found in SQL-free databases?

      I agree with you that workarounds do exist, and that developpers are to blame instead of SQL, but in the real world, SQL is how a lot of services are compromised by kiddies.

      Why do we need to invent wrappers?
      Why do tools

      • SQL injection vulnerabilities don't exist because of the database, they exist because of the crappy programmer who doesn't know how to use the database being let loose writing production code. It's a bit like saying "Lets blame the internet for Cross-site scripting vulnerabilities!".

        And there's nothing wrong with SQL. There's a lot wrong with people who think SQL will solve every single problem under the sun. Unfortunately, those people seem to be employed writing 3rd-party abstraction layers and ORMs.
    • And yes, for complex requests as in data mining, SQL and XPath make sense. For people who aren't developpers, SQL makes sense as well. For interoperability with 3rd-party apps, SQL is also useful, just as FAT is still useful today in order to share filesystems between operating systems.

      But for the rest of us....

      Sorry, but could not help thinking but to this line from "Life of Brian":

      But apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, the fresh-water system, a

      • Actually it's quite the opposite. For any complex task, writing a script for MongoDB, CouchDB or TC/TT is way easier and faster than an unbearable 100-lines SQL statement, that even you are unable to understand the day after. Plus it's able to get things that just can't be written as an SQL query.
        And your "we'll have to run it over the weekend because it'll kill the server" is also why when you need to extract stuff out of a large dataset, you write a script to process data in chunks, not a single SQL state

      • Marschaling is still required, but you don't need to think about being restricted to a schema, columns, types, to define identifiers for everything, to do explicit joins, etc. Just store your objects as they are in memory.
        Look at MongoDB: http://www.mongodb.org/ [mongodb.org]

      • "id" is MP3 data. How do I find the title of the song? Will your 3 liners do the trick?
        In a document-oriented database id wouldn't be an issue. No need for any wrapper.

  • MySQL? (Score:4, Funny)

    by trisweb (690296) on Tuesday July 21, @02:38PM (#28773711) Journal
    No offense to the creators (well, maybe some offense) but why the heck would you want to put MySQL in where PostgreSQL already was? That's like taking out your star quarterback and putting in, well, me!
    • Re:MySQL? (Score:4, Informative)

      by scorp1us (235526) on Tuesday July 21, @03:31PM (#28774363) Journal

      MySQL has its fan boys from circa 1994-2001. During this period, the MySQL license was much more permissive, and gained a certain momentum from PHP that carries it through to this day. At the same time, PostgreSQL was still using Cygwin on Windows, the INSTALL had a table of contents, and was lacking performance enhancements (particularly on Windows). Eventually Cygwin was dropped and the threading was happy on windows, and the performance enhancements were good. Along with this came a much shorter INSTALL file and all reason to use MySQL had disappeared. But once you know something, people like to keep on using it. Then MySQL got things like triggers, foreign key constraints and full ACID compliance. So in the end it ended being a wash. However, and not to start a flame war, it seems that PostgreSQL, having been feature-complete (ACID, foreign keys, etc) maintained a performance edge. But also to this day MySQL has a very fast table implementation, provided you don't need things like ACID compliance. For a variety of applications, this is "good enough" and the trade-offs of feature completeness vs performance are worth it. Disclaimer: I have used both extensively in the past. I prefer PostgreSQL, but now use neither. Now I only do SQLite (embedded tables) or Oracle (for hot replication).

    • We might create the software intending it to do and be used in one way, but how it will actually be used is determined by the users. Postgre and MySQL don't carry any intrinsic values, only the values which their users discover and, well, use. Without users they have no good or bad features.

      So why is it that people feel the need to rally around or defend them? After all, only the developers who have done the work are capable of understanding the snips and criticism leveled against them, and these are the
      • I've used both pretty extensively in a wide variety of environments, and I don't take such a balanced view at all. IMHO, the best answer to most database-related problems is to use PostgreSQL or SQLite. MySQL sits somewhere between them in terms of reliability, scalability, ACIDity, etc, and kinda fails at being good at anything in particular. For that matter, even if you *like* where MySQL lies on those tradeoffs, compared to either of the other two mentioned products (especially Pg), the quality of the

    • Well, the name is certainly a lot catchier...

  • (In my best Special Ed impersonation)

    Yaaaaaay, now we can scale out Hadoop! Yaaaaay! Yaaaay Hadoop! Yaaaaay!

  • There are also two Hadoop subprojects that either support SQL or will shortly. They both translate SQL queries into map/reduce programs. They are:

    http://hadoop.apache.org/pig/ [apache.org]
    http://hadoop.apache.org/hive/ [apache.org]

    • I hate to disagree with you ... but ....

      Anyone can come up with ideas is true, HOWEVER not all ideas are GOOD ones. The problem with coming up with GOOD ideas is often people don't have a basic understanding of the problem or the implications of various ways of implementing an idea.

      Getting people to do the work is often not quite as easy as it seems. First you have to have qualified people. They have to be motivated to actually complete the work given.

      As for degree programs at schools and such, a MS is noth

        • Can't spell worth shit, doesn't negate my intelligence. I know idiots who spell perfectly. I know very intelligent people who spell well, and idiots who can't spell like me. Spelling is NOT a sign of intelligence nor education level.

          And I didn't know I was going to be "graded" on a spur of the moment post to a web log. Had I known you were a lurking grammar nazi, I would have proofread my post more carefully. Perhaps even hiring someone to draft(write) it for me to post as my own.

          Knowing my weaknesses (spel

    • And how much more than "Free" does it cost?
        • Not sure what you were talking about, but hadoop and postgres are open source. Unless they're stupid, they wouldn't make the resulting product closed source.

          I'm not going to make the whole free software pitch here, but lets just say I believe in the superiority of the development process and the end product through my experiences developing and using software.

          I have no confidence in Intersystems Cache's long term survival.

Because I don't need to worry about finances I can ignore Microsoft and take over the (computing) world from the grassroots. -- Linus Torvalds