Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Databases Programming Software IT

Is the One-Size-Fits-All Database Dead? 208

jlbrown writes "In a new benchmarking paper, MIT professor Mike Stonebraker and colleagues demonstrate that specialized databases can have dramatic performance advantages over traditional databases (PDF) in four areas: text processing, data warehousing, stream processing, and scientific and intelligence applications. The advantage can be a factor of 10 or higher. The paper includes some interesting 'apples to apples' performance comparisons between commercial implementations of specialized architectures and relational databases in two areas: data warehousing and stream processing." From the paper: "A single code line will succeed whenever the intended customer base is reasonably uniform in their feature and query requirements. One can easily argue this uniformity for business data processing. However, in the last quarter century, a collection of new markets with new requirements has arisen. In addition, the relentless advance of technology has a tendency to change the optimization tactics from time to time."
This discussion has been archived. No new comments can be posted.

Is the One-Size-Fits-All Database Dead?

Comments Filter:
  • Well it's about time we had some change around here!
  • The closest thing I can think of that fits that description is Postgres.
    • Re: (Score:2, Funny)

      There's a difference between fitting and being forced to fit into something ;)
    • Languages, OSs, file systems, databases, microprocessors, cars, VCRs, diskdrives, pizzas, .... none of these are one-size-fits-all.

      There never has been, and probably never will be. A small embedded database will never be replaced by a fat-asses SQL database any more than Linux will ever find aplace in the really bottom-end microcontroller systems.

    • Re: (Score:3, Informative)

      by egghat ( 73643 )
      Btw. Postgres was a project from Stonebreaker meant to deal with the limitiations of SQL (POST inGRES).

      See the history of PostgreSQL [postgresql.org].

      When the community picked the old, dormant Postgres source code up (no problem due to the BSD licensing), the first that was added (after some debates) was the SQL syntax, hence the name change to PostgreSQL.

      Bye egghat.
  • Have you noticed when you code your own routines for manipulating data (in effect, your own application specific database) you can produce stuff that is very, very fast? In the good old days of the Internet Bubble 1.0 I took an application specific database like this (originally for a record store) and generalized it into a generic database capable of handling all sorts of data. But every change I made to make the code more general also made it less efficient. The end result wasn't bad by any means: we sol
    • by smilindog2000 ( 907665 ) <bill@billrocks.org> on Tuesday January 09, 2007 @11:16PM (#17534244) Homepage
      I write all my databases with the fairly generic DataDraw database generator. The resulting C code is faster that if you wrote it manually using pointers to C structures (really). http:datadraw.sourceforge.net [sourceforge.net]. Its generic, and faster than anything EVER.
      • by Anonymous Coward on Tuesday January 09, 2007 @11:38PM (#17534430)
        Looks interesting, will check it out. Working URL for the lazy: http://datadraw.sourceforge.net/ [sourceforge.net]
      • It's hard to take any project seriously (professional or not) when it's web page has such glaring mistakes as random letter b's in its source (clearly visible in the all the browsers I've tried), more white space than anyone can reasonably shake a stick at and poor graphics (I'm looking at the rounded corners of the main content).

        As interesting as it sounds, it makes me wonder what could be wrong with the code...

      • Is there any documentation for it (didn't see a link on the webpage)? How do I use it in my program?

        Are there any benchmark results that prove the claims about it being faster? How much faster (than what?) is it, really?
        • ``Is there any documentation for it (didn't see a link on the webpage)? How do I use it in my program?''

          Never mind, I found the link. Must have skipped past it the first time. Perhaps it would be a good idea to add it to one of the edges of the page?
        • There is a manual in OpenOffice format [sourceforge.net] (yeah, I really AM a PITA). It was benchmarked heavily in the late 90's, internally at an EDA company before deciding to use the integer-based object references. All programs (including a placer and a router) sped up, and the range was from 10% to 20%, depending on the tool. The average was about 15%. Improvements were more pronounced in tools with larger amounts of data, which we felt was due to cache effects. It would be nice to redo the benchmarks, with open-so
    • In our company, we use the database mostly as a warehouse. Our daily processing is done via flat files and Java code. It's just much, much, much faster that way and easier to maintain. I think we're kind of a special case though.
      • by suggsjc ( 726146 )

        I think we're kind of a special case though.

        Yep, you are special...just like everyone else.


        On a side note. I know the term flat files can mean different things to different people, but I find that they are almost always a bad idea (to some degree and depending on your definition). You always run the risk of whatever you are using as delimiters coming up in the data you are parsing giving those "bugs." You always think "we sanatize our data..." and it will never happen to me, but more times than not, i

    • Re: (Score:3, Interesting)

      Back in the late 90s, I worked on a data warehouse project. We tried Oracle, and had an Oracle tuning expert work with us. However, we couldn't get the performance we needed. We wound up developing a custom "database" system, where data was extracted from the source databases (billing, CDRs, etc.) and de-normalized into several large tables in parallel. The de-normalization performed global transformations and corrections. Those tables were then loaded into shared memory (64bit HP multi-CPU system with a hu
  • Prediction... (Score:5, Insightful)

    by Ingolfke ( 515826 ) on Tuesday January 09, 2007 @11:00PM (#17534140) Journal
    1) More and more specialized databases will begin cropping up.
    2) Mainstream database systems will modularize their engines so they can be optimized for different applications and they can incorporate the benefits of the specialized databases while still maintaining a single uniform database management system.
    3) Someone will write a paper about how we've gone from specialized to monolithic...
    4) Something else will trigger specialization... (repeat)

    Dvorak if you steal this one from me I'm going to stop reading your writing... oh wait.
    • Re:Prediction... (Score:4, Interesting)

      by Tablizer ( 95088 ) on Tuesday January 09, 2007 @11:47PM (#17534506) Journal
      2) Mainstream database systems will modularize their engines so they can be optimized for different applications and they can incorporate the benefits of the specialized databases while still maintaining a single uniform database management system.

      I agree with this prediction. Database interfaces (such as SQL) do not dictate implimentation. Ideally, query languages only ask for what you want, not tell the computer how to do it. As long as it returns the expected results, it does not matter if the database engine uses pointers, hashes, or gerbiles to get the answer. It may however require "hints" in the schema about what to optimize. Of course, you will sacrifice general-purpose performance to speed up a specific usage pattern. But at least they will give you the option.

      It is somewhat similar to what "clustered indexes" do in some RDBMS. Clusters improve the indexing by a chosen key at the expense of other keys or certain write patterns by physically grouping the data by that *one* chosen index/key order. The other keys still work, just not as fast.
             
      • Re: (Score:3, Interesting)

        by Pseudonym ( 62607 )

        Interfaces like SQL don't dictate the implementation, but they do dictate the model. Sometimes, the model that you want is so far from the interface language, that you need to either extend or replace the interface language for the problem to be tractable.

        SQL's approach has been to evolve. It isn't quite "there" for a lot of modern applications. I can forsee a day when SQL can efficiently model all the capabilities of, say, Z39.50, but we're not there now.

      • Re: (Score:3, Informative)

        by Decaff ( 42676 )
        I agree with this prediction. Database interfaces (such as SQL) do not dictate implimentation. Ideally, query languages only ask for what you want, not tell the computer how to do it.

        This can be taken a stage further, with general persistence APIs. The idea is that you don't even require SQL or relational stores: you express queries in a more abstract way and let a persistence engine generate highly optimised SQL, or some other persistence process. I use the Java JDO 2.0 API like this: I can persist and
    • The reasons for this "one size fits all" (OSFA) strategy include the following:
      Engineering costs...
      Sales costs...
      Marketing costs...

      What about the cost of maintenance for the customer?

      Maybe people will keep buying 'one size fits all' DBMSs if they meet enough of their requirements and they don't have to hire specialists for each type of databases they might have for each type of application. That is, it is easier and cheaper to maintain a smaller number of *standard* architectures (e.g. one) for a comp

  • one size fits 90% (Score:5, Insightful)

    by JanneM ( 7445 ) on Tuesday January 09, 2007 @11:03PM (#17534158) Homepage
    It's natural to look at the edges of any feature or performance envelope. People that want to store petabytes of particle accellerator data, do complex queries to serve a million webpages a second, have hundreds of thousands of employees doing concurrent things to the backend.

    But for most uses of databases - or any back-end processing - performance just isn't a factor and haven't been for years. Enron may have needed a huge data warehouse system; "Icepick Johhny's Bail Bonds and Securities Management" does not. Amazon needs the cutting edge in customer management; "Betty's Healing Crystals Online Shop (Now With 30% More Karma!)" not so much.

    For the large majority of uses - whether you measure in aggregate volume or number of users - one size really fits all.
    • This is more true all the time. I work in the EDA industry, in chip design. The databases sizes I work with are naturally well correlated with More's Law. In effect, I'm a permanent power user, but my circle of peers is shrinking into oblivion...
    • Re: (Score:2, Insightful)

      by TubeSteak ( 669689 )

      For the large majority of uses - whether you measure in aggregate volume or number of users - one size really fits all.

      I'm willing to concede that...
      But IMO it is not 100% relevant.

      Large corporate customers usually have a large effect on what features show up in the next version of [software]. Software companies put a lot of time & effort into pleasing their large accounts.

      And since performance isn't a factor for the majority of users, they won't really be affected by any performance losses resulting fr

    • The same argument is what gave rise to re-programmable generic processing components: CPUs. You'll note that the processor industry today (AMD in particular) is now also moving towards this kind of diversification. Gaming systems have been using dedicated GPUs for ages (today they're more powerful than entire PCs from 5 years ago) and I'm sure we remember back when math co-processors (i387) were introduced. You'll note that math co-processors were just absorbed back into the generic model.

      It's another pe
    • by Bozdune ( 68800 )
      Then why do we need specialized OLAP systems like Essbase, Kx Systems, etc.? So much for OSFA (one size fits all). Any transaction-oriented database of sufficient size, requiring multi-way joins between tables, and requiring sub-second response times to queries, is way out of range of OSFA. Furthermore, it doesn't require petabytes to take a relational database system to its knees. Just a few million transactions, and your DBMS will be on its back waving its arms feebly, along with your server.

      Performan
  • Imagine that.... (Score:5, Insightful)

    by NerveGas ( 168686 ) on Tuesday January 09, 2007 @11:09PM (#17534210)
    ... a database mechanism particularly written for the task at hand will beat a generic one. Who would have thought?

    steve

    (+1 Sarcastic)
    • There is an article, and it has many references. How is a 'Captain Obvious' sort of comment labeled Insightful? The insightful part is in the article. The first author, Michael Stonebraker, architected Ingres and Postgres. He looked at OLAP databases, which is a market that is much larger than a special case. He proposed storing the data in columns rather than in rows. He tested this, it works. In fact it works so well that he can clobber a $300,000 server cluster with a $800 dollar PC. I know that
  • Dammit (Score:5, Insightful)

    by AKAImBatman ( 238306 ) * <akaimbatman AT gmail DOT com> on Tuesday January 09, 2007 @11:15PM (#17534238) Homepage Journal
    I was just thinking about writing an article on the same issue.

    The problem I've noticed is that too many applications are becoming specialized in ways that are not handled well by traditional databases. The key example of this is forum software. Truly heirarchical in nature, the data is also of varying sizes, full of binary blobs, and generally unsuitable for your average SQL system. Yet we keep trying to cram them into SQL databases, then get surprised when we're hit with performance problems and security issues. It's simply the wrong way to go about solving the problem.

    As anyone with a compsci degree or equivalent experience can tell you, creating a custom database is not that hard. In the past it made sense to go with off-the-shelf databases because they were more flexible and robust. But now that modern technology is causing us to fight with the databases just to get the job done, the time saved from generic databases is starting to look like a wash. We might as well go back to custom databases (or database platforms like BerkeleyDB) for these specialized needs.
    • Re: (Score:3, Funny)

      by Jason Earl ( 1894 )

      Eventually the folks working on web forums will realize that they are just recreating NNTP and move on to something else.

    • by Jerf ( 17166 )

      Truly heirarchical in nature, the data is also of varying sizes, full of binary blobs, and generally unsuitable for your average SQL system.

      Actually, I was bitching about this very problem [jerf.org] (and some others) recently, when I came upon this article about recursive queries [teradata.com] on the programming reddit [reddit.com].

      Recursive queries would totally, completely solve the "hierarchy" part of the problem, and halfway decent database design would handle the rest.

      My theory is that nobody realizes that recursive queries would solve th

      • Or just use hierarchical queries - like START WITH / CONNECT BY clauses in Oracle. Probably other vendors have something similar too - not sure about that.
      • by Imsdal ( 930595 )

        My theory is that nobody realizes that recursive queries would solve their problems, so nobody asks for them, so nobody ever discovers them, so nobody ever realizes that recursive queries would solve their problem.

        It used to be that execution plans in Oracle were retreived from the plan table via a recursive query. Since even the tiniest application will need a minimum amount of tuning, and since all db tuning should start by looking at the execution plans, everyone should have run into recursive queries s

    • The key example of this is forum software. Truly heirarchical in nature, the data is also of varying sizes, full of binary blobs, and generally unsuitable for your average SQL system.

      Hierarchichal? Yes, but I don't see any problem using SQL to access hierarchical information. It's easy to have parent/child relationships.

      Data of varying sizes? I thought this problem was solved 20 years ago when ANSI adopted a SQL standard including a VARCHAR datatype.

      Full of binary blobs? Why? What in the hell for? So
  • Duh (Score:5, Insightful)

    by Reality Master 101 ( 179095 ) <RealityMaster101@gmail. c o m> on Tuesday January 09, 2007 @11:18PM (#17534258) Homepage Journal

    Who thinks that a specialized application (or algorithm) won't beat a generalized one in just about every case?

    The reason people use general databases is not because they think it's the ultimate in performance, it's because it's already written, already debugged, and -- most importantly -- programmer time is expensive, and hardware is cheap.

    See also: high level compiled languages versus assembly language*.

    (*and no, please don't quote the "magic compiler" myth... "modern compilers are so good nowadays that they can beat human written assembly code in just about every case". Only people who have never programmed extensively in assembly believe that.)

    • Re:Duh (Score:5, Informative)

      by Waffle Iron ( 339739 ) on Tuesday January 09, 2007 @11:43PM (#17534468)
      *and no, please don't quote the "magic compiler" myth... "modern compilers are so good nowadays that they can beat human written assembly code in just about every case". Only people who have never programmed extensively in assembly believe that.

      I've programmed extensively in assembly. Your statement may be true up to a couple of thousand lines of code. Past that, to avoid going insane, you'll start using things like assembler macros and your own prefab libraries of general-purpose assembler functions. Once that happens, a compiler that can tirelessly do global optimizations is probably going to beat you hands down.

      • Re:Duh (Score:5, Insightful)

        by wcbarksdale ( 621327 ) on Wednesday January 10, 2007 @12:07AM (#17534670)
        Also, to successfully hand-optimize you need to remember a lot of details about instruction pipelines, caches, and so on, which is fairly detrimental to remembering what your program is supposed to do.
      • The reason why assembly programmers can beat high-level programmers is they can write their code in a high-level language first, then profile to see where the hotspots are, and then rewrite a 100 line subroutine or two in assembly language, using the compiler output as a first draft.

        In other words, assembly programmers beat high-level programmers because they can also use modern compilers.

      • Re: (Score:3, Insightful)

        by RAMMS+EIN ( 578166 )
        Also, the compiler may know more CPUs than you do. For example, do you know the pairing rules for instructions on an original Pentium? The differences one must pay attention to when optimizing for an Thoroughbred Athlon vs. a Prescott P4 vs. a Yonah Pentium-M vs. a VIA Nehemiah? GCC does a pretty good job of generating optimized assembly code for each of these from the same C source code. If you were to do the same in assembly, you would have to write separate code for each CPU, and know the subtle differen
    • I've never heard the "magic compiler myth" phrase, but I'll help educate others about it. It's refreshing to hear someone who understands reality. Of course, a factor of 2 to 4 improvement in speed is less and less important every day...
    • Re: (Score:3, Interesting)

      by suv4x4 ( 956391 )
      "modern compilers are so good nowadays that they can beat human written assembly code in just about every case". Only people who have never programmed extensively in assembly believe that.

      Only people who haven't seen recent advancements in CPU design and compiler architecture will say what you just said.

      Modenr compilers apply optimizations on a so sophisticated level that would be a nightmare for a human to support such a solution optimized.

      As an example, modern Intel processors can process certain "simple"
      • Re: (Score:2, Insightful)

        by mparker762 ( 315146 )
        Only someone who hasn't recently replaced some critical C code with assembler and gotten substantial improvement would say that. This was MSVC 2003 which isn't the smartest C compiler out there, but not a bad one for the architecture. Still, a few hours with the assembler and a few more hours doing some timings to help fine-tune things improved the CPU performance of this particular service by about 8%.

        Humans have been writing optimized assembler for decades, the compilers are still trying to catch up. M
        • Re: (Score:2, Insightful)

          by suv4x4 ( 956391 )
          This was MSVC 2003 which isn't the smartest C compiler out there, but not a bad one for the architecture. Still, a few hours with the assembler and a few more hours doing some timings to help fine-tune things improved the CPU performance of this particular service by about 8%... These sophisticated CPU's don't know who wrote the machine code, they do parallel execution and branch prediction and so forth on hand-optimized assembly just like they do on compiler-generated code. Which is one reason (along with
      • Re:Duh (Score:4, Insightful)

        by try_anything ( 880404 ) on Wednesday January 10, 2007 @02:59AM (#17535854)
        Modenr compilers apply optimizations on a so sophisticated level that would be a nightmare for a human to support such a solution optimized.

        There are three quite simple things that humans can do that aren't commonly available in compilers.

        First, a human gets to start with the compiler output and work from there :-) He can even compare the output of several compilers.

        Second, a human can experiment and discover things accidentally. I recently compiled some trivial for loops to demonstrate that array bounds checking doesn't have a catastrophic effect on performance. With the optimizer cranked up, the loop containing a bounds check was faster than the loop with the bounds check removed. That did not inspire confidence.

        Third, a human can concentrate his effort for hours or days on a single section of code that profiling revealed to be critical and test it using real data. Now, I know JIT compilers and some specialized compilers can do this stuff, but as far as I know I can't tell gcc, "Compile this object file, and make the foo function as fast as possible. Here's some data to test it with. Let me know on Friday how far you got, and don't throw away your notes, because we might need further improvements."

        I hope I'm wrong about my third point (please please please) so feel free to post links proving me wrong. You'll make me dance for joy, because I do NOT have time to write assembly, but I have a nice fast machine here that is usually idle overnight.

        • With the optimizer cranked up, the loop containing a bounds check was faster than the loop with the bounds check removed.

          That actually makes sense to me. If your bounds check was very simple and the only loop outcome was breaking out (throw an exception, exit the loop, exit the function, etc., without altering the loop index), the optimizer could move it out of the loop entirely and alter the loop index check to incorporate the effect of the bounds check. Result is a one-time bounds check before entering
    • Re: (Score:2, Insightful)

      by kfg ( 145172 )
      The reason people use general databases is not because they think it's the ultimate in performance, it's because it's already written, already debugged, and -- most importantly. . .

      . . .has some level of definable and gauranteed data integrity.

      KFG
    • I had a "simple" optimization project. It came down to one critical function (ISO JBIG compression). I coded the thing by hand in assembler, carefully manually scheduling instructions. It took me days. Managed to beat GNU gcc 2 and 3 by a reasonable margin. The latest Microsoft C compiler? Blew me away. I looked at the assembler it produced -- and I don't get where the gain is coming from. The compiler understands the machine better than I do.

      Go figure -- I hung up my assembler badge. Still a useful skill f
      • I looked at the assembler it produced -- and I don't get where the gain is coming from. The compiler understands the machine better than I do.

        All that proves is that the compiler knew a trick you didn't (probably it understood which instructions will go into which pipelines and will parallelize). I bet if you took the time to learn more about the architecture, you could find ways to be even more clever.

        I'm not arguing for a return to assembly... it's definitely too much of a hassle these days, and aga

      • by TheLink ( 130905 )
        "The compiler understands the machine better than I do."

        Actually the people paid lots of money to write Microsoft's C compiler understand the machine better than you do. I doubt you should be surprised.

        And the compiler will hopefully be able to keep all the tricks in mind (a human might forget to use one in some cases).

        I'm just waiting/hoping for the really smart people to make stuff like perl and python faster.

        Java has improved in speed a lot and already is quite fast in some cases, but I don't consider it
  • This reminds me of the parallel databases class I took in college. Sure, specialized parallel databases (not distributed, mind you, parallel) using specialized hardware were definitely faster than the standard SQL-type relational databases...but so what? The costs were so much higher they were not feasible for most applications.

    Specialized software and hardware outperforms generic implementations! Film at 11!
  • SW platform development always features a tradeoff between general purpose APIs and optimized performance engines. Databases are like this. The economic advantages for everyone in using an API even as awkward and somewhat inconsistent as SQL are more valuable than the lost performance in the fundamental relational/query model.

    But it doesn't have to be that way. SQL can be retained as an API, but different storage/query engines can be run under the hood to better fit different storage/query models for differ
    • Databases already have the ability to change storage engines as long as they support SQL. The reason my company shuns the database for many specific tasks is that SQL is ill-suited to perform many types of transformations, calculations, and aggregations on data. What may take many pages of SQL (and many temp tables) in a stored proc can be written in a simple Java class and will perform much better, as well as being easier to maintain. A lot of our processing goes like this Raw data from database (simple
  • by TVmisGuided ( 151197 ) <alan...jump@@@gmail...com> on Tuesday January 09, 2007 @11:56PM (#17534566) Homepage
    Sheesh...and it took someone from MIT to point this out? Look at a prime example of a high-end, heavily-scaled, specialized database: American Airlines' SABRE. The reservations and ticket-sales database system alone is arguably one of the most complex databases ever devised, is constantly (and I do mean constantly) being updated, is routinely accessed by hundreds of thousands of separate clients a day...and in its purest form, is completely command-line driven. (Ever see a command line for SABRE? People just THINK the APL symbol set looked arcane!) And yet this one system is expected to maintain carrier-grade uptime or better, and respond to any command or request within eight seconds of input. I've seen desktop (read: non-networked) Oracle databases that couldn't accomplish that!
    • by sqlgeek ( 168433 ) on Wednesday January 10, 2007 @03:12AM (#17535930)
      I don't think that you know Oracle very well. Lets say you want so scale and so you want clustering or grid functionality -- built into Oracle. Lets say that you want to partition your enormous table into one physical table per month or quarter -- built in. Oh, and if you query the whole giant table you'd like parallel processes to run against each partition, balanced across your cluster or grid -- yeah, that's built in too. Lets say you almost always get a group of data together rather than piece by piece so you want it physically colocated to reduce disk i/o -- built in.

      This is why you pay a good wage for your Oracle data architect & DBA -- so that you can get people who know how to do these sort of things when needed. And honestly I'm not even scratching the surface.

      Consider a data warehouse for a giant telecom in South Africa (with a DBA named Billy in case you wondered). You have over a billion rows in your main fact table, but you're only interested in a few thousand of those rows. You have an index on dates and another index on geographic region and another region on customer. Any one of those indexes will reduce the 1.1 billion rows to 10's of millions of rows, but all three restrictions will reduce it to a few thousand. What if you could read three indexes, perform bitmap comparisons on the results to get only the rows that match the results of all three indexes and then only fetch those few thousand rows from the 1.1 billion row table. Yup, that's built in and Oracle does it for you for behind the scenes.

      Now yeah, you can build a faster single-purpose db. But you better have a god damn'd lot of dev hours allocated to the task. My bet is that you'll probably come our way ahead in cash & time to market with Oracle, a good data architect and a good DBA. Any time you want to put your money on the line, you let me know.
      • Nevertheless - anyone doing serious data warehousing who cares about read performance has been using Teradata (older apps) or column-oriented Sybase-IQ (newer apps). Oracle can store over a billion rows, sure; a terabyte's a lot of data, and people have had multi-terabyte databases for the better part of a decade, for some projects.

        Why? Despite all the tuning, Sybase-IQ can still run through a general purpose query into its data around ten times faster than tuned Oracle.

        It may not matter in the telephone
      • Now yeah, you can build a faster single-purpose db. But you better have a god damn'd lot of dev hours allocated to the task. My bet is that you'll probably come our way ahead in cash & time to market with Oracle, a good data architect and a good DBA. Any time you want to put your money on the line, you let me know.

        Seems to me this describes AA perfectly...SABRE has been around since what, the mid- to late-70s? And it's still actively developed and maintained. At a fairly hefty annual price tag. An

  • by suv4x4 ( 956391 ) on Wednesday January 10, 2007 @12:06AM (#17534660)
    We're all sick with "new fad: X is dead?" articles. Please reduce lameness to an acceptable level!
    Can't we get used to the fact that specialized & new solutions don't magically kill existing popular solution to a problem?

    And it's not a recent phenomenon, either, I bet it goes back to when the first proto-journalistic phenomenons formed in early uhman societies, and haunts us to this very day...

    "Letters! Spoken speech dead?"

    "Bicycles! Walking on foot dead?"

    "Trains! Bicycles dead?"

    "Cars! Trains dead?"

    "Aeroplanes! Trains maybe dead again this time?"

    "Computers! Brains dead?"

    "Monitors! Printing dead yet?"

    "Databases! File systems dead?"

    "Specialized databases! Generic databases dead?"

    In a nutshell. Don't forget that a database is a very specialized form of a storage system, you can think of it as a very special sort of file system. It didn't kill file systems (as noted above), so specialized systems will thrive just as well without killing anything.
    • Re: (Score:2, Funny)

      by msormune ( 808119 )
      I'll chip in: Public forums! Intelligence dead? Slashdot confirms!
    • Death to Trees! (Score:3, Interesting)

      by Tablizer ( 95088 )
      Don't forget that a database is a very specialized form of a storage system, you can think of it as a very special sort of file system. It didn't kill file systems

      Very specialized? Please explain. Anyhow, I *wish* file systems were dead. They have grown into messy trees that are unfixable because trees can only handle about 3 or 4 factors and then you either have to duplicate information (repeat factors), or play messy games, or both. They were okay in 1984 when you only had a few hundred files. But they
      • Re: (Score:3, Insightful)

        by suv4x4 ( 956391 )
        Anyhow, I *wish* file systems were dead. They have grown into messy trees that are unfixable because trees can only handle about 3 or 4 factors and then you either have to duplicate information (repeat factors), or play messy games, or both.

        You know, I've seen my share of RDBMS designs to know the "messiness" is not the fault of the file systems (or databases in that regard).

        Sets have more issues than you describe, and you know very well Vista had lots of set based features that were later downscaled, hidde
    • Comment removed based on user account deletion
  • by Dekortage ( 697532 ) on Wednesday January 10, 2007 @12:24AM (#17534804) Homepage

    I've made some similar discoveries myself!

    • Transporting 1500 pounds of bricks from the store to my house is much faster if I use a big truck rather than making dozens (if not hundreds) of trips with my Honda Civic.
    • Wearing dress pants with a nice shirt and tie often makes an interview more likely to succeed, even if I wear jeans every other day after I get the job.
    • Carving pumpkins into "jack-o-lanterns" always turns out better if I use a small, extremely sharp knife instead of a chainsaw.

    Who woulda thought that specific-use items might improve the outcome of specific situations?

    • What self-respecting geek would carve pumpkins with anything other than a Dremel? Turn your card in at the door, sir...
    • Transporting 1500 pounds of bricks from the store to my house is much faster if I use a big truck rather than making dozens (if not hundreds) of trips with my Honda Civic.

      Nah, just strap it all on top [snopes.com].

  • I've seen drop dead performance on flat file databases. I've seen molasses slow performance on mainframe relational databases. And I've seen about everything in between.

    What I see as a HUGE factor is less the database chosen (though that is obviously important) and more how interactions with the database (updates, queries, etc) are constructed and managed.

    For example, we one time had a relational database cycle application that was running for over eight hours every night, longer than the alloted time f

  • It's just not called SQL driven RDBMS. It's called Sleepycat.
  • 23 years ago I wrote a custom DB to maintain the status of millions of "universal" gift cards, it ran 3-5 orders of magnitude faster (on a 6 MHz IMB AT) than a commercial database running on a big IBM mainframe.

    I reduced the key operations (what is the value of this gift card, when was it sold, has it been redeemed previously? etc) to just one operation:

    Check and clear a single bit in a bitmap.

    My program used 1 second to update 10K semi-randomly-ordered (i.e. in the order we got them back from the shops tha
  • by pfafrich ( 647460 ) <rich AT singsurf DOT org> on Wednesday January 10, 2007 @07:27AM (#17537278) Homepage
    Has anyone noticed the This article is published under a Creative Commons License Agreement, its the first time I've seen this applied to an academic paper. Another small step for the open-content movement.
  • This is titled "OSFA? - Part 2: Benchmarking Results." Has anyone found Part 1?
  • This is, of course, what MUMPS [wikipedia.org] advocates have been saying for years.

    MUMPS is a very peculiar language that is very "politically incorrect" in terms of current language fashion. Its development has been entirely governed by pragmatic real-world requirements. It is one of the purest examples of an "application programming" language. It gets no respect from academics or theoreticians.

    Its biggest strength is its built-in "globals," which are multidimensional sparse arrays. These arrays and the elements in them

What is research but a blind date with knowledge? -- Will Harvey

Working...