MemSQL Makers Say They've Created the Fastest Database On the Planet 377

Posted by timothy on Sunday June 24, 2012 @07:23PM from the before-you-even-think-it dept.

mikejuk writes "Two former Facebook developers have created a new database that they say is the world's fastest and it is MySQL compatible. According to Eric Frenkiel and Nikita Shamgunov, MemSQL, the database they have developed over the past year, is thirty times faster than conventional disk-based databases. MemSQL has put together a video showing MySQL versus MemSQL carrying out a sequence of queries, in which MySQL performs at around 3,500 queries per second, while MemSQL achieves around 80,000 queries per second. The documentation says that MemSQL writes back to disk/SSD as soon as the transaction is acknowledged in memory, and that using a combination of write-ahead logging and snapshotting ensures your data is secure. There is a free version but so far how much a full version will cost isn't given." (See also this article at SlashBI.)

This discussion has been archived. No new comments can be posted.

MemSQL Makers Say They've Created the Fastest Database On the Planet

Load All Comments

Search 377 Comments Log In/Create an Account

Comments Filter:

A nice approach perhaps... (Score:5, Interesting)

by alphatel ( 1450715 ) * writes: on Sunday June 24, 2012 @07:27PM (#40433251)

It sounds cool, but we can get 200k iops on Raid10 SSD without degradation.

Share
twitter facebook
- Re:A nice approach perhaps... (Score:5, Insightful)
  
  by tomhath ( 637240 ) writes: on Sunday June 24, 2012 @07:44PM (#40433427)
  
  Price/performance is a better question. If it's fast enough that you don't need the Raid 10 SSD then it could be a good choice. Throw hardware at any DBMS and you'll get good throughput.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by hcs_$reboot ( 1536101 ) writes:
  
  So, two guys meet and churn out some code that cache most of the DB work in memory. Great. But MySQL has a MEMORY engine and is pretty well optimized (eg keep indexes in memory, does some caching as well)... the hardest part is probably its setup: setting the right options in MySQL to achieve top performance is not easy.
  Besides, the "caching" or equivalent work is not the most difficult part of a DBMS, by far: What about the algorithms to "compile" queries in order to use indexes and perform the JOINS opti
  - Re: (Score:3)
    
    by Tough Love ( 215404 ) writes:
    
    All of computer science is just an exercise in caching, don't you know.
- Re: (Score:3)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
Ya Don't Say! (Score:5, Insightful)

by Rary ( 566291 ) writes: on Sunday June 24, 2012 @07:28PM (#40433271)

Really? Accessing RAM is faster than accessing a disk? What a novel discovery!
It seems to me that MySQL can also be run in memory. Apparently that's how the clustered database works (or used to work). I've never tried it, but let's see some benchmarks between MemSQL and an entirely memory-based MySQL.

Share
twitter facebook
- Re: (Score:2)
  
  by JimboFBX ( 1097277 ) writes:
  
  I was going to say, how does this perform on large queries in a large database.
  - Re: (Score:3)
    
    by Alex Belits ( 437 ) * writes:
    
    Didn't you get the memo? There are no large databases anymore, all database servers are supposed to have more RAM than the size of their database.
    *** BARF!!! ***
    - Re: (Score:3)
      
      by drsmithy ( 35869 ) writes:
      
      Given how trivial and relatively cheap it is to put 192GB+ RAM into a server these days, there's a lot of truth in that statement, whether you like it or not.
      - Re: (Score:3)
        
        by QuantumRiff ( 120817 ) writes:
        
        I find it funny how easy it is to order an AMD system with 256GB of ram (or even 512GB, just much more expensive) yet the Intel ones all seem to max out at 192 or really, really expensive 384GB.. I know it has to do with the memory controllers, but our loads are very, very memory dependent..
        
        Re: (Score:3)
        
        by Zak3056 ( 69287 ) writes:
        
        I find it funny how easy it is to order an AMD system with 256GB of ram (or even 512GB, just much more expensive) yet the Intel ones all seem to max out at 192 or really, really expensive 384GB.. I know it has to do with the memory controllers, but our loads are very, very memory dependent..
        The Dell PE820 (a 4-socket intel server) supports up to 1.5TB of RAM. With 2-CPUs, though, it's only 768GB...
        
        Re: (Score:3)
        
        by Junta ( 36770 ) writes:
        
        IBM x3950 x5 can do 3TB of ram with Intel processors.
      - Re:Ya Don't Say! (Score:4, Insightful)
        
        by drsmithy ( 35869 ) writes: <drsmithy@gmail . c om> on Monday June 25, 2012 @05:50AM (#40436575)
        
        Wake me up when that is 800GB to a few TB. Then you can say. This might shock you to learn but some business uses their databases to drive more than just web forums.
        You can already put a TB of RAM into a server if you want. If you really need to have that amount of data with next to zero latency, then the cost (which is still relatively low) is unlikely to be much of a stumbling block.
        It clearly will shock you to learn that most databases are well under a couple of hundred GB in size.
        
        Parent Share
        twitter facebook
      - Re: (Score:3)
        
        by SQL Error ( 16383 ) writes:
        
        Our database is 300TB. So.... Yeah.
- Re: (Score:3)
  
  by lucifuge31337 ( 529072 ) writes:
  
  It seems to me that MySQL can also be run in memory. Apparently that's how the clustered database works (or used to work).
  Absolutely correct. NDB Cluster. It's quite fast, even on older hardware providing you have enough RAM to hold your database.
  - - Re: (Score:3)
      
      by lucifuge31337 ( 529072 ) writes:
      
      No foreign keys with NDB
      Also correct. Your point? Other than that specific tools were made for specific jobs.
- Re:Ya Don't Say! (Score:4, Insightful)
  
  by errandum ( 2014454 ) writes: on Sunday June 24, 2012 @08:41PM (#40433813)
  
  That and memcached (I think that's the name).
  This comparison is far from fair... Is it ACID? Or eventually synchs up? How does it compare with other memory based DB's?
  Comparing it with a slow relational DB will not give you any kind of credibility.
  
  Parent Share
  twitter facebook
  - Re:Ya Don't Say! (Score:5, Insightful)
    
    by mwvdlee ( 775178 ) writes: on Monday June 25, 2012 @03:00AM (#40435955) Homepage
    
    TFS states that transactions are writen to disk after being "acknowledged" in memory.
    I assume that means transactions are written to disk only after the database reports back a succesful commit.
    So failing to meet the D of ACID compliancy.
    
    Parent Share
    twitter facebook
- Re: (Score:2)
  
  by MobileTatsu-NJG ( 946591 ) writes:
  
  Isn't the implocation that in this case there's a lot less time between the transaction getting made and that data being committed to non-volitile memory?
  - Re:Ya Don't Say! (Score:5, Insightful)
    
    by xelah ( 176252 ) writes: on Monday June 25, 2012 @05:48AM (#40436571)
    
    I don't think that's something which can be changed, except by changing the hardware. The starting point is this: When a COMMIT is made all changes have to be written to the write-ahead-log before a success response can be returned to the client. The WAL is written sequentially, and so if you're using ordinary disks and are sensible you give it its own set of spindles (RAID1, say). That means that between each write you have to wait for one disk rotation - you append to the log, you process the next transaction, then you have to wait for the disk to rotate to just after where you finished writing before you can write the next one. So you can do 1/15k transactions per minute with this basic setup.
    You can do things to make this faster. You can write several transactions at once, and you can put slight delays in to transaction commits to wait for others to bundle them with (PostgreSQL I believe will do the first and can be configured to do the second). You can use battery backed caches in your RAID system, which will have much the same effect (and leave you limited by disk bandwidth and cache size). You can use SSDs that don't need to seek.
    I can't see anything in TFA that MemSQL is supposed be doing differently here, or anything it CAN do differently. From TFA: 'The key ideas are that SQL code is translated into C++, so avoiding the need to use a slow SQL interpreter, and that the data is kept in memory, with disk read/writes taking place in the background.'. The first I'm not too sure I understand (presumably they're not turning it in to C++ and then passing it through a C++ compiler....) but maybe we can blame the journalist for that. Or maybe they've just reinvented prepared statements. The second is what databases do anyway - except, of course, for the WAL and when you're reading data which isn't in memory. Perhaps what they're doing is flushing the WAL after the commit has returned to the client - which makes the database very much not ACID, and is also something that other databases can be configured to do if you don't care about your data.
    Potentially what they could do, though, is to have designed all of their data structures, algorithms, locking and so on around the assumption that everything is in memory. There are big differences in the best query plan to use when data is in memory vs on disk, and traditional databases don't necessarily make the right choices. They try, but may for instance use table scans for queries which return a large proportion of the rows in a table because sequential IO is faster, when they should be using indexes if the data is in memory. And BTrees and the way data everywhere is split in to pages is something traditional DBs do because that works well even when most of your data is on disk. So maybe that's what they've done differently that other DBs haven't already been doing.
    
    Parent Share
    twitter facebook
- Re:Ya Don't Say! (Score:4, Interesting)
  
  by bill_mcgonigle ( 4333 ) * writes: on Sunday June 24, 2012 @09:23PM (#40434063) Homepage Journal
  
  Not just that - you can get a FusionIO ramdisk device for really big databases and get performance that's somewhere between SSD and memory. Those are all battery backed and such, so no monkeying around with whether the ACID was done right or not.
  
  Parent Share
  twitter facebook
- Re:Ya Don't Say! (Score:5, Interesting)
  
  by gman003 ( 1693318 ) writes: on Sunday June 24, 2012 @10:21PM (#40434449)
  
  It's a bit more complex. There's four main ways to do MySQL storage in RAM (which I know of because my current work project is a MySQL application).
  First, the NDB Cluster system is there, which is what you've mentioned. That's basically just a MySQL frontend to a distributed, memory-based NoSQL database, though. Convenient, but not truly "MySQL".
  The second is using the "Memory" storage engine, where it actually stores a normal MyISAM table in memory. However, this is a surprisingly crappy option, because it uses table-level locks for writing, so parallel write performance is only marginally faster than disk.
  The third is to store regular InnoDB tables on a ramdisk. This can be crazy fast, but it also means that if your server crashes or loses power, you're *fucked*
  The fourth is to use Memcached, which isn't really a MySQL thing at all. You're basically just caching data in a memory-only NoSQL database, at the application level. This is actually what we ended up doing, because all the others are pretty crappy options - Cluster is the best one, but the hardware requirements are higher than we could justify spending given our performance requirements. Shoving memcached onto the web server (which has RAM to spare) and setting certain queries to cache their results there sped things up significantly, at minimal cost.
  As far as I can tell from the summary (I refuse to read the articles for such a blantant slashvertisement), this "MemSQL" doesn't do anything you can't do by configuring MySQL properly, although they likely optimized some rarely-used modules to make them faster.
  
  Parent Share
  twitter facebook
  - Re:Ya Don't Say! (Score:5, Informative)
    
    by arth1 ( 260657 ) writes: on Sunday June 24, 2012 @11:08PM (#40434729) Homepage Journal
    
    The third is to store regular InnoDB tables on a ramdisk. This can be crazy fast, but it also means that if your server crashes or loses power, you're *fucked*
    Not necessarily. There are battery-backed volatile RAM devices that can last for days, and also non-volatile RAM devices like F-RAM and MRAM.
    Battery backed volatile RAM can even be considered "cheap" - if the bottleneck are in tables small enough to fit on these, or the amount of overall writes is so high that placing the innodb logs there makes sense, it can be cheaper than a RAID10 or 50 of high-speed SAS drives.
    The HyperCard / ACARD drives, for example, are only $300 plus RAM. And if the worst happens, you can even dump the RAM to a CF card before the battery runs out.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by gman003 ( 1693318 ) writes:
      
      I was referring to software-based ramdisks, not RAM-based SSDs. Although I suppose there's not much of a performance difference - the only difference is durability.
      - Re:Ya Don't Say! (Score:5, Interesting)
        
        by arth1 ( 260657 ) writes: on Monday June 25, 2012 @01:06AM (#40435405) Homepage Journal
        
        The biggest issue with RAM drives are their cost.
        Yes and no. If you can fit the Innodb writeahead-logs and a few of the worst bottleneck tables on, say, an 8 GB ram drive, it's a bargain.
        HyperDrive: $300
        2 * 4GB 240-Pin DDR2-800 SDRAM ECC: $234
        16 GB CF card for backup: $30
        Total: $564
        That's downright cheap compared to what a RAID 10 or 50 of SSDs or short-stroked 10k/15k rpm drives would cost.
        If it solves a bottleneck, it could be a big money saver.
        
        Parent Share
        twitter facebook
- Re:Ya Don't Say! (Score:5, Funny)
  
  by Hognoxious ( 631665 ) writes: on Monday June 25, 2012 @02:17AM (#40435743) Homepage Journal
  
  MySQL is not webscale because it uses joins.
  
  Parent Share
  twitter facebook
  - Re:Ya Don't Say! (Score:4, Informative)
    
    by Anonymous Coward writes: on Monday June 25, 2012 @02:40AM (#40435871)
    
    For those who don't get the reference:
    http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
    
    Parent Share
    twitter facebook
- - Re:Ya Don't Say! (Score:5, Funny)
    
    by Paradise Pete ( 33184 ) writes: on Monday June 25, 2012 @01:01AM (#40435381) Journal
    
    I am a ramdisc fan since Mac+.
    I'm gonna call BS on this one. Why would a ram disc need a fan?
    
    Parent Share
    twitter facebook
okay...? (Score:5, Funny)

by bhcompy ( 1877290 ) writes: on Sunday June 24, 2012 @07:38PM (#40433359)

When I think of fast databases to compare to, the first thing I think of is MySQL.

/Actually, I'd rather see a comparison to Pick or other lightning fast MV dbs

Share
twitter facebook
- Re:okay...? (Score:5, Insightful)
  
  by Kergan ( 780543 ) writes: on Sunday June 24, 2012 @07:51PM (#40433489)
  
  MySQL is the last thing I think of, personally. It sucks as soon as you make it ACID compliant and start hitting it with thousands of concurrent requests. You're much better off with PostgreSQL.
  
  Parent Share
  twitter facebook
  - Re:okay...? (Score:5, Funny)
    
    by LurkerXXX ( 667952 ) writes: on Sunday June 24, 2012 @08:03PM (#40433573)
    
    But with MySQL you can get a wrong answer REAL FAST!!!
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by sortadan ( 786274 ) writes:
      
      But MySQL can only return NULL 3,500 times in one second, this MemSQL thing can return NULL 80,000 times per second. That's 2285.71% faster!
  - Re:okay...? (Score:4, Informative)
    
    by Ziekheid ( 1427027 ) writes: on Sunday June 24, 2012 @08:07PM (#40433589)
    
    He was being sarcastic..
    
    Parent Share
    twitter facebook
    - Re:okay...? (Score:5, Informative)
      
      by Areyoukiddingme ( 1289470 ) writes: on Monday June 25, 2012 @01:49AM (#40435631)
      
      Here on Slashdot, we have a convention for saying that with less typing. It's spelled like this:
      
      *woosh*
      
      Parent Share
      twitter facebook
- Re:okay...? (Score:4, Insightful)
  
  by evilviper ( 135110 ) writes: on Sunday June 24, 2012 @08:15PM (#40433643) Journal
  
  When I think of fast databases to compare to, the first thing I think of is MySQL.
  MySQL is actually very fast under light loads / one-off queries, and if you choose to leave it at the non-ACID compliant default settings, and similar. eg. "innodb_flush_log_at_trx_commit"
  That's probably the only reason why it got popular... There weren't any open source NoSQL DBs at the time, and MySQL seems fast when tested with a basic, shallow benchmark. Of course others like PostgreSQL completely leave it in the dust once there's some real load, or complex queries, or you WANT to be absolutely sure transactions were committed to disk before returning.
  As a single point of evidence, I give you Zabbix... It supports the use of all the major databases (Postgresql, DB2, Oracle, SQLite, etc.) as backends, yet MySQL is recommended as it performs the fastest.
  http://www.zabbix.com/documentation/1.8/manual/performance_tuning [zabbix.com]
  /Actually, I'd rather see a comparison to Pick or other lightning fast MV dbs
  Level-2 overflow! Resize analysis! Change the modulo! Ahhhh!
  I've done the PICK-OS thing for a few years, and I'm not a big fan. I'm infinitely happier administering PostgreSQL DBs.
  Besides, you don't have to go to something as exotic as PICK to get away from SQL. Try ages-old Berkley DB (db4), or any of the newer NoSQL options.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by julesh ( 229690 ) writes:
    
    That's probably the only reason why it got popular... There weren't any open source NoSQL DBs at the time
    Zope? BDB? Both of these were available at the time MySQL became popular.
  - Re: (Score:3)
    
    by CadentOrange ( 2429626 ) writes:
    
    As a single point of evidence, I give you Zabbix... It supports the use of all the major databases (Postgresql, DB2, Oracle, SQLite, etc.) as backends, yet MySQL is recommended as it performs the fastest. http://www.zabbix.com/documentation/1.8/manual/performance_tuning [zabbix.com]
    From the linked document:
    rebuild MySQL or PostgreSQL from sources to get maximum performance
    2003 just called. They want their Gentoo Ricers [funroll-loops.info] back.
- Ahhhh, Pick! (Score:5, Interesting)
  
  by hedronist ( 233240 ) writes: on Sunday June 24, 2012 @08:56PM (#40433891)
  
  The most over-the-top DB God I know started in Pick-land (ca 1972?). Although he does (is forced to?) use SQL nowadays, he thinks in ways that do not come out of any SQL DBA handbook. As a result he gets DBMSs to do things that are ... unnatural.
  He is currently doing some data-cubing stuff for us that I didn't think could be done with something less than a DOD budget. He says his touchstone is thinking in Pick and then 'translating' to SQL.
  I still think that the 2 missing courses from any CS degree program are 1) how to debug, and 2) history of computing.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Samantha Wright ( 1324923 ) writes:
    
    I am intrigued by your ideas. Let us start a newsletter.
  - Re:Ahhhh, Pick! (Score:5, Insightful)
    
    by Zenin ( 266666 ) writes: on Sunday June 24, 2012 @11:05PM (#40434713) Homepage
    
    I still think that the 2 missing courses from any CS degree program are 1) how to debug, and 2) history of computing.
    Practical software engineering is mostly about debugging. An actual course in debugging would imply that Computer Science curriculum had something to do with practical software engineering, which we're all painfully away it hasn't in the slightest.
    
    Parent Share
    twitter facebook
  - Re: (Score:3)
    
    by SixDimensionalArray ( 604334 ) writes:
    
    As someone who programs in Pick/D3 still every day (a skill I picked up working for a company with a legacy product), as well as having had worked in pretty much every SQL product that exists, I am both startled and amazed to see it mentioned on Slashdot. I think this is the first time I've ever seen anyone mention it!!!
    And I am in agreement - Pick was something truly different which could have been as big as SQL - multi-value, "NoSQL"-ish which still had a query engine, fast, little to no maintenance, loo
Show me vs a real DB engine (Score:5, Interesting)

by Kergan ( 780543 ) writes: on Sunday June 24, 2012 @07:42PM (#40433395)

Show me benchmarks vs Oracle, PostgreSQL or SQLServer. Spare me the comparison with MySQL or some other toy.

Share
twitter facebook
- vs DB2 (Score:2)
  
  by Hyperhaplo ( 575219 ) writes:
  
  I would like to see the compare againsr DB2. Midrange DB2 if you really want like for like, mainframe if you have guts :)
- Re:Show me vs a real DB engine (Score:5, Informative)
  
  by BitterOak ( 537666 ) writes: on Sunday June 24, 2012 @08:44PM (#40433821)
  
  Show me benchmarks vs Oracle, PostgreSQL or SQLServer. Spare me the comparison with MySQL or some other toy.
  I think the reason the comparison to MySQL is appropriate is that this database is supposed to be MySQL compatible.
  
  Parent Share
  twitter facebook
- Re:Show me vs a real DB engine (Score:5, Informative)
  
  by symbolset ( 646467 ) * writes: on Sunday June 24, 2012 @09:18PM (#40434027) Journal
  
  This i supposed to be funny. Oracle prohibits private benchmark publication in their license.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by rubycodez ( 864176 ) writes:
    
    but anyone can download and run Oracle & benchmark it and publish it. This is the internet, information wants to be free.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- Re: (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  PostgreSQL is definitely the best database software out there.
  - Re: (Score:2)
    
    by jmactacular ( 1755734 ) writes:
    
    Just wish we could easily re-order columns with FKs. =)
    - Re: (Score:3)
      
      by mwvdlee ( 775178 ) writes:
      
      I haven't yet had a need to re-order columns with FK's, despite having build, maintained and used hundreds of different tables in a variety of database products.
      Is there any good reason to do so, besides a desire to make old database tables look slightly prettier?
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  MySQL is not a toy. Oracle is a bloated monster that only survives by locking-in their customers. I know a lot of high-end customers that would ditch Oracle immediately if that would not mean rewriting a lot of software.
- - Re:Show me vs a real DB engine (Score:4, Funny)
    
    by gman003 ( 1693318 ) writes: on Sunday June 24, 2012 @10:22PM (#40434461)
    
    Ah, but it's an "enterprise-grade" toy.
    
    Parent Share
    twitter facebook
Err... what? (Score:4, Interesting)

by Splab ( 574204 ) writes: on Sunday June 24, 2012 @07:42PM (#40433399)

Ok, so both article and video is extremely thin on details, the explanation for the massive performance is pretty much gibberish and their argumentation for ACID compliance is bullshit.
Just leaves me with the question, what are they trying to get out of this BS?

Share
twitter facebook
- Re:Err... what? (Score:5, Informative)
  
  by viperidaenz ( 2515578 ) writes: on Sunday June 24, 2012 @07:52PM (#40433503)
  
  Just leaves me with the question, what are they trying to get out of this BS?
  Your money, its not a free piece of software.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  Self-aggrandizement and money. When somebody claims they are better than everybody else, they are usually lying and knowing it.
- - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Quite true. Of course this is something the "enterprise" DB vendors are desperate to hide and there are still enough people that do not have enough of a clue about database-theory. The problem is not SQL though, but the relational database model in general.
Meh. (Score:5, Insightful)

by hey! ( 33014 ) writes: on Sunday June 24, 2012 @07:45PM (#40433435) Homepage Journal

Give me fast enough, robust, easy to administer and standards compliant. Maybe a little less fast means you throw more hardware at a problem, but it doesn't matter if overall the overall cost and risk is inflated. A platform decision boils down to three things: (1) is it good enough; (2) is it economical; (3) if we decide later this doesn't work for us, are we totally screwed.
In any case, there's no meaningful way you can make a claim that a database management system is the fastest on the planet. All you have is benchmarks, and different benchmarks apply to different use-cases.

Share
twitter facebook
Pedant alert! (Score:3)

by PPH ( 736903 ) writes: on Sunday June 24, 2012 @07:52PM (#40433497)

What you have there is (or may be) the fastest database management system.
I have the worlds fastest database. One table, one record, and one field (NULL).

Share
twitter facebook
Facebook engineers? Gah! (Score:4, Funny)

by TheMiddleRoad ( 1153113 ) writes: on Sunday June 24, 2012 @07:56PM (#40433531)

I wouldn't run my toaster on software engineered by someone from Facebook, let alone a database. I'd have to spend ten minutes searching for my toast, and it would show up the following week.

Share
twitter facebook
- Re:Facebook engineers? Gah! (Score:5, Funny)
  
  by duk242 ( 1412949 ) writes: on Sunday June 24, 2012 @08:02PM (#40433571) Homepage
  
  And then the next week, your toast would have changed from white bread to wholegrain and you're just going to have to get used to it.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by kiore ( 734594 ) writes:
  
  Not only would it tell all your "friends" and relatives what you are eating and when but the control for turning notifications off would be deeply buried next to the mains power wire and mysteriously switch itself back on at random intervals.
- Re: (Score:3)
  
  by Rary ( 566291 ) writes:
  
  Oh but come on. Their engineers are super leet! To work at Facebook, you have to win a drunken speed-hacking contest just to be a PHP coder!
- - But not dislike the toast. (Score:2)
    
    by TheMiddleRoad ( 1153113 ) writes:
    
    But not dislike the toast.
Faster then MUMPS? (Score:3)

by stanlyb ( 1839382 ) writes: on Sunday June 24, 2012 @08:12PM (#40433617)

Or its nowadays name: CACHE? The best, the fastest, and the most reliable commercial database on the planet? Common, guys, get real.

Share
twitter facebook
Nothing to see here, move along, folks. (Score:2)

by 140Mandak262Jamuna ( 970587 ) writes:

Some clever tricks and cache management. All the speed improvement seems to be coming via read/write speeds rather than any fundamental breakthrough or parallel implementation or massively parallel database of any such thing. And the test was the standard test but some hand picked data base and their own queries. Probably the original funders are planning to sell it down to the next set of chumps.
How do they write to disk faster? (Score:3)

by mounthood ( 993037 ) writes: on Sunday June 24, 2012 @08:36PM (#40433777)

They're durable and synchronously log all changes to disk, so what makes them faster? They do say this, from: http://developers.memsql.com/docs/1b/durability.html [memsql.com]
Reconfigure the server to use a faster disk. MemSQL exclusively relies on sequential (not random) disk writes, so using an SSD will dramatically improve durability write performance.
Are SSDs better at sequential writes? I thought their advantage was random reads, and they weren't any faster at writes then HDDs. Also, the data would become hopelessly out of order by only doing sequential writes, unless they're periodically re-writing all the data in order, which would mean lots more I/O then a typical DB.

Share
twitter facebook
- Re: (Score:2)
  
  by hawguy ( 1600213 ) writes:
  
  Are SSDs better at sequential writes? I thought their advantage was random reads, and they weren't any faster at writes then HDDs. Also, the data would become hopelessly out of order by only doing sequential writes, unless they're periodically re-writing all the data in order, which would mean lots more I/O then a typical DB.
  They say they rely on snapshots and logging. I'm assuming that it periodically writes a snapshot of RAM to disk, then logs transactions in the log for recovery. Hopefully it snapshots different portions of RAM at different times so there's not one huge snapshot being written to disk every time.
  Though if I had a database where I needed 80,000 query/second performance, I'd probably want a cluster of these so if one machine goes down, the other machine can take over so I don't have to wait for the service to
- Re:How do they write to disk faster? (Score:5, Informative)
  
  by Surt ( 22457 ) writes: on Sunday June 24, 2012 @11:09PM (#40434731) Homepage Journal
  
  SSD is significantly faster than HDD at both sequential and random writes. Top 15K SAS drives write ~250MB/s sequential. Top SSD write 550MB/s sequential. Write random and it gets much worse for the SAS drive. Try to even find an enterprise HDD benchmark done in the last year. No one bothers because enterprise buys SSD if they care about performance.
  
  Parent Share
  twitter facebook
Speed vs. speed (Score:5, Interesting)

by Todd Knarr ( 15451 ) writes: on Sunday June 24, 2012 @08:58PM (#40433903) Homepage

Speed's fine, but what kind? Or more specifically, over what timeframe? High transaction rates are fine, but they don't do any good if you can only sustain them for a few seconds or minutes before the whole thing collapses. I want to know the transaction rate the thing can sustain over 24 hours of continuous operation. In the real world you have to be able to keep processing transactions continuously.
That long-time-period test also shows up another potential problem area: disk bottleneck. In-memory's fine, but few serious databases are small enough to fit completely in memory. And even if it will fit, you can't lose your database when you shut down to upgrade the software so eventually the data has to be written to disk. And that becomes a bottleneck. If your system can't flush to disk at least as rapidly as you're handling transactions, your disk writes start to lag behind. Sooner or later that'll cause a collapse as the buffers needed to hold data waiting to be written to disk compete for memory with the actual data. You can play algorithmic games to minimize the competition, but sooner or later you run up against the hard wall of disk throughput. And the higher your transactions rates are, the harder you're going to hit that wall.

Share
twitter facebook
- Re: (Score:2)
  
  by symbolset ( 646467 ) * writes:
  
  FusionIO
- - Re:Speed vs. speed (Score:4, Insightful)
    
    by hawguy ( 1600213 ) writes: on Monday June 25, 2012 @12:17AM (#40435155)
    
    I can buy servers with over a Terabyte of ram, mutiple power supplies and 4 x 10G interfaces for FCOE.
    What is a disk again other than to boot from.
    The disk is something to hold your data when a backhoe cuts your datacenter power, and cuts the network connections that you use to replicate data to your remote site.... then your UPS runs out of battery after an hour of transactions have been applied to the database with no replication to the remote site.
    Sometimes sh*t happens in ways you haven't planned for... when you have N degrees of redundancy, you'll get bit by the rare N+1 event. It's better to have your data stored somewhere that doesn't disappear after the power goes away (or the machine reboots).
    (if you're using your FCoE network to connect to the SAN to store your data, you're still using disks but there's no reason to use a local disk to boot from)
    
    Parent Share
    twitter facebook
    - Re: (Score:3)
      
      by dkf ( 304284 ) writes:
      
      The disk is something to hold your data when a backhoe cuts your datacenter power, and cuts the network connections that you use to replicate data to your remote site.... then your UPS runs out of battery after an hour of transactions have been applied to the database with no replication to the remote site.
      We once had a "backhoe event" that cut the power cables between the point where the grid power and the UPS cables came together (our main UPS at the time was a 10MW diesel generator) and the point where they entered the datacenter building. There was about only 2 feet of cabling where they could have done this, but that's where someone put a jackhammer through. Aside from shutting us down in a great hurry, they also put themselves in the hospital, and at the same time blew the breakers on the grid substatio
  - Re:Speed vs. speed (Score:5, Interesting)
    
    by Todd Knarr ( 15451 ) writes: on Monday June 25, 2012 @02:36AM (#40435849) Homepage
    
    A terabyte of RAM costs quite a lot of money, far more than a terabyte of hard drive does. And it's not as big as it sounds, I've dealt with databases bigger. Usually the ones that demand the highest performance are also the ones that eat the most space once you start taking indexes and such into account.
    And multiple power supplies? Won't help you when the data center rack loses all power. I recall at least 2, maybe more, reports of total loss at data centers in the last 12 months, so it's not like it's that rare an event. That's not counting partial losses, or cases where someone simply fumble-fingered and powered down or rebooted the wrong server. And it certainly doesn't count maintenance outages when the server or the database software had to be restarted to upgrade software. Redundant power supplies won't help against that, and while it's no big deal normally it's a really big deal when it means losing 100% of the contents of the database when memory gets cleared. Sooner or later you need the data on persistent storage, disk or an equivalent. You can handwave that need over the short term, minutes to maybe hours, but when you start talking about maintaining the database for months to years it's a different story. And if you want to say you don't need that kind of up-time, well, the business people where I work would probably boot you out the door so hard you'd bounce twice for suggesting they could just live with losing all our data a couple of times a year. Having it happen even once would probably be the end of the company.
    
    Parent Share
    twitter facebook
TimesTen Database (Score:2, Interesting)

by Anonymous Coward writes:

So what is the difference between MemSQL and TimesTen [wikipedia.org]?
Other than the 16 years TimesTen has been out longer, the fact that Oracle now owns TimesTen, that it runs on both 32bit and 64bit Linux and Windows, that it can run in front of another database engine to give it a boost, and that it has customer installations up to the Terabyte range.
Just another lame attempt to reinvent the wheel.
Filesystem anyone? (Score:2, Informative)

by Anonymous Coward writes:

Remember the good old days, when XYZ-db wasn't always available (or even disirable)? we used to use files.
Yea, files. Novel concept, these days, mention ISAM to someone and they don't know what you're talking about!
If you really need speed, maybe a database isn't your best bet. Maybe, just maybe, you should consider structuring the data in a way that makes sense for your application using files.
- Re: (Score:3, Interesting)
  
  by Anonymous Coward writes:
  
  I work on a system like that right now in a really big company. Let me tell you something- it's shit. If you need concurrent access to the files/directories by several processes, you'll have a heap of issues. Consumers pick up files before they are completely written by the producers (now fixed by file renaming, but required work). Sime directories now hold 300k files, and any file operations are extremely slow- filesystems aren't designed for this (in process of being fixed by splitting directories squid s
Looks like that old Prevayler "database" (Score:2)

by Lisias ( 447563 ) writes:

"No more porridge". Right.
This thing is ACID at least?
memSQL fully hubris acid trip compliant (Score:2)

by WaffleMonster ( 969671 ) writes:

MySQL the worlds most popular open source database
memSQL the worlds fastest database
PostgresSQL the worlds most advance open source database
SQLite most widely deployed SQL database engine in the world
I just wish people would dispense with their childish marketing bullshit already.
The Devil Is In The Detail (Score:4, Informative)

by Anonymous Coward writes: on Monday June 25, 2012 @02:10AM (#40435717)

I've had a love-hate relationship with MySQL for over ten years now, and have as much cause to hate it as anyone, but I have to point this out. Read the MemSQL docs carefully, and here's the killer - they only support single-query transactions, and only at isolation level READ COMMITTED.
Until those two facts change, then its hardly a fair comparison.

Share
twitter facebook
Qualifications? (Score:3)

by Fnord666 ( 889225 ) writes: on Monday June 25, 2012 @09:58AM (#40438131) Journal

Shamgunov has excellent credentials in the database world, in spite of having worked at Microsoft on SQL Server for six years.
FTFY

Share
twitter facebook
- Re:Looks good for testing (Score:5, Insightful)
  
  by realityimpaired ( 1668397 ) writes: on Sunday June 24, 2012 @08:10PM (#40433611)
  
  As a long time SysAd/webmaster/developer, I'm certainly interested
  At the risk of sounding incredibly condescending....
  If you were really a sysadmin who could benefit from that kind of speed improvement, you'd know that it's possible to achieve that level of performance with MySQL already, by either running it from memory or by using a fast hard drive array. The simplest/cheapest option to drastically improve MySQL performance is to throw a large amount of RAM at a system and point MySQL at the memory. MySQL can be configured to keep the database in active memory and sync to the disk on a regular basis, which is almost exactly the kind of behaviour described for MemSQL... for an exceptionally large database that can't be stored in system memory, I imagine that the advantage that MemSQL is boasting would evapourate. There are other ways to go about doing it, such as running a fast disk array or a cluster, in order to get around the limitations of using RAM, but ultimately the prime determining factor for speed in MySQL is speed of access to the database file itself.
  
  Parent Share
  twitter facebook
  - Re: (Score:3)
    
    by errandum ( 2014454 ) writes:
    
    I'd love to see their tests when this DB needs to go into swap / pagefile. It's double the slowdown, needs to write into the swap (disk I/O) and then sync the DB (disk I/O again).
    I can't, for the life of me, understand where this will be better than the already available options.
    - Re: (Score:2)
      
      by hawguy ( 1600213 ) writes:
      
      I'd love to see their tests when this DB needs to go into swap / pagefile. It's double the slowdown, needs to write into the swap (disk I/O) and then sync the DB (disk I/O again).
      I can't, for the life of me, understand where this will be better than the already available options.
      I think the point of an in-memory database is that you size your machine so it does *not* need to swap in normal use. Otherwise, since as you said, you lose all of the speed - worse because the operating system decides what to swap out, and may not make the most efficient choice. (though they probably just mlock() the memory buffers into RAM and prevent any of the database RAM from being swapped out at all.)
      But if the architect did expect the machine to swap at times, he probably wouldn't put the swap and
      - Re: (Score:3)
        
        by errandum ( 2014454 ) writes:
        
        I meant 2 disk access, some or another. From what I read they would never be simultaneous anyways.
        Either way, this would be useful (actually IS, some solutions do this) in the Business Intelligence field. But the whole point of keeping everything in memory is moot when you have petabytes of information that you need to process during your ETL. What matters in this database is, how well does it behave in a cluster and how would it handle concurrency (ACID? Eventually synchronized?).
        I doubt this is all that u
        
        Re: (Score:2)
        
        by hawguy ( 1600213 ) writes:
        
        I meant 2 disk access, some or another. From what I read they would never be simultaneous anyways.
        Either way, this would be useful (actually IS, some solutions do this) in the Business Intelligence field. But the whole point of keeping everything in memory is moot when you have petabytes of information that you need to process during your ETL. What matters in this database is, how well does it behave in a cluster and how would it handle concurrency (ACID? Eventually synchronized?).
        I doubt this is all that useful for common DB applications like websites and the like. Relational DB's have been proving to be enough for everything (ex: Youtube uses mysql shards - or used to) purely web related for a while now, I doubt this is a gamechanger at all.
        Actually, I thought this would be less useful with large databases (like a large data warehouse), and more useful with webservers. If you have a busy website and your core database is measured in gigabytes and not terabytes, it's probably cheaper and easier to run it in-memory than to build out a distributed cluster of SQL nodes to handle the transaction load. $15K buys you a server with 16 cores of CPU and 384GB of RAM.
      - Re: (Score:2)
        
        by errandum ( 2014454 ) writes:
        
        I see your point, but I disagree. I consider it better to have the webserver running slowly than having it crash because it ran out of memory. To do that might be a choice, but you could just go here ( http://unixfoo.blogspot.pt/2007/11/linux-performance-tuning.html [blogspot.pt] ).
        
        Re: (Score:3)
        
        by hawguy ( 1600213 ) writes:
        
        Just say no to swap. It's pointless, except as a crutch for broken software. And it's dangerous on a server. If an application wants disk-backed VM, it can use mmap.
        Swap isn't just a crutch for broken software (though it can be), sufficient RAM is not always available. In a perfect world, all servers would have more RAM than their applications ever need, more cores than the processes can take advantage of, and all disks would be RAID-10 arrays of SSD's.
        But back in the real world where most of us have to live, swap does come into use at times to let a server accommodate loads that it otherwise couldn't handle due the memory footprint of the software it's running. Swap
  - Re: (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    If you were really a sysadmin who could benefit from that kind of speed improvement, you'd know that it's possible to achieve that level of performance with MySQL already, by either running it from memory or by using a fast hard drive array.
    
    The guys that wrote it are former Facebook employees. So I have to assume they know how to get the best performance out of MySQL, and that itdoesn't suit their needs for whatever reason.
    The article doesn't really go into much detail about why, but my point is really ab
  - Re: (Score:2)
    
    by frosty_tsm ( 933163 ) writes:
    
    Sysadmins benefit from tools that offer significant speed improvements while not sacrificing reliability or ease of recovery. There's some serious questions about the data loss from a system crash (which is more common these days due to cloud stuff) if a transaction is not committed to disk. It comes down to how valuable is that data lost when a crash occurs.
    
    Getting a speed boost but setting a time bomb to being screwed isn't really a smart decision.
- - Re: (Score:3)
    
    by icebraining ( 1313345 ) writes:
    
    History
    The ARPANET, the predecessor of the Internet, had no distributed host name database. Each network node maintained its own map of the network nodes as needed and assigned them names that were memorable to the users of the system. There was no method for ensuring that all references to a given node in a network were using the same name, nor was there a way to read the hosts file of another computer to automatically obtain a copy.
    The small size of the ARPANET kept the administrative overhead small to maintain an accurate hosts file. Network nodes typically had one address and could have many names. As local area TCP/IP computer networks gained popularity, however, the maintenance of hosts files became a larger burden on system administrators as networks and network nodes were being added to the system with increasing frequency.
    http://en.wikipedia.org/wiki/Hosts_(file) [wikipedia.org]
- Top coder (Score:5, Interesting)
  
  by Taco Cowboy ( 5327 ) writes: on Sunday June 24, 2012 @09:21PM (#40434045) Journal
  
  They did have an ad to lure in "Top Coders" at http://developers.memsql.com/blog/ [memsql.com]
  Apart from their ad, what they said about Top Coders was interesting - with the exception of top coders memorizing who books filled with algorithms, because top coders do not memorize nothing - top coders do not get to be top coders by memorizing.
  Instead, top coders have that instinct to _know_ which algorithm to adapt and apply, and top coders know where (and how to) look for the algorithm (either from their own archive, from books, from old magazines, or from some strange corners on the Web)
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Quite true. That is also what competent computer scientists do: Learn the rough border conditions of a problem and its solutions and look up details when needed. Be able to construct something reasonable when no solution can be found in the literature. Committing details to memory is only for those weak of mind. Of which there are many.
    - Re:Top coder (Score:5, Interesting)
      
      by Surt ( 22457 ) writes: on Sunday June 24, 2012 @10:55PM (#40434653) Homepage Journal
      
      All of the best developers I've met had phenomenal memories. I think both a natural reasoning ability and great memory are assets. If you are missing one, you aren't going to be as strong as someone who has both.
      
      Parent Share
      twitter facebook
      - Re:Top coder (Score:5, Interesting)
        
        by gweihir ( 88907 ) writes: on Sunday June 24, 2012 @11:26PM (#40434851)
        
        I have met quite few people that could fake being good coders using really good memory. They were in fact at best mediocre coders and sometimes really bad ones. While these people can code solutions to simpler things really fast, they usually do not notice when they are out of their depth and would need to look up things or think about them for a while. Then they screw up royally. That most people mistake them for really good coders (and no, memory does not help reasoning ability, it hinders it) makes things worse. One of the hallmarks of a great coder is a very keen sense for when he/she needs to be careful because something is more difficult than it appears to be. Those with really good memories regularly fail that test. Bad memory is an asset here.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3)
        
        by Surt ( 22457 ) writes:
        
        I definitely disagree. People with great memories can bring context to a problem that a lousy memory just can't. If you can't hold all 20 factors to consider in your mind at once, meandering from one to another will leave you with a solution that effectively considers only a couple.
        I have reasonably solid anecdotal evidence on this. I've seen top coders with great memories produce software that dominates their industry in three different industries now, and some of that software is now in the mid third d
        
        Re:Top coder (Score:5, Insightful)
        
        by mwvdlee ( 775178 ) writes: on Monday June 25, 2012 @01:25AM (#40435517) Homepage
        
        Juggling 20 factors in your brain (short term memory) is not the same as having a good memory (long term memory).
        In fact they literally use different parts of the brain.
        
        Parent Share
        twitter facebook
  - Re:Top coder (Score:5, Insightful)
    
    by zig007 ( 1097227 ) writes: on Monday June 25, 2012 @05:15AM (#40436441)
    
    Except that so very little of programming these days is about algorithms.
    Rather, it is about elegantly solving businesses problems and to know one's way around huge frameworks.
    Being a "top coder" is in it self a very good thing of course, but there are very few companies that actually work with technical details like implementing a better hash algorithm and so forth.
    Rather, in most developers jobs, it is very valuable to;
    * be good at being able to understand, handle and especially change large systems.
    * be good at producing solutions that at a reasonable rate balances cost and customer demands versus simplicity, performance, structure and other technical values.
    * being able to foresee the usages of the solutions in different time frames, and through this make systems cheaper and easier to evolve. Sometime, a super quick and butt-ugly solution is a really good thing to get the customer going while it figures out what it really wants. As long as all parties are aware of the situation and knows that a complete rewrite will have to be paid for next.
    * not act like a stubborn child when ones pet solution or technology gets scrapped or unaccepted or that the rest of the company think it is risky to invest time in going down that road. But to just keep pushing.
    * to be professional and keep on working even though the current thing is really boring.
    
    Parent Share
    twitter facebook
- Re: (Score:3)
  
  by symbolset ( 646467 ) * writes:
  
  Newsflash: servers come with up to 2 TB RAM now.
- Re: (Score:3)
  
  by pipatron ( 966506 ) writes:
  
  I'm not so sure that this database would fly that fast if it was running on a beowulf cluster of Raspberry Pi with OSX.
- Re: (Score:2)
  
  by unixisc ( 2429386 ) writes:
  
  Actually, which license does this use in the first place? CDDL? GPL2? GPL3? Any other?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

A nice approach perhaps... (Score:5, Interesting)

Re:A nice approach perhaps... (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Ya Don't Say! (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re:Ya Don't Say! (Score:4, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re:Ya Don't Say! (Score:4, Insightful)

Re:Ya Don't Say! (Score:5, Insightful)

Re: (Score:2)

Re:Ya Don't Say! (Score:5, Insightful)

Re:Ya Don't Say! (Score:4, Interesting)

Re:Ya Don't Say! (Score:5, Interesting)

Re:Ya Don't Say! (Score:5, Informative)

Re: (Score:2)

Re:Ya Don't Say! (Score:5, Interesting)

Re:Ya Don't Say! (Score:5, Funny)

Re:Ya Don't Say! (Score:4, Informative)

Re:Ya Don't Say! (Score:5, Funny)

okay...? (Score:5, Funny)

Re:okay...? (Score:5, Insightful)

Re:okay...? (Score:5, Funny)

Re: (Score:2)

Re:okay...? (Score:4, Informative)

Re:okay...? (Score:5, Informative)

Re:okay...? (Score:4, Insightful)

Re: (Score:2)

Re: (Score:3)

Ahhhh, Pick! (Score:5, Interesting)

Re: (Score:2)

Re:Ahhhh, Pick! (Score:5, Insightful)

Re: (Score:3)

Show me vs a real DB engine (Score:5, Interesting)

vs DB2 (Score:2)

Re:Show me vs a real DB engine (Score:5, Informative)

Re:Show me vs a real DB engine (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Funny)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re:Show me vs a real DB engine (Score:4, Funny)

Err... what? (Score:4, Interesting)

Re:Err... what? (Score:5, Informative)

Re: (Score:3)

Re: (Score:2)

Meh. (Score:5, Insightful)

Pedant alert! (Score:3)

Facebook engineers? Gah! (Score:4, Funny)

Re:Facebook engineers? Gah! (Score:5, Funny)

Re: (Score:2)

Re: (Score:3)

But not dislike the toast. (Score:2)

Faster then MUMPS? (Score:3)

Nothing to see here, move along, folks. (Score:2)

How do they write to disk faster? (Score:3)

Re: (Score:2)

Re:How do they write to disk faster? (Score:5, Informative)

Speed vs. speed (Score:5, Interesting)

Re: (Score:2)

Re:Speed vs. speed (Score:4, Insightful)

Re: (Score:3)

Re:Speed vs. speed (Score:5, Interesting)

TimesTen Database (Score:2, Interesting)

Filesystem anyone? (Score:2, Informative)

Re: (Score:3, Interesting)

Looks like that old Prevayler "database" (Score:2)

memSQL fully hubris acid trip compliant (Score:2)

The Devil Is In The Detail (Score:4, Informative)

Qualifications? (Score:3)