Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Data Storage Upgrades

MySQL Clustering Software Launched 48

Posted by timothy
from the englobulate-positively dept.
lawrencekhoo writes "MySQL AB announced yesterday that software for building a MySQL Cluster will be available for download by the end of April. Articles available from Computerworld, Internetnews, Linux Electrons, and PHP Architect. Great! Now my website can finally have 99.99% availability ..."
This discussion has been archived. No new comments can be posted.

MySQL Clustering Software Launched

Comments Filter:
  • More info... (Score:5, Informative)

    by Apiakun (589521) <tikora AT gmail DOT com> on Wednesday April 14, 2004 @02:13PM (#8862824)

    Here are some direct links to more information:

    Oh, and they say availability is 99.999%, not just 99.99% :)
  • MySQL Cluster combines the world's most popular open source database with a fault tolerant database

    It's nice to start out a press release with a lie, isn't it? As far as I know, the title of the world's most popular open source database (meaning it has the most installs around the world) belongs to the Berkley DB [sleepycat.com].
    • Berkley is a relational database? Does it have anything to do with MySQL apart from the fact that the both have something to do with data?
      • The quoted text says nothing about relational databases.
        • Since it would be blindingly obvious that the article meant the kind of databases tha MySQL is (client/server , SQL syntax to store/retriece data, relational, etc) there is no need to mention that.

          However, if you believe that by not mentioning this it is open to any interpretation possible i would suggest that neither BerkleyDB nor MySQL are the most popular. I am sure ext2fs is installed on more machines than MySQL so the article is lying. It's not MySQL , but ext2fs that is the most popular database.
          He

    • by dacarr (562277) on Wednesday April 14, 2004 @02:33PM (#8863008) Homepage Journal
      It's PR. Remember, The SCO Group is "a leading provider of UNIX-based solutions", per many of their press releases. It doesn't make it any more acceptable, it's just a tactic. Chill.
    • by Apiakun (589521)
      It really depends on what the meaning of is is. Does popularity mean that it is the most used, or the most liked? I would think that popularity and usage are a different metrics.
    • by jtheory (626492) on Wednesday April 14, 2004 @02:48PM (#8863146) Homepage Journal
      Apples to oranges. The press release should have been more specific than just "database", but still... Berkely DB is not a "database" as most developers think of the term (relational, accessible using SQL, etc.).

      Berkely DB is code that manages a data store, and you access the data using method calls within your app (you compile their code with your project), NOT using SQL, and NOT connecting to an independant application. Remote access n/a, no ODBC or JDBC, etc. etc.. Great product, but a completely different animal from MySql and other relational databases.

      In fact, MySql used to offer Berkeley DB (as opposed to InnoDB, etc.) as a data storage option WITHIN the MySql product.
    • by jonadab (583620)
      I think they're using "database" here to mean RDBMS. Technically a database is
      just anything that organises data, so a filesystem would count, but that's not
      how the term is generally used. Usually these days when people say database
      they mean RDBMS.

      The other thing is, most installs is not the only reasonable measure of
      popularity. I'm pretty sure more people have daily interaction with MySQL
      than with Berkeley DB directly. Berkeley DB is installed so widely because
      it's been around longer and because certai
    • I don't think it means the most installs. For example, if MySQL had scantily-clad babes advertising it, then it could be really popular even if it wasn't installed a lot.
  • What about PG? (Score:5, Interesting)

    by Anonymous Coward on Wednesday April 14, 2004 @02:35PM (#8863025)
    I remember someone developing a rahter advanced multi-master replication and clustering for PostgreSQL. Does anyone know how far is that project? Has it entered the testing phase yet?
    From what I've read it looked very, very prommising, but it doesn't do much good if it's on paper only...
  • In memory only? (Score:4, Insightful)

    by diegomontoya (712934) on Wednesday April 14, 2004 @03:01PM (#8863247)
    If this is the requirement deployment then for people like us were db size at over 20GB, and yes the big blogs are already stored in compressed using compression, this would not be economically pratically to use. Factoring OS, caching, I need to get 22GB memory for each node? Last I checked, the 2GB cheaps are still nasty expensive.
    • Wow...I have to apologize for my atrocious spelling and grammar in the post. I'm usually not this bad. =)
    • Re:In memory only? (Score:5, Informative)

      by Unknown Relic (544714) on Wednesday April 14, 2004 @06:21PM (#8864647) Homepage
      I was wondering this as well. Also the FAQ mentions that "Data that needs to be highly-available must reside in the MySQL Cluster storage engine. If existing MyISAM and/or InnoDB data needs to be made highly-available then it has to be migrated to the MySQL Cluster storage engine." I'd assume that the clustered table types have support for transactions like InnoDB tables do, but there's nothing here to confirm this.

      From the way I'm reading it, this type of cluster would be most ecomomically used for in conjunction with a traditional replicated mysql database. You would use clustered engine for transactional tables where a large number of inserts or updates occur, and for tables where you have a lot of historical or read-only data, you would use standard replication, where you could tolerate a few minutes without the ability to insert or update should the master fail. In order to reduce the memory requirements for the cluster you could also move old transactations from the transactional tables to historical tables which use InnoDB/MyISAM.

      That being said, there must be SOME use of the disk on the cluster, because their recommended node system has raid + four 73GB SCSI hard drives... major overkill if everything except for OS/Software is stored in memory!
      • by diegomontoya (712934) on Wednesday April 14, 2004 @07:22PM (#8865182)
        No where did they mention battery backed-up ram modules as a recommended config so I believe your're correct to assume that disk not only has to be used, but MUST be used.

        Without ramsan style battery packed ram, there is no way any enterprise would trust clusters of any kind to ram only storage for write commits.

        Looks like each write transaction will be synchronized acrossed all nodes, which would explain the gigabit and lower latency interconnects. Still, this is crazy complex to make fast and reliable.

        So to make it truely synchronized, they have to write to disk, for backup/log, before committiong the data to the ram. So regardless, writes are slow and I'm waiting to see how they by-pass this disk write commit latency. Add on that they have to do this for all nodes before responding to the app, writes are crazy slow, relatively, since they can influence indices, force cache/ramed-data flushes, etc. Would be interesting to see how they handle this.

        Also, I'm interested to see what type of check code/algorithm to see which NODE is healthy and which ones are corrupt (not dead since dead servers are the easiest to detect). From their diagrams, looks like N-type replication so each node is an exact synchorinized duplicate of all others. But how to know for sure which one is the "safe" one when corrupts happen?

        Also, I wonder how they tackle gigantic inserts/update like "replace into table2 select * from gigantic_table1". They can't assume or dictate that we only stick to small write transactions right?

        Cheap N-way synchronized replication is my and probably most dbms managers' holy grail so I'm crossing my fingers for Mysql to get this right.
        • Actually according to MySQL diskless servers will be supported very soon. Im at the MySQL con in orlando and this was one of the first questions that came up.
        • MySQL Cluster will only write to the disk in an asynchronous manner. The disk is only needed if there is a total cluster failure (ie. all machines go down at once)

          However the data is written synchronously to more than a single node. So when you insert data, it is inserted into two places (or more, it is configurable) at the same time. That way even if one server goes down, you will still query from the other place.

          The result of this, is that it still will scale linearly for writes as well. Keep in min
  • by denubis (105145) <brian@NOspAM.technicraft.com> on Wednesday April 14, 2004 @06:10PM (#8864538)
    You know you're a Database geek when you see the headline and immediately think: "Ah hah! Clustered indexes! That'll save some time during joins! Oh. Wait. They're talking about boxes. Drat."
  • Node requirements (Score:2, Insightful)

    by Anonymous Coward
    The standard requirements for the node surprised me.

    Is stats that you need 16GB of RAM !! Why do they say that? Doesn't the amount of RAM depends on the size of your Database? If my InnoDB database file is only 3GB why would I need more that 4GB og RAM?

    Also, why the hell would you need scsi drives for an in memory database?

    • Not to mention that they don't support the Pentium line of processors. From FAQ - MINIMUM requirements:
      1x Intel Xeon, Intel Itanium, AMD Opteron, Sun SPARC, IBM PowerPC
      I guess they think you won't bother clustering on anything less than a hefty server. I won't be testing this at home.
    • 16 GB is the "preferred" requirement; the minimum is 512 MB. Quite a difference.
  • by Anonymous Coward on Thursday April 15, 2004 @05:50AM (#8867498)
    I mean, this is an enterprise-scale storage engine from the same engineering team that used to deride ACID transaction isolation and rollback as unimportant, and whose parser still silently ignores any attempt to use integrity constraints that aren't supported. Are these the right people to achieve the robustness that needs to accompany "five nines"?
  • by Vexware (720793) on Thursday April 15, 2004 @01:36PM (#8872641) Homepage

    For the lazy among you (and lazy you have to be to find the task of entering a few fields in a form exhiliarating), I have uploaded the MYSQL Cluster white paper to another FTP site, mirror of the file which you may access there: mysql-cluster-whitepaper.pdf [zemelon.net] (the document is a PDF file, so fear the Adobe Acrobat Reader loading time).

Old programmers never die, they just branch to a new address.

Working...