Facebook Trapped In MySQL a 'Fate Worse Than Death' 509
wasimkadak writes with this excerpt from GigaOM: "According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to 'a fate worse than death,' and the only way out is 'bite the bullet and rewrite everything.' Not that it's necessarily Facebook's fault, though. Stonebraker says the social network's predicament is all too common among web startups that start small and grow to epic proportions."
Commercial databases (Score:3, Interesting)
Well. then they convert from one db to another. So what. its not like that would be a completely new thing to happen, and i am sure that oracle or any other big db provider will send experts to help with the task.
Nah, this is standard practice (Score:2)
Delegate scalability downwards. Throw hardware at the problem.
Re: (Score:2)
It may not be a hardware problem, it may be a problem that actually has more to do with the fact that Oracle owns MySQL.
Oracle vs Facebook? (Score:3)
It may not be a hardware problem, it may be a problem that actually has more to do with the fact that Oracle owns MySQL.
It's not unreasonable to suppose Oracle might "nudge" Facebook into the deeper end of Oracle's trough of slimy swill. But who to root for? This is a bit of a conundrum. Seeing Facebook's delicate bits getting squeezed is not an unattractive proposition, but seeing Oracle benefit therefrom would be appalling.
Maybe... (Score:3)
Maybe this guy's problem is that Facebook HAS created such a large and successful business without paying Oracle millions of dollars or his company millions of dollars.
Kinda of sounds like that commercial for Scott trade where the Fat Cat broker is trying to keep his clients so he gets his fat commissions.
Re:Maybe... (Score:4, Insightful)
Um, RTFA? It's not a pitch for Oracle. In fact, it's a rant against SQL in general. Quote:
In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.
Sounds like the usual NoSQL FUD, right? But wait, there's more here:
Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.
Right... so what then? More magic buzzwords to the rescue!
But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL or scalable SQL ... Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.
Now the article is pretty light on details regarding what is that "new" SQL, and Googling around doesn't really help. So far, to be honest, it sounds more like a bunch of DB makers have ganged together and came up with a nifty word to market their products against Oracle, DB2, MSSQL, Postgres etc - if it's "new" it must be good, right?
Re: (Score:3, Funny)
Maybe Facebook should just put all our data in the cloud...it's not like security or privacy is a big concern for Facebook...
Re:Oracle vs Facebook? (Score:4, Informative)
Use Postgres.
It costs the same as MySQL $0 and is a 100 times the DB.
It offers far better data integrity, it supports transactions out of the box, it will handle DBs in the TB range, and is about as standards compliant as DBs get.
The company I work for uses it for our service that we sell.
Re: (Score:3)
I'll second h4rr4r. Use Postgres. Until you are very big, a DBMS will only make any difference if you choose one of those that trade reliability and compatibility for speed (like MySQL). Your best strategy at the beginning is to start with a cheap one, and there is none cheaper than 0.
When you get big enough that the DBMS will make any difference, you'll discover it is cheaper to add hardware than going with a proprietary one anyway. The best strategy here is to go with a cheap one, and there is none cheape
Re: (Score:3)
If you are worth your weight as a developer, you've already done model isolation layer where your all queries would be, thus it's not that hard to rewrite the queries. If this was to be expected, you've made it far simpler already.
In any case, i don't see the anti-MySQL points. I've tried Postgres once - that was enough, i'm not going back to it. It was weird as shit, required some weird conundrums for permissions and DBs changes, didn't seem to be properly isolating but more like hacked together to support
Re: (Score:2, Interesting)
Re: (Score:2)
Re: (Score:2)
SQL is a standard, but every provider implements it differently, with their own additions. So, for any non-trivial uses of SQL, you need to do at least some changes.
In some cases, the changes could be really big. Especially when using some of the more complex features, like the support for recursive queries.
Re: (Score:3)
I would guess that instead of using PDO or similar abstraction layer, their PHP code is littered with "mysql_*" function calls, so they'll necessarily need to modify everything to handle any other database.
Or just wait for enough people to move to Google+ instead so that their database load is reduced...
Re: (Score:3)
SQL is a standard, but no, "SQL" isn't standard. There are syntax differences between databases, and if you get into stored procedures (or equivalent) and triggers (or equivalent), or rely on referential integrity (which is implemented on some RDBMS systems, but not others, and doesn't always work the same), it won't be a matter of dumping the database from one RDBMS and then importing it to another RDBMS. Things are going to break.
I'd hate to have to deal with a(-) Facebook dump file(s); I'm sure everythi
Re: (Score:3)
And, to add to that, Facebook is insane if they didn't implement what is commonly called an "access layer" for abstraction, so that the system can be rapidly ported from one RDBMS to another. However, even if they did implement that in their architecture, some issues come up: is it implemented throughout the project, or did some developers bypass it for performance, and is it intermingled with presentation code? Can they re-implement the access layer without performance suffering? Does the new RDBMS provide
Re: (Score:2)
> And, to add to that, Facebook is insane if they didn't implement
> what is commonly called an "access layer" for abstraction, so
> that the system can be rapidly ported from one RDBMS to another.
"Access layer" aka "database independence". Otherwise known as "absolute death to any hope of performance and scalability". The reason one pays large sums of money for an Oracle, DB2, or even SQL Server implementation and programming is _exactly_ to get the performance and scalability potential and improv
Re: (Score:3)
> You write the code that actually does the queries as stored procedures
> in the database, then write a DAL that essentially works as a database
> driver. Your code does nothing to the DB other than requesting that it
> execute an SP, and the SPs can be tuned for the specific database server.
True, but that is not writing "database independent code" - it is writing separate versions of the code for each database and building a good UI that can be configured to be compatible with all of the versions
Re: (Score:3)
> If you're actually suggesting writing X versions of stored procedures
> just to be able to run on X different DBs that's not solving the
> migration problem. That's continuously living in the migration problem.
Yes, that is exactly what the parent is suggesting, since experience shows it is the only way to get correct, performant, scalable systems. Organizations that actually have a need to support multiple databases (few truly do) generally find that this method is in the end less labor-intensive
Re: (Score:3)
...how the heck do you write performant code that works against both databases?
Erm... writing _two_ code baselines that provide the same high level interface, perhaps. It's not as if that same problem hasn't been dealt with several times before (compiling to different architectures, for example).
Sometimes. I once dealt with a database called UNIFY which had a piss-poor query planner. It tended to overwhelmingly favor certain types of indexes (which were built implicitly for you whenever you had a foreign key relationship) over any other kind. We had a frequently used query against a sku database on style, color, and size. There were indexes on any combination of those fields, but color was also a foreign key to the color table. Which meant that it ALWAYS used the color index. Problems rose up
Re: (Score:2)
Re: (Score:2, Insightful)
Yes and no. There is ANSI SQL. PostgreSQL is probably one of the more compliant databases and is by far one of the more portable solutions. But even that is iffy.
MySQL is on the other end. MySQL is well known for being non-compliant, teaching very poor SQL code, offering minimal SQL compatibility and lowest common denominator features to achieve the same goal. That's also why, contrary to the lies and marketing hype, MySQL is almost always one of the slowest and least scalable solutions of any generally ava
Re: (Score:2)
> Far too often, vast ignorance, huge ego, and massive pride
> prevent people from considering alternative database solutions
> and their ignorance of the domain allows them to quickly
> become self assured they've picked a winner.
I won't say that those factors don't come into play, because they do. But I think a more fundamental problem is that there has never been a good source of education on how relational databases really work in the corporeal world (as opposed to the theoretical world of CS
Re: (Score:3)
Oh give MySQL a break. They finally fixed it so it no longer recognized February 31st as a real date, so they are making progress.
And MySQL hates babies and kittens! (Score:4, Insightful)
Geez, GooberToo, did a MySQL developer kill your father or something? You've posted two giant rants about how MySQL is so unsuitable for anything that it can't possibly work for any serious project. You make it sound like simply installing MySQL causes a server to immediately explode.
You *are* aware that Facebook, Slashdot, Wikipedia, and many other sites use MySQL, yes? Maybe there are better choices (more likely, there are different tradeoffs, but whatever), but MySQL works well enough to power some of the most popular websites in the world. Proof by existence that what you claim is inaccurate.
Re: (Score:2)
Re:Commercial databases (Score:4, Insightful)
A minor difference that exists in 4,000 instances and who knows how many places in the code that's also distributed across multiple servers, isn't minor, especially when there are hundreds or even thousands of minor differences.
And no, the differences in SQL between Oracle and MySQL aren't minor. It's not just syntax, and it's not MySQL-can-Oracle-can't. It's the performance characteristics of various queries, the logic of how they're implemented, and the incredible investment in configuring a large cluster to work smoothly (which MySQL and Oracle do extremely differently. Large scale systems add a layer of complexity all their own that's a totally separate engineering challenge.
Short version: Switching from MySQL to anything else would be the equivalent to a ground-up rewrite, though this is largely true of any database system. MySQL hasn't somehow uniquely trapped them here.
Re:Commercial databases (Score:5, Informative)
This isn't true. I just migrated an application from MySQL 4.1 to Postgresql 9.0 at work. It took me about two weeks, but certainly not a complete rewrite from scratch. It varies greatly on the application, the language it's written in, frameworks in use, and the number of product specific features in use. This was a perl / mason app.
If an application was making extensive use of stored procedures, then it would require a lot of effort to rewrite those, but not the whole application. If the application were written in C, it would be a lot of work to change. I think facebook uses PHP and that's not too hard to change out especially if they were sane and used an abstraction layer like PDO.
If the app were written in Java or .NET and using an ORM, it would be TRIVIAL to change to another database.
With my experience, the biggest problems were date functions and the fact that MySQL embeds index creation in the create table syntax whereas postgres requires it be separate and the names of indexes are global. This meant that I had some work cut out for me changing index names. There were also a few quirks with some join queries as MySQL is not picky about ordering in the from clause.
You are correct that they'll have to tune queries and things, but it's not a total rewrite if they wrote their app in a reasonable way.
For the record, Postgresql 9 is faster for many of our queries but seems slower doing INSERT. YMMV
Re: (Score:2)
Don't forget that the names for column types are substantially different, that even when they are the same, the maximum d
Re:Commercial databases (Score:4, Interesting)
Ability to convert depends completely on the application. If the MySQL app written using simple or at least standard SQL, it will be easy to migrate. However, MySQL has some very problematic areas (i.e.: select foo from table1 where id in (select id from table2 where criteria='something')) that make people do some very nonstandard and MySQL-only style fixes to address performance. The query shown with 5000 rows in table1, 50000 in table2, table2 only having 50 rows that met the criteria took ~10ms on PostgreSQL 8.3, and 52 minutes on MySQL 5.1 on the same hardware. The only way I could find to get the ~10ms performance on MySQL was so goofy that MySQL itself refused to allow me to create a view from that select statement.
Converting from PostgreSQL to Oracle has always seemed much easier and smoother, but PostgreSQL isn't as popular as MySQL because it hasn't been as easy to throw hardware at problems with scaling PostgreSQL, whereas MySQL has always made that option easier.
Each database has its own pros and cons, but most times you don't discover how hard it is to migrate until it's too late.
Re: (Score:3)
(i.e.: select foo from table1 where id in (select id from table2 where criteria='something'))
Haver you tried using a join and if so how well does it work?
e.g. select table1.foo from table1 inner join table2 on table1.id=table2.id where table2.criteria='something''
Or this might be better, depending on the quality of the query optimizer:
select table1.foo from table1 inner join table2 on table1.id=table2.id AND table2.criteria='something''
Re: (Score:3)
This is exactly why anyone in their right minds puts some sort of ORM/query layer in front of their database so that their mid-tier/front-end code has no knowledge of what the sql looks like.
Re: (Score:3)
This is exactly why anyone in their right minds puts some sort of ORM/query layer in front of their database so that their mid-tier/front-end code has no knowledge of what the sql looks like.
LOL because we all know thats how you improve performance.
Re: (Score:3, Interesting)
I went for a job interview a few years ago which was very SQL intense. I looked at some SQL code in C# for both ODBC as well as direct SQL Server code, and it was the most complex thing I have ever seen and frankly hair pulling ugly. It was no simple UPDATE INTO TABLE like simply MySQL with php.
Rather, It was weird ASYNC VSYNC Data.adaptor,x and weird eseortoric lines consisting of 35 to 40 lines of code for each insert doing God knows what! Maybe a SQL programmer can explain what a Vsync was and what a dat
Re: (Score:2)
Re: (Score:2)
if more hosts would offer PGsql, I would use it, but my clients options get limited otherwise.
Re: (Score:3)
In my experience, you don't really have to worry about the types of SQL that MySQL won't accept or the specialized syntax that MySQL will accept. The biggest pain is worrying about the SQL that MySQL will accept and choose the stupidest execution plan possible. This leads to unintuitive queries designed around MySQL's shortcomings.
For example, one hack we had to employ regularly was to select only the columns from a table that were part of the WHERE or ORDER BY clauses and then join back against the origina
Re: (Score:2)
They should have gone with Oracle? Why? I work with that expensive cr*p, and it can't perform its way out of an open box. They can't have that much db dependent software anyway. Just plug in a compatibility layer and use something fast under it. I guess this is only news because it's facebook and facebook is worth so much money.
Re: (Score:2)
Yeah, Oracle. Who owns MySQL again?
http://www.oracle.com/us/products/mysql/index.html [oracle.com]
Re: (Score:2)
Do you ever find when someone says Oracle it's hard to tell from the contect if they mean Oracle the corporation or Oracle the flagship database product from that corporation?
Re: (Score:2)
You have no idea what it's like under the hood so 'just' plugging in a compatibility layer may be a real headache.
Re:Commercial databases (Score:5, Insightful)
If you wan't to start a fun/interesting project that you didn't expect any revenue from, it would make more sense to use free software. MySQL is a popular choice for web applications and there is a lot of freely available documentation and examples available. Many people have been successful doing it, so it's a proven path that works.
Oracle is expensive. It would have cost a fortune to start Facebook with Oracle, and I can't imagine what it would cost them now. But even if they have to hire a ton of experts to convert to Oracle( assuming that is the best thing to do...) They can probably be funded by the money saved by not using Oracle over the past couple of years.
Maybe Oracle would have been a mistake, there are companies migrating from Oracle to DB2/DB2 to Oracle/Oracle to Sybase/Sybase to MySQL/Mainframe to AIX/AIX to Solaris/Solaris to Linux/etc.. It seems like nobody can agree to the best hardware/OS/database solution, but there are plenty of people who swear that the solution they know is the best one.
Re: (Score:3)
The fact is: there is no single best solution. Specific bottlenecks will require specific solutions.
Re: (Score:3)
Success sometimes makes fools of us and our plans (Score:3)
Very true: mod parent +Insightful.
We see the same principle when some individual acquires Sudden Wealth, as for example by winning the lottery. Sudden Wealth -- it's every man's dream, right?
On closer inspection, Sudden Wealth is not a miracle cure for unhappiness or any other problem. Quite the contrary: Sudden Wealth brings new problems, new diseases of the so
Still their fault (Score:4, Informative)
Once they started the trend to grow beyond being a toy, they should have redone things right then.
Waiting until you are painted in a corner is irresponsible.
Re: (Score:2)
3) They're not made of money.
Doesn't apply to Facebook either, but would apply to startups in general.
Re: (Score:3)
PostgreSQL can be had at a similar price point to MySQL only is better at pretty much everything else.
Re: (Score:2)
but thought the entire web was supposed to run off the almighty LAMP stack? you just can't make a neat acronym with Oracle in there.
Re: (Score:2)
Re: (Score:2)
Obligatory Clue (Score:2)
Professor Plum: What are you afraid of, a fate worse than death?
Mrs. Peacock: No, just death, isn't that enough?
Subject line should read... (Score:2)
PostGreSQL is far better than MySQL (Score:2)
Re: (Score:2)
How ten years ago of you. You do realize that MySQL and Postgres are getting rather close on feature parity right? Both of them have been adding features the other lacked. MySQL 5.5 has stored procedures, views, triggers, etc.
The biggest selling point with Postgres for me is the schema handling/support. One database can have many schemas.
It's fine if you like postgres, but saying that technologically advanced in every way is wrong. There are benefits to both systems.
Regarding the marketshare comments,
Re: (Score:2)
Maybe geature wise they're on par, however the MySQL syntax for nested queries is incredibly weird.
Re: (Score:2)
How ten years ago of you. You do realize that MySQL and Postgres are getting rather close on feature parity right? Both of them have been adding features the other lacked. MySQL 5.5 has stored procedures, views, triggers, etc.
Not sure which way to take that to be honest. Postgres has got faster without sacrificing useful and sometimes essential features while MySQL has had to scramble to catch up. It's an ass backwards way of going about things and it's one of the reasons why there are so many problems with it. Facebook, Google and God knows who else has goodness knows how many forks of MySQL now. That's the complaint being levelled by the article.
Re: (Score:3)
MySQL is 'fast' because its lack of feature and robustness mainly.
I've read the same thing about C.
Abstraction layer (Score:2)
Re: (Score:2)
> They would probably need to build an abstraction layer on top of MySQL. :) . This would create the ability to move to
> Something like "FBSQL"
> whatever database system they want.
And get equal performance on all of them. Equally poor, that is.
sPh
PHP too (Score:2)
"We're so new" (Score:5, Insightful)
I love the snippets "After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed" from the article and "We’ve been using stonge age technology to solve problems that didn’t exist 30 years ago." Yes, the problems existed 30 years ago, such as (land-line) telephone billing. I don't know how those problems were solved -- probably with a mainframe and a custom non-SQL database and not a PC running a SQL-based server -- but they were solved.
And this opinion has nothing to do with the fact . (Score:2)
And this opinion has nothing to do with the fact that this is the guy who write PostgreSQL and he has been bitching about how MySQL has a to big market share, for years??
MySQL has been faster that PostgreSQL for years, it doesn't have as many features, but it is **fast** !!
Re: (Score:2)
Actually, if I understand it correctly he worked on Postgres, not PostgreSQL which came later.
Re:And this opinion has nothing to do with the fac (Score:4, Informative)
Not at all. But it does have something to do with the fact that he is plugging his new product, which implements something he calls "NewSQL."
dZ.
Re: (Score:3)
MySQL has been faster that PostgreSQL for years, it doesn't have as many features, but it is **fast** !!
/dev/null is even faster, but I wouldn't use that for data storage, either.
Oh dear (Score:2)
Successful Troll is Successful (Score:5, Insightful)
Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.
Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.
Re:Successful Troll is Successful (Score:5, Interesting)
Re: (Score:2)
Re:Successful Troll is Successful (Score:5, Insightful)
No, he's not an academic purist; he's a businessman who's selling a product that competes with MySQL. So he's trying to convince web startups to pay a bunch of money for his product rather than rely on free MySQL because he claims it will help them scale better than Facebook. IOW, businessman trashes competitor's product, claims you should buy from him instead. Nothing to see here.
That's not Facebook's problem (Score:5, Informative)
Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.
Right.
Some of the key architects of Facebook have spoken at Stanford about how the system is put together, and I went to that presentation and had a chance to talk to them. They didn't consider MySQL to be a bottleneck. Their big problem was PHP performance. They were writing a PHP compiler to fix that.
Internally, the user-facing side of Facebook is in PHP. But the front end machines don't talk directly to the databases. They use an RPC system to talk to other machines that do the "business logic" parts of the system. Building a Facebook reply page may involve a hundred machines. There's heavy caching all over the system, of course, so the databases aren't hit for most read requests.
The RPC system isn't HTML, JSON, or SOAP. It's a binary system that doesn't require text parsing. Otherwise, RPC would be the bottleneck.
This makes for a flexible, easy to enhance system. New services go in new machines, which talk to existing machines.
Re:That's not Facebook's problem (Score:4, Informative)
The RPC system they're using is Thrift (http://thrift.apache.org/)., which they developed because JSON was becoming a bottleneck. And yeah, there's a metric crapload of memcached in their data centers as well. The multi-hour outage Facebook had late last year was due to a near-complete failure of the memcached layer, resulting in an overload of requests to the main mysql farms.
Re:Successful Troll is Successful (Score:5, Interesting)
Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.
Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.
He's no academic purist. He's pushing his product, and he's either an outright liar or, worse, doesn't know what he's talking about:
Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading)
Is that so? MySQL, as with virtually all SQL DBMSs, defaults to "repeatable read" [mysql.com] transactional guarantees, and it doesn't even spend time guaranteeing foreign key relationships [mysql.com] by default. About the only thing MySQL really guarantees out of the box is durability.
It's just nonsense to talk about all the "wasted resources" when, if they don't need them, it's a few lines in a config file to turn them off.
Easy - Just Migrate to Oracle (Score:2)
That should do the trick, eh?
Oops, time to buy Oracle stock and short FaceBook, I guess!
looks like facebook is doing just fine... (Score:5, Insightful)
Re: (Score:3)
You're right. Computers were invented to steer bombs to people and kill them. Through WWII and the 1960s missiles that brought us the IC and beyond.
As mountainous as advertising suck is, bomb suck is worse. I like where computers have gone from their purely murderous beginnings.
migration.... (Score:2)
Facebook user data is relatively similar across profiles. Their export application lends credit to this. Correct me if im wrong, but wouldnt it be simple to write a routine to port data from one product to the other? Test this thoroughly to make sure its bulletproof. Then run it until the job is done. To make this simple purge all the bullshit like wall posts, messages, notes and other use
Re: (Score:2)
Exporting the data isn't the issue. The issue is going through the codebase and altering all the MySql specific code and then making sure it all works.
Re: (Score:2)
So, what you are saying is to purge everything which makes Facebook worthwhile to the masses? What else could possibly give Google+ an opportunity to instantly overtake Facebook? :)
Re: (Score:2)
Correct me if im wrong, but wouldnt it be simple to write a routine to port data from one product to the other?
Depends on your definition of "simple":
1. You build a second system.
2. Your "porting routine" has to keep both systems constantly in sync, bidirectionally.
3. You rewrite every client for the first system, or add bridges, to talk to the new system.
4. Finally, once all the clients are updated, you can switch off the old system.
Michael Stonebraker & VoltDB (Score:5, Informative)
The guy in the article [wikipedia.org] does have some cred. He was a professor at UC Berkeley for 29 years where he was project leader on Ingres and led the creation of its follow up, Postgres.
His new database, VoltDB [wikipedia.org], based on the 'NewSQL' ideas touched on in the article, is Free Software licensed under the GPLv3.
MySQL is facebook's issue? (Score:3)
I don't think so. Facebook isn't known for having a lot of down time. It is known for opening up information to the public. If anything, that would be considered too much up time. I've used MySQL and PostgreSQL. I found MySQL to be limited but most limitations were easily worked around in code. PostgreSQL wasn't as limited. However, the options that it provided forced the need to vacuum the database. I would rather write code but to each his own.
Total rewrite is always bad... mkay? (Score:3)
Over and over we hear about this "scrap and start over" concept. It sounds like a great idea but you are assuming you can do a better job than the guys before and more often than not you will be wrong.
I used to suggest it but now I know better. I have seen new devs with little experience passionately suggest so called "total refactoring". It has never ended well.
Or you could.... (Score:3)
The underlying problem according to Stonebrook:
During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve.
Or you could put MySQL on an IBM Power Systems LPAR and use a commercial MySQL plug-in [krook.net] to store the data in a DB2 database. Then you can get away with maybe a dozen database machines instead of thousands. I have to imagine, btw, that Oracle has a similar offering in the works.
Lesson: academic credentials are no match for real world experience.
Stonebraker trapped in Stonebraker (Score:5, Informative)
(reposting as a logged in user) I wrote a bit longer response to this:
stonebraker trapped in stonebraker 'fate worse than death' [dom.as]
I think I know a bit more about database situation inside FB than Mr.Stonebraker. Go figure.
Reply from a former FB engineer (Score:4, Interesting)
Reply from an engineer who worked at Facebook 2007-2011 [quora.com].
The O in NOSQL stands for "only" (Score:2)
Keep on backpedalling, you silly NoSQLers. (Score:5, Interesting)
It's hilarious how the NoSQL fools are now constantly backpedalling these days.
It turns out that writing database queries in JavaScript is a stupid idea! Imagine that! All of their attempts to invent a better query language end up being almost identical to, guess what, SQL!
Then they realize that trying to maintain data consistency using logic written in JavaScript, Ruby or PHP doesn't work so well. Values go unconstrained, and the referential integrity gets fucked up. Soon the data is nearly worthless.
The smarter/less-ignorant ones then think that they'll just use transactions. But wait, their NoSQL database of choice doesn't support that, or doesn't support it properly. So they tell themselves that their data will become "eventually consistent", or worse, they try to implement some shitty ass "transaction" support using Ruby. Regardless of the path chosen, failure is the result.
Now they're realizing that it's mandatory to use a real relational database when working on anything remotely serious. So we see this bullshit about "no" now meaning "not only". That's funny, last month it meant "no", as in, "we will never write a SQL query again, and we will never use a relational database again."
I'm going to make a prediction: Next month, we'll get to read articles and comments from them about these amazing new database systems that they've just discovered. These new systems avoid all of the problems associated with NoSQL databases! What are their names? Oracle, DB/2, SQL Server, PostgreSQL and SQLite.
Re: (Score:2)
Trollishly stated, but oh so true.
Re: (Score:3)
Meh,
If NOSQL really means Not Only SQL, then it's a smart idea.
If it means re-writing relational database code to behave like SQL, then you should have been using SQL in the first place.
If it means that your live object database that doesn't follow normal relationships has a much more efficient system, then you should not have been using SQL in the first place.
Simply, use the right tool for the job.
[I prefer a well defined stored procedure interface to my data, the slight amount of extra design time makes u
Re: (Score:2)
Not mistaken, but the issue the NoSQL people face when trying to replicate something like a Facebook-sized cluster of relational databases is that they have to build the ACID features back in, which tends to negate the performance advantage.
Facebook isn't really using a relational database system either: they have a gigantic memcached layer on top of a gigantic MySQL layer. They have, effectively, a massive in-memory database that's continually being written back to MySQL for permanence. That's the only w
Re: (Score:2)
And if you need to use memcache, then it doesn't matter what database you're using. You can scale out read only nodes with most larger database systems, but it's not always a good idea.
Usually the issue is write performance. You can only write to one MySQL server in a cluster (unless using the newer mysql cluster storage engine). Some other commercial products let you partition data across multiple servers and write to different servers.
If you have a lot of read only nodes then all that data has to get r
Re: (Score:2)
I was under the impression that there was no feasible way, performance wise, to run something as big as Facebook without using a non-relational database system.
Am I mistaken?
Plenty of financial institutions are working on a similar scale to Facebook, and they use SQL-based DBMSs. Facebook doesn't need transactional integrity for a lot of what they do. They don't have an elaborate set of regulations that they need to be in compliance with, and a large set of accounts that must all be constantly balanced. And Facebook can't charge a fee for every transaction that takes place.
Your standard SQL DBMS is doing OLTP, or online transactional processing. That usually means that it's run
Re: (Score:2)
Its probably better to give all investors and employees their money and kill the whole Facebook project. No one likes their privacy policies anyway.
Re: (Score:2)
No but there will be plenty of vendors promising such a thing.
NoSQL? NewSQL? (Score:2)
Re:Delusional editorialism! (Score:4, Insightful)
IMHO it was a bargain - MySQL has worked up until now, it is still working, so as far as I am concerned that's a big success story for such a low-end free/free database - and it was a choice they made based on what they already had skills in and it enabled them to earn billions, so it was a very smart, inexpensive way for them to get started. Now for Facebook, spending a few million to get on to big iron is cheap money, whereas back in the day spending a week or two to really learn the ins and outs of Postgres or spending thousands on Oracle could have prevented them from surviving in the first place.
Re: (Score:2)
I don't really understand what is bad about this. Facebook is this big of a site and it seems to work great.
Have you actually seen facebook?
Re: (Score:2)