Moving From CouchDB To MySQL 283
itwbennett writes "Sauce Labs had outgrown CouchDB and too much unplanned downtime made them switch to MySQL. With 20-20 hindsight they wrote about their CouchDB experience. But Sauce certainly isn't the first organization to switch databases. Back in 2009, Till Klampaeckel wrote a series of blog posts about moving in the opposite direction — from MySQL to CouchDB. Klampaeckel said the decision was about 'using the right tool for the job.' But the real story may be that programmers are never satisfied with the tool they have."
Of course, then they say things like: "We have a TEXT column on all our tables that holds JSON, which our model layer silently treats the same as real columns for most purposes. The idea is the same as Rails' ActiveRecord::Store. It’s not super well integrated with MySQL's feature set — MySQL can’t really operate on those JSON fields at all — but it’s still a great idea that gets us close to the joy of schemaless DBs."
Not getting RDMS (Score:5, Insightful)
And in another three years they will switch to whatever is the coolest up-and-coming storage solution. Incompetent developers will always be incompetent developers.
Re:Not getting RDMS (Score:5, Insightful)
true, just reading their blog
Things like SQL injection attacks simply should not exist.
HTTP API. Being able to query the DB from anything that could speak HTTP (or run curl) was handy.
so sql injection is real bad, bad design of SQL... yet allowing any old HTTP javascript queries is somehow ok. Yes, incompetent developers indeed.
They also say
Why are we still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?
apart from the concepts of query caches - and stored procedures - so what if the language is related to COBOL, javascript is closely related to C which is almost as old. And that has plenty of relations to Algol which is even older.
So yes, it sounds like they havn't really got a clue. Great advert for their business!
Re: (Score:3)
so sql injection is real bad, bad design of SQL... yet allowing any old HTTP javascript queries is somehow ok.
HTTP isn't a subset of javascript - no javascript queries are needed for HTTP. Even for JSON and other javascript objects.
That said, yes, the developers don't seem to "get it". An object/method based database query language, which they seem to want, has already been tried. Look where Informix is right now.
Yes, parsing can be a bitch, and which is why using a structured database isn't always the right choice to start with. If you're just using it for data storage, it rarely makes sense.
Re:Not getting RDMS (Score:5, Insightful)
"Why are we still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?"
I couldn't agree with you more, this quote makes me want to vomit. Is this really how low the average competence of today's web developer has stooped? Between PHP developers not getting why PHP is a pretty shitly designed and developed language and stuff like this, I barely get how the web even runs anymore.
To answer the original quote, the reason we're "still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?" is because SQL is a language based on mathematically sound principles, and which is supported widely, and known widely, and is processed by database engines across the globe that have literally decades of stability behind them, data in them and so forth.
There's absolutely no reason to change SQL, because if you build a new query language that is based on the same mathematically sound principles of relational algebra then it will er... look just like SQL. The fact the kiddie (I can only assume he's a kiddie due to his blatant lack of knowledge and/or experience in the field) who wrote that blog post doesn't get this suggests he should absolutely not be trusted with your data as he'll only lose it.
This is a classic example of someone bitching about something not because it's bad, but because they simply don't understand it and believe that rather than learn about it properly, it's better to bitch and hope you can somehow effect change by bitching.
The advantage of most SQL/RDBMS is that they do adhere to the ACID principles, and for people who want to be able to have some degree of trust in their data source that's pretty fucking important. It's no surprise that they've moved over to MySQL though as it's one of the few RDBMS that is completely shit at adhering to the ACID principles and keeping uptodate with solid, stable implementations of modern database functionality.
Re: (Score:2)
Re:Not getting RDMS (Score:5, Interesting)
There's absolutely no reason to change SQL, because if you build a new query language that is based on the same mathematically sound principles of relational algebra then it will er... look just like SQL.
False. First of all, SQL is NOT based on mathematically sound principles of relational algebra. SQL took the mathematically sound principles of relational algebra and fucked them up. There should be no NULLs, there should be no natural ordering of "columns", there should be no possibility of having duplicate rows, there should be no possibility of inconsistent intermediate states in transactions (no deferred checking) etc. SQL has them all, and then some. Why? Because SQL simply ignores the relation model and "does what IBM and Oracle always did". That's not the same thing as "implementing the relational model".
Second, there is a separation between the surface structures of a language and its foundations. I really don't think that a language based on relational algebra has to look like SQL. That's like saying that a language with nouns having singular and plural and verbs having tenses has to look like English. Nope, it doesn't have to at all. Just look and VB.NET and C#. Basically two front-ends to a virtually identical language semantics, only one of them does not avoid non-alphabetic structural delimiters like the plague (and is so much more pleasant for it).
Re:Not getting RDMS (Score:5, Interesting)
Then how do I, say, indicate the date of death for someone who hasn't died? An IsDead field? Really? (Yes, a NULL in a field is a shortcut for proper relationship, but a lack of relationship when using a linking table will still be represented by NULL)
there should be no natural ordering of "columns"
Does it really matter? The natural ordering of columns is the order in which you added them to the table. Ignore it. It isn't important, and not in need of a "solution"
there should be no possibility of having duplicate rows
Firstly, get to know your DISTINCT SQL keyword. Secondly, data in real life sometimes IS duplicate. What the hell should people do? Have a DuplicatedThisManyTimes field? Ugh.
possibility of inconsistent intermediate states in transactions
That is a property of the database engine, not SQL.
Because SQL simply ignores the relation model and "does what IBM and Oracle always did". That's not the same thing as "implementing the relational model".
Where do you get this shit? Are you telling me the function of foreign key constraints and referential integrity, and the good ol INNER/RIGHT/LEFT join keywords are just smoke and mirrors and everything is really just a chaotic bowl of soup? References please.
Re: (Score:3)
Re:Not getting RDMS (Score:5, Informative)
Re:Not getting RDMS (Score:4, Insightful)
That's not how debate works. If you can't take a position and defend it against questioning, without resorting to "go away and learn more", then you have no position and shouldn't have posted in the first place.
Re: (Score:3)
"False. First of all, SQL is NOT based on mathematically sound principles of relational algebra."
No, you've completely missed the point - I'm not saying SQL is an implementation of, and only of the relational model and nothing more, and nothing less, merely that those are it's foundations. SQL absolutely IS based on the principles of relational algebra - it's still ultimately based on much of the important set theory that underlies that when it comes down to it. The point being that sure, whilst SQL is far
Re: (Score:2)
...I barely get how the web even runs anymore.
In the cloud, obviously.
What, you didn't get the memo?
Re: (Score:3)
I've worked on quite a few large-ish database applications (eg 800 - 2000 tables, some with multi-million rows), and I'd say I'm fluent with SQL. But the thing that annoys me most about SQL, from a maintenance perspective, is how much of the database structure ends up strewn around in your code base. SQL is *not* good at encapsulation.
When a new requirement comes in that should cause you to change some of the primary relationships in your database, you have a look at how much code you'd need to change to d
Re:Not getting RDMS (Score:5, Insightful)
so sql injection is real bad, bad design of SQL...
SQL injection actually has nothing to do with SQL.
Exactly the same attacks happen in any system where you build up a string from user data and pass it off to an interpreter. SQL has nothing to do with it.
Exactly the same thing used to happen with sudo shell scripts.
Exactly the same thing happened with javascript injection in very early webmail systems.
There are plenty of opportunities for code injection on poorly written PHP, too.
Re: (Score:2)
But yet, are these same developers that are being *highly* paid on these Web 2.0 times.
Serious. I was of of them - but got kicked out because I made the huge mistake of pointing the obvious: you must be a skilled programmer to do programs right. Ruby On Rails will not make a good coder from a dumb ass.
The dumb asses joined up em kick me out. =D
Re: (Score:3)
That is a common reason for firing. A couple of years ago some programmers wanted me to support them with the boss on switching a project written in python to Java. Their justification? The python programmer called them a bunch of monkeys. No technical arguments at all.
Unfortunately the boss sided with the monkeys and I was next on the chopping block for pointing out that a 200 Bingo player max using 3 machines (1 web 1 db, 1 backup db) was a design flaw.
Re: (Score:3)
SQL is nothing like COBOL. Once again they show how they are clueless rookies.
Not quite true (Score:5, Informative)
If all your application is ever going to do is read and write to fixed sized record structured data with little relational (or any) attributes then COBOL will suit you fine as that's what it was designed for. Unfortunatly those sorts of apps are few and far between these days, but in its ever decreasing niche COBOL is still good.
COBOL is cool! (Score:4, Funny)
PERFORM makemoney UNTIL rich.
(Note the the full stop at the end)
Re: (Score:3)
COBOL can be a bad language, but the best paid jobs around here are for COBOL programmers.
It's hard to find a position (someone must die in order to open up a position), but once you get it, it's for life. =]
Re:Not getting RDMS (Score:4, Insightful)
COBOL can be a bad language, but the best paid jobs around here are for COBOL programmers.
It's hard to find a position (someone must die in order to open up a position), but once you get it, it's for life. =]
In the end, there can be only one.
Re:Not getting RDMS (Score:5, Insightful)
I think the main problem is application developers not understanding anything about database theory. The vast majority of databases I encounter are not normalized at all, and it's almost always because they were designed by a developer with no database background.
Granted, I didn't come into this field with that background, either, but I made a point to learn it, and now I'm very cognizant of implementing sound database designs. This whole idea of throwing random strings of structured text into a database column, and then relying entirely on the program code to parse and use it... well, why the hell even use a relational database, then?
Relational databases aren't suitable for every application, nor are "bigtable" and other NoSQL implementations. The problem is that developers use a particular kind of database without really understanding how to use it properly. If they can get data in, and get data out, that's basically all they care about. Never mind if they make it a maintenance nightmare in the process.
Normalisation isn't a panacea (Score:2)
Yes it makes sense up to a point , but it starts to suffer from the law of diminishing returns and at some point having to do complicated multi-table joins actually slows down your queries so much that it becomes simpler and faster to suffer duplicate data than normalise to the Nth degree.
Re: (Score:3)
It depends on the task though, I'd wager 90% of SQL work that is done by developers day to day isn't in such a performance sensitive environment that it needs to favour performance over normalisation, and I agree with the GP, there's far too many developers out there that just don't do it and hence simply don't have the performance excuse. It really is just bad database design as a result of incompetence most the time.
Re: (Score:3)
I can definitely see the value in making an informed tradeoff, but like you said, a lot of the time it's not an informed decision--they just do it to make it work and don't really have the expertise to know which is the right way to go. I've definitely seen enough bad database designs to know that most developers just have no clue how to design them. The worst I've seen had bad designs and poor performance, and were built in a completely ad hoc manner without any eye toward maintainability, performance, or
Re: (Score:3)
And in many databases, there'd be more performance gains from proper normalization than pre-mature optimization. I'm working with a legacy database that has this problem. Proper normalization would probably make it lightning fast, but instead it's slow as fuck because too many concerns are put in one table when they should be put in several tables. Also, it uses functions to retrieve values, which is just...so wrong.
Re:Normalisation isn't a panacea (Score:5, Insightful)
Yeah, it really depends on what you are doing. But any time you break normalization there should be a good reason. Performance is certainly a valid reason. "I'm too lazy to make a well-designed database," however, is not.
If you find yourself breaking normalization all the time, then you've probably found a use case where a relational database isn't the best tool for the job.
While there is a "right" way to use a given tool, there is no one tool that is right for every situation. People who get this backwards are zealots and will often make poor decisions.
Re: (Score:2)
Yes it makes sense up to a point , but it starts to suffer from the law of diminishing returns and at some point having to do complicated multi-table joins actually slows down your queries so much that it becomes simpler and faster to suffer duplicate data than normalise to the Nth degree.
The question is whether this should be solved at the conceptual model level. As a developer, I don't care whether the database cheats and duplicates something to speed things up, as long as I don't have to do it in the data model and as long as the implementation is correct. The same logic applies to CPU caches and compiler optimizations. The computer is allowed to "cheat" if it can prove that the shortcut is correct. But you shouldn't be forced to do it manually, since it only makes your code (and data str
Re:Not getting RDMS (Score:5, Insightful)
I completely agree. A lot of non-DB centric people think that they can do more in the app tier, effectively using their databases as glorified file stores. Why even have a database server in those instances? I'm not saying that everything should be done in the database, either, but take advantage of every tool you have.
NoSQL has a place, so does relational. Learn their strengths and determine which is the best fit for your project. Then, learn how to use the tool to its fullest.
Re: (Score:2)
Unfortunately the developers of these "NoSQL"databases seem to have the same idea. I'm working with one that shill remain nameless but sounds oddly like a piece of fruit right now. The generally accepted best practice for scaling is to pull as much of the logic out of the database layer. While there are fancy aggregation pieces, they're all impossibly slow (and hamper concurrency). Argh.
Re: (Score:2)
A lot of non-DB centric people think that they can do more in the app tier, effectively using their databases as glorified file stores. Why even have a database server in those instances?
This is pretty easy to answer, I think: because databases offer ACID attributes. Reimplementing those on your own is a big project and likely to create bugs; it's a lot easier to just grab an existing database and use it.
For instance, what if you need a "glorified file store" that multiple processes on multiple systems can
Re: (Score:3)
I think the main problem is application developers not understanding anything about database theory. The vast majority of databases I encounter are not normalized at all, and it's almost always because they were designed by a developer with no database background.
Or a developer who is experienced enough to know how bad an idea an overly normalized database is for most applications.
Re: (Score:3)
You've got it backwards. The highly normalized database is connected to transaction processing. Highly normalized databases have few lock issues and are optimized for transaction processing. Also TPS is narrow so you have good coders dealing with the relatively little code that bangs on it hard.
The read-only database denormalized for simplicity and query performance is the data warehouse. That's where the report monkeys work.
Why not PostgreSQL? (Score:5, Interesting)
PostgreSQL 9.2 beta improves scalability, adds JSON
http://www.h-online.com/open/news/item/PostgreSQL-9-2-beta-improves-scalability-adds-JSON-1573815.html [h-online.com]
Comment removed (Score:5, Funny)
Re: (Score:2)
Re: (Score:2)
MySql-s MyIsam is much faster with reads than PostgreSql. I think for the things people use NoSql, MyIsam is perfect. And when you want to move better ACID support, you can effortlessly switch to InnoDB.
Re: (Score:3)
That's all fine until you need to actually write to that table. With myISAM any write needs a table lock, and that makes performance drop like a rock.
Re: (Score:2)
And how is it worse than NoSQL?
Anyway, that's when you move to InnoDB.
Re: (Score:2)
And with InnoDB you still get table locks every time you update an autoincrement field. Performance then drops like a fucking rock. Anyway, that's when you move to Postgres.
Re: (Score:2)
Not really.
MySQL is good enough for 80% of the currently web sites. Damn it, even PHP is good enough for a small site.
As someone below states, if you need to be serious about ACID, you can switch to InnoDB.
Are you pissed of with Oracle? Go for MariaDB.
I like PostGRES, and acknowledge its technical superiority on every single aspect of MySQL.
But I'm using MariaDB on my site: I (still) don't need the PostGRES superiority, and MariaDB is easier to maintain (not to mention its smaller memory footprint!).
Re: (Score:2)
I use mysql because it is the best database supported by my cheap ass webhost (access? hells no) for free (mssql for $20 a month? no thanks).
Nosql in Postgres (Score:4, Interesting)
or altenatively you can use the hstore data type.
programmers don't know how to store data (Score:3)
But the real story may be that programmers are never satisfied with the tool they have.
Ah typo
But the real story may be that programmers don't know how to store data
They many not know because no one knows the business needs, but more often because they have no idea what they're doing WRT to data storage.
IT training tends to cover data manipulation pretty well "how to add two numbers'
IT training gets shakey on data structures "So, in junior level class we will talk about data structures, which is too bad because you've already developed at least two years of bad habits first"
IT training tends to pretty much skip data storage "In a senior level class, you might talk about scalability, maybe in an optional class. Or maybe you'll take a semester of cobol instead"
Re: (Score:3)
Possibly, but given how quick many programmers are to get into a fruitless pissing match over their favourite language it's quite apropos, no?
Re: (Score:2)
1) I know my business needs.
In some industries you can pretty well predict the future. In others.. no.
One app I built years ago would have literally required geographic changes to expand. Then "surprise" it gets rolled out to 5 additional bigger cities. Well, that was unexpected... I had a O(n**2) algorithm in there that did pretty well for values of N around 7 where N can never increase beyond 7, but not so good for values of N around 57. whoops.
Is a DB even needed sometimes? (Score:2, Interesting)
It seems to be a knee jerk reaction amongst a lot of developers and designers that as soon as your app starts requiring persistent data beyond ini values a database is needed. Why? For large but simply structured data something like json or XML or even a flat csv file is perfectly adequate. Performance can be an issue during searches but if for example you have a fixed record size with key sorted data then finding a given key is simple (binary chop or similar).
It seems to me that reaching for a DB is the ea
Re: (Score:2)
Starting with a database avoids the pain of migrating flat files to a database later when the database is needed (and if your app gets at all popular, it will be).
Sure, if you're only ever expecting 10k rows of data with very little concurrent access, go nuts with your flat files.
Re:Is a DB even needed sometimes? (Score:5, Informative)
A CSV or XML or JSON file is a db (a DB is just structured data).
Are relational DBs always required? Certainly not.
The big benefit to a relational DB with lots of enforcement at the data layer is that you can have one or more applications reading/writing to it with minimal concern of data corruption.
What isn't obvious is that second application is often aggregate reporting for management. "How many customers are using $foo and where do they live geographically". With a relational DB, I might knock that query out in a few minutes across millions of customers.
With a flat XML file per customer spread across a number of servers, this could take days to assemble, particularly if $foo is nested deep in the structure.
Having spent far too much time writing one-off scripts to gather customer data because the middleware didn't support that type of query, I've actually gone the other way and started shoving some business logic into the DB.
Functions such as isCustomerPaymentOverdue are now in the relational DB with a very thin model in the middleware to allow for much easier and faster reporting.
Re:Is a DB even needed sometimes? (Score:5, Insightful)
The big benefit to a relational DB with lots of enforcement at the data layer is that you can have one or more applications reading/writing to it with minimal concern of data corruption.
Not just that, but good use of relations and normalization makes whole classes of bug impossible.
Re: (Score:2)
Not just that, but good use of relations and normalization makes whole classes of bug impossible.
That's precisely the motive the current cast of "developers" avoid it like the Devil.
They NEED bugs in order to justify the overdue payments and overpaid weekend death marches.
Software Development *must* be a arcane practice, not a scientific knowledge - or they will be measured under rational arguments, and ending up loosing their jobs.
This kids think they are artists, and behavior as they are.
Native JSON fields (Score:3)
PostgreSQL 9.2 (now in beta) includes native JSON fields:
http://www.h-online.com/open/news/item/PostgreSQL-9-2-beta-improves-scalability-adds-JSON-1573815.html [h-online.com]
It's also available as an extension for the current 9.1 release:
http://people.planetpostgresql.org/andrew/index.php?/archives/255-JSON-for-PG-9.2-...-and-now-for-9.1!.html [planetpostgresql.org]
PICK (Score:2)
Hop into the wayback machine and fire up any flavor of PICK. The database where schema is applied on use, not on storage. No length limits on fields and very fast on old hardware (really fast on new). Storing bits of xml and code are no problem. And for those users who simply must have SQL, many versions will support that too (UniData and UniVerse are two examples). It's not cool, not new, but it does work.
Urban Airship (Score:4, Interesting)
Urban Airship went PostgreSQL to MongoDB to Cassandra to PostgreSQL. http://wiki.postgresql.org/images/7/7f/Adam-lowry-postgresopen2011.pdf [postgresql.org]
It's a good presentation because they're in love with none of them and are moving for specific reasons each time, handling different issues. It's not coders chasing the new hotness.
Oh, the joy! (Score:2)
... the joy of schemaless DBs.
You mean working with a file system and not using a DB at all, not needing to pay a DBA, not dealing with corrupted databases, not using arcane tools, etc.?
I jest, but not entirely. Clearly there are purposes for which databases are the right toold for the job. I'm most definitely not convinced that big blob storage is one of them.
Re: (Score:3)
not using arcane tools
I know database concepts are difficult for some people, but it's by no means magic.
Re: (Score:2)
not using arcane tools
I know database concepts are difficult for some people, but it's by no means magic.
Sorry, I beg to differ. You select a DB. Turns out that's just the interface, and you have to *then* select the actual DB engine. Some engines / databases allow checking for and repair of corruption on-line, some don't. There's locking. Line level, table level, database level. Oh, wait, you didn't know about tables vs databases? What do you do when your query takes too long? Didn't you know about connecting before making a querry, persistent connections, and how to interpret obscure error messages?
CouchDB just didn't work (Score:3)
a majority of our unplanned downtime was due to CouchDB issues
Nowhere on the CouchDB home page [apache.org] is reliability even mentioned. And that's the real issue. Developing a reliable database system is a difficult design and programming task. It requires real software engineering. The hacks who write PHP and use JSON aren't up to a job like that. The "aw, we'll fix it in the next release" attitude doesn't cut it in databases.
Re:The decision the simple (Score:5, Interesting)
That's actually a rather insightful point...
If your application fits well with the methodologies of a traditional RDBMS, use a traditional RDBMS, and hire people who are trained and experienced in using those methodologies to their full potential.
If you're dealing with the latest Big Data paradigms and designs, where you can sacrifice some of the rigidity of a RDBMS to gain some flexibility and cheaper scalability, use a NoSQL database, and hire people who aren't stuck in their old RDBMS ways.
Re:The decision the simple (Score:5, Insightful)
That's actually a rather insightful point...
If your application fits well with the methodologies of a traditional RDBMS, use a traditional RDBMS, and hire people who are trained and experienced in using those methodologies to their full potential.
If you're dealing with the latest Big Data paradigms and designs, where you can sacrifice some of the rigidity of a RDBMS to gain some flexibility and cheaper scalability, use a NoSQL database, and hire people who aren't stuck in their old RDBMS ways.
The real key is for the person doing the hiring to understand which of those of methodologies fits their application.
Re:The decision the simple (Score:5, Informative)
The real key is for the person doing the hiring to understand which of those of methodologies fits their application.
This is insighful. I've worked extensively with RDBMS solutions and now quite a bit with NoSQL technologies. They each have their place. An entire article could be written on where each fits most naturally, but in general if you don't need to join between tables, need to throw data to your store at a high velocity (e.g. logging), and/or need a loose schema, a NoSQL solution works best. If what you're doing can be naturally modeled (i.e. users HAVE AND BELONG TO stations, stations HAVE MANY playlists, etc. etc.), use an RDBMS.
One can see in the subtext of the GP that they may not get this, with their comment that people using RDBMS solutions are "stuck in old ways". It seems like they are saying that NoSQL is effectively always best. I'm curious why they think that. Nail, hammer, etc...
Re: (Score:2)
people using RDBMS solutions are "stuck in old ways". It seems like they are saying that NoSQL is effectively always best.
No, no, no, no, no, no, no, no, no, and hell no.
I'm referring somewhat-sarcastically to the RDBMS proponents who reject NoSQL out of hand. The ones who see "database" and think it must have a rigid structure, where all connections are made with JOINs. The ones who don't accept that NoSQL databases are inherently different and must be designed differently. If a programmer is actually stuck thinking in terms of an RDBMS, they should not be working in a NoSQL database. If the programmer is flexible enough to d
Re: (Score:2)
Fair enough, sorry if I misunderstood.
Re: (Score:2)
So, I'm actually curious about this part.
I've worked in RDB's, and I've worked in things that are more based on Berkeley DB ... but I am actually having a hard time thinking of specific examples of where I'd want something database-ish and not have the need for JOINs.
Berkeley gives you key value pairs, but the product I worked on which was based on it allowed us to do searching on multiple of tho
Re: (Score:2)
We were excited to try a NoSQL db, having spent too many years using MySQL in ways that the designers of relational databases never imagined. CouchDB seemed well suited to our needs.
wow. and i thought i had some mean hubris.
Re: (Score:3)
And most importantly, make sure you know the difference.
Because I should think someone who thinks you should ditch your RDBMS when it's the thing you need to keep using is going to cause you more problems than they're worth. Of course, the opposite is true ... I remember someone who insisted in writing ER diagrams to describe our system, despite it not being an RDB, and not being accurately described by ER diagrams -- but to him everything was an ER diagram.
It's not uncommon for geeks to push to use the la
Re: (Score:2)
Of course, the opposite is true ... I remember someone who insisted in writing ER diagrams to describe our system, despite it not being an RDB, and not being accurately described by ER diagrams -- but to him everything was an ER diagram.
I can't say whether entity relationship diagrams were appropriate in the situation you describe but there is nothing wrong in principle in using ER diagrams to describe non-RDB systems. ER diagrams describe the logical or semantic model, not the physical implementation, and are therefore DB agnostic. Yes, they are often used to help design an RDB schema but their real value is to understand your data at the semantic level.
Unfortunately, many don't grasp this distinction and you'll see many RDB systems where
Re:The decision the simple (Score:4, Insightful)
RDBMS systems can be flexible also. It just takes a bit of planning, a good understanding of your data and a well designed application...which you should do/have regardless of your storage solution.
Call me set in my old RDBMS ways, but if I'm supporting it then I want to know what the hell is gong on with the data.
Re: (Score:2)
going, going, gong!
Re: (Score:2)
If you're dealing with the latest Big Data paradigms and designs, where you can sacrifice some of the rigidity of a RDBMS to gain some flexibility and cheaper scalability, use a NoSQL database, and hire people who aren't stuck in their old RDBMS ways.
Well, no. If you're dealing with "big data", you still need to evaluate which tool is appropriate for the task. If you're calling it "NoSQL", you're probably referring to a rather immature set of products designed to pander to people looking for teh new hotness. If you're looking for a key-value store, mature solutions like Berkeley DB have been around for ages.
It's not that key-value stores don't have their place, it's that people running around chanting the NoSQL mantra are really just reinventing the
The Uncle Larry feature (Score:2)
Re:The decision the simple (Score:4, Insightful)
Or use a better DB like Postgres. How the MySQL still is popular I will never know. I think it is a conspiracy to prove FREE DBs suck.
Re: (Score:2)
MySQL has a version 5 now, you should check it out.
Re: (Score:2)
Does it have transactions yet? In the default engine, not by going to someone else and getting InnoDB.
Does it have booleans yet?
Re: (Score:2)
The anonymous coward comment was by me. I forgot to log in on this account.
Re: (Score:2)
Re: (Score:3)
Also, I know what I'm doing.
This line really doesn't count for anything. How many people are really going to say "I don't know what I'm doing", or "I'm incompetent"? Everyone thinks they know what they're doing.
You may or may not really know what you're doing, we have little way to know for sure, but you saying it about yourself is meaningless.
Re:Has to be said (Score:5, Funny)
MongoDB is Webscale. MySQL is not Webscale, because it uses joins. SQL also has impetus mismatch.
Wikipedia and Slashdot use MySQL (Score:4, Insightful)
MySQL is not Webscale, because it uses joins.
Then how does a non-webscale database power popular web sites such as Wikipedia and Slashdot? If you don't do joins in the database, you'll probably end up doing the equivalent of joins (using one value as the key in another table) in your application.
Re:Wikipedia and Slashdot use MySQL (Score:5, Informative)
Then what's it called instead of a join? (Score:5, Insightful)
Re: (Score:3, Interesting)
Not sure about MongoDB or CouchDB, but I have experience with RavenDB, which is absolutely fantastic. Instead of "joins" you have "includes" or "live projections". See http://ravendb.net/docs/client-api/querying/handling-document-relationships [ravendb.net]
Re:Then what's it called instead of a join? (Score:5, Funny)
Witchcraft.
Re: (Score:2)
If you have a lot of related data, then you should probably be using a relational db, no?
If all you have is a hammer (Score:2)
Re:Wikipedia and Slashdot use MySQL (Score:5, Funny)
MongoDB can write its data to /dev/nul/ for extra performance.
Re: (Score:3, Funny)
If /dev/null is webscale then I will use it.
Re: (Score:2)
I hadn't read the article this was based on before, thanks for the laugh. I encourage others to Google "webscale" :^)
Re: (Score:2)
I read this entire thread wondering if "Webscale" is really some kind of valid term or not. I'm not convinced that it has any valid meaning whatsoever.
Re: (Score:2)
I see this comment a lot and I don't get it.
Here you go. [youtube.com] Now you'll get why it's funny, and not serious.
Re: (Score:3)
So the thing is, traditional joins (on, say, Postgres or MySQL) aren't blocking operations. You can run more than one at a time. MapReduce (as well as writes, any aggregation, and any use of JavaScript) are blocking operations on Mongo. They block the entire mongo process. The MapReduce case gets around this with a bit of cooperative multitasking (yielding every few hundred or thousand rows), but writes, aggregation, and other use of javascript do not. So there's already a much bigger need to distribut
How is XML indexed? (Score:2)
do they have an XML field type? MS SQL Server does [...] which allows you to essentially keep the table schema-less but still allows you to perform complex queries on the contained data.
But how does it index the data in the XML or JSON fields? How does it, say, tell an element containing a number from an element containing text? Does it act like SQLite, which is dynamically typed (and thus can store text in any field) but can be told to prefer to compare and index certain columns as numbers, dates, text with Unicode collation, or binary data?
Re: (Score:2)
Does it act like SQLite, which is dynamically typed
That's one of the annoying thing to me about SQLite, and the justification is a marvellous case of doucle standards. It goes something like this:
Static typing is bad because it limits flexibility and you can check elsewhere if you need to. We will never add it ever because we are right about this.
Then, in release foo.bar of SQLite
We've added foreign keys!
Either constraints are good or they are bad. Typing is just another constraint, like foreign keys, and
SQLite static typing (Score:2)
Typing is just another constraint, like foreign keys, and various other domain constraints. I cannot see any valid argument for having foreign keys but not type constraints. It jest seems bizarre to me. It's not like they could be optional or anything.
They are optional. It appears you can enforce static typing for a column with constraints like CHECK(typeof(x)='integer'). I'd give more details, but the document that Wikipedia cites about such constraints is a printed publication of which I happen not to own a copy.
Re: (Score:2)
That's interesting. I'll have to look into that.
It's more than a little mildly annoying that when you specify the type, it doesn't check it, more you have to specify it twice. Anyhow, if it can be beaten into shape, then that makes it more acceptable.
Re: (Score:2)
Does it act like SQLite, which is dynamically typed
That's one of the annoying thing to me about SQLite,
Either constraints are good or they are bad. Typing is just another constraint, like foreign keys, and various other domain constraints. I cannot see any valid argument for having foreign keys but not type constraints. It jest seems bizarre to me. It's not like they could be optional or anything.
Foreign keys are good, but type independent. You just want to check that some foreign key actually references valid data in another table. If they match, they match.
However, typing in SQL is almost certainly a case of implementation details leaking through the abstraction layer. The data type is defined by the data and how it is used. Traditional SQL databases require that type data up front so they can organise the data on disk. But you shouldn't care how data is organised on disk.
When comparing dates, the
Re: (Score:2)
SQL Server really does have good online documentation. Just search for "SQL Server [feature]" and pick the MSDN link. In this case, I searched for XML INDEX. http://msdn.microsoft.com/en-us/library/bb934097.aspx [microsoft.com]
Oracle supports XML Types with XPath style queries. I don't remember any XML Indexes, but you can always use a function based index against the XPath.
Re: (Score:2)
SELECT * and ye shall find.
Re: (Score:2)
what did they find wrong with MongoDB, apart from rumour.
I doubt RavenDB is any different really, it looks and feels the exact same (except written in .NET so you have a lot of garbage and resource issues to deal with), same as Cassandra (written in java).
Maybe erlang isn't as good as they say, or maybe CouchDB isn't as well written as they say.
Re: (Score:2)