Slashdot Log In
Brian Aker On the Future of Databases
Posted by
kdawson
on Tue Jun 03, 2008 05:52 PM
from the everything-you-know-changed dept.
from the everything-you-know-changed dept.
blackbearnh recommends
an interview with MySQL Director of Technology Brian Aker that O'Reilly Media is running. Aker talks about the merger of MySQL with Sun, the challenges of designing databases for a SOA world, and what the next decade will bring as far as changes to traditional database architecture. Audio is also available. From the interview: "I think there's two things right now that are pushing the changes... The first thing that's going to push the basic old OLCP transactional database world, which... really hasn't [changed] in some time now — is really a change in the number of cores and the move to solid state disks because a lot of the... concept around database is the idea that you don't have access to enough memory. Your disk is slow, can't do random reads very well, and you maybe have one, maybe eight processors but... you look at some of the upper-end hardware and the mini-core stuff,... and you're almost looking at kind of an array of processing that you're doing; you've got access to so many processors. And well the whole story of trying to optimize... around the problem of random I/O being expensive, well that's not that big of a deal when you actually have solid state disks. So that's one whole area I think that will... cause a rethinking in... the standard Jim Gray relational database design."
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Well (Score:5, Funny)
Re: (Score:2, Funny)
Re: (Score:3, Informative)
Come on, he's talking about the future of databases. He was just trying to set the mood by doing his best Kirk impression.
Re: (Score:3, Informative)
Re: (Score:3, Funny)
Re:Well (Score:4, Funny)
Parent
Dear Slashot (Score:5, Funny)
Thank you.
Re: (Score:2)
Re: (Score:3, Informative)
Re:Dear Slashot (Score:5, Interesting)
We've been running MySQL using MyISAM since 2002. It's delivered acceptable performance until recently as we've expanded our application and the data volumes have increased. Now, we have to reorganize it on a frequent basis (we just backup and restore).
But, we really need to move to a transactional model so I've done some benchmarking between InnoDB and Postgresql. In almost all cases, Postgresql was significantly faster. Our application is very transactional with alot of writes.
And from what I've read, Postgresql scales well to multiprocessors and multiple cores where as MySQL does not. I know Falcon is coming but it was still very Alpha at the time I compared - I couldn't get it to run long enough to perform the tests.
Has anyone else compared Postgres to MySQL/Innodb?
Parent
Leaky abstractions (Score:5, Interesting)
I think TFA is pretty right on the money that parallellization and massive use of SSD could cause some pretty fundamental changes in how we approach database optimization - if I were to imagine that rack that I'm staring at being filled with SSD drives and processors instead of with nothing but hard drives... locality of data takes on a whole new meaning if you don't require data to be on the same sector of the HD, but rather want certain sets of data to be stored on storage chips located around the same processor chips to avoid having to overload your busses.
Then again, I haven't been in this game for so long, so maybe I'm overestimating the impact. Oldtimer opinion would be very welcome.
Re: (Score:2)
One example I know is in Python.
>>> a=[[5]]
>>> b=a
>>> a[0][0]=3
>>> b
[[3]]
Should changing one variable also change another? It makes sense if you think of a Python list as linked list of pointers. (It acts that way -- I know nothing about the internals of Python.) The behavio
Re: (Score:2)
This IS news! (Score:5, Funny)
Re: (Score:2)
Re: (Score:3, Informative)
Relational algebra has nothing to do with random IO however building a relational database system has everything to do with random IO because it is by and large the worst bottleneck in the system. The best performing RDBMSs are the ones completely designed around avoiding random IO. That's why TFA says a new RDBMS could be created from scratch and blow the existing players out of the water in the new SSD world.
Admittedly.... (Score:2, Interesting)
Databases need to scale to disgusting large numbers. Memory and disk resources should always be treated as expensive, precious commodities, because they might be plentiful on a simple database on robust hardware, but there are plenty of people out there with massive friggin' databases.
In corporate America, Oracle
Re: (Score:3)
Your statement of "Databases need to scale to disgusting large numbers." is spot on.
Re: (Score:3)
Our slots database didn't just track totals for rewards credits and such, we kept every transaction ever. We can see how long you played at each machine, for how long, when you put money it, when you hit, when you took money out, etc. We know when you want to come in and play, which machines, etc. Consider that for every customer in a casino. Now consider than on a scale of 60 major casinos.
I'm not a DBA, so I was
Re: (Score:3, Funny)
COMMIT;
try {
machine.collectMoney();
player.run();
} catch(SecurityException e) {
player.resist();
if(player.isFast())
casino.exit(0);
else
player.yield();
}
Re: (Score:2)
I imagine a schema like OrderItemId1, OrderItemId2
Re: (Score:2)
Seriously though, I'm curious what type of application would need and actually benefit from such a design.
Re: (Score:3, Insightful)
I don't know much about SPSS, but if you need to present the data in such a flat format, you can always store the data the proper way and create a view when you need to import it into SPSS.
I'm not sure if there is some sort of limit on the number of columns in Oracle views or not.
Re: (Score:3, Informative)
Re: (Score:2)
1000? That's "One Thousand" ?? As in ten times more than 100?
You might want to take some database design courses if you consider a 1000 column limit to be unreasonable or unworkable. The widest table I'd ever seen was one where the original programmer did a poor job of normalization (that is, he didn't do it), and even that only hit about 150 to 200 columns (times about 60 million rows. Ouch.) Not only is it difficult to maintain wide tables like that, the database will probably perform poorly as well whe
Re: (Score:2, Interesting)
BigTable, HBase and SimpleDB are the future (Score:3, Informative)
Can I ask a stupid question... (Score:3, Interesting)
Again, I'm sure this is a stupid question, but perhaps someone could clue me in.
Re:Can I ask a stupid question... (Score:4, Informative)
MySQL only does table locks, which are much simpler and much faster for light workloads, but as I'm sure you can imagine when you have many CPUs trying to update the table at once in the end each thread has to wait their turn to grab the lock and perform their updates sequentially.
In SQL Server, Oracle, or any other "enterprisey" db multiple threads can update the same table at exactly the same time, as long as its not the same row.
Stuff like this is exactly why people who use MS-SQL and oracle look down their nose at people who use MySQL and claim it is capable of playing with the big boys.
Once again, despite what MySQL are saying there is nothing innovative here. All this stuff has existed in the mainstream database engines for many, many years and they are still playing catchup.
Parent
Re: (Score:3, Informative)
Re: (Score:2)
I'm wondering if the traditional lock approach used in MySQL (and most SQL databases as far as I know) somewhat kills parallel operations
I'm not sure what you mean, fine-grained locking is up to the engine, and MySQL can't really do it.
InnoDB is the classic example; you can get MySQL to lock entire InnoDB (or MyISAM) tables, but if you want to lock a certain row in an InnoDB table MySQL can't help, because it doesn't know about InnoDB.
If you're saying "the MyISAM/MySQL table-based locking won't work in a parallel environment", then you're right, but that's nothing new that relates SSDs and micro-cores.
Fully transactional engines have
Re: (Score:2)
I wonder which one has better performance for multiple operations at once.
Re: (Score:3, Informative)
Re: (Score:2)
Both interbase and MySQL's InnoDB engine use the concept of Multi Version Concurrency Control (MVCC) to address the issue of simultaneous access -- although their implemenations differ: InnoDB does make use of row-level write locks. The upcoming Falcon engine is said to do the same with even fewer loc
Cores? (Score:2)
Re:Cores? (Score:5, Insightful)
"pseudo-" is that verbal prefix which means "false".
I'm seeing language devolve in front of my eyes...
Parent
MonetDB (Score:2)
Brian's Blog (Score:2)
Sup Brian!
-Bryan (from apachetoolbox)
Locality is the key (Score:5, Interesting)
Given the amount of computing power on hand today, it may surprise many how difficult it is to engineer a system capable of executing more than a few thousand transactions per second per thread.
Why? Latency. Consider your average SOA application which reaches out to 4-5 remote services or dataserver calls to execute its task. Each network/rpc/soap/whatever call has a latency cost of anything between one and at worst several hundred milliseconds. Lets say for example that the total latency for all the calls necessary is 10 milliseconds. 1000/10=100 transactions per thread per second. Oh dear.
The amount of memory an "average" server ships with today is in the 32-64GB range. Next year it will be in the 64-128GB range. The average size of an OLTP database is 60-80GB.
So, the amount of memory available to the application tier will very soon be greater than the size of the database, warehouses excluded. Moore's law is quickly going to give the application tier far more memory than it needs to solve the average business state, exceptions noted.
The final fact in the puzzle is that for transaction processing, read operations outnumber write operations by roughly 20 to 1. (This will of course vary on the system, but that *is* the average.)
This situation is strongly in favor in migrating read only data caches back into the application tier, and only paying for the network hop when writes are done in the interests of safety. (There is a lot of research into how writes can be done safely asynchronously at the moment, but its not ready yet IMHO.)
Challenges exist in terms of efficient data access and manipulation when caches are large, performant garbage collection and upset recovery - but they are all solvable with care.
Its my opinion that in the near future large data caches in the application tier will become the norm. What has to be worked out is the most effective way of accessing, manipulating and administering that data tier and dealing with all the inevitable caveats of asynchronous data flow.
Some (not complete) examples of implementing this:
Relational Caches (there are many more):
http://www.oracle.com/technology/products/coherence/coherencedatagrid/coherence_for_java.html
http://www.alachisoft.com/ncache/index.html
Object Caches:
http://www.ogf.org/OGF21/materials/970/GigaSpaces_DataGrid_OGF_Oct07.ppt
http://jakarta.apache.org/jcs/
Re:Locality is the key (Score:5, Interesting)
For instance, from the JCS link you provided:
JCS is not a transactional distribution mechanism. Transactional distributed caches are not scalable. JCS is a cache not a database. The distribution mechanisms provided by JCS can scale into the tens of servers. In a well-designed service oriented architecture, JCS can be used in a high demand service with numerous nodes. This would not be possible if the distribution mechanism were transactional.
So if you're having to give up transactional integrity to have your distributed cache, I think it's going to have limited applications because it doesn't solve that 1000 transactions per thread problem you indicated. Sure you can work your way around it a little by treating it as extremely optimistic locking to maintain transactional integrity on writes, but it also does limit the accuracy of the cache and for some applications (financial for starters, I would expect) that's going to be an issue.
Parent
Re: (Score:3, Interesting)
Not to bash Oracle, but the ultimate scalability of their multi-host database partitioning solution (RAC) is indeed limited by the amount of communication the distributed lock manager needs to make to ensure transactional isolation as the number of partitions/hosts increase. (Caveat to Oracle fans - 80% of requirements are beneath this threshold - so I understand Oracle's strategy.) (An alternative solution is the "shared nothing" partitioning approach (example - db2's DPF) but this
Solid state storage devices are more than disks. (Score:5, Interesting)
Until recently, solid state storage devices have been treated as "disks". But they're not disks. They have orders of magnitude less latency.
For files, this doesn't matter all that much. For databases, it changes everything. Solid state devices need new access mechanisms; I/O based seek/read/write is oriented towards big blocks and long latencies. The optimal access size for solid state storage devices is much smaller, more like the size of a cache line. With smaller, lower latency accesses, you can do more of them, instead of wasting channel bandwidth reading big blocks to get some small index item. It's not RAM, though; these devices usually aren't truly random access.
It starts to make sense to put more lookup-type functions out in the disk, since getting the data into the CPU has become the bottleneck. Search functions in the disk controller went out of fashion decades ago, but it may be time to bring them back. It may make sense to put some of the lower-level database functions in the disk controller, at the level of "query with key, get back record". Cacheing at the disk controller definitely makes sense, and it will be more effective if it's for units smaller than traditional "disk blocks"
This could be the beginning of the end of the UNIX "everything is a stream of bytes" model of data storage. We may see the "database is below the file system" model, which has appeared a few times in mainframes, make it to ordinary servers.
Cores, cpus, nodes, .. (Score:4, Informative)
Re: (Score:2, Funny)
In 2025, this will be grouped with "640K RAM will always be more than enough." and you'll feel like such a Bill G.
Re: (Score:2, Informative)
I'm sure he'll feel lots worse. While Gates gets hounded for something he never said, at least he has mountains and mountains of cash to console him.
Re: (Score:3, Interesting)
In 20 years, I do not know what form slow, or cheap, or permanent storage may take. It may not be spinning magnetized platters. But I do know that in 20 years, every well-written database will have algorithms and data structures to deal with slow
Re: (Score:2)
Point: RAID5 doesn't statistically increase reliability. I know someone who worked for a data recovery firm, lots of people swapped the drives the moment they failed and had a second go while the
Re: (Score:3, Insightful)
Worse than that. Part of that is the fact that RAID vendors tell people to go with drives from the same manufacturer to optimize performance, when in reality, they should be doing with the most diverse pool of drives possible, dramatically reducing the probability of multiple catastrophic failures at the same time (except through human error, some sort of physical impact, electrical surge, etc.). If a drive has a design defect, it isn't at all uncommon for them all to fail at n+/-k hours for some relative
Re: (Score:2)
Re:The furture of databases you say? (Score:4, Interesting)
Oracle's RAC seems to be a better solution than MSSQL's approach. PostgreSQL (and EnterpriseDB) are working on a more RAC-like approach.
This is a good story about a company that successfully moved from Oracle to Postgresql [arnnet.com.au]. Basically, they had 2 database systems running Oracle, a data warehouse and an OLTP system. They moved their data warehouse over to Postgresql running on Solaris 10, then they used the licenses they no longer need for the data warehouse to boost the computing power of the OLTP system.
Parent