Open Source Solution Breaks World Sorting Records 139
Posted
by
Soulskill
from the out-of-sorts dept.
from the out-of-sorts dept.
allenw writes "In a recent blog post, Yahoo's grid computing team announced that Apache Hadoop was used to break the current world sorting records in the annual GraySort contest. It topped the 'Gray' and 'Minute' sorts in the general purpose (Daytona) category. They sorted 1TB in 62 seconds, and 1PB in 16.25 hours. Apache Hadoop is the only open source software to ever win the competition. It also won the Terasort competition last year."
Overlords (Score:3, Funny)
I'm sure that I can rock their scores (Score:1, Funny)
Just give me a few minutes to patch together a bubblesort from my highschool Pascal class. I'll show them record speed!
Re:Overlords (Score:5, Funny)
I for one welcome our new datasorting overlords!
With a name like Apache Hadoop, I wouldn't be surprised if they came from Star Wars.
Re:I'm sure that I can rock their scores (Score:5, Funny)
My sort [wikipedia.org] will totally beat yours!
Re:I'm sure that I can rock their scores (Score:3, Funny)
Bogosort: for when you have you are paid by the hour, but aren't penalised for being late.
Is it settled? (Score:5, Funny)
So, it appears they have finally sorted out whether open source beats proprietary.
Re:I'm sure that I can rock their scores (Score:4, Funny)
I've asked lots of interview candidates to implement randomSort. They've never heard of it, so then I describe the algorithm.
Watching their eyes go wide is the highlight of the interview, typically.
Occasionally some person who has overcome their interview nervousness will, with eager honesty, try to implore to me that this is not a very good sort algorithm, and that much better ones are taught in universities these days.
Good Times.
Re:What data? (Score:5, Funny)
This doesn't say anything if we don't know what kind of records were supposed to be sorted.
It's amazing what you can learn if you actually RTFA.
All of the sort benchmarks measure the time to sort different numbers of 100 byte records.
If that's not good enough for you, post your email address and maybe someone will be kind enough to send you the 100TB and 1PB data files they used.
Re:I'm sure that I can rock their scores (Score:3, Funny)
Bogosort: for when you have you are paid by the hour, but aren't penalised for being late.
with my luck, bogosort would get it right the first time.
Re:I'm sure that I can rock their scores (Score:4, Funny)
No, he clearly changed roles from developer to Evil HR. He's probably directly subservient to Catbert.
Re:Use C++ and save 10x the hardware (Score:1, Funny)
Use C++ and save 10x the hardware
You tell em brutha! I'm so tired of carrying 10 cell phones to play java games.
Re:Overlords (Score:3, Funny)
Fastest implementation of BubbleSort EVER!
Re:Overlords (Score:5, Funny)
Re:Great! It's open source! (Score:4, Funny)
"Why isn't this illegal"
Because they made it legal by passing it on a Totally Unrelated Bill.
Re:Overlords (Score:1, Funny)
Actually, it came from Google. Sorta.
i actually like the name 'Google Sorta' better than Apache Hadoop