


Microsoft Research Introduces Record-Beating MinuteSort Tech 118
mikejuk writes "A team from Microsoft Research has taken the lead in the MinuteSort data sorting test using a specially-devised technology: Flat DataCenter Storage. The figures are impressive — 1401 gigabytes in 60 seconds, using 1033 disks across 250 machines. Not only is this three times as much as the previous record, but also, it uses only one sixth of the hardware resources, according to a blog post about the test from Microsoft. One thing that's interesting about the success is the technology used. While solutions such as Hadoop and MapReduce are traditionally used for working with large data sets, Microsoft Research created its own technology called the 'Flat Datacenter Storage,' or FDS for short. This isn't just academic research, of course. The team from Microsoft Research has already been working with the Bing team to help Bing accelerate its search results, and there are plans to use it in other Microsoft technologies."
This is what I like about Microsoft (Score:1, Troll)
Re: (Score:3, Insightful)
"They are pretty much the only one of the large companies that fund this kind of research"
Bullshit alert.
"Their work does lots of good for the world."
For the world? Or for Microsoft?
Citations needed.
Re:This is what I like about Microsoft (Score:5, Informative)
From the Wiki (http://en.wikipedia.org/wiki/Microsoft_Research#Laboratories [wikipedia.org]), all of the following have come from MS Research
C#
Comic Chat (IRC Client)
F#
Sideshow (Became Desktop Gadgets)
Surface (TouchLight)
SenseCam
ClearType
Group Shot
Allegiance (Game)
Songsmith
I'd say C#, F#, and ClearType are pretty big contributions
Re: (Score:1, Interesting)
And these are exact, high-profile products that have come out of Microsoft Research. You have to remember that they work on many smaller things that will be then integrated into other Microsoft products, or do work 'just for science' (which is pretty amazing from Microsoft).
Re: (Score:3)
I don't doubt that Microsoft Research has made important contributions (even though from the list you posted only C# is something I can put my finger on). Obviously, it's the "They are pretty much the only one" part that is complete nonsense.
Re:This is what I like about Microsoft (Score:4)
For one moment there, I read "Comic Sans" instead of "Comic Chat".
Re:This is what I like about Microsoft (Score:4, Informative)
ClearType invented nothing apart from the name itself.
Sub-pixel rendering was used two decades ago by Apple [grc.com].
"Back in 1976, my design of the Apple II's high resolution graphics system utilized a characteristic of the NTSC color video signal (called the 'color subcarrier') that creates a left to right horizontal distribution of available colors. By coincidence, this is exactly analogous to the R-G-B distribution of colored sub-pixels used by modern LCD display panels. So more than twenty years ago, Apple II graphics programmers were using this 'sub-pixel' technology to effectively increase the horizontal resolution of their Apple II displays." - Steve Wozniak
Re:This is what I like about Microsoft (Score:5, Informative)
You immediately lose credibility by citing Steve Gibson.
The type of subpixel rendering done on old Apple IIs essentially treats the color display as a monochrome display of triple the resolution. This is clever and useful, but causes color fringing.
ClearType takes the concept substantially further by applying perceptual modelling to determine how the subpixels can be used. It's similar to MP3 audio, in that the process adds artifacts, but some artifacts will be invisible (or inaudible in MP3's case) to a human. The trick is minimizing the visible artifacts.
For example, if you have a one pixel wide line, it is always safe to shift it one third of a pixel to the left. RGB becomes BRG, which still appears the same.
However, if you have a one third pixel width line, you cannot just use one third of the subpixels. A "white" vertical line would be all red, all green, or all blue, depending on which subpixel it fell on. ClearType would render it using all three subpixels but in the correct color.
There's quite a bit more to it - sometimes you can use a single subpixel depending on what neighbors it, and/or you can adjust adjascent subpixels to mask fringing artifacts.
So yes, sub-pixel rendering isn't a wholly new concept, but saying ClearType isn't novel is willfully ignorant.
Re: (Score:2)
You immediately lose credibility by citing ClearType.
http://en.wikipedia.org/wiki/Subpixel_rendering [wikipedia.org]
Originally invented and patented by IBM in1988 [patentstorm.us].
Re: (Score:2)
For example, if you have a one pixel wide line, it is always safe to shift it one third of a pixel to the left. RGB becomes BRG, which still appears the same.
I have a pentile display in portrait mode you insensitive clod!
Re: (Score:1)
impressive papers at 2012 SIGGRAPH (Score:2)
Re: (Score:2)
it just means royalties and licenses go to ms instead of others. this contributed to microsoft, not the world. your citation is invalid, try again. :)
.NET, C#, VB.NET, and F# are all free... Download the .NET framework, fire up a text editor, and use the command-line compiler.
Visual Studio Express versions are free... Wrap a GUI around your development.
Parts of ASP.NET are even open source now... And they're accepting contributions from the public.
What exactly hasn't been contributed?
Re: (Score:1)
.NET, C#, VB.NET, and F# are all free...
you mean they are given away "for free" so you can "buy" windows licenses. i'm not saying this is bad, but it is still very far from "Their work does lots of good for the world."
What exactly hasn't been contributed?
so what has been contributed? with respect to society, this could be debatable. but in the sense of r&d (which is the topic, incidentally) ms has contributed practically nothing. could you just cite one single innovation from ms? quite the contrary, they have a long record of profiting from other's contributions. they may have
Re: (Score:1)
..and Redhat gives away their OS "for free" so you can buy a support contract from them.
so? when did i state rh's work "does lots of good for the world"?
You are a stereotypical slashdot poster - always holding MS to some double-standard so that you can always paint MS as evil.
you are a stereotypical moron. i said nothing about double-standards or about ms being evil.
Re:This is what I like about Microsoft (Score:5, Funny)
"Their work does lots of good for the world."
For the world? Or for Microsoft?
Dude, seriously! You do realise this algorithm has been developed to help Microsoft sort through all of the outstanding 'serious security flaw found in IE6' tickets? Why else do you think they'd need 1033 hard drives, and 250 machines?
Re:This is what I like about Microsoft (Score:5, Informative)
Citations needed.
Here you go [microsoft.com]. About 14,000 peer reviewed publications for the computer science community, about 10,000 of which were published completely in house by Microsoft Research, and about 4,000 of which were done in collaboration with Universities.
Re: (Score:2)
The issue isn't whether they fund research - clearly they do. The GP was taking issue with "the only one" part.
Re: (Score:2)
For the world? Or for Microsoft?
Publishing research is beneficial for the world. Not just Microsoft.
Re: (Score:2)
Okay, I totally agree with that.
Re: (Score:1)
Re: (Score:3)
The dispute has nothing to do with Microsoft Research. It is the claim that they are the only big company that researches. You even refute this yourself with your mention of IBM. Almost everything Google does is in the name of R&D (hence the high risk/reward business model). Then you extend that outside of the tech industry and look at pharmaceutical research. Look at Monsanto R&D. Look at Boeing R&D.
The GGP was complete troll material.
Re:This is what I like about Microsoft (Score:5, Informative)
Google doesn't really innovate or do any research. The closest you get is the 20% time they give to engineers (note that not for other personnel). In fact, the only real products Google has made in-house are their search engine and gmail. Everything else (YouTube, Google Earth, Maps, Android) have been buy-outs of startups or copied, like Google+.
What about their self-driving cars? What about their glasses and stuff? They have a lot of secret research projects that they are allegedly spending billions on. Are you trolling, or am I misunderstanding you?
Re: (Score:3)
What about their self-driving cars?
They bought the talent: aka Sebastian Thrun, who worked on many successful self driving cars before being hired by Google.
Re: (Score:2)
Did you expect Sergey Brin to develop the self-driving car?
Re: (Score:2)
Re: (Score:3, Insightful)
Don't all research houses always 'buy' talent by recruiting qualified candidates? Isn't a university the place where people develop skills and then 'sell' themselves to employers?
Re: (Score:2)
What's the functional difference between buying a scientist and paying him to do science and buying a researcher and paying him to do research?
Google hires people who are good fits for their goals and pays them to achieve those goals. I see no problem here.
Yes, Microsoft Research harkens back to a time of the failed Xerox PARC labs, but while funding generic research is really cool from a geeky perspective, its not so different from paying people to achieve specific goals if you're innovative enough.
Re:This is what I like about Microsoft (Score:4, Funny)
Exactly right. Functional self-driving cars aren't really innovations like a fancy coffee table [wikipedia.org] is!
People don't really need silly things like augmented reality glasses or street-level pictures of their mapped destinations - they need internally-inconsistent UIs that change at every major OS version! Thank God we have Microsoft to innovate for us!
Re: (Score:3)
personally i wouldn;t knock any research, funded for "humanity" or profit. If it advances technology eventually it becomes accessible... plus its kinda cool
Re: (Score:2)
Re: (Score:2)
What the fuck is it with everybody saying "shill!" every second word out of their worthless mouths on Slashdot these days?
Is that some new thing, like "noob" or "fag", or do you actually believe *anyone* who *ever* says something remotely positive about Microsoft, Google, Apple, or Facebook (etc.) is actually paid by them?
Why not simply pay attention to what they're actually saying? If you find fault with it, why not refute it?
If you can't refute it, saying "shill" just makes that even more clear, and if yo
Re: (Score:2)
LOL! I don't even know what to say... It's not that I doubt that corporate entities try to game the social web and deploy sockpuppets, or "just" encourage their workers to share how much they love the products they work on, etc... but why is Google exempt as some kind of good guy? That seems more bizarro than plausible to me.
Oh, and my name actually *is* Johann Lau and I never post AC, even when I'm just foaming at the mouth while in
Re: (Score:2)
Google hasn't been the subject of obviously-fake praise on here. Microsoft is the most obvious shilling, such as Asksa here. The pro-Apple camp is pretty prevalent, though with the Reality Distortion Field still holding strong, it's more difficult to say whether they're paid shills or just annoying (but genuine) Slashdotters. I personally can't recall any blatant pro-Google or pro-Facebook posts matching the pattern (high UID, short post history, quick-posted long praising rants), but that may just mean tha
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Re:This is what I like about Microsoft (Score:5, Informative)
Re: (Score:1)
The one thing about Microsoft that I respect is their seriousness about R&D - MS has the highest R&D budget ($9 billion) of all the companies. And they have Turing award winners working with them (C.A.R Hoare and Charles Thacker come to mind). I once had the honour of listening to C.A.R Hoare at a conference where he said that the most difficult job for MS R&D people is to make the rest of the organization use what they create.
Re: (Score:2)
Re: (Score:2)
I don't know about troll, but certainly a brand new UID that posted to one or two other articles while waiting for this to come out of the firehose. My guess is an MSR employee that found out "hey, our stuff's going to be on ./!"
But that's neither here nor there, the post is inaccurate garbage.
Re: (Score:2)
Shooter McGavin: You're in big trouble though, pal. I eat pieces of shit like you for breakfast!
Happy Gilmore: [laughing] You eat pieces of shit for breakfast?
Shooter McGavin: [long pause] No!
- Happy Gilmore 1996
Re:Dude, it's a sort (Score:5, Interesting)
Not only is this three times as much as the previous record, but also, it uses only one sixth of the hardware resources, according to a blog post about the test from Microsoft.
The important part is not that this is a new approach, but that they beat the previous record using less hardware.
Re: (Score:3, Informative)
http://www.research.ibm.com/ [ibm.com]
They used to have one of the most amazing IT geek magazines.
Re: (Score:2)
Other companies also spend a lot on R&D, but they just don't publicize it. Do you think Apple pulled the iPhone out of a hat or something? Hell, here's a recent blog post [arcfn.com] where the author tore down Apple's power adapter for the phone and found some interesting design work. Google probably does a lot of stuff to improve their search algorithm
Re:This is what I like about Microsoft (Score:5, Insightful)
The big difference is that Microsoft Research is one of the last large corporate research labs focused on pure research. That is, research done for the sake of the research, not to drive product development. Research done at MSR doesn't have to be product driven (it has to be in the general space of software and computers, but that's about the only requirement). MSR is well funded by Microsoft and an integral part of the company's culture.
Sure, IBM, HP, and Intel all have research labs, but their charters have been re-written over the last ten years to focus more on product-centric research. Most research projects at these companies must start with a business plan that shows how the work will be commercialized within 5 years before being approved. This is not the pure research these labs were once known for.
Google, Facebook, Yahoo, and many other internet companies have some interesting projects (self driving cars, for instance), but these tend to be one-off projects and aren't part of a larger, long lived research organization.
Another interesting aspect of MSR is that they encourage all MS developers to take a stint in the organization, not just specially recruited Ph.D.s. It's not uncommon for someone to go from working on a product for a few years, take some time in MSR, then go back to product work.
I've worked directly with many of the research groups mentioned in this post over the last 20 years. Based on my experiences, MSR is truly the last real corporate research group (in the spirit of 20th century PARC/Watson/et al). The others are just part of the product funnels or whims of the founders.
-Chris
Re: (Score:3, Interesting)
First of all tons of companies fund research. Lots of papers come out of them of all kinds and plenty more that is never published.
Second of all Microsoft is actually known for being a black hole of research. Researchers go in and almost nothing comes out. They hire people just so their competitors can't hire them. They may do a few demos but nothing commercial comes from them.
Re: (Score:2)
Second of all Microsoft is actually known for being a black hole of research. Researchers go in and almost nothing comes out.
Except for all the published academic research papers with stuff like what's described in TFA?
Re: (Score:2)
All of which have patents attached to them ensuring that they never become too useful to anyone.
Re: (Score:2)
Re: (Score:2)
Microsoft is actually known for being a black hole of research. Researchers go in and almost nothing comes out. They hire people just so their competitors can't hire them.
Citation?
As for nothing coming out, you're apparently not including published papers [microsoft.com] (lots published by respectable bodies like IEEE, ACM, Oxford Publishing, etc.), or downloads [microsoft.com] such as Excel plug-ins to simplify working with genomic sequences, Differentially Private Network-Trace-Analysis Tools [microsoft.com], an e-mail loss detection add-in [microsoft.com], etc., etc.
Sure, not as sexy as self-driving cars. But serious, hard research usually isn't that sexy or appealing to the general public. I thought this was a geek web site...?
Re: (Score:2)
Most research never leaves Microsoft. Only the developers of the project leave. (with their research they cannot use anymore)
You know... except the 10k+ peer reviewed publications [microsoft.com] available to the CS community, nothing ever comes out of MS R&D.
First post (Score:5, Funny)
Sorted by Microsoft
doesn't resolve confusion (Score:1)
faster than ever... (Score:3)
...yet MinuteSort still takes a minute!
Re: (Score:2)
...or is it minute sort, as in tiny. Minute Maid or Minute Maid? Am I going mad, yes I've gone mad. The article is slashdotted already, and my mad mind will never know.
Re: (Score:1)
Re: (Score:1)
Other technologies... (Score:5, Funny)
The team from Microsoft Research has already been working with the Bing team to help Bing accelerate its search results, and there are plans to use it in other Microsoft technologies.
So Bing is going to scrape their search results from Google *and* other search engines? :-)
Did they do anything? (Score:2)
Re: (Score:1)
Re: (Score:1)
"they developed a different way of referencing the massive amount"
Different doesnt mean much I could do somthing different doesnt mean it is any better or any worse.
They say less hardware and sure maybe quantity wise but not by much
microsoft - 1033 disks
yahoo - 1406
difference 373 - that alone I would say would just be from advancements made in hard drives
yahoo nodes
2x quad core xeons 8GB of assuming ddr2 ram which was current at
Re: (Score:1)
Re: (Score:1)
Downside (Score:2, Redundant)
Oh Look (Score:3, Insightful)
Good to see that a nerd site is inundated with droves of empty-headed group-think religious fanatics!
When you're done masturbating to your imaginary universe, maybe you'd like to sit down with the likes of Simon Peyton-Jones and discuss some of the finer points of the terrible work he and his peers have been doing.
Baa-hahahaha. Right.
Re: (Score:1)
The initial post heaps on un-warranted praise on Microsoft and that post was made using an account that is only getting it's first post today and will not have any further posts. So yes it is shill post and people bitch about shill posts as they should.
Second, it is well known that yes Microsoft spends tons of money on research, it is also well known that almost none of that research makes it's way into their products.
Yes, the individuals who did this deserve praise, but no one will benefit from this resea
Re: (Score:2)
Why will it not have further posts? Are you some kind of prophet? It certainly has previous posts [slashdot.org], among them stuff like
http://slashdot.org/submission/2061021/mozilla---ms-is-blocking-browser-choice---again [slashdot.org]
http://slashdot.org/submission/2055677/why-do-we-tole [slashdot.org]
Re: (Score:1)
microsoft - 1033 disks
yahoo - 1406
difference 373 - that alone I would say would just be from advancements made in hard drives
yahoo nodes
2x quad core xeons
8GB of assuming ddr2 ram which was current at the time
1gb ethernet port on each node
40 nodes per rack
microsoft
2 - 12 cores a cluster
24GB - 96GB assuming ddr3 ram
10gb ethernet ports
78% were 10,000
Re: (Score:1)
Microsoft spends lots of money on many things like you say does this actually help anyone other than microsoft? Of course not so why would anyone praise a marketing scheme?
Re: (Score:1)
Microsoft spends lots of money on many things like you say does this actually help anyone other than microsoft?
Go to Google Scholar and search for papers published by MSR folk.
Re: (Score:1)
Re: (Score:3)
Pretty much all stuff from MSR ends up published, so presumably whatever is new & special here would be published as well, for others to use and build upon.
Re: (Score:1)
Re: (Score:2)
Yes, and patented, so that we can avoid building on it in the FOSS world and wait twenty odd years to be able to make use of this research.
Re: (Score:3)
Do me a favor and look at the previous record holder in close detail and tell me that microsoft actually did anything other than buy the record...
Microsoft actually did something other than buy the record.
im just disgusted at microsoft buying the top then everyone is like wow they are doing something when in realty they are doing very little.
Sorting at that scale is fundamentally an i/o bound problem; and distributed sorting is bound by communications between the nodes. Scaling the problem up to more comput
Re: (Score:1)
Re: (Score:3)
Sorting on many disks does become harder but hey they had less disks than the last record holder so they actually had it easie
They sorted 3 times as much data. Hard drives didn't get 3x faster since 2009. And how many hard drives were involved is nearly irrelevant.
If they invented interconnects then they broke the rules of the competition because its supposed to be 100% off the shelf hardware so its all about the algorithm
They used off the shelf hardware to build the network, but they connected and used it
Website, PDF and excerpts (Score:2)
Website: http://sortbenchmark.org/ [sortbenchmark.org]
PDF: MinuteSort with Flat Datacenter Storage [sortbenchmark.org]
The sorts were accomplished using a heterogeneous
cluster consisting of 256 computers and 1,033 disks, di-
vided broadly into two classes: storage nodes and com-
pute nodes. Notably, no compute node in our system
uses local storage for data; we believe FDS is the first
system with competitive sort performance that uses re-
mote storage. Because files are all remote, our 1,470 GB
runs actually transmitted 4.4 TB over the network in un-
der a minute. No strong assumptions are made around
key or record lengths; keys and records of other lengths
can be handled with only a performance-neutral config-
uration change.
Summary
FDS is a general-purpose scalable parallel blob store
that exploits a full-bandwidth interconnect to expose the
entire cluster’s disk bandwidth to remote clients. The
sort performance results in this paper demonstrate the
power of the architecture: in both Daytona and Indy
sorts, the system reads the data remotely to the sort ma-
chines, sorts the data across the network, and writes it
remotely back to storage.
Performant remote file access imparts a flexibility ab-
sent in contemporary distributed storage systems. Be-
yond sort, FDS supports a broad variety of scalable large-
data applications. It does so without demanding that
cluster nodes balance compute and disk performance;
more importantly, it does so without demanding that ap-
plications observe locality constraints.
tract locator table (metadata) and P2P? (Score:2)
Could someone knowledgeable comment on their "tract locator table" (or TLT) metadata system and it's possible relation to P2P protocols? If Bittorrent didn't focus on peer-speed as measured by reads and writes, couldn't it gain an advantage using this? TLT is expected to have consistent membership, but if it was updated once a minute (say), wouldn't that be enough to get the advantages without it taking to long to join a group?
Microsoft is not the great satan (Score:1)
Rare (Score:1)
only 10 gigabit Ethernet? Too slow (Score:2)
They used 10 GigE with a very advanced set of switches that support OpenFlow so that they could get the full bisectional bandwidth. They could have use InfiniBand and probably done much better with FDR adapters capable of 56 gigabit per second. Even "old" IB adapters were faster. Most of the IB switches supported full bisectional bandwidth right out of the box. MS should look at the High Performance Computing world. They need to do handle large amounts of data with low latency.