Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Databases Wikipedia Youtube

Researcher's Wikipedia Big Data Project Shows Globalization Rate 16

Posted by Soulskill
from the abstracted-webs-of-connectedness dept.
Nerval's Lobster writes "Wikipedia, which features nearly 4 million articles in English alone, is widely considered a godsend for high school students on a tight paper deadline. But for University of Illinois researcher Kalev Leetaru, Wikipedia's volumes of crowd-sourced articles are also an enormous dataset, one he mined for insights into the history of globalization. He made use of Wikipedia's 37GB of English-language data — in particular, the evolving connections between various locations across the globe over a period of years. 'I put every coordinate on a map with a date stamp,' Leetaru told The New York Times. 'It gave me a map of how the world is connected.' You can view the time lapse/data visualization on YouTube."
This discussion has been archived. No new comments can be posted.

Researcher's Wikipedia Big Data Project Shows Globalization Rate

Comments Filter:
  • by Anonymous Coward on Friday June 15, 2012 @06:12PM (#40340033)

    Just like stars. If you consult a starmap, it's much denser near earth than further away. So looking at a star catalogue we'd be correct to surmise we're the center of the universe since all stars cluster around us right? Wrong.

    Sampling bias. Starmaps are clustering stars around us because the stars in our vincinity are better sampled then those further away.

    The movie looks exponential because the density of articles dealing with the present is higher than the the density of articles dealing with events long past. It's not surprising, since in 1900 nobody was editing wikipedia, and all entries that did made it there, came from secondary sources, rather than being edited in from primary sources.

Shortest distance between two jokes = A straight line