Forgot your password?

typodupeerror
The Internet Programming IT Technology

Six Degrees of Wikipedia 296

Posted by kdawson
from the finding-the-center dept.
An anonymous reader notes that someone has applied the game Six Degrees of Kevin Bacon to the articles in Wikipedia. Instead of the relation being "in the same film," he used "is linked to by." From the blog post: "We'll call the 'Kevin Bacon number' from one article to another the 'distance' between them. It's then possible to work out the 'closeness' of an article in Wikipedia as its average distance to any other article. I wanted to find the centre of Wikipedia, that is, the article that is closest to all other articles (has minimum [distance])."
This discussion has been archived. No new comments can be posted.

Six Degrees of Wikipedia

Comments Filter:
  • by plover (150551) * on Tuesday May 27 2008, @04:45PM (#23562791) Homepage Journal
    It's pretty obvious, and has a Bacon number of 1.0: http://en.wikipedia.org/wiki/Main_Page [wikipedia.org]
  • And now (Score:5, Funny)

    by Anonymous Coward on Tuesday May 27 2008, @04:48PM (#23562829)
    I know that Kurt Vonnegut is apparently the only link between Douglas Adams and Adolph Hitler.

    Cool stats though.
  • by OMNIpotusCOM (1230884) * on Tuesday May 27 2008, @04:53PM (#23562887) Homepage Journal
    I'd be more impressed if we could find the center of Slashdot... except that it's probably somewhere near CowboyNeal's taint. So, on second thought... maybe not.
  • by Palmyst (1065142) on Tuesday May 27 2008, @04:54PM (#23562913)
    Ignoring obvious stuff like main page, index etc.. is it not possible that there could be two articles that are not in the same transitive closure at all?
  • Where All... (Score:4, Interesting)

    by TheLazySci-FiAuthor (1089561) <thelazyscifiauthor@gmail.com> on Tuesday May 27 2008, @04:54PM (#23562919) Homepage Journal
    It's sometimes eerie to think of an idea and then see that someone has done it over the weekend and posted it on slashdot.

    Last friday at work I was researching different chemicals on wikipedia (a favorite past time of mine) and thought it would be pretty neat if there was a way to find how related two articles were - or to have some way to query the links between two articles to find similarities.

    What I really wanted was a very simple query. My SQL is very rusty, so a plain english version might be perhaps, 'show links where link exists in article_a and article_b'

    Is there a way to execute SQL queries on wikipedia without having to actually download the entire database? I asked google, but was presented with the SQL page on wikipedia....

  • by Charles Dodgeson (248492) * <jeffrey@goldmark.org> on Tuesday May 27 2008, @04:55PM (#23562925) Homepage Journal
    This is News for Nerds. Surely the analogy should be to Erdos numbers [oakland.edu], not Kevin Bacon.
    • by mrbluze (1034940) on Tuesday May 27 2008, @05:12PM (#23563143) Journal

      Surely the analogy should be to Erdos numbers [oakland.edu], not Kevin Bacon. --
      Erdos numbers just don't have the same crackling sound to them.
    • I think this is more interesting than either Erdos number or Kevin Bacon number - those are both social network proximities. This is about the proximity of general information. And IMHO it's somewhat believable - if I, John Doe, linked to an article about myself from the article on the UK, it would be removed very quickly. I do wonder if the results are very different than if you did latent semantic analysis on a big corpus of text from more varied sources though. I think links would be somewhat more ec
    • Re: (Score:3, Informative)

      by grizdog (1224414)
      Also, I'm sure Erdos has priority. I remember people talking about Erdos numbers in the early 80s. I don't think Bacon number goes back before 1990.
  • Legacy of the colonial era, no doubt.
  • Almost every network exhibits small world phenomena. Neural networks, human networks, www, etc. EVERY actor is connected to every other actor by 6 or fewer degrees, not just Kevin Bacon. And every human is connected to every other human by 6 or fewer degrees. And while understanding how small world phenomena helps us understand human networks, I fail to see how this is even slightly interesting applied to Wikipedia.
    • by timeOday (582209) on Tuesday May 27 2008, @05:15PM (#23563163)
      Small world phenomena in general aren't very interesting, but the specific results are. Your comment is like having an election and saying "big deal, I knew somebody would win!"
      • I disagree. To me, the opposite is true. I work in the intelligence community, and small world phenomena means a great deal to me. When looking at terrorist networks, for instance, it comes as no surprise to me that every low-level member is only a few links away from the leader, though it never fails to amaze most of my coworkers. Knowing centrality or closeness out to three decimal places usually doesn't mean a whole lot, but knowing about small world phenomena allows a greater understanding of how th
    • Did you bother to read the notes on how he parsed the data, or how he used group theory, or how he used distributed computing? While not particularly newsworthy in their own right. I thought that was pretty interesting even if the result should be expected. Also the little shortest path search is pretty fun to play with.
    • Well, that depends. (Score:3, Interesting)

      by jd (1658)
      The six degrees of seperation is an easily-misunderstood concept, making it important that what it is people are looking for is also what people think they are looking for.

      The next thing to consider is that Wikipedia is produced by self-selecting contributors who are (necessariy) selective as to what facts (and what references) are to be used, making this a definitely non-random sample using incomplete data out of a population that may have unexpected biases.

      What matters, then, is that even under heavil

  • Link distance (Score:5, Interesting)

    by ninjapiratemonkey (968710) on Tuesday May 27 2008, @05:01PM (#23563021)
    The distance going from Article A to Article B is not necessarily the same as from Article B to article A. For example, the Slashdot [wikipedia.org] page links to the HTTP [wikipedia.org] page, but not vice versa. It would be interesting to know if he took that into consideration when counting links, or whether he would have counted it as one in either direction.
    • Re:Link distance (Score:5, Informative)

      by stedo (855834) on Tuesday May 27 2008, @05:27PM (#23563311) Homepage
      I just took it as distance outwards. The "center" I came up with is the article from which it is easiest to get to all others.
    • Re:Link distance (Score:4, Interesting)

      by jd (1658) <imipak AT yahoo DOT com> on Tuesday May 27 2008, @05:34PM (#23563405) Homepage Journal
      In mathematical terms, this makes Wikipedia a non-simply-connected space. This has two consequences. Firstly, it makes the topology much harder to describe. Secondly, it means that topologists should have enough research material to write books and papers on the dynamics of Wikispace for years to come.
      • In mathematical terms, this makes Wikipedia a non-simply-connected space.
        No, it doesn't. In a simply-connected space, any path between two points is fundamentally the same. Non-simply-connected refers to a shape like a torus, in which getting somewhere by going clockwise and by going counter-clockwise are fundamentally different.

        What it does mean, however, is that "Wikipedia distance" is not a metric.
    • I think there is a simple to prove theorem:

      If the distance from article A to B is n, then the distance from B to A is at most 2n.

      Proof:
      Because every page has a "What links here" page we can walk backwards in twice as many steps. There could be a shorter path which is why the bound is not strict.
    • Re: (Score:3, Funny)

      by ArsonSmith (13997)
      I found this one interesting:

      Shortest path from george bush to satan

      George Bush
      George H. W. Bush
      Andover, Massachusetts
      Satan

      3 clicks needed
  • by LionKimbro (200000) on Tuesday May 27 2008, @05:02PM (#23563023) Homepage
    ...the sun never sets, on the British Empire.
  • I thought of this years ago! I've got blog posts as prior art! SOMEBODY GET ME A MARSHALL TEXAS JUDGE ON THE LINE!
  • by smellsofbikes (890263) on Tuesday May 27 2008, @05:03PM (#23563045) Journal
    In case anyone is interested, the original research that created the idea of 'six degrees of separation' is summarized and analyzed by Malcolm Gladwell in his essay Six Degrees Of Lois Weisberg [gladwell.com]. The original research was done by Stanley Milgram (of greater fame for the (in)famous Milgram Experiment [wikipedia.org] in which people were led to believe that they were shocking other people to death, but continued to do so anyway because they were Just Following Orders.) Milgram's six-degrees research, to sum up, involved handing out a large number of letters to random people, and asking them to give the letters to other people they knew who they thought would be most likely to know a (given, random, unknown-to-everyone-involved) person, and then tracking how those letters actually moved through society to their intended recipients.
    The result was a map that showed large groups of closely-connected people, linked by small numbers of people who were linked into many, disparate, closely-linked groups. These people are unusual and their behavior is unusually influential on others, precisely because they serve to transfer information from homogenous groups to other homogenous groups.
    It's not that people, or wikipedia articles, are all evenly linked by an average of six links that's important. The idea of 'six degrees of separation' is precisely about the nodes which interlink groups of nodes to each other.
  • by certron (57841) on Tuesday May 27 2008, @05:04PM (#23563049)
    While the results are interesting (I won't spoil it by posting the answers, although I'm sure someone else has already cut to the chase and done it), the way they arrived at their results is more interesting. I'm sure this could be extended to some pretty maps of what links where, or deep/shallow topics in different fields. I had tried to find the number of links between Kevin Bacon and Nuclear Physics, but it didn't like my input. Instead, I discovered that it takes 3 clicks to go from Bacon to Physics, passing through Columbia University and BDSM on the way.

    Off-topic, but this is as good a place as any: There was a project hosted on some academic server a few years ago that linked song lyrics together. Clicking on the lyric 'creep' in the lyrics of the Radiohead song of the same title would bring up links to the TLC and Stone Temple Pilots songs of the same title, as well as any other song that used that word in their lyrics. Two songs that shared certain words would be linked by at most 2 clicks. I'm sure it has been buried in Google-cruft in the years since someone figured out that lyrics pages could be slurped up and turned into banner ad farms, but I had been thinking about how this could be re-implemented using a Wiki that would turn every word into a link and then link to a 'what links here' page. Does anyone know where this original project is or what happened to it? Any hints on re-implementing the behavior with a wiki?
    • by certron (57841)
      Incidentally, going from Physics to Bacon is also 3 clicks, passing through 1968 and Lisa Loeb in the process.

      I remember when searching for certain terms on the Internet could bring you from one article to another and suddenly you'd gone from Seaquest DSV to magnetic monopoles to a South African shaman who talks with UFO aliens and suggests curing diseases with sonic frequencies. Now you can do all that solely through Wikipedia.

      Obligatory XKCD: http://xkcd.com/214/ [xkcd.com]
  • the idea is to find redundant connections between sir francis bacon and kevin bacon: socially, in film, genetically, and via wikipedia links

    this sort of alternate connection generation is known as a double bacon whopper with cheese
  • by snuf23 (182335) on Tuesday May 27 2008, @05:05PM (#23563071)
    Our personal favorite for Wikipedia is "Six degrees of anal sex". You'd be amazed how few steps it takes to go from Rush Limbaugh to butt piracy.
  • by Escogido (884359) on Tuesday May 27 2008, @05:06PM (#23563081)
    Shortest path from Microsoft to Evil

    Microsoft
    ASCII
    2 (number)
    Evil

    3 clicks needed

    Too bored to make a good pun out of this so please someone else do.
  • I'm pretty sure I'm not the only one who wants to know about the most remote articles, or who even wants to see distribution graphs, am I? The article is a teaser, not completely satisfactory. :-(

  • They once asked him, why would he do that (as that was dangerous and quite pointless from any practical point of view because there's nothing on top of Mt. Everest and the guy wasn't even a scientist). His answer? "Because it's there."

    The more I think about it, the more I come to the conclusion that it's what defines a true geek - doing things even if they're pointless by themselves, for the sake of doing them and proving that this or that completely crazy idea is actually doable. And, of course, because th
  • I haven't been able to get two words to be more than 4 links apart so far... can anyone come up with words that can beat 4 degrees?
    • by stedo (855834) on Tuesday May 27 2008, @06:22PM (#23564093) Homepage
      Unfortunately, yes. The original project was to find the diameter of wikipedia, i.e. the biggest such number of links. That approach was abandoned when I found giant "tails" in wikipedia, almost linear linked lists of articles that stretch out for 70 links. The worst offenders were the subpages of List of named asteroids as each is only linked from the previous one, and it takes about 70 links to get from anywhere to the last one.

      Stephen Dolan, aka mu
  • by STrinity (723872) on Tuesday May 27 2008, @05:20PM (#23563225) Homepage
    [[There]] are [[some]] [[Cmdr Taco|idiots]] who [[bracket]] [[every]] other [[word]].
  • This looks like a job for Google App Engine [google.com]!
  • A variant game to do with the connected nature of Wikipedia involves a group of people choosing a start page and an end page and seeing who can get there in the least amount of hops. Posting the route allows for interesting analysis of the logic players used to try to get places. The "find shortest path" in the article would kill that tho :P
  • FTA:

    If you skip past all of the articles that are just lists, years or days of the year, the "real article" closest to the centre is: United Kingdom at an average of 3.67 clicks to anywhere else. Following it are Billie Jean King and United States (in that order, strangely) with averages of 3.68 and 3.69 clicks respectively.

    A quick look at her article [wikipedia.org] - along with keeping in mind the previous results of year-and-date type pages being ranked very highly - it seems that her main advantage is that her article

    • Re: (Score:2, Informative)

      by stedo (855834)
      Yeah, that kind of thing does bias the results a bit. If you go to the bother of downloading the full results (I think the server may be a bit slashdotted atm, so don't do this immediately), then it turns out that a lot of music group's tours place unusually highly because they have a lot of sentences like "In [[2007]], they toured the [[United Kingdom]]".

      Stephen Dolan, aka mu
  • I wrote something kind of similar as a proof of concept (in common lisp) a little while back: http://icarus.maneks.net:4242/ [maneks.net]

    There's a few technical details at http://icarus.maneks.net:4242/static/readme.txt [maneks.net]

    I've been meaning to clean it up and release the source (maybe a screencast intro to Lisp?) for a while now. The main problem with mine is that the DB server and Web server are far apart, so it takes forever to get any data
  • If you want to look something up on IMDB, it can be fun to see if you can use the links on the front page to reach the article you want within seven degrees. I normally count actors but not movies as a degree, but you could try per-click.
  • I prefer the Disneyporn game, where you go to Disney.com and see how many left clicks it takes to reach porn.

    Closest I found (a few years ago) was from Disney to ABC, to ABC Sports, to HP (server provider), to Yahoo index, to massage providers, then a few ad links.

    Only fair to ensure your PC is free of extra popup software first.
  • What about language? (Score:5, Interesting)

    by kylehase (982334) on Tuesday May 27 2008, @05:38PM (#23563473)
    The 6 degrees theory claims that everyone in the world is connected. That means you'd have to include every Wikipedia page in other languages as well, not just English.

    I tested some random Japanese Wikipages and the test failed. I then tried some very common English pages and those failed as well "Unknown article...". So I think their server might be having the /. effect.

    In any case it doesn't look like they included other languages in their setup.
    • Re: (Score:3, Insightful)

      by stedo (855834)
      No, it's just the English Wikipedia. There aren't that many links between the English and Japanese Wikipedias anyway, so it wouldn't make much difference. I might do it again later with other Wikipedias.

      Stephen Dolan (aka mu)
  • by joelpt (21056) <slashdot@@@joelpt...net> on Tuesday May 27 2008, @05:43PM (#23563541)
    Shortest path from disney to fuck

    The Walt Disney Company
    Motion Picture Association of America film rating system
    Fuck

    2 clicks needed
  • by $random_var (919061) on Tuesday May 27 2008, @06:03PM (#23563823)
    The paths it generates from Article A to Article B would be more interesting if they excluded list pages... so far, most of the interesting searches I've tried have been short-circuited by some kind of date page.
  • by xPsi (851544) * on Tuesday May 27 2008, @06:07PM (#23563879)
    Wikipedia articles actually linking to Kevin Bacon should be made "time-like" and given a negative sign in the metric tensor when calculating article "distances" in this exercise.


    No, I don't know why I'm advocating this.

  • There's so much promotional material for bands on Wikipedia that this must lower the the number of steps between pages.

    Basically every other page at least has some sort of "band X wrote a song about .... ". And then every band page has further spam in the form of "band Y covered the song of Band X on their ABC album".

    Is this a good time to remind everyone that the Music Industry is the Original Evil raised to the power of evil -- and yet, something that's supposed to be neutral (and I guess I emphasi
  • by foxtrot (14140) on Tuesday May 27 2008, @06:33PM (#23564235)
    It must've stuck in this guy's craw a little, given that he's at Trinity College in Dublin, Ireland to find out that the Center of the Known Wikiverse is the United Kingdom...
  • In an ironic twist, most articles on Wikipedia are also within 6 clicks of Kevin Bacon's article!

    Enjoy a nice game of WikiBacon [blogspot.com]!

    Link is NOT to my blog
  • Shortest path from Zeuxis and Parrhasius to Jeruzal
    No path found

    I beat the Man

  • by guruevi (827432) <evi@NOspam.smokingcube.be> on Tuesday May 27 2008, @09:50PM (#23566011) Homepage
    Scary:

    From Slashdot to Girl, 3 clicks
    From Slashdot to Sex, 2 clicks
    From Slashdot to Microsoft, 1 click

    Interesting, from Slashdot to your basement (4 clicks), you actually go through Apple, Inc.
  • by Titoxd (1116095) on Wednesday May 28 2008, @01:54AM (#23567429) Homepage
    This article talks about a tool that was first available to Wikipedians in 2004 [wikipedia.org]. Heck, there's an entire page to try to find long chains at Wikipedia:Six degrees of Wikipedia [wikipedia.org], and it even mentions a chain of seven articles...
  • by ConfusedVorlon (657247) on Wednesday May 28 2008, @03:43AM (#23567895) Homepage
    Good to see confirmation of what we Brits already knew.

    The UK is the centre of the known universe.

Life is like an analogy.

Working...