Forgot your password?
typodupeerror
Graphics Databases Programming Software IT

Visualizing Complex Data Sets? 180

Posted by kdawson
from the see-it-to-believe-it dept.
markmcb writes "A year ago my company began using SAP as its ERP system, and there is still a great deal of focus on cleaning up the 'master data' that ultimately drives everything the system does. The issue we face is that the master data set is gigantic and not easy to wrap one's mind around. As powerful as SAP is, I find it does little to aid with useful visualization of data. I recently employed a custom solution using Ruby and Graphviz to help build graphs of master data flow from manual extracts, but I'm wondering what other people are doing to get similar results. Have you found good out-of-the-box solutions in things like data warehouses, or is this just one of those situations where customization has to fill a gap?"
This discussion has been archived. No new comments can be posted.

Visualizing Complex Data Sets?

Comments Filter:
  • by Anonymous Coward on Monday January 19, 2009 @10:53PM (#26524337)
    Portraits of complex networks [arxiv.org]

    Abstract: We propose a method for characterizing large complex networks by introducing a new matrix structure, unique for a given network, which encodes structural information; provides useful visualization, even for very large networks; and allows for rigorous statistical comparison between networks. Dynamic processes such as percolation can be visualized using animations. Applications to graph theory are discussed, as are generalizations to weighted networks, real-world network similarity testing, and applicability to the graph isomorphism problem.

  • PtolemyPlot (Score:2, Informative)

    by technofix (588726) on Monday January 19, 2009 @10:58PM (#26524387)
    PtolemyPlot and Java.
  • R language (Score:5, Informative)

    by QuietLagoon (813062) on Monday January 19, 2009 @11:03PM (#26524423)
    There was a thread about the R language [r-project.org] a couple of weeks ago. Look it up and read it....
  • Re:R language (Score:5, Informative)

    by koutbo6 (1134545) on Monday January 19, 2009 @11:11PM (#26524503)
    I second that. If you are visualizing graphs be sure to get the igraph package which can be used with R, Python, C, or Ruby.
    http://cneurocvs.rmki.kfki.hu/igraph/ [rmki.kfki.hu]
    Processing is another package that is geared towards data visualization which java developers might find easier use
    http://www.processing.org/ [processing.org]
  • by Mithrandir (3459) on Monday January 19, 2009 @11:12PM (#26524509) Homepage

    The infovis community has been dealing with these subjects for years. There's many different visualisation techniques around. Here's a list of the past conferences and the papers:

    http://conferences.computer.org/Infovis/ [computer.org]

    Plenty of good products out there, but the one that I like most is from Tableau Software (http://www.tableausoftware.com/).

  • Spotfire (Score:3, Informative)

    by DebateG (1001165) on Monday January 19, 2009 @11:24PM (#26524583)
    I work in biology, and we use Spotfire DecisionSite [tibco.com] to visualize and analyze a lot of our massive genetic data. It's a very powerful program that I barely know how to use. It seems to have packages able to analyze pretty much anything you want, and you can even write your own scripts to help things along.
  • by Shados (741919) on Monday January 19, 2009 @11:28PM (#26524607)

    Wouldn't any everyday cube browser along with any tool to detect base dimentions in a datawarehouse schema do the trick? You may have to add a few custom dimentions on your own depending on how shitty the master data is (I don't think that can be helped, no matter the solution, if a dimention is "these two fields multiplied together times a magic number appended to the value of another table", you need to know, no tool will guess), but aside that?

    Thats usually what I do anyway. I dump my data in a datawarehouse, use whatever built in wizard can auto-generate dimensions, then play with them in a cube browser. Works for even pretty archaic home-made multi-thousand-tables-without-normalization ERP systems I had to work with in the past anyhow.

  • by Anonymous Coward on Monday January 19, 2009 @11:35PM (#26524649)

    Your ERP isn't supposed to directly analyze the data. You're supposed to use a Business Intelligence software package for that. This being SAP, I believe they'll try to sell you Hyperion.

  • Re:get rich slow (Score:1, Informative)

    by Anonymous Coward on Monday January 19, 2009 @11:50PM (#26524787)

    American. But I did not intend to impugn German anything.

    SAP's "engineering"

    There, fixed that for me.

  • by morton2002 (200597) on Tuesday January 20, 2009 @01:02AM (#26525259)
    Tableau Desktop is an interactive analysis and visualization product that connects to relational and cube data sources to help people see and understand their data. There was a webinar [tableausoftware.com] (slides - PDF [tableausoftware.com]) back in November 2008 covering Blastrac Global's success in using Tableau with their ERP system.

    Disclaimer: I work at Tableau Software, so I encourage you to see for yourself with a free trial: http://www.tableausoftware.com/products/tour [tableausoftware.com]
  • Cytoscape (Score:5, Informative)

    by adamkennedy (121032) <adamk@c[ ].org ['pan' in gap]> on Tuesday January 20, 2009 @02:24AM (#26525705) Homepage

    I had a similar situation to yours recently, except I was trying to detangle a horridly complex product substitution graph for a logistics company.

    I used a bunch of Perl to crunch the raw databases into various abstract graph structures, but instead of graphviz or something created by/for developers, I found that the best software for graph visualisation is the stuff that the genetics and bio people use.

    The standout for me was a program called Cytoscape [cytoscape.org] which can import enormous graph datasets and then gives you literally dozens of different automated layout algorithms to play with (most of which I'd never heard of, but it's easy to just go through them one at a time till something works)

    It's got lots of plugins for talking to genetics databases and such, but if you ignore all that and use Perl/Ruby/whatever for the data production part of the problem, it's a great way to visualise it.

We are not a loved organization, but we are a respected one. -- John Fisher

Working...