Forgot your password?
typodupeerror
Programming IT Technology

The Power of the R Programming Language 382

Posted by samzenpus
from the much-better-than-Q dept.
BartlebyScrivener writes "The New York Times has an article on the R programming language. The Times describes it as: 'a popular programming language used by a growing number of data analysts inside corporations and academia. It is becoming their lingua franca partly because data mining has entered a golden age, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models. Companies as diverse as Google, Pfizer, Merck, Bank of America, the InterContinental Hotels Group and Shell use it.'"
This discussion has been archived. No new comments can be posted.

The Power of the R Programming Language

Comments Filter:
  • by Anonymous Coward on Wednesday January 07, 2009 @08:36PM (#26366243)

    ... most others keep thinking that M$ Excel is the silver bullet.

    Sad, but f****** true.

  • popular? no (Score:5, Insightful)

    by geekoid (135745) <dadinportland@ya ... m minus math_god> on Wednesday January 07, 2009 @08:39PM (#26366283) Homepage Journal

    Growing in use? sure.

  • by Samschnooks (1415697) on Wednesday January 07, 2009 @08:44PM (#26366339)

    ... most others keep thinking that M$ Excel is the silver bullet.

    The folks I know who use Excel for analysis use it because it's the package that everyone gets in their organization, there's a shit load of material on the web that uses excel, there's plenty of add-ons for it (no need to reinvent the wheel), and when sharing data and analysis, everyone is familiar with it. An engineer I know who uses excel chose it because it was the fastest way to connect to his testing equipment. R is relatively new and as more folks come into the workforce who know it, we'll see it replace Excel for functions that it is better suited for.

  • by bogaboga (793279) on Wednesday January 07, 2009 @08:47PM (#26366369)

    My request is to those that are in the know to show me some example code, that does something useful. Then later, compare that code to code from other languages to accomplish the same task.

    Include reasons to support the notion that the R language is [necessarily] better at what it does.

  • by transonic_shock (1024205) on Wednesday January 07, 2009 @08:58PM (#26366501) Homepage

    FTA
    "I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.""

    Seriously, does this person know what she is talking about?

    1. Yes, CFD and Structural Analysis software is increasingly written using open source tools and run on open source OS (Linux running on clusters)

    2. SAS is not used to design any part of the aircraft.

    I have noticed SAS uses the same kind of FUD to counter R as M$ uses to counter Linux.

  • Free as in beer (Score:3, Insightful)

    by visible.frylock (965768) on Wednesday January 07, 2009 @09:00PM (#26366521) Homepage Journal

    "R is a real demonstration of the power of collaboration, and I don't think you could construct something like this any other way," Mr. Ihaka said. "We could have chosen to be commercial, and we would have sold five copies of the software."

    Very true. This is what I try to explain to people when they can't understand why some software is given away gratis. Because if they charged for it, given the current attitudes of the market, they wouldn't stand a chance and wouldn't ever get any market share to begin with.

  • by visible.frylock (965768) on Wednesday January 07, 2009 @09:04PM (#26366555) Homepage Journal

    Seriously, does this person know what she is talking about?

    Let's see, Director of technology product marketing. I'm gonna go with a big NO.

  • FUD from SAS (Score:4, Insightful)

    by idiot900 (166952) * on Wednesday January 07, 2009 @09:13PM (#26366647)

    "I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet."

    Wow...talk about FUD. Does SAS imdemnify against plane crashes?

  • by Anonymous Coward on Wednesday January 07, 2009 @09:23PM (#26366759)

    Calling R a programming language is like calling Mathematica or Matlab a language. R is a system for statistical tasks that has a language and snytax, and but it is not capable of producing stand-alone executables that do not require the entire R environment.

    So, you're saying java, js, python, perl, and ruby aren't programming languages?

  • by Hobbes_2100 (171980) on Wednesday January 07, 2009 @09:59PM (#26367059)

    Are you kidding me? Are you really *(*$@#ing, Grade A kidding me?

    Python/Perl/Ruby require interpreters. Scheme and Lisp are frequently run within interpreters. "stand-alone executable" require HARDWARE. Any programming system requires *something* underneath it unless you are programming in a purely physical system like an automated abacus with mechanical gears that buzz and whirr.

    Programming languages are defined by their Turing completeness: can they do things repeatedly, can they assign values to memory locations and perform some basic set of operations (nand works nicely), can they make decisions. Everything else is fluff.

    Perl has "fluff" that handles regular expressions very well.

    Python (and others) have "fluff" that make networking and database ops easy.

    R has "fluff" that makes it terribly convenient to work with data.

    Matlab has "fluff" that makes it very easy to do numerical methods programming.

    Mathematica has "fluff" that makes it very easy to do symbolic computation.

    Each and every one of these, and most well-known languages, with all their warts and beauty marks are Turing complete and are deserving of the term "programming language".

    Regards,
    Mark

  • by stephentyrone (664894) on Wednesday January 07, 2009 @10:16PM (#26367173)

    I have no idea how i would start to code that in C, python, etc. in a way that's remotely efficient ;)

    How about:

    #include <clapack.h>
    dgesdd( argument list );

    This sort of thing is a feature of libraries, not an inherent advantage of one language.

  • by Hatta (162192) on Wednesday January 07, 2009 @10:21PM (#26367209) Journal

    Do analysts who use R get better returns than those who use Excel?

  • by zippthorne (748122) on Wednesday January 07, 2009 @10:33PM (#26367315) Journal

    But we already have a language that does vectors correctly. It's called Matlab and it's based on Fortran, which I guess technically also does vectors correctly, if you want to bother to learn it.

  • by slashdotmsiriv (922939) on Wednesday January 07, 2009 @10:36PM (#26367347)

    Your comment is absolutely wrong.
    http://en.wikipedia.org/wiki/Programming_language [wikipedia.org]

    R is a Turing complete programming language. The fact that it requires an interpreter is completely irrelevant.

  • by Daniel Dvorkin (106857) * on Wednesday January 07, 2009 @10:37PM (#26367359) Homepage Journal

    One big advantage R has over Matlab (er, besides the fact that R is OSS, but of course there's Octave for those who want an OSS Matlab alternative) is that R handles non-matrix data structures much, much better than Matlab does. Trying to work with anything that isn't a vector or a matrix in Matlab is an exercise in pain.

  • by TapeCutter (624760) on Wednesday January 07, 2009 @10:45PM (#26367417) Journal
    'R' is not a general programing language but that hardly means it's not a language. Producing a stand alone executable is not a feature of any language, it's a feature of the tool set.
  • by SanityInAnarchy (655584) <ninja@slaphack.com> on Wednesday January 07, 2009 @11:02PM (#26367577) Journal

    I would argue that GP is confusing "programming language" with "general-purpose programming language".

    I bet even SQL is Turing-complete, but I wouldn't want to do more than database operations with it.

  • by gringer (252588) on Wednesday January 07, 2009 @11:10PM (#26367639)

    Okay, I'll take you up on that... here's some code that takes in a vector of genotypes (as a factor with levels AA,AC,CC,XX), and a matrix of columns to be used for different bootstraps, and spits out a list of genotype counts for those bootstraps:


    ## matmap -- maps a vector onto a matrix of indexes to the vector
    ## (a hack to get round something that R doesn't seem to do by default)
    matmap <- function(vector.in, matrix.indices){
        res <- vector.in[matrix.indices];
        if(is.null(dim(matrix.indices))){
            dim(res) <- c(length(matrix.indices),1);
        } else {
            dim(res) <- dim(matrix.indices);
        }
        return(res);
    }
    ## generate table based on genotype frequencies
    GTcounts <- function(in.genotypes, columns.pop){
        gt.table <- apply(matmap(in.genotypes,columns.pop),2,tabulate, nbins = 4);
        rownames(gt.table) <- levels(in.genotypes);
        return(gt.table);
    }

    Out of the [imperative] languages I know, only octave/matlab have a chance out doing better than that in terms of lines of code. And when you're writing code, being able to avoid duplication and mindless for loops is a really useful feature.

  • by Anonymous Coward on Wednesday January 07, 2009 @11:46PM (#26367897)

    That sounds extremely weird: if a program has a stack, then it has a state - the location on the stack is still state. Thus, if you use recursion, you still have state. I mean, you can try to hide the fact that you have state, but I don's see how you can have a program without state.

    Even the wizard book appears to have a chapter on state: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-19.html#%25_chap_3 [mit.edu] , but, unlike your description, instead of talking about a program without state, it considers two kinds of state: the state of objects, or the state of streams of data.

    Do you happen to have a link to what you mean by "a program should not have state"? Because, I mean, that seems antithetic to the nature of a program.

  • by Peaquod (1200623) on Wednesday January 07, 2009 @11:49PM (#26367921)
    Yeah that's a poorly informed comment. C is a freeware language. And it is used in virtually every embedded system on earth... like the control system for the laser that cuts your cornea at the neighborhood lasik shop. No doubt R is staggeringly less mature than C, but the fact that it is free has no bearing on its quality.
  • by Kyle3om (1421333) on Thursday January 08, 2009 @12:08AM (#26368061)
    The flowchart programming of labview is a pain in the butt for many looped programs and programs with complicated timings. Mablab is easier for most things (and more powerful) if you can get your external equipment to work with it without jumping through hoops.
  • Nothing really (Score:4, Insightful)

    by ArchieBunker (132337) on Thursday January 08, 2009 @12:33AM (#26368225) Homepage

    Labview is well designed for its intent. So someone with minimal programming skills can sit down and get something done in a short amount of time. Would I use it for crunching numbers or collecting terabytes of data, probably not. But its sure damn handy if you want to interface test equipment and get results. Its all about the best tool for the job.

  • by thethibs (882667) on Thursday January 08, 2009 @12:49AM (#26368323) Homepage

    You have to play with it. As with APL you'll either love it or hate it.

    If you like the idea of a language that includes relational tables as a primitive data type, that extends most operators to do the right thing when you feed them vectors and matrices, that has linear regression and equation solving built-in, you'll probably like R.

  • by PeterBrett (780946) on Thursday January 08, 2009 @07:11AM (#26370033) Homepage

    Pfft. Matlab is the fastest way to connect to his testing equipment.

    One of MATLAB's few redeeming features is the Instrument Control Toolbox, especially since it works well with most of the top-end Agilent/Tektronix kit. It's nice to be able to automate acquisition and analysis of instrument data from a single environment.

  • by pavon (30274) on Thursday January 08, 2009 @11:38AM (#26372401)

    Since there are no functions and the only way to reuse code is to put it in a different file people tend not to do this.

    Oh yeah, and when you do so, you have to draw an icon to represent the function instead of just giving it a name, and many people don't do this (even though the icon could just be text in a box). And since the data-flow nature of the language also eliminates most intermediate variables, you end up with code that is nothing but unlabeled lines drawn between generic-looking boxes. In otherwords, the semi-self-documenting nature of function and variable names is lost because those names don't exist, or are not shown.

    Just another example of how LabView makes it easy for people to write bad code while being far more time-consuming to write good code than other languages.

  • by Anonymous Coward on Thursday January 08, 2009 @01:30PM (#26374019)
    R is based on S, which was developed at Bell Labs in 1975, C was developed at Bell Labs in 1972. Perhaps you've overstated the "staggering part". A three year head start doesn't seem like that much to me.

3500 Calories = 1 Food Pound

Working...