Forgot your password?
typodupeerror
Programming Stats Math Open Source Python

R Throwdown Challenge 185

Posted by timothy
from the if-you-pirate-it-so-much-the-better dept.
theodp (442580) writes "'R beats Python!' screams the headline at Prof. Norm Matloff's Mad (Data) Scientist blog. 'R beats Julia! Anyone else wanna challenge R?' Not that he has anything against Python, Matloff adds, but he just doesn't believe that Python or Julia will become 'the new R' anytime soon, or ever. Why? 'R is written by statisticians, for statisticians,' explains Matloff. 'It matters. An Argentinian chef, say, who wants to make Japanese sushi may get all the ingredients right, but likely it just won't work out quite the same. Similarly, a Pythonista could certainly cook up some code for some statistical procedure by reading a statistics book, but it wouldn't be quite same. It would likely be missing some things of interest to the practicing statistician. And R is Statistically Correct.'"
This discussion has been archived. No new comments can be posted.

R Throwdown Challenge

Comments Filter:
  • Meh (Score:5, Informative)

    by hyfe (641811) on Sunday May 25, 2014 @08:49AM (#47086827)
    Statistics major who programmed Python professionally for a few years (and have a MsC in Comp.Sci) ...

    ... this is all posturing and drama, but good on Prof. Norm Matloff for getting some attention. R is rather usefull, has quite a few extremely usefull features as a language, including some of the best list/indices handling I've seen anywhere. Excellent libraries for statistical work, but it also has quite a few the most downright abhorrent language decision I've seen anywhere ever, with the amazingly poor string handling (for a scripted language) topping that list ( http://www.burns-stat.com/page... [burns-stat.com] )

    Python, C, Mathematica and R all have different strengths for mathematical work / numerical calculations though, and using the best tool for the job is what it's about. As always, what the best tool actually is, is also rather subjective, as which tool will best solve a specific task is always dependent on your skill with the different tools. I do agree with professor though, even though there's quite abit of Python hype (python + scipy/matplotlib is amazing) R is not being replaced anytime soon. It's too good at what it's good at.

  • Re:Bad analogy (Score:4, Informative)

    by KingOfBLASH (620432) on Sunday May 25, 2014 @10:29AM (#47087107) Journal

    You're just getting a plot. I'm talking about output that looks like this:


    Call:
    lm(formula = new_day_return ~ prior_day_return + rsi_under_10 +
            rsi_under_20 + rsi_under_30 + rsi_over_70 + rsi_over_80 +
            rsi_over_90 + fourteen_day_rsi, data = mydata5)

    Residuals:
          Min 1Q Median 3Q Max
        -100 -1 0 1 205700

    Coefficients:
                                          Estimate Std. Error t value Pr(>|t|)
    (Intercept) -9.845e+01 3.742e+02 -0.263 0.792
    prior_day_return -4.143e-04 3.434e-03 -0.121 0.904
    rsi_under_10 -1.916e-01 3.798e+00 -0.050 0.960
    rsi_under_20 2.195e-02 1.447e+00 0.015 0.988
    rsi_under_30 -2.291e-01 6.915e-01 -0.331 0.740
    rsi_over_70 -2.364e-01 3.348e-01 -0.706 0.480
    rsi_over_80 5.135e-03 4.820e-01 0.011 0.991
    rsi_over_90 7.162e-03 8.650e-01 0.008 0.993
    fourteen_day_rsi 4.193e-04 3.434e-03 0.122 0.903

    Residual standard error: 163.7 on 1581663 degrees of freedom
        (137 observations deleted due to missingness)
    Multiple R-squared: 5.397e-07, Adjusted R-squared: -4.518e-06
    F-statistic: 0.1067 on 8 and 1581663 DF, p-value: 0.999

  • by Johnny Loves Linux (1147635) on Sunday May 25, 2014 @10:43AM (#47087167)
    Be sure to use RStudio as the front end: http://www.rstudio.com/ [rstudio.com]. Using on R in a terminal is ok, but having the beautiful GUI frontend RStudio makes working with R sooooooo much better! The help system, plots, R markdown (knitr), and inspecting variables in RStudio is so much easier. As far as comparisons go,
    1. R is no competitor to python for writing generic scripts.
    2. Python (numpy, scipy, statsmodels, pandas, sklearn, matplotlib, ipython and ipython notebooks) is not yet ready to compete with R for doing statistical analysis but give Python a couple of more years and then slashdot should do a review of how it compares.
    3. You can always call R from python using the r2py module. This is really easy within an ipython notebook using the %load_ext rmagic command.

    For a nice video on using ipython notebook in data analysis: https://www.youtube.com/watch?... [youtube.com]

    For a nice selection of ipython notebooks for doing various type of data analysis: https://github.com/ipython/ipy... [github.com]

"In matters of principle, stand like a rock; in matters of taste, swim with the current." -- Thomas Jefferson

Working...