Follow Slashdot blog updates by subscribing to our blog RSS feed

R Throwdown Challenge 185

Posted by timothy on Sunday May 25, 2014 @08:26AM from the if-you-pirate-it-so-much-the-better dept.

theodp (442580) writes "'R beats Python!' screams the headline at Prof. Norm Matloff's Mad (Data) Scientist blog. 'R beats Julia! Anyone else wanna challenge R?' Not that he has anything against Python, Matloff adds, but he just doesn't believe that Python or Julia will become 'the new R' anytime soon, or ever. Why? 'R is written by statisticians, for statisticians,' explains Matloff. 'It matters. An Argentinian chef, say, who wants to make Japanese sushi may get all the ingredients right, but likely it just won't work out quite the same. Similarly, a Pythonista could certainly cook up some code for some statistical procedure by reading a statistics book, but it wouldn't be quite same. It would likely be missing some things of interest to the practicing statistician. And R is Statistically Correct.'"

This discussion has been archived. No new comments can be posted.

R Throwdown Challenge

Load All Comments

Search 185 Comments Log In/Create an Account

Comments Filter:

Can't use it (Score:5, Funny)

by smittyoneeach ( 243267 ) * writes: on Sunday May 25, 2014 @08:30AM (#47086779) Homepage Journal

Nothing with a name that verbose can possibly be any good.

Share
twitter facebook
- Re: (Score:2)
  
  by dmbasso ( 1052166 ) writes:
  
  Like... hmm... C?
  - - Re: (Score:3)
      
      by rudy_wayne ( 414635 ) writes:
      
      R#
      - Re: (Score:2)
        
        by Mitchell314 ( 1576581 ) writes:
        
        R.net?
- Re:Can't use it (Score:4, Funny)
  
  by FatdogHaiku ( 978357 ) writes: on Sunday May 25, 2014 @11:07AM (#47087263)
  
  Is this the programming language of Pirates?
  Is this the programming language for Pirates?
  Is this the language for programming Pirates?
  Arrr...
  
  Parent Share
  twitter facebook
  - Can't spell warez without R (Score:3)
    
    by tepples ( 727027 ) writes:
    
    And to what extent are statisticians willing to use warez?
    - Re: (Score:2)
      
      by Mitchell314 ( 1576581 ) writes:
      
      Many of them are or were college students. What do you think? :P
  - Re: (Score:2)
    
    by FatLittleMonkey ( 1341387 ) writes:
    
    Posting to undo stupid.
- - Re: (Score:2)
    
    by the_B0fh ( 208483 ) writes:
    
    I would wait for the next version...
    R2D2...
Comment removed (Score:5, Funny)

by account_deleted ( 4530225 ) writes: on Sunday May 25, 2014 @08:36AM (#47086789)

Comment removed based on user account deletion

Share
twitter facebook
- - Re: (Score:3)
    
    by I'm New Around Here ( 1154723 ) writes:
    
    It dices. It chops. It purees. It makes my food taste better, to a not insignificant amount.
    Any other claims you want to hear from a chef*?
    .
    *Note: Worked in several restaurants during and after high school. Now I occasionally cook or make deserts at home.
    - Re: (Score:2)
      
      by arglebargle_xiv ( 2212710 ) writes:
      
      *Note: Worked in several restaurants during and after high school.
      Saying "would you like fries with that" doesn't really count as working in a restaurant though...
      - Re: (Score:2)
        
        by I'm New Around Here ( 1154723 ) writes:
        
        No, I worked in actual restaurants, with menus and tablecloths and everything. The real "doesn't count" part is that generally I was a dishwaher or busboy, and only did the prep-cook work as it was needed. But I didn't want to confuse the issue at the expense of a fun post.
    - Re: (Score:2)
      
      by cellocgw ( 617879 ) writes:
      
      *Note: Worked in several restaurants during and after high school. Now I occasionally cook or make deserts at home.
      We have enough arid lands already, you insensitive clod!
      - Re: (Score:2)
        
        by I'm New Around Here ( 1154723 ) writes:
        
        D'oh!
        I forgot to check that. "Just remember that 'dessert' has two s's because you want to have two servings."
        Thanks for catching that. :^)
  - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    I'd like to see a double-blind study on the chef claim, too.
    I'm pretty sure that food, like music, has a fringe of...enthusiasts...who would tell you that double-blind studies just ineffably blunt the terroir in some more or less mystical way (which is of course the real reason why they have trouble performing above chance), rather than let base materialism and the plebian theory that functionally identical outcomes can be produced by a variety of means sully the transcendent subtlety of their experience.
    - Re: (Score:2)
      
      by Culture20 ( 968837 ) writes:
      
      I enjoy having my steak prepared the same way every time, but I would balk at eating a 100% reproduced steak.
    - Re: (Score:2)
      
      by gordo3000 ( 785698 ) writes:
      
      Like wine, or food, or music, if you do a double blind study you may very well end up with very different preference rankings. It doesn't mean the outcomes are equivalent though. Even dishes that are usually simple are made with very different spices and ingredient balances by different chefs, and frankly, very different ingredients. Even sushi, one of the simplest foods available (literally, slice raw fish, put it on rice which has some sugar and vinegar added to it) can be markedly different. And usua
Bad analogy (Score:5, Insightful)

by Florian Weimer ( 88405 ) writes: <fw@deneb.enyo.de> on Sunday May 25, 2014 @08:39AM (#47086797) Homepage

An Argentinian chef is more likely to make great sushi than a Japanese automotive engineer.
You generally want to use programming languages designed by experienced programmers (even better, experienced language designers) who work closely with subject matter experts. Left to their own devices, experts are likely to get a lot of things wrong, and if the language is sufficiently popular, you are stuck with their mistakes for a long time to come.

Share
twitter facebook
- Re:Bad analogy (Score:5, Interesting)
  
  by Glock27 ( 446276 ) writes: on Sunday May 25, 2014 @09:08AM (#47086871)
  
  Exactly. Julia will eat R for lunch soon enough, I think. It's an elegant, well designed and efficient language. It's only been around for a couple of years, and has a very vibrant and rapidly growing community.
  Check it out for yourself: The Julia Language Homepage [julialang.org]. It's got a lot to offer anyone with an interest in mathematics, including statisticians. It's based on the LLVM, and interfaces trivially with C libraries - plus it's a very fast language in it's own right, unlike R or Python.
  
  Parent Share
  twitter facebook
  - Re:Bad analogy (Score:5, Interesting)
    
    by retchdog ( 1319261 ) writes: on Sunday May 25, 2014 @09:48AM (#47086959) Journal
    
    my friend uses julia, and every few weeks complains about some bug. the other day he mentioned that the latest release broke Bernoulli sampling (wtf?). the others have been pretty fundamental too.
    this is a serious problem, of course. the other one is lack of libraries. R is an abysmal pile of shit, but at least it's a standard; pretty much 95%+ of applied stats is at least partially supported by someone's hacked-up library/package. julia is far, far short of that, and it appears that much of its community is more interested in pretty graphics, meta-wankery, and interface methodology than actual working statistics (not that there's anything wrong with that per se).
    yeah, yeah, "fix it yourself," and it's on my list to write at least a basic survival analysis package for it. but i wouldn't blame anyone for not using it, and i wouldn't recommend it for doing stats as it is now.
    
    Parent Share
    twitter facebook
    - Comment removed (Score:4, Funny)
      
      by account_deleted ( 4530225 ) writes: on Sunday May 25, 2014 @09:59AM (#47086995)
      
      Comment removed based on user account deletion
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by K. S. Kyosuke ( 729550 ) writes:
      
      How much R package code is written in R? Would it be such a problem to take an R parser and generate Julia code out of it as a first iteration? Then, people could refactor it - if necessary - while keeping the first version around for regression testing. Even if the original R APIs are horrible, at least they have the benefit of people being familiar with them, as you rightly point out.
      - Re: (Score:2)
        
        by Antique Geekmeister ( 740220 ) writes:
        
        Like the f2c toolkit, for converting Fortran to C?
        I don't think you could write he parser in R, or in Julia.
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        Or f2cl? ;-) I don't see a reason why one shouldn't be able to write the parser in Julia. It seems perfectly equipped even for such tasks. It even has macros, come to think of it.
        
        Slashdot (Score:2)
        
        by DrYak ( 748999 ) writes:
        
        You know you're on /. when you need to check half of the words on wikipedia, just to be able to understand a 10 words sentence. :-)
      - Re: (Score:2)
        
        by fuzzyfuzzyfungus ( 1223518 ) writes:
        
        Given that these languages are (primarily, obviously anything Turing-complete can be turned to the same purposes as anything else, if somebody feels like it) used for statistics work, I'd be inclined to wonder whether that is the easiest or best way to go about it:
        
        If something is already implemented in R, and you want to more or less blindly feed it a new target, or re-run it to see how it works, R was apparently not broken enough to stop it, because it's already done.
        
        If you want to implement some, cu
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        Julia isn't strictly numerical. It sure as hell isn't "primarily for statistics work". It has a numerical bent, but so far I haven't seen any limitation in the sense that something general and non-numeric in it would be possible (in the sense of Turing completeness) but impractical. Indeed, the very fact that Julia has been designed with support for Lisp-like macros in mind should be a hint to you that perhaps expecting it to have at least generous facilities for manipulating and transforming syntactic tree
      - Re: (Score:2)
        
        by Jmstuckman ( 561420 ) writes:
        
        Although much R package code is written in R, many of the important bits are living in FORTRAN libraries (many of which date back to the 1980s) which are linked into the packages.
  - Re: (Score:2)
    
    by HiThere ( 15173 ) writes:
    
    Julia is an excellent design for a specific range of problems. I was considering using it for a couple of days, so I looked over the design. It is good for handling matricies of identical types of element doing the same thing on each entry. This is a pretty broad class of problem, but it's far from descriptive of all problems, and even within that class I'm skeptical that they will ever be able to optimise some of the operations.
    OTOH, I must admit that I didn't even consider using R. I wasn't considerin
    - Re: (Score:2)
      
      by Beck_Neard ( 3612467 ) writes:
      
      > It is good for handling matricies of identical types of element doing the same thing on each entry.
      Actually, that's MATLAB. Julia does not give matrices any special treatment - it has a type system that is rich enough that you can define an entire matrix domain-specific-language inside it (which is exactly what they did - Julia's matrix operations are defined entirely in Julia itself, yet are still blazing fast because they call external libraries). http://julia.readthedocs.org/e... [readthedocs.org]
      Plus, whereas MATLA
  - Re: (Score:2)
    
    by StripedCow ( 776465 ) writes:
    
    How can a non-functional language be _the_ platform for mathematical computing?
- Re: (Score:2)
  
  by KingOfBLASH ( 620432 ) writes:
  
  Using three lines of code I can do a regression in R and get the output, including loading the data.
  Python? Fuhgeddaboutit. Can do, but with a lot more code.
  Of course, if you're looking to do stuff you'd expect of a normal scripting language, R falls flat on its face.
  The solution? R + Python. They talk to each other quite nicely, and you can get the best of both worlds.
  - Re: (Score:3)
    
    by tomhath ( 637240 ) writes:
    
    Python? Fuhgeddaboutit. Can do, but with a lot more code.
    Yea, with Python it takes up to nine lines of code [blogspot.com] to calculate the regression and generate a plot
    - Re:Bad analogy (Score:4, Informative)
      
      by KingOfBLASH ( 620432 ) writes: on Sunday May 25, 2014 @10:29AM (#47087107) Journal
      
      You're just getting a plot. I'm talking about output that looks like this:
      Call: lm(formula = new_day_return ~ prior_day_return + rsi_under_10 + rsi_under_20 + rsi_under_30 + rsi_over_70 + rsi_over_80 + rsi_over_90 + fourteen_day_rsi, data = mydata5)
      Residuals: Min 1Q Median 3Q Max -100 -1 0 1 205700
      Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -9.845e+01 3.742e+02 -0.263 0.792 prior_day_return -4.143e-04 3.434e-03 -0.121 0.904 rsi_under_10 -1.916e-01 3.798e+00 -0.050 0.960 rsi_under_20 2.195e-02 1.447e+00 0.015 0.988 rsi_under_30 -2.291e-01 6.915e-01 -0.331 0.740 rsi_over_70 -2.364e-01 3.348e-01 -0.706 0.480 rsi_over_80 5.135e-03 4.820e-01 0.011 0.991 rsi_over_90 7.162e-03 8.650e-01 0.008 0.993 fourteen_day_rsi 4.193e-04 3.434e-03 0.122 0.903
      Residual standard error: 163.7 on 1581663 degrees of freedom (137 observations deleted due to missingness) Multiple R-squared: 5.397e-07, Adjusted R-squared: -4.518e-06 F-statistic: 0.1067 on 8 and 1581663 DF, p-value: 0.999
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by KingOfBLASH ( 620432 ) writes:
        
        OK. How about loading of data?
        In R I just type mydata - read.table("./foo", header=TRUE, sep=",")
        What about messing around with models? Python you're either executing your script over and over again, or you're using it in interactive mode, and it gets a bit messy.
        I use R because it seems easiest to me. If you can make my life easier with Python, I'm all ears...
        What would you recommend?
        
        Re: (Score:2)
        
        by LetterRip ( 30937 ) writes:
        
        import pandas as pd
        df = pd.read_csv("./foo", header=True, sep = ',')
        
        Re: (Score:2)
        
        by LetterRip ( 30937 ) writes:
        
        Look at ipython notebook,
        it is a lot like the workflow for mathematica or R.
        Look at pandas + scipy stack.
        pandas replicates the functionality of R dataframes but integrates many features found in external R packages in a beatiful and intuitive way.
        notebook + scipy stack (scipy, numpy, sklearn, matplotlib w seaborn or use ggplot if you prefer) + pandas is enough to largely eliminate the need for R for most people doing machine learning or statistical.analysis (there are still occassional times when I need som
    - Re:Bad analogy (Score:5, Insightful)
      
      by professionalfurryele ( 877225 ) writes: on Sunday May 25, 2014 @11:14AM (#47087297)
      
      Sorry but I use both R and python in my work as a biomechanist and while I love working with python and hate working in R, R is not only less verbose for this task, but it is more consistent, intuitive and better documented. Very few languages beat python for simple, easy to read code, but it is not up to the task of doing general purpose statistics. To see why this is the case consider a problem with that blog post. All the diagnostic plots I need to do to check the regression are missing, no qq, no cook's, not even something simple like fitted vs. residual. Now consider what happens when I notice that while the fit is decent the residuals depend on what subject I'm looking at and I need to vary the error term. Or need to switch to a mixed effects model because there is clearly a dependence on the intercept by subject.
      Seriously when i say I hate R, I mean it. The code is ugly, it can be hard to read and woe betide the poor git who makes the mistake of needing a plot more complicated that something lattice can do. It is still better than python for statistics.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by professionalfurryele ( 877225 ) writes:
        
        You can. For me the primitives are a pain to work with compared with matplotlib. Not that anything I've used has good 2D primitives for plotting, just gradations of less crappy.
    - Re: (Score:2)
      
      by CadentOrange ( 2429626 ) writes:
      
      With judicious use of semicolons, you could fit all that into a single line.
      You might have to scroll horizontally a lot, but it's still a single line!
- Re: (Score:2)
  
  by arglebargle_xiv ( 2212710 ) writes:
  
  An Argentinian chef is more likely to make great sushi than a Japanese automotive engineer.
  There's an even closer-to-food analogy for this: If you want a good Italian pizza, get a Greek to make it. I have no idea why this works, but the best Italian pizzas always tend to be made by someone called Nikos or Costas.
  - Re: (Score:2)
    
    by StripedCow ( 776465 ) writes:
    
    (While you may be right, following slashdot conventions the analogy was intended as a car-analogy, not a food-analogy.)
- - Re: (Score:2)
    
    by I'm New Around Here ( 1154723 ) writes:
    
    Considering Florian Weimer didn't make an analogy in his post, your post is what happens when a /. geek tries to make an argument based on his own skills in reading comprehension.
    - Re: (Score:2)
      
      by DexterIsADog ( 2954149 ) writes:
      
      Yeah? Go back and read it again. But sober this time. Put the cherry schnapps back in mom's liquor cabinet.
      - Re: (Score:3)
        
        by I'm New Around Here ( 1154723 ) writes:
        
        Yeah? Go back and read it again.
        
        OK. FW said:
        An Argentinian chef is more likely to make great sushi than a Japanese automotive engineer.
        You generally want to use programming languages designed by experienced programmers (even better, experienced language designers) who work closely with subject matter experts. Left to their own devices, experts are likely to get a lot of things wrong, and if the language is sufficiently popular, you are stuck with their mistakes for a long time to come.
        Upon rereading it, I still don't see an analogy. So let's break it down and verify.
        An Argentinian chef is more likely to make great sushi than a Japanese automotive engineer.
        Not an analogy. A statement of fact, with some supposition. It is possible Japanese auto engineers are all required to be master sushi artists, but unlikely. Still not analogy.
        You generally want to use programming languages designed by experienced programmers
        Again, not an analogy. A statement of personal opinion, which may or may not be factually accurate.
        (even better, experienced language designers)
        More personal opinion. But it certainly makes sense.
        who work closely with subject matter experts.
        Conclusion of personal opinion.
        Still not an analogy.
        Left to their own devices, experts are likely to get a lot of things wrong,
        Again, supposition used to bolster an
        
        Re: (Score:2)
        
        by I'm New Around Here ( 1154723 ) writes:
        
        I won't dispute that, but I still know what an analogy is.
true, but not really because of R itself (Score:5, Insightful)

by Trepidity ( 597 ) writes: <.delirium-slashdot. .at. .hackish.org.> on Sunday May 25, 2014 @08:42AM (#47086807)

R itself is okay, but even as a long-time user I don't think the language or environment itself is all that much to brag about. What makes it great for statistics is just that statisticians use it, which means that a lot of the packages are written by statisticians. That makes a big difference: recent papers often have R implementations, standard problems have well-maintained R packages for them with all the bells and whistles, etc. As Matloff notes, this means they often have everything that statisticians are looking for, while straightforward textbook implementations you often find in other languages often aren't nearly as thorough in how they handle the statistical models, or only handle some special cases (though there are some really good packages in other languages, just not as many).
But I don't think that has much to do with R itself being uniquely suited to statisticians. It's used for historical reasons: Bell Labs S was influential in the field way back when nothing like Python or Julia existed, and statisticians started using it because it was a lot nicer than Fortran, which is what other areas of science mostly used back then. GNU R is essentially a free-software workalike for Bell's S, and it's kept most of the community on board through a mixture of existing packages, familiarity, and inertia.

Share
twitter facebook
- Re:true, but not really because of R itself (Score:4, Interesting)
  
  by jythie ( 914043 ) writes: on Sunday May 25, 2014 @09:44AM (#47086947)
  
  *nods* who uses a language has more impact on its usefulness then anything inherent to the language. LIbraries, support community, easy of hiring people who both know the language and have domain specific skills, much more important then what kind of sugar the language has.
  
  Parent Share
  twitter facebook
- Re:true, but not really because of R itself (Score:4, Interesting)
  
  by HuguesT ( 84078 ) writes: on Sunday May 25, 2014 @10:13AM (#47087061)
  
  R has some pretty unique graphing packages. Nothing that I know of matches the way you can do 2D and 3D plots in R. Not Python, not Gnuplot, not Julia, not Matlab, not Excel, not Mathematica, nothing.
  
  Parent Share
  twitter facebook
  - Re: (Score:3)
    
    by Trepidity ( 597 ) writes:
    
    Around here Python's matplotlib has been making some inroads in the plotting category, even among people who use R for the actual data analysis, but it's admittedly not as featureful as the whole suite of R plotting packages.
- true, but not really because of R itself (Score:4, Insightful)
  
  by jonnyj ( 1011131 ) writes: on Sunday May 25, 2014 @04:46PM (#47088985)
  
  Completely right.
  We use R extensively in work. Programmers talk about R's libraries, but that's not the real reason we use it. The killer blow is that the _documentation_ is written by statisticians. That means that it's reliable, easy to understand, and honestly tells you the pitfalls of the techniques you're using.
  We're financial guys who are doing stuff in consumer finance that has rarely, if ever, been done in our field. The statistics aren't particularly advanced, but it's impossible to hire someone who understands the industry and knows the statistics already. Statistics text books tend to either be so basic that you already know what they say, or so advanced that you need a PhD to understand them. On the other hand, much of the R documentation is beautifully simple to read, and comes with brilliant worked examples - albeit from fields that are very different from our own. Whenever we're researching potential new statistical approaches, we find blogs stuffed full of examples written in R.
  In short, the R ecosystem makes you a better statistician. Julia and Python can't offer that.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by StripedCow ( 776465 ) writes:
    
    A true "statistical" programming language allows the user to define statistical processes in the language and then compute its statistical properties.
    For example:
    x = random() /* a random number between 0 and 1, uniformly distributed */ y = x*x print(E(y)) /* print the expected value of y */
    R is nowhere near that.
- - Re: (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    You got the title wrong.
    _Numerical Recipes in C_, by Press, W. et al
    http://www.amazon.com/Numerical-Recipes-Scientific-Computing-Edition/dp/0521431085
    IIRC there was also a _Numerical Recipes in FORTRAN_ as well.
    Also see http://www.nr.com/ . I think they only have a single book now called _Numerical Recipes_ and it is in its third edition.
  - Re: (Score:2)
    
    by plopez ( 54068 ) writes:
    
    The numerical recipes series was much more than algorithms and code. It told you more about the how and *why* of an algorithm. And when as in when it should be used. The commentary alone is enough reason to buy them even if you never actually use any code from them.
Meh (Score:5, Informative)

by hyfe ( 641811 ) writes: on Sunday May 25, 2014 @08:49AM (#47086827)

Statistics major who programmed Python professionally for a few years (and have a MsC in Comp.Sci) ...
... this is all posturing and drama, but good on Prof. Norm Matloff for getting some attention. R is rather usefull, has quite a few extremely usefull features as a language, including some of the best list/indices handling I've seen anywhere. Excellent libraries for statistical work, but it also has quite a few the most downright abhorrent language decision I've seen anywhere ever, with the amazingly poor string handling (for a scripted language) topping that list ( http://www.burns-stat.com/page... [burns-stat.com] )
Python, C, Mathematica and R all have different strengths for mathematical work / numerical calculations though, and using the best tool for the job is what it's about. As always, what the best tool actually is, is also rather subjective, as which tool will best solve a specific task is always dependent on your skill with the different tools. I do agree with professor though, even though there's quite abit of Python hype (python + scipy/matplotlib is amazing) R is not being replaced anytime soon. It's too good at what it's good at.

Share
twitter facebook
- Re: (Score:2)
  
  by Pseudonym ( 62607 ) writes:
  
  [...] using the best tool for the job is what it's about.
  Ah, but from the point of view of a computer scientist, the "best tool for the job" isn't necessarily the best tool that currently exists. R is a fabulous set of well-documented algorithms and linked together with one of the bizarre, poorly-specified and inadequately-documented language with a flaky, abstraction-leaking, poorly-performing implementation [purdue.edu].
  I think it's great that R is written by statisticians for statisticians, and that statisticians fin
- Re: (Score:2)
  
  by PerlPunk ( 548551 ) writes:
  
  I agree with the above about R. But as regards to reliability, I would prefer SAS to R, even though I hate SAS even more than R. Yes, R has lots and lots of features, good documentation, better libraries than any other out there. But sometimes I find discrepancies between R and SAS in performing the same operations, and when I test which is right SAS always seems to win. That is to say that R as an open source platform has the same problems open source platforms tend to have -- buggy code, sometimes inconsi
A joke on the subject (Score:5, Funny)

by kav2k ( 1545689 ) writes: on Sunday May 25, 2014 @09:03AM (#47086861)

A joke I've read recently [twitter.com]:
I'm not sure if "R is written by statisticians, for statisticians" is a good thing e.g. "stadiums are built by footballers, for footballers"

Share
twitter facebook
Who really f-ing cares? (Score:4, Insightful)

by nurb432 ( 527695 ) writes: on Sunday May 25, 2014 @09:25AM (#47086907) Homepage Journal

Use the right tool for the job and stop bashing other tools that were designed for different jobs .

Share
twitter facebook
With R... every day is Talk Like A Pirate Day! (Score:4, Funny)

by TheRealHocusLocus ( 2319802 ) writes: on Sunday May 25, 2014 @10:27AM (#47087105)

"Arrrr.... fix yar name 'R' while you may, maties!!"
I may not have the belly for Deep Statistics but I do know abut Internet Search noise levels. I remember trying to do research on WebDAV (believe me, there is such a thing) only to discover that folks discussing it invariably refer to it as 'dav'. Because saying "Distributed Authoring [and] Versioning" out loud makes you spit out your toothpick. Any attempt to search 'webdav' yielded only the sterile official pages, and attempts to search on 'dav' with other keywords brought up conversations from the community of Disabled American Veterans who also use the term in casual conversation, and have said an awful lot over the years. They occupied 'dav' first.
Now you may think you can pull off a 'C' where Google seems to pick off relevant results if you combine it with any computery term, but it was not always so. It has taken an incredible saturation of C, and perhaps some special coded cases on Google's part, for this to come about.
The success of Perl is due in some part to the ability of confused people to obtain help and advice about it merely by searching on its unique spelling.
So the best way to push this R language is with a refit of the name. Go with the pirate theme, it will sell many more T-shirts than those of silly camels and pearls. But stake out a bit of Keyword Real Estate that presently has a relatively low population density.
Google search result estimate counts, descending order,
r --- 2,730,000,000
ar --- 656,000,000
arr --- 24,400,000
arrrrrrrr --- 3,060,000
arrrr --- 876,000
aarr --- 638,000
arrr --- 536,000
arrrrr --- 405,000
aaarrrrr --- 267,000
arrrrrr --- 205,000
arrrrrrr --- 129,000
aarrr --- 107,000
aarrrr --- 107,000
aaarrr --- 56,600
aaarrr --- 56,600
arrrrrrrrr --- 52,400
Adding arrrs is not enough since talking like a pirate is typically accomplished with a single 'a', so ar+ space is pretty well populated up to ar{5}, it looks like best ratio is around a{3}r{3}. But even choosing the less-optimum and easier to type a{2}r{3} by using 'aarrr' instead of 'r' you have improved the signal to noise ratio by a factor of twenty-five thousand.
Push the name change firmly and decisively. This means that if anyone mentions 'R' there should be immediate responses that ask, "What AARRR you talking about?" This will inject the proper searchable term into the discussion while it reminds the poster of the name change.
For an interesting 9 minute lecture that might help sell you on this idea, listen here [upenn.edu].

Share
twitter facebook
- Re: (Score:2)
  
  by TheRealHocusLocus ( 2319802 ) writes:
  
  For an interesting 9 minute lecture that might help sell you on this idea, listen here [upenn.edu].
  Certificate warnings freak you out? Try this link instead [upenn.edu], now with matching wildcard, calmer seas and less mogul.
- Re: (Score:3)
  
  by wisnoskij ( 1206448 ) writes:
  
  It is scary sometimes how much control the limitations of Google Search has over our lives.
  For example, the best anti pirating system you can use for any game or film is to name it with less than 3 characters. It then becomes very hard to search for it.
  It took me days to find "9" (and I know others who had similar problems), and I think I never did end up seeing "B".
  - Re: (Score:2)
    
    by Bite The Pillow ( 3087109 ) writes:
    
    If google search is limiting your pirating, you may want to investigate something a little more specialized. I assume you're talking about the 2009 film with Jennifer Connelly, not the 2005 short nor the video game - either would be two clicks away after less than a minute.
    And if Google Search is really impacting your life in any meaningful way, you should step away from the keyboard for a weekend.
    I think this is more a case where you detected a pattern from two events, and extrapolated to assume that ever
    - Re: (Score:2)
      
      by wisnoskij ( 1206448 ) writes:
      
      Well I specifically mean the specialised searchers.
      Go to The Pirate Bay. Search "9", Search "9 2009".
      Neither of those return any useful results.
      And I guarantee you that that would of effected the number of people who torrented it.
- With R... every day is Talk Like A Pirate Day! (Score:2)
  
  by iggymanz ( 596061 ) writes:
  
  I'm afraid your research neglects a huge subset of the Talk-Like-A-Pirate word space, 'yarr' has 523,000 results
- - Re: (Score:2)
    
    by TheRealHocusLocus ( 2319802 ) writes:
    
    This joke was tired and lazy a decade ago. You're not just beating a dead horse, you've move past that to sodomizing it.
    And you've been everywhere and seen it all -- and have come back to tell us how you've been everywhere and seen it all -- and have come back to tell us how you've been everywhere and seen it all -- and have come back to tell us how you've -- been.
    Sorry to hear it. Get a leg up [youtube.com] into the world of wonder and whimsy. Join us!
If you're going to use R (Score:5, Informative)

by Johnny Loves Linux ( 1147635 ) writes: on Sunday May 25, 2014 @10:43AM (#47087167)
Be sure to use RStudio as the front end: http://www.rstudio.com/ [rstudio.com]. Using on R in a terminal is ok, but having the beautiful GUI frontend RStudio makes working with R sooooooo much better! The help system, plots, R markdown (knitr), and inspecting variables in RStudio is so much easier. As far as comparisons go,
1. R is no competitor to python for writing generic scripts.
2. Python (numpy, scipy, statsmodels, pandas, sklearn, matplotlib, ipython and ipython notebooks) is not yet ready to compete with R for doing statistical analysis but give Python a couple of more years and then slashdot should do a review of how it compares.
3. You can always call R from python using the r2py module. This is really easy within an ipython notebook using the %load_ext rmagic command.
For a nice video on using ipython notebook in data analysis: https://www.youtube.com/watch?... [youtube.com]
For a nice selection of ipython notebooks for doing various type of data analysis: https://github.com/ipython/ipy... [github.com]
Share
twitter facebook
State of Programming in the Sciences (Score:3)

by wisnoskij ( 1206448 ) writes: on Sunday May 25, 2014 @11:08AM (#47087271) Homepage

Having seen the state of programming in the Sciences, I really do not thing that "built by statisticians" is something you would want to advertise.

Share
twitter facebook
Beats python at what? (Score:4, Interesting)

by umafuckit ( 2980809 ) writes: on Sunday May 25, 2014 @11:09AM (#47087277)

A few examples are provided in TFA but it's all rather vague as to why R "beats" Python. I've been using R for years for fitting mixed effects linear models. It does this really well, it makes it easy to compare models, it's got all the cutting-edge stuff in it. The problem with R, however, is that it's shitty and unintuitive as a programming language. I do all my pre-processing in MATLAB and I only ever export to R when I have a final data frame that needs a moderately complicated statistical analysis.

Share
twitter facebook
Fortran throwdown challenge! (Score:2)

by Theovon ( 109752 ) writes:

This guy must have been reading the recent stuff on Fortran and decided to jump on the bandwagon.
Fortran was written by engineers and scientists for engineers and scientists.
R is written by statisticians for statisticians.
Well, there you have it. If a language or other kind of tool was developed by practitioners of X for other practitioners of X, it’s likely that it will be better than some other tool that was designed for a different purpose.
Who would have thunk it.
DSLs (Score:4, Insightful)

by jbolden ( 176878 ) writes: on Sunday May 25, 2014 @01:35PM (#47088051) Homepage

He's probably right. All other things being equal a good Domain Specific Language will crush a General Purpose Language in its domain. If Julia is much faster than R and that were unfixable it would still be far easier to write a library in Julia accessible by R than to train R users in all of Julia's concepts.
General purpose languages can sometimes get close to DSLs in effectiveness and then the greater diversity of users creates an economy of sacle and deep entrenchment which drives DSLs away. But then with a large and highly diverse user base the General Purpose language isn't able to rapidly adapt so DSLs spring up to fill niches. Some of those DSLs become incredibly successful and start to move into other domains diversifying their purpose and user base to become General Purpose Languages and the cycle repeats.

Share
twitter facebook
- - Re: (Score:2)
    
    by jbolden ( 176878 ) writes:
    
    That for the general purpose language creates the two language problem.
    Library X has a syntax Y but also a syntax from from language Z keeps bleeding through in practice vs. in a DSL where Y is clean.
shocked (Score:2)

by Spazmania ( 174582 ) writes:

I'm shocked to learn that a purpose-built programming language might be better at its specific purpose than a general purpose programming language. Shocked I say.
I'd be even more shocked if a bunch of mathematicians had the good sense to pick a Google searchable name for their language. One PIA thing with C is how hard it is to search Google for documentation when you don't remember the exact function name.
- Re: (Score:2)
  
  by cellocgw ( 617879 ) writes:
  
  I'd be even more shocked if a bunch of mathematicians had the good sense to pick a Google searchable name for their language
  You young punks have any idea by how many years R precedes the existence of google (or even alta vista)? Same goes for the c language, FWIW.
R Julia beats both R and Julia (Score:2)

by PaddyM ( 45763 ) writes:

We all know Raul Julia as M Bison beats them both. And Raul Julia's reading of "Mystery on the Docks" on Double "R" (Reading Rainbow) lives on in my mind as one of the great renditions.
Flaky (Score:3)

by StripedCow ( 776465 ) writes: on Monday May 26, 2014 @06:07AM (#47091329)

From the summary:
And R is Statistically Correct
But Python is correct all the time.

Share
twitter facebook
- Re: (Score:2)
  
  by Trepidity ( 597 ) writes:
  
  You can't talk SAS unless you've got a big bank account, though. A one-year, individual (single-desktop) license costs upwards of $5,000, which makes it a non-starter for a lot of people. Also, it's not open source.
- Re: (Score:2)
  
  by retchdog ( 1319261 ) writes:
  
  yes, R is written for people who know what they are doing.
  - Re: (Score:2)
    
    by Pseudonym ( 62607 ) writes:
    
    On the contrary, R is written for people who don't know they're programming. That's why it's such a pain to write maintainable programs in.
- Re:I dislike Python (Score:4, Insightful)
  
  by jythie ( 914043 ) writes: on Sunday May 25, 2014 @09:46AM (#47086955)
  
  Hrm. I never thought about the whitespace requirements in python from an accessibility perspective.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    Hrm. I never thought about the whitespace requirements in python from an accessibility perspective.
    I know that Python's approach to whitespace is very...polarizing; but I've always wondered how much it would cause trouble either for people who really loath it, or for specialized situations that tend to crop up under 'accessibility' (where the path from text file to user is likely going through one or more atypical transformations, anywhere from simple contrast bumps up through text to speech or the like).
    
    Given that the whitespace has to have an unambiguous meaning to the python interpreter, your edito
- Re: (Score:2)
  
  by Pinky's Brain ( 1158667 ) writes:
  
  In the end most people will still use anything but LISP.
- Re:I dislike Python (Score:4, Interesting)
  
  by KingOfBLASH ( 620432 ) writes: on Sunday May 25, 2014 @10:48AM (#47087183) Journal
  
  Believe it or not, most statisticians are not programming wizards.
  Most stats guys use R, matlab, mathematica, or something similar. Even if it takes days to run a program that would take 20 minutes in C. Sort of like how the business guys will use VBA when they need anything, because that's what they know.
  Languages like R are used because they are accessible. And once they reach a critical mass, everyone learns them in a field.
  Sort of like how Fortran just won't die.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by pla ( 258480 ) writes:
  
  because it is an inferior mish-mash for an up-start generation which was never taught the, "In the end, everything looks like LISP," maxim.
  
  I have to suspect you as trolling here, because although I do indeed know Lisp (and Scheme, and Tcl) - Very, very little of my code ends up looking anything like Lisp.
  
  And its requirement for particular whitespace offends me as someone who has spent the last decade working with accessibility groups.
  
  I will fully agree with you that required whitespace offends me,
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    I have to suspect you as trolling here, because although I do indeed know Lisp (and Scheme, and Tcl) - Very, very little of my code ends up looking anything like Lisp.
    You should try making your code look functional sometime (that is, write it as functions with no side-effects), you might find you have fewer bugs.
    
    I'm really interested why you combined Tcl with Lisp and Scheme though, those languages don't seem to have much together
    - Re: (Score:2)
      
      by pla ( 258480 ) writes:
      
      I'm really interested why you combined Tcl with Lisp and Scheme though, those languages don't seem to have much together
      
      Although you can force it to behave imperatively, Tcl primarily counts as a functional language (though I have to agree, a bit of an oddball due to its object oriented side).
    - - Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        Good luck convincing any of them it's a good idea to start focusing on making their code look functional with no side-effects.
        Thanks.
        If you're a good programmer and are writing the whole thing all by yourself then I'm sure Lisp is great for that.
        If you're a great programmer, then you will write great code in any language. The language is less important than the skill of the person using it.
        Top programmers can pick languages that are great for the code they write. Average programmers should pick languages that are great for all the code they DON'T have to write, document and support/debug ;).If all those libraries, frameworks are written by above average programmers, it means less code written by crappy programmers like me, which means fewer bugs.
        If you are writing new code in an existing project, it is up to you whether you want to write functions without side-effects or not. Sometimes it's hard because you have to call a function that has side effects, but in that case you can communicate to anyone who might call your code what the side effects might be (either use comments or name the function i
        
        Re: (Score:2)
        
        by countach ( 534280 ) writes:
        
        "If you're a great programmer, then you will write great code in any language. The language is less important than the skill of the person using it."
        This is commonly claimed, but I'm damned if I could ever do anything elegant in perl. I've done lovely stuff in Java, but none of it is as great as what I did in scheme. The right language can make a good programmer better and a better programmer great.
        
        Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        This is commonly claimed, but I'm damned if I could ever do anything elegant in perl.
        You're not a great programmer.
        
        Re: (Score:2)
        
        by Pseudonym ( 62607 ) writes:
        
        If you're a great programmer, then you will write great code in any language.
        If you're a truly great programmer, then you will refuse to write any code in certain languages.
        
        Re: (Score:2)
        
        by phantomfive ( 622387 ) writes:
        
        Like Python?
        
        Re: (Score:2)
        
        by Pseudonym ( 62607 ) writes:
        
        You said it, not me.
  - Re: (Score:2)
    
    by Pseudonym ( 62607 ) writes:
    
    Statisticians != Programmers.
    Yes, Dr Statistician, I know you're not a programmer, but that thing you're writing is a program, and you will use revision control or you are not working on my project.
    Woah, sorry, had a flashback there.
- Re: (Score:2)
  
  by Taxman415a ( 863020 ) writes:
  
  What's the accessibility problem with Python's whitespace? I don't code, but my screen reader reads space, tab, and newline to me just fine. I use VoiceOver.
- Re: (Score:2)
  
  by BitterOak ( 537666 ) writes:
  
  I'm not really sure I see where R fits, though. For basic statistical work, SPSS is good.
  It's good if you have the money. R is free, while SPSS is fairly expensive, as is its main competitor SAS. I see R as competing not with general purpose languages like Python, but rather with commercial statistics packages like SPSS and SAS. While it may have more of a learning curve than these packages, it is free software, which makes it very attractive for many users.
- Re: (Score:3)
  
  by gnupun ( 752725 ) writes:
  
  "R is written by statisticians, for statisticians"
  Does R invent new syntactic constructs that make it useful for handling/generating statistical data? So far I've not seen any new syntax in R that warrants creating a new programming language -- it's just a rehash of various scripting languages already available.
  From a programmer's perspective, R should just be an easy to use library that you can use in various languages like Python, Julia, Ruby, etc. There's no need to learn new syntax if it's not that new
  - Re: (Score:3)
    
    by HuguesT ( 84078 ) writes:
    
    How about the syntax for specifying model? [princeton.edu].
    lmfit = lm( change ~ setting + effort )
  - Re: (Score:3)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    Don't forget the influence of history: R wasn't designed for superiority to Python, Julia, and Ruby; but in large part to be a GNU-acceptable implementation of S, which may well have been designed for superiority to APL and FORTRAN; and which has existed since somewhere between the-before-time-when-the-gods-were-young and the start of the Second Trilobite War.
- Re: (Score:2)
  
  by Pseudonym ( 62607 ) writes:
  
  It is perfect for what it is meant to do, namely, load data, do statistical analysis on it, and produce graphics summarizing the results.
  And that's also its biggest problem. It's perfect for what it's meant to do, but it's distinctly imperfect for many of the uses that it does do, and poorly designed for the uses that it could do.
  An example of the former is maintaining large codebases. People do maintain large R codebases, but this is despite the language, not because of it.
  An example of the latter is paral

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Can't use it (Score:5, Funny)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:Can't use it (Score:4, Funny)

Can't spell warez without R (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Comment removed (Score:5, Funny)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Bad analogy (Score:5, Insightful)

Re:Bad analogy (Score:5, Interesting)

Re:Bad analogy (Score:5, Interesting)

Comment removed (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Slashdot (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re:Bad analogy (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Bad analogy (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

true, but not really because of R itself (Score:5, Insightful)

Re:true, but not really because of R itself (Score:4, Interesting)

Re:true, but not really because of R itself (Score:4, Interesting)

Re: (Score:3)

true, but not really because of R itself (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2, Interesting)

Re: (Score:2)

Meh (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

A joke on the subject (Score:5, Funny)

Who really f-ing cares? (Score:4, Insightful)

With R... every day is Talk Like A Pirate Day! (Score:4, Funny)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

With R... every day is Talk Like A Pirate Day! (Score:2)

Re: (Score:2)

If you're going to use R (Score:5, Informative)

State of Programming in the Sciences (Score:3)

Beats python at what? (Score:4, Interesting)

Fortran throwdown challenge! (Score:2)

DSLs (Score:4, Insightful)

Re: (Score:2)

shocked (Score:2)

Re: (Score:2)

R Julia beats both R and Julia (Score:2)

Flaky (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)