Forgot your password?
typodupeerror
Python Programming

Van Rossum: Python Not Too Slow 510

Posted by Soulskill
from the prefers-terms-like-stately-or-majestic dept.
snydeq writes "Python creator Guido van Rossum discusses the prospects and criticisms of Python, noting that critics of Python performance should supplement with C/C++ rather than re-engineering Python apps into a faster language. 'At some point, you end up with one little piece of your system, as a whole, where you end up spending all your time. If you write that just as a sort of simple-minded Python loop, at some point you will see that that is the bottleneck in your system. It is usually much more effective to take that one piece and replace that one function or module with a little bit of code you wrote in C or C++ rather than rewriting your entire system in a faster language, because for most of what you're doing, the speed of the language is irrelevant.'"
This discussion has been archived. No new comments can be posted.

Van Rossum: Python Not Too Slow

Comments Filter:
  • Re:Agreed. (Score:5, Informative)

    by randallman (605329) on Friday March 16, 2012 @04:56PM (#39382801)

    Python is strongly typed. Maybe you mean statically typed.

  • Personally (Score:5, Informative)

    by ciascu (1345429) on Friday March 16, 2012 @04:58PM (#39382835) Journal

    As someone simulating fluid-structure interaction with a number of constituent models and a lot of finite element (i.e. big matrix problems; using FEniCS - fenicsproject.org), using Python makes my overall quite-long algorithm much easier to flick through. Invaluable for debugging the theory as well as the implementation. FEniCS' Python interface ties into the standard C/C++ libraries using SWIG and, in simple cases, saves me working in C++. Very clear, well-written C++ is great for this application but I find it takes considerably longer to write than clear Python.

    When I hit a more intricate problem, I realized I was going to have to solve a series of FE matrices by hand (with PETSc, written in C). It turned out to be pretty straightforward to pick up SWIG, write a short module in C and a Python interface. Done! Particularly useful as I believe getting FEniCS and petsc4py to play well is tricky.

    So, I'd agree - having written a C++ version of my (simpler) problem and a Python/C version of the complicated one, the latter was definitely easier, and all the rate-limiting stuff is in C anyhow.

    Doubt it would be true for every situation but +1 from an FE perspective.

  • Python's problem (Score:5, Informative)

    by spiffmastercow (1001386) on Friday March 16, 2012 @05:04PM (#39382957)
    The problem with Python isn't the speed -- he's right about optimizing with bits of C. The problem is the GIL. Without good multithreading support, I have to give up on Python for a large number of application domains.
  • by Animats (122034) on Friday March 16, 2012 @05:09PM (#39383017) Homepage

    Says the guy whose whole life is tied up in the language, and whose project, at Google, to speed it up, crashed and burned. [wikipedia.org]

    Python is slow because von Rossum refuses to cut loose the boat-anchor of "anything can change anything at any time". The straightforward implementation of Python, CPython, boxes all numbers (everything is a CObject, including an int or a float) and looks up functions, attributes, and such in a dictionary for every reference. And only one thread is allowed to run at a time. This allows one thread to dynamically patch the objects and code of another thread. Which is cool, but useless. 99.99+% of the time, there's no need for a dynamic lookup. Most program dynamism is shortly after program startup - once things are running, they don't change much. If, sometime shortly after startup, the program said "OK, done with self-modification", at which point a JIT compiler did its thing, the language would be much faster. But no. That's "un-Pythonic".

    PyPy, the newer Python implementation, uses two interpreters and a JIT compiler to try to handle the dynamism with less overhead. They're making progress, but they need a very complex implementation to do it, and they're many years behind schedule.

    Python, as a language, is very usable. But it's too slow for volume production. That's not inherent in the basic language. Python could remain declaration-free if there were just a few more restrictions on unexpected dynamism. By this is meant ways the program can change itself that aren't obvious from looking at the source code. For example, if a module or class could only be modified from outside itself if it contained explicit self-modification code (like a relevant "setattr" call) most modules and classes could be nailed down as fixed, "slotted" objects at compile time. The other big win is using enough type inference to decide if a variable can always be represented as a machine type (int, float, char, bool, etc.). That's a huge performance win.

    Claiming that the "slow parts" should be rewritten in C is a cop-out. It makes the program more fragile, since C code can break Python's memory safety. Except for number-crunching, or glue code for existing libraries, it's seldom done.

    (I have a Python program running right now which will run for over a week, parsing the street address of every business in the US into a standard format. The parser is complex enough that rewriting it in C would be a big job. There's no "inner loop".)

  • Re:007087 (Score:5, Informative)

    by buchner.johannes (1139593) on Friday March 16, 2012 @05:10PM (#39383027) Homepage Journal

    As the GP pointed out, if you're skilled enough to write optimized code in C/C++, why fuck around with Python at all?

    Because we don't want to spend our time thinking about pointers and how to iterate over things? Because functional programming is actually really nice? Because in Python, you can download some data from the web, analyse it using a machine learning algorithm, plot the results, and install another package on the fly, combining 4 independent packages, and many ideas, in just 50 lines of code.

    ctypes is really easy to use and to interface with C or Fortran. I use it a lot, namely for the 1% of the code that takes 99% of the time. The rest is nice OOP and functional.

  • by Frnknstn (663642) on Friday March 16, 2012 @05:13PM (#39383075) Homepage

    Yes, that is correct. You should write your apps in Python.

    Your libraries, you should write in Python first, because it is also a great prototyping language. If they work fine (which they will in most cases) you have saved yourself a bunch of time. If they are too slow, you have saved yourself a bunch of time by fixing algorithmic bugs in a flexible language like Python. It is now trivial to convert it to bug-free C or C++.

  • Donald Knuth (Score:4, Informative)

    by JohnWiney (656829) on Friday March 16, 2012 @05:15PM (#39383121)
    Donald Knuth made this point in 1971, in his Empircal Study of Fortran Programs - virtually none of most programs has any significant effect on performance.
  • Re:007087 (Score:3, Informative)

    by Anonymous Coward on Friday March 16, 2012 @05:21PM (#39383195)

    ... because given two equally talented developers, the one working in python will literally run circles around the guy working in C/C++... and in the real world developer time = money.

    The fact is that interfacing with C libraries in Python is already trivial. Furthermore modern tools like Cython make it EVEN EASIER!

    So you take your code, profile it, decorate the most time-intensive portions to be compatible with Cython (trivial for most applications) -or- interface it with your C library. This way, you get your code up and running in a fraction of the developer time, with near identical performance to the C implementation.

  • Re:007087 (Score:4, Informative)

    by vgerclover (1186893) on Friday March 16, 2012 @05:26PM (#39383255)

    As most things in life do, code usually follows the Pareto distribution: 80% of the time is spend in 20% of the code. If Python is fast enough for, let's say 90% of your code, and you are much more productive in Python than C, then writing most all the code in Python first, and replacing the bits and pieces that are too slow for you with C functions, is much more efficient use of your time than writing everything in C.

    Also consider that sometimes you have to go in fishing expeditions for the correct algorithm to do what you are doing. Doing so in Python, with the speedup in iterative design that that carries, and even if that once you find the most efficient algorithm for your problem you implement it in C, you will have had spend the same time as before, but knowing all the ways you can't do it, and have arrived to an at least nearly optimal solution.

    Most of the time you don't need that much speed. When you do, you have to have the right algorithm and the right language. I also put forth that Python has a lot of modules that are already written in C, so you take advantage of existing optimized code that you don't have to write.

  • by Teckla (630646) on Friday March 16, 2012 @05:42PM (#39383481)

    Strictly speaking, the language itself shouldn't have any effect on how fast it executes, it's the implementation that really matters.

    This is nonsense.

    Language syntax has a huge impact on how hard or easy it is for a compiler (ahead-of-time, just-in-time, or hybrid) to produce fast native code.

    If that effort is too large, in terms of development effort and/or compiler analysis effort, you will simply never see a compiler written for those kinds of languages that produces fast executables. This is the reality.

    So, in pragmatic terms...yes, some languages are slow.

  • by vrt3 (62368) on Friday March 16, 2012 @05:53PM (#39383647) Homepage

    If you do need to know the index, you should write

    for i, element in enumerate(sequence):
        print i, element

    It pays off to spend some time learning not only the syntax when you learn a new language, but also often used idioms in that language.

  • by SQLGuru (980662) on Friday March 16, 2012 @05:59PM (#39383711) Journal

    return (a>0)?a+1:a-1;

    Tertiary operator FTW!

  • Re:007087 (Score:4, Informative)

    by shutdown -p now (807394) on Friday March 16, 2012 @06:39PM (#39384123) Journal

    Because we don't want to spend our time thinking about pointers and how to iterate over things? Because functional programming is actually really nice?

    You can do all that in C++11 these days, if a tad more verbose.

    vector<int> xs { 1, 2, 3 };
    transform(xs.begin(), xs.end(), [](int x) { return x * x; });
    for (int x : xs) {
      cout << x << '\n';
    }

    Note the lack of pointers, and the lambda passed to std::transform.

  • Re:007087 (Score:4, Informative)

    by shutdown -p now (807394) on Friday March 16, 2012 @06:48PM (#39384223) Journal

    Python is compiled to bytecode, and said bytecode is then interpreted by Python VM. It's just that this compile cycle happens transparently, and is not explicit as with .java & .class files. It also caches the resulting bytecode in form of .pyc files, and this process can be triggered manually - indeed, many Linux distros have Python packages do that when they are installed. So there's no re-parsing there, and no direct interpretation of AST.

    Java is compiled to bytecode, and said bytecode is then JIT-compiled to native code by the VM. The VM does not typically run bytecode directly - or rather, it runs it for a few times, but when it's clear that the function is going to be called a lot, such that overhead of compiling bytecode to native is worth the bother.

    Also note that the above description of how Python works only applies to CPython, which is the most popular implementation, but by far not the only one. Python language spec explicitly recognizes the existence of alternative implementations, and language is intentionally designed to cover a wide range of implementation technique (e.g. it does not mandate reference counting and predictable implicit release of orphaned objects, for the benefit of GC-driven implementations, and provides a "with" statement to explicitly release things RAII-style). Hence why we have Jython, IronPython, PyPy etc. PyPy, in particular, implements the full source code -> bytecode -> JIT-compiled native code, similar to Java, and is much faster than CPython.

    The real reason why Python will still be slower than Java is because it's a much more dynamic language. JIT-compiler can try to infer types and other constraints, or at least predict the most likely path and optimize for that, but even a single extra check is still slower than no checks.

  • by steveha (103154) on Friday March 16, 2012 @06:57PM (#39384349) Homepage

    Will users be thinking, "Gosh, this sucks, but I'm sure glad the programmer used a dynamic language, because it made it easier on him (the programmer)."? No, they'll be thinking, "Damn buggy programs! I just lost X (hours,minutes,seconds) of work, and now I'm frustrated!" Programming languages are a means to an end, not an end in itself. Don't be a self centered developer: the fruits of your labor are for users, not so you can write the code equivalent of poetry.

    It's true that static type checking can catch bugs for you. That's why, in C, I'm rigorous about declaring const for anything that ought to not change; I like it when the compiler catches a bug for me.

    But the world isn't as binary as you make it out to be. It's not a choice between poetry-like code or more correct code; there is more to it than that.

    Python's "duck-typing" can let you write one function where you would need to write several in C or C++. As a contrived example:

    def add5(x):
        return float(x) + 5.0

    It doesn't matter what you pass to this function, as long as it can be converted to a float. In C++ you would have to write multiple versions of this with different type signatures. In C you would have to write multiple versions of this with different names! And it would truly suck if just one of the various versions had a bug in it; in Python, with just one function, you have one place to look for bugs.

    (You could probably do this contrived example with templates in C++, or even with a macro in C. And this is a useless example. But I think it makes the point, and real examples need not be as useless and trivial.)

    Not to mention, statically typed languages allow for easy refactoring possibilities that make it possible to fix all sorts of serious issues, including architectural ones, with reasonable effort expended. Dynamic languages, while they have made some progress in the area of refactoring, are really in the dark ages here.

    I have no idea what you are talking about here. Dynamic languages, which impose fewer limits on what you can do, have some sort of disadvantage over statically-typed languages for refactoring? How does that work?

    And let me counter that with an example. In Python, you are encouraged to simply read and write member variables directly, rather than writing getter and setter functions. But if you ever need to "hook" the getting or setting, you can write a property descriptor [gomaa.us] and run a getter or setter function, without needing to rewrite any code; the property descriptor does an implicit function call for you can you can do any checking you need.

    And finally, every language has its best practices. In the Python community, self-tests are strongly encouraged. Yes, with Python, type errors aren't caught until runtime; but you can easily add self-tests that exercise the code and catch the type errors pretty much immediately after you introduced them.

    http://docs.python-guide.org/en/latest/index.html [python-guide.org]

    No language is perfect, but in my experience Python hits a sweet spot. I can get a lot of work done in a few lines of Python, I can write those lines quickly, I can read them when I go back later, and I enjoy the coding more than C. I don't think my Python code is buggier than my C code per line of code, and in Python I write so many fewer lines of code I think I come out ahead on bugs.

    steveha

  • Re:007087 (Score:5, Informative)

    by lattyware (934246) <gareth@lattyware.co.uk> on Friday March 16, 2012 @07:53PM (#39384939) Homepage Journal

    [Python is sometimes faster to write in.]

    Python isn't just fast to write though - it's fast and easy to read and understand, to work with, and it's very easy to keep code clear and well documented. I'm not saying these things are impossible in other languages, but in Python it is effortless, and enocouraged by the language. There are big benefits to using it besides simply 'fast to write'.

  • Re:Guido's wrong. (Score:5, Informative)

    by Terrasque (796014) on Saturday March 17, 2012 @05:31AM (#39387771) Homepage Journal

    It takes more skill and system-programming knowledge to deal with the tricky interfaces between the internals of a Python interpreter and an external C++ program.

    Is this experience talking, or guesswork?

    Admittedly, I haven't had a need for it myself, but it looks easy enough. And you have plenty of options, too!

    1. Extending Python with C or C++ [python.org]

    2. ctypes [python.org]

    3. Cython [cython.org]

    Examples for 1 and 2 [dalkescientific.com]

    Example for 3 [perrygeo.net]

Hacking's just another word for nothing left to kludge.

Working...