Slashdot Log In
Python-to-C++ Compiler
Posted by
timothy
on Thu Jun 15, 2006 12:12 PM
from the calibrate-your-scales dept.
from the calibrate-your-scales dept.
Mark Dufour writes "Shed Skin is an experimental Python-to-C++ compiler. It accepts pure, but implicitly statically typed, Python programs, and generates optimized C++ code. This means that, in combination with a C++ compiler, it allows for translation of pure Python programs into highly efficient machine language. For a set of 16 non-trivial test programs, measurements show a typical speedup of 2-40 over Psyco, about 12 on average, and 2-220 over CPython, about 45 on average. Shed Skin also outputs annotated source code."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
not terribly useful quite yet (Score:5, Insightful)
Re:not terribly useful quite yet (Score:3, Insightful)
I could envision it working like this. Instead of statically declaring all your variable types in every function, you instead simply declare that whatever tpyes are being used, the
Re:not terribly useful quite yet (Score:3, Informative)
How exactly does duck typing differ from the structural subtyping of e.g. OCaml, which allows you to write a function that can be passed any object, of any class or none, if it provides all the methods that function uses? The type inference system handles it just fine.
Of course, "duck typing"
Ewwwww (Score:3, Funny)
Re:Ewwwww (Score:5, Insightful)
I think you're not supposed to read it. You're only supposed to feed it to your C++ compiler. f2c produced unreadable output too, but nobody read the output; at one time it was the only free fortran option on linux.
Parent
Re:Ewwwww (Score:2, Funny)
Re:Ewwwww (Score:3, Insightful)
Re:Ewwwww (Score:4, Insightful)
Yeah, whenever I look at the output of my optimising compiler, it's really hard to understand too. It's all in assembler, for a start.
Plus, the quality of C code generated by CFront was rubbish - unreadable.
Same with the Modula-3 compiler I tried. You couldn't work out what was going on in the resulting C code without a load of work.
Can you see where I'm going with this?
Parent
Yeah, but that's not what we need. (Score:3, Insightful)
This won't be meaningful until a converted python script is compared to efficient code written natively in C++ in the first place.
Re:Yeah, but that's not what we need. (Score:4, Insightful)
Parent
Re:Yeah, but that's not what we need. (Score:5, Insightful)
no, I'd be far more interested in a good compiler to compile that python straight to machine code...
Parent
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Re:Yeah, but that's not what we need. (Score:4, Insightful)
...and that's why it shouldn't be a Python to C++ translator; it should be a GCC frontend instead (i.e., translating to GCC's internal representation).
Parent
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Would you also like to translate a text from Arabic to English by passing through 3 or 4 languages in between?
In this analogy the problem would probably be accuracy, in the case you presented it would be performance being lost due to layers of conversion. Some high level optimizations are inevitably lost (unless the C++ compiler has some sort of strong AI).
Re:Yeah, but that's not what we need. (Score:3, Insightful)
You are making a gigantic assumption that because this converter's better than the last one, that it's usable in efficiency arenas. By comparison, you might be looking at the difference between a shoe and a shoe with a spring (that's what air pumps do, don't laugh) wh
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Is that the same way the method of using layers of multiple simple tools that all do one thing really well is more buggy that just using one larger general purpose monolithic app?
A cross platform Python to machine code compiler would presumably need to reinvent a whole lot of difficult platform specific stuff that has already been solved by C++ compilers.
Re:Yeah, but that's not what we need. (Score:3)
here's one way to look at it (Score:3, Funny)
- 4 hours to write a given program in python, 32 hours to write same program in C++
- 10 seconds to run the python program, but just 2 seconds to run the faster C++ program
- the program is run 20 times a day
- assume the developer time costs as much as the the time of the person that runs it
Ok, so it'll take 630 days of running this program for the faster C++ program to make up for the extra time to develop it. So, if you can wa
Re:Yeah, but that's not what we need. (Score:3, Informative)
Neither CPython nor PyPy is a strict interpreter, both of them compile source to byte-code and then act as a virtual machine to run that byte-code. PyPy also does some work on compiling to native code on the fly, depending on which version you're using (Armin Rigo's is the most sophisticated on the JIT/native code front, but it's far from stable).
Re:wrong comparison (Score:3, Interesting)
That'll teach me to hit submit without checking the preview. I lost a big and important chunk of the reply after operator< because I forgot to write out the entity for <. Here's a repaste; yay form buffers, boo no edit button for the first five minutes of a post.
-----------------
That's the wrong comparison to make, because it assumes that the C++ programmer has unlimited time to make his C++ code efficient and correct.
Well, yes and no. I actually got into this else-thread; there are a hell
Native code (Score:3, Insightful)
The best way to get some speed and still keep the nice Python functions and layout is just to export the most heavily used functions to native code (C/C++).
I don't know if its possible to take the C++ output and optimize it seperatly, that way you will have a good start to make native code though.
In short: Better, fast and easy, but not the best (if you can write native code)
Very interesting... (Score:4, Informative)
Among python programmers, I'm curious - how many use psyco (another python performance enhancement tool) for their projects? I fiddled with it a while ago (it didn't work because of a C module that it didn't like), but never had a compelling reason to go back to it. Performance optimization has never been important enough for my applications to merit the effort.
Re:Very interesting... (Score:3, Interesting)
Yup. Along the same lines, Ruby has a related project by Ryan Davis, Ruby2C [rubyforge.org]. It's useful for small localized speedups, but you wouldn't want to try to write your entire app in it.
Re:Very interesting... (Score:4, Interesting)
I use Psyco in my work. My app is a code generator that processes multiple models and transforms them into optimization code. Psyco reduced the time it took for process 1 model from 20 seconds to 2 seconds. It doesn't sound like much, but when you have to do it for lots of models, the speedup suddenly becomes quite substantial.
Parent
Lots of cross-language compiling... (Score:2)
I'm confused... (Score:5, Interesting)
Re:I'm confused... (Score:3, Interesting)
And yet, if you're going to compile Python, I'd want the translation into source code. If it's worth rewriting in C++, it's worth tuning, especially if you can improve the usage of type-safe code.
File as NBNC (Nice But No Cigar) (Score:5, Insightful)
What the Python C/C++ interested people REALLY need is a book written by a group of Python AND C/C++ masters which teaches the two simultaneously showing complimentary methods of doing any given thing working from beginner to advanced and I DON'T mean "How to turn your n00b Python code into C/C++ hotness" sort of viewpoint. I mean both taught simultaneously in synch showing how they can interchange and compliment.
Software tricks for converting? Ultimately worse than not having them because it leads to horrible obfuscation because we don't know exactly what is going on when 13,412 lines of Python is turned into C++ because WE DIDN'T WRITE IT AND WE NEVER LEARNED C/C++. "Say Mike, that's great but you're the company code cowboy and you don't do C++ natively and I sure as hell don't read it being management so exactly what happens if this needs to be fixed? We've gone from importing open source code you couldn't read to writing our own open source code you can't read."
Re:File as NBNC (Nice But No Cigar) (Score:4, Insightful)
That isn't how a compiler is used. When you compile a C++ program, you don't throw away your C++ source and check the executable into source control. "Oh, no! We used gcc and now we have a bunch of gobbledygook we don't understand!"
The C++ is an intermediate stage in the make process, akin to the output of various phases of gcc.
Parent
Python as prototyping language (Score:4, Interesting)
Python is a terrific prototyping language (and lots of other things besides.) As a C++ coder I've been using it for prototyping stuff that will eventually be integrated into a larger application and therefore MUST be translated to C++. So what I'd like to see is a tool (written in Perl, just for the fun of having a linguistic threesome) that just does a light gloss on Python syntax to get me most of the way to human-readable C++. That would be far more useful (to me) than thsi thing, which sounds more like f2c, whose output could case brain damage in humans and cancer in rats, or possibly the other way around.
Re:Why not just use pure C++? (Score:2, Insightful)
Re:Why not just use pure C++? (Score:2)
That's pretty much what he's doing, ShedSkin is a Python to C++ compiler, then you need to compile the C++ code ShedSkin yields to machine code, you can do that with gcc.
The goal (for the author) at the moment is to get a fairly complete Python to C++ compiler (ShedSkin is already very good if you're mostly doing simple operations such as crunching numbers, but if your program is really complex or uses libraries then you're out of luck)
Re:Why not just use pure C++? (Score:4, Informative)
Yes, I have have wasted some time staring at the shell waiting and waiting for it to return from some complicated Python routine. I know that compiled C would faster, and hand-rolled assembler would be faster still. But I say to myself: hey, I wrote this code in a single afternoon, how many weeks of hair-pulling would it take to re-engineer this - and make it bug-free - in C? When I put it that way, I don't mind waiting the extra minutes for Python to do my dirty work.
As a previous poster mentioned, the ability to handle tuples of mixed-types is critical. I look forward to seeing great things from Shed Skin in the future.
Parent
Re:Why not just use pure C++? (Score:4, Interesting)
Which is why languages like python were written in the first place. They pretty much just make the underlying C calls anyways, but do so in a way that handles buffer overflows, pointers, etc., that pretty much make C/C++ so troublesome, hazardous, and hard to learn. I like java (alot really), but nothing beats a good scirpting language, like perl or python, to handle tasks like text manipulation. Python is especially good at using libraries, such as the imaging library, which are written in C anyways. How much faster can you get calling a C library from C than from python? I honestly don't know, but I can't imagine it's that much more. But when you add in speed of development, safety, and even portability, it's powerful.
Python's OOP is also a feature that makes it far more attractive than perl for me. Perl does OOP, but it's not as clean as python's, and I don't think it supports all the OOP features either. Doing GUI's is not the strength of any scripting language, but it depends on what you need to do. You can write a native frontend and embed python into a C or even a java application.
Parent
Re:Why not just use pure C++? (Score:3, Informative)
This is why projects like pyGTK [pygtk.org] exist
Re:Why not just use pure C++? (Score:3, Interesting)
your "expert" C++ guy wasn't an expert. Can you describe the
problem a little better.. if what you say is true, I as
a long term C++ programmer would consider switching, but
I've looked at python, and I simply don't believe you.
I'll grant that C++ is a nightmare for beginners with more pitfalls
than an indiana jones movie, but once you know them, writing
poorly performing code is unlikely.
Stupid comparison (Score:4, Insightful)
Parent
Re:Why not just use pure C++? (Score:3, Insightful)
As have I, but I'd certainly rather manage in languages that support first order data structures, "for each" loops for iterations, proper disjunctive types, pattern matching, and so on. C++ is better than it used to be, but all the data structures and algorithms in the standard library barely hold a candle to the expressive power of many functional programming and "scripting" languages.
Re:Sounds good... (Score:4, Insightful)
Uh, why would they have to? This goes from Python to C++, not vice versa. If there are no pointers or structs in the Python code, why would they have to handle them? Certainly, it's quite possible that some Python variable types will be converted to pointers or structs in the output code, but that's orthagonal to the issue of Python not having them natively.
If you were trying to go from C++ to Python, then you'd have to convert C++ pointers and structs to some sort of Python data type, and your comment would make sense. As it is, I'm not sure what you were trying to say.
Parent
Re:Sounds good... (Score:3, Insightful)
Why would one ever need to do that? The goal is not to write C++ in Python, it's to compile Python to machine code via an intermediate Python -> C++ compilation.
Re:Static Typing? (Score:2)
I recently wrote a largish simulation in python for a Biology course. The goal was to watch how a species spread over a planet given other competing species, natural disasters and the like. It took four in deep hack mode to write the whole thing,
Re:Static Typing? (Score:3, Interesting)
A more significant roadblock, IMO, is that he can't handle mixed types in 3+-tuples, which is very common.
Re:Static Typing? (Score:4, Interesting)
I love Python, but I hate the dynamic typing. It can be handy at times, but 99% of the time you make a variable to hold one kind of thing. Having the static typing would both improve performance (because the interpreter knew what you were up to) but would also eliminate bugs (because it would complain when I tried to set a double to "And now press...").
I'd love to see Python get optional static typing.
Parent
Re:Very nice, but... (Score:2)
Jeremy
Re:Very nice, but... (Score:3, Insightful)
Re:Very nice, but... (Score:3, Informative)
Re:Very nice, but... (Score:3, Insightful)
Indeed, VB.net and C# have very similar features and capabilities, and if there are big performance differences between them, it's because the authors of one of the compilers screwed up.
But the other posters were arguing that their performance and capabilities should be identical because they both compile to MSIL, and in fact that any language that does so would have equal performance and capabilities. Which is just silly; hence my silly IRock.net example. For a less silly example, Managed C++ certainly
Re:If they can do this... (Score:5, Insightful)
If this converter proves to be successful, I believe that a GCC frontend will be written eventually. There are probably potential optimizations that would be difficult or impossible to implement any other way.
Some may think that the dynamic nature of Python may preclude its inclusion in GCC. Technically, all that would need to be done is to have a runtime to handle dynamic things, similar to how Objective-C (for which there is GCC support) has a runtime to handle message passing and late binding. However, a large portion of the potential efficiency of a compiled version of the language would be lost to these dynamic capabilities; luckily, a compiler can detect when things are implicitly static (in fact, this converter is limited to implicitly static constructs), and optimise them to be truly static at compile-time.
Parent
Re:2-40 what? (Score:3, Insightful)
Re:Speaking from experience... (Score:3, Informative)
It's a great language -- combining the benefits of Python, Ruby, and C# -- and it's wonderful for proto-typing in the