Slashdot Log In
OCaml vs. C++ for Dynamic Programming
Posted by
timothy
on Mon Mar 14, 2005 06:35 PM
from the picking-a-fight dept.
from the picking-a-fight dept.
jcr13 writes "OCaml is nearly as fast (or sometimes even faster) than C, right? At least according to the Computer Language Shootout [alternate] (OCaml supporters often point to these shootout results). My results on a real-world programming problem (optimizing a garden layout using dynamic programming) disagree. On one particular problem instance (a garden of size 7x3), my C++ implementation finished in 1 second, while the OCaml implementation was still running after 16 minutes. Bear in mind that my OCaml implementation was dramatically faster than my equivalent Haskell code. It seems that if you program using a functional style in OCaml (which I did, using map, filter, and other recursive structures in place of loops), it is quite slow. However, most of the shootout OCaml programs rely heavily on OCaml's imperative features (unlike Haskell, OCaml doesn't force you to be a functional purist). If you write OCaml code that is isomorphic to C code, it will be fast---what about if you use OCaml the way it was meant to be used?"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Uh huh (Score:2)
Re:Uh huh (Score:2)
Have you tried running any tracing of the program?
Hmmm ... (Score:5, Interesting)
Dynamic programming depends basically on memoization (not "memorization", before someone complains about my typo) which inherently means preserving some state. If you don't preserve state, it becomes a good old, likely exponential time, recursive program. Any chance your implementation is not memoizing?
Re:Hmmm ... (Score:2)
Well, he said he was memoizing in the article. You can check his code here. [sourceforge.net]
Re:Hmmm ... (Score:2)
Re:Hmmm ... (Score:2)
Shortly after that, you'll also learn that if your dynamic programming solution is running 960 times longer than expected, the odds are awfully good that you've manage to make an exponential implementation of what should be a poly-time solution.
Re:Hmmm ... (Score:5, Insightful)
Try using a map instead. I'll send you example Ocaml code in a day or two (I'm moving, so I don't have that much free time to fix other people's bugs that I'm not getting paid to fix). Note that this is true in Haskell as well as Ocaml. Haskell may be just a little bit better at hiding the problem with laziness- but it's still a problem!
Now, for the brutal part of the response: that big of a difference in performance almost certain does mean you've messed something up in your implementation. But, instead of saying "Gee- I wonder what I screwed up?" you said "Gee- Ocaml and functional programming must just suck!" That I still fault you for.
Parent
Re:Hmmm ... (Score:2)
Re:Hmmm ... (Score:3, Informative)
But here's another hint: the fact that there's a fuinction that says "memoize" doesn't necessarily mean it works.
A guess... (Score:2)
That doesn't mean that ocaml is slow. It means that dynamic programming in a functional programming style in an eager programming language requires a lot of care, or perhaps should even be avoided. A result that wouldn't surprise me the slightest.
My realworld results differ (Score:2, Interesting)
Re:My realworld results differ (Score:2)
I mean this seriously. I'm both an experienced O'Caml (and SML, Erlang) programmer and C/C++ programmer. I have never, ever, had anyone present a problem that can't be done faster in C or C++. You might have to solve the problem differently, but C/C++ wins every time as far as performance goes.
Now implementation speed and/or maintenance is something different. O'Caml has a horrible syntax though. Regular SML is much better. They should've just stuck with th
Re:My realworld results differ (Score:5, Insightful)
Haskell type families are just elegance and beauty itself, but doing state in Haskell is an exercise in raw tedium. Very localized state (in one function) is easy enough, but anything more pervasive and you soon become more familiar with monads than you ever wanted to be. If you want a haskell program that doesn't suck up more memory than emacs, you have to stay away from many modern features so your program will compile with nhc98.
Ocaml isn't seeing a lot of new work going into it -- the language definition seems to have become cast in stone. Haskell is always evolving, though typically in ways that are really impenetrable to those of us without PhD's in category theory and denotational semantics.
I guess I could search the world over for my holy grail FP language, and always be dissatisfied...
Parent
Haskell (Score:2)
But now it all seems to be about Arrows. AAARRRGGGHHH!
Too bad. Haskell really is one of the coolest languages around.
Re:My realworld results differ (Score:5, Informative)
It's not actually the case that Haskell "forces" functional purity, at least not in the way the submitter seems to think. You can do things that are a LOT like non-pure functions, you just have to use Monads. You have the so-called "unsafe" functions, which perform side-effects in otherwise "pure" functions.
So you might ask, "If you're going to write code like that in Haskell, why not just use C++." The answer is because even when using Haskell in a non-idiomatic way, Haskell is still more beautiful :)
Monads are a means of threading "stateful" code in a very clean and predictable way through your programs. The parent's comment, "Very localized state (in one function) is easy enough, but anything more pervasive and you soon become more familiar with monads than you ever wanted to be," is sorta like saying, "You can write high-level code in C++, but you will soon become more familiar with objects than you ever wanted to be."
They are indeed a part of the language, and definitely a new concept, but monads aren't nearly as confusing as people seem to think, certainly not more confusing than objects, it's just a reputation issue that makes people think monads are confusing. Take it from a random-joe hacker like me. You don't need a PhD to perform IO in Haskell.
For instance, here's a basic implementation of 'cat' in Haskell:
The code:
is similar to assignment.getArgs just reads in the command-line arguments as a list, so 'a' represents a list of the filenames.
readFile takes a file name, reads the contents, and returns it as a list of lines.
mapM means 'perform this computation once for each item in this list'
putStr is obvious, concat just takes a list of lists and turns it into a single list.
There's a paper, Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell [microsoft.com] about how to do these kinds of "real-world" things in Haskell.
There's also a very cool version control system called darcs [darcs.net] that's written in Haskell, and recently an implementation of Perl 6 called Pugs [pugscode.org] in Haskell.
peace,
isaac
Parent
Re:My realworld results differ (Score:3, Interesting)
It isn't that monads are confusing, but that they're a contagion. Now, I'm not a very experienced Haskell programmer; in fact, I'm not a very experienced functional programmer, so I'm probably just making some dumb mistakes... but almost every application that I write goes through the following evolution:
Re:My realworld results differ (Score:2)
I've never seen a real good reason this is usefull, myself (and yes, Virginia, I have done signifigant programming in C++ and other languages with this ability). Many data structures have the same or similiar implementations, but generally different performance characteristics. When I use a par
Re:My realworld results differ (Score:2)
Sincerely delusional to call it rare. It's an entire paradigm of programming. For example, you cannot generically write a functor that maps over any container structure if the algorithm needs to be aware of every possible type of container. Criminy, those were off the top of my head. How about Socket.read and Pipe.read? (I'm not familiar enough with the standard library). Heck, if python, perl, and even C have polymorphic behavior the
Re:My realworld results differ (Score:4, Insightful)
Not always, not always.
For example, I have seen two different kinds of tree castings made of stone: one, a negative casting made by molten lava that built up as an accretion on a tree (which obviously burned out), and two, a positive casting made through a slow fossilization ("petrification") process.
I would happily come up with a false etymology originating in the parlance of lime-slakers, medieval wall builders, sarcophagus fillers, or even potters discussing cone-10 firing, but you'd probably call me on it.
That being said, it is a weird phrase, that probably belongs with "mute points" and exclaimations like "here here!"
Parent
Re:My realworld results differ (Score:2)
Oh, and it's "moot point" (no idea where the word moot comes from) and "hear, hear" (as in "we hear you". Or "preach it!").
Re:My realworld results differ (Score:2)
Why is it that I have a hankering for grits now, and want to watch Natalie Portman movies? Strange, that...
Can you also link to the Haskell code? (Score:2, Interesting)
http://minorgems.sf.net/Haskell.hs doesn't exist, though the the C++ and OCaml code are there.
apples and oranges (Score:3, Insightful)
Usings lists in all circumstances because its functional is not appropriate. Show the ocaml implementation using arrays or an c++ implementation using linked lists for a valid comparision.
The strength of Ocaml is the flexibility it provides to a developer. If your solution is more elegantly coded using imperative constructs, then use them!lookup table (Score:3, Interesting)
In the same kind of vein, has the code been profiled? I'd quite like to see where the time is going.
Re:lookup table (Score:5, Insightful)
I just looked at the code, and he's memoizing the function results in a associative list:
(* Set up an associative list for memoization *)
let lookup key table = List.assoc key !table;;
let insert key value table = table
Insertion is cheap, but the lookup is a linear table scan! Doh! What was he thinking?
I suspect that a Hashtable or a Map datastructure might be much better suited to the task.
In any case, it would have been very easy for him to post this code to the OCaml newsgroup and ask, "Am I writing good functional code?"
He would of gotten a lot of advice on how he could have sped up his program while still maintaining a functional style.
Lastly, in response to his question, "I could write an OCaml implementation that is isomorphic to the C++ code (using loops and side effects), but what would be the point?" The point is that you can easily mix and match styles in OCaml.
You can write 90% of your code in a functional style and fall back to imperative style if there is an inner loop that would benefit from that.
For this problem though, I suspect that a well written functional version would be pretty close in speed to his C++ version, cleaner, and easier to maintain.
Parent
About benchmarks (Score:2)
One algorithm is sometimes not enough to accurately judge between compilers. But I guess if it's OCaml, then sure, C++
Haskell code? (Score:5, Informative)
Can we see your Haskell code?
Haskell is not known for raw speed, but dynamic programming is probably the one thing it does well, thanks to lazy evaluation. You fill a CAF with unevaluated function calls, and the language engine does the rest. It won't be as fast as the hand-crafted C++ version, most likely, but if your O'Caml code is anything to go by, it might be able to be improved.
It's always possible to tune your inner loops (Score:5, Informative)
However, as an every-day ML user I find it very unlikely that your program would be a thousand times slower if you're using it "the way it's meant to be used." I am guessing that your implementation is asymptotically worse, since using map and fold correctly should really only be a constant factor slower than C, at worst. (mlton can often inline and optimize these into essentially the same code you'd write in C!) How about posting your code?
Re:It's always possible to tune your inner loops (Score:4, Insightful)
Parent
Bad string manipulation (Score:4, Informative)
You should use the Buffer module, or String.concat:If there is a lot of those mistake, no wonder it is so slow...
More efficient ocaml version (Score:3, Interesting)
If you are concerned by performance, you should use a complete cache, like in your C version.
FYI, I uploaded an ocaml translation of your C code. It doesn't use mutable state except for memoizing, and uses pattern-matching on lists, and recursion rather than for loops, but otherwise it follows closely your code. Performance should be very similar.
http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/gard
Re:More efficient ocaml version (Score:2, Interesting)
In terms of performance, I get:
$
real 0m16.951s
user 0m16.870s
sys 0m0.010s
$
real 0m10.200s
user 0m10.160s
sys 0m0.010s
So OCaml wins on performance per LOC.
However, this C++ is also very poorly written IMHO. Specifi
Email response (Score:5, Interesting)
>
> 1. You use lists. Lists aren't designed to be fast (computationally)
> to use. They're designed to be fast (programmatically) to use. You'll
> be hard pressed to find a production, speed-sensitive Lisp or O'Caml
> program that uses lists.
Okay... but here's my point: Every single example that shows how elegant Haskell and OCaml are uses lists. The 4-line Quicksort example for Haskell uses lists. All of the code that demonstrates easy reuse of functions and functions taken as arguments uses lists (like how easy it is to implement quite complicated algorithms using only map and filter, for example).
So, proponents say "Everyone should use functional languages because they can express complicated problems in elegant ways and result in cleaner, more reusable code."
But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code. So, they are instead all using mutable data structures... I.e., they are programming with side-effects and loops (random access instead of recursion, even when ever element of an array/list needs to be accessed/processed).
That was my point exactly. If you write elegant OCaml code using all of the lovely (and I mean lovely, really) tricks that they present when they demonstrate why OCaml is cool, you end up with code that is too slow to use in the real world.
I would say that my C++ (or most would call it C) implementation is elegant enough... easy to understand... no messy optimization tricks. Sure, I'm not using objects and templates everywhere, but these structures are hardly needed to solve this simple problem.
> 2. Practically none of your functions are written tail-recursively.
Good point.
> 2.5. You use a list append (@) inside a loop (generateStates).
> List.append is O(m), where m is the length of its first argument. If
> you write an implementation, you'll see why. It probably doesn't make
> much of a difference here (generateStates is only called once) but it's
> something to watch out for.
Of course, as you point out, generateStates has almost no effect on the running time. However, I wonder how you might implement that in an elegant way in OCaml without @. In C, I just looped over all numbers between 0 and 2^stateLength and converted the bit representations for the numbers to cell on/off states.
> 3. For Pete's sake, man, you're using an association list for your
> memos! Surely you know that lookup in an association list is O(n) in
> the size of the list.
I simply Googled for "memoization Ocaml" and found that code:
http://www.emeraldtiger.net/modules.php?op= modload &name=News&file=article&sid=9
The author pointed out how "sweet" polymorphism is... one block of code that can be used to memoize any function. Sweet indeed, and it certainly sped up my OCaml code a lot (without memoization, it was so slow as to be intractable for anything larger than about 4x4).
So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.
I agree that the memoization code is probably the problem in the OCaml version. However, this code came directly from the OCaml community and was the *only* example of memoization in OCaml that I could find.
For Haskell, I used an infinite list of results that was filled in lazily as the results were needed. This also sped up the algorithm dramatically. However, I cannot get a Haskell compiler to compile itself on my platform, so I was testing all code in the Hugs interpreter, which made it too slow to be practical. Isomorphic compiled OCaml code was hundreds of times fast
Re:Email response (Score:3, Insightful)
Only to the extent that all of your production code is speed-sensitive. The hot-spots are generally only 10% of the code. A lot of very complex code (eg: configuration stuff), only gets run very rarely. Using high-level languages has a certain design procedure. You write everything at the high-level, and
Re:Email response (Score:4, Informative)
Or perhaps more correctly, "every single example that you've seen". For a real quick one, look at Jason Hickey's Intro to OCaml (pdf) [caltech.edu] and have a quick peek at his Red/Black tree implementation. Or even cooler (if you're into that sort of thing) is the ever famous One Day Compilers [venge.net] talk.
But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code. So, they are instead all using mutable data structures... I.e., they are programming with side-effects and loops (random access instead of recursion, even when ever element of an array/list needs to be accessed/processed).
No. What he's saying in that you should use the best data structure for the job. Your best bet would have been to use the Hashtbl [inria.fr] module from the standard library, or if you wanted to stay in the purely applicative, the Map [inria.fr] module (also in the standard library) would have been loads faster...
You are aware that there are more purely functional data structures (pdf) [cmu.edu] (OCaml implementations) [oefai.at] than the list, don't you?
So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.
Here's a pretty neat example [ed.ac.uk] that uses arrays in a naive way, but you could certainly use, say, a map instead... And I'm pretty sure the OCaml community (by which I mean the people who would have helped you improve your code [inria.fr] had you asked them) know about things like this.
I don't think I spread any falsehoods. I mean, my experiment was real, and the results are real, and the code is there for people to inspect and try on their own. I also talked in my
Yes. And we inspected it, found it to be poorly written, and told you so. The "falsehood" here is that you claim that code written in a functional style is slow, when you really should have said "my code written in a naively functional style is slow". If I fill my gas tank with water, my car sure is slower than walking, therefore all cars are slower than walking... right?
Trust me... I am *dying* to use OCaml or Haskell for real-world programming. I have spent the past month or so exploring these languages and trying to apply them to real programming problems. Especially when shootout results showed that OCaml was sometimes faster than C, and when I discovered that OCaml was much faster than Hasell, I was really starting to think that OCaml was a possibility.
I put a link to tho OCaml mailing lists above. Use it. Ask questions of the list (you may want to start with the beginner's list). They can help you learn the language faster and better than google will.
However, the ONLY reason why I would want to use OCaml is to take advantage of the expressiveness of pure functional programmin
Parent
Re:Email response (Score:4, Interesting)
You get a significant boost just by dumping the list memoization in favor of a hashtable implementation. I'm not necessarily saying that's the optimal choice, but it's an easy drop-in replacement that is much better suited to the task. Here's a patch:Also: learn to use the profiler! It takes about five seconds to see that camlList__assoc is killing you.
Parent
The effect of community (Score:3, Insightful)
That's a very important point, and I think the effect of the community around a programming language (or indeed a whole paradigm -- functional, OOP, etc.) is often underestimated.
Just look at PERL: it has its merits, but in fairness clarity to newcomers is rarely listed as one of them. And yet, because asking a question on a PERL group tends to result in joyous cri
More sensible results. (Score:2)
Re:More sensible results. (Score:2)
Hm. Slashcode messed with the URL. Use good old copy and paste:
http://shootout.alioth.debian.org/great/benchma r k. php?test=all&lang=all&sort=fullcpu&xfullcpu=1&xmem =1&xloc=0&ackermann=1&wc=3&echo=5&fannkuch=0&fasta =0&fibo=1&harmonic=0&heapsort=4&knucleotide=0&mand elbrot=0
update the web page? (Score:2, Insightful)
Quick Haskell Rebuttal (Score:3, Interesting)
OCaml Evalation Is Strict, Not Lazy (Score:4, Informative)
I wrote some extensions for programming in OCaml in the functional style. Check out the OCaml NAE [sf.net] project, and look for the Core Foundation (Cf) package.
Re:Other languages... (Score:2)
On the flip-side, there are plenty of languages which were "official" standards, once, but which are now all but dead. ADA was the "standard" language of the DoD, for example, was (and is) very well supported, but is simply incapable of compe
Re:Other languages... (Score:2)
There are actually quite a few clean-room Java implementations. The GNU Classpath homepage lists 14 Java VMs [gnu.org] that use the Classpath [gnu.org] clean-room implementation of the Java class libraries.
For the record, Classpath is 99.79% function complete relative to JDK 1.1 according to the comparison linked from the home page. The numbers drop off with JDK
Re:Speed alone (Score:2, Interesting)
Re:Speed alone (Score:2)
Um... not true.
I consider myself an expert assembly level programmer. I recently took on a job optimizing some C code. The code was the back end bi-level compression code in a JBIG library. Bit twiddling, and updates to a compression table.
No big deal, I though. Unrolled the inner loop, and extracted patterns (as in "this case cannot possibly happen on the next n bits being 0, so bulk update and skip forward"). I wrote a program to generate the cases of interest, and tested.
Very good results --
Re:Speed alone (Score:2)
Re:Speed alone (Score:2)
Ratboy.
Re:OCaml is supposed to be faster? C++ is dynamic? (Score:2)
You apparently don't know what "dynamic programming" means. Dynamic programming is a general technique, usually applied to search, which caches the results of executions on problem subsets and then reuses those results when those subsets are encountered again. It has nothing to do with dynamic typing, or dynamic code, or whatever you're thinking about.
To put it extremely simply, dynamic programming means hash tables, although that's not really a complete defini
Re:OCaml is supposed to be faster? C++ is dynamic? (Score:3, Informative)
Perhaps this is a case of problem solving by public contradiction?