Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Programming IT Technology

OCaml vs. C++ for Dynamic Programming 161

jcr13 writes "OCaml is nearly as fast (or sometimes even faster) than C, right? At least according to the Computer Language Shootout [alternate] (OCaml supporters often point to these shootout results). My results on a real-world programming problem (optimizing a garden layout using dynamic programming) disagree. On one particular problem instance (a garden of size 7x3), my C++ implementation finished in 1 second, while the OCaml implementation was still running after 16 minutes. Bear in mind that my OCaml implementation was dramatically faster than my equivalent Haskell code. It seems that if you program using a functional style in OCaml (which I did, using map, filter, and other recursive structures in place of loops), it is quite slow. However, most of the shootout OCaml programs rely heavily on OCaml's imperative features (unlike Haskell, OCaml doesn't force you to be a functional purist). If you write OCaml code that is isomorphic to C code, it will be fast---what about if you use OCaml the way it was meant to be used?"
This discussion has been archived. No new comments can be posted.

OCaml vs. C++ for Dynamic Programming

Comments Filter:
  • So the OCaml compiler is supposed to figure out that there's a dynamic programming solution to the problem you have specified using higher order operators? That's a great idea. Let me know when you work that out cause I'd love to be able to just write a specification, press a button and get an optimised program. If you do manage to make this though, be sure not to tell those bastards who pay our salaries, they'll probably think we're not working anymore.
    • No, he's trying to do it explicitly. I'm not an ML guru, but I'm kinda suspicious that "generate states" is getting called exponentially many times.

      Have you tried running any tracing of the program?
  • Hmmm ... (Score:5, Interesting)

    by crmartin ( 98227 ) on Monday March 14, 2005 @06:46PM (#11938437)
    That difference is so dramatic that I wonder if you made a mistake in your functional implementation? Or is there something specific about your dynamic program that makes trouble?

    Dynamic programming depends basically on memoization (not "memorization", before someone complains about my typo) which inherently means preserving some state. If you don't preserve state, it becomes a good old, likely exponential time, recursive program. Any chance your implementation is not memoizing?
    • Maybe rtfa? ("My results")
      • When you grow up, you'll learn that if the timing differs by three orders of magnitude from expectations, your problem is algorithmic; you'll smell that before you even need to read the code.

        Shortly after that, you'll also learn that if your dynamic programming solution is running 960 times longer than expected, the odds are awfully good that you've manage to make an exponential implementation of what should be a poly-time solution.
        • Re:Hmmm ... (Score:1, Informative)

          by Anonymous Coward
          How long does it take for one to learn that you should be humble enough to make sure the program you wrote even halts before publicly complaining and posting results?

          Furthermore, looking at the ml code shows some questionable practices. He reinvents the map function, which isn't nessecary and could cause relative slowdowns if the ocaml library version has been optimized. It shouldn't take 16 minutes, though. I don't have an ocaml compiler installed at the moment, so I can't really verify and delve into why
    • Or maybe just a plain old infinite loop.

      I mean it was never claimed to have umm finished corectly. I mean if every time a complicated program (dynamic programming counts as complicated/non-trivial in my book at least) went off and didn't finish I assumed the language sucked, well I would be out programing languages real quick.
    • Any chance your implementation is not memoizing?

      Well, he said he was memoizing in the article. You can check his code here. [sourceforge.net]

  • My "shoot from the hip" guess would be that your ocaml program is not efficient because it's not running the same algorithm. The Haskell program turns out not so slow, because Haskell is lazy, and in this particular case it's a big win.

    That doesn't mean that ocaml is slow. It means that dynamic programming in a functional programming style in an eager programming language requires a lot of care, or perhaps should even be avoided. A result that wouldn't surprise me the slightest.

    • it's not running the same algorithm

      A good observation.

      I'd recommend re-implementing the C++ algorithm in OCaml (it has classes and objects, after all), and then seeing if that is in the same ball park.

      Then you can compare that to your purely functional version.
  • You know I've implemented some real world applications recently for a contract job, and the Ocaml applications are actually faster than the C++ equivalents using the STL. So you mileage may vary based on your problem set (or Ocamlfu as the case may be). As for how Ocaml is supposed to be programmed, there's a reason Ocaml supports imperative programming, because you should use the form that is most efficient for your problem. Some programs benefit from a functional approach (and it helps if you implement
    • Show me one example where O'Caml beats C++.

      I mean this seriously. I'm both an experienced O'Caml (and SML, Erlang) programmer and C/C++ programmer. I have never, ever, had anyone present a problem that can't be done faster in C or C++. You might have to solve the problem differently, but C/C++ wins every time as far as performance goes.

      Now implementation speed and/or maintenance is something different. O'Caml has a horrible syntax though. Regular SML is much better. They should've just stuck with th
      • Show me one example where O'Caml beats C++.

        ICFP contests:

        2004 [upenn.edu] (ocaml 1st); 2003 [chalmers.se] (ocaml 1st); 2000 [cornell.edu] (ocaml 3rd, C++ 5th); >2001 [inria.fr] (ocaml tied 3rd place with C);

        Others winners: 1998 [mit.edu] C derivative, 2002 [ogi.edu] python (I think C++ actually beat OCaml this year)
      • You might have to solve the problem differently

        And therein lies the rub.

        I think almost everyone will agree that carefully written C (or ++) is faster than pretty much anything out there, including OCaml.

        I also (subjective statement warning) think most will agree that pretty much anything out there is easier to read / reason about / maintain than that C. Speed optimized C is rarely (tho not never) pretty.

        OCaml's strength is that is is almost as fast as C, almost as pretty(*) as haskell, and scales to l
    • by snorklewacker ( 836663 ) on Monday March 14, 2005 @07:31PM (#11938894)
      Ocaml doesn't support any ad-hoc polymorphism (overloading) whatsoever in functions. Methods on the other hand can be overloaded, but not generic. This sort of thing makes it weaker than even C++ for generic programming, let alone Haskell, though I must admit not having to use template syntax makes me want to claw my eyes out a good deal less when reading it (or my hair when writing). Modules simply don't do it for me. Having to differentiate between HashTable.insert and SkipList.insert sort of defeats the purpose of abstract types, because no one thought to make module signatures themselves first-class (except Alice).

      Haskell type families are just elegance and beauty itself, but doing state in Haskell is an exercise in raw tedium. Very localized state (in one function) is easy enough, but anything more pervasive and you soon become more familiar with monads than you ever wanted to be. If you want a haskell program that doesn't suck up more memory than emacs, you have to stay away from many modern features so your program will compile with nhc98.

      Ocaml isn't seeing a lot of new work going into it -- the language definition seems to have become cast in stone. Haskell is always evolving, though typically in ways that are really impenetrable to those of us without PhD's in category theory and denotational semantics.

      I guess I could search the world over for my holy grail FP language, and always be dissatisfied...
      • by jefu ( 53450 )
        I quite like haskell in general and finally figured out enough about monads so I can actually use them.

        But now it all seems to be about Arrows. AAARRRGGGHHH!

        Too bad. Haskell really is one of the coolest languages around.

        • Check out Epigram, hopefully they'll think of a better "helper" program, and a syntax that doesn't kill you to write without the helper (and that is clearer even with the helper).

          And one day it may even be capable of I/O.
      • by ijones ( 83977 ) on Tuesday March 15, 2005 @12:54AM (#11940911)

        It's not actually the case that Haskell "forces" functional purity, at least not in the way the submitter seems to think. You can do things that are a LOT like non-pure functions, you just have to use Monads. You have the so-called "unsafe" functions, which perform side-effects in otherwise "pure" functions.

        So you might ask, "If you're going to write code like that in Haskell, why not just use C++." The answer is because even when using Haskell in a non-idiomatic way, Haskell is still more beautiful :)

        Monads are a means of threading "stateful" code in a very clean and predictable way through your programs. The parent's comment, "Very localized state (in one function) is easy enough, but anything more pervasive and you soon become more familiar with monads than you ever wanted to be," is sorta like saying, "You can write high-level code in C++, but you will soon become more familiar with objects than you ever wanted to be."

        They are indeed a part of the language, and definitely a new concept, but monads aren't nearly as confusing as people seem to think, certainly not more confusing than objects, it's just a reputation issue that makes people think monads are confusing. Take it from a random-joe hacker like me. You don't need a PhD to perform IO in Haskell.

        For instance, here's a basic implementation of 'cat' in Haskell:

        import System.Environment(getArgs)

        main = do
        {a <- getArgs;
        lines <- mapM readFile a;
        putStr (concat lines);}

        The code:

        a <- b
        is similar to assignment.

        getArgs just reads in the command-line arguments as a list, so 'a' represents a list of the filenames.

        readFile takes a file name, reads the contents, and returns it as a list of lines.

        mapM means 'perform this computation once for each item in this list'

        putStr is obvious, concat just takes a list of lists and turns it into a single list.

        There's a paper, Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell [microsoft.com] about how to do these kinds of "real-world" things in Haskell.

        There's also a very cool version control system called darcs [darcs.net] that's written in Haskell, and recently an implementation of Perl 6 called Pugs [pugscode.org] in Haskell.

        peace,

        isaac

        • That will work out very slow in at least ghc and hugs (not sure about nhc) as they seem to build the whole string in core before putStr can do its business. Use (mapM_ putStrLn lines) for the last line of code instead to get iterative behaviour instead of recursive.
        • They are indeed a part of the language, and definitely a new concept, but monads aren't nearly as confusing as people seem to think, certainly not more confusing than objects

          It isn't that monads are confusing, but that they're a contagion. Now, I'm not a very experienced Haskell programmer; in fact, I'm not a very experienced functional programmer, so I'm probably just making some dumb mistakes... but almost every application that I write goes through the following evolution:

          1. Application starts out at
      • You said:

        Having to differentiate between HashTable.insert and SkipList.insert sort of defeats the purpose of abstract types, because no one thought to make module signatures themselves first-class (except Alice).

        I've never seen a real good reason this is usefull, myself (and yes, Virginia, I have done signifigant programming in C++ and other languages with this ability). Many data structures have the same or similiar implementations, but generally different performance characteristics. When I use a par

        • > Generic implementations of algorithms? Rare, in my experience.

          Sincerely delusional to call it rare. It's an entire paradigm of programming. For example, you cannot generically write a functor that maps over any container structure if the algorithm needs to be aware of every possible type of container. Criminy, those were off the top of my head. How about Socket.read and Pipe.read? (I'm not familiar enough with the standard library). Heck, if python, perl, and even C have polymorphic behavior the
        • The insert example isn't great. Despite using the same name and similar end results, these are unlikely to have the same performance characteristics, which is likely to be a significant difference in practice.

          However, consider a simple function such as map, which fundamentally takes some structured data, and produces a new set of data with the same structure, built from the original elements acted on by some function. This concept is applicable to a wide range of data structures, and writing it polymorphi

      • by Anonymous Coward
        Ocaml doesn't support any ad-hoc polymorphism (overloading) whatsoever in functions.

        This is an annoyance more than anything else... e.g. writing "+." for floating point addition instead of "+"...

        Methods on the other hand can be overloaded, but not generic.

        Actually, the language _has_ been extended recently to allow a restricted form of generic methods. For example, you can now do fold's using a syntax like this:

        class ['a] list l =
        object
        val mutable _data = l
        method fold : 'b . ('b -> '
        • Yowza, mod parent up into the stratosphere. I stand at least partly corrected, not being aware of generic methods. I know about functors, but the disconnect between them and the actual language just feels so bloody bureaucratic compared to type classes -- which are also in a separate language, but it's a small one in comparison.

          I guess I should take another look at Ocaml. My other long-running gripe about ocaml was its lack of machine words (like UInt64), and while I'm not fond of having only a library
      • Having to differentiate between HashTable.insert and SkipList.insert sort of defeats the purpose of abstract types

        I disagree. The purpose of abstract types is to syntactically enforce layers of abstraction. Overloading can save you typing (in my opinion, it usually makes programs less clear in the process), but it doesn't really have anything to do with abstract types.

        Except for some questions of how they'd interact with the module system, Haskell's type classes are compatible with eager functional/imper

        • Haskell's type classes are compatible with eager functional/imperative languages like ML


          I seem to recall that someone tried a haskell compiler with a carefully broken strictness analyzer: it claimed that every function was strict.

          Unfortunately, it didn't work, I think due to required laziness in the prelude. I don't recall it being impossible.

          Would be a fun language to play with.
  • by jd ( 1658 )
    D is supposedly superior in almost every regard to C++. There is a D interface to GCC produced by a third-party. It's not tested in the shootout, but the reference implementation of D (which is stand-alone) is.


    Occam is probably the most scalable language out there, as it was designed for massively parallel systems.


    There's a PL/I interface to GCC. I dare someone to write a Gnome or KDE interface for it...

    • by Anonymous Coward
      D has no support though. At least when compared to other modern languages (C/C++, C#, Java, Perl, Python, etc.). It really not all that different from C++. It has some better syntax and extra features, but it's simply non-standard and the tiny improvements aren't worth the risk of using something unproven. It is a dead end.
      • True, maybe, but so was PHP when it started and Python 1.x had only a single developer for the core system. Java is largely under the control of Sun, with very very few clean-room implementations. Kaffe is one of the few I know of, and it is far from feature-complete.

        On the flip-side, there are plenty of languages which were "official" standards, once, but which are now all but dead. ADA was the "standard" language of the DoD, for example, was (and is) very well supported, but is simply incapable of compe

        • Java is largely under the control of Sun, with very very few clean-room implementations. Kaffe is one of the few I know of, and it is far from feature-complete.

          There are actually quite a few clean-room Java implementations. The GNU Classpath homepage lists 14 Java VMs [gnu.org] that use the Classpath [gnu.org] clean-room implementation of the Java class libraries.

          For the record, Classpath is 99.79% function complete relative to JDK 1.1 according to the comparison linked from the home page. The numbers drop off with JDK

  • I'd like to see the Haskell sources for comparison.
    http://minorgems.sf.net/Haskell.hs doesn't exist, though the the C++ and OCaml code are there.
  • apples and oranges (Score:3, Insightful)

    by e aubin ( 121373 ) on Monday March 14, 2005 @07:21PM (#11938792)

    Usings lists in all circumstances because its functional is not appropriate. Show the ocaml implementation using arrays or an c++ implementation using linked lists for a valid comparision.

    The strength of Ocaml is the flexibility it provides to a developer. If your solution is more elegantly coded using imperative constructs, then use them!
  • Speed alone (Score:1, Flamebait)

    by idiotfromia ( 657688 )
    If you're just looking for speed, try assembly. I hear all the good programmers are using it these days.
    • Re:Speed alone (Score:2, Interesting)

      by ameline ( 771895 )
      I know you think you're joking, but you're actually right -- that's what the good programmers do. In my application, the performance critical routines (on the order of a dozen or so) are hand coded in vector assembler (Altivec on Mac, and MMX and SSE on Intel), and are about 3 to 4 times faster than the most optimized C or C++ implementation of those algorithms. If you have code that is vectorizable, and especially if it is doing saturated small integer math, (blending, resampling etc), you can do way bett
      • Ameline:

        Um... not true.

        I consider myself an expert assembly level programmer. I recently took on a job optimizing some C code. The code was the back end bi-level compression code in a JBIG library. Bit twiddling, and updates to a compression table.

        No big deal, I though. Unrolled the inner loop, and extracted patterns (as in "this case cannot possibly happen on the next n bits being 0, so bulk update and skip forward"). I wrote a program to generate the cases of interest, and tested.

        Very good results --
        • how about intel's compiler?
          • Since the job was T&M, I didn't have the freedom to explore Intels compiler. Sorry -- and it would have been very interesting.

            Ratboy.
          • I've tried 'em all. Intels is good, but when it comes to toe sort of code I was talking about in my original post, you can still get a factor of 2 or 3 over whan it can do with scalar code, if you havd code a vectorized MMX or SSE version.
        • I did explicitly say "if it's vectorizable", and "particularly if it uses small integer saturated math". I doubt any autovectorization will recognize the conditionals necessary for saturated math and make efficient use of MMX instead. (I worked on the back end of an optimizing compiler at IBM for 6 years, so I know a bit about how compilers work :-)
  • lookup table (Score:3, Interesting)

    by jefu ( 53450 ) on Monday March 14, 2005 @07:48PM (#11939030) Homepage Journal
    Looking at the ocaml code and in particular at the functions that handle the memoizing, I'm wondering how big those memo tables are getting. If they get big it seems quite possible that the overhead of doing the lookup (or if the lookup is fast) of doing the insert might not be a problem. It should be easy enough to just look and see how big the table is getting.

    In the same kind of vein, has the code been profiled? I'd quite like to see where the time is going.

    • Re:lookup table (Score:5, Insightful)

      by PylonHead ( 61401 ) on Monday March 14, 2005 @08:48PM (#11939530) Homepage Journal
      Yea, you're not kidding.

      I just looked at the code, and he's memoizing the function results in a associative list:

      (* Set up an associative list for memoization *)
      let lookup key table = List.assoc key !table;;
      let insert key value table = table := (key, value) :: !table;;

      Insertion is cheap, but the lookup is a linear table scan! Doh! What was he thinking?

      I suspect that a Hashtable or a Map datastructure might be much better suited to the task.

      In any case, it would have been very easy for him to post this code to the OCaml newsgroup and ask, "Am I writing good functional code?"

      He would of gotten a lot of advice on how he could have sped up his program while still maintaining a functional style.

      Lastly, in response to his question, "I could write an OCaml implementation that is isomorphic to the C++ code (using loops and side effects), but what would be the point?" The point is that you can easily mix and match styles in OCaml.
      You can write 90% of your code in a functional style and fall back to imperative style if there is an inner loop that would benefit from that.

      For this problem though, I suspect that a well written functional version would be pretty close in speed to his C++ version, cleaner, and easier to maintain.

      • Yup, thats what it looked like to me, but I'm not enough of an OCaml expert to know for sure if that was what he was doing, and I wasn't sure about how to profile it and find out how big the lists were getting. If the lists are always short, there should be no problem, but if they get long, things could get very slow.

    • Looking at the ocaml code and in particular at the functions that handle the memoizing, I'm wondering how big those memo tables are getting.

      Well, this is not the problem : at the end of the execution, his table are empty. The problem is that the memoization is not done. Well after using Hashtable in place of assoc list, and after having realy memoize all function (only one had the problem) I've a good speedup...
  • I've seen different C compilers with all speed optimizations enabled generate code for the same problem that differed in performance by a factor of 20. This was not consistent with other benchmarks I had seen on the web comparing the same compilers. The benchmark was an html compressor. I'd share more, but unfortunately the code was automatically deleted by XP's chkdsk after a minor crash.

    One algorithm is sometimes not enough to accurately judge between compilers. But I guess if it's OCaml, then sure, C++
  • Haskell code? (Score:5, Informative)

    by Pseudonym ( 62607 ) on Monday March 14, 2005 @11:00PM (#11940393)

    Can we see your Haskell code?

    Haskell is not known for raw speed, but dynamic programming is probably the one thing it does well, thanks to lazy evaluation. You fill a CAF with unevaluated function calls, and the language engine does the rest. It won't be as fast as the hand-crafted C++ version, most likely, but if your O'Caml code is anything to go by, it might be able to be improved.

  • by Tom7 ( 102298 ) on Tuesday March 15, 2005 @12:00AM (#11940693) Homepage Journal
    One of the joys of programming in ML is that you can write most of your code in a really nice, functional way, and (if necessary) put in the effort to write a tight inner loop, perhaps in an imperative style for speed. I don't see this as a disadvantage, and ML compilers often do a better job of optimizing such loops than C compilers, in part because of more information being available in the type system. (And if they don't, it's trivial to invoke C subroutines.) Also, if performance is really an issue, you might try mlton (which is for SML, very similar to Caml); its whole-program approach often produces significantly better code than O'Caml.

    However, as an every-day ML user I find it very unlikely that your program would be a thousand times slower if you're using it "the way it's meant to be used." I am guessing that your implementation is asymptotically worse, since using map and fold correctly should really only be a constant factor slower than C, at worst. (mlton can often inline and optimize these into essentially the same code you'd write in C!) How about posting your code?
  • by vanicat ( 162345 ) on Tuesday March 15, 2005 @03:40AM (#11941444)
    I've not read all the ocaml code, but I've seen at the very begining this:
    let rec printCells cs =
    match cs with
    | [] -> ""
    | (c::rest) -> (printCell c) ^ (printCells rest);;
    And I know that this program will be slow. The ocaml string concatenation operator copy both string each time it is called, and this concatenation work will take O(n^2) step.

    You should use the Buffer module, or String.concat:
    let rec printCells cs =
    String.concat "" (list.map printCell cs)
    If there is a lot of those mistake, no wonder it is so slow...
  • Well, I tried the code and on my computer it ran in about 10 seconds (ocamlc/bytecode) and in about 7.6 seconds (ocamlopt/machine code). So, what's the problem this discussion is about?
    • Well, ok, it's not 7x3 in the original code.

      But after looking into the code in more detail,
      I saw that many (the most (nearly all (all?!)))
      recursive functions aren't tail-recursive.

      With such simple recursive approaches
      it is no wonder that the program is slow.

      Try to read SICP and learn to write tail-recursive
      code, then the code will be reasonably fast!

      At least the functions that will be called often
      should be rewritten into tailrec functions.

      ocamlprof will help to find out, which functions
      are called most
  • by Anonymous Coward on Tuesday March 15, 2005 @04:26AM (#11941572)
    I see nothing wrong in your C++ version, while your ocaml version clearly sucks: you are memoizing using a complex key, and an association list, meaning that accessing memoized information costs a lot.

    If you are concerned by performance, you should use a complete cache, like in your C version.
    FYI, I uploaded an ocaml translation of your C code. It doesn't use mutable state except for memoizing, and uses pattern-matching on lists, and recursion rather than for loops, but otherwise it follows closely your code. Performance should be very similar.

    http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/garde n2.ml [kyoto-u.ac.jp]
    • Interesting, having tidied up the C++ code, this OCaml code is still less than half the length. The original OCaml clearly had several unused functions, several pointless reimplementations of the library functions and many comments regarding Haskell (?!).

      In terms of performance, I get:

      $ ./garden2
      real 0m16.951s
      user 0m16.870s
      sys 0m0.010s

      $ ./Garden
      real 0m10.200s
      user 0m10.160s
      sys 0m0.010s

      So OCaml wins on performance per LOC. :-)

      However, this C++ is also very poorly written IMHO. Specifi
  • You wrote it in pure C! You are comparing Ocaml with C, not C++! You are just using the C in C++.
    Try writing your program fully using OO and templates. I dare you.

    Another thing: You OCaml code is shity. Im not asking you to tinker with optimisations, but just to use the correct data structures.

    It's not fair compare lazy OCaml code to simpleminded C code.
  • Email response (Score:5, Interesting)

    by jcr13 ( 203471 ) on Tuesday March 15, 2005 @08:38AM (#11942528) Homepage
    > Here's a laundry list of why your O'Caml program in inefficient:
    >
    > 1. You use lists. Lists aren't designed to be fast (computationally)
    > to use. They're designed to be fast (programmatically) to use. You'll
    > be hard pressed to find a production, speed-sensitive Lisp or O'Caml
    > program that uses lists.

    Okay... but here's my point: Every single example that shows how elegant Haskell and OCaml are uses lists. The 4-line Quicksort example for Haskell uses lists. All of the code that demonstrates easy reuse of functions and functions taken as arguments uses lists (like how easy it is to implement quite complicated algorithms using only map and filter, for example).

    So, proponents say "Everyone should use functional languages because they can express complicated problems in elegant ways and result in cleaner, more reusable code."

    But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code. So, they are instead all using mutable data structures... I.e., they are programming with side-effects and loops (random access instead of recursion, even when ever element of an array/list needs to be accessed/processed).

    That was my point exactly. If you write elegant OCaml code using all of the lovely (and I mean lovely, really) tricks that they present when they demonstrate why OCaml is cool, you end up with code that is too slow to use in the real world.

    I would say that my C++ (or most would call it C) implementation is elegant enough... easy to understand... no messy optimization tricks. Sure, I'm not using objects and templates everywhere, but these structures are hardly needed to solve this simple problem.

    > 2. Practically none of your functions are written tail-recursively.

    Good point.

    > 2.5. You use a list append (@) inside a loop (generateStates).
    > List.append is O(m), where m is the length of its first argument. If
    > you write an implementation, you'll see why. It probably doesn't make
    > much of a difference here (generateStates is only called once) but it's
    > something to watch out for.

    Of course, as you point out, generateStates has almost no effect on the running time. However, I wonder how you might implement that in an elegant way in OCaml without @. In C, I just looped over all numbers between 0 and 2^stateLength and converted the bit representations for the numbers to cell on/off states.

    > 3. For Pete's sake, man, you're using an association list for your
    > memos! Surely you know that lookup in an association list is O(n) in
    > the size of the list.

    I simply Googled for "memoization Ocaml" and found that code:
    http://www.emeraldtiger.net/modules.php?op= modload &name=News&file=article&sid=9

    The author pointed out how "sweet" polymorphism is... one block of code that can be used to memoize any function. Sweet indeed, and it certainly sped up my OCaml code a lot (without memoization, it was so slow as to be intractable for anything larger than about 4x4).

    So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.

    I agree that the memoization code is probably the problem in the OCaml version. However, this code came directly from the OCaml community and was the *only* example of memoization in OCaml that I could find.

    For Haskell, I used an infinite list of results that was filled in lazily as the results were needed. This also sped up the algorithm dramatically. However, I cannot get a Haskell compiler to compile itself on my platform, so I was testing all code in the Hugs interpreter, which made it too slow to be practical. Isomorphic compiled OCaml code was hundreds of times fast
    • Re:Email response (Score:3, Insightful)

      by be-fan ( 61476 )
      But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code.

      Only to the extent that all of your production code is speed-sensitive. The hot-spots are generally only 10% of the code. A lot of very complex code (eg: configuration stuff), only gets run very rarely. Using high-level languages has a certain design procedure. You write everything at the high-level, and
    • jcr13, I feel your pain. I've hit the exact same wall and bounced in nearly the same fashion.

      There are tons of examples of OCaml code out there, but whenever you try and use them you'll find they're written for elegance, not for any real-world metric. It makes for good propaganda, but not much help for anyone trying to do anything real. Asking for help and showing this code in most communities results in a series of curt, bitter responses from many members of the community.

      So even though all kinds of arro
      • Asking for help and showing this code in most communities results in a series of curt, bitter responses from many members of the community.

        That's a very important point, and I think the effect of the community around a programming language (or indeed a whole paradigm -- functional, OOP, etc.) is often underestimated.

        Just look at PERL: it has its merits, but in fairness clarity to newcomers is rarely listed as one of them. And yet, because asking a question on a PERL group tends to result in joyous cri

    • Re:Email response (Score:4, Informative)

      by Fahrenheit 450 ( 765492 ) on Tuesday March 15, 2005 @01:45PM (#11945358)
      Okay... but here's my point: Every single example that shows how elegant Haskell and OCaml are uses lists. The 4-line Quicksort example for Haskell uses lists. All of the code that demonstrates easy reuse of functions and functions taken as arguments uses lists (like how easy it is to implement quite complicated algorithms using only map and filter, for example).

      Or perhaps more correctly, "every single example that you've seen". For a real quick one, look at Jason Hickey's Intro to OCaml (pdf) [caltech.edu] and have a quick peek at his Red/Black tree implementation. Or even cooler (if you're into that sort of thing) is the ever famous One Day Compilers [venge.net] talk.

      But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code. So, they are instead all using mutable data structures... I.e., they are programming with side-effects and loops (random access instead of recursion, even when ever element of an array/list needs to be accessed/processed).

      No. What he's saying in that you should use the best data structure for the job. Your best bet would have been to use the Hashtbl [inria.fr] module from the standard library, or if you wanted to stay in the purely applicative, the Map [inria.fr] module (also in the standard library) would have been loads faster...

      You are aware that there are more purely functional data structures (pdf) [cmu.edu] (OCaml implementations) [oefai.at] than the list, don't you?

      So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.

      Here's a pretty neat example [ed.ac.uk] that uses arrays in a naive way, but you could certainly use, say, a map instead... And I'm pretty sure the OCaml community (by which I mean the people who would have helped you improve your code [inria.fr] had you asked them) know about things like this.

      I don't think I spread any falsehoods. I mean, my experiment was real, and the results are real, and the code is there for people to inspect and try on their own. I also talked in my /. post about OCaml code that is isomorphic to C being fast, but functional code perhaps not being fast.

      Yes. And we inspected it, found it to be poorly written, and told you so. The "falsehood" here is that you claim that code written in a functional style is slow, when you really should have said "my code written in a naively functional style is slow". If I fill my gas tank with water, my car sure is slower than walking, therefore all cars are slower than walking... right?

      Trust me... I am *dying* to use OCaml or Haskell for real-world programming. I have spent the past month or so exploring these languages and trying to apply them to real programming problems. Especially when shootout results showed that OCaml was sometimes faster than C, and when I discovered that OCaml was much faster than Hasell, I was really starting to think that OCaml was a possibility.

      I put a link to tho OCaml mailing lists above. Use it. Ask questions of the list (you may want to start with the beginner's list). They can help you learn the language faster and better than google will.

      However, the ONLY reason why I would want to use OCaml is to take advantage of the expressiveness of pure functional programmin
      • Great response. IMHO the article and jcr13 should be modded "-1 fuckwit". He has drawn some very wrong conclusions from his flawed experiment.
        The "falsehood" here is that you claim that code written in a functional style is slow, when you really should have said "my code written in a naively functional style is slow"
        Quite so. If only he realised.
    • Re:Email response (Score:4, Interesting)

      by Fourier ( 60719 ) on Tuesday March 15, 2005 @02:48PM (#11945994) Journal
      So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.

      You get a significant boost just by dumping the list memoization in favor of a hashtable implementation. I'm not necessarily saying that's the optimal choice, but it's an easy drop-in replacement that is much better suited to the task. Here's a patch:
      --- Garden.ml 2005-03-14 13:22:04.000000000 -0500
      +++ Garden2.ml 2005-03-15 14:38:34.000000000 -0500
      @@ -135,8 +135,8 @@ let costList = map cost allStates;;

      (* Set up an associative list for memoization *)
      -let lookup key table = List.assoc key !table;;
      -let insert key value table = table := (key, value) :: !table;;
      +let lookup key table = Hashtbl.find table key;;
      +let insert key value table = Hashtbl.add table key value;;

      (* memoize any 3-parameter function *)
      @@ -150,7 +150,7 @@ let memoize3 table f x y z =
      result;;

      (* table for memoizing optLayout *)
      -let isCovered_table = ref [];;
      +let isCovered_table = Hashtbl.create 100;;

      (* checks if each cell in center colum is covered by an empty cell *)
      let rec isCovered c1 c2 c3 =
      @@ -266,7 +266,7 @@ and memo_fib n = memoize fib n;;
      *)

      (* table for memoizing optLayout *)
      -let optLayout_table = ref [];;
      +let optLayout_table = Hashtbl.create 100;;

      (*
      Also: learn to use the profiler! It takes about five seconds to see that camlList__assoc is killing you.
    • Production ML code sure does use lists, and they are efficient for what they are. But if you want fast random access, lists are simply not the right data structure. (And you don't even necessarily need to resort to imperative structures to get this; functional splay trees or functional arrays can get you at least good asymptotic performance, if not good constants!)

      Anyway, map and fold are available for arrays as well as lists.
  • Look at the shoot-out with only tests that are all available for C, C++ and O'Caml are included and weighted for both CPU and Memory. It paints quite another picture [debian.org].
    • Look at the shoot-out with only tests that are all available for C, C++ and O'Caml are included and weighted for both CPU and Memory. It paints quite another picture.

      Hm. Slashcode messed with the URL. Use good old copy and paste:

      http://shootout.alioth.debian.org/great/benchma r k. php?test=all&lang=all&sort=fullcpu&xfullcpu=1&xmem =1&xloc=0&ackermann=1&wc=3&echo=5&fannkuch=0&fasta =0&fibo=1&harmonic=0&heapsort=4&knucleotide=0&mand elbrot=0
      • And if you weight all tests evenly, rather than giving some more importance than others you get an even different picture...

        Also with cut and paste goodness (mind the injected spaces)...

        http://shootout.alioth.debian.org/great/benchma r k. php?test=all&lang=all&sort=fullcpu&xfullcpu=1&xmem =1&xloc=0&ackermann=1&wc=1&echo=1&fannkuch=0&fasta =0&fibo=1&harmonic=0&heapsort=1&knucleotide=0&mand elbrot=0&matrix=1&nbody=0&nsieve=0&nsi
  • Have you considered updating that web page - keep the content you have there now, but possibly tack on some of the feedback from here that shows that yes, a very, very bad implementation of an algorithm will run slowly, regardless of the language. It would be a shame of someone ran across the page seeking a realistic comparison, and didn't look deep enough to realize that this "study" is basically the equivalent of comparing the gas mileage of two makes of automobile without informing the readers that one
  • by Anonymous Coward on Tuesday March 15, 2005 @12:58PM (#11944864)
    This is a 10 minute proof-of-concept that Haskell shouldn't lag as much as claimed. It's hardwired for n*3 grids, doesn't use memoization or arrays, and it solves 7*3 in 15 seconds on my ancient hardware. 15 lines of code, not astoundingly elegant, but no optimization tricks at all. If anyone cares I will write a generalized version to kick C++'s arse later.
    import Word; import Bits; import List

    collength = 7
    full = 2^collength-1

    selfs c = (shiftR c 1) .|. c .|. ((shiftL c 1).&.full)

    invert c = foldl (.) id [if(testBit c i)then(flip setBit (collength-1-i))else id|i<-[0..collength-1]]
    0

    empties c = length [()|i<-[0..collength-1],testBit c i]

    valids = [((c1,c2,c3),e1+e2+e3)
    |(c1,s1,i1,e1)<-c's, (c2,s2,i2,e2)<-c's, (c3,s3,i3,e3)<-c's,
    c1==minimum[c1,i1,c3,i3],
    s1.|.c2==full, c1.|.s2.|.c3==full, c2.|.s3==full]
    where c's = zip4 cs (map selfs cs) (map invert cs) (map empties cs)
    cs = [(0::Word32)..full]

    bests = (best,[cs|(cs,score)<-valids,score==best])
    where (_,scores) = unzip valids
    best = minimum scores
    • You guys really do live in another world don't you?

      10 lines of code ... but do you seriously thing such style makes for readability and maintainability?

      invert c = foldl (.) id [if(testBit c i)then(flip setBit (collength-1-i))else id|i&lt;-[0..collength-1]]

      I suppose this example from your code is one line right? Someone tell me please whether this kind of mess is standard Haskell coding practice. Because to me it looks like theres a bunch of different things going on in that line, that would be bette


      • No, IMHO its just bad style.

        I think the author of this haskell code chose to write short and fast code when speed, readability, conciseness in that order perhaps would have been a more apropriate set of priorities in the context of the cited article.

        After all, the computational costs of additional comments and descriptive name are astonishingly low in most programming languages.

        On the bright side, the code is considerably fast (it takes 2 secs on a 1.3 GHz pb with ghc-6.4 -O) and could easily be made rea
      • I suppose this example from your code is one line right? Someone tell me please whether this kind of mess is standard Haskell coding practice. Because to me it looks like theres a bunch of different things going on in that line, that would be better readable spread over a few lines with a bit of indentation, a few comments, sensible variable names etc. Or is it just my non-functional-programming ignorance that makes me think this way?

        It's not really all that bad (though I would have put a couple of newlin
        • Gerf... I knew I'd screw something up.

          This:
          That is to say, given a base value b, an operator, op, and a list [l1, l2, l3, .., ln], we compute ((((l1 op l2) op l3)...) op l(n-1)) op ln).

          Should be:
          That is to say, given a base value b, an operator, op, and a list [l1, l2, l3, .., ln], we compute
          (((((b op l1) op l2) op l3)...) op l(n-1)) op ln).

          And consequently this:
          (f1 (f2 (f3 (... (fn (id 0))))))), or more correctly, computing the function f x = (f1 (f2 (f3 (... (fn (id x))))))) and applying it to 0

          S
  • by j h woodyatt ( 13108 ) <jhw@conjury.org> on Tuesday March 15, 2005 @06:34PM (#11948195) Homepage Journal
    If you want to program in a functional style, and you need lazy evaluation, you're going to find the standard library that comes with the compiler somewhat limited.

    I wrote some extensions for programming in OCaml in the functional style. Check out the OCaml NAE [sf.net] project, and look for the Core Foundation (Cf) package.
  • Here's my first attempt at a version which performs decent in Haskell. It doesn't do any memoization, and isn't particularily dynamic. It ended up being only 35 non-blank lines of code. The 9x3 case takes 26 seconds to complete on my Athlon 1600 compiled with GHC 6.4. Erg. You'll have to grab a copy here [sleepingsquirrel.org] as /. thinks my Haskell code has too many 'junk' characters.

"Inquiry is fatal to certainty." -- Will Durant

Working...