Please create an account to participate in the Slashdot moderation system


Forgot your password?
Programming Books Data Storage Media Book Reviews IT Technology

Purely Functional Data Structures 427

andrew cooke writes "A while ago I read the comments following a Slashdot book review. Someone had posted a request for books that covered a wider range of languages than Java, C, Python, etc. Well, I thought, why not review Okasaki's Purely Functional Data Structures? It's a classic from the underworld of functional programming - recognised as the standard reference, yet clear enough to work as an introduction to the subject for anyone with a basic functional programming background. Of course, some readers won't know what functional programming is, or what is special about pure data structures. So I hope that this review can also serve as something of an introduction to the languages that I (a software engineer paid to work with Java, C, Python, etc) choose to use in my spare time, just for the joy of coding." Read on for the rest; even if you're not planning to give up C or Perl, there are links here worth exploring.
Purely Functional Data Structures
author Chris Okasaki
pages 220
publisher Cambridge University Press
rating 8/10
reviewer Andrew Cooke
ISBN 0521663504
summary Functional programming for grown-ups.

In Okasaki's introduction he says that the "[...] benefits of functional languages are well known, but still the vast majority of programs are written in imperative languages such as C. This apparent contradiction is easily explained by the fact that functional languages have historically been slower than their more traditional cousins, but this gap is narrowing."

Indeed, OCaml has a reputation for being "as fast as C," yet it contains automatic memory management and supports object-oriented, as well as functional, programming. It's also probably the most widely used functional language outside academia (except perhaps Lisp/Scheme).

I mention OCaml not just because it's fast, free and popular, but because Okasaki uses a related language - ML - in his book. The ML family of languages are the standard strict, functional languages (Standard ML of New Jersey is perhaps the reference implementation but see also Okasaki also includes an appendix with examples in Haskell, which is the standard lazy functional language.

The difference between lazy and strict languages is the order in which code is evaluated. Most languages are strict. Unlike most languages, Haskell only evaluates something when it is absolutely necessary. Each parameter to a function, for example, is passed as a "thunk" of code, not a value. If the value is not required inside the function, the parameter is left unused; if it is required (say as part of a result that needs to be displayed) then the thunk is evaluated. This evaluation may trigger a whole slew of evaluations of functions that "should" have been called earlier (from a Java programmer's point of view).

Laziness is both good and bad. The bad side is obvious: the order in which code is executed my be very different from the order in which the program was written and some serious book-keeping is necessary in the compiler to juggle the thunks of code and their final values. The reordering of code could cause mayhem for IO operations, for example (in practice, of course, Haskell includes a solution to this problem).

The good side is that laziness can help make programs more efficient and, while the definition of ML doesn't include laziness, individual ML implementations -- including OCaml and SML/NJ -- include it as an extra.

Much of Purely Functional Data Structures (the second of three parts) focuses on how to use laziness to make data structures efficient. Lazy evaluation allows book-keeping actions to be postponed, for example, so that the cost of maintaining the data structure in an efficient form can be averaged across several read/write operations (improving worst case limits - avoiding a very slow response if the data happen to be in a "bad" order).

An understanding of how the efficiency of algorithms is rated (the big-O notation) is one piece of knowledge that this book does assume, along with a basic grasp of what Stacks, Queues, Trees, etc, are.

This lazy boost in efficiency is needed because, even though functional languages may be getting faster, it's not always possible for them to implement the efficient algorithms used in imperative (non-functional) programming.

But I'm getting ahead of myself, because I haven't described what a functional language is, or why it is useful. These are the topics of the first part of the book, which explains how functional languages, which make it impossible to change variable values by direct assignment, support persistent data structures. This section is fairly brief, and while it's a good refresher course for someone who's not had to worry about such things since studying at university, it's not sufficient as an initial introduction to functional programming in general.

There's a good explanation of functional programming in the Wikipedia, but, in all honesty, I don't see how anyone can really "get it" without writing functional code (just as I, at least, couldn't understand how OOP worked until I wrote code that used objects).

So forgive me for not telling you why functional programming is good (This paper is one famous attempt), but perhaps a better question to focus on is "Why should you spend the time to investigate this?" The best answer I can give is that it leads to a whole new way of thinking about programming. Describing functional programming as "excluding assignment to variables" doesn't do justice to the consequences of such a profound change (one I found almost unimaginable - how can you program without loop counters, for example?).

There's a practical side to all this too - learning new ways of thinking about programs makes you a better programmer. This ties in closely with the final part of Okasaki's book, which explores a few fairly esoteric approaches to data structures. Who would have thought that you can design data structures that parallel the way in which you represent numbers? Some of this is pretty heavy going - I can't say I understood it all, but I'm taking this book with me on holiday (it's slim - just over 200 pages) and I'll be bending my brain around some of the points in the last few chapters as I lie in the sun (here in the southern hemisphere it's late summer).

So just who would benefit from this book? It seems to me that it's most valuable as a second book on functional programming. There are a bunch of texts (and online guides) that can get you started in functional programming. This one goes further. It shows how to exploit the features of functional languages to solve real problems in applied computing. Admittedly, they are problems that have already been solved in imperative languages, but you might find that you, too, come to enjoy those famous benefits of functional languages. The algorithms in this book let you enjoy those benefits without paying the price of inefficiency.

Andrew Cooke last reviewed for Slashdot The Aardvark is Ready for War . You can purchase Purely Functional Data Structures from Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

Purely Functional Data Structures

Comments Filter:
  • Nice Review! (Score:5, Informative)

    by jaaron ( 551839 ) on Wednesday March 03, 2004 @02:06PM (#8453700) Homepage
    Nice to see some actual content on Slashdot!

    By the way, this reminds me of the recent article on Domain Specific Languages over on Martin Foweler's website []. Another aspect of programming worth investigating.
  • by Waffle Iron ( 339739 ) on Wednesday March 03, 2004 @02:14PM (#8453793)
    that supports GOTOs. Those are the best!

    Well, the functional language Scheme supports "continuations". These are kind of like GOTOs on acid.

  • Re:O'Camel (Score:3, Informative)

    by Anonymous Coward on Wednesday March 03, 2004 @02:15PM (#8453801)
    You read it like OKamell with a french accent (its developers are french).
  • by bcolflesh ( 710514 ) on Wednesday March 03, 2004 @02:21PM (#8453868) Homepage
    4.1.8. PureFun []
  • by skybrian ( 8681 ) on Wednesday March 03, 2004 @02:22PM (#8453889) Homepage
    Another way to think about functional programming is as constructor-based programming. Problems are solved by constructing lots of temporary objects that are a little closer to the solution, until you're finally in a position to construct the answer.

    You can do this in any language, but to be efficient, you need an good garbage collector and a compiler that can optimize out most of the temporary objects.
  • Re:Functionals (Score:4, Informative)

    by SparafucileMan ( 544171 ) on Wednesday March 03, 2004 @02:26PM (#8453930)
    Its a different mind-set required to write in a functional language, that's all. Some people find it a steeper learning curve, and some don't. But the effort put into learning the functional language style will be paid back 1,000 fold (at least) for the rest of your life, whereas learning a typical procedural language will only get you 10x return. Plus the pay-back on learning the functional language will apply to _all_ languages you ever learn. This is because all languages and algorithms are defined, on paper, in a functional style anyway (since it uses "math"), and the procedural super-set is just a needless complication from what is 'really happening'.

    But then again, I majored in math, not programming, so maybe I'm biased.

  • Just got this book (Score:5, Informative)

    by QuantumFTL ( 197300 ) * on Wednesday March 03, 2004 @02:26PM (#8453933)
    Nice review!

    I just got this book and it's clear the author has really done his research. His writing style is also very clear, concise and well thought out. Not overly chatty or pandering, yet not dryly accademic either. Precisely the kind of computer book I'd want to write!

    I'm glad the reviewer didn't try to talk a lot about why people should be interested in functional programs, however I must say that the ability to write large complex algorithms in very few lines, and prove mathematically that it works is simply a miracle in some situations. If you need to write a compiler (or do any other set of complex alogirthms on large recursive data structures, especially those that could take advantage of tagged unions, like Abstract Syntax Trees do) you should check out OCaml. And it doesn't hurt that it can figure out the type of all your functions and variables for you :)

    Oh, and if you happen to get this book, and want to play with OCaml, you can get the OCaml translation of the data structures in this book here [].

    I dont' know if very many programmers will ever program in a purely functional language, however it seems that languages of the future will have to include things like first class functions and closures, as they are incredibly useful. I know Ruby and Python already support a lot of it.

    Oh and in case anyone's wondering, it *IS* possible to encapsulate things like notion of state, error handling, and I/O in a purely functional language ("side effect free" language) using something called monads []. Now there's a fun concept to wrap your brain around!

    Hope some people here are brave enough to dig into a book like this that requires a bit of math, and more than a little faith at some points :)

    Justin Wick
  • Re:Functionals (Score:2, Informative)

    by e4liberty ( 537089 ) on Wednesday March 03, 2004 @02:28PM (#8453952)

    Having delivered commercial (yes!) applications on Lisp Machines, I can say with some authority that a G4 Mac with Digitool's Macintosh Common Lisp (MCL) is an excellent, probably superior substitute.

    The folks at Digitool [], and Gary's OpenMCL [] is a nice free alternative if you don't need the GUI.

  • I bought this book (Score:5, Informative)

    by johnnyb ( 4816 ) <> on Wednesday March 03, 2004 @02:33PM (#8454006) Homepage
    I bought this book about 6 months ago. I *love* it. The author did an excellent job showing many interesting algorithms. I did have to read it a few times to make sense of it, as I am not as engaged in the functional programming community as I would like. I still have trouble figuring out how to apply the banker's method and the physicist's method when determining the amortized performance of functional algorithms.

    Anyway, the book was a great read. I was really surprised to learn how efficient some of the functional data structures could be.

    Good discussion as well on the use of suspensions (lazy evaluation for specific blocks of code) in programming.
  • by SparafucileMan ( 544171 ) on Wednesday March 03, 2004 @02:35PM (#8454021)
    I should have included this link:

    Perl Contains the Lambda Calculus [].

    Working with the Lambda Calculus on a computer, as a mathematician who is used to only the brain and pen/paper, makes me alternatively want to piss my pants and orgasm everywhere. It's, like, the simplest programming language on the earth!

  • by johnnyb ( 4816 ) <> on Wednesday March 03, 2004 @02:39PM (#8454073) Homepage
    Also, since this was this guy's thesis, it's also available online

    See [].

    I suggest you get the book, however, as it's a great read.
  • Another Book ... (Score:2, Informative)

    by Anonymous Coward on Wednesday March 03, 2004 @02:40PM (#8454087)
    "A while ago I read the comments following a Slashdot book review. Someone had posted a request for books that covered a wider range of languages than..."

    Well he why not try "Comparative Programming Languages" it covers most of them.

    Here is an excerpt:

    "This book considers the principal programming language concepts and how they are dealt with in object-oriented languages such as Java and Delphi, in traditional procedural languages such as Pascal, C and Fortran, in hybrid object-oriented or object-based languages such as C++ and Ada 95, in functional languages such as ML and in logic languages like Prolog. "

    There is also a link to it:

    which I post as plain text because I'm lazy right now and also because I hope this prevents compulsive clickers (are there such people?) from going there without getting too much wisdom out of it. There is after all this much feared slashdot effect.

    Have fun
  • Scala (Score:3, Informative)

    by Channing ( 514228 ) <> on Wednesday March 03, 2004 @02:41PM (#8454095) Homepage
    Have a look at Scala [] if you are looking for a language that supports both paradigms.

    From their site:

    Scala is a pure object-oriented language in the sense that every value is an object. Types and behavior of objects are described by classes and traits. Class abstractions are extended by subclassing and a flexible mixin-based composition mechanism as a clean replacement for multiple inheritance.

    Scala is also a functional language in the sense that every function is a value. Scala provides a lightweight syntax for defining anonymous functions, it supports higher-order functions, it allows functions to be nested, and supports currying. Scala's case classes and its built-in support for pattern matching model algebraic types used in many functional programming languages.
  • by QuantumFTL ( 197300 ) * on Wednesday March 03, 2004 @02:43PM (#8454112)
    This review was awesome, but he didn't really mention what data structures are discussed. Here's the list from the TOC for those who care:

    • Leftist
    • Binomial
    • Splay
    • Pairing
    • Skew Binomial
    • Lazy Pairing

    • Binary Search
    • Red-Black
    • Trie

    • FIFO
    • Banker's
    • Physicist's
    • Double Ended

    There's also a lot of other things, such as some interesting ways of doing numerical representation in functional languages (lazy numbers!) Even talks about trinary/quaternary numbers, as discussed before on slashdot.

    Also if you'd like to see the source code without buying the book (you cheap bastard!) you can find it here [].

    But I hope you buy the book :)

  • Re:Functionals (Score:5, Informative)

    by gangien ( 151940 ) on Wednesday March 03, 2004 @02:50PM (#8454197) Homepage
    I can say that functional languages are a lot more complicated than procedural languages

    What?? More complicated? No no no.. They are certainly not more complicated. For instance the funcitonal languages:

    (display "Hello World") -Scheme

    main =
    putStr "Hello World" - Haskell


    main () {
    printf("hello world"); }- C

    print "hello world"; - perl

    Now those differences are subtle, but A. The functional languages are easier to read(and I think most any none biases person would aggree with me) B. No whacky syntax, hell scheme syntax is only () C. The data types are simple, Pointers? please. Ect ect. The thing that makes functional programming difficult is the lack of an imperative control flow. Which is how people tend to solve problems. For instance if you want to sum up the numbers 1 through 8, in an imperative language it'd be

    tol = 0
    for i = 1 to 8
    tol += i
    end for

    in scheme it'd be
    (define (sum a b)
    (if (equal? a b)
    (+ a (sum (+ 1 a) b)))

    which is confusing. And not obvious. But as for other people's code, that is always the case. Well I've never seen a language that oculd get around that anyhow, the closest I think is java.
  • by Jagasian ( 129329 ) on Wednesday March 03, 2004 @02:59PM (#8454307)
    Monadic programming is a fancy name for a pretty common sense design pattern used by functional programmers far before the theory of Monads was created. Basically you want a function that executes a list of commands, but the problem is that functions can only evaluate to pure values. So what you do is your function evaluates to a value that represents the list of commands you wanted to execute.

    So the design pattern consists of using functions that pass a state value in and out of each function, in addition to possibly other values. The pattern enforces some restrictions, one of which is: each state value can only be used once.

    So executing one command and then another involves each command being defined as a function that takes the world's state and returns the world's state after modification. Sequencing the two commands then consists of composing the functions such that the state is passed from one function to the next.

    There are additional properties required for something to be considered a Monad... and it turns out that this "design pattern" is a mathematical construct known from a branch of mathematics known as Category Theory, and that category theory construct is called a "Monad".

    A side note: category theory is basically the study of mathematical design patterns. Its more than that, but thats a good intuition for computer scientists to take when they study category theory.
  • Monads are trivial. (Score:3, Informative)

    by Kickasso ( 210195 ) on Wednesday March 03, 2004 @03:00PM (#8454309)
    Really. Try this one:
  • by QuantumFTL ( 197300 ) * on Wednesday March 03, 2004 @03:08PM (#8454416)
    Monadic programming is a fancy name for a pretty common sense design pattern used by functional programmers far before the theory of Monads was created. Basically you want a function that executes a list of commands, but the problem is that functions can only evaluate to pure values. So what you do is your function evaluates to a value that represents the list of commands you wanted to execute.

    What makes them special is the mathematical rigor that *ensures* the properties you need from the monads, and greatly aids in proving properties of your algorithms. Things like associativty, and the left/right unit properties really help trivialize a lot of programming proofs.

    It's more than just a "fancy name", and I'd say the most exciting thing about it is that it allows the use of composition of monads to compose functionalities described inside them. It makes functional programs much more modular because of this, and allows one to add all kinds of features to something like an interpreter without hardly changing the signatures of any functions! Quite impressive compared to the normal craziness associated with refactoring a functional program.

    It's interesting to see how monads are used in Glassgow's Haskell compiler... much more useful than I ever though "some weird triple thing" from category theory could be.

    A side note: category theory is basically the study of mathematical design patterns. Its more than that, but thats a good intuition for computer scientists to take when they study category theory.

    I always thought Category Theory was best though of by CS students as a replacement for Set Theory that has a function-centric view... a more pure theory of functions (with applications to the lamda calculus no less) that can usefully find common structure in many disparate fields of mathematics, especially discrete math.

    Justin Wick
  • Re:Functionals (Score:2, Informative)

    by bigbadbob0 ( 726480 ) on Wednesday March 03, 2004 @03:12PM (#8454450)
    Your scheme sum example runs fine for small summations. But for big numbers you'll blow the stack because you aren't tail recursive... a "better" solution to sum between a and b:

    (define (sum a b)

    (define (sum' mysum iter)

    (if (> iter b) mysum

    (sum' (+ mysum iter) (+ iter 1))
    (sum' 0 a) )
  • by Anonymous Coward on Wednesday March 03, 2004 @03:13PM (#8454469)
    As opposed to what

    Just to make sure everyone knows, it's as opposed to imperative data structures. "Imperative" refers to code that works through updating variables. OO languages such as Java and C++, and languages like C and Pascal are imperative. "Functional languages" do not support updatable variables.

  • by Godeke ( 32895 ) * on Wednesday March 03, 2004 @03:56PM (#8455037)
    I love functional programming. If I had my way, my projects would all use functional programming languages. I don't have my way however, and there are two reasons.

    1. Few commercial tools: Functional languages are under represented in the commercial space. With the exception of Franz Lisp and a few other lisp dialects, there is little commercial support. That may not be a killer for everyone, but I would like an environment with a good form designer and a large library to back me up. One I could give to another coder and expect them to be productive with it. Emacs works as an IDE for me, but I can't force that on others...

    2. Fewer programmers: The vast majority of programmers seem unable/unwilling/whatever to grasp the concepts and work with functional code. If you need to build a team of programmers, it is much harder to find those who can do functional programming (and when you do, they rock, but are expensive and in high demand).

    In the end, I use commonly used commercial tools so I can work with other people. Internally I use a lot of non commercial tools (LAMP model) and so I can sometimes indulge in my functional side there, but rarely can I do functional programming for my business clients.
  • by zeno_lee ( 125322 ) on Wednesday March 03, 2004 @04:05PM (#8455152)
    Programming in a functional language allows you to be more concise and expressive. Some studies indicate that development time is 4 times shorter, and the resulting code is 4 times smaller and is much easier to read.

    Take a language like Ocaml.

    It is a functional language with imperative features (loops, mutable data structures), modular organization, has object orientation, compiles to portable byte code (like Java and the JVM) as well as compiling to native code that runs as fast as C, has garbage collection, and a good standard library.

    I'm thinking the reason why there are such few people picking up functional programming is the same reason the US still uses the imperial system for measurement.

    While the rest of the world is on the metric system, the U.S. still uses the strange imperial system, that uses things like 12 inches to a foot, 3 feet to a yard, 16 ounces to a pound. That's because the US is entrenched in a mindset and there's no driving reason to change. This is the same reason why people have been stuck on imperative languages. Imperative languages have been the overriding paradigm because in the past, processing time and memory were expensive and imperative languages. This is no longer the case but there is a huge momentum of imperative language that that rolls on like a giant snowball, reinforced by industry and entrenched upon the next generation of computer programmers.

    How did I discover Ocaml?

    I was investigating rapid prototyping languages to add to my toolbelt and ran across Ocaml. I was drawn to it because it has done extremely well in the ICFP contests, especially in the lightning division (24 hour submission).

    Also it ranks impressively in the great language shootout by Doug Bagley, in terms of Lines of Code, Execution Speed, and memory consumption.

  • Re:Nice Review! (Score:5, Informative)

    by YoJ ( 20860 ) on Wednesday March 03, 2004 @04:18PM (#8455306) Journal
    This is one of my favorite books, and is largely responsible for me switching from math to CS. I had done C and Java programming before, but I always considered programming (and computer science) a messy but fun engineering activity akin to woodworking. When I read this book, I got a glimpse of the "true meaning" of programming.

    I would recommend this book to programmers that have a couple years experience with imperative or OO languages, who have also taken an introductory course in algorithms. A motivated reader does not need experience with ML, but having taken some sort of "Programming Languages" class that spends a day or two on ML is very helpful. I found it the most rewarding to learn ML and start programming in ML while reading this book, then attempting to implement the data structures in the book myself to use in small projects.

    If you don't know any functional languages and don't want to learn them, this book will not be helpful and you probably won't get enough motivation from the book itself to understand what it's talking about. If you don't really know functional languages but are willing to give them a try, this book is ideal. If you already know one or more functional languages well, you should try to read Okasaki's papers on functional data structures before getting the book. Of course you might want the book too.

  • by jasoegaard ( 103287 ) on Wednesday March 03, 2004 @04:57PM (#8455839) Homepage
    Okasaki's book is delightful. Anyone interested
    in data structures owe themselves read his book.

    I became so inspired that I implemented all the heap
    algorithms in Scheme:

    The code contains implementations of the following heaps:

    - leftist heaps
    - binomial heaps
    - pairing-heaps
    - splay heaps
    - lazy binomial heaps
    - lazy pairing heaps
    - skew binomial heaps

    The documentation is here:
  • by QuantumFTL ( 197300 ) * on Wednesday March 03, 2004 @05:05PM (#8455928)
    IIRC Osaki said there were some data structures he did not know how to implement in a pure functional language, but I don't think he said which. Are there any obvious gaps in the TOC given by the parent?

    Splay Trees weren't mentioned in the TOC but I haven't gotten thruogh the whole book yet...

    He does say that persistent arrays were about impossible to implement functionally except with a horrible access time. Random Access Lists are used instead (forgot to include that in my list).

    Not sure what else is missing... I didn't see skiplists anywhere in there or any other randomized data structures, but that might have been a matter of personal taste on his part. He also didn't talk much about perfectly-balanced search trees (expensive in imperitive langauges, rediculously so in functional languages).

    Good question though!

    Justin Wick
  • Mathematica? (Score:2, Informative)

    by gardyloo ( 512791 ) on Wednesday March 03, 2004 @05:08PM (#8455959)
    I've just started consciously focusing on functional programming techniques in Mathematica . It's almost religiously advanced in most of the Mathematica texts I've read as being easier to read and, ultimately, faster to program than most procedural algorithms one could implement to do the same things. I've definitely adopted it for some operations on lists and things, but I'll have to get more comfortable with it to apply it to everything I do in Mm . As far as I know, the Mathematica Journal even has a column devoted to solving problems with "one-liners", using only functional programming techniques. (Of course, if one is really interested in it, he could go to Journal of Functional Programming [] and educate us all.) I'm too cheap to find out for sure.

    It seems to me that Mathematica is extremely powerful in this respect: one can choose which paradigm to program in, and successfully mix paradigms, almost to the heart's content, and get useful information out. Of course, the execution speed may not be so hot, but for ease of use, it can't be beaten. What other programs out there allow such mixing?
  • by Tom7 ( 102298 ) on Wednesday March 03, 2004 @05:51PM (#8456468) Homepage Journal
    Regarding write-once variables: the reviewer makes it sounds as though this is some terrible burden or restriction that functional programmers must deal with. I offer the contrary point: immutable structures are an *invariant* that make programming much, much easier. Have you ever been in a situation where some other part of the code has an alias to something you're working with, and is modifying it when you don't want it to? Functional programming totally avoids that by having a language-enforced invariant of immutability.

    That (along with all of the other features that functional languages typically have that languages like Java and C don't, like higher order nested lexically-scoped functions, polymorphism, algebraic data types and pattern matching, parameterized modules, tuple and sum types, etc.) is the reason to do functional programming. Of course, SML and O'Caml allow you to program imperatively too, since some activities are more naturally expressed that way.

    I used to be a total C cowboy, and now I love functional programming. You can just do so much more great stuff with it.

    IMO, mlton [] is a great compiler with which to start functional (SML) programming. The performance is incredibly good, it behaves like a real compiler from the command line, and it's free (GPL) software.
  • by Anonymous Coward on Thursday March 04, 2004 @05:49AM (#8461391)
    We discussed this book in SU.SOFTW - Russian FIDO group about software engineering - and one found that aforementioned thesis differs from book. It was found that kook discuss a little bit more, for example, RB-trees.

This process can check if this value is zero, and if it is, it does something child-like. -- Forbes Burkowski, CS 454, University of Washington