How to read mentally Lisp/Clojure code

I think concat is a bad example to try to understand. It's a core function and it's more low-level than code you would normally write yourself, because it strives to be efficient.

Another thing to keep in mind is that Clojure code is extremely dense compared to Java code. A little Clojure code does a lot of work. The same code in Java would not be 23 lines. It would likely be multiple classes and interfaces, a great many methods, lots of local temporary throw-away variables and awkward looping constructs and generally all kinds of boilerplate.

Some general tips though...

  1. Try to ignore the parens most of the time. Use the indentation instead (as Nathan Sanders suggests). e.g.

    (if s
      (if (chunked-seq? s)
        (chunk-cons (chunk-first s) (concat (chunk-rest s) y))
        (cons (first s) (concat (rest s) y)))
      y))))
    

    When I look at that my brain sees:

    if foo
      then if bar
        then baz
        else quux
      else blarf
    
  2. If you put your cursor on a paren and your text editor doesn't syntax-highlight the matching one, I suggest you find a new editor.

  3. Sometimes it helps to read code inside-out. Clojure code tends to be deeply nested.

    (let [xs (range 10)]
      (reverse (map #(/ % 17) (filter (complement even?) xs))))
    

    Bad: "So we start with numbers from 1 to 10. Then we're reversing the order of the mapping of the filtering of the complement of the wait I forgot what I'm talking about."

    Good: "OK, so we're taking some xs. (complement even?) means the opposite of even, so "odd". So we're filtering some collection so only the odd numbers are left. Then we're dividing them all by 17. Then we're reversing the order of them. And the xs in question are 1 to 10, gotcha."

    Sometimes it helps to do this explicitly. Take the intermediate results, throw them in a let and give them a name so you understand. The REPL is made for playing around like this. Execute the intermediate results and see what each step gives you.

    (let [xs (range 10)
          odd? (complement even?)
          odd-xs (filter odd? xs)
          odd-xs-over-17 (map #(/ % 17) odd-xs)
          reversed-xs (reverse odd-xs-over-17)]
      reversed-xs)
    

    Soon you will be able to do this sort of thing mentally without effort.

  4. Make liberal use of (doc). The usefulness of having documentation available right at the REPL can't be overstated. If you use clojure.contrib.repl-utils and have your .clj files on the classpath, you can do (source some-function) and see all the source code for it. You can do (show some-java-class) and see a description of all the methods in it. And so on.

Being able to read something quickly only comes with experience. Lisp is no harder to read than any other language. It just so happens that most languages look like C, and most programmers spend most of their time reading that, so it seems like C syntax is easier to read. Practice practice practice.


Lisp code, in particular, is even harder to read than other functional languages because of the regular syntax. Wojciech gives a good answer for improving your semantic understanding. Here is some help on syntax.

First, when reading code, don't worry about parentheses. Worry about indentation. The general rule is that things at the same indent level are related. So:

      (if (chunked-seq? s)
        (chunk-cons (chunk-first s) (concat (chunk-rest s) y))
        (cons (first s) (concat (rest s) y)))

Second, if you can't fit everything on one line, indent the next line a small amount. This is almost always two spaces:

(defn concat
  ([] (lazy-seq nil))  ; these two fit
  ([x] (lazy-seq x))   ; so no wrapping
  ([x y]               ; but here
    (lazy-seq          ; (lazy-seq indents two spaces
      (let [s (seq x)] ; as does (let [s (seq x)]

Third, if multiple arguments to a function can't fit on a single line, line up the second, third, etc arguments underneath the first's starting parenthesis. Many macros have a similar rule with variations to allow the important parts to appear first.

; fits on one line
(chunk-cons (chunk-first s) (concat (chunk-rest s) y))

; has to wrap: line up (cat ...) underneath first ( of (chunk-first xys)
                     (chunk-cons (chunk-first xys)
                                 (cat (chunk-rest xys) zs))

; if you write a C-for macro, put the first three arguments on one line
; then the rest indented two spaces
(c-for (i 0) (< i 100) (add1 i)
  (side-effects!)
  (side-effects!)
  (get-your (side-effects!) here))

These rules help you find blocks within the code: if you see

(chunk-cons (chunk-first s)

Don't count parentheses! Check the next line:

(chunk-cons (chunk-first s)
            (concat (chunk-rest s) y))

You know that the first line is not a complete expression because the next line is indented beneath it.

If you see the defn concat from above, you know you have three blocks, because there are three things on the same level. But everything below the third line is indented beneath it, so the rest belongs to that third block.

Here is a style guide for Scheme. I don't know Clojure, but most of the rules should be the same since none of the other Lisps vary much.