Building blocks of language, structure and thought

As I travel on my path to perhaps what I deem as some sort of enlightenment, back in time via Clojure to one of the great ancestors of language, structure and computational thought (Lisp), I continue to come across a simple theme.

Building Blocks

That theme is the concept of basic building blocks with which vast cathedrals can be constructed. Those building blocks are, in Lisp terms at least, car, cdr and cons.

One of my companions on this path is Daniel Higginbotham’s Clojure for the Brave and True. In Part II, covering Language Fundamentals, Clojure’s abstractions, or interfaces, are discussed. One of the Clojure philosophies is that the abstraction idea allows a simplified collection of functions that work across a range of different data structures. Abstracting action patterns from concrete implementations allows this to happen. This is nicely illustrated with a look the first, rest and cons functions from the sequence (or ‘seq’) abstraction.

Screen Shot 2016-02-01 at 07.12.54There’s a close parallel between first, rest & cons in Clojure and car, cdr & cons in other Lisps such as Scheme. And there’s an inherent and implicit beauty in a collection of constructs so simple yet collectively so powerful. You can read about the origins of the terms car and cdr on the Wikipedia page, which have a depth and a degree of venerability of their own. Essentially both sets of functions implement a linked list, which can be simply illustrated, as shown in the book and elsewhere, as a sequence of connected nodes, like this:

node1              node2              node3
+--------------+   +--------------+   +--------------+
| value | next |-->| value | next |-->| value | next |
+--------------+   +--------------+   +--------------+
    |                  |                  |
    V                  V                  V
  "one"              "two"              "three"

Implementing a linked list

Daniel goes on to show how such a linked list of nodes like this, along with the three functions, can be simply implemented in, say, JavaScript. Given that these nodes could be represented like this in JavaScript:

node3 = { value: "three", next: null }
node2 = { value: "two", next: node3 }
node1 = { value: "one", next: node2 }

then the first, rest and cons functions could be implemented as follows:

function first(n) { return n.value; }
function rest(n) { return; }
function cons(newval, n) { return { value: newval, next: n }; }

With those basic building blocks implemented, you can even build the next level, for example, he shows that map might be implemented thus:

function map(s, f) {
  if (s === null) {
    return null;
  } else {
    return cons(f(first(s)), map(rest(s), f));

To me, there’s a beauty there that is twofold. It’s implemented using the three core functions we’ve already seen, the core atoms, if you will. Moreover, there’s a beauty in the recursion and the “first and rest pattern” I touched upon earlier in “A meditation on reduction“.

Using the building blocks

Let’s look at another example of how those simple building blocks are put together to form something greater. This time, we’ll take inspiration from a presentation by Marc Feeley: “The 90 minute Scheme to C compiler“. In a slide on tail calls and garbage collection, the sample code, in Scheme (a dialect of Lisp), is shown with a tail call recursion approach thus:

(define f
  (lambda (n x) 
    (if (= n 0) 
        (car x) 
        (f (- n 1) 
           (cons (cdr x) 
                 (+ (car x) 
                    (cdr x)))))))

If you stare long enough at this you’ll realise two things: It really only uses the core functions car (first), cdr (rest) and cons. And it’s a little generator for finding the Nth term of the Fibonacci sequence:

(f 20 (cons 1 1)) ; => 10946

I love that even the example call uses cons to construct the second parameter.

I read today, in “Farewell, Marvin Minsky (1927–2016)” by Stephen Wolfram, how Marvin said that “programming languages are the only ones that people are expected to learn to write before they can read”. This is a great observation, and one that I’d like to think about a bit more. But before I do, I’d at least like to consider that studying the building blocks of language helps in reading, as well as writing.











Breathing through the reduction meditation

The subject of the earlier post A meditation on reduction was the simple Clojure implementation of a reduce function.

Here it is again:

(defn my-reduce
 ([f initial coll]
  (loop [result initial
         remaining coll]
    (if (empty? remaining)
      (recur (f result (first remaining)) (rest remaining)))))
 ([f [head & tail]]
  (my-reduce f head tail)))

If you’re not familiar with what reduce is used for, and how, you may want to take a quick look at the documentation before continuing.

OK, let’s examine it line by line.

the my-reduce function definition starts with the defn macro, which allows for a shorthand way to write (def my-reduce (fn (...))). The name my-reduce is used to distinguish it from the actual reduce function.

So begins the first arity definition, ending on line 7. This definition is for the 3-arity case – where the my-reduce function is called with three arguments – the function, a starting value, and the collection. The arguments are captured into f, initial and coll. Note that these are immutable from the get go.

loop/recur is a tail position pattern where the loop part has bindings which we can see here on lines 3 and 4. These are like let bindings, to local vars (remember, vars are not variables!). Here the vars are result and remaining, bound initially (i.e. before the first recur) to initial and coll.

The test to be evaluated is whether the remaining collection (in remaining) is empty; if it is, then the current value of result becomes the return value of the call to my-reduce and execution ends. If it’s not, then the recur goes back to the target indicated by loop in line 3, specifying the values to be used for the result and remaining bindings this time around.

For result, the value is the result of applying the function bound to f to the result and the first element of the remaining collection. So for example, if we were using this to sum a series of numbers, this would be the point (and in fact the only point) at which the actual add function would be invoked.

For remaining, we take all but the first element of the remaining collection.

This is a nice example of the “head and tail” pattern that I mentioned in an earlier post “My Journey to Clojure“.

And here we find the second arity definition. It’s simplicity is what makes it lovely, and again, this is a common pattern in Clojure. For the case where an argument to a function isn’t supplied, then often you’ll a second definition, expecting that reduced set of arguments and then calling itself with the “right” number of arguments.

In this particular case, this second arity definition is for the 2-argument case [f [head & tail]] as opposed to the 3-argument case [f initial coll] in line 2.

In case you’re wondering about [f [head & tail]], f is the first argument, and [head & tail] is the second. This second argument is an example of sequential destructuring, where the first value is captured and then any remaining values beyond the named position (head) are gathered into a sequence.


Scribbling down values in my notebook for sample executions helped me meditate on this. I thought I’d share a couple here.

Screen Shot 2015-10-20 at 17.00.03

Summing up

All in all, for such a small piece of code, such a compact meditation, there are some choice patterns and best (or at least common) practices that one can reflect upon and enjoy.

My journey to Clojure

I’m learning Clojure. Slowly, but hopefully surely. Clojure is a Lisp, which I like saying, because it makes me sound as though I know what I’m talking about and that my language experience is as old as the hills. Perhaps the only thing that is valid there is that I’m old. But anyway.

Actually, one of the things about Clojure that appeals to me is that it is a Lisp. One of the books I remember buying when I was still in my early teens was Artificial Intelligence by Patrick Henry Winston. I still have it, a second printing from 1979. While I didn’t understand very much of it, I was somewhat mesmerised by the Lisp forms, written in all caps and with beautiful bracket symmetry. Lisp cropped up again for me a few years later, in the amazing book Gödel, Escher, Bach by Douglas Hofstadter, and it was equally mesmerising.

So when I finally discovered Clojure, I decided to delve beneath the shimmering surface that had heretofore had me transfixed, and experience the beauty from within.

One of the recurring patterns emerging from what I read, even at that early stage, was that of “head and tail”. This is alternatively known as “first and rest”, or, going back to early Lisp origins, “CAR and CDR“. Given a sequence, the idea is that you can get hold of the first item, and everything but the first item (the rest), as two separate addressable entities. You do something with the first item, and then repeat the process where the sequence becomes what you had just before identified as the rest.

There’s something appealingly simple in this pattern, not just because it’s something that we can all immediately understand, but also because there’s an unspoken truth that is about the general approach of sequential data structures, and data structure processing in general. It can perhaps be summed up nicely in an epigram from Alan Perlis thus:

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.”