(notes from Programming Clojure, 1st Edition)

TOC

Chapter 1, Getting Started

Why Lisp?

  • Lisp is homoiconic. Lisp code is just Lisp data. This make it easy for programs to write other programs.
  • The whole language is there, all the time. Read Paul Graham's Revenge of the Nerds.

Lisp, with Fewer Parentheses

  • Clojure generalize Lisp's physical list into an abstraction called a sequence. This preserves the power of lists, while extending that power to a variety of other data structures.
  • Clojure's reliance on the JVM provides a standard library and a deployment platform with great reach.
  • Clojure's approach to symbol resolution and syntax quoting makes it easier to write many common macros.
  • Clojure provides a convenient literal syntax for a wide variety of data structures besides just lists. (for example, vector [] instead of list ())
  • In Clojure, unlike most Lisps, commas are whitespace.
  • Idiomatic Clojure does not nest parentheses more than necessary.

Clojure Is a Functional Language

Clojure is a functional language but not a pure functional language like Haskell. Functional languages have the following properties:

  • Functions are first-class objects. That is, functions can be created at runtime, passed around, returned, and in general used like any other datatype.
  • Data is immutable.
  • Functions are pure; that is, they have no side effects.

Clojure Simplifies Concurrent Programming

Clojure's support for concurrency goes beyond just functional programming. It protects mutable data via software transactional memory (STM).

Clojure Embraces the Java Virtual Machine

You can call any Java API directly:

(System/getProperties)

Clojure Coding Quick Start

you need

  • Java runtime
  • Leiningen

enter repl

$ lein repl

hello world

(println "hello world")
(defn hello [name] (str "Hello, " name))
(hello "Stu")
; `Ctrl+C` or `Ctrl+D` to exit repl

here

  • defn defines a function
  • hello is the function name
  • hello takes one argument, name
  • str is a function call that concatenates an arbitrary list of arguments into a string
  • defn, hello, name, and str are all symbols, which are names that refer to things.

load file, you can use absolute path or a path relative to where you launched repl

(load-file "temp.clj")

Chapter 2, Exploring Clojure

Forms

When you run a Clojure program, a part of Clojure called the reader reads the text of the program in chunks called forms and translates them into Clojure data structures. Clojure then compiles and executes the data structures.

Numberic Types

Numberic literals are forms.

Clojure has a build-in Ratio type:

(class (/ 22 7))
=> clojure.lang.Ratio
; if you actually want decimal division
(/ 22.0 7)
; integer quotient and remainder
(quot 22 7)
=> 3
(rem 22 7)
=> 1

Symbols

Symbols name all sorts of things in Clojure:

  • Functions like str
  • "Operators" like +, which are just functions
  • Java classes like Java.lang.Sring
  • Namespaces like clojure.core and Java packages like Java.lang
  • Data structures and references

Symbols cannot start with a number.

Clojure treats / and . specially in order to support namespaces.

Strings and Characters

Strings are delimited by ", and can span multiple lines:

"This is 
a multiline String"

you can call Java functions directly

(.toUpperCase "hello")
=> "HELLO"

the dot before toUpperCase tells Clojure to treat it as the name of a Java method instead of a Clojure function.

Clojure characters are Java characters.

Boolean and Nil

rules

  • true is true, and false is false
  • nil also evaluates to false when used in a boolean context
  • other than false and nil, everything else evaluates to true in a boolean context

empty list is not false in Clojure.

zero is not false in Clojure.

A predicate is a function that returns either true of false.

(true? expr)
(false? expr)
(nil? expr)
(zero? expr)

true? tests whether a value is actually true, not whether that value evaluates to true in a boolean context.

(true? "foo")
=> false

Maps, Keywords, and Structs

Map is a collection of key/value pairs. Literal form is surrounded by {}

(def inventors {"Lisp" "McCarthy" "Clojure" "Hickey"})
; same, comma is treated as whitespace
(def inventors {"Lisp" "McCarthy", "Clojure" "Hickey"})

Maps are functions.

(inventors "Lisp")
=> "McCarthy"

get allows you to specify a different return value for missing keys:

(get inventors "Lisp" "I dunno!")
=> "McCarthy"
(get inventors "Foo" "I dunno!")
=> "I dunno!"

Any Clojure data structure can be a key in a map. A very common key type is the Clojure keyword.

keywords begin with a colon :, keywords resolve to themselves:

:foo
=> :foo

keywords are also functions:

(:Clojure inventors)
=> "Hicky"

Structs

(defstruct book :title :author)
=> #'user/book
; instantiate with struct
(def b (struct book "Anathem" "Neal Stephenson"))
(:title b)

you can omit values for some of the basic keys.

use struct-map to add values

(struct-map book :copyright 2008 :title "Anathem")
=> {:title "Anathem", :author nil, :copyright 2008}

Reader Macros

Reader macros are special reader behaviors triggered by prefix macro characters.

(check book/online for list of reader macros)

Clojure does not allow programs to define new reader macros.

Functions

create function

(defn greeting 
    "Returns a greeting"
    [username]
    (str "Hello, " username))
; call
(greeting "world")
=> "Hello, World"

; document
(doc greeting)

exception will be thrown if argument is omitted. to use a default argument:

(defn greeting
    "Returns a greeting"
    ([] (greeting "world"))
    ([username] (str "Hello, " username)))
(greeting)
=> "Hello, world"

you can create function with variable arity by including an ampersand & in the parameter list, Clojure will bind the name after the & to a sequence of all remaining parameters.

(defn date [person-1 person-2 & chaperones]
    (println person-1 "and" person-2
        "went out with" (count chaperones) "chaperones."))
(date "Romeo" "Juliet" "Friar Lawrence" "Nurse")
=> Romeo and Juliet went out with 2 chaperones.

Anonymous Functions

you can create anonymous functions with fn

reason to use anonymous functions

  • function is so brief and self-explanatory
  • function is being used only from inside another function and needs a local name
  • function is created inside another function for the purpose of closing over some data

filter functions are often brief and self-explanatory.

(use '[clojure.contrib.str-utils :only (re-split)])
(filter (fn [w] (> (count w) 2)) (re-split #"\W+" "A fine day"))
=> ("fine" "day")

shorter syntax for anonymous functions: (#body), parameters are named %1, %2 and so on.

(filter #(> (count %) 2) (re-split #"\W+" "A fine day it is"))
=> ("fine" "day)

use named function inside the scope of another function:

(defn indexable-words [text]
    (let [indexable-word? (fn [w] (> (count w) 2))]
        (filter indexable-word? (re-split #"\W+" text))))

the let binds the name indexable-word? to the anonymous function inside the (lexical) scope of indexable-words

Vars, Bindings, and Namespaces

when you define an object with def or defn, that object is stored in a Clojure var.

the var special form returns a var itself, not the var's value

(var foo)
=> #'user/foo

vars are more than just storing a value:

  • same var name can be aliased into more than one namespace
  • vars can have metadata, includes documentation, type hints for optimization, and unit tests
  • vars can be dynamically rebound on a per-thread basis

vars are bound to names. there're other kinds of bindings. in a function call, argument values bind to parameter names.

a function's parameter bindings hava a lexical scope: they are visible only inside the text of the function body. functions are not the only way to create a lexical binding, the special form let does nothing other than create a set of lexical bindings.

(let [bindings*] exprs*)

bindings are then in effect for exprs, and the value of the let is the value of the last expression in exprs.

Destructuring

(defn greeting-author-1 [author]
    (println "Hello, " (:first-name author)))

but you don't need the author, all you need is the first-name. Clojure solves this with destructuring: any place that you bind names, you can nest a vector or a map in the binding to reach into a collection and bind only the part you want.

(defn greeting-author-2 [{fname :first-name}]
    (println "Hello, " fname))

you can use a vector to destructure any sequential collection.

(let [[x y] [1 2 3]]
    [x y])
=> [1 2]
; skip elements
(let [[_ _ z] [1 2 3]]
    z)
=> 3
; use :as to bind for the entire enclosing structure
(let [[x y :as coords] [1 2 3 4 5 6]]
    (str "x: " x ", y: " y ", total dimensions " (count coords)))
=> "x: 1, y: 2, total dimensions 6"

Namespaces

use resolve to explicitly resolve the symbol

(resolve 'foo)
=> #'user/foo

switch namespaces or creating a new one if needed, with in-ns

(in-ns 'myapp)
=> #<Namespace myapp>

when you create new namespace with in-ns, java.lang package is automatically available to you.

making Clojure's core function available in new namspace as well:

(clojure.core/use 'clojure.core)

class names outside java.lang must be fully qualified

; wrong
File/separator
; correct
java.io.File/separator

if you don't want to use a fully qualified name, you can use import

(import '(java.io InputStream File))
File/separator
=> "/"

import is only for Java classes, if you want to use a Clojure var from another namespace, you must use fully qualified name or map the name into the current namespace.

(require 'clojure.contrib.math)
(clojure.contrib.math/round 1.7)
=> 2
(round 1.7)
=> java.lang.Exception
; this maps all public vars to current namespace
(use 'clojure.contrib.math)
; use :only to list only the vars you need
(use '[clojure.contrib.math :only (round)])
; now you can call
(round 1.2)
=> 2
; add :reload to make changes available to a running program
(use :reload '[clojure.contrib.make :only (round)])
; :reload-all reloads any namespaces that examples.exploring refers to
(use :reload-all 'examples.exploring)

it's idiomatic to import Java classes and use namespaces at the top of a source file, using the ns macro:

(ns examples.exploring
    (:use examples.utils clojure.contrib.str-utils)
    (:import (java.io File))

Flow Control

Clojure has very few flow control forms, if, do, and loop/recur may be all you will ever need.

(defn is-small? [number]
    (if (< number 100) "yes"))
; if false it returns nil
(is-small? 200)
=> nil
; add "else" part
(defn is-small? [number]
    (if (< number 100) "yes" "no"))
(is-small? 200)
=> "no"

if allows only one form for each branch. do takes any number of forms, evaluates them all, and returns the last.

(defn is-small? [number]
    (if (< number 100)
        "yes"
        (do
            (println "Saw a big number" number)
            "no")))

and this is an example of side effect.

Recur with loop/recur

(loop [bindings *] exprs*]

loop just like let, establishing bindings and then evaluating exprs, the difference is that loop sets a recursion point, which can then be targeted by the recur special form

(recur exprs*)

recur binds new values for loop's bindings and return control to the top of the loop.

example

(loop [result [] x 5]
    (if (zero? x)
        result
        (recur (conj result x) (dec x))))
=> [5 4 3 2 1]

instead of using a loop, you can use recur back to the top of a function.

(defn countdown [result x]
    (if (zero? x)
        result
        (recur (conj result x) (dec x)))
(countdown [] 5)
=> [5 4 3 2 1]

recur is a powerful building block, but you may not use it very often, because many common recursions are proviced by Clojure's sequence library.

; countdown could expressed as any of these
(into [] (take 5 (iterate dec 5)))
=> [5 4 3 2 1]
(into [] (drop-last (reverse (range 6))))
=> [5 4 3 2 1]
(vec (reverse (rest (range 6))))
=> [5 4 3 2 1]

Where's My for Loop?

an implementation of Apache Commons indexOfAny in Clojure (check online for the Java source)

; indexing a string
(defn indexed [coll] (map vector (iterate inc 0) coll))
; return indices 
(defn index-filter [pred coll]
    (when pred
        (for [[idx elt] (indexed coll) :when (pred elt)] idx)))

Clojure's for is not a loop but a sequence comprehension.

(index-filter #{\a \b} "abcdbbb")
=> (0 1 4 5 6)
(index-filter #{\a \b} "xyz")
=> nil
; return first result
(defn index-of-any [pred coll]
    (first (index-filter [pred coll])))

Metadata

In Clojure, metadata is data that is orthogonal to the logical value of an object.

you can add metadata to a collection or a symbol using with-meta

(with-meta object metadata)

example

(def stu {:name "Stu" :email "stu@example.com"})
(def serializable-stu (with-meta stu {:serializable true}))

; "=" tests for value equality, like Java's "equals"
(= stu serializable-stu)
=> true
; "identical?" tests reference equality, like Java's "=="
(identical? stu serializable-stu)
=> false
; assess metadata with "meta" macro
(meta stu)
=> nil
(meta serializable-stu)
=> {:serializable true}
; or use the reader macro "^"
(^stu)
=> nil
(^serializable-stu)
=> {:serializable true}

when you create a new object based on an existing object, the existing object's metadata flows to the new object.

Reader Macro

to add your own key/value pairs to a var, use the metadata reader macro

#^metadata form

example, shout expects and returns a string, use :tag key to enforce it

(defn #^{:tag String} shout [#^{:tag String} s] (.toUpperCase s))
; inspect
^#'shout
; Exception, not a string
(shout 1)
; ":tag" is so common, you can use short-form "#^Classname"
(defn #^String shout [#^String s] (.toUpperCase s))
; can also place the metadata last
(defn shout
    ([s] (.toUpperCase s))
    {:tag String})

common metadata keys

  • :arglists
  • :doc
  • :file
  • :line
  • :macro
  • :name
  • :ns
  • :tag

it is important to note that the metadata reader macro is not the same as with-meta. reader macro addes metadata for the compiler, with-meta adds for your own data

(def #^{:testdata true} foo (with-meta [1 2 3] {:order :ascending}))
(meta #'foo)
=> {:ns .., :name foo, :file .., :testdata true}
(meta foo)
=> {:order :ascending}

as a general rule, use metadata reader macro to add metadata to vars and parameters, use with-meta to add metadata to data.


Chapter 3, Working with Java

Calling Java

create new object

(def rnd (new java.util.Random))
; call with "." special form
; following calls no-argument version of "nextInt()"
(. rnd nextInt)
; with argument
(. rnd nextInt 10)
; "." also works with static fields
(. Math PI)

Syntactic Sugar

instead of new, you can use the Classname. form

; following are equivalent
(new Random)
(Random.)
; for static fields, short form is "Classname/membername"
(. Math PI)
Math/PI
; for static methods, use "(Classname/membername)"
(System/currentTimeMillis)
; another short form is".methodOrFieldName"
(. rnd nextInt)
(.nextInt rnd)
; chaining is ugly with prefix-dot notation
(.getLocation
    (.getCodeSource (.getProtectionDomain (.getClass '(1 2)))))

; clean it up with ".." macro
(.. '(1 2) getClass getProtectionDomain getCodeSource getLocation)
; to call methods several times, use "doto" macro
(doto (System/getProperties)
    (.setProperty "name" "Stuart")
    (.setProperty "favoriteColor" "blue"))

Using Java Collections

make-array to create Java arrays

(make-array String 5)
; can wrap it as Clojure sequence
(seq (make-array String 5))

(skipped, read later)

Chapter 4, Unifying Data with Sequences

Everything Is a Sequence

A Sequence has 3 core capabilities

; get first item
(first aseq)
; get everything after first item
(rest aseq)
; construct a new sequence by adding an item to the front of an existing sequence.
(cons elem aseq)
; when apply to vector, "rest" and "cons" returns seq
(rest [1 2 3])
=> (2 3)
(cons 0 [1 2 3])
=> (0 1 2 3)
; works with maps and sets
(first {:fname "Stu" :lname "Halloway"})
=> [:fname "Stu"]
(first #{:the :quick :brown :fox})
=> :brown
; if you want a reliable order, sort it
(sorted-set :the :quick :brown :fox)
=> #{:brown :fox :quick :the}
(sorted-map :c 3 :b 2 :a 1)
=> {:a 1 :b 2 :c 3}

in addition to the core capabilities of seq, two other capabilities are worth meeting immediately: conj and into

; "conj" adds one or more elements to a collection
(conj `(1 2 3) :a)
=> (:a 1 2 3)
; "into" adds all the items in one collection to another
(into '(1 2 3) '(:a :b :c))
=> (:c :b :a 1 2 3)
; for lists, they add to the front (above)
; for vectors, they add to the back
(conj [1 2 3] :a)
=> [1 2 3 :a]
(info [1 2 3] [:a :b :c])
=> [1 2 3 :a :b :c]

most Clojure sequences are lazy: they can process sequences too large to fit in memory

Clojure sequences are immutable: they never change.

Using the Sequence Library

creating sequences

; (range start? end step?)
; ranges include their start, but not their end.
(range 5)
=> (0 1 2 3 4)
(range 1 5)
=> (1 2 3 4)
(range 1 5 2)
=> (1 3)
; (repeat n x)
(repeat 5 1)
=> (1 1 1 1 1)
; infinite version 
; (iterate f x)
; combine with take
; (take n sequence)
(take 5 (iterate inc 1))
=> (1 2 3 4 5)

; cycle takes a collection and cycles it infinitely
; (cycle coll)
(take 10 (cycle (range 3)))
=> (0 1 2 0 1 2 0 1 2 0)
; interleave takes multiple collections and produces a new collection what interleaves values from each collection until one of the collection is exhausted.
; (interleave & colls)
(interleave (iterate inc 1) ["A" "B" "C"])
=> (1 "A" 2 "B" 3 "C")
; interpose returns a sequence with each of the elements separated by a separator
; (interpose separator coll)
(interpose "," ["apples" "bananas" "grapes"])
=> ("apples" "," "bananas" "," "grapes)
; works with (apply str ...) to produce output strings
(apply str (interpose \, ["apples" "bananas" "grapes"]))
=> "apples,bananas,grapes"
; clojure-contrib wraps it as "str-join"
; (str-join separator sequence)

filtering

; (filter pred coll)
(take 10 (filter even? (iterate inc 1)))
=> (2 4 6 8 10 12 14 16 18 20)
; (take-while pred coll)
(take-while (complement #{\a\e\i\o\u}) "the-quick-brown-fox")
=> (\t \h)
; complement reverses the behavior of another function.
; opposite of take-while is drop-while
; (drop-while pred coll)
(drop-while (complement #{\a\e\i\o\u}) "the-quick-brown-fox")
(\e \- \q \u .... )
; (split-at index coll)
; (split-with pred coll)
(split-at 5 (range 10))
=> [(0 1 2 3 4) (5 6 7 8 9)]
(split-with #(<= % 10) (range 0 20 2))
=> [(0 2 4 6 8 10) (12 14 16 18)]
; all take-, split-, and drop- functions returns lazy sequences.

sequence predicates

; (every? pred coll)
(every? odd? [1 3 5])
=> true
; (some pred coll)
; some return first nonfalse value for its predicate or returns nil if no match
(some even? [1 2 3])
=> true
(some identity [nil false 1 nil 2])
=> 1
; (not-every? pred coll)
; (not-any? pred coll)

transforming sequences

; (map f coll)
(map #(format "<p>%s</p>" %) ["the" "quick" "brown" "fox"])
=> ("<p>the</p>" "<p>quick</p>" ... )
; map can take more than one collection argument, stopping whenever the smallest collection is exhausted
; (reduce f coll)
(reduce + (range 1 11))
=> 55
; (sort comp? coll)
; (sort-by a-fn comp? coll)
(sort [42 1 7 11])
=> (1 7 11 42)
(sort-by #(.toString %) [42 1 7 11])
=> (1 11 42 7)
(sort > [42 1 7 11])
=> (42 11 7 1)
(sort-by :grade > [{:grade 83} {:grade 90} {:grade 77}])
=> ({:grade 90} {:grade 83} {:grade 77}

The granddaddy of all filters and transformations is the list comprehension. A list comprehension creates a list based on an existing list, using set notation. In general, list comprehension will consist of the following

  • Input list(s)
  • Placeholder variables for elements in the input lists
  • Predicates on the elements
  • An output form that produces output from the elements of the input lists that satisfy the predicates
; list comprehension using "for" macro
; (for [binding-form coll-expr filter-expr? ...] expr)
; rewrite the map example
(for [word ["the" "quick" "brown" "fox"]]
    (format "<p>%s</p>" word))
; emulate filter with ":when"
(take 10 (for [n (iterate inc 1) :when (eval? n)]))

; the ":while" clause
(for [n (iterate inc 1) :while (even? n)] n)
=> (0)
; the real power of "for" comes when you work with more than one binding expression.
(for [file "ABCDEFGH" rank (range 1 9)] (format "%c%d" file rank))
=> ("A1" "A2" ... "H7" "H8")
; Clojure iterates over the rightmost binding expression first and then works its way left.
(for [rank (range 1 9) file "ABCDEFGH"] (format "%c%d" file rank))
=> ("A1" "B1" ... "G8" "H8")

Lazy and Infinite Sequences

Most Clojure sequences are lazy, using lazy sequences has many benefits:

  • you can postpone expensive computations that may not in fact be needed
  • you can work with huge data sets that do not fit into memory
  • you can delay I/O until it is absolutely needed
; force evaluation with "doall"
; (doall coll)
(def x (for [i (range 1 3)] (do (println i) i)))
(doall x)
| 1 
| 2
=> (1 2)
; or "dorun", "dorun" walks the elements of sequence without keeping past elements in memory
; (dorun coll)
(dorun x)
| 1 
| 2
=> nil

Clojure Makes Java Seq-able

The seq abstraction of first/rest applies to following in Java world:

  • The Collections API
  • Regular expressions
  • File system traversal
  • XML processing
  • Relational database results

Seq-ing Java Collections

(first (.getBytes "hello"))
=> 104
(rest (System/getProperties))

; Clojure will automatically wrap collections in sequences, but will not automatically rewrap them back to their original type
; use (apply str seq) to convert a seq back to a string
(apply str (reverse "hello"))
; also remember, seq wrappers are immutable

Seq-ing Regular Expressions

; (re-matcher regexp string)
; but you should use the higher-level "re-seq"
; (re-seq regexp string)
(re-seq #"\w+" "the quick brown fox")
=> ("the" "quick" "brown" "fox")

Seq-ing the File System

(count (file-seq (File. ".")))

; check by last modified time
(defn minutes-to-millis [mins] (* mins 1000 60))
(defn recently-modified? [file]
    (> (. lastModified file)
        (- (System/currentTimeMillis) (minutes-to-millis 30))))
(filter recently-modified? (file-seq (File. ".")))

Seq-ing a Stream

you can seq over the lines of any Java Reader using line-seq. To get a Reader, you can use clojure-contrib's duck-streams library.

(use '[clojure.contrib.duck-streams :only (reader))
; leaves reader open ...
(take 2 (line-seq (reader  "example/utils.clj")))
; correctly close the reader
(with-open [rdr (reader "book/utils.clj")]
    (count (filter #(re-find #"\S" %) (line-seq rdr))))
; a function that counts the lines of Clojure code in a directory tree
(use '[clojure.contrib.duck-streams :only (reader))
(def non-blank? [line] (if (re-find #"\S" line) true false))
(def non-svn? [file] (not (.contains (.toString file) ".svn")))
(def clojure-source? [file] (.endsWith (.toString file) ".clj"))
(def clojure-loc [base-file]
    (reduce
    +
    (for [file (file-seq base-file)
        :when (and (clojure-source? file) (non-svn? file))]
        (with-open [rdr (reader file)]
            (count (filter non-blank? (line-seq rdr)))))))
; usage
(clojure-loc (java.io.File. "/my/repos/clojure"))

Seq-ing XML

(use '[clojure.xml :only (parse)])
(parse (java.io.File. "examples/compositions.xml"))
; manipulate with xml-seq
; (xml-seq root)
(for [x (xml-seq
            (parse (java.io.File. "examples.xml")))
        :when (= :composition (:tag x))]
    (:composer (:attr x)))

Calling Structure-Specific Functions

Clojure includes functions that specifically target lists, vectors, maps, structs and sets.

Functions on lists

; peek is the same as first, but pop is not the same as rest. pop will throw an exception if the sequence is empty.
(peek '(1 2 3))
=> 1
(pop '(1 2 3))
=> (2 3)

Functions on Vectors

; vectors also support peek and pop, but deal with element at the end
(peek [1 2 3])
=> [3]
(pop [1 2 3])
=> [1 2]
; get reutrns value at an index, nil if out of bound
(get [:a :b :c] 1)
=> :b
; vectors are themselves functions, exception if out of bound
([:a :b :c] 1)
=> :b
; assoc associates a new value
(assoc [0 1 2 3 4] 2 :two)
=> (0 1 :two 3 4)
; subvec return a subvector of a vector
; (subvec avec start end?)
(subvec [1 2 3 4 5] 3)
=> [4 5]
(subvec [1 2 3 4 5] 1 3)
=> [2 3]

Functions on Maps

; (keys map)
; (vals map)
(keys {:sundance "spaniel", :darwin "beagle"})
=> (:sundance :darwin)
(vals {:sundance "spaniel", :darwin "beagle"})
("spaniel" "beagle")
; (get map key value-if-not-found?)
(get {:sundance "spaniel", :darwin "beagle"} :darwin)
=> "beagle"
; maps are functions too
({:sundance "spaniel", :darwin "beagle"} :darwin)
=> "beagle"
; keywords are functions too
(:darwin {:sundance "spaniel", :darwin "beagle"})
=> "beagle"
; (contains? map key)
; "assoc" returns a map with a key/value pair added
; "dissoc" returns a map with a key removed
; "select-keys" returns a map keeping only the keys passed in
; "merge" combines maps. if multiple maps contain a key, the right-most map wins
; "merge-with" is like "merge", but when two or more maps have the same key,  you can specify a function for combining the values under the key
; (merge-with merge-fn & maps)

Functions on Sets

; Clojure provides a group of functions for sets
; (use 'clojure.set)
; "union" returns set of all elements present in either input set
; "intersection" returns the set of all elements present in both input sets
; "difference" returns the set of all elements present in the first input set, minus those in the second.
; "select" returns the set of all elements matching a predicate
; "union" and "difference" also part of relational algebra, which is the basis for query languages such as SQL.
; The relational algebra consists of 6 primitive operators:
; union, difference, rename, selection, projection, and cross product.

; (rename relation rename-map)
(rename compositions {:name :title})
; (select pred relation)
(select #(= (:name %) "Requiem") compositions)
; (project relation keys)
; returns only the portions of the maps that match a set of keys
(project compositions [:name])
; (join relation-1 relation-2 keymap?)
(join compositions composers)

Chapter 5, Functional Programming

The Sixe Rules

  1. Avoid direct recrusion. (JVM limitation)
  2. Use recur when you are producing scalar values or small, fixed sequences. Clojure will optimize calls that use an explicit recur.
  3. When producing large or variable-sized sequences, always be lazy. (Do not recur)
  4. Be careful not to realize more of a lazy sequence than you need.
  5. Know the sequence library. You can often write code without using recur or the lazy APIs at all.
  6. Subdivide. Divide even simple-seeming problems into smaller pieces, and you will often find solutions in the sequence library that lead to more general, reusable code.

How to Be Lazy

A recursion definition consists of two parts:

  • A basis, which explicitly enumerates some members of the sequence
  • A induction, which provides rules for combining members of the sequence to produce additional members

There are may ways to convert a recursion definition into working code:

  • A simple recursion, using a function that call itself in some way to implement the induction step.
  • a tail recrusion, using a function only calling itself at the tail of its execution. Tail recrusion enables an important optimization.
  • A lazy sequence that eliminates actual recursion and calculates a values later, when it is needed.

example, Fibonacci numbers in simple recursion

; bad idea
(defn stack-consuming-fibo [n]
    (cond
        (= n 0) 0
        (= n 1) 1
    :else (+ (stack-consuming-fibo (- n 1))
            (stack-consuming-fibo(- n 2)))))

Tail Recursion

Recursion must come at the tail, that is, at an expression that is a return value of the function. Languages can then perform tail-call optimization (TCO), converting tail recursions into iterations that do not consume the stack.

fibo in tail recursion

(defn tail-fibo [n]
    (letfn [(fib
                [current next n]
                (if (zero? n)
                    current
                    (fib next (+ current next) (dec n))))]
    (fib 0 1 n)))
; the letfn macro
; (letfn fnspecs & body)
; fnspecs ==> (fname [params*] exprs)+

But JVM was not designed for functional languages. No language that runs directly on the JVM can perform automatic TCO.

One special case of recursion that can be optimized away on the JVM is a self-recursion. In Clojure, you can convert a function that tail calls itself into an explicit self-recursion with recur.

; better but not great
(defn recur-fibo [n]
    (letfn [(fib
            [current next n]
            (if (zero? n)
                current
                (recur next (+ current next) (dec n))))]
     (fib 0 1 n)))

Lazy Sequences

; (lazy-seq & body)

A lazy-seq will invoke its body only when needed, that is, when seq is called directly or indirectly.

(defn lazy-seq-fibo
    ([]
        (concat [0 1] (lazy-seq-fibo 0 1)))
    ([a b]
        (let [n (+ a b)]
            (lazy-seq
                (cons n (lazy-seq-fibo b n ))))))

using existing sequence library functions

(defn fibo []
    (map first (iterate (fn [[a b]] [b (+a b)]) [0 1])))

Losing Your Head

Unless you wnat to cache a sequence as you traverse it, you must be careful not to keep a reference to the head of the sequence. You should normally expose lazy sequences as a function that returns the sequence, not as a var that contains the sequence.

With lazy sequence, losing your head is often a good idea.

Currying and Partial Application

When you curry a function, you get a new function that takes one argument and returns the original function with that one argument fixed.

The absence of curry from Clojure is not a major problem, since partial is available.

; (partical f & partial-args)

Recursion Revisited

(i have too many recursions in my mind, skipped for now)

Chapter 6, Concurrency

Clojure provides a powerful concurrency library, consisting of four APIs that enforce different concurrency models: refs, atoms, agents, and vars

  • Refs manage coordinated, synchronous changes to shared state.
  • Atoms manage uncoordinated, synchronous changes to shared state.
  • Agents manage asynchronous changes to shared state.
  • Vars manage thread-local state.

Refs and Software Transactional Memory (STM)

Most objects in Clojure are immutable. When you really want mutable data, you must be explicit about it, such as by creating a mutable reference (ref) to an immutable object.

; (ref initial-state)
(def current-track (ref "Mars, the Bringer of War"))
; to read the contents of the reference, you can call "deref"
; (deref reference)
(deref current-track)
=> "Mars, the Bringer of War"
; or shortened "@" reader macro
@current-track
=> "Mars, the Bringer of War"
; you can change where a reference points to with "ref-set"
; (ref-set reference new-value)
(ref-set current-track "Venus, the Bringer of Peace")
=> java.lang.IllegalStateException

because refs are mutable, you must protect their updates. In many languages, you would use a lock for this purpose. In Clojure, you can use a transaction. Transactions are wrapped in a dosync

; (dosync & exprs)
(dosync (ref-set current-track "Venus, the Bringer of Peace"))

Transactional Properties

Like database transactions, STM transactions guarantee some important properties:

  • Updates are atomic. If you update more than one ref in a transaction, the cumulative effect of all the updates will appear as a single instantaneous event to anyone not inside your transaction.
  • Updates are consistent. Refs can specify validation functions. If any of these functions fail, the entire transaction will fail.
  • Updates are isolated. Running transactions cannot see partially completed results from other transactions.

Clojure does not guarantee that updates are durable (database ACID's D), because transactions are in-memory transactions. If you want a durable transaction in Clojure, you should use a database.

; you can make sure that updates are coordinated
(def current-track (ref "Venus, the Bringer of Peace"))
(def current-composer (ref "Holst"))
(dosync
    (ref-set current-track "Credo")
    (ref-set current-composer "Byrd"))

alter will apply an update function to a referenced object within a transaction:

;(alter ref update-fn & args ...)
(defn add-message [msg]
    (dosync (alter message conj msg)))

How STM Works: MVCC

Clojure's STM uses a technique called Multiversion Concurrency Control (MVCC). (read the book)

commute is a specialized variant of alter allowing for more concurrency.

; (commute ref update-fn & args ...)

Commutes are so named because they must be commutative. That is, updates must be able to occur in any order.

Many updates are not commutative, consider a counter that returns an increasing sequence of numbers.

In general, you should prefer alter over commute.

Adding Validation to Refs

; (ref initial-state options*)
; options include:
; :validator validate-fn
; :meta metadata-map
(def validate-message-list
    (partial every? #(and (:sender %) (:text %))))
(def messages (ref () :validator validate-message-list))

Refs are great for coordinated access to shared state, but not all tasks require such coordination. For updating a single piece of isolated data, prefer an atom.

Use Atoms for Uncoordinated, Synchronous Updates

Atoms are a lighter-weight mechanism than refs. Atoms allow updates of a single value, uncoordinated with anything else.

; (atom initial-state options?)
; options include:
; :validator validate-fn
; :meta metadata-map
(def current-track (atom "Venus, the Bringer of Peace"))
; dereference with "deref"
(deref current-track)
=> "Venus, the Bringer of Peace"
; or
@current-track
=> "Venus, the Bringer of Peace"
; atoms do not participate in transactions, to update the value of an atom, simply call "reset!"
(reset! current-track "Credo")
; (swap! an-atom f & args)
; swap! updates an-atom by calling function f on the current value of an-atom, plus any additional args.
(swap! current-track assoc :title "Sancte Deus")

Both refs and atoms perform synchronous updates. When the update function returns, the value is already changed. If you do not need this level of control and can tolerate updates happening asynchronously at some later time, prefer an agent.

Use Agent for Asynchronous Updates

; (agent initial-state)
(def counter (agent 0))

once you have an agent, you can send the agent a function to update its state. send queues an update-fn to run later, on a thread in a thread pool

; (send agent update-fn & args)
(send counter inc)

you can check the current value of an agent with deref/@

@counter
=> 1

if you want to be sure that the agent has completed the actions you sent to it, you can call await or await-for.await has no timeout, so be careful: await is willing to wait forever.

; (await & agents)
; (await-for timeout-millis & agents)

Validating Agents and Handling Errors

Agent can take a validation function also

; (agent initial-state options*)
; options include:
; :validator validate-fn
; :meta metadata-map
(def counter (agent 0 :validator number?))
(send counter (fn [_] "boo"))
; no error, send still returns immediately.
; but when try to dereference
@counter
=> java.lang.Exception: Agent has errors

to discover the specific error(s), call agent-errors, which will return a sequence of errors thrown during agent actions

; (agent-errors counter)

once an agent has errors, all Subsequence attempts to query the agent will return an error. you make the agent usable again by calling clear-agent-errors

; (clear-agent-errors agent)
(clear-agent-errors counter)
@counter
=> 0

Including Agents in Transactions

Transactions should not have side effects, because Clojure may retry a transaction an arbitrary number of times. However, sometimes you want a side effect when a transaction succeeds. Agents provide a solution. If you send an action to an agent from within a transaction, what action will be sent exactly once, if and only if the transaction succeeds.

(def backup-agent (agent "output/messages-backup.clj"))
(use '[clojure.contrib.duck-steams :only (spit)])
(defn add-message-with-backup [msg]
    (dosync
        (let [snapshot (commute message conj msg)]
            (send-off backup-agent (fn [filename]
                                    (spit filename snapshot)
                                    filename))
            snapshot)))

send-off is a variant of send for actions that expect to block.

send-off actions get their own expandable thread pool. Never send a blocking functions, or you may unnecessarily prevent other agents from making progress.

Managing Per-Thread State with Vars

When you call def or defn, you create a dynamic var, you pass an initial value to def, which becomes the root binding for the var.

(def foo 10)

the binding of foo is shared by all threads

(.start (Thread. (fn [] (println foo))))
=> nil
| 10

you can create a thread-local binding for a var with the binding macro

; (binding [bindings] & body)

bindings have dynamic scope

Structurally, a binding looks a lot like a `let

(binding [foo 42] foo)
=> 42
(defn print-foo [] (println foo))
; now see the difference between binding and let
(let [foo "let foo"] (print-foo))
| 10
(binding [foo "bound foo"] (print-foo))
| bound foo

the let has no effect outside its own form, so the first print-foo prints the root binding 10. the binding, stays in effect down any chain of calls that begins in the biding form.

Acting at a Distance

Vars intended for dynamic binding are sometimes called special variables. It is good style to name them with leading and trailing asterisks *. For example, Clojure uses dynamic binding for thread-wide options such as the standard I/O streams *in*, *out*, and *err*.

Dynamic bindings enable action at a distance. When you change a dynamic binding, you can change the behavior of distant functions without changing any function arguments.

One kind of action at a distance is temporarily argumenting the behavior of a function. In some languages this would be classified as aspect-oriented programming; in Clojure it is simply a side effect of dynamic binding.

example

(defn slow-double [n]
    (Thread/sleep 100)
    (* n 2))
(defn calls-slow-double [] 
    (map slow-double [1 2 1 2 1 2]))
; you have to run with "dorun", otherwise "map" will immediately returning a lazy sequence
(time (dorun (calls-slow-double)))
| "Elapsed time: 601.418 msecs"

slow-double is a good candidates for memoization. When you memoize a function, it keeps a cache mapping past inputs to past outputs.

(memoize function)

since the slow-double function is already in use, you can create a binding to a memoized version and call calls-slow-double from within the binding:

(defn demo-memoize []
    (time
     (dorun
      (binding [slow-double (memoize slow-down)]
       (calls-slow-double)))))
(demo-memoize)
| "Elapsed time: 203.115 msecs"

dynamic binding has great power, but functions that use dynamic bindings are not pure functions and can quickly lose the benefits of Clojure's functional style.

A Clojure Snake

snake game written by Clojure

Chapter 7, Macros

**When to Use Macros

Macro Club has two rules, plus one exception.

The first rule is Don't Write Macros. Macros are complex, and they require you to think carefully about the interplay of macro expansion time and compile time. If you can write it as a function, think twice before using a macro.

The second rule is Write Macros If That Is The Only Way to Encapsulate a Pattern.

The exception to the rule is that you can write any macro that makes life easier for your callers when compared with an equivalent function.

Writing a Control Flow Macro

time and compile time. If you can write it as a function, think twice before using a macro.

The second rule is Write Macros If That Is The Only Way to Encapsulate a Pattern.

The exception to the rule is that you can write any macro that makes life easier for your callers when compared with an equivalent function.

Writing a Control Flow Macro

create an implementation of unless

begin by tring to write unless as a function

; this is doomed to fail ..
(defn unless [expr form]
    (if expr nil form))
(unless true (println "this should not print"))
| this should not print

function arguments are always evaluated before passing them to unless

macro solve his problem, because they do not evaluate their arguments immediately. Instead, you get to choose when (and if!) the arguments to a macro are evaluated.

when Clojure encounters a macro, it processes it in two steps. First, it expaneds (executes) the macro and substitutes the result back into the program. This is called macro expansion time. Then it continues with the normal compile time.

to write unless, you need to write Clojure code to perform the translation at macro expansion time:

; (unless expr form) -> (if expr nil form)

then you need to tell Clojure that your code is a macro by using defmacro

; (defmacro name doc-string? attr-map? [params*] body)
(defmacro unless [expr form]
    (list 'if expr nil form))
(unless false (println "this should print"))
then Clojure will (invisibly to you) expand the unless form into:
(if false nil (println "this should print"))

then Clojure compiles and executes the expanded if form.

Special Forms, Design Patterns, and Macros

Clojure has no special syntax for code. Code is composed of data structures. This is true for normal functions but also for special forms and macros.

"design pattern is not a finished design that can be transformed directly into code" (Wikipedia)

that is what macros fit it. macros provide a layer of indirection so that you can automate the common parts of any recurring pattern. Macros and code-as-data work together, enabling you to reprogram your language on the fly to encapsulate patterns.

Expanding Macros

the unless macro

(defmacro unless [expr form]
    (list 'if expr nil form))

only if is quoted, because:

  • you prevent Clojure from directly evaluating if at macro expansion time. Instead, evaluation strips off the quote, leaving if to be compiled.
  • you do not want to quote expr and form, because they are macro arguments. Clojure will substitute them without evaluation at macro expansion time.
  • you do not need to quote nil, since nil evaluates to itself.

you can test macro expansions at the REPL by using macroexpand-1

; (macroexpand-1 form)
(macroexpand-1 '(unless false (println "this should print")))
=> (if false nil (println "this should print"))
(defmacro bad-unless [expr form]
    (list 'if 'expr nil form))
(macroexpand-1 '(bad-unless false (println "this should print")))
=> (if expr nil (println "this should print"))

for recursively expand, use macroexpand

; (macroexpand form)
(macroexpand '(.. arm getHand getFinger))
=> (. (. arm getHand) getFinger)

it's not a problem arm, getHand, and getFinger do not exist. you are only expanding them, not attempting to compile and execute them.

when and when-not

(unless false (println "this") (println "and also this"))
=> java.lang.IllegalArgumentException
; (when test & body)
; (when-not test & body)
(when-not false (println "this) (println "and also this))
| this
| and also this
=> nil
; source for when-not
(defmacro when-not [test & body]
    (list 'if test nil (cons 'do body)))
; exam with macroexpand-1
(macroexpand-1 '(when-not false (print "1") (print "2")))
=> (if false nil (do (print "1") (print "2")))

when differs from if in two ways:

  • if allows an else clause, and when does not.
  • since when does not have else as its second argument, it can take a variable argument list and execute all the arguments inside a do

Making Macros Simpler

build a replica of Clojure's .. macro, call it chain

(defmacro chain [x form]
    (list '. x form))
; chain needs support any number of arguments
(defmacro chain
    ([x form] (list '. x form))
    ([x form & more] (concat (list 'chain (list '. x form)) more)))
; exam it
(macroexpand '(chain arm getHead))
=> (. arm getHead)
(macroexpand '(chain arm getHead getFinger))
=> (. (. arm getHead) getFinger)

it works fine but this part is difficult to read

(contact (list 'chain (list '. x form)) more))

one solution is a templating language

; hypothetical templating language
(defmacro chain
 ([x form] (. ${x} ${form}))
 ([x form & more] (chain (. ${x} ${form}) ${more})))

Syntax Quote, Unquote, and Splicing Unquote

The syntax quote character `` `` works almost like normal quoting, but inside a syntax quoted list, the *unquote character* ~` turns quoting off again.

(defmacro chain [x form]
    `(. ~x ~form))
(macroexpand '(chain arm getHead))
=> (. arm getHead)

but it doesn't work for multiple arguments

; does not quite work
(defmacro chain
 ([x form] `(. ~x ~form))
 ([x form & more] `(chain (. ~x ~form) ~more)))
(macroexpand '(chain arm getHead getFinger))
=> (. (. arm getHead) (getFinger))

you want more to be spliced into the list. splicing unquote ~@ is a reader macro for it

(defmacro chain
 ([x form] `(. ~x ~form))
 ([x form & more] `(chain (. ~x ~form) ~@more)))
(macroexpand '(chain arm getHead getFinger))
=> (. (. arm getHead) getFinger)

many macros follow the pattern of chain:

  1. begin the macro body with syntax quote `` ` `` to treat the entire thing as a template
  2. insert individual arguments with an unquote ~
  3. splice in more arguments with splicing unquote ~@

Chapter 8, Multimethods

Chapter 9, Clojure in the Wild

Search Blog: