Pipes, Clojure and Choco (part 1: Limits of Laziness)

This is the first in a series of posts that will explore constraint programming. It’s always more fun to learn with a good example, so we’re going to write a solution for this problem (yes, that link is in Dutch - I think - but it’s pretty obvious what is required).

However, this first post won’t give any solution. Instead, it explores a detail of Clojure that can be important if you are relying on lazy sequences for efficiency.

The issue comes to light when we try to use Clojure’s lazy sequences to make an efficient depth-first search (which we are going to need for the problem above). Here’s a function that encapsulates DFS as a general process:

(defn eager-dfs [node solution? expand count trace]
  (swap! count inc)
  (if trace (trace node @count))
  (if (solution? node)
    (list [node @count])
    (apply concat
      (map (fn [n] (eager-dfs n solution? expand count trace))
        (expand node)))))

The basic idea is that we recurse down through children (provided by expand). Whenever we reach a solution we return it as part of a list of results.

If you’re unused to Clojure, but can more-or-less grok Lisp then I suspect the only thing that seems odd here is @count. That’s because it’s a mutable value1, to avoid “resetting” the call count on backtrace, which requires special handling in Clojure (where variables and data structures are immutable by default).

Here is some test code and output that shows the search in action:

(defn expand-tree [search]
  (let [solution? (fn [n] (not (seq? n)))
        expand (fn [n] (for [c n] (let [] (debug "expand:" c) c)))
        trace (fn [n c] (debug "trace:" n c))
        results (search '((:a (:b :c )) :d ) solution? expand (atom 0) trace)]
    (println "before")
    (let [a (first results)]
      (println "first: " a))
    (let [b (first (next results))]
      (println "second: b"))
    (let [c (doall results)]
      (println "all: " c))))

(deftest test-eager
  (println "\ntest-eager")
  (expand-tree eager-dfs))
test-eager
trace: ((:a (:b :c)) :d) 1
expand: (:a (:b :c))
trace: (:a (:b :c)) 2
expand: :a
trace: :a 3
expand: (:b :c)
trace: (:b :c) 4
expand: :b
trace: :b 5
expand: :c
trace: :c 6
expand: :d
trace: :d 7
before
first:  [:a 3]
second: b
all:  ([:a 3] [:b 5] [:c 6] [:d 7])

And it works as expected. Which is fine, except that if we were interested only in the first result then we would have waited longer than necessary, because all results are calculated before being returned.

With Clojure this should not be a problem - the language has pervasive support for lazy sequences. All we need do is add lazy-seq:

(defn lazy-dfs [node solution? expand count trace]
  (lazy-seq
    (swap! count inc)
    (if trace (trace node @count))
    (if (solution? node)
      (list [node @count])
      (apply concat
        (map (fn [n] (lazy-dfs n solution? expand count trace))
          (expand node))))))
(deftest test-lazy
  (println "\ntest-lazy")
  (expand-tree lazy-dfs))
test-lazy
before
trace: ((:a (:b :c)) :d) 1
expand: (:a (:b :c))
expand: :d
trace: (:a (:b :c)) 2
expand: :a
expand: (:b :c)
trace: :a 3
first:  [:a 3]
trace: (:b :c) 4
expand: :b
expand: :c
trace: :b 5
second: b
trace: :c 6
trace: :d 7
all:  ([:a 3] [:b 5] [:c 6] [:d 7])

Bingo! The first result, :a, is available much earlier.

But look more closely. Why is expand: :d being printed before :a is returned? We do not want to invoke expand on nodes until they are needed.

If you’re experienced with Clojure you might want to go away and think about this. It was not obvious to me…

Here’s the solution:

(defn very-lazy-concat-map [f l]
  (if (seq l)
    (lazy-cat (f (first l)) (very-lazy-concat-map f (next l)))
    nil))

(defn dfs [node solution? expand count trace]
  (lazy-seq
    (swap! count inc)
    (if trace (trace node @count))
    (if (solution? node)
      (list [node @count])
      (very-lazy-concat-map
        (fn [n] (dfs n solution? expand count trace))
        (expand node)))))
(deftest test-very-lazy
  (println "\ntest-very-lazy")
  (expand-tree dfs))
test-very-lazy
before
trace: ((:a (:b :c)) :d) 1
expand: (:a (:b :c))
trace: (:a (:b :c)) 2
expand: :a
trace: :a 3
first:  [:a 3]
expand: (:b :c)
trace: (:b :c) 4
expand: :b
trace: :b 5
second: b
expand: :c
trace: :c 6
expand: :d
trace: :d 7
all:  ([:a 3] [:b 5] [:c 6] [:d 7])

And that really does work as expected.

But why do we need to provide very-lazy-map when Clojure’s built-in map is already lazy? The manual says:

Returns a lazy sequence consisting of the result of applying f to the set of first items of each coll, followed by applying f to the set of second items in each coll, until any one of the colls is exhausted.

It turns out that, for efficiency, Clojure “chunks” lazy sequences in groups of 32 items. So even though map is lazy, in a sense, it will happily expand 32 children at a time. In contrast, our alternative very-lazy-map forces item-by-item laziness.

A simple way to see this is to run the following at the Clojure REPL:

user=> (defn noisy [l] (for [x l] (let [] (print x) x)))
#'user/noisy
user=> (first (noisy (range 1000)))
0123456789101112131415161718192021222324252627282930310
user=>

You can see all the code in the repo.

Next time I’ll show the first solution to the puzzle.

Andrew


  1. More exactly, it is an Atom. At one point I thought (and wrote here) that a Vars could be used via with-local-vars, but that goes out-of-scope when lazy sequences are returned (when the lazy sequence is expanded calculations occur that attempt to access the now undefined Var).


Related Posts

blog comments powered by