Clojure Euler: Problem 002

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

Permalink: http://projecteuler.net/problem=2

This is not hard problem with objective to make you familliar with Fibonacci numbers (this numbers will appear in a lot of next problems). It can be implemented with the straightforward manner with the following rules applied:

F(0) = 1
F(1) = 1
F(n) = F(n-1) + F(n-2)

So, we can define simple recursive function to calculate n-th number of Fibonacci sequence in clojure:

(defn fib [n]
  (if (or (= n 1) (= n 0)) 1
    (+ (fib (- n 1)) (fib (- n 2)))))

It’s probably not the best implementation of Fibonacci numbers, but it shows the idea. We can check that it works correctly by calling (map fib (range 10)) that produces sequence (1 1 2 3 5 8 13 21 34 55). Problem appears when we try to calculate 40th number. Calling that function on my machine takes ~20 seconds. Not good.

The reason of such bad performance is repeated calculations. Let’s see how expanded call (fib 40). It produces (+ (fib 39) (fib 38)), where (fib 39) expanded to (+ (fib 38) (fib 37)) and (fib 38) to (+ (fib 37) (fib 36)) etc. The growth with factor of 2. And with these two expansions we calculated 2 values that we calculated before, 37th and 38th.

We can make better if we producing our sequence in direct order, instead of reverse. This way function will be more complex than previous:

(defn fib-seq [n]
  ((fn [a b c seq]
    (cond (= 1 n) [1]
          (= c n) seq
          :else (recur b (+ a b) (inc c) (conj seq (+ a b)))))
   1 1 2 [1 1]))

Let’s see what’s new clojure features we used:

We define function inside function. Why we did this? All is because we used “helper”-function for accumulate current Fibonacci sequence. It used only in function fib-seq, so we don’t need to define it at the root level. It closed for using for everyone, except fib-seq.
Next thing we define anonymous function by keyword fn. Basically it’s the right thing to define functions. defn just shortcut for (def (fn [] )), and..
..we pass default (initialization) values for helper function 1 1 2 [1 1].
I skip logic of helper function here. It is pretty straightforward, just notice two new clojure functions that we used.
cond - similar to switch-case structure in C-style languages.
conj - add value to collection. Dependant on collection type, insert new value in different places. For our case we using vector, so we add this value to the end of vector.

Now we can call (fib-seq 40) and it calculates all 40 values very quickly. Nice.

But let’s return to problem description “… whose values do not exceed 4 million…“. How do we know how many values we need to take that do not exceed 4 million? 50? 100? 112? This is drawback of our fib-seq function and we need to rewrite it…or take a look at clojure mechanism called lazy sequences.

Lazy sequences

In few words, lazy sequence is infinite sequence of some values, that calculates due to some expression. The main idea here is the word lazy that means we evaluate expressions when they are needed. For example, sequence of natural numbers is a lazy sequence. Clojure supports lazy sequences along with operations on them. So, we can combine sequences, limit, filter etc. Java do not support lazy sequences but similar functionality can be implemented with generators concept. Generator interface provides method next() to retrieve next value in the sequence, and that method evaluates value. But this approach is very poor with comparison to clojure lazy sequences.

Lazy sequences in clojure can be treated like ordinary sequences. Obviously some methods make no sense, due to infinity of sequence (count, last, etc.) There are way to define lazy sequences by using macro lazy-seq. But most common way to do it with the function iterate. It takes function f, and initial value a, and produces lazy sequence (a, f(a), f(f(a)),...). For example lazy sequence of natural numbers we can define as following: (iterate inc 1). Then we can play with this sequence as we want.

WARNING: Never call lazy sequence without limiting functions. It tries to evaluate all, and… You know, never call.

Now, we have a little understanding what lazy sequence is, and can implement Fibonacci lazy sequence.

(defn fib-seq-lazy []
  (map first (iterate (fn [[a b]] [b (+ a b)]) [1 1])))

Ok, What’s here:

Read from the end. We apply anonymous function to vector [1 1] and produce lazy sequence with iterate.
Note double square brackets in anonymous function definition. Outer brackets indicate function variables. Inner brackets indicate that we take one parameter to function as argument. This parameter is a sequence and we map to a and b, first and second elements in that sequence. For example for first call, a bound to 1, b bound to 1.
Our iterate produces lazy sequence with following format ([1 1] [1 2] [2 3] [3 5] [5 8] ...). Our Fibonacci sequence is just first value of each pair, and we using…
map - takes function and applies it to each element of collection. This function also produces lazy sequence.
To gather all in one we get lazy sequence that represent Fibonacci numbers (1 1 2 3 5 8 ...).

Now we have to implement functionality that in problem description. Immediately, code:

(reduce +
  (filter even? (take-while #(< % 4000000) (fib-seq-lazy))))

Another one(two)-liner. Good.

take-while - takes values from sequence (includes lazy sequences) while condition is true.
#(< % 4000000) - #() it’s shorter shortcut for anonymous function. In our case predicate return true if value lower than 4 millions. Exacly what we need.
filter - return all values for which predicate return true. Instead of take-while it proceed all sequence till the end, and, obviously, do not accept lazy sequences.
even? - predicate that return true, if number is even. Note: this is common way to append function name with question mark to indicate that this function is predicate (return true or false). Just convention.
And, finally, sum all of them.

Congratulations! We just solved Project Euler Problem 2.

There are also few optimisations to get result faster, but KISS.

GitHub for lazy!

P.S. We could use clojure-contrib library, which is often deployed with standard clojure-core. clojure-contrib.lazy-seqs library contains method (fibs) that also produces lazy Fibonacci sequence. And interestingly it implemented the same way as we did. There are two more lazy sequences: prime numbers and powers of 2. It’s good practice to use existing functionality and not invent vehicle.

Have fun!

mishadoff 15 October 2012

Clojure Euler: Problem 003 → ← Clojure + Emacs + Leiningen + nREPL

mishadoff thoughts

Clojure Euler: Problem 002

Lazy sequences