Thursday, 19 January 2012

Re-implement map

This brings us to the question of how we would implement the function map itself.

This is a basic building block that takes a function and a list and returns a list created by applying the given function to each of the elements of the given list. To map f onto a sequence, apply f to the first element and join the result onto whatever you get my mapping f onto the rest of the sequence.  So the obvious implementation looks like this:-

(defn mymap [f x]
  (if (seq x)
    (cons (f (first x)) (mymap f (rest x)))
    '()))


For example, increment a list of numbers:

user=> (mymap inc '(1 2 3 4))
(2 3 4 5)


However, the built-in function map is lazy. That is to say, I can apply it to an unlimited list provided I don't try to take the whole result. For example, recall that the function range without any parameters returns a lazy sequence of integers starting from 0 and going on forever:

user=> (take 10 (range))
(0 1 2 3 4 5 6 7 8 9)


I can still use this as a parameter in map:

user=> (take 10 (map inc (range)))
(1 2 3 4 5 6 7 8 9 10)


But if I do this with my reimplementation we get this:

user=> (take 10 (mymap inc (range)))
StackOverflowError clojure.lang.ChunkedCons.first (ChunkedCons.java:37)


Because the recursion in the function mymap starts work on its input list whether it needs it or not and just keeps going. To avoid this we want to write a lazy version of the map. In Clojure this is easier to do than to understand. I'm guided here by Fogus & Hauser, p 166. We add the macro lazy-seq to our map definition:

(defn mymap2 [f x]
  (lazy-seq
    (if (seq x)
      (cons (f (first x)) (mymap2 f (rest x)))
      '())))


Now this works for limited sequences:

user=> (take 3 (mymap2 inc '(1 2 3 4 5 6)))
(2 3 4)


But it also works for infinite ones:

user=> (take 3 (mymap2 inc (range)))
(1 2 3)


hmm. Pretty neat.

For the purposes of the 4Clojure exercise  we need to code this as an anonymous function that has an internally visible name so that it can call itself - you can add a name into the function definition form that will do this, as so:

(fn my-map [f x]
  (lazy-seq
    (if (seq x)
      (cons (f (first x)) (my-map f (rest x)))
      '())))


Here the name my-map is visible just inside the body of the definition.

No comments:

Post a Comment