Friday, 23 September 2011

Drop every nth item of a sequence

Start with a simple sequence:-

user=> (def xs [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20])
#'user/xs
user=> xs
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]


The task is to write a function that will drop, say, every fourth item and return the resulting sequence.

My first thought is that we can start by partitioning the sequence up into sub-sequences each, say, 4 elements, with the built-in function partition:-

user=> (partition 4 xs)
((1 2 3 4) (5 6 7 8) (9 10 11 12) (13 14 15 16) (17 18 19 20))


Then we take only the first 3 of each of these sub-sequences, thereby
omitting each fourth item overall:-

user=> (map #(take 3 %) *1)
((1 2 3) (5 6 7) (9 10 11) (13 14 15) (17 18 19))


Then we flatten these to make the whole list without the fourth ones:-

user=> (flatten *1)
(1 2 3 5 6 7 9 10 11 13 14 15 17 18 19)


However, this does not work for the general case. The function partition returns sub-sequences only until there are enough elements left in the list to have a whole sub-sequence. So 20 elements will work if you break into sub-sequences of four elements:-

user=> (partition 4 xs)
((1 2 3 4) (5 6 7 8) (9 10 11 12) (13 14 15 16) (17 18 19 20))


But not if I split by, say, six:-

user=> (partition 6 xs)
((1 2 3 4 5 6) (7 8 9 10 11 12) (13 14 15 16 17 18))


We lose 19 and 20: but if we were dropping every sixth item we would want these in. Hmm.

OK, plan B we use a list comprehension with for.

First, we want to number the elements of the list. Here is the sequence:-

user=> (for [x xs] x)
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)


To pick out certain ones, we number them by putting each one into a vector pair with successive numbers - provided by the function range:-

user=> (for [x (map vector (range) xs)] x)
([0 1] [1 2] [2 3] [3 4] [4 5] [5 6] [6 7] [7 8] [8 9] [9 10] [10 11] [11 12]
[12 13] [13 14] [14 15] [15 16] [16 17] [17 18] [18 19] [19 20])


We destructure the vectors into an element x and a count n, so our output can just show the element x:-

user=> (for [[n x] (map vector (range) xs)] x)
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)


I've confused matters by starting with a sequence that contains numbers in the first place. Well, never mind.

If I want to omit every fourth item then I want to drop the ones where n mod 4 is zero - that is, retain the ones where mod n mod 4 > 0. No, hang on - the counter n starts from zero not one - so I want to retain the ones where (n + 1) mod 4 > 0.

So I want a filter in my list comprehension (> (mod (+ n 1) 4) 0) like so:-

user=> (for [[n x] (map vector (range) xs) :when (> (mod (+ n 1) 4) 0)] x)
(1 2 3 5 6 7 9 10 11 13 14 15 17 18 19)


Does this work for every sixth item?

user=> (for [[n x] (map vector (range) xs) :when (> (mod (+ n 1) 6) 0)] x)
(1 2 3 4 5 7 8 9 10 11 13 14 15 16 17 19 20)

Interpose into a sequence

The next task is to write a function that interposes a given value between elements of a sequence and returns the result.  For example we want the function f that does this:

(f 99 [1 2 3 4 5]))  => [1 99 2 99 3 99 4 99 5]

The built in function interpose does this already but the task is to write our own.

Well, the function repeat gives us a lazy infinite list of repeats of an element x:-

user=> (take 10 (repeat 99))
(99 99 99 99 99 99 99 99 99 99)


So we want to interleave this into our original sequence.  There is a function interleave that does this:-

user=> (interleave (repeat 99) [1 2 3 4 5])
(99 1 99 2 99 3 99 4 99 5)


But we do not want the first element, just the rest, so we say:-

user=> (defn f [x ys] (rest (interleave (repeat x) ys)))
#'user/f
user=> (f 99 [1 2 3 4 5])
(1 99 2 99 3 99 4 99 5)

Tuesday, 23 August 2011

Basic Destructuring

A couple of examples showing destructuring - using the argument list to break up a composite value passed to a function.  The same principle applies within a let statement so it is easy to see how it works.

The function range produces a list of values from a start to an (exclusive) end number with a specified step:

user=> (range 1 10 3)
(1 4 7)


You can omit the step and it will step by 1:

user=> (range 1 10)
(1 2 3 4 5 6 7 8 9)


You can supply just the end and it will start from 0:

user=> (range 10)
(0 1 2 3 4 5 6 7 8 9)


Call it on its own and it will start at 0 and go on as long as you take numbers out:

user=> (take 20 (range))
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)


So, if we pass (range) as a parameter to a let statement, we can deconstruct the first few elements of the list of numbers by using a vector of parameter identifiers like this:

user=> (let [[a b c d e f g] (range)] [c e])
[2 4]


So in the body of the (let), a=0, b=1, c=2, d=3, e=4, f=5, g=6.

And so [c e] returns [2 4]

The additional character & in the parameter deconstruction sets the following identifier to a vector of the remaining parameters that have not yet been assigned.

And the keyword :as indicates that the next identifier will take the value of the whole list.  Therefore:

user=> (let [[a b & c :as d] [1 2 3 4 5]] [a b c d])
[1 2 (3 4 5) [1 2 3 4 5]]


What happens here?  From the input values [1 2 3 4 5], a takes the first value 1, b takes the next value 2, and c takes the remaining values and is set to [3 4 5].  Then d is set to the value of the whole input list, that is [1 2 3 4 5].

Compare this with the case where we do not deconstruct the input - we could set d to be the whole of the input list like so:

user=> (let [d [1 2 3 4 5]] d)
[1 2 3 4 5]


Friday, 29 July 2011

Simple Simple Recursion

The next question asks what the following function will return:

((fn foo [x] (when (> x 0) (conj (foo (dec x)) x))) 5)


I hadn't spotted until now that an anonymous function definition can give the function a name, that is to say a name that is visible inside the definition, so you can easily write recursive lambdas.

Anyway clearly this is going to loop down from 5 to 1, building up a list as it goes - so the answer is

(1 2 3 4 5)


Wrong.  This is what happens:

user=> ((fn foo [x] (when (> x 0) (conj (foo (dec x)) x))) 5)
(5 4 3 2 1)


Ha. Maybe it would be clearer (to me at any rate) if I did this with cons instead of conj:

user=> ((fn foo [x] (when (> x 0) (cons x (foo (dec x))))) 5)
(5 4 3 2 1)


The alternative way with cons instead of conj: the x and the call to foo go round the other way. But for lists, conj and cons do the same thing - they add the new element to the start of the list:

user=> (cons 1 '(2))
(1 2)
user=> (conj '(2) 1)
(1 2)


It's with vectors that conj goes the other way:

user=> (conj [2] 1)
[2 1]


when is like if except there is no else part - when the test returns false the when clause returns nil.
You can conveniently cons onto nil to start a list as you would cons onto the empty list:

user=> (cons :a nil)
(:a)
user=> (cons :a '())
(:a)
user=> (cons :a ())
(:a)


Even though nil and () are not the same.

OK, so in this function what happens is the recursive calls burrow their way down from 5 to zero, get a nil when they are down there, and then pop their way back out consing the numbers 1, 2, 3, 4, 5 onto the return list as they go -

(foo 5)
(cons 5 (foo 4))
(cons 5 (cons 4 (foo 3)))
(cons 5 (cons 4 (cons 3 (foo 2))))
(cons 5 (cons 4 (cons 3 (cons 2 (foo 1)))))
(cons 5 (cons 4 (cons 3 (cons 2 (cons 1 (foo 0))))))
(cons 5 (cons 4 (cons 3 (cons 2 (cons 1 nil)))))
(cons 5 (cons 4 (cons 3 (cons 2 '(1))))
(cons 5 (cons 4 (cons 3 '(2 1)))
(cons 5 (cons 4 '(3 2 1)))
(cons 5 '(4 3 2 1))
'(5 4 3 2 1)


Right.  Got it.

Interleave Two Sequences

Similarly to interleave two sequences, without of course using the interleave function.

So for example [1 2 3] [:a :b :c] => [1 :a 2 :b 3 :c]

(fn [sq1] [sq2]
  (loop [src1 sq1, src2 sq2, dest []]
    (if (and (seq src1) (seq src2))
      (recur
        (rest src1)
        (rest src2)
        (conj dest (first src1) (first src2)))
       dest)))

Wednesday, 27 July 2011

Transforming Sequences

So with those points in mind here are two examples from 4Clojure.

To pack a sequence - that is to take all runs of one or more equal elements and put them inside their own sub-sequences. At each turn round the loop get the next element and then use take-while to gather equal ones into a sub-list and use drop-while to skip over the ones waiting to be processed, like this:

(defn pack-sequence [sq]
  (loop [src sq, dest []]
    (if (seq src)
      (let [x (first src)]
        (recur
          (drop-while #(= x %) src)
          (conj dest (take-while #(= x %) src))))
      dest)))


So for example (pack-sequence '(1 1 2 2 2 3) => '((1 1) (2 2 2) (3))

And similarly to replicate a sequence, meaning to duplicate each element a nominated number of times: use repeat to do the duplication and concat to add the elements to the growing destination sequence:

(defn replicate-sequence [sq n]
  (loop [src sq, dest []]
    (if (seq src)
      (recur
        (rest src)
        (concat dest (repeat n (first src))))
    dest)))


So for example (replicate-sequence '(1 2 3) 2) => '(1 1 2 2 3 3)

Doing Sequences

So I have invested in The Joy Of Clojure by Fogus & Houser. I just felt like buying a book. Well, with a few of their tips I can update my scheme for transforming sequences. One observation is that "quoting the empty list isn't idiomatic Clojure" (p.34) - so you don't have to write '() because () will do. Another observation is that the idiomatic way to test whether a sequence is empty is to use (seq s), which returns nil if (empty? s), and nil evaluates to boolean false. On p.87 they point out that you can remove that call to reverse on closing a sequence transformation by starting with a vector and using conj instead of cons. So instead of:

(defn copy-sequence-1 [sq]
  (loop [src sq, dest ()]
    (if (seq src)
      (recur
        (rest src)
        (cons (first src) dest))
      (reverse dest))))


You can do this:

(defn copy-sequence-2 [sq]
  (loop [src sq, dest []]
    (if (seq src)
      (recur
        (rest src)
        (conj dest (first src)))
      dest)))


You get a vector back but that's not a problem. In fact we read "in idiomatic Clojure code lists are used almost exclusively to represent code forms" (p.90) - for data rather than code "lists rarely offer any value over vectors".