Friday, 23 September 2011

Drop every nth item of a sequence

Start with a simple sequence:-

user=> (def xs [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20])
#'user/xs
user=> xs
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]


The task is to write a function that will drop, say, every fourth item and return the resulting sequence.

My first thought is that we can start by partitioning the sequence up into sub-sequences each, say, 4 elements, with the built-in function partition:-

user=> (partition 4 xs)
((1 2 3 4) (5 6 7 8) (9 10 11 12) (13 14 15 16) (17 18 19 20))


Then we take only the first 3 of each of these sub-sequences, thereby
omitting each fourth item overall:-

user=> (map #(take 3 %) *1)
((1 2 3) (5 6 7) (9 10 11) (13 14 15) (17 18 19))


Then we flatten these to make the whole list without the fourth ones:-

user=> (flatten *1)
(1 2 3 5 6 7 9 10 11 13 14 15 17 18 19)


However, this does not work for the general case. The function partition returns sub-sequences only until there are enough elements left in the list to have a whole sub-sequence. So 20 elements will work if you break into sub-sequences of four elements:-

user=> (partition 4 xs)
((1 2 3 4) (5 6 7 8) (9 10 11 12) (13 14 15 16) (17 18 19 20))


But not if I split by, say, six:-

user=> (partition 6 xs)
((1 2 3 4 5 6) (7 8 9 10 11 12) (13 14 15 16 17 18))


We lose 19 and 20: but if we were dropping every sixth item we would want these in. Hmm.

OK, plan B we use a list comprehension with for.

First, we want to number the elements of the list. Here is the sequence:-

user=> (for [x xs] x)
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)


To pick out certain ones, we number them by putting each one into a vector pair with successive numbers - provided by the function range:-

user=> (for [x (map vector (range) xs)] x)
([0 1] [1 2] [2 3] [3 4] [4 5] [5 6] [6 7] [7 8] [8 9] [9 10] [10 11] [11 12]
[12 13] [13 14] [14 15] [15 16] [16 17] [17 18] [18 19] [19 20])


We destructure the vectors into an element x and a count n, so our output can just show the element x:-

user=> (for [[n x] (map vector (range) xs)] x)
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)


I've confused matters by starting with a sequence that contains numbers in the first place. Well, never mind.

If I want to omit every fourth item then I want to drop the ones where n mod 4 is zero - that is, retain the ones where mod n mod 4 > 0. No, hang on - the counter n starts from zero not one - so I want to retain the ones where (n + 1) mod 4 > 0.

So I want a filter in my list comprehension (> (mod (+ n 1) 4) 0) like so:-

user=> (for [[n x] (map vector (range) xs) :when (> (mod (+ n 1) 4) 0)] x)
(1 2 3 5 6 7 9 10 11 13 14 15 17 18 19)


Does this work for every sixth item?

user=> (for [[n x] (map vector (range) xs) :when (> (mod (+ n 1) 6) 0)] x)
(1 2 3 4 5 7 8 9 10 11 13 14 15 16 17 19 20)

Interpose into a sequence

The next task is to write a function that interposes a given value between elements of a sequence and returns the result.  For example we want the function f that does this:

(f 99 [1 2 3 4 5]))  => [1 99 2 99 3 99 4 99 5]

The built in function interpose does this already but the task is to write our own.

Well, the function repeat gives us a lazy infinite list of repeats of an element x:-

user=> (take 10 (repeat 99))
(99 99 99 99 99 99 99 99 99 99)


So we want to interleave this into our original sequence.  There is a function interleave that does this:-

user=> (interleave (repeat 99) [1 2 3 4 5])
(99 1 99 2 99 3 99 4 99 5)


But we do not want the first element, just the rest, so we say:-

user=> (defn f [x ys] (rest (interleave (repeat x) ys)))
#'user/f
user=> (f 99 [1 2 3 4 5])
(1 99 2 99 3 99 4 99 5)