Friday, 24 January 2014

Evaluator with Error Handling - Monadic Version

On this basis then we put the bind and unit functions in their own module, which looks like this:

-module(exception_monad).
-export([unit/1, bind/2]).
-export([raise/1]).

unit(A) ->
    {ok, A}.

bind({ok, A}, F) ->
    F(A);
bind({error, E}, F) ->
    {error, E}.

raise(E) ->
    {error, E}.

So here we have a function unit/1 that takes a basic value and wraps it in our tuple.  And the function bind/2 does the fork business - it applies the function F if the value is flagged as ok, but if the value is flagged as
an error it passes the error on unchanged.  The function F is going to be our eval function but here in our monad code we leave this function to be anything that returns a value of the right type.

In addition we have a function raise/1 that creates the error case of our value.  Logic says that this code belongs in here.

So now we re-write the evaluator with error to make use of the monad code as follows:

-module(eval_em).
-export([eval/1]).
-export([answer/0, error/0]).

-import(exception_monad, [unit/1, bind/2, raise/1]).

% Evaluator with exception monad

eval({con, A}) ->
    unit(A);
eval({dv, T, U}) ->
    bind(eval(T), fun(A) ->
      bind(eval(U), fun(B) -> 
        if
          B == 0 -> raise("divide by zero");
          true -> unit(A div B)
        end
      end)
    end).

answer() ->
    {dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}.

error() ->
    {dv, {con, 1}, {con, 0}}.

We use the import command to get the functionality from our monad module, which is compiled separately.

Now our evaluation function can deal with the case of a simple value by calling unit/1 to turn this into a monadic value.  And it deals with an expression to evaluate - it ...evaluates the first expression and then with this value in hand it
...calls the function that
...evaluates the second expression and then with both values in hand it
...calls the function that
...either works out the sum or raises an error as appropriate.

All of this happens inside one function call.  The sequence is just the sort of regular pattern that you can smooth over with a bit of syntactical custard.

Also I like the fact that our client code does not know or need to know how the monadic value is constructed - it does not need to know that the value is created by using a tuple and the two atoms ok and
error.  All it needs to do is call the right functions, raise and unit, to create the values it wants.

Does this work as previously?

Eshell V5.10.2  (abort with ^G)
1> l(eval_em).
{module,eval_em}
2> eval_em:eval(eval_em:answer()).
{ok,42}
3> eval_em:eval(eval_em:error()).
{error,"divide by zero"}

So it does.

Enter the Rucksack

In these three variations we introduce an additional parcel of information that rides on the back of the central arguments and return values of  our function.  So the functional hamster has been fitted with a rucksack
and while the hamster keeps the important information in its cheeks the functions that we call are allowed to look inside the rucksack and can even take out what they find and put in something different.  The rodent
simile goes no further.  Anyway these three different examples show the pattern and of course patterns are what we are here to abstract out.  Enter the Rucksack.  Er, Monad I mean.

At this point I note that Erlang does have a format language for describing types of functions but I've not read up that far yet so I'll temporarily bodge it as follows.

So, instead of using functions F that take a plain value and return a plain result:

    F : x -> y

we start using enhanced functions that return an enhanced value:

    F' : x -> y'

In order to choreograph these functions we need a higher level function (generally called bind) that takes an enhanced value x' and an enhanced function F' and applies the one to the other taking care to maintain the additional information:

    bind : x',F' -> y'

In addition we need a little function called unit that goes from a naked value to an enhanced one without changing anything else:

    unit : x -> x'

Logic dictates that using unit to make a monadic value then binding to the function gives you the same result as applying the function to the original value: hope I got that right: anyway let there be a picture:


With this scheme in mind we can abstract out the maintenance of the additional information into a separate module.  So let's see.

Thursday, 16 January 2014

Evaluator with Trace

For this example we want to give our evaluator the power to create a trace of its execution. We add a function line/2 that takes a term and a value and creates a string that describes the term and shows that this evaluates to the value.

For example given the term {con, 42} and the value 42 it returns the string "eval({con,42}) <== 42" or similar.

Obviously this makes sense only if the value we pass is actually the value returned by the evaluation of the term we pass.  I mean, we won't call line({con, 42}), 99).

Then when the evaluator evaluates a term it also calls this function line/2 with the term and with the resulting value, thus showing the trace of that step, and returns this in a tuple with the actual return value.

Also I needed an extra function write_term/1 in there to turn a term into a string, for example to turn {con, 42} into the string "{con, 42}".

These extra functions are not exported from the module because we won't need to call them from the shell. Well, yes we would for testing, so you might opt to export them at first and then remove the export later.

So now each time I evaluate a term I include a trace of the evaluation in a tuple with the result, and I add to the trace each time it passes through.

Anyway here is the code now:

-module(eval_t).
-export([eval/1]).
-export([answer/0,error/0]).

% evaluator with trace

eval({con, A}) ->
    {line({con, A}, A), A};
eval({dv, T, U}) ->
    {X, A} = eval(T),
    {Y, B} = eval(U),
    {X ++ Y ++ line({dv, T, U}, (A div B)), A div B}.

write_term({con, A}) ->
    "{con, " ++ io_lib:print(A) ++ "}";
write_term({dv, T, U}) ->
    "{dv, " ++ write_term(T) ++ ", " ++ write_term(U) ++ "}".

line(Term, A) ->
    "eval (" ++ write_term(Term) ++ ") <==" ++ io_lib:print(A) ++ io_lib:nl().

answer() ->
    {dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}.

error() ->
    {dv, {con, 1}, {con, 0}}.

So now when I evaluate the Answer I get back the value 42 tucked into a tuple with the string that constitutes the accumulated trace of the eval operations:

1> l(eval_t).
{module,eval_t}
2> eval_t:eval(eval_t:answer()).
{"eval ({con, 1972}) <==1972\neval ({con, 2}) <==2\neval ({dv, {con, 1972}, {con
, 2}}) <==986\neval ({con, 23}) <==23\neval ({dv, {dv, {con, 1972}, {con, 2}}, {
con, 23}}) <==42\n",
 42}

I need to spread that trace string out - split that string at each "\n" character:

eval ({con, 1972}) <==1972
eval ({con, 2}) <==2
eval ({dv, {con, 1972}, {con, 2}}) <==986
eval ({con, 23}) <==23
eval ({dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}) <==42

Why does the arrow point from the result to the parameter and not the other way?  Don't know.  Hmm.

Saturday, 4 January 2014

Evaluator with State.

Leaving the error processing aside the next example incorporates an updateable state into our evaluator.  As an example we say that we wish to count how often the division operation has been called.  We could do this by creating a global variable that we increment in the division code.  In our functional version we do this by tagging our values with a state, our counter, by including it in a tuple. So instead of our evaluator returning an integer A it will return return {A, N} where A is the integer result and N is is the count of division operations that we have done.

So we adapt the evaluator to set and maintain this counter, as so:

-module(eval_s).
-export([eval/1]).
-export([answer/0, error/0]).

% evaluator with state

eval({{con, A}, X}) ->
    {A, X};
eval({{dv, T, U}, X}) ->
    {A, Y} = eval({T, X}),
    {B, Z} = eval({U, Y}),
    {A div B, Z + 1}.

answer() ->
    {dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}.

error() ->
    {dv, {con, 1}, {con, 0}}.

The counter is passed through each evaluation and on the line where we do the division we add one to it.

Then when we evaluate Answer we also pass a starting value zero for the counter:  the result contains the answer 42 and the final value of the counter, 2.

1> l(eval_s).
{module,eval_s}
2> eval_s:eval({eval_s:answer(), 0}).
{42,2}
3> eval_s:eval({eval_s:error(), 0}).
** exception error: an error occurred when evaluating an arithmetic expression
     in function  eval_s:eval/1 (c:/Users/polly/Erlang/eval_s.erl, line 12)

Evaluating Error still gives you an exception of course.

Wednesday, 1 January 2014

Evaluator with error handling

The next version allows for the fact that the core division operation can fail. The natural way to deal with this is to throw an exception and get out of the module up to a higher level. However this means that your function has not properly returned.  The other approach is to plough on to the end, taking care to avoid using anything that depends on a value you couldn't get because of the error.  Everyone has written code like this. Your execution code forks and forks again.  Or rather, everyone has a favourite strategy for avoiding writing code like this.  Subject for a later blog post I think.  Anyway, this is how we can handle the error state in our evaluator.  Intead of just returning a value A we return either the tuple {ok, A}, where A is some integer, representing a successful evaluation, or else the tuple {error, Reason} where Reason is some string explaining what the error is.

Then in the evaluator when we are passed a parameter which evaluates to an error we just pass that error back.  If we are asked to divide by zero we create the error expression.  Otherwise it is calculated as normal.  we just have to add the forks to the code to make this work, like so:

-module(eval_e).
-export([eval/1]).
-export([answer/0, error/0]).

% evaluator with error handling

eval({con, A}) ->
    {ok, A};
eval({dv, T, U}) ->
    case eval(T) of
        {error, E} -> {error, E};
        {ok, A} ->
            case eval(U) of
                {error, E} -> {error, E};
                {ok, B} ->
                    if
                        B == 0 -> {error, "divide by zero"};
                        true -> {ok, A div B}

                    end
            end
    end.

answer() ->
    {dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}.

error() ->
    {dv, {con, 1}, {con, 0}}.


Applying the new evaluator to Answer returns the tuple {ok, 42} indicating a successfully completed computation and showing the result:  applying it to Error returns {error, "divide by zero"}.  These are both legitimate return values for the function.

1> l(eval_e).
{module,eval_e}
2> eval_e:eval(eval_e:answer()).
{ok,42}
3> eval_e:eval(eval_e:error()).
{error,"divide by zero"}


The Streets of Software City are littered with cases where the return values from a function can include values that are not just quantitively different but qualitatively different.  For example a function that is supposed to return the next character from a stream can also return an integer that is not a character because it needs to indicate the end of file somehow.  Or a function that finds the index of a character inside a string can return -1 to indicate that there is no character to find.  End-of-file is not a type of character: -1 is not an index: your return data is polluted with meta-data.  Being able to tag values and then pattern match against the tags is one way to tackle this.  In Haskell you can create types that have alternative qualitatively different values and bolt them into your type system so the code won't even compile if you haven't handled all the cases correctly.  That has to be the right way, surely?

Anyway this handles the error but has the bad effect that your code keeps forking every time you hit a function that can return the two different types of value.

The basic evaluator

OK so I've taken down Philip Wadler's paper "Monads for functional programming" from its place on my One Day I'll Get My Head Round This pile because I ask myself, how would this work in Erlang?  Here goes.

Basic evaluator

The paper illustrates application of monads to functional programming by means of a series of variations on a tiny example program, a module for evaluating simple expressions.

The evaluator works on two types of term:
  • A term is either a constant, denoted by the tuple {con, A} where A is some integer, 
  • Or it is a quotient, denoted {dv, T, U} where T and U are other terms.
The basic evaluator looks like this. To evaluate a Constant {con, A} we return the integer A.  To evaluate the quotient {dv, T, U} we evaluate T and U and then divide the value from T by the value from U.
There are also two example expressions to test, called Answer and Error:
  • Answer represents the calculations (1972/2)/23 = 42 (I get the allusion) and 
  • Error represents 1/0 so it is there to cause an error.  
I've added these to the module as functions to save typing them out.  Here is the code.
-module(eval0).
-export([eval/1]).
-export([answer/0, error/0]).

% Basic evaluator

eval({con, A}) ->
    A;
eval({dv, T, U}) ->
    eval(T) div eval(U).

answer() ->
    {dv, {dv, {con, 1972}, {con, 2}}, {con, 23}}.

error() ->
    {dv, {con, 1}, {con, 0}}.

So I can load this code into the Erlang shell and evaluate the two expressions answer and error:

C:\Users\polly\Erlang>erl
Eshell V5.10.2  (abort with ^G)
1> l(eval0).
{module,eval0}
2> eval0:eval(eval0:answer()).
42
3> eval0:eval(eval0:error()).
** exception error: an error occurred when evaluating an arithmetic expression
     in function  eval0:eval/1 (c:/Users/polly/Erlang/eval0.erl, line 10)
4> q().
ok
5>
C:\Users\polly\Erlang>


So answer brings back the answer 42, and error leads to an exception in the shell.