Friday 8 November 2013

Hello Again World

So let us start at the Place Where People Start.  Write a script in Erlang that will write the text "Hello World" to the terminal.

I call it a "script" because that seems  appropriate when the program is short and is being executed directly from the source code.

Our Hello World script looks like this:

main([]) ->
    io:format("Hello World").

This goes in a file called hello.erl.

Now, the first line of the script is left blank.  This is because escript considers this line to be reserved for system-specific commands to the shell to nominate the program that runs the script.  I'll leave this blank as I'm on Windows 7 here and that trick does not work.

The rest of the file consists of a definition of the function main().  I told you, it's like C.  The square brackets [] are a list in which the system will pass the arguments on the command line.  At first we'll take this to be empty and not use any arguments.  The symbol -> introduces the body of the definition.  This is a call to the format function in the io library, indicated with a colon thus io:format.  To this function we pass a string containing placeholders and a list of identifiers whose values will be slotted into the string - so it does what printf would do in C.  In this case we are just printing a single string, so no placeholders and no list of arguments.  The full top . terminates the function definition.

The function io:format is documented in the Stdlib document.  It returns just the atom ok so of course here we are using a function for its side-effects, that is to say writing text to the terminal.  You can call io:format three ways: the first way just pass a string, to be output:  the second way also pass a list
of values to substitute into the string; and the third way you first specify the output channel, which would otherwise default to standard output.  So these are all the same:

    io:format("Hello World").
    io:format("Hello World", []).
    io:format(standard_io, "Hello World", []).

In the Erlang jargon these would be described as format/1, format/2 and format/3.

When you execute the script from the command line with the escript command, escript executes the program starting from the main function.

So in Erlang jargon we would call this the function main/0, meaning it's called main and it takes no arguments.

You run this with the command escript like this:

C:\Users\polly\Erlang>escript hello.erl
Hello World

So what about passing arguments?  What happens if I add something to the command line?

C:\Users\polly\Erlang>escript hello.erl charlie
escript: exception error: {function_clause,[{local,main,[["charlie"]]}]}
  in function  escript:code_handler/4 (escript.erl, line 838)
  in call from erl_eval:local_func/5 (erl_eval.erl, line 467)
  in call from escript:interpret/4 (escript.erl, line 774)
  in call from escript:start/1 (escript.erl, line 277)
  in call from init:start_it/1 (init.erl, line 1054)
  in call from init:start_em/1 (init.erl, line 1034)

The escript returns a run time error message because although we have defined a function main/0 to process no arguments we have tried to call main/1, the same function  with one argument, which has not been defined.

Erlang tries a pattern match of the arguments it has in its hands and the possible sets of arguments to the main function and finding none that match it declares an error and halts processing.

So we can now allow for an argument to the function by adding a clause to our existing definition:  remove the full stop at the end and change this to a semicolon and we can add a clause for the case with an argument, which we call Arg:

main([]) ->
    io:format("Hello World");
main([Arg]) ->
    io:format("Hello ~s", [Arg]).

So now [Arg] is a list containing a single identifier Arg which will be set to the argument on the command line after the name of the script.

Then within the format() function we add a second argument, the list  containing [Arg], and add a placeholder ~s in the string to indicate where we want this to be slotted in to the string.

So now this will work with or without an extra argument:

C:\Users\polly\Erlang>escript hello.erl
Hello World
C:\Users\polly\Erlang>escript hello.erl charlie
Hello charlie
C:\Users\polly\Erlang>

Which of course now poses the question:

C:\Users\polly\Erlang>escript hello.erl curly larry mo
escript: exception error: {function_clause,[{local,main,[["curly","larry","mo"]]
}]}
  in function  escript:code_handler/4 (escript.erl, line 838)
  in call from erl_eval:local_func/5 (erl_eval.erl, line 467)
  in call from escript:interpret/4 (escript.erl, line 774)
  in call from escript:start/1 (escript.erl, line 277)
  in call from init:start_it/1 (init.erl, line 1054)
  in call from init:start_em/1 (init.erl, line 1034)

We've allowed for one argument but more than one is still an error.  No pattern match for it, you see.

So, we want the final option to take care of two or more items on the command line.  The expression [X|XS] is the Erlang code for a list whose first element is X and the rest of which is XS.  So we want to match against this as follows:

main([]) ->
    io:format("Hello World");
main([Arg]) ->
    io:format("Hello ~s", [Arg]);
main([Arg|More]) ->
    io:format("Hello ~s and~n", [Arg]),
    main(More).

Right, so here we have matched against [Arg|More] in our argument list for the main function.  Note that this doesn't just match the pattern - it doesn't just say, yes, your argument list matches the pattern [Arg|More] - it also assigns the parts of the argument list, the head and tail, to the identifiers you supply, all in one statement.  This I have to admit is neat, remembering that in Lisp I would first check that I had a list with a head and a tail and then if this were so go back and get the CAR and the CDR a couple of lines later.  Not so neat.

How to handle this case?  We write the first element Arg to the output and then re-call the main() function to process More, the rest of the argument list.  The first line here now ends with a comma, meaning there are further lines within this block, to be executed in sequence - - so the comma does what a PROGN would do in Lisp.

C:\Users\polly>escript hello.erl
Hello World
C:\Users\polly>escript hello.erl charlie
Hello charlie
C:\Users\polly>escript hello.erl curly larry moe
Hello curly and
Hello larry and
Hello moe
C:\Users\polly>

No comments:

Post a Comment