Next Previous Contents

9. Functions

There are essentially two classes of functions that may be called from the interpreter: intrinsic functions and slang functions.

An intrinsic function is one that is implemented in C or some other compiled language and is callable from the interpreter. Nearly all of the built-in functions are of this variety. At the moment the basic interpreter provides nearly 300 intrinsic functions. Examples include the trigonometric functions sin and cos, string functions such as strcat, etc. Dynamically loaded modules such as the png and pcre modules add additional intrinsic functions.

The other type of function is written in S-Lang and is known simply as a ``S-Lang function''. Such a function may be thought of as a group of statements that work together to perform a computation. The specification of such functions is the main subject of this chapter.

9.1 Declaring Functions

Like variables, functions must be declared before they can be used. The define keyword is used for this purpose. For example,

      define factorial ();
is sufficient to declare a function named factorial. Unlike the variable keyword used for declaring variables, the define keyword does not accept a list of names.

Usually, the above form is used only for recursive functions. In most cases, the function name is almost always followed by a parameter list and the body of the function:

define function-name (parameter-list) { statement-list }
The function-name is an identifier and must conform to the naming scheme for identifiers discussed in the chapter on Identifiers. The parameter-list is a comma-separated list of variable names that represent parameters passed to the function, and may be empty if no parameters are to be passed. The variables in the parameter-list are implicitly declared, thus, there is no need to declare them via a variable declaration statement. In fact any attempt to do so will result in a syntax error.

The body of the function is enclosed in braces and consists of zero or more statements (statement-list). While there are no imposed limits upon the number statements that may occur within a S-Lang function, it is considered poor programming practice if a function contains many statements. This notion stems from the belief that a function should have a simple, well-defined purpose.

9.2 Parameter Passing Mechanism

Parameters to a function are always passed by value and never by reference. To see what this means, consider

     define add_10 (a)
     {
        a = a + 10;
     }
     variable b = 0;
     add_10 (b);
Here a function add_10 has been defined, which when executed, adds 10 to its parameter. A variable b has also been declared and initialized to zero before being passed to add_10. What will be the value of b after the call to add_10? If S-Lang were a language that passed parameters by reference, the value of b would be changed to 10. However, S-Lang always passes by value, which means that b will retain its value during and after after the function call.

S-Lang does provide a mechanism for simulating pass by reference via the reference operator. This is described in greater detail in the next section.

If a function is called with a parameter in the parameter list omitted, the corresponding variable in the function will be set to NULL. To make this clear, consider the function

     define add_two_numbers (a, b)
     {
        if (a == NULL) a = 0;
        if (b == NULL) b = 0;
        return a + b;
     }
This function must be called with two parameters. However, either of them may omitted by calling the function in one of the following ways:
     variable s = add_two_numbers (2,3);
     variable s = add_two_numbers (2,);
     variable s = add_two_numbers (,3);
     variable s = add_two_numbers (,);
The first example calls the function using both parameters, but at least one of the parameters was omitted in the other examples. If the parser recognizes that a parameter has been omitted by finding a comma or right-parenthesis where a value is expected, it will substitute NULL for missing value. This means that the parser will convert the latter three statements in the above example to:
     variable s = add_two_numbers (2, NULL);
     variable s = add_two_numbers (NULL, 3);
     variable s = add_two_numbers (NULL, NULL);
It is important to note that this mechanism is available only for function calls that specify more than one parameter. That is,
     variable s = add_10 ();
is not equivalent to add_10(NULL). The reason for this is simple: the parser can only tell whether or not NULL should be substituted by looking at the position of the comma character in the parameter list, and only function calls that indicate more than one parameter will use a comma. A mechanism for handling single parameter function calls is described later in this chapter.

9.3 Returning Values

The usual way to return values from a function is via the return statement. This statement has the simple syntax

return expression-list ;
where expression-list is a comma separated list of expressions. If a function does not return any values, the expression list will be empty. A simple example of a function that can return multiple values (two in this case) is:
        define sum_and_diff (x, y)
        {
            variable sum, diff;

            sum = x + y;  diff = x - y;
            return sum, diff;
        }

9.4 Multiple Assignment Statement

In the previous section an example of a function returning two values was given. That function can also be written somewhat simpler as:

       define sum_and_diff (x, y)
       {
          return x + y, x - y;
       }
This function may be called using
      (s, d) = sum_and_diff (12, 5);
After the above line is executed, s will have a value of 17 and the value of d will be 7.

The most general form of the multiple assignment statement is

     ( var_1, var_2, ..., var_n ) = expression;
Here expression is an arbitrary expression that leaves n items on the stack, and var_k represents an l-value object (permits assignment). The assignment statement removes those values and assigns them to the specified variables. Usually, expression is a call to a function that returns multiple values, but it need not be. For example,
     (s,d) = (x+y, x-y);
produces results that are equivalent to the call to the sum_and_diff function. Another common use of the multiple assignment statement is to swap values:
     (x,y) = (y,x);
     (a[i], a[j], a[k]) = (a[j], a[k], a[i]);

If an l-value is omitted from the list, then the corresponding value will be removed fro the stack. For example,

     (s, ) = sum_and_diff (9, 4);
assigns the sum of 9 and 4 to s and the difference (9-4) is removed from the stack. Similarly,
     () = fputs ("good luck", fp);
causes the return value of the fputs function to be discarded.

It is possible to create functions that return a variable number of values instead of a fixed number. Although such functions are discouraged, it is easy to cope with them. Usually, the value at the top of the stack will indicate the actual number of return values. For such functions, the multiple assignment statement cannot directly be used. To see how such functions can be dealt with, consider the following function:

     define read_line (fp)
     {
        variable line;
        if (-1 == fgets (&line, fp))
          return -1;
        return (line, 0);
     }
This function returns either one or two values, depending upon the return value of fgets. Such a function may be handled using:
      status = read_line (fp);
      if (status != -1)
        {
           s = ();
           .
           .
        }
In this example, the last value returned by read_line is assigned to status and then tested. If it is non-zero, the second return value is assigned to s. In particular note the empty set of parenthesis in the assignment to s. This simply indicates that whatever is on the top of the stack when the statement is executed will be assigned to s.

9.5 Referencing Variables

One can achieve the effect of passing by reference by using the reference (&) and dereference (@) operators. Consider again the add_10 function presented in the previous section. This time it is written as:

     define add_10 (a)
     {
        @a = @a + 10;
     }
     variable b = 0;
     add_10 (&b);
The expression &b creates a reference to the variable b and it is the reference that gets passed to add_10. When the function add_10 is called, the value of the local variable a will be a reference to the variable b. It is only by dereferencing this value that b can be accessed and changed. So, the statement @a=@a+10 should be read as ``add 10 to the value of the object that a references and assign the result to the object that a references''.

The reader familiar with C will note the similarity between references in S-Lang and pointers in C.

References are not limited to variables. A reference to a function may also be created and passed to other functions. As a simple example from elementary calculus, consider the following function which returns an approximation to the derivative of another function at a specified point:

     define derivative (f, x)
     {
        variable h = 1e-6;
        return ((@f)(x+h) - (@f)(x)) / h;
     }
     define x_squared (x)
     {
        return x^2;
     }
     dydx = derivative (&x_squared, 3);
When the derivative function is called, the local variable f will be a reference to the x_squared function. The x_squared function is called with the specified parameters by dereferencing f with the dereference operator.

9.6 Functions with a Variable Number of Arguments

S-Lang functions may be called with a variable number of arguments. A natural example of such functions is the strcat function, which takes one or more string arguments and returns the concatenated result. An example of different sort is the strtrim function which moves both leading and trailing whitespace from a string. In this case, when called with one argument (the string to be ``trimmed''), the characters that are considered to be whitespace are those in the character-set that have the whitespace property (space, tab, newline, ...). However, when called with two arguments, the second argument may be used to specify the characters that are to be considered as whitespace. The strtrim function exemplifies a class of variadic functions where the additional arguments are used to pass optional information to the function. Another more flexible and powerful way of passing optional information is through the use of qualifiers, which is the subject of the next section.

When a S-Lang function is called with parameters, those parameters are placed on the run-time stack. The function accesses those parameters by removing them from the stack and assigning them to the variables in its parameter list. This details of this operation are for the most part hidden from the programmer. But what happens when the number of parameters in the parameter list is not equal to the number of parameters passed to the function? If the number passed to the function is less than what the function expects, a StackUnderflow error could result as the function tries to remove items from the stack. If the number passed is greater than the number in the parameter list, then the extras will remain on the stack. The latter feature makes it possible to write functions that take a variable number of arguments.

Consider the add_10 example presented earlier. This time it is written

     define add_10 ()
     {
        variable x;
        x = ();
        return x + 10;
     }
     variable s = add_10 (12);  % ==> s = 22;
For the uninitiated, this example looks as if it is destined for disaster. The add_10 function appears to accept zero arguments, yet it was called with a single argument. On top of that, the assignment to x might look a bit strange. The truth is, the code presented in this example makes perfect sense, once you realize what is happening.

First, consider what happens when add_10 is called with the parameter 12. Internally, 12 is pushed onto the stack and then the function called. Now, consider the function add_10 itself. In it, x is a local variable. The strange looking assignment `x=()' causes whatever is on the top of the stack to be assigned to x. In other words, after this statement, the value of x will be 12, since 12 is at the top of the stack.

A generic function of the form

    define function_name (x, y, ..., z)
    {
       .
       .
    }
is transformed internally by the parser to something akin to
    define function_name ()
    {
       variable x, y, ..., z;
       z = ();
       .
       .
       y = ();
       x = ();
       .
       .
    }
before further parsing. (The add_10 function, as defined above, is already in this form.) With this knowledge in hand, one can write a function that accepts a variable number of arguments. Consider the function:
    define average_n (n)
    {
       variable x, y;
       variable s;

       if (n == 1)
         {
            x = ();
            s = x;
         }
       else if (n == 2)
         {
            y = ();
            x = ();
            s = x + y;
         }
       else throw NotImplementedError;

       return s / n;
   }
   variable ave1 = average_n (3.0, 1);        % ==> 3.0
   variable ave2 = average_n (3.0, 5.0, 2);   % ==> 4.0
Here, the last argument passed to average_n is an integer reflecting the number of quantities to be averaged. Although this example works fine, its principal limitation is obvious: it only supports one or two values. Extending it to three or more values by adding more else if constructs is rather straightforward but hardly worth the effort. There must be a better way, and there is:
   define average_n (n)
   {
      variable s, x;
      s = 0;
      loop (n)
        {
           x = ();    % get next value from stack
           s += x;
        }
      return s / n;
   }
The principal limitation of this approach is that one must still pass an integer that specifies how many values are to be averaged. Fortunately, a special variable exists that is local to every function and contains the number of values that were passed to the function. That variable has the name _NARGS and may be used as follows:
   define average_n ()
   {
      variable x, s = 0;

      if (_NARGS == 0)
        usage ("ave = average_n (x, ...);");

      loop (_NARGS)
        {
           x = ();
           s += x;
        }
      return s / _NARGS;
   }
Here, if no arguments are passed to the function, the usage function will generate a UsageError exception along with a simple message indicating how to use the function.

9.7 Qualifiers

One way to pass optional information to a function is to do so using the variable arguments mechanism described in the previous section. However, a much more powerful mechanism is through the use of qualifiers, which were added in version 2.1.

To illustrate the use of qualifiers, consider a graphics application that defines a function called plot that plots a set of (x,y) values specified as 1-d arrays:

     plot(x,y);
Suppose that when called in the above manner, the application will plot the data as black points. But instead of black points, one might want to plot the data using a red diamond as the plot symbol. It would be silly to have a separate function such as plot_red_diamond for this purpose. A much better way to achieve this functionality is through the use of qualifiers:
    plot(x,y ; color="red", symbol="diamond");
Here, a single semicolon is used to separate the argument-list proper (x,y) from the list of qualifiers. In this case, the qualifiers are ``color'' and ``symbol''. The order of the qualifiers in unimportant; the function could just as well have been called with the symbol qualifier listed first.

Now consider the implementation of the plot function:

    define plot (x, y)
    {
       variable color = qualifier ("color", "black");
       variable symbol = qualifier ("symbol", "point");
       variable symbol_size = qualifier ("size", 1.0);
          .
          .
    }
Note that the qualifiers are not handled in the parameter list; rather they are handled in the function body using the qualifier function, which is used to obtain the value of the qualifier. The second argument to the qualifier function specifies the default value to be used if the function was not called with the specified qualifier. Also note that the variable associated with the qualifier need not have the same name as the qualifier.

A qualifier need not have a value--- its mere presence may be used to enable or disable a feature or trigger some action. For example,

     plot (x, y; connect_points);
specifies a qualifier called connect_points that indicates that a line should be drawn between the data points. The presence of such a qualifier can be detected using the qualifier_exists function:
     define plot (x,y)
     {
         .
         .
       variable connect_points = qualifier_exists ("connect_points");
         .
         .
     }

Sometimes it is useful for a function to pass the qualifiers that it has received to other functions. Suppose that the plot function calls draw_symbol to plot the specified symbol at a particular location and that it requires the symbol attibutes to be specified using qualifiers. Then the plot function might look like:

    define plot (x, y)
    {
       variable color = qualifier ("color", "black");
       variable symbol = qualifier ("symbol", "point");
       variable symbol_size = qualifier ("size", 1.0);
          .
          .
       _for i (0, length(x)-1, 1)
         draw_symbol (x[i],y[i]
                      ;color=color, size=symbol_size, symbol=symbol);
          .
          .
    }
The problem with this approach is that it does not scale well: the plot function has to be aware of all the qualifiers that the draw_symbol function takes and explicitly pass them. In many cases this can be quite cumbersome and error prone. Rather it is better to simply pass the qualifiers that were passed to the plot function on to the draw_symbol function. This may be achieved using the __qualifiers function. The __qualifiers function returns the list of qualifiers in the form of a structure whose field names are the same as the qualifier names. In fact, the use of this function can simplify the implementation of the plot function, which may be coded more simply as
    define plot (x, y)
    {
       variable i;
       _for i (0, length(x)-1, 1)
         draw_symbol (x[i],y[i] ;; __qualifiers());
    }
Note the syntax is slightly different. The two semicolons indicate that the qualifiers are specified not as name-value pairs, but as a structure. Using a single semicolon would have created a qualifier called __qualifiers, which is not what was desired.

As alluded to above an added benefit of this approach is that the plot function does not need to know nor care about the qualifiers supported by draw_symbol. When called as

    plot (x, y; symbol="square", size=2.0, fill=0.8);
the fill qualifier would get passed to the draw_symbol function to specify the ``fill'' value to be used when creating the symbol.

9.8 Exit-Blocks

An exit-block is a set of statements that get executed when a functions returns. They are very useful for cleaning up when a function returns via an explicit call to return from deep within a function.

An exit-block is created by using the EXIT_BLOCK keyword according to the syntax

EXIT_BLOCK { statement-list }
where statement-list represents the list of statements that comprise the exit-block. The following example illustrates the use of an exit-block:
      define simple_demo ()
      {
         variable n = 0;

         EXIT_BLOCK { message ("Exit block called."); }

         forever
          {
            if (n == 10) return;
            n++;
          }
      }
Here, the function contains an exit-block and a forever loop. The loop will terminate via the return statement when n is 10. Before it returns, the exit-block will get executed.

A function can contain multiple exit-blocks, but only the last one encountered during execution will actually get used. For example,

      define simple_demo (n)
      {
         EXIT_BLOCK { return 1; }

         if (n != 1)
           {
              EXIT_BLOCK { return 2; }
           }
         return;
      }
If 1 is passed to this function, the first exit-block will get executed because the second one would not have been encountered during the execution. However, if some other value is passed, the second exit-block would get executed. This example also illustrates that it is possible to explicitly return from an exit-block, but nested exit-blocks are illegal.

9.9 Handling Return Values from a Function

The most important rule to remember in calling a function is that if the function returns a value, the caller must do something with it. While this might sound like a trivial statement it is the number one issue that trips-up novice users of the language.

To elaborate on this point further, consider the fputs function, which writes a string to a file descriptor. This function can fail when, e.g., a disk is full, or the file is located on a network share and the network goes down, etc.

S-Lang supports two mechanisms that a function may use to report a failure: raising an exception, returning a status code. The latter mechanism is used by the S-Lang fputs function. i.e., it returns a value to indicate whether or not is was successful. Many users familiar with this function either seem to forget this fact, or assume that the function will succeed and not bother handling the return value. While some languages silently remove such values from the stack, S-Lang regards the stack as a dynamic data structure that programs can utilize. As a result, the value will be left on the S-Lang stack and can cause problems later on.

There are a number of correct ways of ``doing something'' with the return value from a function. Of course the recommended procedure is to use the return value as it was meant to be used. In the case of fputs, the proper thing to do is to check the return value, e.g.,

     if (-1 == fputs ("good luck", fp))
       {
          % Handle the error
       }
Other acceptable ways to ``do something'' with the return value include assigning it to a dummy variable,
     dummy = fputs ("good luck", fp);
or simply ``popping'' it from the stack:
     fputs ("good luck", fp);  pop();
The latter mechanism can also be written as
     () = fputs ("good luck", fp);
The last form is a special case of the multiple assignment statement, which was discussed earlier. Since this form is simpler than assigning the value to a dummy variable or explicitly calling the pop function, it is recommended over the other two mechanisms. Finally, this form has the redeeming feature that it presents a visual reminder that the function is returning a value that is not being used.


Next Previous Contents