7.4 Subroutines | Perl 6 Essentials

A calculation like "the factorial of a number" may be used several times in a large program. Subroutines allow this kind of functionality to be abstracted into a unit. It's a benefit for code reuse and generally makes it easier to work with the code too. Even though PASM is just an assembly language running on a virtual processor, it has a number of features to support high-level subroutine calls. IMCC offers a smoother interface to those features.

7.4.1 Stack-Based Subroutine Calls

Unlike most high-level languages, PIR and PASM don't provide a single statement to call a subroutine, pass in the arguments, and return the result. This is a language designed to implement other languages, and every language does subroutine calls a little differently. What's needed is a set of building blocks and tools, not a prepackaged solution.

PIR has several directives and instructions relevant to subroutine calls. The most important is call , which simply branches to a subroutine label. On the side of the caller, .arg passes an argument to a subroutine, and .result retrieves a result. Within the subroutine, .param retrieves an argument passed to the subroutine, and .return returns a value to the caller:

 .sub _main     .local int counter     counter = 5     .arg counter       # pass an argument     call _fact         # call the subroutine     .local int product     .result product    # retrieve the result     print product     print "\n"     end .end .sub _fact     saveall            # save caller's registers     .param int N       # retrieve the parameter     .local int prod     prod = 1 REDO:     prod = prod * N     dec N     if N > 0 goto REDO     .return prod       # return the result     restoreall         # restore caller's registers     ret                # back to the caller .end

This example reimplements the factorial code from the previous section as an independent subroutine. The subroutine _fact is a separate compilation unit, assembled and processed after the _main function. IMCC resolves global symbols like the _fact label between different units.

The _main compilation unit sets up a local variable named counter and passes it to the subroutine using the .arg directive. This is just the PASM save instruction ”it pushes the argument onto the user stack. It then calls _fact with the call instruction. This is PASM's bsr . It branches to a subroutine label and pushes the current location onto the control stack so it can return to it later.

The .result directive saves the value returned by the subroutine in the variable named product . This is just the PASM restore instruction ”it pops a return value off the user stack. The common case for return values is to use an existing local variable, so .result doesn't create a new named variable.

The first statement in the _fact subroutine is saveall . This saves all the registers off to the typed stacks, so they can be restored at the end of the subroutine. This is a callee save ”the subroutine takes responsibility for preserving the environment of the caller. The advantage here is that IMCC can ignore the subroutine's register usage when it allocates registers for the _main routine ”a call or bsr instruction has as much impact on register allocation as a noop .

The next statement is .param , which pops a function parameter off the user stack as an integer. The .param directive also creates a new named local variable for the parameter. So the statement:

 .param int N

is exactly the same as:

 .local int N restore N

which explicitly defines a local named integer variable and then calls the PASM restore to pop a value off the user stack. IMCC doesn't check the type, order, or number of the parameters to make sure they match what the the caller passes to the subroutine. You'll get no warning for a mismatch.

The next section is the same loop we had in the earlier example to calculate a factorial, but now it uses the parameter passed in to the subroutine as the counter. The .return statement at the end returns the final value of prod to the caller. This is just the PASM save instruction again ”it pushes the value onto the user stack, so .result can retrieve it after the subroutine ends. restoreall restores the caller's register values, and ret pops the top item off the control stack ”in this case, the location of the call to _fact ”and goes back to it.

When you have more then one argument to pass to the subroutine, passing order is the reverse of retrieval order. You push the final argument onto the user stack first, because it'll be the last parameter popped off the stack on the other end. Multiple return values are also passed in reverse order for the same reason. Often the first parameter or result will be a count of values passed in, especially when the number of arguments can vary:

 .arg y             # save args in reverse order .arg x call _foo          # (r, s) = _foo(x,y) .result r .result s          # restore results in order

The example above could have been written using simple labels instead of separate compilation units:

 .sub _main     $I1 = 5         # counter     call fact       # same as bsr fact     print $I0     print "\n"     $I1 = 6         # counter     call fact     print $I0     print "\n"     end fact:     $I0 = 1           # product L1:     $I0 = $I0 * $I1     dec $I1     if $I1 > 0 goto L1     ret .end

The unit of code from the fact label definition to ret is a reusable routine, but there are several problems with this simple approach. First, the caller has to know to pass the argument to fact in $I1 and to get the result from $I0 . Second, neither the caller nor the function itself preserves any registers. This is fine for the example above, because very few registers are used. But if this same bit of code were buried deeply in a math routine package, you would have a high risk of clobbering the caller's register values.

Another disadvantage of this approach is that _main and fact share the same compilation unit, so they're parsed and processed as one piece of code. When IMCC does register allocation, it calculates the data flow graph (DFG) of all symbols, ^[7] looks at their usage, calculates the interference between all possible combinations of symbols, and then assigns a Parrot register to each symbol. This process is less efficient for large compilation units than it is for several small ones, so it's better to keep the code modular. The optimizer will decide whether register usage is light enough to merit combining two compilation units, or even inlining the entire function.

^[7] The operation to calculate the DFG has a cost of O(2) or better. It depends on n_lines * n_symbols .

A Short Note on the Optimizer

The optimizer isn't powerful enough to inline small subroutines yet. But it already does other simpler optimizations. You may recall that the PASM opcode mul (multiply) has a two-argument version that uses the same register for the destination and the first operand. When IMCC comes across a PIR statement like $I0 = $I0 * $I1 , it can optimize it to the two-argument mul $I0 , $I1 instead of mul $I0, $I0, $I1 . This kind of optimization is enabled by the -O1 command-line option.

So you don't need to worry about finding the shortest PASM instruction, calculating constant terms, or avoiding branches to speed up your code. IMCC does it already.

7.4.2 Parrot Calling Conventions

Parrot defines a set of calling conventions for all subroutines that are externally visible. Since these routines may be called as part of a library or module, it's important to have a consistent interface. In these calls, the caller is responsible for preserving its own registers, and arguments and return values are passed in a predefined set of Parrot registers.

IMCC's implementation of the Parrot calling conventions is still unfinished . The example here is mostly PASM, but the comments show the planned IMCC syntax:

 .sub _main    .local int count    .local int product    count = 5    product = 1    saveall    I5 = count             # .nciarg count    I6 = product           # .nciarg product    .local Sub factsub    factsub = new Sub    $I0 = addr _fact    factsub = $I0    P0 = factsub           # .ncicall factsub  # fact(count, product)    invoke    save I5                # .nciresult $I0    restoreall    restore $I0    print $I0    print "\n"    end .end .sub _fact loop:    if I6 <= 1 goto fin    I5 = I5 * I6    dec I6    branch loop fin:    ret .end

The directives .nciarg , .ncicall , and .nciresult aren't implemented yet, but it's likely that they will be by the time you read this.

Let's take a closer look at the individual parts of this example. First, two locally named variables are defined and assigned values. All registers are preserved with the saveall instruction. The function call sequence begins with:

 I5 = count          # .nciarg count I6 = product        # .nciarg product

The arguments to _fact are assigned to consecutive registers, starting with I5 . Next, a variable of class Sub is created. IMCC's = addr syntax gets the branch offset to a label, just like PASM's set_addr . The value assigned to the factsub variable sets the offset of the subroutine it will invoke. Since the subroutine is its own compilation unit, a fixup of the global label _fact is done just before emitting the actual code.

The invoke opcode does a function call with the function object in the parrot register P0 ”it pushes the offset of the next instruction onto the control stack and branches to the subroutine.

On returning from the subroutine, the return value is saved, registers get restored, and the result is printed.

7.4.3 PASM Subroutines

We mentioned earlier that pure PASM compilation units can use the .emit and .eom directives instead of .sub and .end . These can be useful for grouping PASM functions or function wrappers. The subroutine entry labels have to be global labels:

 .emit _substr:     ...     ret _grep:     ...     ret .eom