Blocks in Perl 6

7 minute read

Perl 6 code is structured, scoped and invoked in fundamental units called Block’s (technically, there is an even more fundamental level called Code but that will be outside of the scope of this article). A Block is both a data-type and a core syntactical feature in Perl 6. When I’m talking about it as a data-type, I’ll try to spell it with a capital-“B” to distinguish from a “block” meaning the syntax where a section of code is enclosed in braces, forming a block that is, in turn, stored internally as a Block.

All of the other types of abstractions of executable units are derived from the Block including the Routine and Method (see the type graph for Routine for more clarity on these relationships.

What is a block?

In general terms, a block is any brace-enclosed piece of Perl 6 code, such as:

my $x = 42;
{
    my $y = 24;
    say "$x, $y";
}

Braces around code create blocks and they, in turn, create scope. Scope is how variables are made visible and invisible to different parts of your program. In the above example, the definition of $x is visible anywhere in the listed code because it is in the outer-most (implicit) block, and so the say can print its value. But the definition of $y is only visible within the inner block because it was declared using the lexical scoping operator, my and my attaches the variable to the scope of the inner-most block that contains it. A block represents a lexical scope (lexical, meaning word- or vocabulary-related, is used to describe something that exists within certain “tokens” in an input file such as the $x, $y appearing “lexically” within the double quotes).

But blocks are sort of routines

A block that is intended to be invoked through means other than simply proceeding from an enclosing scope is called a routine. Like blocks, routines are both an abstract syntactical element of the language and the name of the data structure (Routine) that those elements are represented by.

Technically, there is nothing that a Routine has that a Block does not (the definition of Routine is essentially just class Routine is Block {} with a few extra technical details) but we generally try to separate the two, logically, to indicate how they are intended to be used.

A routine is declared like so:

sub f($x, $y) {
    $x.substr(0, $y);
}

The parts of a routine are:

  • An introductory keyword or operator - sub in this case.
  • The name - A bare identifier that has the same naming rules as variables.
  • The signature - An (optional and sometimes implicit) list of parameters and other items that define calling detailed below (see signatures).
  • The block - A brace-enclosed block of statements.
  • Its return value - Either the value of the last statement executed within the block (including sub-blocks) or the value returned explicitly by the return statement.

All of the parameters are considered to be within the scope of the block, including any sub-blocks, as are any variables declared within it.

Anonymous Routines

Routines can exist without a name or “anonymously”. Perl 5 programmers will be familiar with the sub form of anonymous routines:

my $f = sub ($x, $y) { $x.substr(0, $y) };

In this case, the subroutine cannot be called directly by name, since it does not have one, and so must be saved to a variable or otherwise stored so that it can be invoked later. More commonly, in Perl 6, however, the “pointy sub” form of anonymous routine is used:

my $f = -> $x, $y { $x.substr(0, $y) };

Notice that the parentheses are dropped around the signature in a pointy sub. This is not just saving characters. The presence of parentheses would indicate that the sub takes one parameter that should be expanded to a list of two elements, as opposed to taking two parameters.

A common usage for pointy blocks is in function vectors:

sub perform-operator($operator, $lhs, $rhs) {
    my %operator-vector =
        '+' => -> $a, $b { $a + $b },
        '-' => -> $a, $b { $a - $b },
        '*' => -> $a, $b { $a * $b },
        '/' => -> $a, $b { $a / $b };

    # Now call the required function:
    return %operator-vector{$operator}($lhs, $rhs);
}

Pointy blocks show up in Perl 6 most often in a place where it might not be obvious that they are actually a separate construct. That is, in looping constructs such as for:

for @a -> $x {
    say $x;
}

But this is, indeed, a pointy block in action. In fact, the signature of that anonymous routine controls how the for loop will behave. Every routine has what is called an arity, which is the number of positional parameters that it takes. In the above example, the arity is one, but if you change it to two:

for @a -> $x, $y {
    say "$x $y";
}

Now you have a loop that reads the list two elements at a time.

There are four basic ways that anonymous subs are typically declared:

  • With the sub keyword
  • With the -> (pointy) operator
  • As plain blocks, possibly using placeholders
  • By using the whatever syntax via the * operator.

We’ve already seen the first two, so let’s look at the last two:

Blocks as routines

When you write:

my $x = 42;
{
    say $x;
}

You don’t typically think of this as anything other than extra syntax. The braces don’t change the way the code behaves, and you could just as easily have written:

my $x = 42;
say $x;

But that block is a routine in disguise. You can even call it:

my $x = 42;
{
    say $x++;
    &?BLOCK() if $x < 50;
}

which outputs the numbers 42 through 49. This isn’t a very useful example, but a block being passed as a routine is quite common in Perl 6, such as with grep:

(^100).grep({.is-prime})

In this example, grep is passed a block which expects the variable $_ to have been set to an element of the range (0 through 99). It invokes that variable’s is-prime method and returns its result, filtering the primes in the range. We tend to think of this as passing a Routine, but that’s not quite true. If we ask it WHAT it is:

$ perl6 -e 'say {.is-prime}.WHAT'
(Block)

It clarifies for us. These blocks-as-routines can also implicitly declare parameters using the “placeholder” syntax:

(^100).grep({$^n.is-prime})

In this example, $^n is called a placeholder. When Perl sees it, it modifies the immediately enclosing Block to have a signature that takes one more formal, positional parameter. If you re-use this name, it has no further effect and so it’s now just a normal variable that came from the argument list:

(^100).grep({$^n.is-prime or $^n == 10})

It’s important to understand that this isn’t some magic that grep is doing. A block can always behave like this in Perl 6.

Whatever and WhateverCode

The last form is only really useful in very small cases. It’s never something that you want to build large blocks of code with because it would quickly become too cumbersome. However, for many routine tasks, the “whatever” is extremely handy. For example, a sequence can be defined by using the ... operator. The arguments are:

  • Some number of list values
  • An optional block or routine
  • And a terminal value.

Here is a list of the odd numbers from 1 to 100:

1, {$^n + 2} ... 100

Notice the placeholder syntax from the previous section. But there’s a shorter way to write this:

1, * + 2 ... 100

In this example, the * operator acts like the placeholder $^n, but it does even more work. It forces the expression that it’s within to become an implicit block, taking a parameter that is then substituted for the *. This is called a WhateverCode.

You can even use more than one. For example, the Fibonacci sequence can be declared quite simply using the whatever operator twice:

1, 1, * + * ... ∞

Just like placeholders, each * will modify the implicit block’s signature to include one more formal, positional parameter and because the sequence operator checks the arity of the block or routine it is passed, it will consume that many previous values from the list each time, just like a for loop would.

Conclusion

If there is just one thing that you learn from this article, I hope it is that Perl 6 blocks are much more than meets the eye and whether they are declared explicitly or implicitly, they have all of the power of a routine at their disposal. They can call and be called, have parameters and be passed as values just like named or anonymous routines.

This makes Perl 6 fairly unusual in non-Lisp-derived languages and, to my knowledge, completely unique in the C-derived tree of languages that includes C, C++, AWK, Java, Perl 5, Python and so on.

Updated: