Protocols

One of the most fundamental constructs in neutrino is the protocol.

Protocols are similar to Java interfaces in some respects, and the basic purpose of a protocol is similar: to specify that an object supports a certain behavior without necessarily specifying an implementation of that behavior. However, protocols are also very different from interfaces.

To begin with, here's an example of a protocol declaration.

protocol Vector is Collection;

Already you see a considerable difference between protocols and interfaces: protocols don't specify any methods. A protocol is, at the outset, completely opaque and atomic. How does this match with what I just said, that a protocol is used to specify that an object supports a certain behavior?

In Java, interfaces are very much a tool of the static type system. On the one hand, implementing an interface forces you to implement a particular set of methods, otherwise your program will not compile. On the other, if you're given an instance of an interface then you're guaranteed that the it will have an implementation of its methods. Note, however, that this kind of static checking is limited. As long as you define methods with the right signatures that's all the type checker cares about, it doesn't care how you implement them or whether they obey their respective contracts.

Imagine for a moment that we could remove the type system from Java (ignoring the ways it interacts with program behavior through static method overloading, reflection, etc.). What use would the methods listed in an interface be? They would be a convenient way for a programmer to see which methods to implement, and which he could call, but the language wouldn't actually care. Machine-readable documentation is a good thing, certainly, but the goal of neutrino is to be as simple as possible at the basic level, and then allow more complex utilities – like type systems – to be built on top. Since neutrino is a dynamically typed language the implementation really doesn't need to know which methods corresponds to a particular protocol. If you say that your object is a Vector then that's all the implementation needs to know, it doesn't care what the requirements are for being a Vector, not which methods it requires nor, as with Java, whether the implementations obey the contracts for those methods.

That was the first major difference between protocols and interfaces. The second is that you can associate behavior with protocols.

In Java, an interface is a pure specification with no behavior. I'm sure this is intentional, and I understand why that is, but it does cause some serious problems. In particular it causes tension between keeping interface small to make them easy to implement, and making them broad so they're easier to use. A classic example is Java's List interface. For convenience, the List interface has a lot of methods, many of which could be trivially defined in terms of each other. Like addAll in terms of add, isEmpty in terms of size, toArray in terms of iterator, and so on. Beyond these there is a number of utility functions in Collections, like indexOfSubList, replaceAll, rotate, etc., which actually belongs to List. In other words, there are some methods that are intrinsic to being a List, like add and size, some that are simple utilities that are included in the interface to make it convenient to work with, and some that are less common or challenging to implement which are put outside the interface in a utility class. This sucks for the user of List because he has to look for methods both on the list itself and the utility class, and it's a drag for the implementor of a class that implements List who has to choose between making his one and only superclass AbstractList or re-implementing a dozen trivial utility methods.

The way neutrino deals with this is to allow protocols to have methods with implementations. Say we required implementations of Vector to have a .length method. Regardless of how a Vector is implemented we can then define

def (this is Vector).is_empty -> (this.length = 0);

The tension is now gone. The module that defined Vector can provide as many utility methods as it want directly on the vector, without thereby tying down the implementation – you're still free to implement the intrinsic methods, the core methods that everything else builds on, however you want. Your implementation is also free to override any of the default implementations. The programmer that uses Vector has access to all the convenient utilities and can enjoy greater reliability because there only has to be one implementation of .is_empty rather than, as in the case of List, one for each implementation that for whatever reason decides not to extend AbstractList.

To put it all together, here is one example of how this can be used. Here's a definition of a generic vector interface and some utility functions to go with it

/**
 * A finite, read-only, random-access vector.  Requires
 * implementations of .length and the indexing operator.
 */
protocol Vector;

def (this is Vector).is_empty -> (this.length = 0);

def (this is Vector).for_each(fun) {
  for (i : 0 .. this.length)
    fun(this[i]);
}

def (this is Vector).contains(val) {
  with_1cc (return) {
    for (elm : this) {
      if elm = val
        then return(true);
    }
    false;
  }
}

Based on this you can easily define new types of Vector which defines the required intrinsics:

/**
 * A range represents a series of integers from
 * 'from' to 'to'.
 */
protocol Range is Vector;

def Range.new(from, to)
 -> ...; // Let's cover object construction later.

def (this is Range).length
 -> this.to - this.from + 1;

def (this is Range)[index]
 -> this.to + index;

After we've implemented the intrinsics of Vector we get all the extra functionality for free:

def range := new Range(0, 10);
range.contains(5);
> #t
range.is_empty
> #f

In some cases we can implement these methods more efficiently, and there we can simply override the default implementations we get from Vector:

def (this is Range).contains(val)
 -> (this.from <= val) and (val < this.to);

You get some of the same advantages with abstract superclasses or traits, but this is simpler. This description has only covered protocols in terms of simple single inheritance, but it generalizes straightforwardly to objects that implement more than one protocol, and to multiple dispatch. But that will have to wait until a future post.