Category Archives: languages

Why I find IEEE 754 frustrating

Programming languages new (Java) and old (Fortran), and their compilers, still lack competent support for features of IEEE 754 so painstakingly provided by practically all hardware nowadays. […] Programmers seem unaware that IEEE 754 is a standard for their programming environment, not just for hardware.

William Kahan

This post is the result of me, a language design guy, trying and failing to figure out how to provide language bindings for IEEE 754 in a modern language. The TL;DR is,

  • The basic model described in 754 is great, no complaints.
  • The licensing terms of the spec document are deeply problematic.
  • The language bindings are too specific and have an old-school procedural flavor that makes them difficult to apply to a modern language.
  • It would be much more useful to know the motivation behind the design, that way each language can find a solution appropriate for them.

The rest of the post expands on these points. First off I’ll give some context about 754 and floating-point support in general.

754

I want to start out by saying that IEEE 754 has been enormously helpful for the world of computing. The fact that we have one ubiquitous and well-designed floating-point model across all major architectures, not multiple competing and less well designed models, is to a large extent thanks to 754. It’s been so successful that it’s easy to overlook what an achievement it was. Today it’s the water we swim in.

But 754 is two things, what I’ll call the model and the bindings. The model is a description of a binary floating-point representation and how basic operations on that representation should work. The bindings are a description of how a programming language should make this functionality available to programmers. The model is the part that’s been successful. The bindings, as Kahan, the driving force behing 754, admits, not so much.

I’ll go into more detail about why the bindings are problematic in a sec, it’s the main point of this post, but first I want to take a broader look at floating-point support in languages.

Floating-point is hard

I’m not really a numerical analysis guy at all, my interest is in language design (though as it happens the most popular blog post I’ve ever written, 0x5f3759df, is also about floating-point numbers). What interests me about floating-point (FP from now on) is that a couple of factors conspire to make it particularly difficult to get right in a language.

  1. Because floats are so fundamental, as well as for performance reasons, it’s difficult to provide FP as a library. They’re almost always part of the core of the language, the part the language designers design themselves.
  2. Numerical analysis is hard, it’s not something you just pick up if you’re not interested in it. Most language designers don’t have an interest in numerical analysis.
  3. A language can get away with incomplete or incorrect FP support. To quote James Gosling, “95% of the folks out there are completely clueless about floating-point.”

The goal of the binding part of 754 is, I think, to address the second point. Something along the lines of,

Dear language designers,
We know you’re not at home in this domain and probably regret that you can’t avoid it completely. To help you out we’ve put together these guidelines, if you just follow them you’ll be fine.

Much love,
The numerical analysts

I think this is a great idea in principle. So what’s the problem?

Licensing

The most immediate problem is the licensing on the 754 document itself. Have you ever actually read it? I hadn’t until very recently. Do you ever see links from discussions around FP into the spec? I don’t, ever. There’s a reason for that and here it is,

ieee

The document is simply not freely available. This alone is a massive problem. Before you complain that people don’t implement your spec maybe stop charging them $88 to even read it?

More insidiously, say I support 754 to the letter, how do I document it? What if I want to describe, in my language spec, under which circumstances a divideByZero is raised? This is what the spec says,

The divideByZero exception shall be signaled if and only if an exact infinite result is defined for an operation on finite operands. The default result of divideByZero shall be an ∞ correctly signed according to the operation [… snip …]

This sounds good to me as is. Can I copy it into my own spec or would that be copyright infringement? Can I write something myself that means the same? Or do I have to link to the spec and require my users to pay to read it? Even if I just say “see the spec”, might my complete implementation constitute infringement of a copyrighted structure, sequence, and organization? The spec itself says,

Users of these documents should consult all applicable laws and regulations. […] Implementers of the standard are responsible for observing or referring to the applicable regulatory requirements.

So no help there. In short, I don’t know how to create FP bindings that conform to 754 without exposing myself to a copyright infringement lawsuit in the US from IEEE. Sigh.

My first suggestion is obvious, and should have been obvious all along: IEEE 754 should be made freely available. Maybe take some inspiration from ECMA-262?

That’s not all though, there are issues with the contents of the spec too.

The spec is too specific

The main problem with how the bindings are specified is that they’re very specific. Here’s an example (my bold),

This clause specifies five kinds of exceptions that shall be signaled when they arise; the signal invokes default or alternate handling for the signaled exception. For each kind of exception the implementation shall provide a corresponding status flag.

What does it mean to provide a status flag? Is it like in JavaScript where the captures from the last regexp match is available through RegExp?

$ /.(.)./.exec("abc")
> ["abc", "b"]
$ RegExp.$1
> "b"

It looks like that’s what it means, especially if you consider functions like saveAllFlags,

Implementations shall provide the following non-computational operations that act upon multiple status flags collectively: [snip]

― flags saveAllFlags(void)
Returns a representation of the state of all status flags.

This may have made sense at the time of the original spec in 1985 but it’s mutable global state – something that has long been recognized as problematic and which languages are increasingly moving away from.

Here’s another example. It sounds very abstract but for simplicity you can think of “attribute” as roundingMode and “constant value” as a concrete rounding mode, for instance roundTowardsZero.

An attribute is logically associated with a program block to modify its numerical and exception semantics[…] For attribute specification, the implementation shall provide language-defined means, such as compiler directives, to specify a constant value for the attribute parameter for all standard operations in a block […].

Program block? Attribute specification? Nothing else in my language works this way. How do I even apply this in practice? If attributes are lexically scoped then that would mean rounding modes only apply to code textually within the scope – if you call out to a function its operations will be unaffected. If on the other hand attributes are dynamically scoped then it works with the functions you call – but dynamically scoped state of this flavor is really uncommon these days, most likely nothing else in your language behaves this way.

You get the picture. The requirements are very specific and some of them are almost impossible to apply to a modern language. Even with the best will in the world I don’t know how to follow them except by embedding a little mini-language just for FP that is completely different in flavor from the rest. Surely that’s not what anyone wants?

What do I want then?

As I said earlier, it’s not that I want to be left alone to do FP myself. I’m a language guy, almost certainly I won’t get this right without guidance. So what is it that I want?

I want the motivation, not some solution. Give me a suggested solution if you want but please also tell me what the problem is you’re trying to solve, why I’m supporting this in the first place. That way, if for some reason your suggested solution just isn’t feasible there’s the possibility that I can design a more appropriate solution to the same problem. That way at least there’ll be a solution, just not the suggested one.

For instance, it is my understanding that the rounding mode mechanism was motivated mainly by the desire to have a clean way to determine error bounds. (Note: as pointed out on HN and in the comments, the rest of this section is wrong. Something similar was indeed the motivation for the mechanism but not exactly this, and you don’t get the guarantees I claim you do. See the HN discussion for details.) It works like this. Say you have a nontrivial computation and you want to know not just a result but also the result’s accuracy. With the rounding mode model you’d do this: repeat the exact same computation three times, first with the default rounding, then round-positive (which is guaranteed to give a result no smaller than the exact result) and finally round-negative (which is guaranteed to give a result no larger). This will give you an approximate result and an interval that the exact result is guaranteed to be within. The attribute scope gymnastics are to ensure that you can replicate the computation exactly, the only difference being the rounding mode. It’s actually a pretty neat solution if your language can support it.

It’s also a use case that makes sense to me, it’s something I can work with and try to support in my own language. Actually, even if I can implement the suggested bindings exactly as they’re given in the spec the motivation is still useful background knowledge – maybe it’s a use case I’ll be extra careful to test.

I’m sure there are other uses for the binding modes as well. But even if I assume, wrongly, that there isn’t, that misapprehension is still an improvement over the current state where I have nothing to fall back to so I dismiss the whole rounding mode model outright.

(As a bitter side note, minutes from the 754-2008 revision meetings are available online and I expect they contain at least some of the motivation information I’m interested in. I don’t know though, I can’t search for the relevant parts because the site’s robots.txt prevents search engines from indexing them. Grr.)

In conclusion, I think the people who regret that languages don’t have proper support for 754 could do a lot to improve that situation. Specifically,

  • Fix the licensing terms on the spec document.
  • Describe the use cases you need support for, not just one very specific solution which is hard to apply to a modern language.
  • As an absolute minimum, fix the robots.txt so the spec meeting minutes become searchable.

(HN discussion, reddit discussion)

Update: removed sentence that assumed the spec’s concept of exceptions referred to try/except style exceptions. Don’t use the term “interval arithmetic” about determining error bounds.

Documentation

After taking a long break over the winter I’ve started working on neutrino again. When I left off I was working on generating static data in the binary object files: basically, taking neutrino objects and constructing binary data in the generated executables. I know how my code generator should behave for this to work but changing it to do that has been tricky because it is a large amount of relatively complex code (generating binaries is a relatively complex process) and I’ve forgotten how it works exactly. The problem is that I didn’t do a good job of documenting it. So my step one is going to be documenting it. Or, it turns out, that actually will be step two because at this point neutrino doesn’t have a documentation format – so step one will be implementing one.

JavaDoc
I’ve always felt that java’s JavaDoc comments were a good idea in principle but problematic in a lot of the particulars. The good part is that, unlike systems like doxygen and JsDoc, it’s a standard that comes with the language, and it provides all the primitives I want then documenting code. However…

JavaDoc builds on top of block comments. Block comments are for free-form, anything goes text. This means that it’s tricky if the language wants to impose rules on them, even if those rules might be useful. For instance, it’s legal to @link to a class or method that doesn’t exist.

Also, since they it is based on comments the connection between JavaDoc and the code it documents is weak. This, for instance, is meaningless but perfectly legal:

String x = /** Looky here! */ "foo";

Finally, JavaDoc builds directly on HTML which is fine if what you want to generate is HTML but problematic if you want to generate anything else. It is also difficult to read outside of a browser and I almost always read the documentation in the code, not in the generated output.

A small detail that’s always bugged me about them, also, is that they waste two lines of space:

/**
* Two lines of comment takes up three lines of code
* because of the start and end markers.
*/

This is a two-line comment, really, and it shouldn’t take up more than two lines in the program.

For all these reasons I’m not keen to base neutrino’s documentation format on JavaDoc and I’m not familiar with any other formats that solve all of these issue. So I’ve ended up starting basically from scratch, with some inspiration from a language I previously worked on, neptune.

Neutrino
I’ll start with the two last points first, the syntax. Here’s one example of what a piece of documentation looks like:

/| Moves the specified number of disks from the @from@
| peg to the @to@ peg.
|
| param(from): Which peg to move from.
| param(to): Which peg to move to.
\| param(disks): How many disks to move.

def this.move(from:, to:, disks:) {
...
}

I’d like the syntax to be such that you don’t need the start and end marker to require a separate line each. In other words, you need to be able to write the documentation on the same lines as those markers. So the way this works is: /| starts a comment and \| marks that that lines is the end of the block, but the whole line is included in the comment. These last markers can also occur alone, without a start marker, in which case it’s a single-line doc comment:

\| Creates a new hanoi instance.
@static def Hanoi.new() => ...

The first character on each line (above that would be the | character) are used to determine how to interpret each line. A | means that it is just text. A @ means that the text is code. For instance:

/| Moves the specified number of disks from the @from@
| peg to the @to@ peg. For instance, to execute the
| towers of hanoi with 15 disks do:
|
@ def hanoi := new Hanoi();
@ hanoi.move(from: 0, to: 2, disks: 15);
|
| param(from): Which peg to move from.
| param(to): Which peg to move to.
\| param(disks): How many disks to move.

def this.move(from:, to:, disks:) {
...
}

This way it’s as easy to identify and read code examples within the code as it is for a doc generator to recognize.

The second thing that makes these different from JavaDoc is that they’re not comments, they’re annotations and can only occur in the same positions as annotations. Documentation always belongs to a particular element and it is clear from the grammar which element that is. The code above is, in principle, equivalent to:

@doc(...) def this.move(from:, to:, disks:) {
...
}

This also means that documentation can stay attached to source elements at runtime and be accessed through reflection, like any other annotations.

Finally, the format used is not HTML but something very close to textile. This is, again, to make it easier to read from source code, and to make it easier to generate documentation in other formats than HTML. I remember we had a doc generator for neptune that generated a PDF file for a module, which worked really well (and looked, if I do say so myself, awesome).

I think this format has a lot of appealing properties and I’m actually eager to get started documenting the code generator.

Keywords

I’m finally through the Big Cleanup (well, the first one, I’m sure there’ll be others) and have now arrived at the fun part: simplifying and generalizing based on the new cleaner infrastructure, and implementing completely new stuff. On the surface level I’ve made a few syntax changes to free up syntactic elements for some future constructs, including changing from -> to => for functions and : to in in for loops. Where before you would write

def this.size() -> this.elements.length;

def this.print_all() {
for (elm : this)
print(elm);
}

you must now write

def this.size() => this.elements.length;

def this.print_all() {
for (elm in this)
print(elm);
}

This is because I need the small arrow for delayed calls (such as process->run()) and I need the colon for keyword arguments. I have just completed a first shot at implementing keyword arguments which required a complete overhaul of how calls work. But now that they’re implemented they seem to be well worth the trouble.

To illustrate why keyword arguments are helpful consider this method from the implementation of towers of Hanoi:

def this.run(size) {
this.build(0, size);
this.move(0, 1, size);
this.moves;
}

Quick, what does this do? If the algorithm being implemented here isn’t clear in your when you read this it’s not obvious – you have to know the solution to infer what the various arguments are and what this does. Now consider this:

def this.run(size:) {
this.build(pile: 0, disks: size);
this.move(from: 0, to: 1, disks: size);
this.moves;
}

With keyword arguments you don’t have to guess how this works, the code tells you: there are piles with integer indices, we first build pile 0 with size elements and then and move that many disks from 0 to 1. Inferring this from the implementation without keywords is possible but requires you to think. Inferring it from the second requires no effort, it is clear from the code.

Keyword arguments are not a new invention, they are present in several popular languages. However, I would claim that they are used in an ad-hoc fashion, in part because languages like python give them to you for free. In python, any function, for instance

def run(self, size):
...

can be called with or without keyword arguments, in this case h.run(3) and h.run(size=3). This is good because it means that there’s no extra effort required to enable keyword arguments, but it’s bad because it means that you don’t have to put much thought into that part of your interface. There is also no established rule of thumb that I know of for when to use keywords in python — the google python style guide doesn’t even have an opinion, which really says something.

In neutrino, the intention is that the same way you make code more readable by using descriptive variable and method names, you should use keyword arguments. It’s just one of the mechanisms that can be used to make code self-documenting. The basic assumption is that keywords should be used whenever they aid code readability but left out if they are redundant, which is actually most of the time. Keywords are a public part of a method’s signature and should be chosen as carefully as the method’s name, since they are in some sense part of the name. Consequently, parameters that can be passed via keyword arguments are declared in one way (def foo(size:) => ...) and those that don’t in another (def foo(size) => …). This is for two reasons: to make it explicit to users that it may not be obvious what these arguments mean so you might want to consider passing them using keywords, and the make it clear to the implementor that people will depend on those keyword names so there are compatibility concerns if you change them.

An interesting part of this will be to learn to use this myself, which may lead me to change how this works. However, I’m sure I’ll hang on to the idea of keyword arguments in some form, for code readability there is just no replacement for a construct like this.

TextMate support

The project has been fairly quiet lately. That’s not because nothing is happening, it’s just that I’m still cleaning up the codebase which makes for uninteresting blogposts. The good news is that I’m done reimplementing the parts that needed to be replaced and am now working on the fun stuff: taking advantage of a better model to make the implementation cleaner.

One new “feature” I have added is a TextMate mode for neutrino. TextMate, the best non-IDE code editor I’m aware of for mac, makes it pretty easy to add support for new languages.

To install it simply browse to /misc/editors using finder and double-click the Neutrino TextMate bundle. Easy!

Mutability

In my last post about constructors I mentioned that objects have three states: under construction, mutable and immutable. The under construction state is not likely to be one you’ll see often — it will be more of an implementation detail — but the mutable and immutable states you’ll see.

Neutrino is strict about mutability. You are free to have mutable state but you are limited in how you are allowed to use it. State can only be shared between threads if it is immutable. State passed from one execution stage to another (think of it as from compile time to runtime), including global constants, must be immutable. In general, immutable data is required in a lot of places and neutrino has to make it as easy as possible to work with it.

The tricky issue with immutable data is: how do you construct it. Simple data like pairs and singly-linked lists compose nicely and lend themselves to being constructed immutably or compositionally, from only immutable parts. Most practical data structures, including hash maps and random-access arrays, do not lend themselves to immutable construction but are more conveniently constructed by successive operations on mutable data. To make immutable instances of these you need more complex machinery.

One popular pattern for more complex object construction is to use a mutable builder to define the structure of the object, and then have it produce an immutable copy. This is useful in a lot of cases but also has its limitations. First of all, it adds a lot of code. It also means that you get at least two copies of your object: the builder and the resulting object. Finally, it only helps if the resulting object is non-circular — you can’t use a builder to construct a circular list, for instance. Builders are neat but limited.

Another non-solution to this issue is to wrap your mutable data structure in an immutable wrapper. This doesn’t actually prevent the object from being mutated — the underlying data is still mutable — but it allows you to pass the object around in a way that prevents users of the data from modifying it. That is also useful but gives none of the guarantees we need.

The thing to note about these two examples is that they’re patterns for building nontrivial immutable data in languages that have no proper concept of mutable vs. immutable data. I would claim is that, as with threads, immutable data cannot be implemented as a library. This is why, even though the whole point of neutrino is simplicity, the distinction between mutable and immutable is basic and understood by the language.

As described in the post about constructors a neutrino object is in one of three states: under construction, mutable and immutable. An object can move from fewer restrictions to more, from under construction to mutable and from under construction and mutable to immutable, but never the other way. These states are used in a few different situations.

One way they’re used is when passing objects outside the current thread. Only objects that are transitively immutable, which the runtime can check, can be passed between threads, both within the same process or via RPCs. The state of an object is also taken into account when looking up methods. Basically, you can have mutator methods on an object which are only present when the object is not in the immutable state. For instance:

protocol HashMap;

def (this is HashMap)[key] {
def entry := this.find_entry(key, this.hash(key),
false);
if entry = null
then null;
else entry.value;
}

def (@mutable this is HashMap)[key]:=(value) {
def entry := this.find_entry(key, this.hash(key),
true);
entry.value := value;
}

// Rest of hash map implementation...

The effect of the @mutable annotation is that the setter here can only be called when the object is mutable. You might think of this in terms of reclassification: part of an object’s type is whether or not it is mutable, and changes to mutability status of an object changes the object’s type. This also works for the under construction state so initialization methods can be marked so that they’re only available when the object is under construction and you can be sure that it will be impossible for anyone to call those methods once the object is no longer under construction. The pattern for creating immutable data is simple: create a mutable version, populate it and then freeze it. The mutator methods will fall away and you’ll be left with an object with the appropriate immutable interface which the language will guarantee cannot be modified further, since it will prevent any attempt to write to a field of an immutable object.

Or, immutable objects can actually still be modified to some extent. That may sound conterintuitive but it is completely safe and follows from the mechanism above and how it interacts with field keys.

To reiterate what I described in more detail in the post about constructors, fields in neutrino are set using field keys. Rather than set and get a field through a string name like in python or javascript you use an object, the field’s key. If you have the field key then you can get and set the field. If you don’t have the key there is no way you can access the field, and no way for you to get access to the key except by having someone with the key pass it to you. Basically, it is a capability that allows you to access an object field. For a “public” field, one that others are allowed to access, there will be public getter and/or setter methods which have access to the key, so anyone can access it by calling those methods. For “private” state there would be no direct accessors, instead the methods that need to access the field will be given access to the key directly, and noone else.

This model means that there are two objects involved in getting and setting fields: the object itself and the field key. This gives four different combinations of mutability: the object can be mutable or not, the key can be mutable or not. Only if both are immutable will setting the field fail. Immutable field keys are said to be global, mutable ones are local.

Global fields are “normal” fields. They can be set only when the object is not in the immutable state, and when transferring an object out of its context. (A quick timeout: when I say “context” a good approximation is to think “thread”. In reality it is a bit broader and includes transferring from compile time to runtime, between threads and between processes over RPC, but the common and familiar case is between threads). Technically, the reason the values of global fields are accessible out of context is that the field key is immutable so it can be transferred along with the object and used to access the field in the destination context. You can also have global fields that are not accessible outside the context: if you don’t transfer the key along with the object.

Local fields on the other hand can be set even on immutable object. This is safe because, being mutable, the field key cannot leave the current context and hence is guaranteed to only be accessible locally. You might think of a local field as an associative map where setting the value doesn’t actually change the object, it just creates an association for the object in the key. Whether that’s how the implementation treats it up to it, it may decide to store the value on the object itself for efficiency, but it shouldn’t matter from the programmer’s viewpoint: you get to store some data with an object regardless of whether that object is mutable, under the restriction that the data will not be accessible outside your context. This is convenient for storing hashes, coloring objects when traversing object graphs, etc.

That is the general idea of how mutability works in neutrino. It is a slightly more complex model than in most languages but the complexity pays off in increased flexibility and security in a way that is practical and useful. In particular the simplicity of the rules that makes this secure, that mutable data cannot leave its context and that all fields are accessed through unforgeable keys, are a key part of the design of neutrino.

Constructors

My last post about protocols used object instantiation but glossed over how it actually works. One reason for that is that I haven’t sorted out all the details of it yet, but I think I’m close enough to a solution that I can give a rough outline of how it will work.

Most languages I know have some form of construction phase when creating objects. During this phase the object isn’t fully initialized yet so many of the assumptions you can usually make don’t actually hold yet. The object under construction is allowed to be in an inconsistent state, and in some cases you can observe these inconsistencies. Here is an example of a program observing two different values for a final field in Java, which is otherwise not allowed to change:

public class Test {

private final int value;

public Test() {
printValue();
this.value = 4;
printValue();
}

private void printValue() {
System.out.println(value);
}

}

The exact same thing is possible in C++:

class Test {
public:
Test() : x_(print_x()){
print_x();
}
int print_x() {
printf("%i\n", x_);
return 0;
}
private:
const int x_;
};

Even though the field is supposedly constant this prints two different values for it (technically the value of x_ is not well-defined before it is initialized so it might not change because it was already 0).

I use Java and C++ as examples because those languages make an effort to limit what you can do during construction, an effort which is nowhere near airtight. Other object-oriented languages, for instance python, have more relaxed rules, and some, like JavaScript and smalltalk, have none at all.

For neutrino I had several requirements in mind for object construction. It must be possible for one implementation type to extend another, and extra code should not be required in the supertype for this to work. There should be no constructors, methods that are allowed to construct or initialize objects but are covered by various rules and restrictions. It must be possible to construct and subtype immutable objects. It must be possible for each step in the construction process to execute arbitrary code to calculate the fields of the new object.

The model I ended up with to support this has two components: three-state objects and field keys. I’ll first outline how they work and then give an example at the end.

Three-state objects

All objects are in one of three states: under construction, mutable and immutable. An object can only move towards more restrictions, that is, from being under construction to being mutable and from being under construction or mutable to being immutable. For now we’ll ignore the mutable/immutable distinction and focus on being or not being under construction.

When an object is born it has no state and implements no protocols. It is an empty atom which has nothing but its object identity. However, since it is in the under construction state all that can change. An object under construction can have protocols added to it, which means that the object will now respond to all the methods associated with those protocols. It can also have fields set and changed, similar to how this works in JavaScript and python: when you set a field the object doesn’t yet have that field is created. There are also notable differences to those languages though.

An object under construction can be passed around to any number of methods which can set fields and add protocols – construct the object, basically. When the object is considered done, typically when it is returned to the constructor call that started the construction process, it can be moved to one of the two other states. From that point on construction will be considered complete and it will no longer be possible to add protocols to the object, though it may still be possible to add and set fields.

This model means that the fact that the object is under construction is understood by the underlying system. There doesn’t have to be any restrictions on the code that constructs the object, any method can do that. One “constructor” method can be extended by having the “sub-constructor” simply call the original constructor and then extend the returned object, which is still under construction, with additional fields and protocols. There is no restriction on how many objects are under construction at the same time.

This model means that during the construction phase you’re not encumbered by restrictions on how you’re allowed to write your construction code. The object is in a special state, the runtime system understands that and allows you to construct it however is convenient for you. After, the runtime system knows that the object can no longer change how it is constructed and can use that information to optimize the program. In Java and C++ any property of an object that is violated during construction is worthless for optimization purposes because it is impossible to tell, short of adding an implicit under construction flag, whether it is safe to rely on this property.

Field keys

Before I mentioned that you can add and set fields during construction, similar to python and JavaScript. That is only true in a limited sense. Neutrino objects don’t have fields in the usual sense but instead use field keys which are similar to private names as proposed for JavaScript. Where JavaScript and python use string names to identify an object field, for instance the string "x" to get the x coordinate of a Point, neutrino accesses fields through keys. A key is an opaque object, you might think of it as a capability for accessing a particular field. When implementing a Point type you would create two new keys, an x key and a y key, and use them to store the actual coordinates of point objects.

protocol Point;

@static def Point_x := new GlobalField();
@static def Point_y := new GlobalField();

def Point.new(x, y) {
def result := new Object();
result::Point_x := x;
result::Point_y := y;
Point.attach(result);
return result;
}

def (this is Point).x -> this::Point_x;

def (this is Point).y -> this::Point_y;

Here we create two field keys and use them to attach the two coordinates to the point, setting them (which means creating them) in the new method and reading them in the two accessors. Direct access to the field values is only possible if you have access to the Point_x and Point_y values, which can be kept private or exposed as appropriate. As usual, this is not how you would write your program but how the underlying concepts work and how the implementation understands object construction. There would be shorthands for doing all this and I expect that actually seeing field keys will be rare.

The way this works there is no way for different fields to interfere — if you want to associate some data with an object you just make your own key and store the data, and no other user of the object should care. This is much more flexible than languages where the set of fields is fixed by the type declaration, but should be no more expensive in the cases where all you need is a fixed set of fields set by a straightforward constructor. Code like the above is straightforward to analyze and optimize, especially in this cases like this where it is impossible to observe the result while the fields are being set and the protocol added.

That, basically, is how object construction works in neutrino. I’ll describe the second part of this, how the mutable and immutable states work, in a later post.

Protocols

One of the most fundamental constructs in neutrino is the protocol.

Protocols are similar to Java interfaces in some respects, and the basic purpose of a protocol is similar: to specify that an object supports a certain behavior without necessarily specifying an implementation of that behavior. However, protocols are also very different from interfaces.

To begin with, here’s an example of a protocol declaration.

protocol Vector is Collection;

Already you see a considerable difference between protocols and interfaces: protocols don’t specify any methods. A protocol is, at the outset, completely opaque and atomic. How does this match with what I just said, that a protocol is used to specify that an object supports a certain behavior?

In Java, interfaces are very much a tool of the static type system. On the one hand, implementing an interface forces you to implement a particular set of methods, otherwise your program will not compile. On the other, if you’re given an instance of an interface then you’re guaranteed that the it will have an implementation of its methods. Note, however, that this kind of static checking is limited. As long as you define methods with the right signatures that’s all the type checker cares about, it doesn’t care how you implement them or whether they obey their respective contracts.

Imagine for a moment that we could remove the type system from Java (ignoring the ways it interacts with program behavior through static method overloading, reflection, etc.). What use would the methods listed in an interface be? They would be a convenient way for a programmer to see which methods to implement, and which he could call, but the language wouldn’t actually care. Machine-readable documentation is a good thing, certainly, but the goal of neutrino is to be as simple as possible at the basic level, and then allow more complex utilities – like type systems – to be built on top. Since neutrino is a dynamically typed language the implementation really doesn’t need to know which methods corresponds to a particular protocol. If you say that your object is a Vector then that’s all the implementation needs to know, it doesn’t care what the requirements are for being a Vector, not which methods it requires nor, as with Java, whether the implementations obey the contracts for those methods.

That was the first major difference between protocols and interfaces. The second is that you can associate behavior with protocols.

In Java, an interface is a pure specification with no behavior. I’m sure this is intentional, and I understand why that is, but it does cause some serious problems. In particular it causes tension between keeping interface small to make them easy to implement, and making them broad so they’re easier to use. A classic example is Java’s List interface. For convenience, the List interface has a lot of methods, many of which could be trivially defined in terms of each other. Like addAll in terms of add, isEmpty in terms of size, toArray in terms of iterator, and so on. Beyond these there is a number of utility functions in Collections, like indexOfSubList, replaceAll, rotate, etc., which actually belongs to List. In other words, there are some methods that are intrinsic to being a List, like add and size, some that are simple utilities that are included in the interface to make it convenient to work with, and some that are less common or challenging to implement which are put outside the interface in a utility class. This sucks for the user of List because he has to look for methods both on the list itself and the utility class, and it’s a drag for the implementor of a class that implements List who has to choose between making his one and only superclass AbstractList or re-implementing a dozen trivial utility methods.

The way neutrino deals with this is to allow protocols to have methods with implementations. Say we required implementations of Vector to have a .length method. Regardless of how a Vector is implemented we can then define

def (this is Vector).is_empty -> (this.length = 0);

The tension is now gone. The module that defined Vector can provide as many utility methods as it want directly on the vector, without thereby tying down the implementation – you’re still free to implement the intrinsic methods, the core methods that everything else builds on, however you want. Your implementation is also free to override any of the default implementations. The programmer that uses Vector has access to all the convenient utilities and can enjoy greater reliability because there only has to be one implementation of .is_empty rather than, as in the case of List, one for each implementation that for whatever reason decides not to extend AbstractList.

To put it all together, here is one example of how this can be used. Here’s a definition of a generic vector interface and some utility functions to go with it

/**
* A finite, read-only, random-access vector. Requires
* implementations of .length and the indexing operator.
*/

protocol Vector;

def (this is Vector).is_empty -> (this.length = 0);

def (this is Vector).for_each(fun) {
for (i : 0 .. this.length)
fun(this[i]);
}

def (this is Vector).contains(val) {
with_1cc (return) {
for (elm : this) {
if elm = val
then return(true);
}
false;
}
}

Based on this you can easily define new types of Vector which defines the required intrinsics:

/**
* A range represents a series of integers from
* 'from' to 'to'.
*/

protocol Range is Vector;

def Range.new(from, to)
-> ...; // Let's cover object construction later.

def (this is Range).length
-> this.to - this.from + 1;

def (this is Range)[index]
-> this.to + index;

After we’ve implemented the intrinsics of Vector we get all the extra functionality for free:

def range := new Range(0, 10);
range.contains(5);
> #t
range.is_empty
> #f

In some cases we can implement these methods more efficiently, and there we can simply override the default implementations we get from Vector:

def (this is Range).contains(val)
-> (this.from <= val) and (val < this.to);

You get some of the same advantages with abstract superclasses or traits, but this is simpler. This description has only covered protocols in terms of simple single inheritance, but it generalizes straightforwardly to objects that implement more than one protocol, and to multiple dispatch. But that will have to wait until a future post.

One-shot continuations

Neutrino has two basic control structures: method calls and one-shot continuations. In my last blog post I outlined how method calls work. In this one I’ll outline one-shot continuations.

You may already be thinking: continuations? Seriously? But it’s not as hairy as it sounds. First of all, these are one-shot, not fully general call/cc. It’s really just a convenient way to define all the usual control structures, including break, continue, and local and non-local return, in terms of one fundamental construct. In particular, one-shot continuations are, even in the most general case, no more expensive in terms of implementation complexity than non-local returns.

In the current syntax (I’ll get back to the topic of syntax and why you shouldn’t focus too much on that later) a one-shot continuation use looks something like this:

with_1cc (return) {
for (i : 0 .. 10) {
if found_it(i)
then return(i)
}
-1;
}

What’s happening here is: at the beginning of the with_1cc statement we capture the current continuation and store it in a variable called return. We then loop from 0 to 9 and, if a condition holds, invoke the continuation and immediately yield the specified value as the result of the with_1cc expression without executing any more of the body. If the condition doesn’t hold then the continuation is implicitly invoked with the result of the body which is -1. Invoking a one-shot continuation more than once is an error and the final implicit call ensures that you’ll get an error if you try to invoke the continuation after the with_1cc expression has completed.

That’s all there is to with_1cc: it gives you a function that you can call — once — to bail out of an arbitrarily nested computation and yield a value to the whole expression. Cases like above where the capture and use of the continuation occur together in a single function are easy for the compiler to recognize and for those it can generate code that doesn’t actually materialize the continuation but simply uses a jump to leave the expression. This will work in any situation where you could use break, continue or local return in a standard C-family language:

def my_function() {
with_1cc (return) {
// ... prepare loop ...
with_1cc (break) {
for (i : 0 .. n) {
with_1cc (continue) {
// ... loop body ...
}
}
}
// ... finish ...
}
}

The above code is obviously very verbose and luckily you wouldn’t have to actually write that in neutrino — the syntax is flexible and the compiler is free to automatically introduce with_1ccs when required. The point is that this can be used to implement any of the basic local jumps from the C-like languages, except for goto, and generate as efficient code for each of them as in languages where they’re built-in. In addition to mimicking other control structures it also allows patterns that are not possible on C-like languages, like expression-local returns:

def matching_entry := with_1cc (leave) {
for (entry : this.entries) {
if entry.name = name
then leave(entry);
}
}
print("Found entry: ${matching_entry}");

This code loops over a list of entries and leaves the loop, but not the whole function, immediately when the appropriate one has been found. This can be simulated in various ways: creating a utility method that uses normal return, assigning to matching_entry and breaking out of the loop, or inlining the action to be performed with the result at the place where the result is identified. Using with_1cc you can express the behavior directly without the need for workarounds. And, again, this comes without any performance penalty since the control flow can be resolved statically and the continuation doesn’t have to be materialized.

In the case where the continuation is not used locally it behaves similarly to non-local returns in language that support that, like Smalltalk. The difference is that where non-local return ties the return to a method call, with_1cc lets you return to an arbitrary expression. Other than that they are completely equivalent.

As mentioned, the syntax of neutrino is flexible and there’s every reason for libraries to introduce shorthands based on this construct that allows it to be used without the full syntactic complexity. This is a basic building block that the base language supports, not something you’re likely to see that often in its naked form.

Update: Kevin Millikin points out that just calling the continuation on exit isn’t enough to ensure that it won’t get called more than once. Kevin’s counterexample is

def get_me_back_in := with_1cc (get_me_out) {
with_1cc (invoke_me_later) {
get_me_out(invoke_me_later);
}
}
print("hest");
get_me_back_in(...);

Here we escape the implicit invocation of invoke_me_later by leaving via a surrounding continuation. Bummer. This issue is present in any language with non-local return that can be captured, for instance SmallTalk:

outer
| innerReturn |
innerReturn := self inner.
^ innerReturn value: 5.

inner
^ [ :v | ^ v ]

(I just tried this in Squeak; you get a BlockCannotReturn exception.)

The solution is to enclose the body of the with_1cc in an unwind-protect that ensure that no matter how we leave the with_1cc, the corresponding continuation will be disabled. Divorcing calling a continuation from disabling it does mean that the name “one-shot continuation” stops being accurate so I’ll probably have to come up with a more fitting name.

One thing I haven’t written about yet are coroutines, which neutrino also supports. Mixing not-really-one-shot-continuations-anymore with those is also interesting but that is a future blogpost.

Methods

Neutrino is an object-oriented language and as such the most fundamental operation is the method call. Neutrino methods are not smalltalk-style single dispatch ones, however, but multimethods.

The smalltalk model is nice and simple — but also limiting in practice. This obviously doesn’t just apply to smalltalk but any single dispatch language. You find yourself repeating the double dispatch pattern over and over. Operations that are binary by nature, such as addition and equality, are difficult to express in a way that is fully general. Generalizations such as catch-all methods or proxy objects can be awkward to express. Single dispatch is like an iceberg: the tip, the mechanism itself, has little complexity but there is a huge amount of hidden complexity that comes from having to express operations that are not, by nature, singly dispatched using only that mechanism. With neutrino I wanted to try a slightly more complex basic mechanism, under the assumption that it would reduce the amount of hidden complexity drastically.

The mechanism works like this. When defining what looks like a method you’re actually defining a matcher, so

def (this is Point).norm()
-> Math.sqrt(this.x*this.x + this.y*this.y);

might look like a method definition on Point but is, in reality a matcher that matches any call where the name is "norm" and the first argument is an instance of Point. Similarly,

def (this is Point)+(this is Point) -> ...

is not a method on Point but a matcher that matches any call with the name "+" and a first and second argument which are Points.

Using this model method lookup becomes a matter of, given a call, finding the matcher that best matches the call. This is done using a scoring function which, for each possible matcher, gives an integer value to each entry in the matcher and then takes the lowest scoring (best matching) function. In the first example, if you pass a Point then the match is perfect and the score is 1. If you pass an instance of a subtype, say a CartesianPoint then the match is less perfect and the score is 1 + the shortest distance through the inheritance hierarchy. This means that if there is a more specific method, one that takes a CartesianPoint, then it will score lower and be preferred over the method that takes general Points. If no method matches a LookupError is thrown. If there is no unique best match then an AmbiguityError is thrown. That’s it.

When matching, all parts of the matcher are treated equally. Some parts will usually be known at compile time, for instance the method name, which means that the compiler will be able to generate more efficient code. However, that is an implementation detail and the method name is a parameter like any other. In particular, a matcher that matches any String as the name and a Point as the first argument is as valid and no different from one that happens to have a compile-time known name.

I’ve been using this mechanism for a while and find it to be much more convenient than single dispatch. There is really no comparison. No more visitors, you get to just express what you’re trying to do. Multimethods are often accused of being confusing and hard to understand. I think this is a matter of keeping the lookup rules simple and not being afraid to report an error if there is no obvious best match, rather than use elaborate tie-breaker rules.

This may look like a challenge to implement efficiently but the mechanism has been designed such that you only pay for the generality if you actually use it. If you do normal single or double dispatch with compile-time known method names then that’s what you pay for and, in particular, double dispatch can be less expensive because you can express it directly rather than through chains of separate calls, which means that the system can more easily optimize for it.

I’ve written up the details of the mechanism on the wiki under MultiMethodDispatch. This also describes how this works with keyword methods, which neutrino has but which I haven’t described here.

What is neutrino?

One of my plans with this blog is to keep it light and stick with relatively short posts about a single thing at a time. I’m planning to start writing about what neutrino is and, in keeping with the plan, I will do it in a number of separate posts rather than one big one. In this one I’ll talk about what the goal of neutrino is.

What is neutrino? It is a new programming language and implementation. I’ve implemented a lot of experimental programming languages in my spare time. Some I’ve written about, like neptune and saturn. Most of them I haven’t because they’ve just been experiments that lived on my laptop and nowhere else. Neutrino is the next programming language in the sequence and the first one I’ve worked on for this long and still felt that everything fits together just as it should.

Here’s a snippet of neutrino code, a teaser. The syntax isn’t actually that important and will change over time so don’t pay too much attention to that. I’ll explain the details of what’s going on here in future posts.

protocol ElfStringTable is ElfSection;

def ElfStringTable.new() {
def result := new ElfStringTable {
table := new HashMap(),
strings := new ArrayList(),
r_size := new Ref(0)
};
result.ensure("");
result;
}

def (this is ElfStringTable).alignment -> 1;

def (this is ElfStringTable).size -> this.r_size.get();

def (this is ElfStringTable).name -> ".strtab";

def (this is ElfStringTable).type -> 3; // SHT_STRTAB

def (this is ElfStringTable)[key]
-> this.table[key];

def (this is ElfStringTable).ensure(key) {
def prev := this.table[key];
if prev = null then {
def index := this.size;
this.table[key] := index;
this.r_size.set(index + key.length + 1);
this.strings.add(key);
index;
} else {
prev;
}
}

def (this is ElfStringTable).encode(out, elf) {
def info := elf.layout.info(this.name);
assert_equal(info.start, out.length);
for (str : this.strings)
out.write_c_string(str);
assert_equal(info.end, out.length);
}

This is taken from the ELF file encoder that is part of neutrino’s linux implementation.

Besides some original ideas, neutrino takes its inspiration from a number of different places. Its concurrency and security constructs resemble those in E (the overall syntactic similarity with E is a coincidence though). The module mechanism is inspired by newspeak. The multi-stage programming model is inspired by MetaML and APT. The implementation approach and low-level programming library are very much inspired by the jikes RVM. There are many more examples like these.

These are all good ideas that are not available in any mainstream languages. There is great potential in creating a solid high-performance language implementation that combines these different ideas within one coherent model. The goal of neutrino is to do that.