Monthly Archives: February 2006

Figurines

Here’s a gift for the person who has everything: Hiëronymous Bosch action figures. When someone has an idea this good you don’t think “how did they come up with this” but “why hasn’t anyone thought of this before”.

Disappointed!

I’m working on our new neptune plugin for eclipse and I draw a lot of inspiration from the JDT java plugin that comes with eclipse. One thing I really like is that you can do refactorings and navigation in a file even if parts of the source is not legal java code. The parser they use apparently just skips the illegal parts and gets right back on track as soon as the source becomes parseable again. For instance, the parser has no problem reading this nonsense:

public class Klass {

int float double foo
String bar() {
List baz() { { {
float %$%$ quux();
try catch finally if else
class K {

}

This class is shown in the outline as having one field, foo, three methods, bar, baz and quux, and an inner class K. I would have expected the unmatched brackets or illegal characters to throw the parser off a bit but it doesn’t seem to care.

My first guess was that maybe they used the indentation as a guide but apparently whitespace is completely irrelevant — putting the whole thing on one line makes no difference. So to understand how they did it I ended up single-stepping through the gory innards of JDT until I found the parser and hoped to see all sorts of well-documented heuristic rules that I could understand and use in our own plugin.

Full of anticipation, I stepped over lines and lines of initialization code, saw the first token being read from the input and finally reached the sacred inner loop of the parser. After having gone around in the loop a few times I noticed that what appeared to control everything was the result of a single method call. Ready for the big revelation, I stepped into the method and found a single line:

return term_action[term_check[base_action[state]+sym] 
== sym ? base_action[state] + sym : base_action[state]]

Disappointed! I hate automaton-based parsers.

Constructors

In the new neptune language, we’ve experimented with various shorthands that allows you to write code that is more concise and readable than equivalent code in neptune’s “parent” languages C/C++ and smalltalk. Most of the new constructs are identical or similar to well-known constructs in existing languages but usually have a different twist. The most “twisted” of our new constructs is constructors.

One of the things that’s bothered me most in smalltalk is all the boilerplate code you need when writing a new kind of object: to construct an object you usually need two methods, a static constructor method which is mostly boilerplate code and an init method for initializing the instance. Using this pattern, the construction part of a Point would look something like this:

Point = Object (

| x y |

initPointX: xInt y: yInt = (
x := xInt.
y := yInt.
)

) class (

newX: xInt y: yInt = (
| result |
result = super new.
result initPointX: xInt y: yInt.
^ result.
)

)

With this implementation you create a new point by writing Point newX: 0 y: 1. Here’s the same code again with all the code that’s not boilerplate underlined:

Point = Object (

| x y |

initPointX: xInt y: yInt = (
x := xInt.
y := yInt.
)

) class (

newX: xInt y: yInt = (
| result |
result = super new.
result initPointX: xInt y: yInt.
^ result.
)

)

The only “interesting” code in the common case is 1) how many arguments does the constructor expect, 2) how is the super constructor invoked, and 3) how is the instance initialized. And once you’ve written a few dozen objects you start to get real tired of initWhatever methods.

One of the shorthands we’ve added in the neptune language is constructors. In neptune, you could write the class above as:

class Point {

hidden int x, y;

Point(int _x, int _y) {
x = _x;
y = _y;
}

}

To create a new point, you write new Point(0, 0). In this code, all you see is the non-boilerplate code from the example above. And in fact you don’t even see the call to the super constructor because it is generated for you if you don’t write it yourself. Constructors can be used in much the same way as constructors in C++, Java or C# but are also very different from constructors in those languages. First of all, there are no special rules about how you implement your constructors beyond the rule that if you don’t write a call to super(...) somewhere one will be generated for you. But you’re free to call the super constructor whenever you want and as many times as you want.

Another difference from traditional constructors is that neptune constructors are only a shorthand that you don’t actually need to use to construct objects. When you write new Point(0, 0), that simply means calling the new method on class Point, which you are free to implement any way you want. An equivalent implement of a Point constructor would be

class Point {

hidden int x, y;

void initPoint(int _x, int _y) {
x = _x;
y = _y;
}

static operator new(int x, int y) {
Point result = new super();
result.initPoint(x, y);
return result;
}

}

All that happens when you use the constructor syntax is that the two methods are created for you: an instance initializer containing the body of the constructor and a static new operator the creates the object and calls the initializer. One of the problems with constructors in many languages is that calling the constructor in a class must return an instance of that class. In smalltalk, “constructor” methods are free to return whatever they want which can be a very powerful tool. And the same thing is true in neptune since you are free to implement the new operator however you want. But most of the time constructors simply create and initialize instances, and in those cases you can use the shorthand.

There is one piece of boilerplate left in the code above, however. When I write a constructor, the arguments are very often stored in fields in the object. In the point example above, that’s all the constructor does: stores _x and _y in x and y. Because this happens so often, we’ve added another shorthand for storing arguments in instance variables:

class Point {

hidden int x, y;

Point(int -> x, int -> y);

}

The arrow notation, int -> x means that the constructor takes an integer argument and stores it in the field x. In this case we don’t even need to give the constructor a body since all it does is set the variables. Compared with the original smalltalk code, and the fully expanded neptune code, the last example is not only much faster to write but easier to understand and maintain. And there’s no “magic”: every shorthand used above maps in a trivial way to other constructs in the language.

The last constructor-related shorthand we’ve added is instance variable initialization. In many cases, instance variables must be initialized before the object can be used. For instance, a PushButton might have a list of button click listeners:

class PushButton {

hidden List button_click_listeners;

PushButton() {
button_click_listeners = new LinkedList();
}

}

The disadvantage of this is that all constructors need to initialize the list of listeners (or call a constructor that does) and that it is less clear how the field is initialized since you need to inspect all constructors to see that. An alternative way to write this is to initialize the field directly:

class PushButton {

hidden List button_click_listeners = new LinkedList();

}

This means that same as the code above but is more compact and, again, easier to understand and maintain.

We have a bunch of other shorthands which I’ll probably write more about later, including collection initializers, local functions, and (possibly) default arguments.