Constructors
My last post about protocols used object instantiation but glossed over how it actually works. One reason for that is that I haven't sorted out all the details of it yet, but I think I'm close enough to a solution that I can give a rough outline of how it will work.
Most languages I know have some form of construction phase when creating objects. During this phase the object isn't fully initialized yet so many of the assumptions you can usually make don't actually hold yet. The object under construction is allowed to be in an inconsistent state, and in some cases you can observe these inconsistencies. Here is an example of a program observing two different values for a final field in Java, which is otherwise not allowed to change:
public class Test {
private final int value;
public Test() {
printValue();
this.value = 4;
printValue();
}
private void printValue() {
System.out.println(value);
}
}
The exact same thing is possible in C++:
class Test {
public:
Test() : x_(print_x()){
print_x();
}
int print_x() {
printf("%i\n", x_);
return 0;
}
private:
const int x_;
};
Even though the field is supposedly constant this prints two different values for it (technically the value of x_ is not well-defined before it is initialized so it might not change because it was already 0).
I use Java and C++ as examples because those languages make an effort to limit what you can do during construction, an effort which is nowhere near airtight. Other object-oriented languages, for instance python, have more relaxed rules, and some, like JavaScript and smalltalk, have none at all.
For neutrino I had several requirements in mind for object construction. It must be possible for one implementation type to extend another, and extra code should not be required in the supertype for this to work. There should be no constructors, methods that are allowed to construct or initialize objects but are covered by various rules and restrictions. It must be possible to construct and subtype immutable objects. It must be possible for each step in the construction process to execute arbitrary code to calculate the fields of the new object.
The model I ended up with to support this has two components: three-state objects and field keys. I'll first outline how they work and then give an example at the end.
Three-state objects
All objects are in one of three states: under construction, mutable and immutable. An object can only move towards more restrictions, that is, from being under construction to being mutable and from being under construction or mutable to being immutable. For now we'll ignore the mutable/immutable distinction and focus on being or not being under construction.
When an object is born it has no state and implements no protocols. It is an empty atom which has nothing but its object identity. However, since it is in the under construction state all that can change. An object under construction can have protocols added to it, which means that the object will now respond to all the methods associated with those protocols. It can also have fields set and changed, similar to how this works in JavaScript and python: when you set a field the object doesn't yet have that field is created. There are also notable differences to those languages though.
An object under construction can be passed around to any number of methods which can set fields and add protocols – construct the object, basically. When the object is considered done, typically when it is returned to the constructor call that started the construction process, it can be moved to one of the two other states. From that point on construction will be considered complete and it will no longer be possible to add protocols to the object, though it may still be possible to add and set fields.
This model means that the fact that the object is under construction is understood by the underlying system. There doesn't have to be any restrictions on the code that constructs the object, any method can do that. One "constructor" method can be extended by having the "sub-constructor" simply call the original constructor and then extend the returned object, which is still under construction, with additional fields and protocols. There is no restriction on how many objects are under construction at the same time.
This model means that during the construction phase you're not encumbered by restrictions on how you're allowed to write your construction code. The object is in a special state, the runtime system understands that and allows you to construct it however is convenient for you. After, the runtime system knows that the object can no longer change how it is constructed and can use that information to optimize the program. In Java and C++ any property of an object that is violated during construction is worthless for optimization purposes because it is impossible to tell, short of adding an implicit under construction flag, whether it is safe to rely on this property.
Field keys
Before I mentioned that you can add and set fields during construction, similar to python and JavaScript. That is only true in a limited sense. Neutrino objects don't have fields in the usual sense but instead use field keys which are similar to private names as proposed for JavaScript. Where JavaScript and python use string names to identify an object field, for instance the string "x"
to get the x coordinate of a Point, neutrino accesses fields through keys. A key is an opaque object, you might think of it as a capability for accessing a particular field. When implementing a Point type you would create two new keys, an x key and a y key, and use them to store the actual coordinates of point objects.
protocol Point;
@static def Point_x := new GlobalField();
@static def Point_y := new GlobalField();
def Point.new(x, y) {
def result := new Object();
result::Point_x := x;
result::Point_y := y;
Point.attach(result);
return result;
}
def (this is Point).x -> this::Point_x;
def (this is Point).y -> this::Point_y;
Here we create two field keys and use them to attach the two coordinates to the point, setting them (which means creating them) in the new method and reading them in the two accessors. Direct access to the field values is only possible if you have access to the Point_x and Point_y values, which can be kept private or exposed as appropriate. As usual, this is not how you would write your program but how the underlying concepts work and how the implementation understands object construction. There would be shorthands for doing all this and I expect that actually seeing field keys will be rare.
The way this works there is no way for different fields to interfere -- if you want to associate some data with an object you just make your own key and store the data, and no other user of the object should care. This is much more flexible than languages where the set of fields is fixed by the type declaration, but should be no more expensive in the cases where all you need is a fixed set of fields set by a straightforward constructor. Code like the above is straightforward to analyze and optimize, especially in this cases like this where it is impossible to observe the result while the fields are being set and the protocol added.
That, basically, is how object construction works in neutrino. I'll describe the second part of this, how the mutable and immutable states work, in a later post.