Category Archives: neptune

Saturn?

Posted on July 17, 2006 | 1 Comment

As I mentioned in my last post, I’ve started work on a new implementation of neptune. Even though Esmertec Denmark has been closed down it didn’t mean that the language died, not as such. What it meant was that an implementation of the language died. Now that I have some time off I might as well write a new one.

Will the new implementation be exactly the same as the old one? Not exactly, no. Now that there’s no marketing aspect I’ll take the opportunity to try out a few things. One clear difference is that concurrency control will be based on a transactional memory model. I’ve experimented with changing the syntax a little, for instance allowing -, ?, ! and such in identifiers and method names but in the end I decided against that. There will probably be “real” dynamic privacy. There will definitely be tail call optimization and full closures. But the plan is to otherwise stay close to the original language. There’s a very good reason for that. I know myself: if I’m completely free to design a language I get stuck before I even begin with theoretical issues such as “what is a method name, really?” and “how can I unify instance variables and local variables?”. So to get any work done at all I’ll have to stick with what we already have except for adding a few well-understood mechanisms.

I don’t know if this language will ever get anywhere. I have no world domination plans, so far I’m just doing this because I enjoy implementing programming languages. Of course I do hope that at some point there will be room in the world for a statically typed object-oriented language that is not C++, Java or C#. Otherwise I’m afraid the next few decades of my career will be something of a disappointment.

By the way, if you feel like contributing or just taking a look at how things are going, neplang at sourceforge is open for business.

→ 1 Comment

Posted in neptune, saturn

RIP

Posted on July 3, 2006 | 1 Comment

Esmertec sent out an innocent-looking press release today: Jean-Claude Martinez Confirmed as CEO of Esmertec. But there’s a bit more to it than the title suggests:

Cost reductions: Management has decided to close down Esmertec’s subsidiary in Denmark and reduce significantly the operations in Japan. The company has assigned resources from other Esmertec’s offices and in Japan to continue supporting the products and customers in these regions.

What that means is that I now have a really long summer vacation. It also means that you shouldn’t expect an open source version of neptune to be released any time soon — in fact, don’t expect any version to be released at all.

Why has this happened? This press release gives a hint.

→ 1 Comment

Posted in neptune

Types #2

Posted on July 2, 2006 | Leave a comment

When I wrote the first post about types in neptune I ran out of steam before I was completely finished. In this post I’ll finish by describing two features I only mentioned briefly in the first post, the void type and protocol literals.

Checks

But first: as Isaac points out, I’ve spent a lot of time explaining what the type system doesn’t do and doesn’t require, and very little explaining what it actually does do for you. I guess I didn’t think to write about it because the checks that are performed are pretty simple and conventional. Here’s a summary of the checks performed by the compiler.

If you call a method m and the type system knows that the receiver has a type that doesn’t define the method m, the checker will issue a warning:
```
protocol Runnable {
  run();
}

Runnable action = ...;
action.runn(); // warning: undefined method Runnable.runn()
```

If the type checker knows the signature of a method and you pass it an argument with an incompatible type, a warning will be issued:
```
int gcd(int a, int b) { ... }
gcd("foo", "bar"); // warning: incompatible types
```
I’ll return to what “incompatible” means in a second.

If a value is assigned to a variable of type T, a warning is issued if the type of that value is not a subtype of T:
```
int x = "4"; // warning: incompatible assignment
```

When returning a value from a method with return type T, the returned value must be a subtype of T.
```
int foo() {
  return "bar"; // warning: incompatible return type
}
```
As a special case of this rule which I’ll return to later, returning a value from a void method is illegal, and not returning a value from a non-void method is illegal.

Finally, the compiler checks that methods override each other properly. An overriding method is allowed to return a subtype of the overridden method (return type covariance) and it is allowed to take arguments that are supertypes of the arguments of the overridden method (argument contravariance):
```
class SuperClass {
  Object foo(int x) { ... }
}

class SubClass: SuperClass {
  Object foo(String x) { ... } // incompatible override
  int foo(Object x) { ... } // legal
}
```

There may be more checks that I’ve forgotten but at least these are the most important ones. All but one of these checks use the subtype relation between two types which is simple in neptune: The type X is a subtype of Y they are equal, if Y is a class that extends X, directly or indirectly, or if Y implements the protocol of X, again directly or indirectly. It’s a simple version of the nominal typing relation of many other object-oriented languages, including Java and C#, except that it is more general because one class can be a subtype another class without having that class as a superclass.

Besides these checks, the compiler also issues warnings if it thinks a class has not been properly implemented: if non-abstract classes have abstract methods, if a class doesn’t provide all the methods required by the traits it uses, if it declares that a method overrides another but it doesn’t, etc. However, all those things could be checked even if there was no static type system.

I hope that clarifies what the type system will do for you, and then move on to describing the two last features of the type system.

Void

In most languages, methods that don’t return a meaningful value are treated differently than other methods. In the C language family, such functions and methods are declared as if they did return a value of type void, but the void type is special and magical; what it really means is that the function don’t return a value at all. In particular, there is no value of type void and you can’t declare a variable of that type:

void do_nothing() { }

void my_void = do_nothing(); // illegal

Void methods and functions in the C languages are fundamentally different from functions and methods that return values and the type system keeps track of whether or not you’re allowed to use the result of a call. In neptune we can’t do that because you are free to call a method even if the type system knows nothing whatsoever about it. So we couldn’t use the C model, at least not directly.

In smalltalk, on the other hand, there is no such thing as a method that doesn’t return a value. You are free to not explicitly return a value from a method, but in that case the method returns self. I don’t consider that to be a particularly good design and long before we considered switching from smalltalk to another language we experimented with changing this in OSVM. The problem is that many methods really aren’t intended to return a meaningful value but return self by default. Then someone accidentally uses this value and then later when the method is changed, under the assumption that no one is using the return value, the program breaks somewhere completely unexpected. The problem is that the intention of the method isn’t clear because sometimes methods really do intend to return self but you can’t tell if this is the case since the return isn’t written explicitly. Personally, I used the convention that if I wanted a method to return a value I would always explicitly return it, even self. If I didn’t want the method to return a meaningful value, I would return nil.

In neptune, we wanted a combo solution: all methods should return a value but we also wanted to avoid the problems from smalltalk and we wanted the code to look like C. The solution was to make void into a real type and introduce a void value. All methods that don’t explicitly return a value automatically return the void value, and a return statement without an explicit value also returns the void value. The void value itself is a full, real object just like null:

void get_void_value() {
  return;
}

var void_value = get_void_value();
System.out.println(void_value.to_string()); // prints "void"

This model gives you almost the same behavior as C, Java and co. but for different reasons. If you declare that a method returns void the type system will warn you if you return a value of another type, since the returned value is not a subtype of void. If a method is declared to return something other than void but don’t return a value, the type check fails because a method that doesn’t explicitly return is made to return the void value which is not a subtype of the return type. Finally, if the void value ever turns up as the value of a variable or somewhere else where you don’t expect it you’ll know that there is a problem.

This model does give a little more flexibility, some of it is probably useless (but you never know) and some of it is slightly useful. First of all, since void is a regular type you can declare variables of type void. I don’t know why you would do that but you can. Secondly, it is legal to explicitly return a value from a void method if that value is the void value. That might not sound useful but it can be since it means that you can call another method and return in the same statement:

void add(var obj) {
  if (delegate) return target.add(obj);
  ...
}

In java, you can’t call a void method and return in the same statement so you would have to write

void add(Object obj) {
  if (delegate) {
    target.add(obj);
    return;
  }
  ...
}

It’s a small thing but it does give greater uniformity. By the way, I believe you can actually do the same thing in C++.

Of course, having a dynamic representation of the “nothing” value is not a new thing, many languages have that. My experience with it, at least in the context of this kind of language, is that it works very well in practice and better than any of the alternatives I’ve tried.

Protocol literals

The last thing is sort of an oddball. Sometimes it can be practical to represent a type as a run-time value. For instance, in our unit test framework we have a method that asserts that running a particular piece of code throws a particular exception

assert_throws(IOException.protocol, fun -> file.open());

The expression IOException.protocol gives you an object that represents the protocol IOException. A protocol object can be used for exactly one thing: to test whether other objects implement the protocol it represents:

Protocol p = String.protocol;
var x = "Beige";
if (p.is_supported_by(x)) ...

That may not sound very exciting but it can be useful in combination with other abstractions that are built on protocol objects, for instance exception handling. In a post a long time ago I mentioned that in neptune, an exception handler such as this:

try {
  ...
} catch (e: IOException) {
  ...
}

is expanded into this:

fun {
  ...
}.try_catch(
  new Vector { IOException.protocol }, 
  fun (Exception e) {
    ...
  }
)

Exception handling is based on the try_catch method which uses a list of protocol literals to decide whether or not it catches a particular exception. The fact that you can call this method directly means that you can make exception handlers that are not statically bound to a particular exception type but where it can be determined dynamically which exceptions to catch. This is how assert_throws is implemented:


void assert_throws(Protocol type, fun f) {
  fun {
    f();
    fail_not_thrown(type);
  }.try_catch(new Vector { type }, fun {
    // Ignore exceptions of the right type
  });
}

It’s not something you use often but on those rare occasions where you do use it it’s really handy.

→ Leave a comment

Posted in neptune

Types

Posted on June 23, 2006 | 1 Comment

The last big thing in neptune that I haven’t written about is the static type system. We’re not actually 100% done with it yet — the design is mostly done but there is still a few late additions that we seem to agree are good ideas but that we haven’t implemented yet. I think we’re far enough, though, that I can give a pretty accurate overview.

The way the type system in neptune came about is kind of unusual. Neptune descends from smalltalk which is not statically typed, but an important goal in the design of the language was to make it look familiar to C and C++ programmers. If you look at these two versions of the same program:

int gcd(int a, int b) {
  while (a != 0) {
    int c = a;
    a = b % a;
    b = c;
  }
  return b;
}

and

gcd(a, b) {
  while (a != 0) {
    var c = a;
    a = b % a;
    b = c;
  }
  return b;
}

the first program looks just like C (because that’s what it is) and the second one looks sort of C-ish but also pretty different. The only difference between the two programs is, of course, that one has type annotations and the other one doesn’t. When we set out to design neptune there was really no question: if we wanted it to look like C, we had to have a static type system. That is not to say that this was the only reason to have a static type system, there are plenty of very good reasons for that, but it is the reason why not having a type system was never an option. Besides the syntax, the most important reasons were that it allows you to write documentation that can be understood and checked by the compiler which again enables much better tool support. The most important non-reasons were optimization security — they were basically irrelevant in the design.

So, what’s the neptune type system like? Well, first of all it’s an optional type system. This means that you are never required to specify a type for anything. For instance, the two gcd programs above are both legal neptune code¹. Basically, we want the type system to be there as a service that you can use if you want to but which doesn’t impose itself on you. From a “genealogical” viewpoint, it is closely related to the Strongtalk type system for smalltalk.

The examples above demonstrate the type syntax. If you use type annotations the syntax is the same as in the C language family. If you don’t use type annotations you either just leave the type out, as with return types and parameters, or use the keyword var, as with local variables. In some cases, for instance method parameters, you can do both.

The type system is based on the concept of protocols, which is smalltalk for what most people know as “interfaces”. A protocol is simply a collection of methods:

protocol SimpleOutputStream {
  void write(int b);
  void flush();
  void close();
}

Unlike most statically typed object-oriented languages, including Java and C#, neptune does not have class types. Class types are a Bad Thing because they not only specify the interface of an object but also dictate how the object must be implemented, breaking encapsulation.

There are two ways to define a protocol. One way is to write it directly, as the SimpleOutputStream example above. Also, whenever you define a class you automatically get a protocol with the same name. That might sound confusing but in practice it isn’t. Protocols work much the same way as interfaces, and protocols defined explicitly work exactly the same way as protocols defined by classes. For instance, you are free to the protocol of a class without subclassing it:

class MyString is String {
  ...
}

In this declaration, “is String” means that MyString implements the protocol of String, not that it extends String — the superclass of MyString is in this case Object. To create a subclass of String you would write

class MyString: String {
  ...
}

A class can extend one other class but implement an arbitrary number of protocols.

If you use String as the type of a variable it’s usually safe to think of it as if the variable can contain instances of String. But that’s is not accurate in general: the variable can actually contain any object that implements String‘s protocol. Neptune has no way to express that a variable can only contain instances of a particular class.

Programs can dynamically test whether or not an object supports a particular protocol by using the is operator:

if (obj is String) ...

One of the things we haven’t implemented yet but seem to agree is a good idea is to combine this with a limited form of dependent types. In most other languages, for instance Java, the compiler doesn’t “remember” instanceof checks:

if (obj instanceof String) {
  label.setText(obj); // illegal: obj not String
}

The compiler doesn’t recognize that you only enter the body of the if if obj is a String — instead, you have to insert a cast:

if (obj instanceof String) {
  String str = (String) obj;
  label.setText(str);
}

In neptune, we plan to allow the compiler to automatically remember extra type information about local variables after is checks, something that will in many cases make casts unnecessary:

if (obj is String) {
  label.setText(obj); // okay, obj is a String
}

In a language like Java that might make programs harder to understand, since the runtime behavior of the program can be affected if the compiler gains more type information. In neptune all it does is avoid unnecessary type warnings, since the runtime behavior of a program is not affected by static types.

This leads me to another aspect: type warnings. In neptune, all problems detected by the type checker are reported as warnings, not errors. It doesn’t matter how wrong you get the types, the program will still compile and run. Well, except for one case: you can’t mix function objects (blocks) and other objects, you have to get the types right there. That is because the compiler needs to be able to track that no references exist to a block after the method that created it returns — otherwise the runtime will die horribly. The philosophy of when to give warnings and errors is simple: we only give errors in a program if we can’t give meaning to the program, otherwise we give warnings. You won’t, for instance, see errors on unreachable code or other such nonsense. You can configure the IDE to give errors instead of warnings but that’s not the default.

Neptune’s static type system is “weak”. Even if x has type int there is no guarantee that it can only contain integers:

var my_string = "blah";
int x = my_string;
int y = x + 18;

In the second line an object of an unknown type is assigned to an integer-typed variable. That is perfectly legal and does not cause a warning so this code will compile without issuing any errors or warnings but will probably fail at runtime in the third line when adding an integer to a string. That doesn’t mean that you can use this to break the runtime. The runtime is strongly typed so no matter how wrong you get the static types it will survive. You might ask why we don’t add runtime checks to catch these situations but that is potentially expensive and it’s probably not worth the trouble. We could also add warnings if we’re not sure that an assignment is safe, but that would generate warnings whenever untyped code called typed code, which goes against the principle that the type system should never impose itself on you if you decide not to use it. Personally, this has never been a problem for me, not even once.

The other thing we’ve discussed and seem to agree upon but haven’t implemented yet are nullable types. By default all types contain the null value:

String str = null; // legal

But often a value is not actually supposed to be null. Nullable types allow you to express this:

String! str = null; // illegal
String? str = null; // legal
String  str = null; // legal

The type String! does not allow null whereas String? does. The type String also allows null but is different from String? in that Strings are completely unchecked whereas using a String? in place of a String! without a null check causes a warning.

Ok, I think that’s enough about types for now. There are still interesting details about the type system that I haven’t written about — protocol objects, the void type, etc. — but that will have to wait.

¹In case you were wondering, int is not a basic type in neptune — neptune has no basic types. It is just a convenient shorthand for the type Integer.

→ 1 Comment

Posted in neptune

Neptune

Posted on May 27, 2006 | 4 Comments

Neptune is a new programming language inspired mainly by smalltalk and C/C++ but also taking ideas from many other places. It is, in short, a dynamically typed object-oriented language with an optional static type system. It is similar in many ways to smalltalk but has a C-like syntax. The language was designed by Esmertec AG in Århus, Denmark. Esmertec Denmark is now closed but work has started on a new implementation at sourceforge.

Posts

I’ ve written about it in these posts:

Types #2: more about the type system, including the void type and protocol literals.

Types: an overview of the static type system

Selectors: about selector objects, a dynamic representation of method names, which is a very powerful abstraction.

Characters: about a neat way to specify character literals in neptune.

Why Neptune?: A post that tries to explain why we decided to switch from using smalltalk to designing our own language.

Traits: How traits work in neptune.

Exceptions: About neptune’s exception mechanism which has some unusual features

Using: How neptune’s using statement works

Brevity: An example of how you could represent a simple concept in neptune, demonstrating various language features

C# 3.0: A look at some new features in C# that look very similar to features in neptune.

Constructors: How constructors work.

Interpol: About neptune’s approach to string construction: string interpolation.

Structs and Memory: Describes a tool I’ve written to make it easier to work with external C structures from neptune. Describes neptune’s interface to external calls and external memory.

I’ll keep this list updated as I write new posts.

→ 4 Comments

Posted in neptune

Selectors

Posted on May 24, 2006 | Leave a comment

One of the most important rules in software engineering is don’t repeat yourself. If you find yourself writing the same code, or almost the same code, in more than one place then that is a sign that your code smells. For instance, if you see this code

Node root = current_node;
while (root.get_parent() != null)
  root = root.get_parent();

near this code

Node topmost = left_leaf;
while (topmost.get_parent() != null)
  topmost = topmost.get_parent();

you should feel a strong urge to factor out the similarities:

Node find_root(Node start) {
  Node current = start;
  while (current.get_parent() != null)
    current = current.get_parent();
  return current;
}

and then just use that method:

Node root = find_root(current_node);
...
Node topmost = find_root(left_leaf);

Refactorings like this is something we do all the time: notice similarities in our code and factor them out.

But not all similarities are easy to factor out. In the example above it was easy: the same thing was done with two different objects, current_node and left_leaf. Factoring out the subject of an operation is usually easy: you just create a method or function that takes the subject as an argument. But consider these two code snippets:

Node root = current_node;
while (root.get_parent() != null)
  root = root.get_parent();

and

File root_directory = current_directory;
while (root_directory.get_enclosing_directory() != null)
  root_directory = root_directory.get_enclosing_directory();

These two pieces of code are almost identical in what they do but in this case they’re not only different in the object they operate on but also in which method is called. In most object-oriented languages you can’t “factor out” the name of a method, like get_parent or get_enclosing_directory in this example, so you can’t write a find_root method that can be used to replace both loops as we could in the previous example.

In neptune, on the other hand, there is a mechanism for abstracting over method names: selectors. A selector is an object that represents the name of a method. For instance, the name of the get_enclosing_directory method is written as ##get_enclosing_directory:0. The syntax of a selector, at least in the common case, is ## followed by the name of the method, colon, and the number of arguments expected by the method. Given a selector object you can invoke the corresponding method on an object using the perform syntax:

Selector sel = ##to_string:0;
Point p = new Point(3, 5);
String s = p.{sel}(); // = p.to_string()

The syntax recv.{expr}(args...) means “invoke the method specified by expr on recv with the specified arguments. Using this, the loop example from before can be refactored into

find_root(var start, Selector method_name) {
  var current = start;
  while (current.{method_name}() != null)
    current = current.{method_name}();
  return current;
}

and then the two instances can call that method:

Node root = find_root(current_node, ##get_parent:0);
...
Node root_directory = find_root(current_directory,
    ##get_enclosing_directory:0);

Using selectors this way can sometimes be useful but code that is identical except for the name of a method is pretty rare, at least in my experience. But selectors can be used for many other things.

One of the most useful applications of selectors is delegates. A delegate is a selector coupled with an object. You can think of it as a delayed method call: you specify a particular method to call on a particular object but you don’t perform the call just yet.

Point p = new Point(3, 5);
Delegate del = new Delegate(p, ##to_string:0);
String s = del();

Here, we create an object, then we create a delegate which can be used to send to_string() to the object, and finally we invoke the delegate which causes to_string() to be called on the point. The syntax for invoking a delegate is the standard function call syntax: delegate(args...).

The syntax new Delegate(...) is a bit cumbersome so there is also a binary operator, =>, that can be used to create delegates:

...
Delegate del = (##to_string:0 => p);
...

How are delegates useful? Well, the place where I’ve had most use for them is as event handlers. For instance, we have a rudimentary GUI toolkit based on Qt that uses delegates for all events:

void draw_controls(qt::Widget parent) {
  qt::Button ok_button = new qt::Button(parent);
  ok_button.add_on_click_listener(##ok_button_clicked:0 => this);
}

void ok_button_clicked() {
  System.out.println("Ok button clicked");
}

This code demonstrates how delegates can be used in a very light-weight mechanism for specifying event handlers, in this case causing the system to print a message on the console each time the button is clicked. And if we use accessor methods the code that sets the event handler can be made even more concise:

...
ok_button.on_click = (##ok_button_clicked:0 => this);
...

Another use of delegates is for spawning threads. Besides just invoking a delegate, you can also call the spawn method which invokes the delegate in a new thread:

void start_process() {
  Worklist list = new Worklist();
  (##produce:1 => this).spawn(list); // spawn producer
  (##consume:1 => this).spawn(list); // spawn consumer
}

void produce(Worklist list) {
  while (true) {
    var obj = produce();
    list.offer(obj);
  }
}

void consume(Worklist list) {
  while (true) {
    var obj = list.take();
    consume(obj);
  }
}

The start_process method starts two threads: on one that adds objects to the worklist and one that consumes those object, again in a very light-weight fashion using delegates to invoke two local methods in separate threads. I think that’s pretty elegant!

Unlike languages like C#, delegates in neptune are not “magic”; they are implemented as pure neptune code that uses selectors and perform to do the actual delegation. While selectors might not look like that useful a construct they can be used to build some very powerful abstractions.

→ Leave a comment

Posted in neptune

Characters

Posted on April 25, 2006 | Leave a comment

Quick: which character is '\u03b5'?

Let’s see, well, since it’s between 0370 and 03ff it’s clearly Coptic or Greek. The Greek lower-case characters start at 03b1 so it would be the lower case form of the fifth Greek character. That would be, let’s see, ε! Man, that Unicode stuff is just so logical!

I don’t think there’s anything wrong with Unicode. But as soon as people have to use the character codes directly, for instance when using Unicode character constants in languages like Java and C#, there’s trouble. There are tens of thousands of Unicode code points and I can’t even remember which ASCII character is line feed and which one is carriage return. On the rare occasions where I either read or write code that uses Unicode constants or ASCII control characters, I usually have to open a browser look up the values myself. That sucks.

The fortress language improves on things: there, instead of writing the code of a character in an identifier, you can write the name. It just seems so obvious really: the Unicode standard defines a name for all the characters so why should I have the trouble of looking up the character code?

In neptune you’re welcome to still use character codes to specify Unicode characters. Writing '\u03b5' means greek small letter epsilon just as it does in Java and C#. But inspired by fortress we’ve added another syntax for specifying Unicode characters by name: writing \<name> specifies the Unicode character called name. So instead of writing '\u03b5' you can simply name the character: '\'. If you feel that 'x' is too straightforward you can specify it equivalently as '\'. This works both in character literals and text strings:

"From \ to \"

which means the same as, but is a lot easier to understand than

"From \u0391 to \u03A9"

Besides Unicode names, we also allow the ASCII control characters to be specified by their short and long names. This means that I don’t have to remember that line feed is 10, not 13; instead I can just write '\' or '\'.

The first time you want to write the character ε you probably still have to look it up to see that it’s called greek small letter epsilon. But there’s a lot more logic to the name than to the raw character code and there’s a better chance you’ll remember next time. And it will of course be obvious to anyone reading the code which character it is. The only problem I see is that the names tend to be very long. Fortress allows you to use shorter forms of some characters: you can for instance leave out words like “letter” in character names. If the length of the names turns out to be a problem we might add something like that at some point.

Either way, I think this is a pretty nifty feature. And I wouldn’t be surprised if it turns out that there are other languages that have similar mechanisms.

→ Leave a comment

Posted in neptune

Why Neptune?

Posted on April 18, 2006 | 14 Comments

Lately I’ve been writing a lot about the neptune language, the replacement for smalltalk on the OSVM platform. I’ve tried to focus on the technical aspects both because there’s lots to write about but also because when you’re blogging about work, keeping it technical makes it less likely that you’ll accidentally write something, shall we say, career limiting. In this post I’ll take my chances and write about one of the less technical aspects: why we made the switch from smalltalk to a different language, and why we decided not to use an existing language but design our own. It is not about the strategic business decision of changing the language — for that you’ll have to go through the official channels — but my perspective on the switch as one of the people developing the platform.

I’ll start off by explaining a bit about OSVM and why we used smalltalk for the platform in to first place.

Why Smalltalk?

To give some context I’ll start of by describing what OSVM is. The OSVM platform is basically a virtual machine targeted at memory-constrained embedded devices. The platform has been designed with several goals in mind. It should be very simple in order to make the runtime as small as possible, since memory is limited. It should never be necessary to restart the system — if you’re dealing with software for applications security or medical equipment you want the system to be able to run without any interruptions, ever. If an error occurs, and of course they do, it shouldn’t kill the system; rather, it must be possible to diagnose and fix the problem in the field. That’s not always possible but there’s a lot you can do to make it more likely that the system will survive an error. Finally, it must be possible to update the running program in the field, both to fix errors and to install software upgrades. Being able to diagnose a bug in your program is no good if you can’t update the software to fix the problem. This, in short, is what OSVM is all about.

These requirements didn’t pop out of thin air. When you’re used to working with smalltalk this is what you expect from your platform: robustness, flexibility and the ability to interact directly with running programs and to update code on the fly. The smalltalk language and environment were developed together, which means that language was designed with all the issues that are important to us in mind. This is not the case with any language I know outside the smalltalk family, which makes it uniquely suitable for our use.

But using smalltalk is not without problems.

Why not smalltalk?

The languages most used in writing embedded software is C and C++. If a company considers using a platform like OSVM instead they not only have to consider the direct cost of buying the system from Esmertec. They also have to consider the indirect cost in time and money to train their employees to use the system, before they get actually writing software.

C++ is a object-oriented programming language (let’s be generous and call it that) but other than that, C++ and smalltalk are completely different. In order to use smalltalk, a C++ programmer would have to learn everything from scratch: how to declare a variable, how do write an if and while statement, how to write a string or character literal, everything. Since most embedded software is developed in C and C++, that means that there is a considerable indirect cost associated with using smalltalk with OSVM.

The pragmatic programmers recommend that you learn a new language every year. I think learning new languages, both programming and natural, is a great hobby. If I have a deadline, however, “broadening my thinking” by learning a new programming language is not my top priority. Imagine that you were writing a book and had the choice to either write it in English using a primitive ASCII text editor, or in Danish using a state-of-the-art graphical text editor. Assuming that you don’t speak Danish, what would you choose? Or, perhaps more realistically, what would your boss choose for you?

Even though the OSVM system is very much inspired by smalltalk and the smalltalk environment, what is important is not the language itself. The important thing is the properties of the language: robustness, flexibility, serviceability, etc. The goal of the neptune language is to retain all the important aspects of smalltalk but to adapt the less important ones, like the syntax, to make it easier to learn and use by programmers who are used to more mainstream programming languages. That is not to say that syntax is not an important part of smalltalk but it is not essential for its use as a part of OSVM.

I may give the impression that I think smalltalk is the pinnacle of programming language design and any change is a change for the bad. That is not the case: I think there are plenty of ways to improve smalltalk. In changing the language we’ve tried to address many of the quirks and shortcomings of plain smalltalk to get a system that is all-round better to use and not just easier to learn for C or C++ programmers. I won’t go into any details about how I think neptune improves upon smalltalk, though, since that’s a touchy subject that I don’t want to obscure the point I’m trying to make here.

Why Neptune?

Why design a whole new language instead of using an existing one? The whole point is to not force people to learn a new programming language and if we design a new one we can be absolutely certain that no one knows it.

It all comes down the nature of the OSVM platform. The language has to be very simple since the runtime must be minimal. This requirement alone excludes many languages including Java, C# and python. But the real killer is that it must be possible to update the code dynamically. I don’t know of any languages (except smalltalk) where you can give a reasonable meaning to updating the code of a running program in all cases. This is especially bad in languages where the runtime behavior of a program depends on static typing. When designing a language it is very easy to make decisions that make it very hard to update code on the fly. You have to keep this kind of use in mind during the design and smalltalk is really the only language designed like that. Having decided that smalltalk is not the way to go, that means that there are really no existing languages for us to use. Hence neptune.

Designing the language from scratch also means that we’re free to adapt the language to fit our use. The result is a language that, beneath an added layer of syntax, is still very close to smalltalk, and retains many of the properties that makes smalltalk unique. An indication of how close smalltalk and neptune are is that the virtual machine is almost exactly the same as before, the biggest change being that indexing is now 0-based instead of being 1-based. On the surface, however, neptune is much closer to C++ or Java. Basic things like variable declarations, control structures, operators and literals will all be familiar to a C++ or Java programmer. When learning the language this means people will be able to use much of what they already know and will be able to be productive almost immediately, and to move on more quickly to the more advanced concepts such as traits, blocks and optional types.

Some might consider it to be “dumbing” the language down but we’ve also made a conscious effort to make it possible for people to use the language without necessarily being familiar with the full generality of it. When you’re using an ifTrue:ifFalse: expression in smalltalk it might be intellectually gratifying to know that it’s not a “magic” construct, just a method call with two block arguments. But it’s not something you necessarily want people to understand from day one in order to use the language. In neptune, we have a separate if statement but that is still just a shorthand for a call to the if method. You’re free to call that method directly, without the special syntax, to the exact same effect. The full generality of smalltalk, along with some new constructs, is still available to you in neptune but instead of having it forced upon you, you can grow into it gradually. Even though it make the language a bit less clean than smalltalk, I think that is a pretty reasonable tradeoff.

→ 14 Comments

Posted in neptune

Traits

Posted on April 15, 2006 | 2 Comments

One of the changes we’ve made in going from smalltalk to neptune, beyond pure syntax, is in the object model. We’ve added an optional type system and introduced two new concepts: traits and protocols. I don’t mean new in the sense that we invented them, they’re pretty well known actually, but they’re new in the sense that they’re not part of standard smalltalk. In this post I’ll give a brief introduction to traits.

Smalltalk’s single inheritance model is clean and simple, but it also has many limitations. The most obvious limitation is that a class can only share code with one other class, its superclass. In addition, there are also more conceptual limitations. If I create a subclass B of class A, an instance of B should be considered a special case of an instance of A. The typical toy example of this is animals. Let’s say that the class Eagle extends the class Bird which extends the class Animal. This models the fact that an eagle is a bird and a bird is an animal. On the other hand, a bat shares many of the characteristics of a bird but it is not a bird so the class Bat cannot inherit from Bird. Even though there is much code in the class Bird that could be used in class Bat, single inheritance doesn’t allow them to share code because it is just conceptually wrong. I guess this is a pretty poor example but at least if someone disagrees and believes a bat can be implemented by using parts of birds I can just sit back and wait for the villagers with torches and pitchforks to take care of him. But I digress.

C++, another ancestor of neptune, has multiple inheritance which at least solves some of the problem — a class can now share code with more than one other class — but introduces so many other problems that we’ve never really considered it as a serious alternative. The solution we’ve decided to go for is traits. Traits seem to be slowly making their way into the mainstream as a more flexible way of sharing code between classes; variants can be found in languages such as perl6, scala and fortress.

To explain how traits work in neptune I’ll show you an example from our libraries. One place where we’ve used traits in our libraries is for implementing relational operators such as <, >=, etc. In most languages I know, the standard way of making an object x comparable to other objects is to add a method to x that takes the object to compare it with, y, and returns an integer signifying whether x is less than, equal to, or greater than y. In java that method is compareTo: if x is less than y, x.compareTo(y) returns a negative number, if they're equal it returns zero and if x is greater than y it returns a positive number. In neptune, we use the spaceship operator, <=>, for comparing objects. For instance,

"aardvark" <=> "zebra"

returns a negative number because the string "aardvark" is lexicographically less than "zebra". That's all nice and general but it is not very convenient or readable. Writing

if ((x <=> y) < 0) return y;

is a lot less clear than writing

if (x < y) return y;

This is where traits come in. Since the relational operators are completely defined by the comparison operator I can implement the relational operators as a trait:

trait TRelation {

  require int operator <=>(var that);
  
  bool operator ==(var that) {
    return (this <=> that) == 0;
  }

  bool operator <(var that) {
    return (this <=> that) < 0;
  }

  // ... and >=, <=, !=, > ... 

}

This doesn't define an a class but a set of methods that can be used by any class, provided that is has the <=> operator. One example of this could be an implementation of fractions:

class Fraction {

  use TRelation;

  int num, denom;

  bool operator <=>(var that) {
    // ... fraction comparison ...
  }

  ...

}

The easiest way to think of the use declaration is that it essentially copies the methods declared in the TRelation trait and pastes them into the Fraction class. The only requirement in this case is that class using the trait must define the comparison operator. By the way, a class can use as many traits as it wants.

Unlike inheritance, the class Fraction is not put in any kind of "relationship" with the trait TRelation. Traits are nothing more than collections of methods that you can import into your own classes almost as if you had used a #import in C. It should be completely irrelevant to anyone who uses the class Fraction whether the < operator is implemented by using a trait or by writing a method in the class itself. Because of this, a trait cannot be used as a type -- you cannot, for instance, declare a variable of type TRelation. This is also the reason why you write use declarations in the body of the class and not the header, since it is an implementation detail and not something should know about to use the class.

A trait is only allowed to contain methods, not fields. This might seem like a considerable limitation but it simplifies the model considerably and if a trait needs a field it can simply require the class to have it. For instance, this trait which contains methods for dynamically extending an array or vector, needs a contents field:

trait TVector {

  require accessor contents;
  require accessor contents=(var value);

  void ensure_capacity(int new_size) {
    if (new_size > this.contents.size) {
      // ... enlarge contents ...
    }
  }

}

TVector does not have the field contents itself but requires any class that uses it to have accessors for it. A class that uses TVector can implement this by actually having a field contents but since accessors are just ordinary methods (you can think of them as getter and setter methods), the class is free to implement them however it wants.

Our traits are very closely modeled on the trait mechanism for smalltalk described in the article linked above (and again here for your convenience) which I can definitely recommend reading. There are plenty of subtleties to the mechanism that I haven't mentioned. What happens if there's a name clash when importing more than one trait? (There's a mechanism for renaming methods) Can traits use other traits? (Yes) How does super calls work with trait methods? (I can't answer that in a single sentence). For the full answers to all this you'll just have to read the article. The original motivation for adding traits was the trouble we've had with implementing the collection hierarchy. It turns out that some of the people who wrote the smalltalk traits article have also written another one on exactly that: using traits in the design of a collection hierarchy.

So far we haven't used them that much in our libraries -- my guess would be that there is at most one trait to every ten classes. But in the cases where we have used them, they've really saved the day.

→ 2 Comments

Posted in neptune

Exceptions

Posted on April 8, 2006 | 2 Comments

It seems that these days whenever anyone describes some language feature, I’m compelled to compare/contrast that with how we’ve done things in the new neptune language. Most recently James Robertson has written about exceptions in smalltalk so, well, here we go.

In smalltalk exceptions capture the stack when they’re thrown which is a bit similar to the way stack traces are captured by exceptions in java. In java, though, what is stored in the exception is exactly a stack trace which is a serialized copy of the real stack that you can print and… well, not much else. In smalltalk the exception actually captures the real live stack which means that execution can be resumed from wherever the exception was thrown. This borders on what MenTaLguY calls “the Lovecraftian madness of full continuations”. But even though this model is neat and general, everyone I’ve ever talked with about resumable exceptions, at least those that have used them in practice, have told that their usefulness is actually pretty limited. James’ example code does little to change my attitude; that code is so convoluted that I take it more as an argument against than for resumable exceptions.

My attitude against resumable exceptions may just be a convenient rationalization, though. On the OSVM platform, which must be able to run on platforms with very little memory and processing power, we can’t afford the full smalltalk model anyway. In the previous incarnation of OSVM, which was based on smalltalk, there were no stacks or stack traces. You could throw any object as an exception and if someone else caught it there was no way to tell where it had been thrown from. In addition, there was no exception hierarchy and we usually just threw text strings or symbols: calling a method that hadn’t been defined caused the system to simply throw the string "LookupError" with no further information about which method was called on which object. Clearly there was room for improvement and now that we’ve had an opportunity to revisit the design, we’ve improved it quite a bit. The biggest change we’ve made has to do with, you guessed it, the stack. But that aspect will have to wait a bit.

The old system allowed you to throw any object so there was really no technical reason not to have an exception hierarchy and throw structured and informative exceptions rather than flat text strings. We just never really got around to it. For the new language we were rewriting all our libraries anyway so we added structured exceptions as soon as we had implemented exception handling. And, I must confess, doing it early on also meant that it would be a major hassle to take them out later if it turned out that someone thought it was a bad idea. Did I mention that the code name of our next release is “sneaky monkey”? Anyway, now when an error occurs somewhere because you’re trying to reverse a string (which you can’t currently do) you’ll an error that tells you exactly that: LookupError("ksiftseh".reverse()).

In neptune, the syntax for exception handlers is somewhat similar to C++, Java and C#. Here’s how an exception handler looks in C++:

try {
  ...
} catch (exception& e) {
  ...
}

This exception handler catches any objects thrown that descend from the class exception, because that’s the declared type of the exception variable e. Java and C# use the same approach. In neptune we have to use a slightly different model because we have optional typing, which means that type declarations are not allowed to affect the runtime semantics of a program. Here’s an example of a neptune exception handler:

try {
  ...
} catch (Exception e: IOException) {
  ...
}

The exceptions handled by this catch clause are listed after the colon, in this case IOException, and the declared type of the variable, Exception, has no influence on which exceptions are caught. One thing this means is that you can now catch several unrelated exceptions with the same exception handler:

try {
  ...
} catch (Exception e: IOException, AssertionError) {
  ...
}

If the variable has the same type as the exception being caught, which is usually the case, you can just leave out the type declaration:

try {
  ...
} catch (e: IOException) {
  ...
}

The try/catch syntax is not “magic” but a shorthand that, like all our other shorthands, expands straightforwardly into simple method calls. A try statement corresponds to a call to the try_catch method on Function. The handler above expands into something like this

fun {
  ...
}.try_catch(new Vector {
 IOException.protocol,
 AssertionError.protocol
}, fun (Exception e) {
  ...
})

Hey I didn’t say that the result was straightforward, just that the transformation was. And even though it may look expensive, with allocations and whatnot, the generated code is very efficient no matter if you use the try/catch syntax or write the call to try_catch yourself. An explanation of what .protocol means will have to wait.

Now for the stack traces. As you’ll remember, in Java and smalltalk the stack is stored in the exception object being thrown. I’m not one to underestimate the ingenuity of the people that implement smalltalk or java (since a majority of my colleagues have done both) but no matter how clever you are, storing the stack in an exception is bound to cost space and time. The worst part is that when creating or throwing an exception in smalltalk or java, you don’t actually know if the stack will be used or not so you always have to store it, just in case. That won’t do in OSVM because we don’t have processing power or space enough to do stuff “just in case”. Instead, you can specify that you want a stack trace at the location where you’ll be using it: the exception handler:

try {
  ...
} catch (e: IOException) at (StackTrace s) {
  ...
}

This way the stack trace and the exception object are completely decoupled and we only need to instantiate the trace when it is really needed. Another advantage of decoupling the trace is that throwing an object doesn’t change it. In smalltalk, throwing an exception that has been thrown before causes the previous stack to be overwritten, which is not an especially clean model. In Java the stack trace stored in the exception is captured when the exception is instantiated, not when it is thrown. In my experience that is the desired semantics only in one situation: if you want to get the stack trace but don’t want to throw an exception. Doing

new Throwable().printStackTrace()

is not uncommon in java but it’s a hack really. In neptune there’s a much cleaner way to get the current stack: just call the static Thread.current_stack() method.

I don’t know of any other language with a similar model but my experience so far is that it is exactly what you want and, in addition, simpler to implement and less expensive than the alternatives. There are still issues to work out, like what happens when you rethrow an exception and what exactly should be stored in a stack trace object. The biggest problem for me, though, is the keyword at. This is something we can still easily change so if you have a good alternative let me know.

→ 2 Comments

Posted in neptune

Category Archives: neptune

Saturn?

RIP

Types #2

Checks

Void

Protocol literals

Types

Neptune

Posts

Selectors

Characters

Why Neptune?

Why Smalltalk?

Why not smalltalk?

Why Neptune?

Traits

Exceptions

About me

Recent Posts

Posts by date

Posts by category