Category Archives: Uncategorized

Completion #3

I’ve been working on the JavaDoc API gadget over Christmas. I know, it’s as if I just don’t understand the concept of “holidays”. Anyway.

The UI now works perfectly in firefox and safari (at least as far as I can tell), in opera with just a single glitch and reasonably well in IE as long as you don’t resize the browser window. I have also fixed a few bugs, including some issues with inner classes, and implemented a feature several people have been asking for: pressing in the text field now takes you to the top item on the list.

A less visible improvement is that a lot less data now has to be transferred to the gadget when you start to write in the text field — the class index is down from 60K to 20K in browsers that support gzipped content (which these days turns out to be pretty much any of them).

The downside is that the code is now pretty atrocious. I quickly realized that having just one HTML document and one javascript file for all the different browsers was no good. Instead, I now generate different files for each of firefox, safari, opera and explorer. In fact the files are being generated using the most stone-age technique imaginable: cpp. Here’s an actual piece of code from the implementation:

function KeyEventMatches(e, r) {
#ifdef EXPLORER
var c = window.event.keyCode;
#else
var c = e.which;
#endif
return String.fromCharCode(c).match(r);
}

I know, absolutely terrible but it actually works pretty well and, most importantly, requires no resources at runtime.

Oh, and I’ve moved the gadget here:


Add to Google

The old location will disappear shortly.

I hope to clear up the last outstanding bugs, especially the thing with resizing in explorer, this week.

Completion #2

A while ago I wrote about my project of creating a google gadget that gave easy access to javadoc documentation; my guinea pig is the java SDK 6 documentation. Here’s an update on how that’s going.

After my first attempt, which was modeled on google suggest, I’ve rewritten most of the UI stuff. Where the gadget used to pop up a list of suggestions it now has a stationary list:

I liked the original format but it turns out that it’s really hard for an iframe to create a popup that covers its surrounding frame. The good news is that the new format is much more robust and now, as far as I can tell, works reliably in firefox and safari. Also, it no longer loads the whole index of JDK6 classes when the gadget is first displayed but waits until you start to enter text into the text field.

Another improvement is that I’ve changed the format of the class index used by the gadget. The index is created by processing the HTML class index that’s part of the JavaDoc site and turning it into a javascript file with the same information in a compressed form that can be loaded directly by that gadget (note to self: potential security problem?). The HTML class index on the website is around 500K, my first shot at a compressed javascript index was 200K. In the latest version it’s only 60K and I’m sure I can get it down even further.

If you feel like trying it out you can add it to your customized google homepage:


Add to Google

It doesn’t work in internet explorer but I’m not sure if that is a bug or a feature. I’m leaning towards bug.

Completion

I’ve been playing around with something new: google gadgets. Well, I’ve mostly been playing around with javascript and HTML because the part about packaging it together to form a gadget is actually pretty simple.

It’s an interesting experience for someone like me who has used dynamic web pages for years without the first clue about how they were implemented. So far, the fruit of my labors is limited to a gadget that lets you search javadoc (I haven’t thought up a name for it yet):


Ok, the screenshot actually only shows the control that will eventually inhabit the gadget. It’s inspired by the completion mechanism used in google suggest. When you enter text into the text field a list of suggestions will pop up that lets you jump to an appropriate javadoc page.

The javadoc index used to produce the list of suggestions is currently a (large!) piece of javascript code generated from the javadoc HTML class index by a perl script. But I intend to implement a mechanism so that the script just fetches the class list directly from the javadoc page. That way you should be able to point the gadget to any javadoc site and it will provide site-specific completion.

This does make me wonder: does anyone actually use javadoc on the web anymore? I mostly use it within eclipse and maybe everybody’s moved om from the web-based interface?

I hope to work on this over Christmas and have a beta ready within a week or two.

Update: I’ve put the current (alpha) version of this online at quenta.org/javadoc. I’ve only tested it in firefox 2 and you’ll notice that there are still a few kinks to sort out but I think it gives a reasonable impression of how it will eventually work.

Update #2: I’ve fixed some bugs so now you can actually click on the items and go to the appropriate javadoc page. At least for most items. Also, it should now work in safari as well. Also, it can now be added as a gadget:


Add to Google

CbfgFpevcg

One of my favorite languages is PostScript. It’s not so much the language itself — it’s simple and general but it can be a bit of a hassle to use — but the idea to use a general programming language as a document format. I’ve been playing around a bit lately and have written two small PostScript programs that can transform PS documents: munch-rot13.ps and munch-nl.ps. The one script ROT13s the document (example) and the other one simply swaps the letter N and the letter L (example). The N/L thing is sort of a tradition at daimi.

How do you “run” the programs? You simply concatenate the program with the document you want to transform. To apply ROT13 to document.ps you simply write

$ cat munch-rot13.ps document.ps > document-rot13.ps

You can apply the transformations as many times as you want so given a document encoded with munch-rot13.ps you can decode it by just applying munch-rot13.ps again. And, incidentally, if want to decode a document but don’t have munch-rot13.ps, don’t worry: since the document was encoded by appending it to the script, the encoded document actually carries around the script.

You can also mix the transformations: apply ROT13, then N/L and then ROT13 again. That causes A and Y to be switched in the document.

Pointless? Maybe, but I thought it was an interesting challenge. And, as Mayer Goldberg demonstrates, you can actually write useful programs in PostScript. Maybe I’ll try that some day. I’m not completely done with pointlessness yet, though: I’m considering writing a version that Enigmas the document.

Prefix Keywords

Even though we’ve decided not to use smalltalk in OSVM it doesn’t mean that we haven’t been happy with smalltalk ourselves. You always want what you can’t have and if I wasn’t a smalltalk fan before, I was certainly made one by the decision that we were not going to use it anymore. Having said that, I think there are situations where smalltalk is less than perfect; in particular, I find some smalltalk code very hard to read. In this post I’ll discuss some examples of this and describe a neat solution (at least I think so) I once experimented with.

One thing that can make smalltalk code hard to read is the fact that almost everything is a message send, and that the receiver of a message send is always written first. For instance, in a send to ifTrue:ifFalse: you always write the condition first because that is the receiver of the send:

self size ~= other size ifTrue: [ ... ] ifFalse: [ ... ]

When you read this line you have to read past the self size ~= other size part before you see that what you’re reading is actually the condition of a conditional. In this case the line is short so it’s not a big problem but it does cost me just one or two extra brain cycles when reading the line. I would have saved those if you could write the conditional so that it was immediately clear, when reading from left to right, that it was a conditional. For instance:

if: (self size ~= other size) ...

The problem becomes much worse when you consider the operations whose receiver is a block, for instance some repetition, thread and exception operations, because blocks can be very long. An example is this code snippet taken from Squeak (well, slightly changed):

[
[ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
] repeat
] on: Exception do: [
^nil.
]

What’s happening here? The outer block is an exception handler, the body of which repeatedly performs some operation. When you read this code you have to read ahead several lines to understand what each block means. By the way, I have no idea what findThePig does.

Another example, also taken from Squeak, is this:

[
[
newMorph := NetworkTerminalMorph connectTo:
self ipAddress.
WorldState
addDeferredUIMessage:
[newMorph openInStyle: #scaled] fixTemps.
]
on: Error
do: [ :ex |
WorldState addDeferredUIMessage: [
self inform: 'No connection to: ',
self ipAddress,' (',ex printString,')'
] fixTemps
].
] fork

Here you not only have to read way ahead to see that the outer block forks a thread, but you have to look closely at the body of the inner block to spot the on:do: selector which doesn’t stand out at all but is very important in understanding the code. This is only two levels of nesting and if you add another level the code becomes even harder to read.

Back when I wrote a lot of smalltalk I started inserting comments to make code such as this easier to read. For instance, I would insert “do” before a loop and “try” before an exception handler:

"try" [
"do" [ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
] repeat
] on: Exception do: [
^nil.
]

Here I don’t have to read ahead because I can see which kind of construct I’m dealing with even before I see the opening bracket. You still have to read to the end to understand exactly how the code is repeated and which exceptions are caught, but my experience is that this gives just enough context that you can read and understand the code in one go, something I couldn’t do with the previous version of the code, without the comments.

Back when we were discussing how to change the language, before we decided to go for a C/C++ style syntax, I suggested something called prefix keywords to make the language slightly more C-like and easier to read. It was never very popular but I still think it’s a pretty decent idea. The suggestion was to allow keyword sends that start with a keyword. You can still write ordinary keyword sends such as:

(...) ifTrue: [...] ifFalse: [...]

but you would also be allowed to write:

if: (...) then: [...] else: [...]

which means sending the if:true:false: message to the first expression, with the two blocks as arguments. You would declare the if:then:else: method on booleans like this (here on True)

if: self then: thenPart else: elsePart
^thenPart value

That way, instead of using comments to mark the beginning of constructs like above, you can use a prefix keyword:

try: [
do: [ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
].
] on: Exception do: [
^nil.
]

There’s no problem in parsing it and the semantics is as simple as the three other kinds of sends. It also works pretty well for methods such as acquireDuring:. Consider

mutex acquireDuring: [
... critical section ...
]

compared with

acquire: mutex during: [
... critical section ...
]

I think it’s pretty neat. It does change the flavor of the code somewhat but I think it makes the code much easier to read.

Totten

I read the Middle East Journal, a blog written by freelance journalist Michael Totten. Recently, he asked his readers what we thought about him no longer going on paid assignments. Instead, he would write exclusively for the readers of his blog and financed by our voluntary tips. The feedback in the comments were very positive and apparently so was the feedback in his tip jar. So now, if things work out, he’s off to Iran. And maybe later Afghanistan, or Syria, or North Korea.

It’s an interesting experiment — a journalist cutting out the middleman and writing directly for, and interacting with, a group of people. At first I was pretty excited about it but as I’ve thought a bit more about it I’m having some second thoughts. For one thing, some might argue that the middleman, the editor, is there for a reason. I would tend to agree. Also, these are dangerous places he’ll be going. Will I be (partially) responsible if something happens to him? I’m sure that the more remote and dangerous places he writes about, the more people will read his blog and hit his tip jar. Will put pressure on him to visit these places and put himself at even greater risk?

On the other hand there’s the distinct possibility that I shouldn’t think so much and just enjoy the blog.

Match

When I first heard about Scala, the very first thing I did was find the specification and look up the details of pattern matching. I’m a big fan of pattern matching, something I apparently share with Martin Odersky, and I wanted to see if Scala used the same model as his previous language, Pizza. Unfortunately it does as far as I can see. The reason I think it’s unfortunate is that I think there is a great potential in introducing pattern matching in an object-oriented language but I don’t think Scala’s model quite reaches that potential.

Before getting to Scala’s mechanism I’ll start off by explaining why I think pattern matching is something you would want to have in an object-oriented language. Conventional wisdom says that in an object-oriented language you’re not supposed to decide which action to take based on the type of other objects. Scott Meyers says in Effective C++:

Anytime you find yourself writing code of the form “if the object is of type T1, then do something, but if it’s of type T2, then do something else,” slap yourself.

Instead of dispatching on an object’s type, you should use polymorphism. This rule is a good first approximation, but it’s not the whole story. Sometimes you’re not in a position to add methods to other objects. And even if you are, adding methods can be a bad idea because it often means polluting other objects with methods that may be convenient for you but are completely irrelevant to the object itself and anyone else who might use it. Case in point: in Squeak the Object class has over 400 methods including isSketchMorph, playSoundNamed and, the winner by unanimous decision, hasModelYellowButtonMenuItems. Encapsulation? We don’t need no steenkin’ encapsulation!

This is where a pattern matching mechanism can be really useful. If you have a reasonably fixed set of object types that you want to operate on it gives you a structured, simple and readable way of distinguishing between objects. Unfortunately Scala’s model, and Pizza’s before it, imposes some limitations on the objects involved that make it unnecessarily difficult to use.

Scala’s pattern matching mechanism is pretty simple: you can declare a class as a case class, which enables you to match against it:

class Expr
case class Var(x: String) extends Expr
case class Apply(f: Expr, e: Expr) extends Expr
case class Lambda(x: String, e: Expr) extends Expr

Given an Expr object, or any object really, you can match it against these three cases like this:

val e: Expr = ...
e match {
case Var(x) => ...
case Apply(f, e) => ...
case Lambda(x, e) => ...
}

This looks very similar to functional-style pattern matching as in ML and Haskell.

The most important problem with this mechanism is that a case is bound to a particular class. The Var(x) case only matches instances of the Var class and its subclasses. It’s fine to dispatch based on whether or not an object implements a particular interface, because any class is free to declare that it implements that interface, and implement it however they want. Requiring classes to have a particular superclass, on the other hand, dictates how the class should be implemented which is a bad thing indeed.

Here’s an example of why that’s a problem. Let’s say I’m implementing a linear algebra library with various functions that act on points, lists, matrices, etc. I have a basic Point class:

case class Point(x: Int, y: Int) {
// All sorts of numerical methods
}

Let’s say that my library turns out to be a bit slow and after profiling it I discover that one point in particular is used a lot, (0, 0), and a lot of the algorithms can be simplified considerable in that case. So I make a separate Origin class which represents (0, 0) and has its own simplified implementations of all operations (here in an imaginary case-extension syntax):

class Origin extends Object case Point(0, 0) {
// Methods specialized for (0, 0)
}

In Scala you can’t do that — Origin must be a subclass of Point if it wants to match as a point. In this case extending Point means that Origin gets two fields and a lot of methods, none of which it is interested in using. Bugger.

The second problem is that the values in the pattern can not be computed, they are always read directly from the object. You might want to optimize the numerical library by making some operations lazy, so you have a LazyPoint which is just like a point except that it doesn’t calculate its coordinates until they are requested. Unfortunately you can’t: the coordinates are read through an accessor, which is generated for you, that simply returns the value of a variable. Double bugger.

The last problem is that once a class has declared that it’s a case, subclasses can’t change or override that. An example from the post about equality was the class LinePosition, which represents a position corresponding to a particular line in a text file. A subclass of this class was CharacterPosition, which not only specifies the line in the file but also a particular character within that line. I would probably want LinePosition to match LinePosition(line: Int) and CharacterPosition to match CharacterPosition(line: Int, char: Int). Unfortunately I can’t because if one is a subclass of the other then they must match the same pattern. In this case, CharacterPosition must match as LinePosition(line: Int). Triple bugger.

The conclusion here is that if I use case classes and pattern matching I’ve put a lot of restrictions on the way I can implement my classes. If I was to implement the linear algebra library I mentioned above and use pattern matching to distinguish between points, lists and matrices, I’ve severely limited the freedom to decide how to actually implement these objects. And the limitations are especially uncomfortable because they dictate how you should implement your objects: that superclass must be a particular class and that the internal state of the object, or at least part of it must be readable directly from variables.

What I’m saying is not that I think pattern matching in Scala was poorly designed — it’s more that I’m probably trying to use it to solve a problem it probably wasn’t designed to solve. But I don’t think it’s a fundamental problem with pattern matching. So without further stalling, here’s my suggestion for an alternative semantics that is much less restrictive.

There is already a well-known way to dispatch on the type of an object: the visitor design pattern. When you pattern match an object, you shouldn’t inspect the object “from the outside” to select the right case, you should instead let the object itself select the right case using double dispatch. If you were to write a double-dispatch version of the lambda example from before, without any syntactic sugar, the code would look something like this:

class Expr
class Var (x: String) extends Expr {
def match(m: Matcher) = m.var(x)
}
class Apply (f: Expr, e: Expr) extends Expr {
def match(m: Matcher) = m.apply(f, e)
}
class Lambda(x: String, e: Expr) extends Expr {
def match(m: Matcher) = m.lambda(x, e)
}

and a match expression would expand into something like:

val e: Expr = ...
e.match(new {
def var(x) = ...
def apply(f, e) = ...
def lambda(x, e) = ...
})

The match expression is packed into a Matcher object which is really a mini-visitor. The matcher is then given to the to the object being matched which calls back to select the case to match. Note that I’m not suggesting that you write the code like that. In simple cases the same syntax could be used as in Scala, in more complex cases you might need a more general syntax but I haven’t quite thought that through.

Using visitor-pattern-matching an object has full control over what and how it “matches”. In the Origin example this means that an Origin object is free to implement its match method so that it matches as a point. It also means that the lazy point object is free to calculate the coordinates before calling back to the Matcher. Finally, CharacterPosition can override the match method so it is no longer forced to match as a LinePosition. Nice!

Of course there are problems. If you want to be able to have default cases that match if no other cases do, the matcher object must be able to handle calls to methods it doesn’t implement — sort of like the way smalltalk does with doesNotUnderstand. Another problem is that the extra flexibility enables objects to abuse the Matcher object for instance by matching more than once or not at all. Finally, you don’t want to use regular method names like point since that’s a flat namespace — instead you would use some form of qualified names. But none of these problems are very difficult to solve.

Bottom line: I think visitor-ifying the pattern matching mechanism in Scala would make it much more flexible and useful but still provide, as a special case, the current mechanism. And oh, I almost forgot, it can also be used to solve the equality problem that I described in an earlier post but that will have to wait.

Update: I discussed this with some people from the scala group and they managed to convince me that visitors are just not a realistic implementation strategy. Bugger!

Kurdistan

Michael J. Totten is traveling in the Kurdish northern part of Iraq and blogging about it. Very interesting stuff, with lots of pictures.

Whatever food and beverage item you might want to buy (including booze), the Kurds have it. You want Red Bull? They got it. They also have Blue Ox, whatever that is. People looked at me funny when I took these pictures. Why on earth is that guy taking pictures of the Red Bull? He’s American, hasn’t he seen this shit a million times already back home? What I think they don’t understand is that what’s normal in the Middle East somehow amazes (and comforts) people who have never been here. So I took pictures of the grocery store. It’s not all burkhas, camels, and caves out here.