Category Archives: javascript

Monitoring

One really useful programming tool I use whenever it’s available is exported stats monitoring. It’s one of those things that has no standard definition and which people reimplement over and over in different ad-hoc ways. This post is about a monitoring tool for JavaScript programs running in chrome I implemented recently, WebMon.

WebMon

WebMon is a very simple tool conceptually; this image tells you all you need to know to use it:

screenshot.png

It has two parts: a JavaScript library, webmon.js, and a chrome extension, WebMon. In your JavaScript code you export a stat by creating an instance of the webmon.Counter type which is provided by the library. During execution you update the stats as appropriate using the methods on the Counter instances. The WebMon chrome extension detects any page that uses the library and can show you a popup that shows the current value of each exported stat, updated continuously. You can also specify some simple computations on the data, for instance here the frame rate counter is configured to display the rate per second rather than the count. To see this in action try installing the chrome extension and going to this test page. The extension needs pretty broad permissions to see stats on any web page so for the paranoid you may want to skim the source code before installing it.

The motivation for implementing this came from a WebGL-based hobby project I’m working on. I needed to keep track of the frame rate and rendering CPU load and my initial implementation which displayed those in a div on the page was just not working out. WebMon gives you the same functionality with a minimum of complexity and performance impact on the JavaScript side since almost all the code is in the chrome extension. It also doesn’t require you to clutter your page with debug info.

That’s all there is to it. The rest of this post will be about exported stats in general.

Monitoring

WebMon is just one example of the more general mechanism of exported stats. The general model is: a program exports a set of stats which it updates during execution. These stats are visible to some form of external processor/viewer which can record or at least see their value at any time. A simple example is chrome. If you run chrome with the --enable-stats-table command-line flag you can go to chrome://stats and see the current value of a number of chrome’s internal counters and timers. V8 has a similar mechanism but since v8 doesn’t have a UI that can easily display the stats I wrote a python script that continuously reads the counter values and displays them in a window. I used it all the time when working on the v8/chrome integration layer since it allowed me to see exactly how many dom node wrappers were live, exactly what had happened during the last garbage collection, etc.

Much of this you could also get by printing debug information and processing it either manually or, for more complex output, using separate programs. A place this works really well is dumping debug information after garbage collections in a virtual machine. Garbage collections happen rarely enough that it’s not a lot of overhead and you’re likely to want information about every collection, not just the last one. Compared to logging the advantage of using an exported stat is that updating it is cheaper — in the v8 implementation it’s just writing a single word in memory — and it scales well. If you log information about 100 different conditions your log becomes huge and difficult to process, and the I/O slows your program down. With exported stats the external processor or viewer can easily filter the stats for you and only show the ones you’re interested in. Also, the space for each export stat is constant, typically just a single word containing the stat and another few words of metainformation. Another advantage of exported stats is that they’re testable — you can easily read back and test the value of a counter whereas testing that the right thing is printed in a debug log is tricky.

Really I think each programming language should come with stats exporting and monitoring built in or as a canonical library. Or even better, since monitoring and viewing can (and should)  be a separate process there could be a single common monitoring tool with a standard protocol that would have bindings for each language.

Until that happens, if you’d like to know what is happening in your JavaScript program as it happens you should give WebMon a try.

Tracing Promises

Ever so often I find myself having to write JavaScript programs that interact with the world through XMLHttpRequests. And every time the first thing I do before I actually get to send off requests is implement a small promise framework. Promises are just so much more powerful than callbacks that at this point I wouldn’t even consider writing raw callback-based asynchronous code.

One thing about promises though, especially if you program is just moderately complex, is that they make it much more difficult to follow the flow of your code. Everything happens asynchronously in short callbacks and when an error occurs it can be near impossible to piece together what happened. This post is about how to solve that problem.

Promises in general

Before getting to the part about tracing them here are some examples of uses of promises, to set the stage for the kind of operations we’ll later want to trace. These examples use idioms from my own promise library but these considerations apply to any of the libraries out there, the concrete patterns may just be different.

One place I relied heavily on promises was in the implementation of the sputnik test web runner (which has now been subsumed by test262). The function for firing off an XHR using google’s closure library looked like this:

function makeRequest(path) {
var result = new Promise();
goog.net.XhrIo.send(path, function () {
result.fulfill(this.getResponse());
}, "GET", undefined, undefined, 0);
return result;
}

// Fetch google.com and alert the result.
makeRequest("http://www.google.com")
.onFulfilled(alert);

What this function does is return a promise object that gets fulfilled when the XHR, which is sent off at the same time, completes and invokes its callback. This may look like a whole lot of nothing – basically making a callback into a promise – but the advantage comes at the call site. Say you have two mirrors of the same data and want to send a request to both and use the first response you get back. With promises that is trivial:

// Fetch from both mirrors, use the first response.
var response = Promise.select(makeRequest(mirrorOne),
makeRequest(mirrorTwo));
response.onFulfilled(alert);

The select operator takes any number of promises and returns a new promise that resolves as soon as any of the child promises resolve. Another use for select to set timeouts:

// Fetch path but time out after one second.
var response = Promise.select(makeRequest(path),
Promise.failAfter("Promise timed out", 1000));

// Alert successful response, log any errors in the console.
response
.onFulfilled(alert)
.onFailed(console.log.bind(console));

Here the response will resolve either to the result of the request, if it happens within one second, or else will fail with the string "Promise timed out" when the result of failAfter fails after one second.

To get the results of several concurrent promises you can use join. This code sends off two XHRs at once, waits until they’re both complete, and then alerts the concatenation of the results.

// Fetch pathOne and pathTwo in parallel.
var request = Promise.join(makeRequest(pathOne),
makeRequest(pathTwo));

// When done, concatenate the two responses.
var response = request.thenApply(function (dataOne, dataTwo) {
return dataOne + dataTwo;
});

// Finally alert the result.
response.onFulfilled(alert);

The join operator takes a list of promises and returns a promise that resolves to the list of values of all the child promises once they’re been resolved. The thenApply operator on a promise returns a new promise whose value will be the result of applying the function to the value of the original promise.

Note that these examples are a lot more verbose than you would usually make such code. I wrote them like that to make it clearer what’s going on. Some operations would be packed into convenience methods, like setting a timeout, and you would generally use method chaining. With that the last example, with the addition of a three second timeout, would look like this:

var response = Promise
.join(makeRequest(pathOne), makeRequest(pathTwo))
.thenApply(function (a, b) { return a + b; })
.withTimeout(3000)
.onFulfilled(alert)
.onFailed(console.log.bind(console));

Packs quite a punch doesn’t it? This is what I mean when I say that promises are more powerful than callbacks; it’s not that one is more expressive than the other, they’re computationally equivalent, but in terms of conciseness and power of abstraction there’s just no competition. And this is just a small part of what you can do with promises. A good place to see the power of this style of programming in action is twitter’s finagle stack.

The Errors! The Errors!

All’s well with this model as long as all operations succeed. However, as soon as operations start failing this is a mess. Consider for instance the case where you have numerous concurrent operations running that have timeouts. Errors are propagated through promises nicely so if you’ve set up a failure handler you’ll see the error, but it will look something like this:

Operation timed out
at promise.js:108:18

And that’s in the best case where error objects have stack traces. Since each step in the chain of promises is detached and runs in a separate turn of the event loop all you’ll see is that some timeout somewhere fired. There’s no telling which operation took too long. As soon as your program is just moderately complex this becomes a major issue. It’s like with normal synchronous programs if there were no stack traces. Stack traces are there for a reason: it sucks to debug without them.

The stack trace analogy is useful in understanding what information you’d want to have, if you could, when a promise fails. All these patterns of asynchronous operations correspond, when you’re only interested in error reporting, to simple call/return patterns we’re used to. For instance, the join operation is just an asynchronous version of taking a list of functions, calling each one, and returning a list of their results:

function syncJoin(funs) {
var result = [];
for (var i = 0; i ... funs.length; i++)
result.push(funs[i]());
return result;
}

In this case if calling any of the functions fails then syncJoin will fail and both calls will show up in the stack trace. The same thing holds for join on promises: if join fails because one of the child promises fail you want information about both. The same thing holds for select, for the same reason.

This all suggests that when a promise fails and propagates that failure through a number of other promises what you as a programmer need is information about each promise in that chain. Something like:

Operation timed out
at promise.js:108:18
--- triggering the failure of ---
at Function.withTimeout (promise.js:106:17)
at runDemo (demo.js:226:11)
at demo.html:7:168
at onload (demo.html:8:4)
--- triggering the failure of ---
at Function.select (promise.js:50:16)
at Function.withTimeout (promise.js:110:18)
at runDemo (demo.js:226:11)
at demo.html:7:168
at onload (demo.html:8:4)

This is a trace created using the promise trace library I wrote last week to trace down some issues with a chrome extension I was working on. A promise trace is made up of segments, each one a stack trace identifying a promise. The top segment is the error that caused the failure, in this case exactly the same timeout as before. The next one is the promise created by withTimeout that fails when the timeout is exceeded. The bottom segment is the select between the operation and the timeout, and going to demo.js:226 will tell you which operation it was.

The trace above is from chrome; here’s the result of the same failure but in firefox

Operation timed out
--- triggering the failure of ---
GenericTraceSegment()@trace.js:197
Promise()@promise.js:23
([object Object],100)@promise.js:106
runDemo()@demo.js:226
onload([object Event])@demo.html:1
--- triggering the failure of ---
GenericTraceSegment()@trace.js:197
Promise()@file:promise.js:23
([object Object],[object Object])@promise.js:50
([object Object],100)@promise.js:110
runDemo()@demo.js:226
onload([object Event])@demo.html:1

It’s a bit more cluttered since the firefox stack trace api doesn’t support stripping irrelevant top frames but both contain the relevant clues you need when debugging.

Both traces contain a fair amount of redundant information. For instance, the two bottom promises are created close to each other so the bottom of their stack traces are identical. To make the traces easier to read the promise trace library folds away overlapping stack traces. The actual trace that would be printed for the above error is

Operation timed out
at promise.js:108:18
--- triggering the failure of ---
at Function.withTimeout (promise.js:106:17)
at runDemo (demo.js:226:11)
at demo.html:7:168
at onload (demo.html:8:4)
--- triggering the failure of ---
at Function.select (promise.js:50:16)
at Function.withTimeout (promise.js:110:18)
at runDemo (demo.js:226:11)
(... rest of trace same as previous ...)

This is probably still more cluttered and redundant that it needs to be – in my experience for each segment you only need to know the top frame below the call into the promise library and the bottom frame that is different from the previous segment. But I’m not sure that’s true in all cases so for now this is as small as I’d like to make it.

Getting promise traces

How does this api work? From a user’s perspective all you see is that on failure your callback is given two values: the error and a trace. You’d typically use it something like this:

myOperation.onFailed(function (error, trace) {
console.log(trace.toString());
});

This requires some support from the promise library but not a lot actually. Whenever a promise is created it has to create a PromiseTraceSegment value and when propagating a failure it should create and propagate a chain of PromiseTrace objects which collect the relevant segments. In my own promise library the tracing code makes up maybe ten lines.

Performance

The way this is implemented, as is probably evident from the examples above, is by capturing a stack trace every time a promise is created (there’s also a flag to disable trace capturing altogether). The whole thing is actually pretty straightforward, the whole trace library is ~200 lines of which most are for formatting the output. Capturing stack traces is not super expensive generally but even relatively cheap operations add up if you do them often enough. To get a sense for  how expensive tracing is I wrote a simple ludicrously asynchronous implementation of the Fibonacci function:

function lazyFib(n) {
if (n == 0 || n == 1) {
return Promise.of(1);
} else {
return Promise.deferLazy(function () {
return Promise
.join(lazyFib(n - 2), lazyFib(n - 1))
.thenApply(function (a, b) { return a + b;});
});
}
}

I then tried running it with and without promise tracing. In chrome tracing makes it around 60% slower on average. In firefox the cost is only around 20% but then this program was a lot slower overall. I believe this is because in firefox the event queue runs at a slower rate by design. In safari you can enable tracing and it’s essentially free but in the version of safari I tried you can’t actually get a stack trace so the promise traces produced are useless. I believe stack trace support is coming to safari too though.

This example is the absolute worst case performance cost; nobody would write as promise-heavy a program as that example. To also get a sense for the impact on a more realistic program I wrote a larger benchmark, one that simulates an RPC system with a string repository that holds a large number of strings which have to be fetched asynchronously, a few at a time, and then processed by a chain of other asynchronous operations. It was still ridiculously asynchronous, much more than any realistic program, but at least not quite as trivial. This time the performance cost in chrome was 0.8% on average. Statistically significant but tiny.

Conclusion

Debugging asynchronous JavaScript programs is difficult. However, adding support to libraries that implement asynchronous abstractions like promises is straightforward and the overhead of collecting traces is likely to be insignificant for any realistic JavaScript program.

MyJS

The first large JavaScript program I wrote after ES5 came out was my mercury chrome extension. I was surprised at how big a difference to my code style some relatively small changes to the language makes. Before then I had followed the process that led to ES5 and knew about those features long before I tried them but it was only when I actually tried them I got as sense and feel for how they felt to use. I thought, wouldn’t it be useful if you could get that while developing the language rather than after?

So a few months ago I sat down to write a language experimentation framework for JavaScript, MyJS. What MyJS does is allow you to define a language extension, including new syntax and/or library functions. A source file (html or js) can then specify which extensions it uses and the parser and translation framework will plug extensions together as appropriate to parse your program and convert it to plain JavaScript. How it works is worth a blog post of its own but here I’ll outline how you define a language extension and use it.

Delegates

You can try MyJS by going to this test page (I haven’t tested this in all browsers, it probably doesn’t work in all of them). The program that prints the output you see on that page uses the delegates extension, the operator ::, which returns a python-style bound method. This is what the program looks like:
// A Printer can print text to a document.
function Printer(prefix) {
this.prefix = prefix;
}

// Add this printer's prefix and print the given value.
Printer.prototype.print = function(value) {
var elm = document.createElement('div');
elm.innerText = this.prefix + value;
document.body.appendChild(elm);
};

var p = new Printer("Looky here: ");

[1, 2, 3].map(p::print);
(::print)(p, "Where?");
(The idea for the delegates operator is Erik Corry‘s) In this case the binary (“bound”) :: operator returns a function that, when called, calls p.print with the arguments it’s given. So (p::print)("foo") means exactly the same as p.print("foo"). The unary :: (“unbound”) operator returns a function that uses its first argument as the receiver and the rest as arguments. So (::print)(p, "bar") is the same as p.print("bar"). This is just a random example of an extension, the operator itself isn’t that important.

Using a dialect

The way you use an extension is by using a MyJS specific script tag:


This code first includes the MyJS java library. It then includes a script called delegates.js, which defines the delegates extension. MyJS comes with a few language extensions built in which makes it easier to write new language extensions. Delegates uses one of those, myjs.Quote. Finally, once we’ve loaded the delegates extension the third script element can use the extended language. Easy!

Defining an extension

How do you define a language extension? It has four parts: defining syntax trees, describing the concrete syntax, implementing the rules for translating the new syntax trees into plain JavaScript, and finally hooking it all into the framewrk. The full definition of the delegates extension is here. For the delegates extension we need two new syntax trees, one for bound and one for unbound delegates:

// Bound expression syntax tree node.
function BoundMethodExpression(atom, name) {
this.type = 'BoundMethodExpression';
this.atom = atom;
this.name = name;
}

// Unbound method syntax tree node.
function UnboundMethodExpression(name) {
this.type = 'UnboundMethodExpression';
this.name = name;
}

The syntax tree format is modelled on Mozilla’s Parser API and the parser for the standard language constructs produces parser api compatible syntax trees.

To produce these we need to hook into the grammar. The syntax is pretty straightforward:

LeftHandSideSuffix
::= "::" Identifier

PrimaryExpression
::= "::" Identifier

(A left hand side suffix is basically the right hand side of an operator expression). The way to express this in MyJS is to map it into calls to the grammar constructor library:

function getSyntax() {
// Suffix syntax tree builder helper.
function BoundMethodSuffix(name) {
this.name = name;
}

// Apply this suffix to a base expression.
BoundMethodSuffix.prototype.apply = function(atom) {
return new BoundMethodExpression(atom, this.name);
};

// Build the syntax
var f = myjs.factory;
var syntax = myjs.Syntax.create();

// LeftHandSideSuffix ::= "::" Identifier
syntax.getRule('LeftHandSideSuffix')
.addProd(f.punct('::'), f.nonterm('Identifier'))
.setConstructor(BoundMethodSuffix);

// PrimaryExpression ::= "::" Identifier
syntax.getRule('PrimaryExpression')
.addProd(f.punct('::'), f.nonterm('Identifier'))
.setConstructor(UnboundMethodExpression);

return syntax;
}

There’s a bit of a quirk here because a left hand suffix doesn’t correspond to a “full” syntax tree, it has the left hand side missing. So instead it returns a suffix object that will be called later by the parser with the left hand side as an argument and must return the full syntax tree.

The reason we don’t just build the syntax but define a function for doing it is that we only want to build it if the dialect is used. If it is just defined but never used it’s a waste of time to build the syntax.

This allows the framework to parse extended code and build syntax trees. The next part is to define how they are translated into plain JavaScript:

BoundMethodExpression.prototype.translate = function(context) {
return #Expression((function(temp) {
return temp.,(context.translate(this.name)).bind(temp);
})(,(context.translate(this.atom))));
};

UnboundMethodExpression.prototype.translate = function(context) {
return #Expression(function(recv, var_args) {
return recv.,(context.translate(this.name)).apply(recv,
Array.prototype.splice.call(arguments, 1));
});
};

This is where it gets a little bit tricky. This code uses a language extension, myjs.Quote, to plug together syntax trees. But while this is tricky the first time you see it is a lot better than having to build syntax trees by hand without quoting. The #Expression part means: the following code should be parsed as an expression but don’t run the code, return the syntax tree. The commas mean: this must be evaluated and the result, which is a syntax tree, must be spliced into the surrounding tree. If you know quasiquote from scheme that’s basically what it is. What this code does is say: if you have the syntax tree A::B translate it into something like

(A.B).bind(A)

except that will cause a to be executed twice so we use

(function (t) { return (t.B).bind(t); })(A)

and similarly for ::A. The recursive calls to translate are there to translate the subexpressions.

The last part is to hook it all into the framework. That’s done using this incantation:

myjs.registerFragment(new myjs.Fragment('demo.Delegates')
.setSyntaxProvider(getSyntax)
.registerType('BoundMethodExpression', BoundMethodExpression)
.registerType('UnboundMethodExpression', UnboundMethodExpression));

Here we give the extension a name, demo.Delegates, register the function that will return the syntax to use, and register the two new types of syntax trees. That’s what it takes to define an extension.

Fragments also allow you to set up install hooks, so if your extension needs to add library functions to the global object for instance you can specify a function that is given the global object as an argument and can install any functions and methods it needs.

…six months later

I sort of ran out of steam on this project six months ago so while the “hard” part is there, the extensible parser framework, the basic language definition (all the standard language constructs are defined as if they were extensions), the translation framework, etc., there is still some work left to do before everything works – for instance, before you could define the harmony classes syntax. It would also be great if you could specify a dialect the same way you specify strict mode, within the script (“using demo.Delegates”).

I also played around with making this work with node.js but I couldn’t find the hooks in the module importing primitives that would allow me to intercept module loads and do the source translation.

The code lives on github.