Tracing Promises

Ever so often I find myself having to write JavaScript programs that interact with the world through XMLHttpRequests. And every time the first thing I do before I actually get to send off requests is implement a small promise framework. Promises are just so much more powerful than callbacks that at this point I wouldn’t even consider writing raw callback-based asynchronous code.

One thing about promises though, especially if you program is just moderately complex, is that they make it much more difficult to follow the flow of your code. Everything happens asynchronously in short callbacks and when an error occurs it can be near impossible to piece together what happened. This post is about how to solve that problem.

Promises in general

Before getting to the part about tracing them here are some examples of uses of promises, to set the stage for the kind of operations we’ll later want to trace. These examples use idioms from my own promise library but these considerations apply to any of the libraries out there, the concrete patterns may just be different.

One place I relied heavily on promises was in the implementation of the sputnik test web runner (which has now been subsumed by test262). The function for firing off an XHR using google’s closure library looked like this:

function makeRequest(path) {
  var result = new Promise();
  goog.net.XhrIo.send(path, function () {
    result.fulfill(this.getResponse());
  }, "GET", undefined, undefined, 0);
  return result;
}

// Fetch google.com and alert the result.
makeRequest("http://www.google.com")
  .onFulfilled(alert);

What this function does is return a promise object that gets fulfilled when the XHR, which is sent off at the same time, completes and invokes its callback. This may look like a whole lot of nothing – basically making a callback into a promise – but the advantage comes at the call site. Say you have two mirrors of the same data and want to send a request to both and use the first response you get back. With promises that is trivial:

// Fetch from both mirrors, use the first response.
var response = Promise.select(makeRequest(mirrorOne),
  makeRequest(mirrorTwo));
response.onFulfilled(alert);

The select operator takes any number of promises and returns a new promise that resolves as soon as any of the child promises resolve. Another use for select to set timeouts:

// Fetch path but time out after one second.
var response = Promise.select(makeRequest(path),
  Promise.failAfter("Promise timed out", 1000));

// Alert successful response, log any errors in the console.
response
    .onFulfilled(alert)
    .onFailed(console.log.bind(console));

Here the response will resolve either to the result of the request, if it happens within one second, or else will fail with the string "Promise timed out" when the result of failAfter fails after one second.

To get the results of several concurrent promises you can use join. This code sends off two XHRs at once, waits until they’re both complete, and then alerts the concatenation of the results.

// Fetch pathOne and pathTwo in parallel.
var request = Promise.join(makeRequest(pathOne),
    makeRequest(pathTwo));

// When done, concatenate the two responses.
var response = request.thenApply(function (dataOne, dataTwo) {
  return dataOne + dataTwo;
});

// Finally alert the result.
response.onFulfilled(alert);

The join operator takes a list of promises and returns a promise that resolves to the list of values of all the child promises once they’re been resolved. The thenApply operator on a promise returns a new promise whose value will be the result of applying the function to the value of the original promise.

Note that these examples are a lot more verbose than you would usually make such code. I wrote them like that to make it clearer what’s going on. Some operations would be packed into convenience methods, like setting a timeout, and you would generally use method chaining. With that the last example, with the addition of a three second timeout, would look like this:

var response = Promise
    .join(makeRequest(pathOne), makeRequest(pathTwo))
    .thenApply(function (a, b) { return a + b; })
    .withTimeout(3000)
    .onFulfilled(alert)
    .onFailed(console.log.bind(console));

Packs quite a punch doesn’t it? This is what I mean when I say that promises are more powerful than callbacks; it’s not that one is more expressive than the other, they’re computationally equivalent, but in terms of conciseness and power of abstraction there’s just no competition. And this is just a small part of what you can do with promises. A good place to see the power of this style of programming in action is twitter’s finagle stack.

The Errors! The Errors!

All’s well with this model as long as all operations succeed. However, as soon as operations start failing this is a mess. Consider for instance the case where you have numerous concurrent operations running that have timeouts. Errors are propagated through promises nicely so if you’ve set up a failure handler you’ll see the error, but it will look something like this:

Operation timed out
    at promise.js:108:18

And that’s in the best case where error objects have stack traces. Since each step in the chain of promises is detached and runs in a separate turn of the event loop all you’ll see is that some timeout somewhere fired. There’s no telling which operation took too long. As soon as your program is just moderately complex this becomes a major issue. It’s like with normal synchronous programs if there were no stack traces. Stack traces are there for a reason: it sucks to debug without them.

The stack trace analogy is useful in understanding what information you’d want to have, if you could, when a promise fails. All these patterns of asynchronous operations correspond, when you’re only interested in error reporting, to simple call/return patterns we’re used to. For instance, the join operation is just an asynchronous version of taking a list of functions, calling each one, and returning a list of their results:

function syncJoin(funs) {
  var result = [];
  for (var i = 0; i ... funs.length; i++)
    result.push(funs[i]());
  return result;
}

In this case if calling any of the functions fails then syncJoin will fail and both calls will show up in the stack trace. The same thing holds for join on promises: if join fails because one of the child promises fail you want information about both. The same thing holds for select, for the same reason.

This all suggests that when a promise fails and propagates that failure through a number of other promises what you as a programmer need is information about each promise in that chain. Something like:

Operation timed out
    at promise.js:108:18
--- triggering the failure of ---
    at Function.withTimeout (promise.js:106:17)
    at runDemo (demo.js:226:11)
    at demo.html:7:168
    at onload (demo.html:8:4)
--- triggering the failure of ---
    at Function.select (promise.js:50:16)
    at Function.withTimeout (promise.js:110:18)
    at runDemo (demo.js:226:11)
    at demo.html:7:168
    at onload (demo.html:8:4)

This is a trace created using the promise trace library I wrote last week to trace down some issues with a chrome extension I was working on. A promise trace is made up of segments, each one a stack trace identifying a promise. The top segment is the error that caused the failure, in this case exactly the same timeout as before. The next one is the promise created by withTimeout that fails when the timeout is exceeded. The bottom segment is the select between the operation and the timeout, and going to demo.js:226 will tell you which operation it was.

The trace above is from chrome; here’s the result of the same failure but in firefox

Operation timed out
--- triggering the failure of ---
GenericTraceSegment()@trace.js:197
Promise()@promise.js:23
([object Object],100)@promise.js:106
runDemo()@demo.js:226
onload([object Event])@demo.html:1
--- triggering the failure of ---
GenericTraceSegment()@trace.js:197
Promise()@file:promise.js:23
([object Object],[object Object])@promise.js:50
([object Object],100)@promise.js:110
runDemo()@demo.js:226
onload([object Event])@demo.html:1

It’s a bit more cluttered since the firefox stack trace api doesn’t support stripping irrelevant top frames but both contain the relevant clues you need when debugging.

Both traces contain a fair amount of redundant information. For instance, the two bottom promises are created close to each other so the bottom of their stack traces are identical. To make the traces easier to read the promise trace library folds away overlapping stack traces. The actual trace that would be printed for the above error is

Operation timed out
    at promise.js:108:18
--- triggering the failure of ---
    at Function.withTimeout (promise.js:106:17)
    at runDemo (demo.js:226:11)
    at demo.html:7:168
    at onload (demo.html:8:4)
--- triggering the failure of ---
    at Function.select (promise.js:50:16)
    at Function.withTimeout (promise.js:110:18)
    at runDemo (demo.js:226:11)
    (... rest of trace same as previous ...)

This is probably still more cluttered and redundant that it needs to be – in my experience for each segment you only need to know the top frame below the call into the promise library and the bottom frame that is different from the previous segment. But I’m not sure that’s true in all cases so for now this is as small as I’d like to make it.

Getting promise traces

How does this api work? From a user’s perspective all you see is that on failure your callback is given two values: the error and a trace. You’d typically use it something like this:

myOperation.onFailed(function (error, trace) {
  console.log(trace.toString());
});

This requires some support from the promise library but not a lot actually. Whenever a promise is created it has to create a PromiseTraceSegment value and when propagating a failure it should create and propagate a chain of PromiseTrace objects which collect the relevant segments. In my own promise library the tracing code makes up maybe ten lines.

Performance

The way this is implemented, as is probably evident from the examples above, is by capturing a stack trace every time a promise is created (there’s also a flag to disable trace capturing altogether). The whole thing is actually pretty straightforward, the whole trace library is ~200 lines of which most are for formatting the output. Capturing stack traces is not super expensive generally but even relatively cheap operations add up if you do them often enough. To get a sense for how expensive tracing is I wrote a simple ludicrously asynchronous implementation of the Fibonacci function:

function lazyFib(n) {
  if (n == 0 || n == 1) {
    return Promise.of(1);
  } else {
    return Promise.deferLazy(function () {
      return Promise
          .join(lazyFib(n - 2), lazyFib(n - 1))
          .thenApply(function (a, b) { return a + b;});
    });
  }
}

I then tried running it with and without promise tracing. In chrome tracing makes it around 60% slower on average. In firefox the cost is only around 20% but then this program was a lot slower overall. I believe this is because in firefox the event queue runs at a slower rate by design. In safari you can enable tracing and it’s essentially free but in the version of safari I tried you can’t actually get a stack trace so the promise traces produced are useless. I believe stack trace support is coming to safari too though.

This example is the absolute worst case performance cost; nobody would write as promise-heavy a program as that example. To also get a sense for the impact on a more realistic program I wrote a larger benchmark, one that simulates an RPC system with a string repository that holds a large number of strings which have to be fetched asynchronously, a few at a time, and then processed by a chain of other asynchronous operations. It was still ridiculously asynchronous, much more than any realistic program, but at least not quite as trivial. This time the performance cost in chrome was 0.8% on average. Statistically significant but tiny.

Conclusion

Debugging asynchronous JavaScript programs is difficult. However, adding support to libraries that implement asynchronous abstractions like promises is straightforward and the overhead of collecting traces is likely to be insignificant for any realistic JavaScript program.