Final
In his recent java 7 wish list Peter Ahé attributes a proposal to me, a shorthand syntax for variable declaration. In this post I'll give a bit of background on where that idea came from and why I think it deserves to make it into java 7.
The idea is really very simple: when you declare a final
variable with an initializer you are allowed to leave out the type of the variable; if you do, the type will be inferred from the initializer expression. Peter's example is this:
final strings = new ArrayList();
The variable strings will get the type ArrayList
because that's the type of the expression used to initialize it. That's all there is to it. The reason this idea is even worth writing about is not the mechanism in itself but the fact that a number of really nice properties fall out if you link type inference and final
together.
But first I have to confess that this may not have been my idea at all. All I know is that it came out of a discussion I had with Kasper Lund last winter while we worked together in Esmertec. I've since talked with Kasper about it and none of us can remember who actually suggested it.
The idea came out of my code style at the time. In his (semi-)recent follow-up rant on agile programming, Steve Yegge talks about slob type systems and neat-freak type systems and the slobs and neat-freaks among programmers who prefer either one or the other. Well, I am -- emphatically -- a neat freak (as Erik saw fit to point out recently). If you take a look a some of my source code I'm sure you'll agree. I need my type declarations, my @Override
annotations and, more than anything, I need my final
variables. I can say without blushing that I was probably the biggest fan of final
variables on the planet. Unfortunately, writing final
on all variables whose value doesn't change is really verbose. Hence the idea to leave out the part of a final variable declaration that is actually redundant: the type. That is, in a nutshell, where the idea came from. But there's more to it.
Final
Historically, the concept of local variables has been tied to mutability. It's even part of the name: it's called a local variable, not a local constant. It's only natural, really: the way local definitions (to use a mutable/immutable neutral word) have historically been implemented as locations on the call stack which means that it is trivial to make them mutable. Why not by default give programmers added flexibility and allow them to change them? Well, there is actually good reason not to.
What is a local "variable"? It's a location or slot where you can store different values over time. Consider this piece of code:
String myMethod() {
String str = "bleep";
...
... // many lines of code
...
return str;
}
Say you've run into a NullPointerException
somewhere and suspect that this method might return null
, that is, that str is null
when the method returns. It starts out non-null but since it's a variable you have to read and understand, at least to some degree, all the code between the declaration and the return. Maybe there are no assignments, in which case you know the value is still "bleep"
when the method returns. I would venture a guess that across all the world's java code that is the common case: variables are usually only written once. But you don't know until you've read the whole method.
Variables that are initialized and then never changed are a different beast than variables that may be written multiple times, and they are much simpler. When you look at the declaration of such a variable you know not only the type of the variable but also the value. If you knew in advance in the code above that str doesn't change, you can safely ignore all other code in the method and focus on the value used to initialize str since you know that that's the value that will eventually be returned. The initializer may be a complex expression, sure, but it is still a simpler problem to figure out the value of an expression than to figure out the value of an expression and figure out how and when the variable might change its value. You're not creating a slot where you can store different values but just computing a value and then giving it name.
That's why I started writing final
on all variables whose values never changed. It makes it a lot easier to skim through code and understand what is going on. In particular, since most variables will be final, it actually makes non-final variables stand out. After I'd used this for a while I stopped noticing final
, since that's the common case, and instead started to notice non-final variables. Maybe it has never been a problem for you to distinguish between variables that change and variables that don't but there's no question about the fact that when you read a piece of code you have to dedicate a few mental cycles to sort this out whenever the code refers to a variable. And in my experience mental cycles is an extremely precious resource when reading code, which is why I'm such a neat-freak (well, that and plain OCD). If writing final
or @Override
saves cycles for future readers of my code I'll be inclined to do it; that future reader is likely to be myself.
Writing final
everywhere is verbose, though. My god is it verbose. I have always have little patience for people who think that the fewer keystrokes it takes to write a program the better. Needless to say, perl is not my favorite language. In the mythical man month, Fred Brooks says that programmers seemed to generate about the same amount of code per day regardless of the language they use (which is bad news if 25% of the code you write is the word final
). Had it been someone other than Fred Brooks I'd have called it rubbish. Paul Graham says that he had seen little to contradict this hypothesis. I have no problem calling that rubbish. The logic and abstractions used in a program, those take time to design and understand. Sitting down in front of a keyboard and pushing buttons isn't what takes the time.
Having said that, though, there's such a thing as ad absurdum, and so a few months ago I decided that it was enough with the final
. It makes the code easier to read, sure, but I'm afraid if I kept it up I'd have arthritis by the time I was 30. It was quite a relief and I doubt I'll ever revert to my old coding style. But it is frustrating: I felt I was forced to do the Wrong Thing just because the Right Thing was too verbose even for me. Bugger.
Inference
But let's turn to another source of sore fingers: generics. A declaration like this
HashMapnew HashMap
is around 50% redundant. And it gets worse, much worse. So obviously people have been looking for a way to remove part of this redundancy. I've heard several different suggestions. You can use inference to guess the type arguments of the new expression:
HashMapmap = new HashMap();
I think this proposal is a bit confusing and it has the limitation that it only really works with new expression. On the plus side it allows you to use different types for the variable and the constructor:
Mapmap = new HashMap();
Another option is to use a shorthand syntax that resembles C++'s stack allocated object syntax:
HashMapmap();
This syntax is also limited to object creation, it (sort of) collides with the method declaration syntax, and had the added limitation that the type of the variable must be a class type which is unfortunate if you want to assign other values to the variable. On the other hand if the variable is immutable I actually think it's a pretty reasonable syntax.
Finally, you can infer the variable type from the initializer:
final map = new HashMap();
or, with the syntax proposed by James Gosling,
map := new HashMap
The advantage of this is that the syntax is not limited to variables initialized with a new expression, and the type of the variable can be an interface type. On the minus side, it can become difficult to figure out what the type of a variable is:
final obj = myMethod();
The most important thing about unifying type inference and final, though, is that is avoids a number of problems you otherwise have when mixing type inference and mutability (which is a well-known source of trouble).
A mutable variable can be assigned any number of times during its lifetime and it's not a given that it makes sense to just use the type of the initial assignment as the type of the variable. Imagine you're doing something like:
static Properties getProperties(String name) { ... }
props := getProperties("qbert");
if (props == null)
props = Collections.emptyMap(); // error
This code looks pretty reasonable but it doesn't type check because the initial assignment gives props the type Properties
when the intended type was actually the less specific Map
(which Properties
is a subtype of). It is also fragile. You could imagine that getProperties had originally been written with the return type Map
, in which case the code above would work. If you then later decided that the type could be narrowed to Properties
, which seems pretty innocent, you would unexpectedly break the code.
On the other hand, if you only assign to a variable once you can never go wrong in using the type of that value as the type of the variable. If you were later to narrow the type of a variable or method used to initialize such a variable you wouldn't risk breaking code that depends on the type not being overly specific. Also, I would say that adding another level of complexity to mutable variables is not doing anyone a favor. At least, if you were to do that it should be through a syntax that stood out more than just adding a single colon (:=
) which is easily overlooked. But that's a matter of taste more than anything.
Conclusion
I think there's a very good chance that a variable type inference mechanism will be added in Java 7. Making it easier to declare final
variables would be a good thing and linking type inference with final
solves a number of problems. There's a risk that code may become harder to read because this proposal not only allows you to leave out type declarations when they're redundant but also when they're not. In most languages you could just trust programmers to exercise judgment when deciding whether or not to leave out variable types. Java's design philosophy is based on not trusting people to make that kind of decision so introducing a mechanism like this would be a bit of a departure for this language. I hope that won't end up standing in the way of introducing this or a similar mechanism.