Final

In his recent java 7 wish list Peter Ahé attributes a proposal to me, a shorthand syntax for variable declaration. In this post I’ll give a bit of background on where that idea came from and why I think it deserves to make it into java 7.

The idea is really very simple: when you declare a final variable with an initializer you are allowed to leave out the type of the variable; if you do, the type will be inferred from the initializer expression. Peter’s example is this:

final strings = new ArrayList();

The variable strings will get the type ArrayList because that’s the type of the expression used to initialize it. That’s all there is to it. The reason this idea is even worth writing about is not the mechanism in itself but the fact that a number of really nice properties fall out if you link type inference and final together.

But first I have to confess that this may not have been my idea at all. All I know is that it came out of a discussion I had with Kasper Lund last winter while we worked together in Esmertec. I’ve since talked with Kasper about it and none of us can remember who actually suggested it.

The idea came out of my code style at the time. In his (semi-)recent follow-up rant on agile programming, Steve Yegge talks about slob type systems and neat-freak type systems and the slobs and neat-freaks among programmers who prefer either one or the other. Well, I am — emphatically — a neat freak (as Erik saw fit to point out recently). If you take a look a some of my source code I’m sure you’ll agree. I need my type declarations, my @Override annotations and, more than anything, I need my final variables. I can say without blushing that I was probably the biggest fan of final variables on the planet. Unfortunately, writing final on all variables whose value doesn’t change is really verbose. Hence the idea to leave out the part of a final variable declaration that is actually redundant: the type. That is, in a nutshell, where the idea came from. But there’s more to it.

Final

Historically, the concept of local variables has been tied to mutability. It’s even part of the name: it’s called a local variable, not a local constant. It’s only natural, really: the way local definitions (to use a mutable/immutable neutral word) have historically been implemented as locations on the call stack which means that it is trivial to make them mutable. Why not by default give programmers added flexibility and allow them to change them? Well, there is actually good reason not to.

What is a local “variable”? It’s a location or slot where you can store different values over time. Consider this piece of code:

String myMethod() {
String str = "bleep";
...
... // many lines of code
...
return str;
}

Say you’ve run into a NullPointerException somewhere and suspect that this method might return null, that is, that str is null when the method returns. It starts out non-null but since it’s a variable you have to read and understand, at least to some degree, all the code between the declaration and the return. Maybe there are no assignments, in which case you know the value is still "bleep" when the method returns. I would venture a guess that across all the world’s java code that is the common case: variables are usually only written once. But you don’t know until you’ve read the whole method.

Variables that are initialized and then never changed are a different beast than variables that may be written multiple times, and they are much simpler. When you look at the declaration of such a variable you know not only the type of the variable but also the value. If you knew in advance in the code above that str doesn’t change, you can safely ignore all other code in the method and focus on the value used to initialize str since you know that that’s the value that will eventually be returned. The initializer may be a complex expression, sure, but it is still a simpler problem to figure out the value of an expression than to figure out the value of an expression and figure out how and when the variable might change its value. You’re not creating a slot where you can store different values but just computing a value and then giving it name.

That’s why I started writing final on all variables whose values never changed. It makes it a lot easier to skim through code and understand what is going on. In particular, since most variables will be final, it actually makes non-final variables stand out. After I’d used this for a while I stopped noticing final, since that’s the common case, and instead started to notice non-final variables. Maybe it has never been a problem for you to distinguish between variables that change and variables that don’t but there’s no question about the fact that when you read a piece of code you have to dedicate a few mental cycles to sort this out whenever the code refers to a variable. And in my experience mental cycles is an extremely precious resource when reading code, which is why I’m such a neat-freak (well, that and plain OCD). If writing final or @Override saves cycles for future readers of my code I’ll be inclined to do it; that future reader is likely to be myself.

Writing final everywhere is verbose, though. My god is it verbose. I have always have little patience for people who think that the fewer keystrokes it takes to write a program the better. Needless to say, perl is not my favorite language. In the mythical man month, Fred Brooks says that programmers seemed to generate about the same amount of code per day regardless of the language they use (which is bad news if 25% of the code you write is the word final). Had it been someone other than Fred Brooks I’d have called it rubbish. Paul Graham says that he had seen little to contradict this hypothesis. I have no problem calling that rubbish. The logic and abstractions used in a program, those take time to design and understand. Sitting down in front of a keyboard and pushing buttons isn’t what takes the time.

Having said that, though, there’s such a thing as ad absurdum, and so a few months ago I decided that it was enough with the final. It makes the code easier to read, sure, but I’m afraid if I kept it up I’d have arthritis by the time I was 30. It was quite a relief and I doubt I’ll ever revert to my old coding style. But it is frustrating: I felt I was forced to do the Wrong Thing just because the Right Thing was too verbose even for me. Bugger.

Inference

But let’s turn to another source of sore fingers: generics. A declaration like this

HashMapnew HashMap

is around 50% redundant. And it gets worse, much worse. So obviously people have been looking for a way to remove part of this redundancy. I’ve heard several different suggestions. You can use inference to guess the type arguments of the new expression:

HashMap map = new HashMap();

I think this proposal is a bit confusing and it has the limitation that it only really works with new expression. On the plus side it allows you to use different types for the variable and the constructor:

Map map = new HashMap();

Another option is to use a shorthand syntax that resembles C++’s stack allocated object syntax:

HashMap map();

This syntax is also limited to object creation, it (sort of) collides with the method declaration syntax, and had the added limitation that the type of the variable must be a class type which is unfortunate if you want to assign other values to the variable. On the other hand if the variable is immutable I actually think it’s a pretty reasonable syntax.

Finally, you can infer the variable type from the initializer:

final map = new HashMap();

or, with the syntax proposed by James Gosling,

map := new HashMap

The advantage of this is that the syntax is not limited to variables initialized with a new expression, and the type of the variable can be an interface type. On the minus side, it can become difficult to figure out what the type of a variable is:

final obj = myMethod();

The most important thing about unifying type inference and final, though, is that is avoids a number of problems you otherwise have when mixing type inference and mutability (which is a well-known source of trouble).

A mutable variable can be assigned any number of times during its lifetime and it’s not a given that it makes sense to just use the type of the initial assignment as the type of the variable. Imagine you’re doing something like:

static Properties getProperties(String name) { ... }

props := getProperties("qbert");
if (props == null)
props = Collections.emptyMap(); // error

This code looks pretty reasonable but it doesn’t type check because the initial assignment gives props the type Properties when the intended type was actually the less specific Map (which Properties is a subtype of). It is also fragile. You could imagine that getProperties had originally been written with the return type Map, in which case the code above would work. If you then later decided that the type could be narrowed to Properties, which seems pretty innocent, you would unexpectedly break the code.

On the other hand, if you only assign to a variable once you can never go wrong in using the type of that value as the type of the variable. If you were later to narrow the type of a variable or method used to initialize such a variable you wouldn’t risk breaking code that depends on the type not being overly specific. Also, I would say that adding another level of complexity to mutable variables is not doing anyone a favor. At least, if you were to do that it should be through a syntax that stood out more than just adding a single colon (:=) which is easily overlooked. But that’s a matter of taste more than anything.

Conclusion

I think there’s a very good chance that a variable type inference mechanism will be added in Java 7. Making it easier to declare final variables would be a good thing and linking type inference with final solves a number of problems. There’s a risk that code may become harder to read because this proposal not only allows you to leave out type declarations when they’re redundant but also when they’re not. In most languages you could just trust programmers to exercise judgment when deciding whether or not to leave out variable types. Java’s design philosophy is based on not trusting people to make that kind of decision so introducing a mechanism like this would be a bit of a departure for this language. I hope that won’t end up standing in the way of introducing this or a similar mechanism.

20 Responses to Final

  1. Hm. I was deeply unconvinced when you first mentioned it (I’ve rather gone off type inference in the last year or so), but by the time I got to the end I realised that this is actually a really good idea. It eliminates a big source of verbosity nearly for free, and as a happy side effect encourages Java programmers to use more immutable structures. 🙂

    I note that you often declare method arguments and fields to be final. Presumably type inference wouldn’t work in those cases? (Fields you could manage, but I don’t think it’s a good idea. Trying to do type inference on arguments in Java seems to be asking for trouble)

  2. I think type inference could work for final fields the same way it works for variables as long as it’s only used when the initializer is part of the declaration. I wouldn’t allow it in any other cases.

    As for final parameters there’s nothing you can really do, the type has to be there. But in that case it’s actually the ‘final’ keyword that’s redundant since no sensible programmer ever assigns to parameters. Maybe the way to go is to deprecate assignments to parameters and eventually disallow them completely. Of course that would never happen.

  3. True. There’s no real reason why one can’t do type inference on final fields with declared initializers (I have this problem at the moment where I keep failing to properly switch between thinking in Nice and Java. In Nice there would be a problem, but that’s ok because the obvious analogue to use type inference in is ‘let’ declarations rather than final).

    I think the comment on parameters was inspired by the fact that I’ve seen you doing that in the codebase for Tedir. Maybe I’m misremembering though.

  4. No you’re right, I do use final for parameters (or I did at least). I don’t think the proposal would help though, you have to write the type.

  5. In some cases it makes sense to assign to parameters, because they pollute the namespace.

    E.g., given a method that takes x and y, then scales them up and works some things out with them, it’s more readable to assign to the parameters than to make variables called newX, newY, etc.

    I have my IDE make everything final that can be; don’t get me wrong, final expresses intent very well, but as Java doesn’t allow a local variable to shadow a parameter, assigning to parameters is sometimes appropriate.

  6. There’s absolutely no need for this.

    An IDE can establish that something is as good as final (never changes) and can thus colour or italicize or some such to indicate it. If it’s a good idea, that’s how it should be handled.

    An IDE can also automatically give you the left end of an assignment – and show a popup box with all relevant abstractions.

    e.g. in Eclipse there’s the quickfix – assign to local variable. You just write only the right side (e.g: new HashMap() on a blank line), then use the quickfix key shortcut (usually CMD or CTRL +1). This will then give you a popdown with HashMap, Map, and Object (the three relevant options that make sense).

    Let’s turn this argument on its head first – how would you feel about READING such verbose code? From what I can sense in your argumentation, you don’t mind at all and in fact prefer the wordiness when looking at code, you just think it’s too much to time.

    So – get to know your IDE and your problems are solved, the ‘right’ way.

  7. What if you have:

    SomeInterface someMethod() {…}

    void myMethod()
    {
    final foo = someMethod()
    }

    What concrete type should foo have in that case?

  8. David R. MacIver wrote In Nice … the obvious analogue to use type inference in is ‘let’ declarations rather than final

    In Nice ‘let’ declarations /do/ use type inference:
    let Map map = new HashMap();

    And so do ‘val’ declarations in Scala:
    val d: HashMap[String,String] = new HashMap()

    And in both Nice and Scala the syntax is not limited to constants and variables initialized with a new expression – there are plenty of examples of type inference for method local constants and variables on the computer language shootout.

    So with those languages it’s pretty easy to explore what it actually feels like to use type inference for method local constants and variables.

  9. I really can’t understand this. Type inference is always bad thing, final or not. A smart compiler can accurately infer variable’s type if it’s final and it can even try to make good on a non-final… But what is really important, is that the reader be able to know your intended type!

    If you just instantiate ArrayList and let compiler do the rest there is no way for me to know whether you just need a List or intend to use some specific services only ArrayList provides. Of course, collections are a trivial example, but business logic objects are not.

    If you’re required to declare the type, and you’re doing it correctly whenever I read your code it’s easier for me to understand it, and when I have to modify it I have a clear choice: to accept your decision that it be this general type, or to explicitly decide something more specific is needed and modify the declaration.

    Of course, making type inference on finals available will make no harm to compile type checking, but it is not what is important. It will for sure make serious harm to people who try to understand your code.

    So don’t do this just because it is possible.

    If you really want to improve Java think about what to do with such brain damaged idiom Java forces on me every now and then:

    {
    Type v = null;
    try {
    v = something();
    // …
    } finally {
    v.cleanup();
    }
    }

    C# has

    using (Type v = something()) { /* … */ }

    and such a thing would be a really nice addition to Java, reducing gratuituous verbosity and not helpful one, such as explicitly declaring your intentions.

  10. Reinier: An IDE can highlight variables that don’t change and I’m sure that can be very useful. For instance, that helps you if you read somebody else’s code that doesn’t use final variables.

    However, there’s a big difference between having an IDE infer information for you and actively putting information into the source code. For instance, if you write ‘final’ in your source code the compiler can check that you use the variable accordingly. An IDE can infer that a variable is final but it can’t check that a variable that you meant to be final is only written once. For that, you have to state that intention explicitly. And in any case, as I argue in the post, you need the variables to be explicitly final if you want a sound model for variable type inference.

    I may not have stated this clearly but I really do mind reading overly verbose and redundant code as much as I mind writing it. Eclipse makes it easier to write redundant code but that doesn’t mean that redundant code isn’t a problem.

  11. albert: I agree to some degree. If you write

    final obj = whatever();

    then it’s harder to figure out what the type of obj is which may make the code much harder to understand. But I think it’s the programmer’s responsibility to write the type explicitly if the code is hard to understand if the type is left out.

    The reason I think this construct is a good idea is that there are actually many cases where the type can be left out without making the code harder to read. For instance:

    final objs = new ArrayList();

    Here, there is no confusion — objs is clearly an ArrayList. In that case, and many similar cases, you can use type inference and still have readable code. I believe this happens often enough that it is useful to have type inference and I don’t think most people will abuse it to write unreadable code.

    The best way to tell is probably to look at how similar mechanisms are used in other languages with variable type inference such as the languages mentioned by Isaac, Nice and Scala.

  12. … But, Cristian, your whole rant is about getting arthritis.

    You’re absolutely right. “final” is a specific intent and it helps a lot if your compiler/IDE helps to point out you just broke your intent.

    However, in your article, again, you take some time to highlight the problem where not using final forces you to read a lot of code. Not with a properly configured IDE.

    I’m confused about your point, in other words. Anyway, back to ‘final’:

    ‘final’ being relieved of the duty of simply helping clarify it won’t change means ‘final’ can be safely left to those situations where you really really care that no one messes with that variable over the course of its scope.

    That’s a big change. There are lots of variables that I don’t mind if they change, but in the current implementation of the method, don’t change. This is very useful information, but tacking “final” onto everything just to make it obvious is annoying, and given a proper IDE, useless.

    If for a variable I have in mind that it shouldn’t change – that my further implementation code cares about this – that’s rather specific. So, manually type it out – type out “final”. Or let eclipse do this for you. (using a template you can reduce it to a single key combination).

    I also think that when I read code, I want to see those intentions: I want to SEE the intention that it never change (final), I want to SEE the operative intent (Collection, List or ArrayList?), and I want to SEE the implementation chosen (LinkedList) but I don’t care as much about that.

    The only superfluous information in this rather lengthy declare (replace all {{ with angular brackets – trying to avoid the HTMLification here):

    final List{{String}} userNames = new ArrayList{{String}}();

    is the doubling up of {{String}}. Which is why I suggest a change in language handling:

    1. to FORCE raw types, use the {{}} construct. e.g. new Arraylist{{}}; //raw type.

    2. Just plain leaving off the generics parameters entirely (which currently forces a raw type on constructors) does inference, exactly the same way static methods ALREADY do.

    Here’s an example:

    List{{String}} list = Collections.emptyList(); //TOTALLY LEGAL, no warnings!

    Collection{{String}} list = Collections.singleton(“woot!”); //LEGAL! no warnings!

    I -love- static method inference. In fact, if I just write Collections.singleton(“woot!”);, eclipse can generate Collection/Set{{String}} for me.

    In fact, I have rewritten most of my constructors in generics-heavy libraries to use static methods instead:

    Equals.make(USERS_ID, COMMENTS_USER_REF) instead of new Equals{{Unid}}(USERS_ID, COMMENTS_USER_REF).

    But this does get annoying, and when creating people instinctively write new (classname) and engage auto-complete, which won’t help in such cases.

    NB: For backwards compatible reasons, it’s been my opinion for a while now that any backwards-compatibilty-breaking change for which you can write a ‘wizard’ that translates old code to new code 100% without user input and 100% performance (0 false negatives and positives), just go ahead, fix the language, and write that wizard. If you must, default the -source level to the previous release as was done with the whole assert/java 1.4 thing. This idea allows you to fix a metric clusterfuck of java mistakes.

  13. Albert: The closures proposal is already dealing with the ‘using’ idea. This one is independent of that.

  14. ricky: Which closures proposal? There were few but I stopped following this issue at some point.

    In my opinion closures are harmful and should be eliminated from the language in their current form (i.e. leaking finals to anonymous classes), not extended to make it even worse. If a method needs an object, just pass it, don’t let it leak.

    christian: Of course, I can determine type of the right side of assignment, that is no problem. What I cannot determine, is the type author intended to use. Do you just want a Collection, or is it a List? Or do you need some specific services only ArrayList can provide? I can’t know that unless I carefully read the following code.

    You say a decent programmer would declare this type if omitting declaration makes the code harder to read… But hey, isn’t it obvious, the type? At least it seems so, when writing the code. Might not be so obvious when reading it a few months afterwards. Especially if the right-side determined type is a class which has a fair number of superclasses up, and implements 3 more interfaces (which extend other interfaces, too).

  15. Albert: The BGGA proposal, see http://javac.info

    “Closures are harmful”. You haven’t actually given any reasons. The leakage you speak of isn’t leakage, it’s explicit. You pass a closure to a method, and you’re explicitly giving the current context away.

    I’d suggest getting familiar with Neal Gafter’s blog.

  16. Anton: If the author really intends the type of a variable to be List or Collection and not ArrayList he’s free to just write that. For final variables you don’t really gain much by discarding type information and I think the common case is that the type of the variable is the same as the type of the initializer. In that case the author’s intention really is for the types to be the same and can use type inference.

    The argument that a programmer cannot be trusted to write readable code can be used against any language construct. You cannot make a language safe against bad programmers, they could write unreadable code in Java 1.0 and can still do that in Java 7.0. I believe most programmers will be able to determine when to use this and when not to.

    Java tries to protect you against yourself under the assumption that you’re a bad programmer, which I think is a the root of many bad design decisions. In that regard, I think this is a step in the right direction.

  17. Reinier: Your suggestion sounds a lot like the proposal described here.

    As for writing vs. inferring final, I sort of agree with you when considering the current version of Java. But when you combine explicit final and type inference you get some benefits that you don’t get when final is inferred by the IDE.

  18. I agree with your comments and would like to enter the official world competition for the most use of final. I personally use final locals including method arguments virtually always, loops are about the only exception!

    With regard to type inference and final; this has been suggested before, e.g. I lodged RFE 6389769, about 12 months ago and it has the same suggestion. And I very much doubt that I was the first to make the observation that the type was redundant.

  19. Ricky: But yes, it’s a leakage. See this:
    public void someMethod() {
    final int x = 5;
    withWhatever(new FancyInterface() {
    public int closingOver() {
    return x + 4;
    }
    }
    }

    I can’t see x passed to closingOver() anywhere, can you? But it is taken, it exists there despite it’s never declared for that method. Then it is a leakage, even if you make up fancy names like “closure”.

    Christian: You miss the point entirely. If you’re good programmer, a smart one, you can code in Lisp and make a fortune out of it. Look at Graham, he’s very smart, Graham. You smart programmers need macros, not type checking anyway.

    What Java does, in fact, is not protecting a bad programmer against themselves, it is protecting most programmers (Graham calls them “mediocre” or “average”) against good and smart ones.

    You cannot protect against a really bad programmer. I saw much code written by bad programmers, and see no way anything could protect against. But you see, it was not _unreadable_, their code, not. It featured bad design, it featured crazy things like bare data structures everywhere, methods returning null to indicate success, methods passing their actual return value in an exception they were throwing and other really scary things. But it was never unreadable, in fact it was often very easy to read.

    But you can educate a bad programmer and make them an average one or even a good one if they want to listen and learn – and if they do not, they should have two choices only – to resign or get fired.

    What we really need is to protect against _good_ programmers, smart ones who think they don’t need type or variable declarations, and, which is even worse, they are right. Java does a good job here – it makes you declare nearly everything even if _you_ don’t need it. So your code is readable even to pea brains like me 😉 Because every line of unreadable code I saw was written by a good smart programmer. It often featured good design, better than I would think of, nice OO features… it was just unreadable. Not to them, to the rest of us.

    What I assert, is that this protection, which makes so smart programmers cry and weep, and declare and write openly what they want and do is one of reasons Java was ever so successful. It’s success is not about making bad programmers less bad, it is about making good programmers less good. It might sound suicidal, to do this, but it is what is needed because no real project is done by one person, or by few persons so smart, so it is better off to have all people speak the same than to have an outstanding geek who is better but does not speak to lowlings 😉

    Look at it, C# has more and more features and it wants to kill Java, but the more features it gets, it seems farther from taking over. Ever thought why? It is because you don’t need new shiny features, you need common ones so the enterprise wants you. Too bad Sun doesn’t understand this and they go into this race they cannot win.

    We are talking about Java7 here, but the enterprise still uses 1.4 and shuns even 1.5…

  20. <a href="http://erniz.com">Hilbert</a>

    Thanks to author for this article. Very interesting. Write more!

Leave a Reply

Your email address will not be published. Required fields are marked *


*