Equals
I recently discovered that if you look closely at the seemingly simple problem of comparing objects for equality it's actually very hard, if not impossible, to solve satisfactorily. I looked around a bit to see if this was a well-known issue and it turns out that I'm certainly not the first person to notice this problem. See for instance here and here. This post introduces the problem (or one of the problems) of implementing object equality methods.
I have a standard template for implementing the equals method in Java:
public class X ... {
// ... body of X ...
public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof X)) return false;
final X that = (X) obj;
// X-specific comparison
}
}
Unfortunately it turns out that implementing equals using instanceof doesn't work in general. Let's say you have a class LinePosition which specifies a particular line in a text file. Two line positions are equal if they represent the same line:
public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof LinePosition)) return false;
final LinePosition that = (LinePosition) obj;
return this.getLine() == that.getLine();
}
Now you might later want to create a subclass, CharacterPosition, which not only specifies a line but also a particular character in the line. Two character positions are equal if they specify the same line and the same character in that line:
public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof CharacterPosition)) return false;
final CharacterPosition that = (CharacterPosition) obj;
return this.getLine() == that.getLine()
&& this.getCharacter() == that.getCharacter();
}
This might look reasonable but it doesn't work: equals is required to be symmetric but in this case it isn't:
final LinePosition a = new LinePosition(3)
final CharacterPosition b = new CharacterPosition(3, 5);
final boolean aIsB = a.equals(b); // true
final boolean bIsA = b.equals(a); // false
The problem is that b is more picky about who it considers itself to be equal to than a. Damn.
An alternative approach is to avoid instanceof and use getClass instead:
public boolean equals(final Object obj) {
if (this == obj) return true;
if (obj == null || obj.getClass() != this.getClass()) return false;
final X that = (X) obj;
// X-specific comparison
}
Using this technique to implement equals on the text position classes would cause a and b to be equally picky about who they considered themselves to be equal with so equals would be symmetric. Unfortunately, now the objects are so picky that they break Liskov Substitutability, which is a Bad Thing.
Here's an example where that might hit you. Let's say that during debugging you notice that instances of LinePosition turn up in places where you didn't expect then. If I wanted to find out where these positions came from, I would probably make an empty subclass of LinePosition for each place where I instantiate them so each instantiation site uses its own class. That way, whenever I find a line position where I didn't expect it, I can see where it was instantiated
But if equals has been implemented using class equality I can't do that because even though I don't change any methods, the behavior of my program has changed in a subtle way. One position can now only be equal to another position if they were created at the same place, which is very different from how the program behaved before. I've tried hunting bugs that occurred because a subclass behaved differently from superclasses merely because it had a different class. Bugs like that can be extremely evil because it's so unexpected if your mind is warped the way object orientation usually warps people's minds.
At the core, this problem is a question about when can you consider two objects to be of the same kind of objects. Having a common superclass does not imply that two objects are the same kind, as the LinePosition and CharacterPosition example demonstrates. On the other hand, sometimes objects should be considered to be of the same kind even if they have different classes. So how do you determine whether or not two objects should be considered to be the same kind of object? In Java, I don't know. I think the getClass approach is the greater of two evils so I'll probably keep doing things the way I've always done, using instanceof. Looking outside Java and java-like languages I have an idea for a solution using a mechanism that's a combination of Scala-style pattern matching and the visitor design pattern. But that will be the subject of another post in the near future.