As Java (or Android) developers, we use Object
all the time. Java is an object-oriented programming language in the first place. We probably read/hear (more than once1 2 3) that equals()
and hashCode()
should only be overriden together, as they affect hashtable operations. Let’s do an interactive experiment to see what happens when one doesn’t follow advice.
First let’s grab some official definitions. The Java language specification (Java SE 8) is quite brief on this:
The method equals
defines a notion of object equality, which is based on value, not reference, comparison.
The method hashCode
is very useful, together with the method equals, in hashtables such as java.util.HashMap
.
Effective Java item #9 has a more lengthy explanation of this contract, summing up that:
hashCode()
when you override equals()
The contract basically says that if 2 objects are equals()
, they should have the same hashCode()
value, but 2 objects having the same hashCode()
value are not necessarily equals()
.
The experiment
Consider a common use case where we have a collection of items, each has an ID. An item is uniquely identified by its ID: if 2 items have the same ID, they are considered ‘logically equivalent’, and should be treated as being the same when we program using them. Let’s represent such item using Item
class, and use its instances as keys for map operations as follows:
class Item {
private final int id;
Item(int id) {
this.id = id;
}
}
// for some funny reason, we create 3 instances
// that represent the same thing
Item item1 = new Item(1),
item2 = new Item(1),
item3 = new Item(1);
HashMap<Item, Integer> map = new HashMap<>();
map.put(item1, 1); // set value to 1
map.put(item2, 2); // override value to 2
map.put(new Item(100), 100); // some random entry
System.out.println(map.get(item1));
System.out.println(map.get(item2));
System.out.println(map.get(item3));
As our keys are supposed to represent the same item logically, the expected output to the 3 printouts are:
2, 2, 2
(Trivial) test runs
TL;DR
- Test 1: Don’t override
hashCode()
andequals()
- Test 2: Override
hashCode()
only - Test 3: Override
equals()
only - Test 4: Override both
hashCode()
andequals()
- Test 5: What if
hashCode()
is random? - Test 6: What if
hashCode()
is always the same? - Test 7: And what if
equals()
is always false? - Test 8: Okay what if
equals()
is always true now?
You can skip to explanation, or try to work out explanation for each test output as you read on.
Test 1: Don’t override hashCode()
and equals()
class Item {
private final int id;
Item(int id) {
this.id = id;
}
}
1, 2, null [FAILED]
Test 2: Override hashCode()
only
class Item {
...
@Override
public int hashCode() {
return id;
}
}
1, 2, null [FAILED]
Test 3: Override equals()
only
class Item {
...
@Override
public boolean equals(Object obj) {
return obj instanceof Item && id == ((Item) obj).id;
}
}
1, 2, null [FAILED]
Test 4: Override both hashCode()
and equals()
class Item {
...
@Override
public int hashCode() {
return id;
}
@Override
public boolean equals(Object obj) {
return obj instanceof Item && id == ((Item) obj).id;
}
}
2, 2, 2 [PASSED]
Test 5: What if hashCode()
is random?
class Item {
private static int staticInt;
...
@Override
public int hashCode() {
return ++staticInt; // make sure each call gets a different value
}
@Override
public boolean equals(Object obj) {
return obj instanceof Item && id == ((Item) obj).id;
}
}
null, null, null [FAILED]
Test 6: What if hashCode()
is always the same?
class Item {
...
@Override
public int hashCode() {
return 0;
}
@Override
public boolean equals(Object obj) {
return obj instanceof Item && id == ((Item) obj).id;
}
}
2, 2, 2 [PASSED]
Test 7: And what if equals()
is always false?
class Item {
...
@Override
public int hashCode() {
return 0;
}
@Override
public boolean equals(Object obj) {
return false;
}
}
1, 2, null [FAILED]
Test 8: Okay what if equals()
is always true now?
class Item {
...
@Override
public int hashCode() {
return 0;
}
@Override
public boolean equals(Object obj) {
return true;
}
}
100, 100, 100 [FAILED]
Explanation
Well, if you need no more explanation for above results then congratulations, you have mastered hashCode()
, equals()
and how HashMap
works. But in case you get confused at some point, then below is the explanation.
A little dig into internal implementation reveals that HashMap
uses hashCode()
, ==
and equals()
for entry lookup. HashMap
stores entries in buckets. Ideally each bucket should store 1 entry, and by looking up a bucket in O(1) time, one can get corresponding entry in O(1) time. In practice, many entries may end up in the same bucket.
The lookup sequence for a given key k
is simplified as follows:
- Use
k.hashCode()
to determine which bucket the entry is stored, if any - If found, for each entry’s key
k1
in that bucket, ifk == k1 || k.equals(k1)
, then returnk1
’s entry - Any other outcomes, no corresponding entry
Similar sequence applies for when an entry is put into HashMap
.
Now, hashCode()
is a native implementation, but its Javadoc says:
/**
* ...
* As much as is reasonably practical, the hashCode method defined by
* class Object does return distinct integers for distinct objects.
* ...
*/
public native int hashCode();
equals()
implementation is much more straightforward:
public boolean equals(Object obj) {
return (this == obj);
}
So without overriding hashCode()
, its output will be different for each Item
object. Without overriding equals()
, 2 Item
objects that are logically equivalent would only be possible if they are represented by the same instance.
With that in mind, let’s see how each test failed/passed:
- [FAILED] Test 1: Don’t override
hashCode()
andequals()
- As each key has a different
hashCode()
value, they end up in different buckets, and represent different entries, with different values.
- As each key has a different
- [FAILED] Test 2: Override
hashCode()
only- Our keys are in the same bucket now, but they end up representing different entries, due to failed
equals()
checks.
- Our keys are in the same bucket now, but they end up representing different entries, due to failed
- [FAILED] Test 3: Override
equals()
only- Without overriding
hashCode()
,HashMap
never gets to the point where it needs to doequals()
checks. We get the same results as Test 1.
- Without overriding
- [PASSED] Test 4: Override both
hashCode()
andequals()
- Our keys are in the same bucket, and they represent the same entry, as either
==
orequals()
checks passed.
- Our keys are in the same bucket, and they represent the same entry, as either
- [FAILED] Test 5: What if
hashCode()
is random?- It’s worse than not overriding
hashCode()
, as the same key object would end up in a different bucket every time we use it.
- It’s worse than not overriding
- [PASSED] Test 6: What if
hashCode()
is always the same?- Our code is functionally correct, but the use of
HashMap
is now redundant, as all entries will end up in the same bucket, regardless of their keys, which takes O(N) or O(logN)4 to iterate.
- Our code is functionally correct, but the use of
- [FAILED] Test 7: And what if
equals()
is always false?- It would be the same as not overriding
equals()
. We get the same results as Test 2.
- It would be the same as not overriding
- [FAILED] Test 8: Okay what if
equals()
is always true now?- We would be treating each key as logically equivalent to any other key, and thus any map operation will refer to the same entry.
I hope this trivial article now convinces you to follow Java language specification advice when overriding hashCode()
and/or equals()
. Effective Java2 item #9 provides much more informative insights, and you should absolutely grab the book and check it out as well.