Hash Tables in the Standard Library

Cards (9)

  • The Standard Library includes hash table implementations of Set and Map, namely HashSet and HashMap.
  • The items in the HashSet (or the keys in the HashMap) must provide an equals and hashCode method.
  • The HashSet and HashMap are currently implemented using separate chaining hashing.
  • HashSet and HashMap can be used if it is not important for the entries to be viewable in sorted order.
  • The performance of a HashMap can often be superior to a TreeMap, but it is hard to know for sure without writing the code both ways.
  • In cases where either a HashMap or TreeMap is acceptable, it is preferable to declare variables using the interface type Map and then change the instantiation from a TreeMap to a HashMap, and perform timing tests.
  • Because the expensive part of the hash table operations is computing the hashCode, the hashCode method in the String class contains an important optimization: Each String object stores internally the value of its hashCode.
  • Initially the value of a strings hashCode is 0 , but if hashCode is invoked, the value is remembered. Thus if hashCode is computed on the same String object a second time, we can avoid the expensive recomputation. This technique is called caching the hash code, and represents another classic time-space tradeoff
  • Caching the hash code works only because Strings are immutable