Hash Tables Without Linked Lists

Cards (18)

  • In linear probing, f is a linear function of i, typically f(i) = i.
  • Linear Probing amounts to trying cells sequentially (with wraparound) in search of an empty cell.
  • For linear probing, As long as the table is big enough, a free cell can always be found, but the time to do so can get quite large.
  • For linear probing, even if the table is relatively empty, blocks of occupied cells start forming. This effect, known as primary clustering, means that any key that hashes into the cluster will require several attempts to resolve the collision, and then it will add to the cluster.
  • For linear probing, on average, successful searches should take less time than unsuccessful searches.
  • Quadratic probing is a collision resolution method that eliminates the primary clustering problem of linear probing.
  • Quadratic probing is what you would expect—the collision function is quadratic. The popular choice is f(i) = i^2.
  • For linear probing it is a bad idea to let the hash table get nearly full, because performance degrades.
  • For quadratic probing, it is a bad idea to let the hash table get nearly full: There is no guarantee of finding an empty cell once the table gets more than half full, or even before the table gets half full if the table size is not prime. This is because at most half of the table can be used as alternative locations to resolve collisions.
  • If quadratic probing is used, and the table size is prime, then a new element can always be inserted if the table is at least half empty.
  • Standard deletion cannot be performed in a probing hash table, because the cell might have caused a collision to go past it. Thus, probing hash tables require lazy deletion, although in this case there really is no laziness implied.
  • contains(x), invokes private methods isActive and findPos.
  • The private method findPos performs the collision resolution. We ensure in the insert routine that the hash table is at least twice as large as the number of elements in the table, so quadratic resolution will always work.
  • For quadratic probing, we do nothing if x is already present. It is a simple modification to do something else. Otherwise, we place it at the spot suggested by the findPos routine.
  • Although quadratic probing eliminates primary clustering, elements that hash to the same position will probe the same alternative cells. This is known as secondary clustering.
  • For double hashing, one popular choice is f(i) = i · hash2(x).
  • The double hashing says that we apply a second hash function to x and probe at a distance hash2(x), 2hash2(x), …, and so on.
  • However, if double hashing is correctly implemented, simulations imply that the expected number of probes is almost the same as for a random collision resolution strategy.