Hash tables

Cards (30)

  • Hash table
    A collection of items stored in such a way that they can quickly be located
  • Hashing
    1. Apply a hashing algorithm to the value in the key field of each record to transform it into an address
    2. Divide the key by the number of available addresses and take the remainder as the address
  • Synonyms are bound to occur with any hashing algorithm, and two record keys hashing to the same address is referred to as a collision
  • Collision handling
    Store the item in the next available free space
  • Storing items in a hash table
    1. Calculate the hash value of each item to be stored using division by 11 and taking the remainder
    2. Store items in the next available free slot in case of collisions
  • Searching for an item
    1. Apply the hashing algorithm to the key field of the item
    2. Examine the resulting cell in the list
    3. If the item is there, return the item
    4. If the cell is empty, the item is not in the table
    5. If there is another item in that spot, keep moving forward until either the item is found or a blank cell is encountered, when it is apparent that the item is not in the table
  • To be as efficient as possible, the hashing algorithm needs to be chosen so that it generates the least number of collisions. This will depend to some extent on the distribution of the items to be hashed.
  • Folding method
    Divides the item into equal parts, and adds the parts to give the hash value
  • If the hash table has fewer spaces than the maximum possible sum generated by this method, say 100 cells, then the extra step of dividing by 100 needs to be applied.
  • Searching for an item
  • when searching for an item, these steps are followed
    1. apply the hashing algorithm to the key field of the item
    2. examine the resulting cell in the list
    3. the item is there, return the item
    4. the cell is empty, the item is not in the table
    5. there is another item in that spot, keep moving forward until either the item is found or a blank cell is encountered, when it is apparent that the item is not in the table
  • Other hashing algorithms
  • Folding method
    Divides the item into equal parts, and adds the parts to give the hash value
  • If the table has fewer spaces than the maximum possible sum generated by this method, say 100 cells, then the extra step of dividing by 100 needs to be applied
  • 02: Using the folding method and division by 100, complete the hash table below to show where each number will be stored in a table of 100 spaces. (A sample 123456 is done for you.)
  • "Folded" value
  • Hashing the word CAB
    1. Add up the ASCII values for each letter
    2. If there are 11 spaces in the hash table, divide by 11 and take the remainder as the hash value
  • The ASCII code for 'T' is 84
  • The hash values associated with these words are similar
  • Collision resolution
    The process of finding an empty slot when a collision has occurred
  • Rehashing
    • Looks for the next empty slot, looping round to the first cell if the table end is reached
    • Could look at every third call (the "plus 3" rehash)
    • Could increment the hash value by 1, 3, 5, 7... until a free slot is found
  • Different hashing and rehashing methods will work more efficiently on different data sets
  • Uses of hash tables
    • Efficient lookup, e.g. for an index
    • Storing data such as user codes and encrypted passwords that need to be looked up and verified quickly
    • Implementing the data structure called a dictionary
  • A dictionary is a useful data structure for implementing graphs
  • Dictionary
    An abstract data type consisting of associated pairs of items, where each pair consists of a key and a value
  • Dictionaries are a built-in data structure in Python and Visual Basic
  • Dictionary in Python
    Comma-delimited pairs in the format key:value, enclosed in curly braces
  • Dictionary in Python
    • IDs (342: 'Harry', 634:'Jasmine', 885:'Max', 571: 'Sheila')
  • Operations on dictionaries
    • Create a new empty dictionary
    • Add a new key: value pair to the dictionary
    • Delete a key: value pair from the dictionary
    • Amend the value in a key: value pair
    • Return a value associated with key k
    • Return True or False depending on whether key is in the dictionary
    • Return the length of the dictionary, i.e. the number of key:value pairs
  • Dictionaries can be implemented using either a static or a dynamic data structure