Organisation and Structure of Data*

Cards (15)

  • The purpose of a hashing algorithm is, to convert a string of characters into a fixed-length key using a calculation.
  • Random access in files involve jumping to a record directly without the need to search through an index.
  • Direct access files are split into blocks where block contains a fixed number of records.
  • A size of a block is the product of the number of the records and the size of each byte.
  • The load factor is the number of records stored over the maximum number of records.
  • Deterministic is a hashing function that ensures that when searching for a key, the same result is always produced.
  • Uniformity is a hashing function that ensures all keys are evenly spread over the block range. This is to reduce block overflow.
  • Data normalisation is a hashing function that ensures all keys are normalised (in the same character format i.e. all lower case or numbers).
  • Continuity is a hashing function that allows keys to differ if they were originally identical.
  • Non-invertible hashing function means that it is not true that the hash value can be reversed to get the original key.
  • Collisions occur when two different keys produce the same hash values. This causes longer linear searches as records get placed in the same block.
  • Using multipliers can reduce collisions to improve distribution.
  • Block overflow is caused when collisions are too great and a block has no more available storage for further records.
  • To solve block overflow, an overflow area can be created but essentially defeats the purpose of a direct access system. Or to redo it all and create more blocks to cater for the overflow.
  • Hashing is used for indexing/retrieving data in a data structure.