Organisation and Structure of Data*

Created by

Dairhys Leckonby

Cards (15)

The purpose of a hashing algorithm is, to convert a string of characters into a fixed-length key using a calculation.
Random access in files involve jumping to a record directly without the need to search through an index.
Direct access files are split into blocks where block contains a fixed number of records.
A size of a block is the product of the number of the records and the size of each byte.
The load factor is the number of records stored over the maximum number of records.
Deterministic is a hashing function that ensures that when searching for a key, the same result is always produced.
Uniformity is a hashing function that ensures all keys are evenly spread over the block range. This is to reduce block overflow.
Data normalisation is a hashing function that ensures all keys are normalised (in the same character format i.e. all lower case or numbers).
Continuity is a hashing function that allows keys to differ if they were originally identical.
Non-invertible hashing function means that it is not true that the hash value can be reversed to get the original key.
Collisions occur when two different keys produce the same hash values. This causes longer linear searches as records get placed in the same block.
Using multipliers can reduce collisions to improve distribution.
Block overflow is caused when collisions are too great and a block has no more available storage for further records.
To solve block overflow, an overflow area can be created but essentially defeats the purpose of a direct access system. Or to redo it all and create more blocks to cater for the overflow.
Hashing is used for indexing/retrieving data in a data structure.