Save
Comp Sci A2
Comp Sci Unit 4
Unit 4.4 - Organisation of Data
Save
Share
Learn
Content
Leaderboard
Learn
Created by
Luis
Visit profile
Cards (15)
Batch processing is where a computer periodically completes
high-volume
,
repetitive
tasks
Transactional
files hold data from day to day interactions e.g. a till at a supermarket
Transactional Files
Temporary
Serial
At the end of a set time period, the data on the transactional file is copied onto the
master
file and the file is
wiped
Master
Files hold data collected over a
long
time period e.g. historical company data
Master
Files
Sequentially
ordered by
key
field
Used to perform
batch
processing
Batch Processing
A
sorted
transactional file is used to update a
master
file
A
report
, an updated
master
file and an
error
file is produced
Batch processing requires a
large
amount of processing power so must be performed at an
off
peak
time
Batch
processing
is useful as it ignores
errors
and stores them to be handled later
Batch processing can take a
large
amount of time to complete, so a system might be
unusable
while undergoing this process
Serial
Files
Where there is no
particular
order
to the data added to a record and new records are added to the
back
of a file
Sequential
Files
Where records are organised by
primary
key. Requires a
temporary
file to perform updates to the file
Direct
/
Random
Access
Files
Where
records
can be accessed at any time by jumping to their location through a
hashing algorithm
, rather than performing a search for the data
Direct / Random Access Files
Split into
fixed
length blocks of data
Data added is assigned an
index
by a
hashing
algorithm
Data is
retrieved
by hashing the query to generate the location to be looked in
Too many blocks will cause space to be
wasted
Too few blocks will cause
collisions
Hashing Algorithm Requirements
Deterministic
: Always produces the same result
Uniformity
: Data should be evenly spread
Data Normalisation
: Data fed to a hashing algorithm should be normalised
Continuity
: Keys that differ by small amounts should have hash values that differ by small amounts
Non-Invertible
: Hash values should not be reversible to obtain the original data
Block Overflow can be resolved by:
using an
overflow
area (
separate
chaining)
creating a
new
file
Using a new
hashing
algorithm