1.1 Data Structures

Cards (31)

  • Data Structure
    A group / collection of related data items / elements
  • Arrays
    • Is a set of data elements of the same type
    • Has its elements accessed via indexes or subscripts
    • Has a fixed/pre-determined number of elements
  • Records
    • A set of data items all related to a single individual / entity
    • Can contain data of different types
  • Stacks
    A 'First In, Last Out' Data Structure
  • Stacks - full definition
    1. A stack is a container of objects that are inserted and removed according to the last-in first-out (LIFO) / first-in last-out (FILO) principle
    2. It is a limited access data structure - elements can be added and removed from the stack only at the top
  • Stacks - full definition (2/3)
    1. push adds an item to the top of the stack
    2. pop removes the item from the top
    3. A stack can be used as a recursive data structure
    4. A stack is either empty or it consists of a top and the rest which is a stack
  • Stacks - full definition (3/3)
    • Underflow occurs when an attempt is made to pop an empty stack
    • Overflow occurs when an attempt is made to add to a full stack
  • Why use a Stack for those situations?
    In each case, because the natural / desirable processing order is first in last out (e.g. last address placed on stack needs to be first one to be accessed)
  • Implementing a Stack
    1. The data for a Stack would be stored in an array
    2. A Pointer would keep track of the top of the Stack as data is added and removed (Stack Pointer)
    3. *A pointer is just a variable!
  • Implementing a Stack - Pushing
    1. If stackPointer < stackMaximum then: stackPointer = stackPointer + 1, stackArray(stackPointer) = dataItem
    2. Else: Output "Stack is full – your data has not been saved"
  • Implementing a Stack - Popping
    1. If stackPointer > 0 then: dataItem = stackArray(stackPointer), stackPointer = stackPointer - 1
    2. Else: Output "Stack is empty – no data can be retrieved"
  • Queues
    A 'First In, First Out' Data Structure
  • Why use a Queue for those situations?
    In each case, because the natural / desirable processing order is first in first out (e.g. job waiting longest should be printed next)
  • Implementing a Queue
    1. The data for a Queue would be stored in an array
    2. Two Pointers would keep track of the front and back of the queue as data is added & removed (Front Pointer & Back Pointer)
    3. *A pointer is just a variable!
  • Linked List
    A set of data elements, where each element contains the data itself and a pointer to the next element.
  • Linked List- Drawbacks
    • A linked list is more complex to program / manipulate than an array
    • Extra programming is required to access the data in the opposite direction (or the list needs to be doubly linked)
    • Can only be accessed in a linear manner
  • Binary Tree
    • A Data Structure made up of Nodes and Branches
    • The Nodes contain data
    • The Node at the top of the tree is called the Root Node
    • Binary = Each node can only have two branches
  • Why a Binary Tree?
    • Advantage: faster to search / add a value
    • Disadvantage: more complex to program / process
  • If a Binary Tree is not well balanced, then searching / processing time will be significantly increased.
  • Traversal of Binary Trees
    1. Preorder (root, left, right)
    2. Inorder (left, root, right)
    3. Postorder (left, right, root)
  • Preorder Traversal
    1. Visit the root
    2. Traverse the left subtree i.e., call Preorder(left-subtree)
    3. Traverse the right subtree i.e., call Preorder(right-subtree)
  • Inorder Traversal
    1. Traverse the left subtree i.e., call Inorder(left-subtree)
    2. Visit the root
    3. Traverse the right subtree i.e., call Inorder(right-subtree)
  • Postorder Traversal
    1. Traverse the left subtree i.e., call Postorder(left-subtree)
    2. Traverse the right subtree i.e., call Postorder(right-subtree)
    3. Visit the root
  • Hash Tables
    A hash table has two components, a table where the actual data is stored and a mapping function (called a hash function or hash algorithm)
  • The hash function uses the data that is going to be entered in the table to generate the location in the table where it should be stored.
  • Hash Tables - Separate Chaining
    This uses the original table to store a link to a dynamic structure like a linked list
  • When a new item creates the same result as an existing piece of data, a new node on the linked list in that location is used to hold the new data.
  • Whilst this would avoid collisions, this technique would slow down finding the data again as the linked list would have to be searched after the hashing algorithm has been completed.
  • Hash Tables - Another Solution
    Flag the original block and move data into designated overflow area for subsequent linear search
  • It would be important to pick a hash function which spreads the data evenly over the table to minimise collisions.
  • Linked List - Benefits
    • New items can be inserted into a linked list without rearranging all the other elements.
    • If programmed dynamically they use memory more efficiently