Data Structures and Abstract Data Types

Cards (25)

  • Data structures
    The containers within which information is stored
    • Some are better suited to different types of data than others
    • Programmers must decide which of the data structures available is the best to use
  • Arrays
    • An indexed set of related elements
    • Must be finite and indexed
    • An array’s index often starts from zero
    • Must only contain elements with the same data type
    • Arrays can be created in many dimensions
  • Fields, records and files
    Information is stored by computers as a series of files
    • Each file is made up of records which are composed of a number of fields
  • Abstract data types/data structures
    Don‘t exist as data structures in their own right
    • Make use of other data structures to form a new way of storing data
  • Queues
    An abstract data structure based on an array
    • The first item added to a queue is the first to be removed
    • Referred to as “first in, first out” (FIFO) abstract data structures
    • Used by computers in keyboard buffers and in the breadth first search algorithm
  • Linear queues
    Have two pointers, a front pointer and a rear pointer
    • These can be used to identify where to place a new item in a queue or to identify which item is at the front of the queue
    The item at the front of the queue has been in the queue for the longest
    Operations that can be performed on a queue include:
    • Enqueue
    • Dequeue
    • IsEmpty
    • IsFull
    Emptiness can be detected by comparing the front and rear pointers
  • Circular queues
    A type of queue in which the front and rear pointers can move the two ends of the queue
    • More memory efficient than a linear queue
  • Priority queues
    Items are assigned a priority
    • High priority items are dequeued before low priority items
    • In the case that two or more items have the same priority, the items are removed in the usual first in, first out order
    Frequently used in computer systems
  • Stacks
    • A first in, last out (FILO) abstract data structure
    • Often based on an array
    • Have one pointer: a top pointer
    • The item at the bottom of a stack has been in the stack for the longest
    • Operations that can be performed include:
    • Push (add an item)
    • Pop (remove the top item)
    • Peek (return the value of the item at the top of the stack without actually removing the item)
    • The functions IsFull and IsEmpty can be executed on stacks
    • If a push command is executed on a full stack, a stack overflow error is thrown
    • A stack underflow can be caused by attempting the pop command on an empty stack
  • Graphs
    • An abstract data structure used to represent complex relationships between items within datasets
    • Can be used to represent networks e.g., transport networks, IT networks and the Internet
    • Consists of nodes (vertices) joined by edges (arcs)
    • A weighted graph is one in which edges are assigned a value, representing a value such as time, distance or cost
    • Can be represented in two different ways:
    • Adjacency matrices
    • Adjacency lists
  • Adjacency matrices
    • A tabular representation of a graph
    • Each of the nodes in the graph is assigned both a row and a column
    • Unweighted:
    • 1 is used to show that an edge exists between two nodes, 0 indicates no connection
    • They have a characteristic diagonal line of 0s and display diagonal symmetry
    • Weighted:
    • Contain the weight of a connection between two nodes, if no connection exists, an arbitrarily large value is used e.g., infinity
  • Adjacency lists
    • Represent a graph using a list
    • For each node, a list of adjacent nodes is created
    • Only records existing connections in a graph
  • Evaluation of adjacency matrices vs lists
    Adjacency matrix:
    • Memory inefficient: Stores every possible edge between nodes, even those that don‘t exist, almost half is repeated data
    • Time efficient: Allows a specific edge to be queried very quickly as can be looked up by its row and column
    • Well suited to dense graphs with a large number of edges
    Adjacency list:
    • Memory efficient: Only stores the edges that exist in the graph
    • Time inefficient: Slow to query, as each item in a list must be searched sequentially until the desired edge is found
    • Well suited to sparse graphs with few edges
  • Trees
    A tree is a connected, undirected graph with no cycles
  • Rooted trees
    • Have a root node from which all other nodes stem
    • The root node is usually displayed at the top of the tree in diagrams
    • Nodes from which other nodes stem are called parent nodes
    • The root node is the only node with no parent
    • Nodes with a parent are called child nodes
    • Child nodes with no children are called leaf nodes
  • Binary trees
    A rooted tree in which each parent node has no more than two child nodes.
  • Hash tables
    • Constant time complexity
    • A hashing algorithm takes an input and returns a hash
    • A hash is unique to its input and cannot be reversed to retrieve the input value
    • A hash table stores key-value pairs
    • The key is sent to a hash function, which produces a hash
    • Resulting hash is the index of the key-value pair in the hash table
    • When an element is to be looked up in a hash table, the key is first hashed
    • Once the hash has been calculated, the position in the table corresponding to that hash is queried and the desired info located
  • Hash tables - collisions
    • Sometimes different inputs produce the same hash
    • Well-designed hash tables get around collisions by using rehashing
    • One simple rehashing technique is to keep on moving to the next position until an available one is found
  • Dictionaries
    • A collection of key-value pairs
    • Values are accessed by their associated key
    • One application of dictionaries is in dictionary-based compression
  • Vectors

    Can be represented as:
    • lists of numbers
    • functions
    • ways of representing a point in space
    If viewed as a function, a vector can be represented by using a dictionary, if viewed as a list, a one-dimensional array would be suitable. They can be visually represented as an arrow.
  • Vector addition
    Vectors can be added to achieve translation - with arrows, vectors are added “tip to tail”. Vectors can also be added by adding each of their components.
  • Scalar-vector multiplication
    In order to scale a vector, each of its components are multiplied by a scalar. Scaling a vector only affects its magnitude, not its direction.
  • Convex combination of two vectors
    With two vectors, a and b, a convex combination of the two would be ax + by where x and y:
    • are non-zero numbers
    • are less than one
    • add to 1
    The convex combination of a and b with x and y:
    • is formed on the line that would join the tips of a and b
    • splits the line in the ratio chosen for the values x and y
  • Dot product
    A single number derived from the components of vectors.
    • Can be used to determine the angle between two vectors
    • The dot product of the vectors an and b is notated a.b (said ”a dot b”)
  • Static and dynamic data structures
    • Every data structure is either static or dynamic
    • Static data structures are:
    • fixed in size
    • most frequently declared in memory as a series of sequential, contiguous memory locations
    • Dynamic data structures
    • change in size to store their content
    • store each item of data alongside a reference to where the next item is stored in memory
    • require more work on the part of the computer to set up and use