data representation

Cards (34)

  • The natural numbers are a set of numbers containing all positive whole numbers and zero. They can be used to count how many of a certain item you have. For example, three keyboards, seven printers or two servers. The symbol for the natural numbers is ℕ.
  • The integers are a set of whole numbers, both positive and negative, including zero. The symbol used for integers is ℤ
  • ASCII (American Standard Code for Information Interchange, pronounced ah-ski) and Unicode are two widely used information coding systems. Introduced in 1963, ASCII makes use of 7 bits to represent 128 (= 27 ) different characters including A to Z, a to z, 0 to 9 and various symbols.
  • Unicode was introduced to allow the representation of a wide variety of alphabets by computers. The standard uses anywhere from 8 to 48 bits (1 to 6 bytes) per character, allowing it to represent a much wider range of different characters than ASCII.
  • A parity bit is a single bit added to a transmission that can be used to check for errors in the transmitted data. Its value is calculated based on the transmitted data itself
  • In even parity, the value of the parity bit is chosen so as to make the total number of 1s in the transmitted data even
  • Odd parity works in a similar way to even parity, but adds a parity bit so that the total number of 1s in the transmitted data is odd
  • When using majority voting, each bit of the data is transmitted multiple times. When the data is received, the most commonly occurring value is taken to be correct
  • ASCII (American Standard Code for Information Interchange) and Unicode are two widely used information coding systems.
  • Unicode was introduced in 1991 to allow the representation of a wide variety of alphabets by computers.
  • Introduced in 1963, ASCII makes use of 7 bits to represent 128 (= 27 ) different characters including A to Z, a to z, 0 to 9 and various symbols.
  • checksums involve adding a value, determined by the data itself, to the transmitted data. An algorithm is used to determine the value of a checksum based on the data being transmitted. There is no agreed algorithm for this and different systems will use their own solutions.
  • A check digit is a type of checksum in which only a single digit is added to the transmitted data. This reduces the number of different algorithms that could be used to calculate the value of the check digit and so reduces the variety of errors that the method can detect.
  • Analogue data is continuous, there are no limits to the values that the data can take. In contrast, digital data is discrete, meaning that it can only take particular values
  • An analogue signal can take any values and can change as frequently as required whereas a digital signal must always take one of a specified range of values and can only change value at specified intervals
  • When converting from digital to analogue, a device called a digital to analogue converter (or DAC for short) is used. The device reads a bit pattern representing an analogue signal and outputs an alternating, analogue, electrical current.
  • Many sensors such as temperature sensors and microphones output an analogue signal. When a computer needs to make use of these, they use an analogue to digital converter (ADC for short) to convert the analogue signal to a digital bit pattern. The device works by taking a reading of an analogue signal at regular intervals and recording the value in a process called sampling
  • Samples are taken at a specific frequency, given in Hertz, which determines the number of samples taken per second. This is usually a high number as greater sampling frequencies result in a better reproduction of the analogue signal
  • The resolution of an image is often expressed as a number of pixels per square inch in an image, or as a total collection of pixels (a.k.a. 800x1200 resolution)
  • The number of bits assigned to a pixel in an image is called its colour depth
  • This method of calculating the storage requirements for bitmapped images produces a minimum value. This is because bitmap image files may also contain metadata, typical examples of which include the image’s width, height, date created and colour depth
  • Vector graphics represent images using geometric objects and shapes such as rectangles, circles and lines. The properties (such as fill colour, fill style and dimensions) of each geometric object or shape in the image are stored in a drawing list
  • Computers represent sound as a sequence of samples, each of which takes a discrete digital value. The number of samples per second is called the sampling rate and is expressed in Hertz
  • The number of bits allocated to each sample is referred to as the sample resolution. Higher sample resolutions result in greater audio quality but also increased file size.
  • The size of a sound sample can be calculated by multiplying together the duration of the sample in seconds, the sampling rate in Hertz and the sample resolution
  • The Nyquist theorem states that the sampling rate of a digital audio file must be at least twice the frequency of the sound. If the sampling rate is below this, the sound may not be accurately represented.
  • Musical instrument digital interface, or MIDI, is used with electronic musical instruments which can be connected to computers. Rather than storing samples of sound, MIDI stores sound as a series of event messages, each of which represents an event in a piece of music. These can be thought of as a series of instructions which could be used to recreate a piece of music
  • MIDI files are much smaller than other types of audio files because they do not store actual sounds, just instructions on how to create them. This makes it easier to transfer large numbers of songs over networks like the internet
  • Raster images consist of pixels arranged into rows and columns. They are often created from photographs or scanned drawings. Pixels have no physical existence; instead, they exist only as points of light produced by a computer monitor
  • When using lossy compression, some information is lost in the process of reducing the file’s size. This could be reducing the resolution of an image or lowering the sample resolution of a sampled audio file
  • In contrast to lossy compression, there is no loss of information when using lossless compression. The size of a file can be reduced without decreasing its quality
  • Run length encoding (RLE for short) reduces the size of a file by removing repeated information and replacing it with one occurance of the repeated information followed by the number of times it is to be repeated
  • When a file is compressed with a dictionary-based method, a dictionary containing repeated data is appended to the file
  • Encryption is the process of scrambling data so that it cannot be understood if intercepted in order to keep it secure during transmission. Unencrypted information is referred to as plaintext and encrypted information is called ciphertext. A cipher is a type of encryption method.