1. Text editor software takes into account control characters
2. Control characters are replaced with a different representation or acted on
3. Text is then displayed
Control characters
Special characters read and interpreted by the editor
Newline character
ASCII code 10, Unicode U+000A, often typed as \n
Backslash is an escape character, defines an escape sequence
When the text editor reads the newline character, it is removed and any text following it is placed on a new line
Understanding control characters is important as they can cause issues when working with text files
CSV (Comma-Separated Values)
Type of text file widely supported by many programs
Useful format for moving data between applications
CSV file
Uses a character set like ASCII or Unicode
Consists of records (typically one per line) divided into fields separated by delimiters
Has the same sequence of fields for every record
File name extension is generally .csv
CSV files may have field names encoded in the first line
When importing a CSV file, you are usually asked if there is a header line and what character has been used as the delimiter
Contents of shopping.csv
Item, Description, Qty
Rice, Organic, 1kg
Milk, Skimmed, 2.27l
Eggs, Free-range, 60
Sugar, Brown, 500g
All data on a computer is stored in binary sequences of 1s and 0s
Text files
Binary codes represent characters, which can be displayed in text editors
Binary files
Hold information that is not character based, such as sound or image data
Encoded in a form that can be directly manipulated by a computer program
Most computer files are binary rather than text files, as humans rarely need to read the raw file data
File headers
Tell the computer what kind of data it is looking at, especially important in program binary files
Program binary file header
Prefix 'MZ' - initials of the programmer who invented the format
Bitmapped graphic files
Images represented as a collection of pixels, each with an RGB colour code
Metadata also included in image files, such as dimensions and creation date
Sound files
Analogue signal converted to binary by sampling at fixed intervals
Data points stored as binary numbers, reversed to recreate the sound wave
Program files
Start as text files with programming language instructions
Compiled into machine executable binary format before being run
Software companies are often reluctant to document and share the exact format of their binary files due to concerns about cybercrime and reverse-engineering
The binary is processed by the program associated with it, and is translated back into the instructions and resources that the program needs