An important job that allows companies to make use of the data available to them in readily accessible forms
6 Stages of Data processing
1. Data collection
2. Data preparation
3. Data input
4. Processing
5. Data output/interpretation
6. Data storage
Data processing
Occurs when data is collected and translated into usable information
Converts information from raw data into a readable and functional format and storing it for future use
Data processing can be completed through advanced technological methods or manual efforts
Data processing becomes the process of converting information into data and also vice-versa
Data scientist
Processes information using their expertise to condense large sets of data into functional formats
Data collection
Collecting data is the first step in the data processing
The data sources available must be trustworthy and well-built so the data collected is of the highest possible quality
Types of data collection
Qualitative
Quantitative
Primary
Secondary
Qualitative data collection
Non-numerical research that gathers information on concepts, thoughts or experiences
Qualitative data collection methods
Observations
Surveys
Focus groups
Interviews
Quantitative data collection
Collects numerical or statistical information
Quantitative data collection methods
Observations
Surveys
Primary data collection
Researchers obtain information directly from the original sources
Secondary data collection
Information gathered from previous research
Sources of secondary data
Books
Scholarly journals and papers
Newspapers
Websites
Podcasts
Data preparation
Raw data is cleaned up and organized for the following stage of data processing
Data input
The clean data is entered into its destination, which may be a general data processing software or a processing system designed for specific needs
Data processing
The data inputted to the computer is processed for interpretation, which may include manual work or using functions provided by the chosen data processing system
Data output/interpretation
The stage at which data is finally usable to non-data scientists, producing outputs like reports, charts, graphs, videos, images, or other visual aids and documents
Data storage
The final stage of data processing, where the data is stored for future use
Types of data processing
Batch processing
Distributed processing
Multi-processing
Real-time processing
Transaction processing
Batch processing
Processes large groups of data at the same time, sacrificing immediate results for efficiency
Distributed processing
Data processing resides on multiple machines or servers, offering high fault tolerance
Multi-processing
Uses multiple processors within the same physical component, expediting the process but more susceptible to slowdowns
Real-time processing
Processes information as quickly as possible, skipping entries with errors and moving on
Transaction processing
A real-time processing method for important information that must be error-free, pausing processing until errors are corrected