Also known as data munging or data preparation, refers to the process of refining, organizing, and enhancing raw data to make it more appropriate for analysis
Essential steps in data wrangling
Structuring
Cleaning
Enriching
Validating
Generating output
Structuring
1. Arrange the raw data into a format suitable for analysis
2. Convert data types
3. Reorder columns
4. Address missing values
Cleaning
1. Identify and address errors, inconsistencies, or outliers within the dataset
2. Remove duplicate records
3. Rectify typos
4. Manage missing or incomplete information
Enriching
1. Enhance the dataset by adding relevant information from other sources
2. Merge datasets
3. Extract additional features
4. Incorporate external data to provide more context and depth to the analysis
Validating
1. Ensure that the data adheres to specific rules, standards, or expectations
2. Identify any remaining errors or inconsistencies