Instagram
youtube
Facebook
Twitter

Data Cleaning

Data

  • Data is a very crucial part of data analysis because data with errors and faults can affect analysis and will lead to wrong predictions.
  • Bad Data Consists of Empty cells, Data in the wrong format, Wrong data, and Duplicates.
  • Before performing any sort of data analysis, data scientists spend a significant amount of time cleaning the dataset to ensure that it is in the refined state required for various data science techniques to be applied. 
  • Handling messy data, such as missing values, inconsistent formatting, and data types is essential to a data scientist's job.

The Process of data cleaning is almost covered in these steps:

  • Dropping irrelevant columns.

  • Renaming column names to meaningful names.

  • Making data values consistent.

  • Imputing missing values.

In the coming tutorials, we are going to cover these steps and by the end of this tutorial, you will be able to clean data and make it prepared for Analysis.