Pandas Intro

About Pandas

Welcome to Pandas Data Manipulation!

Pandas is a Python module that works with tabular data (data that has rows and columns). Pandas combine the functionality of programs such as SQL and Excel with the power of Python. Python is used by data analysts to intake, organize, and analyze large data sets.  Pandas are a Python library used for working with datasets. It has functions for analyzing, cleaning, exploring, and manipulating data.

Why Pandas are important

Pandas allow us to examine large amounts of data and make conclusions based on statistical theories.  Pandas can clean up messed-up data sets and make them more readable and relevant. Data relevance is critical in data science.

By the end of the section, you will know how to ingest or create tables of data, summarize data with aggregates, and combine data from multiple tables using merge.