Member-only story

6 Pandas Functions for a Quick Exploratory Data Analysis

Soner Yıldırım
6 min readDec 25, 2021

--

The first step into exploring your data

The most important ingredient of a data science project or product is the data. The better we understand the data the more accurate and robust the end product is.

This is the reason why a substantial amount of time in a project is spent on data cleaning and exploratory data analysis.

Pandas being one of the most widely used data analysis and manipulation libraries provides a rich selection of functions to help explore the data.

An exploratory data analysis process is likely to be very detailed depending on the task. How you approach the data changes according to its dynamics. However, there are some fundamental operations being done in most cases.

In this article, we will go over 5 Pandas functions to perform a quick exploratory data analysis.

Don’t forget to subscribe if you’d like to get an email whenever I publish a new article.

I have previously created a sales dataset filled with mock data. Let’s start with creating a Pandas data frame from this dataset. Feel free to download it and follow along.

import pandas as pdsales = pd.read_csv("sales.csv")sales.head()
Sales data frame (image by author)

The dataset contains stock, cost, price, and sales information of some products at a retail store.

1. Shape

The shape method returns a tuple that shows the number of rows and columns in a data frame.

sales.shape
(1000, 8)

The sales data frame has 1000 rows and 8 columns. Pandas has also a size method that returns the total number of cells (i.e. number of rows times number of columns).

sales.size
8000

2. Dtypes

It is important to use appropriate data types for all the columns for mainly 3 reasons:

  • Some functions and methods work more efficiently with specific data types.

--

--

Soner Yıldırım
Soner Yıldırım

Responses (4)

Write a response