Member-only story

Practical Data Analysis Guide with Pandas and Altair

Soner Yıldırım
5 min readJan 20, 2021

--

An analysis of the songs collected from Spotify

Photo by Puria Berenji on Unsplash

Pandas is a highly popular data analysis and manipulation library for Python. It contains numerous functions and methods for efficient data analysis. Although Pandas offers some basic visualizations, a data visualization library is more preferable to create advanced and versatile data visualizations.

There are many data visualization libraries for Python such as Matplotlib, Seaborn, and Altair. In this article, we will be using Altair along with Pandas to analyze a dataset of songs collected from Spotify API.

Altair is a statistical visualization library for Python. Its syntax is clean and easy to understand as we will see in the examples. Altair is highly flexible in terms of data transformations which makes the library even more efficient for exploratory data analysis.

Let’s start with importing the libraries and reading the dataset into a pandas dataframe.

import numpy as np
import pandas as pd
import altair as alt
df = pd.read_csv("/content/data_by_year.xls", parse_dates=['year'])df.shape
(100,14)

The dataset (data_by_year) contains several features that explain song characteristics from 1921 to 2020.

--

--

Soner Yıldırım
Soner Yıldırım

No responses yet