Member-only story
Practical Data Analysis Guide with Pandas and Altair
An analysis of the songs collected from Spotify
Pandas is a highly popular data analysis and manipulation library for Python. It contains numerous functions and methods for efficient data analysis. Although Pandas offers some basic visualizations, a data visualization library is more preferable to create advanced and versatile data visualizations.
There are many data visualization libraries for Python such as Matplotlib, Seaborn, and Altair. In this article, we will be using Altair along with Pandas to analyze a dataset of songs collected from Spotify API.
Altair is a statistical visualization library for Python. Its syntax is clean and easy to understand as we will see in the examples. Altair is highly flexible in terms of data transformations which makes the library even more efficient for exploratory data analysis.
Let’s start with importing the libraries and reading the dataset into a pandas dataframe.
import numpy as np
import pandas as pd
import altair as altdf = pd.read_csv("/content/data_by_year.xls", parse_dates=['year'])df.shape
(100,14)
The dataset (data_by_year) contains several features that explain song characteristics from 1921 to 2020.