5 Ways to Filter Data in R

Soner Yıldırım
5 min readMar 5, 2023

A fundamental piece in data cleaning

Photo by Samantha Gades on Unsplash

Python and R are the two key players in the data science ecosystem. While R is not as popular as Python, it is just as efficient and capable as R doing data manipulation and analysis, and even outperforms Python in some cases.

In this article, we will learn 5 different ways for filtering data in R, which is one of the most frequently done data wrangling operations. We filter data for two main reasons:

  1. Not all the data is needed for the task at hand
  2. Some part of the data is redundant, not useful, or just bad

How to filter data largely depends on the data type but methods can usually be used with different data types as we will see in the examples.

We will be using a sample dataset that I prepared with mock data. You can download it from my datasets repo. Let’s start with creating a data table from the “sales_data_with_stores” csv file.

library(data.table)

dt <- fread("sales_data_with_stores.csv")

# display the first 6 rows
head(df)
First 6 rows of the data (image by author)

The dataset contains both numeric and textual columns. Before we start, let’s briefly…

--

--

Soner Yıldırım
Soner Yıldırım

No responses yet