We see a bunch of different clocks in this photo. They have different shapes, colors, and sizes. However, all of them are a type of clock. We can think of classes in Python in a similar way. A class represents a type (clock) and we can create many instances of that type (clocks in the photo above).
Object oriented programming (OOP) paradigm is built around the idea of having objects that belong to a particular type. In a sense, the type is what explains us the object.
The explanation of an object is of crucial importance for OOP. …
SQL is a programming language that is used to manage data stored in tabular form (i.e. tables) in relational databases.
A relational database consists of multiple tables that relate to each other. The relation between tables is formed in the sense of shared columns.
There are many different relational database management systems (e.g. MySQL, PostgreSQL, SQL Server). The SQL syntax they adapt might differ slightly. However, the difference is very small so if you learn how to use one, you can easily switch to another one.
In this article, we will go over 30 examples that cover the following operations with…
Pandas is one of the most popular data analysis libraries. There are numerous Pandas functions and methods that ease and expedite the data cleaning and analysis process.
Pandas also provides some functions that are not so common but come in handy for certain tasks. In this post, we will cover 7 uncommon Pandas functions.
The functions that will be discussed are:
We always start importing the dependencies.
import numpy as np
import pandas as pd
Clip function trims a dataframe based on the given upper or lower values. It does not drop the rows that are outside the specified range by the upper or lower values. Instead, if a value is outside the boundaries, the clip function makes them equal to the appropriate boundary value. …
Data science has experienced a tremendous growth in recent years. The potential to create value out of data has attracted businesses which, as a result, has driven new investments in this field.
The popularity and potential of data science along with the increasing demand for data scientists cause lots of people to make a career change to work in this field.
The biggest challenge for aspiring data scientists is to take the first step into the field. I think what makes it hard to take the first step are the following reasons:
Pandas is one of the most commonly used data analysis and manipulation libraries in data science ecosystem. It offers plenty of functions and methods to perform efficient operations.
What I like most about Pandas is that there are almost always multiple ways to accomplish a given task. However, we should consider time and computational complexity when selection a method from available options.
It is not enough just to complete a given task. We should make it as efficient as possible. Thus, having a comprehensive understanding of how functions and methods work is of crucial importance.
In this article, we will do examples to compare the apply and applymap functions of pandas to vectorized operations. The apply and applymap functions come in hand for many tasks. However, as the size of data increases, time becomes an issue. …
SQL is a programming language used by most relational database management systems (RDBMS) to manage data stored in tabular form (i.e. tables). A relational database consists of multiple tables that relate to each other. The relation between tables is formed with shared columns.
When we are to retrieve data from a relational database, the desired data is typically spread out to multiple tables. In such cases, we use SQL joins which are used to handle tasks that include selecting rows from two or more related tables.
In order to be consistent while selecting rows from different tables, SQL joins make use of the shared column. In this article, we will go over 7 examples to demonstrate how SQL joins can be used to retrieve data from multiple tables. …
A function is a block of code that takes zero or more inputs, performs some operations, and returns a value. Functions are essential tools to create efficient and powerful programs.
In this article, we will cover a special form of functions in Python: lambda expressions. The first and foremost point we need to emphasize is that a lambda expression is a function.
square = lambda x: x**2type(square)
The square is a function that returns the square of a number. In the traditional form of defining functions in Python, the square function would look as below.
Data visualization is a powerful tool for exploratory data analysis. We can use it to reveal the underlying structure within data or the relationships among variables. An overview of basic descriptive statistics can also be obtained from a data visualization.
Data visualization is of crucial importance in data science field. Thus, there are lots of libraries and packages in this domain. Although they have different syntax and methods to create visualizations, the ultimate goal is the same: explore and understand the data.
In this article, we will explore a medical cost dataset using the ggplot2 library of R programming language. The dataset is good for practicing because it contains a mixture of variables with different data types. …
SQL is a language used for managing data in relational databases. The core component of a relational database is table that stores data in tabular form with labelled rows and columns.
We query data from a relational database with the select statement of SQL. The select statement is highly versatile and flexible in terms of data transformation and filtering operations.
In that sense, SQL can be considered as a data analysis tool. The advantage of using SQL for data transformation and filtering is that we only retrieve the data we need. …
Pandas is a highly popular data analysis and manipulation library for Python. It is one of the very first tools introduced in data science education. Pandas provides plentiful functions and methods for more efficient data analysis.
What I like most about Pandas is that there are almost always multiple ways to complete a given task. One way might outperform others in terms of time and complexity. However, having multiple options makes you think outside the box. It also helps to improve your approach to solve complex tasks.
Another advantage of practicing different ways to solve an issue is that it greatly improves your knowledge of pandas. …