A complete tutorial on evaluation metrics for imbalanced classification

38 mins read A classifier is only as good as the metric used to evaluate it. If you choose the wrong metric to […]

Exploratory Data Analysis (EDA) example: Road safety dataset case study

20 mins read Getting a good feeling about a new dataset is not always easy and takes time. However, a good and broad […]

Pandas data selection using .loc and .iloc

8 mins read When it comes to select data on a DataFrame, Pandas loc and iloc are two top favorites. They are quick, fast, easy to read, […]

Understanding hypothesis testing with Covid-19 case study (Z-test and t-test)

13 mins read Introduction The coronavirus pandemic has made a statistician out of us all. We are constantly checking the numbers, making our […]

Difference between ETL and ELT

13 mins read On a high level, ETL transforms your data before loading, while ELT transforms data only after loading to your warehouse. In this post, […]

Guidelines to use Transfer Learning in Convolutional Neural Networks

9 mins read Transfer Learning How to adapt an expert’s CNN architecture that has already learned so much about how to find the […]

Installing g++ (C++ Compiler) on Windows

11 mins read Follow these steps to install g++ (the GNU C++ compiler) for Windows. There is no room for creativity here; you must […]

Kalman Filter Simply Explained

5 mins read Let’s start with what a Kalman filter is: It’s a method of predicting the future state of a system based […]

Common behavioral questions in job interviews

3 mins read 1. Getting to Know You What motivates you at work? Describe what your preferred supervisor—employee relationship looks like. What two […]

A good LinkedIn profile checklist

3 mins read Here is a list of rules to make your LinkedIn profile professional: General Criteria Meet Specification Overall, profile is professional, […]

Styling Pandas DataFrames using Style API

10 mins read Python’s Pandas library allows you to present tabular data in a similar way as Excel. What’s not so similar is […]

Understanding the probabilistic interpretation of linear regression

6 mins read Linear regression is about finding a linear model that best fits a given dataset. For example, in a simple linear […]

Understanding Beta Distribution

9 mins read When to use Beta distribution The Beta distribution is a probability distribution on probabilities. For example, we can use it to model […]

Walkthrough of an exploratory analysis for classification problems

20 mins read In this post, I’ll outline how to perform an exploratory analysis for a binary classification problem. I am going to […]

Dealing with imbalanced data in machine learning

8 mins read Imbalanced classes are a common problem in machine learning classification where there is a disproportionate ratio of observations in each […]

List of useful tutorials for Exploratory Data Analysis (EDA)

< 1 min https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15 https://medium.com/omarelgabrys-blog/statistics-probability-exploratory-data-analysis-714f361b43d1 https://www.kaggle.com/ekami66/detailed-exploratory-data-analysis-with-python https://www.kaggle.com/dvigneshwer/kernele7f4dbb964/notebook Visualizing the distribution of a dataset — seaborn 0.10.0 documentationhttps://seaborn.pydata.org/tutorial/distributions.html https://www.kaggle.com/kashnitsky/topic-1-exploratory-data-analysis-with-pandas https://iq.opengenus.org/exploratory-data-analysis-python/ Plotting with categorical data […]

Types of Data & Measurement Scales: Nominal, Ordinal, Interval and Ratio

6 mins read There are four measurement scales: nominal, ordinal, interval, and ratio. These are simply ways to categorize different types of variables […]

Using Kaggle Datasets in Google Colab

< 1 min Steps: Create an API key in Kaggle.To do this, go to kaggle.com/ and open your user settings page.  Next, scroll […]

Getting Started With Google Colab

5 mins read If you want to create a machine learning model but say you don’t have a computer that can take the […]

Understanding Gated Recurrent Unit (GRU) with PyTorch code

21 mins read The Gated Recurrent Unit (GRU) is the younger sibling of the more popular Long Short-Term Memory (LSTM) network, and also a […]