Understanding and interpreting Residuals Plot for linear regression

27 mins read Interpreting Residual Plots to Improve Your Regression When you run a regression, calculating and plotting residuals help you understand and improve your […]

Implementing Transformers step-by-step in PyTorch from scratch

14 mins read Doing away with clunky for-loops, the transformer instead finds a way to allow whole sentences to simultaneously enter the network […]

A review on information theory concepts for machine learning: Entropy, Cross-Entropy, KL divergence, Information gain, and Mutual Information

58 mins read Information Theory Information theory is a field of study concerned with quantifying information for communication. It is a subfield of mathematics […]

Understanding ROC and Precision-Recall curves

25 mins read It can be more flexible to predict probabilities of an observation belonging to each class in a classification problem rather […]

A tutorial on data science project experimentation with Jupyter, Papermill, and MLflow

7 mins read Your company (e.g., an e-commerce platform across several countries) is starting a new project on fraud detection. You begin by […]

Interpreting coefficients of Dummy Variables in a Linear Regression Model

5 mins read Linear regression is a method we can use to quantify the relationship between one or more predictor variables and a response variable. […]

Styling Pandas dataframes using Styler

7 mins read What is styling and why care? The basic idea behind styling is that a user will want to modify the way […]

Mel Spectrogram Explained with Python Code

6 mins read Signals A signal is a variation in a certain quantity over time. For audio, the quantity that varies is air pressure. How […]

A comprehensive tutorial on Transformers Architecture

43 mins read We’ve been hearing a lot about Transformers and with good reason. They have taken the world of NLP by storm […]

Categorical data type in Pandas

8 mins read You may have categorical data in your dataset. A categorical data is a type with two or more categories. If […]

NumPy Broadcasting tutorial

13 mins read In operations between NumPy arrays (ndarray), each shape is automatically converted to be the same by broadcasting. This article describes the following […]

Minimal PyTorch LSTM example for regression and classification tasks

10 mins read The Idea Behind RNNs Recurrent neural networks in general maintain state information about data previously passed through the network. This […]

Understanding TF-IDF with Python example

6 mins read Term Frequency – Inverse Document Frequency (TF-IDF) is a widely used statistical method in natural language processing and information retrieval. […]

Understanding 1D, 2D, and 3D convolutional layers in deep neural networks

21 mins read In deep learning, convolutional layers have been major building blocks in many deep neural networks. The design was inspired by […]

Understanding Attention Mechanism with example

14 mins read For decades, Statistical Machine Translation has been the dominant translation model, until the birth of Neural Machine Translation (NMT). NMT is an […]

Different approaches for finding feature importance using Random Forests

16 mins read In many (business) cases it is equally important to not only have an accurate, but also an interpretable model. Oftentimes, […]

Out of Bag (OOB) score in Random Forests with example

12 mins read Introduction This post describes the intuition behind the Out of Bag (OOB) score in Random forest, how it is calculated, […]

Understanding the Random Forest algorithm and its hyperparameters

17 mins read In this post, we will see how the Random Forest algorithm works internally. To truly appreciate it, it might be […]

Difference between discriminative and generative machine learning models

8 mins read Introduction In today’s world, Machine learning becomes one of the popular and exciting fields of study that gives machines the ability […]

Feature Selection for categorical data with Python code

17 mins read Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target […]