2022-07-09

Which performance metrics to use for evaluating a classification model on imbalanced datasets?

8 mins read There are various metrics to evaluate a classification model: Accuracy, Precision, Recall F1-score, and AUC-ROC score. However, it is always […]
2022-07-08

Understanding the ROC curve and AUC-ROC with Python example

17 mins read AUC (Area Under the Curve)-ROC(Receiver Characteristic Operator) curve helps us visualize how well our machine learning classifier is performing. Although […]
2022-07-07

Hyperparameter optimization with Scikit-Learn GridSearchCV using different models

4 mins read Basically it is a bit difficult to manually perform grid search across different models in scikit-learn. We usually need to […]
2022-07-03

Visual comparison of decision boundaries for different classifiers

33 mins read There are many debates on how to decide on the best classifier. Measuring the Performance Metrics score, and getting the […]
2022-07-01

Handling imbalanced datasets for machine learning tasks

12 mins read You can find the implementation of codes in this post in the GitHub Gist. Introduction When observation in one class […]
2022-07-01

Speed up Pandas using Numba

20 mins read Numba is a very commonly used library nowadays to speed up computations in Python code. It let us speed up […]
2022-06-26

A complete guide on Pandas Grouping, Aggregating, and Transformation

51 mins read Introduction One of the most basic analysis functions is grouping and aggregating data. In some cases, this level of analysis […]
2022-06-25

Understanding Moving Average Model in Time Series with Python

10 mins read One of the foundational models for time series forecasting is the moving average model, denoted as MA(q). This is one […]
2022-06-23

A tutorial on Pandas apply, applymap, map, and transform

16 mins read In Data Processing, it is often necessary to perform operations (such as statistical calculations, splitting, or substituting values) on a […]
2022-06-19

Evaluation metrics for Multi-Label Classification with Python codes

10 mins read In a traditional classification problem formulation, classes are mutually exclusive. In other words, under the condition of mutual exclusivity, each […]
2022-06-19

Understanding Micro, Macro, and Weighted Averages for Scikit-Learn metrics in multi-class classification with example

11 mins read The F1 score (aka F-measure) is a popular metric for evaluating the performance of a classification model. In the case […]
2022-06-19

Why are precision, recall, and F1 score equal when using micro averaging in a multi-class problem?

9 mins read In one of my projects, I was wondering why I get the exact same value for precision, recall, and the F1 score when using scikit-learn’s metrics. […]
2022-06-18

A guide on regression error metrics (MSE, RMSE, MAE, MAPE, sMAPE, MPE) with Python code

25 mins read Regressions are one of the most commonly used tools in a data scientist’s kit. The quality of a regression model is how […]
2022-06-15

Understanding Contiguous vs Non-Contiguous Tensors in PyTorch

13 mins read Tensor and View View uses the same data chunk from the original tensor, just a different way to ‘view’ its […]
2022-06-14

Deploying and sharing Machine Learning projects easily using Gradio

7 mins read Students or Professionals from other streams, like business studies, practice and excel in data science. But when it comes to […]
2022-06-13

Detecting elbow/knee points in a graph using Python

16 mins read Theory When working with data, it is sometimes important to know where a data point’s “relative costs to increase some […]
2022-06-03

A complete guide on feature selection techniques with Python code

33 mins read Considering you are working on high-dimensional data that’s coming from IoT sensors or healthcare with hundreds to thousands of features, […]
2022-05-30

A tutorial on Scikit-Learn Pipeline, ColumnTransformer, and FeatureUnion

20 mins read These three powerful tools are must-know for anyone who wants to master using sklearn. It’s, therefore, crucial to learn how to […]
2022-05-29

Understanding different types of Scikit Learn Cross Validation methods

14 mins read Cross-validation is an important concept in machine learning which helps the data scientists in two major ways: it can reduce the […]
2022-05-28

How to interpret logistic regression coefficients?

15 mins read Logistic Regression is a fairly simple yet powerful Machine Learning model that can be applied to various use cases. It’s […]