2022-08-01

A simple tutorial on Sampling Importance and Monte Carlo with Python codes

16 mins read Introduction In this post, I’m going to explain the importance sampling. Importance sampling is an approximation method instead of a […]
2022-07-30

What is Reservoir Sampling in Stream Processing?

4 mins read Reservoir sampling is a fascinating algorithm that is especially useful when you have to deal with streaming data, which is […]
2022-07-30

A comprehensive tutorial on MLflow for MLOps: From experimentation to production

39 mins read After reading this post you will be able to: Understand how you and your Data Science teams can improve your […]
2022-07-28

Understanding TF-IDF with Python example

7 mins read Term Frequency – Inverse Document Frequency (TF-IDF) is a popular statistical technique utilized in natural language processing and information retrieval […]
2022-07-28

Steps to package and publish Python codes to PyPI (pip)

6 mins read You wrote a new Python package that solves a specific problem and it’s now time to share it with the […]
2022-07-25

Understanding DenseNet architecture with PyTorch code

20 mins read DenseNet Architecture Introduction In a standard Convolutional Neural Network, we have an input image, that is then passed through the network […]
2022-07-23

A guide to Bootstrapping for Statistical Inference – Confidence Interval and Hypothesis Testing

14 mins read Introduction Inferential Statistics is the process of examining the observed data (sample) in order to make conclusions about the properties/parameters […]
2022-07-21

Partial Dependence Plots with Python code

17 mins read What Are Partial Dependence Plots Some people complain machine learning models are black boxes. These people will argue we cannot see how […]
2022-07-19

Understanding Deep U-Nets for Semantic Segmentation: A salt identification case study with Keras

19 mins read Introduction Deep Learning has enabled the field of Computer Vision to advance rapidly in the last few years. In this […]
2022-07-19

Understanding Transposed Convolution with Python example

25 mins read Transposed Convolutions is a revolutionary concept for applications like image segmentation, super-resolution, etc but sometimes it becomes a little trickier […]
2022-07-19

Understanding the basics of audio data with Python code

36 mins read Overview A huge amount of audio data is being generated every day in almost every organization. Audio data yields substantial […]
2022-07-16

SQL Window Functions explained with example

32 mins read All database users know about regular aggregate functions which operate on an entire table and are used with a GROUP […]
2022-07-14

Setup Apache Spark on a multi-node cluster

12 mins read This article covers basic steps to install and configure Apache Spark Apache Spark 3.1.1 on a multi-node cluster which includes installing spark […]
2022-07-11

A tutorial on Transfer Learning using PyTorch

21 mins read Overview The art of transfer learning could transform the way you build machine learning and deep learning models Learn how […]
2022-07-11

Stratified K-fold Cross Validation for imbalanced classification tasks

10 mins read Model evaluation involves using the available dataset to fit a model and estimate its performance when making predictions on unseen […]
2022-07-11

Understanding Perplexity for language models

17 mins read In general, perplexity is a measurement of how well a probability model predicts a sample. In the context of Natural Language Processing, […]
2022-07-11

How to select classification threshold for imbalanced datasets

21 mins read Classification predictive modeling typically involves predicting a class label. Nevertheless, many machine learning algorithms are capable of predicting a probability […]
2022-07-09

Which performance metrics to use for evaluating a classification model on imbalanced datasets?

8 mins read There are various metrics to evaluate a classification model: Accuracy, Precision, Recall F1-score, and AUC-ROC score. However, it is always […]
2022-07-08

Understanding the ROC curve and AUC-ROC with Python example

17 mins read AUC (Area Under the Curve)-ROC(Receiver Characteristic Operator) curve helps us visualize how well our machine learning classifier is performing. Although […]
2022-07-07

Hyperparameter optimization with Scikit-Learn GridSearchCV using different models

4 mins read Basically it is a bit difficult to manually perform grid search across different models in scikit-learn. We usually need to […]