A simple tutorial on Sampling Importance and Monte Carlo with Python codes

16 mins read Introduction In this post, I’m going to explain the importance sampling. Importance sampling is an approximation method instead of a […]

What is Reservoir Sampling in Stream Processing?

4 mins read Reservoir sampling is a fascinating algorithm that is especially useful when you have to deal with streaming data, which is […]

Evaluation metrics for Multi-Label Classification with Python codes

10 mins read In a traditional classification problem formulation, classes are mutually exclusive. In other words, under the condition of mutual exclusivity, each […]

A guide on regression error metrics (MSE, RMSE, MAE, MAPE, sMAPE, MPE) with Python code

25 mins read Regressions are one of the most commonly used tools in a data scientist’s kit. The quality of a regression model is how […]

How to interpret logistic regression coefficients?

15 mins read Logistic Regression is a fairly simple yet powerful Machine Learning model that can be applied to various use cases. It’s […]

Understanding interaction effects in regression analysis

22 mins read In regression, an interaction effect exists when the effect of an independent variable on a dependent variable changes, depending on […]

Performing A/B test in Python example – A case study from Udacity Data Scientist Nano Degree

11 mins read This is a simple walkthrough of an A/B test case study developed and used by Udacity. It is part of […]

A guide to Bootstrapping for Statistical Inference – Confidence Interval and Hypothesis Testing

14 mins read Introduction Inferential Statistics is the process of examining the observed data (sample) in order to make conclusions about the properties/parameters […]

Understanding p-value using bootstrapping technique in statistics

13 mins read For context, we are using the bootstrapping methods (that I’ve referenced previously) for simulating null and sampling distributions (rather than standard […]

Understanding Bootstrapping approach vs. Traditional approaches in statistics

13 mins read Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows you to […]

Understanding Jacobian and Hessian matrices with example

19 mins read In this post, you will find what the Jacobian matrix and the Hessian matrix are and how to calculate them. […]

Understanding and interpreting Residuals Plot for linear regression

27 mins read Interpreting Residual Plots to Improve Your Regression When you run a regression, calculating and plotting residuals help you understand and improve your […]

A review on information theory concepts for machine learning: Entropy, Cross-Entropy, KL divergence, Information gain, and Mutual Information

58 mins read Information Theory Information theory is a field of study concerned with quantifying information for communication. It is a subfield of mathematics […]

Understanding Discrete Fourier Transformation with mathematics and Python codes

16 mins read Introduction The Fourier Transformation is applied in engineering to determine the dominant frequencies in a vibration signal. When the dominant […]

Mathematical view of Bias-Variance trade-off

6 mins read The bias-variance trade-off is an important concept in statistics and machine learning. This is used to get better performance out […]