2022-08-01

A simple tutorial on Sampling Importance and Monte Carlo with Python codes

16 mins read Introduction In this post, I’m going to explain the importance sampling. Importance sampling is an approximation method instead of a […]
2022-07-30

What is Reservoir Sampling in Stream Processing?

4 mins read Reservoir sampling is a fascinating algorithm that is especially useful when you have to deal with streaming data, which is […]
2022-05-28

Understanding interaction effects in regression analysis

22 mins read In regression, an interaction effect exists when the effect of an independent variable on a dependent variable changes, depending on […]
2022-05-24

Performing A/B test in Python example – A case study from Udacity Data Scientist Nano Degree

11 mins read This is a simple walkthrough of an A/B test case study developed and used by Udacity. It is part of […]
2022-05-23

A guide to Bootstrapping for Statistical Inference – Confidence Interval and Hypothesis Testing

14 mins read Introduction Inferential Statistics is the process of examining the observed data (sample) in order to make conclusions about the properties/parameters […]
2022-05-19

Understanding the basics of Bayesian Inference with Python Code

10 mins read Why did someone have to invent the Bayesian Inference? In one sentence: to update the probability as we gather more data. The […]
2022-02-20

Bayesian view of linear regression – Maximum Likelihood Estimation (MLE) and Maximum APriori (MAP)

16 mins read Linear Regression is commonly the first machine learning problem that people are interested in in the area of study. For […]
2022-02-13

Handling skewness in features by applying transformation in Python

13 mins read In this tutorial, you will learn how to deal with your data when it is not following the normal distribution. One […]
2022-01-24

A guide on Maximum likelihood and Bayesian inference for parameter estimation

28 mins read Introduction In this post, I’ll explain what the maximum likelihood and Bayesian inference methods for parameter estimation are and go […]
2022-01-18

Why does LASSO regression (L1 regularization) shrink coefficients to zero but not the Ridge?

11 mins read We often read almost everywhere that Lasso regression encourages zero coefficient and hence provides a great tool for variable selection as well but it […]
2021-11-01

ARIMA and SARIMA for Real-World Time Series Forecasting in Python

15 mins read Time series and forecasting have been some of the key problems in statistics and Data Science. Data becomes a time […]
2021-10-19

Difference between Probability Density and Probability

5 mins read The probability density at x can be greater than one but then, how can it integrate to one? It’s a […]
2021-10-19

What is Conjugate Prior?

5 mins read What is Prior? Prior probability is the probability of an event before we see the data. In Bayesian Inference, the prior […]
2021-10-17

Important probability distributions for Data Science with Python code

33 mins read For a data scientist aspirant, Statistics is a must-learn thing. It can process complex and challenging problems in the real […]
2021-07-03

Understating and discovering multicollinearity in regression analysis with Python code

9 mins read In this post, I will explain the concept of collinearity and multicollinearity and why it is important to understand them […]
2021-07-02

Measure the correlation between numerical and categorical variables and the correlation between two categorical variables in Python: Chi-Square and ANOVA

27 mins read This scenario can happen when we are doing regression or classification in machine learning. Regression: The target variable is numeric […]
2021-06-21

Data Science and Machine Learning Cheat Sheets

5 mins read Click on the links to get the high-resolution cheat sheets. Algebra Linear Algebra Calculus Probability Statistics Python R Machine Learning […]
2021-04-17

Understanding intuition behind Markov Chain Monte Carlo Methods (MCMC)

15 mins read For many of us, Bayesian statistics is voodoo magic at best or completely subjective nonsense at worst. Among the trademarks […]
2021-03-17

Methods for sampling from complex distributions

8 mins read This writeup includes descriptions from a recent paper on algorithmic sampling, to describe in simpler terms the motivation and approach for […]
2021-03-02

ARIMA for time series forecasting in Python

11 mins read Making out-of-sample forecasts can be confusing when getting started with time series data. The statsmodels Python API provides functions for […]