2021-05-11

Which Mean should we use? A guide on Arithmetic, Geometric, and Harmonic Means in Data Analysis

45 mins read Introduction It’s probably the most common data analytic task: You have a bunch of numbers. You want to summarize them […]
2021-05-04

Setup and run Jupyter notebook from a remote server by ssh

5 mins read In my research, I usually work with remote servers to run deep learning models inside machines more powerful than my […]
2021-04-28

Python Scipy sparse matrices explained

8 mins read What is a Sparse Matrix? Imagine you have a two-dimensional data set with 10 rows and 10 columns such that […]
2021-04-17

Understanding intuition behind Markov Chain Monte Carlo Methods (MCMC)

15 mins read For many of us, Bayesian statistics is voodoo magic at best or completely subjective nonsense at worst. Among the trademarks […]
2021-04-14

A complete tutorial on tmux in Linux

16 mins read What’s tmux? tmux authors describe it as a terminal multiplexer. Behind this fancy term hides a simple concept: Within one terminal […]
2021-03-27

12 useful Python decorators

12 mins read Python decorators are powerful tools that help you produce clean, reusable, and maintainable code. I’ve long waited to learn about […]
2021-03-23

Review of important offline evaluation metrics for recommendation systems

28 mins read We are in an era of personalization. The user wants personalized content and businesses are capitalizing on the same. Recommendation […]
2021-03-12

Bayesian Linear Regression using PyMC3

8 mins read Introduction In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within […]
2021-03-02

ARIMA for time series forecasting in Python

11 mins read Making out-of-sample forecasts can be confusing when getting started with time series data. The statsmodels Python API provides functions for […]
2021-02-25

Identifying time series AR, MA, ARMA, or ARIMA Models using ACF and PACF plots

4 mins read In time series analysis, the Autocorrelation Function (ACF) and the partial autocorrelation function (PACF) plots are essential in providing the […]
2021-02-19

Pivot, Melt, Stack, and Unstack methods in Pandas

5 mins read Data does not come in a usable format by default; a data science professional has to spend 70–80% of their […]
2021-02-15

Recommended tools and environment setup for a Data Scientist

16 mins read Intro and motivation In this post, I would like to describe in detail our setup and development environment (hardware and […]
2021-02-13

Python testing tutorial using pytest

18 mins read Testing your code brings a wide variety of benefits. It increases your confidence that the code behaves as you expect and […]
2021-02-08

Probability Density Estimation: Maximum Likelihood Estimation (MLE), Maximum A Posteriori (MAP), and Bayesian inference

14 mins read Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP) estimation are methods of estimating parameters of statistical models. Despite a […]
2021-02-04

Implicit Recommender Systems with Alternating Least Squares

13 mins read In today’s post, we will explain a certain algorithm for matrix factorization models for recommender systems which goes by the […]
2020-12-18

How to determine epsilon and MinPts parameters of DBSCAN clustering

9 mins read Every data mining task has the problem of parameters. Every parameter influences the algorithm in specific ways. DBSCAN (Density-Based Spatial […]
2020-11-24

A review of Deep learning based recommendation systems

20 mins read INTRODUCTION The number of research publications on deep learning-based recommendation systems has increased exponentially in the past recent years. In […]
2020-11-20

Steps to setup PyTorch with GPU for NVIDIA GTX 960m (Asus VivoBook n552vw) in Ubuntu

3 mins read In this post, I’m gonna describe the steps I used to utilize GPU for the PyTorch Deep Learning framework on […]
2020-11-18

Basics of Convolutional Neural Networks (CNN) from Deep Learning specialization

8 mins read These notes are taken from the first two weeks of the Convolutional Neural Networks course (part of Deep Learning specialization) by Andrew Ng […]
2020-11-14

Machine Learning From Scratch Series: Linear Regression with Gradient Descent

10 mins read In the following sections, we are going to implement linear regression in a step-by-step fashion using just Python and NumPy. We will […]