A guide on Maximum likelihood and Bayesian inference for parameter estimation

28 mins read Introduction In this post, I’ll explain what the maximum likelihood and Bayesian inference methods for parameter estimation are and go […]

Implementing Attention Mechanism in Python

7 mins read The attention mechanism was introduced to improve the performance of the encoder-decoder model for machine translation. The idea behind the […]

Understanding AdaBoost algorithm and its mathematics

15 mins read If you’re going through this tutorial, you’ve probably heard of XGBoost, LightGBM, or something of those sorts before. These are […]

An illustrated guide to Attention Mechanism in Sequence Models with PyTorch code

22 mins read In this article, I will be covering the main concepts behind Attention, including the implementation of a sequence-to-sequence Attention model, […]

Understanding Self-Attention in Transformers with example

10 mins read What do BERT, RoBERTa, ALBERT, SpanBERT, DistilBERT, SesameBERT, SemBERT, SciBERT, BioBERT, MobileBERT, TinyBERT and CamemBERT all have in common? And […]

Why does LASSO regression (L1 regularization) shrink coefficients to zero but not the Ridge?

11 mins read We often read almost everywhere that Lasso regression encourages zero coefficient and hence provides a great tool for variable selection as well but it […]

Theory of Generalization: growth function, dichotomies, and break points

15 mins read The size of our data set N plays a major role when it comes to the reliability of the generalization Ein […]

Mathematical view of Bias-Variance trade-off

6 mins read The bias-variance trade-off is an important concept in statistics and machine learning. This is used to get better performance out […]

Walk-forward optimization for algorithmic trading strategies on cloud architecture

11 mins read Table of Contents: Introduction Terminology Walk-forward Optimization Design of walk-forwards The Architecture Configuring cloud machines using Ansible Docker Swarm Optimization […]

Understanding stdin, stdout, and stderr in Linux

11 mins read stdin, stdout, and stderr are three data streams created when you launch a Linux command. You can use them to tell if your […]

What is a Makefile?

27 mins read If you want to run or update a task when certain files are updated, the make utility can come in handy. The make utility […]

Useful sed command use cases in Linux

8 mins read In this article, we will review sed, the well-known stream editor, and share 15 tips to use it in order […]

Shebang in Linux Shell Scripting

6 mins read You’ll often come across shell scripts that start with: This #! is called shebang or hashbang. The shebang plays an important role […]

Useful keyboard shortcuts for Linux Bash

5 mins read The bash shell features a wide variety of keyboard shortcuts you can use. These will work in bash on any […]

Understanding Gaussian Process

79 mins read Gaussian Process is a machine learning technique. You can use it to do regression, classification, among many other things. Being […]

Docker Swarm tutorial along with code

29 mins read Table of Content •  Your first Swarm cluster •  Your first Swarm deployment •  Explore the stack •  Set up […]

Sampling from a multivariate Gaussian (Normal) distribution with Python code

3 mins read Steps: A widely used method for drawing (sampling) a random vector  from the N-dimensional multivariate normal distribution with mean vector  and covariance […]

Solving six problems with Bayesian statistics

8 mins read 1) The first one is a warm-up problem. Suppose there are two full bowls of cookies. Bowl #1 has 10 […]

Bahdanau and Luong Attention Mechanisms explained

11 mins read Conventional encoder-decoder architectures for machine translation encoded every source sentence into a fixed-length vector, irrespective of its length, from which […]

Difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP)

4 mins read Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP), are both methods for estimating variable from probability distributions or graphical […]