2022-03-28

Bulk Boto3 (bulkboto3): Python package for fast and parallel transferring a bulk of files to S3 based on boto3!

Table of Contents: Introduction About bulkboto3 Getting Started Prerequisites Installation Usage Contributing Conclusion Introduction “How to transfer a bulk of […]
2021-12-26

Walk-forward optimization for algorithmic trading strategies on cloud architecture

Table of Contents: Introduction Terminology Walk-forward Optimization Design of walk-forwards The Architecture Configuring cloud machines using Ansible Docker Swarm Optimization […]
2022-05-26

Understand Ordinal and One-Hot Encodings for categorical features

Machine learning models require all input and output variables to be numeric. This means that if your data contains categorical […]
2022-05-26

When should we drop the first one-hot encoded column?

Many machine learning models demand that categorical features are converted to a format they can comprehend via a widely used […]
2022-05-26

Alternatives for One-Hot Encoding of Categorical Variables

One-hot encoding, otherwise known as dummy variables, is a method of converting categorical variables into several binary columns, where a […]
2022-05-26

Handling cyclical features, such as hours in a day, for machine learning pipelines with Python example

What’s the difference between 23 and 1? If we’re talking about time, it’s 2. Hours of the day, days of […]
2022-05-26

Common mistakes to avoid as a Machine Learning Engineer

In machine learning, there are many ways to build a product or solution and each way assumes something different. Many […]
2022-05-24

Performing A/B test in Python example – A case study from Udacity Data Scientist Nano Degree

This is a simple walkthrough of an A/B test case study developed and used by Udacity. It is part of […]
2022-05-23

A guide to Bootstrapping for Statistical Inference – Confidence Interval and Hypothesis Testing

Introduction Inferential Statistics is the process of examining the observed data (sample) in order to make conclusions about the properties/parameters […]
2022-05-23

Understand p-value using bootstrapping technique in statistics

For context, we are using the bootstrapping methods (that I’ve referenced previously) for simulating null and sampling distributions (rather than standard […]
2022-05-22

Understanding Bootstrapping approach vs. Traditional approaches in statistics

Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows you to […]
2022-05-16

SQL Window Functions explained with example

All database users know about regular aggregate functions which operate on an entire table and are used with a GROUP […]
2022-05-11

Understanding Perplexity for language models

In general, perplexity is a measurement of how well a probability model predicts a sample. In the context of Natural Language Processing, […]
2022-05-10

Understanding GloVe embedding with Tensorflow implementation

In this article, you will learn about GloVe, a very powerful word vector learning technique. This article will focus on […]
2022-05-09

Understanding Word2vec embedding with Tensorflow implementation

This article is going to be about Word2vec algorithms. Word2vec algorithms output word vectors. Word vectors, underpin many of the […]
2022-05-02

Understand Jacobian and Hessian matrices with example

In this post, you will find what the Jacobian matrix and the Hessian matrix are and how to calculate them. […]
2022-04-27

Understanding and interpreting Residuals Plot for linear regression

Interpreting Residual Plots to Improve Your Regression When you run a regression, calculating and plotting residuals help you understand and improve your […]