2021-10-08

Fundamentals of statistics for Data Scientists and Analysts with Python Code

As Karl Pearson, a British mathematician has once stated, Statistics is the grammar of science and this holds especially for Computer and […]
2021-10-07

Monte Carlo Simulation Explained

Monte Carlo Methods: I Am Feeling (Un-)Lucky! In short, Monte Carlo methods refer to a series of statistical methods essentially […]
2021-09-12

Best storage formats to save Pandas dataframes

When working on data analytical projects, I usually use Jupyter notebooks and a great pandas library to process and move my data around. It […]
2021-07-30

ANOVA (Analysis of variance) simply explained

Introduction Buying a new product or testing a new technique but not sure how it stacks up against the alternatives? […]
2020-12-18

How to determine epsilon and MinPts parameters of DBSCAN clustering

Every data mining task has the problem of parameters. Every parameter influences the algorithm in specific ways. For DBSCAN, the […]
2020-11-09

Restricted Boltzmann Machines (RBMs) Simply Explained

Contents Definition & Structure Reconstructions Probability Distributions Code Sample: Stacked RBMS Parameters & k Continuous RBMs Next Steps Other Resources […]
2020-02-21

Walkthrough of an exploratory analysis for classification problems

In this post I outline how to perform an exploratory analysis for a binary classification problem. I am going to […]
2020-02-05

Dealing with Imbalanced Data

https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 Imbalanced classes are a common problem in machine learning classification where there are a disproportionate ratio of observations […]