Fundamentals of statistics for Data Scientists and Analysts with Python Code

As Karl Pearson, a British mathematician has once stated, Statistics is the grammar of science and this holds especially for Computer and […]

Monte Carlo Simulation Explained

Monte Carlo Methods: I Am Feeling (Un-)Lucky! In short, Monte Carlo methods refer to a series of statistical methods essentially […]

Best storage formats to save Pandas dataframes

When working on data analytical projects, I usually use Jupyter notebooks and a great pandas library to process and move my data around. It […]

ANOVA (Analysis of variance) simply explained

Introduction Buying a new product or testing a new technique but not sure how it stacks up against the alternatives? […]

How to determine epsilon and MinPts parameters of DBSCAN clustering

Every data mining task has the problem of parameters. Every parameter influences the algorithm in specific ways. For DBSCAN, the […]

Restricted Boltzmann Machines (RBMs) Simply Explained

Contents Definition & Structure Reconstructions Probability Distributions Code Sample: Stacked RBMS Parameters & k Continuous RBMs Next Steps Other Resources […]

Walkthrough of an exploratory analysis for classification problems

In this post I outline how to perform an exploratory analysis for a binary classification problem. I am going to […]

Dealing with Imbalanced Data

https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 Imbalanced classes are a common problem in machine learning classification where there are a disproportionate ratio of observations […]