2022-07-14

The default Random Forest feature importance is not reliable: Understanding Permutation Feature Importance

47 mins read The scikit-learn Random Forest feature importance and R’s default Random Forest feature importance strategies are biased. To get reliable results […]
2022-06-03

A complete guide on feature selection techniques with Python code

33 mins read Considering you are working on high-dimensional data that’s coming from IoT sensors or healthcare with hundreds to thousands of features, […]
2022-05-08

Encoding categorical features using the category_encoders package

11 mins read There are loads of different ways to convert categorical variables into numeric features so they can be used within machine […]
2022-05-04

Understand different feature scaling techniques with Python code

19 mins read In many machine learning algorithms, to bring all features in the same standing, we need to do scaling so that […]
2022-02-15

Different approaches for finding feature importance using Random Forests

16 mins read In many (business) cases it is equally important to not only have an accurate, but also an interpretable model. Oftentimes, […]
2022-02-03

Feature Selection for categorical data with Python code

17 mins read Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target […]
2021-07-04

Feature Scaling with Scikit-Learn

9 mins read 1 Introduction 2 Loading the libraries 3 Scaling methods 3.1 Standard Scaler 3.2 Min-Max Scaler 3.3 Robust Scaler 3.4 Comparison […]