2022-03-22

Categorical data type in Pandas

8 mins read You may have categorical data in your dataset. A categorical data is a type with two or more categories. If […]
2022-03-22

NumPy Broadcasting tutorial

13 mins read In operations between NumPy arrays (ndarray), each shape is automatically converted to be the same by broadcasting. This article describes the following […]
2022-03-10

Minimal PyTorch LSTM example for regression and classification tasks

10 mins read The Idea Behind RNNs Recurrent neural networks in general maintain state information about data previously passed through the network. This […]
2022-03-09

A complete guide to writing custom Datasets and DataLoader in PyTorch

19 mins read Table of Contents An Introduction To PyTorch Dataset and DataLoaderWhy Write Good Data Loaders and Datasets?The Basic PyTorch Dataset StructureImplementing […]
2022-03-02

Setup Celery with Redis for Django Tutorial

9 mins read When you work on data-intensive applications, long-running tasks can seriously slow down your users. Modern users expect pages to load […]
2022-02-28

Understanding TF-IDF with Python example

6 mins read Term Frequency – Inverse Document Frequency (TF-IDF) is a widely used statistical method in natural language processing and information retrieval. […]
2022-02-28

Understanding Pandas and NumPy views vs copies to handle SettingWithCopyWarning

33 mins read Table of Contents Prerequisites Example of a SettingWithCopyWarning Views and Copies in NumPy and Pandas Understanding Views and Copies in […]
2022-02-24

Understanding 1D, 2D, and 3D convolutional layers in deep neural networks

21 mins read In deep learning, convolutional layers have been major building blocks in many deep neural networks. The design was inspired by […]
2022-02-23

Useful shortcut keys in Linux terminal

11 mins read Ubuntu comes with a powerful set of keyboard shortcuts that you can utilize in order to increase your productivity through minimum effort. […]
2022-02-18

A guide on PySpark Window Functions with Partition By

11 mins read Pyspark window functions are useful when you want to examine relationships within groups of data rather than between groups of […]
2022-02-17

Setting up a multi-node Apache Spark Cluster on a local Windows machine with Virtual Box

6 mins read Prerequisite Understand how to install Ubuntu inside Windows using Oracle VM VirtualBox from this Link Apache Spark is a fast and […]
2022-02-17

Useful magic commands in Jupyter Notebook/Lab

30 mins read Jupyter Notebook/Lab is the go-to tool used by data scientists and developers worldwide to perform data analysis nowadays. It provides […]
2022-02-15

Different approaches for finding feature importance using Random Forests

16 mins read In many (business) cases it is equally important to not only have an accurate, but also an interpretable model. Oftentimes, […]
2022-02-14

Understanding GROUP BY, GROUPING SET, ROLL UP, and CUBE in SQL

18 mins read GROUP BY A table in a database has columns of information in it. Each column in a table represents an […]
2022-02-13

Common loss functions for training deep neural networks with Keras examples

30 mins read Deep neural networks are trained using the stochastic gradient descent optimization algorithm. As part of the optimization algorithm, the error for […]