Coursera Deep Learning Specialization Notes

2 mins read A couple of years ago I completed Deep Learning Specialization taught by AI pioneer Andrew Ng. I found this series […]

Understanding Perplexity for language models

17 mins read In general, perplexity is a measurement of how well a probability model predicts a sample. In the context of Natural Language Processing, […]

Understanding GloVe embedding with Tensorflow implementation

9 mins read In this article, you will learn about GloVe, a very powerful word vector learning technique. This article will focus on […]

Understanding Word2vec embedding with Tensorflow implementation

15 mins read This article is going to be about Word2vec algorithms. Word2vec algorithms output word vectors. Word vectors, underpin many of the […]

Delving into GPT-2 and GPT-3 Language Models

32 mins read This year, we saw a dazzling application of machine learning. The OpenAI GPT-2 exhibited an impressive ability to write coherent and passionate […]

A comprehensive tutorial on Transformers Architecture

43 mins read We’ve been hearing a lot about Transformers and with good reason. They have taken the world of NLP by storm […]

Review of intuitions behind the recent advances in NLP: From RNNs to Transformers and BERT

48 mins read Few areas of AI are more exciting than NLP right now. In recent years language models (LM), which can perform […]

Understanding TF-IDF with Python example

6 mins read Term Frequency – Inverse Document Frequency (TF-IDF) is a widely used statistical method in natural language processing and information retrieval. […]

Understanding Attention Mechanism with example

14 mins read For decades, Statistical Machine Translation has been the dominant translation model, until the birth of Neural Machine Translation (NMT). NMT is an […]

An illustrated guide to Attention Mechanism in Sequence Models with PyTorch code

22 mins read In this article, I will be covering the main concepts behind Attention, including the implementation of a sequence-to-sequence Attention model, […]

Understanding Self-Attention in Transformers with example

10 mins read What do BERT, RoBERTa, ALBERT, SpanBERT, DistilBERT, SesameBERT, SemBERT, SciBERT, BioBERT, MobileBERT, TinyBERT and CamemBERT all have in common? And […]

Machine Learning From Scratch Series: Naive Bayes and Gaussian Naive Bayes

16 mins read Introduction Naïve Bayes algorithm is a supervised classification algorithm based on the Bayes theorem with strong (Naïve) independence among features. In machine learning and data […]

The BERT Model

17 mins read The year 2018 has been an inflection point for machine learning models handling text (or more accurately, Natural Language Processing […]

Using BERT for Sentence Sentiment Classification

11 mins read Progress has been rapidly accelerating in machine learning models that process language over the last couple of years. This progress […]

Seq2Seq models, Attention Mechanism, and Transformers Explained

29 mins read Sequence-to-sequence models are deep learning models that have achieved a lot of success in tasks like machine translation, text summarization, […]

Understanding Attention Mechanism in Sequence 2 Sequence Machine Translation

39 mins read Introduction Recurrent Neural Networks (or more precisely LSTM/GRU) have been found to be very effective in solving complex sequence-related problems […]

What is Word2vec word embedding?

24 mins read I find the concept of embeddings to be one of the most fascinating ideas in machine learning. If you’ve ever […]

What are Word Embeddings and how do they work? An introduction to Word2Vec (CBOW and Skip Gram)

22 mins read Word embedding is one of the most popular representations of document vocabulary. It is capable of capturing the context of […]

Implementing LSTM Networks in Python with Keras

27 mins read A powerful and popular recurrent neural network is the long short-term model network or LSTM. It is widely used because […]

A complete guide to understanding Long Short Term Memory (LSTM) Networks

37 mins read In this post, I provide three useful resources for understanding LTSMs. Introduction Sequence prediction problems have been around for a […]