Home
Sefidian Academy
Filter by
Categories
Tags
Authors
Show all
All
Apache Kafka
Artificial Intelligence
Bayesian Statistics
Big Data
Cassandra
Computer Vision
Data Engineering
Data Science
Database
Deep Learning
Design Pattern
DevOps
Django
Docker
ELK
English
Feature Engineering
Finance
Java
Java Script
Keras
Linear Algebra
Linux
Machine Learning
Mathematics
MLOps
NLP
Python
PyTorch
Recommendation Systems
Reinforcement Learning
Research
Software Engineering
Spark
State of the art technologies
Statistics and Probability
Tensorflow
Time Series
Uncategorized
All
Apache Kafka
Big Data
boltzman
broadcasting
Cython
Deploy
Django
Docker
Elasticsearch
ELK
Exception Handling
Feature Engineering
GPU
GRU
Linux
Logging
lstm
MinIO
monte carlo
numpy
Pandas
PySpark
Python
PyTorch
Reinforcement Learning
remover
S3
scikit-learn
Self-Driving Car
separator
sklearn
tmux
transfer learning
Tutorial
Ubuntu
vocal
All
Amir Masoud Sefidian
2023-01-11
Categories
Artificial Intelligence
Big Data
Data Science
Machine Learning
NLP
Python
Recommendation Systems
Spark
Time Series
Machine Learning for Big Data using PySpark with real-world projects
10
mins read
Introduction I have prepared a GitHub Repository that provides a set of self-study tutorials on Machine Learning for big data
[…]
2022-09-18
Categories
Big Data
Python
Spark
A guide on PySpark Window Functions with Partition By
11
mins read
Pyspark window functions are useful when you want to examine relationships within groups of data rather than between groups of
[…]
2022-07-14
Categories
Big Data
Data Science
Linux
Python
Spark
Setup Apache Spark on a multi-node cluster
12
mins read
This article covers basic steps to install and configure Apache Spark Apache Spark 3.1.1 on a multi-node cluster which includes installing spark
[…]
2022-03-22
Categories
Artificial Intelligence
Big Data
Data Science
Machine Learning
Python
Spark
PySpark equivalent methods for Pandas dataframes
8
mins read
Pandas is the go-to library for every data scientist. It is essential for every person who wishes to manipulate data
[…]
2022-02-17
Categories
Big Data
Python
Spark
Setting up a multi-node Apache Spark Cluster on a local Windows machine with Virtual Box
6
mins read
Prerequisite Understand how to install Ubuntu inside Windows using Oracle VM VirtualBox from this Link Apache Spark is a fast and
[…]