למידת מכונה ולמידה עמוקה בעברית

Avraham Raviv

Mike Erlihson

Chapter authors:

David Ben Attar

Gal Peretz

Jeremy Rutman

Maya Rapaport

Nava (Reinitz) Leibovich

Or Avrahami

Or Shemesh

Ron Levy

Uri Almog


Avi Caciularu

Avshalom Dayan

Rachel Wities


If you find this book useful in your research work, please consider citing:

author = {Raviv, Avraham and Erlihson, Mike},
booktitle = {Machine and Deep learning in Hebrew},
year = {2021}

Table of contents

Part I:

Part II:

Part III:

1. Introducion to Machine Learning

1.1 What is Machine Learning?

  • 1.1.1 The Basic Concept

  • 1.1.2 Data, Tasks and Learning

1.2 Applied Math

  • 1.2.1 Linear Algebra

  • 1.2.2 Calculus

  • 1.2.3 Probability

2. Machine Learning Algorithms

2.1 Supervised Learning Algorithms

  • 2.1.1 Support Vector Machines (SVM)

  • 2.1.2 Naïve Bayes

  • 2.1.3 K-nearest neighbors (K-NN)

  • 2.1.4 Qadratic\Linear Discriminant Analysis (QDA\LDA)

  • 2.1.5 Decision Trees

2.2 Unsupervised Learning Algorithms

  • 2.2.1 K-means

  • 2.2.2 Mixture Models

  • 2.2.3 Expectation–maximization (EM)

  • 2.2.4 Hierarchical Clustering

  • 2.2.5 Local Outlier Factor

2.3 Dimensionally Reduction

  • 2.3.1 Principal Components Analysis (PCA)

  • 2.3.2 t-distributed Stochastic Neighbor Embedding (t-SNE)

  • 2.3.3 Uniform Manifold Approximation and Projection (UMAP)

2.4 Ensemble Learning

  • 2.4.1 Introduction to Ensemble Learning

  • 2.4.2 Bagging

  • 2.4.3 Boosting

3. Linear Neural Networks (Regression problems)

3.1 Linear Regression

  • 3.1.1 The Basic Concept

  • 3.1.2 Gradient Descent

  • 3.1.3 Regularization and Cross Validation

  • 3.1.4 Linear Regression as Classifier

3.2 Softmax Regression

  • 3.2.1 Logistic Regression

  • 3.2.2 Cross Entropy and Gradient descent

  • 3.2.3 Optimization

  • 3.2.4 SoftMax Regression – Multi Class Logistic Regression

  • 3.2.5 SoftMax Regression as Neural Network

4. Deep Neural Networks

4.1 MLP – Multilayer Perceptrons

  • 4.1.1 From a Single Neuron to Deep Neural Network

  • 4.1.2 Activation Function

  • 4.1.3 Xor

4.2 Computational Graphs and propagation

  • 4.2.1 Computational Graphs

  • 4.2.2 Forward and Backward propagation

4.3 Optimization

  • 4.3.1 Data Normalization

  • 4.3.2 Weight Initialization

  • 4.3.3 Batch Normalization

  • 4.3.4 Mini Batch

  • 4.3.5 Gradient Descent Optimization Algorithms

4.4 Generalization

  • 4.4.1 Regularization

  • 4.4.2 Weight Decay

  • 4.4.3 Model Ensembles and Drop Out

  • 4.4.4 Data Augmentation

5. Convolutional Neural Networks

5.1 Convolutional Layers

  • 5.1.1 From Fully-Connected Layers to Convolutions

  • 5.1.2 Padding, Stride and Dilation

  • 5.1.3 Pooling

  • 5.1.4 Training

  • 5.1.5 Convolutional Neural Networks (LeNet)

5.2 CNN Architectures

  • 5.2.1 AlexNet

  • 5.2.2 VGG

  • 5.2.3 GoogleNet

  • 5.2.4 Residual Networks (ResNet)

  • 5.2.5 Densely Connected Networks (DenseNet)

  • 5.2.6 U-Net

  • 5.2.7 Transfer Learning

6. Recurrent Neural Networks

6.1 Sequence Models

  • 6.1.1 Recurrent Neural Networks

  • 6.1.2 Learning Parameters

6.2 RNN Architectures

  • 6.2.1 Long Short-Term Memory (LSTM)

  • 6.2.2 Gated Recurrent Units (GRU)

  • 6.2.3 Deep RNN

  • 6.2.4 Bidirectional RNN

  • 6.2.5 Sequence to Sequence Learning

7. Deep Generative Models

7.1 Variational AutoEncoder (VAE)

  • 7.1.1 Dimensionality Reduction

  • 7.1.2 Autoencoders (AE)

  • 7.1.3 Variational AutoEncoders (VAE)

7.2 Generative Adversarial Networks (GANs)

  • 7.2.1 Generator and Discriminator

  • 7.2.2 DCGAN

  • 7.2.3 Conditional GAN (cGAN)

  • 7.2.4 Pix2Pix

  • 7.2.5 CycleGAN

  • 7.2.6 Progressively Growing (ProGAN)

  • 7.2.7 StyleGAN

  • 7.2.8 Wasserstein GAN

7.3 Auto-Regressive Generative Models

  • 7.3.1 PixelRNN

  • 7.3.2 PixelCNN

  • 7.3.3 Gated PixelCNN

  • 7.3.4 PixelCNN++

8. Attention Mechanism

8.1 Sequence to Sequence Learning and Attention

  • 8.1.1 Attention in Seq2Seq Models

  • 8.1.2 Bahdanau Attention and Luong Attention

8.2 Transformer

  • 8.2.1 Positional Encoding

  • 8.2.2 Self-Attention Layer

  • 8.2.3 Multi Head Attention

  • 8.2.4 Transformer End to End

  • 8.2.5 Transformer Applications

9. Computer Vision

9.1 Object Detection

  • 9.1.1 R-CNN

  • 9.1.2 You Only Look Once (YOLO)

  • 9.1.3 Single Shot Detector (SSD)

  • 9.1.4 Spatial Pyramid Pooling (SPP-net)

  • 9.1.5 Feature Pyramid Networks

  • 9.1.6 Deformable Convolutional Networks

  • 9.1.7 DE:TR: Object Detection with Transformers

9.2 Segmentation

  • 9.2.1 Semantic Segmentation vs. Instance Segmentation

  • 9.2.2 SegNet neural network

  • 9.2.3 Atrous convolutions

  • 9.2.4 Atrous Spatial Pyramidal Pooling

  • 9.2.5 Conditional Random Fields usage for improving final output

  • 9.2.6 See More Than Once -- Kernel-Sharing Atrous Convolution

9.3 Face Recognition and Pose Estimation

  • 9.3.1 Face Recognition

  • 9.3.2 Pose Estimation

9.5 Few-Shot Learning

  • 9.5.1 The Problem

  • 9.5.2 Metric Learning

  • 9.5.3 Meta-Learning (Learning-to-Learn)

  • 9.5.4 Data Augmentation

  • 9.5.5 Zero-Shot Learning

10. Natural Language Process

10.1 Language Model

  • 10.1.1 N-gram

  • 10.1.2 Word Representation (Vectors)

  • 10.1.3 Word2Vec/GloVe

  • 10.1.4 ELMo - Embeddings from Language Model

  • 10.1.5 Attention/Transformer (GPT)

10.2 Neural Machine Translation

  • 10.2.1 Neural Machine Translation by Jointly Learning to Align and Translate

  • 10.2.2 Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

  • 10.2.3 ConvS2S

  • 10.2.4 RNMT+

  • 10.2.5 Transformer and Transformer based models

  • 10.2.6 Named Entity Recognition (NER)

  • 10.2.7 Bilingual Evaluation Understudy (BLEU score)

  • 10.2.8 Unsupervised Machine Translation

10.3 Speech Recognition

  • 10.3.1 Connectionist Temporal Classification

  • 10.3.2 Listen, Attend, and Spell

  • 10.3.3 Very Deep Convolutional Networks for End-to-End Speech Recognition

10.4 Document Summarization

Extractive Text Summarization:

  • 10.4.1 TextRank

  • 10.4.2 LexRank

  • 10.4.3 Luhn

  • 10.4.4 Latent Semantic Analysis, LSA

  • 10.4.5 KL-Sum

Abstractive Text Summarization:

  • 10.4.6 T5 Transformers

  • 10.4.7 BART Transformers

  • 10.4.8 GPT-2 Transformers

  • 10.4.9 XLM Transformers

11. Reinforcement Learning

11.1 Introduction to RL

  • 11.1.1 Markov Decision Process (MDP) and RL

  • 11.1.2 Planning

  • 11.1.3 Learning Algorithms

11.2 Exploration and Exploitation

11.3 Planning by Dynamic Programming

11.4 Policy Gradient Methods

11.5 Monte-Carlo

11.6 Temporal-Difference Learning

11.7 Model-based algorithms


Stanford cs231

Machine Learning - Andrew Ng

Dive into Deep Learning

Deep Learning Book

ספר מלא בעברית בנושאים של למידת מכונה ולמידה עמוקה