saiful9379 / Vision_Transformer

Vision Transformer (ViT) is a type of neural network architecture that has been introduced to address the problem of image classification. Unlike traditional convolutional neural networks (CNNs), which rely on convolutions to extract local features from an image, ViT employs a self-attention mechanism to extract global features for classification.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

saiful9379/Vision_Transformer Stargazers