Implementation of Vision Transformer (ViT) model for image classification on a custom dataset (the pyCOCO dataset). The model leverages the power of the transformer architecture to classify images into 5 different categories
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool