andrehuang / my-foundation-models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

my-foundation-models

In this repository, I aim to document the useful resources of foundational models for my work.

Language foundation models

Vision foundation models

  • Class-agnostic segmentation models: SAM, HQ-SAM
  • ImageNet22k trained: SwinTransformer?
  • Semi-supervised models: MAE, DINOv2

Vision-Language models

  • CLIP, DiHT, SigLIP

  • Image tag generation: RAM, RAM++

  • Region-level grounding model: GLaMM, GroundingDINO, GroundingSAM

  • VQA, caption generation: BLIP2, CaSED, LLaVA

About