Emin Orhan's repositories
humanlike-vits
ViT models pretrained with up to ~5k hours of human-like video data
silicon-menagerie
Menagerie of models trained on SAYCam (and more)
llm-memory
Memory experiments with LLMs
video-models
Menagerie of video models trained on various video datasets
third-order
an experimental third-order attention model
vqgan-gpt-video
GPT on VQGAN for video
webdataset-example
a simple webdataset sharded tar write example
optimized-mae
An optimized implementation of masked autoencoders (MAEs)
optimized-stmae
An optimized implementation of spatiotemporal masked autoencoders
adept
adept intuitive physics benchmark data & tools
annealed-attention
an experimental annealed self-attention layer
dit
DiT clone
faster-dit
An even faster DiT clone
hvm-1
Video models trained on ~5k hours of human-like video data
igpt-memory
Can deep learning match the efficiency of human visual long-term memory for object details?
mugs
Mugs
sa1b-downloader
downloader for the SA-1B dataset
tae
A simple transformer-based autoencoder model
vision
Datasets, Transforms and Models specific to Computer Vision
visual-recognition-memory
Visual recognition memory in humans and machines