huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Home Page:https://huggingface.co/docs/timm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VAE or VQ-VAE is needed

amirshamaei opened this issue · comments

Is your feature request related to a problem? Please describe.
Currently, the timm library lacks implementations for Variational Autoencoder (VAE) and Vector Quantized VAE (VQ-VAE) models. Users looking to utilize these autoencoder architectures may find it inconvenient to implement them from scratch or integrate external implementations into their projects.

Describe the solution you'd like
I would like to request the addition of Variational Autoencoder (VAE) and Vector Quantized VAE (VQ-VAE) models to the timm library. This would involve creating modules for these autoencoder architectures, ensuring they adhere to the existing timm standards for simplicity and compatibility.

Describe alternatives you've considered
Users can currently implement VAE and VQ-VAE models from scratch or use external implementations from other libraries such as diffusener. However, having native support for these models in the timm library would provide a more streamlined and integrated experience for users.

@amirshamaei I would love to contribute to this.