kyegomez / GPT4o

Community Open Source Implementation of GPT4o in PyTorch

Home Page:https://discord.gg/7VckQVxvKk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-Modality

GPT4o

Community Open Source Implementation of GPT4o in PyTorch

Install

Architecture

  • TikToken Tokenzier: We know fursure the tokenizer. Which is here
  • Model understands Images and Audio Natively. There are 2 approaches, process them natively or use encoders for each. I think here they're using encoders like whisper and vit for simplicity and brevity.
  • Using DALLE3 as the output head to generate images
  • Tokens to denote when to generate an image or audio
  • Whisper output head for the audio outputs

License

MIT

About

Community Open Source Implementation of GPT4o in PyTorch

https://discord.gg/7VckQVxvKk

License:MIT License


Languages

Language:Shell 59.3%Language:Python 40.7%