Aurora-M

This is code to finetune and run Aurora-M, an open source Starcoderplus based model trained on 400B additional tokens of multilingual and multidomain data, and adapted for multimodal understanding using the BakLLaVA/LLaVA 1.5 code base. The 400B additional tokens were trained with BigCode's Megatron fork. This model is intended for mixture of experts (MoE) adapation using the M*DEL MoE adapatation. See our M*DEL project page for more details.

Compute provided by the LUMI Supercomputer center and JUWELS Supercomptuer center. Thank you!

Also check out our BakLLaVA project, which is a cooperation between the AI Open source organizations: LAION, Ontocord, Skunkworks OSS AI group and AI Alignment Lab.

About

Adapting Starcoderplus for Multimodal Experts

Apache License 2.0

Languages

Language:Python 91.0%Language:Shell 4.4%Language:JavaScript 2.3%Language:HTML 1.8%Language:CSS 0.4%