Metaseq

A codebase for working with Open Pre-trained Transformers.

Community Integrations

Using OPT with 🤗 Transformers

The OPT 125M--66B models are now available in Hugging Face Transformers. You can access them under the facebook organization on the Hugging Face Hub

Using OPT-175B with Alpa

The OPT 125M--175B models are now supported in the Alpa project, which enables serving OPT-175B with more flexible parallelisms on older generations of GPUs, such as 40GB A100, V100, T4, M60, etc.

Using OPT with Colossal-AI

The OPT models are now supported in the Colossal-AI, which helps users to efficiently and quickly deploy OPT models training and inference, reducing large AI model budgets and scaling down the labor cost of learning and deployment.

Getting Started in Metaseq

Follow setup instructions here to get started.

Documentation on workflows

Training
API

Background Info

Support

If you have any questions, bug reports, or feature requests regarding either the codebase or the models released in the projects section, please don't hesitate to post on our Github Issues page.

Please remember to follow our Code of Conduct.

Contributing

We welcome PRs from the community!

You can find information about contributing to metaseq in our Contributing document.

The Team

Metaseq is currently maintained by the CODEOWNERS: Susan Zhang, Naman Goyal, Punit Singh Koura, Moya Chen, Kurt Shuster, Ruan Silva, David Esiobu, Igor Molybog, Peter Albert, Sharan Narang, Andrew Poulton, Nikolay Bashlykov, and Binh Tang.

Previous maintainers include: Stephen Roller, Anjali Sridhar, Christopher Dewan.

License

The majority of metaseq is licensed under the MIT license, however portions of the project are available under separate license terms:

Megatron-LM is licensed under the Megatron-LM license

About

Repo for external large-scale work

MIT License

Languages

Language:Python 98.0%Language:Cython 1.0%Language:HTML 0.7%Language:Shell 0.1%Language:Dockerfile 0.1%