AakashKumarNain / mistral_jax

This is a port of Mistral-7B model in JAX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mistral-7B reference implementation JAX and Equinox

This repository contains a port of the original Mistral-7B model in JAX and Equinox. The model here isn't pretrained or fine-tuned. The weights are ported from torch to jax, provided on an "as is" basis, without warranties or conditions of any kind.

Any official restriction, if applicable, that comes with the original code and the model, applies here as well. Please check the original license and the repo for the details.

References

[1] Mistral 7B- Official code implementation

[2] Generating Long Sequences with Sparse Transformers, Child et al. 2019

[3] Longformer: The Long-Document Transformer, Beltagy et al. 2020

About

This is a port of Mistral-7B model in JAX

License:MIT License


Languages

Language:Python 97.6%Language:Shell 2.4%