This repository contains a port of the original Mistral-7B model in JAX and Equinox. The model here isn't pretrained or fine-tuned. The weights are ported from torch to jax, provided on an "as is" basis, without warranties or conditions of any kind.
Any official restriction, if applicable, that comes with the original code and the model, applies here as well. Please check the original license and the repo for the details.
[1] Mistral 7B- Official code implementation
[2] Generating Long Sequences with Sparse Transformers, Child et al. 2019
[3] Longformer: The Long-Document Transformer, Beltagy et al. 2020