Oxen.ai's repositories
oxen-release
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
Self-Rewarding-Language-Models
This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.
mamba-dive
This is the code that went into our practical dive using mamba as information extraction
BitNet-1.58-Instruct
Implementation of BitNet-1.58 instruct tuning
Score-Entropy-Discrete-Diffusion
Modified Score-Entropy-Discrete-Diffusion to do a character level ml model and integrate with Oxen
chat-gpt-export
Web app to convert ChatGPT export into a csv
homebrew-oxen
Homebrew release formula for oxen
homebrew-oxen-server
This is the formula for the homebrew release of oxen-server
ImageClassification
Boilerplate Repository for Image Classification
CatsVsDogsClassification
This is an example jupyter notebook for how to classify cats vs dogs by cloning the data from Oxen
OxenLM
Testing Language Models with Powered by Oxen datasets
Attention-Is-All-You-Need-PyTorch
A pytorch implementation of attention is all you need, using oxen datasets
Awesome-LLM-Finetuning-Datasets
A list of awesome public datasets for finetune LLMs
Flying-Oxen-Modal
Implementation of generating a flying oxen using modal.com
Llama-Fine-Tune
Example of fine tuning llama-2 and exporting to ggml to run on cpu
Text2SQL
An example of fine tuning a Text2Sql LLM
TextDiffusionSEDD
This is a reproduction of the Score Entropy Discrete Diffusion paper optimized for understanding