cfoster0 / pile-explorer

For exploring the data and documenting its limitations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploring the Pile

This repository contains code for exploring the Pile and documenting its limitations

Language Modeling Data Format

The data in the Pile is stored in the lm_dataformat. This repository is designed to be used on data stored in that format. For documentation, see the linked repository.

About

For exploring the data and documenting its limitations

License:MIT License


Languages

Language:Python 100.0%