EleutherAI

EleutherAI

Geek Repo

Location:The Internet

Home Page:www.eleuther.ai

Twitter:@AIEleuther

Github PK Tool:Github PK Tool

EleutherAI's repositories

Language:PythonStargazers:8Issues:3Issues:0

pile-literotica

Download, parse, and filter data from Literotica. Data-ready for The-Pile.

Language:PythonStargazers:8Issues:4Issues:0

pile-cc-filtering

The code used to filter CC data for The Pile

Language:PythonLicense:MITStargazers:6Issues:2Issues:0

pile-uspto

A script for collecting the USPTO Backgrounds dataset in a language modelling friendly format.

Language:PythonLicense:MITStargazers:6Issues:3Issues:1

pile-allpoetry

Scraper to gather poems from allpoetry.com

Language:PythonLicense:MITStargazers:3Issues:2Issues:0

pile-ubuntu-irc

A script for collecting the Ubuntu IRC dataset in a language modelling friendly format.

Language:PythonLicense:MITStargazers:3Issues:2Issues:0

bucket-cleaner

A small utility to clear out old model checkpoints in Google Cloud Buckets whilst keeping tensorboard event files

Language:PythonStargazers:2Issues:2Issues:0

jusText

Heuristic based boilerplate removal tool

Language:PythonLicense:BSD-2-ClauseStargazers:1Issues:2Issues:0

lang-filter

Filter text files or archives by language

Language:PythonStargazers:1Issues:1Issues:0

pile-cord19

A script for collecting the CORD-19 dataset in a language modelling friendly format.

Language:PythonLicense:MITStargazers:1Issues:2Issues:0

discord-role-bot

Control Discord Roles with Reactions

Stargazers:0Issues:0Issues:0

lingvo

Lingvo

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0