Giters
EleutherAI
/
the-pile
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
1380
Watchers:
32
Issues:
100
Forks:
117
EleutherAI/the-pile Issues
Link in Readme produces 404
Updated
10 days ago
Comments count
15
Could you possibly share the 825GB pile data temporarily and unofficially?
Updated
2 months ago
Comments count
1
Question regarding Shuffling
Updated
2 months ago
Comments count
1
Issue reproducing the GitHub partition
Updated
5 months ago
Comments count
3
Meta data `file_name` in the GitHub part of The Pile a bit off
Updated
5 months ago
Comments count
2
"Github" code data download only
Updated
5 months ago
Comments count
2
link for book3
Updated
7 months ago
Comments count
1
When accessing https://the-eye.eu/public/AI/pile_preliminary_components/, a 404 error occurs
Updated
9 months ago
Mismatched data size Problem
Closed
9 months ago
book3 metadata
Updated
10 months ago
Any search tools?
Updated
a year ago
Ubuntu IRC broken encoding, impacting generative models downstream
Updated
a year ago
Comments count
6
pass2_shuffle_holdout.py - ModuleNotFoundError: No module named 'parse'
Closed
a year ago
Comments count
1
URL Links
Updated
a year ago
Comments count
2
(Natural) Languages in The PILE
Updated
a year ago
Comments count
1
Appending data to the Pile.
Updated
a year ago
Comments count
1
Suggested corpus: Adult stories
Updated
a year ago
Comments count
1
Cannot download data , error
Updated
a year ago
Reducing download size
Updated
a year ago
Pile-CC Size
Updated
a year ago
ConvoKit datasets
Closed
a year ago
Comments count
2
Accepting submissions to the Pile
Closed
a year ago
Comments count
1
Public website to explore dataset
Updated
a year ago
Comments count
1
failed to download stackexchange
Updated
a year ago
Comments count
1
tfds_pile
Updated
2 years ago
Scripts for dedup and filter Common Crawl?
Updated
2 years ago
Comments count
1
import fasttext_pybind as fasttext fails with undefined symbol
Updated
2 years ago
download website is not accessible
Updated
2 years ago
Comments count
1
Royalroad
Closed
3 years ago
Comments count
1
SHA256 Sums
Closed
3 years ago
Comments count
1
Code generation
Closed
3 years ago
Comments count
1
Paper checklist
Closed
3 years ago
Caucasian Languages Dataset
Closed
3 years ago
Make treemaps
Closed
3 years ago
Comments count
1
Russian dialogs and stories from the Pickabu website
Closed
3 years ago
PDF parsing
Closed
3 years ago
Comments count
11
Israeli Legal Databases
Closed
3 years ago
Legal Contracts
Closed
3 years ago
Comments count
1
Set up webpage
Closed
3 years ago
Comments count
1
Debate notes
Closed
3 years ago
Comments count
5
case.law
Closed
3 years ago
Comments count
3
Multilingual Wikipedia
Closed
3 years ago
Comments count
2
Southern African Legal Datasets
Closed
3 years ago
Comments count
1
European Patent Office
Closed
3 years ago
Comments count
1
Royal Society Publishing
Closed
3 years ago
Comments count
2
Exploiting bitexts
Closed
3 years ago
Comments count
2
Api
Closed
3 years ago
Early Buddhism
Closed
3 years ago
Comments count
4
ImportError: cannot import name 'train_chars' from 'the_pile.pile'
Closed
3 years ago
Comments count
5
The `--using` flag doesn't actually do anything
Closed
3 years ago
Previous
Next