Giters
mlfoundations
/
dclm
DataComp for Language Models
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
820
Watchers:
34
Issues:
20
Forks:
72
mlfoundations/dclm Issues
Missing scale configs?
Closed
4 days ago
Comments count
1
How to train and fine-tuning model
Updated
5 days ago
BFF code?
Updated
6 days ago
Missing files or bugs in evaluation code?
Updated
8 days ago
Any web demo?
Updated
8 days ago
botocore.exceptions.NoCredentialsError: Unable to locate credentials
Closed
8 days ago
Comments count
1
Which data file correspond to table 4 fasttext?
Closed
a month ago
Comments count
7
Unable to run `eval/eval_openlm_ckpt.py`
Closed
13 days ago
Comments count
10
Ray Actor dies during tokenization process
Updated
14 days ago
Comments count
1
ArrowConversionError when running tokenization
Closed
15 days ago
Comments count
12
Causal Transformer for Perplexity
Closed
18 days ago
Comments count
3
Would you share the 0.28T token dataset for achieve highest scores in 7B-2x experiment?
Closed
18 days ago
Comments count
1
Tokenization file missing
Closed
24 days ago
Comments count
2
Accessing S3 bucket dcnlp-west
Closed
a month ago
Comments count
4
Data download script
Closed
25 days ago
Comments count
2
Request to DCLM-Pool
Closed
a month ago
Comments count
3
Duplicated licenses
Closed
a month ago
Comments count
1
How to find CORE, MMLU, EXTENDED values in the eval json?
Closed
a month ago
Comments count
2
Why are all of these leaderboard empty?
Closed
a month ago
Comments count
1
Question regarding the evaluation
Closed
a month ago
Comments count
1