facebookresearch / metaseq

Repo for external large-scale work

facebookresearch/metaseq Issues

train opt-125M from scratch
Updated a month ago2
Weights/Code for CM3Leon
Updated 5 months ago2
How to load the checkpoints into a HF model?
Updated 6 months ago
How to generate binary data file ?
Closed a year ago1
I change Num_head of OPT-1.3b,and it cause CUDA Error: IndexSelectLargeIndex,
Updated a year ago
Access request for opt-175b
Updated a year ago1
setup to pyproject
Updated a year ago
Process blocks when deploying OPT-1.3B with FasterTransformer
Closed a year ago
How can I pretrain an opt-model with the codes?
Updated a year ago
pre-training script faild
Closed a year ago6
Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq
Updated a year ago4
OPT在中文对话上表现如何呢？
Updated a year ago
Grammatical Error Correction (GEC) prompt for OPT-IML
Updated a year ago
load checkpoint failed when training with multi-nodes.
Closed a year ago1
FSDP is incompatible with BF16
Closed a year ago4
Sub-workers exits without messages
Updated a year ago6
OPT and LLaMA
Closed a year ago1
How to finetune from a consolidated model ?
Updated a year ago1
Add type hints to all methods
Updated a year ago
Deprecate convert_to_singleton
Updated a year ago13
Converting OPT-175B tokenizer to HF format?
Updated a year ago2
Confirm md5sums after running reshard_fsdp.py on OPT-175B #702
Closed a year ago3
Incorrect md5sums after running reshard_fsdp.py on OPT-175B
Closed a year ago2
generation_args["stop"] doesn't work for stop sequence "\n\n"
Updated a year ago
downloading opt-66B part7 get access denied
Closed a year ago1
Torch setup
Closed a year ago2
convert_to_singleton doesn't seem to handle bias properly
Updated a year ago
`reshard_mp.py --num-output-parts 1` merges to smaller OPT file
Closed a year ago3
Re-release consolidated OPT / OPT-IML checkpoints
Updated a year ago2
Implement `finish_reason` in API response
Updated a year ago
Remove ` --reset-lr-scheduler` flag
Updated a year ago
Very weird predictions of OPT-IML-30B on Blended Skill Talk dataset.
Closed a year ago4
OPT-IML Bench release?
Closed a year ago2
A scholar who loves AI asked
Closed a year ago1
RuntimeError
Closed a year ago1
Access request for opt-175b
Closed a year ago1
Errors:Could not load 'base_config'
Updated a year ago2
How much gpu memory needed for opt-175B fine-tuning?
Closed a year ago2
Example scripts for training is not working at all.
Closed a year ago1
Implement --update-freq for StreamingLanguageModeling task
Updated a year ago
Convert to singleton script doesn't work using a unified tokenizer file
Closed a year ago
Bring CM3 in!
Closed a year ago
how to get sharded ckpt
Updated a year ago4
No response for access of 175B checkpoints
Closed a year ago
Failure after loading checkpoint shards.
Closed a year ago
metaseq-train hangs at global barrier when 2-nodes launch.
Closed a year ago5
how to launch pretrain jobs when no slurm cluster
Closed a year ago
kill code of "zero_sharding" in metaseq
Closed a year ago
How long does it take to get a reply after filling out the Request Form ?
Closed a year ago1
Why is ReLU used?
Closed a year ago2