Giters
kingoflolz
/
mesh-transformer-jax
Model parallel transformers in JAX and Haiku
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
6251
Watchers:
112
Issues:
205
Forks:
890
kingoflolz/mesh-transformer-jax Issues
Colab Demo Notebook Not Working
Updated
2 years ago
Comments count
11
Save failed during checkpoint saving function call
Updated
2 years ago
Comments count
4
Fine-tuning
Closed
3 years ago
Comments count
1
how to implement stop sequence in gpt j
Closed
2 years ago
Comments count
3
Fine Tuning Dataset Format
Updated
2 years ago
Comments count
1
The latest update to PIP breaks installation
Updated
2 years ago
Comments count
7
Error!
Closed
2 years ago
Comments count
1
Using `no_repeat_ngram_size` like HF
Closed
2 years ago
Comments count
3
No TPU found, falling back to CPU
Closed
2 years ago
Comments count
1
invalid syntax in to_hf_weights.py and device_train.py
Closed
2 years ago
Comments count
1
Instruct GPT fine tuning
Closed
2 years ago
Comments count
1
How does it work?
Closed
2 years ago
Comments count
3
Did you use the splits made by the Pile directly?
Closed
2 years ago
RESOURCE_EXHAUSTED: Failed to allocate request for 256.00MiB (268435456B) on device ordinal 0
Closed
2 years ago
Comments count
1
To run multiple model on one model
Updated
2 years ago
TpuEmbeddingEngine_WriteParameters not available in this library.
Closed
2 years ago
Comments count
11
looking at multiple versions of different packages (slow progress of requirements file)
Updated
2 years ago
Comments count
4
Seen floating point types of different precisions in %opt-barrier
Updated
2 years ago
Comments count
1
how to restart training
Closed
2 years ago
the-eye.eu down - alternative access to GPT-J-6B/step_383500_slim.tar.zstd ?
Closed
3 years ago
Comments count
13
Error: AssertionError: Incompatible checkpoints (8,) vs (8, 4096)
Closed
2 years ago
Comments count
1
how to speed up the inference time
Closed
2 years ago
Comments count
2
having issue in running the model
Closed
2 years ago
Is "to_hf_weights.py" specific to "6B_roto_256.json" only?
Updated
3 years ago
save_config_to_hf_format()
Updated
3 years ago
gpt-neo models are not compatible with this codebase
Updated
3 years ago
Error fine-tuning train
Closed
3 years ago
smaller models
Closed
3 years ago
Comments count
2
to_hf_weights script returns "Failed to allocate" error
Closed
3 years ago
Comments count
2
HF model does not work on Torch/XLA
Closed
3 years ago
Comments count
1
read_ckpt getting killed (OOM?)
Closed
3 years ago
Comments count
4
Generating random numbers – None PRNGKey error
Closed
3 years ago
Comments count
1
to_hf_weights.py cpu assertion error
Closed
3 years ago
Verifying logic for LR schedule
Closed
3 years ago
Weight download problem
Closed
3 years ago
Comments count
5
Minor requirement conflict on tqdm
Closed
3 years ago
Comments count
1
sequence_length=2049 or 2048?
Closed
3 years ago
Comments count
3
Can I do Fine-Tune GPT-J in colab pro?
Closed
3 years ago
Comments count
1
jax/haiku versions incompatible?
Updated
3 years ago
end sequence possible?
Closed
3 years ago
Comments count
1
Freeze Transformer Weight
Closed
3 years ago
Comments count
3
sample data configuration for finetuning
Closed
3 years ago
Comments count
1
top-k sampling off by 1 bug
Closed
3 years ago
Comments count
1
Finetuning and training minimum requirements
Closed
3 years ago
Comments count
1
Execute the model in a local machine (or WSL)
Closed
3 years ago
Comments count
1
Pre trained weights for transfer learning
Closed
3 years ago
Comments count
1
How to do pre-train from scratch ?
Closed
3 years ago
Comments count
1
Colab version now breaks on "import optax"
Closed
3 years ago
Comments count
1
How to run v3-128?
Closed
3 years ago
Comments count
2
limitation min_length=1024
Closed
3 years ago
Comments count
1
Previous
Next