kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku

kingoflolz/mesh-transformer-jax Issues

Colab Demo Notebook Not Working
Updated 2 years ago11
Save failed during checkpoint saving function call
Updated 2 years ago4
Fine-tuning
Closed 3 years ago1
how to implement stop sequence in gpt j
Closed 2 years ago3
Fine Tuning Dataset Format
Updated 2 years ago1
The latest update to PIP breaks installation
Updated 2 years ago7
Error!
Closed 2 years ago1
Using `no_repeat_ngram_size` like HF
Closed 2 years ago3
No TPU found, falling back to CPU
Closed 2 years ago1
invalid syntax in to_hf_weights.py and device_train.py
Closed 2 years ago1
Instruct GPT fine tuning
Closed 2 years ago1
How does it work?
Closed 2 years ago3
Did you use the splits made by the Pile directly?
Closed 2 years ago
RESOURCE_EXHAUSTED: Failed to allocate request for 256.00MiB (268435456B) on device ordinal 0
Closed 2 years ago1
To run multiple model on one model
Updated 2 years ago
TpuEmbeddingEngine_WriteParameters not available in this library.
Closed 2 years ago11
looking at multiple versions of different packages (slow progress of requirements file)
Updated 2 years ago4
Seen floating point types of different precisions in %opt-barrier
Updated 2 years ago1
how to restart training
Closed 2 years ago
the-eye.eu down - alternative access to GPT-J-6B/step_383500_slim.tar.zstd ?
Closed 3 years ago13
Error: AssertionError: Incompatible checkpoints (8,) vs (8, 4096)
Closed 2 years ago1
how to speed up the inference time
Closed 2 years ago2
having issue in running the model
Closed 2 years ago
Is "to_hf_weights.py" specific to "6B_roto_256.json" only?
Updated 3 years ago
save_config_to_hf_format()
Updated 3 years ago
gpt-neo models are not compatible with this codebase
Updated 3 years ago
Error fine-tuning train
Closed 3 years ago
smaller models
Closed 3 years ago2
to_hf_weights script returns "Failed to allocate" error
Closed 3 years ago2
HF model does not work on Torch/XLA
Closed 3 years ago1
read_ckpt getting killed (OOM?)
Closed 3 years ago4
Generating random numbers – None PRNGKey error
Closed 3 years ago1
to_hf_weights.py cpu assertion error
Closed 3 years ago
Verifying logic for LR schedule
Closed 3 years ago
Weight download problem
Closed 3 years ago5
Minor requirement conflict on tqdm
Closed 3 years ago1
sequence_length=2049 or 2048?
Closed 3 years ago3
Can I do Fine-Tune GPT-J in colab pro?
Closed 3 years ago1
jax/haiku versions incompatible?
Updated 3 years ago
end sequence possible?
Closed 3 years ago1
Freeze Transformer Weight
Closed 3 years ago3
sample data configuration for finetuning
Closed 3 years ago1
top-k sampling off by 1 bug
Closed 3 years ago1
Finetuning and training minimum requirements
Closed 3 years ago1
Execute the model in a local machine (or WSL)
Closed 3 years ago1
Pre trained weights for transfer learning
Closed 3 years ago1
How to do pre-train from scratch ?
Closed 3 years ago1
Colab version now breaks on "import optax"
Closed 3 years ago1
How to run v3-128?
Closed 3 years ago2
limitation min_length=1024
Closed 3 years ago1