Giters
tensorflow
/
mesh
Mesh TensorFlow: Model Parallelism Made Easier
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
1561
Watchers:
50
Issues:
83
Forks:
255
tensorflow/mesh Issues
Error while importing Meshtensorflow
Closed
5 months ago
Optimizer momentums not properly populated training model with DTensors
Closed
8 months ago
Comments count
1
AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'register_tensor_conversion_function'
Closed
9 months ago
Comments count
4
Does load-balanced loss help the loss converge?
Updated
a year ago
Future of this project?
Updated
a year ago
Comments count
2
When running BERT on GPU: Resource exhausted: failed to allocate memory
Updated
2 years ago
Comments count
1
Getting "NanLossDuringTrainingError: NaN loss during training."
Updated
2 years ago
mask_1_flat and mask_2_flat applied to gates twice?
Updated
2 years ago
Debug in mesh Tensorflow
Updated
2 years ago
Comments count
3
Mesh-tf model conversion to onnx?
Updated
2 years ago
Comments count
2
About the mixture of expert model
Updated
2 years ago
How to freeze embedding layers
Updated
3 years ago
Beam search
Updated
3 years ago
the `model_executor.py` example is broken
Closed
3 years ago
Ability to add Custom Tensorflow Hooks
Updated
3 years ago
[MOE-transformer] How do you build static graph of MOE-Model?
Updated
3 years ago
How to use tf.contrib.opt.ScipyOptimizerInterface or tfp.optimizer.lbfgs_minimize with MeshTF ?
Updated
3 years ago
How to assign values to specific slice of a data block on a specific GPU?
Updated
3 years ago
performing the opposite of mtf.lowering
Updated
3 years ago
Comments count
1
Performance on GPUs and multiple GPU support
Updated
3 years ago
Comments count
12
[Wrong Code Comments] In moe.py, there are two wrong code comments
Updated
3 years ago
MeshTF + pipeline parallelism?
Closed
3 years ago
OpenNMT-tf
Updated
3 years ago
mtf.dropout is inverted
Updated
3 years ago
Tensorflow Mesh needs documentation. Will this be provided anytime soon?
Updated
3 years ago
Comments count
1
error when learning_rate_schedule is a callable
Updated
3 years ago
different target score when using logits from sample_autoregressive
Updated
3 years ago
Mesh tensorflow support for multi-node
Updated
3 years ago
Comments count
5
bias in selfAttention
Updated
4 years ago
more memory occupation in first device
Closed
4 years ago
Comments count
1
AttributeError: module 'mesh_tensorflow' has no attribute 'auto_mtf'
Updated
4 years ago
Comments count
4
Does this supports tf 2 keras API?
Updated
4 years ago
Memory issues when using the "distillation" class
Closed
4 years ago
Comments count
1
Appropriate values for model_parallelism and tokens_per_batch to train a t5.small_ssm model on v3_512, v3_1024 and v3_2048 TPUs
Closed
4 years ago
Comments count
1
Predict vs Eval functionality
Updated
4 years ago
Finetuning a `bfloat16` checkpoint with `float32`
Updated
4 years ago
Preventing leak in packed sequences
Updated
4 years ago
Communication Between TPU Cores and Encoder->Reduce->Decoder Pattern
Updated
4 years ago
PROBLEM=./mesh_tensorflow/transformer/gin/problems/lm1b.gin
Updated
4 years ago
README.md is outdated
Updated
4 years ago
Convolution layers in mesh tensorflow
Updated
4 years ago
tf2 in mesh_tensorflow/utils.py incompatible with tensor2tensor/rl
Updated
4 years ago
Split along layers
Updated
4 years ago
[Bug] brackets missing
Updated
4 years ago
[Bug Fix] Evaluation and Prediction for Aligned model
Updated
4 years ago
Comments count
1
mixed precision support on GPUs
Updated
4 years ago
Capture performance profile using Tensorboard
Updated
4 years ago
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
Updated
4 years ago
SelfAttention & EncDecAttention in mesh transformer allow different values for query, key, value
Updated
4 years ago
Could you please set to False the default value of ignore_comments?
Updated
4 years ago
Previous
Next