Error of the Stackoverflow Tokernizer example
WilliamYi96 opened this issue · comments
Kai Yi commented
TensorFlow version: 2.5.3
fedjax version: 0.0.16
jax version: 0.4.8
When I follow the docs (https://fedjax.readthedocs.io/en/latest/fedjax.datasets.html#fedjax.datasets.stackoverflow.load_data) to process the Stackoverflow dataset by using
from fedjax.datasets import stackoverflow
# Load partially preprocessed splits.
train, held_out, test = stackoverflow.load_data(cache_dir='../data')
# Apply tokenizer during batching.
Tokenizer = stackoverflow.StackoverflowTokenizer()
train_max_length, eval_max_length = 20, 30
train_for_train = train.preprocess_batch(
tokenizer.as_preprocess_batch(train_max_length))
train_for_eval = train.preprocess_batch(
tokenizer.as_preprocess_batch(eval_max_length))
It has the following error:
2023-05-06 23:46:33.460149: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".
Traceback (most recent call last):
File "test.py", line 26, in <module>
tokenizer = stackoverflow.StackoverflowTokenizer()
File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/fedjax/datasets/stackoverflow.py", line 185, in __init__
self._table = tf.lookup.StaticVocabularyTable(
File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/ops/lookup_ops.py", line 1255, in __init__
raise TypeError("Invalid key dtype, expected one of %s, but got %s." %
TypeError: Invalid key dtype, expected one of (tf.int64, tf.string), but got <dtype: 'float32'>.
Exception ignored in: <function CapturableResource.__del__ at 0x2b2156f4c040>
Traceback (most recent call last):
File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/training/tracking/tracking.py", line 269, in __del__
with self._destruction_context():
AttributeError: 'StaticVocabularyTable' object has no attribute '_destruction_context'
Could you please help fix this?
Wu, Ke commented
Sorry for the late response. Somehow the email notification slipped through all team members' inboxes.
I cannot reproduce the problem on TensorFlow 2.5.3 installed from pip. Could you tell us how you installed the packages?