Error of the Stackoverflow Tokernizer example

Question

Error of the Stackoverflow Tokernizer example

WilliamYi96 opened this issue a year ago · comments

TensorFlow version: 2.5.3
fedjax version: 0.0.16
jax version: 0.4.8

When I follow the docs (https://fedjax.readthedocs.io/en/latest/fedjax.datasets.html#fedjax.datasets.stackoverflow.load_data) to process the Stackoverflow dataset by using

from fedjax.datasets import stackoverflow
# Load partially preprocessed splits.
train, held_out, test = stackoverflow.load_data(cache_dir='../data')
# Apply tokenizer during batching.
Tokenizer = stackoverflow.StackoverflowTokenizer()
train_max_length, eval_max_length = 20, 30
train_for_train = train.preprocess_batch(
    tokenizer.as_preprocess_batch(train_max_length))
train_for_eval = train.preprocess_batch(
    tokenizer.as_preprocess_batch(eval_max_length))

It has the following error:

2023-05-06 23:46:33.460149: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".
Traceback (most recent call last):
  File "test.py", line 26, in <module>
    tokenizer = stackoverflow.StackoverflowTokenizer()
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/fedjax/datasets/stackoverflow.py", line 185, in __init__
    self._table = tf.lookup.StaticVocabularyTable(
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/ops/lookup_ops.py", line 1255, in __init__
    raise TypeError("Invalid key dtype, expected one of %s, but got %s." %
TypeError: Invalid key dtype, expected one of (tf.int64, tf.string), but got <dtype: 'float32'>.
Exception ignored in: <function CapturableResource.__del__ at 0x2b2156f4c040>
Traceback (most recent call last):
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/training/tracking/tracking.py", line 269, in __del__
    with self._destruction_context():
AttributeError: 'StaticVocabularyTable' object has no attribute '_destruction_context'

Could you please help fix this?

Wu, Ke · Answer 1 · Sat May 20 2023 00:54:10 GMT+0800 (China Standard Time)

Sorry for the late response. Somehow the email notification slipped through all team members' inboxes.

I cannot reproduce the problem on TensorFlow 2.5.3 installed from pip. Could you tell us how you installed the packages?