google-research / robotics_transformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failing at loading checkpoints

AliBuildsAI opened this issue · comments

Hi,

I am trying to load the checkpoints. I have followed #11 and ran this code:

saved_path = './trained_checkpoints/rt1main'
from tf_agents.policies import py_tf_eager_policy

py_tf_eager_policy.SavedModelPyTFEagerPolicy(
    model_path=saved_path,
    load_specs_from_pbtxt=True,
    use_tf_function=True,
)

But I am getting this error:

Traceback (most recent call last):
  File "/home/ali/workspace/repos/google-research/robotics_transformer/load_checkpoints.py", line 7, in <module>
    py_tf_eager_policy.SavedModelPyTFEagerPolicy(
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/config.py", line 1605, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/config.py", line 1582, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tf_agents/policies/py_tf_eager_policy.py", line 179, in __init__
    policy = tf.compat.v2.saved_model.load(model_path)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 936, in load
    result = load_internal(export_dir, tags, options)["root"]
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 974, in load_internal
    loader = loader_cls(object_graph_proto, saved_model_proto, export_dir,
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 187, in __init__
    self._restore_checkpoint()
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 560, in _restore_checkpoint
    load_status = saver.restore(variables_path, self._checkpoint_options)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1351, in restore
    object_graph_string = reader.get_tensor(base.OBJECT_GRAPH_PROTO_KEY)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/training/py_checkpoint_reader.py", line 66, in get_tensor
    return CheckpointReader.CheckpointReader_GetTensor(
IndexError: Read less bytes than requested
  In call to configurable 'SavedModelPyTFEagerPolicy' (<class 'tf_agents.policies.py_tf_eager_policy.SavedModelPyTFEagerPolicy'>)

Process finished with exit code 1

I am using python 3.8.0 and the following packages:

(rt9) λ › pip list                                                                                      workspace/repos
Package                       Version
----------------------------- ---------
absl-py                       1.4.0
astunparse                    1.6.3
cachetools                    5.3.0
certifi                       2022.12.7
charset-normalizer            3.1.0
cloudpickle                   2.2.1
decorator                     5.1.1
dill                          0.3.6
dm-tree                       0.1.8
etils                         1.1.1
flatbuffers                   23.3.3
gast                          0.5.3
gin-config                    0.5.0
google-auth                   2.16.2
google-auth-oauthlib          0.4.6
google-pasta                  0.2.0
googleapis-common-protos      1.59.0
grpcio                        1.51.3
gym                           0.26.2
gym-notices                   0.0.8
h5py                          3.8.0
idna                          3.4
importlib-metadata            6.1.0
importlib-resources           5.12.0
keras                         2.8.0
Keras-Preprocessing           1.1.2
libclang                      15.0.6.1
Markdown                      3.4.1
MarkupSafe                    2.1.2
numpy                         1.24.2
oauthlib                      3.2.2
opt-einsum                    3.3.0
packaging                     23.0
Pillow                        9.4.0
pip                           23.0.1
promise                       2.3
protobuf                      3.19.6
pyasn1                        0.4.8
pyasn1-modules                0.2.8
requests                      2.28.2
requests-oauthlib             1.3.1
rsa                           4.9
setuptools                    65.6.3
six                           1.16.0
tensorboard                   2.8.0
tensorboard-data-server       0.6.1
tensorboard-plugin-wit        1.8.1
tensorflow                    2.8.2
tensorflow-addons             0.17.1
tensorflow-datasets           4.6.0
tensorflow-estimator          2.8.0
tensorflow-hub                0.12.0
tensorflow-io-gcs-filesystem  0.26.0
tensorflow-metadata           1.9.0
tensorflow-model-optimization 0.7.2
tensorflow-probability        0.16.0
tensorflow-text               2.8.2
termcolor                     2.2.0
tf-agents                     0.12.0
toml                          0.10.2
tqdm                          4.65.0
typeguard                     3.0.1
typing_extensions             4.5.0
urllib3                       1.26.15
Werkzeug                      2.2.3
wheel                         0.38.4
wrapt                         1.15.0
zipp                          3.15.0

Hi,

I am trying to load the checkpoints. I have followed #11 and ran this code:

saved_path = './trained_checkpoints/rt1main'
from tf_agents.policies import py_tf_eager_policy

py_tf_eager_policy.SavedModelPyTFEagerPolicy(
    model_path=saved_path,
    load_specs_from_pbtxt=True,
    use_tf_function=True,
)

But I am getting this error:

Traceback (most recent call last):
  File "/home/ali/workspace/repos/google-research/robotics_transformer/load_checkpoints.py", line 7, in <module>
    py_tf_eager_policy.SavedModelPyTFEagerPolicy(
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/config.py", line 1605, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/gin/config.py", line 1582, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tf_agents/policies/py_tf_eager_policy.py", line 179, in __init__
    policy = tf.compat.v2.saved_model.load(model_path)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 936, in load
    result = load_internal(export_dir, tags, options)["root"]
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 974, in load_internal
    loader = loader_cls(object_graph_proto, saved_model_proto, export_dir,
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 187, in __init__
    self._restore_checkpoint()
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 560, in _restore_checkpoint
    load_status = saver.restore(variables_path, self._checkpoint_options)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1351, in restore
    object_graph_string = reader.get_tensor(base.OBJECT_GRAPH_PROTO_KEY)
  File "/home/ali/anaconda3/envs/rt9/lib/python3.8/site-packages/tensorflow/python/training/py_checkpoint_reader.py", line 66, in get_tensor
    return CheckpointReader.CheckpointReader_GetTensor(
IndexError: Read less bytes than requested
  In call to configurable 'SavedModelPyTFEagerPolicy' (<class 'tf_agents.policies.py_tf_eager_policy.SavedModelPyTFEagerPolicy'>)

Process finished with exit code 1

I am using python 3.8.0 and the following packages:

(rt9) λ › pip list                                                                                      workspace/repos
Package                       Version
----------------------------- ---------
absl-py                       1.4.0
astunparse                    1.6.3
cachetools                    5.3.0
certifi                       2022.12.7
charset-normalizer            3.1.0
cloudpickle                   2.2.1
decorator                     5.1.1
dill                          0.3.6
dm-tree                       0.1.8
etils                         1.1.1
flatbuffers                   23.3.3
gast                          0.5.3
gin-config                    0.5.0
google-auth                   2.16.2
google-auth-oauthlib          0.4.6
google-pasta                  0.2.0
googleapis-common-protos      1.59.0
grpcio                        1.51.3
gym                           0.26.2
gym-notices                   0.0.8
h5py                          3.8.0
idna                          3.4
importlib-metadata            6.1.0
importlib-resources           5.12.0
keras                         2.8.0
Keras-Preprocessing           1.1.2
libclang                      15.0.6.1
Markdown                      3.4.1
MarkupSafe                    2.1.2
numpy                         1.24.2
oauthlib                      3.2.2
opt-einsum                    3.3.0
packaging                     23.0
Pillow                        9.4.0
pip                           23.0.1
promise                       2.3
protobuf                      3.19.6
pyasn1                        0.4.8
pyasn1-modules                0.2.8
requests                      2.28.2
requests-oauthlib             1.3.1
rsa                           4.9
setuptools                    65.6.3
six                           1.16.0
tensorboard                   2.8.0
tensorboard-data-server       0.6.1
tensorboard-plugin-wit        1.8.1
tensorflow                    2.8.2
tensorflow-addons             0.17.1
tensorflow-datasets           4.6.0
tensorflow-estimator          2.8.0
tensorflow-hub                0.12.0
tensorflow-io-gcs-filesystem  0.26.0
tensorflow-metadata           1.9.0
tensorflow-model-optimization 0.7.2
tensorflow-probability        0.16.0
tensorflow-text               2.8.2
termcolor                     2.2.0
tf-agents                     0.12.0
toml                          0.10.2
tqdm                          4.65.0
typeguard                     3.0.1
typing_extensions             4.5.0
urllib3                       1.26.15
Werkzeug                      2.2.3
wheel                         0.38.4
wrapt                         1.15.0
zipp                          3.15.0

Hi, have you solved this problem? I also get this error. It would be better if you could provide some solution or advice.

Hi, No I could not solve it.

Hi, No I could not solve it.

Problem has been solved! You need to download the repo by using "git lfs", instead of "git" or zip file.