KeyError: "The name 'Identity:0' refers to a Tensor which does not exist. The operation, 'Identity', does not exist in the graph."
karantai opened this issue · comments
I am trying to implement federated learning to an LSTM model. After I have transformed the data and the model as expected without any issues, my code raises an error in this line:
train_state = training_process.initialize()
,
with error:
KeyError: "The name 'Identity:0' refers to a Tensor which does not exist. The operation, 'Identity', does not exist in the graph."
.
I searched the graph of my model for Identity:0
name but I found nothing. I suspect that something happens regarding the naming of among the clients but I do not know where to start.
Any ideas?
My code is big, so in case one needs a part of it just tell it.
This is the completed error message.
main()
File "....", line 137, in main
train_state = training_process.initialize()
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/computation/computation_impl.py", line 139, in __call__
return self._context_stack.current.invoke(self, arg)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/execution_contexts/sync_execution_context.py", line 65, in invoke
return self._async_runner.run_coro_and_return_result(
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/common_libs/async_utils.py", line 224, in run_coro_and_return_result
return future.result()
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/common_libs/retrying.py", line 119, in retry_coro_fn
raise e
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/common_libs/retrying.py", line 109, in retry_coro_fn
result = await fn(*args, **kwargs)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/execution_contexts/async_execution_context.py", line 220, in invoke
comp = self._compiler_pipeline.compile(comp)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/execution_contexts/compiler_pipeline.py", line 51, in compile
return self._compiler_fn(comp)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/backends/native/compiler.py", line 144, in desugar_and_transform_to_native
native_form = transform_to_native_form(
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/backends/native/compiler.py", line 87, in transform_to_native_form
compiled_computation_transformations.optimize_tensorflow_graphs(
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/compiled_computation_transformations.py", line 134, in optimize_tensorflow_graphs
return transformation_utils.transform_postorder(
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/transformation_utils.py", line 105, in transform_postorder
result, result_modified = transform_postorder(comp.result, transform)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/transformation_utils.py", line 116, in transform_postorder
value, value_modified = transform_postorder(value, transform)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/transformation_utils.py", line 97, in transform_postorder
arg, arg_modified = transform_postorder(comp.argument, transform)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/transformation_utils.py", line 74, in transform_postorder
return transform(comp)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/compiled_computation_transformations.py", line 128, in transform
return optimize_tensorflow_comp(comp, self._config_proto), True
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/compiler/compiled_computation_transformations.py", line 81, in optimize_tensorflow_comp
optimized_graph_spec = graph_optimizations.optimize_graph_spec(
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/tensorflow_libs/graph_optimizations.py", line 38, in optimize_graph_spec
meta_graph_def = graph_spec_obj.to_meta_graph_def()
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/tensorflow_libs/graph_spec.py", line 85, in to_meta_graph_def
out_names_to_tensor_specs = {
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/tensorflow_libs/graph_spec.py", line 86, in <dictcomp>
name: _get_tensor_spec(name) for name in self.out_names
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/tensorflow_libs/graph_spec.py", line 79, in _get_tensor_spec
graph_for_tensor_specs.get_tensor_by_name(name)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 4188, in get_tensor_by_name
return self.as_graph_element(name, allow_tensor=True, allow_operation=False)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 4012, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 4052, in _as_graph_element_locked
raise KeyError("The name %s refers to a Tensor which does not "
KeyError: "The name 'Identity:0' refers to a Tensor which does not exist. The operation, 'Identity', does not exist in the graph."
Exception ignored in: <function local_cpp_executor_factory.<locals>.ServiceManager.__del__ at 0x7fed5e4dc040>
Traceback (most recent call last):
File "/home/johnny/anaconda3/envs/fl_tf/lib/python3.9/site-packages/tensorflow_federated/python/core/impl/executor_stacks/executor_factory.py", line 150, in __del__
AttributeError: 'NoneType' object has no attribute 'Popen'
@karantai Can you fill in the entire bug template? In particular, please provide things like which TFF version, and which python package versions you're using.
sure. here it is.
ubuntu 22.04
geforce 1070
Python 3.9.16
Package Version
----------------------------- ----------
absl-py 1.0.0
aiohttp 3.8.1
aiosignal 1.3.1
array-record 0.2.0
asttokens 2.2.1
astunparse 1.6.3
async-timeout 4.0.2
attrs 21.4.0
backcall 0.2.0
backports.functools-lru-cache 1.6.4
blinker 1.6.2
brotlipy 0.7.0
cachetools 3.1.1
certifi 2022.12.7
cffi 1.15.0
charset-normalizer 2.1.1
click 8.1.3
cloudpickle 2.2.1
contourpy 1.0.7
cryptography 3.4.8
cycler 0.11.0
debugpy 1.5.1
decorator 5.1.1
dm-tree 0.1.7
dp-accounting 0.3.0
entrypoints 0.4
etils 1.2.0
executing 1.2.0
farmhashpy 0.4.0
flatbuffers 23.3.3
fonttools 4.39.3
frozenlist 1.3.3
gast 0.4.0
google-auth 2.17.3
google-auth-oauthlib 1.0.0
google-pasta 0.2.0
google-vizier 0.1.4
googleapis-common-protos 1.59.0
greenlet 2.0.2
grpcio 1.48.2
grpcio-tools 1.48.2
h5py 3.8.0
idna 3.4
immutabledict 2.2.4
importlab 0.8
importlib-metadata 6.6.0
importlib-resources 5.12.0
ipykernel 6.15.0
ipython 8.12.0
jax 0.3.15
jaxlib 0.3.15
jedi 0.18.2
Jinja2 3.1.2
joblib 1.2.0
jupyter-client 7.0.6
jupyter_core 5.3.0
keras 2.12.0
Keras-Preprocessing 1.1.2
kiwisolver 1.4.4
libclang 16.0.0
libcst 0.4.9
Markdown 3.4.3
MarkupSafe 2.1.2
matplotlib 3.7.1
matplotlib-inline 0.1.6
mpmath 1.2.1
multidict 6.0.2
mypy-extensions 1.0.0
nest-asyncio 1.5.6
networkx 2.8.3
ninja 1.11.1
numpy 1.22.4
oauthlib 3.2.2
opt-einsum 3.3.0
packaging 22.0
pandas 1.5.3
parameterized 0.9.0
parso 0.8.3
patsy 0.5.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.5.0
pip 23.0.1
platformdirs 3.2.0
portpicker 1.5.2
promise 2.3
prompt-toolkit 3.0.38
protobuf 3.20.3
psutil 5.9.5
ptyprocess 0.7.0
pure-eval 0.2.2
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycparser 2.21
pydot 1.4.2
Pygments 2.15.1
PyJWT 2.6.0
pyOpenSSL 20.0.1
pyparsing 3.0.9
PySocks 1.7.1
python-dateutil 2.8.2
pytype 2022.12.15
pytz 2023.3
pyu2f 0.1.5
PyYAML 6.0
pyzmq 19.0.2
requests 2.28.2
requests-oauthlib 1.3.1
rsa 4.9
scikit-learn 1.2.2
scipy 1.7.3
semantic-version 2.10.0
setuptools 66.0.0
six 1.16.0
SQLAlchemy 1.4.0
stack-data 0.6.2
statsmodels 0.13.5
tabulate 0.9.0
tensorboard 2.12.2
tensorboard-data-server 0.7.0
tensorboard-plugin-wit 1.8.1
tensorflow 2.12.0
tensorflow-compression 2.12.0
tensorflow-datasets 4.9.2
tensorflow-estimator 2.12.0
tensorflow-federated 0.56.0
tensorflow-io-gcs-filesystem 0.32.0
tensorflow-metadata 1.13.1
tensorflow-model-optimization 0.7.3
tensorflow-privacy 0.8.8
tensorflow-probability 0.15.0
termcolor 2.2.0
threadpoolctl 3.1.0
toml 0.10.2
tornado 6.1
tqdm 4.65.0
traitlets 5.9.0
typing_extensions 4.4.0
typing-inspect 0.8.0
urllib3 1.26.15
wcwidth 0.2.6
Werkzeug 2.2.3
wheel 0.38.4
wrapt 1.14.1
yarl 1.7.2
zipp 3.15.0
```
Thanks. There are no obvious dependency/package errors. From the stack trace, it seems like your computation training_process.initialize()
may be mixing graph and eager mode. Do you have any snippet that showcases how this computation is built? In particular, I would hazard a guess that your model construction might involve tensors defined in eager mode (and hence are not available in graph mode) but this is only a rough guess.
I had disabled the eager_mode in tensorflow because I had an error related with eager_functions or something like that. I enabled it again and put @tf.function where it needed so no errors left!
Perfect, marking this as resolved.