tensorflow / serving

A flexible, high-performance serving system for machine learning models

Home Page:https://www.tensorflow.org/serving

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Include tensorflow-text 2.9.0 to support Ops for FastBertNormalize

dwyatte opened this issue · comments

Bug Report

Marking as bug as from what I can tell, including support for tensorflow-text is the intention

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 5.10.124-linuxkit
  • TensorFlow Serving installed from (source or binary): Binary/Docker
  • TensorFlow Serving version: 2.10

Describe the problem

When FastBertTokenizer and submodules are included in SavedModel, TensorFlow Serving does not recognize them with the error

2022-10-04 19:09:35.657342: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
	While importing function: __inference_call_182243
	when importing GraphDef to MLIR module in GrapplerHook
...
2022-10-04 19:10:22.730427: E external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] function_optimizer failed: NOT_FOUND: Op type not registered 'FastBertNormalize' in binary running on 2d4c8711fa01. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-10-04 19:10:22.933758: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
	While importing function: __inference_call_182243
	when importing GraphDef to MLIR module in GrapplerHook

Exact Steps to Reproduce

import tensorflow as tf
import tensorflow_text as text


class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.tokenizer = text.FastBertTokenizer(vocab=["a", "b", "c", "[UNK]"])
        
    @tf.function(input_signature=[tf.TensorSpec([None], dtype=tf.string, name="text")])
    def call(self, inputs):
        return self.tokenizer.tokenize(inputs).to_tensor()
    

model = MyModel()
model(["hello world"])
model.save("tokenizer/0")
tensorflow_model_server --rest_api_port=8501 --model_base_path=$PWD/tokenizer

It looks like there was a commit to include tensorflow-text 2.9.0 where these ops were introduced that got reverted,a73f292

@broken, any additional context on whether support for these ops is planned? I can mark as feature request if that's the case

IIRC, the update was rolled back because the ops were using c++17 features, but the model server was being compiled using an earlier version. I need to look back to see what the actual code is that broke it and update our patch file or see if tf.serving will upgrade their c++ version. Nobody was actively asking for the upgrade so I have been busy on other priorities that people were asking for which is why there hasn't been progress here.

I doubt I'll be able to get to it this week, but I'll try to find time next week since it is blocking you.

+1. Is there any workaround for that?

@dwyatte,

Please let us know if the workaround provided in the above commit works for you. The workaround introduces a flag that lets you use tensorflow_text and BertTokenizer rather than FastBertTokenizer.

Thank you!

@singhniraj08 I wasn't using transformers for the use case that spawned this issue, although good to see they are adding the same functionality to make models and tokenizers servable end-to-end in TF Serving 🚀

I do think we can change this to a feature request since it's a known limitation -- should I edit the original issue to reflect that or do you just want to change the label?

@singhniraj08 The problem is that it is compiling with c++0x and not c++17 despite what is set in the .bazelrc file. Do you know somebody more familiar with tf.serving builds that can look at this?

@SHvsMK, Could you please look into this feature request.

A couple things I noticed. First, you have the flag D_GLIBCXX_USE_CXX11_ABI=0 set with the comment that this is to bring parity with TF. TF flipped this recently, so I would recommend removing this as well from .bazelrc.

Another thing in the .bazelrc file I noticed was:

# TensorFlow Decision Forests does not use Absl StatusOr.
# Reason: The version of Absl included by TensorFlow lacks support for StatusOr.
build --define no_absl_statusor=1

.. but I don't believe this to be true. TF Text uses TF's version of absl and compiles fine using StatusOr. In fact, if you search the TF codebase you will notice uses of it there as well. Is there another reason for this? Could it be this compilation problem of not using C++17?

@dwyatte,

TF Text is updated to v2.9.0 in latest TF serving release 2.11.0. Ref: commit

Please try the latest TF serving release 2.11.0 and let us know if you issue has been resolved. Thank you!

It looks like the changes didn't make the TF Serving 2.11.0 release on Dockerhub but they are in the nightly. Confirmed the example in the issue is working there.

Thanks @broken and @singhniraj08 for the continued support for TF Text in TF Serving, very handy for production NLP!