Include tensorflow-text 2.9.0 to support Ops for FastBertNormalize

Question

Include tensorflow-text 2.9.0 to support Ops for FastBertNormalize

dwyatte opened this issue 2 years ago · comments

Dean Wyatte commented 2 years ago

Bug Report

Marking as bug as from what I can tell, including support for tensorflow-text is the intention

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 5.10.124-linuxkit
TensorFlow Serving installed from (source or binary): Binary/Docker
TensorFlow Serving version: 2.10

Describe the problem

When FastBertTokenizer and submodules are included in SavedModel, TensorFlow Serving does not recognize them with the error

2022-10-04 19:09:35.657342: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
	While importing function: __inference_call_182243
	when importing GraphDef to MLIR module in GrapplerHook
...
2022-10-04 19:10:22.730427: E external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] function_optimizer failed: NOT_FOUND: Op type not registered 'FastBertNormalize' in binary running on 2d4c8711fa01. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-10-04 19:10:22.933758: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:134] tfg_optimizer{tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export} failed: INVALID_ARGUMENT: Unable to find OpDef for FastBertNormalize
	While importing function: __inference_call_182243
	when importing GraphDef to MLIR module in GrapplerHook

Exact Steps to Reproduce

import tensorflow as tf
import tensorflow_text as text


class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.tokenizer = text.FastBertTokenizer(vocab=["a", "b", "c", "[UNK]"])
        
    @tf.function(input_signature=[tf.TensorSpec([None], dtype=tf.string, name="text")])
    def call(self, inputs):
        return self.tokenizer.tokenize(inputs).to_tensor()
    

model = MyModel()
model(["hello world"])
model.save("tokenizer/0")

tensorflow_model_server --rest_api_port=8501 --model_base_path=$PWD/tokenizer

Dean Wyatte · Answer 1 · Wed Oct 05 2022 03:49:51 GMT+0800 (China Standard Time)

It looks like there was a commit to include tensorflow-text 2.9.0 where these ops were introduced that got reverted,a73f292

@broken, any additional context on whether support for these ops is planned? I can mark as feature request if that's the case

Robert Neale · Answer 2 · Fri Oct 07 2022 02:29:56 GMT+0800 (China Standard Time)

IIRC, the update was rolled back because the ops were using c++17 features, but the model server was being compiled using an earlier version. I need to look back to see what the actual code is that broke it and update our patch file or see if tf.serving will upgrade their c++ version. Nobody was actively asking for the upgrade so I have been busy on other priorities that people were asking for which is why there hasn't been progress here.

I doubt I'll be able to get to it this week, but I'll try to find time next week since it is blocking you.

Pi Esposito · Answer 3 · Wed Oct 12 2022 07:35:15 GMT+0800 (China Standard Time)

+1. Is there any workaround for that?

Niraj Singh · Answer 4 · Thu Nov 03 2022 19:11:53 GMT+0800 (China Standard Time)

@dwyatte,

Please let us know if the workaround provided in the above commit works for you. The workaround introduces a flag that lets you use tensorflow_text and BertTokenizer rather than FastBertTokenizer.

Thank you!

Dean Wyatte · Answer 5 · Fri Nov 04 2022 00:30:05 GMT+0800 (China Standard Time)

@singhniraj08 I wasn't using transformers for the use case that spawned this issue, although good to see they are adding the same functionality to make models and tokenizers servable end-to-end in TF Serving 🚀

I do think we can change this to a feature request since it's a known limitation -- should I edit the original issue to reflect that or do you just want to change the label?

Robert Neale · Answer 6 · Tue Nov 08 2022 09:31:54 GMT+0800 (China Standard Time)

@singhniraj08 The problem is that it is compiling with c++0x and not c++17 despite what is set in the .bazelrc file. Do you know somebody more familiar with tf.serving builds that can look at this?

Niraj Singh · Answer 7 · Tue Nov 08 2022 16:52:30 GMT+0800 (China Standard Time)

@SHvsMK, Could you please look into this feature request.

Robert Neale · Answer 8 · Tue Nov 08 2022 17:04:11 GMT+0800 (China Standard Time)

A couple things I noticed. First, you have the flag D_GLIBCXX_USE_CXX11_ABI=0 set with the comment that this is to bring parity with TF. TF flipped this recently, so I would recommend removing this as well from .bazelrc.

Another thing in the .bazelrc file I noticed was:

# TensorFlow Decision Forests does not use Absl StatusOr.
# Reason: The version of Absl included by TensorFlow lacks support for StatusOr.
build --define no_absl_statusor=1

.. but I don't believe this to be true. TF Text uses TF's version of absl and compiles fine using StatusOr. In fact, if you search the TF codebase you will notice uses of it there as well. Is there another reason for this? Could it be this compilation problem of not using C++17?

Niraj Singh · Answer 9 · Thu Dec 15 2022 21:02:58 GMT+0800 (China Standard Time)

@dwyatte,

TF Text is updated to v2.9.0 in latest TF serving release 2.11.0. Ref: commit

Please try the latest TF serving release 2.11.0 and let us know if you issue has been resolved. Thank you!

Dean Wyatte · Answer 10 · Thu Dec 29 2022 03:30:19 GMT+0800 (China Standard Time)

It looks like the changes didn't make the TF Serving 2.11.0 release on Dockerhub but they are in the nightly. Confirmed the example in the issue is working there.

Thanks @broken and @singhniraj08 for the continued support for TF Text in TF Serving, very handy for production NLP!