tensorflow / tfx-addons

Developers helping developers. TFX-Addons is a collection of community projects to build new components, examples, libraries, and tools for TFX. The projects are organized under the auspices of the special interest group, SIG TFX-Addons. Join the group at http://goo.gle/tfx-addons-group

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A load test of the exported model

michalbrys opened this issue · comments

It will be helpful to have a component (dedicated of extension to the InfraValidator TFX Pipeline Component) that performs a load test of the exported model.

The expected behavior is to load the exported model, create a TensorFlow Serving endpoint, send requests to the prediction endpoint, and measure the response time. The load test may be performed using the common open-source software like Locust / Vegeta for HTTP or ghz for gRPC protocol.

The motivation behind it is that the prediction time may vary on the model type and structure. With this component, we can initially check if the model will meet the prediction time's business requirements.

Larry and Hannes both confirmed that they have a requirement for low-latency serving that would be benefitted by this. Questions about pushing and how to structure the pipeline. No blessing on test Serving instance, but would be a required blessing for production Push. Difficulty with exactly reproducing the production environment. InfraValidator does something similar, might be using TF.Serving. Similar to Evaluator, we could perhaps compare two models to assess the performance difference with the current test infra.

Is this currently being worked on?