Revise configuration storage logic

Question

Revise configuration storage logic

AlexKoff88 opened this issue a year ago · comments

Since we are giving the possibility to serialize the model came through the model API we need to put the whole configuration into the rt_info of the model so that next time it can be loaded and used as is. This is valid both for Python and C++ API.

There is a special case here, pre/post-processing embedding. We can go the following way about it:

Align pre/postprocessing between C++ and Python API (the most complicated here)
Set the corresponding parameter (e.g. embeded_processing) in rt_info, check it everytime model is being loaded:
How it should work:
- User can specify embeded_processing in the model configuration and Model API will check the availability of this parameter in rt_info. In case it is absent, the preprocessing is embedded in the model load. It is applicable to local inference for C++ or Python API.
- For serving Python API user needs to specify the exact configuration that is embedded into IR that is being served on the endpoint.
Serving API in the Python part should work only with models with embedded processing