Revise configuration storage logic
AlexKoff88 opened this issue · comments
Alexander Kozlov commented
Since we are giving the possibility to serialize the model came through the model API we need to put the whole configuration into the rt_info of the model so that next time it can be loaded and used as is. This is valid both for Python and C++ API.
There is a special case here, pre/post-processing embedding. We can go the following way about it:
- Align pre/postprocessing between C++ and Python API (the most complicated here)
- Set the corresponding parameter (e.g.
embeded_processing
) in rt_info, check it everytime model is being loaded:
How it should work:
- User can specifyembeded_processing
in the model configuration and Model API will check the availability of this parameter in rt_info. In case it is absent, the preprocessing is embedded in the model load. It is applicable to local inference for C++ or Python API.
- For serving Python API user needs to specify the exact configuration that is embedded into IR that is being served on the endpoint. - Serving API in the Python part should work only with models with embedded processing