base inference server

Question

base inference server

SkalskiP opened this issue 2 years ago · comments

overview

The reason for the repository is to give makesense.ai users even more opportunities to support manual annotation with pre-trained models. So far, we've been using tensorflow.js for this purpose, but quite recently there was an issue SkalskiP/make-sense#293 where users are asking to add support for inference via http. I've been thinking about this for a while now, so we're doing it!

scope

To start, the server is to support only one architecture - YOLOv5 or YOLOv7. But when writing the service, please keep in mind that we will probably expand this in the future.
To start, the server is to support only one CV task - object detection. But when writing the service, please keep in mind that we will probably expand this in the future.
Communication must be over http.
Authentication is not required. Unless some simple token that will be randomly generated by the server at startup, printed in the console and then supposed to be included in the http request.
Request and response format to be determined. I do not have a preference at this time.

scope (nice to have)

It would be nice if everything was in Docker.
It would be nice if the server could be controlled through a YML file. For example, define the location of weight files.

Hardik Dava · Answer 1 · Tue Nov 15 2022 03:21:36 GMT+0800 (China Standard Time)

@SkalskiP I checked the implemented code. In general, the idea of using torch server is not good. For example, torch server can be replaced by REST api server(Flask/ fastapi/tornado/etc), model serving(onnx/opencv dnn/Tensorflow/etc) with the usage of opencv. In this way, the user is not bound to use only torch server. If I am missing anything specific advantage then let me know. I can create such small server within a day. I already built something like that.

Paweł Pęczek · Answer 2 · Fri Nov 18 2022 04:08:47 GMT+0800 (China Standard Time)

@hardikdava - basically it is done - obviously, you can implement it in another way - but this method supports ONNX, TRT, TF, and whatever you want - TorchServe is "Torch"-oriented only by the name. We now have YOLOv5 and YOLOv7, it is trivial to deploy TorchHub, other models should also work.
Obviously, that is only an example server - everyone can provide their implementation just interface matters

Paweł Pęczek · Answer 3 · Fri Nov 18 2022 04:25:39 GMT+0800 (China Standard Time)

It is better now to focus on integration from labeling app side

Piotr Skalski · Answer 4 · Fri Nov 18 2022 06:09:33 GMT+0800 (China Standard Time)

Hi @PawelPeczek and @hardikdava 👋!

I apologize to you for engaging so little here so far. The new job is weighing me down a bit. I promise to improve. :)

Guys, remember that this server is only an example that we will use for development and as a guideline for others to build an API that is compatible with make-sense. Others may use it but don't have to :)

@PawelPeczek could you describe in simple words how other non-torch models than be deployed?

Paweł Pęczek · Answer 5 · Wed Nov 23 2022 06:03:56 GMT+0800 (China Standard Time)

up to details described in readme - custom dependency can be installed in environment and the inside model handler - at first model needs to be loaded (it will yield some object of specific type - and torch.device object should only be used there to conclude which device to use), then at inference time - properly constructed handler function is going to be given reference to model and input image - this should be enough to infer. If I were not so lazy I could add onnx model for instance 😂

As per readme:

ModelObject = Any  # this can be anything even, tensorflow if someone wanted
InferenceFunction = Callable[
    [ModelObject, List[np.ndarray], torch.device], List[InferenceResult]
]

and you are supposed to construct two functions in your module - first is:

def load_model(
    context: Context, device: torch.device
) -> Tuple[ModelObject, InferenceFunction]:

and the second is InferenceFunction as per signature