kserve / modelmesh

I'm currently trying to setup streaming reponses of LLM generation from vLLM, however I receive an Streaming not yet supported error from modelmesh. I think this is coming from this code snippet:

modelmesh/src/main/java/com/ibm/watson/modelmesh/SidecarModelMesh.java

Line 433 in eaa2fde

String msg = "Streaming not yet supported";

It looks like implementing streaming is a non trivial task in this SidecarModelMesh class. Are there any plans on implementing streaming support or are there any blockers for this?

Are there any plans to support streaming of prediction responses?