mystic-ai / pipeline

Pipeline is an open source python SDK for building AI/ML workflows

Home Page:https://www.mystic.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimized runtime inference

jerrymatjila opened this issue · comments

I'm looking for advice. Based on your experience which engine provides better optimized runtime inference between vllm and TensorRT-LLM or any engine you have encountered for running on NVIDIA GPU.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.