[Tracking] Sampler optimization

Question

[Tracking] Sampler optimization

masahi opened this issue 8 months ago · comments

masahi commented 8 months ago

Let's collect remaining issues we are aware of related to sampler performance

Small regression (1 req / sec drop from benchmark_throughput.py) after #192 when only greedy sampling is used.
Logprobs, and JSON are extremely slow

masahi · Answer 1 · Wed Feb 21 2024 12:44:28 GMT+0800 (China Standard Time)

The first issue seems to have been fixed by @vvchernov #215

Valery Chernov · Answer 2 · Thu Feb 22 2024 14:15:03 GMT+0800 (China Standard Time)

Hello @masahi! No, my fix in #215 resolved very strong (more than one order) reduction after #214.
About task 1: 1. we observed reduction ~25-30% after #192 2. It was not resolved, I'm investigating the issue
About task 2: I remember about logprobs, but looks like resolving of task 1 requires sampler refactor and I want to do it first (or somebody will do it)