mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks

Home Page:https://mlcommons.org/en/groups/inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Llama2-70b Loadgen Issue in Offline Scenario

Maxusmusti opened this issue · comments

Since the fixes for token measuring in the server scenario have been introduced, there is now a new issue in the offline scenario:

:::MLLOG {"key": "error_runtime", "value": "n_tokens argument missing or attempted to record 0 as number of tokens", "time_ms": 0.151550, "namespace": "mlperf::logging", "event_type": "POINT_IN_TIME", "metadata": {"is_error": true, "is_warning": false, "file": "logging.cc", "line_no": 442, "pid": 394, "tid": 419}}

Every time we call

                response = [lg.QuerySampleResponse(qitem[i].id, bi[0], bi[1])]
                lg.QuerySamplesComplete(response)

in offline, we are missing the new n_tokens field