Ollama codestral model produces nonsensical output on PVC

Question

Ollama codestral model produces nonsensical output on PVC

tkarna opened this issue a month ago · comments

I'm using ollama in langchain. The following code generation test uses codestral model and produces valid a response on standard ollama install running on CPU. If I run with ollama installed with ipex-llm (following the online instructions), I get nonsensical characters in the output.

Brief tests suggest that the length of the input prompt may trigger this. Shorter/simpler queries work fine.

Tested on a server with Sapphire Rapids CPUs and 2 PVCs. Docker with Ubuntu 22.04.4 LTS, OneAPI 2024.1, Python 3.10.12, langchain 0.2.1.

Valid response on CPU:

```diff
  - ResultData data{
  + ResultData data{
  +     utils::GetPid(),
  +     device_id_,
  +     correlator_.GetTimestamp(),
  +     options_.GetMetricGroup(),
  +     device_props_list_,
  +     kernel_name_list,
  +     std::move(kernel_interval_list)};
```

Invalid response on ipex-llm and PVC:

 ```diff
-    ResultData data{
+    ResultData data{
        utils<s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s>...

Reproducer:

reproducer.py

from langchain_community.llms import Ollama

patch_prompt = """You are an experienced programmer. You will be given an issue related to a C++ software project,
some related code blocks for context, and a proposed fix. Your ultimate goal is to write code block that fixes the issue.

To respond you MUST use the following format.
~~~
```diff
[fixed code block in diff format.]
```
~~~

Issue: {input}
"""

issue_text = """
Issue description: COPY_INSTEAD_OF_MOVE
Creating a copy of a variable that is no longer used instead of using std::move().
Context: 
"kernel_interval_list" is copied in call to copy constructor "std::vector >", when it could be moved instead.
Use "std::move"("kernel_interval_list") instead of "kernel_interval_list".
Affected part of source code:
```cpp

    ResultData data{
        utils::GetPid(),
        device_id_,
        correlator_.GetTimestamp(),
        options_.GetMetricGroup(),
        device_props_list_,
        kernel_name_list,
        kernel_interval_list};
```
In:
identifier: Profiler::DumpResultFile
path: pti/tools/oneprof/profiler.h:262
```cpp
  void Profiler::DumpResultFile() {
    std::vector kernel_name_list;
    std::vector kernel_interval_list;

    if (CheckOption(PROF_KERNEL_INTERVALS) ||
        CheckOption(PROF_KERNEL_METRICS) ||
        CheckOption(PROF_AGGREGATION)) {

      if (cl_kernel_collector_ != nullptr) {
        const ClKernelIntervalList& cl_kernel_interval_list =
          cl_kernel_collector_->GetKernelIntervalList();

        std::vector device_list =
          utils::cl::GetDeviceList(CL_DEVICE_TYPE_GPU);
        if (!device_list.empty()) {
          PTI_ASSERT(device_id_ < device_list.size());
          AddKernelIntervals(
              cl_kernel_interval_list,
              device_list[device_id_],
              kernel_name_list,
              kernel_interval_list);
        }
      }

      if (ze_kernel_collector_ != nullptr) {
        const ZeKernelIntervalList& ze_kernel_interval_list =
          ze_kernel_collector_->GetKernelIntervalList();

        std::vector device_list =
          utils::ze::GetDeviceList();
        if (!device_list.empty()) {
          PTI_ASSERT(device_id_ < device_list.size());
          AddKernelIntervals(
              ze_kernel_interval_list,
              device_list[device_id_],
              kernel_name_list,
              kernel_interval_list);
        }
      }
    }

    if (CheckOption(PROF_KERNEL_QUERY)) {
      PTI_ASSERT(metric_query_collector_ != nullptr);
      kernel_name_list = metric_query_collector_->GetKernels();
    }

    ResultStorage* storage = ResultStorage::Create(
        options_.GetRawDataPath(), utils::GetPid());
        PTI_ASSERT(storage != nullptr);

    ResultData data{
        utils::GetPid(),
        device_id_,
        correlator_.GetTimestamp(),
        options_.GetMetricGroup(),
        device_props_list_,
        kernel_name_list,
        kernel_interval_list};

    storage->Dump(&data);

    delete storage;
  }
```
"""

llm = Ollama(
    model="codestral",
    temperature=0.1,
    top_k=10,
    top_p=0.5,
    repeat_penalty=1.03,
    num_thread=28,
)

full_prompt = patch_prompt.format(input=issue_text)
response = llm.invoke(full_prompt)
print(response)

Ruonan Wang · Answer 1 · Fri May 31 2024 17:48:16 GMT+0800 (China Standard Time)

Hi @tkarna ,
I have reproduced this error, and we are trying to figure out the root cause and fix it.
Once it is done, will update here to let you know.

Ruonan Wang · Answer 2 · Mon Jun 03 2024 16:15:30 GMT+0800 (China Standard Time)

Hi @tkarna ,
We have fixed this issue, you can try with pip install ipex-llm[cpp]==2.1.0b20240603 again (which will be released tonight).

Tuomas Kärnä · Answer 3 · Tue Jun 04 2024 00:57:09 GMT+0800 (China Standard Time)

Thank you! I confirm that with the latest version ipex-llm[cpp]==2.1.0b20240603 the example works correctly.