Ollama codestral model produces nonsensical output on PVC
tkarna opened this issue · comments
I'm using ollama in langchain. The following code generation test uses codestral model and produces valid a response on standard ollama install running on CPU. If I run with ollama installed with ipex-llm (following the online instructions), I get nonsensical characters in the output.
Brief tests suggest that the length of the input prompt may trigger this. Shorter/simpler queries work fine.
Tested on a server with Sapphire Rapids CPUs and 2 PVCs. Docker with Ubuntu 22.04.4 LTS, OneAPI 2024.1, Python 3.10.12, langchain 0.2.1.
Valid response on CPU:
```diff
- ResultData data{
+ ResultData data{
+ utils::GetPid(),
+ device_id_,
+ correlator_.GetTimestamp(),
+ options_.GetMetricGroup(),
+ device_props_list_,
+ kernel_name_list,
+ std::move(kernel_interval_list)};
```
Invalid response on ipex-llm and PVC:
```diff
- ResultData data{
+ ResultData data{
utils<s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s>...
Reproducer:
reproducer.py
from langchain_community.llms import Ollama patch_prompt = """You are an experienced programmer. You will be given an issue related to a C++ software project, some related code blocks for context, and a proposed fix. Your ultimate goal is to write code block that fixes the issue. To respond you MUST use the following format. ~~~ ```diff [fixed code block in diff format.] ``` ~~~ Issue: {input} """ issue_text = """ Issue description: COPY_INSTEAD_OF_MOVE Creating a copy of a variable that is no longer used instead of using std::move(). Context: "kernel_interval_list" is copied in call to copy constructor "std::vector >", when it could be moved instead. Use "std::move"("kernel_interval_list") instead of "kernel_interval_list". Affected part of source code: ```cpp ResultData data{ utils::GetPid(), device_id_, correlator_.GetTimestamp(), options_.GetMetricGroup(), device_props_list_, kernel_name_list, kernel_interval_list}; ``` In: identifier: Profiler::DumpResultFile path: pti/tools/oneprof/profiler.h:262 ```cpp void Profiler::DumpResultFile() { std::vector kernel_name_list; std::vector kernel_interval_list; if (CheckOption(PROF_KERNEL_INTERVALS) || CheckOption(PROF_KERNEL_METRICS) || CheckOption(PROF_AGGREGATION)) { if (cl_kernel_collector_ != nullptr) { const ClKernelIntervalList& cl_kernel_interval_list = cl_kernel_collector_->GetKernelIntervalList(); std::vector device_list = utils::cl::GetDeviceList(CL_DEVICE_TYPE_GPU); if (!device_list.empty()) { PTI_ASSERT(device_id_ < device_list.size()); AddKernelIntervals( cl_kernel_interval_list, device_list[device_id_], kernel_name_list, kernel_interval_list); } } if (ze_kernel_collector_ != nullptr) { const ZeKernelIntervalList& ze_kernel_interval_list = ze_kernel_collector_->GetKernelIntervalList(); std::vector device_list = utils::ze::GetDeviceList(); if (!device_list.empty()) { PTI_ASSERT(device_id_ < device_list.size()); AddKernelIntervals( ze_kernel_interval_list, device_list[device_id_], kernel_name_list, kernel_interval_list); } } } if (CheckOption(PROF_KERNEL_QUERY)) { PTI_ASSERT(metric_query_collector_ != nullptr); kernel_name_list = metric_query_collector_->GetKernels(); } ResultStorage* storage = ResultStorage::Create( options_.GetRawDataPath(), utils::GetPid()); PTI_ASSERT(storage != nullptr); ResultData data{ utils::GetPid(), device_id_, correlator_.GetTimestamp(), options_.GetMetricGroup(), device_props_list_, kernel_name_list, kernel_interval_list}; storage->Dump(&data); delete storage; } ``` """ llm = Ollama( model="codestral", temperature=0.1, top_k=10, top_p=0.5, repeat_penalty=1.03, num_thread=28, ) full_prompt = patch_prompt.format(input=issue_text) response = llm.invoke(full_prompt) print(response)
Hi @tkarna ,
I have reproduced this error, and we are trying to figure out the root cause and fix it.
Once it is done, will update here to let you know.
Hi @tkarna ,
We have fixed this issue, you can try with pip install ipex-llm[cpp]==2.1.0b20240603
again (which will be released tonight).
Thank you! I confirm that with the latest version ipex-llm[cpp]==2.1.0b20240603
the example works correctly.