khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.

https://khoj.dev

[FIX] Gracefully handle error case when user-generated data is indexed with two different search models

sabaimran opened this issue 6 months ago · comments

sabaimran commented 6 months ago

Describe the bug

The error in this thread was caused by the user having had data indexed with the multilingual model and the default model separately.

To Reproduce

Steps to reproduce the behavior:

Index some data with the default model.
Change model in the settings page.
Index some new data.

On any chat query, it should result in an internal server error.

Stack trace

[2024-02-21 14:13:31 +0000] [126371] [ERROR] Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.DataException: different vector dimensions 384 and 768

Potential Fixes

Delete all indexed data when search model is changed.
Maintain state of which search model was used to generate which embeddings. It will still be tricky to stitch together results across different models.

Platform

Server:
- Cloud-Hosted (https://app.khoj.dev)

If self-hosted

Server Version [e.g. 1.0.1]: 1.6.0

Additional context

Add any other context about the problem here.