SamurAIGPT / EmbedAI

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

Home Page:https://www.thesamur.ai/?utm_source=github&utm_medium=link&utm_campaign=github_privategpt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

500 Internal Server Error

dKonsalik opened this issue · comments

After uploading a document, when asking questions, I get:
Error getting data.<!doctype html> <title>500 Internal Server Error</title>

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 2190, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1486, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.10/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1484, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1469, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/home/it/Scripts/PrivateChatGPT/server/privateGPT.py", line 146, in get_answer
res = qa(query)
File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 140, in call
raise e
File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 134, in call
self._call(inputs, run_manager=run_manager)
File "/usr/local/lib/python3.10/dist-packages/langchain/chains/retrieval_qa/base.py", line 119, in _call
docs = self._get_docs(question)
File "/usr/local/lib/python3.10/dist-packages/langchain/chains/retrieval_qa/base.py", line 181, in _get_docs
return self.retriever.get_relevant_documents(question)
File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/base.py", line 377, in get_relevant_documents
docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/chroma.py", line 182, in similarity_search
docs_and_scores = self.similarity_search_with_score(query, k, filter=filter)
File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/chroma.py", line 229, in similarity_search_with_score
results = self.__query_collection(
File "/usr/local/lib/python3.10/dist-packages/langchain/utils.py", line 52, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/chroma.py", line 121, in __query_collection
return self._collection.query(
File "/usr/local/lib/python3.10/dist-packages/chromadb/api/models/Collection.py", line 227, in query
return self._client._query(
File "/usr/local/lib/python3.10/dist-packages/chromadb/api/local.py", line 437, in _query
uuids, distances = self._db.get_nearest_neighbors(
File "/usr/local/lib/python3.10/dist-packages/chromadb/db/clickhouse.py", line 585, in get_nearest_neighbors
uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids)
File "/usr/local/lib/python3.10/dist-packages/chromadb/db/index/hnswlib.py", line 240, in get_nearest_neighbors
raise NoIndexException(
chromadb.errors.NoIndexException: Index not found, please create an instance before querying
192.168.70.36 - - [30/May/2023 08:26:06] "POST /get_answer HTTP/1.1" 500 -

Have you done an ingest of the document ?

Yes I did,

image

This shows document is uploaded, but you need to click on Ingest button to train the model on your document data

Thank you @Anil-matcha
The result is gibberish, I assume we need to ingest more documents?

Loading documents from source_documentsLoaded 3 documents from source_documentsSplit into 96 chunks of text (max. 500 characters each)03@919:D5362GF%0='H(!4H",&-:-69);!=,'A<76H-D&75>"=#.4"@0.912&=8GC5=)"+EG7!))!&60>:,==$'+%'@(3:(,.>3$A+A"C&;"9GH#@):D:>+0A&%EEA@=),=G5H&C0GD5)8&G(A+5A>8*<.@3;<&A9.D4C,)>:+B<@,;G3643EAE6G>-)$27*"6&%79C10=#.H0@7H64H.*33(,2A9G.;:5B2%BD.@9&D0$75-59#>!HB,*D+
Source: source_documents/state_of_the_union.txt

What documents have you uploaded ?

Maybe you can clean the source documents and db folder and do upload your document, ingest and then run the query

What documents have you uploaded ?

Maybe you can clean the source documents and db folder and do upload your document, ingest and then run the query

Removed DB and source documents.
I have uploaded PDF, but even tried a simple txt file with 2 sentences and the same output, just gibberish

192.168.70.36 - - [31/May/2023 17:44:11] "GET /download_model HTTP/1.1" 200 -
192.168.70.36 - - [31/May/2023 17:46:18] "POST /upload_doc HTTP/1.1" 200 -
Loading documents from source_documents
Loaded 1 documents from source_documents
Split into 69 chunks of text (max. 500 characters each)
Using embedded DuckDB with persistence: data will be stored in: db/
192.168.70.36 - - [31/May/2023 17:46:29] "GET /ingest HTTP/1.1" 200 -
192.168.70.36 - - [31/May/2023 17:46:50] "OPTIONS /get_answer HTTP/1.1" 200 -
Using embedded DuckDB with persistence: data will be stored in: db/
.A>#68*<$DC36G$(3<2H:3E26.H0#F*3@93#H7$,2DCDA(=''G$<F7G&()$34&(<2=;,GGF0))$6>5D,0C('&&.@eg#B'+A4H7!;2G7)C'&*3$-B@D<E""2)*5(***3@@6.1(<);$"*0&5;0