PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

problem when ingesting (just CPU)

alexmc6 opened this issue · comments

I have set things up on an ubuntu 22.04 box as per instructions. anaconda installed, new 3.10 python env created. I get part way through python ingest.py --device_type cpu

  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/__init__.py", line 143, in Client
    api = system.instance(API)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 195, in instance
    impl = type(self)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/api/segment.py", line 82, in __init__
    self._manager = self.require(SegmentManager)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 134, in require
    inst = self._system.instance(type)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 192, in instance
    type = get_class(fqn, type)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 239, in get_class
    module = importlib.import_module(module_name)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py", line 13, in <module>
    from chromadb.segment.impl.vector.local_persistent_hnsw import (
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 9, in <module>
    from chromadb.segment.impl.vector.local_hnsw import (
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_hnsw.py", line 21, in <module>
    import hnswlib
ImportError: /home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/hnswlib.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

This looks like it might be the same problem as

langchain-ai/langchain#3017

The discussion there suggests I may have the wrong version of hnswlib and recommends

pip install hnswlib --user --no-build-isolation
pip install chromadb --user

However, I can't install hnswlib that way as it crashes too - in a different way.

Is there anything more I should check? Has anyone done this option recently?

(PS this machine has no proper GPU - just the on board graphics, so I am using the CPU only setting for now, 32Gb, i5, intel MB, ubuntu 22.04).

I am just ingesting the one pdf which came with the git repo.

man oh man, this is also not working for me on pc only @PromtEngineer

🔴] × python ingest.py --device_type opencl (localGPT)
2024-05-15 04:59:51,314 - INFO - ingest.py:147 - Loading documents from /home/rob/localGPT/SOURCE_DOCUMENTS
Importing: Orca_paper.pdf
2024-05-15 04:59:51,334 - INFO - ingest.py:47 - Loading document batch
/home/rob/localGPT/SOURCE_DOCUMENTS/Orca_paper.pdf loaded.

2024-05-15 05:00:24,027 - INFO - :241 - pikepdf C++ to Python logger bridge initialized
2024-05-15 05:00:55,455 - INFO - ingest.py:156 - Loaded 1 documents from /home/rob/localGPT/SOURCE_DOCUMENTS
2024-05-15 05:00:55,455 - INFO - ingest.py:157 - Split into 193 chunks of text
2024-05-15 05:01:02,291 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: colbert-ir/colbertv2.0
2024-05-15 05:01:03,468 - WARNING - SentenceTransformer.py:805 - No sentence-transformers model found with name /home/rob/.cache/torch/sentence_transformers/colbert-ir_colbertv2.0. Creating a new one with MEAN pooling.
2024-05-15 05:01:06,047 - INFO - ingest.py:168 - Loaded embeddings from colbert-ir/colbertv2.0
Traceback (most recent call last):
File "/home/rob/localGPT/ingest.py", line 182, in
main()
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/rob/localGPT/ingest.py", line 170, in main
db = Chroma.from_documents(
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 613, in from_documents
return cls.from_texts(
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 568, in from_texts
chroma_collection = cls(
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 120, in init
self._client = chromadb.Client(_client_settings)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/init.py", line 143, in Client
api = system.instance(API)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 195, in instance
impl = type(self)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/api/segment.py", line 82, in init
self._manager = self.require(SegmentManager)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 134, in require
inst = self._system.instance(type)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 192, in instance
type = get_class(fqn, type)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 239, in get_class
module = importlib.import_module(module_name)
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py", line 13, in
from chromadb.segment.impl.vector.local_persistent_hnsw import (
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 9, in
from chromadb.segment.impl.vector.local_hnsw import (
File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_hnsw.py", line 21, in
import hnswlib
ImportError: /home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/hnswlib.cpython-310-x86_64-linux-gnu.so: undefined symbol: __cxa_call_terminate