gkamradt / langchain-tutorials

Overview and tutorial of the LangChain Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'UnstructuredPDFLoader' is not defined

netingweb opened this issue · comments

Hallo
If I try to execute the tutorial either on colab or local I always get the following error
NameError: name 'UnstructuredPDFLoader' is not defined
even if I install all packages as shown on https://langchain.readthedocs.io/en/latest/modules/document_loaders/examples/unstructured_file.html

Thank you Greg for your prompt reply!
I've fixed after reinstalling the following packages on my machine.

!brew install poppler
!brew install tesseract

Now is it working
Cheers

Just a comment for those that cant install proppeler
I actually had same issue and the above instllation don't work.

I installed
pip install unstructured

and it helped


ModuleNotFoundError Traceback (most recent call last)
File [.venv/Lib/site-packages/langchain/document_loaders/url.py:14), in UnstructuredURLLoader.init(self, urls)
13 try:
---> 14 import unstructured # noqa:F401
15 except ImportError:

ModuleNotFoundError: No module named 'unstructured'

Unstructured is tough to get going for some users.

Just updated the code with another loader
https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Ask%20A%20Book%20Questions.ipynb