[Bug]: Knowledge not being returned from brain in latest release
Croccodoyle opened this issue · comments
What happened?
Latst build. Context is visible in doc in brain but I get following message... "2024-03-09 12:41:07 [INFO] modules.brain.knowledge_brain_qa [knowledge_brain_qa.py:107]: No source documents found or source_documents is not a list." Could be because I switched brain from Ollam to Open AI model? When I check knowledge_brain_qa.py I see that Import "langchain.callbacks.streaming_aiter" could not be resolvedPylancereportMissingImports. I also see Import "pydantic_settings" could not be resolvedPylancereportMissingImports
Relevant log output
No response
Twitter / LinkedIn details
No response
Based on the error message you're receiving, it seems like the source_documents
parameter passed to the generate_source
function might not be a list or is None
. This error occurs when the source_documents
parameter is either None
or not a list. You can add a type check at the beginning of the generate_source
function to ensure it only proceeds if source_documents
is a list. Here's an example of how you can do this:
def generate_source(self, source_documents):
if not isinstance(source_documents, list):
raise TypeError("source_documents must be a list")
# rest of the function
Regarding the unresolved imports of langchain.callbacks.streaming_aiter
and pydantic_settings
in the knowledge_brain_qa.py
file, it seems like the modules or packages might not be installed in your environment, or there could be a typo in the module name.
For the langchain.callbacks.streaming_aiter
import, ensure that the langchain
library is correctly installed in your environment. If it's a custom library or a part of a larger framework not publicly available, check that the path to the library is correctly set in your PYTHONPATH. If it's a publicly available library, you might need to install it using pip (e.g., pip install langchain
) or adjust the import statement according to the correct module structure.
For the pydantic_settings
import, the correct import for Pydantic settings is from pydantic
itself, assuming you're trying to use Pydantic's BaseSettings class for configuration management. If pydantic_settings
is a custom module, ensure it exists and is correctly placed in your project structure. If you meant to use Pydantic's BaseSettings, the correct import statement should be from pydantic
:
from pydantic import BaseSettings
Ensure that Pydantic is installed in your environment (pip install pydantic
).
If pydantic_settings
refers to a custom module not related to Pydantic's BaseSettings, verify that the module exists in your project and that the import path is correct.
Regarding the structure and expected format of the source_documents
parameter in the generate_source
function, it is expected to be an iterable collection of objects, where each object represents a document and contains a metadata
attribute. This metadata
attribute is a dictionary that may include keys such as original_file_name
and file_name
. The function processes these documents to generate a list of Sources
objects, where each Sources
object includes the document's name, type (url
or file
), source URL, and original file name.
Here's a breakdown of the expected structure based on the code:
- Each item in
source_documents
should be an object that has ametadata
dictionary. - The
metadata
dictionary should have:original_file_name
: Used to determine if the document is a URL (starts with "http") and to get the document's name if it is a URL.file_name
: Used to get the document's name if it is not a URL.
- The function checks if
original_file_name
exists and starts with "http" to classify the document as a URL; otherwise, it is treated as a file. - For files, a
file_path
is constructed usingbrain_id
and thefile_name
from metadata, and a signed URL is generated for accessing the file.
Example of a minimal source_documents
item structure based on the description:
source_documents = [
{
"metadata": {
"original_file_name": "http://example.com/document1.pdf",
"file_name": "document1.pdf"
}
},
{
"metadata": {
"file_name": "document2.pdf"
}
}
]
This example includes two documents, one treated as a URL and the other as a file, based on the presence and content of the original_file_name
key.
I hope this helps! If you have any other questions or need further clarification, feel free to ask.
Sources
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.