[Bug]: AttributeError: 'tuple' object has no attribute 'score'
frank-morales2020 opened this issue · comments
Bug Description
/usr/local/lib/python3.10/dist-packages/llama_index/packs/rag_fusion_query_pipeline/base.py in (x)
33 for nodes_with_scores in results:
34 for rank, node_with_score in enumerate(
---> 35 sorted(nodes_with_scores, key=lambda x: x.score or 0.0, reverse=True)
36 ):
37 text = node_with_score.node.get_content()
AttributeError: 'tuple' object has no attribute 'score'
Version
0.10.35
Steps to Reproduce
from llama_index.core.llama_pack import download_llama_pack
download and install dependencies
RAGFusionPipelinePack = download_llama_pack(
"RAGFusionPipelinePack", "./rag_fusion_pipeline_pack"
)
Relevant Logs/Tracbacks
SOLUTION
MODIFY IN THE SOURCE
llama_index/packs/rag_fusion_query_pipeline/base.py LINE 35
key=lambda x: x.score or 0.0 FOR key=lambda x: x[1]
The error you're encountering is due to nodes_with_scores
being populated with tuples instead of NodeWithScore
objects, which is why you're seeing an AttributeError
when attempting to access a score
attribute that doesn't exist on tuples. The line causing the issue is trying to sort nodes_with_scores
based on the score
attribute of each item, which works only if those items are NodeWithScore
objects.
Your proposed solution to change key=lambda x: x.score or 0.0
to key=lambda x: x[1]
will indeed avoid the AttributeError
by treating each item as a tuple and sorting based on its second element. However, this approach assumes that the structure of nodes_with_scores
is intended to be a list of tuples, which might not align with the original design of the code. This change could lead to incorrect behavior if the list is expected to contain objects with a score
attribute.
A more appropriate solution would be to ensure that nodes_with_scores
is populated with NodeWithScore
objects as originally intended. This might require reviewing the upstream code that generates nodes_with_scores
to correct the data type mismatch. This approach addresses the root cause of the issue and maintains the integrity of the data handling logic in the codebase.
Details
Sources
- llama-index-packs/llama-index-packs-rag-fusion-query-pipeline/llama_index/packs/rag_fusion_query_pipeline/base.py
- docs/docs/examples/retrievers/reciprocal_rerank_fusion.ipynb
- docs/docs/examples/low_level/fusion_retriever.ipynb
- llama-index-core/llama_index/core/retrievers/fusion_retriever.py
- llama-index-core/llama_index/core/schema.py
- llama-index-legacy/llama_index/legacy/llama_pack/download.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
My full solution here:
def reciprocal_rank_fusion(
results: List[List[NodeWithScore]],
) -> List[NodeWithScore]:
"""Apply reciprocal rank fusion.
The original paper uses k=60 for best results:
https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf
"""
k = 60.0 # `k` is a parameter used to control the impact of outlier rankings.
fused_scores = {}
text_to_node = {}
rank=0
for node_with_score in results:
rank+=1
if not isinstance(node_with_score, NodeWithScore):
raise TypeError("node_with_score must be a NodeWithScore object.")
text = node_with_score.node.get_content()
text_to_node[text] = node_with_score
if text not in fused_scores:
fused_scores[text] = 0.0
fused_scores[text] += 1.0 / (rank + k)
#print(fused_scores) / (rank + k)
# sort results
reranked_results = dict(
sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
)
# adjust node scores
reranked_nodes: List[NodeWithScore] = []
for text, score in reranked_results.items():
reranked_nodes.append(text_to_node[text])
reranked_nodes[-1].score = score
#print(reranked_nodes)
return reranked_nodes