[Bug]: AttributeError: 'tuple' object has no attribute 'score'

Question

[Bug]: AttributeError: 'tuple' object has no attribute 'score'

frank-morales2020 opened this issue a month ago · comments

FRANK MORALES, SMIEEE commented a month ago

Bug Description

/usr/local/lib/python3.10/dist-packages/llama_index/packs/rag_fusion_query_pipeline/base.py in (x)
33 for nodes_with_scores in results:
34 for rank, node_with_score in enumerate(
---> 35 sorted(nodes_with_scores, key=lambda x: x.score or 0.0, reverse=True)
36 ):
37 text = node_with_score.node.get_content()

AttributeError: 'tuple' object has no attribute 'score'

Version

0.10.35

Steps to Reproduce

from llama_index.core.llama_pack import download_llama_pack

download and install dependencies

RAGFusionPipelinePack = download_llama_pack(
"RAGFusionPipelinePack", "./rag_fusion_pipeline_pack"
)

Relevant Logs/Tracbacks

SOLUTION 

MODIFY IN THE SOURCE 

llama_index/packs/rag_fusion_query_pipeline/base.py LINE 35 

key=lambda x: x.score or 0.0 FOR  key=lambda x: x[1]

dosubot · Answer 1 · Wed May 08 2024 21:37:23 GMT+0800 (China Standard Time)

The error you're encountering is due to nodes_with_scores being populated with tuples instead of NodeWithScore objects, which is why you're seeing an AttributeError when attempting to access a score attribute that doesn't exist on tuples. The line causing the issue is trying to sort nodes_with_scores based on the score attribute of each item, which works only if those items are NodeWithScore objects.

Your proposed solution to change key=lambda x: x.score or 0.0 to key=lambda x: x[1] will indeed avoid the AttributeError by treating each item as a tuple and sorting based on its second element. However, this approach assumes that the structure of nodes_with_scores is intended to be a list of tuples, which might not align with the original design of the code. This change could lead to incorrect behavior if the list is expected to contain objects with a score attribute.

A more appropriate solution would be to ensure that nodes_with_scores is populated with NodeWithScore objects as originally intended. This might require reviewing the upstream code that generates nodes_with_scores to correct the data type mismatch. This approach addresses the root cause of the issue and maintains the integrity of the data handling logic in the codebase.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

FRANK MORALES, SMIEEE · Answer 2 · Sat May 11 2024 16:53:45 GMT+0800 (China Standard Time)

My full solution here:

def reciprocal_rank_fusion(
results: List[List[NodeWithScore]],
) -> List[NodeWithScore]:
"""Apply reciprocal rank fusion.

The original paper uses k=60 for best results:
https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf
"""
k = 60.0  # `k` is a parameter used to control the impact of outlier rankings.
fused_scores = {}
text_to_node = {}
rank=0

for node_with_score in results:
    rank+=1
    if not isinstance(node_with_score, NodeWithScore):
        raise TypeError("node_with_score must be a NodeWithScore object.")
    text = node_with_score.node.get_content()
    text_to_node[text] = node_with_score
    if text not in fused_scores:
      fused_scores[text] = 0.0
    fused_scores[text] += 1.0 / (rank + k)
    #print(fused_scores) / (rank + k)

# sort results
reranked_results = dict(
    sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
)

# adjust node scores
reranked_nodes: List[NodeWithScore] = []
for text, score in reranked_results.items():
    reranked_nodes.append(text_to_node[text])
    reranked_nodes[-1].score = score
#print(reranked_nodes)
return reranked_nodes