Bias In Retriever towards Long Texts

Question

Bias In Retriever towards Long Texts

kuldeep7688 opened this issue 4 years ago · comments

I tried the retriever module for getting documents related to the question but unfortunately almost everytime long documents were suggested as the best matched.

I tried to find out whether the tf vector is normalized in the compressed sparse matrix creation but couldn't.

Can someone help me whether I am right or wrong ?
And has anyone noticed this ?