question about calculating NDCG score

Question

question about calculating NDCG score

Eku0194 opened this issue 3 years ago · comments

Your code shows :
class NDCG(Metric):
def init(self, k=10, gain_type='exp2'):
super(NDCG, self).init()
self.k = k
self.gain_type = gain_type
self._dcg = DCG(k=k, gain_type=gain_type)
self._ideals = {}

def evaluate(self, qid, targets):
    return (self._dcg.evaluate(qid, targets) /
            max(_EPS, self._get_ideal(qid, targets)))

Question:

Is evaluate method used to calculate the NDCG score of the model?
If yes, then what is the format of targets ? Is it the array of labels for each query-document pair?
If No, then how do we calculate NDCG score of the selected model?

Jerry Ma · Answer 1 · Tue Jul 13 2021 03:20:55 GMT+0800 (China Standard Time)

Hello! targets is an ordered list of relevance scores, where targets[0] is the top of the result set. And the evaluate() method is used to calculate the NDCG score of each individual result set. You could sum and divide by n to get the average NDCG across the dataset.

Eku0194 · Answer 2 · Tue Jul 13 2021 10:51:02 GMT+0800 (China Standard Time)

Thank you, Jerry.
So, the relevance score is the result of the ranking function?
In order to calculate average NDCG for multiple queries at once - do I have to sum all the NDCG score of each individual result set and then divide by n or I need to calculate average NDCG per query_id and then divide it by the total no. of queries?

Jerry Ma · Answer 3 · Tue Jul 13 2021 10:54:36 GMT+0800 (China Standard Time)

There's a one-to-one correspondence between a query and a result set (i.e. if n denotes the number of result sets, then n is equal to the total number of queries).

Eku0194 · Answer 4 · Tue Jul 13 2021 10:56:42 GMT+0800 (China Standard Time)

Thank you so much, Jerry!