lezzago / LambdaMart

Python implementation of LambdaMart

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Answer Wrong

houchenyu opened this issue · comments

**Hello, I am sorry to bother you. I am a Ph.D. from China, and I am learning LambdaMART recently. I appreciate that you have shared this repository. However, when I implement a new version of LambdaMART referring to your code, I found a mistake. I don't know whether wrong I understood or there is a mistake in your code. Therefore, I propose this issue and hope to communicate with you.

When I debugging your code by printing the lambda during each tree constructing, I used a small sample dataset and found that the first lambda is not consistent with the hand calculation result.

The samples are:**
0 qid:1830 1:0.002736 2:0.000000 3:0.000000 4:0.000000 5:0.002736 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
0 qid:1830 1:0.025992 2:0.125000 3:0.000000 4:0.000000 5:0.027360 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
0 qid:1830 1:0.001368 2:0.000000 3:0.000000 4:0.000000 5:0.001368 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
1 qid:1830 1:0.188782 2:0.375000 3:0.333333 4:1.000000 5:0.195622 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
1 qid:1830 1:0.077975 2:0.500000 3:0.666667 4:0.000000 5:0.086183 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
0 qid:1830 1:0.075239 2:0.125000 3:0.333333 4:0.000000 5:0.077975 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
1 qid:1830 1:0.079343 2:0.250000 3:0.666667 4:0.000000 5:0.084815 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
1 qid:1830 1:0.147743 2:0.000000 3:0.000000 4:0.000000 5:0.147743 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
0 qid:1830 1:0.058824 2:0.000000 3:0.000000 4:0.000000 5:0.058824 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000
0 qid:1830 1:0.071135 2:0.125000 3:0.333333 4:0.000000 5:0.073871 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000

and the first lambda for each document should be
-0.49454758 -0.20639226 -0.10416753 0.23122752 0.23122752 -0.03293464
0.24015702 0.2471325 -0.05118031 -0.06052223

But your code print the result:
-0.06881364 -0.06414268 -0.05850759 -0.18821454 -0.17928504 -0.03063916
0.08398904 0.13811402 0.11171096 0.25578862

I will appreciate that if you can response my issue. You can contact with me by email: houcy@zjut.edu.cn

Best wish.

sample.txt

@houchenyu Have you solved the problem yet?I found that the code for calculating lambda doesn't seem right,especially np.argsort for index

@houchenyu Have you solved the problem yet?I found that the code for calculating lambda doesn't seem right,especially np.argsort for index

You can refer to my repository.

Hi @lezzago ,
I know the final input format are pairs of comparison of scores (>,< or even possibly =) but what is qid here?

LambdaMart/lambdamart.py

Lines 237 to 238 in 4c5154a

true_scores = [self.training_data[query_indexes[query], 0] for query in query_keys]
good_ij_pairs = get_pairs(true_scores)

Hi @lezzago , 你好 , I know the final input format are pairs of comparison of scores (>,< or even possibly =) but what is qid here?我知道最终的输入格式是成对的分数比较(>、< 甚至可能 =),但是这里的 qid 是什么?

LambdaMart/lambdamart.py

Lines 237 to 238 in 4c5154a

true_scores = [self.training_data[query_indexes[query], 0] for query in query_keys]
good_ij_pairs = get_pairs(true_scores)

请问您解决了此问题嘛?这里用qid代替,没有具体的query内容吗?那怎么将docs与query对应呢?