mozilla / bugbug

Platform for Machine Learning projects on Software Engineering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MEI: inconsistency between the ME number and the corresponding bug queries

hsinyi opened this issue · comments

The queries of the opened bugs and closed bugs were not the snapshots when the corresponding MEs were calculated. Taking the DOM Core's quality report as an example, the report is usually sent every Tuesday 2-3AM, CET. However, the query criterion includes the opened/closed bugs, until the end of every Tuesday. So when I check the email, I couldn't get the right bug list. It could be more helpful if the bug query attached in the report can be updated.

Also, a week in the current query are actually 8 days long, e.g. this query is from 2024-04-02 to 2024-04-09 23:59:59 .

I think the discrepancy comes from the exact dates that are being used, which are slightly different between the report and the query.

I think the discrepancy comes from the exact dates that are being used, which are slightly different between the report and the query.

No I was wrong, we are actually using the same query.

@hsinyi I guess the discrepancy might actually be coming from the fact that the query is run during the night, while you look at it in the morning and some bugs might have changed in the meantime.

I think the discrepancy comes from the exact dates that are being used, which are slightly different between the report and the query.

No I was wrong, we are actually using the same query.

@hsinyi I guess the discrepancy might actually be coming from the fact that the query is run during the night, while you look at it in the morning and some bugs might have changed in the meantime.

Yes, exactly, that's what I thought. So I do wonder how easy it is to fix the problem.

A solution may be - when the query is running on Day_N, the queried period is till Day_N-1. In that case, since that period was past when the query is running, we can get a "fixed" list even when I check it later.

There might still be inconsistencies in the calculations because some bugs might change (e.g. get closed) between the time the query runs and the time you use it yourself, though I agree it would be more often consistent than it is now.

@jensstutte do you see any downsides?

I guess another approach that minimizes changes is adding the exact hour in the query and not just the day

Yes, I think having exact timestamps would help for most of the differences. We still can see bugs that changed components differently and weights changing. But it is the nature of MEI and wanted that it can change for those parameters, we want to look at what we know today (when we measure) about our bugs.