aiarena / aiarena-web

A website for running the aiarena.net ladder.

Home Page:https://aiarena.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize slow front page queries

lladdy opened this issue · comments

top10 = Ladders.get_competition_ranked_participants(comp, amount=10).prefetch_related(
Prefetch("bot", queryset=Bot.objects.all().only("user_id", "name")),
Prefetch("bot__user", queryset=User.objects.all().only("patreon_level")),
Prefetch(
"bot__matchparticipation_set",
queryset=MatchParticipation.objects.filter(
match__requested_by__isnull=True,
match__round__competition=comp,
),
to_attr="match_participations",
),
)
# top 10 bots
relative_result = RelativeResult.with_row_number([x.bot.id for x in top10], comp)
try:
sql, params = relative_result.query.sql_with_params()
except EmptyResultSet:
# See https://code.djangoproject.com/ticket/26061
pass
else:
with connection.cursor() as cursor:
cursor.execute(
"""
SELECT bot_id as id, SUM("elo_change") trend FROM ({}) row_numbers
WHERE "row_number" <= %s
GROUP BY bot_id
""".format(
sql
),
[*params, elo_trend_n_matches],
)
rows = cursor.fetchall()
for participant in top10:
participant.trend = next(iter([x[1] for x in rows if x[0] == participant.bot.id]), None)

I tried to get rid of rawsql and got to this code, which produced significantly faster quieries, but the trend numbers I got weren't correct:

top10 = Ladders.get_competition_ranked_participants(comp, amount=10).prefetch_related(
    Prefetch("bot", queryset=Bot.objects.all().only("user_id", "name")),
    Prefetch("bot__user", queryset=User.objects.all().only("patreon_level")),
    Prefetch(
        "bot__matchparticipation_set",
        queryset=MatchParticipation.objects.filter(
            elo_change__isnull=False,
            match__requested_by__isnull=True,
            match__round__competition=comp,
        ),
        to_attr="match_participations",
    ),
)


for participant in top10:
    participant.trend = sum(participation.elo_change for participation in participant.bot.match_participations[:30])
    print("TREND: ", participant.bot, participant.trend)

I've added some indexes to help increase the speed of queries.

@ipeterov
It appears you were quite close with your above code.
Just needed to add an order clause to your MatchParticipation prefetch.

I've added the change here: #649
Note my comment regarding performance - do you mind testing it yourself to see?
You might want to cherry pick it across to the main branch though.

It looks like the problem is that we're actually prefetching thousands of match participations for each participant, instead of the latest 30. It's a known problem, and for a long time the only solution was to use a subquery in the prefetch queryset. I tried that, but for some reason it takes like 10 minutes to complete the query with that approach.

But django actually added native support for slicing prefetch querysets in 4.2, here's how that can be used.

I think I'll try to upgrade django version to 4.2 and see if the libraries we're using allow that.

I upgraded django to 4.2, and django-constance, django-discord-bind, and django-robots so they would support django 4.

The result is that https://aiarena-test.net opens in about 1 second, which is a significant improvement over 10-12 seconds we were getting before, and even over current aiarena.net, which is averaging about 6 seconds.

I checked that the new trends are correct by comparing them with the trends calculated the old way (raw sql) and making sure they match. I used the old production backup for this purpose.