Slicer / slicer_download

Source code of the site allowing to download preview and stable application packages. This site is published at download.slicer.org and maintained by @Kitware on behalf of the 3D Slicer community.

Home Page:https://download.slicer.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Investigate drop in download count associated with macOS

jcfr opened this issue · comments

There is a big drop in downloads on macOS, starting from about April 2021 (140 downloads per day) to September 2021 (60 downloads per day). Do you think it is really that much less macOS users downloading Slicer or there is some technical issue? macOS downloads in the US dropped in July 2021 from about 30 to about 13 and stayed at that level – is it a coincidence that downloads with empty version start about the same date? (situation is similar in Western Europe

Originally reported by @lassoan on Jul 11, 2022 by email

macOS downloads:

image

Windows+Linux downloads:

image

As of posting this there is definitely a large discrepancy between macOS downloads from slicer-packages.kitware.com and the download stats page. I'm not sure how frequently the download stats page gets updated (maybe once a day?), but would not explain this discrepancy.

Slicer 5.0.2 totals from :
https://slicer-packages.kitware.com/#collection/5f4474d0e1d8c75dfc70547e/folder/6286cc6ee8408647b39f7e46

Slicer 5.0.3 totals from:
https://slicer-packages.kitware.com/#collection/5f4474d0e1d8c75dfc70547e/folder/62cc513eaa08d161a31c1372

Slicer 5.0.x Download Stats Page Total from:
https://download.slicer.org/download-stats/ reported values filtered by OS, Stability: release, Version: 5.0

OS Slicer 5.0.2 Slicer 5.0.3 Slicer 5.0.x Total Slicer 5.0.x Download Stats Page Total
Linux 2838 841 3679 3152
macOS 4276 1438 5117 1091
Windows 21266 3668 24934 23950

It was discussed at the recent weekly hangout to make the download stats page use the number of times the macOS package was downloaded rather than using the agent of the device used to download a package. You could for example use Windows to download a macOS package and vice versa. The number of times a certain package is downloaded is what we care about in terms of user breakdown rather than the machine used to download it.

Interesting. I would expect that download per OS would be correlated with the website sessions per OS, but I can't figure out how or if it is possible at all to get the breakdown of per-OS sessions over time in Google Analytics. It is interesting that there seems to be a lot more sessions total on Mac than on Linux, but the stats for downloads do not reflect that.

image

I don't have much experience with Google Analytics, but the difference that @jamesobutler found between the actual Slicer 5.0.x total downloads vs. after the filtering used by the download statistics is quite telling.

The filtering discarded 79% of the macOS downloads, while only 14% of linux downloads and 4% of Windows downloads.

I think this is probably a case of the user agent information changing and messing up the Slicer download stats sniffing of that information while google analytics is getting the info correctly. The timeline matches various User agent reduction activities to reduce passive fingerprinting on all the browsers and changes to the agent related to macOS 10.15 naming switching to macOS 11 resulting in capping of the agent version.

This makes sense. To address this, plan is to:

  • update the user-agent package to its latest version. See https://pypi.org/project/user-agents
  • rely on the platform associated with the downloaded package to identify the platform + rely on the user agent to check the download attempt should be discarded (bot/crawler vs legitimate download)

@jcfr Have you had time to look at this? It would be important to know if there is a real drop or just a bug in the statistics computation.

The issue still persists:

image

After implementing the following changes:

The stats seems to include the missing macOS downloads:

Before After
image image
image image

Additional comparison based on data from 2022.10.19:

Before After Difference
All 1,132,388 1,180,023 +47,635
Linux 138,881 141,521 +2,640
macOS 184,779 219,565 +34,786
Windows 808,728 818,937 +10,209

Awesome, thanks for the update. It is great that it is confirmed to be a data collection issue and not an actual dropout of users.

We can close the issue when the production server is updated with the fixes.

@jcfr Is the production server going to be updated soon? I still see at https://download.slicer.org/download-stats/ that macOS downloads are still down.

@jcfr The fix was ready one year ago but it seems that the production server is still not updated. Is there anything we could help to get the download reporting fixed?

macOS download stats are kind of meaningless now:

image

I was planning to address this today so that stats are relevant before RSNA.

Awesome, thank you! What a coincidence...