medama-io / medama

Self-hostable, privacy-focused website analytics.

Home Page:https://oss.medama.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance Insights

ayuhito opened this issue · comments

commented

When this product is more stable, I think having a page dedicated to performance monitoring (focused on Core Web Vitals and other SEO related checks) is a very useful metric for users but there are various approaches we need to consider.

Synthetic Testing

This is the least invasive approach using an automated browser such as Playwright or Puppeteer with all the necessary tracking scripts to then display all the stats. Running it a few times and then presenting the median values should filter out any outliers.

However, there are many limitations with this approach. A synthetic test doesn't consider different user devices and their performance, nor can effectively test from multiple locations which can hugely affect latency values.

A self-hosted instance can only monitor a website from where it is located, while the paid cloud offering can be designed to run tests from a multitude of preset locations. Additionally, since it is a server-intensive operation, it would be difficult to measure every page on a website or keep data up to date in real time - it would make sense to offer a cron schedule option, such as daily tests and relying on user submitted URLs or sitemaps to scrape appropriately.

Real User Monitoring

This approach would add the additional ~1.5kb core web vitals script (definitely optimisable) to the visitor tracker. This can then give us a lot more datapoints from various devices and locations, giving a more accurate picture of website performance. Collecting information on a per-page basis becomes a lot more easier as we aren’t running this on our servers.

The downside is that it would increase the size of the script (not very problematic since the script will be cached on subsequent loads) and can be a bit excessive in terms of data collection, even if we apply randomised sampling to reduce the number of datapoints. It’s a very subjective line to cross.

Conclusion

I think synthetic tests are a great feature to offer which at least provides useful baseline insights without increasing visitor tracking. Real user monitoring might be worth investigating as an opt-in feature, but it should only be considered after benchmarking and validating our synthetic test approach is insufficient when addressing user needs.