HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community

Home Page:https://almanac.httparchive.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance of generate script

tunetheweb opened this issue · comments

We have a NodeJS npm run generate command, which generates all the chapters, and generates the featured quotes (which requires having processed all the chapters), and generates the ebook.

In GitHub actions we also test and export every web page and save the actual HTML (as opposed to the Jinja2 HTML Templates that the generate script generates that aren't fully HTML as includes Jinja functions), and then lint them.

Finally we run Lighthouse on any chapters changed, plus a few base chapters (in case we change core stuff like CSS or python).

This is all fantastic and automated and gives us real confidence in merging PRs. It also allows PR authors to address a lot of issues before maintainers even get there.

However, this is starting to get slow as more translations, and more years are added.

In GitHub Actions it typically takes 9 mins to run the Test Web Site action, including:

  • 1 minutes 30 to download the GitHub super linter image for linting
  • 2 minutes 30 to generate and test the website
  • 2 minutes 30 to lint the generated HMTL (this is done for ALL the site).
  • 1 minute 30 to run the Lighthouse tests

Now we do have the ability to generate a single chapter (or subset of the site):

npm run generate en/2021/pwa

And also have the ability to auto generate the chapter on save:

npm run watch

However some may not realise this. Plus both those options exclude the featured quotes on the year home page, which are useful for people to see.

Should we look at improving the performance or is it good enough? Most people use it through GitHub Actions and probably don't really notice/care that it's slow. But for those running it locally it's more of an issue/annoyance.

Ideas:

  • Parallelise the years in our JavaScript script
  • Add featured quote regen to chapter only and watch scripts
  • Move HMTL linting to separate GitHub Actions (but so much is repeated!)
  • Move Lighthouse to separate GitHub Actions (but so much is repeated!)
  • Remove/Consolidate some of our dependencies (JSDom, Prettier, Showdown, SmartyPants)
  • Move to a new tech stack from the SSG part (11ty?)