GoogleChrome / rendertron

A Headless Chrome rendering solution

Home Page:https://render-tron.appspot.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How can I detect that the page is being loaded by Rendertron? Or how to tell Rendertron the page is done loading

vesper8 opened this issue · comments

On a new site that I'm using Rendertron on, I'm seeing constant timeouts even though the important data loads pretty quickly. I believe what's probably happening is my ads are causing the browser to believe the page isn't done rendering.

There's two things I would like to do to solve this... either

  1. Be able to explicitly tell Rendertron that the page is "done loading" after my content APIs have returned their data

or

  1. Be able to detect that the page is being loaded through Rendertron so that I can disable some things, such as the Ads

Unfortunately the documentation doesn't indicate how to do either of these.

Found some tips about how to do this with prerender.io
https://stackoverflow.com/questions/43731659/how-to-detect-whether-request-is-from-prerender-iocrawlers-or-from-real-userb

But can't find anything in regards to Rendertron.

I believe that hiding ads may still be considered "cloaking" so I would really prefer to be able to somehow indicate to Rendertron the the page is done loading.

Any advice would be appreciated. Many thanks!

@vesper8 I have not tried any of these for blocking ads, but sharing few thoughts/options:

  1. What if you set a custom user-agent for rendertron and detect that in your website to do the needful?
  2. Assuming you are using something like Google Tag Manager to instrument ads, you can block loading of the GTM tag using the rendertron restrictedUrlPattern config.
  3. Similar to 2 above but you can instead block the request/scripts that load the ads using rendertron restrictedUrlPattern config?
  4. This needs #643: You can add some JS code using the hooks from that PR, that will let you either identify/disable the ADS on the page.
  5. Contributing a PR that will allow adding puppeteer plugins like https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-adblocker

Thank you very much for the thoughtful and detailed reply! Those are definitely some good options that I hadn't considered.. specially the restrictedUrlPattern which wasn't on my radar yet. Thank you!

The current UA is "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/93.0.4577.0 Safari/537.36"

I suppose one could use "HeadlessChrome" to detect.

One could also set the reqHeaders in the Config.

In general, as we're deprecating the project, you should look into alternative approaches to rendering on the web.