fathyb / html2svg

Convert HTML and `<canvas>` to SVG, PDF, or images using Chromium

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

incredible work

cjroebuck opened this issue · comments

Hey, I've just read the blog post and found the repo - amazing work, how long did you take to learn and understand the chromium src code? I think I learned so much just reading through your post, so thank you!

Really appreciate your kind words, thanks!

I started working on the WebKit codebase around 12 years ago, back then I was doing digital forensics and my job was to find exploits on mobile devices. After that I moved to a digital photography company and built a photo editor using NW.js. I started doing some work around the shell, for example by implementing vibrancy on macOS, you can see an early version of this editor on this issue: nwjs/nw.js#2817 (comment)

Later on I joined an AR company where I worked on a cross-platform AR runtime based on a fork of Electron.js. Our customers were mostly using enterprise oriented Windows tablets, so we greatly benefited from the GPU compatibility that ANGLE provides.

I stopped working on Chromium for a while after joining Segment around 4 years ago, and started again around the start of this year during my free time because I kinda missed it!

Very cool! You’re clearly very talented and know your way around the chromium source code very well.

I do have one question which is kind of related to this project and since you know chromium so well I thought it may be worth asking you about.

The reason I was interested in your blog post (I saw it on Reddit) is because I’m using headless chromium to generate pdfs from html via puppeteer, which i imagine uses similar code paths in chromium to the ones you’ve patched for this project, especially the skia stuff.

The problem I’m having is that for some html, the pdf output ignores certain media queries. Here’s an example issue puppeteer/puppeteer#6974 but there are quite a few more all related to this bug which I believe is in chromium rather than puppeteer.

I haven’t had a chance to try out html2svg yet but I’d be interested to know if it also has the same issue when it comes to ‘ignoring’ media queries.

I’m wondering if you might be able to help explain to me where in the chromium source code the relevant code for printing to pdf stuff is, especially any part that deals with media queries, so that I might be able to understand this behaviour / bug.

Thank you 🙏🏼

I suspect that Puppeteer is using the standard printing method which re-evaluates media queries (@media print for example), it is equivalent to opening up the printing dialog and picking "Export to PDF". This does a few things, but roughly:

  1. chromium/components/printing does the preparation, mostly in print_render_frame_helper.cc:
    a. take a DOM snapshot of the page (aka "preview document")
    b. reevaluate CSS media queries for printing <-- likely your problem
    c. a few rendering adjustments are done (typeface, spacing units) based on the output type (pdf or printer)
  2. chromium/printing does the actual export/printing

html2svg uses a graphical snapshot which should provide the same output as you would as a user viewing the page, and should respond to media queries like max-width.

If you have a page example, I can run a quick export and share it with you to see if it produces the expected output. A Docker image should be available by the end of today.


(I'm closing this issue but feel free to keep the discussion going!)