Sicos1977 / ChromiumHtmlToPdf

Convert HTML to PDF with a Chromium based browser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature to enable JavaScript resolve 'window.chrome.webview.hostObjects.printPagesCount'

sangeethnandakumar opened this issue · comments

I'm using ChromeHtmlToPdf for converting HTML to PDF.
Everything looks super fine but I'm unable to make it put page numbers correctly in footers.

Here's my footer where I need to dynamically update page-number to 'Page 1 of 45'

<footer id="pdf-footer">
  <div class="container">
    <div class="left">A123</div>
    <div class="center">Adobe Systems</div>
    <div class="right">Page <span class="page-number"></span></div>
  </div>
</footer>

Firstly I tried the CSS approach. Added this css but had no effect

@page {
  @bottom-center {
    content: "Page " counter(page) " of " counter(pages);
  }
}

.page-number {
  counter-increment: page;
}

But it's not working. Then I tried the JavaScript approach. I'm not sure if this library can support JavaScript execution, I hope it does

This is what I tried, But this is also not working

<script>
  window.onload = function() {
    var pdfFooter = document.getElementById('pdf-footer');
    var pageCount = 0;
    var footerInterval = setInterval(function() {
      pageCount++;
      if (pageCount === window.chrome.webview.hostObjects.printPagesCount) {
        clearInterval(footerInterval);
      }
      var pageNumber = document.createElement('span');
      pageNumber.innerHTML = pageCount;
      pdfFooter.querySelector('.page-number').appendChild(pageNumber);
    }, 100);
  }
</script>

On further research, Somebody told me I need to add the flag --enable-print-browser for this to work.

I tried to see if I can pass flags like this. It seems like I can't documentation for how to do this

public static void Generate(string html, string pdfPath)
{
    var chromeOptions = new ChromeHtmlToPdfLib.Settings.ChromeOptions
    {
        BrowserArgs = new string[] { "--enable-print-browser" }
    };

    using (var converter = new ChromeHtmlToPdfLib.Converter(chromeOptions))
    {
        // Set the pixel dimensions of the HTML content to achieve a DPI of 300
        converter.SetWindowSize(2550, 3300); // 11 inches * 300 pixels per inch
        converter.ConvertToPdf(html, pdfPath, new ChromeHtmlToPdfLib.Settings.PageSettings
        {
            PrintBackground = true,
            PaperWidth = 8.5, // A4 width in inches
            PaperHeight = 11, // A4 height in inches
            Scale= 1,
        });
    }
}

Still I can't find any way to make dynamic pages work with this library. Out of all the amazing features, I hope it does have this but I'm doing something wrong

Using ChromeHTMLToPDF how can I achieve this @Sicos1977 ?
Is it possible to do auto-numbering like this? If so how to do with this library?

image

commented

I myself am not a JavaScript expert, this library uses Chrome to convert HTML to PDF so if you get it to work in Chrome then it also should work when using this library. If you want to inject javascript into the DOM you can use this property

        /// <summary>
        ///     Runs the given javascript after the webpage has been loaded and before it is converted
        ///     to PDF
        /// </summary>
        public string RunJavascript { get; set; }

Set it before calling the ConvertToPdf method.

Thank you Sicos,

After further research, I found post-2016 modern browsers don't support the above CSS approach.
It is a browser implementation limitation.

Also this line will resolve only on embedded Chrome

window.chrome.webview.hostObjects.printPagesCount

On top of that, For the Javascript approach to work, The footer position can't be fixed or absolute. But most footers require any of it thus making it challenging.

But to anybody having the same issue I'm facing, There's no definite way to do it.

I'll close this ticket as this is a browser limitation (CSS3 Page Media) and not with this beautiful library.

Just for anyone reached here, having the same problem, How I fixed is to post-process the PDF file after generated by ChromeHTMLtoPDF using PDFSharp. This can cleanly stamp the PDF with page numbers iteratively and this is the result.

image

I could find ground by using both libraries in tandem.
Anyways thanks for this amazing library @Sicos1977

Let me point out a few features that will be so much useful, which usually came to my common business use cases

  • I didn't find a setter for setting page kind. When I tried to set the page layout to A4, It says it can't because that was a read-only property. Alternatively, then I used height and width to set it like A4. But having an enum with different standard page sizes or kinds (A3, A4, Legal, Wide etc...) will be far more useful
  • Can we have some more options to set page orientation, color, margins, zoom factor (scale) etc... I'm not sure if I was not able to find it or if it exists in this library. If exists, can you add all possible tweakable settings in ReadMe for everyone to see in a tabular way with a small example? Will be a lot useful for new adopters like me
  • Also can you add in the ReadMe, How can I pass custom flags to Chrome when the library invokes Chrome? Like in an array?

Again thanks for this amazing library !

commented

Page properties can be set through the PageSettings object --> https://github.com/Sicos1977/ChromeHtmlToPdf/blob/master/ChromeHtmlToPdfLib/Settings/PageSettings.cs that you then can pass into the ConvertToPdf method

        /// <summary>
        ///     Converts the given <paramref name="inputUri" /> to PDF
        /// </summary>
        /// <param name="inputUri">The webpage to convert</param>
        /// <param name="outputStream">The output stream</param>
        /// <param name="pageSettings"><see cref="PageSettings"/></param>
        /// <param name="waitForWindowStatus">Wait until the javascript window.status has this value before
        ///     rendering the PDF</param>
        /// <param name="waitForWindowsStatusTimeout"></param>
        /// <param name="conversionTimeout">An conversion timeout in milliseconds, if the conversion fails
        ///     to finished in the set amount of time then an <see cref="ConversionTimedOutException"/> is raised</param>
        /// <param name="mediaLoadTimeout">When set a timeout will be started after the DomContentLoaded
        ///     event has fired. After a timeout the NavigateTo method will exit as if the page has been completely loaded</param>
        /// <param name="logger">When set then this will give a logging for each conversion. Use the logger
        ///     option in the constructor if you want one log for all conversions</param>
        /// <exception cref="ConversionTimedOutException">Raised when <paramref name="conversionTimeout"/> is set and the 
        /// conversion fails to finish in this amount of time</exception>
        /// <remarks>
        ///     When the property <see cref="CaptureSnapshot"/> has been set then the snapshot is saved to the
        ///     property <see cref="SnapshotStream"/>
        /// </remarks>
        public void ConvertToPdf(
            ConvertUri inputUri,
            Stream outputStream,
            PageSettings pageSettings,
            string waitForWindowStatus = "",
            int waitForWindowsStatusTimeout = 60000,
            int? conversionTimeout = null,
            int? mediaLoadTimeout = null,
            ILogger logger = null)

You can use the method below to pass any custom arguments to Chrome

        #region AddChromedArgument
        /// <summary>
        ///     Adds an extra conversion argument to the <see cref="DefaultChromeArguments" />
        /// </summary>
        /// <remarks>
        ///     This is a one time only default setting which can not be changed when doing multiple conversions.
        ///     Set this before doing any conversions. You can get all the set argument through the <see cref="DefaultChromeArguments"/> property
        /// </remarks>
        /// <param name="argument">The Chrome argument</param>
        public void AddChromeArgument(string argument)
        {
            if (IsChromeRunning)
                throw new ChromeException($"Chrome is already running, you need to set the argument '{argument}' before staring Chrome");

            if (string.IsNullOrWhiteSpace(argument))
                throw new ArgumentException("Argument is null, empty or white space");

            if (!_defaultChromeArgument.Contains(argument, StringComparison.CurrentCultureIgnoreCase))
            {
                WriteToLog($"Adding Chrome argument '{argument}'");
                _defaultChromeArgument.Add(argument);
            }
            else
                WriteToLog($"The Chrome argument '{argument}' has already been set, ignoring it");
        }

When I have some spare time I'll try to extend the readme with more examples.

Got how to pass flags,

Regarding PageSettings, Yes we do have a property to set PaperFormat.
The issue I mentioned earlier is, That property only has a getter (Read-Only) and no setter!

Here's a screenshot for reference

Screenshot (35)

The below worked totally fine for me (taken from testing with LinqPad)

void Main()
{
	var body = """
		<html>
		<head>
			<title>Experiment</title>
		</head>
		<body>
			<div style="page-break-after: always;">This is the first page of the body</div>
			<div>This is the body of the second page</div>
		</body>
		</html>
		""";
	//See https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF for explanation of CSS classes
	//that are substituted for you automatically, when used in the header/footer
	var header = """
		<div class="text center">Header Content (<span class="title"></span>)</div>
		""";
	var footer = """
		<div class="text center">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>
		""";

	var pdfMemoryStream = new MemoryStream();
	using (var converter = new ChromeHtmlToPdfLib.Converter())
	{
		//paperFormat is not needed in PageSettings, the default is A4
		converter.ConvertToPdf(body, pdfMemoryStream, new ChromeHtmlToPdfLib.Settings.PageSettings(/*paperFormat: ChromeHtmlToPdfLib.Enums.PaperFormat.A4*/)
		{
			DisplayHeaderFooter = true,
			HeaderTemplate = header,
			FooterTemplate = footer
		});		
	}
	
	pdfMemoryStream.Position = 0; //Reset stream position back to 0, so that we can write the contents of the stream to a file 
	using (FileStream fileStream = new FileStream(@"C:\test.pdf", FileMode.Create, FileAccess.Write))
	{
		// Copy the contents of the input stream to the file stream using a buffer
		byte[] buffer = new byte[8192];
		int bytesRead;
		while ((bytesRead = pdfMemoryStream.Read(buffer, 0, buffer.Length)) > 0)
		{
			fileStream.Write(buffer, 0, bytesRead);
		}
	}
}
commented

You need to set the paper format through the constructor, after that you can do everything you want with the settings object.

        /// <summary>
        /// Makes this object and sets all the settings to it's default values
        /// </summary>
        /// <remarks>
        /// Default paper settings are set to <see cref="Enums.PaperFormat.A4"/>
        /// </remarks>
        /// <param name="paperFormat"></param>
        public PageSettings(PaperFormat paperFormat)
        {
            ResetToDefaultSettings();
            PaperFormat = paperFormat;
            SetPaperFormat(paperFormat);
        }
        #endregion

You can use pageNumber and totalPages CSS classes.

string footer = @"
<div style=""color: lightgray; border-top: solid lightgray 1px; font-size: 10px; padding-top: 5px; text-align: center; width: 100%;"">
    <span>Стр.</span> <span class=""pageNumber""></span> <span>из</span> <span class=""totalPages"">
</div>";
string header = @"<div style=""font-size: 0px; width: 100%;""></div>";

PageSettings pageSettings = new(ChromeHtmlToPdfLib.Enums.PaperFormat.A4)
{
    PrintBackground = true,
    DisplayHeaderFooter= true,
    HeaderTemplate = header,
    FooterTemplate = footer,
    MarginBottom = 0.6,
    MarginTop = 0.6
};

pagecount