xoofx / markdig

A fast, powerful, CommonMark compliant, extensible Markdown processor for .NET

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Markdown.ToHtml takes more than two minutes to complete when processing the 32K file

Metalnem opened this issue · comments

Markdown.ToHtml takes more than two minutes to complete when processing the 32K file from the attached archive. You can reproduce this by running the following program and passing it the path to the extracted file as a command line argument:

using System.IO;

namespace Markdig.Run
{
  public class Program
  {
    public static void Main(string[] args)
    {
      var text = File.ReadAllText(args[0]);
      var pipeline = new MarkdownPipelineBuilder().UseAdvancedExtensions().Build();
      Markdown.ToHtml(text, pipeline);
    }
  }
}

I'm using Markdig 0.15.7 and .NET Core 2.2.103.

Found via SharpFuzz.

This does not appear to be an issue on the master branch. There is a slight difference in output
https://gist.github.com/MihaZupan/a5ce072fffec28315cbdcfd779b203e9/revisions#diff-20e4e18dfae3d1b38df3bce24863b74dR3
See the end of the huge block

I believe this was addressed by the changes to the Abbreviation parser.

On 15.7, this loop seems to iterate over every character

EDIT:

Said loop iterates over each character in both versions, the master branch is just faster becuase of this check.

Basically, both versions, for each character loop over the remainer of the string => O(n^2), but the 15.7 version does a TextMatcher.TryMatch call for each of those inner iterations.

You are right, the master works fine. Do you want me to close this issue, or is there something that can be done about the O(n^2) complexity?