Markdown.ToHtml takes more than two minutes to complete when processing the 32K file

Question

Markdown.ToHtml takes more than two minutes to complete when processing the 32K file

Metalnem opened this issue 5 years ago · comments

Nemanja Mijailovic commented 5 years ago

Markdown.ToHtml takes more than two minutes to complete when processing the 32K file from the attached archive. You can reproduce this by running the following program and passing it the path to the extracted file as a command line argument:

using System.IO;

namespace Markdig.Run
{
  public class Program
  {
    public static void Main(string[] args)
    {
      var text = File.ReadAllText(args[0]);
      var pipeline = new MarkdownPipelineBuilder().UseAdvancedExtensions().Build();
      Markdown.ToHtml(text, pipeline);
    }
  }
}

I'm using Markdig 0.15.7 and .NET Core 2.2.103.

Found via SharpFuzz.

Miha Zupan · Answer 1 · Sat Feb 09 2019 20:03:41 GMT+0800 (China Standard Time)

This does not appear to be an issue on the master branch. There is a slight difference in output
https://gist.github.com/MihaZupan/a5ce072fffec28315cbdcfd779b203e9/revisions#diff-20e4e18dfae3d1b38df3bce24863b74dR3
See the end of the huge block

Miha Zupan · Answer 2 · Sat Feb 09 2019 20:11:05 GMT+0800 (China Standard Time)

I believe this was addressed by the changes to the Abbreviation parser.

On 15.7, this loop seems to iterate over every character

EDIT:

Said loop iterates over each character in both versions, the master branch is just faster becuase of this check.

Basically, both versions, for each character loop over the remainer of the string => O(n^2), but the 15.7 version does a TextMatcher.TryMatch call for each of those inner iterations.

Nemanja Mijailovic · Answer 3 · Sat Feb 09 2019 23:02:32 GMT+0800 (China Standard Time)

You are right, the master works fine. Do you want me to close this issue, or is there something that can be done about the O(n^2) complexity?