java-diff-utils / java-diff-utils

Diff Utils library is an OpenSource library for performing the comparison / diff operations between texts or some kind of data: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

Home Page:https://java-diff-utils.github.io/java-diff-utils/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance Very Poor Compared to Unix Diff Tool

DaGeRe opened this issue · comments

Describe the bug
I would expect an implementation of the Myers diff algorithm to behave (at least asymptotically) equal. Unfortunately, this is not the case for this implementation: Starting with ~100k lines that are compared, the time consumption growth heavily (at least more than linear).

You can see this in the following graph:

grafik

To Reproduce
Steps to reproduce the behavior:

  1. Use https://github.com/DaGeRe/diff-benchmark (which contains 414ae6.zip and afdedc.zip which contain data that can be compared)
  2. Run [runJavaDiff.sh](https://github.com/DaGeRe/diff-benchmark/blob/main/java-difflib-benchmark/runJavaDiff.sh) and runBashDiff.sh to obtain the measurement values.
  3. Copy the measurement data and execute plot.plt to obtain the graph.

Expected behavior
The duration of the diff execution should grow linear with count of analyzed lines.

System

  • Java version 1.8
  • Version 4.11

Did you use the new Meyer Alg implementation (#134)? Since the unix tools using the proposed variant of Meyers algorithm with linear space and a significant performance gain, I am surprised that DiffTools perform not worse. The default implementation of DiffUtils use the unimproved version of Meyers paper. Look into the the issue and try the new implementation. This one is not yet the default algorithm, that DiffUtils use.

Another point is, but that is my taste, that the original algorithm does provide more natural patches and suits better for inline comparision.

Any more tests here?

Stale issue message