verhas / levenshtein

A very simple imlpementation of the Levenshtein distance calculation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Levenshtein distance calculator

This implementation is used to create informative error message when a user mistypes a command, parameter etc. The use case implies that

  • Speed is not a primary issue, because we are in an error situation, where an error message is created to be displayed to the user. In a situation like that the reaction of the user is order of magnitudes larger than the speed of the program.

  • The compared strings are not expected to be long.

  • The maximum distance we are interested can be limited. If the distance is too large, then the suggestion in the error message would not be informative enough anyway.

Thus, the calculation of the distance is limited to MAX_COST, fixed to the value to 5 in the code.

To use the library add the following dependency to your project.

<dependency>
    <groupId>com.javax0</groupId>
    <artifactId>levenshtein</artifactId>
    <version>1.0.0</version>
</dependency>

After that add the next line to your module-info.java file.

requires levenshtein;

If you do not have module-info.java file, then just forget it. The minimum Java version is 11.

When you have that dependency you can use the library in your code. A sample code is the following:

        final String[] commands = {"ls", "rm", "mkdir", "cd", "pwd"};
        final String missTyped = userInput();
        String closestCommand = null;
        int distance = Integer.MAX_VALUE;
        for (String command : commands) {
            final int d = Levenshtein.distance(command, missTyped);
            if( d < distance) {
                distance = d;
                closestCommand = command;
            }
        }
        userOutput(closestCommand);

This sample program from the unit test gets a few commands and a mistyped command. It goes through all the commands and finds the one closest to the one given by the user.

Some examples from the unit tests:

Word 1 Word 2 Distance

same string

same string

0

same

seme

1

same

shame

1

same

sam

1

same

shama

2

same

sham

2

same

esam

2

same

xmek

3

same

xyzk

4

About

A very simple imlpementation of the Levenshtein distance calculation

License:Apache License 2.0


Languages

Language:Java 100.0%