vickumar1981 / stringdistance

A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..

Home Page:https://vickumar1981.github.io/stringdistance/api/com/github/vickumar1981/stringdistance/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug in Jaro Winkler score after 1.1.5 version

SpaceCowboyMax opened this issue · comments

Jaro Winkler algorithm error after 1.1.5 version update

sample code produces different result

  import com.github.vickumar1981.stringdistance.StringDistance.JaroWinkler
  println(JaroWinkler.score("kkk_k", "kkk"))

version 1.1.5

0.9066666666666667

newer versions

0.30000000000000004

Looks like problem in getCommonChars function of CommonStringDistanceAlgo interface

👍 @SpaceCowboyMax I can confirm that this is broken after the code was refactored for using generalized arrays. Working on a fix. Thanks for reporting the issue and specifically where the bug is.

@SpaceCowboyMax published a 1.2.7 release that should be syncing up on maven central shortly which should fix the issue. Also released a 1.2.8-SNAPSHOT that can be used in the meantime, if you want to use the snapshot repository.

https://oss.sonatype.org/content/repositories/snapshots/com/github/vickumar1981/stringdistance_2.13/1.2.8-SNAPSHOT/

Let me know if that addresses the issue, and again, thanks for catching that and reporting the issue.

Fixed by commit: 5800a55

Thanks, looks like it fixed now