ruby / did_you_mean

The gem that has been saving people from typos since 2014

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TreeSpellChecker: cases where it doesn't work as expected?

pudiva opened this issue · comments

I've just started playing with TreeSpellChecker and found a couple of cases where it doesn't give me the corrections I would expect.

If I run this script in ruby-3.0.3:

#!/usr/bin/env ruby
require 'set'

paths = [
  "dir/typo",
  "typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** none slashed:"
puts sc.correct("tyop")

paths = [
  "dir/typo",
  "typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** query slashed:"
puts sc.correct("/tyop")

paths = [
  "dir/typo",
  "/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** dict slashed:"
puts sc.correct("tyop")

paths = [
  "dir/typo",
  "/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** both slashed:"
puts sc.correct("/tyop")

I get this output:

*** none slashed:
*** query slashed:
dir/typo
*** dict slashed:
*** both slashed:
/typo

And I think in all the cases, the second path on the list (either typo or /typo) should be suggested.

What do you think? ✨✨✨
cc @yuki24

Trying to work around the issue above, I decided to just prefix everything with ROOT/ and unprefix it later on, but then I found some other cases...

Script:

#!/usr/bin/env ruby
require 'set'
paths = [
  "ROOT/dir/typo",
  "ROOT/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)

puts "*** matches typo :)"
puts sc.correct("ROOT/tyop")

puts "*** doesn't match if too distant :)"
puts sc.correct("ROOT/a")

puts "*** matches too distant :("
puts sc.correct("ROOT/asduhij2ed8uuo3iekd3e/238eoiu3jkr3o48if")

Output:

*** matches typo :)
ROOT/typo
*** doesn't match if too distant :)
*** matches too distant :(
ROOT/dir/typo