TreeSpellChecker: cases where it doesn't work as expected?
pudiva opened this issue · comments
satana commented
I've just started playing with TreeSpellChecker and found a couple of cases where it doesn't give me the corrections I would expect.
If I run this script in ruby-3.0.3:
#!/usr/bin/env ruby
require 'set'
paths = [
"dir/typo",
"typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** none slashed:"
puts sc.correct("tyop")
paths = [
"dir/typo",
"typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** query slashed:"
puts sc.correct("/tyop")
paths = [
"dir/typo",
"/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** dict slashed:"
puts sc.correct("tyop")
paths = [
"dir/typo",
"/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** both slashed:"
puts sc.correct("/tyop")
I get this output:
*** none slashed:
*** query slashed:
dir/typo
*** dict slashed:
*** both slashed:
/typo
And I think in all the cases, the second path on the list (either typo or /typo) should be suggested.
What do you think? ✨✨✨
cc @yuki24
satana commented
Trying to work around the issue above, I decided to just prefix everything with ROOT/ and unprefix it later on, but then I found some other cases...
Script:
#!/usr/bin/env ruby
require 'set'
paths = [
"ROOT/dir/typo",
"ROOT/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** matches typo :)"
puts sc.correct("ROOT/tyop")
puts "*** doesn't match if too distant :)"
puts sc.correct("ROOT/a")
puts "*** matches too distant :("
puts sc.correct("ROOT/asduhij2ed8uuo3iekd3e/238eoiu3jkr3o48if")
Output:
*** matches typo :)
ROOT/typo
*** doesn't match if too distant :)
*** matches too distant :(
ROOT/dir/typo