gin problems
jhellerstein opened this issue · comments
It appears that the gin opclasses don't guarantee correct results:
joe=# create index aix on foo using gin(a gin_similarity_ops);
CREATE INDEX
joe=# select a, b, lev(a,b) from foo, bar where a ~== b;
a | b | lev
---------------------------+----------------------+-------------------
Euler Taveira de Oliveira | Euler T. de Oliveira | 0.76
Euler | Euller | 0.833333333333333
(2 rows)
joe=# explain select a, b, lev(a,b) from foo, bar where a ~== b;
QUERY PLAN
----------------------------------------------------------------
Nested Loop (cost=0.00..122.43 rows=7 width=64)
Join Filter: (foo.a ~== bar.b)
-> Seq Scan on bar (cost=0.00..23.10 rows=1310 width=32)
-> Materialize (cost=0.00..1.07 rows=5 width=32)
-> Seq Scan on foo (cost=0.00..1.05 rows=5 width=32)
(5 rows)
joe=# show enable_seqscan;
enable_seqscan
----------------
on
(1 row)
joe=# set enable_seqscan = 'off';
SET
joe=# explain select a, b, lev(a,b) from foo, bar where a ~== b;
QUERY PLAN
---------------------------------------------------------------------------------
Nested Loop (cost=10000000000.01..10000005300.97 rows=7 width=64)
-> Seq Scan on bar (cost=10000000000.00..10000000023.10 rows=1310 width=32)
-> Bitmap Heap Scan on foo (cost=0.01..4.02 rows=1 width=32)
Recheck Cond: (a ~== bar.b)
-> Bitmap Index Scan on aix (cost=0.00..0.01 rows=1 width=0)
Index Cond: (a ~== bar.b)
(6 rows)
joe=# select a, b, lev(a,b) from foo, bar where a ~== b;
a | b | lev
---------------------------+----------------------+------
Euler Taveira de Oliveira | Euler T. de Oliveira | 0.76
(1 row)
joe=# set enable_seqscan = 'on';
SET
joe=# select a, b, lev(a,b) from foo, bar where a ~== b;
a | b | lev
---------------------------+----------------------+-------------------
Euler Taveira de Oliveira | Euler T. de Oliveira | 0.76
Euler | Euller | 0.833333333333333
(2 rows)
joe=# \q
(joe@3fac) pg_similarity >
It was an oversight. Fixed in e699efb. Problem was that unfortunately some operators can't use indexes. It seems soundex can use indexes but I left it for another commit.