Casecommons / pg_search

pg_search builds ActiveRecord named scopes that take advantage of PostgreSQL’s full text search

Home Page:http://www.casebook.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prefix search does not work with Multisearch

vfonic opened this issue · comments

I've followed the instructions in the README for setting up prefix search: https://github.com/Casecommons/pg_search#prefix-postgresql-84-and-newer-only

PostgreSQL's full text search matches on whole words by default. If you want to search for partial words, however, you can set :prefix to true. Since this is a :tsearch-specific option, you should pass it to :tsearch directly, as shown in the following example.

class Superhero < ActiveRecord::Base
  include PgSearch::Model
  pg_search_scope :whose_name_starts_with,
                  against: :name,
                  using: {
                    tsearch: { prefix: true }
                  }
end

batman = Superhero.create name: 'Batman'
batgirl = Superhero.create name: 'Batgirl'
robin = Superhero.create name: 'Robin'

Superhero.whose_name_starts_with("Bat") # => [batman, batgirl]

The above example works as expected. However, the same example returns no results when used through multisearch:

class Superhero < ActiveRecord::Base
  include PgSearch::Model
  multisearchable against: :name, using: { tsearch: { prefix: true } }
end

batman = Superhero.create name: 'Batman'
batgirl = Superhero.create name: 'Batgirl'
robin = Superhero.create name: 'Robin'

PgSearch.multisearch('Bat') # => []

I've looked into the SQL queries that are being generated and found this:

Superhero.whose_name_starts_with("Bat") # => [batman, batgirl]

Generates the following SQL query:

-- formatting mine
SELECT "superheroes".*
FROM "superheroes"
INNER JOIN (
  SELECT "superheroes"."id" AS pg_search_id,
  (
    ts_rank(
      (to_tsvector('simple', coalesce("superheroes"."title"::text, ''))),
      (to_tsquery('simple', ''' ' || 'Bat' || ' ''' || ':*')),
      0
    )
  ) AS rank
  FROM "superheroes"
  WHERE (
    (to_tsvector('simple', coalesce("superheroes"."title"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Bat' || ' ''' || ':*'))
  )
) AS pg_search_ccc5ed3d02159c90808324
ON "superheroes"."id" = pg_search_ccc5ed3d02159c90808324.pg_search_id
ORDER BY pg_search_ccc5ed3d02159c90808324.rank DESC, "superheroes"."id" ASC

While this:

PgSearch.multisearch('Bat') # => []

Generates the following SQL query:

-- formatting mine
SELECT "pg_search_documents".*
FROM "pg_search_documents"
INNER JOIN (
  SELECT
    "pg_search_documents"."id" AS pg_search_id,
    (
      ts_rank(
        (to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))),
        (to_tsquery('simple', ''' ' || 'Bat' || ' ''')),
        0
      )
    ) AS rank
  FROM "pg_search_documents"
  WHERE (
    (to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Bat' || ' '''))
  )
) AS pg_search_ce9b9dd18c5c0023f2116f
ON "pg_search_documents"."id" = pg_search_ce9b9dd18c5c0023f2116f.pg_search_id
ORDER BY pg_search_ce9b9dd18c5c0023f2116f.rank DESC, "pg_search_documents"."id" ASC;

The difference is in the to_tsquery function arguments. I've modified the SQL for the multisearch and ran it in the psql directly and it returned both records:

-- commented-out are the non-working tsquery arguments
SELECT "pg_search_documents".*
FROM "pg_search_documents"
INNER JOIN (
  SELECT "pg_search_documents"."id" AS pg_search_id,
  (
    ts_rank(
      (to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))),
      -- (to_tsquery('simple', ''' ' || 'Bat' || ' ''')),
      (to_tsquery('simple', ''' ' || 'Bat' || ' ''' || ':*')),
      0
    )
  ) AS rank
  FROM "pg_search_documents"
  WHERE (
    -- (to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Bat' || ' '''))
    (to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Bat' || ' ''' || ':*'))
  )
) AS pg_search_ce9b9dd18c5c0023f2116f
ON "pg_search_documents"."id" = pg_search_ce9b9dd18c5c0023f2116f.pg_search_id
ORDER BY pg_search_ce9b9dd18c5c0023f2116f.rank DESC, "pg_search_documents"."id" ASC

Any ideas how to fix this?

Thanks!

Here's how I currently "fixed" this:

Instead of using "default": pg_search_scope :search, I've added a new pg_search_scope :multisearch:

# config/initializers/pg_search/document.rb
module PgSearch
  class Document < ActiveRecord::Base
    pg_search_scope :multisearch, { against: :content, using: { tsearch: { prefix: true } } }
  end
end

And I'm performing a multisearch like this:

search_results = PgSearch::Document.multisearch(search_term)

Instead of using built-in multisearch method:

search_results = PgSearch.multisearch(search_term)

This only works because, in my app, all of my multisearchable models use the same options
using: { tsearch: { prefix: true } }

You can change the multisearch scope's options with PgSearch.multisearch_options=

See https://github.com/Casecommons/pg_search#configuring-multi-search

@nertzy ahhhhhhh! Thanks! :)