Casecommons / pg_search

pg_search builds ActiveRecord named scopes that take advantage of PostgreSQL’s full text search

Home Page:http://www.casebook.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rebuild is not resumable

m5rk opened this issue · comments

We'd like to use this to backfill a relatively large set of documents. But because we leverage conditional and additional attributes, the rebuild loops over every record. If the corresponding sidekiq job fails, on retry, it has to start all over. Is that a known issue?

I think this gem could provide more guidance about how to backfill / rebuild a large set of search documents.

We handled this by implementing the class method pg_search looks for in each of our models that we search, along with the corresponding searchable and without_search_documents scopes:

  def self.rebuild_pg_search_documents
    searchable.without_search_documents.find_each(&:update_pg_search_document)
  end

Our rebuild sidekiq job's perform uses the options to skip clean_up and skip transactional. We expect the job to fail and need to retry. If each time it retries, it cleans up (essentially truncating the table), it will never finish. We skip the transaction because It's not practical to transact such a massive job. Besides, given that we expect it to retry and resume where it left off, there's no practical value in trying to transact it. Personally, I think transactional: false should be the default.

  def perform(class_name)
    PgSearch::Multisearch.rebuild(
      class_name.classify.constantize, clean_up: false, transactional: false)
  end