fatkodima / pluck_in_batches

A faster alternative to the custom use of `in_batches` with `pluck`

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Build Status

ActiveRecord comes with find_each / find_in_batches / in_batches methods to batch process records from a database. ActiveRecord also has the pluck method which allows the selection of a set of fields without pulling the entire record into memory.

This gem combines these ideas and provides pluck_each and pluck_in_batches methods to allow batch processing of plucked fields from the database.

It performs half of the number of SQL queries, allocates up to half of the memory and is up to 2x faster (or more, depending on how far is your database from the application) than the available alternative:

# Before
User.in_batches do |batch| # or .find_in_batches, or .select(:email).find_each etc
  emails = batch.pluck(:emails)
  # do something with emails

# Now, using this gem (up to 2x faster)
User.pluck_in_batches(:email) do |emails|
  # do something with emails

Note: You may also find sidekiq-iteration useful when iterating over large collections in Sidekiq jobs.


  • Ruby 2.7+
  • ActiveRecord 6+

If you need support for older versions, open an issue.


Add this line to your application's Gemfile:

gem 'pluck_in_batches'

And then execute:

$ bundle

Or install it yourself as:

$ gem install pluck_in_batches



Behaves similarly to find_each ActiveRecord's method, but yields each set of values corresponding to the specified columns.

# Single column
User.where(active: true).pluck_each(:email) do |email|
  # do something with email

# Multiple columns
User.where(active: true).pluck_each(:id, :email) do |id, email|
  # do something with id and email


Behaves similarly to in_batches ActiveRecord's method, but yields each batch of values corresponding to the specified columns.

# Single column
User.where("age > 21").pluck_in_batches(:email) do |emails|
  jobs = emails.map { |email| PartyReminderJob.new(email) }

# Multiple columns
User.pluck_in_batches(:name, :email).with_index do |group, index|
  puts "Processing group ##{index}"
  jobs = group.map { |name, email| PartyReminderJob.new(name, email) }

Both methods support the following configuration options:

  • :batch_size - Specifies the size of the batch. Defaults to 1000. Also aliased as :of.
  • :start - Specifies the primary key value to start from, inclusive of the value.
  • :finish - Specifies the primary key value to end at, inclusive of the value.
  • :error_on_ignore - Overrides the application config to specify if an error should be raised when an order is present in the relation.
  • :cursor_column - Specifies the column(s) on which the iteration should be done. This column(s) should be orderable (e.g. an integer or string). Defaults to primary key.
  • :order - Specifies the primary key order (can be :asc or :desc or an array consisting of :asc or :desc). Defaults to :asc.


To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.


Bug reports and pull requests are welcome on GitHub at https://github.com/fatkodima/pluck_in_batches.


The gem is available as open source under the terms of the MIT License.


A faster alternative to the custom use of `in_batches` with `pluck`

License:MIT License


Language:Ruby 100.0%