O-I / pluck

Twitter favorites manager

Home Page:http://kerpluck.herokuapp.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ensure against duplicate favorites in the database

O-I opened this issue · comments

commented

The validation for uniqueness of tweet_id was added after the pluck_all rake task was run in production. The result is that the last tweet saved from each iterative batch call to the Twitter API is also the first tweet saved in the subsequent call. Essentially, every 200th tweet plucked with pluck_all is duplicated. Note that, it isn't precisely every 200th tweet due to the chronological error I mention in issue #1. To fix:

  1. Write a rake task to remove duplicates from the current database
  2. Change line 12 of faves.rake to set max_id to one less than the current minimum. Something like this should do (but check):

options[:max_id] = faves.map(&:id).min - 1 unless faves.map(&:id).min.nil?

commented

Fixed. Implemented a remove_dupluckates rake task, amended pluck_all, and added uniqueness constraint on tweet_id at database level.