Undefined method *_data for nil:NilClass in plugins/entity

Question

Undefined method *_data for nil:NilClass in plugins/entity

patrykk21 opened this issue 4 years ago · comments

Brief Description

I'm finally writing an issue as I wasn't able to understand and debug the reason for which this happens.

It's really hard to replicate it as well as for now it only occurred on production environment and it seems to be consistently happening to date.

If I manage to replicate the issue I promise to create a docker-compose configuration to test this on, but for now I'm not able to do it.

The issue is that the upload seems to be failing at random times in the Attacher.promote_block.

The problematic code looks like so:

# frozen_string_literal: true

class TextUploader < Shrine
  Attacher.promote_block do
    cached_data = file_data

    # Synchronous in order to get the file data immediately so we can store it in the DB
    self.atomic_promote
    # Asynchronous since we can delete the cached data eventually
    Uploaders::DestroyWorker.perform_async(self.class, cached_data)
  end
end

Shrine configuration

# frozen_string_literal: true

require 'oj'
require 'shrine'
require 'shrine/storage/s3'

# The `:cache` store is temporary, contrary to `:store` being permament.
# The cache store is used to upload the file before the data is persisted and finalised.
Shrine.storages =
  if Rails.env.development? && ENV['DOCS_BUCKET_USE_LOCAL'] == 'true'
    {
      cache: Shrine::Storage::FileSystem.new('public', prefix: 'uploads/backlog'),
      store: Shrine::Storage::FileSystem.new('public', prefix: 'uploads/store')
    }
  else
    docs_bucket = Rails.application.secrets[:docs_bucket]

    s3_options = {
      bucket: docs_bucket[:bucket_name],
      access_key_id: docs_bucket[:access_key_id],
      secret_access_key: docs_bucket[:secret_access_key],
      region: docs_bucket[:region],
      public: true,
      force_path_style: true
    }

    s3_options[:endpoint] = docs_bucket[:endpoint] if docs_bucket[:endpoint].present?

    {
      cache: Shrine::Storage::S3.new(prefix: 'backlog', **s3_options),
      store: Shrine::Storage::S3.new(prefix: 'store', **s3_options)
    }
  end

Shrine.plugin :mongoid
Shrine.plugin :model
Shrine.plugin :restore_cached_data
Shrine.plugin :cached_attachment_data
Shrine.plugin :backgrounding
Shrine.plugin :column, serializer: Oj

The model that uploads stripped of unnecessary methods looks like so

# frozen_string_literal: true

class BinaryDocument < ApplicationModel
  include Mongoid::Document
  include Mongoid::Timestamps
  ...
  include TextUploader::Attachment(:document)

  DELEGATE_TO_DOCUMENT_METHODS = %i[url data storage metadata storage_key uploader read rewind].freeze

  field :id, type: String, default: -> { SecureRandom.uuid }
  field :owner_id, type: String
  field :timestamp, type: DateTime, default: -> { ::BinaryDocument.default_timestamp }
  field :document_data, type: Hash
  field :asset_id, type: String

  validates :id, :owner_id, :timestamp, presence: true
  validates :id, length: { maximum: 255 }

  before_save :set_asset_id

  index({ id: 1, owner_id: 1 }, unique: true)
  index(timestamp: -1)

  default_scope -> { order_by(timestamp: :desc) }

  delegate(*DELEGATE_TO_DOCUMENT_METHODS, to: :document)

  def self.default_timestamp
    DateTime.now.utc
  end

  def read_from_beginning
    ...
  end

  def stored_id
    document&.id
  end

  def cached?
    storage_key == :cache
  end

  def stored?
    storage_key == :store
  end

  private

  def set_asset_id
    self.asset_id = stored_id
  end
end

Stacktrace

here

Expected behavior

The expected behaviour is to consistently upload and in case of failed upload to return an error like "ImageDamaged" or "S3UploadFailed".

Actual behavior

The actual behaviour is that the promotion sometimes fails and it is unclear the reason why so happens.

Simplest self-contained example code to demonstrate issue

I'm sorry I haven't provided replication steps as I wasn't able to replicate it locally.
Should I create the linked template anyway?

System configuration

Ruby version:
ruby 2.5.7p206

Shrine version:
shrine (3.2.1)
shrine-mongoid (1.0.0)

LMK if I can help with more details.

Any pointers or directions on what to check/debug/try would be highly appreciated.

Thank you

Ben Koshy · Answer 1 · Mon Jul 20 2020 20:29:29 GMT+0800 (China Standard Time)

a wild, wild guess, but the only thing I could think of, is that the cached files are removed before the promotions take place.

 Uploaders::DestroyWorker.perform_async(self.class, cached_data) # consider removing this line and seeing if you still have the same problem?

Not a solution, but a side note: you can set a life cycle rule on your bucket (if that is feasible) so that cached files are removed after a set period of time .e.g 24 hours. That would obviate the need to run a back ground jobs to delete the cached files. the deletion of the cached files can be handled by amazon directly.

Patryk Kotarski · Answer 2 · Mon Jul 20 2020 20:37:16 GMT+0800 (China Standard Time)

Mmm that is a good guess.

I could try a .perform_in(10.minutes, .. or something. Will let you know if this makes our app happier.

About amazon, thank you for this tip. I was aware of this option however it really scares me the idea of not having control over it in the sense of the following scenario:

An promotion fails for whatever reason
I get back to work on Monday and check the issue
I noticed amazon deleted the file and I'm not able to restore it anymore

I know I could set a retention time of weeks or something, however that would still mean I have a deadline to fix the issue.

For now I opted for this approach as it also makes me able to check for how many files are kept at any time in the cache directory of S3. If this keeps increasing it means we have a leak somewhere and we're creating files but never promoting them.

Does it make sense?

Thank you for your reply

Ben Koshy · Answer 3 · Tue Jul 21 2020 08:08:33 GMT+0800 (China Standard Time)

It looks like you're nesting the Document record within the BinaryDocument record, and are including the attaching logic in BinaryDocument, and then subsequently delegating methods to Document: why is this the case? You could have all the attaching logic in the Document itself, without delegation. my concern is that shrine makes use of after commit call backs, so am unsure of the interplay between those call backs on the BinaryDocument record, and how that interacts with the record holding the actual storage data: the Document record, and the interplay with backgrounding and whether all methods that need to be delegated are in fact delegated. perhaps someone more knowledgeable in the library can comment.

Patryk Kotarski · Answer 4 · Tue Jul 21 2020 22:39:02 GMT+0800 (China Standard Time)

If I understand correctly your message your concept would be

Model Document has one Model BinaryDocument.

However we just have BinaryDocument, plain and simple. Document is just a field in mongodb for such model in which we store the document data for later retrieval in S3.

So instead of applying a chain of binary_document.document.url we can just do binary_document.url.

Does this answer what you wrote?

Ben Koshy · Answer 5 · Thu Jul 23 2020 07:06:05 GMT+0800 (China Standard Time)

From your previous message, I misunderstood how your BinaryDocument model works - please disregard my previous msg.

Ben Koshy · Answer 6 · Sat Aug 01 2020 10:25:43 GMT+0800 (China Standard Time)

Hi there: Has this issue been resolved: were you able to determine whether this was a Shrine bug or not?

Patryk Kotarski · Answer 7 · Wed Aug 12 2020 00:24:23 GMT+0800 (China Standard Time)

Hello, there :) Sorry for the delayed answer. Was on vacation.

We tested the .perform_in(10.minutes, .. on production and eventually we received the same error.

Will try to recreate a sandbox environment and replicate it but it seems very hard

Janko Marohnić · Answer 8 · Wed Aug 12 2020 01:23:34 GMT+0800 (China Standard Time)

What does Mongoid::Document#reload do when the underlying document has been deleted? This line in shrine-mongoid relies on Mongoid raising an exception if the document belonging to the model instance has been deleted (e.g. Active Record would raise ActiveRecord::RecordNotFound in this case). If Mongoid happens to return nil here instead, that would cause the error you're seeing.

This is just a guess. I would check myself, but I've since uninstalled MongoDB from my laptop to save on disk space.

Patryk Kotarski · Answer 9 · Wed Aug 12 2020 21:40:19 GMT+0800 (China Standard Time)

It indeed returns nil

irb(main):001:0> c = Model.last
irb(main):002:0> c.delete
=> true
irb(main):003:0> c.reload
=> nil

Reason for this was that we are using a configuration of raise_not_found_error: false in mongoid.yml.

This was due to maintaining compatibility when switching from DynamoDB to MongoDB.

However we do not delete data, ever.

So I don't really understand the issue yet.
Will work to replicate it outside of work.

Will let you know

Janko Marohnić · Answer 10 · Mon Oct 05 2020 00:10:00 GMT+0800 (China Standard Time)

I really believe this is caused by raise_not_found_error: false setting, and that some documents are indeed deleted. I looked at shrine-mongoid, but there is really no clean way to handle it other than raising an error ourselves, which probably defeats the purpose of that setting.

Since I don't currently have enough information to reproduce this bug, I will close this issue for now. Let me know if you'll have more information.