ankane / onnxruntime-ruby

Run ONNX models in Ruby

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Instantiated model occasionally... goofs? Requiring re-instantiation

dhnaranjo opened this issue · comments

Hello! I wish I could better describe the problem I'm encountering, but hopefully I can demonstrate it effectively.

I find myself scoring a whole bunch of things serially. The dummy approach of loading a new model every time was a waste of time, so I decided to create a lil singleton that would load the model once and then run predictions as needed. A problem I'm encountering is that after some indeterminate number of predictions the model will kinda... stop working good. I'll get errors like Invalid Feed Input Name:��Y� or input name cannot be empty.

If I then reload the model and run the same inputs that just failed it'll carry on successfully for a while.

I wrote up a little test scenario to demonstrate. Forgive me if I don't attach the model as it's, you know, the corporate sauce. Hopefully this will be useful to you without that.

require "singleton"
require "onnxruntime"

class TrainedModel
  include Singleton

  MODEL_NAME = "trained_model.onnx"

  attr_reader :model
  attr_accessor :consecutive_successes

  def self.predict(input)
    results = instance.model.predict(input)
    instance.consecutive_successes += 1
    results
  rescue OnnxRuntime::Error => e
    puts "Reloading model due to exception: #{e}. Consecutive successes: #{instance.consecutive_successes}"
    instance.consecutive_successes = 0
    instance.load_model
    instance.model.predict(input)
  end

  def initialize
    @consecutive_successes = 0
    load_model
  end

  def predict(input)
    model.predict(input)
  end

  def load_model
    @model = OnnxRuntime::Model.new(model_path)
  end

  def model_path
    File.join(File.dirname(__FILE__), self.class::MODEL_NAME)
  end
end

# Our model accepts an array of 10 #s betwixt 0.0 and 1.0
input = Array.new(10) { 1.0 }

n = 100_000

n.times do
  TrainedModel.predict({"input:0" => [input]})
end

puts "Final consecutive successes: #{TrainedModel.instance.consecutive_successes}"

What's curious is that the failures seem to be in the earliest set of predictions, as shown in a few runs of the test:

➜  onnxruntime_test ruby script.rb
Reloading model due to exception: input name cannot be empty. Consecutive successes: 1669
Final consecutive successes: 98330
➜  onnxruntime_test ruby script.rb
Reloading model due to exception: Invalid Feed Input Name:��Wb�. Consecutive successes: 80
Final consecutive successes: 99919
➜  onnxruntime_test ruby script.rb
Reloading model due to exception: input name cannot be empty. Consecutive successes: 761
Reloading model due to exception: Invalid Feed Input Name:��F�. Consecutive successes: 1674
Final consecutive successes: 97563
➜  onnxruntime_test ruby script.rb
Reloading model due to exception: Invalid Feed Input Name:�����. Consecutive successes: 216
Reloading model due to exception: Invalid Feed Input Name:�����. Consecutive successes: 43
Final consecutive successes: 99739

Anyways, this retry strategy is gonna get my feature delivered, but it sure is curious.

I'm on an M1 Mac Mini but a coworker saw similar results on an Intel-based Mac.

Hey @dhnaranjo, thanks for reporting (and the great explanation)! Just pushed 0.6.1 with a fix (objects were being garbage collected too early).

Confirm'd working. Gosh, good looking out.