tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.

Home Page:https://js.tensorflow.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[tfjs-node] Backend overwritten when new environment (such as a worker) is created

Pierrci opened this issue · comments

TensorFlow.js version

tfjs-node 2.0.1

Browser version

Node.js 12.16.3

Describe the problem or feature request

When tfjs-node is required in a new environment (such as a worker when using worker threads) and has already been required in another environment before (such as the main thread), the new TF backend instantiated in the new environment overwrites any previously instantiated one.

This behavior implies that if you try to:

  1. Load a SavedModel in an environment (such as the main thread, or a worker thread)
  2. Load another SavedModel in a different environment (another worker thread).

Then the SavedModel loaded in 1/ will "disappear" (Tensor not referenced errors when trying to run it), even in the environment it was created in. BUT the environment in 1/ will be able to access and run the new SavedModel created in 2/.

It seems to me that the expected behavior would be rather one of those:

  • (a) The TF backend is shared between all the different environments tfjs-node is required in. It means that if a SavedModel is loaded in env 2, it will also be available in env 1, but without overwriting any SavedModel loaded in env 1. In the same way, models loaded in env 1 will be available in env 2.
  • (b) Different, isolated TF backend for each environment, leading in turn to environments isolated from each other when handling SavedModels, each having to load the models it wants to use.

Code to reproduce the bug / link to feature request

I coded fixes corresponding to the two different behaviors:

I've been experimenting with the two fixes for my use case which uses multiple worker threads to interact with SavedModels successfully. I'm willing to work on a PR for (a) or (b), which can be amended obviously with your feedback (particularly for the solution for (b)).

@Pierrci Hi, looks like the Agent concept would create good isolation. Are you still willing to contribute your changes for the option b? thanks

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you.

Closing as stale. Please @mention us if this needs more attention.

Are you satisfied with the resolution of your issue?
Yes
No