tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.

Home Page:https://js.tensorflow.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provided weight data has no target variable: batch_normalization

rajeev-samalkha opened this issue · comments

To get help from the community, check out our Google group.

TensorFlow.js version

0.13

Browser version

Chrome Version 69.0.3497.100

Describe the problem or feature request

I converted a Keras model to tfjs using python utility with no errors. But when I try to load the model in tfjs, I get the following error:

tfjs@0.13.0:2 Uncaught (in promise) Error: Provided weight data has no target variable: batch_normalization_1_2/gamma
    at new t (tfjs@0.13.0:2)
    at loadWeightsFromNamedTensorMap (tfjs@0.13.0:2)
    at t.loadWeights (tfjs@0.13.0:2)
    at tfjs@0.13.0:2
    at tfjs@0.13.0:2
    at Object.next (tfjs@0.13.0:2)
    at i (tfjs@0.13.0:2)

Code to reproduce the bug / link to feature request

Running it on local machine.
model = await tf.loadModel(<path_to_model.json>)

Hi
Can you please share your (original) model and the commands used to convert & load?
Thanks

If that weight is an extra one that is lying around for some reason but is not actually needed, you can call tf.loadModel(..., strict=false) to disable the error.

Of course, if the weight is needed, doing this would leave you with a broken model. In that case, as @bileschi said, we'd need to see the original Keras model to determine whether there is a conversion bug.

@rajeev-samalkha is this issue resolved? If so feel free to close. Thankyou.

Folks, sorry for delayed response.

When I take out Batch Norm layer then it seems to work fine (I had to retrain the model). Is there any difference between Batch norm in tf.keras and tfjs. I have used tensorflowjs.converter utility. The model in itself is quite simple with few Con2D layers interspersed by Max Pool/Dropout.

Regards
Rajeev

TFJS BatchNorm maintains 4 up to weights:
gamma, beta, movingMean, and movingVariance which matches those in keras-team/keras

https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py#L93

I wonder if tf.keras is saving additional tensors, possibly for optimization or related to the momentum for training?

Looking at the tensorflow/keras implementation, I see that it is somewhat more complex: including a 'fused' batch-norm implementation that reaches into c.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/layers/normalization.py

Can you list out the weights in your model from the python code?

Do you need weights for all the layers or just BatchNorm one.

I get the same error while loading a keras converted model on browser.
following error pops up
embeddings_issue

As for the model weights:
model_issue_embedding

Note: Some of my converted model tend to work fine on browser having embeddings in it, but sometimes this error shows up. @bileschi @davidsoergel kindly post a proper fix for this issue.

Same error here, Someone please help.

tfjs@0.13.0:2 Uncaught (in promise) Error: Provided weight data has no target variable: conv2d/kernel
at new t (tfjs@0.13.0:2)
at loadWeightsFromNamedTensorMap (tfjs@0.13.0:2)
at t.loadWeights (tfjs@0.13.0:2)
at tfjs@0.13.0:2
at tfjs@0.13.0:2
at Object.next (tfjs@0.13.0:2)
at i (tfjs@0.13.0:2)

image

image

I hit the same error without batch norm. Appreciate your help.

capture

Folks

I think I found why we are getting this error. The error can happen for any layer. Steps to reproduce the error:

  1. Load the model in tensorFlow using tf.keras.
  2. Load the same model again (basically load the model more than once).
  3. Use tfjs.converters to convert keras model and you get this error.

It seems every layer name changes in model.json file (it will be different than model.summary name). For example, one of the layer in my model was 'conv2d_6' but it got named as 'conv2d_6_2' when I loaded the model twice. But it seems actual weights (assuming in shard file) still expect 'conv2d_6 in my case.

So till we get a fix, pls make sure you load your model only once before doing tfjs conversion. Hope this helps.

I have tried that too by loading the model exactly once but still, the same error prevails.

Not sure if this helps, but when I converted my h5 model using the python code
`import tensorflowjs as tfjs
from keras.models import load_model

modelk = load_model('./input/model.h5')
tfjs.converters.save_keras_model(modelk, './output/')`

I would receive the following error:

errors.ts:48 Uncaught (in promise) Error: Provided weight data has no target variable: dense_1_7/kernel at new t (errors.ts:48) at loadWeightsFromNamedTensorMap (container.ts:190) at t.loadWeights (container.ts:759) at models.ts:285 at index.ts:79 at Object.next (index.ts:79) at i (index.ts:79)

But if I convert the h5 model using the tensorflowjs_converter command line tool my tfjs json model file will load without any problems.

model summary

image

const tf = require('@tensorflow/tfjs');
require('@tensorflow/tfjs-node');
const path = require('path');

async function load(){
  await tf.loadModel(`file://${path.join('xxx', 'model.json')}`);
}

load()

error

(node:72812) UnhandledPromiseRejectionWarning: Error: Provided weight data has no target variable: conv2d_10_1/kernel

@demohi Can you try using setting the strict argument to false, i.e.,

  await tf.loadModel(`file://${path.join('xxx', 'model.json')}`, false);

Also, this might be a bug in loadModel. Can you provide the weight and JSON file to us so we may try reproducing this issue on our end? Thanks.

@caisq Thank you for your reply. It works.

you can convert this keras model to fix the bug.

@demohi I'm looking into this issue now. It seems the cause to do with the following fact:

  • The model has a layer with the name conv2d_10, however
  • One of the weights for that layer is named conv2d_10_1/kernel in the model.h5 file. So there is the extra suffix _1

This is the reason why the weight loading fails and you get the error. Can you tell me a little about how the model is saved from Python side? Is it possible that there are multiple instances of the model existing in Python memory?

I think we need to fix this issue regardless of what happens on the Python side, as Python Keras / TensorFlow can load this sort of model correctly. But I just want to understand the conditions under which this kind of name mismatches happen. Thanks.

@caisq

I use the colab to train this model with keras.

// keras model
model.save('xxx.h5')

@demohi At the risk of asking too much, I wonder whether you could try running the same code but from a Python file (or reset the state of the CoLab kernel and run the code from scratch, making sure that each code block is run only once.) I expect the name mismatch to disappear in those cases.

Again, don't feel obliged to try that. But if you do have time to try it and let me know, it would be wonderful.

We'll work on a fix in the meantime.

I am facing the same issue. It appears to happen when I run the tensorflowjs_converter command (with os.system) while having the model loaded from the same input keras .h5 file. If I run tensorflowjs_converter separately after the python program is over from the shell, it looks to work fine

Same error with

Error: Provided weight data has no target variable: Conv1_1/kernel

could someone fix it?
I am converting mobilenet_v2 to tensorflow js for browser classification task

@demohi At the risk of asking too much, I wonder whether you could try running the same code but from a Python file (or reset the state of the CoLab kernel and run the code from scratch, making sure that each code block is run only once.) I expect the name mismatch to disappear in those cases.

Again, don't feel obliged to try that. But if you do have time to try it and let me know, it would be wonderful.

We'll work on a fix in the meantime.

This helped me solve the problem. I had to run my google colab again after cleaning the runtime. I made sure i execute each code block once and then i converted the model using
tensorflowjs_converter --input_format keras ./my_model.h5 ./my_model_as_tfjs

It is working perfectly fine in the browser now :)

I was experiencing this in the following scenario:

  • train a model using model.fit(..., callbacks=[ModelCheckpoint])
  • load the best model (not just the model weights) using model = load_model(ckpt_path)
  • convert and save using tfjs.converters.save_keras_model

This might be obvious, but the issue was that in the call to load_model I was creating a whole new set of layers without removing the old tf variables. Keras was showing the proper layer.name, but I was still having the mismatch.

The underlying tf.Variable objects had name collisions with the first model, and therefore got a suffix of _1 (like char_embedding_1/embeddings:0 instead of char_embedding/embeddings:0). You can see these names with something like

for layer in model.layers:
    print(l.weights)

To solve my version of the issue (where there was at some point a copy of the same model and I loaded a new one), you can reset the tf session entirely before loading

import keras.backend as K
...
model.fit(data, callbacks=[...])
K.backend.clear_session()  # this resets the session containing the stale, not-best version of the model 
model = load_model(ckpt_path)
tfjs.converters.save_keras_model(model, out_dir)

The way I solved the same problem (provided weight data has no target variable conv1_1/kernel) is by cleaning all output and cache of my jupyter notebook, loading model (model = load_model('./tf_files/keras/modelKeras2.h5')) and converting with tfjs (tfjs.converters.save_keras_model(model, './tfjsModelConverted/model6') ).

Hope it helps...

I was experiencing this in the following scenario:

  • train a model using model.fit(..., callbacks=[ModelCheckpoint])
  • load the best model (not just the model weights) using model = load_model(ckpt_path)
  • convert and save using tfjs.converters.save_keras_model

This might be obvious, but the issue was that in the call to load_model I was creating a whole new set of layers without removing the old tf variables. Keras was showing the proper layer.name, but I was still having the mismatch.

The underlying tf.Variable objects had name collisions with the first model, and therefore got a suffix of _1 (like char_embedding_1/embeddings:0 instead of char_embedding/embeddings:0). You can see these names with something like

for layer in model.layers:
    print(l.weights)

To solve my version of the issue (where there was at some point a copy of the same model and I loaded a new one), you can reset the tf session entirely before loading

import keras.backend as K
...
model.fit(data, callbacks=[...])
K.backend.clear_session()  # this resets the session containing the stale, not-best version of the model 
model = load_model(ckpt_path)
tfjs.converters.save_keras_model(model, out_dir)

This particular way worked for me but instead of using the

keras.backend.clear_session()

functionality I used the tensorflow API to keras as clearing the seesion by directly using keras module throws an error with tensorflow. The following is what I used

tf.keras.backend.clear_session()

Here is how i converted the model using google colab (ipython)...
The python API seems to work for this version atleast. No need to set strict parameter in this case.

Once you have saved the entire model as a h5 file, upload it to colab and run the script to generate the tfjs model.

!pip install tensorflowjs==1.2.6

Restart runtime after installation

import keras

import os
import keras
from keras.models import load_model
import tensorflow as tf

import tfjs

import tensorflowjs as tfjs

load model

tf.compat.v1.disable_eager_execution()
model=load_model('/content/model.h5')# path to model

create directory

!mkdir model

convert model

tfjs.converters.save_keras_model(model, '/content/model')

!zip -r model.zip /content/model

download and verify

@anilsathyan7 thank you , closing this issue.