Save/Load Model have different model evaluation

Question

Save/Load Model have different model evaluation

Arfius opened this issue 3 years ago · comments

Hello.
I get an anomaly during saving and loading models.
I've trained and evaluated in the same jupyter notebook a model on the fashion_mnist dataset .
I ran it in two machines: a M1 and an Intel i5.
In particular, the notebook follows these step:

- load dataset
- train the model
- evaluate the model
- save the model in h5 file
- load the model from the file above
- evaluate the model

In the Intel i5 the step 3 and 6 have the same result, instead it is different in the M1.
PDFs show the results step by step.

Thanks for your support

Training - Intel CORE i5 - 7th - Jupyter Notebook.pdf
Training MacBookPro M1 - Jupyter Notebook.pdf

Deleted user · Answer 1 · Thu Apr 29 2021 18:43:54 GMT+0800 (China Standard Time)

I have similar issues, I work on Reinforcement Learning with Tensorflow and Keras on my macbook M1. I train a NN on a game, it solves it properly, I even make a stability check by checking 10 times the NN gives the same result ... all the time it is good. I save it using the save_model. When I load it back using load_model, sometimes it gives the good result, sometimes not ... very strange ... I thinks there is an issue with the load / save function on the M1 version of tensorflow

Deleted user · Answer 2 · Fri Apr 30 2021 20:52:39 GMT+0800 (China Standard Time)

I spend a lot of time testing different approach ... it seems that using the GPU is causing the issue. I put the following code at the beginning of my notebook and it seems to work fine now ... even really faster than with GPU:
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
from tensorflow.python.compiler.mlcompute import mlcompute
mlcompute.set_mlc_device(device_name='cpu')
I know that there was a buggy version of tensorflow with load/save when using GPU in the past, maybe it is the one forked by apple for creating this mlcompute library ...

Alfonso Farruggia · Answer 3 · Fri Apr 30 2021 20:54:44 GMT+0800 (China Standard Time)

Thanks , I will try for sure.

Deleted user · Answer 4 · Fri Apr 30 2021 21:02:03 GMT+0800 (China Standard Time)

also, when saving your model, please try
open(modelfile+'.json', 'w').write(model.to_json())
model.save_weights(modelfile+'.h5', overwrite=True)
and when loading
model = model_from_json(open(modelfile+'.json').read())
model.load_weights(modelfile+'.h5')

Deleted user · Answer 5 · Fri Apr 30 2021 21:04:05 GMT+0800 (China Standard Time)

I have been turning around for weeks before I found that .... :) ... the fact that CPU run faster than GPU was quite a good and bad surprise (good because it improves a lot my training cycles, but bad as it is not supposed to work this way :) )

Alfonso Farruggia · Answer 6 · Thu May 06 2021 15:10:00 GMT+0800 (China Standard Time)

dosen't work in my side, I've something messing up in my machine for sure.