Port original fine-tuned checkpoint to TFHub

Question

Port original fine-tuned checkpoint to TFHub

thevasudevgupta opened this issue 3 years ago · comments

1st checkpoint is up here: https://tfhub.dev/vasudevgupta7/wav2vec2/1 🎉 🎉

Now, I can transfer 2nd checkpoint to TFHub. It's the converted checkpoint which was fine-tuned on LibriSpeech dataset by Facebook (TensorFlow equivalent of this). I think, we can make changes to this notebook and link it with our 2nd checkpoint.

Please give your valuable suggestions/comments on that.

Sayak Paul · Answer 1 · Thu Jul 29 2021 09:50:27 GMT+0800 (China Standard Time)

Congratulations, @vasudevgupta7 🥳 Be sure to update this in the README of this repository and other places you think are relevant.

Just a nit: why is the architecture field blank?

@MorganR is this a known bug?

Now, I can transfer 2nd checkpoint to TFHub. It's the converted checkpoint which was fine-tuned on LibriSpeech dataset by Facebook (TensorFlow equivalent of this). I think, we can make changes to this notebook and link it with our 2nd checkpoint.

Sounds good. However, help me understand this. I see TensorFlow model weights here:

Is it equivalent to what you have used in your notebook? Also when the LibriSpeech fine-tuning would be complete (the one you are working on) should match the results that you have got in the notebook, right?

Vasudev Gupta · Answer 2 · Thu Jul 29 2021 14:57:24 GMT+0800 (China Standard Time)

I am not sure about blank architecture field. Am I missing something in the PR or it's just a bug??

These TF weights (which I am planning to add to TFHub) are converted from pytorch_model.bin (from above screen shot) using this script and are used in many tests (see this). I have also used it in my notebook.

tf_model.h5 (in above screen shot) is different from our checkpoint (HuggingFace also recently added TF version of Wav2Vec2).

Yes, our fine-tuned checkpoint (once training is over) should give same results (ideally) compared to this checkpoint if trained exactly the same way done by Facebook. Do you think it's good idea to send "converted fine-tuned" checkpoint to TFHub as well as they came with the paper??

Sayak Paul · Answer 3 · Thu Jul 29 2021 15:09:54 GMT+0800 (China Standard Time)

tf_model.h5 (in above screen shot) is different from our checkpoint (HuggingFace also recently added TF version of Wav2Vec2).

Could you elaborate a bit more on this as in what aspects they are different? Maybe I am missing out on something.

Do you think it's good idea to send "converted fine-tuned" checkpoint to TFHub as well as they came with the paper??

I think if our fine-tuned checkpoints produce similar results on the evaluation set, then it's fine to only export that to Hub.

Morgan Roff · Answer 4 · Thu Jul 29 2021 15:43:57 GMT+0800 (China Standard Time)

Don't worry about the architecture tag. It's a known issue that it takes a few days for a new tag value in yaml to show up on the frontend.

…

On Thu, 29 Jul 2021, 09:10 Sayak Paul, ***@***.***> wrote: tf_model.h5 (in above screen shot) is different from our checkpoint (HuggingFace also recently added TF version of Wav2Vec2). Could you elaborate a bit more on this as in what aspects they are different? Maybe I am missing out on something. Do you think it's good idea to send "converted fine-tuned" checkpoint to TFHub as well as they came with the paper?? I think if our fine-tuned checkpoints produce similar results on the evaluation set, then it's fine to only export that to Hub. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZIOLEFO2KR2SI73TJ4Y4TT2D5EZANCNFSM5BFGS6YQ> .

Vasudev Gupta · Answer 5 · Thu Jul 29 2021 21:41:40 GMT+0800 (China Standard Time)

tf_model.h5 (in above screen shot) is different from our checkpoint (HuggingFace also recently added TF version of Wav2Vec2).

Could you elaborate a bit more on this as in what aspects they are different? Maybe I am missing out on something.

HuggingFace also added TF Wav2Vec2 in Transformers, so they also converted Wav2Vec2 from their pytorch version to TF. So their converted model & mine converted model are completely equivalent in terms of outputs but my conversion script is very different from theirs. So layers naming, weights naming is different and their TF checkpoint doesn't work with my code.

Do you think it's good idea to send "converted fine-tuned" checkpoint to TFHub as well as they came with the paper??

I think if our fine-tuned checkpoints produce similar results on the evaluation set, then it's fine to only export that to Hub.

okay

Sayak Paul · Answer 6 · Thu Jul 29 2021 22:13:29 GMT+0800 (China Standard Time)

HuggingFace also added TF Wav2Vec2 in Transformers, so they also converted Wav2Vec2 from their pytorch version to TF. So their converted model & mine converted model are completely equivalent in terms of outputs but my conversion script is very different from theirs. So layers naming, weights naming is different and their TF checkpoint doesn't work with my code.

Thanks for the clarification.