thevasudevgupta / gsoc-wav2vec2

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Home Page:https://thevasudevgupta.github.io/gsoc-wav2vec2/assets/final_report

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to change input signature

bytosaur opened this issue · comments

hey there!

thanks for making this repository! This may be a huge help for me.

When I download the model from https://tfhub.dev/vasudevgupta7/wav2vec2/1 the saved_model_cli says that the input signature for the model is actually (None, 50000) and not (None, 246000)... however when using tfhub to load the model into a keras layer (as done in this cloab ) it is (None, 246000)

i am confused... please help :)
thanks a lot!

Hello @bytosaur,

It's kinda strange that the warning is asking for shape (None, 50_000), the model was exported with (None, 246_000) only. I am not sure why that's happening. But here is the link to code dependent checkpoint (https://huggingface.co/vasudevgupta/gsoc-wav2vec2), in case you would like to export yourself.

I hope this would help in your project!

thanks a lot! I ll try it out right away :)

cool! I was able to convert the model to a sequence length of 80000 using these commands (you can change conda for virtualenv or whatever):

git clone https://github.com/vasudevgupta7/gsoc-wav2vec2
cd gsoc-wav2vec2

conda create -n wav2vec2 python=3.7
conda activate wav2vec2

pip install -r requirements.txt
pip install torch tensorflow transformers

python src/convert_torch_to_tf.py --hf_model_id "facebook/wav2vec2-base"

python src/export2hub.py --model_id tf-wav2vec2-base --saved_model_dir gsoc-wav2vec2/saved-model --seqlen 80000

# show the signatures
saved_model_cli show --dir gsoc-wav2vec2/saved-model/ --tag_set serve --signature_def serving_default

Note: as of now, it is hf_model_id in the first script but model_id in the second one

as a side note: tensorflow==2.6 worked for me :)

Glad to know that!