wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Repository from Github https://github.comwenet-e2e/wespeakerRepository from Github https://github.comwenet-e2e/wespeaker

Redimnet

freshpearYoon opened this issue · comments

I can see Redimnet in your code (redimnet.py), but there is no mention aboutr redimnet in pretrained model readme.md or docs.
Is it able to use redimnet?

The configuration and code have been added, but the pretrained model is not currently supported. I only have initial results, and we may update the pretrained models in the future. For now, you can try training with different configurations.

Thank you for your reply!

Hello @wsstriving ,

Thank you for providing the training code for Redimnet. While training the Redimnet model, I encountered the following error:

`.................................
[ INFO : 2024-08-28 18:33:36,699 ] - (pool): ASTP(
[ INFO : 2024-08-28 18:33:36,699 ] - (linear1): Conv1d(2592, 128, kernel_size=(1,), stride=(1,))
[ INFO : 2024-08-28 18:33:36,699 ] - (linear2): Conv1d(128, 864, kernel_size=(1,), stride=(1,))
[ INFO : 2024-08-28 18:33:36,699 ] - )
[ INFO : 2024-08-28 18:33:36,700 ] - (seg_1): Linear(in_features=1728, out_features=192, bias=True)
[ INFO : 2024-08-28 18:33:36,700 ] - (seg_bn_1): Identity()
[ INFO : 2024-08-28 18:33:36,700 ] - (seg_2): Identity()
[ INFO : 2024-08-28 18:33:36,700 ] - (projection): ArcMarginProduct(
[ INFO : 2024-08-28 18:33:36,700 ] - in_features=192, out_features=2, scale=32.0,
[ INFO : 2024-08-28 18:33:36,700 ] - margin=0.0, easy_margin=False
[ INFO : 2024-08-28 18:33:36,700 ] - )
[ INFO : 2024-08-28 18:33:36,700 ] - )
Traceback (most recent call last):
File "wespeaker/bin/train.py", line 257, in
fire.Fire(train)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "wespeaker/bin/train.py", line 156, in train
script_model = torch.jit.script(model)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 1286, in script
return torch.jit._recursive.create_script_module(
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 476, in create_script_module
return create_script_module_impl(nn_module, concrete_type, stubs_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct
init_fn(script_module)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn
scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct
init_fn(script_module)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn
scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct
init_fn(script_module)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn
scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 542, in create_script_module_impl
create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 393, in create_methods_and_properties_from_stubs
concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
RuntimeError:
cannot statically infer the expected size of a list in this context:
File "/home/wespeaker-master/wespeaker/models/redimnet.py", line 50
def forward(self, x):
size = x.size()
bs, c, f, t = tuple(size)
~~~~~~~~~~ <--- HERE
return x.permute((0, 2, 1, 3)).reshape((bs, c * f, t))

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 9961) of binary: /home/anaconda3/envs/3D/bin/python
Traceback (most recent call last):
File "/home/anaconda3/envs/3D/bin/torchrun", line 8, in
sys.exit(main())
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ..............
`

However, after commenting out lines 155 to 157 in https://github.com/wenet-e2e/wespeaker/blob/master/wespeaker/bin/train.py (which skips exporting the init.zip file), the model was able to train successfully.

Could you please explain what it means when exporting init.zip fails? Does this indicate that the Redimnet model cannot be exported to ONNX as well?

Thank you!

Hello @wsstriving ,

Thank you for providing the training code for Redimnet. While training the Redimnet model, I encountered the following error:

`................................. [ INFO : 2024-08-28 18:33:36,699 ] - (pool): ASTP( [ INFO : 2024-08-28 18:33:36,699 ] - (linear1): Conv1d(2592, 128, kernel_size=(1,), stride=(1,)) [ INFO : 2024-08-28 18:33:36,699 ] - (linear2): Conv1d(128, 864, kernel_size=(1,), stride=(1,)) [ INFO : 2024-08-28 18:33:36,699 ] - ) [ INFO : 2024-08-28 18:33:36,700 ] - (seg_1): Linear(in_features=1728, out_features=192, bias=True) [ INFO : 2024-08-28 18:33:36,700 ] - (seg_bn_1): Identity() [ INFO : 2024-08-28 18:33:36,700 ] - (seg_2): Identity() [ INFO : 2024-08-28 18:33:36,700 ] - (projection): ArcMarginProduct( [ INFO : 2024-08-28 18:33:36,700 ] - in_features=192, out_features=2, scale=32.0, [ INFO : 2024-08-28 18:33:36,700 ] - margin=0.0, easy_margin=False [ INFO : 2024-08-28 18:33:36,700 ] - ) [ INFO : 2024-08-28 18:33:36,700 ] - ) Traceback (most recent call last): File "wespeaker/bin/train.py", line 257, in fire.Fire(train) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "wespeaker/bin/train.py", line 156, in train script_model = torch.jit.script(model) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 1286, in script return torch.jit._recursive.create_script_module( File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 476, in create_script_module return create_script_module_impl(nn_module, concrete_type, stubs_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct init_fn(script_module) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct init_fn(script_module) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct init_fn(script_module) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 516, in init_fn scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 542, in create_script_module_impl create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/jit/_recursive.py", line 393, in create_methods_and_properties_from_stubs concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults) RuntimeError: cannot statically infer the expected size of a list in this context: File "/home/wespeaker-master/wespeaker/models/redimnet.py", line 50 def forward(self, x): size = x.size() bs, c, f, t = tuple(size) ~~~~~~~~~~ <--- HERE return x.permute((0, 2, 1, 3)).reshape((bs, c * f, t))

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 9961) of binary: /home/anaconda3/envs/3D/bin/python Traceback (most recent call last): File "/home/anaconda3/envs/3D/bin/torchrun", line 8, in sys.exit(main()) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(*args, **kwargs) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/run.py", line 762, in main run(args) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run elastic_launch( File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/anaconda3/envs/3D/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: .............. `

However, after commenting out lines 155 to 157 in https://github.com/wenet-e2e/wespeaker/blob/master/wespeaker/bin/train.py (which skips exporting the init.zip file), the model was able to train successfully.

Could you please explain what it means when exporting init.zip fails? Does this indicate that the Redimnet model cannot be exported to ONNX as well?

Thank you!

I forgot to mention that the current implementation does not meet the requirement for jit export. You can simply comment out the JIT check code to run the model for now. We will address this issue later.