Keyword recognition in Python: Advanced models throw SPXERR_INVALID_ARG
S1lverhand opened this issue · comments
I am doing keyword recognition in Python 3.9 with azure-cognitiveservices-speech=1.40.0 using PyCharm 2024.1.1 (Professional Edition) on a Windows 11 Pro machine. The following code works for basic models as expected, but throws SPXERR_INVALID_ARG
for advanced models (lowfa, midfa and highfa). All models have been trained on the same day, 30th August.
My code:
import azure.cognitiveservices.speech as speechsdk
key = MY_KEY # private
region = 'westeurope'
speech_config = speechsdk.SpeechConfig(subscription=key, region=region)
speech_config.set_property(speechsdk.PropertyId.Speech_LogFilename, "speech_sdk.log")
recognizer_input_stream = speechsdk.audio.PushAudioInputStream()
audio_config = speechsdk.audio.AudioConfig(stream=recognizer_input_stream)
recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
# keyword_model = "basic.table" # good
keyword_model = "advanced_midfa.table" # throws SPXERR_INVALID_ARG
kw_model = speechsdk.KeywordRecognitionModel(keyword_model)
recognizer.start_keyword_recognition(kw_model)
Stacktrace:
D:\test_project\.venv\Scripts\python.exe D:\test_project\test.py
Traceback (most recent call last):
File "D:\test_project\test.py", line 18, in <module>
recognizer.start_keyword_recognition(kw_model)
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 821, in start_keyword_recognition
return self.start_keyword_recognition_async(model).get()
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 576, in get
result_handle = self.__get_function(self._handle)
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 1111, in resolve_future
_call_hr_fn(fn=_sdk_lib.recognizer_start_keyword_recognition_async_wait_for, *[handle, max_uint32])
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\interop.py", line 62, in _call_hr_fn
_raise_if_failed(hr)
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\interop.py", line 55, in _raise_if_failed
__try_get_error(_spx_handle(hr))
File "D:\test_project\.venv\lib\site-packages\azure\cognitiveservices\speech\interop.py", line 50, in __try_get_error
raise RuntimeError(message)
RuntimeError: Exception with error code:
[CALL STACK BEGIN]
> keyword_spotter_initialize
- keyword_spotter_initialize
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_string_to_wstring
- pal_get_value
- pal_get_value
- pal_get_value
- pal_get_value
- pal_get_value
[CALL STACK END]
Exception with an error code: 0x5 (SPXERR_INVALID_ARG)
I have attached the log file. speech_sdk.log
Using a dedicated KeywordRecognizer
as in the official example (which by the way raises NotImplementedError
for save_to_wav_file_async
in line 764) also does not work for advanced models. It prints CANCELED: CancellationReason.Error
.
Hi, this is because of a missing library file in the Speech SDK Python packages. We will fix it in the next Speech SDK 1.41.0 release due in October. Before that, please try the following as a workaround:
- Check where
azure-cognitiveservices-speech
is installed, like
C:\>python
Python 3.12.4 (tags/v3.12.4:8e8a4ba, Jun 6 2024, 19:30:16) [MSC v.1940 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import azure.cognitiveservices.speech as speechsdk
>>> print(speechsdk.__file__)
C:\Apps\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\azure\cognitiveservices\speech\__init__.py
-> in this example, C:\Apps\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\azure\cognitiveservices\speech\
is the location of the installed module.
-
Download the Speech SDK nuget package of the same version from
https://www.nuget.org/packages/Microsoft.CognitiveServices.Speech/1.40.0 -
Unzip the downloaded
microsoft.cognitiveservices.speech.1.40.0.nupkg
(it's a zip compressed archive) -
Go to the extracted
runtimes\win-x64\native
folder and copyMicrosoft.CognitiveServices.Speech.extension.kws.ort.dll
to the Python module location shown above.
Then the advanced keyword models will work.
This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.
Fixed in the Speech SDK 1.41.1 release.