unispeech / unimrcp

Open source cross-platform implementation of MRCP protocol

Home Page:http://www.unimrcp.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] unimrcp gRPC implementation

OmarHory opened this issue · comments

Hello! I hope you're having a good time!

Please bear with me, I am new to this.

I have my own gRPC of ASR, NLU and TTS and would like to implement unimrcp to call ASR,NLU,TTS Engines.
All I want to do is to call the engine with an RPC call.

I have seen that there is a googleAPI-gRPC implementation but from what I see that there is no code to follow or inspire from.
Also, I have noticed there is a grammar, from what I have read is that it serves exactly as the Language model in Machine Learning and Deep Learning, I do not know how to implement that as well + Is it mandatory to have a grammar?

I would prefer if there is a clear guidance on how to implement my own custom plugin using MRCPv2 Protocol.

Would appreciate your help.

Take a look at the demo plugins. See:

  • Recognizer - unimrcp\plugins\demo-recog\src\demo_recog_engine.c
  • Synthesizer - unimrcp\plugins\demo-synth\src\demo_synth_engine.c

These demonstrate how to make a plugin to support ASR and TTS. They don't really implement a call to an engine, that part is left to you.

Must you implement grammars? Um, no, not really, but it also depends on what MRCP client is calling your ASR/NLU/TTS service and what conventions it will use.

In my opinion (which may be wrong), MRCP was originally designed to work with VXML. Voice browsers would interpret VXML and call ASR and TTS engines using MRCP. And, yes, VXML pages use grammars to specify how speech should be interpreted. But, you can certainly take liberties with the grammars. You don't have to support SRGS grammars (which are common in non-NLU IVR VXML applications). You are free to develop your own grammar schemes as long as your MRCP client doesn't complain and you stick to the protocol

As an example, many of the plugins provided for Unimrcp at Unimrcp.org support a grammar known as builtin:speech/transcribe. For example, look at http://unimrcp.org/manuals/html/GoogleSRUsageManual.html. MRCP requires that your grammar be identified by a URI. In the example transcribe grammar, a builtin: URI scheme is used. This works well because the builtin: scheme is a common convention used in VXML.

In an MRCP RECOGNIZE command, you can specify any proprietary grammar URI scheme you like, just
make sure the MRCP client is providing those URIs in the body of the message with content-type text/uri-list.

See https://www.w3.org/TR/voicexml20/#dml2.3.1.2 and https://tools.ietf.org/html/rfc6787#section-9.9

Thanks for the great answer Michael!
I am closing the issue.

by the way, you may want to use the discussion group forum for these types of questions - https://groups.google.com/g/unimrcp?pli=1