nay0648 / ego2022

JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS

Yueyue Na1, Ziteng Wang1, Liang Wang2, Qiang Fu1

1Alibaba Group, China
{yueyue.nyy, ziteng.wzt, fq153277}@alibaba-inc.com

2School of Electronics and Communication Engineering
Sun Yat-sen University (SYSU), Guangzhou, Guangdong, 510275, China
wangliang7@mail.sysu.edu.cn

ABSTRACT

Keyword spotting is necessary for triggering human-machine speech interaction. It is a challenging task especially in low signal-to-noise ratio and moving scenarios, such as on a sweeping robot with strong ego-noise. This paper proposes a novel approach for joint ego-noise suppression and keyword detection. The keyword detection model accepts outputs from multi-look adaptive beamformers. The noise covariance matrix in the beamformer is in turn updated using the keyword absence probability given by the model, forming an end-to-end loop-back. The keyword model also adopts a multi-channel feature fusion using self-attention, and a hidden Markov model for online decoding. The performance of the proposed approach is verified on real-word datasets recorded on a sweeping robot.

Links

ICASSP 2022 论文分享:语音增强与关键词检测联合优化技术在扫地机器人中的应用

Generate Differential Beamformers

The differential beamformers used in this paper is generated by solving the following complex optimization problem by CVX [1].

$$ \begin{aligned} \mathbf{w} = &\min \mathbf{w}^H \mathbf{\Phi} \mathbf{w} \\\\ s.t. \quad &\mathbf{w}^H \mathbf{a} = 1 \\\\ &\mathbf{w}^H \mathbf{w} \le 10^{g_{min} / 10} \end{aligned} $$

Where $\mathbf{w}$ is the beamformer, $\mathbf{\Phi}$ is the noise covariance matrix, $\mathbf{a}$ is the look direction's steering vector, and $g_{min}$ in dB is the white noise gain threshold. The first constraint is used to prevent target speech being cancelled (distortionless constraint), and the second constraint is used to avoid the white noise amplification phenomenon.

References

About

JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS


Languages

Language:MATLAB 100.0%