BhagyashreeMukherjee / ASRU2021_Paper

Analysis of conversational speech with application to voice adaptation

Home Page:https://bhagyashreemukherjee.github.io/ASRU2021_Paper/ASRU2021/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Title : Analysis of Conversational Speech with Application to Voice Adaptation

Authors : Bhagyashree Mukherjee; Anusha Prakash; Hema A. Murthy

Abstract: Conversational speech has always been challenging in the context of text-to-speech synthesis (TTS). Most speech synthesis systems are trained on read speech data recorded in a studio environment. But, the intelligibility of TTS systems degrades drastically when using conversational speech. The proposed work attempts to perform extensive analysis on the issues in dealing with conversational speech compared to read speech. As an application, we try to dub the lectures available in English into an Indian language (Hindi) in the original speaker's voice. The task is difficult as classroom lectures are extempore, with variations in speaking rate, and contain speaker mannerisms that lead to disfluencies. We analyze the capability of end-to-end TTS systems in modeling lecture-based data. Based on the analysis, an attempt is made to adapt “read speech TTS system” using conversational speech data to produce lectures in the original speaker's voice.

Published in: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

Date of Conference: 13-17 December 2021

Date Added to IEEE Xplore: 03 February 2022

ISBN Information: Electronic ISBN:978-1-6654-3739-4 USB ISBN:978-1-6654-3738-7 Print on Demand(PoD) ISBN:978-1-6654-3740-0

DOI: 10.1109/ASRU51503.2021.9688146

Publisher: IEEE

Conference Location: Cartagena, Colombia

Funding Agency: 10.13039/501100008628-Ministry of Electronics and Information Technology (MeirY) (Grant Number: CS2021012MEIT003119,CS2021152OPSA003119)

Paper link : https://ieeexplore.ieee.org/document/9688146

Audio samples : https://bhagyashreemukherjee.github.io/ASRU2021_Paper/ASRU2021/

About

Analysis of conversational speech with application to voice adaptation

https://bhagyashreemukherjee.github.io/ASRU2021_Paper/ASRU2021/


Languages

Language:HTML 67.9%Language:CSS 25.3%Language:JavaScript 6.9%