A Deep Learning Approach for Generalized Speech Animation

Taylor, Sarah; Kim, Taehwan; Yue, Yisong; Mahler, Moshe; Krahe, James; Rodriguez, Anastasio Garcia; Hodgins, Jessica; Matthews, Iain

doi:10.1145/3072959.3073699

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

김태환

Kim, Taehwan

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.number	4	-
dc.citation.startPage	93	-
dc.citation.title	ACM TRANSACTIONS ON GRAPHICS	-
dc.citation.volume	36	-
dc.contributor.author	Taylor, Sarah	-
dc.contributor.author	Kim, Taehwan	-
dc.contributor.author	Yue, Yisong	-
dc.contributor.author	Mahler, Moshe	-
dc.contributor.author	Krahe, James	-
dc.contributor.author	Rodriguez, Anastasio Garcia	-
dc.contributor.author	Hodgins, Jessica	-
dc.contributor.author	Matthews, Iain	-
dc.date.accessioned	2023-12-21T22:07:09Z	-
dc.date.available	2023-12-21T22:07:09Z	-
dc.date.created	2021-09-01	-
dc.date.issued	2017-07	-
dc.description.abstract	We introduce a simple and effective deep learning approach to automatically generate natural looking speech animation that synchronizes to input speech. Our approach uses a sliding window predictor that learns arbitrary nonlinear mappings from phoneme label input sequences to mouth movements in a way that accurately captures natural motion and visual coarticulation effects. Our deep learning approach enjoys several attractive properties: it runs in real-time, requires minimal parameter tuning, generalizes well to novel input speech sequences, is easily edited to create stylized and emotional speech, and is compatible with existing animation retargeting approaches. One important focus of our work is to develop an effective approach for speech animation that can be easily integrated into existing production pipelines. We provide a. detailed description of our end-to-end approach, including machine learning design decisions. Generalized speech animation results are demonstrated over a wide range of animation clips on a variety of characters and voices, including singing and foreign language input. Our approach can also generate on-demand speech animation in real-time from user speech input.	-
dc.identifier.bibliographicCitation	ACM TRANSACTIONS ON GRAPHICS, v.36, no.4, pp.93	-
dc.identifier.doi	10.1145/3072959.3073699	-
dc.identifier.issn	0730-0301	-
dc.identifier.scopusid	2-s2.0-85030773470	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/53796	-
dc.identifier.url	https://dl.acm.org/doi/10.1145/3072959.3073699	-
dc.identifier.wosid	000406432100061	-
dc.language	영어	-
dc.publisher	ASSOC COMPUTING MACHINERY	-
dc.title	A Deep Learning Approach for Generalized Speech Animation	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Computer Science, Software Engineering	-
dc.relation.journalResearchArea	Computer Science	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Speech Anirnatiorr	-
dc.subject.keywordAuthor	Machine Learning	-
dc.subject.keywordPlus	TALKING HEAD	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.