Language-Based AI Agent Interaction with Children

KEYNOTE TALKS

Prof. Khiet P. Truong
University of Twente,
Enschede, Netherlands
Human Media Interaction Group

Towards spoken conversational interaction technology
In this presentation, I will highlight some of the research we are carrying out at Human Media Interaction, University of Twente. Talking to embodied agents, such as robots or virtual agents, can be beneficial for several reasons, ranging from being able to multitask hands-free, to expressing oneself in a less restricted manner in the context of storytelling or information search. Creating these spoken interactions with embodied agents requires not only automatic speech recognition technology but also knowledge about paralinguistics, conversational interaction, and human-computer interaction as well. I will present several examples of our research in which speech technology and knowledge from speech communication research drive our investigations into spoken interactions with embodied agents.

Biography
Khiet Truong is an associate professor at the Human Media Interaction group at University of Twente and her research interests lie investigating spoken (conversational) interaction in human-human and human-agent communications. In particular, she is interested in paralinguistic aspects that are indicative of speaker characteristics and conversation dynamics among speakers for application domains such as human-robot interaction, multimedia retrieval, and digital health. Currently, she is an Associate Editor for IEEE Transactions on Affective Computing, an Editorial Member of Computer Speech and Language, Area Chair for Interspeech 2023, and General Co-Chair for Interspeech 2025. She has also reviewed and chaired numerous positions for conferences such as Interspeech, ACM ICMI, IEEE ACII, ACM/IEEE HRI and IEEE ICASSP.

Prof. Shrikanth S. Narayanan
University of Southern California,
Los Angeles, CA
Signal Analysis and Interpretation Laboratory

Child-centered Multimodal Machine Intelligence
Converging technological advances in sensing, machine learning and computing offer tremendous opportunities for continuous contextually rich yet unobtrusive multimodal, spatiotemporal characterization of a child’s behavior, communication and interaction, across stages of development. This in turn promises novel possibilities for understanding and supporting various aspects of child-inclusive interaction applications from health and well-being to learning and entertainment. Recent approaches that have leveraged judicious use of both data and knowledge have yielded significant advances in this regard, for example in deriving rich, context-aware information from multimodal biobehavioral signal sources including human speech, language, and videos of behavior as well as physiological information. This talk will focus on some of the advances, opportunities and challenges in gathering such data and creating algorithms for machine processing of such cues in a child centric setting. It will highlight some of our ongoing efforts including drawing examples from the domain of Autism Spectrum Disorder.

Biography
Shrikanth (Shri) Narayanan is University Professor and Niki & C. L. Max Nikias Chair in Engineering at the University of Southern California, where he is Professor of Electrical & Computer Engineering, Computer Science, Linguistics, Psychology, Neuroscience, Pediatrics, and Otolaryngology—Head & Neck Surgery, Director of the Ming Hsieh Institute and Research Director of the Information Sciences Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Guggenheim Fellow, member of the European Academy of Sciences and Arts, and a Fellow of the National Academy of Inventors, the Acoustical Society of America, IEEE, ISCA, the American Association for the Advancement of Science (AAAS), the Association for Psychological Science, the Association for the Advancement of Affective Computing (AAAC) and the American Institute for Medical and Biological Engineering (AIMBE). He is a recipient of several honors including the 2015 Engineers Council’s Distinguished Educator Award, a Mellon award for mentoring excellence, the 2005 and 2009 Best Journal Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11, a 2018 ISCA CSL Best Journal Paper award, and serving as an ISCA Distinguished Lecturer for 2015-16, Willard R. Zemlin Memorial Lecturer for ASHA in 2017, and the Ten Year Technical Impact Award in 2014 and the Sustained Accomplishment Award in 2020 from ACM ICMI. His research and inventions have led to technology commercialization including through startups he co-founded: Behavioral Signals Technologies focused on the telecommunication services and AI based conversational assistance industry and Lyssn focused on mental health care delivery, treatment and quality assurance.