Date on Master's Thesis/Doctoral Dissertation

5-2025

Document Type

Doctoral Dissertation

Degree Name

Ph. D.

Department

Electrical and Computer Engineering

Degree Program

Electrical Engineering, PhD

Committee Chair

Popa, Dan

Committee Member

McIntyre, Michael

Committee Member

Naber, John

Committee Member

Roussel, Thomas

Author's Keywords

Robotics; affective computing; healthcare; deep learning; large language models

Abstract

This dissertation explores the integration of multimodal data streams and artificial intelligence pipelines to understand human affect in neurotypical and children with Autism Spectrum Disorder (ASD). This dissertation captures human affect in the context of human-robot interaction. For this, multiple studies have been presented with both children with ASD and neurotypical adults. This dissertation makes four contributions: 1) The first study introduces autonomy during perspective-taking teaching sessions by making verbal content generation through large language models (LLMs). This system is the first of its kind for teaching perspective-taking in a semi-autonomous manner under the supervision of domain experts. Furthermore, this robotic intervention was evaluated by domain experts using NASA TLX and Godspeed surveys. This semi-autonomous system was perceived as safe and likable by the domain experts on the Godspeed survey. The second part of this study describes the physiological differences between neurotypical individuals when the robot has different modes of operation (different voice types and hand gestures). It concludes that there are not only different physiological differences but also different perceptions of the robot when it is operated with different voice types and hand gestures. 2) The second study describes forecasting the Blood Volume Pulse (BVP) signal using a time series forecast using CNN+LSTM model for children with ASD. This signal forecast is done during a candid conversation between six pairs of children with ASD. This contribution differs from the literature’s approaches since time-series forecasting hasn’t been explicitly leveraged to forecast challenging behaviors for children with ASD using physiological signals. This forecasting is important in making robotic interventions personalized and adaptable to the needs of an individual. 3) The third contribution highlights the importance of using multimodal data for affect recognition in neurotypical individuals and individuals with ASD. This use of multimodal data is demonstrated with the help of two different studies: (i) affective analysis of human-led and robot-led sessions for a social stories intervention, and (ii) multimodal sensing and machine learning to compare printed and robot-based instruction for a simulated assembly task. For both of these situations, using multimodal data outperformed using individual modalities for affect recognition. 4) The last study describes speech emotion recognition (SER) in the context of HRI. So far, vision transformers have not been used in the HRI-SER literature covering diverse demographics. This dissertation bridges this gap by classifying speech emotion from collected participant data into four primary emotions: happy, sad, angry, and neutral. This contribution highlights that vision transformer-based models outperformed previous state-of-the-art models in classifying speech emotion data from non-North-American accents, too, even though they were initially fine-tuned on datasets containing speakers with North American accents. In addition to this, it was also seen that vision transformers were able to outperform the current state-of-the-art models for the RAVDESS and TESS SER datasets.

Recommended Citation

Mishra, Ruchik, "Multimodal emotion recognition for human-robot interaction across neuro-diverse populations." (2025). Electronic Theses and Dissertations. Paper 4567.
Retrieved from https://ir.library.louisville.edu/etd/4567

Download

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

Electronic Theses and Dissertations

Multimodal emotion recognition for human-robot interaction across neuro-diverse populations.

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Related Links

Contact:

Electronic Theses and Dissertations

Multimodal emotion recognition for human-robot interaction across neuro-diverse populations.

Author

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Related Links

Contact: