Proceedings Chapter
Reference
Detecting and managing impressions for a more engaging virtual agent
WANG, Chen
Abstract
Impressions play an important role in both human-human and human-virtual agent interaction.
However, detecting impressions from physiological and nonverbal cues have not been explored yet. This work intends to recognize impressions within the warmth and competence dimensions, through multimodal physiological signals and behaviors. Eye gaze, facial expressions, gestures, heart rate and other physiological parameters will be used for impression recognition. In addition, the temporal unfolding of expressions between the two agents will be considered. In this way, the nonverbal cues that contribute towards the formation of impressions will be learned. Behaviors which favor a specific impression will then be adapted for a virtual agent to effectively interact with users. The resulting work will consist of an affective loop between a virtual agent and human envisioning the usage at museums.
WANG, Chen. Detecting and managing impressions for a more engaging virtual agent. In: The 6th International Workshop on Symbiotic Interaction . 2017.
Available at:
http://archive-ouverte.unige.ch/unige:103432
Disclaimer: layout of this document may differ from the published version.
1 / 1
Detecting and managing impressions for a more engaging virtual agent
Chen Wang1
1University of Geneva, Geneva, Switzerland [email protected]
Abstract. Impressions play an important role in both human-human and human-virtual agent interaction. However, detecting impressions from physiological and nonverbal cues have not been explored yet. This work intends to recognize impressions within the warmth and competence dimensions, through multimodal physiological signals and behaviors. Eye gaze, facial expressions, gestures, heart rate and other physiological parameters will be used for impression recognition. In addition, the temporal unfolding of expressions between the two agents will be considered. In this way, the nonverbal cues that contribute towards the formation of impressions will be learned. Behaviors which favor a specific impression will then be adapted for a virtual agent to effectively interact with users. The resulting work will consist of an affective loop between a virtual agent and human envisioning the usage at museums.
Keywords:Impression Recognition, Virtual Agent, Physiological Signal.
1 Introduction
Impressions are fundamental for social interactive experiences. Exploring impressions from warmth and competence has gotten more attention recently since they are universal dimensions of social perception towards others [1]. Virtual agents (VAs) are considered as promising for human-computer interaction, since they can mimic naturalistic human communication. According to Wang, Joel, et al [2], people tend to treat VAs similarly to real human beings. There are studies working on how believability of VAs correlated with warmth and competence [3]. However, to the best of our knowledge, we are unaware of any studies which aim at detecting formed impressions using multimodal cues within these two dimensions.
Motivated by the role of VAs in interpersonal interaction and the state-of-the-art on affect recognition studies, this research aims to build an affective interaction framework between an anthropomorphic virtual agent and human beings. This work focuses on detecting the impressions that a human can form of a VA. This is achieved in a multimodal way using physiological signals, facial expressions, gestures and eye gaze information from the user as well as the nonverbal behaviors from the VA. With the impression recognition model and learned nonverbal cues, we will later explore
2 how the VA reactions can be adjusted to generate a specific impression on the user.
The project schematic is shown in Fig.1.
2
Research Approach
This study attempts to answer the following questions (1) how can impressions be detected in the warmth and competence dimensions through multimodal cues of users? (2) which nonverbal behaviors make the most significant contributions for forming impressions? (3) what is the most efficient way to adjust nonverbal behaviors of VAs in order to create better impressions?
In order to achieve this goal, we first design a human-human interaction experiment to build a multimodal model capable of detecting formed impressions. Impression information will be extracted from eye gaze, head poses, facial expressions and physiological signals. The eye gaze indicates interesting points and engagement level [4]. The head poses and facial expressions infer concentration level, emotions and attitudes towards the other [5, 6]. The physiological signals infer excitement, engagement and boredom [7]. All these cues will be mapped in a continuous 2D space, where the axes are warmth and competence. By synchronizing the two users’
signals, we will find out what dyadic dynamic stimuli (facial expressions, gestures, eye-movement and physiological reactions) induce specific impressions. Then a human-virtual agent experiment will be designed to verify the influential nonverbal cues that will be learned from the human-human experiment. With the results of these two experiments, a reaction mechanism will be designed for VAs to change their behaviors based on the detected user impressions. The final goal of the research will consist of an adaptive agent capable of reacting to users’ impressions in real-time, where the impact of this interactive loop will be subsequently evaluated.
3 Dissertation Status and Expected Contribution
So far, we have explored the remote physiological signal detection. During the human-computer interaction process, it is difficult to attach physiological sensors on the bodies of users. Thus, blood volume pulse information was extracted from face videos to calculate heart rate. Future work will study methods for calculating heart rate variability and respiration rate. With our pipeline, the heart rate detection achieved reliable results (t=5.0, p<0.001, our pipeline performed significantly better than some state-of-the-art studies) on the MAHNOB-HCI dataset [7]. In addition, we have designed the first experimental protocol to gather data for the dyadic impression formation model between two humans.
The technical outcomes of the project include methods for non-intrusive physiological signal monitoring and an impression recognition model which allows to map dyadic multimodal recordings in warmth-competence space. It will contribute to enhance the interaction, and improve the user’s acceptance and experience of VA technology. It also has potential implications for the broad integration of impression recognition as a generic capability in systems.
3
Fig. 1.Impression recognition using multimodal cues from the user (formed impression) and the non-verbal behaviors of the virtual agent (intended impression).
References
1. Cuddy, Amy JC, Peter Glick, and Anna Beninger. "The dynamics of warmth and competence judgments, and their outcomes in organizations." Research in Organizational Behavior 31 (2011): 73-98.
2. Wang, Yuqiong, Joe Geigel, and Andrew Herbert. "Reading personality: Avatar vs. human faces." Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, (2013).
3. Demeure, Virginie, Radosław Niewiadomski, and Catherine Pelachaud. "How is believability of a virtual agent related to warmth, competence, personification, and embodiment?." Presence 20.5 (2011): 431-448.
4. Zeng, Zhihong, et al. "A survey of affect recognition methods: Audio, visual, and spontaneous expressions." IEEE transactions on pattern analysis and machine intelligence 31.1 (2009): 39-58.
5. R. El Kaliouby and P. Robinson, “Real-Time Inference of Complex Mental States from Facial Expression and Head Gestures,” Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition (CVPR ’04), vol. 3, p. 154, 2004.
6. M. Yeasin, B. Bullot, and R. Sharma, “Recognition of Facial Expressions and Measurement of Levels of Interest from Video,” IEEE Trans. Multimedia, vol. 8, no. 3, pp.
500-507, June 2006.
7. Soleymani, Mohammad, et al. "A multimodal database for affect recognition and implicit tagging." IEEE Transactions on Affective Computing 3.1 (2012): 42-55.