Biomedical/physiological signal processing for wearable technology
Wearable biometric sensors are being increasingly embedded into our everyday life yielding large amounts of biomedical/physiological data, for which the presence of human experts is not always guaranteed. These underline the need for robust physiological models that efficiently analyze and interpret the acquired signals with applications in daily life, well-being, healthcare, security, and human-computer interaction. The goal of my research is the development of robust algorithms for reliable representation and interpretation of biomedical/physiological signals and their co-evolution with other signal modalities and behavioral indices, centered around three main axes.
Learning physiological representations
This research thread focuses on robust representations of biomedical signals, such as Electrodermal Activity (EDA) and Electrocardiogram (ECG), that take into account their characteristic structure, efficiently encode the underlying information and provide reliable computations of well-established psychophysiological measurements, such as Skin Conductance Responses and QRS complexes. We have proposed the design of signal-specific dictionaries and a sparse Bayesian learning framework for inferring the corresponding dictionary parameters, yielding low reconstruction errors and high compression rates. Post-processing of the selected dictionary atoms results in reliable detection of the underlying signal characteristics, as compared to human-annotated ground truth.
- Chaspari et al., “Markov Chain Monte Carlo Inference of Parametric Dictionaries for Sparse Bayesian Approximations,” IEEE TSP 2016
- Chaspari et al., “Sparse Representation of Electrodermal Activity with Knowledge-Driven Dictionaries,” IEEE TBME 2015
Developing novel physiological measures
The second thread is an extension of the previous one, since it uses the learnt representations to compute novel physiological measures associated to various social, psychological and developmental constructs. We have proposed the Sparse EDA Synchrony Measure (SESM), an index that quantifies the similarity of EDA signals jointly modeled through sparse decomposition techniques. SESM is evaluated with in-lab dyadic interactions between couples showing distinct patterns across tasks of various intensity and depicting association to behavioral measures of attachment. Extensions of this thread exploit parametric representations for developing quantifiable behavioral indices of stress, anxiety and emotional arousal from physiological signals.
- Chaspari et al., “EDA-Gram: Designing Electrodermal Activity Fingerprints for Visualization and Feature Extraction,” EMBC 2016
- Chaspari et al., “Quantifying EDA synchrony through joint sparse representation: A case-study of couples’ interactions,” ICASSP 2015
Modeling the interplay between overt and covert signal cues
Psychophysiological signals co-evolve as a part of an integrated body system and an interacting environment. This thread focuses on models that quantify the interplay between internal indices with external signal cues and behavioral event markers. We have proposed a non-homogeneous Poisson process to model EDA signals as a time sequence of SCRs, whose rate function incorporates external factors of observable behavior. This found application to Autism intervention, providing insights about the types of emotion self- and co-regulation events that benefit child-therapist interactions. We have also studied how to appropriately combine physiological cues and linguistic measures of children with Autism and their parents to predict co-occurring verbal response latencies in conversational scenarios, as indicators of social and cognitive load. Another topic of this thread involves the interplay among vocal, behavioral and physiological indices of arousal during emotionally-charged conversations between dating partners, that can contribute towards identifying protective factors against interpersonal conflicts.
- Chaspari et al., “Dynamical Systems Modeling of Acoustic and Physiological Arousal in Young Couples,” AAAI Spring Symposia 2016
- Chaspari et al., “A non-homogeneous Poisson process model of Skin Conductance Responses integrated with observed regulatory behaviors for Autism intervention,” ICASSP 2014
- Chaspari et al., “Using physiology and language cues for modeling verbal response latencies of children with ASD,” ICASSP 2013
Acoustic analysis of emotion and behavior
Acoustic aspects of speech, such as intonation and prosody, are linked to emotion, affect and several psychopathological factors. We have analyzed non-verbal vocalizations (e.g. laughters) in terms of children’s enjoyment patterns. We have further explored the use of speech modulation features to enhance automatic emotion regognition tasks. Finally, the co-regulation of acoustic patterns between children has been studied in relation to their engagement levels during speech-controlled interactive robot-companion games.
- Chaspari et al., “Exploring Children’s Verbal and Acoustic Synchrony: Towards Promoting Engagement in Speech-Controlled Robot-Companion Games,” INTERPERSONAL@ICMI 2015
- Chaspari et al., “Emotion classification of speech using modulation features,” EUSIPCO 2014
- Chaspari et al., “An acoustic analysis of shared enjoyment in ECA interactions of children with Autism,” ICASSP 2012
Narrative structure quantification
Storytelling is a commonly used technique for analyzing people’s social and communication skills. Children with Autism are likely to produce less coherent narratives than their typically developing peers and demonstrate poor building of causal events in a story. We quantify the narrative structure by modeling the frequency and evolution of entities, i.e. the co-referent noun-phrases that represent the main characters, objects and ideas in the story. Our features capture the distribution of entity frequencies using decaying probabilistic distributions and their transitioning with a Markov Chain model. The evolution of entities through the story is represented with step sequencies and their interaction with directed normalized distance measures.
- Chaspari et al., “Analyzing the Structure of Parent-Moderated Narratives from Children with ASD Using an Entity-Based Approach,” Interspeech 2013
Voice activity detection
Voice activity detection (VAD) is the task of distinguishing speech from non-speech segments, such as silence and noise, in an audio signal. It is the fundamental step for many speech processing applications (e.g. language and speaker ID, ASR) and was included in one of the tasks for the DARPA Robust Automatic Transcription Program (RATS) in which SAIL has participated. We developed a system that uses long-term spectral variability features over multiple time and spectral resolutions and takes advantage of the inherent long-term information in speech, which is not usually present in noise.
- Tsiartas et al., “Multi-band long-term signal variability features for robust voice activity detection,” Interspeech 2013