Perceptual evaluation for automatic anomaly detection in disordered speech: Focus on ambiguous cases
Publication date: December 2018Source: Speech Communication, Volume 105Author(s): Imed Laaridh, Christine Meunier, Corinne FredouilleAbstractPerceptual evaluation is still the most common method in clinical practice for diagnosing and following the condition progression of people suffering from dysarthria (or speech disorders more generally). Such evaluations are frequently described as non-trivial, subjective and highly time-consuming (depending on the evaluation level). Most of the time, perceptual assessment is performed individually by clinicians which can be problematic since judgment may vary from one clinician to th...
Source: Speech Communication - November 2, 2018 Category: Speech-Language Pathology Source Type: research

An ultrasound study of contextual and syllabic effects in consonant sequences produced under heavy articulatory constraint conditions
Publication date: Available online 28 October 2018Source: Speech CommunicationAuthor(s): Daniel Recasens, Clara RodríguezAbstractThe present ultrasound study investigates lingual coarticulatory resistance and differences in tongue configuration and contextual variability for consonants in syllable onset vs coda position in Catalan /iC#Ci/ sequences. In agreement with comparable ultrasound data for /aC#Ca/ sequences reported in an earlier study, coarticulatory resistance turned out to be less for /r/ (which is realized as a trill in onset position and exhibits a more tap-like realization in coda position), /l/ (which is no...
Source: Speech Communication - October 29, 2018 Category: Speech-Language Pathology Source Type: research

Semi-supervised acoustic model training for speech with code-switching
Publication date: Available online 23 October 2018Source: Speech CommunicationAuthor(s): Emre Yılmaz, Mitchell McLaren, Henk van den Heuvel, David A. van LeeuwenAbstractIn the FAME! project, we aim to develop an automatic speech recognition (ASR) system for Frisian-Dutch code-switching (CS) speech extracted from the archives of a local broadcaster with the ultimate goal of building a spoken document retrieval system. Unlike Dutch, Frisian is a low-resourced language with a very limited amount of manually annotated speech data. In this paper, we describe several automatic annotation approaches to enable using of a large am...
Source: Speech Communication - October 23, 2018 Category: Speech-Language Pathology Source Type: research

A new verification of the Speech Transmission Index for the English language
Publication date: Available online 19 October 2018Source: Speech CommunicationAuthor(s): Lorenzo Morales, Francis F. LiAbstractThe speech transmission index (STI) is one of the most widely used and standardized methods for objective prediction of speech intelligibility of transmission channels. The original verification of the relationship between the STI and the intelligibility for the English language was published in 1987. The methodology employed then for the listening tests and the different input spectrum recommended today by the current STI method suggest that the relationship STI vs speech intelligibility needs to ...
Source: Speech Communication - October 20, 2018 Category: Speech-Language Pathology Source Type: research

Deep neural network based i-vector mapping for speaker verification using short utterances
Publication date: Available online 19 October 2018Source: Speech CommunicationAuthor(s): Jinxi Guo, Ning Xu, Kailun Qian, Yang Shi, Kaiyuan Xu, Yingnian Wu, Abeer AlwanAbstractText-independent speaker recognition using short utterances is a highly challenging task due to the large variation and content mismatch between short utterances. I-vector and probabilistic linear discriminant analysis (PLDA) based systems have become the standard in speaker verification applications, but they are less effective with short utterances. In this paper, we first compare two state-of-the-art universal background model (UBM) training metho...
Source: Speech Communication - October 20, 2018 Category: Speech-Language Pathology Source Type: research

Perceptual Evaluation for Automatic Anomaly Detection in Disordered Speech : focus on ambiguous cases
Publication date: Available online 12 October 2018Source: Speech CommunicationAuthor(s): Imed Laaridh, Christine Meunier, Corinne FredouilleAbstractPerceptual evaluation is still the most common method in clinical practice for diagnosing and following the condition progression of people suffering from dysarthria (or speech disorders more generally). Such evaluations are frequently described as non-trivial, subjective and highly time-consuming (depending on the evaluation level). Most of the time, perceptual assessment is performed individually by clinicians which can be problematic since judgment may vary from one clinicia...
Source: Speech Communication - October 13, 2018 Category: Speech-Language Pathology Source Type: research

The Prosodic Marionette: a method to visualize speech prosody and assess perceptual and expressive prosodic abilities
Publication date: Available online 13 October 2018Source: Speech CommunicationAuthor(s): Jonathan S. Brumberg, Jill C. Thorson, Rupal PatelAbstractSpeech technology applications have emerged as a promising method for assessing speech-language abilities and at-home therapy, including prosody. Many applications assume that observed prosody errors are due to an underlying disorder; however, they may be instead due to atypical representations of prosody such as immature and developing speech motor control, or compensatory adaptations by those with congenital neuromotor disorders. The result is the same – vocal productions ma...
Source: Speech Communication - October 13, 2018 Category: Speech-Language Pathology Source Type: research

Robustness metric-based tuning of the augmented Kalman filter for the enhancement of speech corrupted with coloured noise
Publication date: Available online 11 October 2018Source: Speech CommunicationAuthor(s): Aidan E.W. George, Stephen So, Ratna Ghosh, Kuldip K. PaliwalAbstractIn this paper, we describe a tuning method based on a robustness metric and extended to work with the augmented Kalman filter for enhancing coloured-noise-corrupted speech. The method proposed within utilises the robustness metric to provide dynamic and adaptive tuning of the Kalman filter gain in order to reduce the residual noise that results from poor speech model estimates. An analysis of the Kalman filter recursion equations is presented that augments the robustn...
Source: Speech Communication - October 11, 2018 Category: Speech-Language Pathology Source Type: research

Interaction between Speech Variations and Background Noise on Speech Intelligibility by Mandarin-Speaking Cochlear Implant Patients
In this study, the effect of varying speaking rates and styles and background noise on speech understanding was investigated in Mandarin-speaking CI and normal-hearing (NH) listeners. Thirteen (5 male and 8 female, age 19 to 62 years) Mandarin-speaking, post-lingually deafened adult CI patients using their clinical processors and 9 (5 male and 4 female, age 23 to 59 years) NH subjects listening to unprocessed speech. Five different types of speech variations, including 3 speaking rates (slow, normal, fast) and 2 speaking styles (emotional, shouted) were presented with two masking noises (speech-shaped steady state noise-SS...
Source: Speech Communication - October 5, 2018 Category: Speech-Language Pathology Source Type: research

Assessing the Position Tracking Reliability of Carstens’ AG500 and AG501 Electromagnetic Articulographs during Constrained Movements and Speech Tasks
This study has shown that these issues do not affect the newer AG501, which not only performs according to the manufacturer's claim of 0.3 mm dynamical accuracy within a 20-cm-wide spherical region inside the recording volume, but also performs well outside. Furthermore, while the AG500 shows perturbed trajectories in some instances, the AG501 consistently shows accurate results in reproducing the displacements of consonantal and vocalic gestures for the tested speech tasks.Our findings reveal that the AG501 is more stable and significantly more accurate than the previous model, the AG500, which, in turn, performs reasonab...
Source: Speech Communication - October 5, 2018 Category: Speech-Language Pathology Source Type: research

Analysis and classification of the nasal finals in hearing-impaired patients using tongue movement features
In this study, the patient group included 10 young adults with hearing impairment (HI), and the control group included 12 young adults with normal hearing (NH). All participants produced six nasal finals in Mandarin under the same condition, chosen from the Mandarin Chinese phonetic alphabet. Six kinematic features (displacement, duration, maximum velocity, minimum velocity, mean velocity and standard deviation of velocity) were extracted and analyzed in the HI group and compared with those of the NH group. We performed an independent samples t-test to investigate significant differences in the means of the normal and path...
Source: Speech Communication - October 5, 2018 Category: Speech-Language Pathology Source Type: research

A Closed-form Solution to the Graph Total Variation Problem for Continuous Emotion Profiling in Noisy Environment
Publication date: Available online 21 September 2018Source: Speech CommunicationAuthor(s): Shaoling Jing, Xia Mao, Lijiang Chen, Maria Colomba Comes, Arianna Mencattini, Grazia Raguso, Fabien Ringeval, Björn Schuller, Corrado Di Natale, Eugenio MartinelliAbstractTime-continuous emotion estimation (e. g., arousal and valence) from spontaneous speech expressions has recently drawn increasing commercial attention. However, real-life applications of emotion recognition technology require challenging conditions, such as noise from recording devices and background environments. In this work, we introduce a novel personalized e...
Source: Speech Communication - September 21, 2018 Category: Speech-Language Pathology Source Type: research

Editorial Board
Publication date: October 2018Source: Speech Communication, Volume 103Author(s): (Source: Speech Communication)
Source: Speech Communication - September 16, 2018 Category: Speech-Language Pathology Source Type: research

Estimation of the glottal flow from speech pressure signals: Evaluation of three variants of iterative adaptive inverse filtering using computational physical modelling of voice production
Publication date: Available online 12 September 2018Source: Speech CommunicationAuthor(s): Parham Mokhtari, Brad Story, Paavo Alku, Hiroshi AndoAbstractThe aim of this study is to comparatively review and evaluate three variants of the glottal inverse filtering algorithm based on iterative adaptive inverse filtering (IAIF): the Standard algorithm, and two recently proposed variants that use iterative optimal preemphasis (IOP) and a glottal flow model (GFM), respectively. To enable an objective evaluation, a computational physical model of voice production is used to generate time-domain signals pertaining to both the input...
Source: Speech Communication - September 12, 2018 Category: Speech-Language Pathology Source Type: research

Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training
Publication date: Available online 7 September 2018Source: Speech CommunicationAuthor(s): Yanmin Qian, Xuankai Chang, Dong YuAbstractAlthough great progress has been made in automatic speech recognition (ASR), significant performance degradation is still observed when recognizing multi-talker mixed speech. In this paper, we propose and evaluate several architectures to address this problem under the assumption that only a single channel of mixed signal is available. Our technique extends permutation invariant training (PIT) by introducing the front-end feature separation module with the minimum mean square error (MSE) crit...
Source: Speech Communication - September 8, 2018 Category: Speech-Language Pathology Source Type: research