Coding and decoding of messages in human speech communication: implications for machine recognition of speech

Publication date: Available online 15 December 2018Source: Speech CommunicationAuthor(s): Hynek HermanskyAbstractThis paper postulates that linguistic message in speech is coded redundantly in both the time and the frequency domains. Such redundant coding of the message in the signal evolved over millennia of human evolution so that relevant spectral and temporal properties of human hearing can be used to extract these messages in the presence of noise. This view of human speech suggests a particular architecture of an automatic recognition (ASR) system in which longer temporal segments of spectrally-smoothed temporal trajectories of spectral energies in individual frequency bands of speech are used to derive estimates of the posterior probabilities of speech sounds. Combinations of these estimates in reliable frequency bands are then adaptively fused to yield the final probability vectors, which best satisfy the adopted performance monitoring criteria. Some ASR systems, which already use elements of the suggested architecture are mentioned in this paper.Graphical abstract
Source: Speech Communication - Category: Speech-Language Pathology Source Type: research