To see the other types of publications on this topic, follow the link: Speech waveforms.

Journal articles on the topic 'Speech waveforms'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speech waveforms.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Milenkovic, Paul. "Least Mean Square Measures of Voice Perturbation." Journal of Speech, Language, and Hearing Research 30, no. 4 (December 1987): 529–38. http://dx.doi.org/10.1044/jshr.3004.529.

Full text
Abstract:
A signal processing technique is described for measuring the jitter, shimmer, and signal-to-noise ratio of sustained vowels. The measures are derived from the least mean square fit of a waveform model to the digitized speech waveform. The speech waveform is digitized at an 8.3 kHz sampling rate, and an interpolation technique is used to improve the temporal resolution of the model fit. The ability of these procedures to measure low levels of perturbation is evaluated both on synthetic speech waveforms and on the speech recorded from subjects with normal voice characteristics.
APA, Harvard, Vancouver, ISO, and other styles
2

Coorman, Geert. "Speech synthesis using concatenation of speech waveforms." Journal of the Acoustical Society of America 124, no. 6 (2008): 3371. http://dx.doi.org/10.1121/1.3047443.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ferit Gigi, Ercan. "Speech Synthesis Using Concatenation Of Speech Waveforms." Journal of the Acoustical Society of America 129, no. 1 (2011): 545. http://dx.doi.org/10.1121/1.3554813.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Coorman, Geert. "Speech synthesis using concatenation of speech waveforms." Journal of the Acoustical Society of America 116, no. 3 (2004): 1331. http://dx.doi.org/10.1121/1.1809938.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Xiong, Yan, Fang Xu, Qiang Chen, and Jun Zhang. "Speech Enhancement Using Heterogeneous Information." International Journal of Grid and High Performance Computing 10, no. 3 (July 2018): 46–59. http://dx.doi.org/10.4018/ijghpc.2018070104.

Full text
Abstract:
This article describes how to use heterogeneous information in speech enhancement. In most of the current speech enhancement systems, clean speeches are recovered only from the signals collected by acoustic microphones, which will be greatly affected by the acoustic noises. However, heterogeneous information from different kinds of sensors, which is usually called the “multi-stream,” are seldom used in speech enhancement because the speech waveforms cannot be recovered from the signals provided by many kinds of sensors. In this article, the authors propose a new model-based multi-stream speech enhancement framework that can make use of the heterogeneous information provided by the signals from different kinds of sensors even when some of them are not directly related to the speech waveform. Then a new speech enhancement scheme using the acoustic and throat microphone recordings is also proposed based on the new speech enhancement framework. Experimental results show that the proposed scheme outperforms several single-stream speech enhancement methods in different noisy environments.
APA, Harvard, Vancouver, ISO, and other styles
6

Kleijn, W. B. "Encoding speech using prototype waveforms." IEEE Transactions on Speech and Audio Processing 1, no. 4 (1993): 386–99. http://dx.doi.org/10.1109/89.242484.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Maddieson, Ian. "Commentary on ‘Reading waveforms’." Journal of the International Phonetic Association 21, no. 2 (December 1991): 89–91. http://dx.doi.org/10.1017/s0025100300004436.

Full text
Abstract:
The previous issue of the Journal contained a discussion by Peter Ladefoged about interpreting the information in a speech waveform (JIPA, 21, 32–34), noting that examination of waveforms displays is becoming commonplace with the easy availability of personal computer tools for digitizing and editing. As Ladefoged noted:“Several aspects of sounds are clearly distinguishable from the waveforms of a phrase. Stop closures are very evident, as are differences between voiced sounds which have repetitive waveforms and voiceless sounds which do not. Differences in amplitude can be used to distinguish high frequency, high intensity sibilants from lower intensity non-sibilant fricatives; and nasals and laterals usually have smaller amplitudes than the louder adjacent vowels. An expanded view of the waveform allows us to see intervals between peaks in the damped wave of a voiced sound, and thus to calculate the frequency of the first formant. Nasals can often be distinguished from vowels in these expanded waveforms, not only by their smaller amplitudes but also by the less clear formant structure.”
APA, Harvard, Vancouver, ISO, and other styles
8

Yohanes, Banu W. "Linear Prediction and Long Term Predictor Analysis and Synthesis." Techné : Jurnal Ilmiah Elektroteknika 16, no. 01 (April 3, 2017): 49–58. http://dx.doi.org/10.31358/techne.v16i01.158.

Full text
Abstract:
Spectral analysis may not provide an accurate description of speech articulation. This article presents an experimental setup of representing speech waveform directly in terms of timevarying parameters. It is related to the transfer function of the vocal tract. Linear Prediction, Long Term Predictor Analysis, and Synthesis filters are designed and implemented, as well as the theory behind introduced. The workflows of the filters are explained by detailed and codes of those filters. Original waveform files are framed with Hamming window and for each frames the filters are applied, and the reconstructed speeches are compared to original waveforms. The results come out that LP and LTP analysis can be used in DSPs due to its periodical characteristic, but some distortion might be coursed, which examined in the experiments.
APA, Harvard, Vancouver, ISO, and other styles
9

Arda, Betul, Daniel Rudoy, and Patrick J. Wolfe. "Testing for periodicity in speech waveforms." Journal of the Acoustical Society of America 125, no. 4 (April 2009): 2699. http://dx.doi.org/10.1121/1.4784326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Terzopoulos, D. "Co-occurrence analysis of speech waveforms." IEEE Transactions on Acoustics, Speech, and Signal Processing 33, no. 1 (February 1985): 5–30. http://dx.doi.org/10.1109/tassp.1985.1164511.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Latypov, R. Kh, R. R. Nigmatullin, and E. L. Stolov. "Classification of speech files by waveforms." Lobachevskii Journal of Mathematics 36, no. 4 (October 2015): 496–502. http://dx.doi.org/10.1134/s1995080215040265.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Verneuil, Andrew, Bruce R. Gerratt, David A. Berry, Ming Ye, Jody Kreiman, and Gerald S. Berke. "Modeling Measured Glottal Volume Velocity Waveforms." Annals of Otology, Rhinology & Laryngology 112, no. 2 (February 2003): 120–31. http://dx.doi.org/10.1177/000348940311200204.

Full text
Abstract:
The source-filter theory of speech production describes a glottal energy source (volume velocity waveform) that is filtered by the vocal tract and radiates from the mouth as phonation. The characteristics of the volume velocity waveform, the source that drives phonation, have been estimated, but never directly measured at the glottis. To accomplish this measurement, constant temperature anemometer probes were used in an in vivo canine constant pressure model of phonation. A 3-probe array was positioned supraglottically, and an endoscopic camera was positioned subglottically. Simultaneous recordings of airflow velocity (using anemometry) and glottal area (using stroboscopy) were made in 3 animals. Glottal airflow velocities and areas were combined to produce direct measurements of glottal volume velocity waveforms. The anterior and middle parts of the glottis contributed significantly to the volume velocity waveform, with less contribution from the posterior part of the glottis. The measured volume velocity waveforms were successfully fitted to a well-known laryngeal airflow model. A noninvasive measured volume velocity waveform holds promise for future clinical use.
APA, Harvard, Vancouver, ISO, and other styles
13

Das, Amitava, and Eddie L. T. Choy. "Synthesis of speech from pitch prototype waveforms by time-synchronous waveform interpolation." Journal of the Acoustical Society of America 117, no. 3 (2005): 993. http://dx.doi.org/10.1121/1.1896665.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Fox, Lisa G., and Susan D. Dalebout. "Use of the Median Method to Enhance Detection of the Mismatch Negativity in the Responses of Individual Listeners." Journal of the American Academy of Audiology 13, no. 02 (February 2002): 083–92. http://dx.doi.org/10.1055/s-0040-1715951.

Full text
Abstract:
The median method was evaluated as an alternative way of expressing the mismatch negativity (MMN). Traditionally, signal averaging has been used to extract these event-related potentials from unwanted background noise. However, mean values are biased by unrejected artifact that skews the relatively small distribution of values on which the MMN is based. Because the median is a more valid measure of central tendency in asymmetric distributions, it may describe MMN data more accurately. Better representation of the signal in the median waveform might enhance detection of the MMN in the responses of individual listeners. Mean and median waveforms were computed from previously recorded MMN data. Visually identified MMNs were validated using area and onset latency criteria. Detectability of the MMN was not improved using median waveforms. Despite this result, a theoretical argument for use of the median is presented.
APA, Harvard, Vancouver, ISO, and other styles
15

Sprague, Richard P. "Compression of stored waveforms for artificial speech." Journal of the Acoustical Society of America 90, no. 4 (October 1991): 2220. http://dx.doi.org/10.1121/1.401581.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Payton, Karen L., and Louis D. Braida. "A method to determine the speech transmission index from speech waveforms." Journal of the Acoustical Society of America 106, no. 6 (December 1999): 3637–48. http://dx.doi.org/10.1121/1.428216.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Jiaquan, Qijun Huang, Qiming Ma, Sheng Chang, Jin He, Hao Wang, Xiao Zhou, Fang Xiao, and Chao Gao. "Classification of VLF/LF Lightning Signals Using Sensors and Deep Learning Methods." Sensors 20, no. 4 (February 14, 2020): 1030. http://dx.doi.org/10.3390/s20041030.

Full text
Abstract:
Lightning waveform plays an important role in lightning observation, location, and lightning disaster investigation. Based on a large amount of lightning waveform data provided by existing real-time very low frequency/low frequency (VLF/LF) lightning waveform acquisition equipment, an automatic and accurate lightning waveform classification method becomes extremely important. With the widespread application of deep learning in image and speech recognition, it becomes possible to use deep learning to classify lightning waveforms. In this study, 50,000 lightning waveform samples were collected. The data was divided into the following categories: positive cloud ground flash, negative cloud ground flash, cloud ground flash with ionosphere reflection signal, positive narrow bipolar event, negative narrow bipolar event, positive pre-breakdown process, negative pre-breakdown process, continuous multi-pulse cloud flash, bipolar pulse, skywave. A multi-layer one-dimensional convolutional neural network (1D-CNN) was designed to automatically extract VLF/LF lightning waveform features and distinguish lightning waveforms. The model achieved an overall accuracy of 99.11% in the lightning dataset and overall accuracy of 97.55% in a thunderstorm process. Considering its excellent performance, this model could be used in lightning sensors to assist in lightning monitoring and positioning.
APA, Harvard, Vancouver, ISO, and other styles
18

Chen, Jie, and Jing Chen. "The Operation of Cool Edit Pro in Corpus-Based Spoken Language." Advanced Materials Research 694-697 (May 2013): 2383–87. http://dx.doi.org/10.4028/www.scientific.net/amr.694-697.2383.

Full text
Abstract:
Corpus refers to the database of language materials. Cool Edit Pro is a media edit software. This paper explores how to construct spoken language corpus, how to use cool edit pro 2 to make sound wave contrast and give the experimenters an intuitive observation from their own speech waveforms. The key is to offer the obvious waveforms contrast among the sampling waveform of the native speaker, the original and unmodified one of the experimenter and the new waveform of the experimenter after modifications and teachers instructions, which makes the oral autonomic learning more possible and scientific. From long wave or short wave, wave trough or wave crest, smooth wave or sharp wave, the experimenters deviations can be easily identified from the standard during the autonomic practices and efficiently make corrections. Additionally, experimenter also can observe the improvements frequently, which means this experiment more instructive.
APA, Harvard, Vancouver, ISO, and other styles
19

Ng, C. S., and P. H. Milenkovic. "Unstable covariance LPC solutions from nonstationary speech waveforms." IEEE Transactions on Acoustics, Speech, and Signal Processing 37, no. 5 (May 1989): 651–54. http://dx.doi.org/10.1109/29.17557.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Breen, Andrew Paul. "Synthesising speech by converting phonemes to digital waveforms." Journal of the Acoustical Society of America 115, no. 4 (2004): 1401. http://dx.doi.org/10.1121/1.1738269.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Kang, George S., and Lawrence J. Fransen. "Method and apparatus for generating modified speech from pitch-synchronous segmented speech waveforms." Journal of the Acoustical Society of America 107, no. 6 (2000): 2950. http://dx.doi.org/10.1121/1.429368.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Bergstrom, Chad Scott. "Method and apparatus for synthesis of speech excitation waveforms." Journal of the Acoustical Society of America 104, no. 5 (November 1998): 2556. http://dx.doi.org/10.1121/1.423784.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

McKinley, Paula S., Peter A. Shapiro, Emilia Bagiella, Michael M. Myers, Ronald E. De Meersman, Igor Grant, and Richard P. Sloan. "Deriving heart period variability from blood pressure waveforms." Journal of Applied Physiology 95, no. 4 (October 2003): 1431–38. http://dx.doi.org/10.1152/japplphysiol.01110.2002.

Full text
Abstract:
International standards for calculating heart period variability (HPV) from a series of R-wave intervals (R-R) in an electrocardiographic (ECG) recording have been widely accepted. It is possible, and potentially useful in various settings, to use systolic blood pressure waveform intervals to estimate HPV, but the validity of HPV derived from blood pressure (BP) waveforms has not been established. To test the reliability between BP- and ECG-derived HPV indexes, we evaluated data from 234 healthy adults in four studies of HPV reactivity to stress. Study conditions included resting baseline, arithmetic, Stroop test, speech presentation, and orthostatic tilt. Continuous ECG and BP recordings were sampled at a rate of 500 Hz, scored by the same methods, and used to calculate heart rate and time- and frequency-domain measures of HPV. Overall, reliability between the two methods was very high for computing heart rate and HPV indexes. High-frequency HPV indexes were somewhat less reliably computed. In conclusion, in healthy adults, with the use of appropriate methods, BP waveforms can produce reliable indexes of HPV.
APA, Harvard, Vancouver, ISO, and other styles
24

Freyman, Richard L., G. Patrick Nerbonne, and Heather A. Cote. "Effect of Consonant-Vowel Ratio Modification on Amplitude Envelope Cues for Consonant Recognition." Journal of Speech, Language, and Hearing Research 34, no. 2 (April 1991): 415–26. http://dx.doi.org/10.1044/jshr.3402.415.

Full text
Abstract:
This investigation examined the degree to which modification of the consonant-vowel (C-V) intensity ratio affected consonant recognition under conditions in which listeners were forced to rely more heavily on waveform envelope cues than on spectral cues. The stimuli were 22 vowel-consonant-vowel utterances, which had been mixed at six different signal-to-noise ratios with white noise that had been modulated by the speech waveform envelope. The resulting waveforms preserved the gross speech envelope shape, but spectral cues were limited by the white-noise masking. In a second stimulus set, the consonant portion of each utterance was amplified by 10 dB. Sixteen subjects with normal hearing listened to the unmodified stimuli, and 16 listened to the amplified-consonant stimuli. Recognition performance was reduced in the amplified-consonant condition for some consonants, presumably because waveform envelope cues had been distorted. However, for other consonants, especially the voiced stops, consonant amplification improved recognition. Patterns of errors were altered for several consonant groups, including some that showed only small changes in recognition scores. The results indicate that when spectral cues are compromised, nonlinear amplification can alter waveform envelope cues for consonant recognition.
APA, Harvard, Vancouver, ISO, and other styles
25

Tucker, Denise A., Susan Dietrich, Stacy Harris, and Sarah Pelletier. "Effects of Stimulus Rate and Gender on the Auditory Middle Latency Response." Journal of the American Academy of Audiology 13, no. 03 (March 2002): 146–53. http://dx.doi.org/10.1055/s-0040-1715956.

Full text
Abstract:
The effects of stimulus rate and gender on the auditory middle latency response (AMLR) waveforms were examined in 20 young adult male and female subjects. Four different repetition rates were presented to subjects (1.1/sec, 4.1/sec, 7.7/sec, and 11.3/sec). Stimulus repetition rate had a significant effect on Pa latency, Pa amplitude, and Pb amplitude. Pa and Pb amplitudes decreased with increasing the stimulus rate, and Pa latency significantly increased with increasing the stimulus rate. No significant differences were seen on Pb latency or site of recording. Gender had a significant effect on Pa latency and Pa amplitude. Pa latencies were longer in male subjects, and Pa amplitudes were larger in female subjects. Gender did not have a significant effect on the Pb waveform.
APA, Harvard, Vancouver, ISO, and other styles
26

Sharma, Pranav, Puneet Kochar, Priti Soin, and Steven Cohen. "Bisystolic Vertebral Artery: Critical Finding or can be Ignored?" Journal of Clinical Imaging Science 9 (January 31, 2019): 2. http://dx.doi.org/10.4103/jcis.jcis_80_18.

Full text
Abstract:
The carotid Doppler imaging findings in three adults presenting with vertigo, transient speech difficulty and for cardiac prebypass graft surgery revealing two systolic peaks in one of the vertebral arteries. In presteal situations, vertebral artery waveform shows two systolic peaks with sharp first and rounded second systolic peak or two systolic peaks with a deep cleft between the two peaks with antegrade flow. With increase in stenosis to more than 80% there is bidirectional flow and later flow reversal. We discuss the types of presteal vertebral artery waveforms, its clinical implications and brief review of literature.
APA, Harvard, Vancouver, ISO, and other styles
27

Neel, Amy T. "Using Acoustic Phonetics in Clinical Practice." Perspectives on Speech Science and Orofacial Disorders 20, no. 1 (July 2010): 14–24. http://dx.doi.org/10.1044/ssod20.1.14.

Full text
Abstract:
Acoustic phonetics deals with the physical aspects of speech sounds associated with the production and perception of speech. Acoustic measurement techniques can be used by speech-language pathologists to assess and treat a variety of speech disorders. In this article, we will review the source-filter theory of speech production, acoustic theory of vowels, and acoustic properties of consonants. We will examine how visual displays of acoustic information in the form of waveforms, amplitude spectra, and spectrograms can be used to analyze aspects of speech that might be difficult to hear and serve to provide biofeedback to clients to improve their speech production.
APA, Harvard, Vancouver, ISO, and other styles
28

Palmer, Shannon B., and Frank E. Musiek. "N1-P2 Recordings to Gaps in Broadband Noise." Journal of the American Academy of Audiology 24, no. 01 (January 2013): 037–45. http://dx.doi.org/10.3766/jaaa.24.1.5.

Full text
Abstract:
Background: Normal temporal processing is important for the perception of speech in quiet and in difficult listening situations. Temporal resolution is commonly measured using a behavioral gap detection task, where the patient or subject must participate in the evaluation process. This is difficult to achieve with subjects who cannot reliably complete a behavioral test. However, recent research has investigated the use of evoked potential measures to evaluate gap detection. Purpose: The purpose of the current study was to record N1-P2 responses to gaps in broadband noise in normal hearing young adults. Comparisons were made of the N1 and P2 latencies, amplitudes, and morphology to different length gaps in noise in an effort to quantify the changing responses of the brain to these stimuli. It was the goal of this study to show that electrophysiological recordings can be used to evaluate temporal resolution and measure the influence of short and long gaps on the N1-P2 waveform. Research Design: This study used a repeated-measures design. All subjects completed a behavioral gap detection procedure to establish their behavioral gap detection threshold (BGDT). N1-P2 waveforms were recorded to the gap in a broadband noise. Gap durations were 20 msec, 2 msec above their BGDT, and 2 msec. These durations were chosen to represent a suprathreshold gap, a near-threshold gap, and a subthreshold gap. Study Sample: Fifteen normal-hearing young adult females were evaluated. Subjects were recruited from the local university community. Data Collection and Analysis: Latencies and amplitudes for N1 and P2 were compared across gap durations for all subjects using a repeated-measures analysis of variance. A qualitative description of responses was also included. Results: Most subjects did not display an N1-P2 response to a 2 msec gap, but all subjects had present clear evoked potential responses to 20 msec and 2+ msec gaps. Decreasing gap duration toward threshold resulted in decreasing waveform amplitude. However, N1 and P2 latencies remained stable as gap duration changed. Conclusions: N1-P2 waveforms can be elicited by gaps in noise in young normal-hearing adults. The responses are present as low as 2 msec above behavioral gap detection thresholds (BGDT). Gaps that are below BGDT do not generally evoke an electrophysiological response. These findings indicate that when a waveform is present, the gap duration is likely above their BGDT. Waveform amplitude is also a good index of gap detection, since amplitude decreases with decreasing gap duration. Future studies in this area will focus on various age groups and individuals with auditory disorders.
APA, Harvard, Vancouver, ISO, and other styles
29

Hamon, Christian. "Processing device for speech synthesis by addition of overlapping waveforms." Journal of the Acoustical Society of America 101, no. 4 (April 1997): 1766. http://dx.doi.org/10.1121/1.418194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Wohlert, Amy B. "Event-Related Brain Potentials Preceding Speech and Nonspeech Oral Movements of Varying Complexity." Journal of Speech, Language, and Hearing Research 36, no. 5 (October 1993): 897–905. http://dx.doi.org/10.1044/jshr.3605.897.

Full text
Abstract:
Cortical preparation for movement is reflected in the readiness potential (RP) waveform preceding voluntary limb movements. In the case of oral movements, the RP may be affected by the complexity or linguistic nature of the tasks. In this experiment, EEG potentials before a nonspeech task (lip pursing), a speech-like task (lip rounding), and single word production were recorded from scalp electrodes placed at the cranial vertex (Cz) and over the left and right motor strips (C3′ and C4′). Seven right-handed female subjects produced at least 70 repetitions of the three tasks, in each of five repeated sessions. EEG records were averaged with respect to EMG onset at the lip. The word task, as opposed to the other tasks, was associated with greater negative amplitude in the RP waveform at the vertex site. Differences between the waveforms recorded at the rightand left-hemisphere sites were insignificant. Although intersubject variability was high, individuals had relatively stable patterns of response across sessions. Results suggest that the RP recorded at the vertex site is sensitive to changes in task complexity. The RP did not reflect lateralized activity indicative of hemispheric dominance.
APA, Harvard, Vancouver, ISO, and other styles
31

Rothenberg, Martin, and James J. Mahshie. "Monitoring Vocal Fold Abduction through Vocal Fold Contact Area." Journal of Speech, Language, and Hearing Research 31, no. 3 (September 1988): 338–51. http://dx.doi.org/10.1044/jshr.3103.338.

Full text
Abstract:
A number of commercial devices for measuring the transverse electrical conductance of the thyroid cartilage produce waveforms that can be useful for monitoring movements within the larynx during voice production, especially movements that are closely related to the time-variation of the contact between the vocal folds as they vibrate. This paper compares the various approaches that can be used to apply such a device, usually referred to as an electroglottograph, to the problem of monitoring the time-variation of vocal fold abduction and adduction during voiced speech. One method, in which a measure of relative vocal fold abduction is derived from the duty cycle of the linear-phase high pass filtered electroglottograph waveform, is developed in detail.
APA, Harvard, Vancouver, ISO, and other styles
32

Zhao, Yong. "Refining of segmental boundaries in speech waveforms using contextual-dependent models." Journal of the Acoustical Society of America 128, no. 6 (2010): 3827. http://dx.doi.org/10.1121/1.3544446.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Gibson, J. "Digital coding of waveforms: Priciples and application to speech and video." IEEE Transactions on Acoustics, Speech, and Signal Processing 33, no. 6 (December 1985): 1636–37. http://dx.doi.org/10.1109/tassp.1985.1164724.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Bergstrom, Chad S. "Method and apparatus for characterization and reconstruction of speech excitation waveforms." Journal of the Acoustical Society of America 102, no. 5 (1997): 2481. http://dx.doi.org/10.1121/1.420376.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Gibson, J. D. "Digital coding of waveforms: Principles and applications to speech and video." Proceedings of the IEEE 75, no. 4 (1987): 526–27. http://dx.doi.org/10.1109/proc.1987.13765.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Jayant, N. S., and P. Noll. "Digital coding of waveforms. Principles and applications to speech and video." Signal Processing 9, no. 2 (September 1985): 139–40. http://dx.doi.org/10.1016/0165-1684(85)90053-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Glista, Danielle, Vijayalakshmi Easwar, David W. Purcell, and Susan Scollie. "A Pilot Study on Cortical Auditory Evoked Potentials in Children: Aided CAEPs Reflect Improved High-Frequency Audibility with Frequency Compression Hearing Aid Technology." International Journal of Otolaryngology 2012 (2012): 1–12. http://dx.doi.org/10.1155/2012/982894.

Full text
Abstract:
Background. This study investigated whether cortical auditory evoked potentials (CAEPs) could reliably be recorded and interpreted using clinical testing equipment, to assess the effects of hearing aid technology on the CAEP.Methods. Fifteen normal hearing (NH) and five hearing impaired (HI) children were included in the study. NH children were tested unaided; HI children were tested while wearing hearing aids. CAEPs were evoked with tone bursts presented at a suprathreshold level. Presence/absence of CAEPs was established based on agreement between two independent raters.Results. Present waveforms were interpreted for most NH listeners and all HI listeners, when stimuli were measured to be at an audible level. The younger NH children were found to have significantly different waveform morphology, compared to the older children, with grand averaged waveforms differing in the later part of the time window (the N2 response). Results suggest that in some children, frequency compression hearing aid processing improved audibility of specific frequencies, leading to increased rates of detectable cortical responses in HI children.Conclusions. These findings provide support for the use of CAEPs in measuring hearing aid benefit. Further research is needed to validate aided results across a larger group of HI participants and with speech-based stimuli.
APA, Harvard, Vancouver, ISO, and other styles
38

Beyreuther, M., and J. Wassermann. "Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers." Nonlinear Processes in Geophysics 18, no. 1 (February 14, 2011): 81–89. http://dx.doi.org/10.5194/npg-18-81-2011.

Full text
Abstract:
Abstract. Automatic earthquake detection and classification is required for efficient analysis of large seismic datasets. Such techniques are particularly important now because access to measures of ground motion is nearly unlimited and the target waveforms (earthquakes) are often hard to detect and classify. Here, we propose to use models from speech synthesis which extend the double stochastic models from speech recognition by integrating a more realistic duration of the target waveforms. The method, which has general applicability, is applied to earthquake detection and classification. First, we generate characteristic functions from the time-series. The Hidden semi-Markov Models are estimated from the characteristic functions and Weighted Finite-State Transducers are constructed for the classification. We test our scheme on one month of continuous seismic data, which corresponds to 370 151 classifications, showing that incorporating the time dependency explicitly in the models significantly improves the results compared to Hidden Markov Models.
APA, Harvard, Vancouver, ISO, and other styles
39

Howard, Mary F., and David Poeppel. "Discrimination of Speech Stimuli Based on Neuronal Response Phase Patterns Depends on Acoustics But Not Comprehension." Journal of Neurophysiology 104, no. 5 (November 2010): 2500–2511. http://dx.doi.org/10.1152/jn.00251.2010.

Full text
Abstract:
Speech stimuli give rise to neural activity in the listener that can be observed as waveforms using magnetoencephalography. Although waveforms vary greatly from trial to trial due to activity unrelated to the stimulus, it has been demonstrated that spoken sentences can be discriminated based on theta-band (3–7 Hz) phase patterns in single-trial response waveforms. Furthermore, manipulations of the speech signal envelope and fine structure that reduced intelligibility were found to produce correlated reductions in discrimination performance, suggesting a relationship between theta-band phase patterns and speech comprehension. This study investigates the nature of this relationship, hypothesizing that theta-band phase patterns primarily reflect cortical processing of low-frequency (<40 Hz) modulations present in the acoustic signal and required for intelligibility, rather than processing exclusively related to comprehension (e.g., lexical, syntactic, semantic). Using stimuli that are quite similar to normal spoken sentences in terms of low-frequency modulation characteristics but are unintelligible (i.e., their time-inverted counterparts), we find that discrimination performance based on theta-band phase patterns is equal for both types of stimuli. Consistent with earlier findings, we also observe that whereas theta-band phase patterns differ across stimuli, power patterns do not. We use a simulation model of the single-trial response to spoken sentence stimuli to demonstrate that phase-locked responses to low-frequency modulations of the acoustic signal can account not only for the phase but also for the power results. The simulation offers insight into the interpretation of the empirical results with respect to phase-resetting and power-enhancement models of the evoked response.
APA, Harvard, Vancouver, ISO, and other styles
40

Rosen, Stuart, John Walliker, Judith A. Brimacombe, and Bradly J. Edgerton. "Prosodic and Segmental Aspects of Speech Perception with the House/3M Single-Channel Implant." Journal of Speech, Language, and Hearing Research 32, no. 1 (March 1989): 93–111. http://dx.doi.org/10.1044/jshr.3201.93.

Full text
Abstract:
Four adult users of the House/3M single-channel cochlear implant were tested for their ability to label question and statement intonation contours (by auditory means alone) and to identify a set of 12 intervocalic consonants (with and without lipreading). Nineteen of 20 scores obtained on the question/statement task were significantly better than chance. Simplifying the stimulating waveform so as to signal fundamental frequency alone sometimes led to an improvement in performance. In consonant identification, lipreading alone scores were always far inferior to those obtained by lipreading with the implant. Phonetic feature analyses showed that the major effect of using the implant was to increase the transmission of voicing information, although improvements in the appropriate labelling of manner distinctions were also found. Place of articulation was poorly identified from the auditory signal alone. These results are best explained by supposing that subjects can use the relatively gross temporal information found in the stimulating waveforms (periodicity, randomness and silence) in a linguistic fashion. Amplitude envelope cues are of significant, but secondary, importance. By providing information that is relatively invisible, the House/3M device can thus serve as an important aid to lipreading, even though it relies primarily on the temporal structure of the stimulating waveform. All implant systems, including multi-channel ones, might benefit from the appropriate exploitation of such temporal features.
APA, Harvard, Vancouver, ISO, and other styles
41

Tür, Gökhan, Dilek Hakkani-Tür, Andreas Stolcke, and Elizabeth Shriberg. "Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation." Computational Linguistics 27, no. 1 (March 2001): 31–57. http://dx.doi.org/10.1162/089120101300346796.

Full text
Abstract:
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.
APA, Harvard, Vancouver, ISO, and other styles
42

Wohlert, Amy B., and Anne Smith. "Developmental Change in Variability of Lip Muscle Activity During Speech." Journal of Speech, Language, and Hearing Research 45, no. 6 (December 2002): 1077–87. http://dx.doi.org/10.1044/1092-4388(2002/086).

Full text
Abstract:
Compared to adults, children's speech production measures sometimes show higher trial-to-trial variability in both kinematic and acoustic analyses. A reasonable hypothesis is that this variability reflects variations in neural drive to muscles as the developing system explores different solutions to achieving vocal tract goals. We investigated that hypothesis in the present study by analyzing EMG waveforms produced across repetitions of a phrase spoken by 7-year-olds, 12-year-olds, and young adults. The EMG waveforms recorded via surface electrodes at upper lip sites were clearly modulated in a consistent manner corresponding to lip closure for the bilabial consonants in the utterance. Thus we were able to analyze the amplitude envelope of the rectified EMG with a phrase-level variability index previously used with kinematic data. Both the 7- and 12-year-old children were significantly more variable on repeated productions than the young adults. These results support the idea that children are using varying combinations of muscle activity to achieve phonetic goals. Even at age 12 years, these children were not adult-like in their performance. These and earlier kinematic studies of the oral motor system suggest that children retain their flexibility, employing more degrees of freedom than adults, to dynamically control lip aperture during speech. This strategy is adaptive given the many neurophysiological and biomechanical changes that occur during the transition from adolescence to adulthood.
APA, Harvard, Vancouver, ISO, and other styles
43

Stokes, Michael A. "Identification of vowels based on visual cues within raw complex speech waveforms." Journal of the Acoustical Society of America 99, no. 4 (April 1996): 2589–603. http://dx.doi.org/10.1121/1.415250.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Zhang, Fawen, Chelsea Benson, and Steven J. Cahn. "Cortical Encoding of Timbre Changes in Cochlear Implant Users." Journal of the American Academy of Audiology 24, no. 01 (January 2013): 046–58. http://dx.doi.org/10.3766/jaaa.24.1.6.

Full text
Abstract:
Background: Most cochlear implant (CI) users describe music as a noise-like and unpleasant sound. Using behavioral tests, most prior studies have shown that perception of pitch-based melody and timbre is poor in CI users. Purpose: This article will focus on cortical encoding of timbre changes in CI users, which may allow us to find solutions to further improve CI benefits. Furthermore, the value of using objective measures to reveal neural encoding of timbre changes may be reflected in this study. Research Design: A case-control study of the mismatch negativity (MMN) using electrophysiological technique was conducted. To derive MMNs, three randomly arranged oddball paradigms consisting of standard/deviant instrumental pairs: saxophone/piano, cello/trombone, and flute/French horn, respectively, were presented. Study Sample: Ten CI users and ten normal-hearing (NH) listeners participated in this study. Data Collection and Analysis: After filtering, epoching, and baseline correction, independent component analysis (ICA) was performed to remove artifacts. The averaged waveforms in response to the standard stimuli (STANDARD waveform) and the deviant stimuli (DEVIANT waveform) in each condition were separately derived. The responses from nine electrodes in the fronto-central area were averaged to form one waveform. The STANDARD waveform was subtracted from the DEVIANT waveform to derive the difference waveform, for which the MMN was judged to be present or absent. The measures used to evaluate the MMN included the MMN peak latency and amplitude as well as MMN duration. Results: The MMN, which reflects the ability to automatically detect acoustic changes, was present in all NH listeners but only approximately half of CI users. In CI users with present MMNs, the MMN peak amplitude and duration were significantly smaller and shorter compared to those in NH listeners. Conclusions: Our electrophysiological results were consistent with prior behavioral results that CI users' performance in timbre perception was significantly poorer than that in NH listeners. Our results may suggest that timbre information is poorly registered in the auditory cortex of CI users and the capability of automatic detection of timbre changes is degraded in CI users. Although there are some limitations of the MMN in CI users, along with other objective auditory evoked potential tools, the MMN may be a useful objective tool to indicate the extent of sound registration in auditory cortex in the future efforts of improving CI design and speech strategy.
APA, Harvard, Vancouver, ISO, and other styles
45

Zhang, Ming. "Using Concha Electrodes to Measure Cochlear Microphonic Waveforms and Auditory Brainstem Responses." Trends in Amplification 14, no. 4 (December 2010): 211–17. http://dx.doi.org/10.1177/1084713810388811.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Painter, Colin, John M. Fredrickson, Timothy Kaiser, and Roanne Karzon. "Human Speech Development for an Implantable Artificial Larynx." Annals of Otology, Rhinology & Laryngology 96, no. 5 (September 1987): 573–77. http://dx.doi.org/10.1177/000348948709600519.

Full text
Abstract:
An electromagnetic artificial larynx was implanted in two volunteer laryngectomees. Both patients were able to communicate well, but the voice quality still needed improving. Therefore, in this investigation, listener judgments were obtained of 22 different sound sources with a view to incorporating the preferred speech sound in a new version of the device. Electroglottograms were used as sound sources in a speech synthesizer and sentences were produced with different voice qualities for judgmental tests. The results of the listening tests showed a distinct preference for waveforms corresponding to a long completely open phase, a very brief completely closed phase, and an abrupt closing gesture. The optimum acoustic characteristics for the device will be used by electrical engineers to manufacture a new version of the artificial larynx with an improved voice quality.
APA, Harvard, Vancouver, ISO, and other styles
47

Alku, Paavo, Erkki Vilkman, and Anne-Maria Laukkanen. "Parameterization of the Voice Source by Combining Spectral Decay and Amplitude Features of the Glottal Flow." Journal of Speech, Language, and Hearing Research 41, no. 5 (October 1998): 990–1002. http://dx.doi.org/10.1044/jslhr.4105.990.

Full text
Abstract:
A new method is presented for the parameterization of glottal volume velocity waveforms that have been estimated by inverse filtering acoustic speech pressure signals. The new technique, Parameter for Spectral and Amplitude Features of the Glottal Flow (PSA), combines two features of voice production, the AC value and the spectral decay of the glottal flow, both of which contribute to changes in vocal loudness. PSA yields a single parameter that characterizes the glottal flow in different loudness conditions. By analyzing voices of 8 speakers it was shown that the new parameter correlates strongly with the sound pressure level of speech.
APA, Harvard, Vancouver, ISO, and other styles
48

Foti, Enzo. "Method of speech synthesis by means of concentration and partial overlapping of waveforms." Journal of the Acoustical Society of America 105, no. 2 (1999): 587. http://dx.doi.org/10.1121/1.427008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Grant, P. M. "Book review: Digital Coding of Waveforms—Principles and Applications to Speech and Video." IEE Proceedings F Communications, Radar and Signal Processing 132, no. 3 (1985): 186. http://dx.doi.org/10.1049/ip-f-1.1985.0041.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Rothenberg, Martin. "Correcting Low-Frequency Phase Distortion in Electroglottograph Waveforms." Journal of Voice 16, no. 1 (March 2002): 32–36. http://dx.doi.org/10.1016/s0892-1997(02)00069-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography