Journal articles on the topic 'Speech prediction EEG'

To see the other types of publications on this topic, follow the link: Speech prediction EEG.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speech prediction EEG.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Maki, Hayato, Sakriani Sakti, Hiroki Tanaka, and Satoshi Nakamura. "Quality prediction of synthesized speech based on tensor structured EEG signals." PLOS ONE 13, no. 6 (June 14, 2018): e0193521. http://dx.doi.org/10.1371/journal.pone.0193521.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gibson, Jerry. "Entropy Power, Autoregressive Models, and Mutual Information." Entropy 20, no. 10 (September 30, 2018): 750. http://dx.doi.org/10.3390/e20100750.

Full text
Abstract:
Autoregressive processes play a major role in speech processing (linear prediction), seismic signal processing, biological signal processing, and many other applications. We consider the quantity defined by Shannon in 1948, the entropy rate power, and show that the log ratio of entropy powers equals the difference in the differential entropy of the two processes. Furthermore, we use the log ratio of entropy powers to analyze the change in mutual information as the model order is increased for autoregressive processes. We examine when we can substitute the minimum mean squared prediction error for the entropy power in the log ratio of entropy powers, thus greatly simplifying the calculations to obtain the differential entropy and the change in mutual information and therefore increasing the utility of the approach. Applications to speech processing and coding are given and potential applications to seismic signal processing, EEG classification, and ECG classification are described.
APA, Harvard, Vancouver, ISO, and other styles
3

Sohoglu, Ediz, and Matthew H. Davis. "Perceptual learning of degraded speech by minimizing prediction error." Proceedings of the National Academy of Sciences 113, no. 12 (March 8, 2016): E1747—E1756. http://dx.doi.org/10.1073/pnas.1523266113.

Full text
Abstract:
Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.
APA, Harvard, Vancouver, ISO, and other styles
4

Shen, Stanley, Jess R. Kerlin, Heather Bortfeld, and Antoine J. Shahin. "The Cross-Modal Suppressive Role of Visual Context on Speech Intelligibility: An ERP Study." Brain Sciences 10, no. 11 (November 2, 2020): 810. http://dx.doi.org/10.3390/brainsci10110810.

Full text
Abstract:
The efficacy of audiovisual (AV) integration is reflected in the degree of cross-modal suppression of the auditory event-related potentials (ERPs, P1-N1-P2), while stronger semantic encoding is reflected in enhanced late ERP negativities (e.g., N450). We hypothesized that increasing visual stimulus reliability should lead to more robust AV-integration and enhanced semantic prediction, reflected in suppression of auditory ERPs and enhanced N450, respectively. EEG was acquired while individuals watched and listened to clear and blurred videos of a speaker uttering intact or highly-intelligible degraded (vocoded) words and made binary judgments about word meaning (animate or inanimate). We found that intact speech evoked larger negativity between 280–527-ms than vocoded speech, suggestive of more robust semantic prediction for the intact signal. For visual reliability, we found that greater cross-modal ERP suppression occurred for clear than blurred videos prior to sound onset and for the P2 ERP. Additionally, the later semantic-related negativity tended to be larger for clear than blurred videos. These results suggest that the cross-modal effect is largely confined to suppression of early auditory networks with weak effect on networks associated with semantic prediction. However, the semantic-related visual effect on the late negativity may have been tempered by the vocoded signal’s high-reliability.
APA, Harvard, Vancouver, ISO, and other styles
5

Sriraam, N. "EEG Based Thought Translator." International Journal of Biomedical and Clinical Engineering 2, no. 1 (January 2013): 50–62. http://dx.doi.org/10.4018/ijbce.2013010105.

Full text
Abstract:
A brain computer interface is a communication system that translates brain activities into commands for a computer. For physically disabled people, who cannot express their needs through verbal mode (such as thirst, appetite etc), a brain-computer interface (BCI) is the only feasible channel for communicating with others. This technology has the capability of providing substantial independence and hence, a greatly improved quality of life for the physically disabled persons. The BCI technique utilizes electrical brain potentials to directly communicate to devices such as a personal computer system. Cerebral electric activity is recorded via the electroencephalogram (EEG) electrodes attached to the scalp measure the electric signals of the brain. These signals are transmitted to the computer, which transforms them into device control commands. The efficiency of the BCI techniques lies in the extraction of suitable features from EEG signals followed by the classification scheme. This paper focuses on development of brain-computer interface model for motor imagery tasks such as movement of left hand, right hand etc. Several time domain features namely, spike rhythmicity, autoregressive method by Burgs, auto regression with exogenous input, autoregressive method based on Levinson are used by varying the prediction order. Frequency domain method involving estimation of power spectral density using Welch and Burg’s method are applied. A binary classification based on recurrent neural network is used. An optimal classification of the imagery tasks with an overall accuracy of 100% is achieved based on configuring the neural network model and varying the extracted feature and EEG channels optimally. A device command translator finally converts these tasks into speech thereby providing the practical usage of this model for real-time BCI application.
APA, Harvard, Vancouver, ISO, and other styles
6

Weissbart, Hugo, Katerina D. Kandylaki, and Tobias Reichenbach. "Cortical Tracking of Surprisal during Continuous Speech Comprehension." Journal of Cognitive Neuroscience 32, no. 1 (January 2020): 155–66. http://dx.doi.org/10.1162/jocn_a_01467.

Full text
Abstract:
Speech comprehension requires rapid online processing of a continuous acoustic signal to extract structure and meaning. Previous studies on sentence comprehension have found neural correlates of the predictability of a word given its context, as well as of the precision of such a prediction. However, they have focused on single sentences and on particular words in those sentences. Moreover, they compared neural responses to words with low and high predictability, as well as with low and high precision. However, in speech comprehension, a listener hears many successive words whose predictability and precision vary over a large range. Here, we show that cortical activity in different frequency bands tracks word surprisal in continuous natural speech and that this tracking is modulated by precision. We obtain these results through quantifying surprisal and precision from naturalistic speech using a deep neural network and through relating these speech features to EEG responses of human volunteers acquired during auditory story comprehension. We find significant cortical tracking of surprisal at low frequencies, including the delta band as well as in the higher frequency beta and gamma bands, and observe that the tracking is modulated by the precision. Our results pave the way to further investigate the neurobiology of natural speech comprehension.
APA, Harvard, Vancouver, ISO, and other styles
7

MacGregor, Lucy J., Jennifer M. Rodd, Rebecca A. Gilbert, Olaf Hauk, Ediz Sohoglu, and Matthew H. Davis. "The Neural Time Course of Semantic Ambiguity Resolution in Speech Comprehension." Journal of Cognitive Neuroscience 32, no. 3 (March 2020): 403–25. http://dx.doi.org/10.1162/jocn_a_01493.

Full text
Abstract:
Semantically ambiguous words challenge speech comprehension, particularly when listeners must select a less frequent (subordinate) meaning at disambiguation. Using combined magnetoencephalography (MEG) and EEG, we measured neural responses associated with distinct cognitive operations during semantic ambiguity resolution in spoken sentences: (i) initial activation and selection of meanings in response to an ambiguous word and (ii) sentence reinterpretation in response to subsequent disambiguation to a subordinate meaning. Ambiguous words elicited an increased neural response approximately 400–800 msec after their acoustic offset compared with unambiguous control words in left frontotemporal MEG sensors, corresponding to sources in bilateral frontotemporal brain regions. This response may reflect increased demands on processes by which multiple alternative meanings are activated and maintained until later selection. Disambiguating words heard after an ambiguous word were associated with marginally increased neural activity over bilateral temporal MEG sensors and a central cluster of EEG electrodes, which localized to similar bilateral frontal and left temporal regions. This later neural response may reflect effortful semantic integration or elicitation of prediction errors that guide reinterpretation of previously selected word meanings. Across participants, the amplitude of the ambiguity response showed a marginal positive correlation with comprehension scores, suggesting that sentence comprehension benefits from additional processing around the time of an ambiguous word. Better comprehenders may have increased availability of subordinate meanings, perhaps due to higher quality lexical representations and reflected in a positive correlation between vocabulary size and comprehension success.
APA, Harvard, Vancouver, ISO, and other styles
8

Moinuddin, Kazi Ashraf, Felix Havugimana, Rakib Al-Fahad, Gavin M. Bidelman, and Mohammed Yeasin. "Unraveling Spatial-Spectral Dynamics of Speech Categorization Speed Using Convolutional Neural Networks." Brain Sciences 13, no. 1 (December 30, 2022): 75. http://dx.doi.org/10.3390/brainsci13010075.

Full text
Abstract:
The process of categorizing sounds into distinct phonetic categories is known as categorical perception (CP). Response times (RTs) provide a measure of perceptual difficulty during labeling decisions (i.e., categorization). The RT is quasi-stochastic in nature due to individuality and variations in perceptual tasks. To identify the source of RT variation in CP, we have built models to decode the brain regions and frequency bands driving fast, medium and slow response decision speeds. In particular, we implemented a parameter optimized convolutional neural network (CNN) to classify listeners’ behavioral RTs from their neural EEG data. We adopted visual interpretation of model response using Guided-GradCAM to identify spatial-spectral correlates of RT. Our framework includes (but is not limited to): (i) a data augmentation technique designed to reduce noise and control the overall variance of EEG dataset; (ii) bandpower topomaps to learn the spatial-spectral representation using CNN; (iii) large-scale Bayesian hyper-parameter optimization to find best performing CNN model; (iv) ANOVA and posthoc analysis on Guided-GradCAM activation values to measure the effect of neural regions and frequency bands on behavioral responses. Using this framework, we observe that α−β (10–20 Hz) activity over left frontal, right prefrontal/frontal, and right cerebellar regions are correlated with RT variation. Our results indicate that attention, template matching, temporal prediction of acoustics, motor control, and decision uncertainty are the most probable factors in RT variation.
APA, Harvard, Vancouver, ISO, and other styles
9

Cimtay, Yucel, and Erhan Ekmekcioglu. "Investigating the Use of Pretrained Convolutional Neural Network on Cross-Subject and Cross-Dataset EEG Emotion Recognition." Sensors 20, no. 7 (April 4, 2020): 2034. http://dx.doi.org/10.3390/s20072034.

Full text
Abstract:
The electroencephalogram (EEG) has great attraction in emotion recognition studies due to its resistance to deceptive actions of humans. This is one of the most significant advantages of brain signals in comparison to visual or speech signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that EEG recordings exhibit varying distributions for different people as well as for the same person at different time instances. This nonstationary nature of EEG limits the accuracy of it when subject independency is the priority. The aim of this study is to increase the subject-independent recognition accuracy by exploiting pretrained state-of-the-art Convolutional Neural Network (CNN) architectures. Unlike similar studies that extract spectral band power features from the EEG readings, raw EEG data is used in our study after applying windowing, pre-adjustments and normalization. Removing manual feature extraction from the training system overcomes the risk of eliminating hidden features in the raw data and helps leverage the deep neural network’s power in uncovering unknown features. To improve the classification accuracy further, a median filter is used to eliminate the false detections along a prediction interval of emotions. This method yields a mean cross-subject accuracy of 86.56% and 78.34% on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED) for two and three emotion classes, respectively. It also yields a mean cross-subject accuracy of 72.81% on the Database for Emotion Analysis using Physiological Signals (DEAP) and 81.8% on the Loughborough University Multimodal Emotion Dataset (LUMED) for two emotion classes. Furthermore, the recognition model that has been trained using the SEED dataset was tested with the DEAP dataset, which yields a mean prediction accuracy of 58.1% across all subjects and emotion classes. Results show that in terms of classification accuracy, the proposed approach is superior to, or on par with, the reference subject-independent EEG emotion recognition studies identified in literature and has limited complexity due to the elimination of the need for feature extraction.
APA, Harvard, Vancouver, ISO, and other styles
10

Strauß, Antje, Sonja A. Kotz, and Jonas Obleser. "Narrowed Expectancies under Degraded Speech: Revisiting the N400." Journal of Cognitive Neuroscience 25, no. 8 (August 2013): 1383–95. http://dx.doi.org/10.1162/jocn_a_00389.

Full text
Abstract:
Under adverse listening conditions, speech comprehension profits from the expectancies that listeners derive from the semantic context. However, the neurocognitive mechanisms of this semantic benefit are unclear: How are expectancies formed from context and adjusted as a sentence unfolds over time under various degrees of acoustic degradation? In an EEG study, we modified auditory signal degradation by applying noise-vocoding (severely degraded: four-band, moderately degraded: eight-band, and clear speech). Orthogonal to that, we manipulated the extent of expectancy: strong or weak semantic context (±con) and context-based typicality of the sentence-last word (high or low: ±typ). This allowed calculation of two distinct effects of expectancy on the N400 component of the evoked potential. The sentence-final N400 effect was taken as an index of the neural effort of automatic word-into-context integration; it varied in peak amplitude and latency with signal degradation and was not reliably observed in response to severely degraded speech. Under clear speech conditions in a strong context, typical and untypical sentence completions seemed to fulfill the neural prediction, as indicated by N400 reductions. In response to moderately degraded signal quality, however, the formed expectancies appeared more specific: Only typical (+con +typ), but not the less typical (+con −typ) context–word combinations led to a decrease in the N400 amplitude. The results show that adverse listening “narrows,” rather than broadens, the expectancies about the perceived speech signal: limiting the perceptual evidence forces the neural system to rely on signal-driven expectancies, rather than more abstract expectancies, while a sentence unfolds over time.
APA, Harvard, Vancouver, ISO, and other styles
11

Shen, Deju, Yuqin Deng, Chunyan Lin, Jianshu Li, Xuehua Lin, and Chaoning Zou. "Clinical Characteristics and Gene Mutation Analysis of Poststroke Epilepsy." Contrast Media & Molecular Imaging 2022 (August 29, 2022): 1–10. http://dx.doi.org/10.1155/2022/4801037.

Full text
Abstract:
Epilepsy is one of the most common brain disorders worldwide. Poststroke epilepsy (PSE) affects functional retrieval after stroke and brings considerable social values. A stroke occurs when the blood circulation to the brain fails, causing speech difficulties, memory loss, and paralysis. An electroencephalogram (EEG) is a tool that may detect anomalies in brain electrical activity, including those induced by a stroke. Using EEG data to determine the electrical action in the brains of stroke patients is an effort to measure therapy. Hence in this paper, deep learning assisted gene mutation analysis (DL-GMA) was utilized for classifying poststroke epilepsy in patients. This study suggested a model categorizing poststroke patients based on EEG signals that utilized wavelet, long short-term memory (LSTM), and convolutional neural networks (CNN). Gene mutation analysis can help determine the cause of an individual’s epilepsy, leading to an accurate diagnosis and the best probable medical management. The test outcomes show the viability of noninvasive approaches that quickly evaluate brain waves to monitor and detect daily stroke diseases. The simulation outcomes demonstrate that the proposed GL-GMA achieves a high accuracy ratio of 98.3%, a prediction ratio of 97.8%, a precision ratio of 96.5%, and a recall ratio of 95.6% and decreases the error rate 10.3% compared to other existing methods.
APA, Harvard, Vancouver, ISO, and other styles
12

Voitenkov, V. B., A. B. A. B. Palchick, N. A. Savelieva, and E. P. Bogdanova. "Bioelectric activity of the brain in 3-4 years old children in eyes-open resting state." Translational Medicine 8, no. 4 (November 18, 2021): 47–56. http://dx.doi.org/10.18705/2311-4495-2021-8-4-47-56.

Full text
Abstract:
Background. Electroencephalography is the main technique for assessing the functional state of the brain. Indications for EEG are diagnosis of paroxysmal states, prediction of the outcome of a pathological state, evaluation of bioelectrical activity if brain death is suspected. Up to 90 % of the native EEG in calm wakefulness in healthy individuals is occupied by “alpha activity”. In children in active wakefulness, the EEG pattern depends to a great extent on their age.Objective. The aim of the work was to assess EEG parameters in children aged 3–4 years in eyes-open resting state. Design and methods. 31 healthy participants aged 3–4 years were enrolled. EEG was registered for 30 minutes in a state of passive wakefulness in the supine position with open eyes. Average values of the power of the spectra for the alpha-rhythm, delta-rhythm and theta-rhythm in the frontal and temporal leads, as well as the ratio of the average power of alpha/theta and alpha/delta rhythms in the frontal and temporal leads were calculated.Results. Average power of the alpha-rhythm was significantly higher over the right frontal lobe than over the right frontal-temporal area, as well as average amplitude of it was significantly higher in F3-A1 than F7-A1, F4-A2 than F8-A2, which is associated with the articulatory praxis. Average alpha-rhythm power was significantly higher in T5-A1 than T3-A1 and T6-A2 than T4-A2, which corresponds to the recognition and naming of objects optically. Significant differences according to the total average power of the alpha- and theta-rhythms above the frontal and frontal-temporal regions reflect the relationship between the frontal cortex temporal lobes and the premotor zones, i.e. arcuate bundle, responsible for the “speech system”.Conclusion. The identified patterns can reflect the characteristics of the state of active wakefulness in a 3–4-year-old child and can be used for comparison in the future (both in the course of behavioral experiments and observation of patients with certain pathological processes).
APA, Harvard, Vancouver, ISO, and other styles
13

Goller, Lisa, Michael Schwartze, Ana Pinheiro, and Sonja Kotz. "M52. VOICES IN THE HEAD: AUDITORY VERBAL HALLUCINATIONS (AVH) IN HEALTHY INDIVIDUALS." Schizophrenia Bulletin 46, Supplement_1 (April 2020): S153—S154. http://dx.doi.org/10.1093/schbul/sbaa030.364.

Full text
Abstract:
Abstract Background Auditory verbal hallucinations (AVH) are conscious sensory experiences occurring in the absence of external stimulation. AVH are experienced by 75% of individuals diagnosed with schizophrenia and can manifest in other neuropsychiatric disorders. However, AVH are also reported amongst healthy individuals. This implies that hearing voices is not necessarily linked to psychopathology. Amongst voice hearers, the likelihood of AVH seems to reflect individual differences in hallucination proneness (HP). The HP construct allows placing individuals on a psychosis continuum ranging from non-clinical to clinical experiences. Clinical voice hearers tend to misattribute internal events to external sources (externalization bias). Specifically, they seem to experience altered sensory feedback in response to self-initiated stimuli: Although more predictable, clinical voice hearers show similar, neurophysiological responses in reaction to self-initiated vs. externally presented stimuli. EEG studies suggest that this aberrance of prediction is associated with diminished N1-suppression effects that are observed in healthy individuals in response to self-initiated stimuli. Accordingly, clinical voice hearers may have problems differentiating between self-initiated and externally generated speech, potentially leading to externalization of their own speech. In line with this proposal, the current study focusses on non-clinical aspects of the psychosis continuum in healthy voice hearers and controls. This approach avoids confounding factors (medication, disease onset/duration etc.) that typically impede comparisons of clinical and non-clinical voice hearers. By utilizing insights on prediction from the forward model concept within the auditory-sensory domain, we want to investigate how N1-amplitudes in reaction to one’s own or someone else’s voice are modulated as a function of HP. Next to ascertaining the mechanism behind AVH, this research could give direction to identifying risk factors that potentiate the emergence of first-incidence psychosis. Methods HP was assessed by means of the Launay-Slade Hallucination Scale. Each participant’s voice was recorded prior to EEG data acquisition (monosyllabic utterances, “ah” & “oh”, duration = 500 ms). Voice stimuli were morphed with an anchor voice, so that voice identity could be alternated from self- to other-voice (0%, 40%, 50%, 60%, 100%). To contrast neurophysiological responses between self- vs. externally generated voice stimuli, a well-established motor-to-auditory paradigm was used: In a motor-to-auditory condition (MAC) participants were prompted to press a button, thereby eliciting a voice stimulus (self-initiation). In an auditory-only condition (AOC), participants were prompted to passively listen to the voice stimulus (external generation). The motor-only condition (MOC), in which participants executed the button press only, served as a control condition to correct for motor activity in MAC. Results Data from 38 participants replicate the classical N1-suppression effects for self-initiated vs. externally generated self-voice stimuli. This pattern of suppression is also visible for other-voice stimuli. Furthermore, current findings seem to replicate reversed N1-suppression for self-voice in individuals with high HP. Discussion Preliminary findings suggest that HP modulates voice identity processing. More specifically, HP determines how voice stimuli are processed within the internal and external domain. Particularly, individuals with high HP show a reversal of N1-suppression for self-voice stimuli, which corroborates the external biasing hypothesis.
APA, Harvard, Vancouver, ISO, and other styles
14

Acharyya, Rupam, Shouman Das, Ankani Chattoraj, and Md Iftekhar Tanveer. "FairyTED: A Fair Rating Predictor for TED Talk Data." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 01 (April 3, 2020): 338–45. http://dx.doi.org/10.1609/aaai.v34i01.5368.

Full text
Abstract:
With the recent trend of applying machine learning in every aspect of human life, it is important to incorporate fairness into the core of the predictive algorithms. We address the problem of predicting the quality of public speeches while being fair with respect to sensitive attributes of the speakers, e.g. gender and race. We use the TED talks as an input repository of public speeches because it consists of speakers from a diverse community and has a wide outreach. Utilizing the theories of Causal Models, Counterfactual Fairness and state-of-the-art neural language models, we propose a mathematical framework for fair prediction of the public speaking quality. We employ grounded assumptions to construct a causal model capturing how different attributes affect public speaking quality. This causal model contributes in generating counterfactual data to train a fair predictive model. Our framework is general enough to utilize any assumption within the causal model. Experimental results show that while prediction accuracy is comparable to recent work on this dataset, our predictions are counterfactually fair with respect to a novel metric when compared to true data labels. The FairyTED setup not only allows organizers to make informed and diverse selection of speakers from the unobserved counterfactual possibilities but it also ensures that viewers and new users are not influenced by unfair and unbalanced ratings from arbitrary visitors to the ted.com website when deciding to view a talk.
APA, Harvard, Vancouver, ISO, and other styles
15

Shidlovskaya, Tetiana, Tamara Shidlovskaya, Nikolay Kozak, and Lyubov Petruk. "Statе of bioelectric activity of the brain in persons who received acoustic trauma in area of combat actions with a different stage of disorders in the auditory system." OTORHINOLARYNGOLOGY, no. 1(1) 2018 (March 27, 2018): 17–25. http://dx.doi.org/10.37219/2528-8253-2018-1-17.

Full text
Abstract:
Topicality: Providing medical care to patients with combat acoustic trauma remains a topical issue of military medicine. There are works in the literature that show changes in the central nervous system under the influence of intense noise and at acoustic trauma, however, only in individual studies this objective assessment of the functional state of the central nervous system in patients with sensorineural hearing loss is shown as well as the promising use of them. Aim: is to determine the most significant indicators of bioelectric activity of the brain according to the EEG in terms of prediction of the course and ways of co-rejection of sensorineural hearing disorders in persons who have received an acute trauma in the area of fighting. Materials and methods: A group of servicemen with acoustic trauma was examined with the most characteristic, typical forms of audiometric curves – with a downward, precipitous type of the curve, which were divided into three groups depending on the degree of severity of sensorineural deafness. Group 1 – patients with initial non-expressed violations of the function of sound perception in the basal part of the cochlea, group 2 – with a more significant SDP accompanied by violations of speech and supra-vocal audiometry, the 3 groups included patients with severe violations of auditory function, lesions of the mediobasal part of the cochlea, often – with a "break" of perception of tones in the conventional range. A total of 205 patients with acoustic trauma were examined. As a control group, 15 healthy normal people were examined. The EEG study was carried out using the computer electroencephalometry of the firm "DX-System" (Ukraine) according to the generally accepted methodology according to the scheme of electrodes "10-20" Results: In qualitative analysis of electroencephalograms, servicemen with combat acoustic trauma revealed deviations from the norm in the functional state of the central nervous system, expressed in varying degrees, with the most characteristic decreasing of the bioelectric activity of the brain, irritative changes, disorganization and desynchronization of rhythms, more often in the temporal and frontal leads. The most significant changes were in patients with more severe hearing impairment (group 3). These changes indicate signs of severe cortex irritation and deep brain structures in servicemen with acoustic trauma from the combat zone. The analysis of EEG quantitative indicators showed that changes in the bioelectric activity of the cerebral brain in patients with acoustic trauma were manifested by deformation of the basic rhythm with modulation and weakened response to functional loads, especially in the anterior leads. Patients had the significantly (P<0,05) decreased percentage of alfa rhythm in the normal picture of the EEG and the increased representation of beta and delta rhythm, both in the background recording and in the functional loading of photostimulations and hyperventilation . The most significant (P<0,05) changes in bioelectric activity, in comparison with the control group, were observed in individuals 2 and, personally, in 3 groups, with more significant violations of auditory function. We also conducted a comparative analysis of EEG quantitative indicators among the study groups. The results of the research indicate a reliable (P<0,05) difference in the indices in the groups, from the first to the third group there was an increase in the representation of delta, theta and beta rhythm, most in the forward projections, and the decrease in the proportion of alpha rhythm. Moreover, these tendencies were maintained both during the background recording and at the functional loads. Conclusions: Thus, the servicemen with an acoustic trauma revealed objective signs of functional disorders in the cortical and deep structures of the brain. As the auditory function decreases in patients with acoustic trauma and redistribution of the main EEG rhythms in the direction of the growth of manifestations of slow-wave activity on a disorganized background occurs, especially in the frontal and temporal infections. In the subjects we surveyed with severe violations of auditory function, there are significantly more significant changes in the central nervous system than in patients with an initial SDE, which should be taken into account when carrying out treatment and preventive measures aimed at rehabilitation of the victims of combat operations with acoustic trauma.
APA, Harvard, Vancouver, ISO, and other styles
16

Schädler, Marc René. "Interactive spatial speech recognition maps based on simulated speech recognition experiments." Acta Acustica 6 (2022): 31. http://dx.doi.org/10.1051/aacus/2022028.

Full text
Abstract:
In their everyday life, the speech recognition performance of human listeners is influenced by diverse factors, such as the acoustic environment, the talker and listener positions, possibly impaired hearing, and optional hearing devices. Prediction models come closer to considering all required factors simultaneously to predict the individual speech recognition performance in complex, that is, e.g. multi-source dynamic, acoustic environments. While such predictions may still not be sufficiently accurate for serious applications, such as, e.g. individual hearing aid fitting, they can already be performed. This raises an interesting question: What could we do if we had a perfect speech intelligibility model? In a first step, means to explore and interpret the predicted outcomes of large numbers of speech recognition experiments would be helpful, and large amounts of data demand an accessible, that is, easily comprehensible, representation. In this contribution, an interactive, that is, user manipulable, representation of speech recognition performance is proposed and investigated by means of a concrete example, which focuses on the listener’s head orientation and the spatial dimensions – in particular width and depth – of an acoustic scene. An exemplary modeling toolchain, that is, a combination of an acoustic model, a hearing device model, and a listener model, was used to generate a data set for demonstration purposes. Using the spatial speech recognition maps to explore this data set demonstrated the suitability of the approach to observe possibly relevant listener behavior. The proposed representation was found to be a suitable target to compare and validate modeling approaches in ecologically relevant contexts, and should help to explore possible applications of future speech recognition models. Ultimately, it may serve as a tool to use validated prediction models in the design of spaces and devices which take speech communication into account.
APA, Harvard, Vancouver, ISO, and other styles
17

Accou, Bernd, Mohammad Jalilpour Monesi, Hugo Van hamme, and Tom Francart. "Predicting speech intelligibility from EEG in a non-linear classification paradigm *." Journal of Neural Engineering 18, no. 6 (November 15, 2021): 066008. http://dx.doi.org/10.1088/1741-2552/ac33e9.

Full text
Abstract:
Abstract Objective. Currently, only behavioral speech understanding tests are available, which require active participation of the person being tested. As this is infeasible for certain populations, an objective measure of speech intelligibility is required. Recently, brain imaging data has been used to establish a relationship between stimulus and brain response. Linear models have been successfully linked to speech intelligibility but require per-subject training. We present a deep-learning-based model incorporating dilated convolutions that operates in a match/mismatch paradigm. The accuracy of the model’s match/mismatch predictions can be used as a proxy for speech intelligibility without subject-specific (re)training. Approach. We evaluated the performance of the model as a function of input segment length, electroencephalography (EEG) frequency band and receptive field size while comparing it to multiple baseline models. Next, we evaluated performance on held-out data and finetuning. Finally, we established a link between the accuracy of our model and the state-of-the-art behavioral MATRIX test. Main results. The dilated convolutional model significantly outperformed the baseline models for every input segment length, for all EEG frequency bands except the delta and theta band, and receptive field sizes between 250 and 500 ms. Additionally, finetuning significantly increased the accuracy on a held-out dataset. Finally, a significant correlation (r = 0.59, p = 0.0154) was found between the speech reception threshold (SRT) estimated using the behavioral MATRIX test and our objective method. Significance. Our method is the first to predict the SRT from EEG for unseen subjects, contributing to objective measures of speech intelligibility.
APA, Harvard, Vancouver, ISO, and other styles
18

Nogueira, Waldo, and Hanna Dolhopiatenko. "Predicting speech intelligibility from a selective attention decoding paradigm in cochlear implant users." Journal of Neural Engineering 19, no. 2 (April 1, 2022): 026037. http://dx.doi.org/10.1088/1741-2552/ac599f.

Full text
Abstract:
Abstract Objectives. Electroencephalography (EEG) can be used to decode selective attention in cochlear implant (CI) users. This work investigates if selective attention to an attended speech source in the presence of a concurrent speech source can predict speech understanding in CI users. Approach. CI users were instructed to attend to one out of two speech streams while EEG was recorded. Both speech streams were presented to the same ear and at different signal to interference ratios (SIRs). Speech envelope reconstruction of the to-be-attended speech from EEG was obtained by training decoders using regularized least squares. The correlation coefficient between the reconstructed and the attended ( ρ A SIR ) or the unattended ρ U SIR speech stream at each SIR was computed. Additionally, we computed the difference correlation coefficient at the same ( ρ Diff = ρ A SIR − ρ U SIR ) and opposite SIR ( ρ DiffOpp = ρ A SIR − ρ U − SIR ) . ρ Diff compares the attended and unattended correlation coefficient to speech sources presented at different presentation levels depending on SIR. In contrast, ρ DiffOpp compares the attended and unattended correlation coefficients to speech sources presented at the same presentation level irrespective of SIR. Main results. Selective attention decoding in CI users is possible even if both speech streams are presented monaurally. A significant effect of SIR on ρ A SIR , ρ Diff and ρ DiffOpp, but not on ρ U SIR , was observed. Finally, the results show a significant correlation between speech understanding performance and ρ A SIR as well as with ρ U SIR across subjects. Moreover, ρ DiffOpp which is less affected by the CI artifact, also demonstrated a significant correlation with speech understanding. Significance. Selective attention decoding in CI users is possible, however care needs to be taken with the CI artifact and the speech material used to train the decoders. These results are important for future development of objective speech understanding measures for CI users.
APA, Harvard, Vancouver, ISO, and other styles
19

Summers, Van, Ken W. Grant, Brian E. Walden, Mary T. Cord, Rauna K. Surr, and Mounya Elhilali. "Evaluation of A “Direct-Comparison” Approach to Automatic Switching In Omnidirectional/Directional Hearing Aids." Journal of the American Academy of Audiology 19, no. 09 (October 2008): 708–20. http://dx.doi.org/10.3766/jaaa.19.9.6.

Full text
Abstract:
Background: Hearing aids today often provide both directional (DIR) and omnidirectional (OMNI) processing options with the currently active mode selected automatically by the device. The most common approach to automatic switching involves “acoustic scene analysis” where estimates of various acoustic properties of the listening environment (e.g., signal-to-noise ratio [SNR], overall sound level) are used as a basis for switching decisions. Purpose: The current study was carried out to evaluate an alternative, “direct-comparison” approach to automatic switching that does not involve assumptions about how the listening environment may relate to microphone preferences. Predictions of microphone preference were based on whether DIR- or OMNI-processing of a given listening environment produced a closer match to a reference template representing the spectral and temporal modulations present in clean speech. Research Design: A descriptive and correlational study. Predictions of OMNI/DIR preferences were determined based on degree of similarity between spectral and temporal modulations contained in a reference, clean-speech template, and in OMNI- and DIR-processed recordings of various listening environments. These predictions were compared with actual preference judgments (both real-world judgments and laboratory responses to the recordings). Data Collection And Analysis: Predictions of microphone preference were based on whether DIR- or OMNI-processing of a given listening environment produced a closer match to a reference template representing clean speech. The template is the output of an auditory processing model that characterizes the spectral and temporal modulations associated with a given input signal (clean speech in this case). A modified version of the spectro-temporal modulation index (mSTMI) was used to compare the template to both DIR- and OMNI-processed versions of a given listening environment, as processed through the same auditory model. These analyses were carried out on recordings (originally collected by Walden et al, 2007) of OMNI- and DIR-processed speech produced in a range of everyday listening situations. Walden et al reported OMNI/DIR preference judgments made by raters at the same time the field recordings were made and judgments based on laboratory presentations of these recordings to hearing-impaired and normal-hearing listeners. Preference predictions based on the mSTMI analyses were compared with both sets of preference judgments. Results: The mSTMI analyses showed better than 92% accuracy in predicting the field preferences and 82–85% accuracy in predicting the laboratory preference judgments. OMNI processing tended to be favored over DIR processing in cases where the analysis indicated fairly similar mSTMI scores across the two processing modes. This is consistent with the common clinical assignment of OMNI mode as the default setting, most likely to be preferred in cases where neither mode produces a substantial improvement in SNR. Listeners experienced with switchable OMNI/DIR hearing aids were more likely than other listeners to favor the DIR mode in instances where mSTMI scores only slightly favored DIR processing. Conclusions: A direct-comparison approach to OMNI/DIR mode selection was generally successful in predicting user preferences in a range of listening environments. Future modifications to the approach to further improve predictive accuracy are discussed.
APA, Harvard, Vancouver, ISO, and other styles
20

Kaur, Gurpreet, Mohit Srivastava, and Amod Kumar. "Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks." Journal of Telecommunications and Information Technology 2 (June 29, 2018): 23–31. http://dx.doi.org/10.26636/jtit.2018.119617.

Full text
Abstract:
Huge growth is observed in the speech and speaker recognition field due to many artificial intelligence algorithms being applied. Speech is used to convey messages via the language being spoken, emotions, gender and speaker identity. Many real applications in healthcare are based upon speech and speaker recognition, e.g. a voice-controlled wheelchair helps control the chair. In this paper, we use a genetic algorithm (GA) for combined speaker and speech recognition, relying on optimized Mel Frequency Cepstral Coefficient (MFCC) speech features, and classification is performed using a Deep Neural Network (DNN). In the first phase, feature extraction using MFCC is executed. Then, feature optimization is performed using GA. In the second phase training is conducted using DNN. Evaluation and validation of the proposed work model is done by setting a real environment, and efficiency is calculated on the basis of such parameters as accuracy, precision rate, recall rate, sensitivity, and specificity. Also, this paper presents an evaluation of such feature extraction methods as linear predictive coding coefficient (LPCC), perceptual linear prediction (PLP), mel frequency cepstral coefficients (MFCC) and relative spectra filtering (RASTA), with all of them used for combined speaker and speech recognition systems. A comparison of different methods based on existing techniques for both clean and noisy environments is made as well.
APA, Harvard, Vancouver, ISO, and other styles
21

Sengupta, Ranit, and Sazzad M. Nasir. "The predictive roles of neural oscillations in speech motor adaptability." Journal of Neurophysiology 115, no. 5 (May 1, 2016): 2519–28. http://dx.doi.org/10.1152/jn.00043.2016.

Full text
Abstract:
The human speech system exhibits a remarkable flexibility by adapting to alterations in speaking environments. While it is believed that speech motor adaptation under altered sensory feedback involves rapid reorganization of speech motor networks, the mechanisms by which different brain regions communicate and coordinate their activity to mediate adaptation remain unknown, and explanations of outcome differences in adaption remain largely elusive. In this study, under the paradigm of altered auditory feedback with continuous EEG recordings, the differential roles of oscillatory neural processes in motor speech adaptability were investigated. The predictive capacities of different EEG frequency bands were assessed, and it was found that theta-, beta-, and gamma-band activities during speech planning and production contained significant and reliable information about motor speech adaptability. It was further observed that these bands do not work independently but interact with each other suggesting an underlying brain network operating across hierarchically organized frequency bands to support motor speech adaptation. These results provide novel insights into both learning and disorders of speech using time frequency analysis of neural oscillations.
APA, Harvard, Vancouver, ISO, and other styles
22

King, John E., Marek Polak, Annelle V. Hodges, Stacy Payne, and Fred F. Telischi. "Use of Neural Response Telemetry Measures to Objectively Set the Comfort Levels in the Nucleus 24 Cochlear Implant." Journal of the American Academy of Audiology 17, no. 06 (June 2006): 413–31. http://dx.doi.org/10.3766/jaaa.17.6.4.

Full text
Abstract:
Cochlear implant programming necessitates accurate setting of programming levels, including maximum stimulation levels, of all active electrodes. Frequently, clinical techniques are adequate for setting these levels; however, they are sometimes insufficient (e.g., very young children). In the Nucleus 24, several methods have been suggested for estimation of comfort levels (C levels) from neural response telemetry (NRT); however, many require co-application of clinical measurements. Data was obtained from 21 adult Nucleus 24 recipients to develop reliable predictions of C levels. Multiple regression analysis was performed on NRT threshold, slope of the NRT growth function, age, length of deafness, length of cochlear implant use and electrode impedance to examine predictive ability. Only the NRT threshold and slope of the growth function measures were significant predictors yielding R2 values from 0.391 to 0.769. Results demonstrated that these measures may provide an alternative means of estimating C levels when other clinical measures are unavailable.
APA, Harvard, Vancouver, ISO, and other styles
23

Taillez, Tobias de, Florian Denk, Bojana Mirkovic, Birger Kollmeier, and Bernd T. Meyer. "Modeling Nonlinear Transfer Functions from Speech Envelopes to Encephalography with Neural Networks." International Journal of Psychological Studies 11, no. 4 (August 13, 2019): 1. http://dx.doi.org/10.5539/ijps.v11n4p1.

Full text
Abstract:
Diferent linear models have been proposed to establish a link between an auditory stimulus and the neurophysiological response obtained through electroencephalography (EEG). We investigate if non-linear mappings can be modeled with deep neural networks trained on continuous speech envelopes and EEG data obtained in an auditory attention two-speaker scenario. An artificial neural network was trained to predict the EEG response related to the attended and unattended speech envelopes. After training, the properties of the DNN-based model are analyzed by measuring the transfer function between input envelopes and predicted EEG signals by using click-like stimuli and frequency sweeps as input patterns. Using sweep responses allows to separate the linear and nonlinear response components also with respect to attention. The responses from the model trained on normal speech resemble event-related potentials despite the fact that the DNN was not trained to reproduce such patterns. These responses are modulated by attention, since we obtain significantly lower amplitudes at latencies of 110 ms, 170 ms and 300 ms after stimulus presentation for unattended processing in contrast to the attended. The comparison of linear and nonlinear components indicates that the largest contribution arises from linear processing (75%), while the remaining 25% are attributed to nonlinear processes in the model. Further, a spectral analysis showed a stronger 5 Hz component in modeled EEG for attended in contrast to unattended predictions. The results indicate that the artificial neural network produces responses consistent with recent findings and presents a new approach for quantifying the model properties.
APA, Harvard, Vancouver, ISO, and other styles
24

BinKhamis, Ghada, Antonio Elia Forte, Tobias Reichenbach, Martin O’Driscoll, and Karolina Kluk. "Speech Auditory Brainstem Responses in Adult Hearing Aid Users: Effects of Aiding and Background Noise, and Prediction of Behavioral Measures." Trends in Hearing 23 (January 2019): 233121651984829. http://dx.doi.org/10.1177/2331216519848297.

Full text
Abstract:
Evaluation of patients who are unable to provide behavioral responses on standard clinical measures is challenging due to the lack of standard objective (non-behavioral) clinical audiological measures that assess the outcome of an intervention (e.g., hearing aids). Brainstem responses to short consonant-vowel stimuli (speech-auditory brainstem responses [speech-ABRs]) have been proposed as a measure of subcortical encoding of speech, speech detection, and speech-in-noise performance in individuals with normal hearing. Here, we investigated the potential application of speech-ABRs as an objective clinical outcome measure of speech detection, speech-in-noise detection and recognition, and self-reported speech understanding in 98 adults with sensorineural hearing loss. We compared aided and unaided speech-ABRs, and speech-ABRs in quiet and in noise. In addition, we evaluated whether speech-ABR F0 encoding (obtained from the complex cross-correlation with the 40 ms [da] fundamental waveform) predicted aided behavioral speech recognition in noise or aided self-reported speech understanding. Results showed that (a) aided speech-ABRs had earlier peak latencies, larger peak amplitudes, and larger F0 encoding amplitudes compared to unaided speech-ABRs; (b) the addition of background noise resulted in later F0 encoding latencies but did not have an effect on peak latencies and amplitudes or on F0 encoding amplitudes; and (c) speech-ABRs were not a significant predictor of any of the behavioral or self-report measures. These results show that speech-ABR F0 encoding is not a good predictor of speech-in-noise recognition or self-reported speech understanding with hearing aids. However, our results suggest that speech-ABRs may have potential for clinical application as an objective measure of speech detection with hearing aids.
APA, Harvard, Vancouver, ISO, and other styles
25

de Prada Pérez, Ana. "Theoretical implications of research on bilingual subject production: The Vulnerability Hypothesis." International Journal of Bilingualism 23, no. 2 (March 29, 2018): 670–94. http://dx.doi.org/10.1177/1367006918763141.

Full text
Abstract:
In this paper we propose a new hypothesis for the formal analysis of cross-linguistic influence, the Vulnerability Hypothesis (VH), with the support of data from subject personal pronoun use in Spanish and Catalan in Minorca, and contrast it to the Interface Hypothesis (IH). The VH establishes a categorical–variable continuum of permeability, that is, structures that show variable distributions are permeable while those that exhibit categorical distributions are not. To test the predictions of the VH, Spanish language samples were collected from 12 monolingual Spanish speakers, 11 Spanish-dominant bilinguals, and 12 Catalan-dominant bilinguals, and Catalan language samples from 12 Catalan-dominant speakers. Following a variationist comparative analysis, 4,466 first person singular (1sg) and 1,291 third person singular (3sg) tokens were coded for speech connectivity, verb form ambiguity, and semantic verb type. The language-external variable included in the analysis was language group (Spanish monolinguals, Spanish-dominant bilinguals, Catalan-dominant bilinguals, and Catalan controls). Results indicated that speech connectivity is the highest ranked variable in the Spanish control group (most categorical variable), while ambiguity and verb type are ranked lower, with only ambiguity reaching significance. The VH would, therefore, predict bilinguals would be similar to monolinguals in the most categorical variables, in this case, speech connectivity. This is in contrast to the IH, which would predict bilinguals would exhibit difficulty with the pragmatically driven distributions (e.g. speech connectivity), while they would show no contact effects or lesser effects with distributions at the lexico-semantic interface with syntax (e.g. verb form ambiguity and verb type). The prediction of the VH bears out in our data. Bilinguals do not differ with respect to speech connectivity. Ambiguity, on the other hand, is no longer significant in the bilingual groups and verb type reaches significance with 1sg (and not with 3sg) subjects. These results are discussed, redefining the concepts of convergence and simplification from language contact research to adapt to the variationist analysis used. Simplification is specified as the reduction of lower ranked predicting variables, while convergence is defined as an increase in parallels across languages with respect to the variables that are significant, their effect size (variable ranking), and the direction of effects (constraint ranking). Regarding language group, it was not returned as significant in 1sg data. Thus, the groups did not differ in their rates of overt pronominal expression. Differences, however, emerged across groups in the 3sg data, where bilinguals used significantly more overt pronominal subjects than monolinguals do. This paper contributes to current discussions in the fields of language contact, second language acquisition, and bilingualism, introducing a new hypothesis and contrasting it with the IH. In addition, it contributes to variationist approaches by examining a novel community of bilingual speakers.
APA, Harvard, Vancouver, ISO, and other styles
26

Falk, Simone, Cosima Lanzilotti, and Daniele Schön. "Tuning Neural Phase Entrainment to Speech." Journal of Cognitive Neuroscience 29, no. 8 (August 2017): 1378–89. http://dx.doi.org/10.1162/jocn_a_01136.

Full text
Abstract:
Musical rhythm positively impacts on subsequent speech processing. However, the neural mechanisms underlying this phenomenon are so far unclear. We investigated whether carryover effects from a preceding musical cue to a speech stimulus result from a continuation of neural phase entrainment to periodicities that are present in both music and speech. Participants listened and memorized French metrical sentences that contained (quasi-)periodic recurrences of accents and syllables. Speech stimuli were preceded by a rhythmically regular or irregular musical cue. Our results show that the presence of a regular cue modulates neural response as estimated by EEG power spectral density, intertrial coherence, and source analyses at critical frequencies during speech processing compared with the irregular condition. Importantly, intertrial coherences for regular cues were indicative of the participants' success in memorizing the subsequent speech stimuli. These findings underscore the highly adaptive nature of neural phase entrainment across fundamentally different auditory stimuli. They also support current models of neural phase entrainment as a tool of predictive timing and attentional selection across cognitive domains.
APA, Harvard, Vancouver, ISO, and other styles
27

Lu, Yuanxun, Jinxiang Chai, and Xun Cao. "Live speech portraits." ACM Transactions on Graphics 40, no. 6 (December 2021): 1–17. http://dx.doi.org/10.1145/3478513.3480484.

Full text
Abstract:
To the best of our knowledge, we first present a live system that generates personalized photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system contains three stages. The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space. In the second stage, we learn facial dynamics and motions from the projected audio features. The predicted motions include head poses and upper body motions, where the former is generated by an autoregressive probabilistic model which models the head pose distribution of the target person. Upper body motions are deduced from head poses. In the final stage, we generate conditional feature maps from previous predictions and send them with a candidate image set to an image-to-image translation network to synthesize photorealistic renderings. Our method generalizes well to wild audio and successfully synthesizes high-fidelity personalized facial details, e.g., wrinkles, teeth. Our method also allows explicit control of head poses. Extensive qualitative and quantitative evaluations, along with user studies, demonstrate the superiority of our method over state-of-the-art techniques.
APA, Harvard, Vancouver, ISO, and other styles
28

PETERS, RYAN E., THERES GRÜTER, and ARIELLE BOROVSKY. "Vocabulary size and native speaker self-identification influence flexibility in linguistic prediction among adult bilinguals." Applied Psycholinguistics 39, no. 6 (October 8, 2018): 1439–69. http://dx.doi.org/10.1017/s0142716418000383.

Full text
Abstract:
AbstractWhen language users predict upcoming speech, they generate pluralistic expectations, weighted by likelihood (Kuperberg & Jaeger, 2016). Many variables influence the prediction of highly likely sentential outcomes, but less is known regarding variables affecting the prediction of less-likely outcomes. Here we explore how English vocabulary size and self-identification as a native speaker (NS) of English modulate adult bi-/multilinguals’ preactivation of less-likely sentential outcomes in two visual-world experiments. Participants heard transitive sentences containing an agent, action, and theme (The pirate chases the ship) while viewing four referents varying in expectancy by relation to the agent and action. In Experiment 1 (N=70), spoken themes referred to highly expected items (e.g., ship). Results indicate lower skill (smaller vocabulary size) and less confident (not identifying as NS) bi-/multilinguals activate less-likely action-related referents more than their higher skill/confidence peers. In Experiment 2 (N=65), themes were one of two less-likely items (The pirate chases the bone/cat). Results approaching significance indicate an opposite but similar size effect: higher skill/confidence listeners activate less-likely action-related (e.g., bone) referents slightly more than lower skill/confidence listeners. Results across experiments suggest higher skill/confidence participants more flexibly modulate their linguistic predictions per the demands of the task, with similar but not identical patterns emerging when bi-/multilinguals are grouped by self-ascribed NS status versus vocabulary size.
APA, Harvard, Vancouver, ISO, and other styles
29

Mckee, J. J., N. E. Evans, and F. J. Owens. "Digital transmission of 12-lead electrocardiograms and duplex speech in the telephone bandwidth." Journal of Telemedicine and Telecare 2, no. 1 (March 1, 1996): 42–49. http://dx.doi.org/10.1258/1357633961929150.

Full text
Abstract:
Summary A system was developed which allowed the simultaneous communication of digitized, full duplex speech and electrocardiography ECG signals in realtime. A single band-limited channel was used over a standard telephone line providing 3 kHz bandwidth. The full duplex speech was compressed to 2400 bit s using linear predictive coding, while multiple ECG signals were processed by a novel ECG compression technique. Extensive use of digital signal processing reduced the combined bit rate to less than 9600 bit s, allowing the use of lowcost commercial modems. The ECG communications system was tested in clinical trials. It enabled a hospitalbased clinician to provide diagnostic and treatment advice to a remote location over the standard telephone network.
APA, Harvard, Vancouver, ISO, and other styles
30

Singer, Cara M., Sango Otieno, Soo-Eun Chang, and Robin M. Jones. "Predicting Persistent Developmental Stuttering Using a Cumulative Risk Approach." Journal of Speech, Language, and Hearing Research 65, no. 1 (January 12, 2022): 70–95. http://dx.doi.org/10.1044/2021_jslhr-21-00162.

Full text
Abstract:
Purpose: The purpose of this study was to explore how well a cumulative risk approach, based on empirically supported predictive factors, predicts whether a young child who stutters is likely to develop persistent developmental stuttering. In a cumulative risk approach, the number of predictive factors indicating a child is at risk to develop persistent stuttering is evaluated, and a greater number of indicators of risk are hypothesized to confer greater risk of persistent stuttering. Method: We combined extant data on 3- to 5-year-old children who stutter from two longitudinal studies to identify cutoff values for continuous predictive factors (e.g., speech and language skills, age at onset, time since onset, stuttering frequency) and, in combination with binary predictors (e.g., sex, family history of stuttering), used all-subsets regression and receiver operating characteristic curves to compare the predictive validity of different combinations of 10 risk factors. The optimal combination of predictive factors and the odds of a child developing persistent stuttering based on an increasing number of factors were calculated. Results: Based on 67 children who stutter (i.e., 44 persisting and 23 recovered) with relatively strong speech-language skills, the predictive factor model that yielded the best predictive validity was based on time since onset (≥ 19 months), speech sound skills (≤ 115 standard score), expressive language skills (≤ 106 standard score), and stuttering severity (≥ 17 Stuttering Severity Instrument total score). When the presence of at least two predictive factors was used to confer elevated risk to develop persistent stuttering, the model yielded 93% sensitivity and 65% specificity. As a child presented with a greater number of these four risk factors, the odds for persistent stuttering increased. Conclusions: Findings support the use of a cumulative risk approach and the predictive utility of assessing multiple domains when evaluating a child's risk of developing persistent stuttering. Clinical implications and future directions are discussed.
APA, Harvard, Vancouver, ISO, and other styles
31

Kacur, Juraj, Boris Puterka, Jarmila Pavlovicova, and Milos Oravec. "On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition." Sensors 21, no. 5 (March 8, 2021): 1888. http://dx.doi.org/10.3390/s21051888.

Full text
Abstract:
Many speech emotion recognition systems have been designed using different features and classification methods. Still, there is a lack of knowledge and reasoning regarding the underlying speech characteristics and processing, i.e., how basic characteristics, methods, and settings affect the accuracy, to what extent, etc. This study is to extend physical perspective on speech emotion recognition by analyzing basic speech characteristics and modeling methods, e.g., time characteristics (segmentation, window types, and classification regions—lengths and overlaps), frequency ranges, frequency scales, processing of whole speech (spectrograms), vocal tract (filter banks, linear prediction coefficient (LPC) modeling), and excitation (inverse LPC filtering) signals, magnitude and phase manipulations, cepstral features, etc. In the evaluation phase the state-of-the-art classification method and rigorous statistical tests were applied, namely N-fold cross validation, paired t-test, rank, and Pearson correlations. The results revealed several settings in a 75% accuracy range (seven emotions). The most successful methods were based on vocal tract features using psychoacoustic filter banks covering the 0–8 kHz frequency range. Well scoring are also spectrograms carrying vocal tract and excitation information. It was found that even basic processing like pre-emphasis, segmentation, magnitude modifications, etc., can dramatically affect the results. Most findings are robust by exhibiting strong correlations across tested databases.
APA, Harvard, Vancouver, ISO, and other styles
32

Du, Yi, and Robert J. Zatorre. "Musical training sharpens and bonds ears and tongue to hear speech better." Proceedings of the National Academy of Sciences 114, no. 51 (December 4, 2017): 13579–84. http://dx.doi.org/10.1073/pnas.1712223114.

Full text
Abstract:
The idea that musical training improves speech perception in challenging listening environments is appealing and of clinical importance, yet the mechanisms of any such musician advantage are not well specified. Here, using functional magnetic resonance imaging (fMRI), we found that musicians outperformed nonmusicians in identifying syllables at varying signal-to-noise ratios (SNRs), which was associated with stronger activation of the left inferior frontal and right auditory regions in musicians compared with nonmusicians. Moreover, musicians showed greater specificity of phoneme representations in bilateral auditory and speech motor regions (e.g., premotor cortex) at higher SNRs and in the left speech motor regions at lower SNRs, as determined by multivoxel pattern analysis. Musical training also enhanced the intrahemispheric and interhemispheric functional connectivity between auditory and speech motor regions. Our findings suggest that improved speech in noise perception in musicians relies on stronger recruitment of, finer phonological representations in, and stronger functional connectivity between auditory and frontal speech motor cortices in both hemispheres, regions involved in bottom-up spectrotemporal analyses and top-down articulatory prediction and sensorimotor integration, respectively.
APA, Harvard, Vancouver, ISO, and other styles
33

Partheeban, Pachaivannan, Krishnamurthy Karthik, Partheeban Navin Elamparithi, Krishnan Somasundaram, and Baskaran Anuradha. "Urban road traffic noise on human exposure assessment using geospatial technology." Environmental Engineering Research 27, no. 5 (September 30, 2021): 210249–0. http://dx.doi.org/10.4491/eer.2021.249.

Full text
Abstract:
The sounds produced by humans, industries, transport and animals in the atmosphere that pose a threat to the health of humans or animals can be characterized as noise pollution. Adverse effects due to noise exposure can involve speech communication interference and declining learning skills of children. Highway traffic noise contributes to 80% of all noise. It has grown to a massive scale because of growth in population along the roads leading to a rapid change in land use and has evolved into a common reality in various Indian cities. The main objective of this work is to develop a road traffic noise prediction model using ArcGIS 10.3 for the busy corridors of Chennai. The collected data includes traffic volume, speed, and noise level in lateral and vertical directions. Noise levels were measured in 9 locations using a noise level meter. It is observed that the noise levels vary from 50 dB to 96 dB. It is found that the noise problem is severe in 18% of the area, and 6.3% of people are exposed to the traffic noise problem. The results obtained in this study show that the city is affected by severe noise pollution due to road traffic.
APA, Harvard, Vancouver, ISO, and other styles
34

Phan, Tran-Dac-Thinh, Soo-Hyung Kim, Hyung-Jeong Yang, and Guee-Sang Lee. "EEG-Based Emotion Recognition by Convolutional Neural Network with Multi-Scale Kernels." Sensors 21, no. 15 (July 27, 2021): 5092. http://dx.doi.org/10.3390/s21155092.

Full text
Abstract:
Besides facial or gesture-based emotion recognition, Electroencephalogram (EEG) data have been drawing attention thanks to their capability in countering the effect of deceptive external expressions of humans, like faces or speeches. Emotion recognition based on EEG signals heavily relies on the features and their delineation, which requires the selection of feature categories converted from the raw signals and types of expressions that could display the intrinsic properties of an individual signal or a group of them. Moreover, the correlation or interaction among channels and frequency bands also contain crucial information for emotional state prediction, and it is commonly disregarded in conventional approaches. Therefore, in our method, the correlation between 32 channels and frequency bands were put into use to enhance the emotion prediction performance. The extracted features chosen from the time domain were arranged into feature-homogeneous matrices, with their positions following the corresponding electrodes placed on the scalp. Based on this 3D representation of EEG signals, the model must have the ability to learn the local and global patterns that describe the short and long-range relations of EEG channels, along with the embedded features. To deal with this problem, we proposed the 2D CNN with different kernel-size of convolutional layers assembled into a convolution block, combining features that were distributed in small and large regions. Ten-fold cross validation was conducted on the DEAP dataset to prove the effectiveness of our approach. We achieved the average accuracies of 98.27% and 98.36% for arousal and valence binary classification, respectively.
APA, Harvard, Vancouver, ISO, and other styles
35

Paulraj, M. P., Kamalraj Subramaniam, Sazali Bin Yaccob, Abdul H. Bin Adom, and C. R. Hema. "Auditory Evoked Potential Response and Hearing Loss: A Review." Open Biomedical Engineering Journal 9, no. 1 (February 27, 2015): 17–24. http://dx.doi.org/10.2174/1874120701509010017.

Full text
Abstract:
Hypoacusis is the most prevalent sensory disability in the world and consequently, it can lead to impede speech in human beings. One best approach to tackle this issue is to conduct early and effective hearing screening test using Electroencephalogram (EEG). EEG based hearing threshold level determination is most suitable for persons who lack verbal communication and behavioral response to sound stimulation. Auditory evoked potential (AEP) is a type of EEG signal emanated from the brain scalp by an acoustical stimulus. The goal of this review is to assess the current state of knowledge in estimating the hearing threshold levels based on AEP response. AEP response reflects the auditory ability level of an individual. An intelligent hearing perception level system enables to examine and determine the functional integrity of the auditory system. Systematic evaluation of EEG based hearing perception level system predicting the hearing loss in newborns, infants and multiple handicaps will be a priority of interest for future research.
APA, Harvard, Vancouver, ISO, and other styles
36

CRISTIA, ALEJANDRINA, and AMANDA SEIDL. "The hyperarticulation hypothesis of infant-directed speech." Journal of Child Language 41, no. 4 (February 13, 2013): 913–34. http://dx.doi.org/10.1017/s0305000912000669.

Full text
Abstract:
ABSTRACTTypically, the point vowels [i,ɑ,u] are acoustically more peripheral in infant-directed speech (IDS) compared to adult-directed speech (ADS). If caregivers seek to highlight lexically relevant contrasts in IDS, then two sounds that are contrastive should become more distinct, whereas two sounds that are surface realizations of the same underlying sound category should not. To test this prediction, vowels that are phonemically contrastive ([i–ɪ] and [eɪ–ε]), vowels that map onto the same underlying category ([æ–] and [ε–]), and the point vowels [i,ɑ,u] were elicited in IDS and ADS by American English mothers of two age groups of infants (four- and eleven-month-olds). As in other work, point vowels were produced in more peripheral positions in IDS compared to ADS. However, there was little evidence of hyperarticulation per se (e.g. [i–ɪ] washypoarticulated). We suggest that across-the-board lexically based hyperarticulation is not a necessary feature of IDS.
APA, Harvard, Vancouver, ISO, and other styles
37

Whitlock, James A. T., and George Dodd. "Speech Intelligibility in Classrooms: Specific Acoustical Needs for Primary School Children." Building Acoustics 15, no. 1 (January 2008): 35–47. http://dx.doi.org/10.1260/135101008784050223.

Full text
Abstract:
Classrooms for primary school children should be built to criteria based on children's speech intelligibility needs which in some respects – e.g. reverberation time – differ markedly from the traditional criteria for adults. To further identify why the needs of children and adults for speech perception are so different we have measured the ‘integration time’ of speech for adults and children using a novel technique to obviate the complicating effects of differing language. The results for children are significantly different than for adults (35 ms c.f. 50 ms) and recommendations for classroom design based on the children's requirements have been made. When groups of children engage in ‘co-operative learning’ activities in the classroom, the “cafe effect” produces a rising activity noise level. We suggest the Lombard Effect is responsible for this. Measurements show children are more susceptible to the effect and we have developed a prediction model for activity noise in a classroom.
APA, Harvard, Vancouver, ISO, and other styles
38

Grant, Ken W., and Therese C. Walden. "Understanding Excessive SNR Loss in Hearing-Impaired Listeners." Journal of the American Academy of Audiology 24, no. 04 (April 2013): 258–73. http://dx.doi.org/10.3766/jaaa.24.4.3.

Full text
Abstract:
Background: Traditional audiometric measures, such as pure-tone thresholds or unaided word-recognition in quiet, appear to be of marginal use in predicting speech understanding by hearing-impaired (HI) individuals in background noise with or without amplification. Suprathreshold measures of auditory function (tolerance of noise, temporal and frequency resolution) appear to contribute more to success with amplification and may describe more effectively the distortion component of hearing. However, these measures are not typically measured clinically. When combined with measures of audibility, suprathreshold measures of auditory distortion may provide a much more complete understanding of speech deficits in noise by HI individuals. Purpose: The primary goal of this study was to investigate the relationship among measures of speech recognition in noise, frequency selectivity, temporal acuity, modulation masking release, and informational masking in adult and elderly patients with sensorineural hearing loss to determine whether peripheral distortion for suprathreshold sounds contributes to the varied outcomes experienced by patients with sensorineural hearing loss listening to speech in noise. Research Design: A correlational study. Study Sample: Twenty-seven patients with sensorineural hearing loss and four adults with normal hearing were enrolled in the study. Data Collection and Analysis: The data were collected in a sound attenuated test booth. For speech testing, subjects' verbal responses were scored by the experimenter and entered into a custom computer program. For frequency selectivity and temporal acuity measures, subject responses were recorded via a touch screen. Simple correlation, step-wise multiple linear regression analyses and a repeated analysis of variance were performed. Results: Results showed that the signal-to-noise ratio (SNR) loss could only be partially predicted by a listener's thresholds or audibility measures such as the Speech Intelligibility Index (SII). Correlations between SII and SNR loss were higher using the Hearing-in-Noise Test (HINT) than the Quick Speech-in-Noise test (QSIN) with the SII accounting for 71% of the variance in SNR loss for the HINT but only 49% for the QSIN. However, listener age and the addition of suprathreshold measures improved the prediction of SNR loss using the QSIN, accounting for nearly 71% of the variance. Conclusions: Two standard clinical speech-in-noise tests, QSIN and HINT, were used in this study to obtain a measure of SNR loss. When administered clinically, the QSIN appears to be less redundant with hearing thresholds than the HINT and is a better indicator of a patient's suprathreshold deficit and its impact on understanding speech in noise. Additional factors related to aging, spectral resolution, and, to a lesser extent, temporal resolution improved the ability to predict SNR loss measured with the QSIN. For the HINT, a listener's audibility and age were the only two significant factors. For both QSIN and HINT, roughly 25–30% of the variance in individual differences in SNR loss (i.e., the dB difference in SNR between an individual HI listener and a control group of NH listeners at a specified performance level, usually 50% word or sentence recognition) remained unexplained, suggesting the need for additional measures of suprathreshold acuity (e.g., sensitivity to temporal fine structure) or cognitive function (e.g., memory and attention) to further improve the ability to understand individual variability in SNR loss.
APA, Harvard, Vancouver, ISO, and other styles
39

Parris, Benjamin A., Dinkar Sharma, Brendan S. Hackett Weekes, Mohammad Momenian, Maria Augustinova, and Ludovic Ferrand. "Response Modality and the Stroop Task." Experimental Psychology 66, no. 5 (September 2019): 361–67. http://dx.doi.org/10.1027/1618-3169/a000459.

Full text
Abstract:
Abstract. A long-standing debate in the Stroop literature concerns whether the way we respond to the color dimension determines how we process the irrelevant dimension, or whether word processing is purely stimulus driven. Models and findings in the Stroop literature differ in their predictions about how response modes (e.g., responding manually vs. vocally) affect how the irrelevant word is processed (i.e., phonologically, semantically) and the interference and facilitation that results, with some predicting qualitatively different Stroop effects. Here, we investigated whether response mode modifies phonological facilitation produced by the irrelevant word. In a fully within-subject design, we sought evidence for the use of a serial print-to-speech prelexical phonological processing route when using manual and vocal responses by testing for facilitating effects of phonological overlap between the irrelevant word and the color name at the initial and final phoneme positions. The results showed phoneme overlap leads to facilitation with both response modes, a result that is inconsistent with qualitative differences between the two response modes.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhu-Zhou, Fangfang, Roberto Gil-Pita, Joaquín García-Gómez, and Manuel Rosa-Zurera. "Robust Multi-Scenario Speech-Based Emotion Recognition System." Sensors 22, no. 6 (March 18, 2022): 2343. http://dx.doi.org/10.3390/s22062343.

Full text
Abstract:
Every human being experiences emotions daily, e.g., joy, sadness, fear, anger. These might be revealed through speech—words are often accompanied by our emotional states when we talk. Different acoustic emotional databases are freely available for solving the Emotional Speech Recognition (ESR) task. Unfortunately, many of them were generated under non-real-world conditions, i.e., actors played emotions, and recorded emotions were under fictitious circumstances where noise is non-existent. Another weakness in the design of emotion recognition systems is the scarcity of enough patterns in the available databases, causing generalization problems and leading to overfitting. This paper examines how different recording environmental elements impact system performance using a simple logistic regression algorithm. Specifically, we conducted experiments simulating different scenarios, using different levels of Gaussian white noise, real-world noise, and reverberation. The results from this research show a performance deterioration in all scenarios, increasing the error probability from 25.57% to 79.13% in the worst case. Additionally, a virtual enlargement method and a robust multi-scenario speech-based emotion recognition system are proposed. Our system’s average error probability of 34.57% is comparable to the best-case scenario with 31.55%. The findings support the prediction that simulated emotional speech databases do not offer sufficient closeness to real scenarios.
APA, Harvard, Vancouver, ISO, and other styles
41

Usler, Evan, Anna Bostian, Ranjini Mohan, Katelyn Gerwin, Barbara Brown, Christine Weber, Anne Smith, and Bridget Walsh. "What Are Predictors for Persistence in Childhood Stuttering?" Seminars in Speech and Language 39, no. 04 (August 24, 2018): 299–312. http://dx.doi.org/10.1055/s-0038-1667159.

Full text
Abstract:
AbstractOver the past 10 years, we (the Purdue Stuttering Project) have implemented longitudinal studies to examine factors related to persistence and recovery in early childhood stuttering. Stuttering develops essentially as an impairment in speech sensorimotor processes that is strongly influenced by dynamic interactions among motor, language, and emotional domains. Our work has assessed physiological, behavioral, and clinical features of stuttering within the motor, linguistic, and emotional domains. We describe the results of studies in which measures collected when the child was 4 to 5 years old are related to eventual stuttering status. We provide supplemental evidence of the role of known predictive factors (e.g., sex and family history of persistent stuttering). In addition, we present new evidence that early delays in basic speech motor processes (especially in boys), poor performance on a nonword repetition test, stuttering severity at the age of 4 to 5 years, and delayed or atypical functioning in central nervous system language processing networks are predictive of persistent stuttering.
APA, Harvard, Vancouver, ISO, and other styles
42

Bai, Fan, Antje S. Meyer, and Andrea E. Martin. "Neural dynamics differentially encode phrases and sentences during spoken language comprehension." PLOS Biology 20, no. 7 (July 14, 2022): e3001713. http://dx.doi.org/10.1371/journal.pbio.3001713.

Full text
Abstract:
Human language stands out in the natural world as a biological signal that uses a structured system to combine the meanings of small linguistic units (e.g., words) into larger constituents (e.g., phrases and sentences). However, the physical dynamics of speech (or sign) do not stand in a one-to-one relationship with the meanings listeners perceive. Instead, listeners infer meaning based on their knowledge of the language. The neural readouts of the perceptual and cognitive processes underlying these inferences are still poorly understood. In the present study, we used scalp electroencephalography (EEG) to compare the neural response to phrases (e.g., the red vase) and sentences (e.g., the vase is red), which were close in semantic meaning and had been synthesized to be physically indistinguishable. Differences in structure were well captured in the reorganization of neural phase responses in delta (approximately <2 Hz) and theta bands (approximately 2 to 7 Hz),and in power and power connectivity changes in the alpha band (approximately 7.5 to 13.5 Hz). Consistent with predictions from a computational model, sentences showed more power, more power connectivity, and more phase synchronization than phrases did. Theta–gamma phase–amplitude coupling occurred, but did not differ between the syntactic structures. Spectral–temporal response function (STRF) modeling revealed different encoding states for phrases and sentences, over and above the acoustically driven neural response. Our findings provide a comprehensive description of how the brain encodes and separates linguistic structures in the dynamics of neural responses. They imply that phase synchronization and strength of connectivity are readouts for the constituent structure of language. The results provide a novel basis for future neurophysiological research on linguistic structure representation in the brain, and, together with our simulations, support time-based binding as a mechanism of structure encoding in neural dynamics.
APA, Harvard, Vancouver, ISO, and other styles
43

Lindborg, Alma, and Tobias S. Andersen. "Bayesian binding and fusion models explain illusion and enhancement effects in audiovisual speech perception." PLOS ONE 16, no. 2 (February 19, 2021): e0246986. http://dx.doi.org/10.1371/journal.pone.0246986.

Full text
Abstract:
Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.
APA, Harvard, Vancouver, ISO, and other styles
44

Moon, Jerald B., Patricia Zebrowski, Donald A. Robin, and John W. Folkins. "Visuomotor Tracking Ability of Young Adult Speakers." Journal of Speech, Language, and Hearing Research 36, no. 4 (August 1993): 672–82. http://dx.doi.org/10.1044/jshr.3604.672.

Full text
Abstract:
This study was conducted to (a) study the ability of young adult subjects to track target signals with the lower lip, jaw, or larynx, (b) examine subjects’ abilities to track different sinusoidal frequencies and unpredictable target signals, and (c) test notions of response mode and predictive mode tracking reported for nonspeech structures by previous authors (e.g., Noble, Fitts, & Warren, 1955; Flowers, 1978). Twenty-five normal speakers tracked sinusoidal and unpredictable target signals using lower lip and jaw movement and fundamental frequency modulation. Tracking accuracy varied as a function of target frequency and articulator used to track. The results quantify the visuomotor tracking abilities of normal speakers using speech musculature and show the potential of visuomotor tracking tasks in the assessment of speech articulatory control.
APA, Harvard, Vancouver, ISO, and other styles
45

Knolle, Franziska, Erich Schröger, Pamela Baess, and Sonja A. Kotz. "The Cerebellum Generates Motor-to-Auditory Predictions: ERP Lesion Evidence." Journal of Cognitive Neuroscience 24, no. 3 (March 2012): 698–706. http://dx.doi.org/10.1162/jocn_a_00167.

Full text
Abstract:
Forward predictions are crucial in motor action (e.g., catching a ball, or being tickled) but may also apply to sensory or cognitive processes (e.g., listening to distorted speech or to a foreign accent). According to the “internal forward model,” the cerebellum generates predictions about somatosensory consequences of movements. These predictions simulate motor processes and prepare respective cortical areas for anticipated sensory input. Currently, there is very little evidence that a cerebellar forward model also applies to other sensory domains. In the current study, we address this question by examining the role of the cerebellum when auditory stimuli are anticipated as a consequence of a motor act. We applied an N100 suppression paradigm and compared the ERP in response to self-initiated with the ERP response to externally produced sounds. We hypothesized that sensory consequences of self-initiated sounds are precisely predicted and should lead to an N100 suppression compared with externally produced sounds. Moreover, if the cerebellum is involved in the generation of a motor-to-auditory forward model, patients with focal cerebellar lesions should not display an N100 suppression effect. Compared with healthy controls, patients showed a largely attenuated N100 suppression effect. The current results suggest that the cerebellum forms not only motor-to-somatosensory predictions but also motor-to-auditory predictions. This extends the cerebellar forward model to other sensory domains such as audition.
APA, Harvard, Vancouver, ISO, and other styles
46

Saito, Kazuya. "THE ROLE OF AGE OF ACQUISITION IN LATE SECOND LANGUAGE ORAL PROFICIENCY ATTAINMENT." Studies in Second Language Acquisition 37, no. 4 (June 23, 2015): 713–43. http://dx.doi.org/10.1017/s0272263115000248.

Full text
Abstract:
The current project examined whether and to what degree age of acquisition (AOA), defined as the first intensive exposure to a second language (L2) environment, can be predictive of the end state of postpubertal L2 oral proficiency attainment. Data were collected from 88 experienced Japanese learners of English and two groups of 20 baseline speakers (inexperienced Japanese speakers and native English speakers). The global quality of their spontaneous speech production was first judged by 10 native English-speaking raters based on accentedness (linguistic nativelikeness) and comprehensibility (ease of understanding) and was then submitted to segmental, prosodic, temporal, lexical, and grammatical analyses. According to the results, AOA was negatively correlated with the accentedness and comprehensibility components of L2 speech production, owing to relatively strong age effects on segmental and prosodic attainment. Yet significant age effects were not observed in the case of fluency and lexicogrammar attainment. The results suggest that AOA plays a key role in determining the extent to which learners can attain advanced-level L2 oral abilities via improving the phonological domain of language (e.g., correct consonant and vowel pronunciation and adequate and varied prosody) and that the temporal and lexicogrammatical domains of language (e.g., optimal speech rate and proper vocabulary and grammar usage) may be enhanced with increased L2 experience, regardless of age.
APA, Harvard, Vancouver, ISO, and other styles
47

Moodie, Sheila, Jonathan Pietrobon, Eileen Rall, George Lindley, Leisha Eiten, Dave Gordey, Lisa Davidson, et al. "Using the Real-Ear-to-Coupler Difference within the American Academy of Audiology Pediatric Amplification Guideline: Protocols for Applying and Predicting Earmold RECDs." Journal of the American Academy of Audiology 27, no. 03 (March 2016): 264–75. http://dx.doi.org/10.3766/jaaa.15086.

Full text
Abstract:
Background: Real-ear-to-coupler difference (RECD) measurements are used for the purposes of estimating degree and configuration of hearing loss (in dB SPL ear canal) and predicting hearing aid output from coupler-based measures. Accurate measurements of hearing threshold, derivation of hearing aid fitting targets, and predictions of hearing aid output in the ear canal assume consistent matching of RECD coupling procedure (i.e., foam tip or earmold) with that used during assessment and in verification of the hearing aid fitting. When there is a mismatch between these coupling procedures, errors are introduced. Purpose: The goal of this study was to quantify the systematic difference in measured RECD values obtained when using a foam tip versus an earmold with various tube lengths. Assuming that systematic errors exist, the second goal was to investigate the use of a foam tip to earmold correction for the purposes of improving fitting accuracy when mismatched RECD coupling conditions occur (e.g., foam tip at assessment, earmold at verification). Study Sample: Eighteen adults and 17 children (age range: 3–127 mo) participated in this study. Data Collection and Analysis: Data were obtained using simulated ears of various volumes and earmold tubing lengths and from patients using their own earmolds. Derived RECD values based on simulated ear measurements were compared with RECD values obtained for adult and pediatric ears for foam tip and earmold coupling. Results: Results indicate that differences between foam tip and earmold RECDs are consistent across test ears for adults and children which support the development of a correction between foam tip and earmold couplings for RECDs that can be applied across individuals. Conclusions: The foam tip to earmold correction values developed in this study can be used to provide improved estimations of earmold RECDs. This may support better accuracy in acoustic transforms related to transforming thresholds and/or hearing aid coupler responses to ear canal sound pressure level for the purposes of fitting behind-the-ear hearing aids.
APA, Harvard, Vancouver, ISO, and other styles
48

FRANCO, KARLIEN, and SALI A. TAGLIAMONTE. "The most stable it's ever been: the preterit/present perfect alternation in spoken Ontario English." English Language and Linguistics 26, no. 4 (November 18, 2022): 779–806. http://dx.doi.org/10.1017/s1360674322000016.

Full text
Abstract:
English tense/aspect-marking is an area where variation abounds and where many theories have been formulated. Diachronic studies of the preterit/present perfect alternation indicate that the present perfect (e.g. I have eaten already) has been losing ground to the preterit (e.g. I ate already) (e.g. Elsness 1997, but see Hundt & Smith 2009, Werner 2014). However, few studies have examined this alternation in vernacular speech. This article fills this lacuna by analyzing spoken data from Ontario, Canada, from an apparent-time perspective. Using a large archive of multiple communities and people of different generations, we focus on linguistic contexts known to be variable, viz. with adverbs of indefinite time. Results indicate that, in contrast with previous studies, the alternation is mostly stable. We find evidence of change only with the adverb ever. Where there is evidence of change, this change is different from the predictions in the literature, with the preterit increasing in frequency. We suggest that a minor constructionalization process operates in tandem with ongoing specialization of the preterit/present perfect contrast. Taken together, these results provide another example of the importance of including speech in research on language variation and change and of the unique contribution certain constructions make to more general systems of grammar.
APA, Harvard, Vancouver, ISO, and other styles
49

Israelsson, Kjell-Erik, Renata Bogo, and Erik Berninger. "Reliability in Hearing Threshold Prediction in Normal-Hearing and Hearing-Impaired Participants Using Mixed Multiple ASSR." Journal of the American Academy of Audiology 26, no. 03 (March 2015): 299–310. http://dx.doi.org/10.3766/jaaa.26.3.9.

Full text
Abstract:
Background and Purpose: The rapidly evolving field of hearing aid fitting in infants requires rapid, objective, and highly reliable methods for diagnosing hearing impairment. The aim was to determine test-retest reliability in hearing thresholds predicted by multiple auditory steady-state response (ASSRthr) among normal-hearing (NH) and hearing-impaired (HI) adults, and to study differences between ASSRthr and pure-tone threshold (PTT) as a function of frequency in each participant. ASSR amplitude versus stimulus level was analyzed to study ASSR growth rate in NH and HI participants, especially at ASSRthr. Research Design and Study Sample: Mixed multiple ASSR (100% AM, 20% FM), using long-time averaging at a wide range of stimulus levels, and PTT were recorded in 10 NH and 14 HI adults. ASSRthr was obtained in 10 dB steps simultaneously in both ears using a test-retest protocol (center frequencies = 500, 1000, 2000, and 4000 Hz; modulation frequencies = 80–96 Hz). The growth rate at ASSRthr was calculated as the slope (nV/dB) of the ASSR amplitudes obtained at, and 10 dB above, ASSRthr. PTT was obtained in both ears in 1 dB steps using a fixed-frequency Békésy technique. All of the NH participants showed PTTs better than 20 dB HL (125–8000 Hz), and mean pure-tone average (PTA; 500–4000 Hz) was 1.8 dB HL. The HI participants exhibited quite symmetrical sensorineural hearing losses, as revealed by a mean interaural PTA difference of 6.5 dB. Their mean PTA in the better ear was 38.7 dB HL. Results: High ASSRthr reproducibility (independent of PTT) was found in both NH and HI participants (test-retest interquartile range = 10 dB). The prediction error was numerically higher in NH participants (f ≥1000 Hz), although only a significant difference existed at 1000 Hz. The median difference between ASSRthr (dB HL) and PTT (dB HL) was approximately 10 dB in the HI group at frequencies of 1000 Hz or greater, and 20 dB at 500 Hz. In general, the prediction error decreased (p < 0.001) with increasing hearing threshold, although large intersubject variability existed. Regression analysis (PTT versus ASSRthr) in HI participants revealed correlation coefficients between 0.72–0.88 (500–4000 Hz) and slopes at approximately 1.0. Large variability in ASSRthr-PTT versus frequency was demonstrated across HI participants (interquartile range approximately 20 dB). The maximum across-frequency difference (ASSRthr-PTT) in an individual participant was 50 dB. HI participants showed overall significantly higher amplitudes and slopes at ASSRthr than did NH participants (p < 0.02). The amplitude-intensity function revealed monotonically increasing ASSRs in NH participants (slope 2 nV/dB), whereas HI participants exhibited heterogeneous and mostly nonmonotonically increasing ASSRs. Conclusions: Long-time averaging of ASSR revealed high ASSRthr reproducibility and systematic decrease in prediction error with increasing hearing threshold, albeit large intersubject variability in prediction error existed. A plausible explanation for the systematic difference in ASSRthr between NH and HI adults might be significantly higher ASSR amplitudes and higher overall growth rates at ASSRthr among HI participants. Across-frequency comparison of PTT and ASSRthr in an individual HI participant demonstrated large variation; thus, ASSR may not be optimal for, e.g., reliable threshold prediction in infants and subsequent fine-tuning of hearing aids.
APA, Harvard, Vancouver, ISO, and other styles
50

KILPATRICK, ALEXANDER J., RIKKE L. BUNDGAARD-NIELSEN, and BRETT J. BAKER. "Japanese co-occurrence restrictions influence second language perception." Applied Psycholinguistics 40, no. 2 (January 30, 2019): 585–611. http://dx.doi.org/10.1017/s0142716418000711.

Full text
Abstract:
ABSTRACTMost current models of nonnative speech perception (e.g., extended perceptual assimilation model, PAM-L2, Best & Tyler, 2007; speech learning model, Flege, 1995; native language magnet model, Kuhl, 1993) base their predictions on the native/nonnative status of individual phonetic/phonological segments. This paper demonstrates that the phonotactic properties of Japanese influence the perception of natively contrasting consonants and suggests that phonotactic influence must be formally incorporated in these models. We first propose that by extending the perceptual categories outlined in PAM-L2 to incorporate sequences of sounds, we can account for the effects of differences in native and nonnative phonotactics on nonnative and cross-language segmental perception. In addition, we test predictions based on such an extension in two perceptual experiments. In Experiment 1, Japanese listeners categorized and rated vowel–consonant–vowel strings in combinations that either obeyed or violated Japanese phonotactics. The participants categorized phonotactically illegal strings to the perceptually nearest (legal) categories. In Experiment 2, participants discriminated the same strings in AXB discrimination tests. Our results show that Japanese listeners are more accurate and have faster response times when discriminating between legal strings than between legal and illegal strings. These findings expose serious shortcomings in currently accepted nonnative perception models, which offer no framework for the influence of native language phonotactics.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography