Academic literature on the topic 'ID-speech'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'ID-speech.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "ID-speech"

1

Trainor, Laurel J., Caren M. Austin, and Renée N. Desjardins. "Is Infant-Directed Speech Prosody a Result of the Vocal Expression of Emotion?" Psychological Science 11, no. 3 (May 2000): 188–95. http://dx.doi.org/10.1111/1467-9280.00240.

Full text
Abstract:
Many studies have found that infant-directed (ID) speech has higher pitch, has more exaggerated pitch contours, has a larger pitch range, has a slower tempo, and is more rhythmic than typical adult-directed (AD) speech. We show that the ID speech style reflects free vocal expression of emotion to infants, in comparison with more inhibited expression of emotion in typical AD speech. When AD speech does express emotion, the same acoustic features are used as in ID speech. We recorded ID and AD samples of speech expressing love-comfort, fear, and surprise. The emotions were equally discriminable in the ID and AD samples. Acoustic analyses showed few differences between the ID and AD samples, but robust differences across the emotions. We conclude that ID prosody itself is not special. What is special is the widespread expression of emotion to infants in comparison with the more inhibited expression of emotion in typical adult interactions.
APA, Harvard, Vancouver, ISO, and other styles
2

Bryant, Gregory A., and H. Clark Barrett. "Recognizing Intentions in Infant-Directed Speech." Psychological Science 18, no. 8 (August 2007): 746–51. http://dx.doi.org/10.1111/j.1467-9280.2007.01970.x.

Full text
Abstract:
In all languages studied to date, distinct prosodic contours characterize different intention categories of infant-directed (ID) speech. This vocal behavior likely exists universally as a species-typical trait, but little research has examined whether listeners can accurately recognize intentions in ID speech using only vocal cues, without access to semantic information. We recorded native-English-speaking mothers producing four intention categories of utterances (prohibition, approval, comfort, and attention) as both ID and adult-directed (AD) speech, and we then presented the utterances to Shuar adults (South American hunter-horticulturalists). Shuar subjects were able to reliably distinguish ID from AD speech and were able to reliably recognize the intention categories in both types of speech, although performance was significantly better with ID speech. This is the first demonstration that adult listeners in an indigenous, nonindustrialized, and nonliterate culture can accurately infer intentions from both ID speech and AD speech in a language they do not speak.
APA, Harvard, Vancouver, ISO, and other styles
3

Hayashi, Akiko, Yuji Tamekawa, and Shigeru Kiritani. "Developmental Change in Auditory Preferences for Speech Stimuli in Japanese Infants." Journal of Speech, Language, and Hearing Research 44, no. 6 (December 2001): 1189–200. http://dx.doi.org/10.1044/1092-4388(2001/092).

Full text
Abstract:
The developmental change in auditory preferences for speech stimuli was investigated for Japanese infants aged 4–14 months old. We conducted three experiments using two speech pairs in the head-turn preference procedure. Infant-directed (ID) speech and adult-directed (AD) speech stimuli were used in a longitudinal study (Experiment 1) and a cross-sectional study (Experiment 2). Native (Japanese) and non-native (English) speech stimuli were used in a cross-sectional study (Experiment 3). In all experiments, infants demonstrated a developmental change in their listening preference. For the ID/AD speech pair used in Experiments 1 and 2, infants show a U-shaped developmental shift with three developmental stages: Stage 1, in which very young infants tend to prefer ID speech over AD speech; Stage 2, in which the preference for ID speech decreases temporarily; and Stage 3, in which older infants again show a consistent preference for ID speech. For the native/non-native speech pair, there is a tendency for an increased preference for native speech over non-native speech, although infants did not demonstrate a U-shaped pattern. The difference in developmental pattern between the two types of speech pairs was discussed.
APA, Harvard, Vancouver, ISO, and other styles
4

Golinkoff, Roberta Michnick, and Anthony Alioto. "Infant-directed speech facilitates lexical learning in adults hearing Chinese: implications for language acquisition." Journal of Child Language 22, no. 3 (October 1995): 703–26. http://dx.doi.org/10.1017/s0305000900010011.

Full text
Abstract:
ABSTRACTExperiments 1 and 2 examined the effects of infant-directed (ID) speech on adults' ability to learn an individual target word in sentences in an unfamiliar, non-Western language (Chinese). English-speaking adults heard pairs of sentences read by a female, native Chinese speaker in either ID or adult-directed (AD) speech. The pairs of sentences described slides of 10 common objects. The Chinese name for the object (the target word) was placed in an utterance-final position in experiment? (n= 61) and in a medial position in experiment 2 (n= 79). At test, each Chinese target word was presented in isolation in AD speech in a recognition task. Only subjects who heard ID speech with the target word in utterance-final position demonstrated learning of the target words. The results support assertions that ID speech, which tends to put target words in sentence-final position, may assist infants in segmenting and remembering portions of the linguistic stream. In experiment 3 (n= 23), subjects judged whether each of the ID and AD speech samples prepared for experiments ? and 2 were directed to an adult or to an infant. Judgements were above chance for two types of sentence: ID speech with the target word in the final position and AD speech with the target word in a medial position. In addition to indirectly confirming the results of experiments 1 and 2, these findings suggest that at least some of the prosodic features which comprise ID speech in Chinese and English must overlap.
APA, Harvard, Vancouver, ISO, and other styles
5

Kaplan, Peter S., Jo-Anne Bachorowski, Moria J. Smoski, and William J. Hudenko. "Infants of Depressed Mothers, Although Competent Learners, Fail to Learn in Response to Their Own Mothers' Infant-Directed Speech." Psychological Science 13, no. 3 (May 2002): 268–71. http://dx.doi.org/10.1111/1467-9280.00449.

Full text
Abstract:
Depressed mothers use less of the exaggerated prosody that is typical of infant-directed (ID) speech than do nondepressed mothers. We investigated the consequences of this reduced perceptual salience in ID speech for infant learning. Infants of nondepressed mothers readily learned that their mothers' speech signaled a face, whereas infants of depressed mothers failed to learn that their mothers' speech signaled the face. Infants of depressed mothers did, however, show strong learning in response to speech produced by an unfamiliar nondepressed mother. These outcomes indicate that the reduced perceptual salience of depressed mothers' ID speech could lead to deficient learning in otherwise competent learners.
APA, Harvard, Vancouver, ISO, and other styles
6

Smith, Nicholas A., and Heather L. Strader. "Infant-directed visual prosody." Interaction Studies 15, no. 1 (June 10, 2014): 38–54. http://dx.doi.org/10.1075/is.15.1.02smi.

Full text
Abstract:
Acoustical changes in the prosody of mothers’ speech to infants are distinct and near universal. However, less is known about the visible properties of mothers’ infant-directed (ID) speech, and their relation to speech acoustics. Mothers’ head movements were tracked as they interacted with their infants using ID speech, and compared to movements accompanying their adult-directed (AD) speech. Movement measures along three dimensions of head translation, and three axes of head rotation were calculated. Overall, more head movement was found for ID than AD speech, suggesting that mothers exaggerate their visual prosody in a manner analogous to the acoustical exaggerations in their speech. Regression analyses examined the relation between changing head position and changing acoustical pitch (F0) over time. Head movements and voice pitch were more strongly related in ID speech than in AD speech. When these relations were examined across time windows of different durations, stronger relations were observed for shorter time windows (< 5 sec). However, the particular form of these more local relations did not extend or generalize to longer time windows. This suggests that the multimodal correspondences in speech prosody are variable in form, and occur within limited time spans.
APA, Harvard, Vancouver, ISO, and other styles
7

Pavlikova, Maria I., Olga V. Frolova, and Elena E. Lyakso. "Intonation Characteristics of Speech in Children with Intellectual Disabilities." Vestnik Tomskogo gosudarstvennogo universiteta, no. 462 (2021): 31–39. http://dx.doi.org/10.17223/15617793/462/4.

Full text
Abstract:
In the literature, data on the formation of intonation in Russian-speaking children with mild intellectual disabilities (mental retardation) without genetic syndromes and serious neurological disorders (for example, cerebral palsy) based on the instrumental analysis of children’s speech are absent. The aim of this study was to compare the intonation characteristics of speech in children, aged 5 to 7, with typical development and with mild intellectual disabilities. The participants of the study were 20 children aged 5 to 7: 10 children (5 girls and 5 boys) with typical development (TD group) and 10 children (6 boys and 4 girls) with mild intellectual disabilities (ID group, ICD-10-CM Code F70). Intellectual disabilities were not associated with genetic or severe neurological disorders (non-specific ID). Child speech was taken from the AD-CHILD.RU speech database. Audio and video recordings of speech and behavior of TD group children (in a kindergarten) and ID group children (in an orphanage) were made in the model situation of a “dialogue with an adult”. Two studies were conducted: a perceptual experiment (n=10 listeners – native speakers, researchers in the field of child speech development) and an instrumental spectrographic analysis of child speech. The instrumental analysis of speech was made in the Praat program. The duration of utterances and stressed vowels, pitch values (average, maximum and minimum), pitch range values of utterances, and pitch range values of vowels were analyzed. The perceptual experiment showed that the utterances of ID group children classified as less clear and more emotional than the utterances of TD group children. The task of phrase stress (words highlighted by voice) revealing was more difficult for adults when they were listening to the speech of ID group children vs. TD group children. In ID group children, the values of utterance duration are lower and the values of vowel duration are higher than in TD group children. The average, maximum, and minimum pitch values, the pitch range values of ID group children’s utterances are higher vs. the corresponding parameters of TD group children’s speech. The duration and pitch range values of stressed vowels from ID group children’s words highlighted by intonation are higher than these features of TD group children’s stressed vowels. The pitch contours of stressed vowels from TD group children’s words highlighted by intonation were presented in most cases by the rise of the pitch contour; the pitch contours of stressed vowels from ID group children’s words highlighted by intonation were presented by the fall of the pitch. The dome-shaped vowel pitch contour and U-shaped contour are more frequent in the speech of ID group children vs. TD group children. In the future, the intonation characteristics of speech of children with different diagnoses could be considered as additional diagnostic criteria of developmental disorders.
APA, Harvard, Vancouver, ISO, and other styles
8

Dilley, Laura, Matthew Lehet, Elizabeth A. Wieland, Meisam K. Arjmandi, Maria Kondaurova, Yuanyuan Wang, Jessa Reed, Mario Svirsky, Derek Houston, and Tonya Bergeson. "Individual Differences in Mothers' Spontaneous Infant-Directed Speech Predict Language Attainment in Children With Cochlear Implants." Journal of Speech, Language, and Hearing Research 63, no. 7 (July 17, 2020): 2453–67. http://dx.doi.org/10.1044/2020_jslhr-19-00229.

Full text
Abstract:
Purpose Differences across language environments of prelingually deaf children who receive cochlear implants (CIs) may affect language acquisition; yet, whether mothers show individual differences in how they modify infant-directed (ID) compared with adult-directed (AD) speech has seldom been studied. This study assessed individual differences in how mothers realized speech modifications in ID register and whether these predicted differences in language outcomes for children with CIs. Method Participants were 36 dyads of mothers and their children aged 0;8–2;5 (years;months) at the time of CI implantation. Mothers' spontaneous speech was recorded in a lab setting in ID or AD conditions before ~15 months postimplantation. Mothers' speech samples were characterized for acoustic–phonetic and lexical properties established as canonical indices of ID speech to typically hearing infants, such as vowel space area differences, fundamental frequency variability, and speech rate. Children with CIs completed longitudinal administrations of one or more standardized language assessment instruments at variable intervals from 6 months to 9.5 years postimplantation. Standardized scores on assessments administered longitudinally were used to calculate linear regressions, which gave rise to predicted language scores for children at 2 years postimplantation and language growth over 2-year intervals. Results Mothers showed individual differences in how they modified speech in ID versus AD registers. Crucially, these individual differences significantly predicted differences in estimated language outcomes at 2 years postimplantation in children with CIs. Maternal speech variation in lexical quantity and vowel space area differences across ID and AD registers most frequently predicted estimates of language attainment in children with CIs, whereas prosodic differences played a minor role. Conclusion Results support that caregiver language behaviors play a substantial role in explaining variability in language attainment in children receiving CIs. Supplemental Material https://doi.org/10.23641/asha.12560147
APA, Harvard, Vancouver, ISO, and other styles
9

Droucker, Danielle, Suzanne Curtin, and Athena Vouloumanos. "Linking Infant-Directed Speech and Face Preferences to Language Outcomes in Infants at Risk for Autism Spectrum Disorder." Journal of Speech, Language, and Hearing Research 56, no. 2 (April 2013): 567–76. http://dx.doi.org/10.1044/1092-4388(2012/11-0266).

Full text
Abstract:
Purpose In this study, the authors aimed to examine whether biases for infant-directed (ID) speech and faces differ between infant siblings of children with autism spectrum disorder (ASD) (SIBS-A) and infant siblings of typically developing children (SIBS-TD), and whether speech and face biases predict language outcomes and risk group membership. Method Thirty-six infants were tested at ages 6, 8, 12, and 18 months. Infants heard 2 ID and 2 adult-directed (AD) speech passages paired with either a checkerboard or a face. The authors assessed expressive language at 12 and 18 months and general functioning at 12 months using the Mullen Scales of Early Learning (Mullen, 1995). Results Both infant groups preferred ID to AD speech and preferred faces to checkerboards. SIBS-TD demonstrated higher expressive language at 18 months than did SIBS-A, a finding that correlated with preferences for ID speech at 12 months. Although both groups looked longer to face stimuli than to the checkerboard, the magnitude of the preference was smaller in SIBS-A and predicted expressive vocabulary at 18 months in this group. Infants' preference for faces contributed to risk-group membership in a logistic regression analysis. Conclusion Infants at heightened risk of ASD differ from typically developing infants in their preferences for ID speech and faces, which may underlie deficits in later language development and social communication.
APA, Harvard, Vancouver, ISO, and other styles
10

Suttora, Chiara, Nicoletta Salerni, Paola Zanchi, Laura Zampini, Maria Spinelli, and Mirco Fasolo. "Relationships between structural and acoustic properties of maternal talk and children’s early word recognition." First Language 37, no. 6 (June 21, 2017): 612–29. http://dx.doi.org/10.1177/0142723717714946.

Full text
Abstract:
This study aimed to investigate specific associations between structural and acoustic characteristics of infant-directed (ID) speech and word recognition. Thirty Italian-acquiring children and their mothers were tested when the children were 1;3. Children’s word recognition was measured with the looking-while-listening task. Maternal ID speech was recorded during a mother–child interaction session and analyzed in terms of amount of speech, lexical and syntactic complexity, positional salience of nouns and verbs, high pitch and variation, and temporal characteristics. The analyses revealed that final syllable length positively predicts children’s accuracy in word recognition whereas the use of verbs in the utterance-final position has an adverse effect on children’s performance. Several of the expected associations between ID speech features and children’s word recognition skills, however, were not significant. Taken together, these findings suggest that only specific structural and acoustic properties of ID speech can facilitate word recognition in children, thereby fostering their ability to extrapolate sound patterns from the stream and map them with their referents.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "ID-speech"

1

GENOVESE, GIULIANA. "L'infant-directed speech nella lingua italiana: caratteristiche lessicali, sintattiche, prosodiche e relazione con lo sviluppo linguistico." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2019. http://hdl.handle.net/10281/241109.

Full text
Abstract:
Il presente lavoro di ricerca intende indagare le caratteristiche del linguaggio rivolto ai bambini nella lingua italiana nel primo anno di vita e i suoi effetti sullo sviluppo del linguaggio, dai prerequisiti allo sviluppo lessicale e sintattico. La cornice teorica su cui si fondano le ricerche qui presentate assume che il processo di acquisizione linguistica abbia basi sociali. La prima parte dell’elaborato comprende due studi che descrivono le proprietà lessicali, sintattiche e prosodiche del linguaggio rivolto ai bambini. La seconda, invece, è costituita da due lavori che indagano la qualità e gli effetti dell’input linguistico nello sviluppo del linguaggio, prendendo in considerazione sia un prerequisito in fase verbale, sia le competenze lessicali e sintattiche nel secondo anno di vita; in questa seconda parte sono stati inoltre definiti i predittori dell’apprendimento linguistico, considerando sia le caratteristiche dell’input ma anche il contributo delle competenze comunicative precoci del bambino. Il primo studio presentato è un’indagine a carattere longitudinale nella quale sono state descritte, mediante misure globali e specifiche, le caratteristiche lessicali e sintattiche del linguaggio rivolto ai bambini nella lingua italiana. Ciò che è emerso è un registro semplificato ma non semplice che presenta un periodo di massima semplificazione nella seconda metà del primo anno di vita. La seconda ricerca, sempre a carattere longitudinale, ha preso in esame le proprietà prosodiche del linguaggio rivolto ai bambini e la caratterizzazione prosodica di enunciati con funzione pragmatica differente. I risultati hanno messo in luce una prosodia generalmente enfatizzata nel linguaggio rivolgo ai bambini nel periodo preverbale ma, sorprendentemente, in misura moderata. Inoltre, è stato possibile osservare un pattern di cambiamento nel corso del primo anno di vita che si discosta da quello caratterizzante altre lingue non-tonali. Infine, sono emerse caratteristiche prosodiche distintive per enunciati con funzione pragmatica diversa, elemento che evidenzia il ruolo altamente informativo della prosodia. Il terzo lavoro ha indagato longitudinalmente gli antecedenti dello sviluppo linguistico, valutando il contributo delle competenze comunicative precoci del bambino e il ruolo dell’input - di cui sono state esaminate qualità e stabilità - temi rispetto ai quali la letteratura riporta ancora risultati contrastanti. I dati ottenuti indicano che lo sviluppo linguistico nel secondo anno di vita rispecchi le abilità comunicative precoci e sembri favorito da un input ricco, ridondante e sintatticamente articolato. Infine, il quarto contributo ha analizzato, con un disegno sperimentale, i possibili effetti del canto rivolto ai bambini, ipotizzando un ruolo facilitatore rispetto al parlato nel processo di discriminazione fonetica, precursore preverbale dello sviluppo linguistico. Si tratta di un tema piuttosto trascurato nella letteratura che, invece, si è di fatto sempre concentrata sugli effetti della prosodia tipica del parlato rivolto ai bambini. I risultati principali hanno messo in luce come il ruolo facilitatore del canto in tale registro emerga alla fine del primo anno di vita quando, da un punto di vista evolutivo, si verifica un cambiamento nella capacità di discriminare i fonemi nativi e non nativi. È stato altresì possibile individuare benefici di una maggiore esposizione alla musica e al canto in fase preverbale, sia rispetto alla discriminazione fonetica che al successivo sviluppo lessicale.
This research work aims to explore infant-directed speech features in Italian language during the first year of an infant’s life and its effects on language acquisition, from precursors to advanced lexical and syntactic skills. The theoretical background assumes social bases of linguistic development. The first part consists of two studies on lexical, syntactic and prosodic properties in this special register. The second part includes two researches on quality and effects of linguistic input in language acquisition, taking into account a preverbal precursor and lexical and syntactic abilities during the second year of life; additionally, in this section, the predictors of language learning have been defined, exploring the role of linguistic input and the contribution of early communication skills in infants. The first study is a longitudinal design investigation, with an exhaustive analysis of lexical and syntactic characteristics of infant-directed speech in Italian language, comprehensive of both global and specific measures. From this investigation, the special register addressed to infants appears as a simplified but not simple with a period of maximum simplification in the second half of the first year of an infant’s life. The second longitudinal research examines prosodic properties in infant-directed speech and prosodic characterization of utterances with different pragmatic function. Results show how typical prosody in Italian infant-directed speech is overall emphasized in the preverbal period but, surprisingly, moderately; moreover, prosody changes during the first year even though without the same pattern of other non-tonal languages. Lastly, utterances with different pragmatic functions are characterized by a distinctive prosody. In the third contribution, predictors of language acquisition are longitudinally explored, analyzing the role of early communication skills in infants and of maternal input. In addition, input quality and stability are evaluated. About this topic, literature shows conflicting results. Overall, we find how subsequent linguistic abilities could be predicted by infant’s early communication skills and a by a rich, redundant, syntactically articulated but lexically repetitive input. Lastly, the fourth experimental work analyses the facilitator role of infant-directed song compared to infant-directed speech on the phonetic discrimination process, a preverbal precursor of language acquisition. Literature highlights how typical prosody in this special speech supports the identification of linguistic units in the verbal flow. Nevertheless, the role of infant-directed song has been poorly explored, especially as regard the development of a linguistic prerequisite. Main results prove a facilitator role of infant-directed song at the end of the first year of an infant’s life, when changes in the phonetic discrimination skill occur. Moreover, we find benefic effects of an higher musical and song exposition during the preverbal stage on both phonetic discrimination and subsequent lexical skills.
APA, Harvard, Vancouver, ISO, and other styles
2

Mustafa, M. K. "On-device mobile speech recognition." Thesis, Nottingham Trent University, 2016. http://irep.ntu.ac.uk/id/eprint/28044/.

Full text
Abstract:
Despite many years of research, Speech Recognition remains an active area of research in Artificial Intelligence. Currently, the most common commercial application of this technology on mobile devices uses a wireless client – server approach to meet the computational and memory demands of the speech recognition process. Unfortunately, such an approach is unlikely to remain viable when fully applied over the approximately 7.22 Billion mobile phones currently in circulation. In this thesis we present an On – Device Speech recognition system. Such a system has the potential to completely eliminate the wireless client-server bottleneck. For the Voice Activity Detection part of this work, this thesis presents two novel algorithms used to detect speech activity within an audio signal. The first algorithm is based on the Log Linear Predictive Cepstral Coefficients Residual signal. These LLPCCRS feature vectors were then classified into voice signal and non-voice signal segments using a modified K-means clustering algorithm. This VAD algorithm is shown to provide a better performance as compared to a conventional energy frame analysis based approach. The second algorithm developed is based on the Linear Predictive Cepstral Coefficients. This algorithm uses the frames within the speech signal with the minimum and maximum standard deviation, as candidates for a linear cross correlation against the rest of the frames within the audio signal. The cross correlated frames are then classified using the same modified K-means clustering algorithm. The resulting output provides a cluster for Speech frames and another cluster for Non–speech frames. This novel application of the linear cross correlation technique to linear predictive cepstral coefficients feature vectors provides a fast computation method for use on the mobile platform; as shown by the results presented in this thesis. The Speech recognition part of this thesis presents two novel Neural Network approaches to mobile Speech recognition. Firstly, a recurrent neural networks architecture is developed to accommodate the output of the VAD stage. Specifically, an Echo State Network (ESN) is used for phoneme level recognition. The drawbacks and advantages of this method are explained further within the thesis. Secondly, a dynamic Multi-Layer Perceptron approach is developed. This builds on the drawbacks of the ESN and provides a dynamic way of handling speech signal length variabilities within its architecture. This novel Dynamic Multi-Layer Perceptron uses both the Linear Predictive Cepstral Coefficients (LPC) and the Mel Frequency Cepstral Coefficients (MFCC) as input features. A speaker dependent approach is presented using the Centre for spoken Language and Understanding (CSLU) database. The results show a very distinct behaviour from conventional speech recognition approaches because the LPC shows performance figures very close to the MFCC. A speaker independent system, using the standard TIMIT dataset, is then implemented on the dynamic MLP for further confirmation of this. In this mode of operation the MFCC outperforms the LPC. Finally, all the results, with emphasis on the computation time of both these novel neural network approaches are compared directly to a conventional hidden Markov model on the CSLU and TIMIT standard datasets.
APA, Harvard, Vancouver, ISO, and other styles
3

Melnikoff, Stephen Jonathan. "Speech recognition in programmable logic." Thesis, University of Birmingham, 2003. http://etheses.bham.ac.uk//id/eprint/16/.

Full text
Abstract:
Speech recognition is a computationally demanding task, especially the decoding part, which converts pre-processed speech data into words or sub-word units, and which incorporates Viterbi decoding and Gaussian distribution calculations. In this thesis, this part of the recognition process is implemented in programmable logic, specifically, on a field-programmable gate array (FPGA). Relevant background material about speech recognition is presented, along with a critical review of previous hardware implementations. Designs for a decoder suitable for implementation in hardware are then described. These include details of how multiple speech files can be processed in parallel, and an original implementation of an algorithm for summing Gaussian mixture components in the log domain. These designs are then implemented on an FPGA. An assessment is made as to how appropriate it is to use hardware for speech recognition. It is concluded that while certain parts of the recognition algorithm are not well suited to this medium, much of it is, and so an efficient implementation is possible. Also presented is an original analysis of the requirements of speech recognition for hardware and software, which relates the parameters that dictate the complexity of the system to processing speed and bandwidth. The FPGA implementations are compared to equivalent software, written for that purpose. For a contemporary FPGA and processor, the FPGA outperforms the software by an order of magnitude.
APA, Harvard, Vancouver, ISO, and other styles
4

Safavi, Saeid. "Speaker characterization using adult and children's speech." Thesis, University of Birmingham, 2015. http://etheses.bham.ac.uk//id/eprint/6029/.

Full text
Abstract:
Speech signals contain important information about a speaker, such as age, gender, language, accent, and emotional/psychological state. Automatic recognition of these types of characteristics has a wide range of commercial, medical and forensic applications such as interactive voice response systems, service customization, natural human-machine interaction, recognizing the type of pathology of speakers, and directing the forensic investigation process. Many such applications depend on reliable systems using short speech segments without regard to the spoken text (text-independent). All these applications are also applicable using children’s speech. This research aims to develop accurate methods and tools to identify different characteristics of the speakers. Our experiments cover speaker recognition, gender recognition, age-group classification, and accent identification. However, similar approaches and techniques can be applied to identify other characteristics such as emotional/psychological state. The main focus of this research is on detecting these characteristics from children’s speech, which is previously reported as a more challenging subject compared to adult. Furthermore, the impact of different frequency bands on the performances of several recognition systems is studied, and the performance obtained using children’s speech is compared with the corresponding results from experiments using adults’ speech. Speaker characterization is performed by fitting a probability density function to acoustic features extracted from the speech signals. Since the distribution of acoustic features is complex, Gaussian mixture models (GMM) are applied. Due to lack of data, parametric model adaptation methods have been applied to adapt the universal background model (UBM) to the char acteristics of utterances. An effective approach involves adapting the UBM to speech signals using the Maximum-A-Posteriori (MAP) scheme. Then, the Gaussian means of the adapted GMM are concatenated to form a Gaussian mean super-vector for a given utterance. Finally, a classification or regression algorithm is used to identify the speaker characteristics. While effective, Gaussian mean super-vectors are of a high dimensionality resulting in high computational cost and difficulty in obtaining a robust model in the context of limited data. In the field of speaker recognition, recent advances using the i-vector framework have increased the classification accuracy. This framework, which provides a compact representation of an utterance in the form of a low dimensional feature vector, applies a simple factor analysis on GMM means.
APA, Harvard, Vancouver, ISO, and other styles
5

Tang, Andrea. "Narration and speech and thought presentation in comics." Thesis, University of Huddersfield, 2016. http://eprints.hud.ac.uk/id/eprint/27960/.

Full text
Abstract:
The purpose of this study was to test the application of two linguistic models of narration and one linguistic model of speech and thought presentation on comic texts: Fowler's (1986) internal and external narration types, Simpson's (1993) narrative categories from his 'modal grammar of point of view' and Leech and Short's (1981) speech and thought presentation scales. These three linguistic models of narration and speech and thought presentation, originally designed and used for the analysis of prose texts, were applied to comics, a multimodal medium that tells stories through a combination of both words and images. Through examples from comics, I demonstrate in this thesis that Fowler's (1986) basic distinction between internal and external narration types and Simpson's (1993) narrative categories (categories A, B(N) and B(R) narration) can be identified in both visual and textual forms in the pictures and the words of comics. I also demonstrate the potential application of Leech and Short's (1981) speech and thought presentation scales on comics by identifying instances of the scales' categories (NPV/NPT, NPSA/NPTA, DS/DT and FDS/FDT) from comics, but not all of the speech and thought presentation categories existed in my comic data (there was no evidence of IS/IT and the ategorisation of FIS/FIT was debatable). In addition, I identified other types of discourse that occurred in comics which were not accounted for by Leech and Short's (1981) speech and thought presentation categories: internally and externally-located DS and DT (DS and DT that are presented within (internally) or outside of (externally) the scenes that they originate from), narratorinfluenced forms of DS and DT (where narrator interference seems to occur in DS and DT), visual presentations of speech and thought (where speech and thought are represented by pictorial or symbolic content in balloons) and non-verbal balloons (where no speech or thought is being presented, but states of mind and emphasized pauses or silence are represented by punctuation marks and other symbols in speech balloons).
APA, Harvard, Vancouver, ISO, and other styles
6

Dalby, Jonathan Marler. "Phonetic structure of fast speech in American English." Bloomington : Reproduced by the Indiana University Linguistics Club, 1986. http://books.google.com/books?id=6MpWAAAAMAAJ.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Shen, Ao. "The selective use of gaze in automatic speech recognition." Thesis, University of Birmingham, 2014. http://etheses.bham.ac.uk//id/eprint/5202/.

Full text
Abstract:
The performance of automatic speech recognition (ASR) degrades significantly in natural environments compared to in laboratory assessments. Being a major source of interference, acoustic noise affects speech intelligibility during the ASR process. There are two main problems caused by the acoustic noise. The first is the speech signal contamination. The second is the speakers' vocal and non-vocal behavioural changes. These phenomena elicit mismatch between the ASR training and recognition conditions, which leads to considerable performance degradation. To improve noise-robustness, exploiting prior knowledge of the acoustic noise in speech enhancement, feature extraction and recognition models are popular approaches. An alternative approach presented in this thesis is to introduce eye gaze as an extra modality. Eye gaze behaviours have roles in interaction and contain information about cognition and visual attention; not all behaviours are relevant to speech. Therefore, gaze behaviours are used selectively to improve ASR performance. This is achieved by inference procedures using noise-dependant models of gaze behaviours and their temporal and semantic relationship with speech. `Selective gaze-contingent ASR' systems are proposed and evaluated on a corpus of eye movement and related speech in different clean, noisy environments. The best performing systems utilise both acoustic and language model adaptation.
APA, Harvard, Vancouver, ISO, and other styles
8

Najafian, Maryam. "Acoustic model selection for recognition of regional accented speech." Thesis, University of Birmingham, 2016. http://etheses.bham.ac.uk//id/eprint/6461/.

Full text
Abstract:
Accent is cited as an issue for speech recognition systems. Our experiments showed that the ASR word error rate is up to seven times greater for accented speech compared with standard British English. The main objective of this research is to develop Automatic Speech Recognition (ASR) techniques that are robust to accent variation. We applied different acoustic modelling techniques to compensate for the effects of regional accents on the ASR performance. For conventional GMM-HMM based ASR systems, we showed that using a small amount of data from a test speaker to choose an accent dependent model using an accent identification system, or building a model using the data from N neighbouring speakers in AID space, will result in superior performance compared to that obtained with unsupervised or supervised speaker adaptation. In addition we showed that using a DNN-HMM rather than a GMM-HMM based acoustic model would improve the recognition accuracy considerably. Even if we apply two stages of accent followed by speaker adaptation to the GMM-HMM baseline system, the GMM-HMM based system will not outperform the baseline DNN-HMM based system. For more contemporary DNN-HMM based ASR systems we investigated how adding different types of accented data to the training set can provide better recognition accuracy on accented speech. Finally, we proposed a new approach for visualisation of the AID feature space. This is helpful in analysing the AID recognition accuracies and analysing AID confusion matrices.
APA, Harvard, Vancouver, ISO, and other styles
9

Fritz, Isabella. "How gesture and speech interact during production and comprehension." Thesis, University of Birmingham, 2018. http://etheses.bham.ac.uk//id/eprint/8084/.

Full text
Abstract:
This thesis investigates the mechanisms that underlie the interaction of gesture and speech during the production and comprehension of language on a temporal and semantic level. The results from the two gesture-speech production experiments provide unambiguous evidence that gestural content is shaped online by the ways in which speakers package information into planning units in speech rather than being influenced by how events are lexicalised. In terms of gesture-speech synchronisation, a meta-analysis of these experiments showed that lexical items which are semantically related to the gesture's content (i.e., semantic affiliates) compete for synchronisation when these affiliates are separated within a sentence. This competition leads to large proportions of gestures not synchronising with any semantic affiliate. These findings demonstrate that gesture onset can be attracted by lexical items that do not co-occur with the gesture. The thesis then tested how listeners process gestures when synchrony is lost and whether preceding discourse related to a gesture's meaning impacts gesture interpretation and processing. Behavioural and ERP results show that gesture interpretation and processing is discourse dependent. Moreover, the ERP experiment demonstrates that when synchronisation between gesture and semantic affiliate is not present the underlying integration processes are different from synchronous gesture-speech combinations.
APA, Harvard, Vancouver, ISO, and other styles
10

Zhang, Li. "A syllable-based, pseudo-articulatory approach to speech recognition." Thesis, University of Birmingham, 2004. http://etheses.bham.ac.uk//id/eprint/4905/.

Full text
Abstract:
The prevailing approach to speech recognition is Hidden Markov Modelling, which yields good performance. However, it ignores phonetics, which has the potential for going beyond the acoustic variance to provide a more abstract underlying representation. The novel approach pursued in this thesis is motivated by phonetic and phonological considerations. It is based on the notion of pseudo-articulatory representations, which are abstract and idealized accounts of articulatory activity. The original work presented here demonstrates the recovery of syllable structure information from pseudo-articulatory representations directly without resorting to statistical models of phone sequences. The work is also original in its use of syllable structures to recover phonemes. This thesis presents the three-stage syllable based, pseudo-articulatory approach in detail. Though it still has problems, this research leads to a more plausible style of automatic speech recognition and will contribute to modelling and understanding speech behaviour. Additionally, it also permits a 'multithreaded' approach combining information from different processes.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "ID-speech"

1

Markowitz, Judith A. Voice ID source profiles. [Evanston, IL]: J. Markowitz, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ansel, B. Acoustic Predictors of Speech Intelligibility in Cerebral Palsey (Id No 87101). Indiana Univ, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Sappok, Tanja, Sabine Zepperitz, and Mark Hudson. Meeting Emotional Needs in Intellectual Disability: The Developmental Approach. Hogrefe Publishing, 2021. http://dx.doi.org/10.1027/00589-000.

Full text
Abstract:
Using a developmental perspective, the authors offer a new, integrated model for supporting people with intellectual disability (ID). This concept builds upon recent advances in attachment-informed approaches, by drawing upon a broader understanding of the social, emotional, and cognitive competencies of people with ID, which is grounded in developmental neuroscience and psychology. The book explores in detail how challenging behaviour and mental health difficulties in people with ID arise when their basic emotional needs are not being met by those in the environment. Using individually tailored interventions, which complement existing models of care, practitioners can help to facilitate maturational processes and reduce behavior that is challenging to others. As a result, the ‘fit’ of a person within his or her individual environment can be improved. Case examples throughout the book illuminate how this approach works by targeting interventions towards the person’s stage of emotional development. This book will be of interest to a wide range of professionals working with people with ID, including: clinical psychologists, psychiatrists, occupational therapists, learning disability nurses, speech and language therapists, and teachers in special education settings, as well as parents and caregivers.
APA, Harvard, Vancouver, ISO, and other styles
4

Bhaumik, Sabyasachi, and Regi Alexander, eds. Oxford Textbook of the Psychiatry of Intellectual Disability. Oxford University Press, 2020. http://dx.doi.org/10.1093/med/9780198794585.001.0001.

Full text
Abstract:
Intellectual Disability (ID), a lifelong condition characterized by an impairment of intellectual functioning and deficits in adaptive skills is part of a spectrum of developmental disorders which also includes other conditions like autism and ADHD. While psychiatric problems are three to four times more common in those with ID, diagnosing it can be fraught with difficulties due to associated communication problems, atypical presentations, overlap with physical conditions, and experience of marginalization and abuse. In addition, treatment approaches may be different and the potential for treatment-related side effects greater. With a range of international experts authoring its chapters and providing the up-to-date evidence base in assessment, diagnosis, and treatment of mental health problems in people with ID, this book will be useful not just for the trainee doctor in psychiatry, but also for those in allied professions like general practice, nursing, psychology, speech and language therapy, social work, and occupational therapy as well as family members and carers and all those involved in any way with organizing or delivering care and treatment for people with intellectual disability and mental health problems. Throughout, the book addresses issues that are of relevance to those on the frontline and hence most chapters offer examples of clinical issues that come up in day to day practice. There are also a number of single response multiple choice questions that will serve as an aid to learning.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "ID-speech"

1

Ortego-Resa, Carlos, Ignacio Lopez-Moreno, Daniel Ramos, and Joaquin Gonzalez-Rodriguez. "Anchor Model Fusion for Emotion Recognition in Speech." In Biometric ID Management and Multimodal Communication, 49–56. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04391-8_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Scheidat, Tobias, Michael Biermann, Jana Dittmann, Claus Vielhauer, and Karl Kümmel. "Multi-biometric Fusion for Driver Authentication on the Example of Speech and Face." In Biometric ID Management and Multimodal Communication, 220–27. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04391-8_29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Staroniewicz, Piotr. "Recognition of Emotional State in Polish Speech - Comparison between Human and Automatic Efficiency." In Biometric ID Management and Multimodal Communication, 33–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04391-8_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Khademi, Hamidreza, and Walcir Cardoso. "Learning L2 pronunciation with Google Translate." In Intelligent CALL, granular systems and learner data: short papers from EUROCALL 2022, 228–33. Research-publishing.net, 2022. http://dx.doi.org/10.14705/rpnet.2022.61.1463.

Full text
Abstract:
This article, based on Khademi’s (2021) Master’s thesis, examines the use of Google Translate (GT) and its speech capabilities, Text-to-Speech Synthesis (TTS) and Automatic Speech Recognition (ASR), in helping L2 learners acquire the pronunciation of English past -ed allomorphy (/t/, /d/, /id/) in a semi-autonomous context, considering three levels of pronunciation development: phonological awareness, perception, and production. Our pre/posttest results indicate significant improvements in the participants’ awareness and perception of the English past -ed, but no improvements in production (except for /id/). These findings corroborate our hypothesis that GT’s speech capabilities can be used as pedagogical tools to help learners acquire the target pronunciation feature.
APA, Harvard, Vancouver, ISO, and other styles
5

Ross, Cordelia Y. "Below-average intellectual and adaptive functioning." In Child and Adolescent Psychiatry, 17–24. Oxford University Press, 2021. http://dx.doi.org/10.1093/med/9780197577479.003.0003.

Full text
Abstract:
Intellectual disability (ID) is a neurodevelopmental disorder characterized by deficits in intellectual and adaptive functioning that begins during the developmental period. The case describes a child with Fragile X syndrome, the most common inherited cause of ID. Adaptive functioning deficits include difficulty acquiring age-appropriate conceptual, social, and practical skills. There are many causes of ID including genetic, environmental, and idiopathic etiologies. The assessment of ID includes evaluating for underlying and/or associated medical conditions and neuropsychological testing to evaluate the child’s cognitive and functional abilities. The severity of ID is determined by the child’s functional abilities. Components of treatment may include speech and language therapy, occupational therapy, physical therapy, family counseling and respite care, behavioral interventions, and educational interventions. Families should also be educated on transition-to-adulthood topics including guardianship. Pharmacologic treatments can be used to treat behavioral symptoms, sleep difficulty, or comorbid psychiatric disorders.
APA, Harvard, Vancouver, ISO, and other styles
6

He, Yue, and Walcir Cardoso. "Can online translators and their speech capabilities help English learners improve their pronunciation?" In CALL and professionalisation: short papers from EUROCALL 2021, 126–31. Research-publishing.net, 2021. http://dx.doi.org/10.14705/rpnet.2021.54.1320.

Full text
Abstract:
This study investigated whether a translation tool (Microsoft Translator – MT) and its built-in speech features (Text-To-Speech synthesis – TTS – and speech recognition) can promote learners’ acquisition in pronunciation of English regular past tense -ed in a self-directed manner. Following a pretest/posttest design, we compared 29 participants’ performances of past -ed allomorphy (/t/, /d/, and /id/) by assessing their pronunciation in terms of phonological awareness, phonemic discrimination, and oral production. The findings highlight the affordances of MT regarding its pedagogical use for helping English as a Foreign Language (EFL) learners improve their pronunciation.
APA, Harvard, Vancouver, ISO, and other styles
7

Nidhyananthan, S. Selva, Joe Virgin A., and Shantha Selva Kumari R. "Wireless Enhanced Security Based on Speech Recognition." In Handbook of Research on Information Security in Biomedical Signal Processing, 228–53. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5152-2.ch012.

Full text
Abstract:
Security is the most notable fact of all computerized control gadgets. In this chapter, a voice ID computerized gadget is utilized for the security motivation using speech recognition. Mostly, the voices are trained by extracting mel frequency cepstral coefficient feature (MFCC), but it is very sensitive to noise interference and degrades the performance; hence, dynamic MFCC is used for speech and speaker recognition. The registered voices are stored in a database. When the device senses any voice, it cross checks with the registered voice. If any mismatches occur, it gives an alert to the authorized person through global system for mobile communication (GSM) to intimate the unauthorized access. GSM works at a rate of 168 Kb/s up to 40 km and it operates at different operating frequencies like 800MHz, 900MHz, etc. This proposed work is more advantageous for the security systems to trap the unauthorized persons through an efficient communication.
APA, Harvard, Vancouver, ISO, and other styles
8

Anjum, Iram, Aysha Saeed, Komal Aslam, Bibi Nazia Murtaza, and Shahid Mahmood Baig. "Primary Microcephaly and Schizophrenia: Genetics, Diagnostics and Current Therapeutics." In Omics Technologies for Clinical Diagnosis and Gene Therapy: Medical Applications in Human Genetics, 283–300. BENTHAM SCIENCE PUBLISHERS, 2022. http://dx.doi.org/10.2174/9789815079517122010020.

Full text
Abstract:
Intellectual disabilities (ID) are among the most common genetic disabilities worldwide. Over the last two decades, ID has especially drawn special scientific interest being the key to understanding normal brain development, growth, and functioning. Here, we discuss two intellectual disabilities to better understand the emerging trends in disease diagnosis as well as the therapies available for their management. Primary microcephaly (MCPH) is a monogenic genetic disorder with twenty-eight loci (MCPH1-MCPH28) mapped so far with all the causative genes being elucidated as well. The role of these genes in disease prognosis along with their association with various MCPH-linked phenotypes plays an important role in the molecular diagnosis of the disease. As there is no cure/treatment yet available to enlarge a congenitally small brain, management modalities in use include physical, speech and occupational therapies as well as psychological and genetic counselling to not only reduce the incidence of the disorder but also to help families cope better. The second intellectual disability being discussed here is schizophrenia which is a multifactorial disorder owing to its complex and extremely heterogeneous etiology. Although various environmental factors play an important role, the genetic factors have been identified to play the most pivotal role in disease presentation as to date, 19 loci (SCZD1-SCZD19) have been linked to schizophrenia. However, underlying genes for only six of these loci have been mapped along with 10 other genes that are either linked to schizophrenia or show susceptibility to it. Diagnosis of schizophrenia needs careful consideration and various tests and tools currently employed for complete diagnosis have been discussed here. The management options for schizophrenia include pharmacological, non-pharmacological and intracranial therapies. These disorders shed light on the important role omics technologies have played not only in better understanding of the disease prognosis but also assisting in disease diagnosis and treatment modalities too.
APA, Harvard, Vancouver, ISO, and other styles
9

"34. Parsing with ID/LP and PS Rules." In Natural Language Processing and Speech Technology, 342–54. De Gruyter Mouton, 1996. http://dx.doi.org/10.1515/9783110821895-035.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "ID-speech"

1

Lyakso, Elena, Olga Frolova, and Aleksandr Nikolaev. "VOICE AND SPEECH FEATURES AS A DIAGNOSTIC SYMPTOM." In International Psychological Applications Conference and Trends. inScience Press, 2021. http://dx.doi.org/10.36315/2021inpact074.

Full text
Abstract:
"The study of the peculiarities of speech of children with atypical development is necessary for the development of educational programs, children’s socialization and adaptation in society. The aim of this study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The multiplicity of symptomatology, different age of its manifestation, and the presence of a leading symptom complex individually for each child make it difficult to diagnose ASD. To determine the specificity of speech features of ASD, we analyzed the speech of children with developmental disabilities in which speech disorders accompany the disease - Down syndrome (DS), intellectual disabilities (ID), mixed specific developmental disorders (MDD). The features that reflect the main physiological processes occurring in the speech tract during voice and speech production are selected for analysis. The speech of 300 children aged 4-16 years was analyzed. Speech files are selected from the speech database ""AD_Child.Ru"" (Lyakso et al., 2019). Acoustic features of voice and speech, which are specific for different developmental disorders, were determined. The speech of ASD children is characterized by: high pitch values (high voice); pitch variability; high values for the third formant (emotional) and its intensity causing ""atypical"" spectrogram of the speech signal; high values of vowel articulation index (VAI). The speech of children with DS is characterized by the maximal duration of vowels in words; low pitch values (low voice); a wide range of values of the VAI depending on the difficulty of speech material; low values of the third formant; unformed most of consonant phonemes. The characteristics of speech of children with ID are: high values of vowel’s duration in words, the pitch, and the third formant, low values of the VAI; of MDD - low pitch values and high values of the VAI. Based on the identified peculiarities specific to each disease, the set of acoustic features specific to ASD can be considered as a biomarker of autism and used as an additional diagnostic criterion. This will allow a timely diagnose, appoint treatment and develop individual programs for children. Speech characteristics of children with ID, DS, and MDD can be considered to a greater extent in the training and socialization of children and used in the development of training programs taking into account individual peculiarities of children."
APA, Harvard, Vancouver, ISO, and other styles
2

Jianwei, An, Guo Fengyuan, Yang Yuliang, and Zhu Mengyu. "Research on ID Resolution of Speech Communication in VANET." In 2010 International Conference on Communications and Mobile Computing (CMC). IEEE, 2010. http://dx.doi.org/10.1109/cmc.2010.189.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bedyakin, Roman, and Nikolay Mikhaylovskiy. "Language ID Prediction from Speech Using Self-Attentive Pooling." In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.sigtyp-1.12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Owsley, L. M. D., J. J. McLaughlin, L. G. Cazzanti, and S. R. Salaymeh. "Using speech technology to enhance isotope ID and classification." In 2009 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC 2009). IEEE, 2009. http://dx.doi.org/10.1109/nssmic.2009.5402002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Morley, Eric, Esther Klabbers, Jan P. H. van Santen, Alexander Kain, and Seyed Hamidreza Mohammadi. "Synthetic F0 can effectively convey speaker ID in delexicalized speech." In Interspeech 2012. ISCA: ISCA, 2012. http://dx.doi.org/10.21437/interspeech.2012-151.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Waters, Austin, Neeraj Gaur, Parisa Haghani, Pedro Moreno, and Zhongdi Qu. "Leveraging Language ID in Multilingual End-to-End Speech Recognition." In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2019. http://dx.doi.org/10.1109/asru46091.2019.9003870.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lee, Junmo, Kwangsub Song, Kyoungjin Noh, Tae-Jun Park, and Joon-Hyuk Chang. "DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding." In 2019 International Conference on Electronics, Information, and Communication (ICEIC). IEEE, 2019. http://dx.doi.org/10.23919/elinfocom.2019.8706390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Soto, Victor, and Julia Hirschberg. "Joint Part-of-Speech and Language ID Tagging for Code-Switched Data." In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-3201.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Giuseppe, Celano. "A ResNet-50-Based Convolutional Neural Network Model for Language ID Identification from Speech Recordings." In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.sigtyp-1.13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Dowlagar, Suman, and Radhika Mamidi. "A Pre-trained Transformer and CNN model with Joint Language ID and Part-of-Speech Tagging for Code-Mixed Social-Media Text." In International Conference Recent Advances in Natural Language Processing. INCOMA Ltd. Shoumen, BULGARIA, 2021. http://dx.doi.org/10.26615/978-954-452-072-4_042.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "ID-speech"

1

Hansen, John H. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. Fort Belvoir, VA: Defense Technical Information Center, October 2015. http://dx.doi.org/10.21236/ada623029.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography