Journal articles on the topic 'Acoustic-Articulatory Mapping'

To see the other types of publications on this topic, follow the link: Acoustic-Articulatory Mapping.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 27 journal articles for your research on the topic 'Acoustic-Articulatory Mapping.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zussa, F., Q. Lin, G. Richard, D. Sinder, and J. Flanagan. "Open‐loop acoustic‐to‐articulatory mapping." Journal of the Acoustical Society of America 98, no. 5 (November 1995): 2931. http://dx.doi.org/10.1121/1.414151.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Riegelsberger, Edward L., and Ashok K. Krishnamurthy. "Acoustic‐to‐articulatory mapping of fricatives." Journal of the Acoustical Society of America 97, no. 5 (May 1995): 3417. http://dx.doi.org/10.1121/1.412480.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ananthakrishnan, G., and Olov Engwall. "Mapping between acoustic and articulatory gestures." Speech Communication 53, no. 4 (April 2011): 567–89. http://dx.doi.org/10.1016/j.specom.2011.01.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sepulveda-Sepulveda, Alexander, and German Castellanos-Domínguez. "Time-Frequency Energy Features for Articulator Position Inference on Stop Consonants." Ingeniería y Ciencia 8, no. 16 (November 30, 2012): 37–56. http://dx.doi.org/10.17230/ingciencia.8.16.2.

Full text
Abstract:
Acoustic-to-Articulatory inversion offers new perspectives and interesting applicationsin the speech processing field; however, it remains an open issue. This paper presents a method to estimate the distribution of the articulatory informationcontained in the stop consonants’ acoustics, whose parametrizationis achieved by using the wavelet packet transform. The main focus is on measuringthe relevant acoustic information, in terms of statistical association, forthe inference of the position of critical articulators involved in stop consonantsproduction. The rank correlation Kendall coefficient is used as the relevance measure. The maps of relevant time–frequency features are calculated for theMOCHA–TIMIT database; from which, stop consonants are extracted andanalysed. The proposed method obtains a set of time–frequency components closely related to articulatory phenemenon, which offers a deeper understanding into the relationship between the articulatory and acoustical phenomena.The relevant maps are tested into an acoustic–to–articulatory mapping systembased on Gaussian mixture models, where it is shown they are suitable for improvingthe performance of such a systems over stop consonants. The method could be extended to other manner of articulation categories, e.g. fricatives,in order to adapt present method to acoustic-to-articulatory mapping systemsover whole speech.
APA, Harvard, Vancouver, ISO, and other styles
5

Sorokin, V. N., and A. V. Trushkin. "Articulatory-to-acoustic mapping for inverse problem." Speech Communication 19, no. 2 (August 1996): 105–18. http://dx.doi.org/10.1016/0167-6393(96)00028-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wu, Zhiyong, Kai Zhao, Xixin Wu, Xinyu Lan, and Helen Meng. "Acoustic to articulatory mapping with deep neural network." Multimedia Tools and Applications 74, no. 22 (August 1, 2014): 9889–907. http://dx.doi.org/10.1007/s11042-014-2183-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Riegelsberger, Edward L., and Ashok K. Krishnamurthy. "Acoustic‐to‐articulatory mapping of isolated and intervocalic fricatives." Journal of the Acoustical Society of America 101, no. 5 (May 1997): 3175. http://dx.doi.org/10.1121/1.419149.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Atal, Bishnu. "A study of ambiguities in the acoustic-articulatory mapping." Journal of the Acoustical Society of America 122, no. 5 (2007): 3079. http://dx.doi.org/10.1121/1.2942998.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

McGowan, Richard S., and Michael A. Berger. "Acoustic-articulatory mapping in vowels by locally weighted regression." Journal of the Acoustical Society of America 126, no. 4 (2009): 2011. http://dx.doi.org/10.1121/1.3184581.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Schmidt, Anna Marie. "Korean to English articulatory mapping: Palatometric and acoustic data." Journal of the Acoustical Society of America 95, no. 5 (May 1994): 2820–21. http://dx.doi.org/10.1121/1.409681.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Shahrebabaki, Abdolreza Sabzi, Giampiero Salvi, Torbjorn Svendsen, and Sabato Marco Siniscalchi. "Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models." IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2022): 135–47. http://dx.doi.org/10.1109/taslp.2021.3133218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Chennoukh, S., D. Sinder, G. Richard, and J. Flanagan. "Methods for acoustic‐to‐articulatory mapping and voice mimic systems." Journal of the Acoustical Society of America 101, no. 5 (May 1997): 3179. http://dx.doi.org/10.1121/1.419218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Behbood, Hossein, Seyyed Ali Seyyedsalehi, Hamid Reza Tohidypour, Mojtaba Najafi, and Shahriar Gharibzadeh. "A novel neural-based model for acoustic-articulatory inversion mapping." Neural Computing and Applications 21, no. 5 (March 15, 2011): 935–43. http://dx.doi.org/10.1007/s00521-011-0563-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Kello, Christopher T., and David C. Plaut. "A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters." Journal of the Acoustical Society of America 116, no. 4 (October 2004): 2354–64. http://dx.doi.org/10.1121/1.1715112.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

You, Kang, Kele Xu, Jilong Wang, and Ming Feng. "Domain adaptation towards speaker-independent ultrasound tongue imaging based articulatory-to-acoustic conversion." Journal of the Acoustical Society of America 153, no. 3_supplement (March 1, 2023): A366. http://dx.doi.org/10.1121/10.0019181.

Full text
Abstract:
In this paper, we endeavor to address an articulatory-to-acoustic issue which aims to estimate the mel-spectrogram of the acoustical signals, using midsagittal ultrasound tongue images of the vocal tract as input. Previous attempts employed statistical methods for the inversion between the articulatory movements and speech, while deep learning has begun to dominate this field. Despite the sustainable efforts that have been made, the mapping performance can be greatly varied for different speakers and most of the previous methods are constrained for the speaker-dependent scenario. Here, we present a novel approach towards speaker-independent mapping, which is inspired by the domain adaptation method. Specifically, we explore decoupling the spectrogram generation task and the speaker recognition task. Leveraging a novel designed loss function, we can improve the performance under the speaker-independent scenarios, through the adversarial learning strategy. To demonstrate the effectiveness of the proposed method, extensive experiments are conducted on the Tongue and Lips (TaL) corpus. Objective evaluation is conducted to compare the generated spectrograms and ground truth, using three evaluation metrics, including the MSE, SSIM, and CW-SSIM. The results indicate that our proposed method can achieve superior performance under the speaker-independent scenario, compared with competitive solutions. Our code is available at https://github.com/xianyi11/Articulatory-to-Acoustic-with-Domain-Adaptation.
APA, Harvard, Vancouver, ISO, and other styles
16

Honda, Masaaki, and Tokihiko Kaburagi. "Estimation of articulatory‐to‐acoustic mapping using input and output measurements." Journal of the Acoustical Society of America 93, no. 4 (April 1993): 2353–54. http://dx.doi.org/10.1121/1.406212.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Dusan, Sorin, and Li Deng. "Vocal‐tract length normalization for acoustic‐to‐articulatory mapping using neural networks." Journal of the Acoustical Society of America 106, no. 4 (October 1999): 2181. http://dx.doi.org/10.1121/1.427279.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Girin, Laurent, Thomas Hueber, and Xavier Alameda-Pineda. "Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping." IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, no. 3 (March 2017): 662–73. http://dx.doi.org/10.1109/taslp.2017.2651398.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Toda, Tomoki, Alan W. Black, and Keiichi Tokuda. "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model." Speech Communication 50, no. 3 (March 2008): 215–27. http://dx.doi.org/10.1016/j.specom.2007.09.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Fonseca, Marco. "Japanese devoiced vowel tongue movement: Acoustics and articulory mapping." Journal of the Acoustical Society of America 153, no. 3_supplement (March 1, 2023): A372. http://dx.doi.org/10.1121/10.0019215.

Full text
Abstract:
The goal of this study is to evaluate the tongue movements of Japanese devoiced vowels /i/ and /u/ using acoustic and articulatory data. Four speakers of Japanese were recorded with an EchoBlaster 128Z ultrasound probe attached to their chin. Participants read a total of 22 tokens in a carrier sentence (12 repetitions). Duration (ms) and Center of Gravity (CoG, Hz) was fit in two linear mixed effects models as dependent variables, and the interaction between voicing (voiced/devoiced) and vowel (i/u) as predictors. For the duration values, there was an effect of both predictors. There was an effect of voicing and of the interaction between vowel and voicing. For the articulatory, the tongue splines were obtained from the last data frame of the segment before the vowel prone to devoicing were obtained through EdgeTrak. A total of 12317 x-y coordinates were fit into generalized additive models. In conclusion, devoiced vowels are shorter, and the CoG of devoiced vowels is higher than that of voiced ones. CoG results indicate that the former segments are more fronted than the latter. The tongue splines indicate that devoiced vowels are more fronted than voiced ones, which has also been indicated through the CoG data.
APA, Harvard, Vancouver, ISO, and other styles
21

Csapó, Tamás Gábor, Gábor Gosztolya, László Tóth, Amin Honarmandi Shandiz, and Alexandra Markó. "Optimizing the Ultrasound Tongue Image Representation for Residual Network-Based Articulatory-to-Acoustic Mapping." Sensors 22, no. 22 (November 8, 2022): 8601. http://dx.doi.org/10.3390/s22228601.

Full text
Abstract:
Within speech processing, articulatory-to-acoustic mapping (AAM) methods can apply ultrasound tongue imaging (UTI) as an input. (Micro)convex transducers are mostly used, which provide a wedge-shape visual image. However, this process is optimized for the visual inspection of the human eye, and the signal is often post-processed by the equipment. With newer ultrasound equipment, now it is possible to gain access to the raw scanline data (i.e., ultrasound echo return) without any internal post-processing. In this study, we compared the raw scanline representation with the wedge-shaped processed UTI as the input for the residual network applied for AAM, and we also investigated the optimal size of the input image. We found no significant differences between the performance attained using the raw data and the wedge-shaped image extrapolated from it. We found the optimal pixel size to be 64 × 43 in the case of the raw scanline input, and 64 × 64 when transformed to a wedge. Therefore, it is not necessary to use the full original 64 × 842 pixels raw scanline, but a smaller image is enough. This allows for the building of smaller networks, and will be beneficial for the development of session and speaker-independent methods for practical applications. AAM systems have the target application of a “silent speech interface”, which could be helpful for the communication of the speaking-impaired, in military applications, or in extremely noisy conditions.
APA, Harvard, Vancouver, ISO, and other styles
22

Howard, Ian S., and Mark A. Huckvale. "Learning to control an articulatory synthesizer by imitating real speech." ZAS Papers in Linguistics 40 (January 1, 2005): 63–78. http://dx.doi.org/10.21248/zaspil.40.2005.258.

Full text
Abstract:
The goal of our current project is to build a system that can learn to imitate a version of a spoken utterance using an articulatory speech synthesiser. The approach is informed and inspired by knowledge of early infant speech development. Thus we expect our system to reproduce and exploit the utility of infant behaviours such as listening, vocal play, babbling and word imitation. We expect our system to develop a relationship between the sound-making capabilities of its vocal tract and the phonetic/phonological structure of imitated utterances. At the heart of our approach is the learning of an inverse model that relates acoustic and motor representations of speech. The acoustic to auditory mappings uses an auditory filter bank and a self-organizing phase of learning. The inverse model from auditory to vocal tract control parameters is estimated using a babbling phase, in which the vocal tract is essentially driven in a random manner, much like the babbling phase of speech acquisition in infants. The complete system can be used to imitate simple utterances through a direct mapping from sound to control parameters. Our initial results show that this procedure works well for sounds generated by its own voice. Further work is needed to build a phonological control level and achieve better performance with real speech.
APA, Harvard, Vancouver, ISO, and other styles
23

EDGERTON, MICHAEL EDWARD. "Palatal Sound: a comprehensive model of vocal tract articulation*." Organised Sound 4, no. 2 (June 1999): 93–110. http://dx.doi.org/10.1017/s1355771899002058.

Full text
Abstract:
Palatal Sound is a model of vocal tract articulation influenced by physiologic and acoustic analysis of the voice. Specifically, the term articulation refers to all movement within the vocal tract that results in open, filter-like sonorities, as well as in turbulent to absolute airflow modification. This model presents a complete mapping of place within the vocal tract that features flexibility across different vocal tract sizes and proportions. The principles behind this comprehensive mapping of acoustic and physical sound production techniques should not be foreign to those persons who create, combine, design, model or research sound. Therefore, this model might suggest avenues of sound exploration regardless of media or application. This text first presents a brief overview of the current trends of oral modification using vowels, followed by an introduction to and acoustic analyses of the comprehensive vocal tract model as applied to open-like sonorities. This model is then expanded through the presentation of other methods of open-like behaviours. Following the discussion of open sonorities, turbulent-like behaviours are discussed by first identifying the use of language-based fricatives and stops. After this (re-)exposition, the comprehensive model is applied to turbulent structures through examples and acoustic analyses. Finally, these turbulent methods are completed by additional, complementary methods of vocal tract turbulence. The intentions of this paper are: (i) to document this model clearly, (ii) to identify differences between speech and song articulatory behaviour and that of this comprehensive model with the aid of selected acoustic analyses, (iii) to suggest that this model renders valuable scientific information about the limits of vocal tract physiology, and (iv) to propose the practical use of this model by composers and performers.
APA, Harvard, Vancouver, ISO, and other styles
24

Callan, Daniel E., Ray D. Kent, Frank H. Guenther, and Houri K. Vorperian. "An Auditory-Feedback-Based Neural Network Model of Speech Production That Is Robust to Developmental Changes in the Size and Shape of the Articulatory System." Journal of Speech, Language, and Hearing Research 43, no. 3 (June 2000): 721–36. http://dx.doi.org/10.1044/jslhr.4303.721.

Full text
Abstract:
The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is that it is unclear what teaching signal could specify constriction location and degree so that a mapping between constriction target space and articulator space can be learned. It is predicted that a model trained by auditory feedback will accomplish speech goals, in auditory target space, by continuously learning to use different articulator configurations to adapt to the changing acoustic properties of the vocal tract during development. The Maeda articulatory synthesis part of the DIVA neural network model (Guenther et al., 1998) was modified to reflect the development of the vocal tract by using measurements taken from MR images of children. After training, the model was able to maintain the 11 English vowel targets in auditory planning space, utilizing varying articulator configurations, despite morphological changes that occur during development. The vocal-tract constriction pattern (derived from the vocal-tract area function) as well as the formant values varied during the course of development in correspondence with morphological changes in the structures involved with speech production. Despite changes in the acoustical properties of the vocal tract that occur during the course of development, the model was able to demonstrate motor-equivalent speech production under lip-restriction conditions. The model accomplished this in a self-organizing manner even though there was no prior experience with lip restriction during training.
APA, Harvard, Vancouver, ISO, and other styles
25

Ananthakrishnan, G., Olov Engwall, and Daniel Neiberg. "Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings." IEEE Transactions on Audio, Speech, and Language Processing 20, no. 10 (December 2012): 2672–82. http://dx.doi.org/10.1109/tasl.2012.2210876.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Sepúlveda-Sepúlveda, Franklin Alexander, César Germán Castellanos-Domínguez, and Pedro Gómez-Vilda. "Subject-independent acoustic-to-articulatory mapping of fricative sounds by using vocal tract length normalization." Revista Facultad de Ingeniería Universidad de Antioquia, no. 77 (December 2015). http://dx.doi.org/10.17533/udea.redin.n77a19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Pathi, Soujanya, and Prakash Mondal. "The mental representation of sounds in speech sound disorders." Humanities and Social Sciences Communications 8, no. 1 (January 27, 2021). http://dx.doi.org/10.1057/s41599-021-00706-z.

Full text
Abstract:
AbstractThe objective of this study is to investigate facets of the human phonological system in an attempt to elucidate the special nature of mental representations and operations underlying some of the errors in speech sound disorders (SSDs). After examining different theories on the mental representations of sounds and their organization in SSDs, we arrive at the conclusion that the existing elucidations on the phonological representations do not suffice to explain some distinctive facets of SSDs. Here, we endorse a hypothesis in favor of representationalism but offer an alternative conceptualization of the phonological representations (PR). We argue that the PR is to be understood in terms of a phonological base that holds information about a segment’s acoustic structure, and which interacts with other levels in the speech sound system in the mind so as to produce a certain sound. We also propose that the PR is connected to an interface module which mediates interactions between the PR and the articulatory system (AS) responsible for the physical manifestation of speech sounds in real time by way of the coordination of activities of speech organs in the vocal tract. We specifically consider different stages of operations within the interface, a specialized system within the cognitive system, which can explain patterns in the SSD data that have so far remained elusive. Positioned between the PR and the AS, the interface module is the heart of the current study. The presence of an interface module is necessitated by the fact that not all errors of SSDs are explainable in terms of structural, motor or even the symbolic misrepresentations at the level of PR. The interface acts as a mediating system mapping sound representations onto articulatory instructions for the actual production of sounds. The interface module can receive, process, and share the phonological inputs with other levels within the speech sound system. We believe an interface module such as ours holds the key to explaining at least certain speech disarticulations in SSDs.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography