Статті в журналах з теми "Statistical Parametric Speech Synthesizer"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Statistical Parametric Speech Synthesizer.

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Statistical Parametric Speech Synthesizer".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Szklanny, Krzysztof, and Jakub Lachowicz. "Implementing a Statistical Parametric Speech Synthesis System for a Patient with Laryngeal Cancer." Sensors 22, no. 9 (April 21, 2022): 3188. http://dx.doi.org/10.3390/s22093188.

Повний текст джерела
Анотація:
Total laryngectomy, i.e., the surgical removal of the larynx, has a profound influence on a patient’s quality of life. The procedure results in a loss of natural voice, which in effect constitutes a significant socio-psychological problem for the patient. The main aim of the study was to develop a statistical parametric speech synthesis system for a patient with laryngeal cancer, on the basis of the patient’s speech samples recorded shortly before the surgery and to check if it was possible to generate speech quality close to that of the original recordings. The recording made use of a representative corpus of the Polish language, consisting of 2150 sentences. The recorded voice proved to indicate dysphonia, which was confirmed by the auditory-perceptual RBH scale (roughness, breathiness, hoarseness) and by acoustical analysis using AVQI (The Acoustic Voice Quality Index). The speech synthesis model was trained using the Merlin repository. Twenty-five experts participated in the MUSHRA listening tests, rating the synthetic voice at 69.4 in terms of the professional voice-over talent recording, on a 0–100 scale, which is a very good result. The authors compared the quality of the synthetic voice to another model of synthetic speech trained with the same corpus, but where a voice-over talent provided the recorded speech samples. The same experts rated the voice at 63.63, which means the patient’s synthetic voice with laryngeal cancer obtained a higher score than that of the talent-voice recordings. As such, the method enabled for the creation of a statistical parametric speech synthesizer for patients awaiting total laryngectomy. As a result, the solution would improve the quality of life as well as better mental wellbeing of the patient.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Chee Yong, Lau, Oliver Watts, and Simon King. "Combining Lightly-supervised Learning and User Feedback to Construct Andimprove a Statistical Parametric Speech Synthesizer for Malay." Research Journal of Applied Sciences, Engineering and Technology 11, no. 11 (December 15, 2015): 1227–32. http://dx.doi.org/10.19026/rjaset.11.2229.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Coto-Jiménez, Marvin. "Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis." Biomimetics 6, no. 1 (February 7, 2021): 12. http://dx.doi.org/10.3390/biomimetics6010012.

Повний текст джерела
Анотація:
Statistical parametric speech synthesis based on Hidden Markov Models has been an important technique for the production of artificial voices, due to its ability to produce results with high intelligibility and sophisticated features such as voice conversion and accent modification with a small footprint, particularly for low-resource languages where deep learning-based techniques remain unexplored. Despite the progress, the quality of the results, mainly based on Hidden Markov Models (HMM) does not reach those of the predominant approaches, based on unit selection of speech segments of deep learning. One of the proposals to improve the quality of HMM-based speech has been incorporating postfiltering stages, which pretend to increase the quality while preserving the advantages of the process. In this paper, we present a new approach to postfiltering synthesized voices with the application of discriminative postfilters, with several long short-term memory (LSTM) deep neural networks. Our motivation stems from modeling specific mapping from synthesized to natural speech on those segments corresponding to voiced or unvoiced sounds, due to the different qualities of those sounds and how HMM-based voices can present distinct degradation on each one. The paper analyses the discriminative postfilters obtained using five voices, evaluated using three objective measures, Mel cepstral distance and subjective tests. The results indicate the advantages of the discriminative postilters in comparison with the HTS voice and the non-discriminative postfilters.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Coto-Jiménez, Marvin. "Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks." Biomimetics 4, no. 2 (May 28, 2019): 39. http://dx.doi.org/10.3390/biomimetics4020039.

Повний текст джерела
Анотація:
Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters separately and trying to reduce the gap between them. The Long Short-term Memory (LSTM) Neural Networks have been applied successfully in this purpose, but there are still many aspects to improve in the results and in the process itself. In this paper, we introduce a new pre-training approach for the LSTM, with the objective of enhancing the quality of the synthesized speech, particularly in the spectrum, in a more efficient manner. Our approach begins with an auto-associative training of one LSTM network, which is used as an initialization for the post-filters. We show the advantages of this initialization for the enhancing of the Mel-Frequency Cepstral parameters of synthetic speech. Results show that the initialization succeeds in achieving better results in enhancing the statistical parametric speech spectrum in most cases when compared to the common random initialization approach of the networks.
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Trinh, Son, and Kiem Hoang. "HMM-Based Vietnamese Speech Synthesis." International Journal of Software Innovation 3, no. 4 (October 2015): 33–47. http://dx.doi.org/10.4018/ijsi.2015100103.

Повний текст джерела
Анотація:
In this paper, improving naturalness HMM-based speech synthesis for Vietnamese language is described. By this synthesis method, trajectories of speech parameters are generated from the trained Hidden Markov models. A final speech waveform is synthesized from those speech parameters. The main objective for the development is to achieve maximum naturalness in output speech through key points. Firstly, system uses a high quality recorded Vietnamese speech database appropriate for training, especially in statistical parametric model approach. Secondly, prosodic informations such as tone, POS (part of speech) and features based on characteristics of Vietnamese language are added to ensure the quality of synthetic speech. Third, system uses STRAIGHT which showed its ability to produce high-quality voice manipulation and was successfully incorporated into HMM-based speech synthesis. The results collected show that the speech produced by our system has the best result when being compared with the other Vietnamese TTS systems trained from the same speech data.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Zen, Heiga, Keiichi Tokuda, and Alan W. Black. "Statistical parametric speech synthesis." Speech Communication 51, no. 11 (November 2009): 1039–64. http://dx.doi.org/10.1016/j.specom.2009.04.004.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Ekpenyong, Moses, Eno-Abasi Urua, Oliver Watts, Simon King, and Junichi Yamagishi. "Statistical parametric speech synthesis for Ibibio." Speech Communication 56 (January 2014): 243–51. http://dx.doi.org/10.1016/j.specom.2013.02.003.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Chen, Sin‐Horng, Saga Chang, and Su‐Min Lee. "A statistical model based fundamental frequency synthesizer for Mandarin speech." Journal of the Acoustical Society of America 92, no. 1 (July 1992): 114–20. http://dx.doi.org/10.1121/1.404276.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Takahashi, Sateshi, Yasuaki Satoh, Takeshi Ohno, and Katsuhiko Shirai. "Statistical modeling of dynamic spectral patterns for a speech synthesizer." Journal of the Acoustical Society of America 84, S1 (November 1988): S23. http://dx.doi.org/10.1121/1.2026230.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

King, Simon. "An introduction to statistical parametric speech synthesis." Sadhana 36, no. 5 (October 2011): 837–52. http://dx.doi.org/10.1007/s12046-011-0048-y.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Maia, Ranniery, Masami Akamine, and Mark J. F. Gales. "Complex cepstrum for statistical parametric speech synthesis." Speech Communication 55, no. 5 (June 2013): 606–18. http://dx.doi.org/10.1016/j.specom.2012.12.008.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Shannon, M., Heiga Zen, and W. Byrne. "Autoregressive Models for Statistical Parametric Speech Synthesis." IEEE Transactions on Audio, Speech, and Language Processing 21, no. 3 (March 2013): 587–97. http://dx.doi.org/10.1109/tasl.2012.2227740.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Holtse, Peter, and Anders Olsen. "SPL: A speech synthesis programming language." Annual Report of the Institute of Phonetics University of Copenhagen 19 (January 1, 1985): 1–42. http://dx.doi.org/10.7146/aripuc.v19i.131806.

Повний текст джерела
Анотація:
This report describes the first version of a high level computer programming language for experiments with synthetic speech. In SLP a context sensitive parser is programmed to recognize linguistic constructs in an input string. Both the structural and phonetic descriptions of the recognized structures may be modified under program control. The final output of an SPL program is a data stream capable of driving a parametric speech synthesizer. The notation used is based on the principles known from Chomsky and Halle's "The Sound Pattern of English". This means that in principle all linguistic constructs are programmed in segmental units. However, in SPL certain macro facilities have been provided for more complicated units such as syllables or words.
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Yong. "LOW FOOTPRINT HIGH INTELLIGIBILITY MALAY SPEECH SYNTHESIZER BASED ON STATISTICAL DATA." Journal of Computer Science 10, no. 2 (February 1, 2014): 316–24. http://dx.doi.org/10.3844/jcssp.2014.316.324.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Khudoyberdiev, Khurshed A. "The Algorithms of Tajik Speech Synthesis by Syllable." ITM Web of Conferences 35 (2020): 07003. http://dx.doi.org/10.1051/itmconf/20203507003.

Повний текст джерела
Анотація:
This article is devoted to the development of a prototype of a computer synthesizer of Tajik speech by the text. The need for such a synthesizer is caused by the fact that its analogues for other languages not only help people with visual and speech defects, but also find more and more application in communication technology, information and reference systems. In the future, such programs will take their proper place in the broad acoustic dialogue of humans with automatic machines and robotics in various fields of human activity. The article describes the prototype of the Tajik computer synthesizer by the text developed by the author, which is constructed on the principle of a concatenative synthesizer, in which the syllable is chosen as the speech unit, which in turn, indicates the need for the most complete description of the variety of Tajik language syllables. To study the patterns of the Tajik language associated with the concept of syllable, it was introduced the concept of “syllabic structure of the word”. It is obtained the statistical distribution of structures, i.e. a correspondence is established between the syllabic structures of words and the frequencies of their occurrence in texts in the Tajik language. It is proposed an algorithm for breaking Tajik words into syllables, implemented as a computer program. A solution to the problem of Tajik speech synthesis from an arbitrary text is proposed. The article describes the computer implementation of the algorithm for syncronization of words, numbers, characters and text. For each syllable the corresponding sound realization is extracted from the “syllable-sound” database, then the sound of the word is synthesized from the extracted elements.
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Fagel, Sascha. "Merging methods of speech visualization." ZAS Papers in Linguistics 40 (January 1, 2005): 19–32. http://dx.doi.org/10.21248/zaspil.40.2005.255.

Повний текст джерела
Анотація:
The author presents MASSY, the MODULAR AUDIOVISUAL SPEECH SYNTHESIZER. The system combines two approaches of visual speech synthesis. Two control models are implemented: a (data based) di-viseme model and a (rule based) dominance model where both produce control commands in a parameterized articulation space. Analogously two visualization methods are implemented: an image based (video-realistic) face model and a 3D synthetic head. Both face models can be driven by both the data based and the rule based articulation model. The high-level visual speech synthesis generates a sequence of control commands for the visible articulation. For every virtual articulator (articulation parameter) the 3D synthetic face model defines a set of displacement vectors for the vertices of the 3D objects of the head. The vertices of the 3D synthetic head then are moved by linear combinations of these displacement vectors to visualize articulation movements. For the image based video synthesis a single reference image is deformed to fit the facial properties derived from the control commands. Facial feature points and facial displacements have to be defined for the reference image. The algorithm can also use an image database with appropriately annotated facial properties. An example database was built automatically from video recordings. Both the 3D synthetic face and the image based face generate visual speech that is capable to increase the intelligibility of audible speech. Other well known image based audiovisual speech synthesis systems like MIKETALK and VIDEO REWRITE concatenate pre-recorded single images or video sequences, respectively. Parametric talking heads like BALDI control a parametric face with a parametric articulation model. The presented system demonstrates the compatibility of parametric and data based visual speech synthesis approaches.
Стилі APA, Harvard, Vancouver, ISO та ін.
17

Saito, Yuki, Shinnosuke Takamichi, and Hiroshi Saruwatari. "Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 1 (January 2018): 84–96. http://dx.doi.org/10.1109/taslp.2017.2761547.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Koriyama, Tomoki, and Takao Kobayashi. "Statistical Parametric Speech Synthesis Using Deep Gaussian Processes." IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, no. 5 (May 2019): 948–59. http://dx.doi.org/10.1109/taslp.2019.2905167.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Liu, Zheng-Chen, Zhen-Hua Ling, and Li-Rong Dai. "Statistical Parametric Speech Synthesis Using Generalized Distillation Framework." IEEE Signal Processing Letters 25, no. 5 (May 2018): 695–99. http://dx.doi.org/10.1109/lsp.2018.2819886.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Zen, Heiga, Mark J. F. Gales, Yoshihiko Nankaku, and Keiichi Tokuda. "Product of Experts for Statistical Parametric Speech Synthesis." IEEE Transactions on Audio, Speech, and Language Processing 20, no. 3 (March 2012): 794–805. http://dx.doi.org/10.1109/tasl.2011.2165280.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Brumberg, Jonathan S., and Kevin M. Pitt. "Motor-Induced Suppression of the N100 Event-Related Potential During Motor Imagery Control of a Speech Synthesizer Brain–Computer Interface." Journal of Speech, Language, and Hearing Research 62, no. 7 (July 15, 2019): 2133–40. http://dx.doi.org/10.1044/2019_jslhr-s-msc18-18-0198.

Повний текст джерела
Анотація:
Purpose Speech motor control relies on neural processes for generating sensory expectations using an efference copy mechanism to maintain accurate productions. The N100 auditory event-related potential (ERP) has been identified as a possible neural marker of the efference copy with a reduced amplitude during active listening while speaking when compared to passive listening. This study investigates N100 suppression while controlling a motor imagery speech synthesizer brain–computer interface (BCI) with instantaneous auditory feedback to determine whether similar mechanisms are used for monitoring BCI-based speech output that may both support BCI learning through existing speech motor networks and be used as a clinical marker for the speech network integrity in individuals without severe speech and physical impairments. Method The motor-induced N100 suppression is examined based on data from 10 participants who controlled a BCI speech synthesizer using limb motor imagery. We considered listening to auditory target stimuli (without motor imagery) in the BCI study as passive listening and listening to BCI-controlled speech output (with motor imagery) as active listening since audio output depends on imagined movements. The resulting ERP was assessed for statistical significance using a mixed-effects general linear model. Results Statistically significant N100 ERP amplitude differences were observed between active and passive listening during the BCI task. Post hoc analyses confirm the N100 amplitude was suppressed during active listening. Conclusion Observation of the N100 suppression suggests motor planning brain networks are active as participants control the BCI synthesizer, which may aid speech BCI mastery.
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Přibil, Jiří, Anna Přibilová, and Jindřich Matoušek. "Automatic statistical evaluation of quality of unit selection speech synthesis with different prosody manipulations." Journal of Electrical Engineering 71, no. 2 (April 1, 2020): 78–86. http://dx.doi.org/10.2478/jee-2020-0012.

Повний текст джерела
Анотація:
AbstractQuality of speech synthesis is a crucial issue in comparison of various text-to-speech (TTS) systems. We proposed a system for automatic evaluation of speech quality by statistical analysis of temporal features (time duration, phrasing, and time structuring of an analysed sentence) together with standard spectral and prosodic features. This system was successfully tested on sentences produced by a unit selection speech synthesizer with a male as well as a female voice using two different approaches to prosody manipulation. Experiments have shown that for correct, sharp, and stable results all three types of speech features (spectral, prosodic, and temporal) are necessary. Furthermore, the number of used statistical parameters has a significant impact on the correctness and precision of the evaluated results. It was also demonstrated that the stability of the whole evaluation process is improved by enlarging the used speech material. Finally, the functionality of the proposed system was verified by comparison of the results with those of the standard listening test.
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Wang, Xin, Shinji Takaki, and Junichi Yamagishi. "Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 8 (August 2018): 1406–19. http://dx.doi.org/10.1109/taslp.2018.2828650.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Koriyama, Tomoki, Takashi Nose, and Takao Kobayashi. "Statistical Parametric Speech Synthesis Based on Gaussian Process Regression." IEEE Journal of Selected Topics in Signal Processing 8, no. 2 (April 2014): 173–83. http://dx.doi.org/10.1109/jstsp.2013.2283461.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Tao, Jianhua, Keikichi Hirose, Keiichi Tokuda, Alan W. Black, and Simon King. "Introduction to the Issue on Statistical Parametric Speech Synthesis." IEEE Journal of Selected Topics in Signal Processing 8, no. 2 (April 2014): 170–72. http://dx.doi.org/10.1109/jstsp.2014.2309416.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Cai, Ming-Qi, Zhen-Hua Ling, and Li-Rong Dai. "Statistical parametric speech synthesis using a hidden trajectory model." Speech Communication 72 (September 2015): 149–59. http://dx.doi.org/10.1016/j.specom.2015.05.008.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
27

Saheer, Lakshmi, John Dines, and Philip N. Garner. "Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis." IEEE Transactions on Audio, Speech, and Language Processing 20, no. 7 (September 2012): 2134–48. http://dx.doi.org/10.1109/tasl.2012.2198058.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
28

AL-RADHI, Mohammed Salah, Tamás Gábor CSAPÓ, and Géza NÉMETH. "Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis." IEICE Transactions on Information and Systems E103.D, no. 5 (May 1, 2020): 1099–107. http://dx.doi.org/10.1587/transinf.2019edp7167.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Wang, Xin, Shinji Takaki, and Junichi Yamagishi. "Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 402–15. http://dx.doi.org/10.1109/taslp.2019.2956145.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Reddy, M. Kiran, and K. Sreenivasa Rao. "Excitation modelling using epoch features for statistical parametric speech synthesis." Computer Speech & Language 60 (March 2020): 101029. http://dx.doi.org/10.1016/j.csl.2019.101029.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
31

Achanta, Sivanand, and Suryakanth V. Gangashetty. "Deep Elman recurrent neural networks for statistical parametric speech synthesis." Speech Communication 93 (October 2017): 31–42. http://dx.doi.org/10.1016/j.specom.2017.08.003.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Yu, Kai, and Steve Young. "Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis." IEEE Transactions on Audio, Speech, and Language Processing 19, no. 5 (July 2011): 1071–79. http://dx.doi.org/10.1109/tasl.2010.2076805.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
33

Zen, H., N. Braunschweiler, S. Buchholz, M. J. F. Gales, K. Knill, S. Krstulovic, and J. Latorre. "Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization." IEEE Transactions on Audio, Speech, and Language Processing 20, no. 6 (August 2012): 1713–24. http://dx.doi.org/10.1109/tasl.2012.2187195.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
34

Adiga, Nagaraj, and S. R. M. Prasanna. "Acoustic Features Modelling for Statistical Parametric Speech Synthesis: A Review." IETE Technical Review 36, no. 2 (March 21, 2018): 130–49. http://dx.doi.org/10.1080/02564602.2018.1432422.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
35

Barra-Chicote, Roberto, Junichi Yamagishi, Simon King, Juan Manuel Montero, and Javier Macias-Guarasa. "Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech." Speech Communication 52, no. 5 (May 2010): 394–404. http://dx.doi.org/10.1016/j.specom.2009.12.007.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
36

Chee Yong, Lau, and Tan Tian Swee. "Statistical Parametric Speech Synthesis of Malay Language using Found Training Data." Research Journal of Applied Sciences, Engineering and Technology 7, no. 24 (June 25, 2014): 5143–47. http://dx.doi.org/10.19026/rjaset.7.910.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
37

Csapó, Tamás Gábor, and Géza Németh. "Statistical parametric speech synthesis with a novel codebook-based excitation model." Intelligent Decision Technologies 8, no. 4 (June 27, 2014): 289–99. http://dx.doi.org/10.3233/idt-140197.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
38

Takamichi, Shinnosuke, Tomoki Toda, Alan W. Black, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura. "Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, no. 4 (April 2016): 755–67. http://dx.doi.org/10.1109/taslp.2016.2522655.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
39

Erro, Daniel, Inaki Sainz, Eva Navas, and Inma Hernaez. "Harmonics Plus Noise Model Based Vocoder for Statistical Parametric Speech Synthesis." IEEE Journal of Selected Topics in Signal Processing 8, no. 2 (April 2014): 184–94. http://dx.doi.org/10.1109/jstsp.2013.2283471.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
40

Chen, Ling-Hui, Tuomo Raitio, Cassia Valentini-Botinhao, Zhen-Hua Ling, and Junichi Yamagishi. "A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 23, no. 11 (November 2015): 2003–14. http://dx.doi.org/10.1109/taslp.2015.2461448.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
41

Adiga, Nagaraj, Banriskhem K. Khonglah, and S. R. Mahadeva Prasanna. "Improved voicing decision using glottal activity features for statistical parametric speech synthesis." Digital Signal Processing 71 (December 2017): 131–43. http://dx.doi.org/10.1016/j.dsp.2017.09.007.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
42

Airaksinen, Manu, Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, and Paavo Alku. "A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 9 (September 2018): 1658–70. http://dx.doi.org/10.1109/taslp.2018.2835720.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
43

Csapo, Tamas Gabor, and Geza Nemeth. "Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation." IEEE Journal of Selected Topics in Signal Processing 8, no. 2 (April 2014): 209–20. http://dx.doi.org/10.1109/jstsp.2013.2292037.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
44

Souza, Fernando, and Adolfo Maia Jr. "A Mathematical, Graphical and Visual Approach to Granular Synthesis Composition." Revista Vórtex 9, no. 2 (December 10, 2021): 1–27. http://dx.doi.org/10.33871/23179937.2021.9.2.4.

Повний текст джерела
Анотація:
We show a method for Granular Synthesis Composition based on a mathematical modeling for the musical gesture. Each gesture is drawn as a curve generated from a particular mathematical model (or function) and coded as a MATLAB script. The gestures can be deterministic through defining mathematical time functions, hand free drawn, or even randomly generated. This parametric information of gestures is interpreted through OSC messages by a granular synthesizer (Granular Streamer). The musical composition is then realized with the models (scripts) written in MATLAB and exported to a graphical score (Granular Score). The method is amenable to allow statistical analysis of the granular sound streams and the final music composition. We also offer a way to create granular streams based on correlated pair of grains parameters.
Стилі APA, Harvard, Vancouver, ISO та ін.
45

Al-Radhi, Mohammed Salah, Tamás Gábor Csapó, and Géza Németh. "Adaptive Refinements of Pitch Tracking and HNR Estimation within a Vocoder for Statistical Parametric Speech Synthesis." Applied Sciences 9, no. 12 (June 16, 2019): 2460. http://dx.doi.org/10.3390/app9122460.

Повний текст джерела
Анотація:
Recent studies in text-to-speech synthesis have shown the benefit of using a continuous pitch estimate; one that interpolates fundamental frequency (F0) even when voicing is not present. However, continuous F0 is still sensitive to additive noise in speech signals and suffers from short-term errors (when it changes rather quickly over time). To alleviate these issues, three adaptive techniques have been developed in this article for achieving a robust and accurate F0: (1) we weight the pitch estimates with state noise covariance using adaptive Kalman-filter framework, (2) we iteratively apply a time axis warping on the input frame signal, (3) we optimize all F0 candidates using an instantaneous-frequency-based approach. Additionally, the second goal of this study is to introduce an extension of a novel continuous-based speech synthesis system (i.e., in which all parameters are continuous). We propose adding a new excitation parameter named Harmonic-to-Noise Ratio (HNR) to the voiced and unvoiced components to indicate the degree of voicing in the excitation and to reduce the influence of buzziness caused by the vocoder. Results based on objective and perceptual tests demonstrate that the voice built with the proposed framework gives state-of-the-art speech synthesis performance while outperforming the previous baseline.
Стилі APA, Harvard, Vancouver, ISO та ін.
46

Mazenan, Mohd Nizam, Tan Tian Swee, Tan Hui Ru, and Azran Azhim. "Statistical Parametric Evaluation on New Corpus Design for Malay Speech Articulation Disorder Early Diagnosis." American Journal of Applied Sciences 12, no. 7 (July 1, 2015): 452–62. http://dx.doi.org/10.3844/ajassp.2015.452.462.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
47

Juvela, Lauri, Bajibabu Bollepalli, Vassilis Tsiaras, and Paavo Alku. "GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis." IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, no. 6 (June 2019): 1019–30. http://dx.doi.org/10.1109/taslp.2019.2906484.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
48

Maia, Ranniery, and Masami Akamine. "On the impact of excitation and spectral parameters for expressive statistical parametric speech synthesis." Computer Speech & Language 28, no. 5 (September 2014): 1209–32. http://dx.doi.org/10.1016/j.csl.2013.10.001.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
49

Yu, Kai, Heiga Zen, François Mairesse, and Steve Young. "Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis." Speech Communication 53, no. 6 (July 2011): 914–23. http://dx.doi.org/10.1016/j.specom.2011.03.003.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
50

Raitio, Tuomo, Lauri Juvela, Antti Suni, Martti Vainio, and Paavo Alku. "Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis." Speech Communication 81 (July 2016): 104–19. http://dx.doi.org/10.1016/j.specom.2016.01.007.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!

До бібліографії