Добірка наукової літератури з теми "Statistical Parametric Speech Synthesizer"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Statistical Parametric Speech Synthesizer".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Statistical Parametric Speech Synthesizer"
Szklanny, Krzysztof, and Jakub Lachowicz. "Implementing a Statistical Parametric Speech Synthesis System for a Patient with Laryngeal Cancer." Sensors 22, no. 9 (April 21, 2022): 3188. http://dx.doi.org/10.3390/s22093188.
Повний текст джерелаChee Yong, Lau, Oliver Watts, and Simon King. "Combining Lightly-supervised Learning and User Feedback to Construct Andimprove a Statistical Parametric Speech Synthesizer for Malay." Research Journal of Applied Sciences, Engineering and Technology 11, no. 11 (December 15, 2015): 1227–32. http://dx.doi.org/10.19026/rjaset.11.2229.
Повний текст джерелаCoto-Jiménez, Marvin. "Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis." Biomimetics 6, no. 1 (February 7, 2021): 12. http://dx.doi.org/10.3390/biomimetics6010012.
Повний текст джерелаCoto-Jiménez, Marvin. "Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks." Biomimetics 4, no. 2 (May 28, 2019): 39. http://dx.doi.org/10.3390/biomimetics4020039.
Повний текст джерелаTrinh, Son, and Kiem Hoang. "HMM-Based Vietnamese Speech Synthesis." International Journal of Software Innovation 3, no. 4 (October 2015): 33–47. http://dx.doi.org/10.4018/ijsi.2015100103.
Повний текст джерелаZen, Heiga, Keiichi Tokuda, and Alan W. Black. "Statistical parametric speech synthesis." Speech Communication 51, no. 11 (November 2009): 1039–64. http://dx.doi.org/10.1016/j.specom.2009.04.004.
Повний текст джерелаEkpenyong, Moses, Eno-Abasi Urua, Oliver Watts, Simon King, and Junichi Yamagishi. "Statistical parametric speech synthesis for Ibibio." Speech Communication 56 (January 2014): 243–51. http://dx.doi.org/10.1016/j.specom.2013.02.003.
Повний текст джерелаChen, Sin‐Horng, Saga Chang, and Su‐Min Lee. "A statistical model based fundamental frequency synthesizer for Mandarin speech." Journal of the Acoustical Society of America 92, no. 1 (July 1992): 114–20. http://dx.doi.org/10.1121/1.404276.
Повний текст джерелаTakahashi, Sateshi, Yasuaki Satoh, Takeshi Ohno, and Katsuhiko Shirai. "Statistical modeling of dynamic spectral patterns for a speech synthesizer." Journal of the Acoustical Society of America 84, S1 (November 1988): S23. http://dx.doi.org/10.1121/1.2026230.
Повний текст джерелаKing, Simon. "An introduction to statistical parametric speech synthesis." Sadhana 36, no. 5 (October 2011): 837–52. http://dx.doi.org/10.1007/s12046-011-0048-y.
Повний текст джерелаДисертації з теми "Statistical Parametric Speech Synthesizer"
Hu, Qiong. "Statistical parametric speech synthesis based on sinusoidal models." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28719.
Повний текст джерелаMerritt, Thomas. "Overcoming the limitations of statistical parametric speech synthesis." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/22071.
Повний текст джерелаDall, Rasmus. "Statistical parametric speech synthesis using conversational data and phenomena." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/29016.
Повний текст джерелаFonseca, De Sam Bento Ribeiro Manuel. "Suprasegmental representations for the modeling of fundamental frequency in statistical parametric speech synthesis." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/31338.
Повний текст джерелаHong, Jung. "Statistical Parametric Models and Inference for Biomedical Signal Processing: Applications in Speech and Magnetic Resonance Imaging." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10074.
Повний текст джерелаEngineering and Applied Sciences
Evrard, Marc. "Synthèse de parole expressive à partir du texte : Des phonostyles au contrôle gestuel pour la synthèse paramétrique statistique." Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112202.
Повний текст джерелаThe subject of this thesis was the study and conception of a platform for expressive speech synthesis.The LIPS3 Text-to-Speech system — developed in the context of this thesis — includes a linguistic module and a parametric statistical module (built upon HTS and STRAIGHT). The system was based on a new single-speaker corpus, designed, recorded and annotated.The first study analyzed the influence of the precision of the training corpus phonetic labeling on the synthesis quality. It showed that statistical parametric synthesis is robust to labeling and alignment errors. This addresses the issue of variation in phonetic realizations for expressive speech.The second study presents an acoustico-phonetic analysis of the corpus, characterizing the expressive space used by the speaker to instantiate the instructions that described the different expressive conditions. Voice source parameters and articulatory settings were analyzed according to their phonetic classes, which allowed for a fine phonostylistic characterization.The third study focused on intonation and rhythm. Calliphony 2.0 is a real-time chironomic interface that controls the f0 and rhythmic parameters of prosody, using drawing/writing hand gestures with a stylus and a graphic tablet. These hand-controlled modulations are used to enhance the TTS output, producing speech that is more realistic, without degradation as it is directly applied to the vocoder parameters. Intonation and rhythm stylization using this interface brings significant improvement to the prototypicality of expressivity, as well as to the general quality of synthetic speech.These studies show that parametric statistical synthesis, combined with a chironomic interface, offers an efficient solution for expressive speech synthesis, as well as a powerful tool for the study of prosody
"Statistical Parametric Speech Synthesis using Deep Learning Architectures." 2016. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1292251.
Повний текст джерела為了更精確地表示韻律上下文,本文定義了層次韻律結構,用以組織音段與超音段特征。本文採用深度學習結構,運用層次化結構的音節級別表示,構建語音合成系統。
受深度置信網絡(Deep Belief Network, DBN)在手寫數字圖像識別和生成方面成功應用的啟發,本文在DBN的框架下對語音頻譜與基頻進行建模。為了適應語音韻律與聲學參數數據包含不同分佈的特點,本文改進原有的DBN成為帶權重的多分佈深度置信網絡(Weighted Multi-Distribution Deep Belief Network, wMD-DBN)。與傳統的基於隱馬爾科夫(Hidden Markov Model, HMM)的方法相比,客觀評測中wMD-DBN生成的頻譜失真度更低,在主觀評測中,wMDDBN也得到了與HMM基線系統整體相似的結果,證明了wMD-DBN的優勢。
在語音研究領域,之前的深度神經網絡(DNN)工作主要集中在語音識別任務中,採用DNN作為分類器以得到更好的聲學模型。本文將DNN作為生成模型,并使用它在語音合成中做韻律特征到聲學特征的映射。另一方面,DNN建模的是條件概率,而不像DBN建模的是聯合概率,這使得特征映射更加符合直觀感覺。與wMD-DBN相似,本文在DNN的輸出層採用了多分佈的輸出層。本文同時為具有不尋常分佈的聲學特征設計了特殊的損失函數(Loss Function)。為了使模型得到好的效果,本文採用生成式預訓練的DBN作為模型初始化,以構建多分佈深度神經網絡(MD-DNN)結構。主觀與客觀評測顯示,MD-DNN模型比wMD-DBN和HMM模型合成的語音具有更高的自然度。
This thesis presents a statistical parametric speech synthesis framework using the deep learning techniques and models. Existing speech synthesis face two main challenges – the complexity of expressing speech prosody with its acoustic realizations and sparsity of training data. Both of them limit the naturalness of synthesized speech. This thesis attempts to improve the synthesis performance in terms of speech naturalness, by leveraging the modeling power of deep learningarchitectures.
To precisely represent the linguistic contexts, we defined a hierarchical prosodic structure to organize both the segmental and suprasegmental features, and proposed a syllable-level representation of the hierarchical structure for speech synthesis using deep learning architectures.
Inspired by Deep Belief Network’s (DBN’s) success in handwriting digit im age recognition and generation, we propose to model the speech spectrograms in addition to F0 contours as 2-D images in the DBN framework. In order to fit the speech prosodic and acoustic parameters consisting of data with various distributions, we adopt the original model into a Weighted Multi-Distribution DBN(wMD-DBN). Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from wMD-DBN has less distortion. Subjective tests also confirm the advantage of spectrum from wMD-DBN, and the wMD-DBN system gives a similar overall quality as the HMM baseline.
Previous work on DNN in the speech community mainly focused on using it as a classifier for better acoustic modeling in speech recognition task. Here we treat DNN as a generative model and use it for linguistic-to-acoustic feature mapping in speech synthesis. Compared to the DBN model, DNN only requires a single computing pass for feature prediction, making it more suitable for real-time synthesis. On the other hand, DNN models the conditional probability instead of the joint probability as in the DBN model, which is more intuitive for the feature mapping task. Similar as wMD-DBN, we adopt the output layer of a plain DNN into Multi-distribution (MD) output layer. We also design specialized loss functions for acoustic features with uncommon distributions. To achieve good performance on deep model structure, we use the generative pre-trained DBN as the model initialization to build the MD-DNN architecture. Both objective and subjective evaluations show that the MD-DNN model out-performs the wMD DBN and HMM in terms of the naturalness of synthesized speech.
Kang, Shiyin.
Thesis Ph.D. Chinese University of Hong Kong 2016.
Includes bibliographical references (leaves ).
Abstracts also in Chinese.
Title from PDF title page (viewed on …).
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Книги з теми "Statistical Parametric Speech Synthesizer"
Rao, K. Sreenivasa, and N. P. Narendra. Source Modeling Techniques for Quality Enhancement in Statistical Parametric Speech Synthesis. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-02759-9.
Повний текст джерелаSource Modeling Techniques for Quality Enhancement in Statistical Parametric Speech Synthesis. Springer, 2018.
Знайти повний текст джерелаЧастини книг з теми "Statistical Parametric Speech Synthesizer"
Smruti, Soumya, Jagyanseni Sahoo, Monalisa Dash, and Mihir N. Mohanty. "An Approach to Design an Intelligent Parametric Synthesizer for Emotional Speech." In Advances in Intelligent Systems and Computing, 367–74. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-12012-6_40.
Повний текст джерелаAl-Radhi, Mohammed Salah, Tamás Gábor Csapó, and Géza Németh. "A Continuous Vocoder Using Sinusoidal Model for Statistical Parametric Speech Synthesis." In Speech and Computer, 11–20. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-99579-3_2.
Повний текст джерелаCoto-Jiménez, Marvin. "Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis." In Communications in Computer and Information Science, 369–82. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-41005-6_25.
Повний текст джерелаAn, Xiaochun, Hongwu Yang, and Zhenye Gan. "Towards Realizing Sign Language-to-Speech Conversion by Combining Deep Learning and Statistical Parametric Speech Synthesis." In Communications in Computer and Information Science, 678–90. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-2053-7_61.
Повний текст джерелаMa, Dabiao, Zhiba Su, Wenxuan Wang, Yuhao Lu, and Zhen Li. "UFANS: U-Shaped Fully-Parallel Acoustic Neural Structure for Statistical Parametric Speech Synthesis." In PRICAI 2019: Trends in Artificial Intelligence, 273–78. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-29894-4_22.
Повний текст джерелаSingh, Harman, Parminder Singh, and Manjot Kaur Gill. "Statistical Parametric Speech Synthesis for Punjabi Language using Deep Neural Network." In SCRS CONFERENCE PROCEEDINGS ON INTELLIGENT SYSTEMS, 431–41. Soft Computing Research Society, 2021. http://dx.doi.org/10.52458/978-93-91842-08-6-41.
Повний текст джерелаTits, Noé, Kevin El Haddad, and Thierry Dutoit. "The Theory behind Controllable Expressive Speech Synthesis: A Cross-Disciplinary Approach." In Human 4.0 - From Biology to Cybernetic. IntechOpen, 2021. http://dx.doi.org/10.5772/intechopen.89849.
Повний текст джерелаТези доповідей конференцій з теми "Statistical Parametric Speech Synthesizer"
Zen, Heiga, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, and Przemysław Szczepaniak. "Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices." In Interspeech 2016. ISCA, 2016. http://dx.doi.org/10.21437/interspeech.2016-522.
Повний текст джерелаBlack, Alan W., Heiga Zen, and Keiichi Tokuda. "Statistical Parametric Speech Synthesis." In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07. IEEE, 2007. http://dx.doi.org/10.1109/icassp.2007.367298.
Повний текст джерелаFagel, Sascha. "Video-realistic synthetic speech with a parametric visual speech synthesizer." In Interspeech 2004. ISCA: ISCA, 2004. http://dx.doi.org/10.21437/interspeech.2004-422.
Повний текст джерелаAroon, Athira, and S. B. Dhonde. "Statistical Parametric Speech Synthesis: A review." In 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO). IEEE, 2015. http://dx.doi.org/10.1109/isco.2015.7282379.
Повний текст джерелаGutscher, Lorenz, Michael Pucher, Carina Lozo, Marisa Hoeschele, and Daniel C. Mann. "Statistical parametric synthesis of budgerigar songs." In 10th ISCA Speech Synthesis Workshop. ISCA: ISCA, 2019. http://dx.doi.org/10.21437/ssw.2019-23.
Повний текст джерелаNi, Jinfu, Yoshinori Shiga, Hisashi Kawai, and Hideki Kashioka. "Experiments on unsupervised statistical parametric speech synthesis." In 2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012). IEEE, 2012. http://dx.doi.org/10.1109/iscslp.2012.6423518.
Повний текст джерелаShylendra, Ahish, Sina Haji Alizad, Priyesh Shukla, and Amit Ranjan Trivedi. "Non-parametric Statistical Density Function Synthesizer and Monte Carlo Sampler in CMOS." In 2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID). IEEE, 2020. http://dx.doi.org/10.1109/vlsid49098.2020.00021.
Повний текст джерелаZe, Heiga, Andrew Senior, and Mike Schuster. "Statistical parametric speech synthesis using deep neural networks." In ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2013. http://dx.doi.org/10.1109/icassp.2013.6639215.
Повний текст джерелаAn, Shumin, Zhenhua Ling, and Lirong Dai. "Emotional statistical parametric speech synthesis using LSTM-RNNs." In 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2017. http://dx.doi.org/10.1109/apsipa.2017.8282282.
Повний текст джерелаTokuda, Keiichi, Kei Hashimoto, Keiichiro Oura, and Yoshihiko Nankaku. "Temporal modeling in neural network based statistical parametric speech synthesis." In 9th ISCA Speech Synthesis Workshop. ISCA, 2016. http://dx.doi.org/10.21437/ssw.2016-18.
Повний текст джерела