Littérature scientifique sur le sujet « Singing voice recognition »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Singing voice recognition ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Articles de revues sur le sujet "Singing voice recognition"
Wang, Xiaochen, et Tao Wang. « Voice Recognition and Evaluation of Vocal Music Based on Neural Network ». Computational Intelligence and Neuroscience 2022 (20 mai 2022) : 1–9. http://dx.doi.org/10.1155/2022/3466987.
Texte intégralLiusong, Yang, et Du Hui. « Voice Quality Evaluation of Singing Art Based on 1DCNN Model ». Mathematical Problems in Engineering 2022 (30 juillet 2022) : 1–9. http://dx.doi.org/10.1155/2022/2074844.
Texte intégralHuang, Chunyuan. « Vocal Music Teaching Pharyngeal Training Method Based on Audio Extraction by Big Data Analysis ». Wireless Communications and Mobile Computing 2022 (6 mai 2022) : 1–11. http://dx.doi.org/10.1155/2022/4572904.
Texte intégralOwen, Ceri. « On Singing and Listening in Vaughan Williams's Early Songs ». 19th-Century Music 40, no 3 (2017) : 257–82. http://dx.doi.org/10.1525/ncm.2017.40.3.257.
Texte intégralMuhathir, R. Muliono, N. Khairina, M. K. Harahap et S. M. Putri. « Analysis Discrete Hartley Transform for the recognition of female voice based on voice register in singing techniques ». Journal of Physics : Conference Series 1361 (novembre 2019) : 012039. http://dx.doi.org/10.1088/1742-6596/1361/1/012039.
Texte intégralYuan, Weitao, Boxin He, Shengbei Wang, Jianming Wang et Masashi Unoki. « Enhanced feature network for monaural singing voice separation ». Speech Communication 106 (janvier 2019) : 1–6. http://dx.doi.org/10.1016/j.specom.2018.11.004.
Texte intégralHu, Meihui, Zhiwei Xiang et Kai Li. « Application of Artificial Intelligence Voice Technology in Radio and Television Media ». Journal of Physics : Conference Series 2031, no 1 (1 septembre 2021) : 012051. http://dx.doi.org/10.1088/1742-6596/2031/1/012051.
Texte intégralLiu, Pengfei, Wenjin Deng, Hengda Li, Jintai Wang, Yinglin Zheng, Yiwei Ding, Xiaohu Guo et Ming Zeng. « MusicFace : Music-driven expressive singing face synthesis ». Computational Visual Media 10, no 1 (février 2023) : 119–36. http://dx.doi.org/10.1007/s41095-023-0343-7.
Texte intégralLiu, Lilin. « The New Approach Research on Singing Voice Detection Algorithm Based on Enhanced Reconstruction Residual Network ». Journal of Mathematics 2022 (23 février 2022) : 1–11. http://dx.doi.org/10.1155/2022/7987592.
Texte intégralLe, Dinh Son, Huy Hung Ha, Dinh Quan Nguyen, Van An Tran et The Hung Nguyen. « Researching and designing an intelligent humanoid robot for teaching English language ». Ministry of Science and Technology, Vietnam 64, no 6 (25 juin 2022) : 35–39. http://dx.doi.org/10.31276/vjst.64(6).35-39.
Texte intégralThèses sur le sujet "Singing voice recognition"
Regnier, Lise. « Localization, Characterization and Recognition of Singing Voices ». Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.
Texte intégralVaglio, Andrea. « Leveraging lyrics from audio for MIR ». Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT027.
Texte intégralLyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio
Marxer, Piñón Ricard. « Audio source separation for music in low-latency and high-latency scenarios ». Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/123808.
Texte intégralEsta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.
This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals.
Chung, Nien-Yu, et 鍾念佑. « Recognition of Singing Voice and Instrument Sound Using Combinations of Acoustic Features ». Thesis, 2016. http://ndltd.ncl.edu.tw/handle/39449100792026503384.
Texte intégral國立臺灣科技大學
資訊工程系
104
This thesis aims to recognize the class that an input sound clip belongs to. The two sound classes concerned here are singing sound (with vocal singing) and instrument sound (without vocal singing). The focus of this research is placed on testing different combinations of those considered acoustic features in order to find a most effective feature vector for sound class recognition. The acoustic coefficients considered here include mel-frequency cepstral coefficients (MFCC), pitch-detection coefficients (PDC), Chroma extended features, and their delta coefficients. The recognition method studied is based on Gaussian mixture model (GMM). Different numbers of mixtures, e.g. 8, 16, 32 and 64, are used to train the parameters of the GMMs. Then, these GMMs are used in the experiments for recognizing external sound clips. In the experiments for sound frame recognition, we have tried 6 different feature vectors, i.e. 6 different combinations of acoustic features. Among the 6 feature vectors, the vector, MFCC plus PDC, is found to be significantly better than MFCC only in recognition rate. If the feature vector is augmented with delta values and the processing of voting mechanism is added, the best recognition rate achieved is 71.3% for sound frame recognition. In the experiments for sound clips recognition, we have tried 8 different feature vectors, i.e. 8 different combinations of acoustic features. To recognize pure-instrument sound clips, the feature vector consisting of 40 coefficients is found to be the best. The recognition rate achieved is 97.1%. To recognize mixed-sound clips, the feature vector consisting of 17 coefficients (MFCC+PDC) is found to be the best. The recognition rate achieved is 94.7%. If average recognition rate is concerned, the feature vector consisting of 40 coefficients would be the best. The recognition rate achieved is 93.8%. Therefore, the feature vector that obtains the highest recognition rate is of 40 dimensions and consists of MFCC, PDC, Chroma-extended features, and their delta values.
Pereira, Ana Isabel Lemos do Carmo. « The influence of singing with text and a neutral syllable on Portuguese children´s vocal performance, song recognition, and use of singing voice ». Doctoral thesis, 2019. http://hdl.handle.net/10362/91276.
Texte intégralChapitres de livres sur le sujet "Singing voice recognition"
Żwan, Paweł, Piotr Szczuko, Bożena Kostek et Andrzej Czyżewski. « Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets ». Dans Transactions on Rough Sets IX, 455–73. Berlin, Heidelberg : Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89876-4_25.
Texte intégralRocamora, Martín, et Alvaro Pardo. « Separation and Classification of Harmonic Sounds for Singing Voice Detection ». Dans Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 707–14. Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33275-3_87.
Texte intégralJefferson, Ann. « The Romantic Poet and the Brotherhood of Genius ». Dans Genius in France. Princeton University Press, 2014. http://dx.doi.org/10.23943/princeton/9780691160658.003.0006.
Texte intégralActes de conférences sur le sujet "Singing voice recognition"
Zhou, Huali, Yueqian Lin, Yao Shi, Peng Sun et Ming Li. « Bisinger : Bilingual Singing Voice Synthesis ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389659.
Texte intégralGao, Xiaoxue, Xiaohai Tian, Yi Zhou, Rohan Kumar Das et Haizhou Li. « Personalized Singing Voice Generation Using WaveRNN ». Dans Odyssey 2020 The Speaker and Language Recognition Workshop. ISCA : ISCA, 2020. http://dx.doi.org/10.21437/odyssey.2020-36.
Texte intégralHuang, Wen-Chin, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi et Tomoki Toda. « The Singing Voice Conversion Challenge 2023 ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389671.
Texte intégralWang, Jun-You, Hung-Yi Lee, Jyh-Shing Roger Jang et Li Su. « Zero-Shot Singing Voice Synthesis from Musical Score ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389711.
Texte intégralLiu, Ruolan, Xue Wen, Chunhui Lu, Liming Song et June Sig Sung. « Vibrato Learning in Multi-Singer Singing Voice Synthesis ». Dans 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688029.
Texte intégralSuzuki, Motoyuki, Sho Tomita et Tomoki Morita. « Lyrics Recognition from Singing Voice Focused on Correspondence Between Voice and Notes ». Dans Interspeech 2019. ISCA : ISCA, 2019. http://dx.doi.org/10.21437/interspeech.2019-1318.
Texte intégralKhunarsal, Peerapol, Chidchanok Lursinsap et Thanapant Raicharoen. « Singing voice recognition based on matching of spectrogram pattern ». Dans 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5179014.
Texte intégralLiu, Songxiang, Yuewen Cao, Dan Su et Helen Meng. « DiffSVC : A Diffusion Probabilistic Model for Singing Voice Conversion ». Dans 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688219.
Texte intégralChowdhury, Anurag, Austin Cozzo et Arun Ross. « Domain Adaptation for Speaker Recognition in Singing and Spoken Voice ». Dans ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. http://dx.doi.org/10.1109/icassp43922.2022.9746111.
Texte intégralYamamoto, Ryuichi, Reo Yoneyama, Lester Phillip Violeta, Wen-Chin Huang et Tomoki Toda. « A Comparative Study of Voice Conversion Models With Large-Scale Speech and Singing Data : The T13 Systems for the Singing Voice Conversion Challenge 2023 ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389779.
Texte intégral