Gotowa bibliografia na temat „Singing voice recognition”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Singing voice recognition”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Singing voice recognition"
Wang, Xiaochen, i Tao Wang. "Voice Recognition and Evaluation of Vocal Music Based on Neural Network". Computational Intelligence and Neuroscience 2022 (20.05.2022): 1–9. http://dx.doi.org/10.1155/2022/3466987.
Pełny tekst źródłaLiusong, Yang, i Du Hui. "Voice Quality Evaluation of Singing Art Based on 1DCNN Model". Mathematical Problems in Engineering 2022 (30.07.2022): 1–9. http://dx.doi.org/10.1155/2022/2074844.
Pełny tekst źródłaHuang, Chunyuan. "Vocal Music Teaching Pharyngeal Training Method Based on Audio Extraction by Big Data Analysis". Wireless Communications and Mobile Computing 2022 (6.05.2022): 1–11. http://dx.doi.org/10.1155/2022/4572904.
Pełny tekst źródłaOwen, Ceri. "On Singing and Listening in Vaughan Williams's Early Songs". 19th-Century Music 40, nr 3 (2017): 257–82. http://dx.doi.org/10.1525/ncm.2017.40.3.257.
Pełny tekst źródłaMuhathir, R. Muliono, N. Khairina, M. K. Harahap i S. M. Putri. "Analysis Discrete Hartley Transform for the recognition of female voice based on voice register in singing techniques". Journal of Physics: Conference Series 1361 (listopad 2019): 012039. http://dx.doi.org/10.1088/1742-6596/1361/1/012039.
Pełny tekst źródłaYuan, Weitao, Boxin He, Shengbei Wang, Jianming Wang i Masashi Unoki. "Enhanced feature network for monaural singing voice separation". Speech Communication 106 (styczeń 2019): 1–6. http://dx.doi.org/10.1016/j.specom.2018.11.004.
Pełny tekst źródłaHu, Meihui, Zhiwei Xiang i Kai Li. "Application of Artificial Intelligence Voice Technology in Radio and Television Media". Journal of Physics: Conference Series 2031, nr 1 (1.09.2021): 012051. http://dx.doi.org/10.1088/1742-6596/2031/1/012051.
Pełny tekst źródłaLiu, Pengfei, Wenjin Deng, Hengda Li, Jintai Wang, Yinglin Zheng, Yiwei Ding, Xiaohu Guo i Ming Zeng. "MusicFace: Music-driven expressive singing face synthesis". Computational Visual Media 10, nr 1 (luty 2023): 119–36. http://dx.doi.org/10.1007/s41095-023-0343-7.
Pełny tekst źródłaLiu, Lilin. "The New Approach Research on Singing Voice Detection Algorithm Based on Enhanced Reconstruction Residual Network". Journal of Mathematics 2022 (23.02.2022): 1–11. http://dx.doi.org/10.1155/2022/7987592.
Pełny tekst źródłaLe, Dinh Son, Huy Hung Ha, Dinh Quan Nguyen, Van An Tran i The Hung Nguyen. "Researching and designing an intelligent humanoid robot for teaching English language". Ministry of Science and Technology, Vietnam 64, nr 6 (25.06.2022): 35–39. http://dx.doi.org/10.31276/vjst.64(6).35-39.
Pełny tekst źródłaRozprawy doktorskie na temat "Singing voice recognition"
Regnier, Lise. "Localization, Characterization and Recognition of Singing Voices". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.
Pełny tekst źródłaVaglio, Andrea. "Leveraging lyrics from audio for MIR". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT027.
Pełny tekst źródłaLyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio
Marxer, Piñón Ricard. "Audio source separation for music in low-latency and high-latency scenarios". Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/123808.
Pełny tekst źródłaEsta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.
This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals.
Chung, Nien-Yu, i 鍾念佑. "Recognition of Singing Voice and Instrument Sound Using Combinations of Acoustic Features". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/39449100792026503384.
Pełny tekst źródła國立臺灣科技大學
資訊工程系
104
This thesis aims to recognize the class that an input sound clip belongs to. The two sound classes concerned here are singing sound (with vocal singing) and instrument sound (without vocal singing). The focus of this research is placed on testing different combinations of those considered acoustic features in order to find a most effective feature vector for sound class recognition. The acoustic coefficients considered here include mel-frequency cepstral coefficients (MFCC), pitch-detection coefficients (PDC), Chroma extended features, and their delta coefficients. The recognition method studied is based on Gaussian mixture model (GMM). Different numbers of mixtures, e.g. 8, 16, 32 and 64, are used to train the parameters of the GMMs. Then, these GMMs are used in the experiments for recognizing external sound clips. In the experiments for sound frame recognition, we have tried 6 different feature vectors, i.e. 6 different combinations of acoustic features. Among the 6 feature vectors, the vector, MFCC plus PDC, is found to be significantly better than MFCC only in recognition rate. If the feature vector is augmented with delta values and the processing of voting mechanism is added, the best recognition rate achieved is 71.3% for sound frame recognition. In the experiments for sound clips recognition, we have tried 8 different feature vectors, i.e. 8 different combinations of acoustic features. To recognize pure-instrument sound clips, the feature vector consisting of 40 coefficients is found to be the best. The recognition rate achieved is 97.1%. To recognize mixed-sound clips, the feature vector consisting of 17 coefficients (MFCC+PDC) is found to be the best. The recognition rate achieved is 94.7%. If average recognition rate is concerned, the feature vector consisting of 40 coefficients would be the best. The recognition rate achieved is 93.8%. Therefore, the feature vector that obtains the highest recognition rate is of 40 dimensions and consists of MFCC, PDC, Chroma-extended features, and their delta values.
Pereira, Ana Isabel Lemos do Carmo. "The influence of singing with text and a neutral syllable on Portuguese children´s vocal performance, song recognition, and use of singing voice". Doctoral thesis, 2019. http://hdl.handle.net/10362/91276.
Pełny tekst źródłaCzęści książek na temat "Singing voice recognition"
Żwan, Paweł, Piotr Szczuko, Bożena Kostek i Andrzej Czyżewski. "Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets". W Transactions on Rough Sets IX, 455–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89876-4_25.
Pełny tekst źródłaRocamora, Martín, i Alvaro Pardo. "Separation and Classification of Harmonic Sounds for Singing Voice Detection". W Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 707–14. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33275-3_87.
Pełny tekst źródłaJefferson, Ann. "The Romantic Poet and the Brotherhood of Genius". W Genius in France. Princeton University Press, 2014. http://dx.doi.org/10.23943/princeton/9780691160658.003.0006.
Pełny tekst źródłaStreszczenia konferencji na temat "Singing voice recognition"
Zhou, Huali, Yueqian Lin, Yao Shi, Peng Sun i Ming Li. "Bisinger: Bilingual Singing Voice Synthesis". W 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389659.
Pełny tekst źródłaGao, Xiaoxue, Xiaohai Tian, Yi Zhou, Rohan Kumar Das i Haizhou Li. "Personalized Singing Voice Generation Using WaveRNN". W Odyssey 2020 The Speaker and Language Recognition Workshop. ISCA: ISCA, 2020. http://dx.doi.org/10.21437/odyssey.2020-36.
Pełny tekst źródłaHuang, Wen-Chin, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi i Tomoki Toda. "The Singing Voice Conversion Challenge 2023". W 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389671.
Pełny tekst źródłaWang, Jun-You, Hung-Yi Lee, Jyh-Shing Roger Jang i Li Su. "Zero-Shot Singing Voice Synthesis from Musical Score". W 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389711.
Pełny tekst źródłaLiu, Ruolan, Xue Wen, Chunhui Lu, Liming Song i June Sig Sung. "Vibrato Learning in Multi-Singer Singing Voice Synthesis". W 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688029.
Pełny tekst źródłaSuzuki, Motoyuki, Sho Tomita i Tomoki Morita. "Lyrics Recognition from Singing Voice Focused on Correspondence Between Voice and Notes". W Interspeech 2019. ISCA: ISCA, 2019. http://dx.doi.org/10.21437/interspeech.2019-1318.
Pełny tekst źródłaKhunarsal, Peerapol, Chidchanok Lursinsap i Thanapant Raicharoen. "Singing voice recognition based on matching of spectrogram pattern". W 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5179014.
Pełny tekst źródłaLiu, Songxiang, Yuewen Cao, Dan Su i Helen Meng. "DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion". W 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688219.
Pełny tekst źródłaChowdhury, Anurag, Austin Cozzo i Arun Ross. "Domain Adaptation for Speaker Recognition in Singing and Spoken Voice". W ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. http://dx.doi.org/10.1109/icassp43922.2022.9746111.
Pełny tekst źródłaYamamoto, Ryuichi, Reo Yoneyama, Lester Phillip Violeta, Wen-Chin Huang i Tomoki Toda. "A Comparative Study of Voice Conversion Models With Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023". W 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389779.
Pełny tekst źródła