Auswahl der wissenschaftlichen Literatur zum Thema „Singing voice recognition“
Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an
Inhaltsverzeichnis
Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Singing voice recognition" bekannt.
Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.
Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.
Zeitschriftenartikel zum Thema "Singing voice recognition"
Wang, Xiaochen, und Tao Wang. „Voice Recognition and Evaluation of Vocal Music Based on Neural Network“. Computational Intelligence and Neuroscience 2022 (20.05.2022): 1–9. http://dx.doi.org/10.1155/2022/3466987.
Der volle Inhalt der QuelleLiusong, Yang, und Du Hui. „Voice Quality Evaluation of Singing Art Based on 1DCNN Model“. Mathematical Problems in Engineering 2022 (30.07.2022): 1–9. http://dx.doi.org/10.1155/2022/2074844.
Der volle Inhalt der QuelleHuang, Chunyuan. „Vocal Music Teaching Pharyngeal Training Method Based on Audio Extraction by Big Data Analysis“. Wireless Communications and Mobile Computing 2022 (06.05.2022): 1–11. http://dx.doi.org/10.1155/2022/4572904.
Der volle Inhalt der QuelleOwen, Ceri. „On Singing and Listening in Vaughan Williams's Early Songs“. 19th-Century Music 40, Nr. 3 (2017): 257–82. http://dx.doi.org/10.1525/ncm.2017.40.3.257.
Der volle Inhalt der QuelleMuhathir, R. Muliono, N. Khairina, M. K. Harahap und S. M. Putri. „Analysis Discrete Hartley Transform for the recognition of female voice based on voice register in singing techniques“. Journal of Physics: Conference Series 1361 (November 2019): 012039. http://dx.doi.org/10.1088/1742-6596/1361/1/012039.
Der volle Inhalt der QuelleYuan, Weitao, Boxin He, Shengbei Wang, Jianming Wang und Masashi Unoki. „Enhanced feature network for monaural singing voice separation“. Speech Communication 106 (Januar 2019): 1–6. http://dx.doi.org/10.1016/j.specom.2018.11.004.
Der volle Inhalt der QuelleHu, Meihui, Zhiwei Xiang und Kai Li. „Application of Artificial Intelligence Voice Technology in Radio and Television Media“. Journal of Physics: Conference Series 2031, Nr. 1 (01.09.2021): 012051. http://dx.doi.org/10.1088/1742-6596/2031/1/012051.
Der volle Inhalt der QuelleLiu, Pengfei, Wenjin Deng, Hengda Li, Jintai Wang, Yinglin Zheng, Yiwei Ding, Xiaohu Guo und Ming Zeng. „MusicFace: Music-driven expressive singing face synthesis“. Computational Visual Media 10, Nr. 1 (Februar 2023): 119–36. http://dx.doi.org/10.1007/s41095-023-0343-7.
Der volle Inhalt der QuelleLiu, Lilin. „The New Approach Research on Singing Voice Detection Algorithm Based on Enhanced Reconstruction Residual Network“. Journal of Mathematics 2022 (23.02.2022): 1–11. http://dx.doi.org/10.1155/2022/7987592.
Der volle Inhalt der QuelleLe, Dinh Son, Huy Hung Ha, Dinh Quan Nguyen, Van An Tran und The Hung Nguyen. „Researching and designing an intelligent humanoid robot for teaching English language“. Ministry of Science and Technology, Vietnam 64, Nr. 6 (25.06.2022): 35–39. http://dx.doi.org/10.31276/vjst.64(6).35-39.
Der volle Inhalt der QuelleDissertationen zum Thema "Singing voice recognition"
Regnier, Lise. „Localization, Characterization and Recognition of Singing Voices“. Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.
Der volle Inhalt der QuelleVaglio, Andrea. „Leveraging lyrics from audio for MIR“. Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT027.
Der volle Inhalt der QuelleLyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio
Marxer, Piñón Ricard. „Audio source separation for music in low-latency and high-latency scenarios“. Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/123808.
Der volle Inhalt der QuelleEsta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.
This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals.
Chung, Nien-Yu, und 鍾念佑. „Recognition of Singing Voice and Instrument Sound Using Combinations of Acoustic Features“. Thesis, 2016. http://ndltd.ncl.edu.tw/handle/39449100792026503384.
Der volle Inhalt der Quelle國立臺灣科技大學
資訊工程系
104
This thesis aims to recognize the class that an input sound clip belongs to. The two sound classes concerned here are singing sound (with vocal singing) and instrument sound (without vocal singing). The focus of this research is placed on testing different combinations of those considered acoustic features in order to find a most effective feature vector for sound class recognition. The acoustic coefficients considered here include mel-frequency cepstral coefficients (MFCC), pitch-detection coefficients (PDC), Chroma extended features, and their delta coefficients. The recognition method studied is based on Gaussian mixture model (GMM). Different numbers of mixtures, e.g. 8, 16, 32 and 64, are used to train the parameters of the GMMs. Then, these GMMs are used in the experiments for recognizing external sound clips. In the experiments for sound frame recognition, we have tried 6 different feature vectors, i.e. 6 different combinations of acoustic features. Among the 6 feature vectors, the vector, MFCC plus PDC, is found to be significantly better than MFCC only in recognition rate. If the feature vector is augmented with delta values and the processing of voting mechanism is added, the best recognition rate achieved is 71.3% for sound frame recognition. In the experiments for sound clips recognition, we have tried 8 different feature vectors, i.e. 8 different combinations of acoustic features. To recognize pure-instrument sound clips, the feature vector consisting of 40 coefficients is found to be the best. The recognition rate achieved is 97.1%. To recognize mixed-sound clips, the feature vector consisting of 17 coefficients (MFCC+PDC) is found to be the best. The recognition rate achieved is 94.7%. If average recognition rate is concerned, the feature vector consisting of 40 coefficients would be the best. The recognition rate achieved is 93.8%. Therefore, the feature vector that obtains the highest recognition rate is of 40 dimensions and consists of MFCC, PDC, Chroma-extended features, and their delta values.
Pereira, Ana Isabel Lemos do Carmo. „The influence of singing with text and a neutral syllable on Portuguese children´s vocal performance, song recognition, and use of singing voice“. Doctoral thesis, 2019. http://hdl.handle.net/10362/91276.
Der volle Inhalt der QuelleBuchteile zum Thema "Singing voice recognition"
Żwan, Paweł, Piotr Szczuko, Bożena Kostek und Andrzej Czyżewski. „Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets“. In Transactions on Rough Sets IX, 455–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89876-4_25.
Der volle Inhalt der QuelleRocamora, Martín, und Alvaro Pardo. „Separation and Classification of Harmonic Sounds for Singing Voice Detection“. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 707–14. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33275-3_87.
Der volle Inhalt der QuelleJefferson, Ann. „The Romantic Poet and the Brotherhood of Genius“. In Genius in France. Princeton University Press, 2014. http://dx.doi.org/10.23943/princeton/9780691160658.003.0006.
Der volle Inhalt der QuelleKonferenzberichte zum Thema "Singing voice recognition"
Zhou, Huali, Yueqian Lin, Yao Shi, Peng Sun und Ming Li. „Bisinger: Bilingual Singing Voice Synthesis“. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389659.
Der volle Inhalt der QuelleGao, Xiaoxue, Xiaohai Tian, Yi Zhou, Rohan Kumar Das und Haizhou Li. „Personalized Singing Voice Generation Using WaveRNN“. In Odyssey 2020 The Speaker and Language Recognition Workshop. ISCA: ISCA, 2020. http://dx.doi.org/10.21437/odyssey.2020-36.
Der volle Inhalt der QuelleHuang, Wen-Chin, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi und Tomoki Toda. „The Singing Voice Conversion Challenge 2023“. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389671.
Der volle Inhalt der QuelleWang, Jun-You, Hung-Yi Lee, Jyh-Shing Roger Jang und Li Su. „Zero-Shot Singing Voice Synthesis from Musical Score“. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389711.
Der volle Inhalt der QuelleLiu, Ruolan, Xue Wen, Chunhui Lu, Liming Song und June Sig Sung. „Vibrato Learning in Multi-Singer Singing Voice Synthesis“. In 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688029.
Der volle Inhalt der QuelleSuzuki, Motoyuki, Sho Tomita und Tomoki Morita. „Lyrics Recognition from Singing Voice Focused on Correspondence Between Voice and Notes“. In Interspeech 2019. ISCA: ISCA, 2019. http://dx.doi.org/10.21437/interspeech.2019-1318.
Der volle Inhalt der QuelleKhunarsal, Peerapol, Chidchanok Lursinsap und Thanapant Raicharoen. „Singing voice recognition based on matching of spectrogram pattern“. In 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5179014.
Der volle Inhalt der QuelleLiu, Songxiang, Yuewen Cao, Dan Su und Helen Meng. „DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion“. In 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688219.
Der volle Inhalt der QuelleChowdhury, Anurag, Austin Cozzo und Arun Ross. „Domain Adaptation for Speaker Recognition in Singing and Spoken Voice“. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. http://dx.doi.org/10.1109/icassp43922.2022.9746111.
Der volle Inhalt der QuelleYamamoto, Ryuichi, Reo Yoneyama, Lester Phillip Violeta, Wen-Chin Huang und Tomoki Toda. „A Comparative Study of Voice Conversion Models With Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023“. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389779.
Der volle Inhalt der Quelle