Bibliographies thématiques / Singing voice recognition

Littérature scientifique sur le sujet « Singing voice recognition »

Auteur : Grafiati

Publié le 1 juin 2024

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Sommaire

Articles de revues
Thèses
Chapitres de livres
Actes de conférences

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Singing voice recognition ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Articles de revues sur le sujet "Singing voice recognition"

Wang, Xiaochen, et Tao Wang. « Voice Recognition and Evaluation of Vocal Music Based on Neural Network ». Computational Intelligence and Neuroscience 2022 (20 mai 2022) : 1–9. http://dx.doi.org/10.1155/2022/3466987.

Texte intégral

Résumé :

Artistic voice is the artistic life of professional voice users. In the process of selecting and cultivating artistic performing talents, the evaluation of voice even occupies a very important position. Therefore, an appropriate evaluation of the artistic voice is crucial. With the development of art education, how to scientifically evaluate artistic voice training methods and fairly select artistic voice talents is an urgent need for objective evaluation of artistic voice. The current evaluation methods for artistic voices are time-consuming, laborious, and highly subjective. In the objective evaluation of artistic voice, the selection of evaluation acoustic parameters is very important. Attempt to extract the average energy, average frequency error, and average range error of singing voice by using speech analysis technology as the objective evaluation acoustic parameters, use neural network method to objectively evaluate the singing quality of artistic voice, and compare with the subjective evaluation of senior professional teachers. In this paper, voice analysis technology is used to extract the first formant, third formant, fundamental frequency, sound range, fundamental frequency perturbation, first formant perturbation, third formant perturbation, and average energy of singing acoustic parameters. By using BP neural network methods, the quality of singing was evaluated objectively and compared with the subjective evaluation of senior vocal professional teachers. The results show that the BP neural network method can accurately and objectively evaluate the quality of singing voice by using the evaluation parameters, which is helpful in scientifically guiding the selection and training of artistic voice talents.

Styles APA, Harvard, Vancouver, ISO, etc.

Liusong, Yang, et Du Hui. « Voice Quality Evaluation of Singing Art Based on 1DCNN Model ». Mathematical Problems in Engineering 2022 (30 juillet 2022) : 1–9. http://dx.doi.org/10.1155/2022/2074844.

Texte intégral

Résumé :

Traditional speech recognition still has the problems of poor robustness and low signal-to-noise ratio, which makes the accuracy of speech recognition not ideal. Combining the idea of one-dimensional convolutional neural network with objective evaluation, an improved CNN speech recognition method is proposed in this paper. The simulation experiment is carried out with MATLAB. The effectiveness and feasibility of this method are verified by simulation. This new method is based on one-dimensional convolutional neural network. The traditional 1DNN algorithm is optimized by using the fractional processing node theory, and the corresponding parameters are set. Establish an objective evaluation system based on improved 1DCNN. Through the comparison with other neural networks, the results show that the evaluation method based on the improved 1DCNN has high stability, and the error between subjective score and evaluation method is the smallest.

Styles APA, Harvard, Vancouver, ISO, etc.

Huang, Chunyuan. « Vocal Music Teaching Pharyngeal Training Method Based on Audio Extraction by Big Data Analysis ». Wireless Communications and Mobile Computing 2022 (6 mai 2022) : 1–11. http://dx.doi.org/10.1155/2022/4572904.

Texte intégral

Résumé :

In the process of vocal music learning, incorrect vocalization methods and excessive use of voice have brought many problems to the voice and accumulated a lot of inflammation, so that the level of vocal music learning stagnated or even declined. How to find a way to improve yourself without damaging your voice has become a problem that we have been pursuing. Therefore, it is of great practical significance for vocal music teaching in normal universities to conduct in-depth research and discussion on “pharyngeal singing.” Based on audio extraction, this paper studies the vocal music teaching pharyngeal training method. Different methods of vocal music teaching pharyngeal training have different times. When the recognition amount is 3, the average recognition time of vocal music teaching pharyngeal training based on data mining is 0.010 seconds, the average recognition time of vocal music teaching pharyngeal training based on Internet of Things is 0.011 seconds, and the average recognition time of vocal music teaching pharyngeal training based on audio extraction is 0.006 seconds. The recognition time of the audio extraction method is much shorter than that of the other two traditional methods, because the audio extraction method can perform segmented training according to the changing trend of physical characteristics of notes, effectively extract the characteristics of vocal music teaching pharyngeal training, and shorten the recognition time. The learning of “pharyngeal singing” in vocal music teaching based on audio extraction is different from general vocal music training. It has its unique theory, concept, law, and sound image. In order to “liberate your voice,” it adopts large-capacity and large-scale training methods.

Styles APA, Harvard, Vancouver, ISO, etc.

Owen, Ceri. « On Singing and Listening in Vaughan Williams's Early Songs ». 19th-Century Music 40, n^o 3 (2017) : 257–82. http://dx.doi.org/10.1525/ncm.2017.40.3.257.

Texte intégral

Résumé :

Vaughan Williams's celebrated set of Robert Louis Stevenson settings, Songs of Travel, has lately garnered liberal scholarly attention, not least on account of the vicissitudes of its publication history. Following the cycle's premiere in 1904 it was issued in two separate books, each gathering stylistically different songs. Though a credible case for narrative coherence has been advanced in numerous accounts, the cycle's peculiar amalgamation of materials might rather be read as a signal to its projection of multiple voices, which unsettle the longstanding critical tendency to map a single protagonist through its progress. The division marked by the cycle's publication history may productively be understood to reflect a tension inherent in its aesthetic propositions, one constitutive of much of Vaughan Williams's work, which frequently mediates between the individualistic and the collective, the “artistic” and the “accessible,” and, as I suggest, the subjective voice of the individual artist in its invitation to the participation of a singing and listening community. I propose that Vaughan Williams's early songs frequently frame the idea or demand the engagement of a listener's contribution, as particular modes of singing and listening—and singing-as-listening—are figured and invited within the music's constitution. Composed as he was searching for an individual creative voice that simultaneously sustained a nascent commitment to the social utility and intelligibility of national art music, these songs explore the possibility of achieving a self-consciously collective authorial subjectivity, often reaching toward a musical intersubjectivity wherein boundaries between self and other—and between composer, performer, and listener—are collapsed. In the recognition of such processes lies a means of examining the tendency of Vaughan Williams's work toward projecting a powerfully subjective voice that simultaneously claims identification with no single agency.

Styles APA, Harvard, Vancouver, ISO, etc.

Muhathir, R. Muliono, N. Khairina, M. K. Harahap et S. M. Putri. « Analysis Discrete Hartley Transform for the recognition of female voice based on voice register in singing techniques ». Journal of Physics : Conference Series 1361 (novembre 2019) : 012039. http://dx.doi.org/10.1088/1742-6596/1361/1/012039.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Yuan, Weitao, Boxin He, Shengbei Wang, Jianming Wang et Masashi Unoki. « Enhanced feature network for monaural singing voice separation ». Speech Communication 106 (janvier 2019) : 1–6. http://dx.doi.org/10.1016/j.specom.2018.11.004.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Hu, Meihui, Zhiwei Xiang et Kai Li. « Application of Artificial Intelligence Voice Technology in Radio and Television Media ». Journal of Physics : Conference Series 2031, n^o 1 (1 septembre 2021) : 012051. http://dx.doi.org/10.1088/1742-6596/2031/1/012051.

Texte intégral

Résumé :

Abstract With the application of artificial intelligence in various fields, more and more people see the contribution of artificial intelligence technology to the development of the industry, and put more energy in the research of artificial intelligence language, and strive to provide better technical support for the development of more industries. In radio and television media, artificial intelligence voice technology can play a very important value, it can effectively improve the efficiency and quality of traditional audio work, optimize the singing system, broadcasting system and retrieval system, so as to provide better service quality for the masses. This paper discusses the application and development of artificial intelligence technology based on the fusion media environment, analyzes the application and development of artificial intelligence technology in promoting production efficiency, intelligent robot writing, intelligent face recognition, intelligent speech semantic recognition, intelligent OCR recognition, automatic broadcast and other aspects. These AI applications provide efficient and secure support for services, greatly improving service efficiency.

Styles APA, Harvard, Vancouver, ISO, etc.

Liu, Pengfei, Wenjin Deng, Hengda Li, Jintai Wang, Yinglin Zheng, Yiwei Ding, Xiaohu Guo et Ming Zeng. « MusicFace : Music-driven expressive singing face synthesis ». Computational Visual Media 10, n^o 1 (février 2023) : 119–36. http://dx.doi.org/10.1007/s41095-023-0343-7.

Texte intégral

Résumé :

AbstractIt remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression, head pose, and eyes. Due to the coupling of mixed information for the human voice and backing music in common music audio signals, we design a decouple-and-fuse strategy to tackle the challenge. We first decompose the input music audio into a human voice stream and a backing music stream. Due to the implicit and complicated correlation between the two-stream input signals and the dynamics of the facial expressions, head motions, and eye states, we model their relationship with an attention scheme, where the effects of the two streams are fused seamlessly. Furthermore, to improve the expressivenes of the generated results, we decompose head movement generation in terms of speed and direction, and decompose eye state generation into short-term blinking and long-term eye closing, modeling them separately. We have also built a novel dataset, SingingFace, to support training and evaluation of models for this task, including future work on this topic. Extensive experiments and a user study show that our proposed method is capable of synthesizing vivid singing faces, qualitatively and quantitatively better than the prior state-of-the-art.

Styles APA, Harvard, Vancouver, ISO, etc.

Liu, Lilin. « The New Approach Research on Singing Voice Detection Algorithm Based on Enhanced Reconstruction Residual Network ». Journal of Mathematics 2022 (23 février 2022) : 1–11. http://dx.doi.org/10.1155/2022/7987592.

Texte intégral

Résumé :

With the development of Internet technology, multimedia information resources are increasing rapidly. Faced with the massive resources in the multimedia music library, it is extremely difficult for people to find the target music that meets their needs. How to realize computer analysis and perceive users’ needs for music resources has become the goal of the future development of human-computer interaction capabilities. Content-based music information retrieval applications are mainly embodied in the automatic classification and recognition of music. Traditional feedforward neural networks are prone to lose local information when extracting singing voice features. For this reason, on the basis of fully considering the impact of information persistence in the network propagation process, this paper proposes an enhanced two-stage super-resolution reconstruction residual network which can effectively integrate the learned features of each layer while increasing the depth of the network. The first stage of reconstruction is to complete the hierarchical learning of singing voice features through dense residual units to improve the integration of information. The second stage of reconstruction is mainly to perform residual relearning on the high-frequency information of the singing voice learned in the first stage to reduce the reconstruction error. In the middle of these two stages, the model introduces feature scaling and expansion convolution to achieve the dual purpose of reducing information redundancy and increasing the receptive field of the convolution kernel. A monophonic singing voice separation based on the high-resolution neural network is proposed. Because the high-resolution network has parallel subnetworks with different resolutions, it also has original resolution representations and multiple low-resolution representations, avoiding information loss caused by serial network downsampling effects and repeating multiple feature fusions to generate new semantic representations, allowing for the learning of comprehensive, high-precision, and highly abstract features. In this article, a high-resolution neural network is utilized to model the time spectrogram in order to correctly estimate the real value of the anticipated time-amplitude spectrograms. Experiments on the dataset MIR-1K show that compared with the current leading SH-4Stack model, the method in this paper has improved SDR, SIR, and SAR indicators for measuring the separation performance, confirming the effectiveness of the algorithm in this paper.

Styles APA, Harvard, Vancouver, ISO, etc.

Le, Dinh Son, Huy Hung Ha, Dinh Quan Nguyen, Van An Tran et The Hung Nguyen. « Researching and designing an intelligent humanoid robot for teaching English language ». Ministry of Science and Technology, Vietnam 64, n^o 6 (25 juin 2022) : 35–39. http://dx.doi.org/10.31276/vjst.64(6).35-39.

Texte intégral

Résumé :

This article presents the design of the mechatronic system for an intelligent humanoid robot, which is employed for teaching the English language. The robot’s appearance looks like a boy, at 1.2 m tall and 40 kg weight. The robot consists of an upper-body with 21 degrees of freedom, a head, two arms, two hands, a ribcage; and a mobile platform with three omnidirectional wheels. The control system consists of a computer that controls the entire operation of the robot, including motion planning, voice recognition and synchronization, face recognition, gestures, receiving commands from the remote control and monitoring station, receiving signals from microphones, cameras, receiving and sending signals to the mobile module controller and the upper body controller. Microphones, speakers and cameras are located at the head and chest of the robot to perform voice communication and image acquisition functions. A touch screen is arranged in front of the robot’s chest allowing the robot to interact with people and display the necessary information. The robot can communicate with people by voice, perform operations such as greetings, expressing emotions, performing dances, singing, applications for supporting English language teaching in primary schools and has extensible for many other practical applications.

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Thèses sur le sujet "Singing voice recognition"

Regnier, Lise. « Localization, Characterization and Recognition of Singing Voices ». Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.

Texte intégral

Résumé :

This dissertation is concerned with the problem of describing the singing voice within the audio signal of a song. This work is motivated by the fact that the lead vocal is the element that attracts the attention of most listeners. For this reason it is common for music listeners to organize and browse music collections using information related to the singing voice such as the singer name. Our research concentrates on the three major problems of music information retrieval: the localization of the source to be described (i.e. the recognition of the elements corresponding to the singing voice in the signal of a mixture of instruments), the search of pertinent features to describe the singing voice, and finally the development of pattern recognition methods based on these features to identify the singer. For this purpose we propose a set of novel features computed on the temporal variations of the fundamental frequency of the sung melody. These features, which aim to describe the vibrato and the portamento, are obtained with the aid of a dedicated model. In practice, these features are computed on the time-varying frequency of partials obtained using the sinusoidal model. In the first experiment we show that partials corresponding to the singing voice can be accurately differentiated from the partials produced by other instruments using decisions based on the parameters of the vibrato and the portamento. Once the partials emitted by the singer are identified, the segments of the song containing singing can be directly localized. To improve the recognition of the partials emitted by the singer we propose to group partials that are related harmonically. Partials are clustered according to their degree of similarity. This similarity is computed using a set of CASA cues including their temporal frequency variations (i.e. the vibrato and the portamento). The clusters of harmonically related partials corresponding to the singing voice are identified using the vocal vibrato and the portamento parameters. Groups of vocal partials can then be re-synthesized to isolate the voice. The result of the partial grouping can also be used to transcribe the sung melody. We then propose to go further with these features and study if the vibrato and portamento characteristics can be considered as a part of the singers' signature. Previous works on singer identification describe audio signals using features extracted on the short-term amplitude spectrum. The latter features aim to characterize the timbre of the sound, which, in the case of singing, is related to the vocal tract of the singer. The features we develop in this document capture long-term information related to the intonation of the singer, which is relevant to the style and the technique of the singer. We propose a method to combine these two complementary descriptions of the singing voice to increase the recognition rate of singer identification. In addition we evaluate the robustness of each type of feature against a set of variations. We show the singing voice is a highly variable instrument. To obtain a representative model of a singer's voice it is thus necessary to build models using a large set of examples covering the full tessitura of a singer. In addition, we show that features extracted directly from the partials are more robust to the presence of an instrumental accompaniment than features derived from the amplitude spectrum.

Styles APA, Harvard, Vancouver, ISO, etc.

Vaglio, Andrea. « Leveraging lyrics from audio for MIR ». Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT027.

Texte intégral

Résumé :

Les paroles de chansons fournissent un grand nombre d’informations sur la musique car ellescontiennent une grande partie de la sémantique des chansons. Ces informations pourraient aider les utilisateurs à naviguer facilement dans une large collection de chansons et permettre de leur offrir des recommandations personnalisées. Cependant, ces informations ne sont souvent pas disponibles sous leur forme textuelle. Les systèmes de reconnaissance de la voix chantée pourraient être utilisés pour obtenir des transcriptions directement à partir de la source audio. Ces approches sont usuellement adaptées de celles de la reconnaissance vocale. La transcription de la parole est un domaine vieux de plusieurs décennies qui a récemment connu des avancées significatives en raison des derniers développements des techniques d’apprentissage automatique. Cependant, appliqués au chant, ces algorithmes donnent des résultats peu satisfaisants et le processus de transcription des paroles reste difficile avec des complications particulières. Dans cette thèse, nous étudions plusieurs problèmes de ’Music Information Retrieval’ scientifiquement et industriellement complexes en utilisant des informations sur les paroles générées directement à partir de l’audio. L’accent est mis sur la nécessité de rendre les approches aussi pertinentes que possible dans le monde réel. Cela implique par exemple de les tester sur des ensembles de données vastes et diversifiés et d’étudier leur extensibilité. A cette fin, nous utilisons un large ensemble de données publiques possédant des annotations vocales et adaptons avec succès plusieurs des algorithmes de reconnaissance de paroles les plus performants. Nous présentons notamment, pour la première fois, un système qui détecte le contenu explicite directement à partir de l’audio. Les premières recherches sur la création d’un système d’alignement paroles audio multilingue sont également décrites. L’étude de la tâche alignement paroles-audio est complétée de deux expériences quantifiant la perception de la synchronisation de l’audio et des paroles. Une nouvelle approche phonotactique pour l’identification de la langue est également présentée. Enfin, nous proposons le premier algorithme de détection de versions employant explicitement les informations sur les paroles extraites de l’audio
Lyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio

Styles APA, Harvard, Vancouver, ISO, etc.

Marxer, Piñón Ricard. « Audio source separation for music in low-latency and high-latency scenarios ». Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/123808.

Texte intégral

Résumé :

Aquesta tesi proposa mètodes per tractar les limitacions de les tècniques existents de separació de fonts musicals en condicions de baixa i alta latència. En primer lloc, ens centrem en els mètodes amb un baix cost computacional i baixa latència. Proposem l'ús de la regularització de Tikhonov com a mètode de descomposició de l'espectre en el context de baixa latència. El comparem amb les tècniques existents en tasques d'estimació i seguiment dels tons, que són passos crucials en molts mètodes de separació. A continuació utilitzem i avaluem el mètode de descomposició de l'espectre en tasques de separació de veu cantada, baix i percussió. En segon lloc, proposem diversos mètodes d'alta latència que milloren la separació de la veu cantada, gràcies al modelatge de components específics, com la respiració i les consonants. Finalment, explorem l'ús de correlacions temporals i anotacions manuals per millorar la separació dels instruments de percussió i dels senyals musicals polifònics complexes.
Esta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.
This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals.

Styles APA, Harvard, Vancouver, ISO, etc.

Chung, Nien-Yu, et 鍾念佑. « Recognition of Singing Voice and Instrument Sound Using Combinations of Acoustic Features ». Thesis, 2016. http://ndltd.ncl.edu.tw/handle/39449100792026503384.

Texte intégral

Résumé :

碩士
國立臺灣科技大學
資訊工程系
104
This thesis aims to recognize the class that an input sound clip belongs to. The two sound classes concerned here are singing sound (with vocal singing) and instrument sound (without vocal singing). The focus of this research is placed on testing different combinations of those considered acoustic features in order to find a most effective feature vector for sound class recognition. The acoustic coefficients considered here include mel-frequency cepstral coefficients (MFCC), pitch-detection coefficients (PDC), Chroma extended features, and their delta coefficients. The recognition method studied is based on Gaussian mixture model (GMM). Different numbers of mixtures, e.g. 8, 16, 32 and 64, are used to train the parameters of the GMMs. Then, these GMMs are used in the experiments for recognizing external sound clips. In the experiments for sound frame recognition, we have tried 6 different feature vectors, i.e. 6 different combinations of acoustic features. Among the 6 feature vectors, the vector, MFCC plus PDC, is found to be significantly better than MFCC only in recognition rate. If the feature vector is augmented with delta values and the processing of voting mechanism is added, the best recognition rate achieved is 71.3% for sound frame recognition. In the experiments for sound clips recognition, we have tried 8 different feature vectors, i.e. 8 different combinations of acoustic features. To recognize pure-instrument sound clips, the feature vector consisting of 40 coefficients is found to be the best. The recognition rate achieved is 97.1%. To recognize mixed-sound clips, the feature vector consisting of 17 coefficients (MFCC+PDC) is found to be the best. The recognition rate achieved is 94.7%. If average recognition rate is concerned, the feature vector consisting of 40 coefficients would be the best. The recognition rate achieved is 93.8%. Therefore, the feature vector that obtains the highest recognition rate is of 40 dimensions and consists of MFCC, PDC, Chroma-extended features, and their delta values.

Styles APA, Harvard, Vancouver, ISO, etc.

Pereira, Ana Isabel Lemos do Carmo. « The influence of singing with text and a neutral syllable on Portuguese children´s vocal performance, song recognition, and use of singing voice ». Doctoral thesis, 2019. http://hdl.handle.net/10362/91276.

Texte intégral

Résumé :

Research on children’s singing development is extensive. Different ages, approaches, and variables have been considered. However, research on singing with a neutral syllable versus singing with text is scarce and findings are inconclusive. Furthermore, little is known about children’s song recognition, and how text and melody interact along the learning process. In addition, the ability to use all vocal registers has not been of regular concern when investigating singing accuracy. Yet, it has been considered a pre-requisite towards accurate singing. The purpose of this dissertation was to investigate the influence of singing with text and a neutral syllable on children’s vocal performance, song recognition, and use of singing voice. Three studies were conducted with children aged 4 to 9, attending a private school in Lisbon. In Study One, two songs were taught over two periods of instruction and assessment. Period One involved the teaching of a song with text (Song 1) and a song with a neutral syllable (Song 2), whereas in Period Two the same two songs were taught with text (text was added to Song 2). At the end of each Period, children (n = 135) were individually audio recorded singing both songs and interviewed for those songs’ recognition (stimuli included songs with the same melody but different text, the same text but different melody, different text and different melody). Results revealed that singing with text seems to favor younger children in both Periods, and that girls scored higher than boys. In song recognition, findings reveal that the ability to decenter focus toward melody and text increases with age. Song 2 (taught with a neutral syllable during Period One) seems to elicit a wider range of recognition strategies. No significant relationship was found between the scores on vocal performance and the most valued component of a song (melody or text). In Study Two, children (n = 137) were administered with the Singing Voice Development Measure (SVDM). The use of singing voice was assessed, the singing accuracy on the pitches belonging to the measure criterion patterns was determined, as well as the relationship between both variables. A significant, strong and positive relationship was found between both variables with text and a neutral syllable. To sing with text or a neutral syllable did not affect children’s use of singing voice, but pattern singing accuracy scores were higher when singing with a neutral syllable. Given the nature of Study’s One and Study’s Two data, a third study was framed. The additional analysis sought to investigate the role of the use of singing voice, grade level, and gender on songs’ tonal achievement. Findings reveal that the use of singing voice with a neutral syllable is a common predictor for both songs’ tonal achievement. Gender predicts Song’s 1 tonal scores (higher for girls), but not Song’s 2 tonal scores. Overall results indicate the importance of a music program that includes songs and patterns with text and a neutral syllable. Implications for music education and needs for future research are addressed at the end.

Styles APA, Harvard, Vancouver, ISO, etc.

Chapitres de livres sur le sujet "Singing voice recognition"

Żwan, Paweł, Piotr Szczuko, Bożena Kostek et Andrzej Czyżewski. « Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets ». Dans Transactions on Rough Sets IX, 455–73. Berlin, Heidelberg : Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89876-4_25.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Rocamora, Martín, et Alvaro Pardo. « Separation and Classification of Harmonic Sounds for Singing Voice Detection ». Dans Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 707–14. Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33275-3_87.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Jefferson, Ann. « The Romantic Poet and the Brotherhood of Genius ». Dans Genius in France. Princeton University Press, 2014. http://dx.doi.org/10.23943/princeton/9780691160658.003.0006.

Texte intégral

Résumé :

This chapter traces the emergence of a new poetry that presents its credentials as lying not with any preexisting literary or national tradition, but with the genius of the individual poet. Despite the real success of the volumes published in this spirit, the poet is portrayed, like Moses abandoned by his people, as having no public. The collective “other” that might afford him recognition is absent, and in the words of Victor Hugo, his was a voice crying in the wilderness, and singing to the deaf. The new poets thus enter the literary field announcing in advance that they will go unheard by a world that is fundamentally hostile.

Styles APA, Harvard, Vancouver, ISO, etc.

Actes de conférences sur le sujet "Singing voice recognition"

Zhou, Huali, Yueqian Lin, Yao Shi, Peng Sun et Ming Li. « Bisinger : Bilingual Singing Voice Synthesis ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389659.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Gao, Xiaoxue, Xiaohai Tian, Yi Zhou, Rohan Kumar Das et Haizhou Li. « Personalized Singing Voice Generation Using WaveRNN ». Dans Odyssey 2020 The Speaker and Language Recognition Workshop. ISCA : ISCA, 2020. http://dx.doi.org/10.21437/odyssey.2020-36.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Huang, Wen-Chin, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi et Tomoki Toda. « The Singing Voice Conversion Challenge 2023 ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389671.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Wang, Jun-You, Hung-Yi Lee, Jyh-Shing Roger Jang et Li Su. « Zero-Shot Singing Voice Synthesis from Musical Score ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389711.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Liu, Ruolan, Xue Wen, Chunhui Lu, Liming Song et June Sig Sung. « Vibrato Learning in Multi-Singer Singing Voice Synthesis ». Dans 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688029.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Suzuki, Motoyuki, Sho Tomita et Tomoki Morita. « Lyrics Recognition from Singing Voice Focused on Correspondence Between Voice and Notes ». Dans Interspeech 2019. ISCA : ISCA, 2019. http://dx.doi.org/10.21437/interspeech.2019-1318.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Khunarsal, Peerapol, Chidchanok Lursinsap et Thanapant Raicharoen. « Singing voice recognition based on matching of spectrogram pattern ». Dans 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5179014.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Liu, Songxiang, Yuewen Cao, Dan Su et Helen Meng. « DiffSVC : A Diffusion Probabilistic Model for Singing Voice Conversion ». Dans 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2021. http://dx.doi.org/10.1109/asru51503.2021.9688219.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Chowdhury, Anurag, Austin Cozzo et Arun Ross. « Domain Adaptation for Speaker Recognition in Singing and Spoken Voice ». Dans ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. http://dx.doi.org/10.1109/icassp43922.2022.9746111.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Yamamoto, Ryuichi, Reo Yoneyama, Lester Phillip Violeta, Wen-Chin Huang et Tomoki Toda. « A Comparative Study of Voice Conversion Models With Large-Scale Speech and Singing Data : The T13 Systems for the Singing Voice Conversion Challenge 2023 ». Dans 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. http://dx.doi.org/10.1109/asru57964.2023.10389779.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!