Bibliographies thématiques / Conversion vocale

Sommaire

Articles de revues
Thèses
Livres
Chapitres de livres
Actes de conférences

Littérature scientifique sur le sujet « Conversion vocale »

Auteur : Grafiati

Publié le 21 décembre 2024

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Conversion vocale ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Articles de revues sur le sujet "Conversion vocale"

Thamrin, Lily. « Phonological Description of Teochew Dialect in Pontianak West Kalimantan ». Lingua Cultura 14, n^o 2 (30 décembre 2020) : 195–201. http://dx.doi.org/10.21512/lc.v14i2.6600.

Texte intégral

Résumé :

The research aimed to describe the phonological system of the Pontianak Teochew dialect spoken by the Chinese community in West Kalimantan, including vocals, consonants, and tones, using descriptive method. The phonological escription in question included both phonetic and phonemic descriptions with the subject of language that objectively and accurately describes the current aspects of Teochew’s phonology. The phonetic system of the Pontianak Teochew language would be articulately identified based on the way sounds are formed by human speech tools, namely through consonants, vocals and diphthong. The research data were obtained from three Pontianak Teochew informants who used the Teochew dialect as their daily conversation language. The informants were around 70-75 years old. Based on the research results, there are 18 consonant phonemes, 88 vocal phonemes, and eight tones. The research results show that the Pontianak Teochew dialect has its own characteristics that distinguish it from the other regional Teochew dialects.

Styles APA, Harvard, Vancouver, ISO, etc.

Vevia Romero, Fernando Carlos. « Nostalgia de la conversación ». Argos 6, n^o 17 (1 janvier 2019) : 149–57. http://dx.doi.org/10.32870/argos.v6.n19.14a19.

Texte intégral

Résumé :

Toda la vida sólo escuchando. ¿A quién? A la autoridad en todas sus formas; las más burdas y las más sutiles. Luego, más tarde, sólo oír. Oír, no escuchar. Con órdenes estrictas a las neuronas para no almacenar la información oída. Con los músculos faciales entrenados para poner cara interesante, mientras la conciencia dormita. Luego resultó que es mucho peor hablar. Decir, decir, decir. Mover la lengua, los pulmones, las cuerdas vocales... Sentir la señal roja que se prende en el cerebro y advierte:" ¡Atención! ¡Exceso de estupideces!". Hablar de la realidad del mundo que desconozco; de la realidad de los libros que desconozco, de la realidad de los seres humanos que desconozco. Refugio en el silencio para disminuir el número de sandeces dichas por minuto.

Styles APA, Harvard, Vancouver, ISO, etc.

Harris, Taran. « Treating Audio Manipulation Effects like Photoshop : Exploring the Negative Impacts of a Lack of Transparency in Contemporary Vocal Music on Young Learners ». INSAM Journal of Contemporary Music, Art and Technology, n^o 8 (15 juillet 2022) : 47–59. http://dx.doi.org/10.51191/issn.2637-1898.2022.5.8.47.

Texte intégral

Résumé :

Amidst the great and rapid advance in digital audio processing over recent decades, a range of new ‘manipulation’ software has problematised the popular music scene, both in terms of authenticity and achievability of performance. This paper will set out to define manipulation effects as separate from the more over-arching umbrella term of staging effects, under which they have been vaguely included for a number of years. By separating out the staging of vocals from the manipulation of their core content, by pitch correction for example, we can more specifically observe their impact on audience reception and vocal pedagogy. The reception element of this research would be largely related to that of authenticity and the presentation of liveness in online video, but this paper will focus on the effect of the unachievable on vocal learners. These could range from confidence issues to serious vocal problems. This paper explores the possibilities of music following the same trajectory as photography, where manipulation is concerned. Photoshop’s usage in media has provoked a great deal of controversy in recent years, with high profile campaigns resulting in legislative changes such as Israel’s Photoshop Law, which imposes certain restrictions for models and a disclaimer requirement for publishers. It’s a possibility that if the music industry were required to provide disclaimers for audio releases and online videos, that there would be more transparency in vocal pedagogy, with the potential for fewer vocal health problems related to copying unachievable performances. The aim of this paper is to open a conversation about the effect of a lack of transparency surrounding audio manipulation so that more can be done to address it.

Styles APA, Harvard, Vancouver, ISO, etc.

Nishimura, Shogo, Takuya Nakamura, Wataru Sato, Masayuki Kanbara, Yuichiro Fujimoto, Hirokazu Kato et Norihiro Hagita. « Vocal Synchrony of Robots Boosts Positive Affective Empathy ». Applied Sciences 11, n^o 6 (11 mars 2021) : 2502. http://dx.doi.org/10.3390/app11062502.

Texte intégral

Résumé :

Robots that can talk with humans play increasingly important roles in society. However, current conversation robots remain unskilled at eliciting empathic feelings in humans. To address this problem, we used a robot that speaks in a voice synchronized with human vocal prosody. We conducted an experiment in which human participants held positive conversations with the robot by reading scenarios under conditions with and without vocal synchronization. We assessed seven subjective responses related to affective empathy (e.g., emotional connection) and measured the physiological emotional responses using facial electromyography from the corrugator supercilii and zygomatic major muscles as well as the skin conductance level. The subjective ratings consistently revealed heightened empathic responses to the robot in the synchronization condition compared with that under the de-synchronizing condition. The physiological signals showed that more positive and stronger emotional arousal responses to the robot with synchronization. These findings suggest that robots that are able to vocally synchronize with humans can elicit empathic emotional responses.

Styles APA, Harvard, Vancouver, ISO, etc.

Nirmal, Jagannath, Suprava Patnaik, Mukesh Zaveri et Pramod Kachare. « Complex Cepstrum Based Voice Conversion Using Radial Basis Function ». ISRN Signal Processing 2014 (6 février 2014) : 1–13. http://dx.doi.org/10.1155/2014/357048.

Texte intégral

Résumé :

The complex cepstrum vocoder is used to modify the speaker specific characteristics of the source speaker speech to that of the target speaker speech. The low time and high time liftering are used to split the calculated cepstrum into the vocal tract and the source excitation parameters. The obtained mixed phase vocal tract and source excitation parameters with finite impulse response preserve the phase properties of the resynthesized speech frame. The radial basis function is explored to capture the nonlinear mapping function for modifying the complex cepstrum based real and imaginary components of the vocal tract and source excitation of the speech signal. The state-of-the-art Mel cepstrum envelope and the fundamental frequency (F0) are considered to represent the vocal tract and the source excitation of the speech frame, respectively. Radial basis function is used to capture and formulate the nonlinear relations between the Mel cepstrum envelope of the source and target speakers. Mean and standard deviation approach is employed to modify the fundamental frequency (F0). The Mel log spectral approximation filter is used to reconstruct the speech signal from the modified Mel cepstrum envelope and fundamental frequency. A comparison of the proposed complex cepstrum based model has been made with the state-of-the-art Mel Cepstrum Envelope based voice conversion model with objective and subjective evaluations. The evaluation measures reveal that the proposed complex cepstrum based voice conversion system approximate the converted speech signal with better accuracy than the model based on the Mel cepstrum envelope based voice conversion.

Styles APA, Harvard, Vancouver, ISO, etc.

Zeitels, Steven M., Ramon A. Franco, Robert E. Hillman et Glenn W. Bunting. « Voice and Treatment Outcome from Phonosurgical Management of Early Glottic Cancer ». Annals of Otology, Rhinology & ; Laryngology 111, n^o 12_suppl (décembre 2002) : 3–20. http://dx.doi.org/10.1177/0003489402111s1202.

Texte intégral

Résumé :

Phonosurgical management of early glottic cancer has evolved considerably, but objective vocal outcome data are sparse. A prospective clinical trial was done on 32 patients with unilateral cancer (T1a in 28 and T2a in 4) who underwent ultranarrow-margin resection; 15 had resection superficial to the vocal ligament, and 17 deep to it. The subepithelial infusion technique facilitated selection of these patients for the appropriate procedure. All are cancer-free without radiotherapy or open surgery. Involvement of the anterior commissure (22/32) or the vocal process (15/32) of the arytenoid cartilage did not influence local control. Nine of 17 patients had resection of paraglottic musculature, and all underwent medialization reconstruction by lipoinjection and/or Gore-Tex laryngoplasty. Eight of the 17 had resections deep to the vocal ligament, but without vocalis muscle, and 1 of the 8 underwent medialization. Posttreatment vocal function measures were obtained for all patients. A clear majority of the patients displayed normal values for average fundamental frequency (72%) during connected speech, and normal noise-to-harmonics ratio (75%) and average glottal airflow (91%) measures during sustained vowels. Smaller majorities of patients displayed normal values for average sound pressure level (SPL; 59%) during connected speech and for maximum ranges for fundamental frequency (56%) and SPL (59%). Fewer than half of the patients displayed normal values for sustained vowel measures of jitter (45%), shimmer (22%), and maximum phonation time (34%). Almost all patients had elevated subglottal pressures and reduced values for the ratio of SPL to subglottal pressure (vocal efficiency). There were significant improvements in a majority of patients for most vocal function measures after medialization reconstruction. Normal or near-normal conversation-level voices were achieved in most cases, regardless of the disease depth, by utilization of a spectrum of resection and reconstruction options. These favorable results are based on establishing aerodynamic glottal competency and preserving the layered microstructure of noncancerous glottal tissue.

Styles APA, Harvard, Vancouver, ISO, etc.

Adachi, Seiji, Hironori Takemoto, Tatsuya Kitamura, Parham Mokhtari et Kiyoshi Honda. « Vocal tract length perturbation and its application to male-female vocal tract shape conversion ». Journal of the Acoustical Society of America 121, n^o 6 (juin 2007) : 3874–85. http://dx.doi.org/10.1121/1.2730743.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Vijayan, Karthika, Haizhou Li et Tomoki Toda. « Speech-to-Singing Voice Conversion : The Challenges and Strategies for Improving Vocal Conversion Processes ». IEEE Signal Processing Magazine 36, n^o 1 (janvier 2019) : 95–102. http://dx.doi.org/10.1109/msp.2018.2875195.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Treinkman, Melissa. « A Conversation with Leslie Holmes ». Journal of Singing 80, n^o 1 (15 août 2023) : 89–91. http://dx.doi.org/10.53830/tfcq4189.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

GEIST, ROSE, et SUSAN E. TALLETT. « Diagnosis and Management of Psychogenic Stridor Caused by a Conversion Disorder ». Pediatrics 86, n^o 2 (1 août 1990) : 315–17. http://dx.doi.org/10.1542/peds.86.2.315.

Texte intégral

Résumé :

Psychogenic stridor, a recently reported cause of acute upper-airway obstruction, is also known as paradoxical vocal cord motion.1-3 Although reported to occur predominantly among young women,1,2 it has been recognized in both male adolescents and adults,3 as well as in older women.4,5 Several authors1,3,4,6 have associated psychogenic stridor with conversion disorder, but few have included a discussion of the diagnostic criteria or elaborated on the underlying mechanisms of the conversion process. In the case of a female adolescent with a diagnosis of paradoxical vocal cord motion presented here, we have included a discussion of the diagnostic psychological criteria. We also present the combined psychophysiological approach we used to identify and release the underlying affect, which resulted in the complete, immediate, and to date, lasting remittance of our patient's symptoms.

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Thèses sur le sujet "Conversion vocale"

Huber, Stefan. « Voice Conversion by modelling and transformation of extended voice characteristics ». Electronic Thesis or Diss., Paris 6, 2015. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2015PA066750.pdf.

Texte intégral

Résumé :

La Conversion de la Voix (VC) vise à transformer les caractéristiques de la voix d’un locuteur source de manière qu’il sera perçu comme étant prononcé par un locuteur cible. Le principe de la VC est de définir des fonctions du transposition pour la conversion de la voix de l’un locuteur source à la voix de l’un locuteur cible. Les fonctions de transformation de VC systèmes "State-Of-The-Art" (START) adapte instantanément aux caractéristiques de la voix source. Cependant, la qualité est pas encore suffisant. Des améliorations considérables sont nécessaires que les techniques VC peuvent être utilisés dans un environnement industriel professionnel. L’objectif de cette thèse est d’augmenter la qualité de la conversion de la voix pour faciliter son applicabilité industrielle dans une mesure raisonnable. Les propriétés de base de différentes START algorithmes de la conversion de la voix sont discutés sur leurs avantages intrinsèques et ses déficits. Basé sur des évaluations expérimentales avec un GMM VC système la conclusion est que la plupart des systèmes VC START qui reposent sur des modèles statistiques sont, en raison de l’effet en moyenne de la régression linéaire, moins appropriées pour atteindre un score du similitude assez élevé avec le haut-parleur cible requise pour l’utilisation industrielle. Les contributions établies pendant de ce travail de thèse se trouvent dans les moyens étendus à a) modéliser l’excitation du source glottique, b) modéliser des descripteurs de la voix en utilisant un nouveau système de parole basée sur un modèle élargie de source-filtre, et c) avancer une nouveau système VC de l’Ircam en le combinant avec les contributions de a) et b)
Voice Conversion (VC) aims at transforming the characteristics of a source speaker’s voice in such a way that it will be perceived as being uttered by a target speaker. The principle of VC is to define mapping functions for the conversion from one source speaker’s voice to one target speaker’s voice. The transformation functions of common State-Of-The-Art (START) VC system adapt instantaneously to the characteristics of the source voice. While recent VC systems have made considerable progress over the conversion quality of initial approaches, the quality is nevertheless not yet sufficient. Considerable improvements are required before VC techniques can be used in an professional industrial environment. The objective of this thesis is to augment the quality of Voice Conversion to facilitate its industrial applicability to a reasonable extent. The basic properties of different START algorithms for Voice Conversion are discussed on their intrinsic advantages and shortcomings. Based on experimental evaluations of one GMM-based State-Of-The-Art VC approach the conclusion is that most VC systems which rely on statistical models are, due to averaging effect of the linear regression, less appropriate to achieve a high enough similarity score to the target speaker required for industrial usage. The contributions established throughout this thesis work lie in the extended means to a) model the glottal excitation source, b) model a voice descriptor set using a novel speech system based on an extended source-filter model, and c) to further advance IRCAM’s novel VC system by combining it with the contributions of a) and b)

Styles APA, Harvard, Vancouver, ISO, etc.

Guéguin, Marie. « Evaluation objective de la qualité vocale en contexte de conversation ». Phd thesis, Université Rennes 1, 2006. http://tel.archives-ouvertes.fr/tel-00132550.

Texte intégral

Résumé :

La qualité vocale des systèmes de télécommunications est évaluée par les opérateurs pour satisfaire leurs usagers. Les méthodes subjectives permettent de connaître le jugement humain mais sont coûteuses : les méthodes objectives représentent une alternative. Un modèle objectif est proposé pour évaluer la qualité en contexte de conversation à partir des qualités d'écoute, de locution et d'interaction. Il est divisé en deux parties : la partie intégration combine les notes de qualité d'écoute, de locution et d'interaction pour estimer une note de qualité de conversation et la partie mesure fournit les notes objectives de qualité à la partie intégration en se basant sur les modèles existants de qualité vocale dans les différents contextes. Quatre tests subjectifs étudiant différentes dégradations de la qualité de conversation sont utilisés pour construire et valider la partie intégration du modèle. Les performances du modèle sont vérifiées en l'appliquant à des signaux réels.

Styles APA, Harvard, Vancouver, ISO, etc.

Guéguin, Marie. « Évaluation objective de la qualité vocale en contexte de conversation ». Rennes 1, 2006. https://tel.archives-ouvertes.fr/tel-00132550.

Texte intégral

Résumé :

La qualité vocale des systèmes de télécommunications est évaluée par les opérateurs pour satisfaire leurs usagers. Les méthodes subjectives permettent de connaître le jugement humain mais sont coûteuses : les méthodes objectives représentent une alternative. Un modèle objectif est proposé pour évaluer la qualité en contexte de conversation à partir des qualités d’écoute, de locution et d’interaction. Il est divisé en deux parties : la partie intégration combine les notes de qualité d'écoute, de locution et d'interaction pour estimer une note de qualité de conversation et la partie mesure fournit les notes objectives de qualité à la partie intégration en se basant sur les modèles existants de qualité vocale dans les différents contextes. Quatre tests subjectifs étudiant différentes dégradations de la qualité de conversation sont utilisés pour construire et valider la partie intégration du modèle. Les performances du modèle sont vérifiées en l’appliquant à des signaux réels.

Styles APA, Harvard, Vancouver, ISO, etc.

Ogun, Sewade. « Generating diverse synthetic data for ASR training data augmentation ». Electronic Thesis or Diss., Université de Lorraine, 2024. http://www.theses.fr/2024LORR0116.

Texte intégral

Résumé :

Au cours des deux dernières décennies, le taux d'erreur des systèmes de reconnaissance automatique de la parole (RAP) a chuté drastiquement, les rendant ainsi plus utiles dans les applications réelles. Cette amélioration peut être attribuée à plusieurs facteurs, dont les nouvelles architectures utilisant des techniques d'apprentissage profond, les nouveaux algorithmes d'entraînement, les ensembles de données d'entraînement grands et diversifiés, et l'augmentation des données. En particulier, les jeux de données d'entraînement de grande taille ont été essentiels pour apprendre des représentations robustes de la parole pour les systèmes de RAP. Leur taille permet de couvrir efficacement la diversité inhérente à la parole, en terme de voix des locuteurs, de vitesse de parole, de hauteur, de réverbération et de bruit. Cependant, la taille et la diversité des jeux de données disponibles dans les langues bien dotées ne sont pas accessibles pour les langues moyennement ou peu dotées, ainsi que pour des domaines à vocabulaire spécialisé comme le domaine médical. Par conséquent, la méthode populaire pour augmenter la diversité des ensembles de données est l'augmentation des données. Avec l'augmentation récente de la naturalité et de la qualité des données synthétiques pouvant être générées par des systèmes de synthèse de la parole (TTS) et de conversion de voix (VC), ces derniers sont également devenus des options viables pour l'augmentation des données de RAP. Cependant, plusieurs problèmes limitent leur application. Premièrement, les systèmes de TTS/VC nécessitent des données de parole de haute qualité pour l'entraînement. Par conséquent, nous développons une méthode de curation d'un jeux de données à partir d'un corpus conçu pour la RAP pour l'entraînement d'un système de TTS. Cette méthode exploite la précision croissante des estimateurs de qualité non intrusifs basés sur l'apprentissage profond pour filtrer les échantillons de haute qualité. Nous explorons le filtrage du jeux de données de RAP à différents seuils pour équilibrer sa taille, le nombre de locuteurs et la qualité. Avec cette méthode, nous créons un ensemble de données interlocuteurs de haute qualité, comparable en qualité à LibriTTS. Deuxièmement, le processus de génération de données doit être contrôlable pour générer des données TTS/VC diversifiées avec des attributs spécifiques. Les systèmes TTS/VC précédents conditionnent soit le système sur l'empreinte du locuteur seule, soit utilisent des modèles discriminatifs pour apprendre les variabilités de la parole. Dans notre approche, nous concevons une architecture améliorée basée sur le flux qui apprend la distribution de différentes variables de la parole. Nous constatons que nos modifications augmentent significativement la diversité et la naturalité des énoncés générés par rapport à une référence GlowTTS, tout en étant contrôlables. Enfin, nous avons évalué l'importance de générer des données des TTS et VC diversifiées pour augmenter les données d'entraînement de RAP. Contrairement à la génération naïve des données TTS/VC, nous avons examiné indépendamment différentes approches telles que les méthodes de sélection des phrases et l'augmentation de la diversité des locuteurs, la durée des phonèmes et les contours de hauteur, en plus d'augmenter systématiquement les conditions environnementales des données générées. Nos résultats montrent que l'augmentation TTS/VC est prometteuse pour augmenter les performances de RAP dans les régimes de données faibles et moyen. En conclusion, nos expériences fournissent un aperçu des variabilités particulièrement importantes pour la RAP et révèlent une approche systématique de l'augmentation des données de RAP utilisant des données synthétiques
In the last two decades, the error rate of automatic speech recognition (ASR) systems has drastically dropped, making them more useful in real-world applications. This improvement can be attributed to several factors including new architectures using deep learning techniques, new training algorithms, large and diverse training datasets, and data augmentation. In particular, the large-scale training datasets have been pivotal to learning robust speech representations for ASR. Their large size allows them to effectively cover the inherent diversity in speech, in terms of speaker voice, speaking rate, pitch, reverberation, and noise. However, the size and diversity of datasets typically found in high-resourced languages are not available in medium- and low-resourced languages and in domains with specialised vocabulary like the medical domain. Therefore, the popular method to increase dataset diversity is through data augmentation. With the recent increase in the naturalness and quality of synthetic data that can be generated by text-to-speech (TTS) and voice conversion (VC) systems, these systems have also become viable options for ASR data augmentation. However, several problems limit their application. First, TTS/VC systems require high-quality speech data for training. Hence, we develop a method of dataset curation from an ASR-designed corpus for training a TTS system. This method leverages the increasing accuracy of deep-learning-based, non-intrusive quality estimators to filter high-quality samples. We explore filtering the ASR dataset at different thresholds to balance the size of the dataset, number of speakers, and quality. With this method, we create a high-quality multi-speaker dataset which is comparable to LibriTTS in quality. Second, the data generation process needs to be controllable to generate diverse TTS/VC data with specific attributes. Previous TTS/VC systems either condition the system on the speaker embedding alone or use discriminative models to learn the speech variabilities. In our approach, we design an improved flow-based architecture that learns the distribution of different speech variables. We find that our modifications significantly increase the diversity and naturalness of the generated utterances over a GlowTTS baseline, while being controllable. Lastly, we evaluated the significance of generating diverse TTS and VC data for augmenting ASR training data. As opposed to naively generating the TTS/VC data, we independently examined different approaches such as sentence selection methods and increasing the diversity of speakers, phoneme duration, and pitch contours, in addition to systematically increasing the environmental conditions of the generated data. Our results show that TTS/VC augmentation holds promise in increasing ASR performance in low- and medium-data regimes. In conclusion, our experiments provide insight into the variabilities that are particularly important for ASR, and reveal a systematic approach to ASR data augmentation using synthetic data

Styles APA, Harvard, Vancouver, ISO, etc.

Berger, Israel. « Inaction and silent action in interaction ». Thesis, University of Roehampton, 2013. https://pure.roehampton.ac.uk/portal/en/studentthesis/inaction-and-silent-action-in-interaction(a49cedf3-0263-463f-9362-12e13ad2f6e9).html.

Texte intégral

Résumé :

How do non-vocal practices function within sequences? This thesis addresses silence and gesture in the context of social interaction involving human participants through the framework of conversation analysis (CA) to answer this question. Although it is certainly possible, and indeed common, for gestures and other non-vocal practices to occur during talk, this thesis focuses on those that occur without accompanying talk. In order to understand the role of non-vocal practices in this environment, we must first understand the role of silence (or the absence of talk) in participants’ interactions. How does silence function within the sequential environment? How does context affect how silence and nonvocal practices are treated by participants? If one organisation (or aspect of an organisation) is affected, are all organisations (or aspects of that organisation) affected? I draw on psychological, sociological, and linguistic literature to show why silence and gesture are related in interactional research and how this affects conversation analytic methodology. The work that forms this thesis brings together cross-cultural perspectives and technical advances with respect to silence and non-vocal practices both individually and when they occur together (i.e. non-vocal practices without accompanying talk). I begin with a broad overview of research and theories of gesture and silence before discussing CA as a method and its relationship to silence and non-vocal practices. The empirical studies begin with silence in relation to culture and context and issues in analysing and transcribing silence. I then examine how one sequential environment, in which psychotherapy clients are obligated to respond by orienting to the therapeutic agenda, has a preference structure that is very different from ordinary conversation. In this sequential environment, silence and elaboration are marks of preferred responses rather than dispreferred. The preference structure is made particularly visible through accountability that becomes relevant when a client’s response is produced promptly following the therapist’s overtly therapeutic action. Silence in this environment contributes Inaction and Silent Action in Interaction 3 to clients’ performance of sincerity and participation in the psychotherapeutic process, and non-vocal practices during longer silences can show that the client remains engaged with the sequence. Many authors have accounted for silences that are not treated by participants as problematic by applying constructs such as ‘continuing states of incipient talk’. This construct, however, is variably used and has not been developed through empirical examination. It does not adequately explain interactions that involve silence or gesture, as I show through a content analysis and systematic review. After recommending that researchers engage with participant orientations in environments that differ from canonical conversations, I describe an environment that is commonly thought of as constituting a ‘continuing state of incipient talk’ – television-watching. I show that contrary to some claims about ‘incipient talk’ environments, although response relevance is relaxed, both sequence organisation and turn-taking are strongly oriented to by participants. Compared to ordinary conversation, television-watching also involves more gestures and other nonvocal practices without accompanying talk. I examine how non-vocal practices without accompanying talk are used in interaction. As responsive actions, gestures can be used without accompanying talk as a resource for doing sensitive interactional work, particularly in places where giving offence might be a concern. Non-vocal practices can also be used in other situations to accomplish sequential actions that could otherwise be spoken. These uses of non-vocal practices create methodological questions for conversation analysis, which has traditionally focused on the talk of participants. By clearly distinguishing between actions and turns (two classic CA concepts) and examining the timing of non-vocal practices, I show that non-vocal practices can have a clearly defined role in sequence organisation. CA can thus be a useful method for examining the entire situation of social interaction, including non-vocal practices.

Styles APA, Harvard, Vancouver, ISO, etc.

Howell, Ashley N. « Effects of Social Context on State Anxiety, Submissive Behavior, and Perceived Social Task Performance in Females with Social Anxiety ». Ohio University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1365441706.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Deschamps-Berger, Théo. « Social Emotion Recognition with multimodal deep learning architecture in emergency call centers ». Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG036.

Texte intégral

Résumé :

Cette thèse porte sur les systèmes de reconnaissance automatique des émotions dans la parole, dans un contexte d'urgence médicale. Elle aborde certains des défis rencontrés lors de l'étude des émotions dans les interactions sociales et est ancrée dans les théories modernes des émotions, en particulier celles de Lisa Feldman Barrett sur la construction des émotions. En effet, la manifestation des émotions spontanées dans les interactions humaines est complexe et souvent caractérisée par des nuances, des mélanges et étroitement liée au contexte. Cette étude est fondée sur le corpus CEMO, composé de conversations téléphoniques entre appelants et Agents de Régulation Médicale (ARM) d'un centre d'appels d'urgence français. Ce corpus fournit un ensemble riche de données pour évaluer la capacité des systèmes d'apprentissage profond, tels que les Transformers et les modèles pré-entraînés, à reconnaître les émotions spontanées dans les interactions parlées. Les applications pourraient être de fournir des indices émotionnels susceptibles d'améliorer la gestion des appels et la prise de décision des ARM ou encore de faire des synthèses des appels. Les travaux menés dans ma thèse ont porté sur différentes techniques liées à la reconnaissance des émotions vocales, notamment l'apprentissage par transfert à partir de modèles pré-entraînés, les stratégies de fusion multimodale, l'intégration du contexte dialogique et la détection d'émotions mélangées. Un système acoustique initial basé sur des convolutions temporelles et des réseaux récurrents a été développé et validé sur un corpus émotionnel connu de la communauté affective, appelé IEMOCAP puis sur le corpus CEMO. Des recherches approfondies sur des systèmes multimodaux, pré-entraînés en acoustique et linguistique et adaptés à la reconnaissance des émotions, sont présentées. En outre, l'intégration du contexte dialogique dans la détection des émotions a été explorée, mettant en lumière la dynamique complexe des émotions dans les interactions sociales. Enfin, des travaux ont été initiés sur des systèmes multi-étiquettes multimodaux capables de traiter les subtilités des émotions mélangées dues à l'ambiguïté de la perception des annotateurs et du contexte social. Nos recherches mettent en évidence certaines solutions et défis liés à la reconnaissance des émotions dans des situations "in the wild". Cette thèse est financée par la Chaire CNRS AI HUMAAINE : HUman-MAchine Interaction Affective & Ethique
This thesis explores automatic speech-emotion recognition systems in a medical emergency context. It addresses some of the challenges encountered when studying emotions in social interactions. It is rooted in modern theories of emotions, particularly those of Lisa Feldman Barrett on the construction of emotions. Indeed, the manifestation of emotions in human interactions is complex and often characterized by nuanced, mixed, and is highly linked to the context. This study is based on the CEMO corpus, which is composed of telephone conversations between callers and emergency medical dispatchers (EMD) from a French emergency call center. This corpus provides a rich dataset to explore the capacity of deep learning systems, such as Transformers and pre-trained models, to recognize spontaneous emotions in spoken interactions. The applications could be to provide emotional cues that could improve call handling and decision-making by EMD, or to summarize calls. The work carried out in my thesis focused on different techniques related to speech emotion recognition, including transfer learning from pre-trained models, multimodal fusion strategies, dialogic context integration, and mixed emotion detection. An initial acoustic system based on temporal convolutions and recurrent networks was developed and validated on an emotional corpus widely used by the affective community, called IEMOCAP, and then on the CEMO corpus. Extensive research on multimodal systems, pre-trained in acoustics and linguistics and adapted to emotion recognition, is presented. In addition, the integration of dialog context in emotion recognition was explored, underlining the complex dynamics of emotions in social interactions. Finally, research has been initiated towards developing multi-label, multimodal systems capable of handling the subtleties of mixed emotions, often due to the annotator's perception and social context. Our research highlights some solutions and challenges in recognizing emotions in the wild. The CNRS AI HUMAAINE Chair: HUman-MAchine Affective Interaction & Ethics funded this thesis

Styles APA, Harvard, Vancouver, ISO, etc.

« Conversation, Dark haze, San-shui Xi-nan ». 1998. http://library.cuhk.edu.hk/record=b5896306.

Texte intégral

Résumé :

by Ho Tsz-Yan, Rebecca.
Thesis (M.Mus.)--Chinese University of Hong Kong, 1998.
Abstract also in Chinese.
Chapter Part I: --- p.page
Chapter ´Ø --- Abstract --- p.1
Chapter Part II:
Chapter ´Ø --- "Analysis on ""Conversation""" --- p.3
Chapter ´Ø --- """Conversation"" (Full Score)" --- p.6
Chapter ´Ø --- "Analysis on ""Dark Haze´ح" --- p.25
Chapter ´Ø --- """Dark Haze"" (Full Score)" --- p.28
Chapter ´Ø --- "Analysis on ""San-Shui Xi-Nan""" --- p.65
Chapter ´Ø --- """San -Shui Xi-Nan"" (Full Score)" --- p.69
Chapter Part III：
Chapter ´Ø --- Biography --- p.119

Styles APA, Harvard, Vancouver, ISO, etc.

Livres sur le sujet "Conversion vocale"

Klein, Evelyn R., Cesar E. Ruiz et Louis R. Chesney. Echo : A Vocal Language Program for Building Ease and Comfort with Conversation. Plural Publishing, Incorporated, 2021.

Trouver le texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Eidsheim, Nina Sun, et Katherine Meizel, dir. The Oxford Handbook of Voice Studies. Oxford University Press, 2019. http://dx.doi.org/10.1093/oxfordhb/9780199982295.001.0001.

Texte intégral

Résumé :

More than two hundred years after the first speaking machine, we are accustomed to voices talking from seemingly anywhere and everywhere, including house alarm systems, cars, telephones, and digital assistants, or “smart speakers” such as Alexa and Google Home. However, vocal events still have the capacity to raise age-old questions regarding the human, the animal, the machine, and the spiritual—or in nonmetaphysical terms, questions about identity and authenticity. Individuals and groups perform, refuse, and play identity through vocal acts and by listening to and for voice. In this volume, leading scholars from multiple disciplines respond to the seemingly innocuous question: What is voice? While also emphasizing connections and overlaps, the chapters show that the definition and ways of studying of voice is diverse. Many of the authors have worked on connecting voice research across disciplines, seeking to cultivate this trend and to affirm the development of voice studies as a transdisciplinary field of inquiry. It includes diverse standpoints at the intersections of science, culture, technology, arts, and the humanities. While questions of voice address crucial issues within the humanities—for example, the relationships between voice, speech, listening, writing, and meaning—the book also seeks close interaction with the social sciences and medicine in the search for a more complete understanding of these relationships. The term voice studies is used in this context as a specific intervention, to offer a moniker that gathers together otherwise disparate intellectual perspectives and methods and thus hopes to facilitate further transdisciplinary conversation and collaboration.

Styles APA, Harvard, Vancouver, ISO, etc.

Barnard, Stephen R. Hacking Hybrid Media. Oxford University PressNew York, 2024. http://dx.doi.org/10.1093/oso/9780197570272.001.0001.

Texte intégral

Résumé :

Abstract The contemporary public sphere is rife with problematic information, but on what terms are manipulators able to garner attention in the hybrid media system? Focusing on the messaging strategies employed by Donald Trump and his most vocal online supporters, Hacking Hybrid Media provides a theoretically oriented and empirically grounded analysis of the ways today’s media afford deceptive political communication. From the structures of social media platforms to the practices of political actors, the book offers a critical appraisal of media power and the capital required to wield it. Bringing the study of propaganda and media manipulation in conversation with literatures on media power and practice, this book examines how networked media capital is changing the fields of politics and journalism. By analyzing events like the January 6, 2021, attack on the U.S. Capitol, the White House’s first “social media summit,” and the media frenzy following CNN reporter Jim Acosta’s alleged assault of an intern, the book shows how members of Trump’s “digital army” use Facebook groups, Reddit forums, Twitter hashtags, YouTube channels, mass media, and more to shape the flow of disinformation in American media.

Styles APA, Harvard, Vancouver, ISO, etc.

Budney, Stephen. William Jay. Greenwood Publishing Group, Inc., 2005. http://dx.doi.org/10.5040/9798216035947.

Texte intégral

Résumé :

A founder of the New York Anti-Slavery Society, William Jay was one of the most prolific and influential abolitionists of his day, yet Americans know little about him. This is the first extensive examination of his life and work in over 100 years. Like many of his contemporaries, Jay looked at a rapidly changing America and it frightened him. As a conservative social reformer, it was not merely sinfulness that alarmed Jay, but the perception that America was betraying its founding principles. From his early involvement in local temperance societies to his conversion to the cause of immediate abolition of slavery, Jay would emerge as one of the most influential reformers. A fierce and vocal opponent of the efforts to repatriate blacks to Africa as well as the U.S. annexation of Northern Mexico, Jay stood at the center of the abolitionist and anticolonialist movements. The son of founding father John Jay, William Jay felt an obligation to help purify America so that it could continue to adhere to the republican principles that had helped create it. Not only does Budney examine the motivation for multifaceted reform, he also probes how advocates of abolition, peace activists, and temperance attempted to craft their appeals to influence the greatest number of people. Many scholars have attributed the vitality of the reform movement—particularly the abolitionists—to the more radical elements such as the Garrisons; however, most reformers would have preferred a more gentle approach to persuading Americans of the veracity of their efforts.

Styles APA, Harvard, Vancouver, ISO, etc.

Chapitres de livres sur le sujet "Conversion vocale"

Vekkot, Susmitha, et Shikha Tripathi. « Vocal Emotion Conversion Using WSOLA and Linear Prediction ». Dans Speech and Computer, 777–87. Cham : Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-66429-3_78.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Vekkot, Susmitha, et Shikha Tripathi. « Significance of Glottal Closure Instants Detection Algorithms in Vocal Emotion Conversion ». Dans Soft Computing Applications, 462–73. Cham : Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-62521-8_40.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Teo, Nicole, Zhaoxia Wang, Ezekiel Ghe, Yee Sen Tan, Kevan Oktavio, Alexander Vincent Lewi, Allyne Zhang et Seng-Beng Ho. « DLVS4Audio2Sheet : Deep Learning-Based Vocal Separation for Audio into Music Sheet Conversion ». Dans Lecture Notes in Computer Science, 95–107. Singapore : Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2650-9_8.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Matthews, Colin. « Un Colloque sentimental (A Sentimental Conversation) ». Dans New Vocal Repertory 2, 172–79. Oxford University PressOxford, 1998. http://dx.doi.org/10.1093/oso/9780198790181.003.0038.

Texte intégral

Résumé :

Abstract This is a lovely work by one of our most sensitive and versatile composers. it seems beautifully balanced in structure, although the composer, somewhat surprisingly, reveals, in his note at the front of the score, that the three central songs of the five were in fact written between 1971 and 1978 for ‘private pleasure’. ‘Le Jet d’eau’ is the composer’s earliest surviving piece. Eventually he had the idea of expanding them into a cycle, by enclosing them within the two unequal parts of the setting of Verlaine’s celebrated poem of the title. He has, on his own admission, also taken certain liberties with the texts, making several omissions and alterations to suit his personal responses to the special world of nineteenth-century French poetry.

Styles APA, Harvard, Vancouver, ISO, etc.

Juslin, Patrik N., et Klaus R. Scherer. « Vocal expression of affect ». Dans The New Handbook of Methods in Nonverbal Behavior Research, 65–136. Oxford University PressOxford, 2005. http://dx.doi.org/10.1093/oso/9780198529613.003.0003.

Texte intégral

Résumé :

Abstract Imagine yourself sitting in a cafeteria. Suddenly, you can overhear another person’s conversation without being able to see the person in question. Within a few seconds of hearing that person’s voice, you are able to infer the person’s gender (a female), age (a young woman), and perhaps even her origin (from the south), social status (upper class), and physical health (having a cold). All this you can tell from hearing her voice, even though you are unable to understand the verbal contents of her conversation.

Styles APA, Harvard, Vancouver, ISO, etc.

Mcnally, Michael D. « Ojibwes, Missionaries, And Hymn Singing, 1828-1867 ». Dans Ojibwe Singers, 43–80. Oxford University PressNew York, NY, 2000. http://dx.doi.org/10.1093/oso/9780195134643.003.0003.

Texte intégral

Résumé :

Abstract One of the earlier important narratives written at the hand of an Ojibwe figure is shot through with references to hymn texts and their significance. Published first in 1847 and reissued in 1850 as The Life, Letters, and Speeches of Kah-Ge-Ga-Gah-Bowh, or G. Copway Chief Ojibway Nation, this text offers a glimpse of how complex was the world from which Ojibwe hymns emerged. Born far to the east of Minnesota near Rice Lake, Ontario, George Kahgegagahbowh Copway was an Ojibwe Methodist preacher who became well known on the lecture circuit and in print in the 1840s and 1850s for his vivid descriptions of the customs of his Ojibwe people and of their prospects for conversion and “civilization.” He was a prolific interpreter of his people’s culture to a nonnative audience and a vocal ad, vocate of native interests, at least as he saw those interests. Copway believed the survival and prosperity of Ojibwe people would best be served in the context of religious conversion to evangelical Christianity and cultural assimilation. But Copway was a pragmatist, not an ideologue. His verbal performances of civilization in lecture halls from Boston to Philadelphia also involved sharp criticisms of the savagery that resulted from the oppressive circumstances that faced his people under American rule. In some regards, Copway became what missionaries wanted him to become-a model of that one-directional “progress” characterized by both conversion and as, similation to Anglo-American evangelical religion and culture. But a closer look reveals a far more complicated story that led, among other things, to a falling out with the Methodists, short careers as a Union Army recruiter and a healer in Detroit, and (re,)baptism into the Roman Catholic church.

Styles APA, Harvard, Vancouver, ISO, etc.

Recasens, Daniel. « Velar palatalization ». Dans Phonetic Causes of Sound Change, 22–76. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780198845010.003.0003.

Texte intégral

Résumé :

An analysis of the conversion of velar stops before front vocalic segments, and in other contextual and positional conditions, into plain palatal, alveolopalatal, and even alveolar articulations is carried out using descriptive data from a considerable number of languages. Articulatory data on (alveolo)palatal stops reveal that these consonants are mostly alveolopalatal in the world’s languages, and also that their closure location may be highly variable, which accounts for their identification as /t/ or /k/. It is claimed that velar palatalization may be triggered by articulatory strengthening through an increase in tongue-to-palate contact in non-front vocalic environments.

Styles APA, Harvard, Vancouver, ISO, etc.

« “A Little Singer on Broadway” ». Dans Blues Mamas and Broadway Belters, 106–61. Duke University Press, 2024. http://dx.doi.org/10.1215/9781478059967-004.

Texte intégral

Résumé :

This chapter addresses how black musical theatre singers and their contemporaries voiced the young, glamorous American girl for the Broadway stage in the 1950s and 1960s. In the decades prior, Lena Horne had established—by different means from her carefree, white contemporary Mary Martin—strategies for performing girlhood that reverberated for the next generation. Ingénues Diahann Carroll and Leslie Uggams, the first black women to win Tony Awards in the category of Leading Actress in a Musical, built vocal sounds in conversation with white stage actresses and pop music ingénues of the time. Eartha Kitt deployed the sexiness of her singing to secure and elide claims around national belonging, as did Japanese American belter Pat Suzuki by different means. The chapter includes “vocal exercises” inspired by these singers’ performances, extending their legacy in practical terms.

Styles APA, Harvard, Vancouver, ISO, etc.

Schneider, Magnus Tessing. « From the General to the Specific : The Musical Director’s Perspective ». Dans Performing the Eighteenth Century : Theatrical Discourses, Practices, and Artefacts, 225–34. Stockholm University Press, 2023. http://dx.doi.org/10.16993/bce.k.

Texte intégral

Résumé :

In this interview with conductor, pianist, and harpsichordist Mark Tatlow, who was the principal artistic researcher within Performing Premodernity, he describes what he has learned from working within the Early Music and Historically Informed Performance movements since the 1980s, and especially from working at the Drottningholm Palace Theatre. The conversation revolves around the crucial importance of the communicative aspect of musical and operatic performance, and about close attention to the words of a libretto or a sung poem as a key to a singer’s vocal performance.

Styles APA, Harvard, Vancouver, ISO, etc.

« “TO CHANGE THE ORDER OF CONVERSATION” : interruption and vocal diversity in Holmes' American talk ». Dans Oliver Wendell Holmes and the Culture of Conversation, 61–90. Cambridge University Press, 2001. http://dx.doi.org/10.1017/cbo9780511485503.003.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Actes de conférences sur le sujet "Conversion vocale"

Chan, Paul Y., Minghui Dong, S. W. Lee et Ling Cen. « Solo to a capella conversion - Synthesizing vocal harmony from lead vocals ». Dans 2011 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2011. http://dx.doi.org/10.1109/icme.2011.6012032.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Liliana, Resmana Lim et Elizabeth Kwan. « Voice conversion application (VOCAL) ». Dans 2011 International Conference on Uncertainty Reasoning and Knowledge Engineering (URKE). IEEE, 2011. http://dx.doi.org/10.1109/urke.2011.6007812.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Rao, K. Sreenivasa, et B. Yegnanarayana. « Voice Conversion by Prosody and Vocal Tract Modification ». Dans 9th International Conference on Information Technology (ICIT'06). IEEE, 2006. http://dx.doi.org/10.1109/icit.2006.92.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Vekkot, Susmitha. « Building a generalized model for multi-lingual vocal emotion conversion ». Dans 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 2017. http://dx.doi.org/10.1109/acii.2017.8273658.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Turk, Oytun, et Levent M. Arslan. « Voice conversion methods for vocal tract and pitch contour modification ». Dans 8th European Conference on Speech Communication and Technology (Eurospeech 2003). ISCA : ISCA, 2003. http://dx.doi.org/10.21437/eurospeech.2003-36.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Weichao, Xie, et Zhang Linghua. « Vocal tract spectrum transformation based on clustering in voice conversion system ». Dans 2012 International Conference on Information and Automation (ICIA). IEEE, 2012. http://dx.doi.org/10.1109/icinfa.2012.6246812.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Nikolay, Korotaev. « Collaborative constructions in Russian conversations : A multichannel perspective ». Dans INTERNATIONAL CONFERENCE on Computational Linguistics and Intellectual Technologies. RSUH, 2023. http://dx.doi.org/10.28995/2075-7182-2023-22-254-266.

Texte intégral

Résumé :

The talk provides a multichannel description of how interlocutors co-construct utterances in conversation. Using data from the “Russian Pears Chats & Stories”, I propose for a tripartite sequential scheme of collaborative constructions. When the scheme is fully realized, its first step not only includes the initial component of the construction, but also presupposes that the first participant makes a request for a co-operative action; the final component of the construction is provided by the second participant during the second step; while the third step consists of the first participant’s reaction. On each step, the participants combine vocal and non-vocal resources to achieve their goals. In some cases, non-vocal phenomena provide an essential clue to what is actually happening during co-construction, including whether the participants act in a truly co-operative manner. I distinguish between three types of communicative patterns that may take place during co-construction: “Requested Cooperation”, “Unplanned Cooperation”, and “Non-realized Interaction”. The data suggest that these types can be influenced by the way the knowledge of the discussed events is distributed among the participants.

Styles APA, Harvard, Vancouver, ISO, etc.

Shah, Nirmesh, Maulik C. Madhavi et Hemant Patil. « Unsupervised Vocal Tract Length Warped Posterior Features for Non-Parallel Voice Conversion ». Dans Interspeech 2018. ISCA : ISCA, 2018. http://dx.doi.org/10.21437/interspeech.2018-1712.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Saito, Daisuke, Satoshi Asakawa, Nobuaki Minematsu et Keikichi Hirose. « Structure to speech conversion - speech generation based on infant-like vocal imitation ». Dans Interspeech 2008. ISCA : ISCA, 2008. http://dx.doi.org/10.21437/interspeech.2008-178.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Zhu, Zhi, Ryota Miyauchi, Yukiko Araki et Masashi Unoki. « Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants ». Dans 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, 2017. http://dx.doi.org/10.23919/eusipco.2017.8081526.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!