Bibliografie tematiche / Synthèse audio

Indice

Articoli di riviste
Tesi
Libri

Letteratura scientifica selezionata sul tema "Synthèse audio"

Autore: Grafiati

Pubblicato: 12 ottobre 2024

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Synthèse audio".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Synthèse audio"

Rioufreyt, Thibaut. "La transcription outillée en SHS. Un panorama des logiciels de transcription audio/vidéo". Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique 139, n. 1 (24 aprile 2018): 96–133. http://dx.doi.org/10.1177/0759106318762455.

Testo completo

Abstract (sommario):

Cet article propose d’esquisser un panorama des principaux logiciels à la disposition du transcripteur permettant de lui faciliter le travail et de garantir la qualité des transcriptions. L’objectif visé est double. Il s’agit d’une part de mettre à disposition une synthèse des outils existants (24 logiciels sont traités) et des caractéristiques techniques principales de chacun d’entre eux. Il existe en effet une offre pléthorique en la matière, au point qu’il est difficile de s’y retrouver pour l’utilisateur lambda. D’autre part, cet article ne propose pas simplement une liste d’outils mais neuf critères simples permettant de choisir le logiciel le plus adapté aux besoins de sa recherche, à son corpus et à ses compétences : rôle de la problématique et de l’objet de recherche, spécificité du terrain, plate-forme sur laquelle fonctionne l’outil, disponibilité du logiciel, mode de représentation des données, articulation des supports, format des données, outils de lecture et de traitement du son, fonctions d’analyse proposées par ces logiciels.

Gli stili APA, Harvard, Vancouver, ISO e altri

Gauthier, Geneviève, Simonne Couture e Christina St-Onge. "Jugement évaluatif : confrontation d’un modèle conceptuel à des données empiriques". Pédagogie Médicale 19, n. 1 (febbraio 2018): 15–25. http://dx.doi.org/10.1051/pmed/2019002.

Testo completo

Abstract (sommario):

Contexte : Le recours au jugement des évaluateurs est de plus en plus présent en contexte d’utilisation d’une approche de formation par compétences ; toutefois sa subjectivité a souvent été critiquée. Plus récemment, les perspectives variées des évaluateurs ont commencé à être traitées comme source d’information importante et les recherches sur le jugement évaluatif (rater cognition) se sont multipliées. Lors d’une synthèse d’études empiriques sur le sujet, Gauthier et al. ont proposé un modèle conceptuel englobant une série de résultats concourants. Objectif : Dans le cadre de cette étude à devis mixte concomitant imbriqué (quan/QUAL), nous confrontons ce modèle théorique à des données empiriques issues d’entrevues semi-dirigées d’évaluateurs hors pair. Cette analyse vise à valider le modèle théorique et déterminer son utilité pour mieux comprendre le jugement évaluatif. Méthodes : Les verbatim d’entrevues audio-enregistrées de 11 participants observant et jugeant la vidéo d’une résidente lors d’une consultation avec un patient standardisé ont été codés en utilisant le modèle théorique comme arbre de codage. Les données quantitatives portant sur l’occurrence et la co-occurrence de chaque code, en général et par individu, ont été extraites et analysées. Résultats : Les données corroborent que l’ensemble des neuf mécanismes du modèle conceptuel sont bien représentés dans le discours des évaluateurs. Toutefois, les résultats suggèrent que le modèle avec ses neuf mécanismes indépendants ne rend pas justice à la complexité des interactions entre certains mécanismes et qu’un des mécanismes, le concept personnel de compétence, semble soutenir une grande partie des autres mécanismes.

Gli stili APA, Harvard, Vancouver, ISO e altri

Xu, Shu. "AI Color Organ: Piano Music Visualization using Onset Detection and HistoGAN". Highlights in Science, Engineering and Technology 39 (1 aprile 2023): 274–79. http://dx.doi.org/10.54097/hset.v39i.6539.

Testo completo

Abstract (sommario):

The music visualization algorithm described in this study allows users to construct piano audio files using imported image files. This paper contributes to previous studies and designs of sonification by highlighting the effectiveness of utilizing onset detection in creating intuitive sonic changes. The audio-visual correspondences employed in this study could be expanded to many other syntheses and sample manipulation techniques. Translating visual information into sonic changes could yield many creative applications in music production, as it offers musicians a simultaneously optical and auditory production experience. This approach to audio manipulation also increases the unpredictability of the sound output, which could be appealing to experimental musicians seeking to control sounds with the visual structure of artworks that they enjoy, as opposed to precise parameters. It is looking forward to seeing creative implementations of the techniques in audio-visual artworks, music production tools, and interactive multimedia systems. These results shed light on guiding further exploration of AI composing.

Gli stili APA, Harvard, Vancouver, ISO e altri

Dreier, Christian, e Michael Vorländer. "Vehicle pass-by noise auralization in a virtual urban environment". INTER-NOISE and NOISE-CON Congress and Conference Proceedings 265, n. 6 (1 febbraio 2023): 1907–15. http://dx.doi.org/10.3397/in_2022_0269.

Testo completo

Abstract (sommario):

Auralization is a suitable method for the subjective evaluation of environmental noise. Due to its complexity, the plausible and immersive acoustic representation of outdoor scenarios in urban environments is an ongoing field of research. This work presents the design and implementation of a vehicle pass-by noise model with application in a real-time environmental noise auralization. The pass-by noise sources are implemented by procedural audio syntheses of engine and road-tyre noise with according directivities. In an audiovisual demonstration, the resulting source model is auralized considering the sound propagation phenomena in a virtual urban environment using the Virtual Acoustics framework.

Gli stili APA, Harvard, Vancouver, ISO e altri

Dick, Michael. "„ Der neue Audi Q5 ist eine Synthese aus Sportlichkeit, innovativer Technologie und einzigartigem Design.“". ATZextra 13, n. 2 (giugno 2008): 200. http://dx.doi.org/10.1365/s35778-008-0108-z.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abbas, Wasfi. "Audio-Visual Poetry A Semiotic-Cultural Reading in Interactive Digital Poem (Vision of Hope)". Journal of Umm Al-Qura University for Language Sciences and Literature, n. 28 (1 agosto 2021): 253–302. http://dx.doi.org/10.54940/ll78073145.

Testo completo

Abstract (sommario):

This research aims at reading the Saudi poet Mohamed Habibi’s interactive digital poem, ‘Vision of Hope’, through the digital semiotic-cultural approach, trying to depict the external and internal text relationships, identifying the type of rhetoric to which the smart movement of the image and the multifaceted diversity of voices have added, and recognizing the cultural dimension of depicting a human being as a thing. What adds to the importance of this research is pinpointing the role of the technical syntheses of figurative language in the poem, ‘Vision of Hope’ that served the poetic text to become animatic, and served the non-verbal texts to become poetic. The research concludes that though the poem is technically simple because of the absence of hypertext, it is significantly rich because of the existence of the technical synthesis that has contributed in giving each of its component the advantage of the other. In addition, although the poet follows his admiration of the beauty of the three musical compositions of Nusseir Shammah, some of them were not well employed. The poet’s creative experience of interactive digital poetry is promising and deserves to be supported and sustained.

Gli stili APA, Harvard, Vancouver, ISO e altri

McGlacken-Byrne, Sinéad M., Nuala P. Murphy e Sarah Barry. "A realist synthesis of multicentre comparative audit implementation: exploring what works and in which healthcare contexts". BMJ Open Quality 13, n. 1 (marzo 2024): e002629. http://dx.doi.org/10.1136/bmjoq-2023-002629.

Testo completo

Abstract (sommario):

BackgroundMulticentre comparative clinical audits have the potential to improve patient care, allow benchmarking and inform resource allocation. However, implementing effective and sustainable large-scale audit can be difficult within busy and resource-constrained contemporary healthcare settings. There are little data on what facilitates the successful implementation of multicentre audits. As healthcare environments are complex sociocultural organisational environments, implementing multicentre audits within them is likely to be highly context dependent.ObjectiveWe aimed to examine factors that were influential in the implementation process of multicentre comparative audits within healthcare contexts—what worked, why, how and for whom?MethodsA realist review was conducted in accordance with the Realist and Meta-narrative Evidence Syntheses: Evolving Standards reporting standards. A preliminary programme theory informed two systematic literature searches of peer-reviewed and grey literature. The main context-mechanism-outcome (CMO) configurations underlying the implementation processes of multicentre audits were identified and formed a final programme theory.Results69 original articles were included in the realist synthesis. Four discrete CMO configurations were deduced from this synthesis, which together made up the final programme theory. These were: (1) generating trustworthy data; (2) encouraging audit participation; (3) ensuring audit sustainability; and (4) facilitating audit cycle completion.ConclusionsThis study elucidated contexts, mechanisms and outcomes influential to the implementation processes of multicentre or national comparative audits in healthcare. The relevance of these contextual factors and generative mechanisms were supported by established theories of behaviour and findings from previous empirical research. These findings highlight the importance of balancing reliability with pragmatism within complex adaptive systems, generating and protecting human capital, ensuring fair and credible leadership and prioritising change facilitation.

Gli stili APA, Harvard, Vancouver, ISO e altri

Iftikhar, Hassan, e Yan Luximon. "The syntheses of static and mobile wayfinding information: an empirical study of wayfinding preferences and behaviour in complex environments". Facilities 40, n. 7/8 (4 marzo 2022): 452–74. http://dx.doi.org/10.1108/f-06-2021-0052.

Testo completo

Abstract (sommario):

Purpose The efficient delivery of environmental information to wayfinders in complex environments is a challenge for information designers. Wayfinding tasks can be quite strenuous and frustrating in the visual absence of dedicated wayfinding information. This study aims to explore the behaviour regarding the use of wayfinding information by navigators in complex environments. Design/methodology/approach An experiment has been conducted in which participants have performed wayfinding tasks in a spatially complex university campus. The participants were instructed to use the think-aloud protocol during the experiment. The behaviour has been recorded using the head-mounted video recorder (GoPro), mobile phone screen (audio\video) recorder and interview. Twelve university students have been selected based on the equal level of spatial ability using the Santa Barbara Sense of Direction scale. Each participant performed three wayfinding tasks to locate the unknown locations inside the campus using a mobile wayfinding application and other information sources. Findings The results of this study demonstrated significant behavioural preferences in acquiring wayfinding information. Most of the participants synthesised the static and mobile wayfinding information sources, while some preferred only the static ones. Gender differences have also been found for planning and route finding. This study recommends the syntheses of static and mobile wayfinding information for designing an efficient institutional wayfinding system. Research limitations/implications The sample size has been kept small because of the qualitative exploration of the wayfinding behaviour regarding the wayfinding information syntheses behaviour. The experiment findings can be further explored with larger data set and controlled behavioural metrics. This study can help understand the user requirements in facilities management for spatially complex institutional environments. Practical implications The current findings can be further used to develop a framework for wayfinding information designers to assist them in understanding the current practices and incorporate them for improving institutional wayfinding systems. The management of the offered facilities within an institution can be further improved to make the space more efficient by saving users’ time and efforts. Originality/value Information syntheses or symbiosis of environmental information with the beacon-based digital wayfinding system is a new concept. This study explores the potential of such information syntheses for enhancing the legibility of complex institutional environments.

Gli stili APA, Harvard, Vancouver, ISO e altri

Mao, Ling-Xiang, Jing Lan, Zifeng Li e Hua Shi. "Undergraduate Teaching Audit and Evaluation Using an Extended ORESTE Method with Interval-Valued Hesitant Fuzzy Linguistic Sets". Systems 11, n. 5 (23 aprile 2023): 216. http://dx.doi.org/10.3390/systems11050216.

Testo completo

Abstract (sommario):

Undergraduate teaching audit and evaluation (UTAE) plays a substantial role in the teaching quality assurance and monitoring of universities. It achieves the goal of selecting the best university for promoting the quality of higher education in China. Generally, the UTAE is a complex decision-making problem by considering competing evaluation criteria. Moreover, the evaluation information on the teaching quality of universities is often ambiguous and hesitant because of the vagueness existing in human judgments. Previous studies on UTAE have paid subtle attention towards the managing of linguistic expressions and the performance priority of universities. The interval-valued hesitant fuzzy linguistic sets (IVHFLSs) can effectively describe uncertainty, hesitancy, and inconsistency inherent in decision-making process. The ORESTE (organísation, rangement et Synthèse de données relarionnelles, in French) is a new outranking decision-making method which can show detailed distinctions between alternatives. Therefore, in this study, we propose a new UTAE approach based on the VHFLSs and ORESTE method to resolve the prioritization of universities for selecting the optimal university to benchmark. Specifically, the presented method handles the hesitant and uncertain linguistic expressions of experts by adopting the IVHFLSs and determines the ranking of universities with an extended ORESTE approach. Finally, a practical UTAE example illustrates the feasibility the proposed approach and a comparison analysis provides grounding for the superiority of the integrated approach. When the obtained results are evaluated, U2 has been determined as the best university. The results indicate the good performance of the proposed UTAE approach in evaluating and improving the teaching quality of universities.

Gli stili APA, Harvard, Vancouver, ISO e altri

Jermia, Gabriella. "The Usability of Kamishibai Card in Patient Safety: A Literature Review". Fundamental and Management Nursing Journal 5, n. 2 (1 ottobre 2022): 51–54. http://dx.doi.org/10.20473/fmnj.v5i2.36837.

Testo completo

Abstract (sommario):

Introduction: Monitoring patients’ safety in real time and compliance required valid instrument. Kamishibai Card or K-Card was introduced as an instrument to monitor central line-associated bloodstream infections or CLABSIs associated with nosocomial infection. However, the effectivity of the instrument remains inconsistent. The study aims to review the effectiveness of K-Card. Methods: This study applied systematic review of published papers retrieved form data-base: Elsevier Science, Web of Science, EBSCO, and Science Direct. Inclusions were made to only full English article published from January 1, 2012 to February 28, 2022. the JBI Critical Appraisal Checklist for systematic reviews and research syntheses were applied for each study. The feasibility study of methodology using CASP tools Meta-analysis was performed to analyze the articles. Results: A total of four articles was retrieved for analysis and synthesis using. The implementation of K-Card as an audit tool showed positive results. It leads to a quick identification of high-risk patients, increasing patient satisfaction, helping the frontline staff educate patients families, and helping the leaders have better communication with their staff. It’s simplifying the audit process for patient safety with real-time data and direct feedback to solve problems. Conclusions: The use of K-Card allows the leaders and staff to solve their daily problems in real time that relate to patient safety, allows direct feedback and creates a bond between leaders and staff, increases patient trust and satisfaction, and also enables timely root cause analyses to improve patients care.

Gli stili APA, Harvard, Vancouver, ISO e altri

Più fonti

Tesi sul tema "Synthèse audio"

Coulibaly, Patrice Yefoungnigui. "Codage audio à bas débit avec synthèse sinusoïdale". Mémoire, Université de Sherbrooke, 2000. http://savoirs.usherbrooke.ca/handle/11143/1078.

Testo completo

Abstract (sommario):

Les objectifs de notre recherche s’exposent en deux grands points : 1) Explorer les techniques de codage param étrique à synthèse sinusoïdale et les appliquer aux signaux audio (principalement de musique). 2) Améliorer la qualité intrinsèque de ces modèles notamment au niveau des compromis temps/fréquence propres au codage par transformées. Nous avons comme méthodologie, effectué des simulations en « C » et en MATLAB des récents algorithmes de synthèse sinusoïdale, mais en nous inspirant en particulier du codeur MSLPC (Multisinusoid LPC) de Wen- Whei C, De-Yu W. et Li-Wei W. de l’Université Nationale Chiao-Tung de Taiwan (5). Ce mémoire contient quatre chapitres. Le Chapitre 1 présente une introduction et une mise en contexte. Le chapitre 2 présente un aperçu sur le codage paramétrique et l’intérêt de cette technique. Une présentation des types de codeurs paramétriques existants suivra. Le chapitre 3 est consacré à la description des différentes étapes parcourues dans la conception d’un codeur à synthèse sinusoïdale avec des méthodes récemment développées. Le chapitre 4 présente la conception et l’implantation rigoureuse du modèle ainsi qu'une présentation de notre compromis temps/fréquence proposée pour améliorer la qualité intrinsèque du codeur sinusoïdal. Dans ce chapitre 4, nous présentons aussi une évaluation informelle de la performance de notre modèle. Enfin nous terminerons ce mémoire par une conclusion.

Gli stili APA, Harvard, Vancouver, ISO e altri

Oger, Marie. "Model-based techniques for flexible speech and audio coding". Nice, 2007. http://www.theses.fr/2007NICE4109.

Testo completo

Abstract (sommario):

L’objectif de cette thèse est de développer des techniques de codage de parole et audio optimales et plus flexibles que avec l’état de l’art, pouvant s’adapter en temps réel à différentes contraintes (débit, largeur de bande, retard). Cette problématique est étudiée à l’aide de différents outils : modélisation statistique, théorie de la quantification à haut débit, codage entropique flexible. On propose d’abord une nouvelle technique de codage flexible des coefficients de prédiction linéaire (LPC) combinant une transformée de Karhumen-Loeve (KLT) et une quantification scalaire basée sur un modèle gaussien généralisé. Les performances sont équivalentes à celle du quantificateur utilisé dans l’AMR-WB. De plus la complexité est moindre. Puis, on propose deux techniques de codage audio par transformée flexible, l’une utilisant le codage « stack-run » et l’autre le codage par plans de bits basé modèle. Dans les deux cas, le signal après pondération perceptuelle et transformation discrète en cosinus modifié (MDCT) est modélisé par une distribution gaussienne généralisée qui sert à optimiser le codage. La qualité du codeur stack-run est meilleure que ITU-T G. 722. 1 à bas débit et équivalente à haut débit. Par contre, le codeur stack-run est plus complexe et son coût mémoire est faible. L’avantage du codage par plans de bits est d’être scalable en débit. Nous proposons d’utiliser le modèle gaussien généralisé afin d’initialiser les tables de probabilités du codage arithmétique utilisé dans le codage par plan de bits. La qualité associée est inférieure à celle du codeur stack-run à bas débit et équivalente à haut débit. Par contre, la complexité de calcul est proche de G. 722. 1
The objective of this thesis is to develop optimal speech and audio coding techniques which are more flexible than the state of the art and can adapt in real-time to various constraints (rate, bandwidth, delay). This problem is addressed using several tools : statistical models, high-rate quantization theory, flexible entropy coding. Firstly, a novel method of flexible coding for linear prediction coding (LPC) coefficients is proposed using Karhunen-Loeve transform (KLT) and scalar quantization based on generalized Gaussian modelling. This method has a performance equivalent to the LPC quantizer used in AMR-WB with a lower complexity. Then, two transform audio coding structures are proposed using either stack-run coding or model-based bit plane coding. In both case the coefficients after perceptual weighting and modified discrete cosine transform (MDCT) are approximated by a generalized Gaussian distribution. The coding of MDCT coefficients is optimized according to this model. The performance is compared with that of ITU-T G. 7222. 1. The stack-run coder is better than G. 7222. 1 at low bit rates and equivalent at high bit rates. However, the computational complexity of the proposed stack-run coder is higher and the memory requirement is low. The bit plane coder has the advantage of being bit rate scalable. The generalized Gaussian model is used to initialize the probability tables of an arithmetic coder. The bit plane coder is worse than stack-run coding at low bit rates and equivalent at high bit rates. It has a computational complexity close to G. 7222. 1 while memory requirement is still low

Gli stili APA, Harvard, Vancouver, ISO e altri

Liuni, Marco. "Adaptation Automatique de la Résolution pour l'Analyse et la Synthèse du Signal Audio". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00773550.

Testo completo

Abstract (sommario):

Dans cette thèse, on s'intéresse à des méthodes qui permettent de varier localement la résolution temps-fréquence pour l'analyse et la re-synthèse du son. En Analyse Temps-Fréquence, l'adaptativité est la possibilité de concevoir de représentations et opérateurs avec des caractéristiques qui peuvent être modifiées en fonction des objets à analyser: le premier objectif de ce travail est la définition formelle d'un cadre mathématique qui puisse engendrer des méthodes adaptatives pour l'analyse du son. Le deuxième est de rendre l'adaptation automatique; on établit des critères pour définir localement la meilleure résolution temps-fréquence, en optimisant des mesures de parcimonie appropriées. Afin d'exploiter l'adaptativité dans le traitement spectral du son, on introduit des méthodes de reconstruction efficaces, basées sur des analyses à résolution variable, conçues pour préserver et améliorer les techniques actuelles de manipulation du son. L'idée principale est que les algorithmes adaptatifs puissent contribuer à la simplification de l'utilisation de méthodes de traitement du son qui nécessitent aujourd'hui un haut niveau d'expertise. En particulier, la nécessité d'une configuration manuelle détaillée constitue une limitation majeure dans les applications grand public de traitement du son de haute qualité (par exemple: transposition, compression/dilatation temporelle). Nous montrons des exemples où la gestion automatique de la résolution temps-fréquence permet non seulement de réduire significativement les paramètres à régler, mais aussi d'améliorer la qualité des traitements.

Gli stili APA, Harvard, Vancouver, ISO e altri

Olivero, Anaik. "Les multiplicateurs temps-fréquence : Applications à l’analyse et la synthèse de signaux sonores et musicaux". Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4788/document.

Testo completo

Abstract (sommario):

Cette thèse s'inscrit dans le contexte de l'analyse/transformation/synthèse des signaux audio utilisant des représentations temps-fréquence, de type transformation de Gabor. Dans ce contexte, la complexité des transformations permettant de relier des sons peut être modélisée au moyen de multiplicateurs de Gabor, opérateurs de signaux linéaires caractérisés par une fonction de transfert temps-fréquence, à valeurs complexes, que l'on appelle masque de Gabor. Les multiplicateurs de Gabor permettent deformaliser le concept de filtrage dans le plan temps-fréquence. En agissant de façon multiplicative dans le plan temps-fréquence, ils sont a priori bien adaptés pour réaliser des transformations sonores telles que des modifications de timbre des sons. Dans un premier temps, ce travail de thèses intéresse à la modélisation du problème d'estimation d'un masque de Gabor entre deux signaux donnés et la mise en place de méthodes de calculs efficaces permettant de résoudre le problème. Le multiplicateur de Gabor entre deux signaux n'est pas défini de manière unique et les techniques d'estimation proposées de construire des multiplicateurs produisant des signaux sonores de qualité satisfaisante. Dans un second temps, nous montrons que les masques de Gabor contiennent une information pertinente capable d'établir une classification des signaux,et proposons des stratégies permettant de localiser automatiquement les régions temps-fréquence impliquées dans la différentiation de deux classes de signaux. Enfin, nous montrons que les multiplicateurs de Gabor constituent tout un panel de transformations sonores entre deux sons, qui, dans certaines situations, peuvent être guidées par des descripteurs de timbre
Analysis/Transformation/Synthesis is a generalparadigm in signal processing, that aims at manipulating or generating signalsfor practical applications. This thesis deals with time-frequencyrepresentations obtained with Gabor atoms. In this context, the complexity of a soundtransformation can be modeled by a Gabor multiplier. Gabormultipliers are linear diagonal operators acting on signals, andare characterized by a time-frequency transfer function of complex values, called theGabor mask. Gabor multipliers allows to formalize the conceptof filtering in the time-frequency domain. As they act by multiplying in the time-frequencydomain, they are "a priori'' well adapted to producesound transformations like timbre transformations. In a first part, this work proposes to model theproblem of Gabor mask estimation between two given signals,and provides algorithms to solve it. The Gabor multiplier between two signals is not uniquely defined and the proposed estimationstrategies are able to generate Gabor multipliers that produce signalswith a satisfied sound quality. In a second part, we show that a Gabor maskcontain a relevant information, as it can be viewed asa time-frequency representation of the difference oftimbre between two given sounds. By averaging the energy contained in a Gabor mask, we obtain a measure of this difference that allows to discriminate different musical instrumentsounds. We also propose strategies to automaticallylocalize the time-frequency regions responsible for such a timbre dissimilarity between musicalinstrument classes. Finally, we show that the Gabor multipliers can beused to construct a lot of sounds morphing trajectories,and propose an extension

Gli stili APA, Harvard, Vancouver, ISO e altri

Renault, Lenny. "Neural audio synthesis of realistic piano performances". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS196.

Testo completo

Abstract (sommario):

Musicien et instrument forment un duo central de l'expérience musicale.Indissociables, ils sont les acteurs de la performance musicale, transformant une composition en une expérience auditive émotionnelle. Pour cela, l'instrument est un objet sonore que le musicien contrôle pour retranscrire et partager sa compréhension d'une œuvre musicale. Accéder aux sonorités d'un tel instrument, souvent issus de facture poussée, et à sa maîtrise de jeu, requiert des ressources limitant l'exploration créative des compositeurs. Cette thèse explore l'utilisation des réseaux de neurones profonds pour reproduire les subtilités introduites par le jeu du musicien et par le son de l'instrument, rendant la musique réaliste et vivante. En se focalisant sur la musique pour piano, le travail réalisé a donné lieu à un modèle de synthèse sonore pour piano ainsi qu'à un modèle de rendu de performances expressives. DDSP-Piano, le modèle de synthèse de piano, est construit sur l'approche hybride de Traitement du Signal Différentiable (DDSP) permettant d'inclure des outils de traitement du signal traditionnel dans un modèle d'apprentissage profond. Le modèle prend des performances symboliques en entrée, et inclut explicitement des connaissance spécifiques à l'instrument, telles que l'inharmonicité, l'accordage et la polyphonie. Cette approche modulaire, légère et interprétable synthétise des sons d'une qualité réaliste tout en séparant les différents éléments constituant le son du piano. Quant au modèle de rendu de performance, l'approche proposée permet de transformer des compositions MIDI en interprétations expressives symboliques. En particulier, grâce à un entraînement adverse non-supervisé, elle dénote des travaux précédents en ne s'appuyant pas sur des paires de partitions et d'interprétations alignées pour reproduire des qualités expressives. La combinaison des deux modèles de synthèse sonore et de rendu de performance permettrait de synthétiser des interprétations expressives audio de partitions, tout en donnant la possibilité de modifier, dans le domaine symbolique, l'interprétation générée
Musician and instrument make up a central duo in the musical experience.Inseparable, they are the key actors of the musical performance, transforming a composition into an emotional auditory experience. To this end, the instrument is a sound device, that the musician controls to transcribe and share their understanding of a musical work. Access to the sound of such instruments, often the result of advanced craftsmanship, and to the mastery of playing them, can require extensive resources that limit the creative exploration of composers.This thesis explores the use of deep neural networks to reproduce the subtleties introduced by the musician's playing and the sound of the instrument, making the music realistic and alive. Focusing on piano music, the conducted work has led to a sound synthesis model for the piano, as well as an expressive performance rendering model.DDSP-Piano, the piano synthesis model, is built upon the hybrid approach of Differentiable Digital Signal Processing (DDSP), which enables the inclusion of traditional signal processing tools into a deep learning model. The model takes symbolic performances as input and explicitly includes instrument-specific knowledge, such as inharmonicity, tuning, and polyphony. This modular, lightweight, and interpretable approach synthesizes sounds of realistic quality while separating the various components that make up the piano sound. As for the performance rendering model, the proposed approach enables the transformation of MIDI compositions into symbolic expressive interpretations.In particular, thanks to an unsupervised adversarial training, it stands out from previous works by not relying on aligned score-performance training pairs to reproduce expressive qualities. The combination of the sound synthesis and performance rendering models would enable the synthesis of expressive audio interpretations of scores, while enabling modification of the generated interpretations in the symbolic domain

Gli stili APA, Harvard, Vancouver, ISO e altri

Molina, Villota Daniel Hernán. "Vocal audio effects : tuning, vocoders, interaction". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS166.

Testo completo

Abstract (sommario):

Cette recherche se concentre sur l'utilisation d'effets audio numériques (DAFx) sur les pistes vocales dans la musique moderne, on étudie principalement la correction de la hauteur et le vocoding. Malgré son utilisation répandue, il n'y a pas eu suffisamment de discussions sur la manière d'améliorer l'autotune ou sur ce qui rend une modification de la hauteur plus intéressante d'un point de vue musical. Une analyse taxonomique des effets vocaux a été réalisée, montrant des exemples de la manière dont les effets peuvent préserver ou transformer l'identité vocale et leur utilisation musicale, en particulier traitant la modification de la hauteur. En outre, un recueil de termes technico-musicaux a été élaboré pour distinguer les types de tuning vocal et les cas de correction de la hauteur. Une méthode de correction de la hauteur est proposée pour son utilisation vocale : Dynamic Pitch Warping (DPW). Cette méthode est validée par des courbes de hauteur théoriques (appuyées par l'audio) et comparée à une méthode de référence. Bien que le vocodeur soit essentiel pour la correction de la hauteur, il y a un manque de base descriptive et comparative pour les techniques de vocodeur. Par conséquent, une description sonore du vocodeur est proposée, compte tenu de son utilisation pour le tuning, en utilisant quatre algorithmes différents : Antares, Retune, World et Circe. Ensuite, une évaluation psychoacoustique subjective est réalisée pour comparer les quatre systèmes dans les cas suivants : resynthèse de la tonalité originale, correction vocale douce et correction vocale extrême. Cette évaluation psychoacoustique cherche à comprendre la coloration de chaque vocodeur (préservation de l'identité vocale) et dans la correction vocale extrême. Aussi, un protocole d'évaluation subjective des méthodes de correction de la hauteur est proposé et mis en œuvre. Ce protocole compare notre méthode de correction de hauteur DPW à la méthode de référence ATA. Cette étude vise à déterminer s'il existe des différences perceptives entre les systèmes et dans quels cas elles se produisent, ce qui est utile pour développer de nouvelles méthodes de modification mélodique à l'avenir. Enfin, l'utilisation interactive des effets vocaux a été explorée, en capturant le mouvement des mains à l'aide de capteurs sans fil et en le mappant pour contrôler les effets qui modifient la perception de l'espace et de la mélodie vocale
This research focuses on the use of digital audio effects (DAFx) on vocal tracks in modern music, mainly pitch correction and vocoding. Despite its widespread use, there has not been enough discussion on how to improve autotune or what makes a pitch-modification more musically interesting. A taxonomic analysis of vocal effects has been conducted, demonstrating examples of how they can preserve or transform vocal identity and their musical use, particularly with pitch modification. Furthermore, a compendium of technical-musical terms has been developed to distinguish types of vocal tuning and cases of pitch correction. Additionally, a graphical correction method for vocal pitch correction is proposed. This method is validated with theoretical pitch curves (supported by audio) and compared with a reference method. Although the vocoder is essential for pitch correction, there is a lack of descriptive and comparative basis for vocoding techniques. Therefore, a sonic description of the vocoder is proposed, given its use for tuning, employing four different techniques: Antares, Retune, World, and Circe. Subsequently, a subjective psychoacoustic evaluation is conducted to compare the four systems in the following cases: original tone resynthesis, soft vocal correction, and extreme vocal correction. This psychoacoustic evaluation seeks to understand the coloring of each vocoder (preservation of vocal identity) and the role of melody in extreme vocal correction. Furthermore, a protocol for the subjective evaluation of pitch correction methods is proposed and implemented. This protocol compares our DPW pitch correction method with the ATA reference method. This study aims to determine if there are perceptual differences between the systems and in which cases they occur, which is useful for developing new melodic modification methods in the future. Finally, the interactive use of vocal effects has been explored, capturing hand movement with wireless sensors and mapping it to control effects that modify the perception of space and vocal melody

Gli stili APA, Harvard, Vancouver, ISO e altri

Meynard, Adrien. "Stationnarités brisées : approches à l'analyse et à la synthèse". Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0475.

Testo completo

Abstract (sommario):

La non-stationnarité est caractéristique des phénomènes physiques transitoires. Par exemple, elle peut être engendrée par la variation de vitesse d'un moteur lors d'une accélération. De même, du fait de l'effet Doppler, un son stationnaire émis par une source en mouvement sera perçu comme étant non stationnaire par un observateur fixe. Ces exemples nous conduisent à considérer une classe de non-stationnarité formée des signaux stationnaires dont la stationnarité a été brisée par une opérateur de déformation physiquement pertinent. Après avoir décrit les modèles de déformation considérés (chapitre 1), nous présentons différentes méthodes permettant d'étendre l'analyse et la synthèse spectrale à de tels signaux. L'estimation spectrale des signaux revient à déterminer le spectre du processus stationnaire sous-jacent et la déformation ayant brisé sa stationnarité. Ainsi, dans le chapitre 2, nous nous intéressons à l'analyse de signaux localement déformés pour lesquels la déformation subie s'exprime simplement comme un déplacement des coefficients d'ondelettes dans le plan temps-échelle. Nous tirons profit de cet propriété pour proposer l'algorithme d'estimation du spectre instantané JEFAS. Dans le chapitre 3, nous étendons cette analyse spectrale aux signaux multi-capteurs pour lesquels l'opérateur de déformation prend une forme matricielle. Il s'agit d'un problème de séparation de sources doublement non stationnaire. Dans le chapitre 4, nous proposons un approche à la synthèse pour étudier des signaux localement déformés. Enfin, dans le chapitre 5, nous construisons une représentation temps-fréquence adaptée à l'étude des signaux localement harmoniques
Nonstationarity characterizes transient physical phenomena. For example, it may be caused by a speed variation of an accelerating engine. Similarly, because of the Doppler effect, a stationary sound emitted by a moving source is perceived as being nonstationary by a motionless observer. These examples lead us to consider a class of nonstationary signals formed from stationary signals whose stationarity has been broken by a physically relevant deformation operator. After describing the considered deformation models (chapter 1), we present different methods that extend the spectral analysis and synthesis to such signals. The spectral estimation amounts to determining simultaneously the spectrum of the underlying stationary process and the deformation breaking its stationarity. To this end, we consider representations of the signal in which this deformation is characterized by a simple operation. Thus, in chapter 2, we are interested in the analysis of locally deformed signals. The deformation describing these signals is simply expressed as a displacement of the wavelet coefficients in the time-scale domain. We take advantage of this property to develop a method for the estimation of these displacements. Then, we propose an instantaneous spectrum estimation algorithm, named JEFAS. In chapter 3, we extend this spectral analysis to multi-sensor signals where the deformation operator takes a matrix form. This is a doubly nonstationary blind source separation problem. In chapter 4, we propose a synthesis approach to study locally deformed signals. Finally, in chapter 5, we construct a time-frequency representation adapted to the description of locally harmonic signals

Gli stili APA, Harvard, Vancouver, ISO e altri

Nistal, Hurlé Javier. "Exploring generative adversarial networks for controllable musical audio synthesis". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT009.

Testo completo

Abstract (sommario):

Les synthétiseurs audio sont des instruments de musique électroniques qui génèrent des sons artificiels sous un certain contrôle paramétrique. Alors que les synthétiseurs ont évolué depuis leur popularisation dans les années 70, deux défis fondamentaux restent encore non résolus: 1) le développement de systèmes de synthèse répondant à des paramètres sémantiquement intuitifs; 2) la conception de techniques de synthèse «universelles», indépendantes de la source à modéliser. Cette thèse étudie l’utilisation des réseaux adversariaux génératifs (ou GAN) pour construire de tels systèmes. L’objectif principal est de rechercher et de développer de nouveaux outils pour la production musicale, qui oﬀrent des moyens intuitifs de manipulation du son, par exemple en contrôlant des paramètres qui répondent aux propriétés perceptives du son et à d’autres caractéristiques. Notre premier travail étudie les performances des GAN lorsqu’ils sont entraînés sur diverses représentations de signaux audio. Ces expériences comparent différentes formes de données audio dans le contexte de la synthèse sonore tonale. Les résultats montrent que la représentation magnitude-fréquence instantanée et la transformée de Fourier à valeur complexe obtiennent les meilleurs résultats. En s’appuyant sur ce résultat, notre travail suivant présente DrumGAN, un synthétiseur audio de sons percussifs. En conditionnant le modèle sur des caractéristiques perceptives décrivant des propriétés timbrales de haut niveau, nous démontrons qu’un contrôle intuitif peut être obtenu sur le processus de génération. Ce travail aboutit au développement d’un plugin VST générant de l’audio haute résolution. La rareté des annotations dans les ensembles de données audio musicales remet en cause l’application de méthodes supervisées pour la génération conditionnelle. On utilise une approche de distillation des connaissances pour extraire de telles annotations à partir d’un système d’étiquetage audio préentraîné. DarkGAN est un synthétiseur de sons tonaux qui utilise les probabilités de sortie d’un tel système (appelées « étiquettes souples ») comme informations conditionnelles. Les résultats montrent que DarkGAN peut répondre modérément à de nombreux attributs intuitifs, même avec un conditionnement d’entrée hors distribution. Les applications des GAN à la synthèse audio apprennent généralement à partir de données de spectrogramme de taille fixe. Nous abordons cette limitation en exploitant une méthode auto-supervisée pour l’apprentissage de caractéristiques discrètes à partir de données séquentielles. De telles caractéristiques sont utilisées comme entrée conditionnelle pour fournir au modèle des informations dépendant du temps par étapes. La cohérence globale est assurée en fixant le bruit d’entrée z (caractéristique en GANs). Les résultats montrent que, tandis que les modèles entraînés sur un schéma de taille fixe obtiennent une meilleure qualité et diversité audio, les nôtres peuvent générer avec compétence un son de n’importe quelle durée. Une direction de recherche intéressante est la génération d’audio conditionnée par du matériel musical préexistant. Nous étudions si un générateur GAN, conditionné sur des signaux audio musicaux hautement compressés, peut générer des sorties ressemblant à l’audio non compressé d’origine. Les résultats montrent que le GAN peut améliorer la qualité des signaux audio par rapport aux versions MP3 pour des taux de compression très élevés (16 et 32 kbit/s). En conséquence directe de l’application de techniques d’intelligence artificielle dans des contextes musicaux, nous nous demandons comment la technologie basée sur l’IA peut favoriser l’innovation dans la pratique musicale. Par conséquent, nous concluons cette thèse en offrant une large perspective sur le développement d’outils d’IA pour la production musicale, éclairée par des considérations théoriques et des rapports d’utilisation d’outils d’IA dans le monde réel par des artistes professionnels
Audio synthesizers are electronic musical instruments that generate artificial sounds under some parametric control. While synthesizers have evolved since they were popularized in the 70s, two fundamental challenges are still unresolved: 1) the development of synthesis systems responding to semantically intuitive parameters; 2) the design of "universal," source-agnostic synthesis techniques. This thesis researches the use of Generative Adversarial Networks (GAN) towards building such systems. The main goal is to research and develop novel tools for music production that afford intuitive and expressive means of sound manipulation, e.g., by controlling parameters that respond to perceptual properties of the sound and other high-level features. Our first work studies the performance of GANs when trained on various common audio signal representations (e.g., waveform, time-frequency representations). These experiments compare different forms of audio data in the context of tonal sound synthesis. Results show that the Magnitude and Instantaneous Frequency of the phase and the complex-valued Short-Time Fourier Transform achieve the best results. Building on this, our following work presents DrumGAN, a controllable adversarial audio synthesizer of percussive sounds. By conditioning the model on perceptual features describing high-level timbre properties, we demonstrate that intuitive control can be gained over the generation process. This work results in the development of a VST plugin generating full-resolution audio and compatible with any Digital Audio Workstation (DAW). We show extensive musical material produced by professional artists from Sony ATV using DrumGAN. The scarcity of annotations in musical audio datasets challenges the application of supervised methods to conditional generation settings. Our third contribution employs a knowledge distillation approach to extract such annotations from a pre-trained audio tagging system. DarkGAN is an adversarial synthesizer of tonal sounds that employs the output probabilities of such a system (so-called “soft labels”) as conditional information. Results show that DarkGAN can respond moderately to many intuitive attributes, even with out-of-distribution input conditioning. Applications of GANs to audio synthesis typically learn from fixed-size two-dimensional spectrogram data analogously to the "image data" in computer vision; thus, they cannot generate sounds with variable duration. In our fourth paper, we address this limitation by exploiting a self-supervised method for learning discrete features from sequential data. Such features are used as conditional input to provide step-wise time-dependent information to the model. Global consistency is ensured by fixing the input noise z (characteristic in adversarial settings). Results show that, while models trained on a fixed-size scheme obtain better audio quality and diversity, ours can competently generate audio of any duration. One interesting direction for research is the generation of audio conditioned on preexisting musical material, e.g., the generation of some drum pattern given the recording of a bass line. Our fifth paper explores a simple pretext task tailored at learning such types of complex musical relationships. Concretely, we study whether a GAN generator, conditioned on highly compressed MP3 musical audio signals, can generate outputs resembling the original uncompressed audio. Results show that the GAN can improve the quality of the audio signals over the MP3 versions for very high compression rates (16 and 32 kbit/s). As a direct consequence of applying artificial intelligence techniques in musical contexts, we ask how AI-based technology can foster innovation in musical practice. Therefore, we conclude this thesis by providing a broad perspective on the development of AI tools for music production, informed by theoretical considerations and reports from real-world AI tool usage by professional artists

Gli stili APA, Harvard, Vancouver, ISO e altri

Tiger, Guillaume. "Synthèse sonore d'ambiances urbaines pour les applications vidéoludiques". Thesis, Paris, CNAM, 2014. http://www.theses.fr/2015CNAM0968/document.

Testo completo

Abstract (sommario):

Suite à un état de l'art détaillant la création et l'utilisation de l'espace sonore dans divers environnements urbains virtuels (soundmaps, jeux vidéo, réalité augmentée), il s'agira de déterminer une méthodologie et des techniques de conception pour les espaces sonores urbains virtuels du point de vue de l'immersion, de l'interface et de la dramaturgie.ces développements se feront dans le cadre du projet terra dynamica, tendant vers une utilisation plurielle de la ville virtuelle (sécurité et sureté, transports de surface, aménagement de l'urbanisme, services de proximité et citoyens, jeux). le principal objectif du doctorat sera de déterminer des réponses informatiques concrètes à la problématique suivante : comment, en fonction de leur utilisation anticipée, les espaces sonores urbains virtuels doivent-ils être structurés et avec quels contenus?la formalisation informatique des solutions étayées au fil du doctorat et la création du contenu sonore illustrant le projet seront basés sur l'analyse de données scientifiques provenant de domaines variés tels que la psychologie de la perception, l'architecture et l'urbanisme, l'acoustique, la recherche esthétique (musicale) ainsi que sur l'observation et le recueil de données audio-visuelles du territoire urbain, de manière à rendre compte tant de la richesse du concept d'espace sonore que de la multiplicité de ses déclinaisons dans le cadre de la ville virtuelle
In video gaming and interactive media, the making of complex sound ambiences relies heavily on the allowed memory and computational resources. So a compromise solution is necessary regarding the choice of audio material and its treatment in order to reach immersive and credible real-time ambiences. Alternatively, the use of procedural audio techniques, i.e. the generation of audio content relatively to the data provided by the virtual scene, has increased in recent years. Procedural methodologies seem appropriate to sonify complex environments such as virtual cities.In this thesis we specifically focus on the creation of interactive urban sound ambiences. Our analysis of these ambiences is based on the Soundscape theory and on a state of art on game oriented urban interactive applications. We infer that the virtual urban soundscape is made of several perceptive auditory grounds including a background. As a first contribution we define the morphological and narrative properties of such a background. We then consider the urban background sound as a texture and propose, as a second contribution, to pinpoint, specify and prototype a granular synthesis tool dedicated to interactive urban sound backgrounds.The synthesizer prototype is created using the visual programming language Pure Data. On the basis of our state of the art, we include an urban ambiences recording methodology to feed the granular synthesis. Finally, two validation steps regarding the prototype are described: the integration to the virtual city simulation Terra Dynamica on the one side and a perceptive listening comparison test on the other

Gli stili APA, Harvard, Vancouver, ISO e altri

Musti, Utpala. "Synthèse acoustico-visuelle de la parole par sélection d'unités bimodales". Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0003.

Testo completo

Abstract (sommario):

Ce travail porte sur la synthèse de la parole audio-visuelle. Dans la littérature disponible dans ce domaine, la plupart des approches traite le problème en le divisant en deux problèmes de synthèse. Le premier est la synthèse de la parole acoustique et l'autre étant la génération d'animation faciale correspondante. Mais, cela ne garantit pas une parfaite synchronisation et cohérence de la parole audio-visuelle. Pour pallier implicitement l'inconvénient ci-dessus, nous avons proposé une approche de synthèse de la parole acoustique-visuelle par la sélection naturelle des unités synchrones bimodales. La synthèse est basée sur le modèle de sélection d'unité classique. L'idée principale derrière cette technique de synthèse est de garder l'association naturelle entre la modalité acoustique et visuelle intacte. Nous décrivons la technique d'acquisition de corpus audio-visuelle et la préparation de la base de données pour notre système. Nous présentons une vue d'ensemble de notre système et nous détaillons les différents aspects de la sélection d'unités bimodales qui ont besoin d'être optimisées pour une bonne synthèse. L'objectif principal de ce travail est de synthétiser la dynamique de la parole plutôt qu'une tête parlante complète. Nous décrivons les caractéristiques visuelles cibles que nous avons conçues. Nous avons ensuite présenté un algorithme de pondération de la fonction cible. Cet algorithme que nous avons développé effectue une pondération de la fonction cible et l'élimination de fonctionnalités redondantes de manière itérative. Elle est basée sur la comparaison des classements de coûts cible et en se basant sur une distance calculée à partir des signaux de parole acoustiques et visuels dans le corpus. Enfin, nous présentons l'évaluation perceptive et subjective du système de synthèse final. Les résultats montrent que nous avons atteint l'objectif de synthétiser la dynamique de la parole raisonnablement bien
This work deals with audio-visual speech synthesis. In the vast literature available in this direction, many of the approaches deal with it by dividing it into two synthesis problems. One of it is acoustic speech synthesis and the other being the generation of corresponding facial animation. But, this does not guarantee a perfectly synchronous and coherent audio-visual speech. To overcome the above drawback implicitly, we proposed a different approach of acoustic-visual speech synthesis by the selection of naturally synchronous bimodal units. The synthesis is based on the classical unit selection paradigm. The main idea behind this synthesis technique is to keep the natural association between the acoustic and visual modality intact. We describe the audio-visual corpus acquisition technique and database preparation for our system. We present an overview of our system and detail the various aspects of bimodal unit selection that need to be optimized for good synthesis. The main focus of this work is to synthesize the speech dynamics well rather than a comprehensive talking head. We describe the visual target features that we designed. We subsequently present an algorithm for target feature weighting. This algorithm that we developed performs target feature weighting and redundant feature elimination iteratively. This is based on the comparison of target cost based ranking and a distance calculated based on the acoustic and visual speech signals of units in the corpus. Finally, we present the perceptual and subjective evaluation of the final synthesis system. The results show that we have achieved the goal of synthesizing the speech dynamics reasonably well

Gli stili APA, Harvard, Vancouver, ISO e altri

Più fonti

Libri sul tema "Synthèse audio"

Kunow, Kristian. Rundfunk und Internet: These, Antithese, Synthese? A cura di Arbeitsgemeinschaft der Landesmedienanstalten in der Bundesrepublik Deutschland. Berlin: Vistas, 2013.

Cerca il testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

1944-, Kamajou François, e Cameroon. Ministry of Scientific Research., a cura di. Audit scientifique de la recherche agricole au Cameroun: Synthèse de l'audit, rapport général. [Yaoundé]: République du Cameroun, Ministère de la recherche scientifique et technique, 1993.

Cerca il testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Obert, Robert. Synthèse droit et comptabilité, DESCF numéro 1 : Tome 2 - Audit et commissariat aux comptes. Aspects internationaux. Dunod, 2002.

Cerca il testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!