Зміст
Добірка наукової літератури з теми "Synthèse audio"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Synthèse audio".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Synthèse audio"
Rioufreyt, Thibaut. "La transcription outillée en SHS. Un panorama des logiciels de transcription audio/vidéo." Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique 139, no. 1 (April 24, 2018): 96–133. http://dx.doi.org/10.1177/0759106318762455.
Повний текст джерелаGauthier, Geneviève, Simonne Couture, and Christina St-Onge. "Jugement évaluatif : confrontation d’un modèle conceptuel à des données empiriques." Pédagogie Médicale 19, no. 1 (February 2018): 15–25. http://dx.doi.org/10.1051/pmed/2019002.
Повний текст джерелаXu, Shu. "AI Color Organ: Piano Music Visualization using Onset Detection and HistoGAN." Highlights in Science, Engineering and Technology 39 (April 1, 2023): 274–79. http://dx.doi.org/10.54097/hset.v39i.6539.
Повний текст джерелаDreier, Christian, and Michael Vorländer. "Vehicle pass-by noise auralization in a virtual urban environment." INTER-NOISE and NOISE-CON Congress and Conference Proceedings 265, no. 6 (February 1, 2023): 1907–15. http://dx.doi.org/10.3397/in_2022_0269.
Повний текст джерелаDick, Michael. "„ Der neue Audi Q5 ist eine Synthese aus Sportlichkeit, innovativer Technologie und einzigartigem Design.“." ATZextra 13, no. 2 (June 2008): 200. http://dx.doi.org/10.1365/s35778-008-0108-z.
Повний текст джерелаAbbas, Wasfi. "Audio-Visual Poetry A Semiotic-Cultural Reading in Interactive Digital Poem (Vision of Hope)." Journal of Umm Al-Qura University for Language Sciences and Literature, no. 28 (August 1, 2021): 253–302. http://dx.doi.org/10.54940/ll78073145.
Повний текст джерелаMcGlacken-Byrne, Sinéad M., Nuala P. Murphy, and Sarah Barry. "A realist synthesis of multicentre comparative audit implementation: exploring what works and in which healthcare contexts." BMJ Open Quality 13, no. 1 (March 2024): e002629. http://dx.doi.org/10.1136/bmjoq-2023-002629.
Повний текст джерелаIftikhar, Hassan, and Yan Luximon. "The syntheses of static and mobile wayfinding information: an empirical study of wayfinding preferences and behaviour in complex environments." Facilities 40, no. 7/8 (March 4, 2022): 452–74. http://dx.doi.org/10.1108/f-06-2021-0052.
Повний текст джерелаMao, Ling-Xiang, Jing Lan, Zifeng Li, and Hua Shi. "Undergraduate Teaching Audit and Evaluation Using an Extended ORESTE Method with Interval-Valued Hesitant Fuzzy Linguistic Sets." Systems 11, no. 5 (April 23, 2023): 216. http://dx.doi.org/10.3390/systems11050216.
Повний текст джерелаJermia, Gabriella. "The Usability of Kamishibai Card in Patient Safety: A Literature Review." Fundamental and Management Nursing Journal 5, no. 2 (October 1, 2022): 51–54. http://dx.doi.org/10.20473/fmnj.v5i2.36837.
Повний текст джерелаДисертації з теми "Synthèse audio"
Coulibaly, Patrice Yefoungnigui. "Codage audio à bas débit avec synthèse sinusoïdale." Mémoire, Université de Sherbrooke, 2000. http://savoirs.usherbrooke.ca/handle/11143/1078.
Повний текст джерелаOger, Marie. "Model-based techniques for flexible speech and audio coding." Nice, 2007. http://www.theses.fr/2007NICE4109.
Повний текст джерелаThe objective of this thesis is to develop optimal speech and audio coding techniques which are more flexible than the state of the art and can adapt in real-time to various constraints (rate, bandwidth, delay). This problem is addressed using several tools : statistical models, high-rate quantization theory, flexible entropy coding. Firstly, a novel method of flexible coding for linear prediction coding (LPC) coefficients is proposed using Karhunen-Loeve transform (KLT) and scalar quantization based on generalized Gaussian modelling. This method has a performance equivalent to the LPC quantizer used in AMR-WB with a lower complexity. Then, two transform audio coding structures are proposed using either stack-run coding or model-based bit plane coding. In both case the coefficients after perceptual weighting and modified discrete cosine transform (MDCT) are approximated by a generalized Gaussian distribution. The coding of MDCT coefficients is optimized according to this model. The performance is compared with that of ITU-T G. 7222. 1. The stack-run coder is better than G. 7222. 1 at low bit rates and equivalent at high bit rates. However, the computational complexity of the proposed stack-run coder is higher and the memory requirement is low. The bit plane coder has the advantage of being bit rate scalable. The generalized Gaussian model is used to initialize the probability tables of an arithmetic coder. The bit plane coder is worse than stack-run coding at low bit rates and equivalent at high bit rates. It has a computational complexity close to G. 7222. 1 while memory requirement is still low
Liuni, Marco. "Adaptation Automatique de la Résolution pour l'Analyse et la Synthèse du Signal Audio." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00773550.
Повний текст джерелаOlivero, Anaik. "Les multiplicateurs temps-fréquence : Applications à l’analyse et la synthèse de signaux sonores et musicaux." Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4788/document.
Повний текст джерелаAnalysis/Transformation/Synthesis is a generalparadigm in signal processing, that aims at manipulating or generating signalsfor practical applications. This thesis deals with time-frequencyrepresentations obtained with Gabor atoms. In this context, the complexity of a soundtransformation can be modeled by a Gabor multiplier. Gabormultipliers are linear diagonal operators acting on signals, andare characterized by a time-frequency transfer function of complex values, called theGabor mask. Gabor multipliers allows to formalize the conceptof filtering in the time-frequency domain. As they act by multiplying in the time-frequencydomain, they are "a priori'' well adapted to producesound transformations like timbre transformations. In a first part, this work proposes to model theproblem of Gabor mask estimation between two given signals,and provides algorithms to solve it. The Gabor multiplier between two signals is not uniquely defined and the proposed estimationstrategies are able to generate Gabor multipliers that produce signalswith a satisfied sound quality. In a second part, we show that a Gabor maskcontain a relevant information, as it can be viewed asa time-frequency representation of the difference oftimbre between two given sounds. By averaging the energy contained in a Gabor mask, we obtain a measure of this difference that allows to discriminate different musical instrumentsounds. We also propose strategies to automaticallylocalize the time-frequency regions responsible for such a timbre dissimilarity between musicalinstrument classes. Finally, we show that the Gabor multipliers can beused to construct a lot of sounds morphing trajectories,and propose an extension
Renault, Lenny. "Neural audio synthesis of realistic piano performances." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS196.
Повний текст джерелаMusician and instrument make up a central duo in the musical experience.Inseparable, they are the key actors of the musical performance, transforming a composition into an emotional auditory experience. To this end, the instrument is a sound device, that the musician controls to transcribe and share their understanding of a musical work. Access to the sound of such instruments, often the result of advanced craftsmanship, and to the mastery of playing them, can require extensive resources that limit the creative exploration of composers.This thesis explores the use of deep neural networks to reproduce the subtleties introduced by the musician's playing and the sound of the instrument, making the music realistic and alive. Focusing on piano music, the conducted work has led to a sound synthesis model for the piano, as well as an expressive performance rendering model.DDSP-Piano, the piano synthesis model, is built upon the hybrid approach of Differentiable Digital Signal Processing (DDSP), which enables the inclusion of traditional signal processing tools into a deep learning model. The model takes symbolic performances as input and explicitly includes instrument-specific knowledge, such as inharmonicity, tuning, and polyphony. This modular, lightweight, and interpretable approach synthesizes sounds of realistic quality while separating the various components that make up the piano sound. As for the performance rendering model, the proposed approach enables the transformation of MIDI compositions into symbolic expressive interpretations.In particular, thanks to an unsupervised adversarial training, it stands out from previous works by not relying on aligned score-performance training pairs to reproduce expressive qualities. The combination of the sound synthesis and performance rendering models would enable the synthesis of expressive audio interpretations of scores, while enabling modification of the generated interpretations in the symbolic domain
Molina, Villota Daniel Hernán. "Vocal audio effects : tuning, vocoders, interaction." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS166.
Повний текст джерелаThis research focuses on the use of digital audio effects (DAFx) on vocal tracks in modern music, mainly pitch correction and vocoding. Despite its widespread use, there has not been enough discussion on how to improve autotune or what makes a pitch-modification more musically interesting. A taxonomic analysis of vocal effects has been conducted, demonstrating examples of how they can preserve or transform vocal identity and their musical use, particularly with pitch modification. Furthermore, a compendium of technical-musical terms has been developed to distinguish types of vocal tuning and cases of pitch correction. Additionally, a graphical correction method for vocal pitch correction is proposed. This method is validated with theoretical pitch curves (supported by audio) and compared with a reference method. Although the vocoder is essential for pitch correction, there is a lack of descriptive and comparative basis for vocoding techniques. Therefore, a sonic description of the vocoder is proposed, given its use for tuning, employing four different techniques: Antares, Retune, World, and Circe. Subsequently, a subjective psychoacoustic evaluation is conducted to compare the four systems in the following cases: original tone resynthesis, soft vocal correction, and extreme vocal correction. This psychoacoustic evaluation seeks to understand the coloring of each vocoder (preservation of vocal identity) and the role of melody in extreme vocal correction. Furthermore, a protocol for the subjective evaluation of pitch correction methods is proposed and implemented. This protocol compares our DPW pitch correction method with the ATA reference method. This study aims to determine if there are perceptual differences between the systems and in which cases they occur, which is useful for developing new melodic modification methods in the future. Finally, the interactive use of vocal effects has been explored, capturing hand movement with wireless sensors and mapping it to control effects that modify the perception of space and vocal melody
Meynard, Adrien. "Stationnarités brisées : approches à l'analyse et à la synthèse." Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0475.
Повний текст джерелаNonstationarity characterizes transient physical phenomena. For example, it may be caused by a speed variation of an accelerating engine. Similarly, because of the Doppler effect, a stationary sound emitted by a moving source is perceived as being nonstationary by a motionless observer. These examples lead us to consider a class of nonstationary signals formed from stationary signals whose stationarity has been broken by a physically relevant deformation operator. After describing the considered deformation models (chapter 1), we present different methods that extend the spectral analysis and synthesis to such signals. The spectral estimation amounts to determining simultaneously the spectrum of the underlying stationary process and the deformation breaking its stationarity. To this end, we consider representations of the signal in which this deformation is characterized by a simple operation. Thus, in chapter 2, we are interested in the analysis of locally deformed signals. The deformation describing these signals is simply expressed as a displacement of the wavelet coefficients in the time-scale domain. We take advantage of this property to develop a method for the estimation of these displacements. Then, we propose an instantaneous spectrum estimation algorithm, named JEFAS. In chapter 3, we extend this spectral analysis to multi-sensor signals where the deformation operator takes a matrix form. This is a doubly nonstationary blind source separation problem. In chapter 4, we propose a synthesis approach to study locally deformed signals. Finally, in chapter 5, we construct a time-frequency representation adapted to the description of locally harmonic signals
Nistal, Hurlé Javier. "Exploring generative adversarial networks for controllable musical audio synthesis." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT009.
Повний текст джерелаAudio synthesizers are electronic musical instruments that generate artificial sounds under some parametric control. While synthesizers have evolved since they were popularized in the 70s, two fundamental challenges are still unresolved: 1) the development of synthesis systems responding to semantically intuitive parameters; 2) the design of "universal," source-agnostic synthesis techniques. This thesis researches the use of Generative Adversarial Networks (GAN) towards building such systems. The main goal is to research and develop novel tools for music production that afford intuitive and expressive means of sound manipulation, e.g., by controlling parameters that respond to perceptual properties of the sound and other high-level features. Our first work studies the performance of GANs when trained on various common audio signal representations (e.g., waveform, time-frequency representations). These experiments compare different forms of audio data in the context of tonal sound synthesis. Results show that the Magnitude and Instantaneous Frequency of the phase and the complex-valued Short-Time Fourier Transform achieve the best results. Building on this, our following work presents DrumGAN, a controllable adversarial audio synthesizer of percussive sounds. By conditioning the model on perceptual features describing high-level timbre properties, we demonstrate that intuitive control can be gained over the generation process. This work results in the development of a VST plugin generating full-resolution audio and compatible with any Digital Audio Workstation (DAW). We show extensive musical material produced by professional artists from Sony ATV using DrumGAN. The scarcity of annotations in musical audio datasets challenges the application of supervised methods to conditional generation settings. Our third contribution employs a knowledge distillation approach to extract such annotations from a pre-trained audio tagging system. DarkGAN is an adversarial synthesizer of tonal sounds that employs the output probabilities of such a system (so-called “soft labels”) as conditional information. Results show that DarkGAN can respond moderately to many intuitive attributes, even with out-of-distribution input conditioning. Applications of GANs to audio synthesis typically learn from fixed-size two-dimensional spectrogram data analogously to the "image data" in computer vision; thus, they cannot generate sounds with variable duration. In our fourth paper, we address this limitation by exploiting a self-supervised method for learning discrete features from sequential data. Such features are used as conditional input to provide step-wise time-dependent information to the model. Global consistency is ensured by fixing the input noise z (characteristic in adversarial settings). Results show that, while models trained on a fixed-size scheme obtain better audio quality and diversity, ours can competently generate audio of any duration. One interesting direction for research is the generation of audio conditioned on preexisting musical material, e.g., the generation of some drum pattern given the recording of a bass line. Our fifth paper explores a simple pretext task tailored at learning such types of complex musical relationships. Concretely, we study whether a GAN generator, conditioned on highly compressed MP3 musical audio signals, can generate outputs resembling the original uncompressed audio. Results show that the GAN can improve the quality of the audio signals over the MP3 versions for very high compression rates (16 and 32 kbit/s). As a direct consequence of applying artificial intelligence techniques in musical contexts, we ask how AI-based technology can foster innovation in musical practice. Therefore, we conclude this thesis by providing a broad perspective on the development of AI tools for music production, informed by theoretical considerations and reports from real-world AI tool usage by professional artists
Tiger, Guillaume. "Synthèse sonore d'ambiances urbaines pour les applications vidéoludiques." Thesis, Paris, CNAM, 2014. http://www.theses.fr/2015CNAM0968/document.
Повний текст джерелаIn video gaming and interactive media, the making of complex sound ambiences relies heavily on the allowed memory and computational resources. So a compromise solution is necessary regarding the choice of audio material and its treatment in order to reach immersive and credible real-time ambiences. Alternatively, the use of procedural audio techniques, i.e. the generation of audio content relatively to the data provided by the virtual scene, has increased in recent years. Procedural methodologies seem appropriate to sonify complex environments such as virtual cities.In this thesis we specifically focus on the creation of interactive urban sound ambiences. Our analysis of these ambiences is based on the Soundscape theory and on a state of art on game oriented urban interactive applications. We infer that the virtual urban soundscape is made of several perceptive auditory grounds including a background. As a first contribution we define the morphological and narrative properties of such a background. We then consider the urban background sound as a texture and propose, as a second contribution, to pinpoint, specify and prototype a granular synthesis tool dedicated to interactive urban sound backgrounds.The synthesizer prototype is created using the visual programming language Pure Data. On the basis of our state of the art, we include an urban ambiences recording methodology to feed the granular synthesis. Finally, two validation steps regarding the prototype are described: the integration to the virtual city simulation Terra Dynamica on the one side and a perceptive listening comparison test on the other
Musti, Utpala. "Synthèse acoustico-visuelle de la parole par sélection d'unités bimodales." Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0003.
Повний текст джерелаThis work deals with audio-visual speech synthesis. In the vast literature available in this direction, many of the approaches deal with it by dividing it into two synthesis problems. One of it is acoustic speech synthesis and the other being the generation of corresponding facial animation. But, this does not guarantee a perfectly synchronous and coherent audio-visual speech. To overcome the above drawback implicitly, we proposed a different approach of acoustic-visual speech synthesis by the selection of naturally synchronous bimodal units. The synthesis is based on the classical unit selection paradigm. The main idea behind this synthesis technique is to keep the natural association between the acoustic and visual modality intact. We describe the audio-visual corpus acquisition technique and database preparation for our system. We present an overview of our system and detail the various aspects of bimodal unit selection that need to be optimized for good synthesis. The main focus of this work is to synthesize the speech dynamics well rather than a comprehensive talking head. We describe the visual target features that we designed. We subsequently present an algorithm for target feature weighting. This algorithm that we developed performs target feature weighting and redundant feature elimination iteratively. This is based on the comparison of target cost based ranking and a distance calculated based on the acoustic and visual speech signals of units in the corpus. Finally, we present the perceptual and subjective evaluation of the final synthesis system. The results show that we have achieved the goal of synthesizing the speech dynamics reasonably well
Книги з теми "Synthèse audio"
Kunow, Kristian. Rundfunk und Internet: These, Antithese, Synthese? Edited by Arbeitsgemeinschaft der Landesmedienanstalten in der Bundesrepublik Deutschland. Berlin: Vistas, 2013.
Знайти повний текст джерела1944-, Kamajou François, and Cameroon. Ministry of Scientific Research., eds. Audit scientifique de la recherche agricole au Cameroun: Synthèse de l'audit, rapport général. [Yaoundé]: République du Cameroun, Ministère de la recherche scientifique et technique, 1993.
Знайти повний текст джерелаObert, Robert. Synthèse droit et comptabilité, DESCF numéro 1 : Tome 2 - Audit et commissariat aux comptes. Aspects internationaux. Dunod, 2002.
Знайти повний текст джерела