Dissertations / Theses on the topic 'Traitement audio numérique'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 24 dissertations / theses for your research on the topic 'Traitement audio numérique.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Fillon, Thomas. "Traitement numérique du signal acoustique pour une aide aux malentendants." Phd thesis, Télécom ParisTech, 2004. http://pastel.archives-ouvertes.fr/pastel-00001201.
Full textNesvadba, Jan. "Segmentation sémantique des contenus audio-visuels." Bordeaux 1, 2007. http://www.theses.fr/2007BOR13456.
Full textLapierre, Jimmy. "Approches paramétriques pour le codage audio multicanal." Mémoire, Université de Sherbrooke, 2007. http://savoirs.usherbrooke.ca/handle/11143/1355.
Full textGonzález, Santos Ángel de Dios. "Circuits de traitement de signal numérique en temps continu ultra-faible consommation en technologie 28nm FDSOI pour applications audio." Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1I047.
Full textThe focus of this work is the study and development of a feature extraction system using Continuous-Time Digital Signal Processing (CT DSP) techniques, to mitigate the drawbacks of existing implementations based on traditional analog and digital solutions of always-on monitoring sensors for the Internet of Things (IoT). The target is to extract the spectral content of an audio signal using a novel architecture based on a cascade of configurable CT DSP Finite Impulse Response (FIR) filters. An efficient cascade scheme is enabled by the proposed glitch elimination and delta encoding techniques. Additionally, this work introduces a CT function to estimate the instantaneous power within selected frequency bands to build an output spectrogram. The proposed 12-band system has been validated using behavioral simulations. The key element for the implementation of this system is the digital delay element. A new delay element has been designed and fabricated in 28nm FDSOI technology and achieves a record tuning range from 30 ns to 97 µs with a power consumption of 15 fJ/event. By extrapolating this result, the system would have an overall peak power consumption of 2.85 µW when processing typical female speech, while consuming approximately 100 nW when no events are generated. Thus, the average system power consumption outperforms state-of-the-art feature extraction circuits
Hassaïne, Abdelâali. "Restauration des pistes sonores optiques cinématographiques : approche par traitement d'images." Phd thesis, École Nationale Supérieure des Mines de Paris, 2009. http://pastel.archives-ouvertes.fr/pastel-00005981.
Full textLapierre, Jimmy. "Amélioration de codecs audio standardisés avec maintien de l'interopérabilité." Thèse, Université de Sherbrooke, 2016. http://hdl.handle.net/11143/8816.
Full textAbstract : Digital audio applications have grown exponentially during the last decades, in good part because of the establishment of international standards. However, imposing such norms necessarily introduces hurdles that can impede the improvement of technologies that have already been deployed, potentially leading to a proliferation of new standards. This thesis shows that existent coders can be better exploited by improving their quality or their bitrate, even within the rigid constraints posed by established standards. Three aspects are studied, being the enhancement of the encoder, the decoder and the bit stream. In every case, the compatibility with the other elements of the existent coder is maintained. Thus, it is shown that the audio signal can be improved at the decoder without transmitting new information, that an encoder can produce an improved signal without modifying its decoder, and that a bit stream can be optimized for a new application. In particular, this thesis shows that even a standard like G.711, which has been deployed for decades, has the potential to be significantly improved after the fact. This contribution has even served as the core for a new standard embedded coder that had to maintain that compatibility. It is also shown that the subjective and objective audio quality of the AAC (Advanced Audio Coding) decoder can be improved, without adding any extra information from the encoder, by better exploiting the knowledge of the coder model’s limitations. Finally, it is shown that the fixed rate bit stream of the AMR-WB+ (Extended Adaptive Multi-Rate Wideband) can be compressed more efficiently when considering a variable bit rate scenario, showing the need to adapt a coder to its use case.
Parekh, Sanjeel. "Learning representations for robust audio-visual scene analysis." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT015/document.
Full textThe goal of this thesis is to design algorithms that enable robust detection of objectsand events in videos through joint audio-visual analysis. This is motivated by humans’remarkable ability to meaningfully integrate auditory and visual characteristics forperception in noisy scenarios. To this end, we identify two kinds of natural associationsbetween the modalities in recordings made using a single microphone and camera,namely motion-audio correlation and appearance-audio co-occurrence.For the former, we use audio source separation as the primary application andpropose two novel methods within the popular non-negative matrix factorizationframework. The central idea is to utilize the temporal correlation between audio andmotion for objects/actions where the sound-producing motion is visible. The firstproposed method focuses on soft coupling between audio and motion representationscapturing temporal variations, while the second is based on cross-modal regression.We segregate several challenging audio mixtures of string instruments into theirconstituent sources using these approaches.To identify and extract many commonly encountered objects, we leverageappearance–audio co-occurrence in large datasets. This complementary associationmechanism is particularly useful for objects where motion-based correlations are notvisible or available. The problem is dealt with in a weakly-supervised setting whereinwe design a representation learning framework for robust AV event classification,visual object localization, audio event detection and source separation.We extensively test the proposed ideas on publicly available datasets. The experimentsdemonstrate several intuitive multimodal phenomena that humans utilize on aregular basis for robust scene understanding
Mbaye, Amadou. "Linéarisation des amplificateurs de puissance large-bande pour des applications de communications tactiques et de diffusion audio ou vidéo numérique." Thesis, Paris Est, 2015. http://www.theses.fr/2015PEST1021/document.
Full textPower amplifier is one of the most critical element within radiocommunications systems. The PA is their main source of nonlinearities and it has a great contribution on the emitter's power consumption. Running the PA with highest power efficiency is thus as crucial as having it linear for a good communication quality. However these two specifications of the PA are antagonistic and PA manifacturers need to find a compromise between linearity and power efficiency. Digital Predistortion (DPD) and Crest factor Reduction techniques are intended to improve power efficiency while preserving linearity or inversely. Linearization of wideband RF power amplifiers using Digital Predistortion is the focus of this thesis. Three DPD issues are investigated in these works. The first issue deals with multiband linearization where signals with various waveforms located at different frequency bands are amplified. The second objective of this thesis is to study a concurrent DPD/CFR systems based on an automatic estimation of the necessary CFR gain. The last part of this dissertation deals with PA linearization under antenna load variations. Indeed, the impedance of antenna may vary because of electromagnetic objects that are present in its vicinity. Those impedance variations may instigate signal reflections toward the PA, that modify some of its main specifications (linearity, delivered power and efficiency). Our goal in this field is to preserve DPD linearization performances under antenna load mismatch
Gillet, Olivier. "Transcription des signaux percussifs : application à l'analyse de scènes musicales audiovisuelles." Phd thesis, Télécom ParisTech, 2007. http://pastel.archives-ouvertes.fr/pastel-00002805.
Full textBayle, Yann. "Apprentissage automatique de caractéristiques audio : application à la génération de listes de lecture thématiques." Thesis, Bordeaux, 2018. http://www.theses.fr/2018BORD0087/document.
Full textThis doctoral dissertation presents, discusses and proposes tools for the automatic information retrieval in big musical databases.The main application is the supervised classification of musical themes to generate thematic playlists.The first chapter introduces the different contexts and concepts around big musical databases and their consumption.The second chapter focuses on the description of existing music databases as part of academic experiments in audio analysis.This chapter notably introduces issues concerning the variety and unequal proportions of the themes contained in a database, which remain complex to take into account in supervised classification.The third chapter explains the importance of extracting and developing relevant audio features in order to better describe the content of music tracks in these databases.This chapter explains several psychoacoustic phenomena and uses sound signal processing techniques to compute audio features.New methods of aggregating local audio features are proposed to improve song classification.The fourth chapter describes the use of the extracted audio features in order to sort the songs by themes and thus to allow the musical recommendations and the automatic generation of homogeneous thematic playlists.This part involves the use of machine learning algorithms to perform music classification tasks.The contributions of this dissertation are summarized in the fifth chapter which also proposes research perspectives in machine learning and extraction of multi-scale audio features
Bitton, Adrien. "Meaningful audio synthesis and musical interactions by representation learning of sound sample databases." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS362.
Full textComputer assisted music extensively relies on audio sample libraries and virtual instruments which provide users an ever increasing amount of contents to produce music with. However, principled methods for large-scale interactions are lacking so that browsing samples and presets with respect to a target sound idea is a tedious and arbitrary process. Indeed, library metadata can only describe coarse categories of sounds but do not meaningfully traduce the underlying acoustic contents and continuous variations in timbre which are key elements of music production and creativity. The recent advances in deep generative modelling show unprecedented successes at learning large-scale unsupervised representations which invert to data as diverse as images, texts and audio. These probabilistic models could be refined to specific generative tasks such as unpaired image translation and semantic manipulations of visual features, demonstrating the ability of learning transformations and representations that are perceptually meaningful. In this thesis, we target efficient analysis and synthesis with auto-encoders to learn low dimensional acoustic representations for timbre manipulations and intuitive interactions for music production. In the first place we adapt domain translation techniques to timbre transfer and propose alternatives to adversarial learning for many-to-many transfers. Then we develop models for explicit modelling of timbre variations and controllable audio sampling using conditioning for semantic attribute manipulations and hierarchical learning to represent both acoustic and temporal variations
Huvet, Chloé. "D’Un nouvel espoir (1977) à La revanche des Sith (2005) : écriture musicale et traitement de la partition au sein du complexe audio-visuel dans la saga Star Wars." Thesis, Rennes 2, 2017. http://www.theses.fr/2017REN20048.
Full textThe scores of the Star Wars saga, a gigantic dischronic cycle spanning over a long period of twentyeight years, are all composed by John Williams, a unique configuration in cinema history. This compositional consistency should theoretically establish the two trilogies (1977-1983 and 1999-2005) as a coherent and unified whole, especially as George Lucas considers the six episodes as one single entity. Nevertheless, the hexalogy’s musical unity and the existence of a Star Wars musical signature are far from self-evident, instead taking the form of an ideal devoid of real, solid foundations.By adopting a comparative cross-disciplinary approach and by resorting to different scales of analysis (episode, trilogy, saga), this dissertation aims to show in which ways the musical material, Williams’ compositional practice as well as the use and integration of the score within the audiovisual complex are subjected to profound transformations between the two trilogies. This research also questions how and to what extent these changes in Williams’s writing and the score’s treatment in the different episodes are related to the mutations of film techniques, especially those of the digital age.Drawing on unreleased hand-written sources and personal interviews conducted with Williams’ main orchestrator, Conrad Pope, and his music editor, Kenneth Wannberg, this dissertation implements a firm interdisciplinarity at the intersection of musical analysis, cinema and technology history
De, Campos Teixeira Gomes Leandro. "Tatouage de signaux audio." Paris 5, 2002. http://www.theses.fr/2002PA05S009.
Full textPallone, Grégory. "Dilatation et transposition sous contraintes perceptives des signaux audio : application au transfert cinéma-vidéo." Aix-Marseille 2, 2003. https://tel.archives-ouvertes.fr/tel-00003363v4.
Full textOlivero, Anaik. "Les multiplicateurs temps-fréquence : Applications à l’analyse et la synthèse de signaux sonores et musicaux." Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4788/document.
Full textAnalysis/Transformation/Synthesis is a generalparadigm in signal processing, that aims at manipulating or generating signalsfor practical applications. This thesis deals with time-frequencyrepresentations obtained with Gabor atoms. In this context, the complexity of a soundtransformation can be modeled by a Gabor multiplier. Gabormultipliers are linear diagonal operators acting on signals, andare characterized by a time-frequency transfer function of complex values, called theGabor mask. Gabor multipliers allows to formalize the conceptof filtering in the time-frequency domain. As they act by multiplying in the time-frequencydomain, they are "a priori'' well adapted to producesound transformations like timbre transformations. In a first part, this work proposes to model theproblem of Gabor mask estimation between two given signals,and provides algorithms to solve it. The Gabor multiplier between two signals is not uniquely defined and the proposed estimationstrategies are able to generate Gabor multipliers that produce signalswith a satisfied sound quality. In a second part, we show that a Gabor maskcontain a relevant information, as it can be viewed asa time-frequency representation of the difference oftimbre between two given sounds. By averaging the energy contained in a Gabor mask, we obtain a measure of this difference that allows to discriminate different musical instrumentsounds. We also propose strategies to automaticallylocalize the time-frequency regions responsible for such a timbre dissimilarity between musicalinstrument classes. Finally, we show that the Gabor multipliers can beused to construct a lot of sounds morphing trajectories,and propose an extension
Daudet, Laurent. "Représentations structurelles de signaux audiophoniques : méthodes hybrides pour des applications à la compression." Aix-Marseille 1, 2000. http://www.theses.fr/2000AIX11056.
Full textGonon, Gilles. "Proposition d'un schéma d'analyse/synthèse adaptatif dans le plan temps-fréquence basé sur des critères entropiques : application au codage audio par transformée." Le Mans, 2002. http://cyberdoc.univ-lemans.fr/theses/2002/2002LEMA1004.pdf.
Full textAdaptive representations contribute to the study and caracterization of the information carried by any signal. In this work, we present a new decomposition which uses separated segmentation criterias in time and frequency to improve the adaptivity of the analysis to the signal. This scheme is applied to a transform perceptual audio coder. The signal is first temporally segmented using a local entropic criteria. Based upon an estimator of the local entropy, the segmentation criteria is relevant of the entropy variations in a signal and allows to separate stationnary parts from transients ones. Temporal frames thus defined are frequentially filtered using the Wavelet Packet Decomposition and the adaptation is performed by the mean of the Best Basis Search Algorithm. An extension of the library of dyadic basis is derived to improve the entropic gain performed over the signal and so the adaptivity of the decomposition. The perceptual audio coder we developped follows an original design in order to include the proposed scheme. The whole implementation of the coder is described in the document. This coder is evaluated with subjective tests, performed according to absolute and blind comparison for a rate of 96 kbps. As many parts of our coder are still to be improved, results show a subjective quality equivalent to the tested standard and hardly transparent toward the original sounds
Dessein, Arnaud. "Méthodes Computationnelles en Géométrie de l'Information et Applications Temps Réel au Traitement du Signal Audio." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00768524.
Full textNajnudel, Judy. "Power-Balanced Modeling of Nonlinear Electronic Components and Circuits for Audio Effects." Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS223.
Full textThis thesis is concerned with the modeling of nonlinear components and circuits for simulations in audio applications. Our goal is to propose models that are sufficiently sophisticated for simulations to sound realistic, but that remain simple enough for real time to be attainable. To this end, we explore two approaches, both based on a port-Hamiltonian systems formulation. Indeed, this formulation structurally guarantees power balance and passivity. Combined with ad hoc numerical methods, this ensures the numerical stability of simulations. The first approach is comparable to "white box" modeling. It assumes that the circuit topology is known, and focuses on the modeling of specific components found in vintage audio circuits, namely ferromagnetic coils (found in wah-wah pedals and guitar amplifiers) and opto-isolators (found in tremolos and optical compressors). The proposed models are physically-based, passive, modular, and usable in real time. The second approach is comparable to "grey box" modeling. It aims to retrieve the topology and constitutive laws of a circuit from measurements. The learning of the circuit topology is informed by an underlying port-Hamiltonian formulation, and nonlinearities are concomitantly addressed through kernel-based methods. Thus, necessary physical properties are enforced, while the use of reproducing kernels allows for a variety of nonlinear behaviors to be described with a smaller number of parameters and a higher interpretability compared to neural network methods. Finally, a possible generalization of this approach for a larger class of circuits is outlined through the introduction of the Koopman operator
Emiya, Valentin. "Transcription automatique de la musique de piano." Phd thesis, Télécom ParisTech, 2008. http://pastel.archives-ouvertes.fr/pastel-00004867.
Full textCuvillier, Philippe. "On temporal coherency of probabilistic models for audio-to-score alignment." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066532/document.
Full textThis thesis deals with automatic alignment of audio recordings with corresponding music scores. We study algorithmic solutions for this problem in the framework of probabilistic models which represent hidden evolution on the music score as stochastic process. We begin this work by investigating theoretical foundations of the design of such models. To do so, we undertake an axiomatic approach which is based on an application peculiarity: music scores provide nominal duration for each event, which is a hint for the actual and unknown duration. Thus, modeling this specific temporal structure through stochastic processes is our main problematic. We define temporal coherency as compliance with such prior information and refine this abstract notion by stating two criteria of coherency. Focusing on hidden semi-Markov models, we demonstrate that coherency is guaranteed by specific mathematical conditions on the probabilistic design and that fulfilling these prescriptions significantly improves precision of alignment algorithms. Such conditions are derived by combining two fields of mathematics, Lévy processes and total positivity of order 2. This is why the second part of this work is a theoretical investigation which extends existing results in the related literature
Cuvillier, Philippe. "On temporal coherency of probabilistic models for audio-to-score alignment." Electronic Thesis or Diss., Paris 6, 2016. http://www.theses.fr/2016PA066532.
Full textThis thesis deals with automatic alignment of audio recordings with corresponding music scores. We study algorithmic solutions for this problem in the framework of probabilistic models which represent hidden evolution on the music score as stochastic process. We begin this work by investigating theoretical foundations of the design of such models. To do so, we undertake an axiomatic approach which is based on an application peculiarity: music scores provide nominal duration for each event, which is a hint for the actual and unknown duration. Thus, modeling this specific temporal structure through stochastic processes is our main problematic. We define temporal coherency as compliance with such prior information and refine this abstract notion by stating two criteria of coherency. Focusing on hidden semi-Markov models, we demonstrate that coherency is guaranteed by specific mathematical conditions on the probabilistic design and that fulfilling these prescriptions significantly improves precision of alignment algorithms. Such conditions are derived by combining two fields of mathematics, Lévy processes and total positivity of order 2. This is why the second part of this work is a theoretical investigation which extends existing results in the related literature
Massé, Pierre. "Analysis, Treatment, and Manipulation Methods for Spatial Room Impulse Responses Measured with Spherical Microphone Arrays." Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS079.
Full textThe use of spatial room impulse responses (SRIR) for the reproduction of three-dimensional reverberation effects through multi-channel convolution over immersive surround-sound loudspeaker systems has become commonplace within the last few years, thanks in large part to the commercial availability of various spherical microphone arrays (SMA) as well as a constant increase in computing power. This use has in turn created a demand for analysis and treatment techniques not only capable of ensuring the faithful reproduction of the measured reverberation effect, but which could also be used to control various modifications of the SRIR in a more "creative" approach, as is often encountered in the production of immersive musical performances and installations. Within this context, the principal objective of the current thesis is the definition of a complete space-time-frequency framework for the analysis, treatment, and manipulation of SRIRs. The analysis tools should lead to an in-depth model allowing for measurements to first be treated with respect to their inherent limitations (measurement conditions, background noise, etc.), as well as offering the ability to modify different characteristics of the final reverberation effect described by the SRIR. These characteristics can be either completely objective, even physical, or otherwise informed by knowledge of human auditory perception with regard to room acoustics. The theoretical work in this research project is therefore presented in two main parts. First, the underlying SRIR signal model is described, heavily inspired by the historical approaches from the fields of artificial reverberation synthesis and SMA signal processing, while at the same time (incrementally) extending both. The signal model is then used to define the analysis methods that form the core of the final framework; these focus particularly on (a) identifying the "mixing time" that defines the moment of transition between the early reflection and late reverberation regimes, (b) obtaining a space-time cartography of the early reflections, and (c) estimating the frequency- and direction-dependent properties of the late reverberation's exponential energy decay envelope. In order to account for the directional dependence of these properties, a procedure for generating directional SRIR representations (i.e. directional room impulse responses, DRIR) that guarantee the preservation of certain fundamental reverberation properties must also be defined. In the second part, the model parameters made explicit by the analysis methods are exploited in order to either treat (i.e. attempt to correct some of the inevitable limitations inherent to the SMA measurement process) or more creatively manipulate and modify the SRIR. Two treatment methods in particular are developed in this thesis: (1) a pre-analysis procedure acting directly on repeated exponential sweep method (ESM) SMA measurement signals in an attempt to simultaneously increase the resulting SRIR's signal-to-noise ratio (SNR) while reducing its vulnerability to non-stationary noise events, and (2) a post-analysis denoising technique based on replacing the SRIR's background noise floor with a resynthesized extrapolation of the late reverberation tail. The theoretical descriptions thus complete, the main analysis methods as well as the DRIR generation and the denoising treatment procedures are then subjected to a series of validation tests, wherein simulated SRIRs (or parts thereof) are used to evaluate the performance, discuss the limitations, and parameterize the implementation of the different techniques. These sub-studies allow each method to be individually verified, resulting in a comprehensive investigation into the inner workings of the analysis toolbox (as well as the denoising process). Finally, to provide a concluding overview of the complete analysis-treatment-manipulation framework, similar studies are carried out using examples of real-world [...]
Huvet, Chloé. "D’Un nouvel espoir (1977) à La Revanche des Sith (2005) : écriture musicale et traitement de la partition au sein du complexe audio-visuel dans la saga Star Wars." Thèse, 2017. http://hdl.handle.net/1866/20076.
Full text