Rozprawy doktorskie na temat „Acoustic Scene Analysis”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 20 najlepszych rozpraw doktorskich naukowych na temat „Acoustic Scene Analysis”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Kudo, Hiroaki, Jinji Chen i Noboru Ohnishi. "Scene Analysis by Clues from the Acoustic Signals". INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2004. http://hdl.handle.net/2237/10426.
Pełny tekst źródłaFord, Logan H. "Large-scale acoustic scene analysis with deep residual networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123026.
Pełny tekst źródłaThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 63-66).
Many of the recent advances in audio event detection, particularly on the AudioSet dataset, have focused on improving performance using the released embeddings produced by a pre-trained model. In this work, we instead study the task of training a multi-label event classifier directly from the audio recordings of AudioSet. Using the audio recordings, not only are we able to reproduce results from prior work, we have also confirmed improvements of other proposed additions, such as an attention module. Moreover, by training the embedding network jointly with the additions, we achieve a mean Average Precision (mAP) of 0.392 and an area under ROC curve (AUC) of 0.971, surpassing the state-of-the-art without transfer learning from a large dataset. We also analyze the output activations of the network and find that the models are able to localize audio events when a finer time resolution is needed. In addition, we use this model in exploring multimodal learning, transfer learning, and realtime sound event detection tasks.
by Logan H. Ford.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Teutsch, Heinz. "Wavefield decomposition using microphone arrays and its application to acoustic scene analysis". [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=97902806X.
Pełny tekst źródłaMcMullan, Amanda R. "Electroencephalographic measures of auditory perception in dynamic acoustic environments". Thesis, Lethbridge, Alta. : University of Lethbridge, Dept. of Neuroscience, c2013, 2013. http://hdl.handle.net/10133/3354.
Pełny tekst źródłax, 90 leaves : col. ill. ; 29 cm
Narayanan, Arun. "Computational auditory scene analysis and robust automatic speech recognition". The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1401460288.
Pełny tekst źródłaCarlo, Diego Di. "Echo-aware signal processing for audio scene analysis". Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S075.
Pełny tekst źródłaMost of audio signal processing methods regard reverberation and in particular acoustic echoes as a nuisance. However, they convey important spatial and semantic information about sound sources and, based on this, recent echo-aware methods have been proposed. In this work we focus on two directions. First, we study the how to estimate acoustic echoes blindly from microphone recordings. Two approaches are proposed, one leveraging on continuous dictionaries, one using recent deep learning techniques. Then, we focus on extending existing methods in audio scene analysis to their echo-aware forms. The Multichannel NMF framework for audio source separation, the SRP-PHAT localization method, and the MVDR beamformer for speech enhancement are all extended to their echo-aware versions
Deleforge, Antoine. "Acoustic Space Mapping : A Machine Learning Approach to Sound Source Separation and Localization". Thesis, Grenoble, 2013. http://www.theses.fr/2013GRENM033/document.
Pełny tekst źródłaIn this thesis, we address the long-studied problem of binaural (two microphones) sound source separation and localization through supervised leaning. To achieve this, we develop a new paradigm referred as acoustic space mapping, at the crossroads of binaural perception, robot hearing, audio signal processing and machine learning. The proposed approach consists in learning a link between auditory cues perceived by the system and the emitting sound source position in another modality of the system, such as the visual space or the motor space. We propose new experimental protocols to automatically gather large training sets that associates such data. Obtained datasets are then used to reveal some fundamental intrinsic properties of acoustic spaces and lead to the development of a general family of probabilistic models for locally-linear high- to low-dimensional space mapping. We show that these models unify several existing regression and dimensionality reduction techniques, while encompassing a large number of new models that generalize previous ones. The properties and inference of these models are thoroughly detailed, and the prominent advantage of proposed methods with respect to state-of-the-art techniques is established on different space mapping applications, beyond the scope of auditory scene analysis. We then show how the proposed methods can be probabilistically extended to tackle the long-known cocktail party problem, i.e., accurately localizing one or several sound sources emitting at the same time in a real-word environment, and separate the mixed signals. We show that resulting techniques perform these tasks with an unequaled accuracy. This demonstrates the important role of learning and puts forwards the acoustic space mapping paradigm as a promising tool for robustly addressing the most challenging problems in computational binaural audition
Mouterde, Solveig. "Long-range discrimination of individual vocal signatures by a songbird : from propagation constraints to neural substrate". Thesis, Saint-Etienne, 2014. http://www.theses.fr/2014STET4012/document.
Pełny tekst źródłaIn communication systems, one of the biggest challenges is that the information encoded by the emitter is always modified before reaching the receiver, who has to process this altered information in order to recover the intended message. In acoustic communication particularly, the transmission of sound through the environment is a major source of signal degradation, caused by attenuation, absorption and reflections, all of which lead to decreases in the signal relative to the background noise. How animals deal with the need for exchanging information in spite of constraining conditions has been the subject of many studies either at the emitter or at the receiver's levels. However, a more integrated research about auditory scene analysis has seldom been used, and is needed to address the complexity of this process. The goal of my research was to use a transversal approach to study how birds adapt to the constraints of long distance communication by investigating the information coding at the emitter's level, the propagation-induced degradation of the acoustic signal, and the discrimination of this degraded information by the receiver at both the behavioral and neural levels. Taking into account the everyday issues faced by animals in their natural environment, and using stimuli and paradigms that reflected the behavioral relevance of these challenges, has been the cornerstone of my approach. Focusing on the information about individual identity in the distance calls of zebra finches Taeniopygia guttata, I investigated how the individual vocal signature is encoded, degraded, and finally discriminated, from the emitter to the receiver. This study shows that the individual signature of zebra finches is very resistant to propagation-induced degradation, and that the most individualized acoustic parameters vary depending on distance. Testing female birds in operant conditioning experiments, I showed that they are experts at discriminating between the degraded vocal signatures of two males, and that they can improve their ability substantially when they can train over increasing distances. Finally, I showed that this impressive discrimination ability also occurs at the neural level: we found a population of neurons in the avian auditory forebrain that discriminate individual voices with various degrees of propagation-induced degradation without prior familiarization or training. The finding of such a high-level auditory processing, in the primary auditory cortex, opens a new range of investigations, at the interface of neural processing and behavior
Teki, S. "Cognitive analysis of complex acoustic scenes". Thesis, University College London (University of London), 2013. http://discovery.ucl.ac.uk/1413017/.
Pełny tekst źródłaWang, Yuxuan. "Supervised Speech Separation Using Deep Neural Networks". The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1426366690.
Pełny tekst źródłaChen, Jitong. "On Generalization of Supervised Speech Separation". The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492038295603502.
Pełny tekst źródłaHuet, Moïra-Phoebé. "Voice mixology at a cocktail party : Combining behavioural and neural tracking for speech segregation". Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEI070.
Pełny tekst źródłaIt is not always easy to follow a conversation in a noisy environment. In order to discriminate two speakers, we have to mobilize many perceptual and cognitive processes to maintain attention on a target voice and avoid shifting attention to the background. In this dissertation, the processes underlying speech segregation are explored through behavioural and neurophysiological experiments. In a preliminary phase, the development of an intelligibility task -- the Long-SWoRD test -- is introduced. This protocol allows participants to benefit from cognitive resources, such as linguistic knowledge, to separate two talkers in a realistic listening environment. The similarity between the two speakers, and thus by extension the difficulty of the task, was controlled by manipulating the acoustic parameters of the target and masker voices. In a second phase, the performance of the participants on this task is evaluated through three behavioural and neurophysiological studies (EEG). Behavioural results are consistent with the literature and show that the distance between voices, spatialisation cues, and semantic information influence participants' performance. Neurophysiological results, analysed with temporal response functions (TRF), indicate that the neural representations of the two speakers differ according to the difficulty of listening conditions. In addition, these representations are constructed more quickly when the voices are easily distinguishable. It is often presumed in the literature that participants' attention remains constantly on the same voice. The experimental protocol presented in this work provides the opportunity to retrospectively infer when participants were listening to each voice. Therefore, in a third stage, a combined analysis of this attentional information and EEG signals is presented. Results show that information about attentional focus can be used to improve the neural representation of the attended voice in situations where the voices are similar
Baque, Mathieu. "Analyse de scène sonore multi-capteurs : un front-end temps-réel pour la manipulation de scène". Thesis, Le Mans, 2017. http://www.theses.fr/2017LEMA1013/document.
Pełny tekst źródłaThe context of this thesis is the development of spatialized audio (5.1 contents, Dolby Atmos...) and particularly of 3D audio. Among the existing 3D audio formats, Ambisonics and Higher Order Ambisonics (HOA) allow a homogeneous spatial representation of a sound field and allows basics manipulations, like rotations or distorsions. The aim of the thesis is to provides efficient tools for ambisonics and HOA sound scene analyse and manipulations. A real-time implementation and robustness to reverberation are the main constraints to deal with. The implemented algorithm is based on a frame-by-frame Independent Component Analysis (ICA), wich decomposes the sound field into a set of acoustic contributions. Then a bayesian classification step is applied to the extracted components to identify the real sources and the residual reverberation. Direction of arrival of the sources are extracted from the mixing matrix estimated by ICA, according to the ambisonic formalism, and a real-time cartography of the sound scene is obtained. Performances have been evaluated in different acoustic environnements to assess the influence of several parameters such as the ambisonic order, the frame length or the number of sources. Accurate results in terms of source localization and source counting have been obtained for frame lengths of a few hundred milliseconds. The algorithm is exploited as a pre-processing step for a speech recognition prototype and allows a significant increasing of the recognition results, in far field conditions and in the presence of noise and interferent sources
Woodruff, John F. "Integrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments". The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1332425718.
Pełny tekst źródłaSundar, Harshavardhan. "Who Spoke What And Where? A Latent Variable Framework For Acoustic Scene Analysis". Thesis, 2016. https://etd.iisc.ac.in/handle/2005/2569.
Pełny tekst źródłaSundar, Harshavardhan. "Who Spoke What And Where? A Latent Variable Framework For Acoustic Scene Analysis". Thesis, 2016. http://etd.iisc.ernet.in/handle/2005/2569.
Pełny tekst źródłaAaronson, Neil L. "Speech-on-speech masking in a front-back dimension and analysis of binaural parameters in rooms using MLS methods". Diss., 2008.
Znajdź pełny tekst źródłaTeutsch, Heinz [Verfasser]. "Wavefield decomposition using microphone arrays and its application to acoustic scene analysis / vorgelegt von Heinz Teutsch". 2006. http://d-nb.info/97902806X/34.
Pełny tekst źródłaPelluri, Sai Gunaranjan. "Joint Spectro-Temporal Analysis of Moving Acoustic Sources". Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4279.
Pełny tekst źródła"Psychophysical and Neural Correlates of Auditory Attraction and Aversion". Master's thesis, 2014. http://hdl.handle.net/2286/R.I.27518.
Pełny tekst źródłaDissertation/Thesis
Masters Thesis Psychology 2014