Gotowa bibliografia na temat „Auditory source separation”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Auditory source separation”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Auditory source separation"

1

Li, Han, Kean Chen, Lei Wang, Jianben Liu, Baoquan Wan i Bing Zhou. "Sound Source Separation Mechanisms of Different Deep Networks Explained from the Perspective of Auditory Perception". Applied Sciences 12, nr 2 (14.01.2022): 832. http://dx.doi.org/10.3390/app12020832.

Pełny tekst źródła
Streszczenie:
Thanks to the development of deep learning, various sound source separation networks have been proposed and made significant progress. However, the study on the underlying separation mechanisms is still in its infancy. In this study, deep networks are explained from the perspective of auditory perception mechanisms. For separating two arbitrary sound sources from monaural recordings, three different networks with different parameters are trained and achieve excellent performances. The networks’ output can obtain an average scale-invariant signal-to-distortion ratio improvement (SI-SDRi) higher than 10 dB, comparable with the human performance to separate natural sources. More importantly, the most intuitive principle—proximity—is explored through simultaneous and sequential organization experiments. Results show that regardless of network structures and parameters, the proximity principle is learned spontaneously by all networks. If components are proximate in frequency or time, they are not easily separated by networks. Moreover, the frequency resolution at low frequencies is better than at high frequencies. These behavior characteristics of all three networks are highly consistent with those of the human auditory system, which implies that the learned proximity principle is not accidental, but the optimal strategy selected by networks and humans when facing the same task. The emergence of the auditory-like separation mechanisms provides the possibility to develop a universal system that can be adapted to all sources and scenes.
Style APA, Harvard, Vancouver, ISO itp.
2

Sasaki, Yoko, Saori Masunaga, Simon Thompson, Satoshi Kagami i Hiroshi Mizoguchi. "Sound Localization and Separation for Mobile Robot Tele-Operation by Tri-Concentric Microphone Array". Journal of Robotics and Mechatronics 19, nr 3 (20.06.2007): 281–89. http://dx.doi.org/10.20965/jrm.2007.p0281.

Pełny tekst źródła
Streszczenie:
The paper describes a tele-operated mobile robot system which can perform multiple sound source localization and separation using a 32-channel tri-concentric microphone array. Tele-operated mobile robots require two main capabilities: 1) audio/visual presentation of the robot’s environment to the operator, and 2) autonomy for mobility. This paper focuses on the auditory system of a tele-operated mobile robot in order to improve both the presentation of sound sources to the operator and also to facilitate autonomous robot actions. The auditory system is based on a 32-channel distributed microphone array that uses highly efficient directional design for localizing and separating multiple moving sound sources. Experimental results demonstrate the feasibility of inter-person distant communication through the tele-operated robot system.
Style APA, Harvard, Vancouver, ISO itp.
3

Doll, Theodore J., Thomas E. Hanna i Joseph S. Russotti. "Masking in Three-Dimensional Auditory Displays". Human Factors: The Journal of the Human Factors and Ergonomics Society 34, nr 3 (czerwiec 1992): 255–65. http://dx.doi.org/10.1177/001872089203400301.

Pełny tekst źródła
Streszczenie:
The extent to which simultaneous inputs in a three-dimensional (3D) auditory display mask one another was studied in a simulated sonar task. The minimum signal-to-noise ratio (SNR) required to detect an amplitude-modulated SOO-Hz tone in a background of broadband noise was measured using a loudspeaker array in a free field. Three aspects of the 3D array were varied: angular separation of the sources, degree of correlation of the background noises, and listener head movement. Masking was substantially reduced when the sources were uncorrelated. The SNR needed for detection decreased with source separation, and the rate of decrease was significantly greater with uncorrelated sources than with partially or fully correlated sources. Head movement had no effect on the SNR required for detection. Implications for the design and application of 3D auditory displays are discussed.
Style APA, Harvard, Vancouver, ISO itp.
4

Li, Han, Kean Chen, Rong Li, Jianben Liu, Baoquan Wan i Bing Zhou. "Auditory-like simultaneous separation mechanisms spontaneously learned by a deep source separation network". Applied Acoustics 188 (styczeń 2022): 108591. http://dx.doi.org/10.1016/j.apacoust.2021.108591.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Drake, Laura, i Janet Rutledge. "Auditory scene analysis‐constrained array processing for sound source separation". Journal of the Acoustical Society of America 101, nr 5 (maj 1997): 3106. http://dx.doi.org/10.1121/1.418868.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Farley, Brandon J., i Arnaud J. Noreña. "Membrane potential dynamics of populations of cortical neurons during auditory streaming". Journal of Neurophysiology 114, nr 4 (październik 2015): 2418–30. http://dx.doi.org/10.1152/jn.00545.2015.

Pełny tekst źródła
Streszczenie:
How a mixture of acoustic sources is perceptually organized into discrete auditory objects remains unclear. One current hypothesis postulates that perceptual segregation of different sources is related to the spatiotemporal separation of cortical responses induced by each acoustic source or stream. In the present study, the dynamics of subthreshold membrane potential activity were measured across the entire tonotopic axis of the rodent primary auditory cortex during the auditory streaming paradigm using voltage-sensitive dye imaging. Consistent with the proposed hypothesis, we observed enhanced spatiotemporal segregation of cortical responses to alternating tone sequences as their frequency separation or presentation rate was increased, both manipulations known to promote stream segregation. However, across most streaming paradigm conditions tested, a substantial cortical region maintaining a response to both tones coexisted with more peripheral cortical regions responding more selectively to one of them. We propose that these coexisting subthreshold representation types could provide neural substrates to support the flexible switching between the integrated and segregated streaming percepts.
Style APA, Harvard, Vancouver, ISO itp.
7

Drake, Laura A., Janet C. Rutledge i Aggelos Katsaggelos. "Computational auditory scene analysis‐constrained array processing for sound source separation". Journal of the Acoustical Society of America 106, nr 4 (październik 1999): 2238. http://dx.doi.org/10.1121/1.427622.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Zakeri, Sahar, i Masoud Geravanchizadeh. "Supervised binaural source separation using auditory attention detection in realistic scenarios". Applied Acoustics 175 (kwiecień 2021): 107826. http://dx.doi.org/10.1016/j.apacoust.2020.107826.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

McElveen, J. K., Leonid Krasny i Scott Nordlund. "Applying matched field array processing and machine learning to computational auditory scene analysis and source separation challenges". Journal of the Acoustical Society of America 151, nr 4 (kwiecień 2022): A232. http://dx.doi.org/10.1121/10.0011162.

Pełny tekst źródła
Streszczenie:
Matched field processing (MFP) techniques employing physics-based models of acoustic propagation have been successfully and widely applied to underwater target detection and localization, while machine learning (ML) techniques have enabled detection and extraction of patterns in data. Fusing MFP and ML enables the estimation of Green’s Function solutions to the Acoustic Wave Equation for waveguides from data captured in real, reverberant acoustic environments. These Green’s Function estimates can further enable the robust separation of individual sources, even in the presence of multiple loud, interfering, interposed, and competing noise sources. We first introduce MFP and ML and then discuss their application to Computational Auditory Scene Analysis (CASA) and acoustic source separation. Results from a variety of tests using a binaural headset, as well as different wearable and free-standing microphone arrays are then presented to illustrate the effects of the number and placement of sensors on the residual noise floor after separation. Finally, speculations on the similarities between this proprietary approach and the human auditory system’s use of interaural cross-correlation in formulation of acoustic spatial models will be introduced and ideas for further research proposed.
Style APA, Harvard, Vancouver, ISO itp.
10

Otsuka, Takuma, Katsuhiko Ishiguro, Hiroshi Sawada i Hiroshi Okuno. "Bayesian Unification of Sound Source Localization and Separation with Permutation Resolution". Proceedings of the AAAI Conference on Artificial Intelligence 26, nr 1 (20.09.2021): 2038–45. http://dx.doi.org/10.1609/aaai.v26i1.8376.

Pełny tekst źródła
Streszczenie:
Sound source localization and separation with permutation resolution are essential for achieving a computational auditory scene analysis system that can extract useful information from a mixture of various sounds. Because existing methods cope separately with these problems despite their mutual dependence, the overall result with these approaches can be degraded by any failure in one of these components. This paper presents a unified Bayesian framework to solve these problems simultaneously where localization and separation are regarded as a clustering problem. Experimental results confirm that our method outperforms state-of-the-art methods in terms of the separation quality with various setups including practical reverberant environments.
Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat "Auditory source separation"

1

Beauvois, Michael W. "A computer model of auditory stream segregation". Thesis, Loughborough University, 1991. https://dspace.lboro.ac.uk/2134/33091.

Pełny tekst źródła
Streszczenie:
A simple computer model is described that takes a novel approach to the problem of accounting for perceptual coherence among successive pure tones of changing frequency by using simple physiological principles that operate at a peripheral, rather than a central level. The model is able to reproduce a number of streaming phenomena found in the literature using the same parameter values. These are: (1) the build-up of streaming over time; (2) the temporal coherence and fission boundaries of human listeners; (3) the ambiguous region; and (4) the trill threshold. In addition, the principle of excitation integration used in the model can be used to account for auditory grouping on the basis of the Gestalt perceptual principles of closure, proximity, continuity, and good continuation, as well as the pulsation threshold. The examples of Gestalt auditory grouping accounted for by the excitation integration principle indicate that the predictive power of the model would be considerably enhanced by the addition of a cross-channel grouping mechanism that worked on the basis of common on sets and offsets, as more complex stimuli could then be processed by the model.
Style APA, Harvard, Vancouver, ISO itp.
2

Leech, Stuart Matthew. "The effect on audiovisual speech perception of auditory and visual source separation". Thesis, University of Sussex, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.271770.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Melih, Kathy, i n/a. "Audio Source Separation Using Perceptual Principles for Content-Based Coding and Information Management". Griffith University. School of Information Technology, 2004. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20050114.081327.

Pełny tekst źródła
Streszczenie:
The information age has brought with it a dual problem. In the first place, the ready access to mechanisms to capture and store vast amounts of data in all forms (text, audio, image and video), has resulted in a continued demand for ever more efficient means to store and transmit this data. In the second, the rapidly increasing store demands effective means to structure and access the data in an efficient and meaningful manner. In terms of audio data, the first challenge has traditionally been the realm of audio compression research that has focused on statistical, unstructured audio representations that obfuscate the inherent structure and semantic content of the underlying data. This has only served to further complicate the resolution of the second challenge resulting in access mechanisms that are either impractical to implement, too inflexible for general application or too low level for the average user. Thus, an artificial dichotomy has been created from what is in essence a dual problem. The founding motivation of this thesis is that, although the hypermedia model has been identified as the ideal, cognitively justified method for organising data, existing audio data representations and coding models provide little, if any, support for, or resemblance to, this model. It is the contention of the author that any successful attempt to create hyperaudio must resolve this schism, addressing both storage and information management issues simultaneously. In order to achieve this aim, an audio representation must be designed that provides compact data storage while, at the same time, revealing the inherent structure of the underlying data. Thus it is the aim of this thesis to present a representation designed with these factors in mind. Perhaps the most difficult hurdle in the way of achieving the aims of content-based audio coding and information management is that of auditory source separation. The MPEG committee has noted this requirement during the development of its MPEG-7 standard, however, the mechanics of "how" to achieve auditory source separation were left as an open research question. This same committee proposed that MPEG-7 would "support descriptors that can act as handles referring directly to the data, to allow manipulation of the multimedia material." While meta-data tags are a part solution to this problem, these cannot allow manipulation of audio material down to the level of individual sources when several simultaneous sources exist in a recording. In order to achieve this aim, the data themselves must be encoded in such a manner that allows these descriptors to be formed. Thus, content-based coding is obviously required. In the case of audio, this is impossible to achieve without effecting auditory source separation. Auditory source separation is the concern of computational auditory scene analysis (CASA). However, the findings of CASA research have traditionally been restricted to a limited domain. To date, the only real application of CASA research to what could loosely be classified as information management has been in the area of signal enhancement for automatic speech recognition systems. In these systems, a CASA front end serves as a means of separating the target speech from the background "noise". As such, the design of a CASA-based approach, as presented in this thesis, to one of the most significant challenges facing audio information management research represents a significant contribution to the field of information management. Thus, this thesis unifies research from three distinct fields in an attempt to resolve some specific and general challenges faced by all three. It describes an audio representation that is based on a sinusoidal model from which low-level auditory primitive elements are extracted. The use of a sinusoidal representation is somewhat contentious with the modern trend in CASA research tending toward more complex approaches in order to resolve issues relating to co-incident partials. However, the choice of a sinusoidal representation has been validated by the demonstration of a method to resolve many of these issues. The majority of the thesis contributes several algorithms to organise the low-level primitives into low-level auditory objects that may form the basis of nodes or link anchor points in a hyperaudio structure. Finally, preliminary investigations in the representation’s suitability for coding and information management tasks are outlined as directions for future research.
Style APA, Harvard, Vancouver, ISO itp.
4

Melih, Kathy. "Audio Source Separation Using Perceptual Principles for Content-Based Coding and Information Management". Thesis, Griffith University, 2004. http://hdl.handle.net/10072/366279.

Pełny tekst źródła
Streszczenie:
The information age has brought with it a dual problem. In the first place, the ready access to mechanisms to capture and store vast amounts of data in all forms (text, audio, image and video), has resulted in a continued demand for ever more efficient means to store and transmit this data. In the second, the rapidly increasing store demands effective means to structure and access the data in an efficient and meaningful manner. In terms of audio data, the first challenge has traditionally been the realm of audio compression research that has focused on statistical, unstructured audio representations that obfuscate the inherent structure and semantic content of the underlying data. This has only served to further complicate the resolution of the second challenge resulting in access mechanisms that are either impractical to implement, too inflexible for general application or too low level for the average user. Thus, an artificial dichotomy has been created from what is in essence a dual problem. The founding motivation of this thesis is that, although the hypermedia model has been identified as the ideal, cognitively justified method for organising data, existing audio data representations and coding models provide little, if any, support for, or resemblance to, this model. It is the contention of the author that any successful attempt to create hyperaudio must resolve this schism, addressing both storage and information management issues simultaneously. In order to achieve this aim, an audio representation must be designed that provides compact data storage while, at the same time, revealing the inherent structure of the underlying data. Thus it is the aim of this thesis to present a representation designed with these factors in mind. Perhaps the most difficult hurdle in the way of achieving the aims of content-based audio coding and information management is that of auditory source separation. The MPEG committee has noted this requirement during the development of its MPEG-7 standard, however, the mechanics of "how" to achieve auditory source separation were left as an open research question. This same committee proposed that MPEG-7 would "support descriptors that can act as handles referring directly to the data, to allow manipulation of the multimedia material." While meta-data tags are a part solution to this problem, these cannot allow manipulation of audio material down to the level of individual sources when several simultaneous sources exist in a recording. In order to achieve this aim, the data themselves must be encoded in such a manner that allows these descriptors to be formed. Thus, content-based coding is obviously required. In the case of audio, this is impossible to achieve without effecting auditory source separation. Auditory source separation is the concern of computational auditory scene analysis (CASA). However, the findings of CASA research have traditionally been restricted to a limited domain. To date, the only real application of CASA research to what could loosely be classified as information management has been in the area of signal enhancement for automatic speech recognition systems. In these systems, a CASA front end serves as a means of separating the target speech from the background "noise". As such, the design of a CASA-based approach, as presented in this thesis, to one of the most significant challenges facing audio information management research represents a significant contribution to the field of information management. Thus, this thesis unifies research from three distinct fields in an attempt to resolve some specific and general challenges faced by all three. It describes an audio representation that is based on a sinusoidal model from which low-level auditory primitive elements are extracted. The use of a sinusoidal representation is somewhat contentious with the modern trend in CASA research tending toward more complex approaches in order to resolve issues relating to co-incident partials. However, the choice of a sinusoidal representation has been validated by the demonstration of a method to resolve many of these issues. The majority of the thesis contributes several algorithms to organise the low-level primitives into low-level auditory objects that may form the basis of nodes or link anchor points in a hyperaudio structure. Finally, preliminary investigations in the representation’s suitability for coding and information management tasks are outlined as directions for future research.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information Technology
Full Text
Style APA, Harvard, Vancouver, ISO itp.
5

Belzner, Katharine Ann. "DPOAE two-source separation in adult Japanese quail (Coturnix coturnix japonica) /". Full-text of dissertation on the Internet (891.53 KB), 2010. http://www.lib.jmu.edu/general/etd/2010/doctorate/belzneka/belzneka_doctorate_04-19-2010_02.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Deleforge, Antoine. "Acoustic Space Mapping : A Machine Learning Approach to Sound Source Separation and Localization". Thesis, Grenoble, 2013. http://www.theses.fr/2013GRENM033/document.

Pełny tekst źródła
Streszczenie:
Dans cette thèse, nous abordons le problème longtemps étudié de la séparation et localisation binaurale (deux microphones) de sources sonores par l'apprentissage supervisé. Dans ce but, nous développons un nouveau paradigme dénommé projection d'espaces acoustiques, à la croisé des chemins entre la perception binaurale, de l'écoute robotisée, du traitement du signal audio, et de l'apprentissage automatisé. L'approche proposée consiste à apprendre un lien entre les indices auditifs perçus par le système et la position de la source sonore dans une autre modalité du système, comme l'espace visuelle ou l'espace moteur. Nous proposons de nouveaux protocoles expérimentaux permettant d'acquérir automatiquement de grands ensembles d'entraînement qui associent des telles données. Les jeux de données obtenus sont ensuite utilisés pour révéler certaines propriétés intrinsèques des espaces acoustiques, et conduisent au développement d'une famille générale de modèles probabilistes permettant la projection localement linéaire d'un espace de haute dimension vers un espace de basse dimension. Nous montrons que ces modèles unifient plusieurs méthodes de régression et de réduction de dimension existantes, tout en incluant un grand nombre de nouveaux modèles qui généralisent les précédents. Les popriétés et l'inférence de ces modèles sont détaillées en profondeur, et le net avantage des méthodes proposées par rapport à des techniques de l'état de l'art est établit sur différentes applications de projection d'espace, au delà du champs de l'analyse de scènes auditives. Nous montrons ensuite comment les méthodes proposées peuvent être étendues probabilistiquement pour s'attaquer au fameux problème de la soirée cocktail, c'est à dire localiser une ou plusieurs sources émettant simultanément dans un environnement réel, et reséparer les signaux mélangés. Nous montrons que les techniques qui en découlent accomplissent cette tâche avec une précision inégalée. Ceci démontre le rôle important de l'apprentissage et met en avant le paradigme de la projection d'espaces acoustiques comme un outil prometteur pour aborder de façon robuste les problèmes les plus difficiles de l'audition binaurale computationnelle
In this thesis, we address the long-studied problem of binaural (two microphones) sound source separation and localization through supervised leaning. To achieve this, we develop a new paradigm referred as acoustic space mapping, at the crossroads of binaural perception, robot hearing, audio signal processing and machine learning. The proposed approach consists in learning a link between auditory cues perceived by the system and the emitting sound source position in another modality of the system, such as the visual space or the motor space. We propose new experimental protocols to automatically gather large training sets that associates such data. Obtained datasets are then used to reveal some fundamental intrinsic properties of acoustic spaces and lead to the development of a general family of probabilistic models for locally-linear high- to low-dimensional space mapping. We show that these models unify several existing regression and dimensionality reduction techniques, while encompassing a large number of new models that generalize previous ones. The properties and inference of these models are thoroughly detailed, and the prominent advantage of proposed methods with respect to state-of-the-art techniques is established on different space mapping applications, beyond the scope of auditory scene analysis. We then show how the proposed methods can be probabilistically extended to tackle the long-known cocktail party problem, i.e., accurately localizing one or several sound sources emitting at the same time in a real-word environment, and separate the mixed signals. We show that resulting techniques perform these tasks with an unequaled accuracy. This demonstrates the important role of learning and puts forwards the acoustic space mapping paradigm as a promising tool for robustly addressing the most challenging problems in computational binaural audition
Style APA, Harvard, Vancouver, ISO itp.
7

Joseph, Joby. "Why only two ears? Some indicators from the study of source separation using two sensors". Thesis, Indian Institute of Science, 2004. http://hdl.handle.net/2005/55.

Pełny tekst źródła
Streszczenie:
In this thesis we develop algorithms for estimating broadband source signals from a mixture using only two sensors. This is motivated by what is known in the literature as cocktail party effect, the ability of human beings to listen to the desired source from a mixture of sources with at most two ears. Such a study lets us, achieve a better understanding of the auditory pathway in the brain and confirmation of the results from physiology and psychoacoustics, have a clue to search for an equivalent structure in the brain which corresponds to the modification which improves the algorithm, come up with a benchmark system to automate the evaluation of the systems like 'surround sound', perform speech recognition in noisy environments. Moreover, it is possible that, what we learn about the replication of the functional units in the brain may help us in replacing those using signal processing units for patients suffering due to the defects in these units. There are two parts to the thesis. In the first part we assume the source signals to be broadband and having strong spectral overlap. Channel is assumed to have a few strong multipaths. We propose an algorithm to estimate all the strong multi-paths from each source to the sensors for more than two sources with measurement from two sensors. Because the channel matrix is not invertible when the number of sources is more than the number of sensors, we make use of the estimates of the multi-path delays for each source to improve the SIR of the sources. In the second part we look at a specific scenario of colored signals and channel being one with a prominent direct path. Speech signals as the sources in a weakly reverberant room and a pair of microphones as the sensors satisfy these conditions. We consider the case with and without a head like structure between the microphones. The head like structure we used was a cubical block of wood. We propose an algorithm for separating sources under such a scenario. We identify the features of speech and the channel which makes it possible for the human auditory system to solve the cocktail party problem. These properties are the same as that satisfied by our model. The algorithm works well in a partly acoustically treated room, (with three persons speaking and two microphones and data acquired using standard PC setup) and not so well in a heavily reverberant scenario. We see that there are similarities in the processing steps involved in the algorithm and what we know of the way our auditory system works, especially so in the regions before the auditory cortex in the auditory pathway. Based on the above experiments we give reasons to support the hypothesis about why all the known organisms need to have only two ears and not more but may have more than two eyes to their advantage. Our results also indicate that part of pitch estimation for individual sources might be occurring in the brain after separating the individual source components. This might solve the dilemma of having to do multi-pitch estimation. Recent works suggest that there are parallel pathways in the brain up to the primary auditory cortex which deal with temporal cue based processing and spatial cue based processing. Our model seem to mimic the pathway which makes use of the spatial cues.
Style APA, Harvard, Vancouver, ISO itp.
8

Ardam, Nagaraju. "Study of ASA Algorithms". Thesis, Linköpings universitet, Elektroniksystem, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-70996.

Pełny tekst źródła
Streszczenie:
Hearing aid devices are used to help people with hearing impairment. The number of people that requires hearingaid devices are possibly constant over the years, however the number of people that now have access to hearing aiddevices increasing rapidly. The hearing aid devices must be small, consume very little power, and be fairly accurate.Even though it is normally more important for the user that hearing impairment look good (are discrete). Once thehearing aid device prescribed to the user, she/he needs to train and adjust the device to compensate for the individualimpairment.We are within the framework of this project researching on hearing aid devices that can be trained by the hearingimpaired person her-/himself. This project is about finding suitable noise cancellation algorithm for the hearing-aiddevice. We consider several types of algorithms like, microphone array signal processing, Independent ComponentAnalysis (ICA) based on double microphone called Blind Source Separation (BSS) and DRNPE algorithm.We run this current and most sophisticated and robust algorithms in certain noise backgrounds like Cocktail noise,street, public places, train, babble situations to test the efficiency. The BSS algorithm was well in some situation andgave average results in some situations. Where one microphone gave steady results in all situations. The output isgood enough to listen targeted audio.The functionality and performance of the proposed algorithm is evaluated with different non-stationary noisebackgrounds. From the performance results it can be concluded that, by using the proposed algorithm we are able toreduce the noise to certain level. SNR, system delay, minimum error and audio perception are the vital parametersconsidered to evaluate the performance of algorithms. Based on these parameters an algorithm is suggested forheairng-aid.
Hearing-Aid
Style APA, Harvard, Vancouver, ISO itp.
9

Otsuka, Takuma. "Bayesian Microphone Array Processing". 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/188871.

Pełny tekst źródła
Streszczenie:
Kyoto University (京都大学)
0048
新制・課程博士
博士(情報学)
甲第18412号
情博第527号
新制||情||93(附属図書館)
31270
京都大学大学院情報学研究科知能情報学専攻
(主査)教授 奥乃 博, 教授 河原 達也, 准教授 CUTURI CAMETO Marco, 講師 吉井 和佳
学位規則第4条第1項該当
Style APA, Harvard, Vancouver, ISO itp.
10

Chen, Zhuo. "Single Channel auditory source separation with neural network". Thesis, 2017. https://doi.org/10.7916/D8W09C8N.

Pełny tekst źródła
Streszczenie:
Although distinguishing different sounds in noisy environment is a relative easy task for human, source separation has long been extremely difficult in audio signal processing. The problem is challenging for three reasons: the large variety of sound type, the abundant mixing conditions and the unclear mechanism to distinguish sources, especially for similar sounds. In recent years, the neural network based methods achieved impressive successes in various problems, including the speech enhancement, where the task is to separate the clean speech out of the noise mixture. However, the current deep learning based source separator does not perform well on real recorded noisy speech, and more importantly, is not applicable in a more general source separation scenario such as overlapped speech. In this thesis, we firstly propose extensions for the current mask learning network, for the problem of speech enhancement, to fix the scale mismatch problem which is usually occurred in real recording audio. We solve this problem by combining two additional restoration layers in the existing mask learning network. We also proposed a residual learning architecture for the speech enhancement, further improving the network generalization under different recording conditions. We evaluate the proposed speech enhancement models on CHiME 3 data. Without retraining the acoustic model, the best bi-direction LSTM with residue connections yields 25.13% relative WER reduction on real data and 34.03% WER on simulated data. Then we propose a novel neural network based model called “deep clustering” for more general source separation tasks. We train a deep network to assign contrastive embedding vectors to each time-frequency region of the spectrogram in order to implicitly predict the segmentation labels of the target spectrogram from the input mixtures. This yields a deep network-based analogue to spectral clustering, in that the embeddings form a low-rank pairwise affinity matrix that approximates the ideal affinity matrix, while enabling much faster performance. At test time, the clustering step “decodes” the segmentation implicit in the embeddings by optimizing K-means with respect to the unknown assignments. Experiments on single channel mixtures from multiple speakers show that a speaker-independent model trained on two-speaker and three speakers mixtures can improve signal quality for mixtures of held-out speakers by an average over 10dB. We then propose an extension for deep clustering named “deep attractor” network that allows the system to perform efficient end-to-end training. In the proposed model, attractor points for each source are firstly created the acoustic signals which pull together the time-frequency bins corresponding to each source by finding the centroids of the sources in the embedding space, which are subsequently used to determine the similarity of each bin in the mixture to each source. The network is then trained to minimize the reconstruction error of each source by optimizing the embeddings. We showed that this frame work can achieve even better results. Lastly, we introduce two applications of the proposed models, in singing voice separation and the smart hearing aid device. For the former, a multi-task architecture is proposed, which combines the deep clustering and the classification based network. And a new state of the art separation result was achieved, where the signal to noise ratio was improved by 11.1dB on music and 7.9dB on singing voice. In the application of smart hearing aid device, we combine the neural decoding with the separation network. The system firstly decodes the user’s attention, which is further used to guide the separator for the targeting source. Both objective study and subjective study show the proposed system can accurately decode the attention and significantly improve the user experience.
Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "Auditory source separation"

1

Hummersone, Christopher, Toby Stokes i Tim Brookes. "On the Ideal Ratio Mask as the Goal of Computational Auditory Scene Analysis". W Blind Source Separation, 349–68. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-642-55016-4_12.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Duong, Ngoc Q. K., Emmanuel Vincent i Rémi Gribonval. "Under-Determined Reverberant Audio Source Separation Using Local Observed Covariance and Auditory-Motivated Time-Frequency Representation". W Latent Variable Analysis and Signal Separation, 73–80. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-15995-4_10.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Hamada, Nozomu, i Ning Ding. "Source Separation and DOA Estimation for Underdetermined Auditory Scene". W Soundscape Semiotics - Localisation and Categorisation. InTech, 2014. http://dx.doi.org/10.5772/56013.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Auditory source separation"

1

Li, Han, Kean Chen i Bernhard U. Seeber. "Auditory Filterbanks Benefit Universal Sound Source Separation". W ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9414105.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Kim, Chanwoo, Kshitiz Kumar i Richard M. Stern. "Binaural sound source separation motivated by auditory processing". W ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2011. http://dx.doi.org/10.1109/icassp.2011.5947497.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Faller, Kenneth John, Jason Riddley i Elijah Grubbs. "Automatic blind source separation of speech sources in an auditory scene". W 2017 51st Asilomar Conference on Signals, Systems, and Computers. IEEE, 2017. http://dx.doi.org/10.1109/acssc.2017.8335176.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Kong, Qiuqiang, Yuxuan Wang, Xuchen Song, Yin Cao, Wenwu Wang i Mark D. Plumbley. "Source Separation with Weakly Labelled Data: an Approach to Computational Auditory Scene Analysis". W ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053396.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Hussain, Abrar, Kalaivani Chellappan i Siti Zamratol Mai-Sarah Mukari. "Evaluation of source separation using projection pursuit algorithm for computer-based auditory training system". W 2017 7th IEEE International Conference on System Engineering and Technology (ICSET). IEEE, 2017. http://dx.doi.org/10.1109/icsengt.2017.8123436.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Cantisani, Giorgia, Slim Essid i Gael Richard. "Neuro-Steered Music Source Separation With EEG-Based Auditory Attention Decoding And Contrastive-NMF". W ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413841.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii