Dissertations / Theses: 'Vocoder'

1

LeBlanc, Wilfrid P. (Wilfrid Paul) Carleton University Dissertation Engineering Electrical. "An advanced speech coder based on a rate-distortion theory framework." Ottawa, 1988.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

2

Griffin, Daniel W. "Multi-band excitation vocoder." Thesis, Massachusetts Institute of Technology, 1987. http://hdl.handle.net/1721.1/14803.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Martins, José Antônio. "Vocoder LPC com quantização vetorial." [s.n.], 1991. http://repositorio.unicamp.br/jspui/handle/REPOSIP/261389.

Full text

Abstract:

Orientador : Fabio Violaro
Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica
Made available in DSpace on 2018-07-13T23:59:40Z (GMT). No. of bitstreams: 1 Martins_JoseAntonio_M.pdf: 6784204 bytes, checksum: 4e9df50ca8f72e1710d541924b76a67c (MD5) Previous issue date: 1991
Resumo: Neste trabalho são descritos os princípios do vocoder LPC, sendo mostrados os métodos para cálculo dos parâmetros do mesmo. Também são apresentados os resultados de simulações de vocoders LPC usando quantização escalar, quantização vetorial e interpolação dos parâmetros quantizados. Inicialmente foi projetado um vocoder LPC não quantizado, o qual serviu de padrão para a avaliação dos vocoders quantizados. Usando a quantização escalar dos coeficientes razão log-área foi obtido um vocoder à taxa de 2200 bit /s, assegurando uma boa qualidade e alta inteligibilidade da voz sintetizada. Com o uso da quantização vetorial obteve-se um bom desempenho em taxas da ordem de 1000 bit/s. Essas taxas foram reduzidas em 50% com o uso da interpolação linear, transmitindo apenas os parâmetros dos quadros ímpares. Assim, conseguiu-se vocoders com taxas ao redor de 500 bit/s, apresentando voz sintetizada com degradação em relação aos sistemas anteriores, mas ainda assegurando uma boa inteligibilidade
Abstract: Not informed.
Mestrado
Eletronica e Comunicações
Mestre em Engenharia Elétrica

APA, Harvard, Vancouver, ISO, and other styles

4

Hudson, Nicholaus D. W. "The self-excited vocoder for mobile telephony." Thesis, University of Bath, 1992. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.760629.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Moore, James Thomas. "A mixed excitation vocoder with fuzzy logic classifier." Thesis, Monterey, California. Naval Postgraduate School, 1992. http://hdl.handle.net/10945/23960.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Foley, Jeffrey J. (Jeffrey Joseph). "Digital implementation of a frequency-lowering channel vocoder." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38798.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (p. 58-59).
by Jeffrey J. Foley.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

7

Carr, Raymond C. "Improvements to a pitch-synchronous linear predictive coding (LPC) vocoder." Thesis, University of Ottawa (Canada), 1989. http://hdl.handle.net/10393/5954.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Yeh, Ernest Nanjung 1975. "Advanced Vocoder Idle Slot Exploitation for TIA IS-136 standard." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/47580.

Full text

Abstract:

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.
Includes bibliographical references (p. 55).
by Ernest Nanjung Yeh.
S.B.and M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

9

Manjunath, Sharath. "Implementation of a variable rate vocoder and its performance analysis." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-06102009-063255/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Iyengar, Vasu. "A low delay 16 kbit/sec coder for speech signals /." Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=63799.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Huang, Ying. "Effects of vocoder distortion and packet loss on network echo cancellation." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0029/MQ66876.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

McCree, Alan V. "A new LPC vocoder model for low bit rate speech coding." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15053.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Chung, Jae H. "A new homomorphic vocoder framework using analysis-by-synthesis excitation analysis." Diss., Georgia Institute of Technology, 1991. http://hdl.handle.net/1853/15471.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Donaldson, Nicholas. "Extending the phase vocoder with damped sinusoid atomic decomposition of transients." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104832.

Full text

Abstract:

Pitch-preserving time scale modification and time-preserving pitch modification of recorded sounds are integral effects in modern digital music production, and some implementation of these effects can be found in nearly all commercial digital audio production software. Recent research has led to improvements in the reduction of transient smearing artifacts in otherwise high-quality frequency domain time scaling (phase vocoder) algorithms, but many modern implementations still exhibit noticeable smoothing of very abrupt transients, especially for drastic time scale modifications. By using a sparse atomic decomposition method to create representations of the transients in an audio signal, the transient and steady-state content of the signal can be separated and processed separately. The phase vocoder can be used to modify only the steady-state content of the signal, preserving the fidelity of transients when using time scaling effects. Such an extension is introduced here, along with a working software implementation, which performs such feature-specific processing through the use of a damped sinusoid matching pursuit algorithm to represent and remove transients from an audio signal. A high-resolution transient onset detection algorithm is also presented, as well as a practical application of phase locking to a computationally efficient phase vocoder formulation.
Modifier indépendamment la hauteur et l'échelle temporelle d'enregistrements sonores est devenu un outil essentiel de la production audio numérique actuelle; si bien que la plupart des logiciels commerciaux dédiés à la production incluent une version de ces effets. Les algorithmes d'étirement du sons fondés sur le vocodeur de phase permettent d'obtenir des résultats de très bonne qualité, notamment à la suite de travaux récents visant à réduire l'"étalement" des transitoires, artefacts caractéristiques de ces méthodes. Cependant, même les algorithmes les plus récents étalent les transitoires très abruptes, et ce d'autant plus que les modifications de l'échelle temporelle sont extrêmes. Afin de proposer une solution à ce problème, nous faisons ici appel à une décomposition atomique parcimonieuse permettant de dissocier les variations brusques du signal de ses variations plus lentes. Ceci permet alors de laisser les transitoires intacts et de ne modifier que le reste du son à l'aide d'un algorithme de type vocodeur de phase. Ceci assure ainsi une meilleure qualité de l'étirement temporel, même dans les cas extrêmes. Nous présentons dans ce mémoire les détails d'une telle méthode ainsi qu'un logiciel utilisant un algorithme de type "matching pursuit" pour représenter les transitoires du signal audio par des sinusoïdes amorties exponentiellement. Les autres contributions originales de ce travail incluent une nouvelle méthode de détection d'attaque à haute-résolution temporelle, ainsi que l'implémentation d'une version du vocodeur de phase peu coûteuse en temps de calcul et particulièrement appropriée à l'étirement des sons.

APA, Harvard, Vancouver, ISO, and other styles

15

Apel, Theodore R. "Feature preservation and negated music in a phase vocoder sound representation." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2008. http://wwwlib.umi.com/cr/ucsd/fullcit?p3303958.

Full text

Abstract:

Thesis (Ph. D.)--University of California, San Diego, 2008.
Title from first page of PDF file (viewed Jun. 17, 2008). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references: P. 92-98.

APA, Harvard, Vancouver, ISO, and other styles

16

Morgenstern, Robert M. "Vector quantization applied to speech coding in the wireless environment." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-07292009-090440/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

LeBlanc, Wilfrid P. (Wilfrid Paul) Carleton University Dissertation Engineering Electrical. "Speech coding at low to medium bit rates." Ottawa, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

18

Stefanovic, Milos. "Vocoder model based variable rate narrowband and wideband speech coding below 9 kbps." Thesis, University of Surrey, 1999. http://epubs.surrey.ac.uk/843965/.

Full text

Abstract:

The past two decades have witnessed rapid growth and development within the telecommunications industry. This has been primarily fuelled by the proliferation of digital mobile communication applications and services which have become commonplace and easily within the financial reach of businesses and the general public. Current research trends, involving integration and packetisation of voice, video and data channels into true multimedia communications, promise a similar technological revolution in the next decade. One of the key design issues of the new high quality multimedia services is a requirement for very high data rates. Whilst the available bandwidth in wire based terrestrial network is a relatively cheap and expandable resource, it becomes inherently limited in satellite or cellular radio systems. In order to accommodate ever growing numbers of subscribers whilst maintaining high quality and low operational costs, it is necessary to maximise spectral efficiency and reduce power consumption. This has given rise to the rapid development of signal compression techniques, which in the speech transmission domain are known as speech coding algorithms. The research carried out for this thesis has mainly focused on the design and development of low bit rate narrowband and wideband speech coding systems which utilise a variable rate approach in order to improve their perceptual quality and reduce their transmission rates. The algorithms subsequently developed are based on the existing vocoding schemes, whose rigid fixed rate structure is a major limitation to achieving higher quality and lower rates. The variable rate schemes utilise the time-varying characteristics of the speech signal which is classified according to the developed segmentation algorithms. Two main schemes were developed, a variable bit rate with an average as low as 1.35 kbps and a variable frame rate with an average of 2.1 kbps, both achieving or even surpassing the subjective quality of the existing vocoding standard at 4.15 kbps. Wideband speech exhibits characteristics which are not embodied within narrowband speech and which contribute to the superior perceived quality. A very high quality wideband vocoder operating at rates (fixed and variable) below 9 kbps is presented in this thesis, whereby particular attention is paid to preserving the information in higher frequencies in order to maximise the attainable quality.

APA, Harvard, Vancouver, ISO, and other styles

19

Kim, Hyun Soo Electrical Engineering &amp Telecommunications Faculty of Engineering UNSW. "Speech analysis techniques useful for low or variable bit rate coding." Awarded by:University of New South Wales. School of Electrical Engineering and Telecommunications, 2005. http://handle.unsw.edu.au/1959.4/22050.

Full text

Abstract:

We investigate, improve and develop speech analysis techniques which can be used to enhance various speech processing systems, especially low bit rate or variable bit rate coding of speech. The coding technique based on the sinusoidal representation of speech is investigated and implemented. Based on this study of the sinusoidal model of speech, improved analysis techniques to determine voicing, pitch and spectral estimation are developed, as well as noise reduction technique. We investigate the properties and limitations of the spectral envelope estimation vocoder (SEEVOC). We generalize, optimize and improve the SEEVOC and also compare it with LP in the presence of noise. The properties and applications of morphological filters for speech analysis are investigated. We introduce and investigate a novel nonlinear spectral envelope estimation method based on morphological operations, which is found to be very robust against noise. This method is also compared with the SEEVOC method. A simple method for the optimum selection of the structuring set size without using prior pitch information is proposed for many purposes. The morphological approach is then used for a new pitch estimation method and for the general sinusoidal analysis of speech or audio. Many of the new methods are based on a novel systematic analysis of the peak features of signals, including the study of higher order peaks. We propose a novel peak feature algorithm, which measure the peak characteristics of speech signal in time domain, to be used for end point detection and segmentation of speech. This nonparametric algorithm is flexible, efficient and very robust in noise. Several simple voicing measures are proposed and used in a new speech classifier. The harmonic-plus-noise decomposition technique is improved and extended to give an alternative to the methods used in the sinusoidal analysis method. Its applications to pitch estimation, speech classification and noise reduction are investigated.

APA, Harvard, Vancouver, ISO, and other styles

20

Molina, Villota Daniel Hernán. "Vocal audio effects : tuning, vocoders, interaction." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS166.

Full text

Abstract:

Cette recherche se concentre sur l'utilisation d'effets audio numériques (DAFx) sur les pistes vocales dans la musique moderne, on étudie principalement la correction de la hauteur et le vocoding. Malgré son utilisation répandue, il n'y a pas eu suffisamment de discussions sur la manière d'améliorer l'autotune ou sur ce qui rend une modification de la hauteur plus intéressante d'un point de vue musical. Une analyse taxonomique des effets vocaux a été réalisée, montrant des exemples de la manière dont les effets peuvent préserver ou transformer l'identité vocale et leur utilisation musicale, en particulier traitant la modification de la hauteur. En outre, un recueil de termes technico-musicaux a été élaboré pour distinguer les types de tuning vocal et les cas de correction de la hauteur. Une méthode de correction de la hauteur est proposée pour son utilisation vocale : Dynamic Pitch Warping (DPW). Cette méthode est validée par des courbes de hauteur théoriques (appuyées par l'audio) et comparée à une méthode de référence. Bien que le vocodeur soit essentiel pour la correction de la hauteur, il y a un manque de base descriptive et comparative pour les techniques de vocodeur. Par conséquent, une description sonore du vocodeur est proposée, compte tenu de son utilisation pour le tuning, en utilisant quatre algorithmes différents : Antares, Retune, World et Circe. Ensuite, une évaluation psychoacoustique subjective est réalisée pour comparer les quatre systèmes dans les cas suivants : resynthèse de la tonalité originale, correction vocale douce et correction vocale extrême. Cette évaluation psychoacoustique cherche à comprendre la coloration de chaque vocodeur (préservation de l'identité vocale) et dans la correction vocale extrême. Aussi, un protocole d'évaluation subjective des méthodes de correction de la hauteur est proposé et mis en œuvre. Ce protocole compare notre méthode de correction de hauteur DPW à la méthode de référence ATA. Cette étude vise à déterminer s'il existe des différences perceptives entre les systèmes et dans quels cas elles se produisent, ce qui est utile pour développer de nouvelles méthodes de modification mélodique à l'avenir. Enfin, l'utilisation interactive des effets vocaux a été explorée, en capturant le mouvement des mains à l'aide de capteurs sans fil et en le mappant pour contrôler les effets qui modifient la perception de l'espace et de la mélodie vocale
This research focuses on the use of digital audio effects (DAFx) on vocal tracks in modern music, mainly pitch correction and vocoding. Despite its widespread use, there has not been enough discussion on how to improve autotune or what makes a pitch-modification more musically interesting. A taxonomic analysis of vocal effects has been conducted, demonstrating examples of how they can preserve or transform vocal identity and their musical use, particularly with pitch modification. Furthermore, a compendium of technical-musical terms has been developed to distinguish types of vocal tuning and cases of pitch correction. Additionally, a graphical correction method for vocal pitch correction is proposed. This method is validated with theoretical pitch curves (supported by audio) and compared with a reference method. Although the vocoder is essential for pitch correction, there is a lack of descriptive and comparative basis for vocoding techniques. Therefore, a sonic description of the vocoder is proposed, given its use for tuning, employing four different techniques: Antares, Retune, World, and Circe. Subsequently, a subjective psychoacoustic evaluation is conducted to compare the four systems in the following cases: original tone resynthesis, soft vocal correction, and extreme vocal correction. This psychoacoustic evaluation seeks to understand the coloring of each vocoder (preservation of vocal identity) and the role of melody in extreme vocal correction. Furthermore, a protocol for the subjective evaluation of pitch correction methods is proposed and implemented. This protocol compares our DPW pitch correction method with the ATA reference method. This study aims to determine if there are perceptual differences between the systems and in which cases they occur, which is useful for developing new melodic modification methods in the future. Finally, the interactive use of vocal effects has been explored, capturing hand movement with wireless sensors and mapping it to control effects that modify the perception of space and vocal melody

APA, Harvard, Vancouver, ISO, and other styles

21

Atkinson, Ian Andrew. "Advanced linear predictive speech compression at 3.0 kbits/sec and below." Thesis, University of Surrey, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.336527.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Fischman, Rajmil. "Musical applications of digital synthesis and processing techniques : realisation using Csound and the Phase Vocoder." Thesis, University of York, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.280530.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Gouveia, Paulo D. F. "Codificação de fala por modelos variáveis no tempo." Master's thesis, Universidade de Aveiro, 1996. http://hdl.handle.net/10198/1572.

Full text

Abstract:

O trabalho apresentado nesta tese representa uma contribuição para a optimização da codificação da fala. Utilizam-se para o efeito modelos de codificação baseados em filtros LP (filtros de Predição Linear) de parâmetros variáveis no tempo, contrastando com os modelos fixos utilizados nos métodos convencionais. Nestes, a adaptação dos filtros de predição realiza-se simplesmente através de actualizações periódicas dos seus parâmetros, não traduzindo por isso uma evolução gradual e contínua ao longo do tempo. A técnica utilizada na implementação dos modelos variáveis tem por base a utilização de funções do tipo B-spline na representação das formas de onda dos parâmetros LP. Para o estudo da viabilidade do modelo proposto, analisou-se o desempenho de um vocoder de predição linear incluindo, quer o modelo LP de parâmetros variáveis, quer o modelo LP de parâmetros fixos convencional, por forma a possibilitar a comparação de desempenhos. Dos resultados obtidos concluímos que a codificação de fala por modelos variáveis no tempo, embora não tenha evidenciado vantagens convincentes, pode ser encarada como outra forma de codificação, competindo por isso com as metodologias já existentes. The work presented in this thesis aims at to be a contribution to speech coding. To accomplish this objective, coding models based on LP filters (Linear Predictive Filters) with time-varying parameters are used, and compared with fixed models used in conventional methods. In these models, the predictive filters adaptation is carried on simply through periodic updatings of its parameters, therefore doesn’t representing a gradual and continuous evolution in time. The technique used in varying models implementation is based on the utilization of B-spline like functions to represent the LP parameters waveforms. In order to make a viability study of the proposed model, the performance of a linear predictive vocoder was analyzed, including both the LP model with varying parameters and the conventional LP model with fixed parameters, thus enabling the comparison of their performances. From the results, we concluded that speech coding by time-varying models, although it had not demonstrated clear benefits, can be viewed as another coding way, therefore competing with the already existing methodologies.

APA, Harvard, Vancouver, ISO, and other styles

24

Leitner, Jakub. "Hlasové kodéry pro nízké přenosové rychlosti." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218173.

Full text

Abstract:

The final thesis deals with coders and voice coders used in speech signal processing. The aim is to create an integral overview of coders and voice coders including a description of their properties, in the second part of the thesis a simulation of algorithms and methods of speech processing is performed in Matlab Simulink program.The basic methods of speech processing and a parametric LPC voice coder were simulated in time domain. In the LPC voice coder model there are implemented the algorithms for obtaining speech segment parameters. These are the algorithm for classification of voiced and unvoiced speech segment, LPC analysis and pitch detection. The output is a parametric signal that enables a receiver to synthesize a speech signal. The appendix 1 contains a list of names of coders or standard numbers of coders and their properties, the appendix 2 includes an overview of speech processing methods.

APA, Harvard, Vancouver, ISO, and other styles

25

Markle, Blake L. "A comparative study of time-stretching algorithms for audio signals /." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=31119.

Full text

Abstract:

Algorithms exist which will perform independent transformations on frequency or duration of a digital audio signal. These processes have different results different types of audio signals. A comparative study of granular and phase vocoder algorithms, implementation, and their respective effects on audio signals was made to determine which algorithm is best suited to a particular type of audio signal.

APA, Harvard, Vancouver, ISO, and other styles

26

SOTERO, FILHO Roberto Fernando Batista. "Novas abordagens para codificação de voz e reconhecimento automático de locutor projetadas via mascaramento pleno em frequência por oitava." Universidade Federal de Pernambuco, 2009. https://repositorio.ufpe.br/handle/123456789/26231.

Full text

Abstract:

Submitted by Pedro Barros (pedro.silvabarros@ufpe.br) on 2018-08-27T22:00:17Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5)
Approved for entry into archive by Alice Araujo (alice.caraujo@ufpe.br) on 2018-09-05T19:02:50Z (GMT) No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5)
Made available in DSpace on 2018-09-05T19:02:50Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5) Previous issue date: 2009-10-30
CAPES
A área de processamento digital de sinais de voz (PDSV) é uma das mais importantes do processamento digital de sinais. Como sub-áreas relevantes do PDSV estão a Codificação da Voz e o Reconhecimento Automático de Locutor (RAL). Esta dissertação propõe uma nova abordagem para um vocoder baseado no Mascaramento Pleno em Frequência por Oitavas (MPFO) em adição a uma técnica de preenchimento espectral via distribuição beta de probabilidade. O método do MPFO consiste em simplificar a magnitude do espectro em frequência do sinal, considerando apenas uma amostra por oitava. Tal abordagem, que oferece um compromisso entre taxa de bits (e.g. 2,7 kbits/s), complexidade, inteligibilidade e qualidade dos sinais de voz, permitiu a criação de um novo formato binário de representação digital da voz: o formato voz. Apresenta-se, também, um novo método de baixa complexidade computacional para RAL, baseando-se em uma das propriedades-chave da percepção auditiva humana: o mascaramento acústico em frequência. O vetor característico dos quadros do sinal de voz é representado pela fração média das amplitudes dos tons de mascaramento em cada oitava. Ambos os tipos de reconhecimento de locutor (de texto dependente e de texto independente) são estudados. Os resultados confirmam que o algoritmo proposto oferece um compromisso entre a complexidade e a taxa de identificações corretas (típico 85%), sendo atrativo para aplicações em sistemas embarcados.
Digital processing of speech signals (DPSS) is one of the most important areas of digital signal processing. Voice coding and automatic speaker recognition (ASR) are relevant DPSS sub-fields. This dissertation introduces a new vocoder scheme, which is based on full frequency masking per octave (FFMO), jointly with a new spectral stuffing technique through the beta probability distribution. The FFMO method consists of simplifying the magnitude of the voice spectrum. It retains just one spectral sample per octave. This approach offers a tradeoff between the bit rate (e.g., 2.7 kbits/s), complexity, intelligibility and voice quality. A new file format, termed voz, was proposed. A novel and low-complexity ASR technique, based one of the key-properties of the human hearing perception - the auditory frequency masking - is also presented. The feature vectors of voice frames are represented by the average amplitude of the largest spectral samples within each octave. Both text-dependent and text-independent speaker recognition is investigated. Results support a tradeoff between recognition efficiency (typically 85%) and complexity of this kind of vocoder-based systems, being thereby attractive for embedded systems.

APA, Harvard, Vancouver, ISO, and other styles

27

Disch, Sascha [Verfasser]. "Modulation vocoder for analysis, processing and synthesis of audio signals with application to frequency selective pitch transposition / Sascha Disch." Hannover : Technische Informationsbibliothek und Universitätsbibliothek Hannover (TIB), 2011. http://d-nb.info/1014323789/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Mesnildrey, Quentin. "Towards a better understanding of the cochlear implant-auditory nerve interface : from intracochlear electrical recordings to psychophysics." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0007/document.

Full text

Abstract:

L'implant cochléaire est une prothèse neurale implantée visant à restituer une sensation auditive chez des personnes souffrant de surdité neurosensorielle sévère à profonde. Si les performances en reconnaissance de la parole sont relativement bonnes dans le silence, elles chutent dramatiquement dans des environnements sonores complexes. L'une des principales limites de l'appareil vient du fait que chaque électrode stimule une large portion de la cochlée. Ainsi lorsque plusieurs électrodes sont activées les champs électriques produits interfèrent ce qui détériore la transmissions des informations sonores. Plusieurs modes de stimulation ont été proposés pour remédier à ce problème mais les améliorations en termes de reconnaissance de la parole restent limités. Dans ce projet, nous cherchons tout d'abord à expliquer via une simulateur acoustique, les résultats décevants obtenus avec le mode de stimulation bipolaire. Dans un deuxième temps nous tentons de mieux comprendre le comportement électrique de l'oreille interne implantée afin d'optimiser la stimulation multipolaire phased array (van den Honert et Kelsall 2007). Pour obtenir une stimulation efficace il faut par ailleurs s'assurer de l'état de la population neuronale à stimuler. Dans ce projet nous essayons donc de mieux comprendre l'interface électrode-neurones et d'identifier un possible corrélat psychophysique de l'état des neurones. Enfin nous discutons la possibilité de créer une stimulation optimale focalisée directement au niveau des neurones
The cochlear implant is a neural prosthesis designed to restore an auditory sensation to people suffering from severe to profound sensorineural deafness. While satisfying speech recognition can be achieved in silence, their performance dramatically drop in more complex environments. One main limitations of the present device is due to the fact that each electrode stimulates a wide portion of the cochlea. As a result, when several electrodes are activated, the electrical field produced by different electrodes overlap which distorts the transmission of sound information. Several alternative stimulation modes have been proposed to overcome this issue but the benefit in terms of speech recognition remained limited. In this project, we first used an acoustic simulator of the cochlear implant to explain the desappointing results obtained with the bipolar stimuilation mode. We then try to better understand the electrical behavior of the implanted cochlea in order to optimize the multipolar phased array stimulation strategy ( van den Honert and Kelsall 2007). To achieve an efficient stimulation of the neural population it is necessary to determine the distribution of neural survival. In this project we aim to better understand the electrode-neuron interface and identify a possible psychophysical correlate of neural survival. Finally, we discuss the main results and the possibility to design an optimal stimulation strategy to achieve a spatially-focussed electrical field at the level of the nerve fibers

APA, Harvard, Vancouver, ISO, and other styles

29

Daniell, Paul. "A Cross-Language Acoustic-Perceptual Study of the Effects of Simulated Hearing Loss on Speech Intonation." Thesis, University of Canterbury. Department of Communication Disorders, 2012. http://hdl.handle.net/10092/7646.

Full text

Abstract:

Aim : The purpose of this study was to examine the impact of simulated hearing loss on the acoustic contrasts between declarative questions and declarative statements and on the perception of speech intonation. A further purpose of the study was to investigate whether any such effects are universal or language specific. Method: Speakers included four native speakers of English and four native speakers of Mandarin and Taiwanese, with two female and two male adults in each group. Listeners included ten native English and ten native speakers of Mandarin and Taiwanese, with five female and five male adults in each group. All participants were aged between 19 and 55 years old. The speaker groups were asked to read a list of 28 phrases, with each phrase expressed as a declarative statement or a declarative question separately. These phrases were then filtered through six types of simulated hearing loss configurations, including three levels of temporal jittering for simulating a loss in neural synchrony, a high level of temporal jittering in combination with a high-pass or a low-pass filter that simulate falling and rising audiometric hearing loss configurations, and a vocoder processing procedure to simulate cochlear implant processing. A selection of acoustic measures was derived from the sentences and from some embedded vowels, including /i/, /a/, and /u/. The listener groups were asked to listen to the tokens in their native language and indicate if they heard a statement or a question. Results: The maximum fundamental frequency (F0) of the last syllable (MaxF0-last) and the maximum F0 of the remaining sentence segment (MaxF0-rest) were found to be consistently higher in declarative questions than in declarative statements. The percent jitter measure was found to worsen with simulated hearing loss as the level of temporal jittering increased. The vocoder-processed signals showed the highest percent jitter measure and the spread of spectral energy around the dominant pitch. Results from the perceptual data showed that participants in all three groups performed significantly worse with vocoder-processed tokens compared to the original tokens. Tokens with temporal jitter alone did not result in significantly worse perceptual results. Perceptual results from the Taiwanese group were significantly worse than the English group under the two filtered conditions. Mandarin listeners performed significantly worse with the neutral tone on the last syllable, and Taiwanese listeners performed significantly worse with the rising tone on the last syllable. Perception of male intonation was worse than female intonation with temporal jitter and high-pass filtering, and perception of female intonation was worse than male intonation with most temporal jittering conditions, including the temporal jitter and low-pass filtering condition. Conclusion: A rise in pitch for the whole sentence, as well as that in the final syllable, was identified as the main acoustic marker of declarative questions in all of the three languages tested. Perception of intonation was significantly reduced by vocoder processing, but not by temporal jitter alone. Under certain simulated hearing loss conditions, perception of intonation was found to be significantly affected by language, lexical tone, and speaker gender.

APA, Harvard, Vancouver, ISO, and other styles

30

Hu, Qiong. "Statistical parametric speech synthesis based on sinusoidal models." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28719.

Full text

Abstract:

This study focuses on improving the quality of statistical speech synthesis based on sinusoidal models. Vocoders play a crucial role during the parametrisation and reconstruction process, so we first lead an experimental comparison of a broad range of the leading vocoder types. Although our study shows that for analysis / synthesis, sinusoidal models with complex amplitudes can generate high quality of speech compared with source-filter ones, component sinusoids are correlated with each other, and the number of parameters is also high and varies in each frame, which constrains its application for statistical speech synthesis. Therefore, we first propose a perceptually based dynamic sinusoidal model (PDM) to decrease and fix the number of components typically used in the standard sinusoidal model. Then, in order to apply the proposed vocoder with an HMM-based speech synthesis system (HTS), two strategies for modelling sinusoidal parameters have been compared. In the first method (DIR parameterisation), features extracted from the fixed- and low-dimensional PDM are statistically modelled directly. In the second method (INT parameterisation), we convert both static amplitude and dynamic slope from all the harmonics of a signal, which we term the Harmonic Dynamic Model (HDM), to intermediate parameters (regularised cepstral coefficients (RDC)) for modelling. Our results show that HDM with intermediate parameters can generate comparable quality to STRAIGHT. As correlations between features in the dynamic model cannot be modelled satisfactorily by a typical HMM-based system with diagonal covariance, we have applied and tested a deep neural network (DNN) for modelling features from these two methods. To fully exploit DNN capabilities, we investigate ways to combine INT and DIR at the level of both DNN modelling and waveform generation. For DNN training, we propose to use multi-task learning to model cepstra (from INT) and log amplitudes (from DIR) as primary and secondary tasks. We conclude from our results that sinusoidal models are indeed highly suited for statistical parametric synthesis. The proposed method outperforms the state-of-the-art STRAIGHT-based equivalent when used in conjunction with DNNs. To further improve the voice quality, phase features generated from the proposed vocoder also need to be parameterised and integrated into statistical modelling. Here, an alternative statistical model referred to as the complex-valued neural network (CVNN), which treats complex coefficients as a whole, is proposed to model complex amplitude explicitly. A complex-valued back-propagation algorithm using a logarithmic minimisation criterion which includes both amplitude and phase errors is used as a learning rule. Three parameterisation methods are studied for mapping text to acoustic features: RDC / real-valued log amplitude, complex-valued amplitude with minimum phase and complex-valued amplitude with mixed phase. Our results show the potential of using CVNNs for modelling both real and complex-valued acoustic features. Overall, this thesis has established competitive alternative vocoders for speech parametrisation and reconstruction. The utilisation of proposed vocoders on various acoustic models (HMM / DNN / CVNN) clearly demonstrates that it is compelling to apply them for the parametric statistical speech synthesis.

APA, Harvard, Vancouver, ISO, and other styles

31

Rahrer, Timothy J. (Timothy Joseph) Carleton University Dissertation Engineering Electrical. "A digital signal processing-based hearing prosthesis and implementation of principal components analysis for a tactile aid." Ottawa, 1990.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

32

Karoui, Chadlia. "Neuroplasticity behind the rehabilitation of asymmetrical hearing loss and tinnitus through cochlear implantation : from psychoacoustic evaluations to neuroimaging studies." Thesis, Toulouse 3, 2019. http://www.theses.fr/2019TOU30151.

Full text

Abstract:

Ce travail de thèse visait à étudier les adaptations périphériques et centrales du système auditif liées à l'effet bénéfique des implants cochléaires (IC) chez les sujets présentant une perte auditive asymétrique (AHL) et des acouphènes. En ce sens, notre principal intérêt était d'étudier la possibilité d'une fusion entre le signal électrique de l'IC et le signal acoustique de l'oreille auditive et de déterminer si cette fusion restaure les mécanismes d'intégration binaurale, comme chez les sujets ayant une audition normale (NH), tant sur le niveau comportemental qu'au niveau central. D'un point de vue clinique, ces études sur la récupération de l'audition chez les sujets souffrant d'AHL fourniront des informations cruciales sur les capacités plastiques du cerveau à s'adapter à la stimulation électrique et guideront ainsi les stratégies thérapeutiques permettant de mieux récupérer les capacités binaurales et la perception linguistique et paralinguistique. Nous avons combiné différents types de tests comportementaux et audiologiques, des analyses radiologiques et une évaluation en neuroimagerie (imagerie PET Scan H2O15). En outre, nous avons pu décrire certaines propriétés qualitatives du son perçu du côté implanté et évaluer la réponse centrale à cette incohérence spectrale - lorsque les deux signaux de nature différente sont présentés, nous renseignant potentiellement sur des stratégies adaptatives possibles. Par ailleurs, nous avons confirmé que les principaux avantages de la réafférentation électrique via l'IC sont principalement la diminution et, dans certains cas, la suppression des acouphènes. Nous avons également envisagé plusieurs stratégies thérapeutiques pour le masquage des acouphènes impliquant non seulement l'oreille IC, mais également l'oreille NH. Dans l'ensemble, nous estimons que les sujets AHL bénéficient réellement de l'implantation cochléaire. Par conséquent, nos données indiquent que les adaptations plastiques induites par la réafférentation électrique chez les sujets AHL pourraient jouer un rôle déterminant dans la restauration des capacités binaurales, dans l'adaptation aux caractéristiques spectrales du signal IC et dans la suppression des acouphènes, ce qui permettrait potentiellement d'apporter un peu plus d'informations sur leurs mécanismes sous-jacents
This thesis work aimed to investigate the peripheral and central adaptations of the auditory system related to the beneficial effect of cochlear implants (CI) in subjects with asymmetrical hearing loss (AHL) and tinnitus. In this sense, our main interest was to study the possible fusion between the electric signal of the CI and the acoustic signal from the hearing ear and assess if it restores the binaural integration mechanisms as in normal-hearing (NH) subjects, both on behavioral and central levels. From the clinical standpoint, these studies on hearing recovery in AHL CI subjects will provide crucial information on the plastic abilities of the brain to adapt to electrical stimulation and thus to guide therapeutic strategies to better recover binaural abilities, and linguistic and para-linguistic perception. We combined behavioral and audiological testing, radiological analysis and neuroimaging investigation (H2O15PET Scan imaging). Besides, we were able to describe some qualitative properties of the perceived sound on the implanted side and to evaluate the central response to this spectral inconsistency- when the two signals of different nature are presented, potentially informing on possible adaptive strategies. In addition, we confirmed that the main benefits of electrical reafferentation via the CI is mostly the decrease, and in some cases the suppression, of tinnitus. We also considered several therapeutic strategies for tinnitus masking involving not only the CI ear but also the NH ear. Overall, we strongly believe that AHL subjects truly benefit from cochlear implantation. Hence, our data indicate that plastic adaptations to the CI input in AHL subjects may play a key role on restoring binaural hearing abilities, accommodation to CI signal spectral characteristics and tinnitus suppression which may shed some light on its underlying mechanisms

APA, Harvard, Vancouver, ISO, and other styles

33

Schoerner, Sven-Markus, and Erik Zakrisson. "Audioeffects with digital soundprocessing." Thesis, Linköping University, Department of Electrical Engineering, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3777.

Full text

Abstract:

To effectively demonstrate the strength of using digital signal processing when producing sound effects, a sound effects demo is used at the lectures of the course TSRT78, Digital signal processing, which is given at the university in Linköping.

The amount of effects, that in an instructive way can be used for an educational purpose, are many and the existing version of the sound effects demo is somewhat limited in its range of effects.

This reports main focus lies in the presentation of what kind of effects which can be interesting in this kind of demo. All of the effects are presented with their background theory and examples on how they can be implemented in software, mainly with the focus on MATLABTM. Investigations on how well the effects can be run in realtime, in the toolbox SimulinkTM, has been made.

In the report there is also a presentation of a new version of the sound effect demo that has been produced with user friendlieness and further updates in mind. In the new demo all of the effects are implemented, according to their presentations. The report finishes with suggestions for further work on the sound effects demo.

APA, Harvard, Vancouver, ISO, and other styles

34

Barrett, Jenna. "Perception of Spectrally-Degraded, Foreign-Accented Speech." Ohio University Honors Tutorial College / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors1619012518297988.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Calitz, Wietsche Roets. "Independent formant and pitch control applied to singing voice." Thesis, Stellenbosch : University of Stellenbosch, 2004. http://hdl.handle.net/10019.1/16267.

Full text

Abstract:

Thesis (MScIng)--University of Stellenbosch, 2004.
ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the purposes of creating new melodies or to correct existing ones. When the fundamental frequency of an audio signal that represents a human voice is changed by simple algorithms, the formants of the voice tend to move to new frequency locations, making it sound unnatural. The main purpose is to design a technique by which the pitch and formants of a singing voice can be controlled independently.
AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¨e te skep, of om bestaandes te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem apart kan beheer.

APA, Harvard, Vancouver, ISO, and other styles

36

Křupka, Aleš. "Moderní algoritmy posunu výšky základního tónu a jejich využití ve virtuálních hudebních nástrojích." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-219343.

Full text

Abstract:

This diploma thesis deals with pitch shifting methods of acoustical signals. The theoretic part of this thesis involves description of three different pitch shifting techniques, these are the method using a modulated delay line, PICOLA method and method using a phase vocoder. The first two methods represent the processing in time domain, the third method represents the processing in frequency domain. In relation with the PICOLA method, the thesis also mentions algorithms for pitch estimation. The practical part demonstrates the use of these methods. There is described a sampler virtual musical instrument based on the playback of the sounds stored in memory. In this part the particular units providing the required functionality are described. The generating of sounds is controlled by the MIDI protocol. In the sampler is implemented the PICOLA method.

APA, Harvard, Vancouver, ISO, and other styles

37

Massida, Zoé. "Étude de la perception de la voix chez le patient sourd postlingual implanté cochléaire unilatéral et le normo-entendant en condition de simulation d'implant. Psychophysique et imagerie." Phd thesis, Université Paul Sabatier - Toulouse III, 2010. http://tel.archives-ouvertes.fr/tel-00803654.

Full text

Abstract:

Ce travail de thèse a consisté à étudier les mécanismes perceptifs et neurofonctionnels impliqués lors de la perception de la voix chez des patients sourds postlinguaux implantés cochléaires unilatéralement, et chez des sujets normo-entendants en simulation d'implant. Pour répondre à cet objectif, nous avons testé les performances comportementales des patients implantés dans des tâches de détection de la voix ainsi que dans des tâches de perception de l'information paralinguistique de la voix, comme le genre. Les patients ont été testés au cours d'un suivi ainsi qu'en mesures transversales. Nous avons comparé leurs performances à celles de sujets normo-entendants en condition de simulation d'implant cochléaire (vocoder). Nous avons également testé les sujets normo-entendants dans un protocole IRMf consistant à mesurer l'activité spécifique à la voix lors de la simulation d'implant. Dans l'ensemble, ces travaux montrent qu'après implantation cochléaire, les patients sourds sont déficitaires en matière de perception de la voix, contrairement à la compréhension du langage. Ce déficit n'est pas uniquement lié à la dégradation du signal par le processeur de l'implant cochléaire, mais aussi certainement à des réorganisations corticales subséquentes à la surdité.

APA, Harvard, Vancouver, ISO, and other styles

38

Massida, Zoé. "Étude de la perception de la voix chez le patient sourd post lingual implanté cochléaire unilatéral et le sujet normo-entendant en condition de simulation d'implant : psychophysique et imagerie." Toulouse 3, 2010. http://thesesups.ups-tlse.fr/1806/.

Full text

Abstract:

Ce travail de thèse a consisté à étudier les mécanismes perceptif set neurofonctionnels impliqués lors de la perception de la voix chez des patients sourds postlinguaux implantés cochléaires unilatéralement, et chez des sujets normo-entendants en simulation d'implant. Pour répondre à cet objectif, nous avons testé les performances comportementales des patients implantés dans des tâches de détection de la voix ainsi que dans des tâches de perception de l'information paralinguistique de la voix, comme le genre. Les patients ont été testés au cours d'un suivi ainsi qu'en mesures transversales. Nous avons comparé leurs performances à celles de sujets normo-entendants en condition de simulation d'implant cochléaire (vocoder). Nous avons également testé les sujets normo-entendants dans un protocole IRMf consistant à mesurer l'activité spécifique à la voix lors de la simulation d'implant. Dans l'ensemble, ces travaux montrent qu'après implantation cochléaire, les patients sourds sont déficitaires en matière de perception de la voix, contrairement à la compréhension du langage. Ce déficit n'est pas uniquement lié à la dégradation du signal par le processeur de l'implant cochléaire, mais aussi certainement à des réorganisations corticales subséquentes à la surdité
This work consisted in studying perceptual and the underlying neuronal mechanisms involved during voice perception in postlingually deaf cochlear-implanted patients and normal-hearing controls stimulated through cochlear implant simulation. We have analyzed behavioral performance of implanted patients during a voice detection task and other protocols perception tasks of paralinguistic information, such as gender. Two groups of patients were tested using either a longitudinal follow-ups or a transversal approach. We compared their performances to those of control normal-hearing subjects tested in cochlear implant simulation (vocoder). In addition, we have performed in normal-hearing subjects, a fMRI study, to reveal the effect of a cochlear implant simulation in the cortical activity of areas sensitive to human voice. Results agree to point out, following cochlear implantation, a deficit in voice perception, unlike speech comprehension. This deficit is not only due to the degradation of the signal by the vocoder, but probably results from cortical reorganization induced by deafness

APA, Harvard, Vancouver, ISO, and other styles

39

Згуровський, Артур Андрійович, and Artur Zghurovskyi. "Метод кодування мовних сигналів для комунікаційних систем." Master's thesis, Тернопільський національний технічний університет імені Івана Пулюя, 2020. http://elartu.tntu.edu.ua/handle/lib/33949.

Full text

Abstract:

Кваліфікаційну роботу магістра присвячено аналізу методу кодування мовних сигналів для комунікаційних систем. Розглянуто переваги та недоліки відомих методів кодування і виділено переваги фазових вокодерів. Проведено оцінювання параметрів голосових сигналів, що використовуються при кодуванні їх в фазових вокодерах
The master's thesis is devoted to the analysis of the method of coding speech signals for communication systems. The advantages and disadvantages of known coding methods are considered and the advantages of phase vocoders are highlighted. The parameters of voice signals used in their encoding in phase vocoders are evaluated.
ВСТУП РОЗДІЛ 1. АНАЛІТИЧНА ЧАСТИНА 1.1 Задача побудови вокодерів 1.2 Характеристики та структурні параметри голосу 1.3 Висновки до розділу 1. РОЗДІЛ 2. ОСНОВНА ЧАСТИНА.. 2.1 Параметризація мовного сигналу….. 2.2 НТК - архітектура і можливості. 2.3 Технологія моделювання систем розпізнавання мови з застосуванням інструментарію НТК… 2.4 Результати експериментальних досліджень…. 2.5 Висновки до розділу 2…. РОЗДІЛ 3.НАУКОВО-ДОСЛІДНА ЧАСТИНА…. 3.1 Вимірювання параметрів фільтрів мовних сигналів… 3.2 Вимірювання частоти основного тону…. 3.3 Формування збудливого сигналу... 3.4 Синтез: відновлення мовного сигналу.. 3.5 Висновки до розділу 3…. РОЗДІЛ 4.СПЕЦІАЛЬНА ЧАСТИНА…. 4.1 Метрологічне забезпечення наукового дослідженя… 4.2Побудова прикладного програмного забезпечення для розв’язування наукової задачі… 4.3 Висновки до розділу 4… РОЗДІЛ 5.ОХОРОНА ПРАЦІ ТА БЕЗПЕКА В НАДЗВИЧАЙНИХ СИТУАЦІ-ЯХ…. 5.1 Охорона праці…. 5.1.1 Планування заходів з охорони праці. Види планування та контролю стану охорони праці… 5.1.2 Особливості розслідування та обліку нещасних випадків невиробничого характеру. 5.1.3 Пожежна сигналізація і зв'язок. Засоби гасіння пожеж. Протипожежне водопостачання. Первинні засоби пожежогасіння Автоматичні засоби пожежогасіння на об'єктах галузі...... 5.2 Безпека в надзвичайних ситуаціях……. 5.3 Висновки до розділу 5…. ЗАГАЛЬНІ ВИСНОВКИ…….. СПИСОК ВИКОРИСТАНИХ ДЖЕРЕЛ….. ДОДАТКИ.....

APA, Harvard, Vancouver, ISO, and other styles

40

Kovačev, Radovan. "Časově-frekvenční analýza signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236491.

Full text

Abstract:

The main subject of this work represents the time-frequency signal analysis. Firstly, it intends to provide the most essential theoretical background with focus on the continuous wavelet transform, where also a comparison of the key features with its close relative the short-time Fourier transform is performed. Afterwards, there follows a demonstration of the purpose with a practical example. The particular aim is to create a phase vocoder solution for modifying the length of a sound record duration and pitch shifting. Here, in this place, the functional principles, design, procedure of assembling, outputs and achieved results are well documented.

APA, Harvard, Vancouver, ISO, and other styles

41

Seldran, Fabien. "Spécificités de l'implant électro-acoustique : indications, interface bioélectrique et stratégie de codage." Phd thesis, Université Claude Bernard - Lyon I, 2011. http://tel.archives-ouvertes.fr/tel-00751913.

Full text

Abstract:

Le clinicien se trouve parfois confronté à des sujets qui présentent une surdité supérieure à 90 dB HL au-delà de 1 kHz avec une audition résiduelle dans les fréquences graves. Pour réhabiliter les hautes fréquences, il existe aujourd'hui différentes technologies : amplification conventionnelle, compression fréquentielle, implant cochléaire et depuis une dizaine d'année la stimulation électro-acoustique EAS qui consiste à stimuler acoustiquement les sons graves et électriquement les sons aigus via un implant cochléaire. La première partie de cette thèse a consisté à identifier les facteurs qui influencent les capacités des patients sourds partiels à traiter l'information basse fréquence de la parole. Nous avons utilisé un test d'audiométrie vocale filtrée passe-bas. Nos résultats indiquent que les scores d'intelligibilité de la parole sont positivement corrélés avec la durée de la surdité. Ceci signifie qu'avec le temps, ces sujets malentendants apprennent à comprendre avec cette audition type filtre passe-bas, à tel point que certains ont des performances supra-normales pour l'utilisation des basses fréquences. Nos résultats montrent également une corrélation négative entre l'âge d'apparition de la surdité et les scores l'intelligibilité. Ce test pourra aider le clinicien à mieux cibler l'appareillage le plus adapté à chaque profil de patient. La seconde partie de cette thèse, consacrée à l'EAS, a consisté à évaluer par des simulations chez le normo-entendant, diverses stratégies de codage du son par l'implant EAS. Actuellement, la stratégie utilisée pour l'EAS est calquée sur celle de l'implant cochléaire et nos résultats suggèrent que cette stratégie peut être optimisée.

APA, Harvard, Vancouver, ISO, and other styles

42

Crossman, A. H. "Multipulse-excitation applied to vocoders." Thesis, University of Cambridge, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.232981.

Full text

Abstract:

Multipulse-excitation has greatly improved the speech quality achievable from linear predictive coders which previously required speech to be classified as voiced or unvoiced for excitation purposes. Multipulse removes the need for voicing classification, improving speech quality by enhancing the excitation and offsetting errors in the vocal tract filter. An investigation of multipulse-excitation applied to a channel vocoder and a formant synthesiser was conducted. The prime objective was to improve the performance of these algorithms and achieve multipulse linear prediction speech quality, our target quality. This dissertation outlines and restates the idea of multipulse-excitation applied to a linear predictive vocoder. We then examine a high quality channel vocoder and formant synthesiser, and the use of multipulse-excitation to improve their performances. In each case time and frequency domain multipulsecalgorithms were used. Various modifications were made to these algorithms in order to accommodate multipulse-excitation and improve the overall speech quality. In the case of the channel vocoder this involved a novel technique, which sacrificed the inherent waveform preserving properties of the multipulse algorithm. Only by increasing both the pulse rate and the number of channels could the multipulse-excited channel vocoder achieve our target quality. With the formant synthesiser it was possible, by variation of the pulse rate alone, to achieve our target quality. Comparisons are drawn between the three multipulse algorithms and reasons given for their differing performance; this is substantiated by experimental results. These results suggested interesting improvements to the multipulse-excited formant synthesiser; and also hinted at a new and novel technique for formant tracking, using multipulse-excitation applied to a formant synthesiser.

APA, Harvard, Vancouver, ISO, and other styles

43

Hervais-Adelman, Alexis Georges. "The perceptual learning of noise-vocoded speech." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.611867.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Lee, Keebbum state. "Korean-English Bilinguals’ perception of noise-vocoded speech." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1562004544370682.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Chmayssani, Toufic. "Modulation sur les canaux vocodés." Phd thesis, Université Paris-Est, 2010. http://tel.archives-ouvertes.fr/tel-00587629.

Full text

Abstract:

Les canaux vocodés sont les canaux de communications dédiés à la voix et dans lesquels le signal traverse divers équipements destinés au transport de la voix tels que des codeurs de parole, des détecteurs d'activité vocale (VAD), des systèmes de transmission discontinue (DTX). Il peut s'agir de systèmes de communications téléphoniques filaires ou mobiles (réseaux cellulaires 2G/3G, satellites INMARSAT...) ou de voix sur IP. Les codeurs de parole dans les normes récentes pour les réseaux de téléphonie mobiles ou de voix sur IP font appel à des algorithmes de compression dérivés de la technique CELP (Code Excited Linear Prediction) qui permettent d'atteindre des débits de l'ordre de la dizaine de Kb/s bien inférieurs aux codeurs des réseaux téléphoniques filaires (typiquement 64 ou 32 Kb/s). Ces codeurs tirent leur efficacité de l'utilisation de caractéristiques spécifiques aux signaux de parole et à l'audition humaine. Aussi les signaux autres que la parole sont-ils généralement fortement distordus par ces codeurs. La transmission de données sur les canaux vocodés peut être intéressante pour des raisons liées à la grande disponibilité des canaux dédiés à la voix et pour des raisons de discrétion de la communication (sécurité). Mais le signal modulé transmis sur ces canaux vocodés est soumis aux dégradations causées par les codeurs de parole, ce qui impose des contraintes sur le type de modulation utilisé. Cette thèse a porté sur la conception et l'évaluation de modulations permettant la transmission de données sur les canaux vocodés. Deux approches de modulations ont été proposées pour des applications correspondant à des débits de transmission possibles assez différents. La principale application visée par la thèse concerne la transmission de parole chiffrée, transmission pour laquelle le signal de parole est numérisé, comprimé à bas débit par un codeur de parole puis sécurisé par un algorithme de cryptage. Pour cette application, nous nous sommes focalisés sur les réseaux de communications utilisant des codeurs CELP de débits supérieurs à la dizaine de Kb/s typiquement les canaux de communication mobiles de deuxième ou troisième génération. La première approche de modulation proposée concerne cette application. Elle consiste à utiliser des modulations numériques après optimisation de leurs paramètres de façon à prendre en compte les contraintes imposées par le canal et à permettre des débits et des performances en probabilité d'erreur compatibles avec la transmission de parole chiffrée (typiquement un débit supérieur à 1200 b/s avec un BER de l'ordre de 10-3). Nous avons montré que la modulation QPSK optimisée permet d'atteindre ces performances. Un système de synchronisation est aussi étudié et adapté aux besoins et aux contraintes du canal vocodé. Les performances atteintes par la modulation QPSK avec le système de synchronisation proposé, ainsi que la qualité de la parole sécurisée transmise ont été évalués par simulation et validés expérimentalement sur un canal GSM réel grâce à un banc de test développé dans la thèse.La deuxième approche de modulation a privilégié la robustesse du signal modulé lors de la transmission à travers un codeur de parole quelconque, même un codeur à bas débit tels que les codeurs MELP à 2400 ou 1200 b/s. Dans ce but, nous avons proposé une modulation effectuée par concaténation de segments de parole naturelle associée à une technique de démodulation qui segmente le signal reçu et identifie les segments de parole par programmation dynamique avec taux de reconnaissance élevé. Cette modulation a été évaluée par simulation sur différents codeurs de parole. Elle a aussi été testée sur des canaux GSM réels. Les résultats obtenus montrent une probabilité d'erreur très faible quelque soit le canal vocodé et le débit des codeurs de parole utilisés mais pour des débits possibles relativement faibles. Les applications envisageables sont restreintes à des débits typiquement inférieurs à 200 b/s.Enfin nous nous sommes intéressés aux détecteurs d'activité vocale dont l'effet peut-être très dommageable pour les signaux de données. Nous avons proposé une méthode permettant de contrer les VAD utilisés dans les réseaux GSM. Son principe consiste à rompre la stationnarité du spectre du signal modulé, stationnarité sur laquelle s'appuie le VAD pour décider que le signal n'est pas de la parole

APA, Harvard, Vancouver, ISO, and other styles

46

Ma, Wei. "Multi-band excitation based vocoders and their real-time implementation." Thesis, University of Surrey, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.240182.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Rochette, Denis. "Etude et réalisation d'un vocodeur à dictionnaire LPC 800 BITS/S." Grenoble 2 : ANRT, 1986. http://catalogue.bnf.fr/ark:/12148/cb37600727b.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Saadane, Abdelhakim. "Optimisation d'un vocodeur a canaux pour la correction de la parole hyperbare." Rennes 1, 1989. http://www.theses.fr/1989REN10090.

Full text

Abstract:

Au dela d'une certaine profondeur les plongeurs utilisent, pour la respiration des melanges synthetiques dont l'inhalaaltere le fonctionnement de la phonation. On exploite les resultats de travaux precedents pour proposer une nouvelle approche de vocodeurs a canaux. Avec un tel procede, l'enveloppe spectrale est estimee par un banc de filtres dont une optimisation est donnee

APA, Harvard, Vancouver, ISO, and other styles

49

McGettigan, Carolyn. "Factors affecting the perception of noise-vocoded speech : stimulus properties and listener variability." Thesis, University College London (University of London), 2008. http://discovery.ucl.ac.uk/1444460/.

Full text

Abstract:

This thesis presents an investigation of two general factors affecting speech perception in normal-hearing adults. Two sets of experiments are described, in which speakers of English are presented with degraded (noise-vocoded) speech. The first set of studies investigates the importance of linguistic rhythm as a cue for perceptual adaptation to noise-vocoded sentences. Results indicate that the presence of native English rhythmic patterns benefits speech recognition and adaptation, but not when higher-level linguistic information is absent (i.e. when the sentences are in a foreign language). It is proposed that rhythm may help in the perceptual encoding of degraded speech in phonological working memory. Experiments in this strand also present evidence against a critical role for indexical characteristics of the speaker in the adaptation process. The second set of studies concerns the issue of individual differences in speech perception. A psychometric curve-fitting approach is selected as the preferred method of quantifying variability in noise-vocoded sentence recognition. Measures of working memory and verbal IQ are identified as candidate correlates of performance with noise-vocoded sentences. When the listener is exposed to noise-vocoded stimuli from different linguistic categories (consonants and vowels, isolated words, sentences), there is evidence for the interplay of two initial listening 'modes' in response to the degraded speech signal, representing 'top-down' cognitive-linguistic processing and 'bottom-up' acoustic-phonetic analysis. Detailed analysis of segment recognition presents a perceptual role for temporal information across all the linguistic categories, and suggests that performance could be improved through training regimes that direct attention to the most informative acoustic properties of the stimulus. Across several experiments, the results also demonstrate long-term aspects of perceptual learning. In sum, this thesis demonstrates that consideration of both stimulus-based and listener-based factors forms a promising approach to the characterization of speech perception processes in the healthy adult listener.

APA, Harvard, Vancouver, ISO, and other styles

50

SMAIL, ZAHIR. "Optimisation de la correction du signal de parole hyperbare par un vocodeur a canaux modifie." Rennes 1, 1997. http://www.theses.fr/1997REN10039.

Full text

Abstract:

L'exploration sous marine et l'exploitation du petrole offshore sur le plateau continental exigent des immersions humaines de plus en plus profondes. L'air comprime apres quelques dizaines de metres provoque des effets physiques, physiologiques et psychiques nefastes. Des melanges respiratoires synthetiques (heliox, hydrox, etc. . . ) sont employes plus profondement. Un inconvenient de ces melanges est de rendre la parole, dite hyperbare, inintelligible. La distorsion de la parole hyperbare a pour cause principale la translation vers les hautes frequences des formants du conduit vocal. La variation de la vitesse du son dans le melange est a l'origine de ce phenomene. La loi de fant modelise assez correctement cet effet non lineaire. A ce jour, divers correcteurs ont ete etudies sans qu'aucun ne soit totalement satisfaisant. Apres analyse de ces correcteurs, justifiant ici le choix du vocodeur a canaux numerique, la presente these propose d'importantes ameliorations. La premiere concerne l'inversion de l'algorithme precedemment employe. Ceci permet une correction non lineaire a toute profondeur sans limite, comme auparavant, vers 300 m. L'interet est evident. Un echelonnement mieux adapte des bancs de filtres est propose. Un echantillonnage multicadences est adopte qui, en reduisant fortement le temps de traitement, ameliore la realisation temps reel. Enfin, un traitement des variations du pitch est introduit. Cette nouveaute contribue a mieux identifier le locuteur sans degrader la qualite de la parole restituee. Les analyses spectrales quantitatives montrent le bon fonctionnement de l'algorithme. Des tests d'ecoute attestent des ameliorations apportees.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Vocoder'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles