Dissertations / Theses: 'Speech filtering'

1

Ledoux, Christelle Michelle. "Robust speech filtering in impulsive noise environments." Thesis, Virginia Tech, 1999. http://hdl.handle.net/10919/46325.

Full text

Abstract:

This thesis presents a new robust filtering technique that suppresses impulsive noise in speech signals. The method makes use of Projection Statistics based on medians to detect segments of speech with impulses. The autoregressive model employed to smooth out the speech signal is identified by means of a robust nonlinear estimator known as the Schweppe-type Huber GM-estimator. Simulation results are presented that demonstrate the effectiveness of the filter. Another contribution of the work is the development of a robust version of the Kalman filter based on the Huber M-estimator. The performances of this filter are evaluated for a simple autoregressive process.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

2

Ramachandran, Ravi P. "Pitch filtering in adaptive predictive coding of speech." Thesis, McGill University, 1986. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=65345.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Klein, Mark 1977. "Signal subspace speech enhancement with perceptual post-filtering." Thesis, McGill University, 2002. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=33975.

Full text

Abstract:

Speech enhancement blocks form a critical part of voice communications systems. Unfortunately, most enhancement schemes have difficulty eliminating noise from speech without introducing distortion or artefacts. Many of the disturbances originate from poor parameter estimation and interframe fluctuations.
This thesis introduces the Enhanced Signal Subspace (ESS) system to mitigate the above problems. Based on a signal subspace framework, ESS has been designed to attenuate disturbances while minimizing audible distortion.
Artefacts are reduced by employing an auditory post-filter to smooth the enhanced speech spectra. This filter performs averaging in a manner that exploits the properties of the human auditory system. As such, distortion of the underlying speech signal is reduced.
Testing shows that listeners prefer the proposed algorithm to traditional signal subspace speech enhancement.

APA, Harvard, Vancouver, ISO, and other styles

4

Chan, Dominic Sai Fan. "Speech production modelling based on glottal inverse filtering." Thesis, Imperial College London, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.307161.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Lewine, Andrew (Andrew P. ). "Speech filtering for improving intelligibility in noisy transients." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66433.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
Cataloged from PDF version of thesis.
Includes bibliographical references.
Hearing impairment is a problem that affects a large percentage of the population. Cochlear implants allow those with profound or total hearing loss to regain some hearing by stimulating auditory nerve fibers with implanted electrodes, in response to sound picked up by an external microphone. The signal processing chain from microphone input to stimulation output is an important factor in the overall speech intelligibility of the implant system. This thesis work improves on an existing ultra-low-power cochlear implant system by utilizing an improved noise and power efficient bandpass filter bank to implement a novel frequency-selective gain control algorithm capable of reducing, and in some cases removing, loud transient noises, thereby improving speech intelligibility. This gain control algorithm takes advantage of the inherent frequency-specific gain control afforded by the improved bandpass filter topology. This contribution makes an improvement to the existing state-of-the-art system in both power efficiency and performance.
by Andrew Lewine.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

6

Dubbin, Gregory. "Applying particle filtering to unsupervised part-of-speech induction." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:48caedb6-478f-4bb0-8ca7-975ee7fe5e38.

Full text

Abstract:

Statistical Natural Language Processing (NLP) lies at the intersection of Computational Linguistics and Machine Learning. As linguistic models incorporate more subtle nuances of language and its structure, standard inference techniques can fall behind. One such application is research on the unsupervised induction of part-of-speech tags. It has the potential to improve both our understanding of the plausibility of theories of first language acquisition, and Natural Language Processing applications such as Speech Recognition and Machine Translation. Sequential Monte Carlo (SMC) approaches, i.e. particle filters, are well suited to approximating such models. This thesis seeks to determine whether one application of SMC methods, particle Gibbs sampling, is capable of performing inference in otherwise intractable NLP applications. Specifically, this research analyses the benefits and drawbacks to relying on particle Gibbs to perform unsupervised part-of-speech induction without the flawed one-tag-per-type assumption of similar approaches. Additionally, this thesis explores the affects of type-based supervision with tag-dictionaries extracted from annotated corpora or from the wiktionary. The semi-supervised tag dictionary improves the performance of the local Gibbs PYP-HMM sampler enough to nearly match the performance of the particle Gibbs type-sampler. Finally, this thesis also extends the Pitman-Yor HMM tagger of Blunsom and Cohn (2011) to include an explicit model of the lexicon which encodes those tags from which a word-type may be generated. This has the effect of both biasing the model to produce fewer tags per type and modelling the tendency for open class words to be ambiguous between only a subset of the available tags. Furthermore, I extend the type based particle Gibbs inference algorithm to simultaneously resample the ambiguity class as well as tags for all of the tokens of a given word type. The result is a principled probabilistic model of part-of-speech induction that achieves state-of-the-art performance. Overall, the experiments and contributions of this thesis demonstrate the applicability of the particle Gibbs sampler and particle methods in general to otherwise intractable problems in NLP.

APA, Harvard, Vancouver, ISO, and other styles

7

Papanagiotou, Kyriakos. "Enhancement of body conducted speech from an ear microphone." Thesis, University of Southampton, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.289914.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Darlington, David J. "The enhancement of noise-corrupted speech by sub-band adaptive filtering." Thesis, University of the West of Scotland, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.388213.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Hu, Rong. "Enhancement of adaptive de-correlation filtering separation model for robust speech recognition." Diss., Columbia, Mo. : University of Missouri-Columbia, 2007. http://hdl.handle.net/10355/4682.

Full text

Abstract:

Thesis (Ph. D.)--University of Missouri-Columbia, 2007.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on September 25, 2007) Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

10

Mustiere, Frederic. "Particle filtering methods for the enhancement of speech corrupted by additive noise." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27398.

Full text

Abstract:

In this work, we study the application of particle filtering (PF) algorithms to the problem of speech enhancement. The goal of the thesis is to devise PF algorithms that will enhance speech signals corrupted by additive noise, and to evaluate their performance via comparisons with other existing algorithms based on several quality measures. Speech enhancement, or noise reduction, is an important problem in many applications, such as telephony and telecommunications in general, sound recording, human-coaching interface (where speech recognition is important), etc. Even though many algorithms already exist for speech enhancement, there is still very much work to do, especially in terms of intelligibility. In many cases, it may be easier to understand the original, noisy speech rather than the processed, "cleaned-out" one. In other cases, the residual noise may be too annoying to carry out a comfortable conversation. In this context, new approaches for the denoising of speech are welcome. As a first contribution, a practical approach to deriving simple Rao-Blackwellised Particle Filters (RBPFs), which was developed in parallel with a theoretic review of PFs, is presented. In addition, a novel algorithm, called the modified Rao-Blackwellised Particle Filter (RBPF), is proposed to reduce the computational load of regular RBPFs. Several new speech enhancement methods using particle filters are also derived, and shown to outperform some other existing PF-based algorithms. Accessorily, a novel strategy to extend their range of application to colored noise is explained and applied. Comparatively to the other types of enhancement algorithms tested (including spectral subtraction, signal subspace, dual extended Kalman filter, perceptually constrained Kalman filter, dual perceptually constrained unscented Kalman filter) we find that the particle-filter based algorithms presented have the advantage of not introducing any musical noise. Furthermore, in the conditions of our experiments, using several objective measures we find that they are able to compete with and outperform most of the other algorithms tested. Using these measures and based on informal listening, we highlight their advantages---naturalness of the enhanced speech, low intrusiveness of the non-musical residual noise, very good performance at high SNR, flexibility---and their main limitations---intraspeech residual noise "modulated" by the speech, computational burden. Considering how flexible and parametrizable PFs are, there is a strong potential for further improvement.

APA, Harvard, Vancouver, ISO, and other styles

11

Wang, Yao Electrical Engineering &amp Telecommunications Faculty of Engineering UNSW. "Single channel speech enhancement based on perceptual temporal masking model." Awarded by:University of New South Wales. Electrical Engineering & Telecommunications, 2007. http://handle.unsw.edu.au/1959.4/40454.

Full text

Abstract:

In most speech communication systems, the presence of background noise causes the quality and intelligibility of speech to degrade, especially when the Signal-to-Noise Ratio (SNR) is low. Numerous speech enhancement techniques have been employed successfully in many applications. However, at low signal-to-noise ratios most of these speech enhancement techniques tend to introduce a perceptually annoying residual noise known as "musical noise". The research presented in this thesis aims to minimize this musical noise and maximize the noise reduction ability of speech enhancement algorithms to improve speech quality in low SNR environments. This thesis proposes two novel speech enhancement algorithms based on Weiner and Kalman filters, and exploit the masking properties of the human auditory system to reduce background noise. The perceptual Wiener filter method uses either temporal or simultaneous masking to adjust the Wiener gain in order to suppress noise below the masking thresholds. The second algorithm involves reshaping the corrupted signal according to the masking threshold in each critical band, followed by Kalman filtering. A comparison of the results from these proposed techniques with those obtained from traditional methods suggests that the proposed algorithms address the problem of noise reduction effectively while decreasing the level of the musical noise. In this thesis, many other existing competitive noise suppression methods have also been discussed and their performance evaluated under different types of noise environments. The performances were evaluated and compared to each other using both objective PESQ measures (ITU-T P.862) and subjective listening tests (ITU-T P.835). The proposed speech enhancement schemes based on the auditory masking model outperformed the other methods that were tested.

APA, Harvard, Vancouver, ISO, and other styles

12

Ma, Ning. "Speech enhancement algorithms using Kalman filtering and masking properties of human auditory systems." Thesis, University of Ottawa (Canada), 2005. http://hdl.handle.net/10393/29229.

Full text

Abstract:

Speech enhancement algorithms have been employed successfully in many areas such as VoIP, automatic speech recognition and speaker verification. Many approaches are presented in the literature. This thesis focuses on enhancing single channel speech degraded by white noise or colored noise. A Kalman filter algorithm combined with the masking properties of human auditory systems is proposed. The threshold computed from the masking properties is used as a constraint in the Kalman filter to theoretically derive a modified Kalman filter. The derivation gives a theoretical foundation for the feasibility of combining masking properties with a Kalman filter. Some heuristic methods are also proposed for an easier implementation. One algorithm proposes to use the frequency domain masking level as a hard threshold to reshape the Kalman filtered signal. Another algorithm is to use a post-filter concatenated with the Kalman filter, using a threshold where both time-domain and frequency domain masking properties are taken into account. The goal of the masking is to make the energy of the estimate state error smaller than the threshold. To further decrease the computational cost, a wavelet Kalman filter combined with masking thresholds is also introduced. In the above algorithms, the speech model is assumed to be linear. Nonlinear speech models are also considered in the thesis. To address the nonlinear model problem, dual Extended Kalman Filter (EKF) and dual Unscented Kalman Filter (UKF) algorithms are studied. In these cases, both time-domain and frequency domain masking properties are taken into account. The simulation results show that all the proposed methods combining Kalman filter and masking properties can produce promising results from the point of view of PESQ scores. The average PESQ score gains obtained by these proposed methods are from about 0.35 to 0.45. Some informal subjective tests also show that the performance of the proposed methods is promising. No voice activity detection is required in the proposed methods.

APA, Harvard, Vancouver, ISO, and other styles

13

Gransden, I. R. "High speed auditory analysis." Thesis, University of Sheffield, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.364247.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Gobl, Christer. "The Voice Source in Speech Communication - Production and Perception Experiments Involving Inverse Filtering and Synthesis." Doctoral thesis, KTH, Speech Transmission and Music Acoustics, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3665.

Full text

Abstract:

This thesis explores, through a number of production andperception studies, the nature of the voice source signal andhow it varies in spoken communication. Research is alsopresented that deals with the techniques and methodologies foranalysing and synthesising the voice source. The main analytictechnique involves interactive inverse filtering for obtainingthe source signal, which is then parameterised to permit thequantification of source characteristics. The parameterisationis carried by means of model matching, using the four-parameterLF model of differentiated glottal flow.

The first three analytic studies focus on segmental andsuprasegmental determinants of source variation. As part of theprosodic variation of utterances, focal stress shows for theglottal excitation an enhancement between the stressed voweland the surrounding consonants. At a segmental level, the voicesource characteristics of a vowel show potentially majordifferences as a function of the voiced/voiceless nature of anadjacent stop. Cross-language differences in the extent anddirectionality of the observed effects suggest differentunderlying control strategies in terms of the timing of thelaryngeal and supralaryngeal gestures, as well as in thelaryngeal tensions settings. Different classes of voicedconsonants also show differences in source characteristics:here the differences are likely to be passive consequences ofthe aerodynamic conditions that are inherent to the consonants.Two further analytic studies present voice source correlatesfor six different voice qualities as defined by Laver'sclassification system. Data from stressed and unstressedcontexts clearly show that the transformation from one voicequality to another does not simply involve global changes ofthe source parameters. As well as providing insights into theseaspects of speech production, the analytic studies providequantitative measures useful in technology applications,particularly in speech synthesis.

The perceptual experiments use the LF source implementationin the KLSYN88 synthesiser to test some of the analytic resultsand to harness them to explore the paralinguistic dimension ofspeech communication. A study of the perceptual salience ofdifferent parameters associated with breathy voice indicatesthat the source spectral slope is critically important andthat, surprisingly, aspiration noise contributes relativelylittle. Further perceptual tests using stimuli with differentvoice qualities explore the mapping between voice quality andits paralinguistic function of expressing emotion, mood andattitude. The results of these studies highlight the crucialrole of voice quality in expressing affect as well as providingpointers to how it combines withf₀for this purpose.

The last section of the thesis focuses on the techniquesused for the analysis and synthesis of the source. Asemi-automatic method for inverse filtering is presented, whichis novel in that it optimises the inverse filter by exploitingthe knowledge that is typically used by the experimenter whencarrying out manual interactive inverse filtering. A furtherstudy looks at the properties of the modified LF model in theKLSYN88 synthesiser: it highlights how it differs from thestandard LF model and discusses the implications forsynthesising the glottal source signal from LF model data.Effective and robust source parameterisation for the analysisof voice quality is the topic of the final paper: theeffectiveness of global, amplitude-based, source parameters isexamined across speech tokens with large differences inf₀. Additional amplitude-based parameters areproposed to enable a more detailed characterisation of theglottal pulse.

Keywords:Voice source dynamics, glottal sourceparameters, source-filter interaction, voice quality,phonation, perception, affect, emotion, mood, attitude,paralinguistic, inverse filtering, knowledge-based, formantsynthesis, LF model, fundamental frequency,f₀.

APA, Harvard, Vancouver, ISO, and other styles

15

Courtis, N. J. "Some aspects of speech intelligibility enhancement with particular regard to adaptive filtering and room acoustics." Thesis, University of Hertfordshire, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356313.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Ozbek, Ibrahim Yucel. "Dynamic System Modeling And State Estimation For Speech Signal." Phd thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/3/12611777/index.pdf.

Full text

Abstract:

This thesis presents an all-inclusive framework on how the current formant tracking and audio (and/or visual)-to-articulatory inversion algorithms can be improved. The possible improvements are summarized as follows: The first part of the thesis investigates the problem of the formant frequency estimation when the number of formants to be estimated fixed or variable respectively. The fixed number of formant tracking method is based on the assumption that the number of formant frequencies is fixed along the speech utterance. The proposed algorithm is based on the combination of a dynamic programming algorithm and Kalman filtering/smoothing. In this method, the speech signal is divided into voiced and unvoiced segments, and the formant candidates are associated via dynamic programming algorithm for each voiced and unvoiced part separately. Individual adaptive Kalman filtering/smoothing is used to perform the formant frequency estimation. The performance of the proposed algorithm is compared with some algorithms given in the literature. The variable number of formant tracking method considers those formant frequencies which are visible in the spectrogram. Therefore, the number of formant frequencies is not fixed and they can change along the speech waveform. In that case, it is also necessary to estimate the number of formants to track. For this purpose, the proposed algorithm uses extra logic (formant track start/end decision unit). The measurement update of each individual formant trajectories is handled via Kalman filters. The performance of the proposed algorithm is illustrated by some examples The second part of this thesis is concerned with improving audiovisual to articulatory inversion performance. The related studies can be examined in two parts
Gaussian mixture model (GMM) regression based inversion and Jump Markov Linear System (JMLS) based inversion. GMM regression based inversion method involves modeling audio (and /or visual) and articulatory data as a joint Gaussian mixture model. The conditional expectation of this distribution gives the desired articulatory estimate. In this method, we examine the usefulness of the combination of various acoustic features and effectiveness of various types of fusion techniques in combination with audiovisual features. Also, we propose dynamic smoothing methods to smooth articulatory trajectories. The performance of the proposed algorithm is illustrated and compared with conventional algorithms. JMLS inversion involves tying the acoustic (and/or visual) spaces and articulatory space via multiple state space representations. In this way, the articulatory inversion problem is converted into the state estimation problem where the audiovisual data are considered as measurements and articulatory positions are state variables. The proposed inversion method first learns the parameter set of the state space model via an expectation maximization (EM) based algorithm and the state estimation is handled via interactive multiple model (IMM) filter/smoother.

APA, Harvard, Vancouver, ISO, and other styles

17

Abel, Andrew. "Towards an intelligent fuzzy based multimodal two stage speech enhancement system." Thesis, University of Stirling, 2013. http://hdl.handle.net/1893/15989.

Full text

Abstract:

This thesis presents a novel two stage multimodal speech enhancement system, making use of both visual and audio information to filter speech, and explores the extension of this system with the use of fuzzy logic to demonstrate proof of concept for an envisaged autonomous, adaptive, and context aware multimodal system. The design of the proposed cognitively inspired framework is scalable, meaning that it is possible for the techniques used in individual parts of the system to be upgraded and there is scope for the initial framework presented here to be expanded. In the proposed system, the concept of single modality two stage filtering is extended to include the visual modality. Noisy speech information received by a microphone array is first pre-processed by visually derived Wiener filtering employing the novel use of the Gaussian Mixture Regression (GMR) technique, making use of associated visual speech information, extracted using a state of the art Semi Adaptive Appearance Models (SAAM) based lip tracking approach. This pre-processed speech is then enhanced further by audio only beamforming using a state of the art Transfer Function Generalised Sidelobe Canceller (TFGSC) approach. This results in a system which is designed to function in challenging noisy speech environments (using speech sentences with different speakers from the GRID corpus and a range of noise recordings), and both objective and subjective test results (employing the widely used Perceptual Evaluation of Speech Quality (PESQ) measure, a composite objective measure, and subjective listening tests), showing that this initial system is capable of delivering very encouraging results with regard to filtering speech mixtures in difficult reverberant speech environments. Some limitations of this initial framework are identified, and the extension of this multimodal system is explored, with the development of a fuzzy logic based framework and a proof of concept demonstration implemented. Results show that this proposed autonomous,adaptive, and context aware multimodal framework is capable of delivering very positive results in difficult noisy speech environments, with cognitively inspired use of audio and visual information, depending on environmental conditions. Finally some concluding remarks are made along with proposals for future work.

APA, Harvard, Vancouver, ISO, and other styles

18

Nallamilli, Sai Chandra Sekhar Reddy, and Nihanth Kandi. "Detection of Human Emotion from Noise Speech." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19610.

Full text

Abstract:

Detection of a human emotion from human speech is always a challenging task. Factors like intonation, pitch, and loudness of signal vary from different human voice. So, it's important to know the exact pitch, intonation and loudness of a speech for making it a challenging task for detection. Some voices exhibit high background noise which will affect the amplitude or pitch of the signal. So, knowing the detailed properties of a speech to detect emotion is mandatory. Detection of emotion in humans from speech signals is a recent research field. One of the scenarios where this field has been applied is in situations where the human integrity and security are at risk In this project we are proposing a set of features based on the decomposition signals from discrete wavelet transform to characterize different types of negative emotions such as anger, happy, sad, and desperation. The features are measured in three different conditions: (1) the original speech signals, (2) the signals that are contaminated with noise or are affected by the presence of a phone channel, and (3) the signals that are obtained after processing using an algorithm for Speech Enhancement Transform. According to the results, when the speech enhancement is applied, the detection of emotion in speech is increased and compared to results obtained when the speech signal is highly contaminated with noise. Our objective is to use Artificial neural network because the brain is the most efficient and best machine to recognize speech. The brain is built with some neural network. At the same time, Artificial neural networks are clearly advanced with respect to several features, such as their nonlinearity and high classification capability. If we use Artificial neural networks to evolve the machine or computer that it can detect the emotion. Here we are using feedforward neural network which is suitable for classification process and using sigmoid function as activation function. The detection of human emotion from speech is achieved by training the neural network with features extracted from the speech. To achieve this, we need proper features from the speech. So, we must remove background noise in the speech. We can remove background noise by using filters. wavelet transform is the filtering technique used to remove the background noise and enhance the required features in the speech.

APA, Harvard, Vancouver, ISO, and other styles

19

Thüne, Philipp [Verfasser], Gerald [Gutachter] Enzner, and Peter [Gutachter] Jax. "Advances in blind multichannel Wiener filtering of noisy speech / Philipp Thüne ; Gutachter: Gerald Enzner, Peter Jax ; Fakultät für Elektrotechnik und Informationstechnik." Bochum : Ruhr-Universität Bochum, 2017. http://d-nb.info/1150509546/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Torres, Juan Félix. "Estimation of glottal source features from the spectral envelope of the acoustic speech signal." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34736.

Full text

Abstract:

Speech communication encompasses diverse types of information, including phonetics, affective state, voice quality, and speaker identity. From a speech production standpoint, the acoustic speech signal can be mainly divided into glottal source and vocal tract components, which play distinct roles in rendering the various types of information it contains. Most deployed speech analysis systems, however, do not explicitly represent these two components as distinct entities, as their joint estimation from the acoustic speech signal becomes an ill-defined blind deconvolution problem. Nevertheless, because of the desire to understand glottal behavior and how it relates to perceived voice quality, there has been continued interest in explicitly estimating the glottal component of the speech signal. To this end, several inverse filtering (IF) algorithms have been proposed, but they are unreliable in practice because of the blind formulation of the separation problem. In an effort to develop a method that can bypass the challenging IF process, this thesis proposes a new glottal source information extraction method that relies on supervised machine learning to transform smoothed spectral representations of speech, which are already used in some of the most widely deployed and successful speech analysis applications, into a set of glottal source features. A transformation method based on Gaussian mixture regression (GMR) is presented and compared to current IF methods in terms of feature similarity, reliability, and speaker discrimination capability on a large speech corpus, and potential representations of the spectral envelope of speech are investigated for their ability represent glottal source variation in a predictable manner. The proposed system was found to produce glottal source features that reasonably matched their IF counterparts in many cases, while being less susceptible to spurious errors. The development of the proposed method entailed a study into the aspects of glottal source information that are already contained within the spectral features commonly used in speech analysis, yielding an objective assessment regarding the expected advantages of explicitly using glottal information extracted from the speech signal via currently available IF methods, versus the alternative of relying on the glottal source information that is implicitly contained in spectral envelope representations.

APA, Harvard, Vancouver, ISO, and other styles

21

Wu, Mingyang. "Pitch tracking and speech enhancement in noisy and reverberant environments." Columbus, Ohio : Ohio State University, 2003. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1064341479.

Full text

Abstract:

Thesis (Ph. D.)--Ohio State University, 2003.
Title from first page of PDF file. Document formatted into pages; contains xvi, 149 p.; also includes graphics. Includes abstract and vita. Advisor: DeLiang Wang, Dept. of Computer and Information Science. Includes bibliographical references (p. 136-149).

APA, Harvard, Vancouver, ISO, and other styles

22

Roman, Nicoleta. "Auditory-based algorithms for sound segregation in multisource and reverberant environments." Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1124370749.

Full text

Abstract:

Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains i-xxii, xx-xxi, 183 p.; also includes graphics. Includes bibliographical references (p. 171-183). Available online via OhioLINK's ETD Center

APA, Harvard, Vancouver, ISO, and other styles

23

Tan, Ke. "Convolutional and recurrent neural networks for real-time speech separation in the complex domain." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1626983471600193.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Neville, Katrina Lee, and katrina neville@rmit edu au. "Channel Compensation for Speaker Recognition Systems." RMIT University. Electrical and Computer Engineering, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080514.093453.

Full text

Abstract:

This thesis attempts to address the problem of how best to remedy different types of channel distortions on speech when that speech is to be used in automatic speaker recognition and verification systems. Automatic speaker recognition is when a person's voice is analysed by a machine and the person's identity is worked out by the comparison of speech features to a known set of speech features. Automatic speaker verification is when a person claims an identity and the machine determines if that claimed identity is correct or whether that person is an impostor. Channel distortion occurs whenever information is sent electronically through any type of channel whether that channel is a basic wired telephone channel or a wireless channel. The types of distortion that can corrupt the information include time-variant or time-invariant filtering of the information or the addition of 'thermal noise' to the information, both of these types of distortion can cause varying degrees of error in information being received and analysed. The experiments presented in this thesis investigate the effects of channel distortion on the average speaker recognition rates and testing the effectiveness of various channel compensation algorithms designed to mitigate the effects of channel distortion. The speaker recognition system was represented by a basic recognition algorithm consisting of: speech analysis, extraction of feature vectors in the form of the Mel-Cepstral Coefficients, and a classification part based on the minimum distance rule. Two types of channel distortion were investigated: Convolutional (or lowpass filtering) effects Addition of white Gaussian noise Three different methods of channel compensation were tested: Cepstral Mean Subtraction (CMS) RelAtive SpecTrAl (RASTA) Processing Constant Modulus Algorithm (CMA) The results from the experiments showed that for both CMS and RASTA processing that filtering at low cutoff frequencies, (3 or 4 kHz), produced improvements in the average speaker recognition rates compared to speech with no compensation. The levels of improvement due to RASTA processing were higher than the levels achieved due to the CMS method. Neither the CMS or RASTA methods were able to improve accuracy of the speaker recognition system for cutoff frequencies of 5 kHz, 6 kHz or 7 kHz. In the case of noisy speech all methods analysed were able to compensate for high SNR of 40 dB and 30 dB and only RASTA processing was able to compensate and improve the average recognition rate for speech corrupted with a high level of noise (SNR of 20 dB and 10 dB).

APA, Harvard, Vancouver, ISO, and other styles

25

Deivard, Johannes. "How accuracy of estimated glottal flow waveforms affects spoofed speech detection performance." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-48414.

Full text

Abstract:

In the domain of automatic speaker verification, one of the challenges is to keep the malevolent people out of the system. One way to do this is to create algorithms that are supposed to detect spoofed speech. There are several types of spoofed speech and several ways to detect them, one of which is to look at the glottal flow waveform (GFW) of a speech signal. This waveform is often estimated using glottal inverse filtering (GIF), since, in order to create the ground truth GFW, special invasive equipment is required. To the author’s knowledge, no research has been done where the correlation of GFW accuracy and spoofed speech detection (SSD) performance is investigated. This thesis tries to find out if the aforementioned correlation exists or not. First, the performance of different GIF methods is evaluated, then simple SSD machine learning (ML) models are trained and evaluated based on their macro average precision. The ML models use different datasets composed of parametrized GFWs estimated with the GIF methods from the previous step. Results from the previous tasks are then combined in order to spot any correlations. The evaluations of the different methods showed that they created GFWs of varying accuracy. The different machine learning models also showed varying performance depending on what type of dataset that was being used. However, when combining the results, no obvious correlations between GFW accuracy and SSD performance were detected. This suggests that the overall accuracy of a GFW is not a substantial factor in the performance of machine learning-based SSD algorithms.

APA, Harvard, Vancouver, ISO, and other styles

26

Al-saqaf, Walid. "Breaking digital firewalls : analyzing internet censorship and circumvention in the arab world." Doctoral thesis, Örebro universitet, Institutionen för humaniora, utbildnings- och samhällsvetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-34596.

Full text

Abstract:

This dissertation explores the role of Internet censorship and circumvention in the Arab world as well as Arabs’ views on the limits to free speech on the Internet. The project involves the creation of an Internet censorship circumvention tool named Alkasir that allows users to report and access certain types of censored websites. The study covers the Arab world at large with special focus on Egypt, Syria, Tunisia, and Yemen. This work is of interdisciplinary nature and draws on the disciplines of media and communication studies and computer science. It uses a pioneering experimental approach by placing Alkasir in the hands of willing users who automatically feed a server with data about usage patterns without storing any of their personal information. In addition to the analysis of Alkasir usage data, Web surveys were used to learn about any technical and nontechnical Internet censorship practices that Arab users and content producers may have been exposed to. The study also aims at learning about users’ experiences with circumvention tools and how such tools could be improved. The study found that users have successfully reported and accessed hundreds of censored social networking, news, dissident, multimedia and other websites. The survey results show that while most Arab informants disapprove censoring online anti-government political content, the majority support the censoring of other types of content such as pornography, hate speech, and anti-religion material. Most informants indicated that circumvention tools should be free of charge, fast and reliable. An increase in awareness among survey respondents of the need for privacy and anonymity features in circumvention solutions was observed.

APA, Harvard, Vancouver, ISO, and other styles

27

Motlagh, Zadeh Lina. "Developing a digits in noise screening test with higher sensitivity to high-frequency hearing loss." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1552378973670023.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Houdek, Miroslav. "Rozpoznání emočního stavu člověka z řeči." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218117.

Full text

Abstract:

This master thesis concerns with emotional states and gender recognition on the basis of speech signal analysis. We used various prosodic and cepstral features for the description of the speech signal. In the text we describe non-invasive methods for glottal pulses estimation. The described features of speech were implemented in MATLAB. For their classification we used the GMM classifier, which uses the Gaussian probability distribution for modeling a feature space. Furthermore, we constructed a system for recognition of emotional states of the speaker and a system for gender recognition from speech. We tested the success of created systems with several features on speech signal segments of various lengths and compared the results. In the last part we tested the influence of speaker and gender on the success of emotional states recognition.

APA, Harvard, Vancouver, ISO, and other styles

29

Rao, Peddi Srinivas, and Vallabhaneni Sreelatha. "Implementation and Evaluation of Spectral Subtraction with Minimum Statistics using WOLA and FFT Modulated Filter Banks." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2906.

Full text

Abstract:

In communication system environment speech signal is corrupted due to presence of additive acoustic noise, so with this distortion the effective communication is degraded in terms of the quality and intelligibility of speech. Now present research is going how effectively acoustic noise can be eliminated without affecting the original speech quality, this tends to be our challenging in this current research thesis work. Here this work proposes multi-tiered detection method that is based on time-frequency analysis (i.e. filter banks concept) of the noisy speech signals, by using standard speech enhancement method based on the proven spectral subtraction, for single channel speech data and for a wide range of noise types at various noise levels. There were various variants have been introduced to standard spectral subtraction proposed by S.F.Boll. In this thesis we designed and implemented a novel approach of Spectral Subtraction based on Minimum Statistics [MinSSS]. This means that the power spectrum of the non-stationary noise signal is estimated by finding the minimum values of a smoothed power spectrum of the noisy speech signal and thus circumvents the speech activity detection problem. This approach is also capable of dealing with non-stationary noise signals. In order to analyze the system in time frequency domain, we have implemented two different filter bank approaches such as Weighted OverLap Added (WOLA) and Fast Fourier Transform Modulated (FFTMod). The proposed systems were implemented and evaluated offline using simulation tool Matlab and then validated their performances based on the objective quality measures such as Signal to Noise Ratio Improvement (SNRI) and Perceptual Evaluation Speech Quality (PESQ) measure. The systems were tested with a pure speech combination of male and female sampled at 8 kHz, these signals were corrupted with various kinds of noises at different noise power levels. The MinSSS algorithm implemented using FFTMod filter bank approach outperforms when compared the WOLA filter bank approach.

APA, Harvard, Vancouver, ISO, and other styles

30

Matos, Adriano Nogueira. "Extração de características do sinal de voz utilizando análise fatorial verdadeira." Universidade Federal do Amazonas, 2008. http://tede.ufam.edu.br/handle/tede/2959.

Full text

Abstract:

Made available in DSpace on 2015-04-11T14:03:17Z (GMT). No. of bitstreams: 1 DISSERTACAO ADRIANO NOGUEIRA.pdf: 382280 bytes, checksum: fc1f9e0caac3d97ff74a893e97298a71 (MD5) Previous issue date: 2008-12-17
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Digital processing of speech signal is applied in several computer applications, which the major ones are the following: Recognition, synthesis and coding of speech. All these applications require the amount of data in the acoustic signal to be reduced, in order to allow processing by a computer device. The feature extraction of speech signal, that is the goal of this study, performs this action. The features extracted should well depict the speech signal and should have no redundancy, in order to increase the performance of the systems using them. The feature extraction Mel Frequency Cepstral Coefficients (MFCC) method partially fulfills these requirements, but it is seriously damaged when noise signal is acting. The appliance of the statistical method of Factorial Analysis is intended to filter the noise components from the speech. The results of the experiments performed in this work shows that this is a competitive method, especially when used to generate acoustic models in severe noise conditions.
O processamento digital do sinal de voz é empregado em diversas aplicações computacionais, das quais as principais são: Reconhecimento, síntese e codificação da fala. Todas estas aplicações requerem que ocorra redução da quantidade de informações da onda acústica, de maneira a permitir o processamento por um computador. O processo de extração de características do sinal de voz, objeto de estudo deste trabalho, realiza esta tarefa. As características extraídas devem caracterizar o sinal de voz e não conter redundância, de forma a maximizar o desempenho dos sistemas que as utilizem. O método MFCC (Mel Frequency Cepstral Coefficients) de extração de características cumpre parcialmente esses requisitos, mas é seriamente degradado sob a incidência de ruído. A aplicação do método estatístico de Análise Fatorial objetiva filtrar o sinal de ruído das locuções. Os resultados obtidos dos experimentos realizados indicam a competitividade deste método, especialmente quando usado na geração dos modelos acústicos robustos em condições de ruído severo.

APA, Harvard, Vancouver, ISO, and other styles

31

Crespo, Cuaresma Jesus, and Martin Feldkircher. "Spatial Filtering, Model Uncertainty and the Speed of Income Convergence in Europe." Wiley, 2013. http://dx.doi.org/10.1002/jae.2277.

Full text

Abstract:

In this paper we put forward a Bayesian Model Averaging method aimed at performing inference under model uncertainty in the presence of potential spatial autocorrelation. The method uses spatial filtering in order to account for uncertainty in spatial linkages. Our procedure is applied to a dataset of income per capita growth and 50 potential determinants for 255 NUTS-2 European regions. We show that ignoring uncertainty in the type of spatial weight matrix can have an important effect on the estimates of the parameters attached to the model covariates. After integrating out the uncertainty implied by the choice of regressors and spatial links, human capital investments and transitional dynamics related to income convergence appear as the most robust determinants of growth at the regional level in Europe. Our results imply that a quantitatively important part of the income convergence process in Europe is influenced by spatially correlated growth spillovers.

APA, Harvard, Vancouver, ISO, and other styles

32

Olugbenga, Olubodun. "High speed optical phase modulated signaling with offset filtering in a 50 GHz grid." Thesis, Swansea University, 2011. https://cronfa.swan.ac.uk/Record/cronfa42896.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Hawkins, Mikhel E. "High speed target tracking using Kalman filter and partial window imaging." Thesis, Georgia Institute of Technology, 2002. http://hdl.handle.net/1853/16709.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Sturmel, Nicolas. "Analyse de la qualité vocale appliquée à la parole expressive." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00591638.

Full text

Abstract:

L'analyse des signaux de parole permet de comprendre le fonctionnement de l'appareil vocal, mais aussi de décrire de nouveaux paramètres permettant de qualifier et quantifier la perception de la voix. Dans le cas de la parole expressive, l'intérêt se porte sur des variations importantes de qualité vocales et sur leurs liens avec l'expressivité et l'intention du sujet. Afin de décrire ces liens, il convient de pouvoir estimer les paramètres du modèle de production mais aussi de décomposer le signal vocal en chacune des parties qui contribuent à ce modèle. Le travail réalisé au cours de cette thèse s'axe donc autour de la segmentation et la décomposition des signaux vocaux et de l'estimation des paramètres du modèle de production vocale : Tout d'abord, la décomposition multi-échelles des signaux vocaux est abordée. En reprenant la méthode LoMA qui trace des lignes suivant les amplitudes maximum sur les réponses temporelles au banc de filtre en ondelettes, il est possible d'y détecter un certain nombre de caractéristiques du signal vocal : les instants de fermeture glottique, l'énergie associée à chaque cycle ainsi que sa distribution spectrale, le quotient ouvert du cycle glottique (par l'observation du retard de phase du premier harmonique). Cette méthode est ensuite testée sur des signaux synthétiques et réels. Puis, la décomposition harmonique + bruit des signaux vocaux est abordée. Une méthode existante (PAPD - Périodic/APériodic Décomposition) est adaptée aux variations de fréquence fondamentale par le biais de la variation dynamique de la taille de la fenêtre d'analyse et est appelée PAP-A. Cette nouvelle méthode est ensuite testée sur une base de signaux synthétiques. La sensibilité à la précision d'estimation de la fréquence fondamentale est notamment abordée. Les résultats montrent des décompositions de meilleures qualité pour PAP-A par rapport à PAPD. Ensuite, le problème de la déconvolution source/filtre est abordé. La séparation source/filtre par ZZT (zéros de la transformée en Z) est comparée aux méthodes usuelles à base de prédiction linéaire. La ZZT est utilisée pour estimer les paramètres du modèle de la source glottique via une méthode simple mais robuste qui permet une estimation conjointe de deux paramètres du débit glottique : le quotient ouvert et l'asymétrie. La méthode ainsi développée est testée et combinée à l'estimation du quotient ouvert par ondelettes. Finalement, ces trois méthodes d'estimations sont appliquées à un grand nombre de fichiers d'une base de données comportant différents styles d'élocution. Les résultats de cette analyse sont discutés afin de caractériser le lien entre style, valeur des paramètres de la production vocale et qualité vocale. On constate notamment l'émergence très nette de groupes de styles.

APA, Harvard, Vancouver, ISO, and other styles

35

Hollis, Timothy Mowry. "Circuit and Modeling Solutions for High-Speed Chip-to-Chip Communication." BYU ScholarsArchive, 2007. https://scholarsarchive.byu.edu/etd/1067.

Full text

Abstract:

This dissertation presents methods for modeling and mitigating voltage noise and timing jitter across high-speed chip-to-chip interconnects. Channel equalization and associated tuning schemes have been developed to target the distinct characteristics and signal degradation exhibited in the clock and data signals of multi-Gigabit/second digital communication links. Multiple methods for generating realistically degraded signals for the purpose of simulation are also presented and used to verify the proposed equalization and filtering topologies. Specifically, a new technique for modeling high-speed jittery clocks in the frequency domain is presented and shown to reduce transient simulation time and memory requirements, while simultaneously improving the timing resolution and accuracy of the simulation by minimizing the dependence on the transient simulation time-step. The technique is further developed to provide unprecedented control over the timing characteristics of the generated signals, and is then extended to the generation of random data signals with definable jitter statistics. Through these techniques,realistic clock and data waveforms are constructible, providing for the visualization of the combined effects of voltage and timing degradation, while at the same time tracking the phase relationship between the clock and data signals as they pass across their respective channels and through the receiving circuitry of the communication link. New methods for the automated tuning of second-order continuous-time channel equalizers are proposed based on the simulated or measured single pulse and double pulse responses of the transmission channel. Using only one degree of freedom, the methods target the reduction of inter-symbol interference (ISI) as identified in the single and double pulses. Through tuning either the circuit quality factor (Q), the peaking frequency, or the frequency zero, the methods are shown to adapt to a variety of channel lengths and datarates from the same original equalizer transfer function, implying a good degree of generality, while offering a simple, yet effective, method for ISI reduction. Finally, the design of an active 5 Gigahertz (GHz) bandpass filter, employed for high-speed clock conditioning, is presented and shown to address both random and deterministic components of the clock signal degradation. The bandpass transfer function is achieved through a combination of AC coupling and a resonant LC tank consisting of on-chip interleaved spiral inductors and a tunable capacitor array. Through adjusting the load capacitance in parallel with the inductors, the center frequency of the filter is tunable over a range of nearly 5GHz. The design targets a supply voltage of 1.2 volts and draws approximately 5.7 milliamps of current.

APA, Harvard, Vancouver, ISO, and other styles

36

Jemâa, Imen. "Suivi de formants par analyse en multirésolution." Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0026/document.

Full text

Abstract:

Nos travaux de recherches présentés dans ce manuscrit ont pour objectif, l'optimisation des performances des algorithmes de suivi des formants. Pour ce faire, nous avons commencé par l'analyse des différentes techniques existantes utilisées dans le suivi automatique des formants. Cette analyse nous a permis de constater que l'estimation automatique des formants reste délicate malgré l'emploi de diverses techniques complexes. Vue la non disponibilité des bases de données de référence en langue arabe, nous avons élaboré un corpus phonétiquement équilibré en langue arabe tout en élaborant un étiquetage manuel phonétique et formantique. Ensuite, nous avons présenté nos deux nouvelles approches de suivi de formants dont la première est basée sur l'estimation des crêtes de Fourier (maxima de spectrogramme) ou des crêtes d'ondelettes (maxima de scalogramme) en utilisant comme contrainte de suivi le calcul de centre de gravité de la combinaison des fréquences candidates pour chaque formant, tandis que la deuxième approche de suivi est basée sur la programmation dynamique combinée avec le filtrage de Kalman. Finalement, nous avons fait une étude exploratrice en utilisant notre corpus étiqueté manuellement comme référence pour évaluer quantitativement nos deux nouvelles approches par rapport à d'autres méthodes automatiques de suivi de formants. Nous avons testé la première approche par détection des crêtes ondelette, utilisant le calcul de centre de gravité, sur des signaux synthétiques ensuite sur des signaux réels de notre corpus étiqueté en testant trois types d'ondelettes complexes (CMOR, SHAN et FBSP). Suite à ces différents tests, il apparaît que le suivi de formants et la résolution des scalogrammes donnés par les ondelettes CMOR et FBSP sont meilleurs qu'avec l'ondelette SHAN. Afin d'évaluer quantitativement nos deux approches, nous avons calculé la différence moyenne absolue et l'écart type normalisée. Nous avons fait plusieurs tests avec différents locuteurs (masculins et féminins) sur les différentes voyelles longues et courtes et la parole continue en prenant les signaux étiquetés issus de la base élaborée comme référence. Les résultats de suivi ont été ensuite comparés à ceux de la méthode par crêtes de Fourier en utilisant le calcul de centre de gravité, de l'analyse LPC combinée à des bancs de filtres de Mustafa Kamran et de l'analyse LPC dans le logiciel Praat. D'après les résultats obtenus sur les voyelles /a/ et /A/, nous avons constaté que le suivi fait par la méthode ondelette avec CMOR est globalement meilleur que celui des autres méthodes Praat et Fourier. Cette méthode donne donc un suivi de formants (F1, F2 et F3) pertinent et plus proche de suivi référence. Les résultats des méthodes Fourier et ondelette sont très proches dans certains cas puisque toutes les deux présentent moins d'erreurs que la méthode Praat pour les cinq locuteurs masculins ce qui n'est pas le cas pour les autres voyelles où il y a des erreurs qui se présentent parfois sur F2 et parfois sur F3. D'après les résultats obtenus sur la parole continue, nous avons constaté que dans le cas des locuteurs masculins, les résultats des deux nouvelles approches sont notamment meilleurs que ceux de la méthode LPC de Mustafa Kamran et ceux de Praat même si elles présentent souvent quelques erreurs sur F3. Elles sont aussi très proches de la méthode par détection de crêtes de Fourier utilisant le calcul de centre de gravité. Les résultats obtenus dans le cas des locutrices féminins confirment la tendance observée sur les locuteurs
Our research work presented in this thesis aims the optimization of the performance of formant tracking algorithms. We began by analyzing different existing techniques used in the automatic formant tracking. This analysis showed that the automatic formant estimation remains difficult despite the use of complex techniques. For the non-availability of database as reference in Arabic, we have developed a phonetically balanced corpus in Arabic while developing a manual phonetic and formant tracking labeling. Then we presented our two new automatic formant tracking approaches which are based on the estimation of Fourier ridges (local maxima of spectrogram) or wavelet ridges (local maxima of scalogram) using as a tracking constraint the calculation of center of gravity of a set of candidate frequencies for each formant, while the second tracking approach is based on dynamic programming combined with Kalman filtering. Finally, we made an exploratory study using manually labeled corpus as a reference to quantify our two new approaches compared to other automatic formant tracking methods. We tested the first approach based on wavelet ridges detection, using the calculation of the center of gravity on synthetic signals and then on real signals issued from our database by testing three types of complex wavelets (CMOR, SHAN and FBSP). Following these tests, it appears that formant tracking and scalogram resolution given by CMOR and FBSP wavelets are better than the SHAN wavelet. To quantitatively evaluate our two approaches, we calculated the absolute difference average and standard deviation. We made several tests with different speakers (male and female) on various long and short vowels and continuous speech signals issued from our database using it as a reference. The formant tracking results are compared to those of Fourier ridges method calculating the center of gravity, LPC analysis combined with filter banks method of Kamran.M and LPC analysis integrated in Praat software. According to the results of the vowels / a / and / A /, we found that formant tracking by the method with wavelet CMOR is generally better than other methods. Therefore, this method provides a correct formant tracking (F1, F2 and F3) and closer to the reference. The results of Fourier and wavelet methods are very similar in some cases since both have fewer errors than the method Praat. These results are proven for the five male speakers which is not the case for the other vowels where there are some errors which are present sometimes in F2 and sometimes in F3. According to the results obtained on continuous speech, we found that in the case of male speakers, the result of both approaches are particularly better than those of Kamran.M method and those of Praat even if they are often few errors in F3. They are also very close to the Fourier ridges method using the calculation of center of gravity. The results obtained in the case of female speakers confirm the trend observed over the male speakers

APA, Harvard, Vancouver, ISO, and other styles

37

Vatte, Madhu Latha Reddy. "Readout Circuitry for a Logarithmic CMOS Active Pixel Sensor That Facilities High Speed Image Processing." University of Akron / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=akron1278549382.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Lai, Ying-Chun. "A Development of a Common-Mode FilterUsing an EBG Structure in High Speed SerialLinks." Thesis, KTH, Elektroteknisk teori och konstruktion, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-104986.

Full text

Abstract:

As signal speed increases and electronic products become progressively smaller,the risks of electromagnetic radiation and interference are also heightened.Ericsson's SCXB, an Ethernet switch card, experiences exactly this problem,with excessive emission levels probably caused by common-mode noise.In this project, a common-mode lter using the electromagnetic bandgap(EBG) structure has been designed and implemented in the SCXB. Unlikeconventional common-mode lters, the common-mode lter is embedded inthe printed circuit board (PCB) beneath the dierential lines. The eect ofthe common-mode lter is assessed by measuring the insertion loss and thepower radiation of a shielded cable connected to the common-mode lter.A compact common-mode lter using an EBG structure has been proposedin this project and this works eectively at 937.5 MHz. One of the resultsfrom the parametric analysis shows that the common-mode lter is suitableto work in a high frequency range due to the smaller structure and the widerbandwidth range. The common-mode lter is constructed with the PCBfabrication process. No additional components are necessary, although morelayers of the PCB's stack up are required in which to embed the common-mode lter.

APA, Harvard, Vancouver, ISO, and other styles

39

Hamlet, Sean Michael. "COMPARING ACOUSTIC GLOTTAL FEATURE EXTRACTION METHODS WITH SIMULTANEOUSLY RECORDED HIGH-SPEED VIDEO FEATURES FOR CLINICALLY OBTAINED DATA." UKnowledge, 2012. http://uknowledge.uky.edu/ece_etds/12.

Full text

Abstract:

Accurate methods for glottal feature extraction include the use of high-speed video imaging (HSVI). There have been previous attempts to extract these features with the acoustic recording. However, none of these methods compare their results with an objective method, such as HSVI. This thesis tests these acoustic methods against a large diverse population of 46 subjects. Two previously studied acoustic methods, as well as one introduced in this thesis, were compared against two video methods, area and displacement for open quotient (OQ) estimation. The area comparison proved to be somewhat ambiguous and challenging due to thresholding eﬀects. The displacement comparison, which is based on glottal edge tracking, proved to be a more robust comparison method than the area. The ﬁrst acoustic methods OQ estimate had a relatively small average error of 8.90% and the second method had a relatively large average error of -59.05% compared to the displacement OQ. The newly proposed method had a relatively small error of -13.75% when compared to the displacements OQ. There was some success even though there was relatively high error with the acoustic methods, however, they may be utilized to augment the features collected by HSVI for a more accurate glottal feature estimation.

APA, Harvard, Vancouver, ISO, and other styles

40

Pokora, C. D. "Spatio-temporal correlations of jets using high-speed particle image velocimetry." Thesis, Loughborough University, 2009. https://dspace.lboro.ac.uk/2134/13185.

Full text

Abstract:

The major source of aircraft noise at take-off is jet noise. If jet noise is not adequately addressed environmental impact concerns will constrain the planned growth of the air transport system. A considerable amount of research worldwide has therefore been aimed at identifying ways to reduce jet noise including development of a predictive tool that can estimate the noise generated by new nozzle designs. Current noise prediction techniques, however, still require the input of empirically calibrated noise source models and their performance is still inadequate. In addition, development of detailed noise source identification measurements and the associated understanding of how to control (and reduce) the noise at the source has been limited. The fundamental turbulence property which acts as the source of propagating noise in shear layers is the two-point space-time velocity correlation (Rijkl). Very few measurements exist for this property to guide model development. It is therefore the aim of the work reported in this thesis to provide new experimental data that helps identify the turbulence sources located within the shear layer of jets. The technique of Partical Imaging Velocimetry (PIV) is used to capture directly the flowfield and all relevant turbulent statistics.

APA, Harvard, Vancouver, ISO, and other styles

41

Chenais, Patrick. "Une carte de traitement et de reconnaissance de la parole : etude de cibles acoustiques." Toulouse 3, 1987. http://www.theses.fr/1987TOU30009.

Full text

Abstract:

Realisation d'une carte de traitement numerique de la parole. L'application visee est la commande d'actionneurs dans un dispositif de controle d'environnement. Les realisations materielle et logicielle sont decrites. Pour diminuer le cout, l'idee de base de l'etude de l'architecture est d''integrer sur le meme module la fonction reconnaissance et le fonction commande. La reconnaissance utilise cinq indices acoustiques. Ils sont calcules en temps reel par le processeur de signal. La segmentation de la parole en unites discretes est abordee selon une strategie de recherche de cibles acoustiques. La reconnaissance se fait par programmtion dynamique avec l'utilisation d'une distance ponderee; on traite comme un probleme de classification automatique la question du rejet ou de l'acceptation des candidats proposes

APA, Harvard, Vancouver, ISO, and other styles

42

Caliskan, Hakan. "Modeling And Experimental Evaluation Of Variable Speed Pump And Valve Controlled Hydraulic Servo Drives." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12611090/index.pdf.

Full text

Abstract:

In this thesis study, a valveless hydraulic servo system controlled by two pumps is investigated and its performance characteristics are compared with a conventional valve controlled system both experimentally and analytically. The two control techniques are applied on the position control of a single rod linear actuator. In the valve controlled system, the flow rate through the actuator is regulated with a servovalve
whereas in the pump controlled system, two variable speed pumps driven by servomotors regulate the flow rate according to the needs of the system, thus eliminating the valve losses. To understand the dynamic behaviors of two systems, the order of the differential equations defining the system dynamics of the both systems are reduced by using the fact that the dynamic pressure changes in the hydraulic cylinder chambers become linearly dependent on leakage coefficients and cylinder chamber volumes above and below some prescribed cut off frequencies. Thus the open loop speed response of the pump controlled and valve controlled systems are defined by v second order transfer functions. The two systems are modeled in MATLAB Simulink environment and the assumptions are validated. For the position control of the single rod hydraulic actuator, a linear state feedback control scheme is applied. Its state feedback gains are determined by using the linear and linearized reduced order dynamic system equations. A linear Kalman filter for pump controlled system and an unscented Kalman filter for valve controlled system are designed for estimation and filtering purposes. The dynamic performances of both systems are investigated on an experimental test set up developed by conducting open loop and closed loop frequency response and step response tests. MATLAB Real Time Windows Target (RTWT) module is used in the tests for application purposes.

APA, Harvard, Vancouver, ISO, and other styles

43

Buyukkeles, Umit. "Improved Torque And Speed Control Performance In A Vector-controlled Pwm-vsi Fed Surface-mounted Pmsm Drive With Conventional P-i Controllers." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614294/index.pdf.

Full text

Abstract:

In this thesis, high performance torque and speed control for a surface-mounted permanent magnet synchronous machine (PMSM) is designed, simulated and implemented. A three-phase two-level pulse width modulation voltage-source inverter (PWM-VSI) with power MOSFETs is used to feed the PMSM. The study has three objectives. The first is to compensate the voltage disturbance caused by nonideal characteristics of the voltage-source inverter (VSI). The second is to decouple the coupled variables in the synchronous reference frame model of the PMSM. The last is to design a load torque estimator in order to increase the disturbance rejection capability of the speed control. The angular acceleration required for load torque estimation is extracted through a Kalman filter from noisy velocity measurements. Proposed methods for improved torque and speed control performance are verified through simulations and experimental tests. The drive system is modeled in Matlab/Simulink, and control algorithms are developed based on this model. The experimental drive system comprises a three-phase VSI and a 385 W surface-mounted PMSM. Control algorithms developed in the study have been implemented in a digital signal processor (DSP) board and tested comprehensively. With the use of the proposed methods, a considerable improvement of torque and speed control performance has been achieved.

APA, Harvard, Vancouver, ISO, and other styles

44

Olsson, Rickard. "Signal processing and high speed imaging as monitoring tools for pulsed laser welding." Licentiate thesis, Luleå tekniska universitet, Produkt- och produktionsutveckling, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-26555.

Full text

Abstract:

In Laser Materials Processing there has always been a need for suitable methods to supervise and monitor the processes on line, to ensure correct production quality or to trigger alarms when failures are detected. Numerous investigations have been made in this field, including experimental and theoretical work. It is common practice in this field to monitor surface temperature, plasma radiation and back-reflected laser light, coaxially with the laser beam. Traditionally, the monitoring systems involved carry out no statistical analysis of the signals received - they merely involve thresholds. This thesis looks at the feedback collected during laser welding using such a co-axial setup from a Digital Signal Processing point of view and also uses high speed video photography to correlate signal perturbations with process anomalies.Modern Digital Signal Processing techniques such as Kalman filtering, Principal Component Analysis and Cluster Analysis have been applied to the measurement data and have generated new ways to describe the weld behaviour using parameters such as reflected pulse shape. The limitations of commercially available welding supervision systems have been studied and design suggestions for the next generation of on line weld monitoring equipment have been formulated.
Godkänd; 2009; 20091103 (ricols); LICENTIATSEMINARIUM Ämnesområde: Produktionsutveckling/Manufacturing Systems Engineering Examinator: Professor Alexander Kaplan, Luleå tekniska universitet Tid: Onsdag den 16 december 2009 kl 13.00 Plats: E 232, Luleå tekniska universitet

APA, Harvard, Vancouver, ISO, and other styles

45

Bartholomew, David Ray. "Design of a High Speed Mixed Signal CMOS Mutliplying Circuit." Diss., CLICK HERE for online access, 2004. http://contentdm.lib.byu.edu/ETD/image/etd362.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Kučera, Jan. "Filtrace paketů ve 100 Gb sítích." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2016. http://www.nusl.cz/ntk/nusl-255423.

Full text

Abstract:

This master's thesis deals with the design and implementation of an algorithm for high-speed network packet filtering. The main goal was to provide hardware architecture, which would support large rule sets and could be used in 100 Gbps networks. The system has been designed with respect to the implementation on an FPGA card and time-space complexity trade-off. Properties of the system have been evaluated using various available rule sets. Due to the highly optimized and deep pipelined architecture it was possible to reach high working frequency (above 220 MHz) together with considerable memory reduction (on average about 72% for compared algorithms). It is also possible to efficiently store up to five thousands of filtering rules on an FPGA with only 8% of on-chip memory utilization. The architecture allows high-speed network packet filtering at wire-speed of 100 Gbps.

APA, Harvard, Vancouver, ISO, and other styles

47

Ispir, Mehmet. "Design Of Moving Target Indication Filters With Non-uniform Pulse Repetition Intervals." Master's thesis, METU, 2013. http://etd.lib.metu.edu.tr/upload/12615361/index.pdf.

Full text

Abstract:

Staggering the pulse repetititon intervals is a widely used solution to alleviate the blind speed problem in Moving Target Indication (MTI) radar systems. It is possible to increase the first blind speed on the order of ten folds with the use of non-uniform sampling. Improvement in blind speed results in passband fluctuations that may degregade the detection performance for particular Doppler frequencies. Therefore, it is important to design MTI filters with non-uniform interpulse periods that have minimum passband ripples with sufficient clutter attenuation along with good range and blind velocity performance. In this thesis work, the design of MTI filters with non-uniform interpulse periods is studied through the least square, convex and min-max filter design methodologies. A trade-off between the contradictory objectives of maximum clutter suppression and minimum desired signal attenuation is established by the introduction of a weight factor into the designs. The weight factor enables the adaptation of MTI filter to different operational scenarios such as the operation under low, medium or high clutter power. The performances of the studied designs are investigated by comparing the frequency response characteristics and the average signal-to-clutter suppression capabilities of the filters with respect to a number of defined performance measures.Two further approaches are considered to increase the signal-to-clutter suppression performance. First approach is based on a modified min-max filter design whereas the second one focuses on the multiple filter implementations. In addition, a detailed review and performance comparison with the non-uniform MTI filter designs from the literature are also given.

APA, Harvard, Vancouver, ISO, and other styles

48

Törnquist, Martin. "Investigation of rotational velocity sensors." Thesis, Linköping University, Department of Electrical Engineering, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-15904.

Full text

Abstract:

To improve the speed measurement of construction equipment, different sensor technologies have been investigated. Many of these sensor technologies are very interesting but to keep the extent of the thesis only two was chosen for testing, magnetic absolute angle sensors using Hall and GMR technology, to investigate if those are a valid replacement for the current measurement system that is using a passive sensor. Tests show that these sensors are capable of speed measurement, but because of noisy angle estimates they need filtering for good speed computation. This filtering introduces a large time delay that is of significance for the quality of the estimate. A Kalman filter has been implemented in an attempt to lower the time delays but since only a very simple model has been used it does not give any improvements over ordinary low pass filtering. For these sensors the mounting tolerance is of great interest. For best performance the offset between the sensor and magnet centres need to be kept small for both sensors. This is due to a non-linearity effect this causes. The distance between the sensors and the magnet is not critical for linearity issues, but only for the quality of the signal, where it might drop out when the distance is too large. This is where the sensor using GMR technology stands out. Compared to the Hall technology sensor, the GMR sensor can handle distances that are more than 10 times larger. The conclusion is that these sensors can be a valid replacement of the current measurement system. They will introduce more functionality with the capability of detecting rotational direction and zero velocity. In an application with more than one sensor they can also be used for more purposes, like detecting slip in clutches etc. Depending on the application, the time delays may not be critical, else more work need to be done to improve the estimate, e.g. with a more advanced model for the Kalman filter.

APA, Harvard, Vancouver, ISO, and other styles

49

Silva, Cristiane Cristina Sousa da. "UM ALGORITMO TIPO RLS BASEADO EM SUPERFÍCIES NÃO QUADRÁTICAS." Universidade Federal do Maranhão, 2013. http://tedebc.ufma.br:8080/jspui/handle/tede/550.

Full text

Abstract:

Made available in DSpace on 2016-08-17T16:54:33Z (GMT). No. of bitstreams: 1 Tese Cristiane Cristina.pdf: 4404224 bytes, checksum: a68e5757bedc2d3d341a5937f100fe1f (MD5) Previous issue date: 2013-07-19
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
In adaptive filtering many adaptive filter are based on the mean square error method (MSE). These filters were developed to improve convergence spedd with a lower misadjustment. The least mean square (LMS) and the recursive least square (RLS) algorithms have been the hallmark of adaptive filtering. In this work we develop adaptive algorithms based on the even powers of the error inspired in the recursive lest square (RLS) algorithm. Namely recursive nom quadratic (RNQ) algorithm. The ideas is based on Widrow s least mean square fourth (LMF) algorithm. Fisrt we derive equations based on a singal even power of the error in order to obtain criterions that guarantee convergence. We also determine equations that measure the misadjustment and the time constant of the adaptive process of the RNQ algorithm. We work also, toward making the algorithm less sensitive to the size of the error in na alternative direction, by proposing a cost function which is a sum of the even powers of the error. This second approach bring the error explicitly to the RLS algorithm formulation by proposing a new cost function that preserves the measnsquare-error (MSE) solution, but allows for the exploitation of higher order moments of the error to speedup the converge of the algorithm. The main goal this work is to create form first principles (new cost functions ) a mechanism to include instantaneous error information in the RLS algorithm, make it track better, and allow for the design of the forgetting factor. As we will see the key aspecto of our approach is to include the error in the Kalman gain that effectively controls the speed of adaptation of the RLS algorithm.
Em filtragem adaptativa, vários filtros são baseados no método do erro quadrático médio (do inglês, MSE- mean squared error ) e muitos desses foram desenvolvidos para obter uma convergência rápida com um menos desajuste. Os algoritmos mínimos quadrático médio (do inglês, LMS- least mean square ) e mínimos quadrados recursivos (do inglês, RLS- recursive least square ) foram um marco em filtragem adaptativa. Nesse trabalho apresentamos o desenvolvimento de uma família de algoritmos adaptativos baseados nas potências pares do erro, inspirado na dedução do algoritmo RLS padrão. Chamaremos esses novos algoritmos de recursivo não-quadrático (RNQ). A ideia básica é baseada na função de custo apresentada por Widrow no algoritmo mínimo quarto médio ( do inglês, LMF least mean square fourth). Inicialmente derivamos equações baseados em uma potência par do erro para obter critérios que garantam a convergência. Determinamos também, equações que definem o desajuste e o tempo de aprendizagem do processo de adaptação do algoritmo RNQ baseado em potência para arbitrária. Trabalhamos também, no sentido de tornar o algoritmo menos sensível ao tamanho do erro numa direção alternativa, propondo uma função de custo baseado na soma das potências pares do erro. Essa segunda abordagem torna explícito o papel do erro na formulação do RLS ao propor uma nova função de custo que preserve a solução MSE, mas permite a utilização dos momentos de alta ordem do erro para aumentar a velocidade de convergência do algoritmo. O principal objetivo do nosso trabalho é criar a partir dos primeiros princípios (novas funções de custo) um mecanismo para incluir informações de erro instantâneo no algoritmo RLS e torná-lo um seguidor melhor. Assim, o aspecto-chave dessa nova abordagem é incluir o erro no ganho de Kalman que controla efetivamente a velocidade de adaptação do algoritmo de RLS.

APA, Harvard, Vancouver, ISO, and other styles

50

Hodaň, David. "Možnosti akcelerace symbolické regrese pomocí kartézského genetického programování." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403198.

Full text

Abstract:

This thesis is focused on finding procedures that would accelerate symbolic regressions in Cartesian Genetic Programming. It describes Cartesian Genetic Programming and its use in the task of symbolic regression. It deals with the SIMD architecture and the SSE and AVX instruction set. Several optimizations that lead to a significant acceleration of evolution in Cartesian Genetic Programming are presented. A method of a bit-level parallel simulation that uses AVX2 vectors allows to process 256 input combinations of a logic circuit in paralell. Similarly it is possible to use a byte-level parallel simulation and work with 32 bytes when evolving an image filter. A new method of batch mutation can accelerate the evolution of combinational logic circuits thousand times depending on the problem size. For example, using a combination of these and other methods the evolution of 5 x 5b multipliers took 5.8 seconds on average on an Intel Core i5-4590 processor.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Speech filtering'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles