Rozprawy doktorskie na temat „Speaker verification system”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 43 najlepszych rozpraw doktorskich naukowych na temat „Speaker verification system”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Nosratighods, Mohaddeseh Electrical Engineering & Telecommunications Faculty of Engineering UNSW. "Robust speaker verification system". Publisher:University of New South Wales. Electrical Engineering & Telecommunications, 2008. http://handle.unsw.edu.au/1959.4/42796.
Pełny tekst źródłaSarma, Sridevi Vedula. "A segment-based speaker verification system using SUMMIT". Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43406.
Pełny tekst źródłaIncludes bibliographical references (p. 75-79).
by Sridevi Vedula Sarma.
M.S.
Zhou, Yichao. "Lip password-based speaker verification system with unknown language alphabet". HKBU Institutional Repository, 2018. https://repository.hkbu.edu.hk/etd_oa/562.
Pełny tekst źródłaMtibaa, Aymen. "Towards robust and privacy-preserving speaker verification systems". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS002.
Pełny tekst źródłaSpeaker verification systems are a key technology in many devices and services like smartphones, intelligent digital assistants, healthcare, and banking applications. Additionally, with the COVID pandemic, access control systems based on fingerprint scanners or keypads increase the risk of virus propagation. Therefore, companies are now rethinking their employee access control systems and considering touchless authorization technologies, such as speaker verification systems.However, speaker verification system requires users to transmit their recordings, features, or models derived from their voice samples without any obfuscation over untrusted public networks which stored and processed them on a cloud-based infrastructure. If the system is compromised, an adversary can use this biometric information to impersonate the genuine user and extract personal information. The voice samples may contain information about the user's gender, accent, ethnicity, and health status which raises several privacy issues.In this context, the present PhD Thesis address the privacy and security issues for speaker verification systems based on Gaussian mixture models (GMM), i-vector, and x-vector as speaker modeling. The objective is the development of speaker verification systems that perform biometric verification while preserving the privacy and the security of the user. To that end, we proposed biometric protection schemes for speaker verification systems to achieve the privacy requirements (revocability, unlinkability, irreversibility) described in the standard ISO/IEC IS~24745 on biometric information protection and to improve the robustness of the systems against different attack scenarios
Li, Yi. "Speaker Diarization System for Call-center data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Pełny tekst źródłaFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Guo, Yunfei. "Personalized Voice Activated Grasping System for a Robotic Exoskeleton Glove". Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/101751.
Pełny tekst źródłaMaster of Science
The robotic exoskeleton glove used in this research is designed to help patients with hand disabilities. This thesis proposes a voice-activated grasping system to control the exoskeleton glove. Here, the user can use a self-defined keyword to activate the exoskeleton and use voice to control the exoskeleton. The voice command system can distinguish between different users' voices, thereby improving the safety of the glove control. A smartphone is used to process the voice commands and send them to an onboard computer on the exoskeleton glove. The exoskeleton glove then accurately applies force to each fingertip using a force feedback actuator.This study focused on designing a state of the art human machine interface to control an exoskeleton glove and perform an accurate and stable grasp.
Bekli, Zeid, i William Ouda. "A performance measurement of a Speaker Verification system based on a variance in data collection for Gaussian Mixture Model and Universal Background Model". Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20122.
Pełny tekst źródłaShou-Chun, Yin 1980. "Speaker adaptation in joint factor analysis based text independent speaker verification". Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=100735.
Pełny tekst źródłaChan, Siu Man. "Improved speaker verification with discrimination power weighting /". View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202004%20CHANS.
Pełny tekst źródłaIncludes bibliographical references (leaves 86-93). Also available in electronic version. Access restricted to campus users.
Cilliers, Francois Dirk. "Tree-based Gaussian mixture models for speaker verification". Thesis, Link to the online version, 2005. http://hdl.handle.net/10019.1/1639.
Pełny tekst źródłaWan, Qianhui. "Speaker Verification Systems Under Various Noise and SNR Conditions". Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36888.
Pełny tekst źródłaWark, Timothy J. "Multi-modal speech processing for automatic speaker recognition". Thesis, Queensland University of Technology, 2001.
Znajdź pełny tekst źródłaPhythian, Mark. "Speaker identification for forensic applications". Thesis, Queensland University of Technology, 1998. https://eprints.qut.edu.au/36079/3/__qut.edu.au_Documents_StaffHome_StaffGroupR%24_rogersjm_Desktop_36079_Digitised%20Thesis.pdf.
Pełny tekst źródłaSlomka, Stefan. "Multiple classifier structures for automatic speaker recognition under adverse conditions". Thesis, Queensland University of Technology, 1999.
Znajdź pełny tekst źródłaLeis, John W. "Spectral coding methods for speech compression and speaker identification". Thesis, Queensland University of Technology, 1998. https://eprints.qut.edu.au/36062/7/36062_Digitised_Thesis.pdf.
Pełny tekst źródłaBarger, Peter James. "Speech processing for forensic applications". Thesis, Queensland University of Technology, 1998. https://eprints.qut.edu.au/36081/1/36081_Barger_1998.pdf.
Pełny tekst źródłaPINHEIRO, Hector Natan Batista. "Verificação de locutores independente de texto: uma análise de robustez a ruído". Universidade Federal de Pernambuco, 2015. https://repositorio.ufpe.br/handle/123456789/18045.
Pełny tekst źródłaMade available in DSpace on 2016-11-08T19:13:18Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Dissertação_Final.pdf: 15901621 bytes, checksum: e3bd1c1be70941932d970f61be02e4c1 (MD5) Previous issue date: 2015-02-25
O processo de identificação de um determinado indivíduo é realizado milhões de vezes, todos os dias, por organizações dos mais diversos setores. Perguntas como "Quem é esse indivíduo?" ou "É essa pessoa quem ela diz ser?" são realizadas frequentemente por organizações financeiras, sistemas de saúde, sistemas de comércio eletrônico, sistemas de telecomunicações e por instituições governamentais. Identificação biométrica diz respeito ao processo de realizar essa identificação a partir de características físicas ou comportamentais. Tais características são comumente referenciadas como características biométricas e alguns exemplos delas são: face, impressão digital, íris, assinatura e voz. Reconhecimento de locutores é uma modalidade biométrica que se propõe a realizar o processo de identificação pessoal a partir das informações presentes unicamente na voz do indivíduo. Este trabalho foca no desenvolvimento de sistemas de verificação de locutores independente de texto. O principal desafio no desenvolvimento desses sistemas provém das chamadas incompatibilidades que podem ocorrer na aquisição dos sinais de voz. As técnicas propostas para suavizá-las são chamadas de técnicas de compensação e três são os domínios onde elas podem operar: no processo de extração de características do sinal, na construção dos modelos dos locutores e no cálculo do score final do sistema. Além de apresentar uma vasta revisão da literatura do desenvolvimento de sistemas de verificação de locutores independentes de texto, esse trabalho também apresenta as principais técnicas de compensação de características, modelos e scores. Na fase de experimentação, uma análise comparativa das principais técnicas propostas na literatura é apresentada. Além disso, duas técnicas de compensação são propostas, uma do domínio de modelagem e outra do domínio dos scores. A técnica de compensação de score proposta é baseada na Distribuição Normal Acumulada e apresentou, em alguns contextos, resultados superiores aos apresentados pelas principais técnicas da literatura. Já a técnica de compensação de modelo é baseada em uma técnica da literatura que combina dois conceitos: treinamento multi-condicional e Teoria dos Dados Ausentes (Missing Data Theory). A formulação apresentada pelos autores é baseada nos chamados Modelos de União a Posteriori (Posterior Union Models), mas não é completamente adequada para verificação de locutores independente de texto. Este trabalho apresenta uma formulação apropriada para esse contexto que combina os dois conceitos utilizados pelos autores com um tipo de modelagem utilizando UBMs (Universal Background Models). A técnica proposta apresentou ganhos de desempenhos quando comparada à técnica-padrão GMM-UBM, baseada em Modelos de Misturas Gaussianas (GMMs).
The personal identification of individuals is a task executed millions of times every day by organizations from diverse fields. Questions such as "Who is this individual?" or "Is this person who he or she claims to be?" are constantly made by organizations in financial services, health care, e-commerce, telecommunication systems and governments. Biometric identification is the process of identifying people using their physiological or behavioral characteristics. These characteristics are generally known as biometrics and examples of these include face, fingerprint, iris, handwriting and speech. Speaker recognition is a biometric modality which makes the personal identification by using speaker-specific information from the speech. This work focuses on the development of text-independent speaker verification systems. In these systems, speech from an individual is used to verify the claimed identity of that individual. Furthermore, the verification must occur independently of the pronounced word or phrase. The main challenge in the development of speaker recognition systems comes from the mismatches which may occur in the acquisition of the speech signals. The techniques proposed to mitigate the mismatch effects are referred as compensation methods. They may operate in three domains: in the feature extraction process, in the estimation of the speaker models and in the computation of the decision score. Besides presenting a wide description of the main techniques used in the development of text-independent speaker verification systems, this work presents the description of the main feature-, model- and score-based compensation methods. In the experiments, this work shows comprehensive comparisons between the conventional techniques and the alternatively compensations methods. Furthermore, two compensation methods are proposed: one operates in the model domain and the other in the score-domain. The scoredomain proposed compensation method is based on the Normal cumulative distribution function and, in some contexts, outperformed the performance of the main score-domain compensation techniques. On the other hand, the model-domain compensation technique proposed in this work is based on a method presented in the literature which combines two concepts: the multi-condition training and the Missing Data Theory. The formulation proposed by the authors is based on the Posterior Union models and is not completely appropriate for the text-independent speaker verification task. This work proposes a more appropriate formulation for this context which combines the concepts used by the authors with a type of modeling using Universal Background Models (UBMs). The proposed method outperformed the usual GMM-UBM modeling technique, based on Gaussian Mixture Models (GMMs).
Lucey, Simon. "Audio-visual speech processing". Thesis, Queensland University of Technology, 2002. https://eprints.qut.edu.au/36172/7/SimonLuceyPhDThesis.pdf.
Pełny tekst źródła"Text-independent bilingual speaker verification system". 2003. http://library.cuhk.edu.hk/record=b5891732.
Pełny tekst źródłaThesis (M.Phil.)--Chinese University of Hong Kong, 2003.
Includes bibliographical references (leaves 96-102).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Biometrics --- p.2
Chapter 1.2 --- Speaker Verification --- p.3
Chapter 1.3 --- Overview of Speaker Verification Systems --- p.4
Chapter 1.4 --- Text Dependency --- p.4
Chapter 1.4.1 --- Text-Dependent Speaker Verification --- p.5
Chapter 1.4.2 --- GMM-based Speaker Verification --- p.6
Chapter 1.5 --- Language Dependency --- p.6
Chapter 1.6 --- Normalization Techniques --- p.7
Chapter 1.7 --- Objectives of the Thesis --- p.8
Chapter 1.8 --- Thesis Organization --- p.8
Chapter 2 --- Background --- p.10
Chapter 2.1 --- Background Information --- p.11
Chapter 2.1.1 --- Speech Signal Acquisition --- p.11
Chapter 2.1.2 --- Speech Processing --- p.11
Chapter 2.1.3 --- Engineering Model of Speech Signal --- p.13
Chapter 2.1.4 --- Speaker Information in the Speech Signal --- p.14
Chapter 2.1.5 --- Feature Parameters --- p.15
Chapter 2.1.5.1 --- Mel-Frequency Cepstral Coefficients --- p.16
Chapter 2.1.5.2 --- Linear Predictive Coding Derived Cep- stral Coefficients --- p.18
Chapter 2.1.5.3 --- Energy Measures --- p.20
Chapter 2.1.5.4 --- Derivatives of Cepstral Coefficients --- p.21
Chapter 2.1.6 --- Evaluating Speaker Verification Systems --- p.22
Chapter 2.2 --- Common Techniques --- p.24
Chapter 2.2.1 --- Template Model Matching Methods --- p.25
Chapter 2.2.2 --- Statistical Model Methods --- p.26
Chapter 2.2.2.1 --- HMM Modeling Technique --- p.27
Chapter 2.2.2.2 --- GMM Modeling Techniques --- p.30
Chapter 2.2.2.3 --- Gaussian Mixture Model --- p.31
Chapter 2.2.2.4 --- The Advantages of GMM --- p.32
Chapter 2.2.3 --- Likelihood Scoring --- p.32
Chapter 2.2.4 --- General Approach to Decision Making --- p.35
Chapter 2.2.5 --- Cohort Normalization --- p.35
Chapter 2.2.5.1 --- Probability Score Normalization --- p.36
Chapter 2.2.5.2 --- Cohort Selection --- p.37
Chapter 2.3 --- Chapter Summary --- p.38
Chapter 3 --- Experimental Corpora --- p.39
Chapter 3.1 --- The YOHO Corpus --- p.39
Chapter 3.1.1 --- Design of the YOHO Corpus --- p.39
Chapter 3.1.2 --- Data Collection Process of the YOHO Corpus --- p.40
Chapter 3.1.3 --- Experimentation with the YOHO Corpus --- p.41
Chapter 3.2 --- CUHK Bilingual Speaker Verification Corpus --- p.42
Chapter 3.2.1 --- Design of the CUBS Corpus --- p.42
Chapter 3.2.2 --- Data Collection Process for the CUBS Corpus --- p.44
Chapter 3.3 --- Chapter Summary --- p.46
Chapter 4 --- Text-Dependent Speaker Verification --- p.47
Chapter 4.1 --- Front-End Processing on the YOHO Corpus --- p.48
Chapter 4.2 --- Cohort Normalization Setup --- p.50
Chapter 4.3 --- HMM-based Speaker Verification Experiments --- p.53
Chapter 4.3.1 --- Subword HMM Models --- p.53
Chapter 4.3.2 --- Experimental Results --- p.55
Chapter 4.3.2.1 --- Comparison of Feature Representations --- p.55
Chapter 4.3.2.2 --- Effect of Cohort Normalization --- p.58
Chapter 4.4 --- Experiments on GMM-based Speaker Verification --- p.61
Chapter 4.4.1 --- Experimental Setup --- p.61
Chapter 4.4.2 --- The number of Gaussian Mixture Components --- p.62
Chapter 4.4.3 --- The Effect of Cohort Normalization --- p.64
Chapter 4.4.4 --- Comparison of HMM and GMM --- p.65
Chapter 4.5 --- Comparison with Previous Systems --- p.67
Chapter 4.6 --- Chapter Summary --- p.70
Chapter 5 --- Language- and Text-Independent Speaker Verification --- p.71
Chapter 5.1 --- Front-End Processing of the CUBS --- p.72
Chapter 5.2 --- Language- and Text-Independent Speaker Modeling --- p.73
Chapter 5.3 --- Cohort Normalization --- p.74
Chapter 5.4 --- Experimental Results and Analysis --- p.75
Chapter 5.4.1 --- Number of Gaussian Mixture Components --- p.78
Chapter 5.4.2 --- The Cohort Normalization Effect --- p.79
Chapter 5.4.3 --- Language Dependency --- p.80
Chapter 5.4.4 --- Language-Independency --- p.83
Chapter 5.5 --- Chapter Summary --- p.88
Chapter 6 --- Conclusions and Future Work --- p.90
Chapter 6.1 --- Summary --- p.90
Chapter 6.1.1 --- Feature Comparison --- p.91
Chapter 6.1.2 --- HMM Modeling --- p.91
Chapter 6.1.3 --- GMM Modeling --- p.91
Chapter 6.1.4 --- Cohort Normalization --- p.92
Chapter 6.1.5 --- Language Dependency --- p.92
Chapter 6.2 --- Future Work --- p.93
Chapter 6.2.1 --- Feature Parameters --- p.93
Chapter 6.2.2 --- Model Quality --- p.93
Chapter 6.2.2.1 --- Variance Flooring --- p.93
Chapter 6.2.2.2 --- Silence Detection --- p.94
Chapter 6.2.3 --- Conversational Speaker Verification --- p.95
Bibliography --- p.102
Huang, Wei-Hsun, i 黃威勛. "A Study of Speaker Verification System". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/59811329629420223200.
Pełny tekst źródła大葉大學
電信工程學系碩士班
96
The main purpose of speaker verification is to identify the speaker according to the related information of voice signals, and it requires a lot of steps to catch the differences between these signals by computers. In this thesis, Mel-Frequency Cepstrum Coefficients, MFCCs, are used as voice characteristic coefficients to match the characteristics of human pronunciation and hearing. Gaussian mixture model is widely used in the field of text independent speaker verification. However, differences of various apeakers’ voice are not only caused by different oral cavity shapes and vocal cords, but also the articulation speed. Because Gaussian mixture model does not consider the difference of articulation speed, high-order ergodic Gaussian model is adopted in this thesis to implement the text independent speaker verification system. These two models are tested under the same condition, and the results show that high-order ergodic Gaussian model can improve the performance. Equal Error Rate reduces 3.8 percentages.
Shiy, Zhi-Hong, i 許志宏. "A Study of Speaker Verification System". Thesis, 2007. http://ndltd.ncl.edu.tw/handle/30030484954050201539.
Pełny tekst źródła清雲科技大學
電機工程研究所
95
In the progress of technology. The speech recognition is adapted enough to in daily life, which is a subfield of speech recognition is often used in a security system. There are many different kinds of speaker recognition technical and (or methods) and each one has its strong point and drawback. In this thesis, we will discuss the difference between each method. In chapter one, we introduce the study background, motivation, and outline of each chapter, chapter two discusses the signal processing steps before speech recognition, which includes framing, end-point-detection, Hamming window and feature selection. Chapter three explains two kinds of matching methods that is dynamic programming and vectorquantization. The experimental results which obtained by using Matlab Mathematical Toolare discussed in chapter four. In our experiments, the characteristic parameters of a speaker are Mel-frequency, and we change the experimental parameter in the experiment to observe the influence of the verification rate, the experimental parameters are the number of samples in each frame, the number of the overlap samples between frames and the dimensions of Mel-frequency.
劉耀隆. "Parameter choice for speaker verification system and imitating voiceprint verification". Thesis, 2005. http://ndltd.ncl.edu.tw/handle/grmqju.
Pełny tekst źródłaYu-hong, Li, i 李昱鴻. "An Enhanced Text-Independent Speaker Verification System". Thesis, 2005. http://ndltd.ncl.edu.tw/handle/90462488993922482956.
Pełny tekst źródła國立中興大學
電機工程學系
93
Speaker verification is an important technique in security and crime monitored, in this thesis, we proposed two algorithms to perform a traditional text-independent speaker verification system. First, an entropy algorithms is used in endpoint detection, and determine the test utterance length, next, a normalized background model is proposed to enhance the verification rate, the difference of our proposed model and traditional background model is computation decreasing. Experimental results demonstrate that our proposed algorithm was efficiency on text-independent speaker verification system, the proposed background model is normalized with the consequence of more compact score distribution and low equal error rate, moreover, experimental result of entropy-based algorithm prove that the improved feature can be successfully used in the noisy environments.
xing-min-lin i 林幸民. "A Robust Text Dependent Speaker Verification System". Thesis, 2004. http://ndltd.ncl.edu.tw/handle/73367737259637336248.
Pełny tekst źródła國立中興大學
電機工程學系
92
Abstract By the prosperity of computer industry,people have higher requests for the security environment. Thus, the need for the speaker verification with the high distinguishing rate and low cost is indispensable. In general, in the quiets laboratory, it makes no difference that the speaker verification rate can be both reached the high distinguishing rate. However, the distinguishing rate in the different channel can be carried a lot. Therefore, to improve the distinguishing rate in the different channel is the major issue in this thesis. In the thesis, a volume normalization and cepstral normalization is added to increase the speaker verification rate. We have test many voice data in quiet environment and also in noisy environment. We also test the speech in different channel. Simulation results show that using the cepstral normalization, can reduce the channel effect and increase the speaker verification rate. Using the volume normalization can also improve the speaker verification rate in quite environment.
Chung-Ying, Hsieh. "An Improved Speaker Verification System Using Orthogonal GMM". 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0005-1508200621115700.
Pełny tekst źródłaLi, An-Chi, i 李安基. "Noise Reduction for Text Dependent Speaker Verification System". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/28242714521701538505.
Pełny tekst źródła國立中興大學
電機工程學系所
96
The development of speaker verification system become maturely and its application become extension of the scope. To raise the recognition rate is the key point of the speech recognition. In this thesis, we use many noise reduce methods to reduce the noise of speech and to raise the recognition rate. Two major methods were need in thesis to reduce the noise for test dependent speaker verification. The speaker verification experiment was conducted. The speech signals were taken from the MMLab database, NCHU. 100 speaker(50 males, 50 females) were need in the test. The tests show that using cepstral mean subtraction(CMS) noise reduction method can effectively increase the speaker verification rate. Adding the cepstral weighting(CW) noise reduction method can improve the verification performance.
Chen, Bo-ren, i 陳柏仁. "The Application of Voting to the Speaker Verification System". Thesis, 2007. http://ndltd.ncl.edu.tw/handle/78831977862308500010.
Pełny tekst źródła國立中央大學
電機工程研究所
95
This thesis uses a kind of new score computing –Voting, making use of it on the speaker verification system and the efficiency of speaker verification system is improved. We combine Voting and Test normalization and four new kinds of speaker verification system are proposed, improved hybrid speaker verification system can reach the greatest improvement. The experimental result shows, improved hybrid speaker verification system compare with the traditional speaker verification system that EER can be up to 3.25% and DCF can be up to 0.0402 of the improvement. Improved hybrid speaker verification system compare with the test normalization speaker verification system that EER can be up to 0.59% and DCF can be up to 0.0022 of the improvement. The new speaker verification system we propose may assist with test normalization speaker verification system. The new system can supply speaker information and improve the efficiency of speaker verification system.
Chang, Sheng-Jyun, i 張勝鈞. "Double Feature Extraction for Text Dependent Speaker Verification System". Thesis, 2007. http://ndltd.ncl.edu.tw/handle/12043081480115871273.
Pełny tekst źródła中興大學
電機工程學系所
95
In recent years, speaker verification technique and its applications become extension of the scope and the importance of the study of speaker verification is increasing. In this thesis, we developed a combined feature extraction set and used in place of conventional LPC or MFCC feature only. The Linear Predictive Coding (LPC) and its Delta-cepstral coefficients in voice verification system have shown a good result in speaker verification. The use of Mel-Frequency Cepstral Coefficients (MFCC) that has twenty triangular filters to approximate entire speech features was also been used in speaker verification for many years. The experimental results show using the new LPCC-MFCC combined feature have better performance on text dependent speaker verification system.
Chang, Su-Yu, i 張蘇瑜. "Speaker Verification System with Converted Speech Spoofing Detection Mechanism". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/et33a6.
Pełny tekst źródła國立中山大學
資訊工程學系研究所
107
In this paper, we implement a speaker verification system that can detect converted speech attack through combining representation learning and neural networks. The system is divided into two subsystems: the countermeasure system and the verification system. The countermeasure system is responsible for detecting whether the speech is a spoofing speech generated by voice conversion or speech synthesis. The verification system is able to verify whether the speech is consistent with the identity claimed by the speaker through the voiceprint feature. In the countermeasure system, we use the method of representation learning and transfer learning to let the neural network can learn various spoofing speech features. First we use multiple labels of data for training, then use two labels of data fine-tuning models to learn the representation vectors of bona fide and spoofing speech, and finally use the support vector machine to classify. On the ASVspoof 2019 evaluation set, our system achieves a minimum tandem decision cost of 0.1782, and an equal error rate (EER) of 7.62%. In the speaker verification system, we apply a large training data to learn the speaker characterization, and use the learned speaker representation to enrollment and verification. We focus on the text-dependent task, and we evaluate our system on the real environment of 20 testers can achieve the 99% accuracy.
(11178210), Li-Chi Chang. "Defending against Adversarial Attacks in Speaker Verification Systems". Thesis, 2021.
Znajdź pełny tekst źródłaWith the advance of the technologies of Internet of things, smart devices or virtual personal assistants at home, such as Google Assistant, Apple Siri, and Amazon Alexa, have been widely used to control and access different objects like door lock, blobs, air conditioner, and even bank accounts, which makes our life convenient. Because of its ease for operations, voice control becomes a main interface between users and these smart devices. To make voice control more secure, speaker verification systems have been researched to apply human voice as biometrics to accurately identify a legitimate user and avoid the illegal access. In recent studies, however, it has been shown that speaker verification systems are vulnerable to different security attacks such as replay, voice cloning, and adversarial attacks. Among all attacks, adversarial attacks are the most dangerous and very challenging to defend. Currently, there is no known method that can effectively defend against such an attack in speaker verification systems.
The goal of this project is to design and implement a defense system that is simple, light-weight, and effectively against adversarial attacks for speaker verification. To achieve this goal, we study the audio samples from adversarial attacks in both the time domain and the Mel spectrogram, and find that the generated adversarial audio is simply a clean illegal audio with small perturbations that are similar to white noises, but well-designed to fool speaker verification. Our intuition is that if these perturbations can be removed or modified, adversarial attacks can potentially loss the attacking ability. Therefore, we propose to add a plugin-function module to preprocess the input audio before it is fed into the verification system. As a first attempt, we study two opposite plugin functions: denoising that attempts to remove or reduce perturbations and noise-adding that adds small Gaussian noises to an input audio. We show through experiments that both methods can significantly degrade the performance of a state-of-the-art adversarial attack. Specifically, it is shown that denoising and noise-adding can reduce the targeted attack success rate of the attack from 100% to only 56% and 5.2%, respectively. Moreover, noise-adding can slow down the attack 25 times in speed and has a minor effect on the normal operations of a speaker verification system. Therefore, we believe that noise-adding can be applied to any speaker verification system against adversarial attacks. To the best of our knowledge, this is the first attempt in applying the noise-adding method to defend against adversarial attacks in speaker verification systems.
ChiaFeng, Chen, i 陳嘉峰. "A Study on Speaker Verification System Using Hidden Markov Model". Thesis, 2000. http://ndltd.ncl.edu.tw/handle/91019952933404054926.
Pełny tekst źródła國立臺北科技大學
電腦通訊與控制研究所
88
This thesis develops a text-dependent speaker verification system based on Hidden Markov Model (HMM). The fixed digit-string password utterances are segmented into a sequence of isolated-word units for constructing speaker models by employing a segmental K-means training procedure. In order to improve the performance of speaker verification, normalized log-likelihood scoring is utilized against specified speaker reference models and speaker background models which were obtained from cohort speaker set that is based on similarity measure. Several sets of experimental utterances were used for the evaluation of the system, which include male and female utterances recorded through microphone and telephone networks. Experimental results indicate that with the use of individual speaker background models the best equal error rates (EER) of 0.3% and 4.48% were achieved, respectively, for microphone speech (20 true speakers, 5 impostors) and telephone speech (20 true speakers, 10 impostors).
Chang, Jung-Lin, i 張榮霖. "Case Study of CTI System And Speaker Verification Via Telephone". Thesis, 2003. http://ndltd.ncl.edu.tw/handle/28926152976021236089.
Pełny tekst źródła國立高雄第一科技大學
電腦與通訊工程所
91
The open to the telecommunications, the establish of the broadband Internet, the abundant of service and business and along with the CTI Computer Telephony Integration generate the largest economic benefit to the enterprise and the customers. This research paper is aimed to further study the CIT Applied Technology. To begin with,theresearch will investigate individually on Genesys’and Chain Sea Integration’s CTI system. In the end, develop the speaker verification in the telephone system. First of all, we use Dialogic’s speech telephone card to design a speech verification process combined with speaker verification. Next, set up member speech database in assistance to website registration system. The core of speaker verification is based on Hidden Markov Models (HMM), cooperated with member speech database for training and then creates a threshold level for each member. Member can receive advanced functions and service through the verification of website registration and telephony speaker verification system.
Wu, Cheng-Hsiung, i 吳正雄. "Performance Evaluation of Speaker Verification for Mobile Voice-Activated Trading System". Thesis, 2001. http://ndltd.ncl.edu.tw/handle/09854519574455568728.
Pełny tekst źródła國立臺北科技大學
機電整合研究所
89
This thesis investigates the effects of transcoded speech and real GSM speech on the performance of speaker verification for mobile voiced-activated trading system. The transcoded speech for simulation is obtained by transcoding microphone and wired telephone speech databases using various coding schemes. In order to match the real-world environments, a GSM speech database consisting of 20 male and 20 female speakers is also collected over the mobile wireless network. Three in-vehicle call environments are considered: stopped cars (0 km/hr) with running engine, running cars with driving speeds of 50 km/hr and 90 km/hr. Each speaker pronounced 40 7-digit strings at each condition. This results in a database of 4800 digit strings, which is suitable for use in related researches. A text-dependent Hidden Markov Model-based system is implemented for performance evaluation. Experimental results demonstrate that verification performance of real GSM speech is far worse than that of transcoded speech due to channel effects and background noise. Consequently, this investigation provides a useful and practical baseline of performance evaluation for mobile voice-activated trading systems. The results also indicate that 0 km/hr case yields the best performance in the matched conditions; 90 km/hr results in the worst performance in mismatched conditions; and performance of male is always superior to that of female in all conditions. Moreover, we find that the proposed mixed training model improves the performance in some cases.
Lin, Shiou-De, i 林修德. "Deep Neural Network based Factor Analysis for Robust Speaker Verification System". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/pf7dv7.
Pełny tekst źródła國立臺北科技大學
電腦與通訊研究所
102
The goal of this study is to build a model of robust speaker verification. In the speaker verification, performance is affected with noise, environment, or session …etc. i-Vector+Linear Discriminant Analysis (LDA) and i-Vector+ Probabilistic Linear Discriminant Analysis (PLDA) systems have become the state-of-the art technique in the speaker verification field. Because of PLDA''s speaker model is based on the strong assumption that the probability distribution is a Gaussian distribution of information, but due to the variability of the data, the assumption is not always right. So we further proposed variation Deep Neural Network-based systems based on neural network using method.We use the model (FA-DNN), the hidden layer having a high degree of representation, into the non-language speaker node, in the test, only focus on the contribution speaker node. In this thesis, three methods are experimented on the SRE14. The experimental results on min DCF trial showed that relative performance gain of FA-DNN is 9.84%, and EER of PLDA is 13.25%.
Lin, Hung-Lung, i 林宏隆. "Implementation of a Speaker Verification System Using a Neural Network Processor". Thesis, 1993. http://ndltd.ncl.edu.tw/handle/28891650358178728058.
Pełny tekst źródła大同工學院
電機工程研究所
81
Speaker verification is one of the applications ofspeaker recognition and has practical usage. The goal of this research is to design and implement a real-time speaker verification system using a digital signal processor (DSP96002) and a neural network chip (80170NX). As a typical speaker recognition system, there are two main parts in our system, feature extraction and pattern classifier. We take the linear predictive coding (LPC) derived cepstrum as a feature, which is found to have the best performance for speaker recognition. Hardware of our speaker verification system has been imple- mented successfully and achieved the requirement of real-time operation. System performance is evaluated and the system parameters for the highest recognition rate are suggested. The precision of analog voltage in the circuit is the key of system performance. The experiment results showed that, our design of speaker verification system has high potential and flexibility for practical application.
Ho, Hon-Ron, i 何宏榮. "A Study of Speaker Verification on Contactless Smart Card Application System". Thesis, 2001. http://ndltd.ncl.edu.tw/handle/72175373022550969647.
Pełny tekst źródła國立成功大學
工程科學系
89
【Abstract】 This thesis treats about the application of the Contactless Smart Cards and the speaker verification technology. At first a personal speech is recorded through a microphone, these speaker wave data are then processing through a DSP device, which performs FFT and LPC operation. These training results are packed and filtered to become a compact speaker voice signature as small as 32 bytes. Thus signature data can then be written into a smart card chip for the cardholder ID verification application through the same speech recorder unit. The speaker ID verification system is not only user friendly but also good for user ID protection. A V-star’s QUISAR 560 card reader and writer, Mifare contactless smart cards, a compact microphone, a voice direct module and a 64MB RAM Sound Dialog Card are used in cooperating with a PentiumIII 500 PC for this thesis. The developing softwares are including Borland C++3.1, Visual Basic 6.0 and Matlab 5.3 tool kits for this particular application. This thesis can also be applied for the safety of cardholders’ verification on the future E-business.
Yu, Hao Chung, i 俞皓中. "Open Set Classification Based on Tolerance Interval for Speaker Verification System". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/48804952821192817644.
Pełny tekst źródła國立臺灣大學
資訊工程學研究所
90
Speaker verification systems solve the problem of verifying whether a given utterance comes from a claimed speaker. This problem is important because an accurate speaker verification system can be applied to many security systems. Comparing to other biometric methods like fingerprint or face recognition, speaker verification systems do not require expensive specialized equipments and are effective especially for remote identity verification. Previously, Renoylds et al. have proposed a speaker verification system using Gaussian mixture model, but their system is incomplete because their system needs a set of background speaker models, which are constructed using a large speech database of a variety of speakers. It may not be feasible to obtain such a database in the real world. In this thesis, I propose a new solution called OSCILLO, for speaker verification. By applying tolerance interval technique in statistics, OSCILLO can verify a speaker's ID without background speaker models. This greatly reduces the size of the whole system and the time for both training and testing. We compare OSCILLO and Reynolds' method using three standard speech databases: TCC-300, TIMIT and NIST. The experimental results show that OSCILLO performs well for all databases.
Su, Yu-Jui, i 蘇俞睿. "A study and implementation on Speaker Verification System using Gender Information". Thesis, 2018. http://ndltd.ncl.edu.tw/handle/3jrk4n.
Pełny tekst źródła國立臺灣大學
資訊工程學研究所
106
For speaker verification task, one way to improve system’s accuracy without changing the algorithm of acoustic model is to use gender-dependent model instead of gender-independent one. However, since test speakers’ gender are not available, gender classifier plays an important role since its accuracy directly affects the performance of the whole speaker verification system; furthermore, ensuring that the system can maintain good performance under different gender composition of test speakers is also an important appeal. To explore the impact of different gender information’s usage on speaker verification system, this paper implemented a speaker verification system using i-vector and PLDA model as speaker feature and scoring model respectively, and 3 i-vector-based gender classifier. After analyzing the weakness of speaker verification system using gender-dependent model in a general way, we proposed several different methods for the application of gender information under the conditions when gender classifier has good and poor performance respectively; moreover, we analysis the performance of each method under different gender composition of test speakers as well. Finally, we reached the goal of making our system achieve better performance than tradition practice under different circumstances.
Yan, Kuan-Hao, i 管浩延. "The Use of Mixture Endpoint Detection Technique for Text-Dependent Speaker Verification System". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/56600116939838315911.
Pełny tekst źródła國立中興大學
通訊工程研究所
97
Speaker verification has been used in the area of biometric authentication. The recognition rate is the key issue for recent development of speech recognition. In this thesis, instead of the traditional endpoint detection method, we have developed a new mixture endpoint detection method to increase the recognition rate of the speaker verification. We adopt entropy and zero-crossing for end point detection to detect a real speech sections. We use this technique to locate a real speech section in noisy data, and then use the zero-crossing to detect an air sound from the speech. After these processes, the SNR can really reflect the real speech level; therefore the threshold can be set. Using the mixture endpoint detection technique can easily increase the text-dependent speaker verification system efficiency.
Wun-SyongLin i 林文雄. "An Embedded System Design and Implementation for Speaker Independent Single-Words Speech Verification using AMDF-based Pitch Features". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/58222930150850195164.
Pełny tekst źródła國立成功大學
電機工程學系碩博士班
99
Speech Interface of human-machine interactive system provides not only friendly interface but also a directly feedback mechanism for user. In this thesis, an embedded system is designed and implemented with using the pitch-based single-words speech verification to promote the functionality for speech interactive interface. The proposed system is especially designed for the hardware resource limitation environment, and it has the following features: small size, low cost, low power consumption, real-time operation, and can be widely applied to speech interactive applications. In pitch detection, the average magnitude difference function (AMDF) is adopted to predict the pitch period feature for speaker independent utterance verification. We propose an upper bound strategy to reduce the iteration times of SAA (subtraction absolute operation and accumulation), and this new manner can reduce the computations power with high AMDF accuracy. The proposed pitch period feature extraction manner is implemented on an embedded system with 8051 MCU, preamplifier circuit, AGC and filtering circuit. For single command of 2 seconds duration speech data, the average speech verification accuracy rate is about 95% under difference distances from speaker to microphone. From experimental results, we found that the detected pitch period from the modified AMDF is still reliable. The proposed prototype can be widely applied to human-machine interactive system, such as alarm clock, intelligent toys, real-time feedback system, and hand-free remote controller, etc.
Su, Hua, i 蘇樺. "PSO Algorithm for Speaker Verification Systems". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/64080634241659055321.
Pełny tekst źródła國立中央大學
電機工程學系
102
This thesis focused on speaker verification between test corpus and registered speaker models. First of all, the thesis introduces score normalization approaches to the speaker verification system. Then, we apply Particle Swarm Optimization algorithm to optimize model parameters. The main idea of PSO method is like fish foraging behavior. All particles of PSO have memories. The algorithm has simple calculation and fast convergence. With its optimized features to build a more accurate speaker model, the system is more discernment. In addition, the thesis also introduces a regression analysis method to speaker verification system. Regression analysis is a useful statistics analysis method. We build the regression model for each speaker by ordinary least squares estimation and the coefficients of determination analysis. Experiments showed that the proposed method can improve performance of the speaker verification system.
Mohan, Aanchan K. "Combining speech recognition and speaker verification". 2008. http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.17528.
Pełny tekst źródłaChandrasekaran, Aravind. "Efficient methods for rapid UBM training (RUT) for robust speaker verification /". 2008. http://proquest.umi.com/pqdweb?did=1650508671&sid=2&Fmt=2&clientId=10361&RQT=309&VName=PQD.
Pełny tekst źródła