Gotowa bibliografia na temat „Speaker recognition systems”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Speaker recognition systems”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Speaker recognition systems"

1

Gonzalez-Rodriguez, Joaquin. "Evaluating Automatic Speaker Recognition systems: An overview of the NIST Speaker Recognition Evaluations (1996-2014)". Loquens 1, nr 1 (30.06.2014): e007. http://dx.doi.org/10.3989/loquens.2014.007.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Bouziane, Ayoub, Jamal Kharroubi i Arsalane Zarghili. "Towards an Optimal Speaker Modeling in Speaker Verification Systems using Personalized Background Models". International Journal of Electrical and Computer Engineering (IJECE) 7, nr 6 (1.12.2017): 3655. http://dx.doi.org/10.11591/ijece.v7i6.pp3655-3663.

Pełny tekst źródła
Streszczenie:
<p>This paper presents a novel speaker modeling approachfor speaker recognition systems. The basic idea of this approach consists of deriving the target speaker model from a personalized background model, composed only of the UBM Gaussian components which are really present in the speech of the target speaker. The motivation behind the derivation of speakers’ models from personalized background models is to exploit the observeddifference insome acoustic-classes between speakers, in order to improve the performance of speaker recognition systems.</p>The proposed approach was evaluatedfor speaker verification task using various amounts of training and testing speech data. The experimental results showed that the proposed approach is efficientin termsof both verification performance and computational cost during the testing phase of the system, compared to the traditional UBM based speaker recognition systems.
Style APA, Harvard, Vancouver, ISO itp.
3

Singh, Satyanand. "Forensic and Automatic Speaker Recognition System". International Journal of Electrical and Computer Engineering (IJECE) 8, nr 5 (1.10.2018): 2804. http://dx.doi.org/10.11591/ijece.v8i5.pp2804-2811.

Pełny tekst źródła
Streszczenie:
<span lang="EN-US">Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio sample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metrics</span>
Style APA, Harvard, Vancouver, ISO itp.
4

Singh, Mahesh K., P. Mohana Satya, Vella Satyanarayana i Sridevi Gamini. "Speaker Recognition Assessment in a Continuous System for Speaker Identification". International Journal of Electrical and Electronics Research 10, nr 4 (30.12.2022): 862–67. http://dx.doi.org/10.37391/ijeer.100418.

Pełny tekst źródła
Streszczenie:
This research article presented and focused on recognizing speakers through multi-speaker speeches. The participation of several speakers includes every conference, talk or discussion. This type of talk has different problems as well as stages of processing. Challenges include the unique impurity of the surroundings, the involvement of speakers, speaker distance, microphone equipment etc. In addition to addressing these hurdles in real time, there are also problems in the treatment of the multi-speaker speech. Identifying speech segments, separating the speaking segments, constructing clusters of similar segments and finally recognizing the speaker using these segments are the common sequential operations in the context of multi-speaker speech recognition. All linked phases of speech recognition processes are discussed with relevant methodologies in this article. This entire article will examine the common metrics, methods and conduct. This paper examined the algorithm of speech recognition system at different stages. The voice recognition systems are built through many phases such as voice filter, speaker segmentation, speaker idolization and the recognition of the speaker by 20 speakers.
Style APA, Harvard, Vancouver, ISO itp.
5

Mridha, Muhammad Firoz, Abu Quwsar Ohi, Muhammad Mostafa Monowar, Md Abdul Hamid, Md Rashedul Islam i Yutaka Watanobe. "U-Vectors: Generating Clusterable Speaker Embedding from Unlabeled Data". Applied Sciences 11, nr 21 (27.10.2021): 10079. http://dx.doi.org/10.3390/app112110079.

Pełny tekst źródła
Streszczenie:
Speaker recognition deals with recognizing speakers by their speech. Most speaker recognition systems are built upon two stages, the first stage extracts low dimensional correlation embeddings from speech, and the second performs the classification task. The robustness of a speaker recognition system mainly depends on the extraction process of speech embeddings, which are primarily pre-trained on a large-scale dataset. As the embedding systems are pre-trained, the performance of speaker recognition models greatly depends on domain adaptation policy, which may reduce if trained using inadequate data. This paper introduces a speaker recognition strategy dealing with unlabeled data, which generates clusterable embedding vectors from small fixed-size speech frames. The unsupervised training strategy involves an assumption that a small speech segment should include a single speaker. Depending on such a belief, a pairwise constraint is constructed with noise augmentation policies, used to train AutoEmbedder architecture that generates speaker embeddings. Without relying on domain adaption policy, the process unsupervisely produces clusterable speaker embeddings, termed unsupervised vectors (u-vectors). The evaluation is concluded in two popular speaker recognition datasets for English language, TIMIT, and LibriSpeech. Also, a Bengali dataset is included to illustrate the diversity of the domain shifts for speaker recognition systems. Finally, we conclude that the proposed approach achieves satisfactory performance using pairwise architectures.
Style APA, Harvard, Vancouver, ISO itp.
6

Nematollahi, Mohammad Ali, i S. A. R. Al-Haddad. "Distant Speaker Recognition: An Overview". International Journal of Humanoid Robotics 13, nr 02 (25.05.2016): 1550032. http://dx.doi.org/10.1142/s0219843615500322.

Pełny tekst źródła
Streszczenie:
Distant speaker recognition (DSR) system assumes the microphones are far away from the speaker’s mouth. Also, the position of microphones can vary. Furthermore, various challenges and limitation in terms of coloration, ambient noise and reverberation can bring some difficulties for recognition of the speaker. Although, applying speech enhancement techniques can attenuate speech distortion components, it may remove speaker-specific information and increase the processing time in real-time application. Currently, many efforts have been investigated to develop DSR for commercial viable systems. In this paper, state-of-the-art techniques in DSR such as robust feature extraction, feature normalization, robust speaker modeling, model compensation, dereverberation and score normalization are discussed to overcome the speech degradation components i.e., reverberation and ambient noise. Performance results on DSR show that whenever speaker to microphone distant increases, recognition rates decreases and equal error rate (EER) increases. Finally, the paper concludes that applying robust feature and robust speaker model varying lesser with distant, can improve the DSR performance.
Style APA, Harvard, Vancouver, ISO itp.
7

Garcia‐Romero, Daniel, i Carol Espy‐Wilson. "Automatic speaker recognition: Advances toward informative systems." Journal of the Acoustical Society of America 128, nr 4 (październik 2010): 2394. http://dx.doi.org/10.1121/1.3508584.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Padmanabhan, M., L. R. Bahl, D. Nahamoo i M. A. Picheny. "Speaker clustering and transformation for speaker adaptation in speech recognition systems". IEEE Transactions on Speech and Audio Processing 6, nr 1 (1998): 71–77. http://dx.doi.org/10.1109/89.650313.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Singh, Satyanand. "Bayesian distance metric learning and its application in automatic speaker recognition systems". International Journal of Electrical and Computer Engineering (IJECE) 9, nr 4 (1.08.2019): 2960. http://dx.doi.org/10.11591/ijece.v9i4.pp2960-2967.

Pełny tekst źródła
Streszczenie:
This paper proposes state-of the-art Automatic Speaker Recognition System (ASR) based on Bayesian Distance Learning Metric as a feature extractor. In this modeling, I explored the constraints of the distance between modified and simplified i-vector pairs by the same speaker and different speakers. An approximation of the distance metric is used as a weighted covariance matrix from the higher eigenvectors of the covariance matrix, which is used to estimate the posterior distribution of the metric distance. Given a speaker tag, I select the data pair of the different speakers with the highest cosine score to form a set of speaker constraints. This collection captures the most discriminating variability between the speakers in the training data. This Bayesian distance learning approach achieves better performance than the most advanced methods. Furthermore, this method is insensitive to normalization compared to cosine scores. This method is very effective in the case of limited training data. The modified supervised i-vector based ASR system is evaluated on the NIST SRE 2008 database. The best performance of the combined cosine score EER 1.767% obtained using LDA200 + NCA200 + LDA200, and the best performance of Bayes_dml EER 1.775% obtained using LDA200 + NCA200 + LDA100. Bayesian_dml overcomes the combined norm of cosine scores and is the best result of the short2-short3 condition report for NIST SRE 2008 data.
Style APA, Harvard, Vancouver, ISO itp.
10

Kamiński, Kamil A., i Andrzej P. Dobrowolski. "Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features". Sensors 22, nr 23 (1.12.2022): 9370. http://dx.doi.org/10.3390/s22239370.

Pełny tekst źródła
Streszczenie:
This article presents the Automatic Speaker Recognition System (ASR System), which successfully resolves problems such as identification within an open set of speakers and the verification of speakers in difficult recording conditions similar to telephone transmission conditions. The article provides complete information on the architecture of the various internal processing modules of the ASR System. The speaker recognition system proposed in the article, has been compared very closely to other competing systems, achieving improved speaker identification and verification results, on known certified voice dataset. The ASR System owes this to the dual use of genetic algorithms both in the feature selection process and in the optimization of the system’s internal parameters. This was also influenced by the proprietary feature generation and corresponding classification process using Gaussian mixture models. This allowed the development of a system that makes an important contribution to the current state of the art in speaker recognition systems for telephone transmission applications with known speech coding standards.
Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat "Speaker recognition systems"

1

Neville, Katrina Lee, i katrina neville@rmit edu au. "Channel Compensation for Speaker Recognition Systems". RMIT University. Electrical and Computer Engineering, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080514.093453.

Pełny tekst źródła
Streszczenie:
This thesis attempts to address the problem of how best to remedy different types of channel distortions on speech when that speech is to be used in automatic speaker recognition and verification systems. Automatic speaker recognition is when a person's voice is analysed by a machine and the person's identity is worked out by the comparison of speech features to a known set of speech features. Automatic speaker verification is when a person claims an identity and the machine determines if that claimed identity is correct or whether that person is an impostor. Channel distortion occurs whenever information is sent electronically through any type of channel whether that channel is a basic wired telephone channel or a wireless channel. The types of distortion that can corrupt the information include time-variant or time-invariant filtering of the information or the addition of 'thermal noise' to the information, both of these types of distortion can cause varying degrees of error in information being received and analysed. The experiments presented in this thesis investigate the effects of channel distortion on the average speaker recognition rates and testing the effectiveness of various channel compensation algorithms designed to mitigate the effects of channel distortion. The speaker recognition system was represented by a basic recognition algorithm consisting of: speech analysis, extraction of feature vectors in the form of the Mel-Cepstral Coefficients, and a classification part based on the minimum distance rule. Two types of channel distortion were investigated: • Convolutional (or lowpass filtering) effects • Addition of white Gaussian noise Three different methods of channel compensation were tested: • Cepstral Mean Subtraction (CMS) • RelAtive SpecTrAl (RASTA) Processing • Constant Modulus Algorithm (CMA) The results from the experiments showed that for both CMS and RASTA processing that filtering at low cutoff frequencies, (3 or 4 kHz), produced improvements in the average speaker recognition rates compared to speech with no compensation. The levels of improvement due to RASTA processing were higher than the levels achieved due to the CMS method. Neither the CMS or RASTA methods were able to improve accuracy of the speaker recognition system for cutoff frequencies of 5 kHz, 6 kHz or 7 kHz. In the case of noisy speech all methods analysed were able to compensate for high SNR of 40 dB and 30 dB and only RASTA processing was able to compensate and improve the average recognition rate for speech corrupted with a high level of noise (SNR of 20 dB and 10 dB).
Style APA, Harvard, Vancouver, ISO itp.
2

Du, Toit Ilze. "Non-acoustic speaker recognition". Thesis, Stellenbosch : University of Stellenbosch, 2004. http://hdl.handle.net/10019.1/16315.

Pełny tekst źródła
Streszczenie:
Thesis (MScIng)--University of Stellenbosch, 2004.
ENGLISH ABSTRACT: In this study the phoneme labels derived from a phoneme recogniser are used for phonetic speaker recognition. The time-dependencies among phonemes are modelled by using hidden Markov models (HMMs) for the speaker models. Experiments are done using firstorder and second-order HMMs and various smoothing techniques are examined to address the problem of data scarcity. The use of word labels for lexical speaker recognition is also investigated. Single word frequencies are counted and the use of various word selections as feature sets are investigated. During April 2004, the University of Stellenbosch, in collaboration with Spescom DataVoice, participated in an international speaker verification competition presented by the National Institute of Standards and Technology (NIST). The University of Stellenbosch submitted phonetic and lexical (non-acoustic) speaker recognition systems and a fused system (the primary system) that fuses the acoustic system of Spescom DataVoice with the non-acoustic systems of the University of Stellenbosch. The results were evaluated by means of a cost model. Based on the cost model, the primary system obtained second and third position in the two categories that were submitted.
AFRIKAANSE OPSOMMING: Hierdie projek maak gebruik van foneem-etikette wat geklassifiseer word deur ’n foneemherkenner en daarna gebruik word vir fonetiese sprekerherkenning. Die tyd-afhanklikhede tussen foneme word gemodelleer deur gebruik te maak van verskuilde Markov modelle (HMMs) as sprekermodelle. Daar word ge¨eksperimenteer met eerste-orde en tweede-orde HMMs en verskeie vergladdingstegnieke word ondersoek om dataskaarsheid aan te spreek. Die gebruik van woord-etikette vir sprekerherkenning word ook ondersoek. Enkelwoordfrekwensies word getel en daar word ge¨eksperimenteer met verskeie woordseleksies as kenmerke vir sprekerherkenning. Gedurende April 2004 het die Universiteit van Stellenbosch in samewerking met Spescom DataVoice deelgeneem aan ’n internasionale sprekerverifikasie kompetisie wat deur die National Institute of Standards and Technology (NIST) aangebied is. Die Universiteit van Stellenbosch het ingeskryf vir ’n fonetiese en ’n woordgebaseerde (nie-akoestiese) sprekerherkenningstelsel, asook ’n saamgesmelte stelsel wat as primˆere stelsel dien. Die saamgesmelte stelsel is ’n kombinasie van Spescom DataVoice se akoestiese stelsel en die twee nie-akoestiese stelsels van die Universiteit van Stellenbosch. Die resultate is ge¨evalueer deur gebruik te maak van ’n koste-model. Op grond van die koste-model het die primˆere stelsel tweede en derde plek behaal in die twee kategorie¨e waaraan deelgeneem is.
Style APA, Harvard, Vancouver, ISO itp.
3

Shou-Chun, Yin 1980. "Speaker adaptation in joint factor analysis based text independent speaker verification". Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=100735.

Pełny tekst źródła
Streszczenie:
This thesis presents methods for supervised and unsupervised speaker adaptation of Gaussian mixture speaker models in text-independent speaker verification. The proposed methods are based on an approach which is able to separate speaker and channel variability so that progressive updating of speaker models can be performed while minimizing the influence of the channel variability associated with the adaptation recordings. This approach relies on a joint factor analysis model of intrinsic speaker variability and session variability where inter-session variation is assumed to result primarily from the effects of the transmission channel. These adaptation methods have been evaluated under the adaptation paradigm defined under the NIST 2005 speaker recognition evaluation plan which is based on conversational telephone speech.
Style APA, Harvard, Vancouver, ISO itp.
4

Uzuner, Halil. "Robust text-independent speaker recognition over telecommunications systems". Thesis, University of Surrey, 2006. http://epubs.surrey.ac.uk/843391/.

Pełny tekst źródła
Streszczenie:
Biometric recognition methods, using human features such as voice, face or fingeorprints, are increasingly popular for user authentication. Voice is unique in that it is a non-intrusive biometric which can be transmitted over the existing telecommunication networks, thereby allowing remote authentication. Current spealcer recognition systems can provide high recognition rates on clean speech signals. However, their performance has been shown to degrade in real-life applications such as telephone banking, where speech compression and background noise can affect the speech signal. In this work, three important advancements have been introduced to improve the speaker recognition performance, where it is affected by the coder mismatch, the aliasing distortion caused by the Line Spectral Frequency (LSF) parameter extraction, and the background noise. The first advancement focuses on investigating the speaker recognition system performance in a multi-coder environment using a Speech Coder Detection (SCD) System, which minimises training and testing data mismatch and improves the speaker recognition performance. Having reduced the speaker recognition error rates for multi-coder environment, further investigation on GSM-EFR speech coder is performed to deal with a particular - problem related to LSF parameter extraction method. It has been previously shown that the classic technique for extraction of LSF parameters in speech coders is prone to aliasing distortion. Low-pass filtering on up-sampled LSF vectors has been shown to alleviate this problem, therefore improving speech quality. In this thesis, as a second advancement, the Non-Aliased LSF (NA-LSF) extraction method is introduced in order to reduce the unwanted effects of GSM-EFR coder on speaker recognition performance. Another important factor that effects the performance of speaker recognition systems is the presence of the background noise. Background noise might severely reduce the performance of the targeted application such as quality of the coded speech, or the performance of the speaker recognition systems. The third advancement was achieved by using a noise-canceller to improve the speaker recognition performance in mismatched environments with varying background noise conditions. Speaker recognition system with a Minimum Mean Square Error - Log Spectral Amplitudes (MMSE-LSA) noise- canceller used as a pre-processor is proposed and investigated to determine the efficiency of noise cancellation on the speaker recognition performance using speech corrupted by different background noise conditions. Also the effects of noise cancellation on speaker recognition performance using coded noisy speech have been investigated. Key words; Identification, Verification, Recognition, Gaussian Mixture Models, Speech Coding, Noise Cancellation.
Style APA, Harvard, Vancouver, ISO itp.
5

Wildermoth, Brett Richard, i n/a. "Text-Independent Speaker Recognition Using Source Based Features". Griffith University. School of Microelectronic Engineering, 2001. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20040831.115646.

Pełny tekst źródła
Streszczenie:
Speech signal is basically meant to carry the information about the linguistic message. But, it also contains the speaker-specific information. It is generated by acoustically exciting the cavities of the mouth and nose, and can be used to recognize (identify/verify) a person. This thesis deals with the speaker identification task; i.e., to find the identity of a person using his/her speech from a group of persons already enrolled during the training phase. Listeners use many audible cues in identifying speakers. These cues range from high level cues such as semantics and linguistics of the speech, to low level cues relating to the speaker's vocal tract and voice source characteristics. Generally, the vocal tract characteristics are modeled in modern day speaker identification systems by cepstral coefficients. Although, these coeficients are good at representing vocal tract information, they can be supplemented by using both pitch and voicing information. Pitch provides very important and useful information for identifying speakers. In the current speaker recognition systems, it is very rarely used as it cannot be reliably extracted, and is not always present in the speech signal. In this thesis, an attempt is made to utilize this pitch and voicing information for speaker identification. This thesis illustrates, through the use of a text-independent speaker identification system, the reasonable performance of the cepstral coefficients, achieving an identification error of 6%. Using pitch as a feature in a straight forward manner results in identification errors in the range of 86% to 94%, and this is not very helpful. The two main reasons why the direct use of pitch as a feature does not work for speaker recognition are listed below. First, the speech is not always periodic; only about half of the frames are voiced. Thus, pitch can not be estimated for half of the frames (i.e. for unvoiced frames). The problem is how to account for pitch information for the unvoiced frames during recognition phase. Second, the pitch estimation methods are not very reliable. They classify some of the frames unvoiced when they are really voiced. Also, they make pitch estimation errors (such as doubling or halving of pitch value depending on the method). In order to use pitch information for speaker recognition, we have to overcome these problems. We need a method which does not use the pitch value directly as feature and which should work for voiced as well as unvoiced frames in a reliable manner. We propose here a method which uses the autocorrelation function of the given frame to derive pitch-related features. We call these features the maximum autocorrelation value (MACV) features. These features can be extracted for voiced as well as unvoiced frames and do not suffer from the pitch doubling or halving type of pitch estimation errors. Using these MACV features along with the cepstral features, the speaker identification performance is improved by 45%.
Style APA, Harvard, Vancouver, ISO itp.
6

Wildermoth, Brett Richard. "Text-Independent Speaker Recognition Using Source Based Features". Thesis, Griffith University, 2001. http://hdl.handle.net/10072/366289.

Pełny tekst źródła
Streszczenie:
Speech signal is basically meant to carry the information about the linguistic message. But, it also contains the speaker-specific information. It is generated by acoustically exciting the cavities of the mouth and nose, and can be used to recognize (identify/verify) a person. This thesis deals with the speaker identification task; i.e., to find the identity of a person using his/her speech from a group of persons already enrolled during the training phase. Listeners use many audible cues in identifying speakers. These cues range from high level cues such as semantics and linguistics of the speech, to low level cues relating to the speaker's vocal tract and voice source characteristics. Generally, the vocal tract characteristics are modeled in modern day speaker identification systems by cepstral coefficients. Although, these coeficients are good at representing vocal tract information, they can be supplemented by using both pitch and voicing information. Pitch provides very important and useful information for identifying speakers. In the current speaker recognition systems, it is very rarely used as it cannot be reliably extracted, and is not always present in the speech signal. In this thesis, an attempt is made to utilize this pitch and voicing information for speaker identification. This thesis illustrates, through the use of a text-independent speaker identification system, the reasonable performance of the cepstral coefficients, achieving an identification error of 6%. Using pitch as a feature in a straight forward manner results in identification errors in the range of 86% to 94%, and this is not very helpful. The two main reasons why the direct use of pitch as a feature does not work for speaker recognition are listed below. First, the speech is not always periodic; only about half of the frames are voiced. Thus, pitch can not be estimated for half of the frames (i.e. for unvoiced frames). The problem is how to account for pitch information for the unvoiced frames during recognition phase. Second, the pitch estimation methods are not very reliable. They classify some of the frames unvoiced when they are really voiced. Also, they make pitch estimation errors (such as doubling or halving of pitch value depending on the method). In order to use pitch information for speaker recognition, we have to overcome these problems. We need a method which does not use the pitch value directly as feature and which should work for voiced as well as unvoiced frames in a reliable manner. We propose here a method which uses the autocorrelation function of the given frame to derive pitch-related features. We call these features the maximum autocorrelation value (MACV) features. These features can be extracted for voiced as well as unvoiced frames and do not suffer from the pitch doubling or halving type of pitch estimation errors. Using these MACV features along with the cepstral features, the speaker identification performance is improved by 45%.
Thesis (Masters)
Master of Philosophy (MPhil)
School of Microelectronic Engineering
Faculty of Engineering and Information Technology
Full Text
Style APA, Harvard, Vancouver, ISO itp.
7

Adami, André Gustavo. "Modeling prosodic differences for speaker and language recognition /". Full text open access at:, 2004. http://content.ohsu.edu/u?/etd,19.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Yu, K. P. "Text dependency and adaptation in training speaker recognition systems". Thesis, Swansea University, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.636721.

Pełny tekst źródła
Streszczenie:
This thesis investigates speaker specific models trained with training sets with a number of different repetitions per text but focusing mainly on the models trained with only a few (less than 3) repetitions. This work aims to assess the abilities of a speaker model as the amount of training data increases while keeping the length of test utterances fixed. This theme is chosen because small data sets are problematic to the training of models for speech and speaker recognition. Small training set sizes regularly occur when training speaker specific models, as it is often difficult to collect a large amount of speaker specific data. In the first part of this work, three speaker recognition approaches, namely vector quantisation (VQ), dynamic time warping (DTW) and continuous density hidden Markov models (CDHMMs) are assessed. These experiments use increasing training set sizes which contain from 1 to 10 repetitions of each text to train each speaker model. Here the intent is to show which approach is most appropriate across the range of available training set sizes, for text-dependent and text-independent speaker recognition. This part concludes by suggesting that the TD DTW approach is best of all the chosen configurations. The second part of the work concerns adaptation using text-dependent CDHMMs. A new approach for adaptation called cumulative likelihood estimation (CLE) is introduced, and compared with the maximum a posteriori (MAP) approach and other benchmark results. The framework is chosen such that only single repetitions of each utterance are available for enrolment and subsequent adaptation of the speaker model. The objective is to assess whether creating speaker models through the use of an adaptation approach is a viable alternative to creating speaker models using stored speaker specific speech. It is concluded that both MAP and CLE are viable alternatives, and CLE in particular can create a model by adapting single repetitions of data which achieves performance as good as or better than that of an equivalent model, such as DTW, which has been trained using an equivalent amount of stored data.
Style APA, Harvard, Vancouver, ISO itp.
9

Wark, Timothy J. "Multi-modal speech processing for automatic speaker recognition". Thesis, Queensland University of Technology, 2001.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Mathan, Luc Stefan. "Speaker-independent access to a large lexicon". Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=63773.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Książki na temat "Speaker recognition systems"

1

Fundamentals of speaker recognition. New York: Springer, 2011.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Christian, Müller, red. Speaker classification. Berlin: Springer, 2007.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Meisel, William S. The telephony voice user interface: Applications of speech recognition, text-to-speech, and speaker verification over the telephone. Tarzana, CA: TMA Associates, 1998.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Sabourin, Conrad. Computational speech processing: Speech analysis, recognition, understanding, compression, transmission, coding, synthesis, text to speech systems, speech to tactile displays, speaker identification, prosody processing : bibliography. Montréal: Infolingua, 1994.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Russell, M. J. The development of the speaker independent ARM continuous speech recognition system. [London: Controller, H.M.S.O., 1992.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Franz, Gerl, Minker Wolfgang i SpringerLink (Online service), red. Self-Learning Speaker Identification: A System for Enhanced Speech Recognition. Berlin, Heidelberg: Springer-Verlag Berlin Heidelberg, 2011.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Beigi, Homayoon. Fundamentals of Speaker Recognition. Springer, 2016.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Gallardo, Laura Fernández. Human and Automatic Speaker Recognition over Telecommunication Channels. Springer, 2015.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Speaker Classification. Springer, 2007.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Müller, Christian. Speaker Classification I: Fundamentals, Features, and Methods. Springer London, Limited, 2007.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "Speaker recognition systems"

1

Ghate, P. M., Shraddha Chadha, Aparna Sundar i Ankita Kambale. "Automatic Speaker Recognition System". W Advances in Intelligent Systems and Computing, 1037–44. New Delhi: Springer India, 2013. http://dx.doi.org/10.1007/978-81-322-0740-5_126.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Katrak, Kayan K., Kanishk Singh, Aayush Shah, Rohit Menon i V. R. Badri Prasad. "Transformers for Speaker Recognition". W Machine Learning and Autonomous Systems, 49–62. Singapore: Springer Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-7996-4_5.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Glasser, Avery. "Designing Better Speaker Verification Systems: Bridging the Gap between Creators and Implementers of Investigatory Voice Biometric Technologies". W Forensic Speaker Recognition, 511–27. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0263-3_18.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Martin, Alvin, Mark Przybocki i Joseph P. Campbell. "The NIST speaker recognition evaluation program". W Biometric Systems, 241–62. London: Springer London, 2005. http://dx.doi.org/10.1007/1-84628-064-8_8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Koolagudi, Shashidhar G., Kritika Sharma i K. Sreenivasa Rao. "Speaker Recognition in Emotional Environment". W Eco-friendly Computing and Communication Systems, 117–24. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-32112-2_15.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Returi, Kanaka Durga, Y. Radhika i Vaka Murali Mohan. "A Simple Method for Speaker Recognition and Speaker Verification". W Advances in Intelligent Systems and Computing, 663–72. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-5400-1_64.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Hemakumar, G., i P. Punitha. "Large Vocabulary Speech Recognition: Speaker Dependent and Speaker Independent". W Advances in Intelligent Systems and Computing, 73–80. New Delhi: Springer India, 2015. http://dx.doi.org/10.1007/978-81-322-2250-7_8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Shulipa, Andrey, Sergey Novoselov i Yuri Matveev. "Scores Calibration in Speaker Recognition Systems". W Speech and Computer, 596–603. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-43958-7_72.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Hatem, Ahmed Samit, Muthanna J. Adulredhi, Ali M. Abdulrahman i Mohammed A. Fadhel. "Human Speaker Recognition Based Database Method". W Advances in Intelligent Systems and Computing, 1145–54. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-71187-0_106.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Rajendran, Sindhu, Meghamadhuri Vakil, Praveen Kumar Gupta, Lingayya Hiremath, S. Narendra Kumar i Ajeet Kumar Srivastava. "An Overview of the Concept of Speaker Recognition". W Intelligent Systems, 107–24. Includes bibliographical references and index.: Apple Academic Press, 2019. http://dx.doi.org/10.1201/9780429265020-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Speaker recognition systems"

1

Kohler, M. A., W. D. Andrews, J. P. Campbell i J. Herndndez-Cordero. "Phonetic speaker recognition". W Conference Record. Thirty-Fifth Asilomar Conference on Signals, Systems and Computers. IEEE, 2001. http://dx.doi.org/10.1109/acssc.2001.987748.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Badji, Aliou, Youssou Dieng, Ibrahima Diop, Papa Alioune Cisse i Boubacar Diouf. "Automatic Speaker Recognition (ASR)". W ICIST '20: 10th International Conference on Information Systems and Technologies. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3447568.3448544.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Zhu, Jian-wei, Shui-fa Sun, Xiao-li Liu i Bang-jun Lei. "Pitch in Speaker Recognition". W 2009 Ninth International Conference on Hybrid Intelligent Systems (HIS 2009). IEEE, 2009. http://dx.doi.org/10.1109/his.2009.14.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Heck, Larry P., i Dominique Genoud. "Combining speaker and speech recognition systems". W 7th International Conference on Spoken Language Processing (ICSLP 2002). ISCA: ISCA, 2002. http://dx.doi.org/10.21437/icslp.2002-415.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Pandey, Bipul, Alok Ranjan, Rajeev Kumar i Anupam Shukla. "Multilingual speaker recognition using ANFIS". W 2010 2nd International Conference on Signal Processing Systems (ICSPS). IEEE, 2010. http://dx.doi.org/10.1109/icsps.2010.5555759.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

"SPEAKER RECOGNITION USING DECISION FUSION". W International Conference on Bio-inspired Systems and Signal Processing. SciTePress - Science and and Technology Publications, 2008. http://dx.doi.org/10.5220/0001065502670272.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Kadyrov, Shirali, Cemil Turan, Altynbek Amirzhanov i Cemal Ozdemir. "Speaker Recognition from Spectrogram Images". W 2021 IEEE International Conference on Smart Information Systems and Technologies (SIST). IEEE, 2021. http://dx.doi.org/10.1109/sist50301.2021.9465954.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Fei, Wanchun, Liangjun Xu i Xingxing Lu. "Speaker recognition on nonstationary characteristics". W 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). IEEE, 2010. http://dx.doi.org/10.1109/fskd.2010.5569783.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Selvan, Karthik, Aju Joseph i K. K. Anish Babu. "Speaker recognition system for security applications". W 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS). IEEE, 2013. http://dx.doi.org/10.1109/raics.2013.6745441.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Slyh, Raymond, Eric Hansen i Brian Ore. "The 2005 AFRL/HEC One-Speaker Detection Systems". W 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop. IEEE, 2006. http://dx.doi.org/10.1109/odyssey.2006.248119.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Raporty organizacyjne na temat "Speaker recognition systems"

1

Slyh, Raymond E., Eric G. Hansen i Timothy R. Anderson. AFRL/HECP Speaker Recognition Systems for the 2004 NIST Speaker Recognition Evaluation. Fort Belvoir, VA: Defense Technical Information Center, grudzień 2004. http://dx.doi.org/10.21236/ada430750.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Ferrer, Luciana, Mitchell McLaren, Nicolas Scheffer, Yun Lei, Martin Graciarena i Vikramjit Mitra. A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation. Fort Belvoir, VA: Defense Technical Information Center, sierpień 2013. http://dx.doi.org/10.21236/ada614010.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Remus, Jeremiah. Advanced Subspace Techniques for Modeling Channel and Session Variability in a Speaker Recognition System. Fort Belvoir, VA: Defense Technical Information Center, marzec 2012. http://dx.doi.org/10.21236/ada557785.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Issues in Data Processing and Relevant Population Selection. OSAC Speaker Recognition Subcommittee, listopad 2022. http://dx.doi.org/10.29325/osac.tg.0006.

Pełny tekst źródła
Streszczenie:
In Forensic Automatic Speaker Recognition (FASR), forensic examiners typically compare audio recordings of a speaker whose identity is in question with recordings of known speakers to assist investigators and triers of fact in a legal proceeding. The performance of automated speaker recognition (SR) systems used for this purpose depends largely on the characteristics of the speech samples being compared. Examiners must understand the requirements of specific systems in use as well as the audio characteristics that impact system performance. Mismatch conditions between the known and questioned data samples are of particular importance, but the need for, and impact of, audio pre-processing must also be understood. The data selected for use in a relevant population can also be critical to the performance of the system. This document describes issues that arise in the processing of case data and in the selections of a relevant population for purposes of conducting an examination using a human supervised automatic speaker recognition approach in a forensic context. The document is intended to comply with the Organization of Scientific Area Committees (OSAC) for Forensic Science Technical Guidance Document.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii