Relevant bibliographies by topics / Voice signal transformation

Academic literature on the topic 'Voice signal transformation'

Author: Grafiati

Published: 30 May 2022

Last updated: 31 May 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Journal articles
Dissertations / Theses
Book chapters
Conference papers

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Voice signal transformation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Voice signal transformation"

Savic, Michael, and Il-Hyun Nam. "Voice personality transformation." Digital Signal Processing 1, no. 2 (April 1991): 107–10. http://dx.doi.org/10.1016/1051-2004(91)90099-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Tandyo, Anny, Martono Martono, and Adi Widyatmoko. "SPEAKER IDENTIFICATION MENGGUNAKAN TRANSFORMASI WAVELET DISKRIT DAN JARINGAN SARAF TIRUAN BACK-PROPAGATION." CommIT (Communication and Information Technology) Journal 2, no. 1 (May 31, 2008): 1. http://dx.doi.org/10.21512/commit.v2i1.482.

Full text

Abstract:

Article discussed a speaker identification system. Which was a part of speaker recognition. The system identified asubject based on the voice from a group of pattern had been saved before. This system used a wavelet discrete transformationas a feature extraction method and an artificial neural network of back-propagation as a classification method. The voiceinput was processed by the wavelet discrete transformation in order to obtain signal coefficient of low frequency as adecomposition result which kept voice characteristic of everyone. The coefficient then was classified artificial neural networkof back-propagation. A system trial was conducted by collecting voice samples directly by using 225 microphones in nonsoundproof rooms; contained of 15 subjects (persons) and each of them had 15 voice samples. The 10 samples were used as atraining voice and 5 others as a testing voice. Identification accuracy rate reached 84 percent. The testing was also done onthe subjects who pronounced same words. It can be concluded that, the similar selection of words by different subjects has noinfluence on the accuracy rate produced by system.Keywords: speaker identification, wavelet discrete transformation, artificial neural network, back-propagation.

APA, Harvard, Vancouver, ISO, and other styles

Zhao, Chun Hua, Chun Yu Ning, and Xiao Qiang Ji. "Design of Voice Prompt Temperature Detection System." Applied Mechanics and Materials 644-650 (September 2014): 1270–73. http://dx.doi.org/10.4028/www.scientific.net/amm.644-650.1270.

Full text

Abstract:

A kind of intelligent prompt temperature measurement system was designed which was based on single chip micocomputer. It used the pyroelectric infrared sensor to collect the emitted infrared from human body and to preprocess the signal output of the sensor.Then it could be the conversion signal by the A/D transformation inside SCM .The AVR SCM could control to realize the temperature operation and processing, and send the temperature value to the LCD to display and the voice prompt. Simultaneously, it was added the clock function and over temperature alarm function which made the design more practical. Theoretical basis and the software platform were used in this design of the hardware production and software debugging, and finally the design requirements were realized. The temperature measurement system had the advantages of short response time, non-contact,no interference by temperature field, convenient operation.

APA, Harvard, Vancouver, ISO, and other styles

Web, B. W., and Ding Lin. "Transformation-based reconstruction for real-time voice transmissions over the Internet." IEEE Transactions on Multimedia 1, no. 4 (1999): 342–51. http://dx.doi.org/10.1109/6046.807954.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Tan, Choon Beng, Mohd Hanafi Ahmad Hijazi, Frazier Kok, Mohd Saberi Mohamad, and Puteri Nor Ellyza Nohuddin. "Artificial speech detection using image-based features and random forest classifier." IAES International Journal of Artificial Intelligence (IJ-AI) 11, no. 1 (March 1, 2022): 161. http://dx.doi.org/10.11591/ijai.v11.i1.pp161-172.

Full text

Abstract:

The ASVspoof 2015 Challenge was one of the efforts of the research community in the field of speech processing to foster the development of generalized countermeasures against spoofing attacks. However, most countermeasures submitted to the ASVspoof 2015 Challenge failed to detect the S10 attack effectively, the only attack that was generated using the waveform concatenation approach. Hence, more informative features are needed to detect previously unseen spoofing attacks. This paper presents an approach that uses data transformation techniques to engineer image-based features together with random forest classifier to detect artificial speech. The objectives are two-fold: (i) to extract image-based features from the melfrequency cepstral coefficients representation of the speech signal and (ii) to compare the performance of using the extracted features and Random Forest to determine the authenticity of voices with the existing approaches. An audio-to-image transformation technique was used to engineer new features in classifying genuine and spoof voices. An experiment was conducted to find the appropriate combination of the engineered features and classifier. Experimental results showed that the proposed approach was able to detect speech synthesis and voice conversion attacks effectively, with an equal error rate of 0.10% and accuracy of 99.93%.

APA, Harvard, Vancouver, ISO, and other styles

Beltman, Willem, Hector Cordourier, and Paulo Lopez Meyer. "Hearing protection and communication in high noise environments using vibration sensing and neural network voice transformation." INTER-NOISE and NOISE-CON Congress and Conference Proceedings 263, no. 1 (August 1, 2021): 5027–37. http://dx.doi.org/10.3397/in-2021-2925.

Full text

Abstract:

In the United States alone, there are more than 9 million workers who are exposed to high levels of noise (> 85 dBA) that require hearing protection. Not only is there a risk of hearing damage due to these high noise levels, but it also prevents communication between people, leading to significant safety risks, including people not using hearing protection because of the desire to communicate. Workers in such high noise environments typically also wear safety glasses. This paper outlines an integrated system with safety glasses, hearing protection, and communication elements, using vibration sensing technology and a neural network based voice transformation routine. Data was collected to train the neural network based voice transformation. Recordings were made under various representative noise conditions, with some well exceeding sound pressure levels of 93 dBA, and Signal to Noise Ratios were extracted. In addition, experiments were conducted according to a modified ITU P.835 approach to determine intelligibility, naturalness and overall quality. The results demonstrate that with this approach, speech can be clearly understood in such high noise environments with this approach.

APA, Harvard, Vancouver, ISO, and other styles

Fong, Simon, Kun Lan, and Raymond Wong. "Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection." BioMed Research International 2013 (2013): 1–27. http://dx.doi.org/10.1155/2013/720834.

Full text

Abstract:

Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers’ gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC.

APA, Harvard, Vancouver, ISO, and other styles

Guimarães, Paula. "Retrieving Fin-de-Siècle Women Poets: The Transformative Myths, Fragments and Voices of Webster, Blind and Levy." Comparative Critical Studies 14, no. 2-3 (October 2017): 225–49. http://dx.doi.org/10.3366/ccs.2017.0237.

Full text

Abstract:

The critical recuperation of late nineteenth-century women poets is due significantly to the renewed interest in and study of the poetical works of Augusta Webster, Mathilde Blind and Amy Levy (1860-90) by postmodern readers. A major reason for this ‘salvage’ may be that they represent and embody the profound and extraordinary changes that characterize the British Fin de Siècle, in which the transition from the Victorians to the Moderns implied the transformation or reconfiguration of certain myths or (hi)stories and the critical re-use or ‘recycling’ of major literary forms. This essay seeks to demonstrate that while Webster's poetry is firmly grounded in social activism and the exploration and dramatization of the nature of female experience, Blind's epic and dramatic verse creates new myths of human destiny, reclaiming the Poet's role as the singer of the age's scientific deeds, while Levy's lyrics signal the New Woman poet's role as victim of the pressures of emancipation. Through these hybrid and fragmentary forms, Webster, Blind and Levy literally give voice to unspeakable feelings and situations, in which the anomalous and the marginal are made central.

APA, Harvard, Vancouver, ISO, and other styles

Ibrahim, Abu Bakar, and Ahmad Zamzuri Mohamad Ali. "Design of Microwave LNA Based on Ladder Matching Networks for WiMAX Applications." International Journal of Electrical and Computer Engineering (IJECE) 6, no. 4 (August 1, 2016): 1717. http://dx.doi.org/10.11591/ijece.v6i4.9877.

Full text

Abstract:

Advancement in the wireless industry, internet access without borders and increasing demand for high data rate wireless digital communication moving us toward the optimal development of communication technology. Wireless communication is a technology that plays an important role in current technology transformation. Broadband communication is a method of telecommunication that are available for transmitting large amounts of data, voice and video over long distance using different frequencies. Specifically, Low Noise Amplifier which is located at the first block of receiver system, makes it one of the important element in improving signal transmition. This study was aimed to design a microwave Low Noise Amplifier for wireless application that will work at 5.8 GHz using high-performance low noise superHEMT transistor FHX76LP manufactured by Eudyna Technologies. The low noise amplifier (LNA) produced gain of 16.8 dB and noise figure (NF) of 1.20 dB. The input reflection (S11) and output return loss (S22) are -10.5 dB and -13.3 dB respectively. The bandwidth of the amplifier recorded is 1.2 GHz. The input sensitivity is compliant with the IEEE 802.16 standards.

APA, Harvard, Vancouver, ISO, and other styles

Full text

Abstract:

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Voice signal transformation"

Ardaillon, Luc. "Synthesis and expressive transformation of singing voice." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066511/document.

Full text

Abstract:

Le but de cette thèse était de conduire des recherches sur la synthèse et transformation expressive de voix chantée, en vue de pouvoir développer un synthétiseur de haute qualité capable de générer automatiquement un chant naturel et expressif à partir d’une partition et d’un texte donnés. 3 directions de recherches principales peuvent être identifiées: les méthodes de modélisation du signal afin de générer automatiquement une voix intelligible et naturelle à partir d’un texte donné; le contrôle de la synthèse, afin de produire une interprétation d’une partition donnée tout en transmettant une certaine expressivité liée à un style de chant spécifique; la transformation du signal vocal afin de le rendre plus naturel et plus expressif, en faisant varier le timbre en adéquation avec la hauteur, l’intensité et la qualité vocale. Cette thèse apporte diverses contributions dans chacune de ces 3 directions. Tout d’abord, un système de synthèse complet a été développé, basé sur la concaténation de diphones. L’architecture modulaire de ce système permet d’intégrer et de comparer différent modèles de signaux. Ensuite, la question du contrôle est abordée, comprenant la génération automatique de la f0, de l’intensité, et des durées des phonèmes. La modélisation de styles de chant spécifiques a également été abordée par l’apprentissage des variations expressives des paramètres de contrôle modélisés à partir d’enregistrements commerciaux de chanteurs célèbres. Enfin, des investigations sur des transformations expressives du timbre liées à l'intensité et à la raucité vocale ont été menées, en vue d'une intégration future dans notre synthétiseur
This thesis aimed at conducting research on the synthesis and expressive transformations of the singing voice, towards the development of a high-quality synthesizer that can generate a natural and expressive singing voice automatically from a given score and lyrics. Mainly 3 research directions can be identified: the methods for modelling the voice signal to automatically generate an intelligible and natural-sounding voice according to the given lyrics; the control of the synthesis to render an adequate interpretation of a given score while conveying some expressivity related to a specific singing style; the transformation of the voice signal to improve its naturalness and add expressivity by varying the timbre adequately according to the pitch, intensity and voice quality. This thesis provides some contributions in each of those 3 directions. First, a fully-functional synthesis system has been developed, based on diphones concatenations. The modular architecture of this system allows to integrate and compare different signal modeling approaches. Then, the question of the control is addressed, encompassing the automatic generation of the f0, intensity, and phonemes durations. The modeling of specific singing styles has also been addressed by learning the expressive variations of the modeled control parameters on commercial recordings of famous French singers. Finally, some investigations on expressive timbre transformations have been conducted, for a future integration into our synthesizer. This mainly concerns methods related to intensity transformation, considering the effects of both the glottal source and vocal tract, and the modeling of vocal roughness

APA, Harvard, Vancouver, ISO, and other styles

Degottex, Gilles. "Glottal source and vocal-tract separation : estimation of glottal parameters, voice transformation and synthesis using a glottal model." Paris 6, 2010. http://www.theses.fr/2010PA066399.

Full text

Abstract:

Cette étude s'intéresse au problème de l'inversion d'un modèle de production de la voix étant donné un enregistrement audio de parole pour obtenir une représentation de le source sonore qui est générée au niveau de la glotte, la source glottique, ainsi qu'un représentation des résonances et anti-résonances créées par les cavités du conduit vocal. Cette séparation des éléments composants la voix donne la possibilité de manipuler indépendamment les caractéristiques de la source et le timbre des résonances. Nous supposons que la source glottique est un signal à phase mixte et que la réponse impulsionnelle du filtre du conduit vocal est un signal à minimum de phase. Puis, considérant ces propriétés, différentes méthodes sont proposées pour estimer les paramètres d'un modèle glottique qui minimisent la phase carrée moyenne du résiduel convolutif d'un spectre de parole observé et de son modèle. Une dernière méthode est décrite où un unique paramètre de forme est solution d'une forme quasi fermée du spectre observé. Ces méthodes sont évaluées et comparées avec des méthodes de l'état de l'art en utilisant des signaux synthétiques et electro-glotto-graphiques. Nous proposons également une procédure d'analyse/synthèse qui estime le filtre du conduit vocal en utilisant un spectre observé et sa source estimée. Des tests de préférences ont été menés et leurs résultats sont présentés dans cette étude pour comparer la procédure décrite et d'autres méthodes existantes.

APA, Harvard, Vancouver, ISO, and other styles

Loscos, Àlex. "Spectral processing of the singing voice." Doctoral thesis, Universitat Pompeu Fabra, 2007. http://hdl.handle.net/10803/7542.

Full text

Abstract:

Aquesta tesi doctoral versa sobre el processament digital de la veu cantada, més concretament, sobre l'anàlisi, transformació i síntesi d'aquets tipus de veu en el domini espectral, amb especial èmfasi en aquelles tècniques rellevants per al desenvolupament d'aplicacions musicals.

La tesi presenta nous procediments i formulacions per a la descripció i transformació d'aquells atributs específicament vocals de la veu cantada. La tesis inclou, entre d'altres, algorismes per l'anàlisi i la generació de desordres vocals como ara rugositat, ronquera, o veu aspirada, detecció i modificació de la freqüència fonamental de la veu, detecció de nasalitat, conversió de veu cantada a melodia, detecció de cops de veu, mutació de veu cantada, i transformació de veu a instrument; exemplificant alguns d'aquests algorismes en aplicacions concretes.
Esta tesis doctoral versa sobre el procesado digital de la voz cantada, más concretamente, sobre el análisis, transformación y síntesis de este tipo de voz basándose e dominio espectral, con especial énfasis en aquellas técnicas relevantes para el desarrollo de aplicaciones musicales.

La tesis presenta nuevos procedimientos y formulaciones para la descripción y transformación de aquellos atributos específicamente vocales de la voz cantada. La tesis incluye, entre otros, algoritmos para el análisis y la generación de desórdenes vocales como rugosidad, ronquera, o voz aspirada, detección y modificación de la frecuencia fundamental de la voz, detección de nasalidad, conversión de voz cantada a melodía, detección de los golpes de voz, mutación de voz cantada, y transformación de voz a instrumento; ejemplificando algunos de éstos en aplicaciones concretas.
This dissertation is centered on the digital processing of the singing voice, more concretely on the analysis, transformation and synthesis of this type of voice in the spectral domain, with special emphasis on those techniques relevant for music applications.

The thesis presents new formulations and procedures for both describing and transforming those attributes of the singing voice that can be regarded as voice specific. The thesis includes, among others, algorithms for rough and growl analysis and transformation, breathiness estimation and emulation, pitch detection and modification, nasality identification, voice to melody conversion, voice beat onset detection, singing voice morphing, and voice to instrument transformation; being some of them exemplified with concrete applications.

APA, Harvard, Vancouver, ISO, and other styles

Calzada, Defez Àngel. "Conveying expressivity and vocal effort transformation in synthetic speech with Harmonic plus Noise Models." Doctoral thesis, Universitat Ramon Llull, 2016. http://hdl.handle.net/10803/360587.

Full text

Abstract:

Aquesta tesi s'ha dut a terme dins del Grup en de Tecnologies Mèdia (GTM) de l'Escola d'Enginyeria i Arquitectura la Salle. El grup te una llarga trajectòria dins del cap de la síntesi de veu i fins i tot disposa d'un sistema propi de síntesi per concatenació d'unitats (US-TTS) que permet sintetitzar diferents estils expressius usant múltiples corpus. De forma que per a realitzar una síntesi agressiva, el sistema usa el corpus de l'estil agressiu, i per a realitzar una síntesi sensual, usa el corpus de l'estil corresponent. Aquesta tesi pretén proposar modificacions del esquema del US-TTS que permetin millorar la flexibilitat del sistema per sintetitzar múltiples expressivitats usant només un únic corpus d'estil neutre. L'enfoc seguit en aquesta tesi es basa en l'ús de tècniques de processament digital del senyal (DSP) per aplicar modificacions de senyal a la veu sintetitzada per tal que aquesta expressi l'estil de parla desitjat. Per tal de dur a terme aquestes modificacions de senyal s'han usat els models harmònic més soroll per la seva flexibilitat a l'hora de realitzar modificacions de senyal. La qualitat de la veu (VoQ) juga un paper important en els diferents estils expressius. És per això que es va estudiar la síntesi de diferents emocions mitjançant la modificació de paràmetres de VoQ de baix nivell. D'aquest estudi es van identificar un conjunt de limitacions que van donar lloc als objectius d'aquesta tesi, entre ells el trobar un paràmetre amb gran impacte sobre els estils expressius. Per aquest fet l'esforç vocal (VE) es va escollir per el seu paper important en la parla expressiva. Primer es va estudiar la possibilitat de transferir l'VE entre dues realitzacions amb diferent VE de la mateixa paraula basant-se en la tècnica de predicció lineal adaptativa del filtre de pre-èmfasi (APLP). La proposta va permetre transferir l'VE correctament però presentava limitacions per a poder generar nivells intermitjos d'VE. Amb la finalitat de millorar la flexibilitat i control de l'VE expressat a la veu sintetitzada, es va proposar un nou model d'VE basat en polinomis lineals. Aquesta proposta va permetre transferir l'VE entre dues paraules qualsevols i sintetitzar nous nivells d'VE diferents dels disponibles al corpus. Aquesta flexibilitat esta alineada amb l'objectiu general d'aquesta tesi, permetre als sistemes US-TTS sintetitzar diferents estils expressius a partir d'un únic corpus d'estil neutre. La proposta realitzada també inclou un paràmetre que permet controlar fàcilment el nivell d'VE sintetitzat. Això obre moltes possibilitats per controlar fàcilment el procés de síntesi tal i com es va fer al projecte CreaVeu usant interfícies gràfiques simples i intuïtives, també realitzat dins del grup GTM. Aquesta memòria conclou presentant el treball realitzat en aquesta tesi i amb una proposta de modificació de l'esquema d'un sistema US-TTS per incloure els blocs de DSP desenvolupats en aquesta tesi que permetin al sistema sintetitzar múltiple nivells d'VE a partir d'un corpus d'estil neutre. Això obre moltes possibilitats per generar interfícies d'usuari que permetin controlar fàcilment el procés de síntesi, tal i com es va fer al projecte CreaVeu, també realitzat dins del grup GTM. Aquesta memòria conclou presentant el treball realitzat en aquesta tesi i amb una proposta de modificació de l'esquema del sistema US-TTS per incloure els blocs de DSP desenvolupats en aquesta tesi que permetin al sistema sintetitzar múltiple nivells d'VE a partir d'un corpus d'estil neutre.
Esta tesis se llevó a cabo en el Grup en Tecnologies Mèdia de la Escuela de Ingeniería y Arquitectura la Salle. El grupo lleva una larga trayectoria dentro del campo de la síntesis de voz y cuenta con su propio sistema de síntesis por concatenación de unidades (US-TTS). El sistema permite sintetizar múltiples estilos expresivos mediante el uso de corpus específicos para cada estilo expresivo. De este modo, para realizar una síntesis agresiva, el sistema usa el corpus de este estilo, y para un estilo sensual, usa otro corpus específico para ese estilo. La presente tesis aborda el problema con un enfoque distinto proponiendo cambios en el esquema del sistema con el fin de mejorar la flexibilidad para sintetizar múltiples estilos expresivos a partir de un único corpus de estilo de habla neutro. El planteamiento seguido en esta tesis esta basado en el uso de técnicas de procesamiento de señales (DSP) para llevar a cabo modificaciones del señal de voz para que este exprese el estilo de habla deseado. Para llevar acabo las modificaciones de la señal de voz se han usado los modelos harmónico más ruido (HNM) por su flexibilidad para efectuar modificaciones de señales. La cualidad de la voz (VoQ) juega un papel importante en diferentes estilos expresivos. Por ello se exploró la síntesis expresiva basada en modificaciones de parámetros de bajo nivel de la VoQ. Durante este estudio se detectaron diferentes problemas que dieron pié a los objetivos planteados en esta tesis, entre ellos el encontrar un único parámetro con fuerte influencia en la expresividad. El parámetro seleccionado fue el esfuerzo vocal (VE) por su importante papel a la hora de expresar diferentes emociones. Las primeras pruebas se realizaron con el fin de transferir el VE entre dos realizaciones con diferente grado de VE de la misma palabra usando una metodología basada en un proceso filtrado de pre-émfasis adaptativo con coeficientes de predicción lineales (APLP). Esta primera aproximación logró transferir el nivel de VE entre dos realizaciones de la misma palabra, sin embargo el proceso presentaba limitaciones para generar niveles de esfuerzo vocal intermedios. A fin de mejorar la flexibilidad y el control del sistema para expresar diferentes niveles de VE, se planteó un nuevo modelo de VE basado en polinomios lineales. Este modelo permitió transferir el VE entre dos palabras diferentes e incluso generar nuevos niveles no presentes en el corpus usado para la síntesis. Esta flexibilidad está alineada con el objetivo general de esta tesis de permitir a un sistema US-TTS expresar múltiples estilos de habla expresivos a partir de un único corpus de estilo neutro. Además, la metodología propuesta incorpora un parámetro que permite de forma sencilla controlar el nivel de VE expresado en la voz sintetizada. Esto abre la posibilidad de controlar fácilmente el proceso de síntesis tal y como se hizo en el proyecto CreaVeu usando interfaces simples e intuitivas, también realizado dentro del grupo GTM. Esta memoria concluye con una revisión del trabajo realizado en esta tesis y con una propuesta de modificación de un esquema de US-TTS para expresar diferentes niveles de VE a partir de un único corpus neutro.
This thesis was conducted in the Grup en Tecnologies M`edia (GTM) from Escola d’Enginyeria i Arquitectura la Salle. The group has a long trajectory in the speech synthesis field and has developed their own Unit-Selection Text-To-Speech (US-TTS) which is able to convey multiple expressive styles using multiple expressive corpora, one for each expressive style. Thus, in order to convey aggressive speech, the US-TTS uses an aggressive corpus, whereas for a sensual speech style, the system uses a sensual corpus. Unlike that approach, this dissertation aims to present a new schema for enhancing the flexibility of the US-TTS system for performing multiple expressive styles using a single neutral corpus. The approach followed in this dissertation is based on applying Digital Signal Processing (DSP) techniques for carrying out speech modifications in order to synthesize the desired expressive style. For conducting the speech modifications the Harmonics plus Noise Model (HNM) was chosen for its flexibility in conducting signal modifications. Voice Quality (VoQ) has been proven to play an important role in different expressive styles. Thus, low-level VoQ acoustic parameters were explored for conveying multiple emotions. This raised several problems setting new objectives for the rest of the thesis, among them finding a single parameter with strong impact on the expressive style conveyed. Vocal Effort (VE) was selected for conducting expressive speech style modifications due to its salient role in expressive speech. The first approach working with VE was based on transferring VE between two parallel utterances based on the Adaptive Pre-emphasis Linear Prediction (APLP) technique. This approach allowed transferring VE but the model presented certain restrictions regarding its flexibility for generating new intermediate VE levels. Aiming to improve the flexibility and control of the conveyed VE, a new approach using polynomial model for modelling VE was presented. This model not only allowed transferring VE levels between two different utterances, but also allowed to generate other VE levels than those present in the speech corpus. This is aligned with the general goal of this thesis, allowing US-TTS systems to convey multiple expressive styles with a single neutral corpus. Moreover, the proposed methodology introduces a parameter for controlling the degree of VE in the synthesized speech signal. This opens new possibilities for controlling the synthesis process such as the one in the CreaVeu project using a simple and intuitive graphical interfaces, also conducted in the GTM group. The dissertation concludes with a review of the conducted work and a proposal for schema modifications within a US-TTS system for introducing the VE modification blocks designed in this dissertation.

APA, Harvard, Vancouver, ISO, and other styles

Тодорів, Андрій Дмитрович. "Система багатофакторної аутентифікації користувачів комп’ютерних систем." Master's thesis, КПІ ім. Ігоря Сікорського, 2020. https://ela.kpi.ua/handle/123456789/38366.

Full text

Abstract:

Вирішення проблеми захисту корпоративних даних в ХХІ столітті вийшло за рамки фізичної взаємодії з працівниками, у зв’язку з переходом шуканої інформації в комп’ютерний формат. Дана особливість сформувала потребу у розробці та імплементації нових механізмів захисту корпоративних даних. Запропонована система аутентифікації користувачів комп’ютерних систем, розроблена на основі технологій нейронних мереж, надає можливість ідентифікації користувачів на основі індивідуальних антропометричних візуальних та голосових показників суб’єкта, з метою запобігання викраденню корпоративних даних, та ідентифікації злочинних суб’єктів. Об’єктом дослідження є трансформація антропометричних показників в комп’ютерну форму. Предметом дослідження є механізми розпізнавання образів. Метою роботи є покращення можливостей методів біометричної ідентифікації суб’єктів шляхом розробки нової архітектури на базі нейронних мереж. Методи дослідження. Порівняння існуючих алгоритмів за критеріями точності, швидкодії, ресурсних затрат, надійності, з метою імплементації та подальшої модифікації в системі корпоративного контролю. Наукова новизна полягає у розробці нового механізму ідентифікації суб’єктів що поєднує у собі алгоритми голосової та візуальної ідентифікації суб’єктів. Практична цінність полягає у можливості застосування даної системи в корпоративних умовах з метою запобігання витоку даних та ідентифікації злочинних суб’єктів. Низька ресурсозатратність сприяє застосуванню розробленого алгоритму в високонавантажених системах. Структура та обсяг роботи. Магістерська дисертація складається з вступу, чотирьох розділів, висновків та додатків. У вступі аналізується проблема захисту корпоративних даних. Обгрунтовується перспективність застосування механізмів біометричної голосової та візуальної ідентифікації суб’єктів для її вирішення. Досліджуються алгоритми біометричної ідентифікації. У першому розділі описуються існуючі алгоритми розпізнавання візуальних та голосових образів. У другому розділі досліджується доцільність застосування існуючих алгоритмів голосової та візуальної біометричної ідентифікації, аналізуються та порівнюються існуючі архітектури розпізнавання образів. У третьому розділі наводиться процес розробки алгоритмів візуальної та голосової біометричної ідентифікації користувачів У четвертому розділі наводяться характеристики розробленої КС, результати тестування, відбувається дослідження системи на різних наборах даних, та її модифікація з метою досягнення поставленої точності. У висновках стисло наводяться результати досліджень та розробки.
Topic relevance The solution to the problem of corporate data protection in the XXI century has gone beyond the physical interaction with employees, due to the transition of the required information into a computer format. This feature has formed the need to develop and implement new mechanisms for corporate data protection. The proposed system of authentication of computer system users, developed on the basis of neural network technologies, provides the possibility of user identification on the basis of individual anthropometric visual and voice indicators of the subject, in order to prevent theft of corporate data and identification of criminal entities. The object of study is the transformation of anthropometric indicators into a computer form. The subject of study is the mechanisms of pattern recognition. The goal of this work is to improve the capabilities of biometric identification methods of subjects by developing a new architecture based on neural networks. Study methods. Comparison of existing algorithms on the criteria of accuracy, speed, resource costs, reliability, in order to implement and further modify the corporate control system. The scientific novelty is the development of a new mechanism for identifying subjects that combines algorithms for voice and visual identification of subjects. The practical value lies in the possibility of using this system in a corporate environment in order to prevent data leakage and identification of criminal entities. Low resource consumption contributes to the application of the developed algorithm in highly loaded systems. Structure and scope of work. The master's dissertation consists of an introduction, four chapters, conclusions and appendices. The introduction analyzes the problem of corporate data protection. The prospects of using the mechanisms of biometric voice and visual identification of subjects for its solution are substantiated. Biometric identification algorithms are investigated. The first section describes the existing algorithms for recognizing visual and voice images. The second section investigates the feasibility of using existing algorithms for voice and visual biometric identification, analyzes and compares existing image recognition architectures. The third section describes the process of developing algorithms for visual and voice biometric user identification The fourth section presents the characteristics of the developed COP, the test results, the system is studied on different data sets, and its modification in order to achieve the specified accuracy. The conclusions summarize the results of research and development.

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Voice signal transformation"

Alsubari, Akram, Ghanshyam D. Ramteke, and Rakesh J. Ramteke. "Transformation of Voice Signals to Spatial Domain for Code Optimization in Digital Image Processing." In Communications in Computer and Information Science, 196–209. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-0493-5_18.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kumekawa, Ian. "Retreat to the Ivory Tower." In The First Serious Optimist. Princeton University Press, 2017. http://dx.doi.org/10.23943/princeton/9780691163482.003.0006.

Full text

Abstract:

This chapter discusses Pigou's withdrawal to the academy at Cambridge, which did not merely indicate a changing perception of the possibilities of government action. This withdrawal also signaled that the dream to engage in efforts that directly affected welfare was fading. These developments occurred, however, during the period of Pigou's career that would prove the most productive, a period when he consolidated his reputation as the most eminent economic voice of his generation. As he stepped back from participating in issues of policy, he was ascendant as a scientist “of the chair.” It is not unlikely that Pigou himself took some satisfaction in this transformation.

APA, Harvard, Vancouver, ISO, and other styles

Gaines, Malik. "The Cockettes, Sylvester, and Performance as Life." In Black Performance on the Outskirts of the Left. NYU Press, 2017. http://dx.doi.org/10.18574/nyu/9781479837038.003.0005.

Full text

Abstract:

San Francisco’s Cockettes troupe staged radical anti-disciplinary spectacles onstage, in short films, in public, and in their domestic spaces. They hyperbolized the leftist political programs that informed both the liberation movements and the communal living practices in which they were ensconced. Through elaborate gender-defying combinations of drag and nudity, the Cockettes used their bodies as sites of social transformation. Sylvester, who later became a recording star, was perhaps the best-known Cockette. Using a repertoire of black virtuoso diva techniques, including a proficient singing voice, attention to black musical forms, and articulate modes of dress, Sylvester perfected a black expressive originality constructed from historically black signs. The difference between his radical virtuosity and the transgressive drop-out aesthetic of the predominantly white Cockettes troupe reveals a lack of organic unity in this revolutionary space and a racial cleavage in the project of liberation.

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Voice signal transformation"

Stylianou, Yannis. "Voice Transformation: A survey." In ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2009. http://dx.doi.org/10.1109/icassp.2009.4960401.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Jin, Qin, Arthur R. Toth, Tanja Schultz, and Alan W. Black. "Voice convergin: Speaker de-identification by voice transformation." In ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2009. http://dx.doi.org/10.1109/icassp.2009.4960482.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Valbret, H., E. Moulines, and J. P. Tubach. "Voice transformation using PSOLA technique." In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1992. http://dx.doi.org/10.1109/icassp.1992.225951.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Popa, Victor, Hanna Silen, Jani Nurminen, and Moncef Gabbouj. "Local linear transformation for voice conversion." In ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2012. http://dx.doi.org/10.1109/icassp.2012.6288922.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sisman, Berrak, Haizhou Li, and Kay Chen Tan. "Transformation of prosody in voice conversion." In 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2017. http://dx.doi.org/10.1109/apsipa.2017.8282288.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Fernandez, Raul, Andrew Rosenberg, Alexander Sorin, Bhuvana Ramabhadran, and Ron Hoory. "Voice-transformation-based data augmentation for prosodic classification." In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017. http://dx.doi.org/10.1109/icassp.2017.7953214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Takahashi, Masahito, and Hiroki Matsumoto. "A consideration on the method of transformation from whisper voice to voice sounds." In 2011 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS 2011). IEEE, 2011. http://dx.doi.org/10.1109/ispacs.2011.6146160.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Qin Jin, Arthur R. Toth, Alan W. Black, and Tanja Schultz. "Is voice transformation a threat to speaker identification?" In ICASSP 2008 - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2008. http://dx.doi.org/10.1109/icassp.2008.4518742.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Turk, Oytun, Osman Buyuk, Ali Haznedaroglu, and Levent M. Arslan. "Application of voice conversion for cross-language rap singing transformation." In ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2009. http://dx.doi.org/10.1109/icassp.2009.4960404.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Chithra, PL, and R. Aparna. "Voice Signal Encryption Scheme Using Transformation and Embedding Techniques for Enhanced Security." In 2018 2nd International Conference on Imaging, Signal Processing and Communication (ICISPC). IEEE, 2018. http://dx.doi.org/10.1109/icispc44900.2018.9006681.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Academic literature on the topic 'Voice signal transformation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Contents

Journal articles on the topic "Voice signal transformation"

Dissertations / Theses on the topic "Voice signal transformation"

Book chapters on the topic "Voice signal transformation"

Conference papers on the topic "Voice signal transformation"