Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: VOICE SIGNALS.

Artykuły w czasopismach na temat „VOICE SIGNALS”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „VOICE SIGNALS”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Ahamed, Mohamed Rasmi Ashfaq, Mohammad Hossein Babini i Hamidreza Namazi. "Complexity-based decoding of the relation between human voice and brain activity". Technology and Health Care 28, nr 6 (17.11.2020): 665–74. http://dx.doi.org/10.3233/thc-192105.

Pełny tekst źródła
Streszczenie:
BACKGROUND: The human voice is the main feature of human communication. It is known that the brain controls the human voice. Therefore, there should be a relation between the characteristics of voice and brain activity. OBJECTIVE: In this research, electroencephalography (EEG) as the feature of brain activity and voice signals were simultaneously analyzed. METHOD: For this purpose, we changed the activity of the human brain by applying different odours and simultaneously recorded their voices and EEG signals while they read a text. For the analysis, we used the fractal theory that deals with the complexity of objects. The fractal dimension of EEG signal versus voice signal in different levels of brain activity were computed and analyzed. RESULTS: The results indicate that the activity of human voice is related to brain activity, where the variations of the complexity of EEG signal are linked to the variations of the complexity of voice signal. In addition, the EEG and voice signal complexities are related to the molecular complexity of applied odours. CONCLUSION: The employed method of analysis in this research can be widely applied to other physiological signals in order to relate the activities of different organs of human such as the heart to the activity of his brain.
Style APA, Harvard, Vancouver, ISO itp.
2

Mittal, Vikas, i R. K. Sharma. "Classification of Pathological Voices Using Glottal Signal Parameters". Journal of Computational and Theoretical Nanoscience 16, nr 9 (1.09.2019): 3999–4002. http://dx.doi.org/10.1166/jctn.2019.8284.

Pełny tekst źródła
Streszczenie:
The discrimination of voice signals has numerous applications in diagnosing of pathologies related to voice. This paper discussed about the glottal signal that is bound to recognize two sorts of voice issue: Laryngitis and Laryngeal dystonia (LD). The parameters of the glottal signal fill in as contribution to classifiers that characterizes into three unique gatherings of speakers: speakers with Laryngitis; with laryngeal dystonia (LD); lastly speakers with healthy voices. The database is made out of voice accounts containing tests of three gatherings. The classifiers SVM provided 60%, KNN provided 70% and Ensemble provided 80% classification accuracy in the case of Laryngitis. Voice signals of patients affected with Laryngeal dystonia were also collected and tested with same classifiers and the Accuracy of 90%, 80% and 50% were obtained with SVM, KNN and Ensemble respectively.
Style APA, Harvard, Vancouver, ISO itp.
3

Silva, Augusto Felix Tavares, Samuel R. de Abreu, Silvana Cunha Costa i Suzete Elida Nobrega Correia. "Classificação de sinais de voz através da aplicação da transformada Wavelet Packet e redes neurais artificiais". Revista Principia - Divulgação Científica e Tecnológica do IFPB 1, nr 37 (21.12.2017): 34. http://dx.doi.org/10.18265/1517-03062015v1n37p34-41.

Pełny tekst źródła
Streszczenie:
Pathologies such as edema, nodules and paralysis are quite recurrent and directly influence vocal dysfunctions. The acoustic analysis has been used to evaluate the disorders caused in the voice signals, detecting the presence of pathologies in the larynx, through digital signal processing techniques. This work aims to distinguish healthy voice signals from the ones affected by laryngeal pathologies, using the Wavelet Packet transform in the feature extraction step. Energy and entropy measures, in six resolution levels, obtained through the Daubechies wavelet of order 4 are used in the discrimination of the voice signals. The classification is done through Artificial Neural Networks. Accuracies above 90% were obtained, with the entropy measure, in the discrimination between healthy voices and affected ones by pathologies in the vocal folds (nodules, Reinke’s edema and paralysis)
Style APA, Harvard, Vancouver, ISO itp.
4

Choi, Hee-Jin, i Ji-Yeoun Lee. "Comparative Study between Healthy Young and Elderly Subjects: Higher-Order Statistical Parameters as Indices of Vocal Aging and Sex". Applied Sciences 11, nr 15 (28.07.2021): 6966. http://dx.doi.org/10.3390/app11156966.

Pełny tekst źródła
Streszczenie:
The objective of this study was to test higher-order statistical (HOS) parameters for the classification of young and elderly voice signals and identify gender- and age-related differences through HOS analysis. This study was based on data from 116 subjects (58 females and 58 males) extracted from the Saarbruecken voice database. In the gender analysis, the same number of voice samples were analyzed for each sex. Further, we conducted experiments on the voices of elderly people using gender analysis. Finally, we reviewed the standards and reference models to reduce sex and gender bias. The acoustic parameters were extracted from young and elderly voice signals using Praat and a time–frequency analysis program (TF32). Additionally, we investigated the gender- and age-related differences in HOS parameters. Young and elderly voice signals significantly differed in normalized skewness (p = 0.005) in women and normalized kurtosis (p = 0.011) in men. Therefore, normalized skewness is a useful parameter for distinguishing between young and elderly female voices, and normalized kurtosis is essential for distinguishing between young and elderly male voices. We will continue to investigate parameters that represent important information in elderly voice signals.
Style APA, Harvard, Vancouver, ISO itp.
5

Swanborough, Huw, Matthias Staib i Sascha Frühholz. "Neurocognitive dynamics of near-threshold voice signal detection and affective voice evaluation". Science Advances 6, nr 50 (grudzień 2020): eabb3884. http://dx.doi.org/10.1126/sciadv.abb3884.

Pełny tekst źródła
Streszczenie:
Communication and voice signal detection in noisy environments are universal tasks for many species. The fundamental problem of detecting voice signals in noise (VIN) is underinvestigated especially in its temporal dynamic properties. We investigated VIN as a dynamic signal-to-noise ratio (SNR) problem to determine the neurocognitive dynamics of subthreshold evidence accrual and near-threshold voice signal detection. Experiment 1 showed that dynamic VIN, including a varying SNR and subthreshold sensory evidence accrual, is superior to similar conditions with nondynamic SNRs or with acoustically matched sounds. Furthermore, voice signals with affective meaning have a detection advantage during VIN. Experiment 2 demonstrated that VIN is driven by an effective neural integration in an auditory cortical-limbic network at and beyond the near-threshold detection point, which is preceded by activity in subcortical auditory nuclei. This demonstrates the superior recognition advantage of communication signals in dynamic noise contexts, especially when carrying socio-affective meaning.
Style APA, Harvard, Vancouver, ISO itp.
6

Liu, Boquan, Evan Polce i Jack Jiang. "Application of Local Intrinsic Dimension for Acoustical Analysis of Voice Signal Components". Annals of Otology, Rhinology & Laryngology 127, nr 9 (17.06.2018): 588–97. http://dx.doi.org/10.1177/0003489418780439.

Pełny tekst źródła
Streszczenie:
Purpose: The overall aim of this study was to apply local intrinsic dimension ( Di) estimation to quantify high-dimensional, disordered voice and discriminate between the 4 types of voice signals. It was predicted that continuous Di analysis throughout the entire time-series would generate comprehensive descriptions of voice signal components, called voice type component profiles (VTCP), that effectively distinguish between the 4 voice types. Method: One hundred thirty-five voice recording samples of the sustained vowel /a/ were obtained from the Disordered Voice Database Model 4337 and spectrographically classified into the voice type paradigm. The Di and correlation dimension ( D2) were then used to objectively analyze the voice samples and compared based on voice type differentiation efficacy. Results: The D2 exhibited limited effectiveness in distinguishing between the 4 voice type signals. For Di analysis, significant differences were primarily observed when comparing voice type component 1 (VTC1) and 4 (VTC4) across the 4 voice type signals ( P < .001). The 4 voice type components (VTCs) significantly differentiated between low-dimensional, type 3 and high-dimensional, type 4 signals ( P < .001). Conclusions: The Di demonstrated improvements over D2 in 2 distinct manners: enhanced resolution at high data dimensions and comprehensive description of voice signal elements.
Style APA, Harvard, Vancouver, ISO itp.
7

Martin, David P., i Virginia I. Wolfe. "Effects of Perceptual Training Based upon Synthesized Voice Signals". Perceptual and Motor Skills 83, nr 3_suppl (grudzień 1996): 1291–98. http://dx.doi.org/10.2466/pms.1996.83.3f.1291.

Pełny tekst źródła
Streszczenie:
28 undergraduate students participated in a perceptual voice experiment to assess the effects of training utilizing synthesized voice signals. An instructional strategy based upon synthesized examples of a three-part classification system: “breathy,” “rough,” and “hoarse,” was employed. Training samples were synthesized with varying amounts of jitter (cycle-to-cycle deviation in pitch period) and harmonic-to-noise ratios to represent these qualities. Before training, listeners categorized 60 pathological voices into “breathy,” “rough,” and “hoarse,” largely on the basis of fundamental frequency. After training, categorizations were influenced by harmonic-to-noise ratios as well as fundamental frequency, suggesting that listeners were more aware of spectral differences in pathological voices associated with commonly occurring laryngeal conditions. 40% of the pathological voice samples remained unclassified following training.
Style APA, Harvard, Vancouver, ISO itp.
8

Zhu, Xin-Cheng, Deng-Huang Zhao, Yi-Hua Zhang, Xiao-Jun Zhang i Zhi Tao. "Multi-Scale Recurrence Quantification Measurements for Voice Disorder Detection". Applied Sciences 12, nr 18 (14.09.2022): 9196. http://dx.doi.org/10.3390/app12189196.

Pełny tekst źródła
Streszczenie:
Due to the complexity and non-stationarity of the voice generation system, the nonlinearity of speech signals cannot be accurately quantified. Recently, the recurrence quantification analysis method has been used for voice disorder detection. In this paper, multiscale recurrence quantification measures (MRQMs) are proposed. The signals are reconstructed in the high-dimensional phase space at the equivalent rectangular bandwidth scale. Recurrence plots (RPs) combining the characteristics of human auditory perception are drawn with an appropriate recurrence threshold. Based on the above, the nonlinear dynamic recurrence features of the speech signal are quantized from the recurrence plot of each frequency channel. Furthermore, this paper explores the recurrence quantification thresholds that are most suitable for pathological voices. Our results show that the proposed MRQMs with support vector machine (SVM), random forest (RF), Bayesian network (BN) and Local Weighted Learning (LWL) achieve an average accuracy of 99.45%, outperforming traditional features and other complex measurements. In addition, MRQMs also have the potential for multi-classification of voice disorder, achieving an accuracy of 89.05%. This study demonstrates that MRQMs can characterize the recurrence characteristic of pathological voices and effectively detect voice disorders.
Style APA, Harvard, Vancouver, ISO itp.
9

Bartusiak, Emily R., i Edward J. Delp. "Frequency Domain-Based Detection of Generated Audio". Electronic Imaging 2021, nr 4 (18.01.2021): 273–1. http://dx.doi.org/10.2352/issn.2470-1173.2021.4.mwsf-273.

Pełny tekst źródła
Streszczenie:
Attackers may manipulate audio with the intent of presenting falsified reports, changing an opinion of a public figure, and winning influence and power. The prevalence of inauthentic multimedia continues to rise, so it is imperative to develop a set of tools that determines the legitimacy of media. We present a method that analyzes audio signals to determine whether they contain real human voices or fake human voices (i.e., voices generated by neural acoustic and waveform models). Instead of analyzing the audio signals directly, the proposed approach converts the audio signals into spectrogram images displaying frequency, intensity, and temporal content and evaluates them with a Convolutional Neural Network (CNN). Trained on both genuine human voice signals and synthesized voice signals, we show our approach achieves high accuracy on this classification task.
Style APA, Harvard, Vancouver, ISO itp.
10

Liu, Boquan, Evan Polce, Julien C. Sprott i Jack J. Jiang. "Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods". Journal of Speech, Language, and Hearing Research 61, nr 5 (17.05.2018): 1130–39. http://dx.doi.org/10.1044/2018_jslhr-s-17-0250.

Pełny tekst źródła
Streszczenie:
Purpose The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Study Design Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. Method A diffusive behavior detection–based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Results Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. Conclusion The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.
Style APA, Harvard, Vancouver, ISO itp.
11

Geng, Lei, Hongfeng Shan, Zhitao Xiao, Wei Wang i Mei Wei. "Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method". Biomedical Engineering / Biomedizinische Technik 66, nr 6 (29.11.2021): 613–25. http://dx.doi.org/10.1515/bmt-2021-0112.

Pełny tekst źródła
Streszczenie:
Abstract Automatic voice pathology detection and classification plays an important role in the diagnosis and prevention of voice disorders. To accurately describe the pronunciation characteristics of patients with dysarthria and improve the effect of pathological voice detection, this study proposes a pathological voice detection method based on a multi-modal network structure. First, speech signals and electroglottography (EGG) signals are mapped from the time domain to the frequency domain spectrogram via a short-time Fourier transform (STFT). The Mel filter bank acts on the spectrogram to enhance the signal’s harmonics and denoise. Second, a pre-trained convolutional neural network (CNN) is used as the backbone network to extract sound state features and vocal cord vibration features from the two signals. To obtain a better classification effect, the fused features are input into the long short-term memory (LSTM) network for voice feature selection and enhancement. The proposed system achieves 95.73% for accuracy with 96.10% F1-score and 96.73% recall using the Saarbrucken Voice Database (SVD); thus, enabling a new method for pathological speech detection.
Style APA, Harvard, Vancouver, ISO itp.
12

McNair, Bruce E. "Processing of encrypted voice signals". Journal of the Acoustical Society of America 83, nr 6 (czerwiec 1988): 2474. http://dx.doi.org/10.1121/1.396315.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
13

Scalassara, P., C. Maciel i J. Pereira. "Predictability analysis of voice signals". IEEE Engineering in Medicine and Biology Magazine 28, nr 5 (wrzesień 2009): 30–34. http://dx.doi.org/10.1109/memb.2009.934245.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
14

Fonseca, E. S., i J. C. Pereira. "Normal versus pathological voice signals". IEEE Engineering in Medicine and Biology Magazine 28, nr 5 (wrzesień 2009): 44–48. http://dx.doi.org/10.1109/memb.2009.934248.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
15

Kreiman, Jody. "Why we talk about voices as we do". Journal of the Acoustical Society of America 153, nr 3_supplement (1.03.2023): A78. http://dx.doi.org/10.1121/10.0018224.

Pełny tekst źródła
Streszczenie:
The problem of how to characterize voice quality is an endless source of debate and frustration across disciplines. The richness of the vocabulary available to describe voice is overwhelming, but the density of the information conveyed by voice has led some scholars to conclude that language can never adequately specify what we hear. Others have argued that terminology derives from tradition and lacks an empirical basis, so that language-based scales are inadequate a priori. Finally, efforts to link terms to acoustic signal characteristics have had limited success. However, a reconsideration suggests that a few terms appear consistently across studies, disciplines, and eras. These terms align with dimensions that account for acoustic variance in voice across speakers, regardless of gender, language spoken, or the kind of speech sample, and correlate with physical size and arousal across many species. They, thus, may have an evolutionary basis. This suggests talk about voices rests on a bedrock of biology: We have evolved to perceive voices in terms of size/arousal, and these factors structure both voice acoustics and the language we use to describe voices. Such linkages could help integrate studies of physical signals and their meaning, producing a truly interdisciplinary approach to voice.
Style APA, Harvard, Vancouver, ISO itp.
16

Hyun Kim, Bong, i . "Analysis of Voice Signals Change by Voice Modulation Program". International Journal of Engineering & Technology 7, nr 3.34 (1.09.2018): 506. http://dx.doi.org/10.14419/ijet.v7i3.34.19369.

Pełny tekst źródła
Streszczenie:
Background/Objectives: Voice modulation is used in various fields. Especially, it is widely used in entertainment program for voice tampering to give viewers fun, and voice tampering to guarantee the victim 's identity in news. However, in recent years, voice tampering has been exploited for crime. As the information and communication technology in the information society has developed rapidly, the crime using voice modulation is increasing.Methods/Statistical analysis: Therefore, in this paper, the change of voice signal is analyzed by analyzing both normal voice and modulated voice. For this purpose, general voice was collected using the same place, time, microphone, etc., and a modulated voice was collected by applying a voice modulation program. In addition, various voice analysis parameters such as spectrum, formant, intensity, pitch, pulse, jitter, shimmer, DoVB, and NHR were applied to the study.Findings: Experimental results show that the difference between the normal voice and the modulated voice is caused by various voice signal analysis parameters due to voice modulation. Especially, in the modulated voice, the spectrum, pitch, and DoVB values were decreased as compared with the general voice. In addition, jitter, shimmer, and NHR values resulted in a result that the modulated voice was higher than the normal voice. There was no significant difference in strength, formant and pulse measurements. Based on the results of this study, it is possible to reflect the changed voice analysis parameters by the voice modulation program.Improvements/Applications: Voice modulation is useful in various aspects such as fun and identification. However, it has been recently exploited in the same way as voice phishing. Therefore, in this paper, we measured the voice analysis parameters that are changed by the voice modulation program. Based on this, we compared and analyzed the general voice and the modulated voice and extracted the pattern of the voice signal changed by the voice modulation.
Style APA, Harvard, Vancouver, ISO itp.
17

Lei, Zhengdong, Lisa Martignetti, Chelsea Ridgway, Simon Peacock, Jon T. Sakata i Nicole Y. K. Li-Jessen. "Wearable Neck Surface Accelerometers for Occupational Vocal Health Monitoring: Instrument and Analysis Validation Study". JMIR Formative Research 6, nr 8 (5.08.2022): e39789. http://dx.doi.org/10.2196/39789.

Pełny tekst źródła
Streszczenie:
Background Neck surface accelerometer (NSA) wearable devices have been developed for voice and upper airway health monitoring. As opposed to acoustic sounds, NSA senses mechanical vibrations propagated from the vocal tract to neck skin, which are indicative of a person’s voice and airway conditions. NSA signals do not carry identifiable speech information and a speaker’s privacy is thus protected, which is important and necessary for continuous wearable monitoring. Our device was already tested for its durable endurance and signal processing algorithms in controlled laboratory conditions. Objective This study aims to further evaluate both instrument and analysis validity in a group of occupational vocal users, namely, voice actors, who use their voices extensively at work in an ecologically valid setting. Methods A total of 16 professional voice actors (age range 21-50 years; 11 females and 5 males) participated in this study. All participants were mounted with an NSA on their sternal notches during the voice acting and voice assessment sessions. The voice acting session was 4-hour long, directed by a voice director in a professional sound studio. Voice assessment sessions were conducted before, during, and 48 hours after the acting session. The assessment included phonation tasks of passage reading, sustained vowels, maximum vowel phonation, and pitch glides. Clinical acoustic metrics (eg, fundamental frequency, cepstral measures) and a vocal dose measure (ie, accumulated distance dose from acting) were computed from NSA signals. A commonly used online questionnaire (Self-Administered Voice Rating questionnaire) was also implemented to track participants’ perception of vocal fatigue. Results The NSA wearables stayed in place for all participants despite active body movements during the acting. The ensued body noise did not interfere with the NSA signal quality. All planned acoustic metrics were successfully derived from NSA signals and their numerical values were comparable with literature data. For a 4-hour long voice acting, the averaged distance dose was about 8354 m with no gender differences. Participants perceived vocal fatigue as early as 2 hours after the start of voice acting, with recovery 24-48 hours after the acting session. Among all acoustic metrics across phonation tasks, cepstral peak prominence and spectral tilt from the passage reading most closely mirrored trends in perceived fatigue. Conclusions The ecological validity of an in-house NSA wearable was vetted in a workplace setting. One key application of this wearable is to prompt occupational voice users when their vocal safety limits are reached for duly protection. Signal processing algorithms can thus be further developed for near real-time estimation of clinically relevant metrics, such as accumulated distance dose, cepstral peak prominence, and spectral tilt. This functionality will enable continuous self-awareness of vocal behavior and protection of vocal safety in occupational voice users.
Style APA, Harvard, Vancouver, ISO itp.
18

Wang, Hui Jun, i Guan Li. "A Design of the Bone Conduction Ultrasonic Hearing Device". Advanced Materials Research 1030-1032 (wrzesień 2014): 2330–33. http://dx.doi.org/10.4028/www.scientific.net/amr.1030-1032.2330.

Pełny tekst źródła
Streszczenie:
In view of the fact that the traditional gas conduction hearing AIDS does not function to the patients with ear canal jams, this paper introduces a kind of ultrasonic hearing devices. Through the ultrasonic voice signals sent by the bone conduction modulation, the hearing-impaired patients can get a certain degree of hearing. The device, with TM320VC5410 as signal processing unit, modulates the voice signals with ultrasonic and transmit the signals through the bone conduction headphone to human auditory nerves. The experimental results show that the hearing devices can help patients with severe deafness recognize sound and voice. As a result, it is of high application value.
Style APA, Harvard, Vancouver, ISO itp.
19

Peterson, K. Linnea, Katherine Verdolini-Marston, Julie M. Barkmeier i Henry T. Hoffman. "Comparison of Aerodynamic and Electroglottographic Parameters in Evaluating Clinically Relevant Voicing Patterns". Annals of Otology, Rhinology & Laryngology 103, nr 5 (maj 1994): 335–46. http://dx.doi.org/10.1177/000348949410300501.

Pełny tekst źródła
Streszczenie:
The purpose of the present study was to identify one or more aerodynamic or electroglottographic measures that distinguish among voicing patterns that are clinically relevant for nodule pathogenesis and regression: a presumably pathogenic pattern (pressed voice), a neutral pattern (normal voice), and two presumably therapeutic patterns (resonant voice and breathy voice). Trained subjects with normal voices produced several tokens of each voice type on sustained vowels /a/, /i/, and /u/. For each token, maximum flow declination rate, alternating current flow, and minimum flow were obtained from inverse-filtered airflow signals, and closed quotient and closing time were obtained from electroglottographic signals. The results indicate that for /a/ and /i/ (but not for /u/), the closed quotient provides a sensitive tool for distinguishing the voice types in physiologically interpretable directions. Further, post-hoc analyses confirmed a direct relationship between the closed quotient and videoscopic ratings of laryngeal adduction, which previous work links to nodule pathogenesis and regression.
Style APA, Harvard, Vancouver, ISO itp.
20

Chan, Karen M. K., i Edwin M.-L. Yiu. "The Effect of Anchors and Training on the Reliability of Perceptual Voice Evaluation". Journal of Speech, Language, and Hearing Research 45, nr 1 (luty 2002): 111–26. http://dx.doi.org/10.1044/1092-4388(2002/009).

Pełny tekst źródła
Streszczenie:
Perceptual voice evaluation is a common clinical tool for rating the severity of vocal quality impairment. However, the evaluation process involves subjective judgment, and reliability is therefore a major issue that needs to be considered. When listeners are asked to judge the quality of a voice signal, they use their own internal standards as the references. These internal standards can be variable, as different individuals may have acquired different standards in prior situations. In order to improve the reliability of the perceptual voice evaluation process, external anchors and training are provided to counteract the effect of these internal standards. This study investigated to what extent the provision of anchors and a training program would improve the reliability of perceptual voice evaluation by naive listeners. The results show, in general, that anchors and training helped to improve the reliability of perceptual voice evaluation, especially in the rating of male voices. Furthermore, it was found that anchors made up of synthesized signals combined with training were more effective in improving reliability in judging perceptual roughness and breathiness than natural voice anchors.
Style APA, Harvard, Vancouver, ISO itp.
21

Sprecher, Alicia, Aleksandra Olszewski, Jack J. Jiang i Yu Zhang. "Updating signal typing in voice: Addition of type 4 signals". Journal of the Acoustical Society of America 127, nr 6 (czerwiec 2010): 3710–16. http://dx.doi.org/10.1121/1.3397477.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
22

Wei, Yan Ping, i Hai Liu Xiao. "Design of Voice Signal Visualization Acquisition System Based on Sound Card and MATLAB". Applied Mechanics and Materials 716-717 (grudzień 2014): 1272–76. http://dx.doi.org/10.4028/www.scientific.net/amm.716-717.1272.

Pełny tekst źródła
Streszczenie:
With the development of computer technology and information technology, voice interaction has become a necessary means of human-computer interaction, and voice signal acquisition and processing is the precondition and foundation of human-computer interaction. This paper introduces the MATLAB visualization method into voice signal acquisition system, and uses MATLAB programming method to drive sound card directly, which realizes the identification and acquisition of voice signal and designs a new voice signal visualization acquisition system. In order to optimize the system, this paper introduces the variance analysis algorithm into the design of visualization system, which realizes the optimization of voice signal recognition model with different level parameters. At the end this paper does numerical simulation on the speech signal acquisition system; through signal acquisition 2D and 3D visualization voice signals are obtained. It extracts single signal characteristics, which provides a theoretical reference for the design of signal acquisition system.
Style APA, Harvard, Vancouver, ISO itp.
23

Herzel, Hanspeter. "Bifurcations and Chaos in Voice Signals". Applied Mechanics Reviews 46, nr 7 (1.07.1993): 399–413. http://dx.doi.org/10.1115/1.3120369.

Pełny tekst źródła
Streszczenie:
The basic physical mechanisms of speech production is described. A rich variety of bifurcations and episodes of irregular behaviour are observed. Poincare´ sections and the analysis of the underlying attractor suggest that these noise-like episodes are low-dimensional deterministic chaos. Possible implications for the very early diagnosis of brain disorder are discussed.
Style APA, Harvard, Vancouver, ISO itp.
24

O'Callaghan, Tiffany. "Voice almighty: decoding speech's secret signals". New Scientist 219, nr 2925 (lipiec 2013): 38–41. http://dx.doi.org/10.1016/s0262-4079(13)61754-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
25

Al-Rawashdeh, A. Y., i Z. Al-Qadi. "Using Wave Equation to Extract Digital Signal Features". Engineering, Technology & Applied Science Research 8, nr 4 (18.08.2018): 3153–56. http://dx.doi.org/10.48084/etasr.2088.

Pełny tekst źródła
Streszczenie:
Voice signals are one of the most popular data types. They are used in various applications like security systems. In the current study a method based on wave equation was proposed, implemented and tested. This method was used for correct feature array generation. The feature array can be used as a key to identify the voice signal without any dependence on the voice signal type or size. Results indicated that the proposed method can produce a unique feature array for each voice signal. They also showed that the proposed method can be faster than other feature extraction methods.
Style APA, Harvard, Vancouver, ISO itp.
26

Ruben, Aarne. "The “unknown voice” in Western history since Socrates". Semiotica 2017, nr 215 (1.03.2017): 269–80. http://dx.doi.org/10.1515/sem-2016-0032.

Pełny tekst źródła
Streszczenie:
AbstractSocrates remains one of the most prominent paternal figures of Western dialogism and phonocentric paradigm; the man who stirred up the dialectic imaginations of his days. In Plato’s Socratic dialogues, his inner voice (daimonion) sounds as a last-instance statement to cast the light on the final solution of the conversation. In the context of antiquity and following cultural tradition, Socrates was the only hearer of warning signals from inside. The rest of the voices were urging (voices of the imaginable cursed souls, saints, angels, etc.). There was no need for a “personal” dictating voice when a divine dictation was already present. According to Charles Peirce’s classification, a “voice without evident source” or a voice from the head is a dicent indexical legisign.
Style APA, Harvard, Vancouver, ISO itp.
27

Maryn, Youri, i Andrzej Zarowski. "Calibration of Clinical Audio Recording and Analysis Systems for Sound Intensity Measurement". American Journal of Speech-Language Pathology 24, nr 4 (listopad 2015): 608–18. http://dx.doi.org/10.1044/2015_ajslp-14-0082.

Pełny tekst źródła
Streszczenie:
Purpose Sound intensity is an important acoustic feature of voice/speech signals. Yet recordings are performed with different microphone, amplifier, and computer configurations, and it is therefore crucial to calibrate sound intensity measures of clinical audio recording and analysis systems on the basis of output of a sound-level meter. This study was designed to evaluate feasibility, validity, and accuracy of calibration methods, including audiometric speech noise signals and human voice signals under typical speech conditions. Method Calibration consisted of 3 comparisons between data from 29 measurement microphone-and-computer systems and data from the sound-level meter: signal-specific comparison with audiometric speech noise at 5 levels, signal-specific comparison with natural voice at 3 levels, and cross-signal comparison with natural voice at 3 levels. Intensity measures from recording systems were then linearly converted into calibrated data on the basis of these comparisons, and validity and accuracy of calibrated sound intensity were investigated. Results Very strong correlations and quasisimilarity were found between calibrated data and sound-level meter data across calibration methods and recording systems. Conclusions Calibration of clinical sound intensity measures according to this method is feasible, valid, accurate, and representative for a heterogeneous set of microphones and data acquisition systems in real-life circumstances with distinct noise contexts.
Style APA, Harvard, Vancouver, ISO itp.
28

van de Wouwer, G., P. Scheunders, D. van Dyck, M. de Bodt, F. Wuyts i P. H. van de Heyning. "Voice Recognition from Spectrograms: A Wavelet Based Approach". Fractals 05, supp01 (kwiecień 1997): 165–72. http://dx.doi.org/10.1142/s0218348x97000735.

Pełny tekst źródła
Streszczenie:
The performance of a pattern recognition technique is usually determined by the ability of extracting useful features from the available data so as to effectively characterize and discriminate between patterns. We describe a novel method for feature extraction from speech signals. For this purpose, we generate spectrograms, which are time-frequency representations of the original signal. We show that, by considering this spectrogram as a textured image, a wavelet transform can be applied to generate useful features for recognizing the speech signal. This method is used for the classification of voice dysphonia. Its performance is compared with another technique taken from the literature. A recognition accuracy of 98% is achieved for the classification between normal an dysphonic voices.
Style APA, Harvard, Vancouver, ISO itp.
29

Restrepo, Juan F., i Gastón Schlotthauer. "Invariant Measures Based on the U-Correlation Integral: An Application to the Study of Human Voice". Complexity 2018 (2018): 1–9. http://dx.doi.org/10.1155/2018/2173640.

Pełny tekst źródła
Streszczenie:
Nonlinear measures such as the correlation dimension, the correlation entropy, and the noise level were used in this article to characterize normal and pathological voices. These invariants were estimated through an automated algorithm based on the recently proposed U-correlation integral. Our results show that the voice dynamics have a low dimension. The value of correlation dimension is greater for pathological voices than for normal ones. Furthermore, its value also increases along with the type of the voice. The low correlation entropy values obtained for normal and pathological type 1 and type 2 voices suggest that their dynamics are nearly periodic. Regarding the noise level, in the context of voice signals, it can be interpreted as the power of an additive stochastic perturbation intrinsic to the voice production system. Our estimations suggest that the noise level is greater for pathological voices than for normal ones. Moreover, it increases along with the type of voice, being the highest for type 4 voices. From these results, we can conclude that the voice production dynamical system is more complex in the presence of a pathology. In addition, the presence of the inherent stochastic perturbation strengthens along with the voice type. Finally, based on our results, we propose that the noise level can be used to quantitatively differentiate between type 3 and type 4 voices.
Style APA, Harvard, Vancouver, ISO itp.
30

Lin, Chin-Feng, Tsung-Jen Su, Hung-Kai Chang, Chun-Kang Lee, Shun-Hsyung Chang, Ivan A. Parinov i Sergey Shevtsov. "Direct-Mapping-Based MIMO-FBMC Underwater Acoustic Communication Architecture for Multimedia Signals". Applied Sciences 10, nr 1 (27.12.2019): 233. http://dx.doi.org/10.3390/app10010233.

Pełny tekst źródła
Streszczenie:
In this paper, a direct-mapping (DM)-based multi-input multi-output (MIMO) filter bank multi-carrier (FBMC) underwater acoustic multimedia communication architecture (UAMCA) is proposed. The proposed DM-based MIMO-FBMC UAMCA is rare and non-obvious in the underwater multimedia communication research topic. The following are integrated into the proposed UAMCA: A 2 × 2 DM transmission mechanism, a (2000, 1000) low-density parity-check code encoder, a power assignment mechanism, an object-composition petrinet mechanism, adaptive binary phase shift keying modulation and 4-offset quadrature amplitude modulation methods. The multimedia signals include voice, image, and data. The DM transmission mechanism in different spatial hardware devices transmits different multimedia packets. The proposed underwater multimedia transmission power allocation algorithm (UMTPAA) is simple, fast, and easy to implement, and the threshold transmission bit error rates (BERs) and real-time requirements for voice, image, and data signals can be achieved using the proposed UMTPAA. The BERs of the multimedia signals, data symbol error rates of the data signals, power saving ratios of the voice, image and data signals, mean square errors of the voice signals, and peak signal-to-noise ratios of the image signals, for the proposed UAMCA with a perfect channel estimation, and channel estimation errors of 5%, 10%, and 20%, respectively, were explored and demonstrated. Simulation results demonstrate that the proposed 2 × 2 DM-based MIMO-FBMC UAMCA is suitable for low power and high speed underwater multimedia sensor networks.
Style APA, Harvard, Vancouver, ISO itp.
31

Sateesh, Tulluri, Pantham Saikishore, Manchala Ganga Akhila, Karli Nikhil Kumar i Gopagani Ajay Bhargav. "Implementation of Echo Cancellation Ana Noise Reduction System". International Journal for Research in Applied Science and Engineering Technology 10, nr 11 (30.11.2022): 1083–89. http://dx.doi.org/10.22214/ijraset.2022.47560.

Pełny tekst źródła
Streszczenie:
Abstract: The advanced dispatches world is upset talking further naturally by using hands free this help the mortal being to talk more confidently without holding any of the bias similar as microphones or telephones. This paper describes an indispensable system of estimating signals corrupted by cumulative noise or hindrance. Aural echo cancellation problem was bandied out of different noise cancellation ways by concerning different parameters with their relative results. In this project we used voice signal for the input process then we can add some echo and remove noise to the voice signal. After that we can go with different applications for the voice signal. Here we used filter steps to remove the noise from the input voice signal. Here we used filter steps to remove the noise from the input voice signal.
Style APA, Harvard, Vancouver, ISO itp.
32

Yin, Shu Hua. "Design of the Auxiliary Speech Recognition System of Super-Short-Range Reconnaissance Radar". Applied Mechanics and Materials 556-562 (maj 2014): 4830–34. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.4830.

Pełny tekst źródła
Streszczenie:
To improve the usability and operability of the hybrid-identification reconnaissance radar for individual use, a voice identification System was designed. By using SPCE061A audio signal microprocessor as the core, a digital signal processing technology was used to obtain Doppler radar signals of audio segments by audio cable. Afterwards, the A/D acquisition was conducted to acquire digital signals, and then the data obtained were preprocessed and adaptively filtered to eliminate background noises. Moreover, segmented FFT transforming was used to identify the types of the signals. The overall design of radar voice recognition for an individual soldier was thereby fulfilled. The actual measurements showed that the design of the circuit improved radar resolution and the accuracy of the radar identification.
Style APA, Harvard, Vancouver, ISO itp.
33

Mahesh Kumar, Pala. "A New Human Voice Recognition System". Asian Journal of Science and Applied Technology 5, nr 2 (5.11.2016): 23–30. http://dx.doi.org/10.51983/ajsat-2016.5.2.931.

Pełny tekst źródła
Streszczenie:
In an effort to provide a more efficient representation of the speech signal, the application of the wavelet analysis is considered. This research presents an effective and robust method for extracting features for speech processing. Here, we proposed a new human voice recognition system using the combination of decimated wavelet (DW) and Relative Spectra Algorithm with Linear Predictive coding. First, we will apply the proposed techniques to the training speech signals and then form a train feature vector which contains the low level features extracted, wavelet and linear predictive coefficients. Afterwards, the same process will be applied to the testing speech signals and will form a test feature vector. Now, we will compare the two feature vectors by calculating the Euclidean distance between the vectors to identify the speech and speaker. If the distance between two vectors is near to zero then the tested speech/speaker will be matched with the trained speech/speaker. Simulation results have been compared with LPC scheme, and shown that the proposed scheme has performed superior to the existing technique by using the fifty preloaded voice signals from six individuals, the verification tests have been carried and an accuracy rate of approximately 90 % has been achieved.
Style APA, Harvard, Vancouver, ISO itp.
34

Davies-Thompson, Jodie, Giulia V. Elli, Mohamed Rezk, Stefania Benetti, Markus van Ackeren i Olivier Collignon. "Hierarchical Brain Network for Face and Voice Integration of Emotion Expression". Cerebral Cortex 29, nr 9 (1.10.2018): 3590–605. http://dx.doi.org/10.1093/cercor/bhy240.

Pełny tekst źródła
Streszczenie:
Abstract The brain has separate specialized computational units to process faces and voices located in occipital and temporal cortices. However, humans seamlessly integrate signals from the faces and voices of others for optimal social interaction. How are emotional expressions, when delivered by different sensory modalities (faces and voices), integrated in the brain? In this study, we characterized the brains’ response to faces, voices, and combined face–voice information (congruent, incongruent), which varied in expression (neutral, fearful). Using a whole-brain approach, we found that only the right posterior superior temporal sulcus (rpSTS) responded more to bimodal stimuli than to face or voice alone but only when the stimuli contained emotional expression. Face- and voice-selective regions of interest, extracted from independent functional localizers, similarly revealed multisensory integration in the face-selective rpSTS only; further, this was the only face-selective region that also responded significantly to voices. Dynamic causal modeling revealed that the rpSTS receives unidirectional information from the face-selective fusiform face area, and voice-selective temporal voice area, with emotional expression affecting the connection strength. Our study promotes a hierarchical model of face and voice integration, with convergence in the rpSTS, and that such integration depends on the (emotional) salience of the stimuli.
Style APA, Harvard, Vancouver, ISO itp.
35

Lee, Ki-Seung. "Voice Conversion Using a Perceptual Criterion". Applied Sciences 10, nr 8 (22.04.2020): 2884. http://dx.doi.org/10.3390/app10082884.

Pełny tekst źródła
Streszczenie:
In voice conversion (VC), it is highly desirable to obtain transformed speech signals that are perceptually close to a target speaker’s voice. To this end, a perceptually meaningful criterion where the human auditory system was taken into consideration in measuring the distances between the converted and the target voices was adopted in the proposed VC scheme. The conversion rules for the features associated with the spectral envelope and the pitch modification factor were jointly constructed so that perceptual distance measurement was minimized. This minimization problem was solved using a deep neural network (DNN) framework where input features and target features were derived from source speech signals and time-aligned version of target speech signals, respectively. The validation tests were carried out for the CMU ARCTIC database to evaluate the effectiveness of the proposed method, especially in terms of perceptual quality. The experimental results showed that the proposed method yielded perceptually preferred results compared with independent conversion using conventional mean-square error (MSE) criterion. The maximum improvement in perceptual evaluation of speech quality (PESQ) was 0.312, compared with the conventional VC method.
Style APA, Harvard, Vancouver, ISO itp.
36

Singhal, Abhishek, i Devendra Kumar Sharma. "Estimation of Accuracy in Human Gender Identification and Recall Values Based on Voice Signals Using Different Classifiers". Journal of Engineering 2022 (15.12.2022): 1–9. http://dx.doi.org/10.1155/2022/9291099.

Pełny tekst źródła
Streszczenie:
This paper presents the estimation of accuracy in male, female, and transgender identification using different classifiers with the help of voice signals. The recall value of each gender is also calculated. This paper reports the third gender (transgender) identification for the first time. Voice signals are the most appropriate and convenient way to transfer information between the subjects. Voice signal analysis is vital for accurate and fast identification of gender. The Mel Frequency Cepstral Coefficients (MFCCs) are used here as an extracted feature of the voice signals of the speakers. MFCCs are the most convenient and reliable feature that configures the gender identification system. Recurrent Neural Network–Bidirectional Long Short-Term Memory (RNN-BiLSTM), Support Vector Machine (SVM), and Linear Discriminant Analysis (LDA) are utilized as classifiers in this work. In the proposed models, the experimental result does not depend on the text of the speech, the language of the speakers, and the time duration of the voice samples. The experimental results are obtained by analyzing the common voice samples. In this article, the RNN-BiLSTM classifier has single-layer architecture, while SVM and LDA have a k-fold value of 5. The recall value of genders and accuracy of the proposed models also varied according to the number of voice samples in training and testing datasets. The highest accuracy for gender identification is found as 94.44%. The simulation results show that the accuracy of the RNN is always found at a higher value than SVM and LDA. The gender-wise highest recall value of the proposed model is 95.63%, 96.71%, and 97.22% for males, females, and transgender, respectively, using voice signals. The recall value of the transgender is high in comparison to other genders.
Style APA, Harvard, Vancouver, ISO itp.
37

Zafar, Shakeel, Imran Fareed Nizami, Mobeen Ur Rehman, Muhammad Majid i Jihyoung Ryu. "NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram". Sensors 23, nr 12 (16.06.2023): 5652. http://dx.doi.org/10.3390/s23125652.

Pełny tekst źródła
Streszczenie:
With the evolution in technology, communication based on the voice has gained importance in applications such as online conferencing, online meetings, voice-over internet protocol (VoIP), etc. Limiting factors such as environmental noise, encoding and decoding of the speech signal, and limitations of technology may degrade the quality of the speech signal. Therefore, there is a requirement for continuous quality assessment of the speech signal. Speech quality assessment (SQA) enables the system to automatically tune network parameters to improve speech quality. Furthermore, there are many speech transmitters and receivers that are used for voice processing including mobile devices and high-performance computers that can benefit from SQA. SQA plays a significant role in the evaluation of speech-processing systems. Non-intrusive speech quality assessment (NI-SQA) is a challenging task due to the unavailability of pristine speech signals in real-world scenarios. The success of NI-SQA techniques highly relies on the features used to assess speech quality. Various NI-SQA methods are available that extract features from speech signals in different domains, but they do not take into account the natural structure of the speech signals for assessment of speech quality. This work proposes a method for NI-SQA based on the natural structure of the speech signals that are approximated using the natural spectrogram statistical (NSS) properties derived from the speech signal spectrogram. The pristine version of the speech signal follows a structured natural pattern that is disrupted when distortion is introduced in the speech signal. The deviation of NSS properties between the pristine and distorted speech signals is utilized to predict speech quality. The proposed methodology shows better performance in comparison to state-of-the-art NI-SQA methods on the Centre for Speech Technology Voice Cloning Toolkit corpus (VCTK-Corpus) with a Spearman’s rank-ordered correlation constant (SRC) of 0.902, Pearson correlation constant (PCC) of 0.960, and root mean squared error (RMSE) of 0.206. Conversely, on the NOIZEUS-960 database, the proposed methodology shows an SRC of 0.958, PCC of 0.960, and RMSE of 0.114.
Style APA, Harvard, Vancouver, ISO itp.
38

Shirataki, Jun, i Manabu Ishihara. "Perception of Intermittently Eliminated Speech Waves (Auditory Sense Characteristics in Case of Having Eliminated the Voice Signals at a Specified Interval)". Journal of Robotics and Mechatronics 6, nr 1 (20.02.1994): 87–91. http://dx.doi.org/10.20965/jrm.1994.p0087.

Pełny tekst źródła
Streszczenie:
The present paper discusses the auditory sense characteristics concerning how a human listens to voice signals when they have been eliminated at a specified interval. The authors first clarity the clarity of voice signals when they have been eliminated at a certain interval (called the eliminated voice signals), as well as clarity a sentence understanding degree. As a result, the authors detail the relationship between the elimination cycle of eliminated voice signals and the voice section (or block). Claring on the order of 60% can be obtained up to the voice section on the order of 60%. Furthermore, in the case of a sentence, more than 90% level comprehension can be obtained even if the voice section is at 50%. The authors consider that the basic data for studying the correspondence of a man's ears to the case for a machine to have ears can be obtained by clarifying these auditory sense characteristics.
Style APA, Harvard, Vancouver, ISO itp.
39

Abdallah, Hanaa A., i Souham Meshoul. "A Multilayered Audio Signal Encryption Approach for Secure Voice Communication". Electronics 12, nr 1 (20.12.2022): 2. http://dx.doi.org/10.3390/electronics12010002.

Pełny tekst źródła
Streszczenie:
In this paper, multilayer cryptosystems for encrypting audio communications are proposed. These cryptosystems combine audio signals with other active concealing signals, such as speech signals, by continuously fusing the audio signal with a speech signal without silent periods. The goal of these cryptosystems is to prevent unauthorized parties from listening to encrypted audio communications. Preprocessing is performed on both the speech signal and the audio signal before they are combined, as this is necessary to get the signals ready for fusion. Instead of encoding and decoding methods, the cryptosystems rely on the values of audio samples, which allows for saving time while increasing their resistance to hackers and environments with a noisy background. The main feature of the proposed approach is to consider three levels of encryption namely fusion, substitution, and permutation where various combinations are considered. The resulting cryptosystems are compared to the one-dimensional logistic map-based encryption techniques and other state-of-the-art methods. The performance of the suggested cryptosystems is evaluated by the use of the histogram, structural similarity index, signal-to-noise ratio (SNR), log-likelihood ratio, spectrum distortion, and correlation coefficient in simulated testing. A comparative analysis in relation to the encryption of logistic maps is given. This research demonstrates that increasing the level of encryption results in increased security. It is obvious that the proposed salting-based encryption method and the multilayer DCT/DST cryptosystem offer better levels of security as they attain the lowest SNR values, −25 dB and −2.5 dB, respectively. In terms of the used evaluation metrics, the proposed multilayer cryptosystem achieved the best results in discrete cosine transform and discrete sine transform, demonstrating a very promising performance.
Style APA, Harvard, Vancouver, ISO itp.
40

Ingrisano, Dennis R.-S., Cecyle K. Perry i Kairsten R. Jepson. "Environmental Noise". American Journal of Speech-Language Pathology 7, nr 1 (luty 1998): 91–96. http://dx.doi.org/10.1044/1058-0360.0701.91.

Pełny tekst źródła
Streszczenie:
The effects of environmental noise were estimated from automatic computer-assisted analyses of voice samples. Signals consisted of a live voice sample and a synthesized triangular waveform. Noise was generated from a personal computer fan. Six different A-weighted signal-to-noise [S/N(A)] conditions were created for the live voice and synthetic signal— 25, 20, 15, 10, 5, and 0 dB. Results revealed that automatic estimates were systematically affected by different S/N levels. As the noise floor increased, baseline estimates of jitter and shimmer also increased in value. Results are discussed with reference to safeguards and standards in voice recording and analysis.
Style APA, Harvard, Vancouver, ISO itp.
41

Sulistyawan, V. N., S. E. Widhira, A. Fatin i N. A. Salim. "Signal acquisition system based on wireless transmission for environmental sound monitoring system". IOP Conference Series: Earth and Environmental Science 969, nr 1 (1.01.2022): 012015. http://dx.doi.org/10.1088/1755-1315/969/1/012015.

Pełny tekst źródła
Streszczenie:
Abstract In today’s technological era, the ability to access information through digital signals, especially voice, requires sophisticated and comprehensive applications that can convert physical signals into electrical signals. Its purpose is to assist humans in displaying and analysing structured and automated data obtained from tools with their unique set of features. In this study, we used signal data for different sounds, such as rock songs, birdsong, acoustic sounds, and conversational sounds. The data is checked using software and undergoes a data acquisition process. The results of the study are expected to reinforce that environmental sound processing is predicted to assist the development of more sophisticated automated monitoring systems capable of combining voice and visual data in a complementary way.
Style APA, Harvard, Vancouver, ISO itp.
42

Hillenbrand, James. "A Methodological Study of Perturbation and Additive Noise in Synthetically Generated Voice Signals". Journal of Speech, Language, and Hearing Research 30, nr 4 (grudzień 1987): 448–61. http://dx.doi.org/10.1044/jshr.3004.448.

Pełny tekst źródła
Streszczenie:
There is a relatively large body of research that is aimed at finding a set of acoustic measures of voice signals that can be used to: (a) aid in the detection, diagnosis, and evaluation of voice-quality disorders; (b) identify individual speakers by their voice characteristics; or (c) improve methods of voice synthesis. Three acoustic parameters that have received a relatively large share of attention, especially in the voice-disorders literature, are pitch perturbation, amplitude perturbation, and additive noise. The present study consisted of a series of simulations using a general-purpose formant synthesizer that were designed primarily to determine whether these three parameters could be measured independent of one another. Results suggested that changes in any single dimension can affect measured values of all three parameters. For example, adding noise to a voice signal resulted not only in a change in measured signal-to-noise ratio, but also in measured values of pitch and amplitude perturbation, These interactions were quite large in some cases, especially in view of the fact that the perturbation phenomena that are being measured are generally quite small. For the most part, the interactions appear to be readily explainable when the measurement techniques are viewed in relation to what is known about the acoustics of voice production.
Style APA, Harvard, Vancouver, ISO itp.
43

Naresh, B., S. Rambabu i D. Khalandar Basha. "ARM Controller and EEG based Drowsiness Tracking and Controlling during Driving". International Journal of Reconfigurable and Embedded Systems (IJRES) 6, nr 3 (28.05.2018): 127. http://dx.doi.org/10.11591/ijres.v6.i3.pp127-132.

Pełny tekst źródła
Streszczenie:
<span>This paper discussed about EEG-Based Drowsiness Tracking during Distracted Driving based on Brain computer interfaces (BCI). BCIs are systems that can bypass conventional channels of communication (i.e., muscles and thoughts) to provide direct communication and control between the human brain and physical devices by translating different patterns of brain activity commands through controller device in real time. With these signals from brain in mat lab signals spectrum analyzed and estimates driver concentration and meditation conditions. If there is any nearest vehicles to this vehicle a voice alert given to driver for alert. And driver going to sleep gives voice alert for driver using voice chip. And give the information about traffic signal indication using RFID. The patterns of interaction between these neurons are represented as thoughts and emotional states. According to the human feelings, this pattern will be changing which in turn produce different electrical waves. A muscle contraction will also generate a unique electrical signal. All these electrical waves will be sensed by the brain wave sensor and it will convert the data into packets and transmit through Bluetooth medium. Level analyzer unit (LAU) is used to receive the raw data from brain wave sensor and it is used to extract and process the signal using Mat lab platform. The nearest vehicles information is information is taken through ultrasonic sensors and gives voice alert. And traffic signals condition is detected through RF technology.</span>
Style APA, Harvard, Vancouver, ISO itp.
44

Lengagne, T., J. Lauga i T. Aubin. "Intra-syllabic acoustic signatures used by the king penguin in parent-chick recognition: an experimental approach". Journal of Experimental Biology 204, nr 4 (15.02.2001): 663–72. http://dx.doi.org/10.1242/jeb.204.4.663.

Pełny tekst źródła
Streszczenie:
In king penguin colonies, several studies have shown that both parent-chick recognition and mate-pair recognition are achieved by acoustic signals. The call of king penguins consists of strong frequency modulations with added beats of varying amplitude induced by the two-voice generating process. Both the frequency modulation pattern and the two-voice system could play a role in the identification of the calling bird. We investigated the potential role of these features in individual discrimination. Experiments were conducted by playing back altered or reconstructed parental signals to the corresponding chick. The results proved that the king penguin performs a complex analysis of the call, using both frequency modulation and the two-voice system. Reversed or frequency-modulation-suppressed signals do not elicit any responses. Modifying the shape of the frequency modulation by 30 % also impairs the recognition process. Moreover, we have demonstrated for the first time that birds perform an analysis of the beat amplitude induced by the two-voice system to assess individual identity. These two features, which are well preserved during the propagation of the signal, seem to be a reliable strategy to ensure the accurate transmission of individual information in a noisy colonial environment.
Style APA, Harvard, Vancouver, ISO itp.
45

Wang, Da Hu, Qie Qie Zhang i Yi Fan Sun. "Design of Wireless Voice Communication System in Underground Coal Mine Based on ZigBee". Applied Mechanics and Materials 548-549 (kwiecień 2014): 1402–6. http://dx.doi.org/10.4028/www.scientific.net/amm.548-549.1402.

Pełny tekst źródła
Streszczenie:
For disadvantages of the present mine voice communication systems, a kind of wireless voice communication system based on Zig Bee is put forward. The paper provides detailed informations about hardware and software of the wireless voice communication device. In the system, adopt CC2530 as RF sending-receiving unit of voice communication node, convert speech signals to digital or analog signals by CSP1027, encode or decode quantized voice data by AMBE voice codec technology and realize voice message two-way wireless communication over the Zig Bee wireless communication protocol IEEE 802.15.4 between voice communication devices. Experiments have shown that voice communication device can get a clear voice and have a high reliability in the effective distance of communication, meet the requirements of voice communication.
Style APA, Harvard, Vancouver, ISO itp.
46

Sadou, Jean‐Claude B. "Device for the processing of voice signals". Journal of the Acoustical Society of America 79, nr 2 (luty 1986): 590–91. http://dx.doi.org/10.1121/1.393495.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
47

Wang, JingHui, i YuanChao Zhao. "Voice Prediction Based On All-Poles Signals". Procedia Engineering 29 (2012): 1506–10. http://dx.doi.org/10.1016/j.proeng.2012.01.163.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
48

Che Kassim, Farah Nazlia, Vikneswaran Vijean, Zulkapli Abdullah, Hariharan Muthusamy i Rokiah Abdullah. "OPTIMIZATION OF DUAL-TREE COMPLEX WAVELET PACKET BASED ENTROPY FEATURES FOR VOICE PATHOLOGIES DETECTION". Jurnal Teknologi 82, nr 6 (21.10.2020): 21–28. http://dx.doi.org/10.11113/jurnalteknologi.v82.14748.

Pełny tekst źródła
Streszczenie:
The Dual-Tree Complex Wavelet Packet Transform (DT-CWPT) has been successfully implemented in numerous field because it introduces limited redundancy, provides approximately shift-invariance and geometrically oriented signal in multiple dimensions where these properties are lacking in traditional wavelet transform. This paper investigates the performance of features extracted using DT-CWPT algorithms which are quantified using k-Nearest Neighbors (k-NN) and Support Vector Machine (SVM) classifiers for detecting voice pathologies. Decomposition is done on the voice signals using Shannon and Approximate entropy (ApEn) to signify the complexity of voice signals in time and frequency domain. Feature selection methods using the ReliefF algorithm and Genetic algorithm (GA) are applied to obtain the optimum features for multiclass classification. It is observed that the best accuracies obtained using DT-CWPT with ApEn entropy are 91.15 % for k-NN and 93.90 % for SVM classifiers. The proposed work provides a promising detection rate for multiple voice disorders and is useful for the development of computer-based diagnostic tools for voice pathology screening in health care facilities.
Style APA, Harvard, Vancouver, ISO itp.
49

Putri, Farika, Wahyu Caesarendra, Elta Diah Pamanasari, Mochammad Ariyanto i Joga D. Setiawan. "Parkinson Disease Detection Based on Voice and EMG Pattern Classification Method for Indonesian Case Study". Journal of Energy, Mechanical, Material and Manufacturing Engineering 3, nr 2 (31.12.2018): 87. http://dx.doi.org/10.22219/jemmme.v3i2.6977.

Pełny tekst źródła
Streszczenie:
Parkinson disease (PD) detection using pattern recognition method has been presented in literatures. This paper present multi-class PD detection utilizing voice and electromyography (EMG) features of Indonesian subjects. The multi-class classification consists of healthy control, possible stage, probable stage and definite stage. These stages are based on Hughes scale used in Indonesia for PD. Voice signals were recorded from 15 people with Parkinson (PWP) and 8 healthy control subjects. Voice and EMG data acquistion were conducted in dr Kariadi General Hospital Semarang, Central Java, Indonesia. Twenty two features are used for voice signal feature extraction and twelve features are emploed for EMG signal. Artificial Neural Network is used as classification method. The results of voice classification show that accuracy for testing step of 94.4%. For EMG classification, the accuracy of testing of 71%.
Style APA, Harvard, Vancouver, ISO itp.
50

Dimolitsas, S. "Characterization of low-rate digital voice coder performance with non-voice signals". Speech Communication 12, nr 2 (czerwiec 1993): 135–44. http://dx.doi.org/10.1016/s0167-6393(05)80005-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii