Dissertations / Theses on the topic 'Speech waveforms'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 42 dissertations / theses for your research on the topic 'Speech waveforms.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Lowry, Andrew. "Efficient structures for vector quantisation of speech waveforms." Thesis, Queen's University Belfast, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.292596.
Full textCarandang, Alfonso B., and n/a. "Recognition of phonemes using shapes of speech waveforms in WAL." University of Canberra. Information Sciences & Engineering, 1994. http://erl.canberra.edu.au./public/adt-AUC20060626.144432.
Full textDeivard, Johannes. "How accuracy of estimated glottal flow waveforms affects spoofed speech detection performance." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-48414.
Full textBamini, Praveen Kumar. "FPGA-based Implementation of Concatenative Speech Synthesis Algorithm." [Tampa, Fla.] : University of South Florida, 2003. http://purl.fcla.edu/fcla/etd/SFE0000187.
Full textChoy, Eddie L. T. "Waveform interpolation speech coder at 4 kbs." Thesis, McGill University, 1998. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=20901.
Full textLeong, Michael. "Representing voiced speech using prototype waveform interpolation for low-rate speech coding." Thesis, McGill University, 1992. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=56796.
Full textIn examining the PWI method, it was found that although the method generally works very well there are occasional sections of the reconstructed voiced speech where audible distortion can be heard, even when the prototypes are not quantized. The research undertaken in this thesis focuses on the fundamental principles behind modelling voiced speech using PWI instead of focusing on bit allocation for encoding the prototypes. Problems in the PWI method are found that may be have been overlooked as encoding error if full encoding were implemented.
Kleijn uses PWI to represent voiced sections of the excitation signal which is the residual obtained after the removal of short-term redundancies by a linear predictive filter. The problem with this method is that when the PWI reconstructed excitation is passed through the inverse filter to synthesize the speech undesired effects occur due to the time-varying nature of the filter. The reconstructed speech may have undesired envelope variations which result in audible warble.
This thesis proposes an energy fixup to smoothen the synthesized speech envelope when the interpolation procedure fails to provide the smooth linear result that is desired. Further investigation, however, leads to the final proposal in this thesis that PWI should he performed on the clean speech signal instead of the excitation to achieve consistently reliable results for all voiced frames.
Choy, Eddie L. T. "Waveform interpolation speech coder at 4 kb/s." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0028/MQ50596.pdf.
Full textDavis, Andrew J. "Waveform coding of speech and voiceband data signals." Thesis, University of Liverpool, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.232946.
Full textEide, Ellen Marie. "A linguistic feature representation of the speech waveform." Thesis, Massachusetts Institute of Technology, 1993. http://hdl.handle.net/1721.1/12510.
Full textIncludes bibliographical references (leaves 95-97).
by Ellen Marie Eide.
Ph.D.
Zeghidour, Neil. "Learning representations of speech from the raw waveform." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE004/document.
Full textWhile deep neural networks are now used in almost every component of a speech recognition system, from acoustic to language modeling, the input to such systems are still fixed, handcrafted, spectral features such as mel-filterbanks. This contrasts with computer vision, in which a deep neural network is now trained on raw pixels. Mel-filterbanks contain valuable and documented prior knowledge from human auditory perception as well as signal processing, and are the input to state-of-the-art speech recognition systems that are now on par with human performance in certain conditions. However, mel-filterbanks, as any fixed representation, are inherently limited by the fact that they are not fine-tuned for the task at hand. We hypothesize that learning the low-level representation of speech with the rest of the model, rather than using fixed features, could push the state-of-the art even further. We first explore a weakly-supervised setting and show that a single neural network can learn to separate phonetic information and speaker identity from mel-filterbanks or the raw waveform, and that these representations are robust across languages. Moreover, learning from the raw waveform provides significantly better speaker embeddings than learning from mel-filterbanks. These encouraging results lead us to develop a learnable alternative to mel-filterbanks, that can be directly used in replacement of these features. In the second part of this thesis we introduce Time-Domain filterbanks, a lightweight neural network that takes the waveform as input, can be initialized as an approximation of mel-filterbanks, and then learned with the rest of the neural architecture. Across extensive and systematic experiments, we show that Time-Domain filterbanks consistently outperform melfilterbanks and can be integrated into a new state-of-the-art speech recognition system, trained directly from the raw audio signal. Fixed speech features being also used for non-linguistic classification tasks for which they are even less optimal, we perform dysarthria detection from the waveform with Time-Domain filterbanks and show that it significantly improves over mel-filterbanks or low-level descriptors. Finally, we discuss how our contributions fall within a broader shift towards fully learnable audio understanding systems
ARAUJO, ANTONIO MARCOS DE LIMA. "ANALYSIS OF WAVEFORM CODERS FOR SPEECH AND DATA SIGNALS." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 1986. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=9246@1.
Full textThis thesis evaluates the performance of waveform coders at 32,56 and 64kbit/s for digital transmission of speech signal and 4800 bit/s PSK-8 and 9600 bit/s QAM-16 voiceband data signas. A detailed analysis of the systems is carried out both under ideal and noisy channel conditions. From this analysis it was found that a scheme which accurately distinguishes the two classes of signals, would allow a more efficient encoding procedure. A method of statistical identification of speech and data signals is proposed and its use in wakeform coders is, then, analysed. The incorporation of this method into the 32 kbit/s ADPCM system recommended by CCITT provides an improvement in performance for data signals, without sacrificing its efficiency for speech signal.
Yaghmaie, Khashayar. "Prototype waveform interpolation based low bit rate speech coding." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/843152/.
Full textKhan, Mohammad M. A. "Coding of excitation signals in a waveform interpolation speech coder." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32961.
Full textProduct code vector quantizers (PC-VQ) are a family of structured VQs that circumvent the complexity obstacle. The performance of product code VQs can be traded off against their storage and encoding complexity. This thesis introduces split/shape-gain VQ---a hybrid product code VQ, as an approach to quantize the SEW magnitude. The amplitude spectrum of the SEW is split into three non-overlapping subbands. The gains of the three subbands form the gain vector which are quantized using the conventional Generalized Lloyd Algorithm (GLA). Each shape vector obtained by normalizing each subband by its corresponding coded gain is quantized using a dimension conversion VQ along with a perceptually based bit allocation strategy and a perceptually weighted distortion measure. At the receiver, the discontinuity of the gain contour at the boundary of subbands introduces buzziness in the reconstructed speech. This problem is tackled by smoothing the gain versus frequency contour using a piecewise monotonic cubic interpolant. Simulation results indicate that the new method improves speech quality significantly.
The necessity of SEW phase information in the WI coder is also investigated in this thesis. Informal subjective test results demonstrate that transmission of SEW magnitude encoded by split/shape-gain VQ and inclusion of a fixed phase spectrum drawn from a voiced segment of a high-pitched male speaker obviates the need to send phase information.
Choi, Hung Bun. "Pitch synchronous waveform interpolation for very low bit rate speech coding." Thesis, University of Liverpool, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.243264.
Full textPollard, Matthew Peter. "Waveform interpolation methods for pitch and time-scale modification of speech." Thesis, University of Liverpool, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.263905.
Full textSmith, Lloyd A. (Lloyd Allen). "Speech Recognition Using a Synthesized Codebook." Thesis, University of North Texas, 1988. https://digital.library.unt.edu/ark:/67531/metadc332203/.
Full textGhaidan, Khaldoon A. "A study of the application of modern techniques to speech waveform analysis." Thesis, Loughborough University, 1986. https://dspace.lboro.ac.uk/2134/28015.
Full textBliūdžius, Mindaugas. "Skaitmeninių kalbos įrašų glaudinimo metodai." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2004. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2004~D_20040529_122424-17577.
Full textBeall, Jeffery C. "Stored waveform adaptive motor control." Thesis, Virginia Tech, 1986. http://hdl.handle.net/10919/45746.
Full textMaster of Science
Pelteku, Altin E. "Development of an electromagnetic glottal waveform sensor for applications in high acoustic noise environments." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0114104-142855/.
Full textKeywords: basis functions; perfectly matched layers; PML; neck model; parallel plate resonator; finite element; circulator; glottal waveform; multi-transmission line; dielectric properties of human tissues; radiation currents; weighted residuals; non-acoustic sensor. Includes bibliographical references (p. 104-107).
Keenaghan, Kevin Michael. "A Novel Non-Acoustic Voiced Speech Sensor Experimental Results and Characterization." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0114104-144946/.
Full textNehl, Albert Henry. "Investigation of techniques for high speed CMOS arbitrary waveform generation." PDXScholar, 1990. https://pdxscholar.library.pdx.edu/open_access_etds/4109.
Full textJoseph, Andrew Paul. "Assessing the effects of GMAW-Pulse parameters on arc power and weld heat input /." Connect to resource, 2001. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1130521651.
Full textElangovan, Saravanan, Jerry L. Cranford, Letitia Walker, and Andrew Stuart. "A Comparison of the Mismatch Negativity and a Differential Waveform Response." Digital Commons @ East Tennessee State University, 2005. https://dc.etsu.edu/etsu-works/1556.
Full textTorres, Juan Félix. "Estimation of glottal source features from the spectral envelope of the acoustic speech signal." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34736.
Full textLiu, Lup Shun Nelson Carleton University Dissertation Engineering Electronics. "Sensitivity analysis and optimization of high-speed VLSI interconnects using asymptotic waveform evaluation." Ottawa, 1993.
Find full textBhatta, Debesh. "Algorithms and methodology for incoherent undersampling based acquisition of high speed signal waveforms using low cost test instrumentation." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/54262.
Full textTsutsumi, Monike. "Avaliação da videolaringoscopia de alta velocidade de sujeitos normais." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/82/82131/tde-28032016-142207/.
Full textSeveral studies using laryngeal images from high-speed videolaryngoscopy of normal subjects reveals the diversity of tools and metrics used for different population. However, shortage of operational standardization and references of vocal fold parameters are evident. The main objectives of this study were to obtain parameters of vocal dynamics using computational tools of Medical Engineering Research Group (GPEM - CNPq) and to characterize the vocal fold\'s vibration pattern of normal subjects using glottal area waveforms and high-speed kymography. Methods: From laryngeal images of high-speed videolaryngoscopy we extracted the following quantitative parameters: i) phase time of glottal area waveforms, ii) phase time of vibratory cicle\'s total period, iii) quocients of high-speed kymography. Furthermore, qualitative parameters of glottal area waveform were analyzed according to visual pattern protocol. Results: Media values of glottal area waveforms, in milliseconds, of closed phase: female=0.83 and male= 2.47; opening phase: female= 2.43 and male= 2.95; closing phase: female= 2.08 and male= 2.53; opened phase: female=6.15 and male= 6.18, vibratory cicle of total period: female= 6.98 and male= 8.65, closing quotient: female= 0.14 and male= 0.29; opening quotient: female= 0.85 and male= 0.70; speed quotient: female= 1.16 and male= 1.19, besides 73% showed periodic signal. As the high- speed kymography the quantitative parameters obtained were: closed phase: female= 1.75 and male= 3.32; opening phase: female= 1.47 and male= 2.32; closing phase: female= 1.51 and male= 2.22; opened phase: female= 2.91 and male= 4.56, and vibratory cicie of total period: female= 4.67 and male= 7.89. The quotients obtained were: closing quotient: female= 0.37and male= 0.42; opening quotient: female= 0.62 and male= 0.57; speed quotient: female= 1.02 and male= 1.12. 59% amplitude symmetry and 54% phase asymmetry were obtained in the high- speed kymography of normal subjects. Conclusion: using specific computational tools to analyse high-speed laryngeal images we obtained quantitative and qualitative parameters of glottal area waveforms and high-speed kymography that can be used as a standard reference data for normal subjects.
Schulz, Yvonne Katrin [Verfasser], Stefan [Akademischer Betreuer] Kniesburges, and Stefan [Gutachter] Kniesburges. "Parameter analysis of the Glottal Area Waveform based on high-speed recordings within a synthetic larynx model / Yvonne Katrin Schulz ; Gutachter: Stefan Kniesburges ; Betreuer: Stefan Kniesburges." Erlangen : Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2020. http://d-nb.info/120337769X/34.
Full textBaker, Katherine Louise. "Cognitive Evoked Auditory Potentials and Neuropsychological Measures Following Concussion in College Athletes." Miami University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=miami1209744334.
Full textNarayanan, G. "Synchronised Pulsewidth Modulation Strategies Based On Space Vector Approach For Induction Motor Drives." Thesis, Indian Institute of Science, 1999. http://hdl.handle.net/2005/139.
Full textLIN, YAN-JUN, and 林妍君. "Speech waveform coding based on vector quantization." Thesis, 1986. http://ndltd.ncl.edu.tw/handle/88277127723707099722.
Full textBiskup, John Fredrick. "Applications of the speedy delivery waveform." Thesis, 2003. http://hdl.handle.net/2152/29809.
Full texttext
"Unit selection and waveform concatenation strategies in Cantonese text-to-speech." 2005. http://library.cuhk.edu.hk/record=b5892349.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 2005.
Includes bibliographical references.
Abstracts in English and Chinese.
Chapter 1. --- Introduction --- p.1
Chapter 1.1 --- An overview of Text-to-Speech technology --- p.2
Chapter 1.1.1 --- Text processing --- p.2
Chapter 1.1.2 --- Acoustic synthesis --- p.3
Chapter 1.1.3 --- Prosody modification --- p.4
Chapter 1.2 --- Trends in Text-to-Speech technologies --- p.5
Chapter 1.3 --- Objectives of this thesis --- p.7
Chapter 1.4 --- Outline of the thesis --- p.9
References --- p.11
Chapter 2. --- Cantonese Speech --- p.13
Chapter 2.1 --- The Cantonese dialect --- p.13
Chapter 2.2 --- Phonology of Cantonese --- p.14
Chapter 2.2.1 --- Initials --- p.15
Chapter 2.2.2 --- Finals --- p.16
Chapter 2.2.3 --- Tones --- p.18
Chapter 2.3 --- Acoustic-phonetic properties of Cantonese syllables --- p.19
References --- p.24
Chapter 3. --- Cantonese Text-to-Speech --- p.25
Chapter 3.1 --- General overview --- p.25
Chapter 3.1.1 --- Text processing --- p.25
Chapter 3.1.2 --- Corpus based acoustic synthesis --- p.26
Chapter 3.1.3 --- Prosodic control --- p.27
Chapter 3.2 --- Syllable based Cantonese Text-to-Speech system --- p.28
Chapter 3.3 --- Sub-syllable based Cantonese Text-to-Speech system --- p.29
Chapter 3.3.1 --- Definition of sub-syllable units --- p.29
Chapter 3.3.2 --- Acoustic inventory --- p.31
Chapter 3.3.3 --- Determination of the concatenation points --- p.33
Chapter 3.4 --- Problems --- p.34
References --- p.36
Chapter 4. --- Waveform Concatenation for Sub-syllable Units --- p.37
Chapter 4.1 --- Previous work in concatenation methods --- p.37
Chapter 4.1.1 --- Determination of concatenation point --- p.38
Chapter 4.1.2 --- Waveform concatenation --- p.38
Chapter 4.2 --- Problems and difficulties in concatenating sub-syllable units --- p.39
Chapter 4.2.1 --- Mismatch of acoustic properties --- p.40
Chapter 4.2.2 --- "Allophone problem of Initials /z/, Id and /s/" --- p.42
Chapter 4.3 --- General procedures in concatenation strategies --- p.44
Chapter 4.3.1 --- Concatenation of unvoiced segments --- p.45
Chapter 4.3.2 --- Concatenation of voiced segments --- p.45
Chapter 4.3.3 --- Measurement of spectral distance --- p.48
Chapter 4.4 --- Detailed procedures in concatenation points determination --- p.50
Chapter 4.4.1 --- Unvoiced segments --- p.50
Chapter 4.4.2 --- Voiced segments --- p.53
Chapter 4.5 --- Selected examples in concatenation strategies --- p.58
Chapter 4.5.1 --- Concatenation at Initial segments --- p.58
Chapter 4.5.1.1 --- Plosives --- p.58
Chapter 4.5.1.2 --- Fricatives --- p.59
Chapter 4.5.2 --- Concatenation at Final segments --- p.60
Chapter 4.5.2.1 --- V group (long vowel) --- p.60
Chapter 4.5.2.2 --- D group (diphthong) --- p.61
References --- p.63
Chapter 5. --- Unit Selection for Sub-syllable Units --- p.65
Chapter 5.1 --- Basic requirements in unit selection process --- p.65
Chapter 5.1.1 --- Availability of multiple copies of sub-syllable units --- p.65
Chapter 5.1.1.1 --- "Levels of ""identical""" --- p.66
Chapter 5.1.1.2 --- Statistics on the availability --- p.67
Chapter 5.1.2 --- Variations in acoustic parameters --- p.70
Chapter 5.1.2.1 --- Pitch level --- p.71
Chapter 5.1.2.2 --- Duration --- p.74
Chapter 5.1.2.3 --- Intensity level --- p.75
Chapter 5.2 --- Selection process: availability check on sub-syllable units --- p.77
Chapter 5.2.1 --- Multiple copies found --- p.79
Chapter 5.2.2 --- Unique copy found --- p.79
Chapter 5.2.3 --- No matched copy found --- p.80
Chapter 5.2.4 --- Illustrative examples --- p.80
Chapter 5.3 --- Selection process: acoustic analysis on candidate units --- p.81
References --- p.88
Chapter 6. --- Performance Evaluation --- p.89
Chapter 6.1 --- General information --- p.90
Chapter 6.1.1 --- Objective test --- p.90
Chapter 6.1.2 --- Subjective test --- p.90
Chapter 6.1.3 --- Test materials --- p.91
Chapter 6.2 --- Details of the objective test --- p.92
Chapter 6.2.1 --- Testing method --- p.92
Chapter 6.2.2 --- Results --- p.93
Chapter 6.2.3 --- Analysis --- p.96
Chapter 6.3 --- Details of the subjective test --- p.98
Chapter 6.3.1 --- Testing method --- p.98
Chapter 6.3.2 --- Results --- p.99
Chapter 6.3.3 --- Analysis --- p.101
Chapter 6.4 --- Summary --- p.107
References --- p.108
Chapter 7. --- Conclusions and Future Works --- p.109
Chapter 7.1 --- Conclusions --- p.109
Chapter 7.2 --- Suggested future works --- p.111
References --- p.113
Appendix 1 Mean pitch level of Initials and Finals stored in the inventory --- p.114
Appendix 2 Mean durations of Initials and Finals stored in the inventory --- p.121
Appendix 3 Mean intensity level of Initials and Finals stored in the inventory --- p.124
Appendix 4 Test word used in performance evaluation --- p.127
Appendix 5 Test paragraph used in performance evaluation --- p.128
Appendix 6 Pitch profile used in the Text-to-Speech system --- p.131
Appendix 7 Duration model used in Text-to-Speech system --- p.132
Xu, Wen-Long, and 許文龍. "A Mandarin Speech Synthesizer Using Time Proportionated Interpolation of Pitch Waveform." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/70269225450402799343.
Full textHsu, Wen-Lung, and 許文龍. "A Mandarin Speech Synthesizer Using Time Proportionated Interpolation of Pitch Waveform." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/64478207895251783493.
Full text國立臺灣科技大學
電機工程研究所
84
In this thesis, a text-to-speech system is designed and implemented on MS-Windows operating system. The 408 first-tone Mandarin syllables are adopted as the synthesis units. For the synthesis of syllable-signal, a time-domain processing method called "Time Proportionated Interpolation of Pitch Waveform (TPIPW)" is proposed. About the prosodic processing unit, a rule-based method proposed by other researchers is adopted and slightly modified here. In our method, the two parts of a syllable, i.e. the unvoiced part (e.g. voiceless consonants) and voiced parts (e.g. voiced consonants and vowels), are processed separately. The name of our method is just selected to reflect the voiced-part''s processing. By using this method, a syllable''s tone(or pitch-contour), duration, and formant- frequency height can be almost independently controlled. Especially, the duration of a syllable can be more freely changed to a value between one half and double of the original length without notable side-effects on the other two control factors. Besides, the function of increasing or decreasing formant-frequency values is provided to simulate the adjusting of vocal-track length such that the original recorded male voice can be more naturally converted to a female''s voice. For the unvoiced part, signal waveforms are classified into two classes and a method is proposed to process each class differently. This method not only synthesizes clear and intelligible signals but also support the control of duration.
Yan, Jyh-Horng, and 顏誌宏. "The Research of Voice Waveform Reconstruction in CELP Speech Coding due to the Speech Frame''s Codewords Loss." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/77877098696811301575.
Full text國立臺北科技大學
電腦通訊與控制研究所
87
This thesis is mainly to explore the best ways to do the voice reconstruction when the transmitted speech signal in speech coding lost some frame’s codewords. In this thesis, we focus on speech coding method based on code excited linear prediction(CELP), and use the high quality low bit-rate speech vocoder -- FS1016 4.8 kbps CELP as the speech coding methods to compare the performance in the voice reconstruction under loss of frame’s codewords Considering the factors for speeding and simplicity, we use the lost frame’s codewords reconstruction as the major way for voice reconstruction under the loss of frame’s codewords. The methods for lost frame’s codewords reconstruction can be classified into two groups. One is the lost frame’s codewords reconstruction without any information of the lost frame’s codewords, the other is the lost frame’s codewords reconstruction with partial information of the lost frame’s codewords. We do experiments to investigate he best ways to do the voice reconstruction in varied situations. And we also study the importance of the parameters in a CELP frame’s codeword in voice reconstruction and improve the voice reconstruction’s quality by keeping the most important partial information in the lost frame’s codewords. Our results show that the most important parameter in the voice reconstruction due to the lost frame’s codewords is the pitch gain. By adequately keeping the parameters of pitch gain and LSF coefficients in the lost frames, we can reconstruct acceptable speech signal, evenly at high frame lost rate(30%). Finally, we implement our results as an ACM codec, and use Netmeeting to do the real-time voice transmission in the internet. We also port our programs to TI DSP chip TMS320C54x system.
Huang, Tzu-Yun, and 黃姿云. "A Dual Complementary Acoustic Embedding Network: Mining Discriminative Characteristics from Raw-waveform for Speech Emotion Recognition." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/y8zcm7.
Full textChandravadan, Vora Santoshkumar. "Novel Methods For Estimation Of Static Nonlinearity Of High-Speed High-Resolution Waveform Digitizers." Thesis, 2009. http://hdl.handle.net/2005/1019.
Full textChang, Lung-Yin, and 張龍吟. "Research on speed and quality of anodic bonding using applied voltage with various waveforms." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/s656s6.
Full text國立臺灣師範大學
工業教育學系
96
Anodic bonding technique is important and is often used in package of MEMS components. It uses ionic bond to obtain bonding results. Surface level of silicon and glass need to be very serious, and the technique belong to non-media bonding. There are important factors of bonding ratio and quality in bonding process, such as applied voltage, temperature, and type of electrode etc. In different forms of applied voltage, it causes different bonding results. The reason is described as follow: the maximum of bonding current by using constant voltage waveform would be decayed when the bonding time is increasing, but it will be kept at a high value by using variable voltage waveform and to improve bonding ratio and quality widely. The research improves that using radiate-line electrode with square voltage waveform to bonding 4 inch wafer, bonding ratio can reach 99.2% when the average voltage is 250 V, period is 8 sec, temperature is 400 ºC, and bonding time is 200 sec. In this research, we develop a novel conical frustum electrode to co-operate variable voltage waveform for anodic bonding. It not only can keep bonding current at a high value to decrease bonding time, but also can have the same bonding quality with the results of applied constant voltage. The research improves that using novel electrode with constant voltage waveform to bonding 4 inch wafer, bonding ratio can reach 99.98% when the average voltage is 800 V, temperature is 400 ºC, and bonding time is 15 sec. Using the novel electrode with square voltage waveform to bonding 4 inch wafer, bonding ratio only can reach 72.93% when the average voltage is 250 V, period is 8 sec, temperature is 400 ºC, and bonding time is 15 sec. The efficiency of bonding system is limited when using square or constant voltage waveforms to co-operate the conical frustum electrode. Although it causes output voltage can not reach the setting value in bonding process, the research still can achieve the expecting purpose.
Zheng, Yu-Xiang, and 鄭宇翔. "Analysis of Time-domain Waveform for Stub Effect in High-Speed Digital Circuits." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/p7u4d6.
Full text中原大學
電子工程研究所
106
This paper investigates how via stubs effect the time-domain transmission (TDT) waveform, Time-Domain Reflectometry (TDR) waveform, S21, S11, and eye diagram in high speed circuit. Due to the limitations of the laboratory process, the via stub was changed to open stub. In order to understand the effect on the waveform, various parameters of the open stub are analysed. The manner in which the time-domain reflection noise affects the TDR waveform and the step voltage on the TDT waveform is investigated by lattice diagram. And in special cases, waveform overlays are used to explain the behavior of time-domain transmission waveforms. The effects of mismatch stub length due to layer change on the TDT waveform are investigated. Observe the effect on the rise time by changing the Number of open stubs. Propose the eye mask in order to create the design chart. Formulas for TDT waveform are derived by lattice diagram and the waveform overlays.
"low bit rate speech coder based on waveform interpolation =: 基於波形預測方法的低比特率語音編碼." 1999. http://library.cuhk.edu.hk/record=b5889941.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 1999.
Includes bibliographical references (leaves 101-107).
Text in English; abstracts in English and Chinese.
by Ge Gao.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Attributes of speech coders --- p.1
Chapter 1.1.1 --- Bit rate --- p.2
Chapter 1.1.2 --- Speech quality --- p.3
Chapter 1.1.3 --- Complexity --- p.3
Chapter 1.1.4 --- Delay --- p.4
Chapter 1.1.5 --- Channel-error sensitivity --- p.4
Chapter 1.2 --- Development of speech coding techniques --- p.5
Chapter 1.3 --- Motivations and objectives --- p.7
Chapter 2 --- Waveform interpolation speech model --- p.9
Chapter 2.1 --- Overview of speech production model --- p.9
Chapter 2.2 --- Linear prediction(LP) --- p.11
Chapter 2.3 --- Linear-prediction based analysis-by-synthesis coding(LPAS) --- p.14
Chapter 2.4 --- Sinusoidal model --- p.15
Chapter 2.5 --- Mixed Excitation Linear Prediction(MELP) model --- p.16
Chapter 2.6 --- Waveform interpolation model --- p.16
Chapter 2.6.1 --- Principles of waveform interpolation model --- p.18
Chapter 2.6.2 --- Outline of a WI coding system --- p.25
Chapter 3 --- Pitch detection --- p.31
Chapter 3.1 --- Overview of existing pitch detection methods --- p.31
Chapter 3.2 --- Robust Algorithm for Pitch Tracking(RAPT) --- p.33
Chapter 3.3 --- Modifications of RAPT --- p.37
Chapter 4 --- Development of a 1.7kbps speech coder --- p.44
Chapter 4.1 --- Architecture of the coder --- p.44
Chapter 4.2 --- Encoding of unvoiced speech --- p.46
Chapter 4.3 --- Encoding of voiced speech --- p.46
Chapter 4.3.1 --- Generation of PCW --- p.48
Chapter 4.3.2 --- Variable Dimensional Vector Quantization(VDVQ) --- p.53
Chapter 4.3.3 --- Sparse frequency representation(SFR) of speech --- p.56
Chapter 4.3.4 --- Sample selective linear prediction (SSLP) --- p.58
Chapter 4.4 --- Practical implementation issues --- p.60
Chapter 5 --- Development of a 2.0kbps speech coder --- p.67
Chapter 5.1 --- Features of the coder --- p.67
Chapter 5.2 --- Postfiltering --- p.75
Chapter 5.3 --- Voice activity detection(VAD) --- p.76
Chapter 5.4 --- Performance evaluation --- p.79
Chapter 6 --- Conclusion --- p.85
Chapter A --- Subroutine for pitch detection algorithm --- p.88
Chapter B --- Subroutines for Pitch Cycle Waveform(PCW) generation --- p.96
Chapter B.1 --- The main subroutine --- p.96
Chapter B.2 --- Subroutine for peak picking algorithm --- p.98
Chapter B.3 --- Subroutine for encoding the residue (using VDVQ) --- p.99
Chapter B.4 --- Subroutine for synthesizing PCW from its residue --- p.100
Bibliography --- p.101