Dissertations / Theses on the topic 'Coder'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Coder.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Asteborg, Marcus. "Flexible Audio Coder." Thesis, KTH, Ljud- och bildbehandling, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-55344.
Full textTong, Henry Hoi-Yu. "A perceptually adaptive JPEG coder." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/MQ29417.pdf.
Full textMadour, Lila. "A low-delay code excited linear prediction speech coder at 8 kbit/s /." Thesis, McGill University, 1994. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=68042.
Full textChoy, Eddie L. T. "Waveform interpolation speech coder at 4 kbs." Thesis, McGill University, 1998. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=20901.
Full textDe, Aloknath. "Auditory distortion measures for speech coder evaluation." Thesis, McGill University, 1993. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=41270.
Full textUnno, Takahiro. "An improved mixed excitation linear predicitive (MELP) coder." Thesis, Georgia Institute of Technology, 1998. http://hdl.handle.net/1853/13270.
Full textChai, Shan, and shan chai@optusnet com au. "Performance Evaluation of Perceptually Lossless Medical Image Coder." RMIT University. Electrical and Computer Engineering, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080205.120648.
Full textChoy, Eddie L. T. "Waveform interpolation speech coder at 4 kb/s." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0028/MQ50596.pdf.
Full textNeumeyer, Leonardo G. (Leonardo Gabriel) Carleton University Dissertation Engineering Electrical. "A low-delay backward-adaptive CELP speech coder." Ottawa, 1990.
Find full textHardwick, John C. (John Clark). "A 4.8 Kbps multi-band excitation speech coder." Thesis, Massachusetts Institute of Technology, 1988. http://hdl.handle.net/1721.1/14751.
Full textBrandstein, Michael Shapiro. "A 1.5 Kbps multi-band excitation speech coder." Thesis, Massachusetts Institute of Technology, 1990. http://hdl.handle.net/1721.1/14283.
Full textIncludes bibliographical references (leaves 58-60).
by Michael Shapiro Brandstein.
M.S.
Gaylord, W. J. (William J. ). "A hierarchical video coder with application to multipoint teleconferencing." Thesis, Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/14725.
Full textPereira, Wesley. "Modifying LPC parameter dynamics to improve speech coder efficiency." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32970.
Full textKonaté, Cheick Mohamed. "Enhancing speech coder quality: improved noise estimation for postfilters." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104578.
Full textITU-T G.711.1 est une extension multi-débit pour signaux à large-bande de la très répandue norme de compression audio de UIT-T G.711. Cette extension est interoperationelle avec sa version initiale à bande étroite. Lorsque l'ancienne version G.711 est employée pour coder un signal vocal et que G.711.1 est utiliser pour le décoder, le bruit de quantificationpeut être entendu. Pour ce cas, la norme propose un post-filtre optionel. Le post-filtre nécessite l'estimation du bruit de quantification. La précision de l'estimation du bruit de quantification va jouer sur la performance du post-filtre.Dans cette thèse, nous proposons un meilleur estimateur du bruit de quantification pour le post-filtre proposé pour le codec G.711.1 et nous évaluons ses performances. L'estimateur que nous proposons donne une estimation plus précise du bruit de quantification avec la même complexité.
Nagaswamy, Sriram. "Comparison of CELP speech coder with a wavelet method." Lexington, Ky. : [University of Kentucky Libraries], 2005. http://lib.uky.edu/ETD/ukyelen2006t00376/Thesis.pdf.
Full textTitle from document title page (viewed on January 30, 2006). Document formatted into pages; contains: ix, 124 p. : ill. Includes abstract and vita. Includes bibliographical references (p. 118-123).
Nagaswamy, Sriram. "Comparison of CELP speech coder with a wavelet method." UKnowledge, 2006. http://uknowledge.uky.edu/gradschool_theses/269.
Full textWong, Wing-Tak Kenneth. "A speech coder design for land mobile radio communications." Thesis, University of Liverpool, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.237531.
Full textLawai, Adnan Husain. "Scalable coding of HDTV pictures using the MPEG coder." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/38008.
Full textIncludes bibliographical references (leaves 118-121).
by Adnan Husain Lawai.
M.S.
Lalgudi, Hariharan G., Michael W. Marcellin, Ali Bilgin, and Mariappan S. Nadar. "SCALABLE LOW COMPLEXITY CODER FOR HIGH RESOLUTION AIRBORNE VIDEO." International Foundation for Telemetering, 2007. http://hdl.handle.net/10150/605492.
Full textReal-time transmission of airborne images to a ground station is highly desirable in many telemetering applications. Such transmission is often through an error prone, time varying wireless channel, possibly under jamming conditions. Hence, a fast, efficient, scalable, and error resilient image compression scheme is vital to realize the full potential of airborne reconnaisance. JPEG2000, the current international standard for image compression, offers most of these features. However, the computational complexity of JPEG2000 limits its use in some applications. Thus, we present a scalable low complexity coder (SLCC) that possesses many desirable features of JPEG2000, yet having high throughput.
Khan, Mohammad M. A. "Coding of excitation signals in a waveform interpolation speech coder." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32961.
Full textProduct code vector quantizers (PC-VQ) are a family of structured VQs that circumvent the complexity obstacle. The performance of product code VQs can be traded off against their storage and encoding complexity. This thesis introduces split/shape-gain VQ---a hybrid product code VQ, as an approach to quantize the SEW magnitude. The amplitude spectrum of the SEW is split into three non-overlapping subbands. The gains of the three subbands form the gain vector which are quantized using the conventional Generalized Lloyd Algorithm (GLA). Each shape vector obtained by normalizing each subband by its corresponding coded gain is quantized using a dimension conversion VQ along with a perceptually based bit allocation strategy and a perceptually weighted distortion measure. At the receiver, the discontinuity of the gain contour at the boundary of subbands introduces buzziness in the reconstructed speech. This problem is tackled by smoothing the gain versus frequency contour using a piecewise monotonic cubic interpolant. Simulation results indicate that the new method improves speech quality significantly.
The necessity of SEW phase information in the WI coder is also investigated in this thesis. Informal subjective test results demonstrate that transmission of SEW magnitude encoded by split/shape-gain VQ and inclusion of a fixed phase spectrum drawn from a voiced segment of a high-pitched male speaker obviates the need to send phase information.
Iyengar, Vasu. "A low delay 16 kbit/sec coder for speech signals /." Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=63799.
Full textKhan, Abdul Hannan. "Tree encoding in the ITU-T G.711.1 speech coder." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=97215.
Full textCette thèse étudie en détail les améliorations apportées au codeur de la parole ITU-T G.711.1. Le codeur original G.711 est en fait un quantificateur μ-law. Le prolongement large-bande G.711.1 utilise le façonnage du bruit ainsi qu'une couche d'amélioration de la bande-basse en plus de la bande-haute. Afin d'améliorer le codage de la bande-basse principale, nous étudions l'utilisation de quantification vectorielle et la décision à retardement. Le codeur arboriforme avec décision à retardée est réalisé par l'algorithme(M,L). Le nouveau quantificateur considère l'information passée et par conséquent, il considère également la propagation de l'erreur engendrée par le façonnage du bruit. Il code plusieurs échantillons par μ-law. Le flot binaire final est compatible avec le décodeur du prolongement large-bande G.711.1 et donc naturellement avec le décodeur du G.711 original. Une méthode d'évaluation, ITU-T P.862 (PESQ) est utilisée pour évaluer la performance. Les résultats montrent que la quantification vectorielle et le codeur arboriforme sont perceptuellement plus performants que le codeur original de la bande principale. Nous notons tout de même qu'ils sont numériquement plus complexes à réaliser. Des études supplémentaires sont suggérées.
Belkoura, Zouhair M. "Analysis and application of turbo coder based distributed video coding." [S.l.] : [s.n.], 2007. http://opus.kobv.de/tuberlin/volltexte/2007/1618.
Full textKeisarian, Farhad. "A pyramid image coder using Block Template Matching (BTM) algorithm." Thesis, University of Nottingham, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.263412.
Full textSahle, Eskinder. "Development of a user interface for MARIAN and CODER systems." Master's thesis, This resource online, 1993. http://scholar.lib.vt.edu/theses/available/etd-04272010-020142/.
Full textRosenberg, Jonathan David. "Recasting a scene adaptive video coder for real time implementation." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36544.
Full textIncludes bibliographical references (p. 117-118).
by Jonathan David Rosenberg.
M.S.
Allen, Matthew S. "Performance Assessment of Model-Driven FPGA-based Software-Defined Radio Development." Digital WPI, 2014. https://digitalcommons.wpi.edu/etd-theses/943.
Full textSkretkowicz, Steven J. "Implementing a low-complexity, adaptive, layered video coder for video teleconferencing." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1999. http://handle.dtic.mil/100.2/ADA371067.
Full text"September 1999". Thesis advisor(s): Murali Tummala. Includes bibliographical references (p. 103-105). Also Available online.
LeBlanc, Wilfrid P. (Wilfrid Paul) Carleton University Dissertation Engineering Electrical. "An advanced speech coder based on a rate-distortion theory framework." Ottawa, 1988.
Find full textMin, Rex K. (Rex Kee) 1976. "Demonstration system for an ultra-low-power video coder and decoder." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80553.
Full textWeaver, Marybeth Therese. "Implementing an intelligent information retrieval system: the CODER system, version 1.0." Thesis, Virginia Tech, 1988. http://hdl.handle.net/10919/44097.
Full textMaster of Science
Namburu, Visala. "Speech Coder using Line Spectral Frequencies of Cascaded Second Order Predictors." Thesis, Virginia Tech, 2001. http://hdl.handle.net/10919/35670.
Full textMaster of Science
Fan, Yun-Hui. "A stereo audio coder with a nearly constant signal-to-noise ratio." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/14788.
Full textCarlén, Stefan. "Estimation of visual focus for control of a FOA-based image coder." Thesis, Linköping University, Department of Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2032.
Full textA major feature of the human eye is the compressed sensitiveness of the retina. An image coder, which makes use of this, can heavily encode the parts of the image which is not close to the focus of our eyes. Existing image coding schemes require that the gaze direction of the viewer is measured. However, a great advantage would be if an estimator predicts the focus of attention (FOA) regions in the image.
This report presents such an implementation, which is based on a model that mimics many of the biological features of the human visual system (HVS). For example, it uses a center-surround mechanism, which is a replica of the receptive fields of the neurons in the HVS.
An extra feature of the implementation is the extension to handle video sequences, and the expansion of the FOA:s. The test results of the system show good results from a large variety of images.
Chou, Chih-Ping, and 周治平. "An Efficient Speech Coder." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/22928460631222292871.
Full text國立中興大學
應用數學研究所
82
In this thesis, we design a speech coder with less distortion. The speech coder can remove speech redundancy (silence and quasi-period) from speech signal. And it combines the loss coder and the lossless coder to achieve high compression ratio. In practical application, we design a speech coder which combines the representative period wave algorithm, ADPCM and LZW algorithm. According to the experimental result, this speech coder can achieve high compression ratio(about 85%), real time decoding, and toll quality.
Jhuang, Jyun-Jie, and 莊俊傑. "An Improved Central Decoder for Multiple Description Coder Integrated with Distributed Video Coder." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/49009735230666378491.
Full text國立臺灣科技大學
電機工程系
97
Multiple Description Coding (MDC) is an effective solution of multi-media transmission in the jammed internet and unstable wireless networks. The MDC promises the stability and reliability of multi-media communications with multiple transmission paths. Distributed Video Coding (DVC) redistributes the coding complexity from encoder to decoder for the sake of providing video communications for mobile devices. Wyner-Ziv Codec was proposed to deal with image/video coding based DVC. These two video codec approaches, MDC and DVC, can be effectively integrated to provide robust and high efficient video coding. We proposed to integrate the MDC with DVC, abbreviated as MDVC, to provide stable and efficient video coding. The MDVC would yield two separate video descriptions, each comprised one key frame bistream and one WZ frame bitstream. To improve the rate-distortion performance of the MDVC, we proposed to adopt DPCM at the encoder side to reduce the signal entropy for encoding. For DPCM predictive coding, an adaptive prediction method is designed to select a best prediction frame based on the inter-frame correlations, which demonstrated better PSNR performances compared to previous works. Simulations demonstrated that the proposed MDVC codec outperforms previous researches in image PSNR when no transmission error. It also demonstrated robustness to transmission errors and smaller PSNR variations. For theMDVC central decoder, the error correlation among intra- and inter-descriptions are utilized to select the best reconstructed frames from two descriptions dynamically, instead of selecting only one description or select only key frames from two descriptions, to yield the final reconstructed video. Simulations demonstrate higher average PSNRs from 1 to 3 dB and 50%-70% smaller PSNR variations compared to current central decoder with fixed control policy.
Miao, Ting Wu, and 苗廷武. "Dynamic micropipelined RSA coder/decoder." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/98711515824482884312.
Full textLiu, Shin-Hua, and 劉欣華. "Improvement of G.723.1 Speech Coder." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/03566954154494548963.
Full textHuang, Hsing-Yang. "Design and Implementation of AAC audio coder." 2004. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-1906200412395300.
Full textCHYAN, CHUN-AN, and 簡崇安. "Design and Implementation of JPEG2000 EBCOT coder." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/44684953777007916715.
Full text國立臺灣大學
電機工程學研究所
90
JPEG2000 system is the newest standard for still image compression. In this Thesis, we discuss the basic architecture of JPEG2000 system, which could be viewed as an evolution of image compression techniques during recent years. However, the key component, which is called “EBCOT,” contains many bit-level computation and multiple scan, it makes JPEG2000 too slow to fit some applications if we use general purpose CPU to execute JPEG2000. We design and implement an ASIC to accrete EBCOT, the cycles needed are reduced to about 45% of the original algorithm, and the clock rate can reach 133MHz in our simulation.
CHEN, JUN-DE, and 陳俊德. "Motion-compensated coder for monochrome image sequences." Thesis, 1988. http://ndltd.ncl.edu.tw/handle/58263237512020963857.
Full textTai, Chiao-Yen, and 戴譙彥. "Trade-offs in Video Coder VLSI Design." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/30888868208864499827.
Full text國立交通大學
電子研究所
83
This thesis is focused on the system-level design of the image compression coder chip for the CCITT H.261 standard. An H.261 coder consists of several operation units such as motion estimation, discrete cosine transform, quantization, and variable length coding etc. This thesis places more emphasis on how to make efficient and economical connections between various units. The issues under consideration are system operating frequency, system synchronization, general logic vs. memory circuit, the number of external bus, and processing fabrication technologies.
Shou, Shu-Chen, and 邵淑真. "Realization of G.723.1 Speech Coder on." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/62672914634810138702.
Full text國立交通大學
電機與控制工程系
89
In this thesis, we investigate a speech coding standard, G.723.1, and implement it on a ARM processor. With the invention and progress on the internet, multimedia has become essential in wireless communication and 3C products. Audio and vedio are two aspects of multimedia applications and speech coding is one sort of the former. G.723.1 is a speech coding standard produced by ITU-T (International Telecommunication Union - Telecommunication Standardization Sector) in 1996 for compressing the speech or other audio signal component of multimedia services at a very low bit rate. It is designed on the basis of CELP concept with dual rate: 5.3、6.3 kbit/s and can be widely used in the video-phone, the internet phone phone, and particularly as part of the MPEG-4 standard. At first, coder and decoder principles in G.723.1 will be described and explained in this thesis respectively. Then its implementation will be realized step by step. With an eye to accomplish the real-time purpose, some modifications are performed on either C code or assembly code to bring all advantages of ARM processor into full play. Finally, the aim of real-time implementation is attained with ARM9 processor.
Sun, Ming-Hung, and 孫銘鴻. "Joint Source-Channel Coding for MELP Coder." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/h7dv25.
Full text國立臺北科技大學
電機工程系所
94
Source coding and channel coding are definitely different techniques in nature. The former is the method that compresses the original signals into small ones, but still maintains the perceptual quality. On the other hand, the latter adds redundancy for detecting and correcting the transmission error. In this thesis, the mixed excitation linear prediction (MELP) vocoder and the soft-output Viterbi algorithm (SOVA) are used to achieve the purpose of joint source-channel decoding (JSCD). These output statistics of MELP vocoder according to voiced or unvoiced state are regarded as the prior probability of the SOVA decoder. The proposed channel decoding scheme is called the JSCD SOVA. Both unequal error protection (UEP) and equal error protection (EEP) are also included for comparison. The experimental results show that the performance of the UEP is higher than that of the EEP in low signal-to-noise ratio (SNR) condition. In addition, using the EEP to protect transmission bits, the JSCD SOVA decoder performs better than the conventional SOVA decoder with equal prior probabilities under poor SNR environment.
Nieh, Yung-Hsiang, and 聶永祥. "AMR Speech Coder and AMR Reference Model." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/6dadzx.
Full text國立臺北科技大學
電子電腦與通訊產業研發碩士專班
95
This paper is pertinent to the principle based Adaptive Mulit-Rate (AMR) and its relevant theory. As the AMR speech coding specifications and simulation programs are complicated and not easy to understand, it takes a lot of time for the R&D (Research and Development) engineers to understand its structure. With the development of AMR, use Unified Modeling Language (UML) for analysis and design to get a reference model of AMR. This model is provided to different people for use. Furthermore, it is also a pre-requisite for setting standards for AMR reference model. It satisfies the needs of AMR Technology Company which requires techniques in real time in order to save the developing time for communication and research department engineer. This is a pre-requisite, since in such a highly competitive telecommunication industry developing time needs to be very less, as the commercialization of R&D products has become a key point to obtain maximum profits. Finally, the paper proposes an AMR reference model and develops a value added service via speech system without changing the content of the program and maintaining the communication specifications of the AMR receiver. The purpose is to allow the mobile users take benefit of this reference model and examine all the possibilities and practicalities of it.
Lo, Chi-Wen, and 羅啟文. "The Quality Control for Wavelet Video Coder." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/f99f8g.
Full text國立臺灣科技大學
電機工程系
94
Three-dimensional (3D) wavelet transforms have been used in video codecs (WVC) to generate quality scalable bitstreams for communications over heterogeneous networks. However, reconstructed pictures of 3D-WVC suffer temporal quality fluctuations due to unequal distortion distributions in the code/decode process. Theoretical WVC coding models had been proposed to smooth temporal picture quality fluctuations. However, in practical coding processes, signal properties do not comply with assumptions in the ideal WVC model. We propose to bridge this gap between ideal and practical signal properties to improve the temporal quality smoothness (TS) of WVC. The relation between distortions of subbands and qualities of reconstructed pictures of one WVC was exploited, from which the TS algorithm was developed. The most important features of the proposed TS-WVC method are: 1) low computation complexity; 2) largely reduce the temporal picture quality variations; and 3) make it feasible for real-time WVC applications.
Kuo, Kuo-Bao, and 郭國寶. "Complexity Reduction for IETF-iLBC Speech Coder." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/83293637830643107097.
Full text南台科技大學
資訊工程系
97
With the development of the Internet, the needs to transmit information through the Internet have also increased. To find a low-rate but high-quality speech coding technology is the major issue of the speech communications over Internet. The iLBC speech codec has been recommended by the international organization, Internet Engineering Task Force (IETF). In the iLBC encoder, the dynamic codebooks, included the base-codebook, the augmented base-codebook, the expanded codebook and the augmented expanded codebook, bear the maximum loads of coding procedure. Especially, the computational loads of the base-codebook and the expanded codebook account for the largest volume of computational complexity in the codebook search. This thesis aims to reduce the computational complexity with perceptually negligible degradation of the speech quality. The pattern-pre-inspecting search method is therefore proposed. The proposed approach can indeed reduce the computation of the codebook search. In addition, this thesis proposed a complexity scalability design for adaptive codebook search for the hand-held or mobile devices so that the coding process can be adjusted dynamically according to the computation loading. According to the evaluation of the MOS-LQO score, an objective measure of ITU standards, the proposed approaches not only substantially reduce the computational complexity, but also maintain good quality of the coded speech. Furthermore, because the writing format of IETF differs from the international standards, it is difficult for engineers to implement and read. The thesis also illustrates the iLBC coding process in detail, and rewrites the process in the ways of ITU and MPEG standards.
Cai, Yu-Lin, and 蔡育霖. "Empirical Mode Decomposition in Voice Coder Application." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/64973999073939345026.
Full text國立臺灣海洋大學
電機工程學系
97
Waveform speech coding has advantages of high quality and low complexity.In this thesis, the empirical mode decomposition (EMD) proposed by E. Huang is firstly utilized to decompose nonstationary speech into several intrinsic mode functions(IMF).and then the Fast Fourier transform(FFT) is employed to analize the frequency spectrum of each IMF from which the cutoff frequency of the subsequent high-pass filter is decided and used to eliminate low-frequency contents of the original speech signal for data reduction. Finally, the filtered speech signal is fed into voice coder for compression and coding, so that the compression rate can be reduced while preserving the sound quality.In the experiments, the speech signals from MAT-400 voice data base are used as the test dat where in EMD, combined with G.729 and DPCM voice encoders respectively, is performed for performance comparison.
Huang, Hsing-Yang, and 黃興洋. "Design and Implementation of AAC audio coder." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/58945082713462878185.
Full text國立臺灣大學
資訊工程學研究所
92
AAC provides the highest compression rate and quality among all audio coding standards. However, the complexity of AAC encoder is also very high. The high complexity mainly comes from the great amount of operations performed in filter bank, psychoacoustic model, and bit allocation module. In this thesis, we study fast algorithms and try to reduce the complexities of filter bank and bit allocation module. The fast MDCT algorithm takes advantage of the relationship between MDCT and O2DFT (Odd Time Odd Frequency DFT), reduces the MDCT computation to N/4 points FFT. As for bit allocation module, we apply fast initial gain search method and noise estimation method to develop a new bit allocation module. New bit allocation module reduces the two nested loop structure to one single loop. The complexity is much lower comparing to the bit allocation module proposed in AAC standard. These algorithms efficiently reduce the complexities of filter bank and bit allocation module. Comparing to the famous coder FAAC, our coder saves 50% encoding time and still maintains good quality.
Lin, Tzung-Liang, and 林宗良. "A VARIABLE BIT RATE CELP SPEECH CODER." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/88623392182276404082.
Full text大同大學
電機工程研究所
90
In the era of mobile and network communication, speech is still the most natural and convenient manner for human to exchange information. Attempts are made continuously to pursuit speech coding techniques with lower bit rates and better synthetic speech quality. The use of variable bit rate (VBR) coders is undeniably an attractive approach for maintaining speech quality at lower average bit rate. The aim of this thesis is thus to design a VBR speech coder based on the Code Excited Linear Prediction (CELP) at 4.8 kbps, which was standardized as Federal Standard FS-1016 in 1982. The basic idea of our system design is from the observation: the frame size of most speech coders is around 20-30 ms, while the speech signal is slowly time-varying, e.g., vowel sounds may last for 200-300 ms, during which the vocal tract remains nearly unchanged. This observation suggests that the speech parameters, such as the Linear Predictive Coding (LPC) parameters and the Line Spectral Frequencies (LSFs), may share high similarity between the current frame and some temporally closed previous frames. This means that it is not necessarily to transmit a set of new parameters for each frame. Instead, speech parameters of a previous frame may be used in the decoder to save the bit rate. Based on the concept described above, we introduced an adaptive forward/backward quantization (AFBQ) [1] scheme to reduce the required for transmitting of LPC parameters. Specifically, the spectral distances between the current frame and some previous frames are calculated and an experimentally determined threshold is used to decide either the LPC parameters of the current frame should be transmitted or it is sufficient to transmit only a location index of a previous frame. The AFBQ scheme can reduce bit rate at a minimum cost of speech quality. To further reduce the bit rate, instead of using a 34-bit scalar quantizer, the proposed VBR coder utilizes a 10-bit vector quantizer (VQ) for the quantization of the LSF parameters. On the effort of speech quality improvement, we adopted a perceptual weighted distance measure in the LSF vector quantizer and incorporated an interpolation scheme for LSF parameters to smooth the spectral changes in the synthetic speech. The LSF interpolation scheme can improve the speech quality without the need of transmitting extra bits, but at the cost of 15-ms longer coding delay. Our experimental results and informal listening test showed that, by using the AFBQ scheme and LSF vector quantizer, the proposed VBR speech coder could maintain speech quality at a lower average bit rate. For example, the VBR coder at 3.9 kbps can retain the average segmental signal-to-noise ratio (segSNR) with only 0.6 dB lower than that of the 4.8 kbps CELP coder, and their synthetic speech quality can hardly be differentiated. The experimental results also showed that the inclusion of the LSF interpolation scheme did improve the speech quality with a higher average segSNR of 0.8 dB.