Dissertations / Theses: 'Coder'

1

Asteborg, Marcus. "Flexible Audio Coder." Thesis, KTH, Ljud- och bildbehandling, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-55344.

Full text

Abstract:

As modern communications networks are heterogeneous and, therefore, highly variable, the design of source coders should reflect this network variability. It is desired that source coders are able to adapt instantly to any bit-rate constraint. Source coders that possess this property offer coding flexibility that facilitates optimal utilization of the available communication channel within a heterogeneous network. Flexible coders are able to utilize feedback information and therefore perform an instant re-optimization. This property implies that flexible audio coders are better suited for networks with high variability due to the fact a single configuration of the flexible coder can operate at continuum of bit-rates. The aim of the thesis is to implement a flexible audio coder in a real-time demonstrator (VoIP application) that facilitates instant re-optimization of the flexible coding scheme. The demonstrator provides real-time full-duplex communications over a packet network and the operating bit-rate may be adjusted on the fly. The coding performance of the flexible audio coding scheme should remain comparable to non-flexible schemes optimized at their operating bitrates. The report provides a background for the thesis work and describes the real-time implementation of the demonstrator. Finally, test results are provided. The coder is evaluated by means of a subjective MUSHRA test. The performance of the flexible audio coder is compared to relevant state-of-the-art codecs.

APA, Harvard, Vancouver, ISO, and other styles

2

Tong, Henry Hoi-Yu. "A perceptually adaptive JPEG coder." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/MQ29417.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Madour, Lila. "A low-delay code excited linear prediction speech coder at 8 kbit/s /." Thesis, McGill University, 1994. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=68042.

Full text

Abstract:

The goal of this thesis is to design a high quality low-delay 8 kb/s speech coder. This research is motivated by the need of the telecommunication industries to standardize a high quality, low-delay and low rate speech coder. To meet these requirements, we use a coder based on code-existed linear prediction. To meet the demands of high quality and low bit rate, a vector quantizer is used to code the excitation signal. To meet the low-delay requirement, a backward adaptation technique of the synthesis filters is used. The focus of the research is on comparing different pitch synthesis filters in the CELP coder. From the three-order pitch synthesis filter, the first-order integer delay pitch synthesis filter and the first-order fractional delay pitch synthesis filter that are experimented in this research, the latter produces the best quality.

APA, Harvard, Vancouver, ISO, and other styles

4

Choy, Eddie L. T. "Waveform interpolation speech coder at 4 kbs." Thesis, McGill University, 1998. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=20901.

Full text

Abstract:

Speech coding at bit rates near 4 kbps is expected to be widely deployed in applications such as visual telephony, mobile and personal communications. This research focuses on developing a speech coder based on the waveform interpolation (WI) scheme, with an attempt to deliver near toll-quality speech at rates around 4 kbps. A WI coder has been simulated in floating-point using the C programming language. The high performance of the WI model has been confirmed by subjective listening tests in which the unquantized coder outperforms the 32 kbps G.726 standard (ADPCM) 98% of the time under clean input speech conditions; the reconstructed speech is perceived to be essentially indistinguishable from the original. When fully quantized, the speech quality of the WI coder at 4.25 kbps has been judged to be equivalent to or better than that of G.729 (the ITU-T toll-quality 8 kbps standard) for 45% of the test sentences. Further refinements of the quantization techniques are warranted to bring the coder closer to the toll-quality benchmark. Yet, the existing implementation has produced good quality coded speech with a high degree of intelligibility and naturalness when compared to the conventional coding schemes operating in the neighbourhood of 4 kbps.

APA, Harvard, Vancouver, ISO, and other styles

5

De, Aloknath. "Auditory distortion measures for speech coder evaluation." Thesis, McGill University, 1993. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=41270.

Full text

Abstract:

One of the important research problems in the area of speech coding is to determine the sound quality of coded speech signals. This quality can best be evaluated by a subjective assessment which is often difficult to administer and time consuming. An objective measure which is consistent with subjective assessment could play a vital role in the evaluation as well as in the design of a low bit-rate speech coder. In this dissertation, we introduce two distortion measures for speech coder evaluation. Since the perceptual abilities of a human being determine the precision with which speech data must be processed, we consider the details of cochlear (inner ear) and other auditory processing. Using Lyon's auditory model, the time-domain signal is mapped onto a perceptual-domain (PD). Any speech utterance is communicated to the brain through a series of all-or-none electrical spikes (firings) and the PD representation provides information pertaining to the probability-of-firings in the neural channels. Our first measure, namely the cochlear discrimination information (CDI), evaluates the cross-entropy of the neural firings for the coded speech with respect to those for the original one. With this measure, we also compute the rate-distortion function determining the lowest bit-rate required for a specified amount of distortion. In the second measure, namely the cochlear hidden Markovian (CHM) measure, we attempt to capture the high-level processing in the brain with simple hidden Markov models (HMMs). We characterize the firing events by HMMs where the order of occurrence of PD observations and correlations among adjacent observations are modeled suitably. For computing the coder distortion, the PD observations of the coded speech are matched against the HMMs derived from the PD observations of the original speech. Experimental results show that these measures conform to subjective evaluation results in majority of the cases. Finally, the introduced measures are also app

APA, Harvard, Vancouver, ISO, and other styles

6

Unno, Takahiro. "An improved mixed excitation linear predicitive (MELP) coder." Thesis, Georgia Institute of Technology, 1998. http://hdl.handle.net/1853/13270.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Chai, Shan, and shan chai@optusnet com au. "Performance Evaluation of Perceptually Lossless Medical Image Coder." RMIT University. Electrical and Computer Engineering, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080205.120648.

Full text

Abstract:

Medical imaging technologies offer the benefits of faster and accurate diagnosis. When the medical imaging combined with the digitization, they offer the advantage of permanent storage and fast transmission to any geographical location. However, there is a need for efficient compression algorithms that alleviate the taxing burden of both large storage space and transmission bandwidth requirements. The Perceptually Lossless Medical Image Coder is a new image compression technique. It provides a solution to challenge of delivering clinically critical information in the shortest time possible. It embeds the visual pruning into the JPEG 2000 coding framework to achieve the optimal compression without losing the visual integrity of medical images. However, the performance of the PLMIC under certain medical image operation is still unknown. In this thesis, we investigate the performance of the PLMIC by applying linear, quadratic and cubic standard and centered B-spline interpolation filters. In order to evaluate the visual performance, a subjective assessment consisting of 30 medical images and 6 image processing experts was conducted. The perceptually lossless medical image coder was compared to the state-of-the-art JPEG-LS compliant LOCO and NLOCO image coders. The results have shown overall, there were no perceivable differences of statistical significance when the medical images were enlarged by a factor of 2. The findings of the thesis may help the researchers to further improve the coder. Additionally, it may also notify the radiologists the performance of the PLMIC coder to help them with correct diagnosis.

APA, Harvard, Vancouver, ISO, and other styles

8

Choy, Eddie L. T. "Waveform interpolation speech coder at 4 kb/s." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0028/MQ50596.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Neumeyer, Leonardo G. (Leonardo Gabriel) Carleton University Dissertation Engineering Electrical. "A low-delay backward-adaptive CELP speech coder." Ottawa, 1990.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

10

Hardwick, John C. (John Clark). "A 4.8 Kbps multi-band excitation speech coder." Thesis, Massachusetts Institute of Technology, 1988. http://hdl.handle.net/1721.1/14751.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Brandstein, Michael Shapiro. "A 1.5 Kbps multi-band excitation speech coder." Thesis, Massachusetts Institute of Technology, 1990. http://hdl.handle.net/1721.1/14283.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1990.
Includes bibliographical references (leaves 58-60).
by Michael Shapiro Brandstein.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

12

Gaylord, W. J. (William J. ). "A hierarchical video coder with application to multipoint teleconferencing." Thesis, Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/14725.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Pereira, Wesley. "Modifying LPC parameter dynamics to improve speech coder efficiency." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32970.

Full text

Abstract:

Reducing the transmission bandwidth and achieving higher speech quality are primary concerns in developing new speech coding algorithms. The goal of this thesis is to improve the perceptual speech quality of algorithms that employ linear predictive coding (LPC). Most LPC-based speech coders extract parameters representing an all-pole filter. This LPC analysis is performed on each block or frame of speech. To smooth out the evolution of the LPC tracks, each block is divided into subframes for which the LPC parameters are interpolated. This improves the perceptual quality without additional transmission bit rate. A method of modifying the interpolation endpoints to improve the spectral match over all the subframes is introduced. The spectral distortion and weighted Euclidean LSF (Line Spectral Frequencies) distance are used as objective measures of the performance of this warping method. The algorithm has been integrated in a floating point C-version of the Adaptive Multi Rate (AMR) speech coder and these results are presented.

APA, Harvard, Vancouver, ISO, and other styles

14

Konaté, Cheick Mohamed. "Enhancing speech coder quality: improved noise estimation for postfilters." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104578.

Full text

Abstract:

ITU-T G.711.1 is a multirate wideband extension for the well-known ITU-T G.711 pulse code modulation of voice frequencies. The extended system is fully interoperable with the legacy narrowband one. In the case where the legacy G.711 is used to code a speech signal and G.711.1 is used to decode it, quantization noise may be audible. For this situation, the standard proposes an optional postfilter. The application of postfiltering requires an estimation of the quantization noise. The more accurate the estimate of the quantization noise is, the better the performance of the postfilter can be.In this thesis, we propose an improved noise estimator for the postfilter proposed for the G.711.1 codec and assess its performance. The proposed estimator provides a more accurate estimate of the noise with the same computational complexity.
ITU-T G.711.1 est une extension multi-débit pour signaux à large-bande de la très répandue norme de compression audio de UIT-T G.711. Cette extension est interoperationelle avec sa version initiale à bande étroite. Lorsque l'ancienne version G.711 est employée pour coder un signal vocal et que G.711.1 est utiliser pour le décoder, le bruit de quantificationpeut être entendu. Pour ce cas, la norme propose un post-filtre optionel. Le post-filtre nécessite l'estimation du bruit de quantification. La précision de l'estimation du bruit de quantification va jouer sur la performance du post-filtre.Dans cette thèse, nous proposons un meilleur estimateur du bruit de quantification pour le post-filtre proposé pour le codec G.711.1 et nous évaluons ses performances. L'estimateur que nous proposons donne une estimation plus précise du bruit de quantification avec la même complexité.

APA, Harvard, Vancouver, ISO, and other styles

15

Nagaswamy, Sriram. "Comparison of CELP speech coder with a wavelet method." Lexington, Ky. : [University of Kentucky Libraries], 2005. http://lib.uky.edu/ETD/ukyelen2006t00376/Thesis.pdf.

Full text

Abstract:

Thesis (M.S.)--University of Kentucky, 2005.
Title from document title page (viewed on January 30, 2006). Document formatted into pages; contains: ix, 124 p. : ill. Includes abstract and vita. Includes bibliographical references (p. 118-123).

APA, Harvard, Vancouver, ISO, and other styles

16

Nagaswamy, Sriram. "Comparison of CELP speech coder with a wavelet method." UKnowledge, 2006. http://uknowledge.uky.edu/gradschool_theses/269.

Full text

Abstract:

This thesis compares the speech quality of Code Excited Linear Predictor (CELP, Federal Standard 1016) speech coder with a new wavelet method to compress speech. The performances of both are compared by performing subjective listening tests. The test signals used are clean signals (i.e. with no background noise), speech signals with room noise and speech signals with artificial noise added. Results indicate that for clean signals and signals with predominantly voiced components the CELP standard performs better than the wavelet method but for signals with room noise the wavelet method performs much better than the CELP. For signals with artificial noise added, the results are mixed depending on the level of artificial noise added with CELP performing better for low level noise added signals and the wavelet method performing better for higher noise levels.

APA, Harvard, Vancouver, ISO, and other styles

17

Wong, Wing-Tak Kenneth. "A speech coder design for land mobile radio communications." Thesis, University of Liverpool, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.237531.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Lawai, Adnan Husain. "Scalable coding of HDTV pictures using the MPEG coder." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/38008.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.
Includes bibliographical references (leaves 118-121).
by Adnan Husain Lawai.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

19

Lalgudi, Hariharan G., Michael W. Marcellin, Ali Bilgin, and Mariappan S. Nadar. "SCALABLE LOW COMPLEXITY CODER FOR HIGH RESOLUTION AIRBORNE VIDEO." International Foundation for Telemetering, 2007. http://hdl.handle.net/10150/605492.

Full text

Abstract:

ITC/USA 2007 Conference Proceedings / The Forty-Third Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2007 / Riviera Hotel & Convention Center, Las Vegas, Nevada
Real-time transmission of airborne images to a ground station is highly desirable in many telemetering applications. Such transmission is often through an error prone, time varying wireless channel, possibly under jamming conditions. Hence, a fast, efficient, scalable, and error resilient image compression scheme is vital to realize the full potential of airborne reconnaisance. JPEG2000, the current international standard for image compression, offers most of these features. However, the computational complexity of JPEG2000 limits its use in some applications. Thus, we present a scalable low complexity coder (SLCC) that possesses many desirable features of JPEG2000, yet having high throughput.

APA, Harvard, Vancouver, ISO, and other styles

20

Khan, Mohammad M. A. "Coding of excitation signals in a waveform interpolation speech coder." Thesis, McGill University, 2001. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32961.

Full text

Abstract:

The goal of this thesis is to improve the quality of the Waveform Interpolation (WI) coded speech at 4.25 kbps. The quality improvement is focused on the efficient coding scheme of voiced speech segments, while keeping the basic coding format intact. In the WI paradigm voiced speech is modelled as a concatenation of the Slowly Evolving pitch-cycle Waveforms (SEW). Vector quantization is the optimal approach to encode the SEW magnitude at low bit rates, but its complexity imposes a formidable barrier.
Product code vector quantizers (PC-VQ) are a family of structured VQs that circumvent the complexity obstacle. The performance of product code VQs can be traded off against their storage and encoding complexity. This thesis introduces split/shape-gain VQ---a hybrid product code VQ, as an approach to quantize the SEW magnitude. The amplitude spectrum of the SEW is split into three non-overlapping subbands. The gains of the three subbands form the gain vector which are quantized using the conventional Generalized Lloyd Algorithm (GLA). Each shape vector obtained by normalizing each subband by its corresponding coded gain is quantized using a dimension conversion VQ along with a perceptually based bit allocation strategy and a perceptually weighted distortion measure. At the receiver, the discontinuity of the gain contour at the boundary of subbands introduces buzziness in the reconstructed speech. This problem is tackled by smoothing the gain versus frequency contour using a piecewise monotonic cubic interpolant. Simulation results indicate that the new method improves speech quality significantly.
The necessity of SEW phase information in the WI coder is also investigated in this thesis. Informal subjective test results demonstrate that transmission of SEW magnitude encoded by split/shape-gain VQ and inclusion of a fixed phase spectrum drawn from a voiced segment of a high-pitched male speaker obviates the need to send phase information.

APA, Harvard, Vancouver, ISO, and other styles

21

Iyengar, Vasu. "A low delay 16 kbit/sec coder for speech signals /." Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=63799.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Khan, Abdul Hannan. "Tree encoding in the ITU-T G.711.1 speech coder." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=97215.

Full text

Abstract:

This thesis examines further enhancement to ITU-T G.711.1 speech coder. The original G.711 coder is effectively a low band μ-law quantizer. The G.711.1 extension adds noise feed-back and lower band enhancement layer apart from the higher-band. To further improve the core lower-band coding performance the use of both vector quantization and delayed decision multi-path tree encoder in the above coder at the low band portion is studied. The delayed decision multi-path tree encoding is implemented by the (M,L) – algorithm. The new quantizer takes into account past history, and hence, the error propagation due to noise feed-back, and codes multiple samples under μ-law. The final bitstream is compatible with the G.711.1 decoder and, hence, with the original G.711 decoder. An evaluation method, ITU-T P.862 perceptual evaluation of speech quality (PESQ), is used to evaluate the performance. Both the vector quantizer and tree encoder have better performance than the original core layer encoder in terms of perceptual quality, though they are limited by the increased computational complexity. Future studies are suggested.
Cette thèse étudie en détail les améliorations apportées au codeur de la parole ITU-T G.711.1. Le codeur original G.711 est en fait un quantificateur μ-law. Le prolongement large-bande G.711.1 utilise le façonnage du bruit ainsi qu'une couche d'amélioration de la bande-basse en plus de la bande-haute. Afin d'améliorer le codage de la bande-basse principale, nous étudions l'utilisation de quantification vectorielle et la décision à retardement. Le codeur arboriforme avec décision à retardée est réalisé par l'algorithme(M,L). Le nouveau quantificateur considère l'information passée et par conséquent, il considère également la propagation de l'erreur engendrée par le façonnage du bruit. Il code plusieurs échantillons par μ-law. Le flot binaire final est compatible avec le décodeur du prolongement large-bande G.711.1 et donc naturellement avec le décodeur du G.711 original. Une méthode d'évaluation, ITU-T P.862 (PESQ) est utilisée pour évaluer la performance. Les résultats montrent que la quantification vectorielle et le codeur arboriforme sont perceptuellement plus performants que le codeur original de la bande principale. Nous notons tout de même qu'ils sont numériquement plus complexes à réaliser. Des études supplémentaires sont suggérées.

APA, Harvard, Vancouver, ISO, and other styles

23

Belkoura, Zouhair M. "Analysis and application of turbo coder based distributed video coding." [S.l.] : [s.n.], 2007. http://opus.kobv.de/tuberlin/volltexte/2007/1618.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Keisarian, Farhad. "A pyramid image coder using Block Template Matching (BTM) algorithm." Thesis, University of Nottingham, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.263412.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Sahle, Eskinder. "Development of a user interface for MARIAN and CODER systems." Master's thesis, This resource online, 1993. http://scholar.lib.vt.edu/theses/available/etd-04272010-020142/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Rosenberg, Jonathan David. "Recasting a scene adaptive video coder for real time implementation." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36544.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.
Includes bibliographical references (p. 117-118).
by Jonathan David Rosenberg.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

27

Allen, Matthew S. "Performance Assessment of Model-Driven FPGA-based Software-Defined Radio Development." Digital WPI, 2014. https://digitalcommons.wpi.edu/etd-theses/943.

Full text

Abstract:

"This thesis presents technologies that integrate field programmable gate arrays (FPGAs), model-driven design tools, and software-defined radios (SDRs). Specifically, an assessment of current state-of-the-art practices applying model-driven development techniques targeting SDR systems is conducted. FPGAs have become increasingly versatile computing devices due to their size and resource enhancements, advanced core generation, partial reconfigurability, and system-on-a-chip (SoC) implementations. Although FPGAs possess relatively better performance per watt when compared to central processing units (CPUs) or graphics processing units (GPUs), FPGAs have been avoided due to long development cycles and higher implementation costs due to significant learning curves and low levels of abstraction associated with the hardware description languages (HDLs). This thesis conducts a performance assessment of SDR designs using both a model-driven design approach developed with Mathworks HDL Coder and a hand-optimized design approach created from the model-driven VHDL. Each design was implemented on the FPGA fabric of a Zynq-7000 SoC, using a Zedboard evaluation platform for hardware verification. Furthermore, a set of guidelines and best practices for applying model-driven design techniques toward the development of SDR systems using HDL Coder is presented."

APA, Harvard, Vancouver, ISO, and other styles

28

Skretkowicz, Steven J. "Implementing a low-complexity, adaptive, layered video coder for video teleconferencing." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1999. http://handle.dtic.mil/100.2/ADA371067.

Full text

Abstract:

Thesis (M.S. in Electrical Engineering) Naval Postgraduate School, September 1999.
"September 1999". Thesis advisor(s): Murali Tummala. Includes bibliographical references (p. 103-105). Also Available online.

APA, Harvard, Vancouver, ISO, and other styles

29

LeBlanc, Wilfrid P. (Wilfrid Paul) Carleton University Dissertation Engineering Electrical. "An advanced speech coder based on a rate-distortion theory framework." Ottawa, 1988.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

30

Min, Rex K. (Rex Kee) 1976. "Demonstration system for an ultra-low-power video coder and decoder." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80553.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Weaver, Marybeth Therese. "Implementing an intelligent information retrieval system: the CODER system, version 1.0." Thesis, Virginia Tech, 1988. http://hdl.handle.net/10919/44097.

Full text

Abstract:

For individuals requiring interactive access to online text, information storage and retrieval systems provide a way to retrieve desired documents and/or text passages. The CODER (COmposite Document Expert/effective/extended Retrieval) system is a testbed for determining how useful various artificial intelligence techniques are for increasing the effectiveness of information storage and retrieval systems. The system, designed previously, has three components: an analysis subsystem for analyzing and storing document contents, a central spine for manipulation and storage of world and domain knowledge, and a retrieval subsystem for matching user queries to relevant documents. This thesis discusses the implementation of the retrieval subsystem and portions of the spine and analysis subsystem. It illustrates that logic programming, specifically with the Prolog language, is suitable for development of an intelligent information retrieval system. Furthermore, it shows that system modularity provides a flexible research testbed, allowing many individuals to work on different parts of the system which may later be quickly integrated. The retrieval subsystem has been implemented in a modular fashion so that new approaches to information retrieval can be easily compared to more traditional ones. A powerful knowledge representation language, a comprehensive lexicon and individually tailored experts using standardized blackboard modules for communication and control allowed rapid prototyping, incremental development and ready adaptability to change. The system executes on a DEC VAX ll/785 running ULTRIXâ ¢, a variant of 4.2 BSD UNIX. It has been implemented as a set of MU-Prolog and C modules communicating through TCP/IP sockets.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

32

Namburu, Visala. "Speech Coder using Line Spectral Frequencies of Cascaded Second Order Predictors." Thesis, Virginia Tech, 2001. http://hdl.handle.net/10919/35670.

Full text

Abstract:

A major objective in speech coding is to represent speech with as few bits as possible. Usual transmission parameters include auto regressive parameters, pitch parameters, excitation signals and excitation gains. The pitch predictor makes these coders sensitive to channel errors. Aiming for robustness to channel errors, we do not use pitch prediction and compensate for its lack with a better representation of the excitation signal. We propose a new speech coding approach, Vector Sum Excited Cascaded Linear Prediction (VSECLP), based on code excited linear prediction. We implement forward linear prediction using five cascaded second order sections - parameterized in terms of line spectral frequency - in place of the conventional tenth order filter. The line spectral frequency parameters estimated by the Direct Line Spectral Frequency (DLSF) adaptation algorithm are closer to the true values than those estimated by the Cascaded Recursive Least Squares - Subsection algorithm. A simplified version of DLSF is proposed to further reduce computational complexity. Split vector quantization is used to quantize the line spectral frequency parameters and vector sum codebooks to quantize the excitation signals. The effect on reconstructed speech quality and transmission rate, of an increased number of bits and differently split combinations, is analyzed by testing VSECLP on the TIMIT database. The quantization of the excitation vectors using the discrete cosine transform resulted in segmental signal to noise ratio of 4 dB at 20.95 kbps, whereas the same quality was obtained at 9.6 kbps using vector sum codebooks.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

33

Fan, Yun-Hui. "A stereo audio coder with a nearly constant signal-to-noise ratio." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/14788.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Carlén, Stefan. "Estimation of visual focus for control of a FOA-based image coder." Thesis, Linköping University, Department of Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2032.

Full text

Abstract:

A major feature of the human eye is the compressed sensitiveness of the retina. An image coder, which makes use of this, can heavily encode the parts of the image which is not close to the focus of our eyes. Existing image coding schemes require that the gaze direction of the viewer is measured. However, a great advantage would be if an estimator predicts the focus of attention (FOA) regions in the image.

This report presents such an implementation, which is based on a model that mimics many of the biological features of the human visual system (HVS). For example, it uses a center-surround mechanism, which is a replica of the receptive fields of the neurons in the HVS.

An extra feature of the implementation is the extension to handle video sequences, and the expansion of the FOA:s. The test results of the system show good results from a large variety of images.

APA, Harvard, Vancouver, ISO, and other styles

35

Chou, Chih-Ping, and 周治平. "An Efficient Speech Coder." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/22928460631222292871.

Full text

Abstract:

碩士
國立中興大學
應用數學研究所
82
In this thesis, we design a speech coder with less distortion. The speech coder can remove speech redundancy (silence and quasi-period) from speech signal. And it combines the loss coder and the lossless coder to achieve high compression ratio. In practical application, we design a speech coder which combines the representative period wave algorithm, ADPCM and LZW algorithm. According to the experimental result, this speech coder can achieve high compression ratio(about 85%), real time decoding, and toll quality.

APA, Harvard, Vancouver, ISO, and other styles

36

Jhuang, Jyun-Jie, and 莊俊傑. "An Improved Central Decoder for Multiple Description Coder Integrated with Distributed Video Coder." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/49009735230666378491.

Full text

Abstract:

碩士
國立臺灣科技大學
電機工程系
97
Multiple Description Coding (MDC) is an effective solution of multi-media transmission in the jammed internet and unstable wireless networks. The MDC promises the stability and reliability of multi-media communications with multiple transmission paths. Distributed Video Coding (DVC) redistributes the coding complexity from encoder to decoder for the sake of providing video communications for mobile devices. Wyner-Ziv Codec was proposed to deal with image/video coding based DVC. These two video codec approaches, MDC and DVC, can be effectively integrated to provide robust and high efficient video coding. We proposed to integrate the MDC with DVC, abbreviated as MDVC, to provide stable and efficient video coding. The MDVC would yield two separate video descriptions, each comprised one key frame bistream and one WZ frame bitstream. To improve the rate-distortion performance of the MDVC, we proposed to adopt DPCM at the encoder side to reduce the signal entropy for encoding. For DPCM predictive coding, an adaptive prediction method is designed to select a best prediction frame based on the inter-frame correlations, which demonstrated better PSNR performances compared to previous works. Simulations demonstrated that the proposed MDVC codec outperforms previous researches in image PSNR when no transmission error. It also demonstrated robustness to transmission errors and smaller PSNR variations. For theMDVC central decoder, the error correlation among intra- and inter-descriptions are utilized to select the best reconstructed frames from two descriptions dynamically, instead of selecting only one description or select only key frames from two descriptions, to yield the final reconstructed video. Simulations demonstrate higher average PSNRs from 1 to 3 dB and 50%-70% smaller PSNR variations compared to current central decoder with fixed control policy.

APA, Harvard, Vancouver, ISO, and other styles

37

Miao, Ting Wu, and 苗廷武. "Dynamic micropipelined RSA coder/decoder." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/98711515824482884312.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Liu, Shin-Hua, and 劉欣華. "Improvement of G.723.1 Speech Coder." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/03566954154494548963.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Huang, Hsing-Yang. "Design and Implementation of AAC audio coder." 2004. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-1906200412395300.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

CHYAN, CHUN-AN, and 簡崇安. "Design and Implementation of JPEG2000 EBCOT coder." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/44684953777007916715.

Full text

Abstract:

碩士
國立臺灣大學
電機工程學研究所
90
JPEG2000 system is the newest standard for still image compression. In this Thesis, we discuss the basic architecture of JPEG2000 system, which could be viewed as an evolution of image compression techniques during recent years. However, the key component, which is called “EBCOT,” contains many bit-level computation and multiple scan, it makes JPEG2000 too slow to fit some applications if we use general purpose CPU to execute JPEG2000. We design and implement an ASIC to accrete EBCOT, the cycles needed are reduced to about 45% of the original algorithm, and the clock rate can reach 133MHz in our simulation.

APA, Harvard, Vancouver, ISO, and other styles

41

CHEN, JUN-DE, and 陳俊德. "Motion-compensated coder for monochrome image sequences." Thesis, 1988. http://ndltd.ncl.edu.tw/handle/58263237512020963857.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Tai, Chiao-Yen, and 戴譙彥. "Trade-offs in Video Coder VLSI Design." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/30888868208864499827.

Full text

Abstract:

碩士
國立交通大學
電子研究所
83
This thesis is focused on the system-level design of the image compression coder chip for the CCITT H.261 standard. An H.261 coder consists of several operation units such as motion estimation, discrete cosine transform, quantization, and variable length coding etc. This thesis places more emphasis on how to make efficient and economical connections between various units. The issues under consideration are system operating frequency, system synchronization, general logic vs. memory circuit, the number of external bus, and processing fabrication technologies.

APA, Harvard, Vancouver, ISO, and other styles

43

Shou, Shu-Chen, and 邵淑真. "Realization of G.723.1 Speech Coder on." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/62672914634810138702.

Full text

Abstract:

碩士
國立交通大學
電機與控制工程系
89
In this thesis, we investigate a speech coding standard, G.723.1, and implement it on a ARM processor. With the invention and progress on the internet, multimedia has become essential in wireless communication and 3C products. Audio and vedio are two aspects of multimedia applications and speech coding is one sort of the former. G.723.1 is a speech coding standard produced by ITU-T (International Telecommunication Union - Telecommunication Standardization Sector) in 1996 for compressing the speech or other audio signal component of multimedia services at a very low bit rate. It is designed on the basis of CELP concept with dual rate: 5.3、6.3 kbit/s and can be widely used in the video-phone, the internet phone phone, and particularly as part of the MPEG-4 standard. At first, coder and decoder principles in G.723.1 will be described and explained in this thesis respectively. Then its implementation will be realized step by step. With an eye to accomplish the real-time purpose, some modifications are performed on either C code or assembly code to bring all advantages of ARM processor into full play. Finally, the aim of real-time implementation is attained with ARM9 processor.

APA, Harvard, Vancouver, ISO, and other styles

44

Sun, Ming-Hung, and 孫銘鴻. "Joint Source-Channel Coding for MELP Coder." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/h7dv25.

Full text

Abstract:

碩士
國立臺北科技大學
電機工程系所
94
Source coding and channel coding are definitely different techniques in nature. The former is the method that compresses the original signals into small ones, but still maintains the perceptual quality. On the other hand, the latter adds redundancy for detecting and correcting the transmission error. In this thesis, the mixed excitation linear prediction (MELP) vocoder and the soft-output Viterbi algorithm (SOVA) are used to achieve the purpose of joint source-channel decoding (JSCD). These output statistics of MELP vocoder according to voiced or unvoiced state are regarded as the prior probability of the SOVA decoder. The proposed channel decoding scheme is called the JSCD SOVA. Both unequal error protection (UEP) and equal error protection (EEP) are also included for comparison. The experimental results show that the performance of the UEP is higher than that of the EEP in low signal-to-noise ratio (SNR) condition. In addition, using the EEP to protect transmission bits, the JSCD SOVA decoder performs better than the conventional SOVA decoder with equal prior probabilities under poor SNR environment.

APA, Harvard, Vancouver, ISO, and other styles

45

Nieh, Yung-Hsiang, and 聶永祥. "AMR Speech Coder and AMR Reference Model." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/6dadzx.

Full text

Abstract:

碩士
國立臺北科技大學
電子電腦與通訊產業研發碩士專班
95
This paper is pertinent to the principle based Adaptive Mulit-Rate (AMR) and its relevant theory. As the AMR speech coding specifications and simulation programs are complicated and not easy to understand, it takes a lot of time for the R&D (Research and Development) engineers to understand its structure. With the development of AMR, use Unified Modeling Language (UML) for analysis and design to get a reference model of AMR. This model is provided to different people for use. Furthermore, it is also a pre-requisite for setting standards for AMR reference model. It satisfies the needs of AMR Technology Company which requires techniques in real time in order to save the developing time for communication and research department engineer. This is a pre-requisite, since in such a highly competitive telecommunication industry developing time needs to be very less, as the commercialization of R&D products has become a key point to obtain maximum profits. Finally, the paper proposes an AMR reference model and develops a value added service via speech system without changing the content of the program and maintaining the communication specifications of the AMR receiver. The purpose is to allow the mobile users take benefit of this reference model and examine all the possibilities and practicalities of it.

APA, Harvard, Vancouver, ISO, and other styles

46

Lo, Chi-Wen, and 羅啟文. "The Quality Control for Wavelet Video Coder." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/f99f8g.

Full text

Abstract:

碩士
國立臺灣科技大學
電機工程系
94
Three-dimensional (3D) wavelet transforms have been used in video codecs (WVC) to generate quality scalable bitstreams for communications over heterogeneous networks. However, reconstructed pictures of 3D-WVC suffer temporal quality fluctuations due to unequal distortion distributions in the code/decode process. Theoretical WVC coding models had been proposed to smooth temporal picture quality fluctuations. However, in practical coding processes, signal properties do not comply with assumptions in the ideal WVC model. We propose to bridge this gap between ideal and practical signal properties to improve the temporal quality smoothness (TS) of WVC. The relation between distortions of subbands and qualities of reconstructed pictures of one WVC was exploited, from which the TS algorithm was developed. The most important features of the proposed TS-WVC method are: 1) low computation complexity; 2) largely reduce the temporal picture quality variations; and 3) make it feasible for real-time WVC applications.

APA, Harvard, Vancouver, ISO, and other styles

47

Kuo, Kuo-Bao, and 郭國寶. "Complexity Reduction for IETF-iLBC Speech Coder." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/83293637830643107097.

Full text

Abstract:

碩士
南台科技大學
資訊工程系
97
With the development of the Internet, the needs to transmit information through the Internet have also increased. To find a low-rate but high-quality speech coding technology is the major issue of the speech communications over Internet. The iLBC speech codec has been recommended by the international organization, Internet Engineering Task Force (IETF). In the iLBC encoder, the dynamic codebooks, included the base-codebook, the augmented base-codebook, the expanded codebook and the augmented expanded codebook, bear the maximum loads of coding procedure. Especially, the computational loads of the base-codebook and the expanded codebook account for the largest volume of computational complexity in the codebook search. This thesis aims to reduce the computational complexity with perceptually negligible degradation of the speech quality. The pattern-pre-inspecting search method is therefore proposed. The proposed approach can indeed reduce the computation of the codebook search. In addition, this thesis proposed a complexity scalability design for adaptive codebook search for the hand-held or mobile devices so that the coding process can be adjusted dynamically according to the computation loading. According to the evaluation of the MOS-LQO score, an objective measure of ITU standards, the proposed approaches not only substantially reduce the computational complexity, but also maintain good quality of the coded speech. Furthermore, because the writing format of IETF differs from the international standards, it is difficult for engineers to implement and read. The thesis also illustrates the iLBC coding process in detail, and rewrites the process in the ways of ITU and MPEG standards.

APA, Harvard, Vancouver, ISO, and other styles

48

Cai, Yu-Lin, and 蔡育霖. "Empirical Mode Decomposition in Voice Coder Application." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/64973999073939345026.

Full text

Abstract:

碩士
國立臺灣海洋大學
電機工程學系
97
Waveform speech coding has advantages of high quality and low complexity.In this thesis, the empirical mode decomposition (EMD) proposed by E. Huang is firstly utilized to decompose nonstationary speech into several intrinsic mode functions(IMF).and then the Fast Fourier transform(FFT) is employed to analize the frequency spectrum of each IMF from which the cutoff frequency of the subsequent high-pass filter is decided and used to eliminate low-frequency contents of the original speech signal for data reduction. Finally, the filtered speech signal is fed into voice coder for compression and coding, so that the compression rate can be reduced while preserving the sound quality.In the experiments, the speech signals from MAT-400 voice data base are used as the test dat where in EMD, combined with G.729 and DPCM voice encoders respectively, is performed for performance comparison.

APA, Harvard, Vancouver, ISO, and other styles

49

Huang, Hsing-Yang, and 黃興洋. "Design and Implementation of AAC audio coder." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/58945082713462878185.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
92
AAC provides the highest compression rate and quality among all audio coding standards. However, the complexity of AAC encoder is also very high. The high complexity mainly comes from the great amount of operations performed in filter bank, psychoacoustic model, and bit allocation module. In this thesis, we study fast algorithms and try to reduce the complexities of filter bank and bit allocation module. The fast MDCT algorithm takes advantage of the relationship between MDCT and O2DFT (Odd Time Odd Frequency DFT), reduces the MDCT computation to N/4 points FFT. As for bit allocation module, we apply fast initial gain search method and noise estimation method to develop a new bit allocation module. New bit allocation module reduces the two nested loop structure to one single loop. The complexity is much lower comparing to the bit allocation module proposed in AAC standard. These algorithms efficiently reduce the complexities of filter bank and bit allocation module. Comparing to the famous coder FAAC, our coder saves 50% encoding time and still maintains good quality.

APA, Harvard, Vancouver, ISO, and other styles

50

Lin, Tzung-Liang, and 林宗良. "A VARIABLE BIT RATE CELP SPEECH CODER." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/88623392182276404082.

Full text

Abstract:

碩士
大同大學
電機工程研究所
90
In the era of mobile and network communication, speech is still the most natural and convenient manner for human to exchange information. Attempts are made continuously to pursuit speech coding techniques with lower bit rates and better synthetic speech quality. The use of variable bit rate (VBR) coders is undeniably an attractive approach for maintaining speech quality at lower average bit rate. The aim of this thesis is thus to design a VBR speech coder based on the Code Excited Linear Prediction (CELP) at 4.8 kbps, which was standardized as Federal Standard FS-1016 in 1982. The basic idea of our system design is from the observation: the frame size of most speech coders is around 20-30 ms, while the speech signal is slowly time-varying, e.g., vowel sounds may last for 200-300 ms, during which the vocal tract remains nearly unchanged. This observation suggests that the speech parameters, such as the Linear Predictive Coding (LPC) parameters and the Line Spectral Frequencies (LSFs), may share high similarity between the current frame and some temporally closed previous frames. This means that it is not necessarily to transmit a set of new parameters for each frame. Instead, speech parameters of a previous frame may be used in the decoder to save the bit rate. Based on the concept described above, we introduced an adaptive forward/backward quantization (AFBQ) [1] scheme to reduce the required for transmitting of LPC parameters. Specifically, the spectral distances between the current frame and some previous frames are calculated and an experimentally determined threshold is used to decide either the LPC parameters of the current frame should be transmitted or it is sufficient to transmit only a location index of a previous frame. The AFBQ scheme can reduce bit rate at a minimum cost of speech quality. To further reduce the bit rate, instead of using a 34-bit scalar quantizer, the proposed VBR coder utilizes a 10-bit vector quantizer (VQ) for the quantization of the LSF parameters. On the effort of speech quality improvement, we adopted a perceptual weighted distance measure in the LSF vector quantizer and incorporated an interpolation scheme for LSF parameters to smooth the spectral changes in the synthetic speech. The LSF interpolation scheme can improve the speech quality without the need of transmitting extra bits, but at the cost of 15-ms longer coding delay. Our experimental results and informal listening test showed that, by using the AFBQ scheme and LSF vector quantizer, the proposed VBR speech coder could maintain speech quality at a lower average bit rate. For example, the VBR coder at 3.9 kbps can retain the average segmental signal-to-noise ratio (segSNR) with only 0.6 dB lower than that of the 4.8 kbps CELP coder, and their synthetic speech quality can hardly be differentiated. The experimental results also showed that the inclusion of the LSF interpolation scheme did improve the speech quality with a higher average segSNR of 0.8 dB.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Coder'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles