Rozprawy doktorskie na temat „Text compression”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Text compression”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Wilson, Timothy David. "Animation of text compression algorithms". Thesis, University of Canterbury. Computer Science, 1992. http://hdl.handle.net/10092/9570.
Pełny tekst źródłaBranavan, Satchuthananthavale Rasiah Kuhan. "High compression rate text summarization". Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/44368.
Pełny tekst źródłaIncludes bibliographical references (p. 95-97).
This thesis focuses on methods for condensing large documents into highly concise summaries, achieving compression rates on par with human writers. While the need for such summaries in the current age of information overload is increasing, the desired compression rate has thus far been beyond the reach of automatic summarization systems. The potency of our summarization methods is due to their in-depth modelling of document content in a probabilistic framework. We explore two types of document representation that capture orthogonal aspects of text content. The first represents the semantic properties mentioned in a document in a hierarchical Bayesian model. This method is used to summarize thousands of consumer reviews by identifying the product properties mentioned by multiple reviewers. The second representation captures discourse properties, modelling the connections between different segments of a document. This discriminatively trained model is employed to generate tables of contents for books and lecture transcripts. The summarization methods presented here have been incorporated into large-scale practical systems that help users effectively access information online.
by Satchuthananthavale Rasiah Kuhan Branavan.
S.M.
Langiu, Alessio. "Optimal Parsing for dictionary text compression". Thesis, Paris Est, 2012. http://www.theses.fr/2012PEST1091/document.
Pełny tekst źródłaDictionary-based compression algorithms include a parsing strategy to transform the input text into a sequence of dictionary phrases. Given a text, such process usually is not unique and, for compression purpose, it makes sense to find one of the possible parsing that minimizes the final compression ratio. This is the parsing problem. An optimal parsing is a parsing strategy or a parsing algorithm that solve the parsing problem taking account of all the constraints of a compression algorithm or of a class of homogeneous compression algorithms. Compression algorithm constrains are, for instance, the dictionary itself, i.e. the dynamic set of available phrases, and how much a phrase weight on the compressed text, i.e. the length of the codeword that represent such phrase also denoted as the cost of a dictionary pointer encoding. In more than 30th years of history of dictionary based text compression, while plenty of algorithms, variants and extensions appeared and while such approach to text compression become one of the most appreciated and utilized in almost all the storage and communication process, only few optimal parsing algorithms was presented. Many compression algorithms still leaks optimality of their parsing or, at least, proof of optimality. This happens because there is not a general model of the parsing problem that includes all the dictionary based algorithms and because the existing optimal parsings work under too restrictive hypothesis. This work focus on the parsing problem and presents both a general model for dictionary based text compression called Dictionary-Symbolwise theory and a general parsing algorithm that is proved to be optimal under some realistic hypothesis. This algorithm is called Dictionary-Symbolwise Flexible Parsing and it covers almost all the cases of dictionary based text compression algorithms together with the large class of their variants where the text is decomposed in a sequence of symbols and dictionary phrases.In this work we further consider the case of a free mixture of a dictionary compressor and a symbolwise compressor. Our Dictionary-Symbolwise Flexible Parsing covers also this case. We have indeed an optimal parsing algorithm in the case of dictionary-symbolwise compression where the dictionary is prefix closed and the cost of encoding dictionary pointer is variable. The symbolwise compressor is any classical one that works in linear time, as many common variable-length encoders do. Our algorithm works under the assumption that a special graph that will be described in the following, is well defined. Even if this condition is not satisfied it is possible to use the same method to obtain almost optimal parses. In detail, when the dictionary is LZ78-like, we show how to implement our algorithm in linear time. When the dictionary is LZ77-like our algorithm can be implemented in time O(n log n). Both have O(n) space complexity. Even if the main aim of this work is of theoretical nature, some experimental results will be introduced to underline some practical effects of the parsing optimality in compression performance and some more detailed experiments are hosted in a devoted appendix
Ong, Ghim Hwee. "Text compression for transmission and storage". Thesis, Loughborough University, 1989. https://dspace.lboro.ac.uk/2134/13790.
Pełny tekst źródłaJones, Greg 1963-2017. "RADIX 95n: Binary-to-Text Data Conversion". Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc500582/.
Pełny tekst źródłaHe, Meng. "Indexing Compressed Text". Thesis, University of Waterloo, 2003. http://hdl.handle.net/10012/1143.
Pełny tekst źródłaBlandon, Julio Cesar. "A novel lossless compression technique for text data". FIU Digital Commons, 1999. http://digitalcommons.fiu.edu/etd/1694.
Pełny tekst źródłaThaper, Nitin 1975. "Using compression for source-based classification of text". Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/86595.
Pełny tekst źródłaZhang, Nan. "TRANSFORM BASED AND SEARCH AWARE TEXT COMPRESSION SCHEMES AND COMPRESSED DOMAIN TEXT RETRIEVAL". Doctoral diss., University of Central Florida, 2005. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3938.
Pełny tekst źródłaPh.D.
School of Computer Science
Engineering and Computer Science
Computer Science
Linhares, Pontes Elvys. "Compressive Cross-Language Text Summarization". Thesis, Avignon, 2018. http://www.theses.fr/2018AVIG0232/document.
Pełny tekst źródłaThe popularization of social networks and digital documents increased quickly the informationavailable on the Internet. However, this huge amount of data cannot be analyzedmanually. Natural Language Processing (NLP) analyzes the interactions betweencomputers and human languages in order to process and to analyze natural languagedata. NLP techniques incorporate a variety of methods, including linguistics, semanticsand statistics to extract entities, relationships and understand a document. Amongseveral NLP applications, we are interested, in this thesis, in the cross-language textsummarization which produces a summary in a language different from the languageof the source documents. We also analyzed other NLP tasks (word encoding representation,semantic similarity, sentence and multi-sentence compression) to generate morestable and informative cross-lingual summaries.Most of NLP applications (including all types of text summarization) use a kind ofsimilarity measure to analyze and to compare the meaning of words, chunks, sentencesand texts in their approaches. A way to analyze this similarity is to generate a representationfor these sentences that contains the meaning of them. The meaning of sentencesis defined by several elements, such as the context of words and expressions, the orderof words and the previous information. Simple metrics, such as cosine metric andEuclidean distance, provide a measure of similarity between two sentences; however,they do not analyze the order of words or multi-words. Analyzing these problems,we propose a neural network model that combines recurrent and convolutional neuralnetworks to estimate the semantic similarity of a pair of sentences (or texts) based onthe local and general contexts of words. Our model predicted better similarity scoresthan baselines by analyzing better the local and the general meanings of words andmulti-word expressions.In order to remove redundancies and non-relevant information of similar sentences,we propose a multi-sentence compression method that compresses similar sentencesby fusing them in correct and short compressions that contain the main information ofthese similar sentences. We model clusters of similar sentences as word graphs. Then,we apply an integer linear programming model that guides the compression of theseclusters based on a list of keywords. We look for a path in the word graph that has goodcohesion and contains the maximum of keywords. Our approach outperformed baselinesby generating more informative and correct compressions for French, Portugueseand Spanish languages. Finally, we combine these previous methods to build a cross-language text summarizationsystem. Our system is an {English, French, Portuguese, Spanish}-to-{English,French} cross-language text summarization framework that analyzes the informationin both languages to identify the most relevant sentences. Inspired by the compressivetext summarization methods in monolingual analysis, we adapt our multi-sentencecompression method for this problem to just keep the main information. Our systemproves to be a good alternative to compress redundant information and to preserve relevantinformation. Our system improves informativeness scores without losing grammaticalquality for French-to-English cross-lingual summaries. Analyzing {English,French, Portuguese, Spanish}-to-{English, French} cross-lingual summaries, our systemsignificantly outperforms extractive baselines in the state of the art for all these languages.In addition, we analyze the cross-language text summarization of transcriptdocuments. Our approach achieved better and more stable scores even for these documentsthat have grammatical errors and missing information
Carlsson, Yvonne. "Genericitet i text". Doctoral thesis, Stockholms universitet, Institutionen för nordiska språk, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-81330.
Pełny tekst źródłaBell, Timothy. "A unifying theory and improvements for existing approaches to text compression". Thesis, University of Canterbury. Computer Science, 1986. http://hdl.handle.net/10092/8411.
Pełny tekst źródłaMartin, Wickus. "A lossy, dictionary -based method for short message service (SMS) text compression". Master's thesis, University of Cape Town, 2009. http://hdl.handle.net/11427/6415.
Pełny tekst źródłaTao, Tao. "COMPRESSED PATTERN MATCHING FOR TEXT AND IMAGES". Doctoral diss., University of Central Florida, 2005. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2739.
Pełny tekst źródłaPh.D.
School of Computer Science
Engineering and Computer Science
Computer Science
Matsubara, Shigeki, Yoshihide Kato i Seiji Egawa. "Sentence Compression by Removing Recursive Structure from Parse Tree". Springer, 2008. http://hdl.handle.net/2237/15113.
Pełny tekst źródłaHertz, David. "Secure Text Communication for the Tiger XS". Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8011.
Pełny tekst źródłaThe option of communicating via SMS messages can be considered available in all GSM networks. It therefore constitutes a almost universally available method for mobile communication.
The Tiger XS, a device for secure communication manufactured by Sectra, is equipped with an encrypted text message transmission system. As the text message service of this device is becoming increasingly popular and as options to connect the Tiger XS to computers or to a keyboard are being researched, the text message service is in need of upgrade.
This thesis proposes amendments to the existing protocol structure. It thoroughly examines a number of options for source coding of small text messages and makes recommendations as to implementation of such features. It also suggests security enhancements and introduces a novel form of stegangraphy.
Young, David A. "Compression of Endpoint Identifiers in Delay Tolerant Networking". Ohio University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1385559406.
Pełny tekst źródłaGardner-Stephen, Paul Mark, i paul gardner-stephen@flinders edu au. "Explorations In Searching Compressed Nucleic Acid And Protein Sequence Databases And Their Cooperatively-Compressed Indices". Flinders University. Computer Science, Engineering & Mathematics, 2008. http://catalogue.flinders.edu.au./local/adt/public/adt-SFU20081111.105047.
Pełny tekst źródłaTam, Wai I. "Compression, indexing and searching of a large structured-text database in a library monitoring and control system (LiMaCS)". Thesis, University of Macau, 1998. http://umaclib3.umac.mo/record=b1636991.
Pełny tekst źródłaMalatji, Promise Tshepiso. "The development of accented English synthetic voices". Thesis, University of Limpopo, 2019. http://hdl.handle.net/10386/2917.
Pełny tekst źródłaA Text-to-speech (TTS) synthesis system is a software system that receives text as input and produces speech as output. A TTS synthesis system can be used for, amongst others, language learning, and reading out text for people living with different disabilities, i.e., physically challenged, visually impaired, etc., by native and non-native speakers of the target language. Most people relate easily to a second language spoken by a non-native speaker they share a native language with. Most online English TTS synthesis systems are usually developed using native speakers of English. This research study focuses on developing accented English synthetic voices as spoken by non-native speakers in the Limpopo province of South Africa. The Modular Architecture for Research on speech sYnthesis (MARY) TTS engine is used in developing the synthetic voices. The Hidden Markov Model (HMM) method was used to train the synthetic voices. Secondary training text corpus is used to develop the training speech corpus by recording six speakers reading the text corpus. The quality of developed synthetic voices is measured in terms of their intelligibility, similarity and naturalness using a listening test. The results in the research study are classified based on evaluators’ occupation and gender and the overall results. The subjective listening test indicates that the developed synthetic voices have a high level of acceptance in terms of similarity and intelligibility. A speech analysis software is used to compare the recorded synthesised speech and the human recordings. There is no significant difference in the voice pitch of the speakers and the synthetic voices except for one synthetic voice.
Borggren, Lukas. "Automatic Categorization of News Articles With Contextualized Language Models". Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177004.
Pełny tekst źródłaShang, Guokan. "Spoken Language Understanding for Abstractive Meeting Summarization Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization. Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding Speaker-change Aware CRF for Dialogue Act Classification". Thesis, Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAX011.
Pełny tekst źródłaWith the impressive progress that has been made in transcribing spoken language, it is becoming increasingly possible to exploit transcribed data for tasks that require comprehension of what is said in a conversation. The work in this dissertation, carried out in the context of a project devoted to the development of a meeting assistant, contributes to ongoing efforts to teach machines to understand multi-party meeting speech. We have focused on the challenge of automatically generating abstractive meeting summaries.We first present our results on Abstractive Meeting Summarization (AMS), which aims to take a meeting transcription as input and produce an abstractive summary as output. We introduce a fully unsupervised framework for this task based on multi-sentence compression and budgeted submodular maximization. We also leverage recent advances in word embeddings and graph degeneracy applied to NLP, to take exterior semantic knowledge into account and to design custom diversity and informativeness measures.Next, we discuss our work on Dialogue Act Classification (DAC), whose goal is to assign each utterance in a discourse a label that represents its communicative intention. DAC yields annotations that are useful for a wide variety of tasks, including AMS. We propose a modified neural Conditional Random Field (CRF) layer that takes into account not only the sequence of utterances in a discourse, but also speaker information and in particular, whether there has been a change of speaker from one utterance to the next.The third part of the dissertation focuses on Abstractive Community Detection (ACD), a sub-task of AMS, in which utterances in a conversation are grouped according to whether they can be jointly summarized by a common abstractive sentence. We provide a novel approach to ACD in which we first introduce a neural contextual utterance encoder featuring three types of self-attention mechanisms and then train it using the siamese and triplet energy-based meta-architectures. We further propose a general sampling scheme that enables the triplet architecture to capture subtle patterns (e.g., overlapping and nested clusters)
Sjöstrand, Björn. "Evaluation of Compression Testing and Compression Failure Modes of Paperboard : Video analysis of paperboard during short-span compression and the suitability of short- and long-span compression testing of paperboard". Thesis, Karlstads universitet, Institutionen för ingenjörs- och kemivetenskaper (from 2013), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-27519.
Pełny tekst źródłaGattis, Sherri L. "Ruggedized Television Compression Equipment for Test Range Systems". International Foundation for Telemetering, 1988. http://hdl.handle.net/10150/615062.
Pełny tekst źródłaThe Wideband Data Protection Program was necessitated from the need to develop digitized, compressed video to enable encryption.
Jas, Abhijit. "Test vector compression techniques for systems-on-chip /". Full text (PDF) from UMI/Dissertation Abstracts International, 2001. http://wwwlib.umi.com/cr/utexas/fullcit?p3008359.
Pełny tekst źródłaRocher, Tatiana. "Compression et indexation de séquences annotées". Thesis, Lille 1, 2018. http://www.theses.fr/2018LIL1I004/document.
Pełny tekst źródłaThis thesis in text algorithm studies the compression, indexation and querying on a labeled text. A labeled text is a text to which we add information. For example: a V(D)J recombination, a marker for lymphocytes, where the text is a DNA sequence and the labels are the genes' names. A person's immune system can be represented with a set of V(D)J recombinations. With high-throughput sequencing, we have access to millions of V(D)J recombinations which are stored and need to be recovered and compared quickly.The first contribution of this manuscript is a compression method for a labeled text which uses the concept of storage by references. The text is divided into sections which point to pre-established labeled sequences. The second contribution offers two indexes for a labeled text. Both use a Burrows-Wheeler transform to index the text and a Wavelet Tree to index the labels. These indexes allow efficient queries on text, labels or both. We would like to use one of these indexes on V(D)J recombinations which are obtained with hematology services from the diagnostic or follow-up of patients suffering from leukemia
Langiu, Alessio. "Parsing optimal pour la compression du texte par dictionnaire". Phd thesis, Université Paris-Est, 2012. http://tel.archives-ouvertes.fr/tel-00804215.
Pełny tekst źródłaNavickas, T. A., i S. G. Jones. "PULSE CODE MODULATION DATA COMPRESSION FOR AUTOMATED TEST EQUIPMENT". International Foundation for Telemetering, 1991. http://hdl.handle.net/10150/612065.
Pełny tekst źródłaDevelopment of automated test equipment for an advanced telemetry system requires continuous monitoring of PCM data while exercising telemetry inputs. This requirements leads to a large amount of data that needs to be stored and later analyzed. For example, a data stream of 4 Mbits/s and a test time of thirty minutes would yield 900 Mbytes of raw data. With this raw data, information needs to be stored to correlate the raw data to the test stimulus. This leads to a total of 1.8 Gb of data to be stored and analyzed. There is no method to analyze this amount of data in a reasonable time. A data compression method is needed to reduce the amount of data collected to a reasonable amount. The solution to the problem was data reduction. Data reduction was accomplished by real time limit checking, time stamping, and smart software. Limit checking was accomplished by an eight state finite state machine and four compression algorithms. Time stamping was needed to correlate stimulus to the appropriate output for data reconstruction. The software was written in the C programming language with a DOS extender used to allow it to run in extended mode. A 94 - 98% compression in the amount of data gathered was accomplished using this method.
Poirier, Régis. "Compression de données pour le test des circuits intégrés". Montpellier 2, 2004. http://www.theses.fr/2004MON20119.
Pełny tekst źródłaKhayat, Moghaddam Elham. "On low power test and low power compression techniques". Diss., University of Iowa, 2011. https://ir.uiowa.edu/etd/997.
Pełny tekst źródłaZacharia, Nadime. "Compression and decompression of test data for scan-based designs". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0004/MQ44048.pdf.
Pełny tekst źródłaZacharia, Nadime. "Compression and decompression of test data for scan based designs". Thesis, McGill University, 1996. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=20218.
Pełny tekst źródłaThe design of the decompression unit is treated in depth and a design is proposed that minimizes the amount of extra hardware required. In fact, the design of the decompression unit uses flip-flops already on the chip: it is implemented without inserting any additional flip-flops.
The proposed scheme is applied in two different contexts: (1) in (external) deterministic-stored testing, to reduce the memory requirements imposed on the test equipment; and (2) in built-in self test, to design a test pattern generator capable of generating deterministic patterns with modest area and memory requirements.
Experimental results are provided for the largest ISCAS'89 benchmarks. All of these results point to show that the proposed technique greatly reduces the amount of test data while requiring little area overhead. Compression factors of more than 20 are reported for some circuits.
Pateras, Stephen. "Correlated and cube-contained random patterns : test set compression techniques". Thesis, McGill University, 1991. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=70300.
Pełny tekst źródłaThe concepts of correlated and cube-contained random patterns can be viewed as methods to compress a deterministic test set into a small amount of information which is then used to control the generation of a superset of the deterministic test set. The goal is to make this superset as small as possible while maintaining its containment of the original test set. The two concepts are meant to be used in either a Built-In Self-Test (BIST) environment or with an external tester when the storage requirements of a deterministic test are too large.
Experimental results show that both correlated and cube-contained random patterns can achieve 100% fault coverage of synthesized circuits using orders or magnitude less patterns than when equiprobable random patterns are used.
Dalmasso, Julien. "Compression de données de test pour architecture de systèmes intégrés basée sur bus ou réseaux et réduction des coûts de test". Thesis, Montpellier 2, 2010. http://www.theses.fr/2010MON20061/document.
Pełny tekst źródłaWhile microelectronics systems become more and more complex, test costs have increased in the same way. Last years have seen many works focused on test cost reduction by using test data compression. However these techniques only focus on individual digital circuits whose structural implementation (netlist) is fully known by the designer. Therefore, they are not suitable for the testing of cores of a complete system. The goal of this PhD work was to provide a new solution for test data compression of integrated circuits taking into account the paradigm of systems-on-chip (SoC) built from pre-synthesized functions (IPs or cores). Then two systems testing method using compression are proposed for two different system architectures. The first one concerns SoC with IEEE 1500 test architecture (with bus-based test access mechanism), the second one concerns NoC-based systems. Both techniques use test scheduling methods combined with test data compression for better exploration of the design space. The idea is to increase test parallelism with no hardware extra cost. Experimental results performed on system-on-chip benchmarks show that the use of test data compression leads to test time reduction of about 50% at system level
Willis, Stephen, i Bernd Langer. "A Duel Compression Ethernet Camera Solution for Airborne Applications". International Foundation for Telemetering, 2014. http://hdl.handle.net/10150/577522.
Pełny tekst źródłaCamera technology is now ubiquitous with smartphones, laptops, automotive and industrial applications frequently utilizing high resolution imagine sensors. Increasingly there is a demand for high-definition cameras in the aerospace market - however, such cameras must have several considerations that do not apply to average consumer use including high reliability and being ruggedized for harsh environments. A significant issue is managing the large volumes of data that one or more HD cameras produce. One method of addressing this issue is to use compression algorithms that reduce video bandwidth. This can be achieved with dedicated compression units or modules within data acquisition systems. For flight test applications it is important that data from cameras is available for telemetry and coherently synchronized while also being available for storage. Ideally the data in the telemetry steam should be highly compressed to preserve downlink bandwidth while the recorded data is lightly compressed to provide maximum quality for onboard/ post flight analysis. This paper discusses the requirements for airborne applications and presents an innovative solution using Ethernet cameras with integrated compression that outputs two steams of data. This removes the need for dedicated video and compression units while offering all the features of such including switching camera sources and optimized video streams.
Limprasert, Tawan. "Behaviour of soil, soil-cement and soil-cement-fiber under multiaxial test". Ohio University / OhioLINK, 1995. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1179260769.
Pełny tekst źródłaJunior, Célio Anderson da Silva. "Avaliação das propriedades mecânicas de ossos de coelhas submetidas à administração de glicocorticóides". Universidade de São Paulo, 2003. http://www.teses.usp.br/teses/disponiveis/82/82131/tde-11122003-143837/.
Pełny tekst źródłaCorticoteroids are used in many clinical conditions because they present strong anti-inflammatory and imunessupressor activities. But, at the same time, they can cause many metabolic alterations and side effects mainly when there is prolonged. In the present research we studied the possible alterations caused by steroids on the mechanical properties of lamellar and trabecular bone of rabbits. The mechanical properties were assessed by bending tests performed on intact femurs and tibial as well as in samples of cortical bone. Compression tests were performed in L5 vertebral. Thirty-seven female rabbits were randomly distributes in one experimental group (EG-animals) and control (CG animals). Such groups were divided into four subgroups: two experimentals and two controls. The experimental animals received 2mg/kg/day of methylprednisolone (Solumedron ® ) during three weeks. The control animals received the same volume of intramuscular injections of saline, once a day, during three weeks. From the load x deformation curves the load and deflexion were obtained at the yielding point. The ultimate load as well as resilience were also obtained for the intact bones. When the specimens were analysed the ultimate tension was determined. The statisitical analyses did not show any difference between treated and untreated animals for the mechanical properties. But the treated animals showed a significant loose of body weight. We ful that such results require a deepening in the investigation.
Lancel, Jerome. "Analysis and test of a centrifugal compressor". Thesis, University of Sussex, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.250183.
Pełny tekst źródłaChen, Liang-Chih, i 陳良智. "On Text Compression Algorithms". Thesis, 1997. http://ndltd.ncl.edu.tw/handle/68972127981738088023.
Pełny tekst źródła國立清華大學
資訊科學研究所
85
The purpose of this thesis is to make a comprehensive survey about text compression and find out an optimal algorithm for it. There have been some famous algorithms in this realm. How these algorithms work and why they compress data well are two topics we most concern about, so we survey these algorithms and analyze the superiority and limitation of them from the theoretical viewpoint at first. Then, the performance of algorithms are evaluated by experiments. Finally, the context modeling is suggested to improve the compression ratio further. By that, almost all the redundancy in source messages is exhausted under our understanding. A new algorithm using the context modeling is proposed and evaluated.
"Text compression for Chinese documents". Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888571.
Pełny tekst źródłaThesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 133-137).
Abstract --- p.i
Acknowledgement --- p.iii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Importance of Text Compression --- p.1
Chapter 1.2 --- Historical Background of Data Compression --- p.2
Chapter 1.3 --- The Essences of Data Compression --- p.4
Chapter 1.4 --- Motivation and Objectives of the Project --- p.5
Chapter 1.5 --- Definition of Important Terms --- p.6
Chapter 1.5.1 --- Data Models --- p.6
Chapter 1.5.2 --- Entropy --- p.10
Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12
Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12
Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13
Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15
Chapter 1.7 --- Sources of Testing Data --- p.16
Chapter 1.8 --- Outline of the Thesis --- p.16
Chapter 2 --- Literature Survey --- p.18
Chapter 2.1 --- Data compression Algorithms --- p.18
Chapter 2.1.1 --- Statistical Compression Methods --- p.18
Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23
Chapter 2.2 --- Cascading of Algorithms --- p.33
Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34
Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37
Chapter 3 --- Chinese-related Issues --- p.38
Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38
Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38
Chapter 3.1.2 --- Lack of Word Segmentation --- p.40
Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40
Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41
Chapter 3.2 --- Definition of Different Coding Schemes --- p.41
Chapter 3.2.1 --- Big5 Code --- p.42
Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43
Chapter 3.2.3 --- Unicode --- p.44
Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45
Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45
Chapter 4 --- Huffman Coding on Chinese Text --- p.49
Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50
Chapter 4.2 --- Result --- p.51
Chapter 4.3 --- Justification of the Result --- p.53
Chapter 4.4 --- Time and Memory Resources Analysis --- p.58
Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61
Chapter 4.5.1 --- The Algorithm --- p.62
Chapter 4.5.2 --- Result --- p.63
Chapter 4.5.3 --- Justification of the Result --- p.64
Chapter 4.6 --- Chapter Conclusion --- p.66
Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67
Chapter 5.1 --- The Chinese LZSS Compression --- p.68
Chapter 5.1.1 --- The Algorithm --- p.69
Chapter 5.1.2 --- Result --- p.73
Chapter 5.1.3 --- Justification of the Result --- p.74
Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80
Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81
Chapter 5.2 --- The Chinese LZW Compression --- p.92
Chapter 5.2.1 --- The Algorithm --- p.92
Chapter 5.2.2 --- Result --- p.94
Chapter 5.2.3 --- Justification of the Result --- p.95
Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97
Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98
Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100
Chapter 5.4 --- Chapter Conclusion --- p.101
Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103
Chapter 6.1 --- The Algorithm --- p.104
Chapter 6.2 --- Result --- p.107
Chapter 6.3 --- Justification of the Result --- p.108
Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111
Chapter 6.5 --- Chapter Conclusion --- p.114
Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116
Chapter 7.1 --- Static Cascading Model --- p.117
Chapter 7.1.1 --- The Algorithm --- p.117
Chapter 7.1.2 --- Result --- p.120
Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121
Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125
Chapter 7.2.1 --- The Algorithm --- p.125
Chapter 7.2.2 --- Result --- p.126
Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127
Chapter 7.3 --- Chapter Conclusion --- p.128
Chapter 8 --- Concluding Remarks --- p.129
Chapter 8.1 --- Conclusion --- p.129
Chapter 8.2 --- Future Work Direction --- p.130
Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130
Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131
Chapter 8.2.3 --- Use of Grammar Model --- p.131
Chapter 8.2.4 --- Lossy Compression --- p.131
Chapter 8.3 --- Epilogue --- p.132
Bibliography --- p.133
Liou, Chin-yuan, i 劉欽源. "Text compression schemes : a comparison". Thesis, 1993. http://ndltd.ncl.edu.tw/handle/77425729174310677189.
Pełny tekst źródłaPerera, Paththamestrige. "Syntactic Sentence Compression for Text Summarization". Thesis, 2013. http://spectrum.library.concordia.ca/977725/1/Paththamestrige_MSc_F2013.pdf.
Pełny tekst źródłaZhang, Xiaoxi. "Efficient Parallel Text Compression on GPUs". Thesis, 2011. http://hdl.handle.net/1969.1/ETD-TAMU-2011-12-10308.
Pełny tekst źródłaLANGIU, Alessio. "Optimal Parsing for Dictionary Text Compression". Doctoral thesis, 2012. http://hdl.handle.net/10447/94651.
Pełny tekst źródłaYe, Yan. "Text image compression based on pattern matching /". Diss., 2002. http://wwwlib.umi.com/cr/ucsd/fullcit?p3036946.
Pełny tekst źródła"Context-based compression algorithms for text and image data". 1997. http://library.cuhk.edu.hk/record=b5889317.
Pełny tekst źródłaThesis (M.Phil.)--Chinese University of Hong Kong, 1997.
Includes bibliographical references (leaves 80-85).
ABSTRACT --- p.1
Chapter 1. --- INTRODUCTION --- p.2
Chapter 1.1 --- motivation --- p.4
Chapter 1.2 --- Original Contributions --- p.5
Chapter 1.3 --- thesis Structure --- p.5
Chapter 2. --- BACKGROUND --- p.7
Chapter 2.1 --- information theory --- p.7
Chapter 2.2 --- early compression --- p.8
Chapter 2.2.1 --- Some Source Codes --- p.10
Chapter 2.2.1.1 --- Huffman Code --- p.10
Chapter 2.2.1.2 --- Tutstall Code --- p.10
Chapter 2.2.1.3 --- Arithmetic Code --- p.11
Chapter 2.3 --- modern techniques for compression --- p.14
Chapter 2.3.1 --- Statistical Modeling --- p.14
Chapter 2.3.1.1 --- Context Modeling --- p.15
Chapter 2.3.1.2 --- State Based Modeling --- p.17
Chapter 2.3.2 --- Dictionary Based Compression --- p.17
Chapter 2.3.2.1 --- LZ-compression --- p.19
Chapter 2.3.3 --- Other Compression Techniques --- p.20
Chapter 2.3.3.1 --- Block Sorting --- p.20
Chapter 2.3.3.2 --- Context Tree Weighting --- p.21
Chapter 3. --- SYMBOL REMAPPING --- p.22
Chapter 3. 1 --- reviews on Block Sorting --- p.22
Chapter 3.1.1 --- Forward Transformation --- p.23
Chapter 3.1.2 --- Inverse Transformation --- p.24
Chapter 3.2 --- Ordering Method --- p.25
Chapter 3.3 --- discussions --- p.27
Chapter 4. --- CONTENT PREDICTION --- p.29
Chapter 4.1 --- Prediction and Ranking Schemes --- p.29
Chapter 4.1.1 --- Content Predictor --- p.29
Chapter 4.1.2 --- Ranking Techn ique --- p.30
Chapter 4.2 --- Reviews on Context Sorting --- p.31
Chapter 4.2.1 --- Context Sorting basis --- p.31
Chapter 4.3 --- General Framework of Content Prediction --- p.31
Chapter 4.3.1 --- A Baseline Version --- p.32
Chapter 4.3.2 --- Context Length Merge --- p.34
Chapter 4.4 --- Discussions --- p.36
Chapter 5. --- BOUNDED-LENGTH BLOCK SORTING --- p.38
Chapter 5.1 --- block sorting with bounded context length --- p.38
Chapter 5.1.1 --- Forward Transformation --- p.38
Chapter 5.1.2 --- Reverse Transformation --- p.39
Chapter 5.2 --- Locally Adaptive Entropy Coding --- p.43
Chapter 5.3 --- discussion --- p.45
Chapter 6. --- CONTEXT CODING FOR IMAGE DATA --- p.47
Chapter 6.1 --- Digital Images --- p.47
Chapter 6.1.1 --- Redundancy --- p.48
Chapter 6.2 --- model of a compression system --- p.49
Chapter 6.2.1 --- Representation --- p.49
Chapter 6.2.2 --- Quantization --- p.50
Chapter 6.2.3 --- Lossless coding --- p.51
Chapter 6.3 --- The Embedded Zerotree Wavelet Coding --- p.51
Chapter 6.3.1 --- Simple Zerotree-like Implementation --- p.53
Chapter 6.3.2 --- Analysis of Zerotree Coding --- p.54
Chapter 6.3.2.1 --- Linkage between Coefficients --- p.55
Chapter 6.3.2.2 --- Design of Uniform Threshold Quantizer with Dead Zone --- p.58
Chapter 6.4 --- Extensions on Wavelet Coding --- p.59
Chapter 6.4.1 --- Coefficients Scanning --- p.60
Chapter 6.5 --- Discussions --- p.61
Chapter 7. --- CONCLUSIONS --- p.63
Chapter 7.1 --- Future Research --- p.64
APPENDIX --- p.65
Chapter A --- Lossless Compression Results --- p.65
Chapter B --- Image Compression Standards --- p.72
Chapter C --- human Visual System Characteristics --- p.75
Chapter D --- Lossy Compression Results --- p.76
COMPRESSION GALLERY --- p.77
Context-based Wavelet Coding --- p.75
RD-OPT-based jpeg Compression --- p.76
SPIHT Wavelet Compression --- p.77
REFERENCES --- p.80
Wen, Chih-Ming, i 溫智旻. "Text Compression Using a Word-Based Large Alphabet". Thesis, 2005. http://ndltd.ncl.edu.tw/handle/74710510610120099923.
Pełny tekst źródła國立臺灣科技大學
資訊工程系
93
In this thesis, some word-based large-alphabet text compression schemes are studied. After a word token is parsed from an English or Chinese text file, its occurrence probability is predicted with blended predictive models or with partially matched predictive model. Then, this probability is encoded in the module of arithmetic coding. In order to improve the speed of our compression schemes, we have also studied some data structures and their corresponding processing methods under the condition of large alphabet size. For the schemes studied here, we have implemented them as executable programs to practically compress and decompress typical text files. Performance comparison is made between our program and other text compression programs such as GZIP, bzip2, and PPMd. Experiment results show that our program have better compression rate than the programs mentioned for compressing both kinds of Chinese and English text files. In average, rate improvements are 17.02%, 5.48%, and 1.12% for Chinese text files, and 12.08%, 2.04%, 0.29% for English text files, respectively.
HUANG, PING-FENG, i 黃品丰. "Recovering corrupted text files with dictionary-based compression". Thesis, 2018. http://ndltd.ncl.edu.tw/handle/8npz55.
Pełny tekst źródła逢甲大學
通訊工程學系
106
The pace of present technology development now days is getting faster and faster. The new technologies created enormous amount of data over internet and they are growing exponentially. To store all these data pushes for the highly efficient data compression techniques. These data compression algorithms make the transfer of data faster than ever and storage space smaller than ever. However, most of the data compression techniques concentrated on high compression ration rather than data recovery of corrupted compressed files. A compressed file will be unable to decompress correctly if there are few bits corrupted in the file and will require a re-transmission or to retrain another copy from other storage device. To cope with the need of data recovery from certain files, we proposed a segmented data compression technique. This technique can help us to retrieve most of the uncorrupted data if some contents of this file are lost. The trade-off of the proposed data compression is less than optimal compressed ratios for better protection of data files.
Chuang, Yu-Ting, i 莊侑頲. "Text Detection in Color Images and Compound Document Compression". Thesis, 2003. http://ndltd.ncl.edu.tw/handle/91068801328009029959.
Pełny tekst źródła國立臺灣大學
電信工程學研究所
91
Abstract As the growth of multimedia components, News, magazines, Web pages, etc are everywhere in our life. However, text in these documents plays an important role when people need to realize details of their downloaded data. Besides that, in order to make people who speak different languages know content of these data easily, text localization and translation in color images are becoming more and more important, it is clear that good text translation can be achieved if we can accurately localize text regions. In order to achieve good translating performance, we propose a novel approach to detect text in color images with very low false alarm rate. First of all, neural network color quantization is used to compact text color. Second, 3D histogram analysis chooses several colors candidates, and then extracted each of these color candidates to obtain several bi-level images. For each extracted bi-level image, connected component analysis and several morphological operators are fed to hold some boxes that are possible text regions. At last, we can use L.O.G edge detector to authenticate accurate text regions from each possible text regions. Meanwhile, in complex color images, multi-quantization layers can be integrated to reject non-text parts and reduce false alarm rate. In addition to localize text regions in color images, we can also apply the text localization technique in the compression of compound documents such as magazines and newspaper. The application can not only reduce transmitting rate effectively but also hold text part clear when low bit-rate transmitting.
Lin, Wen Long, i 林文隆. "Wavelet-Based Color Document Compression with Graph and Text Segmentation". Thesis, 1998. http://ndltd.ncl.edu.tw/handle/03869862444590817113.
Pełny tekst źródła國立交通大學
電機與控制工程學系
86
In this thesis, we use the technology of graph and text segmentationin wavelet coefficients to separate graph and text in color document. Zero-Tree encodes the part of graph-image, and the part of text-image is coded by the method of multi-plain text coding. Color-number, the ratio of projection variance, and fractal dimension which are different in graph and text part of the block give us the information manipulate the segmentation. Because of the characteristic of these three parameters which reveal strong fuzzy property, we develop a fuzzy rule to achieve the purpose of segmentation. The result of program simulation shows that image compression with graph- text segmentation has good performance on high compression ratio in color document. We also discuss the problem of the best bit- rate allocation in color image, the relation between PSNR and the layer number in wavelet transform, and how high-frequency coefficients effect the image quality.