Dissertations / Theses on the topic 'Chinese language Technical Chinese Data processing'

To see the other types of publications on this topic, follow the link: Chinese language Technical Chinese Data processing.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Chinese language Technical Chinese Data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

洪進德 and Chun-tak Hung. "Chinese workbench: an integrated environment for Chinese writers." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1992. http://hub.hku.hk/bib/B31210314.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Yiu, Lai Kuen Candy. "Chinese character synthesis : towards universal Chinese information exchange." HKBU Institutional Repository, 2003. http://repository.hkbu.edu.hk/etd_ra/477.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

余銘龍 and Ming-lung Yu. "Automatic processing of Chinese language bank cheques." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B31225548.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

羅憲璋 and Hin-cheung Hubert Law. "A language model for mandarin Chinese." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1997. http://hub.hku.hk/bib/B29913391.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Chi-yin. "A pure orthographic stage in processing Chinese characters evidence from data of sub-morphemic processing in preschool children /." Click to view the E-thesis via HKU Scholars Hub, 2003. http://lookup.lib.hku.hk/lookup/bib/B38888919.

Full text
Abstract:
Thesis (B.Sc.)--University of Hong Kong, 2003.
"A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, April 30, 2003." Includes bibliographical references (p. 28-30) Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
6

Lee, Hiu-wing Doris, and 李曉穎. "A study of automatic expansion of Chinese abbreviations." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B31609338.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

施雷 and Lui Sze. "Computer recognition of printed Chinese characters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31213601.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Wong, Kun-wing Peter, and 黃冠榮. "Breaking the learning barrier of Chinese Changjei input method." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31961198.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

黃伯光 and Pak-kwong Wong. "Statistical language models for Chinese recognition: speech and character." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31239456.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

陳國評 and Kwok-ping Chan. "Fuzzy set theoretic approach to handwritten Chinese character recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1989. http://hub.hku.hk/bib/B30425876.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Wong, Kun-wing Peter. "Breaking the learning barrier of Chinese Changjei input method /." Hong Kong : University of Hong Kong, 1998. http://sunzi.lib.hku.hk/hkuto/record.jsp?B21367875.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Wong, Ping-wai, and 黃炳蔚. "Semantic annotation of Chinese texts with message structures based on HowNet." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B38212389.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

林依民 and Yi-min Lin. "Computer recognition of printed Chinese characters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1990. http://hub.hku.hk/bib/B31209919.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

梁祥海 and Cheung-hoi Leung. "Computer recognition of handprinted Chinese characters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1986. http://hub.hku.hk/bib/B31230660.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Chen, Yong. "Constructing a language model based on data mining techniques for a Chinese character recognition system /." View the Table of Contents & Abstract, 2004. http://sunzi.lib.hku.hk/hkuto/record/B30708527.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Chen, Yong, and 陳勇. "Constructing a language model based on data mining techniques for a Chinese character recognition system." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B44570193.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Pang, Bo. "Handwriting Chinese character recognition based on quantum particle swarm optimization support vector machine." Thesis, University of Macau, 2018. http://umaclib3.umac.mo/record=b3950620.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

林碧 and Bik Lum. "A rule-based analysis system for Chinese sentences." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1989. http://hub.hku.hk/bib/B31208769.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

葉賜權 and Chee-kuen Yip. "Machine recognition of multi-font printed Chinese Characters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1990. http://hub.hku.hk/bib/B31210120.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Ho, Yuen-ying, and 何婉瑩. "The effect of introducing a computer software in enhancing comprehension of classical Chinese text." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1995. http://hub.hku.hk/bib/B31957869.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

He, Tingting, and 何婷婷. "A study on several problems in online handwritten Chinese character recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2008. http://hub.hku.hk/bib/B42182086.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Shen, Jingdi. "Regional Lexical Variation in Modern Written Chinese: Analysis and Characterization Using Geo-Tagged Social Media Data." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531845935585073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

李嘉雯 and Ka-man Carmen Lee. "Chinese and English computer-mediated communication in the context of New Literacy Studies." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B29872959.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

陸穎剛 and Wing-kong Luk. "Concept space approach for cross-lingual information retrieval." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B30147724.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Chan, Wai-man, and 陳偉文. "Medical document management system using XML." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31224039.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

"Chinese character processing." Chinese University of Hong Kong, 1987. http://library.cuhk.edu.hk/record=b5885798.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

"A comprehensive Chinese thesaurus system." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888467.

Full text
Abstract:
by Chen Hong Yi.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 62-65).
Abstract --- p.ii
Acknowledgement --- p.iv
List of Tables --- p.viii
List of Figures --- p.ix
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Background Information And Thesis Scope --- p.6
Chapter 2.1 --- Basic Concepts and Terminologies --- p.6
Chapter 2.1.1 --- Semantic Classification Of A Word --- p.6
Chapter 2.1.2 --- Relationship Link And Relationship Type --- p.7
Chapter 2.1.3 --- "Semantic Closeness, Link Weight And Semantic Distance" --- p.8
Chapter 2.1.4 --- Thesaurus Model And Semantic Net --- p.9
Chapter 2.1.5 --- Thesaurus Building And Maintaining Tool --- p.9
Chapter 2.2 --- Chinese Information Processing --- p.9
Chapter 2.2.1 --- The Segmentation of Chinese Words --- p.10
Chapter 2.2.2 --- The Ambiguity of Chinese Words --- p.10
Chapter 2.2.3 --- Multiple Chinese Character Code Set Standards --- p.11
Chapter 2.3 --- Related Work --- p.11
Chapter 2.4 --- Thesis Scope --- p.13
Chapter 3 --- System Design Principles --- p.15
Chapter 3.1 --- Application Context Of TheSys --- p.15
Chapter 3.2 --- Overall System Architecture --- p.16
Chapter 3.3 --- Entry-Term Construct And Thesaurus Frame --- p.19
Chapter 3.3.1 --- "Words, Entry Terms And Entry Term Construct" --- p.21
Chapter 3.3.2 --- "Semanteme, Relationship And Thesaurus Frame" --- p.23
Chapter 3.3.3 --- Dealing With Term Ambiguity --- p.28
Chapter 3.4 --- Weighting Scheme --- p.33
Chapter 3.4.1 --- Assumption --- p.33
Chapter 3.4.2 --- Quantify The Relevancy Between Two Directly Linked Concepts --- p.34
Chapter 3.4.3 --- Quantify The Relevancy Between Two Indirectly Linked Concepts --- p.35
Chapter 3.5 --- Term Ranking --- p.38
Chapter 3.6 --- Thesaurus Module and Maintenance Module --- p.39
Chapter 3.6.1 --- The Procedure Of Building A Thesaurus --- p.40
Chapter 3.6.2 --- Thesaurus Nomination --- p.41
Chapter 3.6.3 --- Semantic Classification Tree Construction --- p.41
Chapter 3.6.4 --- Relation Type Definition --- p.42
Chapter 3.6.5 --- Entry Term Construct Construction --- p.42
Chapter 3.6.6 --- Thesaurus Frame Construction --- p.43
Chapter 3.6.7 --- Thesaurus Query --- p.44
Chapter 4 --- System Implementation --- p.45
Chapter 4.1 --- Data Structure --- p.45
Chapter 4.1.1 --- Entry Term Construct --- p.45
Chapter 4.1.2 --- Thesaurus Frame --- p.49
Chapter 4.2 --- API --- p.50
Chapter 4.3 --- User Interface --- p.54
Chapter 4.3.1 --- Widget And Its Callback --- p.54
Chapter 4.3.2 --- Bilingual User Interface --- p.55
Chapter 4.3.3 --- Chinese Character Input Method --- p.57
Chapter 5 --- Conclusion And Future Work --- p.60
Chapter A --- System Installation --- p.66
Chapter A.1 --- Files In TheSys --- p.67
Chapter A.2 --- Employ TheSys As Application Package --- p.70
Chapter A.3 --- Set Up TheSys With UI --- p.71
Chapter A.4 --- Verify The Word Using External Dictionary --- p.74
Chapter B --- API Description --- p.77
Chapter B.1 --- thesys.h File --- p.77
Chapter B.2 --- API Reference --- p.82
Chapter C --- User Interface Reference --- p.108
APA, Harvard, Vancouver, ISO, and other styles
28

"A generic Chinese PAT tree data structure for Chinese documents clustering." 2002. http://library.cuhk.edu.hk/record=b5891265.

Full text
Abstract:
Kwok Chi Leong.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.
Includes bibliographical references (leaves 122-127).
Abstracts in English and Chinese.
Abstract --- p.ii
Acknowledgment --- p.vi
Table of Contents --- p.vii
List of Tables --- p.x
List of Figures --- p.xi
Chapter Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Contributions --- p.2
Chapter 1.2 --- Thesis Overview --- p.3
Chapter Chapter 2 --- Background Information --- p.5
Chapter 2.1 --- Documents Clustering --- p.5
Chapter 2.1.1 --- Review of Clustering Techniques --- p.5
Chapter 2.1.2 --- Suffix Tree Clustering --- p.7
Chapter 2.2 --- Chinese Information Processing --- p.8
Chapter 2.2.1 --- Sentence Segmentation --- p.8
Chapter 2.2.2 --- Keyword Extraction --- p.10
Chapter Chapter 3 --- The Generic Chinese PAT Tree --- p.12
Chapter 3.1 --- PAT Tree --- p.13
Chapter 3.1.1 --- Patricia Tree --- p.13
Chapter 3.1.2 --- Semi-Infinite String --- p.14
Chapter 3.1.3 --- Structure of Tree Nodes --- p.17
Chapter 3.1.4 --- Some Examples of PAT Tree --- p.22
Chapter 3.1.5 --- Storage Complexity --- p.24
Chapter 3.2 --- The Chinese PAT Tree --- p.26
Chapter 3.2.1 --- The Chinese PAT Tree Structure --- p.26
Chapter 3.2.2 --- Some Examples of Chinese PAT Tree --- p.30
Chapter 3.2.3 --- Storage Complexity --- p.33
Chapter 3.3 --- The Generic Chinese PAT Tree --- p.34
Chapter 3.3.1 --- Structure Overview --- p.34
Chapter 3.3.2 --- Structure of Tree Nodes --- p.35
Chapter 3.3.3 --- Essential Node --- p.37
Chapter 3.3.4 --- Some Examples of the Generic Chinese PAT Tree --- p.41
Chapter 3.3.5 --- Storage Complexity --- p.45
Chapter 3.4 --- Problems of Embedded Nodes --- p.46
Chapter 3.4.1 --- The Reduced Structure --- p.47
Chapter 3.4.2 --- Disadvantages of Reduced Structure --- p.48
Chapter 3.4.3 --- A Case Study of Reduced Design --- p.50
Chapter 3.4.4 --- Experiments on Frequency Mismatch --- p.51
Chapter 3.5 --- Strengths of the Generic Chinese PAT Tree --- p.55
Chapter Chapter 4 --- Performance Analysis on the Generic Chinese PAT Tree --- p.58
Chapter 4.1 --- The Construction of the Generic Chinese PAT Tree --- p.59
Chapter 4.2 --- Counting the Essential Nodes --- p.61
Chapter 4.3 --- Performance of Various PAT Trees --- p.62
Chapter 4.4 --- The Implementation Analysis --- p.64
Chapter 4.4.1 --- Pure Dynamic Memory Allocation --- p.64
Chapter 4.4.2 --- Node Production Factory Approach --- p.66
Chapter 4.4.3 --- Experiment Result of the Factory Approach --- p.68
Chapter Chapter 5 --- The Chinese Documents Clustering --- p.70
Chapter 5.1 --- The Clustering Framework --- p.70
Chapter 5.1.1 --- Documents Cleaning --- p.73
Chapter 5.1.2 --- PAT Tree Construction --- p.76
Chapter 5.1.3 --- Essential Node Extraction --- p.77
Chapter 5.1.4 --- Base Clusters Detection --- p.80
Chapter 5.1.5 --- Base Clusters Filtering --- p.86
Chapter 5.1.6 --- Base Clusters Combining --- p.94
Chapter 5.1.7 --- Documents Assigning --- p.95
Chapter 5.1.8 --- Result Presentation --- p.96
Chapter 5.2 --- Discussion --- p.96
Chapter 5.2.1 --- Flexibility of Our Framework --- p.96
Chapter 5.2.2 --- Our Clustering Model --- p.97
Chapter 5.2.3 --- More About Clusters Detection --- p.98
Chapter 5.2.4 --- Analysis and Complexity --- p.100
Chapter Chapter 6 --- Evaluations on the Chinese Documents Clustering --- p.101
Chapter 6.1 --- Details of Experiment --- p.101
Chapter 6.1.1 --- Parameter of Weighted Frequency --- p.105
Chapter 6.1.2 --- Effect of CLP Analysis --- p.105
Chapter 6.1.3 --- Result of Clustering --- p.108
Chapter 6.2 --- Clustering on Larger Collection --- p.109
Chapter 6.2.1 --- Comparing the Base Clusters --- p.109
Chapter 6.2.2 --- Result of Clustering --- p.111
Chapter 6.2.3 --- Discussion --- p.112
Chapter 6.3 --- Clustering with Part of Documents --- p.113
Chapter 6.3.1 --- Clustering with News Headlines --- p.114
Chapter 6.3.2 --- Clustering with News Abstract --- p.117
Chapter Chapter 7 --- Conclusion --- p.119
Bibliography --- p.122
APA, Harvard, Vancouver, ISO, and other styles
29

"Text compression for Chinese documents." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888571.

Full text
Abstract:
by Chi-kwun Kan.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 133-137).
Abstract --- p.i
Acknowledgement --- p.iii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Importance of Text Compression --- p.1
Chapter 1.2 --- Historical Background of Data Compression --- p.2
Chapter 1.3 --- The Essences of Data Compression --- p.4
Chapter 1.4 --- Motivation and Objectives of the Project --- p.5
Chapter 1.5 --- Definition of Important Terms --- p.6
Chapter 1.5.1 --- Data Models --- p.6
Chapter 1.5.2 --- Entropy --- p.10
Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12
Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12
Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13
Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15
Chapter 1.7 --- Sources of Testing Data --- p.16
Chapter 1.8 --- Outline of the Thesis --- p.16
Chapter 2 --- Literature Survey --- p.18
Chapter 2.1 --- Data compression Algorithms --- p.18
Chapter 2.1.1 --- Statistical Compression Methods --- p.18
Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23
Chapter 2.2 --- Cascading of Algorithms --- p.33
Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34
Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37
Chapter 3 --- Chinese-related Issues --- p.38
Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38
Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38
Chapter 3.1.2 --- Lack of Word Segmentation --- p.40
Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40
Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41
Chapter 3.2 --- Definition of Different Coding Schemes --- p.41
Chapter 3.2.1 --- Big5 Code --- p.42
Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43
Chapter 3.2.3 --- Unicode --- p.44
Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45
Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45
Chapter 4 --- Huffman Coding on Chinese Text --- p.49
Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50
Chapter 4.2 --- Result --- p.51
Chapter 4.3 --- Justification of the Result --- p.53
Chapter 4.4 --- Time and Memory Resources Analysis --- p.58
Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61
Chapter 4.5.1 --- The Algorithm --- p.62
Chapter 4.5.2 --- Result --- p.63
Chapter 4.5.3 --- Justification of the Result --- p.64
Chapter 4.6 --- Chapter Conclusion --- p.66
Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67
Chapter 5.1 --- The Chinese LZSS Compression --- p.68
Chapter 5.1.1 --- The Algorithm --- p.69
Chapter 5.1.2 --- Result --- p.73
Chapter 5.1.3 --- Justification of the Result --- p.74
Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80
Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81
Chapter 5.2 --- The Chinese LZW Compression --- p.92
Chapter 5.2.1 --- The Algorithm --- p.92
Chapter 5.2.2 --- Result --- p.94
Chapter 5.2.3 --- Justification of the Result --- p.95
Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97
Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98
Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100
Chapter 5.4 --- Chapter Conclusion --- p.101
Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103
Chapter 6.1 --- The Algorithm --- p.104
Chapter 6.2 --- Result --- p.107
Chapter 6.3 --- Justification of the Result --- p.108
Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111
Chapter 6.5 --- Chapter Conclusion --- p.114
Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116
Chapter 7.1 --- Static Cascading Model --- p.117
Chapter 7.1.1 --- The Algorithm --- p.117
Chapter 7.1.2 --- Result --- p.120
Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121
Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125
Chapter 7.2.1 --- The Algorithm --- p.125
Chapter 7.2.2 --- Result --- p.126
Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127
Chapter 7.3 --- Chapter Conclusion --- p.128
Chapter 8 --- Concluding Remarks --- p.129
Chapter 8.1 --- Conclusion --- p.129
Chapter 8.2 --- Future Work Direction --- p.130
Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130
Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131
Chapter 8.2.3 --- Use of Grammar Model --- p.131
Chapter 8.2.4 --- Lossy Compression --- p.131
Chapter 8.3 --- Epilogue --- p.132
Bibliography --- p.133
APA, Harvard, Vancouver, ISO, and other styles
30

"Lexical and sublexical processing in Chinese character recognition." 2013. http://library.cuhk.edu.hk/record=b5884442.

Full text
Abstract:
Mo, Deyuan.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves 153-167).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese; appendixes includes Chinese.
APA, Harvard, Vancouver, ISO, and other styles
31

"Domain-optimized Chinese speech generation." 2001. http://library.cuhk.edu.hk/record=b5890609.

Full text
Abstract:
Fung Tien Ying.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.
Includes bibliographical references (leaves 119-128).
Abstracts in English and Chinese.
Abstract --- p.1
Acknowledgement --- p.1
List of Figures --- p.7
List of Tables --- p.11
Chapter 1 --- Introduction --- p.14
Chapter 1.1 --- General Trends on Speech Generation --- p.15
Chapter 1.2 --- Domain-Optimized Speech Generation in Chinese --- p.16
Chapter 1.3 --- Thesis Organization --- p.17
Chapter 2 --- Background --- p.19
Chapter 2.1 --- Linguistic and Phonological Properties of Chinese --- p.19
Chapter 2.1.1 --- Articulation --- p.20
Chapter 2.1.2 --- Tones --- p.21
Chapter 2.2 --- Previous Development in Speech Generation --- p.22
Chapter 2.2.1 --- Articulatory Synthesis --- p.23
Chapter 2.2.2 --- Formant Synthesis --- p.24
Chapter 2.2.3 --- Concatenative Synthesis --- p.25
Chapter 2.2.4 --- Existing Systems --- p.31
Chapter 2.3 --- Our Speech Generation Approach --- p.35
Chapter 3 --- Corpus-based Syllable Concatenation: A Feasibility Test --- p.37
Chapter 3.1 --- Capturing Syllable Coarticulation with Distinctive Features --- p.39
Chapter 3.2 --- Creating a Domain-Optimized Wavebank --- p.41
Chapter 3.2.1 --- Generate-and-Filter --- p.44
Chapter 3.2.2 --- Waveform Segmentation --- p.47
Chapter 3.3 --- The Use of Multi-Syllable Units --- p.49
Chapter 3.4 --- Unit Selection for Concatenative Speech Output --- p.50
Chapter 3.5 --- A Listening Test --- p.51
Chapter 3.6 --- Chapter Summary --- p.52
Chapter 4 --- Scalability and Portability to the Stocks Domain --- p.55
Chapter 4.1 --- Complexity of the ISIS Responses --- p.56
Chapter 4.2 --- XML for input semantic and grammar representation --- p.60
Chapter 4.3 --- Tree-Based Filtering Algorithm --- p.63
Chapter 4.4 --- Energy Normalization --- p.67
Chapter 4.5 --- Chapter Summary --- p.69
Chapter 5 --- Investigation in Tonal Contexts --- p.71
Chapter 5.1 --- The Nature of Tones --- p.74
Chapter 5.1.1 --- Human Perception of Tones --- p.75
Chapter 5.2 --- Relative Importance of Left and Right Tonal Context --- p.77
Chapter 5.2.1 --- Tonal Contexts in the Date-Time Subgrammar --- p.77
Chapter 5.2.2 --- Tonal Contexts in the Numeric Subgrammar --- p.82
Chapter 5.2.3 --- Conclusion regarding the Relative Importance of Left versus Right Tonal Contexts --- p.86
Chapter 5.3 --- Selection Scheme for Tonal Variants --- p.86
Chapter 5.3.1 --- Listening Test for our Tone Backoff Scheme --- p.90
Chapter 5.3.2 --- Error Analysis --- p.92
Chapter 5.4 --- Chapter Summary --- p.94
Chapter 6 --- Summary and Future Work --- p.95
Chapter 6.1 --- Contributions --- p.97
Chapter 6.2 --- Future Directions --- p.98
Chapter A --- Listening Test Questionnaire for FOREX Response Genera- tion --- p.100
Chapter B --- Major Response Types For ISIS --- p.102
Chapter C --- Recording Corpus for Tone Investigation in Date-time Sub- grammar --- p.105
Chapter D --- Statistical Test for Left Tonal Context --- p.109
Chapter E --- Statistical Test for Right Tonal Context --- p.112
Chapter F --- Listening Test Questionnaire for Backoff Unit Selection Scheme --- p.115
Chapter G --- Statistical Test for the Backoff Unit Selection Scheme --- p.117
Chapter H --- Statistical Test for the Backoff Unit Selection Scheme --- p.118
Bibliography --- p.119
APA, Harvard, Vancouver, ISO, and other styles
32

"Text segmentation and error detection for Chinese spell checking." 1999. http://library.cuhk.edu.hk/record=b5890046.

Full text
Abstract:
Ng Mau Kit Michael.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.
Includes bibliographical references (leaves 117-120).
Abstract and appendix in English and Chinese.
Abstract --- p.i
Acknowledgments --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Background Knowledge and Basic Concepts --- p.7
Chapter 2.1 --- Classification of Natural Languages --- p.7
Chapter 2.2 --- Chinese Spell Checking --- p.9
Chapter 2.3 --- Characteristics of Chinese --- p.12
Chapter 2.3.1 --- Word Frequency and Statistical Information of Chinese Words --- p.12
Chapter 2.3.2 --- Chinese Grammar --- p.15
Chapter 2.3.2.1 --- Word Class --- p.15
Chapter 2.3.2.2 --- Grammar Rules --- p.17
Chapter 3 --- Problems with Chinese Spell Checking and Related Work --- p.18
Chapter 3.1 --- Ambiguities --- p.19
Chapter 3.2 --- Unknown Words --- p.20
Chapter 3.3 --- Text Errors --- p.21
Chapter 3.4 --- Combinatory Explosion --- p.23
Chapter 3.5 --- Related Work --- p.26
Chapter 4 --- The Chinese Spell Checking System --- p.33
Chapter 4.1 --- Architecutre of the Chinese Spell Checking System (CSCS) --- p.35
Chapter 4.2 --- The Segmenter and the Error Detector --- p.39
Chapter 5 --- The Block-of-Combinations Segmentation Algorithm and Error Detection --- p.42
Chapter 5.1 --- Single-character-word Function --- p.43
Chapter 5.2 --- Segmentation Strategy --- p.46
Chapter 5.3 --- Maximum Number of Combinations of the BOC --- p.51
Chapter 5.4 --- A Case Study of the BOC --- p.54
Chapter 5.5 --- Evaluation of the BOC --- p.59
Chapter 5.5.1 --- Accuracy --- p.59
Chapter 5.5.2 --- Speed --- p.61
Chapter 5.5.3 --- Discussion --- p.62
Chapter 5.6 --- Experiments on Error Detection for the BOC --- p.63
Chapter 5.6.1 --- Experimental Results of the Error Detection for the BOC --- p.65
Chapter 6 --- The Genetic Algorithm Segmentation Method --- p.69
Chapter 6.1 --- Basic Concepts of Genetic Algorithm --- p.69
Chapter 6.2 --- Genetic Algorithm Model --- p.73
Chapter 6.2.1 --- Chromosome Representation --- p.75
Chapter 6.2.2 --- The Flow of the GAS --- p.76
Chapter 6.2.2.1 --- Crossover --- p.77
Chapter 6.2.2.2 --- Replacement --- p.78
Chapter 6.2.2.3 --- Mutation --- p.80
Chapter 6.2.2.4 --- Termination Criteria --- p.80
Chapter 6.2.3 --- Fitness Function --- p.81
Chapter 6.2.3.1 --- Single-character-word Function --- p.82
Chapter 6.2.3.2 --- Known-word Function and Unknown-word Function --- p.83
Chapter 6.2.3.3 --- Grammar Rules Scoring Function --- p.83
Chapter 6.3 --- Maximum Number of Combinations of the GAS --- p.86
Chapter 6.4 --- Evaluation of the GAS --- p.86
Chapter 6.5 --- Discussion --- p.88
Chapter 7 --- The Improved-BOC Algorithm for Handling Unknown Words and Errors --- p.90
Chapter 7.1 --- Segmentation Principle of the Improved-BOC Method --- p.91
Chapter 7.2 --- Improvement of the Scoring Function --- p.93
Chapter 7.2.1 --- The Choice of Grammar Rules --- p.93
Chapter 7.2.2 --- Phrase-structure Style --- p.96
Chapter 7.2.3 --- Computer Model of Grammar Rules for Handling Unknown Words --- p.98
Chapter 7.3 --- Evaluation of Segmentation --- p.102
Chapter 7.4 --- Error Detection --- p.104
Chapter 7.4.1 --- Evaluation of Error Detection --- p.106
Chapter 7.5 --- Discussion --- p.108
Chapter 7.6 --- "Comparison between the MM, BOC, GA and Improved-BOC" --- p.109
Chapter 8 --- Conclusion --- p.114
Bibliography --- p.117
Appendix A: Sample Result of the Genetic Algorithm Segmentation Method --- p.121
Appendix B: Set of Grammar Rules --- p.123
APA, Harvard, Vancouver, ISO, and other styles
33

"Free-style phonetic input of Chinese." Chinese University of Hong Kong, 1993. http://library.cuhk.edu.hk/record=b5887712.

Full text
Abstract:
by Lau Chi Ching, Donny.
Thesis (M.Sc.)--Chinese University of Hong Kong, 1993.
Includes bibliographical references (leaves [71]).
Chapter 1. --- Introduction
Chapter 1.1 --- Introduction --- p.1
Chapter 1.2 --- Comparison of Phonetic and Written Character Input --- p.2
Chapter 1.3 --- Significance of Phonetic Input --- p.4
Chapter 1.4 --- Drawbacks of Current Phonetic Input Methods --- p.4
Chapter 2. --- Objectives of the Research
Chapter 2.1 --- Main Objectives --- p.6
Chapter 2.2 --- User Background Pre-requisite --- p.8
Chapter 2.3 --- Roman-Spelling (Recommended Phonetic Scheme) --- p.9
Chapter 2.4 --- User Input and the Output Scenario --- p.10
Chapter 2.5 --- Outline of Free-Style Phonetic Input Processing --- p.15
Chapter 3. --- Lexical Analyser
Chapter 3.1 --- Overview of Lexical Analyser --- p.17
Chapter 3.2 --- Identification of Character Boundary --- p.19
Chapter 3.3 --- Lexical Tree --- p.20
Chapter 4. --- Selection Module
Chapter 4.1 --- Overview of Selection Module --- p.23
Chapter 4.2 --- Fault-tolerance Capability --- p.24
Chapter 4.3 --- Group Table (Groups of Similar Sounds) --- p.26
Chapter 4.4 --- Distance Calculation Algorithm --- p.30
Chapter 4.4.1 --- Character Dictionary --- p.31
Chapter 4.4.2 --- Phrase Dictionary --- p.33
Chapter 4.4.3 --- Hashing Key of the Dictionaries --- p.35
Chapter 4.4.4 --- Maintenance of Dictionaries --- p.36
Chapter 4.4.5 --- Distance Calculation of Character Input --- p.37
Chapter 4.4.5.1 --- Examples of Character Output --- p.39
Chapter 4.4.6 --- Distance Calculation of Phrase Input --- p.40
Chapter 4.4.6.1 --- Examples of Phrase Output --- p.44
Chapter 4.4.7 --- Explanation of Algorithm --- p.45
Chapter 5. --- Syntax Analyser
Chapter 5.1 --- Overview of Syntax Analyser --- p.46
Chapter 5.2 --- Overview of a Chinese Simple Sentence --- p.47
Chapter 5.3 --- Testing Simple Sentence Rules --- p.48
Chapter 5.3.1 --- NDFA for Chinese Grammar Rules --- p.49
Chapter 5.4 --- Syntax Analysis Algorithm --- p.51
Chapter 5.4.1 --- Explanation of Algorithm --- p.52
Chapter 5.4.2 --- Justification of Algorithm --- p.54
Chapter 5.4.3 --- Examples of Syntax Analysis --- p.55
Chapter 5.5 --- Parse Tree for Semantic Analysis --- p.59
Chapter 6. --- Division of Technical Work --- p.61
Chapter 7. --- Applied Areas of the Research
Chapter 7.1 --- Chinese User Interface with Operating System --- p.63
Chapter 7.2 --- Bilingual Programming Language Editor --- p.64
Chapter 7.3 --- Development of a Chinese Programming Language --- p.66
Chapter 7.4 --- Putonghua Training --- p.67
Chapter 8. --- Conclusions and Future Improvements
Chapter 8.1 --- Conclusions --- p.68
Chapter 8.2 --- Future Improvements --- p.69
References
Appendix A
APA, Harvard, Vancouver, ISO, and other styles
34

"Rasterization techniques for Chinese outline fonts." Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5887282.

Full text
Abstract:
Kwong-ho Wu.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (leaves 72-75).
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Outline Fonts --- p.2
Chapter 1.1.1 --- Advantages and Disadvantages --- p.4
Chapter 1.1.2 --- Representations --- p.4
Chapter 1.1.3 --- Rasterization --- p.5
Chapter 1.2 --- Introduction to This Thesis --- p.6
Chapter 1.2.2 --- Organization --- p.7
Chapter 1.2.1 --- Objectives --- p.7
Chapter 2 --- Chinese Characters Fonts --- p.8
Chapter 2.1 --- Large Character Set --- p.8
Chapter 2.2 --- Font Styles --- p.8
Chapter 2.3 --- Storage Problems --- p.9
Chapter 2.4 --- Hierarchical Structure --- p.10
Chapter 2.5 --- High Stroke Count --- p.11
Chapter 3 --- Rasterization --- p.13
Chapter 3.1 --- The Basic Rasterization --- p.13
Chapter 3.1.1 --- Scan Conversion --- p.14
Chapter 3.1.2 --- Filling Outline --- p.16
Chapter 3.2 --- Font Rasterization --- p.17
Chapter 3.2.1 --- Outline Scaling --- p.17
Chapter 3.2.2 --- Hintings --- p.17
Chapter 3.2.3 --- Basic Rasterization Approach for Chinese Fonts --- p.18
Chapter 3.3 --- Hintings --- p.20
Chapter 3.3.1 --- Phase Control --- p.20
Chapter 3.3.2 --- Auto-Hints --- p.21
Chapter 3.3.3 --- Storage of Hintings Information in TrueType Font and Postscript Font --- p.22
Chapter 4 --- An Improved Chinese Font Rasterizer --- p.24
Chapter 4.1 --- Floating Point Avoidance --- p.24
Chapter 4.2 --- Filling --- p.25
Chapter 4.2.1 --- Filling with Horizontal Scan Line --- p.25
Chapter 4.2.2 --- Filling with Vertical Scan Line --- p.27
Chapter 4.3 --- Hintings --- p.30
Chapter 4.3.1 --- Assumptions --- p.30
Chapter 4.3.2 --- Maintaining Regular Strokes Width --- p.30
Chapter 4.3.3 --- Maintaining Regular Spacing Among Strokes --- p.34
Chapter 4.3.4 --- Hintings of Single Stroke Contour --- p.42
Chapter 4.3.5 --- Storing the Hinting Information in Font File --- p.49
Chapter 4.4 --- A Rasterization Algorithm for Printing --- p.51
Chapter 4.4.1 --- A Simple Algorithm for Generating Smooth Characters --- p.52
Chapter 4.4.2 --- Algorithm --- p.54
Chapter 4.4.3 --- Results --- p.54
Chapter 5 --- Experiments --- p.56
Chapter 5.1 --- Apparatus --- p.56
Chapter 5.2 --- Experiments for Investigating Rasterization Speed --- p.56
Chapter 5.2.1 --- Investigation into the Effects of Features of Chinese Fonts on Rasterization Time --- p.56
Chapter 5.2.2 --- Improvement of Fast Rasterizer --- p.57
Chapter 5.2.3 --- Details of Experiments --- p.57
Chapter 5.3 --- Experiments for Rasterization Speed of Font File with Hints --- p.57
Chapter 6 --- Results and Conclusions --- p.58
Chapter 6.1 --- Observations --- p.58
Chapter 6.1.1 --- Relationship Between Time for Rasterization and Stroke Count --- p.58
Chapter 6.1.2 --- Effects of Style --- p.61
Chapter 6.1.3 --- Investigation into the Observed Relationship --- p.62
Chapter 6.2 --- Improvement of the Improved Rasterizer --- p.64
Chapter 6.3 --- Gain and Cost of Inserting Hints into Font File --- p.68
Chapter 6.3.1 --- Cost --- p.68
Chapter 6.3.2 --- Gain --- p.68
Chapter 6.4 --- Conclusions --- p.69
Chapter 6.5 --- Future Work --- p.69
Appendix
APA, Harvard, Vancouver, ISO, and other styles
35

"Chinese window system with distributed fonts." Chinese University of Hong Kong, 1990. http://library.cuhk.edu.hk/record=b5886624.

Full text
Abstract:
Cheang Sio Man.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1990.
Bibliography: leaves [103-106]
Chapter 1. --- THE EMERGENCE OF WINDOW SYSTEMS --- p.1-1
Chapter 2. --- THE NEED OF A CHINESE WINDOW SYSTEM --- p.2-1
Chapter 3. --- REQUIREMENTS AND DIFFICULTIES OF DEVELOPING A CHINESE WINDOW SYSTEM --- p.3-1
Chapter 3.1. --- Input Method and Character Encoding --- p.3-1
Chapter 3.2. --- Layout Direction and Formatting Mechanism --- p.3-3
Chapter 3.3. --- Fonts --- p.3-3
Chapter 3.3.1. --- Bitmap font --- p.3-4
Chapter 3.3.2. --- Outline font --- p.3-6
Chapter 4. --- A TRIAL TO OVERCOME THE DIFFICULTIES IN SUPPORTING CHINESE FONTS - OVERVIEW OF A CHINESE FONT SERVER SYSTEM --- p.4-1
Chapter 4.1. --- Network Font Server --- p.4-3
Chapter 4.2. --- Local Font Server --- p.4-4
Chapter 4.3. --- Fonts --- p.4-5
Chapter 4.3.1. --- Bitmap font --- p.4-5
Chapter 4.3.1. --- Outline font --- p.4-5
Chapter 4.4. --- Caching --- p.4-6
Chapter 5. --- ORGANIZATION OF THE CHINESE FONT SERVER SYSTEM --- p.5-1
Chapter 5.1. --- Communication Module --- p.5-2
Chapter 5.1.1. --- Client connection request channel --- p.5-3
Chapter 5.1.2. --- Client communication channels --- p.5-3
Chapter 5.1.3. --- Network server connection channel --- p.5-4
Chapter 5.2. --- Client Service Module --- p.5-7
Chapter 5.2.1. --- Font manipulation module --- p.5-7
Chapter 5.2.1.1. --- Request to open a new font --- p.5-8
Chapter 5.2.1.2. --- Request to close an opened font --- p.5-8
Chapter 5.2.1.3. --- Request to load a font character --- p.5-9
Chapter 5.2.2. --- Cache module --- p.5-10
Chapter 6. --- FROM THE CHINESE FONT SERVER SYSTEM TO A CHINESE WINDOW SYSTEM --- p.6-4
Chapter 7. --- SCREEN FONTS --- p.7-1
Chapter 7.1. --- Hand-edit --- p.7-3
Chapter 7.2. --- Bitmap Scaling --- p.7-3
Chapter 7.3. --- Outline Scaling --- p.7-5
Chapter 7.4. --- Manual Refinement --- p.7-16
Chapter 8. --- FONT CACHING --- p.8-1
Chapter 8.1. --- Font Caching Strategies --- p.8-1
Chapter 8.1.1. --- Pre-loading --- p.8-1
Chapter 8.1.2. --- Fix-loading --- p.8-4
Chapter 8.1.3. --- Demand loading --- p.8-6
Chapter 8.1.3.1. --- Least Recently Used (LRU) replacement --- p.8-9
Chapter 8.1.3.2. --- Least Frequently Used (LFU) replacement --- p.8-9
Chapter 8.1.4. --- Hybrid loading --- p.8-16
Chapter 8.2. --- Retrieval Method --- p.8-22
Chapter 8.2.1. --- Binary searching --- p.8-22
Chapter 8.2.2. --- Tree searching --- p.8-24
Chapter 8.2.3. --- Hash searching --- p.8 26
Chapter 8.3. --- Cache Expansion and Retraction --- p.8-33
Chapter 9. --- AN EXPERIMENTAL CHINESE FONT SERVER SYSTEM - CAPABILITIES AND RESTRICTIONS --- p.9-1
Chapter 9.1. --- Experimental Servers --- p.9-1
Chapter 9.2. --- Programming Interfaces --- p.9-3
Chapter 9.2.1. --- Connection request --- p.9-3
Chapter 9.2.2. --- Open and close fonts --- p.9-4
Chapter 9.2.3. --- Request to load cache --- p.9-5
Chapter 9.2.4. --- Change the current font --- p.9-5
Chapter 9.2.5. --- Request a font character --- p.9-5
Chapter 9.3. --- Testing Applications --- p.9-6
Chapter 9.4. --- Statistics --- p.9-8
Chapter 9.4.1. --- Cache performance --- p.9-8
Chapter 9.4.1.1. --- Tests --- p.9-8
Chapter 9.4.1.2. --- Results --- p.9-10
Chapter 9.4.1.3. --- Discussion --- p.9-10
Chapter 9.4.2. --- Local Server Vs. Network Server --- p.9-12
Chapter 9.4.2.1. --- Tests --- p.9-12
Chapter 9.4.2.2. --- Results --- p.9-13
Chapter 9.4.2.3. --- Discussion --- p.9-13
Chapter 9.4.3. --- Outline Font --- p.9-14
Chapter 9.4.3.1. --- Tests --- p.9-14
Chapter 9.4.3.2. --- Results --- p.9-14
Chapter 9.4.3.3. --- Discussion --- p.9-15
Chapter 10. --- EPILOGUE --- p.10-1
Chapter 10.1. --- Conclusion --- p.10-1
Chapter 10.2. --- Future Extension --- p.10-2
APA, Harvard, Vancouver, ISO, and other styles
36

"The word segmentation & part-of-speech tagging system for the modern Chinese." Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5888174.

Full text
Abstract:
Liu Hon-lung.
Title also in Chinese characters.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (leaves [58-59]).
Chapter 1. --- Introduction --- p.1
Chapter 2. --- "Word Segmentation and Part-of-Speech Tagging: Techniques, Current Researches and The Embraced Problems" --- p.6
Chapter 2.1. --- Various Methods on Word Segmentation and Part-of-Speech Tagging --- p.6
Chapter 2.2. --- Current Researches on Word Segmentation and Part-of-Speech Tagging --- p.9
Chapter 2.3. --- Embraced Problems in Word Segmentation and Part-of-Speech Tagging --- p.9
Chapter 3. --- Branch-and-Bound Algorithm for Combinational Optimization of the Probabilistic Scoring Function --- p.15
Chapter 3.1. --- Definition of Word Segmentation and Part-of-Speech Tagging --- p.15
Chapter 3.2. --- Framework --- p.17
Chapter 3.3. --- "Weight Assignment, Intermediate Score Computation & Optimization" --- p.20
Chapter 4. --- Implementation Issues of the Proposed Word Segmentation and Part-of-Speech Tagging System --- p.26
Chapter 4.1. --- Design of System Dictionary and Data Structure --- p.30
Chapter 4.2. --- Training Process --- p.33
Chapter 4.3. --- Tagging Process --- p.35
Chapter 4.4. --- Tagging Samples of the Word Segmentation & Part-of-Speech Tagging System --- p.39
Chapter 5. --- Experiments on the Proposed Word Segmentation and Part-Of-Speech Tagging System --- p.41
Chapter 5.1. --- Closed Test --- p.41
Chapter 5.2. --- Open Test --- p.42
Chapter 6. --- Testing and Statistics --- p.43
Chapter 7. --- Conclusions and Discussions --- p.47
References
Appendices
Appendix A: sysdict.tag Sample
Appendix B: econ.tag Sample
Appendix C: open. tag Sample
Appendix D:漢語分詞及詞性標注系統for Windows
Appendix E: Neural Network
APA, Harvard, Vancouver, ISO, and other styles
37

"A DBMS query language in natural Chinese language form." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888492.

Full text
Abstract:
by Lam Chin-keung.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 129-135 (2nd gp.)).
ACKNOWLEDGMENTS --- p.I
ABSTRACT --- p.II
TABLE OF CONTENTS --- p.III
LIST OF FIGURES --- p.VI
LIST OF TABLES --- p.VIII
Chapter CHAPTER 1 --- INTRODUCTION --- p.1
Chapter 1.1 --- Motivations --- p.1
Chapter 1.2 --- Objectives --- p.3
Chapter 1.3 --- More to go --- p.3
Chapter 1.4 --- Chapter Summary --- p.4
Chapter CHAPTER 2 --- RELATED WORK --- p.6
Chapter 2.1 --- Chinese Related Work --- p.6
Chapter 2.1.1 --- Chinese Natural Language --- p.6
Chapter 2.1.2 --- Chinesized Query Language From English --- p.7
Chapter 2.2 --- High Level Database Query Language --- p.8
Chapter 2.2.1 --- Relational Algebra vs Relational Calculus --- p.9
Chapter 2.2.2 --- Procedural vs Declarative --- p.10
Chapter 2.2.3 --- Natural Language (NL) vs Restricted Natural Language (RNL) --- p.11
Chapter 2.3 --- Database Query Interface --- p.13
Chapter 2.3.1 --- Linear Textual Interface --- p.13
Chapter 2.3.2 --- Form-based Interface --- p.14
Chapter 2.3.3 --- Graphical Interface --- p.14
Chapter 2.4 --- Remarks --- p.14
Chapter CHAPTER 3 --- DESIGN PRINCIPLES --- p.16
Chapter 3.1 --- Underlying Data Model of the new language --- p.16
Chapter 3.2 --- Problems Under Attack --- p.17
Chapter 3.2.1 --- Naturalness --- p.17
Chapter 3.2.2 --- Procedural vs Declarative --- p.19
Chapter 3.2.3 --- Supports of Chinese Characters --- p.21
Chapter 3.3 --- Design Principles --- p.22
Chapter 3.4 --- Chapter Summary --- p.26
Chapter CHAPTER 4 --- LANGUAGE DEFINITION --- p.28
Chapter 4.1 --- Language Overvew --- p.28
Chapter 4.2 --- The Data Manipulation Language --- p.29
Chapter 4.2.1 --- Relational Operators --- p.30
Chapter 4.2.2 --- Rail-Track Diagram of Chiql --- p.32
Chapter 4.2.3 --- The 11-template --- p.33
Chapter 4.2.4 --- Chiql Examples --- p.37
Chapter 4.2.5 --- Common Language Constructs --- p.39
Chapter 4.2.6 --- ONE issue about GROUP BY and RESTRICTION --- p.41
Chapter 4.3 --- Other Language Features --- p.42
Chapter 4.3.1 --- Aggregate Functions --- p.43
Chapter 4.3.2 --- Attribute Alias --- p.44
Chapter 4.3.3 --- Conditions in Chinese --- p.45
Chapter 4.3.4 --- Unquantifed Predicates --- p.45
Chapter 4.3.5 --- sorting --- p.47
Chapter 4.4 --- Treatment of Quantified Predicates --- p.48
Chapter 4.5 --- The Data Definition Language --- p.52
Chapter 4.5.1 --- Create Table --- p.52
Chapter 4.5.2 --- Drop Table --- p.54
Chapter 4.5.3 --- Alter Table --- p.54
Chapter 4.5.4 --- Insert Row --- p.56
Chapter 4.5.5 --- Delete Row --- p.56
Chapter 4.5.6 --- Update Row --- p.57
Chapter 4.5.7 --- Remarks on DDL --- p.58
Chapter 4.6 --- Chapter Summary --- p.59
Chapter CHAPTER 5 --- END-USER INTERFACE --- p.61
Chapter 5.1 --- EUI Overview --- p.61
Chapter 5.2 --- Design Principles --- p.62
Chapter 5.2.1 --- Language Independent Aspects --- p.62
Chapter 5.2.2 --- Language Dependent Aspects --- p.64
Chapter 5.3 --- Complex Condition Handling --- p.68
Chapter 5.4 --- Input Sequences of the EUI --- p.71
Chapter 5.5 --- Query Formulation: An Example --- p.73
Chapter 5.6 --- Chapter Summary --- p.85
Chapter CHAPTER 6 --- CHIQL TO SQL TRANSLATIONS --- p.86
Chapter 6.1 --- Related Work --- p.87
Chapter 6.2 --- Translation Overview --- p.87
Chapter 6.2.1 --- "Pass One:Mapping( Input = Chiql, Output = multi-statement SQL)" --- p.89
Chapter 6.2.2 --- "Pass Two:Nesting(Input = multi-statement SQL, Output = single statement SQL)" --- p.92
Chapter 6.2.3 --- Technical Difficulties in Chiql/SQL Translation --- p.99
Chapter 6.3 --- Chapter Summary --- p.106
Chapter CHAPTER 7 --- EVALUATION --- p.108
Chapter 7.1 --- Expressiveness Test --- p.108
Chapter 7.1.1 --- Results --- p.109
Chapter 7.1.2 --- Implications --- p.111
Chapter 7.2 --- Usability Evaluation --- p.111
Chapter 7.2.1 --- Evaluation Methodology --- p.112
Chapter 7.2.2 --- Result:Completion Time --- p.113
Chapter 7.2.3 --- Result: Additional Help --- p.116
Chapter 7.2.4 --- Result: Query Error --- p.116
Chapter 7.2.5 --- Result: Overall Score --- p.118
Chapter 7.2.6 --- User Comments --- p.120
Chapter 7.3 --- Chapter Summary --- p.120
Chapter CHAPTER 8 --- CONCLUSIONS --- p.122
Chapter 8.1 --- Thesis Conclusions --- p.122
Chapter 8.2 --- Future Work --- p.124
REFERENCES
APPENDIX
APA, Harvard, Vancouver, ISO, and other styles
38

"A natural language based indexing technique for Chinese information retrieval." 1997. http://library.cuhk.edu.hk/record=b5889267.

Full text
Abstract:
Pang Chun Kiu.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.
Includes bibliographical references (leaves 101-107).
Chapter 1 --- Introduction --- p.2
Chapter 1.1 --- Chinese Indexing using Noun Phrases --- p.6
Chapter 1.2 --- Objectives --- p.8
Chapter 1.3 --- An Overview of the Thesis --- p.8
Chapter 2 --- Background --- p.10
Chapter 2.1 --- Technology Influences on Information Retrieval --- p.10
Chapter 2.2 --- Related Work --- p.13
Chapter 2.2.1 --- Statistical/Keyword Approaches --- p.13
Chapter 2.2.2 --- Syntactical approaches --- p.15
Chapter 2.2.3 --- Semantic approaches --- p.17
Chapter 2.2.4 --- Noun Phrases Approach --- p.18
Chapter 2.2.5 --- Chinese Information Retrieval --- p.20
Chapter 2.3 --- Our Approach --- p.21
Chapter 3 --- Chinese Noun Phrases --- p.23
Chapter 3.1 --- Different types of Chinese Noun Phrases --- p.23
Chapter 3.2 --- Ambiguous noun phrases --- p.27
Chapter 3.2.1 --- Ambiguous English Noun Phrases --- p.27
Chapter 3.2.2 --- Ambiguous Chinese Noun Phrases --- p.28
Chapter 3.2.3 --- Statistical data on the three NPs --- p.33
Chapter 4 --- Index Extraction from De-de Conj. NP --- p.35
Chapter 4.1 --- Word Segmentation --- p.36
Chapter 4.2 --- Part-of-speech tagging --- p.37
Chapter 4.3 --- Noun Phrase Extraction --- p.37
Chapter 4.4 --- The Chinese noun phrase partial parser --- p.38
Chapter 4.5 --- Handling Parsing Ambiguity --- p.40
Chapter 4.6 --- Index Building Strategy --- p.41
Chapter 4.7 --- The cross-set generation rules --- p.44
Chapter 4.8 --- Example 1: Indexing De-de NP --- p.46
Chapter 4.9 --- Example 2: Indexing Conjunctive NP --- p.48
Chapter 4.10 --- Experimental results and Discussion --- p.49
Chapter 5 --- Indexing Compound Nouns --- p.52
Chapter 5.1 --- Previous Researches on Compound Nouns --- p.53
Chapter 5.2 --- Indexing two-term Compound Nouns --- p.55
Chapter 5.2.1 --- About the thesaurus《同義詞詞林》 --- p.56
Chapter 5.3 --- Indexing Compound Nouns of three or more terms --- p.58
Chapter 5.4 --- Corpus learning approach --- p.59
Chapter 5.4.1 --- An Example --- p.60
Chapter 5.4.2 --- Experimental Setup --- p.63
Chapter 5.4.3 --- An Experiment using the third level of the Cilin --- p.65
Chapter 5.4.4 --- An Experiment using the second level of the Cilin --- p.66
Chapter 5.5 --- Contextual Approach --- p.68
Chapter 5.5.1 --- The algorithm --- p.69
Chapter 5.5.2 --- An Illustrative Example --- p.71
Chapter 5.5.3 --- Experiments on compound nouns --- p.72
Chapter 5.5.4 --- Experiment I: Word Distance Based Extraction --- p.73
Chapter 5.5.5 --- Experiment II: Semantic Class Based Extraction --- p.75
Chapter 5.5.6 --- Experiments III: On different boundaries --- p.76
Chapter 5.5.7 --- The Final Algorithm --- p.79
Chapter 5.5.8 --- Experiments on other compounds --- p.82
Chapter 5.5.9 --- Discussion --- p.83
Chapter 6 --- Overall Effectiveness --- p.85
Chapter 6.1 --- Illustrative Example for the Integrated Algorithm --- p.86
Chapter 6.2 --- Experimental Setup --- p.90
Chapter 6.3 --- Experimental Results & Discussion --- p.91
Chapter 7 --- Conclusion --- p.95
Chapter 7.1 --- Summary --- p.95
Chapter 7.2 --- Contributions --- p.97
Chapter 7.3 --- Future Directions --- p.98
Chapter 7.3.1 --- Word-sense determination --- p.98
Chapter 7.3.2 --- Hybrid approach for compound noun indexing --- p.99
Chapter A --- Cross-set Generation Rules --- p.108
Chapter B --- Tag set by Tsinghua University --- p.110
Chapter C --- Noun Phrases Test Set --- p.113
Chapter D --- Compound Nouns Test Set --- p.124
Chapter D.l --- Three-term Compound Nouns --- p.125
Chapter D.1.1 --- NVN --- p.125
Chapter D.1.2 --- Other three-term compound nouns --- p.129
Chapter D.2 --- Four-term Compound Nouns --- p.133
Chapter D.3 --- Five-term and six-term Compound Nouns --- p.134
APA, Harvard, Vancouver, ISO, and other styles
39

"Chinese readability analysis and its applications on the internet." 2007. http://library.cuhk.edu.hk/record=b5893108.

Full text
Abstract:
Lau Tak Pang.
Thesis submitted in: October 2006.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.
Includes bibliographical references (leaves 110-122).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.v
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation and Major Contributions --- p.1
Chapter 1.1.1 --- Chinese Readability Analysis --- p.1
Chapter 1.1.2 --- Web Readability Analysis --- p.3
Chapter 1.2 --- Thesis Chapter Organization --- p.6
Chapter 2 --- Related Work --- p.7
Chapter 2.1 --- Readability Assessment --- p.7
Chapter 2.1.1 --- Assessment for Text Document --- p.8
Chapter 2.1.2 --- Assessment for Web Page --- p.13
Chapter 2.2 --- Support Vector Machine --- p.14
Chapter 2.2.1 --- Characteristics and Advantages --- p.14
Chapter 2.2.2 --- Applications --- p.16
Chapter 2.3 --- Chinese Word Segmentation --- p.16
Chapter 2.3.1 --- Difficulty in Chinese Word Segmentation --- p.16
Chapter 2.3.2 --- Approaches for Chinese Word Segmentation --- p.17
Chapter 3 --- Chinese Readability Analysis --- p.20
Chapter 3.1 --- Chinese Readability Factor Analysis --- p.20
Chapter 3.1.1 --- Systematic Analysis --- p.20
Chapter 3.1.2 --- Feature Extraction --- p.30
Chapter 3.1.3 --- Limitation of Our Analysis and Possible Extension --- p.32
Chapter 3.2 --- Research Methodology --- p.33
Chapter 3.2.1 --- Definition of Readability --- p.33
Chapter 3.2.2 --- Data Acquisition and Sampling --- p.34
Chapter 3.2.3 --- Text Processing and Feature Extraction . --- p.35
Chapter 3.2.4 --- Regression Analysis using Support Vector Regression --- p.36
Chapter 3.2.5 --- Evaluation --- p.36
Chapter 3.3 --- Introduction to Support Vector Regression --- p.38
Chapter 3.3.1 --- Basic Concept --- p.38
Chapter 3.3.2 --- Non-Linear Extension using Kernel Technique --- p.41
Chapter 3.4 --- Implementation Details --- p.42
Chapter 3.4.1 --- Chinese Word Segmentation --- p.42
Chapter 3.4.2 --- Building Basic Chinese Character / Word Lists --- p.47
Chapter 3.4.3 --- Pull Sentence Detection --- p.49
Chapter 3.4.4 --- Feature Selection Using Genetic Algorithm --- p.50
Chapter 3.5 --- Experiments --- p.55
Chapter 3.5.1 --- Experiment 1: Evaluation on Chinese Word Segmentation using the LMR-RC Tagging Scheme --- p.56
Chapter 3.5.2 --- Experiment 2: Initial SVR Parameters Searching with Different Kernel Functions --- p.61
Chapter 3.5.3 --- Experiment 3: Feature Selection Using Genetic Algorithm --- p.63
Chapter 3.5.4 --- Experiment 4: Training and Cross-validation Performance using the Selected Feature Subset --- p.67
Chapter 3.5.5 --- Experiment 5: Comparison with Linear Regression --- p.74
Chapter 3.6 --- Summary and Future Work --- p.76
Chapter 4 --- Web Readability Analysis --- p.78
Chapter 4.1 --- Web Page Readability --- p.79
Chapter 4.1.1 --- Readability as Comprehension Difficulty . --- p.79
Chapter 4.1.2 --- Readability as Grade Level --- p.81
Chapter 4.2 --- Web Site Readability --- p.83
Chapter 4.3 --- Experiments --- p.85
Chapter 4.3.1 --- Experiment 1: Web Page Readability Analysis -Comprehension Difficulty --- p.87
Chapter 4.3.2 --- Experiment 2: Web Page Readability Analysis -Grade Level --- p.92
Chapter 4.3.3 --- Experiment 3: Web Site Readability Analysis --- p.98
Chapter 4.4 --- Summary and Future Work --- p.101
Chapter 5 --- Conclusion --- p.104
Chapter A --- List of Symbols and Notations --- p.107
Chapter B --- List of Publications --- p.110
Bibliography --- p.113
APA, Harvard, Vancouver, ISO, and other styles
40

"A robust unification-based parser for Chinese natural language processing." 2001. http://library.cuhk.edu.hk/record=b5895881.

Full text
Abstract:
Chan Shuen-ti Roy.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.
Includes bibliographical references (leaves 168-175).
Abstracts in English and Chinese.
Chapter 1. --- Introduction --- p.12
Chapter 1.1. --- The nature of natural language processing --- p.12
Chapter 1.2. --- Applications of natural language processing --- p.14
Chapter 1.3. --- Purpose of study --- p.17
Chapter 1.4. --- Organization of this thesis --- p.18
Chapter 2. --- Organization and methods in natural language processing --- p.20
Chapter 2.1. --- Organization of natural language processing system --- p.20
Chapter 2.2. --- Methods employed --- p.22
Chapter 2.3. --- Unification-based grammar processing --- p.22
Chapter 2.3.1. --- Generalized Phase Structure Grammar (GPSG) --- p.27
Chapter 2.3.2. --- Head-driven Phrase Structure Grammar (HPSG) --- p.31
Chapter 2.3.3. --- Common drawbacks of UBGs --- p.33
Chapter 2.4. --- Corpus-based processing --- p.34
Chapter 2.4.1. --- Drawback of corpus-based processing --- p.35
Chapter 3. --- Difficulties in Chinese language processing and its related works --- p.37
Chapter 3.1. --- A glance at the history --- p.37
Chapter 3.2. --- Difficulties in syntactic analysis of Chinese --- p.37
Chapter 3.2.1. --- Writing system of Chinese causes segmentation problem --- p.38
Chapter 3.2.2. --- Words serving multiple grammatical functions without inflection --- p.40
Chapter 3.2.3. --- Word order of Chinese --- p.42
Chapter 3.2.4. --- The Chinese grammatical word --- p.43
Chapter 3.3. --- Related works --- p.45
Chapter 3.3.1. --- Unification grammar processing approach --- p.45
Chapter 3.3.2. --- Corpus-based processing approach --- p.48
Chapter 3.4. --- Restatement of goal --- p.50
Chapter 4. --- SERUP: Statistical-Enhanced Robust Unification Parser --- p.54
Chapter 5. --- Step One: automatic preprocessing --- p.57
Chapter 5.1. --- Segmentation of lexical tokens --- p.57
Chapter 5.2. --- "Conversion of date, time and numerals" --- p.61
Chapter 5.3. --- Identification of new words --- p.62
Chapter 5.3.1. --- Proper nouns ´ؤ Chinese names --- p.63
Chapter 5.3.2. --- Other proper nouns and multi-syllabic words --- p.67
Chapter 5.4. --- Defining smallest parsing unit --- p.82
Chapter 5.4.1. --- The Chinese sentence --- p.82
Chapter 5.4.2. --- Breaking down the paragraphs --- p.84
Chapter 5.4.3. --- Implementation --- p.87
Chapter 6. --- Step Two: grammar construction --- p.91
Chapter 6.1. --- Criteria in choosing a UBG model --- p.91
Chapter 6.2. --- The grammar in details --- p.92
Chapter 6.2.1. --- The PHON feature --- p.93
Chapter 6.2.2. --- The SYN feature --- p.94
Chapter 6.2.3. --- The SEM feature --- p.98
Chapter 6.2.4. --- Grammar rules and features principles --- p.99
Chapter 6.2.5. --- Verb phrases --- p.101
Chapter 6.2.6. --- Noun phrases --- p.104
Chapter 6.2.7. --- Prepositional phrases --- p.113
Chapter 6.2.8. --- """Ba2"" and ""Bei4"" constructions" --- p.115
Chapter 6.2.9. --- The terminal node S --- p.119
Chapter 6.2.10. --- Summary of phrasal rules --- p.121
Chapter 6.2.11. --- Morphological rules --- p.122
Chapter 7. --- Step Three: resolving structural ambiguities --- p.128
Chapter 7.1. --- Sources of ambiguities --- p.128
Chapter 7.2. --- The traditional practices: an illustration --- p.132
Chapter 7.3. --- Deficiency of current practices --- p.134
Chapter 7.4. --- A new point of view: Wu (1999) --- p.140
Chapter 7.5. --- Improvement over Wu (1999) --- p.142
Chapter 7.6. --- Conclusion on semantic features --- p.146
Chapter 8. --- "Implementation, performance and evaluation" --- p.148
Chapter 8.1. --- Implementation --- p.148
Chapter 8.2. --- Performance and evaluation --- p.150
Chapter 8.2.1. --- The test set --- p.150
Chapter 8.2.2. --- Segmentation of lexical tokens --- p.150
Chapter 8.2.3. --- New word identification --- p.152
Chapter 8.2.4. --- Parsing unit segmentation --- p.156
Chapter 8.2.5. --- The grammar --- p.158
Chapter 8.3. --- Overall performance of SERUP --- p.162
Chapter 9. --- Conclusion --- p.164
Chapter 9.1. --- Summary of this thesis --- p.164
Chapter 9.2. --- Contribution of this thesis --- p.165
Chapter 9.3. --- Future work --- p.166
References --- p.168
Appendix I --- p.176
Appendix II --- p.181
Appendix III --- p.183
APA, Harvard, Vancouver, ISO, and other styles
41

"An investigation on Chinese noun phrase extraction." 2000. http://library.cuhk.edu.hk/record=b5890319.

Full text
Abstract:
Chan Kun-Chung Timothy.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.
Includes bibliographical references (leaves 79-83).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation --- p.1
Chapter 1.2 --- Outline of Thesis --- p.3
Chapter 2 --- Background --- p.5
Chapter 2.1 --- Chinese Noun Phrase Structure --- p.5
Chapter 2.2 --- Literature Review --- p.6
Chapter 2.3 --- Observations --- p.10
Chapter 2.4 --- Chapter Summary --- p.11
Chapter 3 --- Maximal Chinese Noun Phrase Extraction System --- p.13
Chapter 3.1 --- Background --- p.13
Chapter 3.1.1 --- Part-of-speech Tagset --- p.13
Chapter 3.1.2 --- The Tagging System --- p.14
Chapter 3.1.3 --- Chinese Corpus --- p.16
Chapter 3.1.4 --- Grammar Rules and Boundary Information --- p.17
Chapter 3.1.5 --- Feature Selection --- p.19
Chapter 3.2 --- Overview of Our Chinese Noun Phrase Extraction System --- p.19
Chapter 3.2.1 --- Training --- p.19
Chapter 3.2.2 --- Testing --- p.21
Chapter 3.3 --- Chapter Summary --- p.21
Chapter 4 --- Preliminary Noun Phrase Extraction --- p.23
Chapter 4.1 --- Framework --- p.23
Chapter 4.2 --- Boundary Information Acquisition --- p.24
Chapter 4.3 --- Candidate Boundary Insertion --- p.26
Chapter 4.4 --- Pairing of Candidate Boundaries --- p.27
Chapter 4.4.1 --- Conditional Probability-based Model --- p.28
Chapter 4.4.2 --- Heuristic-based Model --- p.29
Chapter 4.4.3 --- Dynamic Programming-based Model --- p.30
Chapter 4.4.4 --- Model Selection --- p.31
Chapter 4.4.5 --- Revised Dynamic Programming Model --- p.32
Chapter 4.4.6 --- Analysis of the Impact of the Revised DP Model --- p.35
Chapter 4.4.7 --- Experiments of Dynamic Programming-based Model --- p.38
Chapter 4.4.8 --- Result Analysis --- p.42
Chapter 4.5 --- Concluding Remarks on DP-Based Model --- p.47
Chapter 4.6 --- Chapter Summary --- p.49
Chapter 5 --- Automatic Error Correction --- p.50
Chapter 5.1 --- Introduction --- p.50
Chapter 5.1.1 --- Statistical Properties of TEL --- p.54
Chapter 5.1.2 --- Related Applications --- p.55
Chapter 5.2 --- Settings of Main Components --- p.57
Chapter 5.2.1 --- Initial State --- p.58
Chapter 5.2.2 --- Transformation Actions --- p.58
Chapter 5.2.3 --- Triggering Features of Transformation Templates --- p.58
Chapter 5.2.4 --- Evaluation of Rule --- p.62
Chapter 5.2.5 --- Stopping Threshold --- p.62
Chapter 5.3 --- Experiments and Results --- p.63
Chapter 5.3.1 --- Setup and Procedure --- p.63
Chapter 5.3.2 --- Overall Performance --- p.63
Chapter 5.3.3 --- Contribution of Rules --- p.67
Chapter 5.3.4 --- Remarks on Rules Learning --- p.69
Chapter 5.3.5 --- Discussion on Recall Performance --- p.70
Chapter 5.4 --- Chapter Summary --- p.73
Chapter 6 --- Conclusion --- p.74
Chapter 6.1 --- Summary --- p.74
Chapter 6.2 --- Contributions --- p.76
Chapter 6.3 --- Future Work --- p.76
Bibliography --- p.79
Chapter A --- Chinese POS Tag Set --- p.84
Chapter B --- Algorithms of Boundary Pairing Models --- p.88
Chapter B.1 --- Heuristic based Model --- p.88
Chapter B.2 --- Dynamic Programming based Model --- p.89
Chapter C --- Triggering Environments of Transformation Templates --- p.91
APA, Harvard, Vancouver, ISO, and other styles
42

"Chinese outline fonts support in X Window System." Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5887264.

Full text
Abstract:
by Raymond Cheuk-kuen Chen.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (leaves 157-160).
Chapter 1. --- INTRODUCTION --- p.8
Chapter 1.1. --- Windowing System --- p.8
Chapter 1.2. --- Fonts --- p.10
Chapter 1.2.1. --- Bitmap Fonts --- p.11
Chapter 1.2.2. --- Outline Fonts --- p.12
Chapter 1.3. --- Different font support models --- p.15
Chapter 1.3.1. --- Supported by applications --- p.15
Chapter 1.3.2. --- Supported by windowing system --- p.17
Chapter 1.3.'3. --- Supported by a dedicated server --- p.19
Chapter 1.4. --- Issues of Chinese Font Support --- p.20
Chapter 2. --- OVERVIEW OF X WINDOW SYSTEM --- p.22
Chapter 2.1. --- Introduction --- p.22
Chapter 2.2. --- Architecture --- p.23
Chapter 2.3. --- Font Management in the X Window System --- p.23
Chapter 2.3.1. --- Before X Version 11 Release5 --- p.24
Chapter 2.3.2. --- In X Version 11 Release5 --- p.25
Chapter 2.3.3. --- Portable Compiled Format --- p.25
Chapter 2.3.4. --- Font Server --- p.26
Chapter 2.3.5. --- Font Management Library --- p.28
Chapter 2.4. --- Internal Code --- p.29
Chapter 3. --- CHINESE FONT SERVER --- p.30
Chapter 3.1. --- Motivation --- p.30
Chapter 3.2. --- Font Server Architecture --- p.31
Chapter 3.2.1. --- Device Independent Font Server layer(DIFS) --- p.32
Chapter 3.2.2. --- Operating System layer(OS) --- p.32
Chapter 3.2.3. --- Font Management Library(FML) --- p.33
Chapter 3.2.4. --- Font Path Element --- p.34
Chapter 3.2.5. --- Font File Renderer --- p.35
Chapter 3.2.6. --- Font server Renderer --- p.36
Chapter 3.3. --- Implementation of Chinese Font Server --- p.36
Chapter 3.3.1. --- Font data and code set --- p.36
Chapter 3.3.2. --- Registering a new font reader --- p.38
Chapter 3.3.3. --- Font specific functions --- p.42
Chapter 3.3.4. --- Load-All Scheme --- p.43
Chapter 3.3.5. --- Demand-Loading Scheme --- p.44
Chapter 3.3.6. --- Embedding of font rasterizer --- p.44
Chapter 3.4. --- Test Results --- p.45
Chapter 3.4.1. --- X Application Tests --- p.45
Chapter 3.4.2. --- Demand-Loading Test --- p.49
Chapter 3.5. --- Some Remarks --- p.53
Chapter 4. --- OVERVIEW OF PRINTING SYSTEM --- p.54
Chapter 4.1. --- Motivation --- p.54
Chapter 4.2. --- Design Considerations --- p.56
Chapter 4.2.1. --- Modification of the X server --- p.56
Chapter 4.2.2. --- Embed the printing system into the font server --- p.57
Chapter 4.2.3. --- Distributed Architecture --- p.58
Chapter 4.3. --- System Architecture --- p.60
Chapter 4.4. --- Printer Server --- p.61
Chapter 4.5. --- Font Server --- p.63
Chapter 4.6. --- Printing Services Protocols --- p.63
Chapter 4.7. --- X Window System Server --- p.65
Chapter 4.8. --- Printer Server Library --- p.65
Chapter 4.9. --- Client Applications --- p.65
Chapter 5. --- DESIGN AND IMPLEMENTATION OF A PRINTER SERVER --- p.67
Chapter 5.1. --- Objects identification --- p.67
Chapter 5.1.1. --- Dispatcher (dispatcher) --- p.68
Chapter 5.1.2. --- Communication Channel (ComChannel) --- p.68
Chapter 5.1.3. --- Font Cache Manager (FnCache) --- p.69
Chapter 5.1.4. --- PrnFont (PrnFont) --- p.69
Chapter 5.1.5. --- Per-Font Cache (CacheStruct) 一- --- p.70
Chapter 5.1.6. --- Font Server (FnServer) --- p.71
Chapter 5.1.7. --- Client Manager (LRUList) --- p.71
Chapter 5.1.8. --- Client Record (ClientRec) --- p.71
Chapter 5.1.9. --- Printer Driver (PrnDriver) --- p.71
Chapter 5.1.10. --- Down Loaded Font Table (DownLoadedFont) --- p.72
Chapter 5.1.11. --- Request Header (reqHeader) --- p.72
Chapter 5.1.12. --- Generic Reply(replyGeneric) --- p.74
Chapter 5.2. --- Objects Organization --- p.74
Chapter 5.2.1. --- Server Control Subsystem --- p.75
Chapter 5.2.2. --- Client Management Subsystem --- p.78
Chapter 5.2.3. --- Request Handling Subsystem --- p.84
Chapter 5.2.4. --- Font Managing Subsystem --- p.86
Chapter 6. --- SAMPLE PRINTER DRIVER --- p.94
Chapter 6.1. --- Printer Control Languages --- p.94
Chapter 6.1.1. --- Structure of PCL Command --- p.95
Chapter 6.1.2. --- PCL Command Example --- p.97
Chapter 6.2. --- Printer Font Resources --- p.98
Chapter 6.3. --- Traditional Font Handling Methods in a Printer Driver --- p.99
Chapter 6.4. --- Soft Font Creation in PCL Printer --- p.101
Chapter 6.4.1. --- Font ID number --- p.102
Chapter 6.4.2. --- Font Descriptor --- p.102
Chapter 6.4.3. --- Character Code - --- p.104
Chapter 6.4.4. --- Character Descriptor --- p.105
Chapter 6.4.5. --- Character Bitmap Data --- p.107
Chapter 6.5. --- New font downloading schemes for double-byte fonts --- p.107
Chapter 6.5.1. --- Terminology --- p.108
Chapter 6.5.2. --- Underlying Concepts of Algorithm One --- p.109
Chapter 6.5.3. --- Algorithm One --- p.111
Chapter 6.5.3.1. --- Code Mapping --- p.112
Chapter 6.5.3.2. --- Example --- p.114
Chapter 6.5.3.3. --- Memory Consideration --- p.115
Chapter 6.5.4. --- Algorithm Two --- p.117
Chapter 7. --- EXPERIMENT RESULTS AND DISCUSSIONS --- p.121
Chapter 7.1. --- Cache Test --- p.121
Chapter 7.2. --- Printer Driver Test --- p.125
Chapter 7.2.1. --- Testing with 10 points font --- p.126
Chapter 7.2.2. --- Testing with 12 points font --- p.129
Chapter 7.2.3. --- Testing with 15 points font --- p.131
Chapter 7.2.4. --- Testing with 18 points font --- p.134
Chapter 7.3. --- Time Measurement --- p.136
Chapter 7.4. --- Discussion --- p.139
Chapter 7.5. --- Further Improvement --- p.143
Chapter 8. --- CONCLUSIONS --- p.145
APPENDIX A. PRINTER DRIVER CLASS --- p.147
APPENDIX B. SAMPLE OUTPUT --- p.149
REFERENCES --- p.157
APA, Harvard, Vancouver, ISO, and other styles
43

"Automatic noun phrase extraction from full Chinese text." 1997. http://library.cuhk.edu.hk/record=b6073093.

Full text
Abstract:
by Li Wenjie.
Thesis (Ph.D.)--Chinese University of Hong Kong, 1997.
Includes bibliographical references (p. 209-226).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Mode of access: World Wide Web.
APA, Harvard, Vancouver, ISO, and other styles
44

"A new approach for extracting inter-word semantic relationship from a contemporary Chinese thesaurus." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888491.

Full text
Abstract:
by Lam Sze-sing.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 119-123).
Chapter CHAPTER 1 --- INTRODUCTION --- p.1
Chapter 1.1 --- Introduction --- p.1
Chapter 1.2 --- Statement of Thesis --- p.5
Chapter 1.3 --- Organization of this Thesis --- p.6
Chapter CHAPTER 2 --- RELATED WORK --- p.8
Chapter 2.1 --- Overview --- p.8
Chapter 2.2 --- Corpus-Based Knowledge Acquisition --- p.12
Chapter 2.3 --- Linguistic-Based Knowledge Acquisition --- p.18
Chapter 2.3.1 --- Knowledge Acquisition from Standard Dictionaries --- p.18
Chapter 2.3.2 --- Knowledge Acquisition from Standard Thesauri --- p.23
Chapter 2.4 --- Remarks --- p.24
Chapter CHAPTER 3 --- A METHOD TO EXTRACT THE INTER-WORD SEMANTIC RELATIONSHIP FROM《同義詞詞林》 --- p.25
Chapter 3.1 --- Background --- p.25
Chapter 3.1.1 --- Structure of《《同義詞詞林》 --- p.26
Chapter 3.1.2 --- Knowledge Representation of a Machine Tractable Thesaurus --- p.28
Chapter 3.1.3 --- Extracting the Semantic Knowledge by Simple Co-occurrence --- p.28
Chapter 3.2 --- Association Network --- p.31
Chapter 3.3 --- Semantic Association Model --- p.33
Chapter 3.3.1 --- Problems with the Simple Co-occurrence Method --- p.34
Chapter 3.3.2 --- Methodology of Semantic Association Model --- p.39
Chapter 3.4 --- Inter-word Semantic Function ..… --- p.51
Chapter CHAPTER 4 --- NOUN-VERB-NOUN COMPOUND WORD DETECTION : AN EXPERIMENT --- p.55
Chapter 4.1 --- Overview --- p.56
Chapter 4.2 --- N-V-N Compound Word Detection Model --- p.61
Chapter 4.3 --- Experimental Results of N-V-N Compound Word Detection --- p.63
Chapter CHAPTER 5 --- WORD SENSE DISAMBIGUATION : AN APPLICATION … --- p.66
Chapter 5.1 --- Overview --- p.67
Chapter 5.2 --- Word-Sense Disambiguation Model --- p.72
Chapter 5.2.1 --- Linguistic Resource --- p.72
Chapter 5.2.2 --- The LSD-C Algorithm --- p.73
Chapter 5.2.3 --- LSD-C in Action --- p.78
Chapter 5.3 --- Experimental Results of Word Sense Disambiguation --- p.83
Chapter CHAPTER 6 --- CONCLUSIONS & FURTHER RESEARCH --- p.93
Chapter 6.1 --- Conclusions --- p.93
Chapter 6.2 --- Further Research --- p.96
Chapter 6.2.1 --- Enriching the Knowledge --- p.96
Chapter 6.2.2 --- Enhancing the N-V-N Compound Word Detection Model --- p.98
Chapter 6.2.3 --- Enhancing the LSD-C Algorithm --- p.99
APPENDICES --- p.101
Appendix A - Dependency Grammar --- p.101
Appendix B - Sample Articles from a Local Chinese Newspaper --- p.104
Appendix C - Ambiguous Words with the Senses Given by《現代漢語詞 典》 --- p.108
Appendix D - List of Stop Words for the Testing Samples --- p.117
REFERENCES --- p.119
APA, Harvard, Vancouver, ISO, and other styles
45

"A corpus-based induction learning approach to natural language processing." Chinese University of Hong Kong, 1996. http://library.cuhk.edu.hk/record=b5888859.

Full text
Abstract:
by Leung Chi Hong.
Thesis (Ph.D.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (leaves 163-171).
Chapter Chapter 1. --- Introduction --- p.1
Chapter Chapter 2. --- Background Study of Natural Language Processing --- p.9
Chapter 2.1. --- Knowledge-based approach --- p.9
Chapter 2.1.1. --- Morphological analysis --- p.10
Chapter 2.1.2. --- Syntactic parsing --- p.11
Chapter 2.1.3. --- Semantic parsing --- p.16
Chapter 2.1.3.1. --- Semantic grammar --- p.19
Chapter 2.1.3.2. --- Case grammar --- p.20
Chapter 2.1.4. --- Problems of knowledge acquisition in knowledge-based approach --- p.22
Chapter 2.2. --- Corpus-based approach --- p.23
Chapter 2.2.1. --- Beginning of corpus-based approach --- p.23
Chapter 2.2.2. --- An example of corpus-based application: word tagging --- p.25
Chapter 2.2.3. --- Annotated corpus --- p.26
Chapter 2.2.4. --- State of the art in the corpus-based approach --- p.26
Chapter 2.3. --- Knowledge-based approach versus corpus-based approach --- p.28
Chapter 2.4. --- Co-operation between two different approaches --- p.32
Chapter Chapter 3. --- Induction Learning applied to Corpus-based Approach --- p.35
Chapter 3.1. --- General model of traditional corpus-based approach --- p.36
Chapter 3.1.1. --- Division of a problem into a number of sub-problems --- p.36
Chapter 3.1.2. --- Solution selected from a set of predefined choices --- p.36
Chapter 3.1.3. --- Solution selection based on a particular kind of linguistic entity --- p.37
Chapter 3.1.4. --- Statistical correlations between solutions and linguistic entities --- p.37
Chapter 3.1.5. --- Prediction of the best solution based on statistical correlations --- p.38
Chapter 3.2. --- First problem in the corpus-based approach: Irrelevance in the corpus --- p.39
Chapter 3.3. --- Induction learning --- p.41
Chapter 3.3.1. --- General issues about induction learning --- p.41
Chapter 3.3.2. --- Reasons of using induction learning in the corpus-based approach --- p.43
Chapter 3.3.3. --- General model of corpus-based induction learning approach --- p.45
Chapter 3.3.3.1. --- Preparation of positive corpus and negative corpus --- p.45
Chapter 3.3.3.2. --- Statistical correlations between solutions and linguistic entities --- p.46
Chapter 3.3.3.3. --- Combination of the statistical correlations obtained from the positive and negative corpora --- p.48
Chapter 3.4. --- Second problem in the corpus-based approach: Modification of initial probabilistic approximations --- p.50
Chapter 3.5. --- Learning feedback modification --- p.52
Chapter 3.5.1. --- Determination of which correlation scores to be modified --- p.52
Chapter 3.5.2. --- Determination of the magnitude of modification --- p.53
Chapter 3.5.3. --- An general algorithm of learning feedback modification --- p.56
Chapter Chapter 4. --- Identification of Phrases and Templates in Domain-specific Chinese Texts --- p.59
Chapter 4.1. --- Analysis of the problem solved by the traditional corpus-based approach --- p.61
Chapter 4.2. --- Phrase identification based on positive and negative corpora --- p.63
Chapter 4.3. --- Phrase identification procedure --- p.64
Chapter 4.3.1. --- Step 1: Phrase seed identification --- p.65
Chapter 4.3.2. --- Step 2: Phrase construction from phrase seeds --- p.65
Chapter 4.4. --- Template identification procedure --- p.67
Chapter 4.5. --- Experiment and result --- p.70
Chapter 4.5.1. --- Testing data --- p.70
Chapter 4.5.2. --- Details of experiments --- p.71
Chapter 4.5.3. --- Experimental results --- p.72
Chapter 4.5.3.1. --- Phrases and templates identified in financial news articles --- p.72
Chapter 4.5.3.2. --- Phrases and templates identified in political news articles --- p.73
Chapter 4.6. --- Conclusion --- p.74
Chapter Chapter 5. --- A Corpus-based Induction Learning Approach to Improving the Accuracy of Chinese Word Segmentation --- p.76
Chapter 5.1. --- Background of Chinese word segmentation --- p.77
Chapter 5.2. --- Typical methods of Chinese word segmentation --- p.78
Chapter 5.2.1. --- Syntactic and semantic approach --- p.78
Chapter 5.2.2. --- Statistical approach --- p.79
Chapter 5.2.3. --- Heuristic approach --- p.81
Chapter 5.3. --- Problems in word segmentation --- p.82
Chapter 5.3.1. --- Chinese word definition --- p.82
Chapter 5.3.2. --- Word dictionary --- p.83
Chapter 5.3.3. --- Word segmentation ambiguity --- p.84
Chapter 5.4. --- Corpus-based induction learning approach to improving word segmentation accuracy --- p.86
Chapter 5.4.1. --- Rationale of approach --- p.87
Chapter 5.4.2. --- Method of constructing modification rules --- p.89
Chapter 5.5. --- Experiment and results --- p.94
Chapter 5.6. --- Characteristics of modification rules constructed in experiment --- p.96
Chapter 5.7. --- Experiment constructing rules for compound words with suffixes --- p.98
Chapter 5.8. --- Relationship between modification frequency and Zipfs first law --- p.99
Chapter 5.9. --- Problems in the approach --- p.100
Chapter 5.10. --- Conclusion --- p.101
Chapter Chapter 6. --- Corpus-based Induction Learning Approach to Automatic Indexing of Controlled Index Terms --- p.103
Chapter 6.1. --- Background of automatic indexing --- p.103
Chapter 6.1.1. --- Definition of index term and indexing --- p.103
Chapter 6.1.2. --- Manual indexing versus automatic indexing --- p.105
Chapter 6.1.3. --- Different approaches to automatic indexing --- p.107
Chapter 6.2. --- Corpus-based induction learning approach to automatic indexing --- p.109
Chapter 6.2.1. --- Fundamental concept about corpus-based automatic indexing --- p.110
Chapter 6.2.2. --- Procedure of automatic indexing --- p.111
Chapter 6.2.2.1. --- Learning process --- p.112
Chapter 6.2.2.2. --- Indexing process --- p.118
Chapter 6.3. --- Experiments of corpus-based induction learning approach to automatic indexing --- p.118
Chapter 6.3.1. --- An experiment evaluating the complete procedures --- p.119
Chapter 6.3.1.1. --- Testing data used in the experiment --- p.119
Chapter 6.3.1.2. --- Details of the experiment --- p.119
Chapter 6.3.1.3. --- Experimental result --- p.121
Chapter 6.3.2. --- An experiment comparing with the traditional approach --- p.122
Chapter 6.3.3. --- An experiment determining the optimal indexing score threshold --- p.124
Chapter 6.3.4. --- An experiment measuring the precision and recall of indexing performance --- p.127
Chapter 6.4. --- Learning feedback modification --- p.128
Chapter 6.4.1. --- Positive feedback --- p.129
Chapter 6.4.2. --- Negative feedback --- p.131
Chapter 6.4.3. --- Change of indexed proportions of positive/negative training corpus in feedback iterations --- p.132
Chapter 6.4.4. --- An experiment evaluating the learning feedback modification --- p.134
Chapter 6.4.5. --- An experiment testing the significance factor in merging process --- p.136
Chapter 6.5. --- Conclusion --- p.138
Chapter Chapter 7. --- Conclusion --- p.140
Appendix A: Some examples of identified phrases in financial news articles --- p.149
Appendix B: Some examples of identified templates in financial news articles --- p.150
Appendix C: Some examples of texts containing the templates in financial news articles --- p.151
Appendix D: Some examples of identified phrases in political news articles --- p.152
Appendix E: Some examples of identified templates in political news articles --- p.153
Appendix F: Some examples of texts containing the templates in political news articles --- p.154
Appendix G: Syntactic tags used in word segmentation modification rule experiment --- p.155
Appendix H: An example of semantic approach to automatic indexing --- p.156
Appendix I: An example of syntactic approach to automatic indexing --- p.158
Appendix J: Samples of INSPEC and MEDLINE Records --- p.161
Appendix K: Examples of Promoting and Demoting Words --- p.162
References --- p.163
APA, Harvard, Vancouver, ISO, and other styles
46

"Towards discourse classication for Chinese, a resource-poor language." 2014. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1290645.

Full text
Abstract:
Discourse raises issues about semantics, and especially the nature of coherence and cohesion of texts. Similar to part-of-speech tagging and syntactic parsing, discourse classification is fundamental in computational linguistics. But relatively, this issue is not well studied. The lack of annotated corpora brings limitations to research of discourse classification for most languages other than English (e.g., Chinese). Manual annotation for discourse classification is complex, time consuming and costly. To overcome this predicament, one alternative is to explore unsupervised learning methods. Nevertheless, previous work on English showed that unsupervised methods could only deal with coarse-grained discourse relations and suffered from low precision. Another possible way is to make use of discourse classification capabilities from other languages which have rich discourse corpora. But the problem of cross language discourse classification is still very much open for investigation. Using Chinese as the target, this thesis presents the first study on discourse classification for resource-poor language. Furthermore, we also annotate the first open discourse treebank for Chinese which includes 890 news articles.
At the beginning, we propose a novel bootstrapping unsupervised method based on semantic sequential representation (SSR) for discourse classification. SSR is a new representation for discourse instances which integrate basic bag-of-words information with lexical, semantic and word sequential information. Our method starts with a small set of cue-phrase-based patterns to collect large number of discourse instances which are later converted to SSRs. We then propose an unsupervised SSR learner to generate, weigh and filter new SSRs without cue phrases for recognizing discourse relations. Experimental results showed that our method outperformed previous unsupervised method by 7% in F-score. We also show that SSRs are effective features for supervised learning methods.
The SSR-based method (F-score = 0:63) ignores the ambiguities of discourse connectives. As a result, it suffers from low recall (Recall = 0:49). To discover and eliminate these ambiguities, we further propose a cross-language framework for discourse classification. In our framework, discourse classification for Chinese is achieved in two steps: (1) Discourse connective/trigger identification and (2) Sense classification. English Penn Discourse Treebank 2 (PDTB2) and Chinese-English parallel data are coupled to provide the training data for a co-training based framework. Experimental results showed that our method achieved significant improvement comparing to SSR based method. The proposed framework is practical and effective especially in coping with the inter community problem, which is common in cross-language discourse classification. Moreover, the proposed framework does not integrate any language specific features, making it theoretically applicable for other languages.
Every language has its unique characteristics, our cross-language framework which focuses on the common characteristics between languages is ineffective in detecting Chinese language specific characteristics. As a result, we package the corpus we used in this research to form the Discourse Treebank for Chinese (DTBC). DTBC adopts the principles of PDTB2, and at the same time, it incorporates the linguistic characteristics of Chinese. The annotation work adds a discourse layer to 890 articles from the Penn Chinese Tree Bank 5 (CTB5). DTBC is the first ever open Chinese discourse treebank, which will be an invaluable linguistic resource for future research in Chinese discourse.
語篇(Discourse)提出了關於語義理解的問題,特別是篇章的銜接與連貫問題。與詞法分析、語法分析相似,語篇分類问题是計算語言學的基本問題之一。較同领域其他問題而言,語篇分類的研究尚處於初級階段。對於除英文外的絕大多數語言,由於缺乏语篇標注資料,語篇分類的研究受到了很大的限制。眾所周知,語篇資料的標注工作複雜度较高而且需要花費大量的時間。為了克服這一困境,一種方法是探索無指導的語篇分類方法。然而,在英文上的先行研究表明,無指導语篇分类方法的缺陷是準確率較低並且僅能處理粗粒度的語篇關係。另一種方法是將語篇分類技術從有大量標注資料的源語言遷移到其他目標語言。然而,當前跨語言語篇分類技術尚不成熟。本文以中文為目標語言,首創了在本地標注資料非常有限(Resource-Poor)的情況下,對中文進行語篇分類的研究。不僅如此,我們還標註了中文第一個公開的,包含890篇新聞文章的語篇樹庫。
為了克服以往無指導方法的缺點,我們首先提出了一種新穎的,基於語義有序標記法 (SSR: Semantic Sequential Representation) 的無指導方法。語義有序標記法是一種新的表示語篇實例的方法,它集成了詞袋(bag-of-words)資訊,詞法資訊,語義資訊以及詞序資訊。我們的方法首先從一小組基於語篇連接詞的模式出發,在中文生語料中獲取大量的語篇實例,我們用語義有序標記法表示這些語篇實例。然後,我們提出了一種無指導的,在不考慮語篇連接詞的情況下,對語義有序表示進行挖掘,打分和過濾的方法。實驗結果證明,我們提出的方法比先前的方法在F值上提高了7%。我們還證明了語義有序表示也可以成為有指導語篇分類方法的有效特徵。
基於挖掘語義有序表示的無指導方法(F-score=0.63)忽略了語篇連接詞的歧義性。因此,其召回率較低。爲消除歧義,我們進一步提出了一種跨語言的語篇分類框架。在我們的框架中,中文語篇分類任務由兩個步驟組成:(1)語篇連詞/觸發詞的發現;(2)語篇關係分類。我們將英文語篇樹庫(PDTB2: Penn Discourse TreeBank 2.0)和中文樹庫(CTB5: Chinese TreeBank 5.0)結合起來作為訓練資料,作為co-training演算法框架的輸入。實驗結果表明,我們提出的跨語言語篇分類方法比單純使用語義有序表示的方法在F值上有非常顯著的提高。 這說明我們提出的跨語言框架可以有效地通過雙語平行語料的橋樑作用,識別不同語言之間的語篇分類的共通性。值得一提的是,我們提出的演算法框架並不需要特定的,語言相關的特徵,因此,它具有很強的擴展並應用到其他語言的能力。
每種語言都有其獨特的特點,我們提出的跨語言方法主要注重於發掘語言之間的共同特點,因此並不能有效地發掘中文篇章分類的獨有特點。我們將實驗中標注過的中文語篇分析資料進行了總結和歸納,形成了中文語篇樹庫(DTBC: Discourse TreeBank for Chinese)。中文語篇樹庫繼承了英文語篇庫的構建原則,與此同時,它針對中文獨有的特點進行了大量的本地化工作。我們的標注工作為中文樹庫 (CTB5: The Chinese TreeBank 5.0)的全部890篇新聞文章添加了語篇資訊層。中文語篇樹庫是第一個開放的、大規模中文語篇樹庫語料。它為未來的中文語篇分析研究提供了至關重要的基礎性標註數據。
Zhou, Lanjun.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2014.
Includes bibliographical references (leaves 98-104).
Abstracts also in Chinese.
Title from PDF title page (viewed on 20, December, 2016).
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
APA, Harvard, Vancouver, ISO, and other styles
47

"A methodology for constructing compact Chinese font libraries by radical composition." Chinese University of Hong Kong, 1993. http://library.cuhk.edu.hk/record=b5887716.

Full text
Abstract:
by Wai-Yip Tung.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.
Includes bibliographical references (leaves 55-56).
Chapter 1. --- Introduction --- p.1
Chapter 1.1. --- Previous work --- p.2
Chapter 1.1.1. --- A Chinese METAFONT --- p.2
Chapter 1.1.2. --- Chinese character generator --- p.2
Chapter 1.1.3. --- Chinese Character Design System CCDS --- p.2
Chapter 1.2. --- Goals of the thesis --- p.3
Chapter 1.3. --- Overview of the thesis --- p.3
Chapter 2. --- Construction of Chinese Characters --- p.5
Chapter 2.1 --- Introduction --- p.5
Chapter 2.2. --- liu shu(六書)Six Principles of Chinese Character Construction --- p.5
Chapter 2.3. --- Structural Analysis of Chinese Characters --- p.7
Chapter 2.3.1. --- Left-Right Structure --- p.8
Chapter 2.3.2. --- Top-Bottom Structure --- p.9
Chapter 2.3.3. --- Inside-Outside Structure --- p.10
Chapter 2.3.4. --- Singleton Structure --- p.10
Chapter 2.4. --- Usage frequency of radicals --- p.11
Chapter 2.5. --- Usage frequency of Bushou --- p.11
Chapter 2.6. --- Usage frequency of Shengpang --- p.13
Chapter 2.7. --- Summary --- p.15
Chapter 3. --- Composition by Radicals --- p.17
Chapter 3.1. --- Introduction --- p.17
Chapter 3.2. --- Transforming radicals --- p.18
Chapter 3.3. --- Quality of transformed radicals --- p.19
Chapter 3.4. --- Lower level components --- p.20
Chapter 3.5. --- Summary --- p.23
Chapter 4. --- Automatic Hinting for Chinese Font --- p.24
Chapter 4.1 --- Introduction --- p.24
Chapter 4.2. --- Automatic hinting for Chinese font --- p.26
Chapter 4.3. --- Stroke recognition --- p.30
Chapter 4.3.1. --- Identify horizontal lines --- p.31
Chapter 4.3.2. --- Identify stroke segments --- p.31
Chapter 4.3.3. --- Stroke recognition --- p.32
Chapter 4.4. --- Regularize stroke width --- p.33
Chapter 4.5. --- Grid-fitting horizontal and vertical strokes --- p.33
Chapter 4.6. --- Grid-fitting radicals --- p.37
Chapter 4.7. --- Summary --- p.39
Chapter 5. --- RADIT - A Chinese Font Editor --- p.41
Chapter 5.1. --- Introduction --- p.41
Chapter 5.2. --- RADIT basics --- p.41
Chapter 5.2.1. --- Character selection window --- p.42
Chapter 5.2.2. --- Character window --- p.42
Chapter 5.2.3. --- Tools Palette --- p.43
Chapter 5.2.4. --- Toolbar --- p.43
Chapter 5.2.5. --- Zooming the character window --- p.44
Chapter 5.3. --- Editing a character --- p.44
Chapter 5.3.1. --- Selecting handles --- p.44
Chapter 5.3.2. --- Adding lines and curves --- p.45
Chapter 5.3.3. --- Delete control points --- p.45
Chapter 5.3.4. --- Moving control points --- p.45
Chapter 5.3.5. --- Cut and paste --- p.46
Chapter 5.3.6. --- Undo --- p.46
Chapter 5.4. --- Adding radicals to a character --- p.46
Chapter 5.5. --- Rasterizing and grid-fitting a character --- p.47
Chapter 5.5.1. --- Rasterizing a character --- p.48
Chapter 5.5.2. --- Stroke detection and regularization --- p.48
Chapter 5.5.3. --- Grid-fitting and rasterizing a character --- p.49
Chapter 6. --- Conclusions --- p.50
Appendix A: Sample Fonts --- p.52
References --- p.55
APA, Harvard, Vancouver, ISO, and other styles
48

"Hybrid tag-set for natural language processing." 1999. http://library.cuhk.edu.hk/record=b5889925.

Full text
Abstract:
Leung Wai Kwong.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.
Includes bibliographical references (leaves 90-95).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation --- p.1
Chapter 1.2 --- Objective --- p.3
Chapter 1.3 --- Organization of thesis --- p.3
Chapter 2 --- Background --- p.5
Chapter 2.1 --- Chinese Noun Phrases Parsing --- p.5
Chapter 2.2 --- Chinese Noun Phrases --- p.6
Chapter 2.3 --- Problems with Syntactic Parsing --- p.11
Chapter 2.3.1 --- Conjunctive Noun Phrases --- p.11
Chapter 2.3.2 --- De-de Noun Phrases --- p.12
Chapter 2.3.3 --- Compound Noun Phrases --- p.13
Chapter 2.4 --- Observations --- p.15
Chapter 2.4.1 --- Inadequacy in Part-of-Speech Categorization for Chi- nese NLP --- p.16
Chapter 2.4.2 --- The Need of Semantic in Noun Phrase Parsing --- p.17
Chapter 2.5 --- Summary --- p.17
Chapter 3 --- Hybrid Tag-set --- p.19
Chapter 3.1 --- Objectives --- p.19
Chapter 3.1.1 --- Resolving Parsing Ambiguities --- p.19
Chapter 3.1.2 --- Investigation of Nominal Compound Noun Phrases --- p.20
Chapter 3.2 --- Definition of Hybrid Tag-set --- p.20
Chapter 3.3 --- Introduction to Cilin --- p.21
Chapter 3.4 --- Problems with Cilin --- p.23
Chapter 3.4.1 --- Unknown words --- p.23
Chapter 3.4.2 --- Multiple Semantic Classes --- p.25
Chapter 3.5 --- Introduction to Chinese Word Formation --- p.26
Chapter 3.5.1 --- Disyllabic Word Formation --- p.26
Chapter 3.5.2 --- Polysyllabic Word Formation --- p.28
Chapter 3.5.3 --- Observation --- p.29
Chapter 3.6 --- Automatic Assignment of Hybrid Tag to Chinese Word --- p.31
Chapter 3.7 --- Summary --- p.34
Chapter 4 --- Automatic Semantic Assignment --- p.35
Chapter 4.1 --- Previous Researches on Semantic Tagging --- p.36
Chapter 4.2 --- SAUW - Automatic Semantic Assignment of Unknown Words --- p.37
Chapter 4.2.1 --- POS-to-SC Association (Process 1) --- p.38
Chapter 4.2.2 --- Morphology-based Deduction (Process 2) --- p.39
Chapter 4.2.3 --- Di-syllabic Word Analysis (Process 3 and 4) --- p.41
Chapter 4.2.4 --- Poly-syllabic Word Analysis (Process 5) --- p.47
Chapter 4.3 --- Illustrative Examples --- p.47
Chapter 4.4 --- Evaluation and Analysis --- p.49
Chapter 4.4.1 --- Experiments --- p.49
Chapter 4.4.2 --- Error Analysis --- p.51
Chapter 4.5 --- Summary --- p.52
Chapter 5 --- Word Sense Disambiguation --- p.53
Chapter 5.1 --- Introduction to Word Sense Disambiguation --- p.54
Chapter 5.2 --- Previous Works on Word Sense Disambiguation --- p.55
Chapter 5.2.1 --- Linguistic-based Approaches --- p.56
Chapter 5.2.2 --- Corpus-based Approaches --- p.58
Chapter 5.3 --- Our Approach --- p.60
Chapter 5.3.1 --- Bi-gram Co-occurrence Probabilities --- p.62
Chapter 5.3.2 --- Tri-gram Co-occurrence Probabilities --- p.63
Chapter 5.3.3 --- Design consideration --- p.65
Chapter 5.3.4 --- Error Analysis --- p.67
Chapter 5.4 --- Summary --- p.68
Chapter 6 --- Hybrid Tag-set for Chinese Noun Phrase Parsing --- p.69
Chapter 6.1 --- Resolving Ambiguous Noun Phrases --- p.70
Chapter 6.1.1 --- Experiment --- p.70
Chapter 6.1.2 --- Results --- p.72
Chapter 6.2 --- Summary --- p.78
Chapter 7 --- Conclusion --- p.80
Chapter 7.1 --- Summary --- p.80
Chapter 7.2 --- Difficulties Encountered --- p.83
Chapter 7.2.1 --- Lack of Training Corpus --- p.83
Chapter 7.2.2 --- Features of Chinese word formation --- p.84
Chapter 7.2.3 --- Problems with linguistic sources --- p.85
Chapter 7.3 --- Contributions --- p.86
Chapter 7.3.1 --- Enrichment to the Cilin --- p.86
Chapter 7.3.2 --- Enhancement in syntactic parsing --- p.87
Chapter 7.4 --- Further Researches --- p.88
Chapter 7.4.1 --- Investigation into words that undergo semantic changes --- p.88
Chapter 7.4.2 --- Incorporation of more information into the hybrid tag-set --- p.89
Chapter A --- POS Tag-set by Tsinghua University (清華大學) --- p.96
Chapter B --- Morphological Rules --- p.100
Chapter C --- Syntactic Rules for Di-syllabic Words Formation --- p.104
APA, Harvard, Vancouver, ISO, and other styles
49

"ACTION: automatic classification for Chinese documents." Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5895378.

Full text
Abstract:
by Jacqueline, Wai-ting Wong.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (p. 107-109).
Abstract --- p.i
Acknowledgement --- p.iii
List of Tables --- p.viii
List of Figures --- p.ix
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Chinese Information Processing --- p.6
Chapter 2.1 --- Chinese Word Segmentation --- p.7
Chapter 2.1.1 --- Statistical Method --- p.8
Chapter 2.1.2 --- Probabilistic Method --- p.9
Chapter 2.1.3 --- Linguistic Method --- p.10
Chapter 2.2 --- Automatic Indexing --- p.10
Chapter 2.2.1 --- Title Indexing --- p.11
Chapter 2.2.2 --- Free-Text Searching --- p.11
Chapter 2.2.3 --- Citation Indexing --- p.12
Chapter 2.3 --- Information Retrieval Systems --- p.13
Chapter 2.3.1 --- Users' Assessment of IRS --- p.13
Chapter 2.4 --- Concluding Remarks --- p.15
Chapter 3 --- Survey on Classification --- p.16
Chapter 3.1 --- Text Classification --- p.17
Chapter 3.2 --- Survey on Classification Schemes --- p.18
Chapter 3.2.1 --- Commonly Used Classification Systems --- p.18
Chapter 3.2.2 --- Classification of Newspapers --- p.31
Chapter 3.3 --- Concluding Remarks --- p.37
Chapter 4 --- System Models and the ACTION Algorithm --- p.38
Chapter 4.1 --- Factors Affecting Systems Performance --- p.38
Chapter 4.1.1 --- Specificity --- p.39
Chapter 4.1.2 --- Exhaustivity --- p.40
Chapter 4.2 --- Assumptions and Scope --- p.42
Chapter 4.2.1 --- Assumptions --- p.42
Chapter 4.2.2 --- System Scope ´ؤ Data Flow Diagrams --- p.44
Chapter 4.3 --- System Models --- p.48
Chapter 4.3.1 --- Article --- p.48
Chapter 4.3.2 --- Matching Table --- p.49
Chapter 4.3.3 --- Forest --- p.51
Chapter 4.3.4 --- Matching --- p.53
Chapter 4.4 --- Classification Rules --- p.54
Chapter 4.5 --- The ACTION Algorithm --- p.56
Chapter 4.5.1 --- Algorithm Design Objectives --- p.56
Chapter 4.5.2 --- Measuring Node Significance --- p.56
Chapter 4.5.3 --- Pseudocodes --- p.61
Chapter 4.6 --- Concluding Remarks --- p.64
Chapter 5 --- Analysis of Results and Validation --- p.66
Chapter 5.1 --- Seeking for Exhaustivity Rather Than Specificity --- p.67
Chapter 5.1.1 --- The News Article --- p.67
Chapter 5.1.2 --- The Matching Results --- p.68
Chapter 5.1.3 --- The Keyword Values --- p.68
Chapter 5.1.4 --- Analysis of Classification Results --- p.71
Chapter 5.2 --- Catering for Hierarchical Relationships Between Classes and Subclasses --- p.72
Chapter 5.2.1 --- The News Article --- p.72
Chapter 5.2.2 --- The Matching Results --- p.73
Chapter 5.2.3 --- The Keyword Values --- p.74
Chapter 5.2.4 --- Analysis of Classification Results --- p.75
Chapter 5.3 --- A Representative With Zero Occurrence --- p.78
Chapter 5.3.1 --- The News Article --- p.78
Chapter 5.3.2 --- The Matching Results --- p.79
Chapter 5.3.3 --- The Keyword Values --- p.80
Chapter 5.3.4 --- Analysis of Classification Results --- p.81
Chapter 5.4 --- Statistical Analysis --- p.83
Chapter 5.4.1 --- Classification Results with Highest Occurrence Frequency --- p.83
Chapter 5.4.2 --- Classification Results with Zero Occurrence Frequency --- p.85
Chapter 5.4.3 --- Distribution of Classification Results on Level Numbers --- p.86
Chapter 5.5 --- Concluding Remarks --- p.87
Chapter 5.5.1 --- Advantageous Characteristics of ACTION --- p.88
Chapter 6 --- Conclusion --- p.93
Chapter 6.1 --- Perspectives in Document Representation --- p.93
Chapter 6.2 --- Classification Schemes --- p.95
Chapter 6.3 --- Classification System Model --- p.95
Chapter 6.4 --- The ACTION Algorithm --- p.96
Chapter 6.5 --- Advantageous Characteristics of the ACTION Algorithm --- p.96
Chapter 6.6 --- Testing and Validating the ACTION algorithm --- p.98
Chapter 6.7 --- Future Work --- p.99
Chapter 6.8 --- A Final Remark --- p.100
Chapter A --- System Models --- p.102
Chapter B --- Classification Rules --- p.104
Chapter C --- Node Significance Definitions --- p.105
References --- p.107
APA, Harvard, Vancouver, ISO, and other styles
50

Kuzuoglu, Ulug. "Codes of Modernity: Infrastructures of Language and Chinese Scripts in an Age of Global Information Revolution." Thesis, 2018. https://doi.org/10.7916/D80C6C6B.

Full text
Abstract:
This dissertation explores the global history of Chinese script reforms—the effort to phoneticize Chinese language and/or simplify the writing system—from its inception in the 1890s to its demise in the 1980s. These reforms took place at the intersection of industrialization, colonialism, and new information technologies, such as alphabet-based telegraphy and breakthroughs in printing technologies. As these social and technological transformations put unprecedented pressure on knowledge management and the use of mental and clerical labor, many Chinese intellectuals claimed that learning Chinese characters consumed too much time and mental energy. Chinese script reforms, this dissertation argues, were an effort to increase speed in producing, transmitting, and accessing information, and thus meet the demands of the industrializing knowledge economy. The industrializing knowledge economy that this dissertation explores was built on and sustained by a psychological understanding of the human subject as a knowledge machine, and it was part of a global moment in which the optimization of labor in knowledge production was a key concern for all modernizing economies. While Chinese intellectuals were inventing new signs of inscription, American behavioral psychologists, Soviet psycho-economists, and Central Asian and Ottoman technicians were all experimenting with new scripts in order to increase mental efficiency and productivity. This dissertation reveals the intimate connections between the Chinese and non-Chinese script engineering projects that were taking place synchronically across the world. The chapters of this work demonstrate for the first time, for instance, that the simplification of Chinese characters in the 1920s and 1930s was intimately connected to the discipline of behavioral psychology in the US. The first generation of Chinese psychologists employed the American psychologists’ methods to track eye movements, count word-frequencies, and statistically analyze the speed of reading, writing, and memorizing in order to simplify and “rationalize” the Chinese writing system in an effort to discipline and optimize mental labor. Other chapters explore the issue of mental and clerical optimization by finding the origins of the Chinese Latin Alphabet (CLA), the mother of pinyin, in hitherto unknown Eurasian connections. The CLA, the pages of this work shows, was the product of a transnational exchange that involved Ottoman and Transcaucasian typographers as well as Russian engineers and Chinese communists who sought efficiency in knowledge production through inventing new scripts. Situating the Chinese script reforms at this global intersection of psychology, economy, and linguistics, this dissertation examines the global connections and forces that turned the human subject into a knowledge worker who was cognitively managed through education, literacy, propaganda, and other measures of organizing information, all of which had the script at the center. The search for efficiency and productivity—the core values of industrialism—lay at the heart of script reforms in China, but this search was inseparable from linguistic orders and political ambitions. Even if writing, transmitting, and learning a phonetic script could theoretically be easier and more efficient than the Chinese characters, the alphabet opened a veritable Pandora’s Box around the issue of selection: given the complex linguistic landscape in China, which speech was a phonetic script supposed to represent? There were myriad languages spoken throughout the empire and the subsequent nation-state, most of which were mutually incomprehensible. Mandarin as spoken in Beijing was different from that spoken in the south, and “topolects” or regional languages such as Min or Cantonese were to Mandarin what Romanian is to English. As a linguistic life-or-death issue, phonetic scripts stood for the infrastructural possibilities and limitations in the representation of speeches. Some scripts, such as Lao Naixuan’s phonetic script composed of more than a hundred signs, were capable of representing multiple Mandarin and non-Mandarin speeches; whereas others, such as Phonetic Symbols that only has thirty-seven syllabic signs, represented only one speech, i.e., Mandarin. Using Mandarin-oriented scripts to transcribe non-Mandarin speeches was like writing English with fifteen letters, hence the acrimonious disputes that fill the pages of this dissertation. Succinctly put, it was at the level of script invention that Chinese and non-Chinese actors engineered different infrastructures not only for laboring minds but also for the social world of Chinese languages. The history of information technologies and knowledge economy in China was thus inseparable from the world of speech and language, as each script offered a new potential to reassemble the written matter and the speaking mind in a different way. “Codes of Modernity” thus conceptualizes the script itself as an infrastructural medium. A script was not merely a passive carrier of information, but an existential artifact. Building on an expanding literature on infrastructures, it endorses the observation that infrastructures, technologies, and the social world around them work in a recursive loop. An infrastructure is not just the physical object that permits the flow of information, goods, ideas, and people, but a sociotechnical product that enables the experience of culture, while imposing constrains on it at the same time. Like electricity grids, transportation systems, and sewage canals, the experience of scripts as infrastructures is the experience of thought worlds. After a long tradition of structuralism and poststructuralism that sought to understand the world through the semiotic prism of language, “Codes of Modernity” argues that it is time for an infrastructuralism that excavates the indispensable media that enable the production of language and thought.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography