Academic literature on the topic 'Lexicography – Data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Lexicography – Data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Lexicography – Data processing"

1

Ochilova, Mehriniso. "ELECTRONIC DICTIONARY–LEXICOGRAPHY DEVELOPMENT AS A NEW STAGE PRODUCT." INTERNATIONAL JOURNAL OF WORD ART 6, no. 3 (June 30, 2020): 124–30. http://dx.doi.org/10.26739/2181-9297-2020-6-17.

Full text
Abstract:
The article discusses the importance of automating the order of data processing, creating new active lexicographic systems, resulting in the creation of automated (electronic) dictionaries from traditional dictionaries. In particular, information was provided on the advantages and convenience of electronic dictionaries available on the Internet over traditional dictionaries. There are also comments onthe need for electronic dictionaries in the Uzbek language, such as the electronic dictionary abby lingvo or the electronic dictionary Urban Dictionary. It was noted that the issue of creating a terminological database (TBD) and the creation of Uzbek cyber lexicography
APA, Harvard, Vancouver, ISO, and other styles
2

Ljubešić, Nikola. "‟Deep lexicography” – Fad or Opportunity?" Rasprave Instituta za hrvatski jezik i jezikoslovlje 46, no. 2 (October 30, 2020): 839–52. http://dx.doi.org/10.31724/rihjj.46.2.21.

Full text
Abstract:
In recent years, we are witnessing staggering improvements in various semantic data processing tasks due to the developments in the area of deep learning, ranging from image and video processing to speech processing, and natural language understanding. In this paper, we discuss the opportunities and challenges that these developments pose for the area of electronic lexicography. We primarily focus on the concept of representation learning of the basic elements of language, namely words, and the applicability of these word representations to lexicography. We first discuss well-known approaches to learning static representations of words, the so-called word embeddings, and their usage in lexicography-related tasks such as semantic shift detection, and cross-lingual prediction of lexical features such as concreteness and imageability. We wrap up the paper with the most recent developments in the area of word representation learning in form of learning dynamic, context-aware representations of words, showcasing some dynamic word embedding examples, and discussing improvements on lexicography-relevant tasks of word sense disambiguation and word sense induction.
APA, Harvard, Vancouver, ISO, and other styles
3

Nkwenti Azeh, Blaise. "Descriptive tools for electronic processing of dictionary data: Studies in computational lexicography." Machine Translation 4, no. 4 (1989): 303–6. http://dx.doi.org/10.1007/bf00713704.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Romary, Laurent, and Andreas Witt. "Méthodes pour la représentation informatisée de données lexicales / Methoden der Speicherung lexikalischer Daten [Methods of saving lexical data]." Lexicographica 30, no. 1 (October 10, 2014): 152–86. http://dx.doi.org/10.1515/lexi-2014-0006.

Full text
Abstract:
AbstractIn recent years, new developments in the area of lexicography have altered not only the management, processing and publishing of lexicographical data, but also created new types of products such as electronic dictionaries and thesauri. These expand the range of possible uses of lexical data and support users with more flexibility, for instance in assisting human translation. In this article, we give a short and easy-to-understand introduction to the problematic nature of the storage, display and interpretation of lexical data. We then describe the main methods and specifications used to build and represent lexical data.This paper is targeted for the following groups of people: linguists, lexicographers, IT specialists, computer linguists and all others who wish to learn more about the modelling, representation and visualization of lexical knowledge. This paper is written in two languages: French and German.
APA, Harvard, Vancouver, ISO, and other styles
5

Dmitryuk, Natalya, and Galina Abramova. "Associative Dictionaries as an Ethnic Mental Phenomenon: Basic Values in the Core of Ethnic Group Language Consciousness." PSYCHOLINGUISTICS 30, no. 2 (August 4, 2021): 59–84. http://dx.doi.org/10.31470/2309-1797-2021-30-2-59-84.

Full text
Abstract:
Introduction. Associative research is widely practiced in the field of sciences related to linguistics as an interdisciplinary approach to the study of the relationship of language with consciousness, psyche, and human culture; the corpus of associative data we have created in the Kazakh language replenishes associative lexicography in the context of broad intercultural comparisons. Materials and methods. The dictionaries of the associative norms of the Kazakh language (Dmitryuk, 1978; Dmitryuk, Moldalieva et al., 2014), prepared on the basis of data of free associative experiments (SAE) with 1000 Kazakh students, contain unique information about the mentality and ethnocultural characteristics of the Kazakh ethnic group in the Soviet and modern period. FAE is a well-known method of employing the associative experiment data, the reliable way to access a person’s linguistic consciousness; statistical processing of the FAE body associative data provided for the analytical comparison of a hierarchical sequence of the Kazakh basic values as a linguistic consciousness core – its central and peripheral zones – in the Soviet and post-Soviet periods. Results. Due to the diachronic and interethnic comparative analysis: such basic Kazakh values as religion beliefs, freedom, sovereignty, state symbols have been subjected to the significant changes; ethnic cultural kernel preferences remained traditionally unchanged, constituting the specific essence of the ethnic national mentality: attitude to motherland, mother, elders, men, gender and age as specific peculiarities in the hierarchy of family relations. Conclusions. The intralingua comparison of the dictionaries’ contents revealed a very stable body of unchanging value priorities, indicating a fairly strong core and a significant vitality degree of Kazakh society. The work contributes to the intercultural associative research, associative lexicography and provides for the development of promising research in Psycholinguistics in Kazakhstan.
APA, Harvard, Vancouver, ISO, and other styles
6

Gao, Guang Xia, Zhi Wang Zhang, and Shi Yong Kang. "Chinese Semantic Word-Formation Analysis Using FKP-MCO Classifier Based on Layered and Weighted GED." Applied Mechanics and Materials 284-287 (January 2013): 3044–50. http://dx.doi.org/10.4028/www.scientific.net/amm.284-287.3044.

Full text
Abstract:
For Chinese information processing, automatic classification based on a large-scale database for different patterns of semantic word-formation can remarkably improve the identification for the unregistered word, automatic lexicography, semantic analysis, and other applications. However, owing to noise, anomalies, nonlinear characteristics, class-imbalance, and other uncertainties in word-formation data, the predictive performance of multi-criteria optimization classifier (MCOC) and other traditional data mining approaches will rapidly degenerate. In this paper we put forward an novel MCOC with fuzzification, kernel, and penalty factors (FKP-MCOC) based on layered and weighted graph edit distance (GED): firstly the layered and weighted GEDs between each semantic word-formation graph and prototype graphs are calculated and used for the dissimilarity measure, then the normalized GEDs are embedded into a new feature vector space, and FKP-MCO classifier based on the feature vector space is built for predicting the patterns of semantic word-formation. Our experimental results of Chinese word-formation analysis and comparison with support vector machine (SVM) show that our proposed approach can increase the separation of different patterns, the predictive performance of semantic pattern of a new compound word.
APA, Harvard, Vancouver, ISO, and other styles
7

Babović, Dželila, and Madžida Mašić. "Literary Heritage of Bosnia and Herzegovina." Prilozi za orijentalnu filologiju, no. 70 (November 30, 2021): 185–207. http://dx.doi.org/10.48116/issn.2303-8586.2020.70.185.

Full text
Abstract:
The manuscript collection of the Specialized Library “Behram-beg” in Tuzla contains 131 manuscript codices written in Arabic, Turkish, Persian and Bosnian. The largest part of the collection consists of manuscripts of the Qur’an, works from the Qur’anic disciplines, hadith sciences, Islamic law, dogmatics, prayers, sermons, grammar, lexicography and belles lettres. Of particular value to this collection are the works of Bosniak authors and works by other authors copied by Bosniaks, as well as works that are rarely found in other manuscriptcollections and those written in Arabic script in the Bosnian language. Of the total number of manuscripts stored in the collection of the Behram-beg library, 78 have been digitised. We will present a part of these manuscripts in this paper, trying to draw attention to the growing importance of digital data processing and storage with the aim of valid protection, study and valorization of written heritage. Digital archives as safe places of storage on the one hand, and top presenters of cultural heritage to a large number of users on the other, can reliably guarantee that times of “archival silence” have passed and that the manuscript treasure will experience its reaffirmation and increasingly arouse the interest of researchers and scientists around the world.
APA, Harvard, Vancouver, ISO, and other styles
8

Gantar, Polona. "Dictionary of Modern Slovene." Rasprave Instituta za hrvatski jezik i jezikoslovlje 46, no. 2 (October 30, 2020): 589–602. http://dx.doi.org/10.31724/rihjj.46.2.7.

Full text
Abstract:
The ability to process language data has become fundamental to the development of technologies in various areas of human life in the digital world. The development of digitally readable linguistic resources, methods, and tools is, therefore, also a key challenge for the contemporary Slovene language. This challenge has been recognized in the Slovene language community both at the professional and state level and has been the subject of many activities over the past ten years, which will be presented in this paper. The idea of a comprehensive dictionary database covering all levels of linguistic description in modern Slovene, from the morphological and lexical levels to the syntactic level, has already formulated within the framework of the European Social Fund’s Communication in Slovene (2008-2013) project; the Slovene Lexical Database was also created within the framework of this project. Two goals were pursued in designing the Slovene Lexical Database (SLD): creating linguistic descriptions of Slovene intended for human users that would also be useful for the machine processing of Slovene. Ever since the construction of the first Slovene corpus, it has become evident that there is a need for a description of modern Slovene based on real language data, and that it is necessary to understand the needs of language users to create useful language reference works. It also became apparent that only the digital medium enables the comprehensiveness of language description and that the design of the database must be adapted to it from the start. Also, the description must follow best practices as closely as possible in terms of formats and international standards, as this enables the inclusion of Slovene into a wider network of resources, such as Open Linked Data, babelNet and ELExIS. Due to time pressures and trends in lexicography, procedures to automate the extraction of linguistic data from corpora and the inclusion of crowdsourcing into the lexicographic process were taken into consideration. Following the essential idea of creating an all-inclusive digital dictionary database for Slovene, a few independent databases have been created over the past two years: the Collocations Dictionary of Modern Slovene, and the automatically generated Thesaurus of Modern Slovene, both of which also exist as independent online dictionary portals. One of the novelties that we put forward together with both dictionaries is the ‘responsive dictionary’ concept, which includes crowdsourcing methods. Ultimately, the Digital Dictionary Database provides all (other) levels of linguistic description: the morphological level with the Sloleks database upgrade, the phraseological level with the construction of a multi-word expressions lexicon, and the syntactic level with the formalization of Slovene verb valency patterns. Each of these databases contains its specific language data that will ultimately be included in the comprehensive Slovene Digital Dictionary Database, which will represent basic linguistic descriptions of Slovene both for the human and machine user.
APA, Harvard, Vancouver, ISO, and other styles
9

Matvieieva, Svitlana A., Nataliya Ye Lemish, Alla A. Zernetska, Volodymyr O. Babych, and Maryna A. Torgovets. "English-Ukrainian Parallel Corpus: Prerequisites for Building and Practical Use in Translation Studies." Studies about Languages 1, no. 40 (July 13, 2022): 61–74. http://dx.doi.org/10.5755/j01.sal.40.1.30735.

Full text
Abstract:
Consistent demand for highly professional translators determines continuous attempts of researchers and programmers to develop and propose reliable tools for both improvement of translation quality and facilitation of translators’ work. Last ten years have brought the parallel and comparable corpora into the focus of Ukrainian scientists’ attention. The aim of the paper is to specify the prerequisites for building the English-Ukrainian parallel corpus and describe its application in Translation Studies. A parallel corpus as a separate type of linguistic corpora cannot be built without alignment that enables placing and extracting corresponding sentences/paragraphs of source and target texts in one space. To create parallel corpora, it is necessary to perform additional text preparation. The Sketch Engine system (an example of a web-oriented system for work with corpora) can offer the solution for annotation with Excel. However, Sketch Engine lacks artificial intelligence techniques for further word processing. There is probability that employment of a neural network in the future will enable text alignment in parallel corpora instead of system users. Data from parallel corpora can be used in translation lexicography, comparative lexico-grammatical works, studies in the theory and practice of translation, language teaching, and development of machine translation systems. Corpus-based translation analysis is extremely relevant to identifying translation solutions that can only be explored on the basis of translation products. It is stipulated by rather frequent absence of dictionary equivalents in most contexts and ready evidence of possible translation variants in parallel corpora that provide the usage of a language unit in a wide range of contexts.
APA, Harvard, Vancouver, ISO, and other styles
10

Mairal-Usón, Ricardo, and Francisco Cortés-Rodríguez. "Automatically Representing TExt Meaning via an Interlingua-based System (ARTEMIS). A further step towards the computational representation of RRG." Journal of Computer-Assisted Linguistic Research 1, no. 1 (June 26, 2017): 61. http://dx.doi.org/10.4995/jclr.2017.7788.

Full text
Abstract:
Within the framework of FUNK Lab – a virtual laboratory for natural language processing inspired on a functionally-oriented linguistic theory like Role and Reference Grammar-, a number of computational resources have been built dealing with different aspects of language and with an application in different scientific domains, i.e. terminology, lexicography, sentiment analysis, document classification, text analysis, data mining etc. One of these resources is ARTEMIS (<span style="text-decoration: underline;">A</span>utomatically <span style="text-decoration: underline;">R</span>epresenting <span style="text-decoration: underline;">TE</span>xt <span style="text-decoration: underline;">M</span>eaning via an <span style="text-decoration: underline;">I</span>nterlingua-Based <span style="text-decoration: underline;">S</span>ystem), which departs from the pioneering work of Periñán-Pascual (2013) and Periñán-Pascual &amp; Arcas (2014). This computational tool is a proof of concept prototype which allows the automatic generation of a conceptual logical structure (CLS) (cf. Mairal-Usón, Periñán-Pascual and Pérez 2012; Van Valin and Mairal-Usón 2014), that is, a fully specified semantic representation of an input text on the basis of a reduced sample of sentences. The primary aim of this paper is to develop the syntactic rules that form part of the computational grammar for the representation of simple clauses in English. More specifically, this work focuses on the format of those syntactic rules that account for the upper levels of the RRG Layered Structure of the Clause (LSC), that is, the <em>core</em> (and the level-1 construction associated with it), the <em>clause</em> and the <em>sentence </em>(Van Valin 2005). In essence, this analysis, together with that in Cortés-Rodríguez and Mairal-Usón (2016), offers an almost complete description of the computational grammar behind the LSC for simple clauses.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Lexicography – Data processing"

1

Mok, Yuen-kwan Sally, and 莫婉君. "Multilingual information retrieval on the world wide web: the development of a Cantonese-Dagaare-English trilingual electroniclexicon." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B36399085.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

So, Keith Kam-Ho Computer Science &amp Engineering Faculty of Engineering UNSW. "Lexicographic path searches for FPGA routing." Publisher:University of New South Wales. Computer Science & Engineering, 2008. http://handle.unsw.edu.au/1959.4/41295.

Full text
Abstract:
This dissertation reports on studies of the application of lexicographic graph searches to solve problems in FPGA detailed routing. Our contributions include the derivation of iteration limits for scalar implementations of negotiation congestion for standard floating point types and the identification of pathological cases for path choice. In the study of the routability-driven detailed FPGA routing problem, we show universal detailed routability is NP-complete based on a related proof by Lee and Wong. We describe the design of a lexicographic composition operator of totally-ordered monoids as path cost metrics and show its optimality under an adapted A* search. Our new router, CornNC, based on lexicographic composition of congestion and wirelength, established a new minimum track count for the FPGA Place and Route Challenge. For the problem of long-path timing-driven FPGA detailed routing, we show that long-path budgeted detailed routability is NP-complete by reduction to universal detailed routability. We generalise the lexicographic composition to any finite length and verify its optimality under A* search. The application of the timing budget solution of Ghiasi et al. is used to solve the long-path timing budget problem for FPGA connections. Our delay-clamped spiral lexicographic composition design, SpiralRoute, ensures connection based budgets are always met, thus achieves timing closure when it successfully routes. For 113 test routing instances derived from standard benchmarks, SpiralRoute found 13 routable instances with timing closure that were unroutable by a scalar negotiated congestion router and achieved timing closure in another 27 cases when the scalar router did not, at the expense of increased runtime. We also study techniques to improve SpiralRoute runtimes, including a data structure of a trie augmented by data stacks for minimum element retrieval, and the technique of step tomonoid elimination in reducing the retrieval depth in a trie of stacks structure.
APA, Harvard, Vancouver, ISO, and other styles
3

Tiedemann, Jörg. "Recycling Translations : Extraction of Lexical Data from Parallel Corpora and their Application in Natural Language Processing." Doctoral thesis, Uppsala University, Department of Linguistics, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-3791.

Full text
Abstract:

The focus of this thesis is on re-using translations in natural language processing. It involves the collection of documents and their translations in an appropriate format, the automatic extraction of translation data, and the application of the extracted data to different tasks in natural language processing.

Five parallel corpora containing more than 35 million words in 60 languages have been collected within co-operative projects. All corpora are sentence aligned and parts of them have been analyzed automatically and annotated with linguistic markup.

Lexical data are extracted from the corpora by means of word alignment. Two automatic word alignment systems have been developed, the Uppsala Word Aligner (UWA) and the Clue Aligner. UWA implements an iterative "knowledge-poor" word alignment approach using association measures and alignment heuristics. The Clue Aligner provides an innovative framework for the combination of statistical and linguistic resources in aligning single words and multi-word units. Both aligners have been applied to several corpora. Detailed evaluations of the alignment results have been carried out for three of them using fine-grained evaluation techniques.

A corpus processing toolbox, Uplug, has been developed. It includes the implementation of UWA and is freely available for research purposes. A new version, Uplug II, includes the Clue Aligner. It can be used via an experimental web interface (UplugWeb).

Lexical data extracted by the word aligners have been applied to different tasks in computational lexicography and machine translation. The use of word alignment in monolingual lexicography has been investigated in two studies. In a third study, the feasibility of using the extracted data in interactive machine translation has been demonstrated. Finally, extracted lexical data have been used for enhancing the lexical components of two machine translation systems.

APA, Harvard, Vancouver, ISO, and other styles
4

Yang, Li. "Improving Topic Tracking with Domain Chaining." Thesis, University of North Texas, 2003. https://digital.library.unt.edu/ark:/67531/metadc4274/.

Full text
Abstract:
Topic Detection and Tracking (TDT) research has produced some successful statistical tracking systems. While lexical chaining, a non-statistical approach, has also been applied to the task of tracking by Carthy and Stokes for the 2001 TDT evaluation, an efficient tracking system based on this technology has yet to be developed. In thesis we investigate two new techniques which can improve Carthy's original design. First, at the core of our system is a semantic domain chainer. This chainer relies not only on the WordNet database for semantic relationships but also on Magnini's semantic domain database, which is an extension of WordNet. The domain-chaining algorithm is a linear algorithm. Second, to handle proper nouns, we gather all of the ones that occur in a news story together in a chain reserved for proper nouns. In this thesis we also discuss the linguistic limitations of lexical chainers to represent textual meaning.
APA, Harvard, Vancouver, ISO, and other styles
5

Samia, Michel. "Databáze XML pro správu slovníkových dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-412859.

Full text
Abstract:
The following diploma thesis deals with dictionary data processing, especially those in XML based formats. At first, the reader is acquainted with linguistic and lexicographical terms used in this work. Then particular lexicographical data format types and specific formats are introduced. Their advantages and disadvantages are discussed as well. According to previously set criteria, the LMF format has been chosen for design and implementation of Python application, which focuses especially on intelligent merging of more dictionaries into one. After passing all unit tests, this application has been used for processing LMF dictionaries, located on the faculty server of the research group for natural language processing. Finally, the advantages and disadvantages of this application are discussed and ways of further usage and extension are suggested.
APA, Harvard, Vancouver, ISO, and other styles
6

Makgabutlane, Kelebohile Hilda. "An investigation into lemmatization in Southern Sotho." Diss., 1996. http://hdl.handle.net/10500/17302.

Full text
Abstract:
Lemmatization refers to the process whereby a lexicographer assigns a specific place in a dictionary to a word which he regards as the most basic form amongst other related forms. The fact that in Bantu languages formative elements can be added to one another in an often seemingly interminable series till quite long words are produced, evokes curiosity as far as lemmatization is concerned. Being aware of the productive nature of Southern Sotho it is interesting to observe how lexicographers go about handling the question of morphological complexities they are normally faced with in the process of arranging lexical items. This study has shown that some difficulties are encountered as far as adhering to the traditional method of alphabetization is concerned. It does not aim at proposing solutions but does point out some considerations which should be borne in mind in the process of lemmatization.
African Languages
M.A. (African Languages)
APA, Harvard, Vancouver, ISO, and other styles
7

"Statistical modeling for lexical chains for automatic Chinese news story segmentation." 2010. http://library.cuhk.edu.hk/record=b5894500.

Full text
Abstract:
Chan, Shing Kai.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.
Includes bibliographical references (leaves 106-114).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgements --- p.v
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Problem Statement --- p.2
Chapter 1.2 --- Motivation for Story Segmentation --- p.4
Chapter 1.3 --- Terminologies --- p.5
Chapter 1.4 --- Thesis Goals --- p.6
Chapter 1.5 --- Thesis Organization --- p.8
Chapter 2 --- Background Study --- p.9
Chapter 2.1 --- Coherence-based Approaches --- p.10
Chapter 2.1.1 --- Defining Coherence --- p.10
Chapter 2.1.2 --- Lexical Chaining --- p.12
Chapter 2.1.3 --- Cosine Similarity --- p.15
Chapter 2.1.4 --- Language Modeling --- p.19
Chapter 2.2 --- Feature-based Approaches --- p.21
Chapter 2.2.1 --- Lexical Cues --- p.22
Chapter 2.2.2 --- Audio Cues --- p.23
Chapter 2.2.3 --- Video Cues --- p.24
Chapter 2.3 --- Pros and Cons and Hybrid Approaches --- p.25
Chapter 2.4 --- Chapter Summary --- p.27
Chapter 3 --- Experimental Corpora --- p.29
Chapter 3.1 --- The TDT2 and TDT3 Multi-language Text Corpus --- p.29
Chapter 3.1.1 --- Introduction --- p.29
Chapter 3.1.2 --- Program Particulars and Structures --- p.31
Chapter 3.2 --- Data Preprocessing --- p.33
Chapter 3.2.1 --- Challenges of Lexical Chain Formation on Chi- nese Text --- p.33
Chapter 3.2.2 --- Word Segmentation for Word Units Extraction --- p.35
Chapter 3.2.3 --- Part-of-speech Tagging for Candidate Words Ex- traction --- p.36
Chapter 3.3 --- Chapter Summary --- p.37
Chapter 4 --- Indication of Lexical Cohesiveness by Lexical Chains --- p.39
Chapter 4.1 --- Lexical Chain as a Representation of Cohesiveness --- p.40
Chapter 4.1.1 --- Choice of Word Relations for Lexical Chaining --- p.41
Chapter 4.1.2 --- Lexical Chaining by Connecting Repeated Lexi- cal Elements --- p.43
Chapter 4.2 --- Lexical Chain as an Indicator of Story Segments --- p.48
Chapter 4.2.1 --- Indicators of Absence of Cohesiveness --- p.49
Chapter 4.2.2 --- Indicator of Continuation of Cohesiveness --- p.58
Chapter 4.3 --- Chapter Summary --- p.62
Chapter 5 --- Indication of Story Boundaries by Lexical Chains --- p.63
Chapter 5.1 --- Formal Definition of the Classification Procedures --- p.64
Chapter 5.2 --- Theoretical Framework for Segmentation Based on Lex- ical Chaining --- p.65
Chapter 5.2.1 --- Evaluation of Story Segmentation Accuracy --- p.65
Chapter 5.2.2 --- Previous Approach of Story Segmentation Based on Lexical Chaining --- p.66
Chapter 5.2.3 --- Statistical Framework for Story Segmentation based on Lexical Chaining --- p.69
Chapter 5.2.4 --- Post Processing of Ratio for Boundary Identifi- cation --- p.73
Chapter 5.3 --- Comparing Segmentation Models --- p.75
Chapter 5.4 --- Chapter Summary --- p.79
Chapter 6 --- Analysis of Lexical Chains Features as Boundary Indi- cators --- p.80
Chapter 6.1 --- Error Analysis --- p.81
Chapter 6.2 --- Window Length in the LRT Model --- p.82
Chapter 6.3 --- The Relative Importance of Each Set of Features --- p.84
Chapter 6.4 --- The Effect of Removing Timing Information --- p.92
Chapter 6.5 --- Chapter Summary --- p.96
Chapter 7 --- Conclusions and Future Work --- p.98
Chapter 7.1 --- Contributions --- p.98
Chapter 7.2 --- Future Works --- p.100
Chapter 7.2.1 --- Further Extension of the Framework --- p.100
Chapter 7.2.2 --- Wider Applications of the Framework --- p.105
Bibliography --- p.106
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Lexicography – Data processing"

1

1950-, Boguraev Bran, and Briscoe E. J. 1959-, eds. Computational lexicography for natural language processing. London: Longman, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Computer corpus lexicography. Edinburgh: Edinburgh University Press, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hungary), COMPLEX '92 (Budapest. Papers in computational lexicography, COMPLEX '92. Budapest: Linguistics Institute, Hungarian Academy of Sciences, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hungary), COMPLEX '96 (Budapest. Papers in computational lexicography, COMPLEX '96. Budapest: Research Institute for Linguistics, Hungarian Academy of Sciences, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

COMPLEX '94 (1994 Budapest, Hungary). Papers in computational lexicography, COMPLEX '94. Budapest: Research Institute for Linguistics, Hungarian Academy of Sciences, 1994.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

1971-, Boas Hans Christian, ed. Multilingual FrameNets in computational lexicography: Methods and applications. New York, NY: Mouton de Gruyter, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

E-lexicography: The internet, digital initiatives and lexicography. London: Continuum, 2011.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Júlia, Pajzs. Számítógép és lexikográfia. Budapest: Magyar Tudományos Akadémia Nyelvtudományi Intézete, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Schnorf, Peter. Dynamic instantiation, configuration, and testing of efficient lexical analysers. Zürich: Institut für Informatik Universität Zürich, 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ci shu yu shu zi hua yan jiu: Cishu yu shuzihua yanjiu. Shanghai: Shanghai ci shu chu ban she, 2005.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Lexicography – Data processing"

1

Chang, Ching-Chun, and Chang-Tsun Li. "Privacy-Preserving Reversible Watermarking for Data Exfiltration Prevention Through Lexicographic Permutations." In Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 330–39. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-03745-1_41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography