To see the other types of publications on this topic, follow the link: Computer dictionary.

Dissertations / Theses on the topic 'Computer dictionary'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Computer dictionary.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Spitkovsky, Valentin I. (Valentin Ilyich) 1977. "A fast genomic dictionary." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86505.

Full text
Abstract:
Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Includes bibliographical references (p. 252-254).
by Valentin I. Spitkovsky.
S.B.and M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
2

Lee, Ka-hing, and 李家興. "The dictionary problem: theory andpractice." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31234963.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hoffman, Beth Huhn. "Creating a data dictionary from a requirements specification." Thesis, Kansas State University, 1985. http://hdl.handle.net/2097/9850.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Lee, Ka-hing. "The dictionary problem : theory and practice /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B19667085.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tam, Siu-lung. "Linear-size indexes for approximate pattern matching and dictionary matching." Click to view the E-thesis via HKUTO, 2010. http://sunzi.lib.hku.hk/hkuto/record/B44205326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Tam, Siu-lung, and 譚小龍. "Linear-size indexes for approximate pattern matching and dictionary matching." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B44205326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lee, Francis Hyeongwoo. "A web-based universal encyclopedia/dictionary." CSUSB ScholarWorks, 1998. https://scholarworks.lib.csusb.edu/etd-project/1812.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tikkireddy, Lakshmi Venkata Sai Sri. "Dictionary-based Compression Algorithms in Mobile Packet Core." Thesis, Blekinge Tekniska Högskola, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18813.

Full text
Abstract:
With the rapid growth in technology, the amount of data to be transmitted and stored is increasing. The efficiency of information retrieval and storage has become a major drawback, thereby the concept of data compression has come into the picture. Data compression is a technique that effectively reduces the size of the data to save storage and speed up the transmission of the data from one place to another. Data compression is present in various formats and mainly categorized into lossy compression and lossless compression where lossless compression is often used to compress the data. In Ericsson, SGSN-MME is using one of the data compression technique namely Deflate, to compress each user data independently. Due to the compression ratio between compress and decompress speed, the deflate algorithm is not optimal for the SGSN-MME’s use case. To mitigate this problem, the deflate algorithm has to be replaced with a better compression algorithm.
APA, Harvard, Vancouver, ISO, and other styles
9

Lam, Jacqueline Kam-mei. "A study of semi-technical vocabulary in computer science texts with special reference to ESP teaching and lexicography." Thesis, University of Exeter, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.326882.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Al-Hafez, Muhammad Yassin. "A semantic knowledge-based computational dictionary for support of natural language processing systems." Thesis, Cranfield University, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.385765.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Salberg, Randall N. "The systems resource dictionary : a synergism of artificial intelligence, database management and software engineering methodologies." Thesis, Kansas State University, 1985. http://hdl.handle.net/2097/9877.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Bhave, Sampada Vasant. "Novel dictionary learning algorithm for accelerating multi-dimensional MRI applications." Diss., University of Iowa, 2016. https://ir.uiowa.edu/etd/2182.

Full text
Abstract:
The clinical utility of multi-dimensional MRI applications like multi-parameter mapping and 3D dynamic lung imaging is limited by long acquisition times. Quantification of multiple tissue MRI parameters has been shown to be useful for early detection and diagnosis of various neurological diseases and psychiatric disorders. They also provide useful information about disease progression and treatment efficacy. Dynamic lung imaging enables the diagnosis of abnormalities in respiratory mechanics in dyspnea and regional lung function in pulmonary diseases like chronic obstructive pulmonary disease (COPD), asthma etc. However, the need for acquisition of multiple contrast weighted images as in case of multi-parameter mapping or multiple time points as in case of pulmonary imaging, makes it less applicable in the clinical setting as it increases the scan time considerably. In order to achieve reasonable scan times, there is often tradeoffs between SNR and resolution. Since, most MRI images are sparse in known transform domain; they can be recovered from fewer samples. Several compressed sensing schemes have been proposed which exploit the sparsity of the signal in pre-determined transform domains (eg. Fourier transform) or exploit the low rank characteristic of the data. However, these methods perform sub-optimally in the presence of inter-frame motion since the pre-determined dictionary does not account for the motion and the rank of the data is considerably higher. These methods rely on two step approach where they first estimate the dictionary from the low resolution data and using these basis functions they estimate the coefficients by fitting the measured data to the signal model. The main focus of the thesis is accelerating the multi-parameter mapping and 3D dynamic lung imaging applications to achieve desired volume coverage and spatio-temporal resolution. We propose a novel dictionary learning framework called the Blind compressed sensing (BCS) scheme to recover the underlying data from undersampled measurements, in which the underlying signal is represented as a sparse linear combination of basic functions from a learned dictionary. We also provide an efficient implementation using variable splitting technique to reduce the computational complexity by up to 15 fold. In both multi- parameter mapping and 3D dynamic lung imaging, the comparisons of BCS scheme with other schemes indicates superior performance as it provides a richer presentation of the data. The reconstructions from BCS scheme result in high accuracy parameter maps for parameter imaging and diagnostically relevant image series to characterize respiratory mechanics in pulmonary imaging.
APA, Harvard, Vancouver, ISO, and other styles
13

Lescoat, Thibault. "Geometric operators for 3D modeling using dictionary-based shape representations." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT005.

Full text
Abstract:
Dans cette thèse, nous étudions les représentations haut-niveau de formes 3D et nous développons les primitives algorithmiques nécessaires à la manipulation d'objets représentés par composition d'éléments. Nous commençons par une revue de l'état de l'art, des représentations bas-niveau usuelles jusqu'à celles haut-niveau, utilisant des dictionnaires. En particulier, nous nous intéressons à la représentation de formes via la composition discrète d'atomes tirés d'un dictionnaire de formes.Nous observons qu'il n'existe aucune méthode permettant de fusionner des atomes (placés sans intersection) de manière plausible ; en effet, la plupart des méthodes requiert des intersections ou alors ne préservent pas les détails grossiers. De plus, très peu de techniques garantissent la préservation de l'entrée, une propriété importante lors du traitement de formes créées par des artistes. Nous proposons donc un opérateur de composition qui propage les détails grossiers tout en conservant l'entrée dans le résultat.Dans le but de permettre une édition interactive, nous cherchons à prévisualiser la composition d'objets lourds. Pour cela, nous proposons de simplifier les atomes avant de les composer. Nous introduisons donc une méthode de simplification de maillage qui préserve les détails grossiers. Bien que notre méthode soit plus contrainte que les approches précédentes qui ne produisent pas de maillage, elle résulte en des formes simplifiées fidèles aux formes détaillées
In this thesis, we study high-level 3D shape representations and developed the algorithm primitives necessary to manipulate shapes represented as a composition of several parts. We first review existing representations, starting with the usual low-level ones and then expanding on a high-level family of shape representations, based on dictionaries. Notably, we focus on representing shapes via a discrete composition of atoms from a dictionary of parts.We observe that there was no method to smoothly blend non-overlapping atoms while still looking plausible. Indeed, most methods either required overlapping parts or do not preserve large-scale details. Moreover, very few methods guaranteed the exact preservation of the input, which is very important when dealing with artist-authored meshes to avoid destroying the artist's work. We address this challenge by proposing a composition operator that is guaranteed to exactly keep the input while also propagating large-scale details.To improve the speed of our composition operator and allow interactive edition, we propose to simplify the input parts prior to completing them. This allow us to interactively previsualize the composition of large meshes. For this, we introduce a method to simplify a detailed mesh to a coarse one by preserving the large details. While more constrained than related approaches that do not produce a mesh, our method still yields faithful outputs
APA, Harvard, Vancouver, ISO, and other styles
14

Byrne, Bernadette M. "A longitudinal study of the diffusion of the ISO/IEC information resource dictionary system standard (IRDS.)." Thesis, Aston University, 2001. http://publications.aston.ac.uk/10610/.

Full text
Abstract:
The IRDS standard is an international standard produced by the International Organisation for Standardisation (ISO). In this work the process for producing standards in formal standards organisations, for example the ISO, and in more informal bodies, for example the Object Management Group (OMG), is examined. This thesis examines previous models and classifications of standards. The previous models and classifications are then combined to produce a new classification. The IRDS standard is then placed in a class in the new model as a reference anticipatory standard. Anticipatory standards are standards which are developed ahead of the technology in order to attempt to guide the market. The diffusion of the IRDS is traced over a period of eleven years. The economic conditions which affect the diffusion of standards are examined, particularly the economic conditions which prevail in compatibility markets such as the IT and ICT markets. Additionally the consequences of the introduction of gateway or converter devices into a market where a standard has not yet been established is examined. The IRDS standard did not have an installed base and this hindered its diffusion. The thesis concludes that the IRDS standard was overtaken by new developments such as object oriented technologies and middleware. This was partly because of the slow development process of developing standards in traditional organisations which operate on a consensus basis and partly because the IRDS standard did not have an installed base. Also the rise and proliferation of middleware products resulted in exchange mechanisms becoming dominant rather than repository solutions. The research method used in this work is a longitudinal study of the development and diffusion of the ISO/EEC IRDS standard. The research is regarded as a single case study and follows the interpretative epistemological point of view.
APA, Harvard, Vancouver, ISO, and other styles
15

Pound, Andrew E. "Exploiting Sparsity and Dictionary Learning to Efficiently Classify Materials in Hyperspectral Imagery." DigitalCommons@USU, 2014. https://digitalcommons.usu.edu/etd/4020.

Full text
Abstract:
Hyperspectral imaging (HSI) produces spatial images with pixels that, instead of consisting of three colors, consist of hundreds of spectral measurements. Because there are so many measurements for each pixel, analysis of HSI is difficult. Frequently, standard techniques are used to help make analysis more tractable by representing the HSI data in a different manner. This research explores the utility of representing the HSI data in a learned dictionary basis for the express purpose of material identification and classification. Multiclass classification is performed on the transformed data using the RandomForests algorithm. Performance results are reported. In addition to classification, single material detection is considered also. Commonly used detection algorithm performance is demonstrated on both raw radiance pixels and HSI represented in dictionary-learned bases. Comparison results are shown which indicate that detection on dictionary-learned sparse representations perform as well as detection on radiance. In addition, a different method of performing detection, capitalizing on dictionary learning is established and performance comparisons are reported, showing gains over traditional detection methods.
APA, Harvard, Vancouver, ISO, and other styles
16

Dilenschneider, Robert. "Investigating the Use of a Computer Thesaurus and an On-line Dictionary for Unknown Words in Texts." Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/334785.

Full text
Abstract:
Teaching & Learning
Ed.D.
Two studies were conducted to explore the feasibility of adapting reading passages by means of a computer thesaurus and an on-line dictionary. In Study 1, the Computer Thesaurus Study, three word replacement conditions to replace the marked mid- and low-frequency words of a reading passage were compared. Each condition’s performance for the proportion of marked words that could be replaced, the word frequency levels of the submitted synonyms that replaced the marked words, and the time duration needed to replace the marked words was examined. The participants were 12 English-language instructors who were native English speakers. The findings from Study 1 were determined by averaging the proportion of marked words replaced, averaging the word frequency level of submitted synonyms to replace the marked words, and averaging the time duration needed to replace the marked words. In Study 2, the On-line Dictionary Study, three look-up conditions for language learners to learn mid- and low-frequency target words and comprehend a reading passage when they are transferred away to an on-line dictionary were compared. The research questions were focused on how each look-up condition affected the recall and recognition of word forms, word meanings, and passage comprehension. The participants were 84 first-year Japanese medical university students. The data were analyzed with the Rasch model with regard to the recall and recognition of word forms and word meanings, and passage comprehension. Overall, the results suggest three findings in order to adapt the lexis of reading passages for learners in a timely manner. First, the results from the Computer Thesaurus Study suggest in terms of proportion of synonyms offered, the frequency level of synonyms offered, and the amount of time to provide synonyms, it might be best for instructors to replace the unknown words in passage without the use of a computer thesaurus. This study showed that the dynamic involved in choosing synonyms listed from a computer thesaurus increases word frequency level and the time to replace target words in a reading passage. Second, with regard to the On-line Dictionary Study, if the results that were both statistically significant and measurably different are considered, the spell, click and control conditions might promote the learning the forms of words, the meanings of words, and passage comprehension, respectively. However, the click condition might promote both the learning of word meanings and passage comprehension because its effects were higher and measurably different to the spell condition on these measures. Third, if only the results that were statistically significant from the On-line Dictionary Study are considered, there might be two conditions instead of three conditions to learn words and comprehend reading passages. The spell condition might be best for learning the forms and meanings of words and the control condition might be best for promoting passage comprehension. In terms of learning words, the results are consistent with the Type of Processing-Resource Allocation model in that there is a tradeoff for retaining the forms and meanings of words depending on if an activity emphasizes the spellings of words (spell) or the meanings of words (click). In terms of comprehending passages, the results are consistent with Cognitive Load Theory in that germane loads that increase the mental effort to complete a task (click or spell) diminish processing to learn information (control).
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
17

Kupriianov, Yevhen, and Nunu Akopiants. "Developing linguistic research tools for virtual lexicographic laboratory of the spanish language explanatory dictionary." Thesis, Ruzica Piskac, 2019. http://repository.kpi.kharkov.ua/handle/KhPI-Press/42372.

Full text
Abstract:
The present article is devoted to the problems of creating linguistic tools for the virtual lexicographic laboratory of Spanish explanatory dictionary (DLE 23). The goal of the research is to consider some issues related to the development of linguistic tools for the virtual lexicographic laboratory. To achieve this goal the dictionary was analyzed to define the peculiarities of linguistic facts representation, its structure and metalanguage. On the basis of the dictionary analysis and the theory of lexicographic systems the formal model of DLE 23 was developed and its main components, including their relationships, were determined to ensure their availability via linguistic tools for accessing linguistic information. The range of research activities to be performed by using the linguistic tools was outlined.
APA, Harvard, Vancouver, ISO, and other styles
18

Hall, Abraham. "Using Freebase, an Automatically Generated Dictionary, and a Classifier to Identify a Person's Profession in Tweets." Master's thesis, University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5788.

Full text
Abstract:
Algorithms for classifying pre-tagged person entities in tweets into one of eight profession categories are presented. A classifier using a semi-supervised learning algorithm that takes into consideration the local context surrounding the entity in the tweet, hash tag information, and topic signature scores is described. In addition to the classifier, this research investigates two dictionaries containing the professions of persons. These two dictionaries are used in their own classification algorithms which are independent of the classifier. The method for creating the first dictionary dynamically from the web and the algorithm that accesses this dictionary to classify a person into one of the eight profession categories are explained next. The second dictionary is freebase, an openly available online database that is maintained by its online community. The algorithm that uses freebase for classifying a person into one of the eight professions is described. The results also show that classifications made using the automated constructed dictionary, freebase, or the classifier are all moderately successful. The results also show that classifications made with the automated constructed person dictionary are slightly more accurate than classifications made using freebase. Various hybrid methods, combining the classifier and the two dictionaries are also explained. The results of those hybrid methods show significant improvement over any of the individual methods.
M.S.
Masters
Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Science
APA, Harvard, Vancouver, ISO, and other styles
19

Mairal, Julien. "Sparse coding for machine learning, image processing and computer vision." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2010. http://tel.archives-ouvertes.fr/tel-00595312.

Full text
Abstract:
We study in this thesis a particular machine learning approach to represent signals that that consists of modelling data as linear combinations of a few elements from a learned dictionary. It can be viewed as an extension of the classical wavelet framework, whose goal is to design such dictionaries (often orthonormal basis) that are adapted to natural signals. An important success of dictionary learning methods has been their ability to model natural image patches and the performance of image denoising algorithms that it has yielded. We address several open questions related to this framework: How to efficiently optimize the dictionary? How can the model be enriched by adding a structure to the dictionary? Can current image processing tools based on this method be further improved? How should one learn the dictionary when it is used for a different task than signal reconstruction? How can it be used for solving computer vision problems? We answer these questions with a multidisciplinarity approach, using tools from statistical machine learning, convex and stochastic optimization, image and signal processing, computer vision, but also optimization on graphs.
APA, Harvard, Vancouver, ISO, and other styles
20

Sjöstrand, Mattias Håkansson. "En studie i komprimeringsalgoritmer." Thesis, Blekinge Tekniska Högskola, Avdelningen för för interaktion och systemdesign, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2971.

Full text
Abstract:
Compression algorithms can be used everywhere. For example, when you look at a DVD movie a lossy algorithm is used, both for picture and sound. If you want to do a backup of your data, you might be using a lossless algorithm. This thesis will explain how many of the more common lossless compression algorithms work. During the work of this thesis I also developed a new lossless compression algorithm. I compared this new algorithm to the more common algorithms by testing it on five different types of files. The result that I got was that the new algorithm was comparable to the other algorithms when comparing the compression ratio, and in some cases it also performed better than the others.
Komprimeringsalgoritmer kan användas överallt. T ex, när du tittar på en DVD-film så används en förstörande algoritm, både för bild och ljud. Om du vill göra en säkerhetskopia av din data, så kanske du använder en icke förstörande algoritm. Denna avhandling kommer att förklara hur många av de mer vanliga icke förstörande algoritmer fungerar. Under arbetets gång så utvecklade jag också en ny icke förstörande algoritm. Jag jämförde denna nya algoritm med de mer vanliga algoritmerna genom att jämföra algoritmerna med varandra på fem olika typer av filer. Resultatet som jag kom fram till var att den nya algoritmen var jämförbar med de andra algoritmerna när man jämförde komprimeringsförhållandet, och i vissa fall så presterade den bättre än de andra.
APA, Harvard, Vancouver, ISO, and other styles
21

Wheeler, Ryan. "BlindCanSeeQL: Improved Blind SQL Injection For DB Schema Discovery Using A Predictive Dictionary From Web Scraped Word Based Lists." Scholar Commons, 2015. http://scholarcommons.usf.edu/etd/6050.

Full text
Abstract:
SQL Injections are still a prominent threat on the web. Using a custom built tool, BlindCanSeeQL (BCSQL), we will explore how to automate Blind SQL attacks to discover database schema using fewer requests than the standard methods, thus helping avoid detection from overloading a server with hits. This tool uses a web crawler to discover keywords that assist with autocompleting schema object names, along with improvements in ASCII bisection to lower the number of requests sent to the server. Along with this tool, we will discuss ways to prevent and protect against such attacks.
APA, Harvard, Vancouver, ISO, and other styles
22

Mårtensson, Christian. "Managing language learning data in mobile apps." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-81078.

Full text
Abstract:
On the journey of learning a language we are exposed to countless words that are easily forgotten and subsequently difficult to find again. This study investigates how to design a personal data management system that enables its users to efficiently organize, find and input the words and phrases that they encounter on their journey. Using DSRM, an artifact was iteratively developed and tested in usability tests and interviews by a total of 10 participants. The feedback from the respondents indicates a strong demand for this type of app and also uncovered design knowledge in this new context. The contribution of this study is a set of 14 design principles for making data management in language learning apps more user-friendly and efficient.
APA, Harvard, Vancouver, ISO, and other styles
23

Kipfer-Westerlund, B. A. "Towards the onomasiological dictionary : The use of the computer in providing diversified access as exemplified by an electronic lexicon of baby and child-care concepts." Thesis, University of Exeter, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.234553.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Horskainen, Arvi. "Kamusi ya Kiswahili sanifu in test:: A computer system for analyzing dictionaries and for retrieving lexical data." Swahili Forum; 1 (1994), S. 169-179, 1994. https://ul.qucosa.de/id/qucosa%3A10567.

Full text
Abstract:
The paper describes a computer system for testing the coherence and adequacy of dictionaries. The system suits also well for retiieving lexical material in context from computerized text archives Results are presented from a series of tests made with Kamusi ya Kiswahlli Sanifu (KKS), a monolingual Swahili dictionary.. The test of the intemal coherence of KKS shows that the text itself contains several hundreds of such words, for which there is no entry in the dictionary. Examples and frequency numbers of the most often occurring words are given The adequacy of KKS was also tested with a corpus of nearly one million words, and it was found out that 1.32% of words in book texts were not recognized by KKS, and with newspaper texts the amount was 2.24% The higher number in newspaper texts is partly due to numerous names occurring in news articles Some statistical results are given on frequencies of wordforms not recognized by KKS The tests shows that although KKS covers the modern vocabulary quite well, there are several ru·eas where the dictionary should be improved The internal coherence is far from satisfactory, and there are more than a thousand such rather common words in prose text which rue not included into KKS The system described in this article is au effective tool for `detecting problems and for retrieving lexical data in context for missing words.
APA, Harvard, Vancouver, ISO, and other styles
25

Holst, Andy. "Automatic Transcript Generator for Podcast Files." Thesis, Linnaeus University, School of Computer Science, Physics and Mathematics, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-6936.

Full text
Abstract:

In the modern world, Internet has become a popular place, people with speech hearing disabilities and search engines can't take part of speech content in podcast les. In order to solve the problem partially, the Sphinx decoders such as Sphinx-3, Sphinx-4 can be used to implement a Auto Transcript Generator application, by coupling already existing large acoustic model, language model and a existing dictionary, or by training your own large acoustic model, language model and creating your own dictionary to support continuous speaker independent speech recognition system.

APA, Harvard, Vancouver, ISO, and other styles
26

McClanahan, Peter J. "A Probabilistic Morphological Analyzer for Syriac." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2200.

Full text
Abstract:
We show that a carefully crafted probabilistic morphological analyzer significantly outperforms a reasonable, naive baseline for Syriac. Syriac is an under-resourced Semitic language for which there are no available language tools such as morphological analyzers. Such tools are widely used to contribute to the process of annotating morphologically complex languages. We introduce and connect novel data-driven models for segmentation, dictionary linkage, and morphological tagging in a joint pipeline to create a probabilistic morphological analyzer requiring only labeled data. We explore the performance of this model with varying amounts of training data and find that with about 34,500 tokens, it can outperform the baseline trained on over 99,000 tokens and achieve an accuracy of just over 80%. When trained on all available training data, this joint model achieves 86.47% accuracy — a 29.7% reduction in error rate over the baseline.
APA, Harvard, Vancouver, ISO, and other styles
27

Тесленко, М. В. "Засоби вираження та функції англомовного комп’ютерного сленгу." Thesis, Сумський державний університет, 2019. https://essuir.sumdu.edu.ua/handle/123456789/77485.

Full text
Abstract:
Персональний комп’ютер переніс багато змін: удосконалилась комп’ютерна техніка, з’явилось нове програмне забезпечення та технології. Такий швидкий розвиток комп’ютерної індустрії та комп’ютерних технологій породив велику кількість неологізмів. Більшість з них використовувалися в неофіційному мовленні людей певного фаху, але потім вони стали поширеними майже серед усіх користувачів комп’ютера. В результаті зростання кількості людей пов’язаних з комп’ютерами, сленг переходить до загального вжитку.
Персональный компьютер перенес много изменений: усовершенствовалась компьютерная техника, появилось новое программное обеспечение и технологии. Такое быстрое развитие компьютерной индустрии и компьютерных технологий породил большое количество неологизмов. Большинство из них использовались в неофициальном речи людей определенной профессии, но потом они стали распространенными почти среди всех пользователей компьютера. В результате роста количества людей связанных с компьютерами, сленг переходит в общее употребление.
The personal computer has undergone many changes: computer technology has improved, new software and technologies have appeared. This rapid development of the computer industry and computer technology has given rise to a large number of neologisms. Most of them were used in an informal speech by people of a certain profession, but then they became common among almost all computer users. As the number of people connected to computers grows, slang becomes commonplace.
APA, Harvard, Vancouver, ISO, and other styles
28

Gustafsson, Jessica. "Lärande bildlexikon : Ett interaktivt sätt att lära sig." Thesis, University of Kalmar, School of Communication and Design, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hik:diva-2121.

Full text
Abstract:

Ett bildlexikon ger en bildlig stimulans och gör lärande till en interaktiv lek. Men bildlexikon är inte enbart för barn, de skapas även för ungdomar och vuxna. De kan berätta historier och sagor på ett underhållande sätt, även återge historia. Deras syfte bestäms enbart av deras skapare. Det denna rapport handlar om är skapandet av ett bildlexikon. Syftet med detta lexikon är att utbilda yngre människor inom ämnet vardag, allt som man kan komma att stöta på inom vardagen finns i detta lexikon. Det företag som står för detta bildlexikon är Euroway Media Business AB, de har länge jobbat med hemsidor men vill nu ge sig in på nytt territorier där de kan utöka sina kunskaper. Bildlexikonet kommer sedan att göras om till en applikation, i ett senare projekt, och integreras på en hemsida som för tillfället är under konstruktion. Det som är viktigt att tänka på vid skapandet av ett bildlexikon är att ha en röd tråd igenom hela arbetet, göra många undersökningar för att hålla koll på att man är på rätt spår och till sist vara kreativ. Olika väl använda metoder – som till exempel enkäter och persona - kom att användas under projektet för att samla in data, undersöka målgruppen och för att utvärdera lexikonet. Resultatet blev ett strukturerat lexikon med tillräckligt många bilder och kategorier för att det ska vara utbildande. Rent grafiskt blev det tilltalande och framhäver innehållet i lexikonet.


A picture dictionary gives a figurative stimulus and makes learning an interactive game. But the dictionary is not just for children, they are also made for young people and adults. They can tell stories and fairytales in an amusing way, they can even retell history. Their purpose is only decided by their creator. This report is about the creation of a dictionary with pictures. The purpose of this dictionary is to educate young people about the objects we encounter in our everyday life; anything you can encounter is in this dictionary. The company that is in charge for this dictionary is Euroway Media Business AB, they usually work with websites for other companies but they feel like expanding their ground and knowledge. The dictionary will be made into an application, in a later project, which eventually will be integrated on a website – that is currently under construction. One thing to keep in mind when designing a dictionary is to have a main theme throughout the project, go through a lot of surveys to keep on track and finally – be creative. Different kinds of methods were used – for example a poll and a persona – during the project to collect data, examine the target group and evaluate the dictionary. The result became a well structured dictionary with enough pictures and categories to be educational. It became pure graphic, appealing and highlights the dictionaries content.

APA, Harvard, Vancouver, ISO, and other styles
29

Купріянов, Євген Валерійович. "Українська термінологічна підсистема "гідротурбіни" як об'єкт комп'ютерного словника." Thesis, Харкiвський нацiональний унiверситет iм. В. Н. Каразiна, 2011. http://repository.kpi.kharkov.ua/handle/KhPI-Press/2493.

Full text
Abstract:
The paper deals with the lexicographic description of Ukrainian hydroturbine term subsystem in electronic dictionary. The formation and development of computer lexicography in Ukraine and abroad are considered. The lines of research in modern Ukrainian computer lexicography are analyzed, and its development prospects are outlined. Theoretical studies of lexical unit description in paper and electronic dictionaries, performed by leading scientists, are surveyed. Based on the vast research material the analysis of hydroturbine term subsystem, in particular its sources of formation, structure and system relations between terms, is performed. The structural and semantic characteristics of hydroturbine terms are given.
Дисертаційне дослідження присвячено описові української гідротурбінної термінології та її лексикографічній репрезентації в електронному словнику. Розглянуто становлення і розвиток українського та зарубіжного електронного словникарства, визначено напрямки досліджень сучасної української комп'ютерної лексикографії, наголошено на перспективах її розвитку. Здійснено огляд теоретичних праць провідних науковців щодо опрацювання лексики в паперових та комп'ютерних словниках. Проаналізовано гідротурбінну термінологічну підсистему, зокрема висвітлено джерела її формування, структурну організацію та внутрішні системні зв'язки, з'ясовано структурно-семантичну будову термінів. Запропоновано модель комп'ютерного словника спеціальної лексики, обґрунтовано лексикографічні параметри, за якими здійснюється опис терміносистеми у словнику.
APA, Harvard, Vancouver, ISO, and other styles
30

Goussard, George Willem. "Unsupervised clustering of audio data for acoustic modelling in automatic speech recognition systems." Thesis, Stellenbosch : University of Stellenbosch, 2011. http://hdl.handle.net/10019.1/6686.

Full text
Abstract:
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2011.
ENGLISH ABSTRACT: This thesis presents a system that is designed to replace the manual process of generating a pronunciation dictionary for use in automatic speech recognition. The proposed system has several stages. The first stage segments the audio into what will be known as the subword units, using a frequency domain method. In the second stage, dynamic time warping is used to determine the similarity between the segments of each possible pair of these acoustic segments. These similarities are used to cluster similar acoustic segments into acoustic clusters. The final stage derives a pronunciation dictionary from the orthography of the training data and corresponding sequence of acoustic clusters. This process begins with an initial mapping between words and their sequence of clusters, established by Viterbi alignment with the orthographic transcription. The dictionary is refined iteratively by pruning redundant mappings, hidden Markov model estimation and Viterbi re-alignment in each iteration. This approach is evaluated experimentally by applying it to two subsets of the TIMIT corpus. It is found that, when test words are repeated often in the training material, the approach leads to a system whose accuracy is almost as good as one trained using the phonetic transcriptions. When test words are not repeated often in the training set, the proposed approach leads to better results than those achieved using the phonetic transcriptions, although the recognition is poor overall in this case.
AFRIKAANSE OPSOMMING: Die doelwit van die tesis is om ’n stelsel te beskryf wat ontwerp is om die handgedrewe proses in die samestelling van ’n woordeboek, vir die gebruik in outomatiese spraakherkenningsstelsels, te vervang. Die voorgestelde stelsel bestaan uit ’n aantal stappe. Die eerste stap is die segmentering van die oudio in sogenaamde sub-woord eenhede deur gebruik te maak van ’n frekwensie gebied tegniek. Met die tweede stap word die dinamiese tydverplasingsalgoritme ingespan om die ooreenkoms tussen die segmente van elkeen van die moontlike pare van die akoestiese segmente bepaal. Die ooreenkomste word dan gebruik om die akoestiese segmente te groepeer in akoestiese groepe. Die laaste stap stel die woordeboek saam deur gebruik te maak van die ortografiese transkripsie van afrigtingsdata en die ooreenstemmende reeks akoestiese groepe. Die finale stap begin met ’n aanvanklike afbeelding vanaf woorde tot hul reeks groep identifiseerders, bewerkstellig deur Viterbi belyning en die ortografiese transkripsie. Die woordeboek word iteratief verfyn deur oortollige afbeeldings te snoei, verskuilde Markov modelle af te rig en deur Viterbi belyning te gebruik in elke iterasie. Die benadering is getoets deur dit eksperimenteel te evalueer op twee subversamelings data vanuit die TIMIT korpus. Daar is bevind dat, wanneer woorde herhaal word in die afrigtingsdata, die stelsel se benadering die akkuraatheid ewenaar van ’n stelsel wat met die fonetiese transkripsie afgerig is. As die woorde nie herhaal word in die afrigtingsdata nie, is die akkuraatheid van die stelsel se benadering beter as wanneer die stelsel afgerig word met die fonetiese transkripsie, alhoewel die akkuraatheid in die algemeen swak is.
APA, Harvard, Vancouver, ISO, and other styles
31

Liang, Hsuan Lorraine. "Spell checkers and correctors : a unified treatment." Diss., Pretoria : [s.n.], 2009. http://upetd.up.ac.za/thesis/available/etd-06252009-163007/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Al-Olimat, Hussein S. "Knowledge-Enabled Entity Extraction." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1578100367105233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Black, Kevin P. "Interactive Machine Assistance: A Case Study in Linking Corpora and Dictionaries." BYU ScholarsArchive, 2015. https://scholarsarchive.byu.edu/etd/5620.

Full text
Abstract:
Machine learning can provide assistance to humans in making decisions, including linguistic decisions such as determining the part of speech of a word. Supervised machine learning methods derive patterns indicative of possible labels (decisions) from annotated example data. For many problems, including most language analysis problems, acquiring annotated data requires human annotators who are trained to understand the problem and to disambiguate among multiple possible labels. Hence, the availability of experts can limit the scope and quantity of annotated data. Machine-learned pre-annotation assistance, which suggests probable labels for unannotated items, can enable expert annotators to work more quickly and thus to produce broader and larger annotated resources more cost-efficiently. Yet, because annotated data is required to build the pre-annotation model, bootstrapping is an obstacle to utilizing pre-annotation assistance, especially for low-resource problems where little or no annotated data exists. Interactive pre-annotation assistance can mitigate bootstrapping costs, even for low-resource problems, by continually refining the pre-annotation model with new annotated examples as the annotators work. In practice, continually refining models has seldom been done except for the simplest of models which can be trained quickly. As a case study in developing sophisticated, interactive, machine-assisted annotation, this work employs the task of corpus-dictionary linkage (CDL), which is to link each word token in a corpus to its correct dictionary entry. CDL resources, such as machine-readable dictionaries and concordances, are essential aids in many tasks including language learning and corpus studies. We employ a pipeline model to provide CDL pre-annotations, with one model per CDL sub-task. We evaluate different models for lemmatization, the most significant CDL sub-task since many dictionary entry headwords are usually lemmas. The best performing lemmatization model is a hybrid which uses a maximum entropy Markov model (MEMM) to handle unknown (novel) word tokens and other component models to handle known word tokens. We extend the hybrid model design to the other CDL sub-tasks in the pipeline. We develop an incremental training algorithm for the MEMM which avoids wasting previous computation as would be done by simply retraining from scratch. The incremental training algorithm facilitates the addition of new dictionary entries over time (i.e., new labels) and also facilitates learning from partially annotated sentences which allows annotators to annotate words in any order. We validate that the hybrid model attains high accuracy and can be trained sufficiently quickly to provide interactive pre-annotation assistance by simulating CDL annotation on Quranic Arabic and classical Syriac data.
APA, Harvard, Vancouver, ISO, and other styles
34

Ahmed, Olfet, and Nawar Saman. "Utvärdering av nätverkssäkerheten på J Bil AB." Thesis, KTH, Data- och elektroteknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-123403.

Full text
Abstract:
Detta examensarbete är en utvärdering av nätverkssäkerheten hos J BiL AB, både på social och teknisk nivå. Företaget är beroende av säkra Internet-anslutningar för att nå externa tjänster och interna servrar lokaliserade på olika geografiska platser. Företaget har ingen IT-ansvarig som aktivt underhåller och övervakar nätverket, utan konsulterar ett externt dataföretag. Syftet med examensarbetet är att utvärdera säkerheten, upptäcka brister, ge förbättringsförslag och till viss del implementera lösningar. För att undersöka säkerheten har observationer och intervjuer med personalen gjorts och ett flertal attacker mot nätverket har utförts. Utifrån den data som samlats in kunde slutsatsen dras att företaget har brister vad gäller IT-säkerheten. Framförallt den sociala säkerheten visade sig ha stora luckor vilket till stor del beror på att de anställda varken har blivit utbildade eller fått någon information om hur de ska hantera lösenord, datorer och IT-frågor i allmänt. Förbättringsförslag har getts och viss implementation har genomförts för att eliminera bristerna. De anställda har även med hjälp av en IT-policy och föreläsning blivit utbildade i hur de ska agera och tänka kring IT-relaterade säkerhetsfrågor.
The aim of this project is to evaluate the network security at J Bil AB. The focus will be on both social and technical issues. For the employees to be able to con-nect to remote servers and external services and perform their daily work tasks, secure connections is needed. J Bil Ab has no IT manager who actively maintains and monitors the network; rather they consult a computer company when changes and implementations are required. The projects’ goal is to identify gaps, come up with suggestions for improvement and to some extent implement so-lutions. To do this, an observation of the employees hav been made, an inter-view have been held, and several attacks on the network have been performed. Based on the data collected, it was concluded that the company has shortcom-ings in IT security. Above all, the social security appeared to have major gaps in it and that is mainly because the lack of knowledge among the employees and they have never been informed of how to manage their passwords, computers and IT issues in general. Suggestions for improvement have been given and some implementations have been performed to eliminate the deficiencies.
APA, Harvard, Vancouver, ISO, and other styles
35

Дегтярьова, Тетяна Олегівна, Татьяна Олеговна Дегтярева, Tetiana Olehivna Dehtiarova, and Бадер Моса. "Использование электронных словарей на уроках русского языка как иностранного." Thesis, Сумский государственный университет, 2016. http://essuir.sumdu.edu.ua/handle/123456789/47569.

Full text
Abstract:
Информационные компьютерные технологии активно входят в практику преподавания русского языка как иностранного. В настоящее время все большую популярность получают мобильные устройства: мобильные телефоны, смартфоны, планшеты. Эти гаджеты являются неотъемлемой частью современного урока благодаря своей доступности, мобильности, простоте использования. ИКТ становятся привычными средствами обучения. Особое место среди них занимают электронные двуязычные словари. Многие исследователи отмечают, что словарь может быть использован не только как простой справочный материал, но и как самостоятельное учебное пособие для развития всех видов речевой деятельности, для формирования коммуникативной компетенции студентов- иностранцев.
APA, Harvard, Vancouver, ISO, and other styles
36

Tadisetty, Srikanth. "Prediction of Psychosis Using Big Web Data in the United States." Kent State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=kent1532962079970169.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Alesand, Elias, and Hanna Sterneling. "A shoulder-surfing resistant graphical password system." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-138163.

Full text
Abstract:
The focus of this report is to discuss graphical password systems and how they can contribute to handle security problems that threaten authentication processes. One such threat is shoulder-surfing attacks, which are also reviewed in this report. Three already existing systems that are claimed to be shoulder-surfing resilient are described and a new proposed system is presented and evaluated through a user study. Moreover, the system is compared to the mentioned existing systems to further evaluate the usability, memorability and the time it takes to authenticate. The user study shows that test subjects are able to remember their chosen password one week after having registered and signed in once. It is also shown that the average time to sign in to the system after five minutes of practice is within a range of 3.30 to 5.70 seconds. The participants in the experiments gave the system an average score above 68 on the System Usability Scale, which is the score of an average system.
APA, Harvard, Vancouver, ISO, and other styles
38

Sarkar, Subrata. "Solving Linear and Bilinear Inverse Problems using Approximate Message Passing Methods." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1595529156778986.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Jabbarzadeh, Gangeh Mehrdad. "Kernelized Supervised Dictionary Learning." Thesis, 2013. http://hdl.handle.net/10012/7455.

Full text
Abstract:
The representation of a signal using a learned dictionary instead of predefined operators, such as wavelets, has led to state-of-the-art results in various applications such as denoising, texture analysis, and face recognition. The area of dictionary learning is closely associated with sparse representation, which means that the signal is represented using few atoms in the dictionary. Despite recent advances in the computation of a dictionary using fast algorithms such as K-SVD, online learning, and cyclic coordinate descent, which make the computation of a dictionary from millions of data samples computationally feasible, the dictionary is mainly computed using unsupervised approaches such as k-means. These approaches learn the dictionary by minimizing the reconstruction error without taking into account the category information, which is not optimal in classification tasks. In this thesis, we propose a supervised dictionary learning (SDL) approach by incorporating information on class labels into the learning of the dictionary. To this end, we propose to learn the dictionary in a space where the dependency between the signals and their corresponding labels is maximized. To maximize this dependency, the recently-introduced Hilbert Schmidt independence criterion (HSIC) is used. The learned dictionary is compact and has closed form; the proposed approach is fast. We show that it outperforms other unsupervised and supervised dictionary learning approaches in the literature on real-world data. Moreover, the proposed SDL approach has as its main advantage that it can be easily kernelized, particularly by incorporating a data-driven kernel such as a compression-based kernel, into the formulation. In this thesis, we propose a novel compression-based (dis)similarity measure. The proposed measure utilizes a 2D MPEG-1 encoder, which takes into consideration the spatial locality and connectivity of pixels in the images. The proposed formulation has been carefully designed based on MPEG encoder functionality. To this end, by design, it solely uses P-frame coding to find the (dis)similarity among patches/images. We show that the proposed measure works properly on both small and large patch sizes on textures. Experimental results show that by incorporating the proposed measure as a kernel into our SDL, it significantly improves the performance of a supervised pixel-based texture classification on Brodatz and outdoor images compared to other compression-based dissimilarity measures, as well as state-of-the-art SDL methods. It also improves the computation speed by about 40% compared to its closest rival. Eventually, we have extended the proposed SDL to multiview learning, where more than one representation is available on a dataset. We propose two different multiview approaches: one fusing the feature sets in the original space and then learning the dictionary and sparse coefficients on the fused set; and the other by learning one dictionary and the corresponding coefficients in each view separately, and then fusing the representations in the space of the dictionaries learned. We will show that the proposed multiview approaches benefit from the complementary information in multiple views, and investigate the relative performance of these approaches in the application of emotion recognition.
APA, Harvard, Vancouver, ISO, and other styles
40

Ballesteros, Lisa Ann. "Resolving ambiguity for cross -language information retrieval: A dictionary approach." 2001. https://scholarworks.umass.edu/dissertations/AAI3027176.

Full text
Abstract:
The global exchange of information has been facilitated by the rapid expansion in the size and use of the Internet, which has led to a large increase in the availability of on-line texts. Expanded international collaboration, the increase in the availability of electronic foreign language texts, the growing number of non-English-speaking users, and the lack of a common language of discourse compels us to develop cross-language information retrieval (CLIR) tools capable of bridging the language barrier. Cross-language retrieval bridges this gap by enabling a person to search in one language and retrieve documents across languages. There are several goals for the research described herein. The first is to gain a clear understanding of the problems associated with the cross-language task and to develop techniques for addressing them. Empirical work shows that ambiguity and lack of lexical resources are the main hurdles. Second we show that cross-language effectiveness does not depend upon linguistic analysis. We demonstrate how statistical techniques can be used to significantly reduce the effects of ambiguity. We also show that combining these techniques is as effective as or more effective than a reasonable machine translation system. Third, we show that an approach based on multi-lingual dictionaries and statistical analysis can be used as the foundation for a cross-language retrieval architecture that circumvents the problem of limited resources.
APA, Harvard, Vancouver, ISO, and other styles
41

Wilson, Loren. "A data dictionary for the INGRES data base management system." 1986. http://hdl.handle.net/2097/22179.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Zhou, Mingyuan. "Nonparametric Bayesian Dictionary Learning and Count and Mixture Modeling." Diss., 2013. http://hdl.handle.net/10161/7204.

Full text
Abstract:

Analyzing the ever-increasing data of unprecedented scale, dimensionality, diversity, and complexity poses considerable challenges to conventional approaches of statistical modeling. Bayesian nonparametrics constitute a promising research direction, in that such techniques can fit the data with a model that can grow with complexity to match the data. In this dissertation we consider nonparametric Bayesian modeling with completely random measures, a family of pure-jump stochastic processes with nonnegative increments. In particular, we study dictionary learning for sparse image representation using the beta process and the dependent hierarchical beta process, and we present the negative binomial process, a novel nonparametric Bayesian prior that unites the seemingly disjoint problems of count and mixture modeling. We show a wide variety of successful applications of our nonparametric Bayesian latent variable models to real problems in science and engineering, including count modeling, text analysis, image processing, compressive sensing, and computer vision.


Dissertation
APA, Harvard, Vancouver, ISO, and other styles
43

"Sparse Methods in Image Understanding and Computer Vision." Doctoral diss., 2013. http://hdl.handle.net/2286/R.I.17719.

Full text
Abstract:
abstract: Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval.
Dissertation/Thesis
Ph.D. Electrical Engineering 2013
APA, Harvard, Vancouver, ISO, and other styles
44

Almeida, David Moreira de. "Face recognition via sparse representation." Master's thesis, 2019. http://hdl.handle.net/10773/29465.

Full text
Abstract:
Face recognition has recently seen a peek in interest due to developments in deep learning. These developments incited great attention to the fi eld, not only from the research community, but also from a commercial perspective. While such methods provide the best accuracies when performing face recognition tasks, they often require millions of face images, a substantial amount of processing power and a considerable amount of time to develop. In the recent years, sparse representations have been successfully applied to a number of computer vision applications. One of those applications is face recognition. One of the first methods proposed for this task was the Sparse Representation Based Classi fication (SRC). Since then, several different methods, based on SRC have been proposed. These include dictionary learning based methods, as well as patch based classi fication. This thesis aims to study face recognition using sparse classi fication. Multiple methods will be explored, and some of these will be tested extensively in order to provide a comprehensive view of the fi eld.
Recentemente houve um pico de interesse na área de reconhecimento facial, devido especialmente aos desenvolvimentos relacionados com "deep learning". Estes estimularam o interesse na área, não apenas numa perspetiva académica, mas também numa comercial. Apesar de tais métodos fornecerem a melhor precisão ao executar tarefas de reconhecimento facial, eles geralmente requerem milhões de imagens de faces, bastante poder de processamento e uma quantidade substancial de tempo para desenvolver. Nos últimos anos, representações esparsas foram aplicadas com sucesso a diversas aplicações de visão de computador. Uma dessas aplicações _e reconhecimento facial. Um dos primeiros métodos propostos para tal tarefa foi o "Sparse Representation Based Classification (SRC)". Entretanto, vários diferentes métodos baseados no SRC, foram propostos. Estes incluem métodos de aprendizagem de dicionários e métodos baseados em classificaçao de "patches" de imagens. O objetivo desta tese é estudar o reconhecimento facial utilizando representações esparsas. Múltiplos métodos vão ser explorados e alguns deles vão ser testados extensivamente de modo a providenciar uma visão compreensiva da área.
Mestrado em Engenharia de Computadores e Telemática
APA, Harvard, Vancouver, ISO, and other styles
45

SIXTOVÁ, Jana. "Komunikace ve sféře výpočetní techniky." Master's thesis, 2009. http://www.nusl.cz/ntk/nusl-52466.

Full text
Abstract:
Annotation This diploma thesis deals with a communication used within information technologies, especially within computer slang. The theoretical part of this work is dedicated to the term `slang{\crq} and its position within the structure of the national language. It presents different definitions of slang and the opinions of linguists on defining this term. Diploma thesis also describes the linguistic and non-linguistic aspects of slang. The second part of the work sets out commentary on the history of information technologies and it also introduces an overview of the written sources concerning the computer slang. The intention of this work was above all to set a dictionary of the computer slang mostly composed on a basis of questionnaire research which was done throughout the period 2007 {--} 2009. The linguistic material was subjected to analysis; it is focused on basic characteristics of slang, especially on the synonymy and expressivity. It also explores the computer anglicisms and character of their adaptation in the Czech language.
APA, Harvard, Vancouver, ISO, and other styles
46

"Semantic Sparse Learning in Images and Videos." Doctoral diss., 2014. http://hdl.handle.net/2286/R.I.25183.

Full text
Abstract:
abstract: Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many of such sparse learning methods focus on designing or application of some learning techniques for certain feature space without much explicit consideration on possible interaction between the underlying semantics of the visual data and the employed learning technique. Rich semantic information in most visual data, if properly incorporated into algorithm design, should help achieving improved performance while delivering intuitive interpretation of the algorithmic outcomes. My study addresses the problem of how to explicitly consider the semantic information of the visual data in the sparse learning algorithms. In this work, we identify four problems which are of great importance and broad interest to the community. Specifically, a novel approach is proposed to incorporate label information to learn a dictionary which is not only reconstructive but also discriminative; considering the formation process of face images, a novel image decomposition approach for an ensemble of correlated images is proposed, where a subspace is built from the decomposition and applied to face recognition; based on the observation that, the foreground (or salient) objects are sparse in input domain and the background is sparse in frequency domain, a novel and efficient spatio-temporal saliency detection algorithm is proposed to identify the salient regions in video; and a novel hidden Markov model learning approach is proposed by utilizing a sparse set of pairwise comparisons among the data, which is easier to obtain and more meaningful, consistent than tradition labels, in many scenarios, e.g., evaluating motion skills in surgical simulations. In those four problems, different types of semantic information are modeled and incorporated in designing sparse learning algorithms for the corresponding visual computing tasks. Several real world applications are selected to demonstrate the effectiveness of the proposed methods, including, face recognition, spatio-temporal saliency detection, abnormality detection, spatio-temporal interest point detection, motion analysis and emotion recognition. In those applications, data of different modalities are involved, ranging from audio signal, image to video. Experiments on large scale real world data with comparisons to state-of-art methods confirm the proposed approaches deliver salient advantages, showing adding those semantic information dramatically improve the performances of the general sparse learning methods.
Dissertation/Thesis
Ph.D. Computer Science 2014
APA, Harvard, Vancouver, ISO, and other styles
47

"Scaling Up Large-scale Sparse Learning and Its Application to Medical Imaging." Doctoral diss., 2017. http://hdl.handle.net/2286/R.I.44043.

Full text
Abstract:
abstract: Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems. In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features.
Dissertation/Thesis
Doctoral Dissertation Computer Science 2017
APA, Harvard, Vancouver, ISO, and other styles
48

Li, Lingbo. "Nonparametric Bayesian Models for Joint Analysis of Imagery and Text." Diss., 2014. http://hdl.handle.net/10161/8675.

Full text
Abstract:

It has been increasingly important to develop statistical models to manage large-scale high-dimensional image data. This thesis presents novel hierarchical nonparametric Bayesian models for joint analysis of imagery and text. This thesis consists two main parts.

The first part is based on single image processing. We first present a spatially dependent model for simultaneous image segmentation and interpretation. Given a corrupted image, by imposing spatial inter-relationships within imagery, the model not only improves reconstruction performance but also yields smooth segmentation. Then we develop online variational Bayesian algorithm for dictionary learning to process large-scale datasets, based on online stochastic optimization with a natu- ral gradient step. We show that dictionary is learned simultaneously with image reconstruction on large natural images containing tens of millions of pixels.

The second part applies dictionary learning for joint analysis of multiple image and text to infer relationship among images. We show that feature extraction and image organization with annotation (when available) can be integrated by unifying dictionary learning and hierarchical topic modeling. We present image organization in both "flat" and hierarchical constructions. Compared with traditional algorithms feature extraction is separated from model learning, our algorithms not only better fits the datasets, but also provides richer and more interpretable structures of image


Dissertation
APA, Harvard, Vancouver, ISO, and other styles
49

"New Directions in Sparse Models for Image Analysis and Restoration." Doctoral diss., 2013. http://hdl.handle.net/2286/R.I.16472.

Full text
Abstract:
abstract: Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses on the study of sparse models and their interplay with modern machine learning techniques such as manifold, ensemble and graph-based methods, along with their applications in image analysis and recovery. By considering graph relations between data samples while learning sparse models, graph-embedded codes can be obtained for use in unsupervised, supervised and semi-supervised problems. Using experiments on standard datasets, it is demonstrated that the codes obtained from the proposed methods outperform several baseline algorithms. In order to facilitate sparse learning with large scale data, the paradigm of ensemble sparse coding is proposed, and different strategies for constructing weak base models are developed. Experiments with image recovery and clustering demonstrate that these ensemble models perform better when compared to conventional sparse coding frameworks. When examples from the data manifold are available, manifold constraints can be incorporated with sparse models and two approaches are proposed to combine sparse coding with manifold projection. The improved performance of the proposed techniques in comparison to sparse coding approaches is demonstrated using several image recovery experiments. In addition to these approaches, it might be required in some applications to combine multiple sparse models with different regularizations. In particular, combining an unconstrained sparse model with non-negative sparse coding is important in image analysis, and it poses several algorithmic and theoretical challenges. A convex and an efficient greedy algorithm for recovering combined representations are proposed. Theoretical guarantees on sparsity thresholds for exact recovery using these algorithms are derived and recovery performance is also demonstrated using simulations on synthetic data. Finally, the problem of non-linear compressive sensing, where the measurement process is carried out in feature space obtained using non-linear transformations, is considered. An optimized non-linear measurement system is proposed, and improvements in recovery performance are demonstrated in comparison to using random measurements as well as optimized linear measurements.
Dissertation/Thesis
Ph.D. Electrical Engineering 2013
APA, Harvard, Vancouver, ISO, and other styles
50

Ravindra, G. "Information Theoretic Approach To Extractive Text Summarization." Thesis, 2006. http://hdl.handle.net/2005/452.

Full text
Abstract:
Automatic text summarization techniques, which can reduce a source text to a summary text by content generalization or selection have assumed signifi- cance in recent times due to the ever expanding information explosion created by the World Wide Web. Summaries generated by generalization of information are called abstracts and those generated by selection of portions of text (sentences, phrases etc.) are called extracts. Further, summaries could for each document separately or multiple documents could be summarized together to produce a single summary. The challenges in making machines generate extracts or abstracts are primarily due to the lack of understanding of human cognitive processes. Summary generated by humans seems to be influenced by their moral, emotional and ethical stance on the subject and their background knowledge of the content being summarized.These characteristics are hardly understood and difficult to model mathematically. Further automatic summarization is very much handicapped by limitations of existing computing resources and lack of good mathematical models of cognition. In view of these, the role of rigorous mathematical theory in summarization has been limited hitherto. The research reported in this thesis is a contribution towards bringing in the awesome power of well-established concepts information theory to the field of summarization. Contributions of the Thesis The specific focus of this thesis is on extractive summarization. Its domain spans multi-document summarization as well as single document summarization. In the whole thesis the word "summarization" and "summary", imply extract generation and sentence extracts respectively. In this thesis, two new and novel summarizers referred to as ESCI (Extractive Summarization using Collocation Information) and De-ESCI (Dictionary enhanced ESCI) have been proposed. In addition, an automatic summary evaluation technique called DeFuSE (Dictionary enhanced Fuzzy Summary Evaluator) has also been introduced.The mathematical basis for the evolution of the scoring scheme proposed in this thesis and its relationship with other well-known summarization algorithms such as latent Semantic Indexing (LSI) is also derived. The work detailed in this thesis is specific to the domain of extractive summarization of unstructured text without taking into account the data set characteristics such as the positional importance of sentences. This is to ensure that the summarizer works well for a broad class of documents, and to keep the proposed models as generic as possible. Central to the proposed work is the concept of "Collocation Information of a word", its quantification and application to summarization. "Collocation Information" (CI) is the amount of information (Shannon’s measure) that a word and its collocations together contribute to the total information in the document(s) being summarized.The CI of a word has been computed using Shannon’s measure for information using a joint probability distribution. Further, a base value of CI called "Discrimination Threshold" (DT) has also been derived. To determine DT, sentences from a large collection of documents covering various topics including the topic covered by the document(s) being summarized were broken down into sequences of word collocations.The number of possible neighbors for a word within a specified collocation window was determined. This number has been called the "cardinality of the collocating set" and is represented as |ℵ (w)|. It is proved that if |ℵ (w)| determined from this large document collection for any word w is fixed, then the maximum value of the CI for a word w is proportional to |ℵ (w)|. This constrained maximum is the "Discrimination Threshold" and is used as the base value of CI. Experimental evidence detailed in this thesis shows that sentences containing words with CI greater than DT are most likely to be useful in an extract. Words in every sentence of the document(s) being summarized have been assigned scores based on the difference between their current value of CI and their respective DT. Individual word scores have been summed to derive a score for every sentence. Sentences are ranked according to their scores and the first few sentences in the rank order have been selected as the extract summary. Redundant and semantically similar sentences have been excluded from the selection process using a simple similarity detection algorithm. This novel method for extraction has been called ESCI in this thesis. In the second part of the thesis, the advantages of tagging words as nouns, verbs, adjectives and adverbs without the use of sense disambiguation has been explored. A hierarchical model for abstraction of knowledge has been proposed, and those cases where such a model can improve summarization accuracy have been explained. Knowledge abstraction has been achieved by converting collocations into their hypernymous versions. In the second part of the thesis, the advantages of tagging words as nouns, verbs, adjectives and adverbs without the use of sense disambiguation has been explored. A hierarchical model for abstraction of knowledge has been proposed, and those cases where such a model can improve summarization accuracy have been explained. Knowledge abstraction has been achieved by converting collocations into their hypernymous versions. The number of levels of abstraction varies based on the sense tag given to each word in the collocation being abstracted. Once abstractions have been determined, Expectation- Maximization algorithm is used to determine the probability value of each collocation at every level of abstraction. A combination of abstracted collocations from various levels is then chosen and sentences are assigned scores based on collocation information of these abstractions.This summarization scheme has been referred to as De-ESCI (Dictionary enhanced ESCI). It had been observed in many human summary data sets that the factual attribute of the human determines the choice of noun and verb pairs. Similarly, the emotional attribute of the human determines the choice of the number of noun and adjective pairs. In order to bring these attributes into the machine generated summaries, two variants of DeESCI have been proposed. The summarizer with the factual attribute has been called as De-ESCI-F, while the summarizer with the emotional attribute has been called De-ESCI-E in this thesis. Both create summaries having two parts. First part of the summary created by De-ESCI-F is obtained by scoring and selecting only those sentences where a fixed number of nouns and verbs occur.The second part of De-ESCI-F is obtained by ranking and selecting those sentences which do not qualify for the selection process in the first part. Assigning sentence scores and selecting sentences for the second part of the summary is exactly like in ESCI. Similarly, the first part of De-ESCI-E is generated by scoring and selecting only those sentences where fixed number of nouns and adjectives occur. The second part of the summary produced by De-ESCI-E is exactly like the second part in De-ESCI-F. As the model summary generated by human summarizers may or may not contain sentences with preference given to qualifiers (adjectives), the automatic summarizer does not know apriori whether to choose sentences with qualifiers over those without qualifiers. As there are two versions of the summary produced by De-ESCI-F and De-ESCIE, one of them should be closer to the human summarizer’s point of view (in terms of giving importance to qualifiers). This technique of choosing the best candidate summary has been referred to as De-ESCI-F/E. Performance Metrics The focus of this thesis is to propose new models and sentence ranking techniques aimed at improving the accuracy of the extract in terms of sentences selected, rather than on the readability of the summary. As a result, the order of sentences in the summary is not given importance during evaluation. Automatic evaluation metrics have been used and the performance of the automatic summarizer has been evaluated in terms of precision, recall and f-scores obtained by comparing its output with model human generated extract summaries. A novel summary evaluator called DeFuSE has been proposed in this thesis, and its scores are used along with the scores given by a standard evaluator called ROUGE. DeFuSE evaluates an extract in terms of precision, recall and f-score relying on The use of WordNet hypernymy structure to identify semantically similar sentences in a document. It also uses fuzzy set theory to compute the extent to which a sentence from the machine generated extract belongs to the model summary. Performance of candidate summarizers has been discussed in terms of percentage improvement in fscore relative to the baselines. Average of ROUGE and DeFuSE f-score for every summary is computed, and the mean value of these scores is used to compare performance improvement. Performance For illustrative purposes, DUC 2002 and DUC 2003 multi-document data sets have been used. From these data sets only the 400 word summaries of DUC 2002 and track-4 (novelty track) summaries of DUC 2003 are useful for evaluation of sentence extracts and hence only these have been used. f-score has been chosen as a measure of performance. Standard baselines such as coverage, size and lead and also probabilistic baselines have been used to measure percentage improvement in f-score of candidate summarizers relative to these baselines. Further, summaries generated by MEAD using centroid and length as features for ranking (MEAD-CL), MEAD using positional, centroid and length as features for ranking (MEAD-CLP), Microsoft Word automatic summarizer (MS-Word) and Latent Semantic Indexing (LSI) based summarizer were used to compare the performance of the proposed summarization schemes.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography