Dissertations / Theses on the topic 'Intelligent language processing'

To see the other types of publications on this topic, follow the link: Intelligent language processing.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Intelligent language processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Lerjebo, Linus, and Johannes Hägglund. "Intelligent chatbot assistant: A study of Natural Language Processing and Artificial Intelligence." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-42691.

Full text
Abstract:
The development and research of Artificial Intelligence have had a recent surge in recent years, which includes the medical field. Despite the new technology and tools available, the staff is still under a heavy workload. The goal of this thesis is to analyze the possibilities of a chatbot whose purpose is to assist the medical staff and provide safety for the patients by guaranteeing that they are being monitored. With the use of technologies such as Artificial Intelligence, Natural Language Processing, and Voice Over Internet Protocol, the chatbot can communicate with the patient. It will work as an assistant for the working staff and provide the information from the calls to the medical staff. With the answers provided from the call, the staff will not be needing to ask routine questions every time and can provide help more quickly. The chatbot is administrated through a web application where administrators can initiate calls and add patients to the database.
APA, Harvard, Vancouver, ISO, and other styles
2

Huang, Qiang. "Speech and language processing for intelligent call routing." Thesis, University of East Anglia, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.426693.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Glinos, Demetrios George. "An intelligent editor for natural language processing of unrestricted text." Master's thesis, University of Central Florida, 1999. http://digital.library.ucf.edu/cdm/ref/collection/RTD/id/16913.

Full text
Abstract:
University of Central Florida College of Arts and Sciences Thesis
The understanding of natural language by computational methods has been a continuing and elusive problem in artificial intelligence. In recent years there has been a resurgence in natural language processing research. Much of this work has been on empirical or corpus-based methods which use a data-driven approach to train systems on large amounts of real language data. Using corpus-based methods, the performance of part-of-speech (POS) taggers, which assign to the individual words of a sentence their appropriate part of speech category (e.g., noun, verb, preposition), now rivals human performance levels, achieving accuracies exceeding 95%. Such taggers have proved useful as preprocessors for such tasks as parsing, speech synthesis, and information retrieval. Parsing remains, however, a difficult problem, even with the benefit of POS tagging. Moveover, as sentence length increases, there is a corresponding combinatorial explosing of alternative possible parses. Consider the following sentence from a New York Times online article: After Salinas was arrested for murder in 1995 and lawyers for the bank had begun monitoring his accounts, his personal banker in New York quietly advised Salinas' wife to move the money elsewhere, apparently without the consent of the legal department. To facilite the parsing and other tasks, we would like to decompose this sentence into the following three shorter sentences which, taken together, convey the same meaning as the original: 1. Salinas was arrested for murder in 1995. 2. Lawyers for the bank had begun monitoring his accounts. 3. His personal banker in New York quietly adviced Salinas' wife to move the money elsewhere, apprently without the consent of the legal department. This study investigates the development of heuristics for decomposing such long sentences into sets of shorter sentences without affecting the meaning of the original sentences. Without parsing or semantic analysis, heuristic rules were developed based on: (1) the output of a POS tagger (Brill's tagger); (2) the punctuation contained in the input sentences; and (3) the words themselves. The heuristic algorithms were implemented in an intelligent editor program which first augmented the POS tags and assigned tags to punctuation, and then tested the rules against a corpus of 25 New York Times online articles containing approximately 1,200 sentences and over 32,000 words, with good results. Recommendations are made for improving the algorithms and for continuing this line of research.
M.S.;
Computer Science
Arts and Sciences
Computer Science;
220 p.
xii, 220 leaves, bound : ill. ; 28 cm.
APA, Harvard, Vancouver, ISO, and other styles
4

Amaral, Luiz Alexandre Mattos do. "Designing intelligent language tutoring systems for integration into foreign language instruction." Columbus, Ohio : Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1179979688.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Amaral, Luiz A. "Designing intelligent language tutoring systems for integration into foreign language instruction." The Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1179979688.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Moiseeva, Alena [Verfasser], and Hinrich [Akademischer Betreuer] Schütze. "Statistical natural language processing methods for intelligent process automation / Alena Moiseeva ; Betreuer: Hinrich Schütze." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2020. http://d-nb.info/1218466944/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Bailey, Stacey M. "Content Assessment in Intelligent Computer-aided Language Learning: Meaning Error Diagnosis for English as a Second Language." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1204556485.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Sasa, Yuko. "Intelligence Socio-Affective pour un Robot : primitives langagières pour une interaction évolutive d'un robot de l’habitat intelligent." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM041/document.

Full text
Abstract:
Le Traitement Automatique de la Parole (TAP) s’intéresse de plus en plus et progresse techniquement en matière d’étendue de vocabulaire, de gestion de complexité morphosyntaxique, de style et d’esthétique de la parole humaine. L’Affective Computing tend également à intégrer une dimension « émotionnelle » dans un objectif commun au TAP visant à désambiguïser le langage naturel et augmenter la naturalité de l’interaction personne-machine. Dans le cadre de la robotique sociale, cette interaction est modélisée dans des systèmes d’interaction, de dialogue, qui tendent à engendrer une dimension d’attachement dont les effets doivent être éthiquement et collectivement contrôlés. Or la dynamique du langage humain situé met à mal l’efficacité des systèmes automatiques. L’hypothèse de cette thèse propose dans la dynamique des interactions, il existerait une « glu socio-affective » qui ferait entrer en phases synchroniques deux individus dotés chacun d’un rôle social impliqué dans une situation/contexte d’interaction. Cette thèse s'intéresse à des dynamiques interactionnelles impliquant spécifiquement des processus altruistes, orthogonale à la dimension de dominance. Cette glu permettrait ainsi de véhiculer les événements langagiers entre les interlocuteurs, en modifiant constamment leur relation et leur rôle, qui eux même viennent à modifier cette glu, afin d’assurer la continuité de la communication. La seconde hypothèse propose que la glu socio-affective se construise à partir d’une « prosodie socio-affective pure » que l’on peut retrouver dans certaines formes de micro-expressions vocales. L’effet de ces événements langagiers serait alors graduel en fonction du degré de contrôle d’intentionnalité communicative qui s’observerait successivement par des primitives langagières : 1) des bruits de bouche (non phonétiques, non phonologiques), 2) des sons prélexicaux, 3) des interjections/onomatopées, 4) des imitations à contenu lexical contrôlé. Une méthodologie living-lab est ainsi développée au sein de la plateforme Domus, sur des boucles agiles et itératives co-construites avec les partenaires industriels et sociétaux. Un Magicien d’Oz – EmOz – est utilisé afin de contrôler les primitives vocales comme unique support langagier d’un robot majordome d’un habitat intelligent interagissant avec des personnes âgées en isolement relationnel. Un large corpus, EmOz Elderly Expressions –EEE– est ainsi recueilli. Cet isolement relationnel permet méthodologiquement d’appréhender les dimensions de la glu socio-affective, en introduisant une situation contrastive dégradée de la glu. Les effets des primitives permettraient alors d’observer les comportements de l’humain à travers des indices multimodaux. Les enjeux sociétaux abordés par la gérontechnologie montrent que l’isolement est un facteur de fragilisation où la qualité de la communication délite le maillage relationnel des personnes âgées alors que ces liens sont bénéfiques à sa santé et son bien-être. L’émergence de la robotique d’assistance en est une illustration. Le système automatisé qui découlera des données et des analyses de cette étude permettrait alors d’entraîner les personnes à solliciter pleinement leurs mécanismes de construction relationnelle, afin de redonner l’envie de communiquer avec leur entourage humain. Les analyses du corpus EEE recueilli montrent une évolution de la relation à travers différents indices interactionnels, temporellement organisés. Ces paramètres visent à être intégrés dans une perspective de système de dialogue incrémental – SASI. Les prémisses de ce système sont proposées dans un prototype de reconnaissance de la parole dont la robustesse ne dépendra pas de l’exactitude du contenu langagier reconnu, mais sur la reconnaissance du degré de glu, soit de l’état relationnel entre les locuteurs. Ainsi, les erreurs de reconnaissance tendraient à être compensées par l’intelligence socio-affective adaptative de ce système dont pourrait être doté le robot
The Natural Language Processing (NLP) has technically improved regarding human speech vocabulary extension, morphosyntax scope, style and aesthetic. Affective Computing also tends to integrate an “emotional” dimension with a common goal shared with NLP which is to disambiguate the natural language and increase the human-machine interaction naturalness. Within social robotics, the interaction is modelled in dialogue systems trying to reach out an attachment dimension which effects need to an ethical and collective control. However, the situated natural language dynamics is undermining the automated system’s efficiency, which is trying to respond with useful and suitable feedbacks. This thesis hypothesis supposes the existence of a “socio-affective glue” in every interaction, set up in between two individuals, each with a social role depending on a communication context. This glue is so the consequence of dynamics generated by a process which mechanisms rely on an altruistic dimension, but independent of dominance dimension as seen in emotions studies. This glue would allow the exchange of the language events between interlocutors, by regularly modifying their relation and their role, which is changing themselves this glue, to ensure the communication continuity. The second hypothesis proposes the glue as built by “socio-affective pure prosody” forms that enable this relational construction. These cues are supposed to be carried by hearable and visible micro-expressions. The interaction events effect would also be gradual following the degree of the communication’s intentionality control. The graduation will be continuous through language primitives as 1) mouth noises (neither phonetics nor phonological sounds), 2) pre-lexicalised sounds, 3) interjections and onomatopoeias, 4) controlled command-based imitations with the same socio-affective prosody supposed to create and modify the glue. Within the Domus platform, we developed an almost living-lab methodology. It functions on agile and iterative loops co-constructed with industrial and societal partners. A wizard of oz approach – EmOz – is used to control the vocal primitives proposed as the only language tools of a Smart Home butler robot interacting with relationally isolated elderly. The relational isolation allows the dimensions the socio-affective glue in a contrastive situation where it is damaged. We could thus observe the primitives’ effects through multimodal language cues. One of the gerontechnology social motivation showed the isolation to be a phenomenon amplifying the frailty so can attest the emergence of assistive robotics. A vicious circle leads by the elderly communicational characteristics convey them to some difficulties to maintain their relational tissue while their bonds are beneficial for their health and well-being. If the proposed primitives could have a real effect on the glue, the automated system will be able to train the persons to regain some unfit mechanisms underlying their relational construction, and so possibly increase their desire to communicate with their human social surroundings. The results from the collected EEE corpus show the relation changes through various interactional cues, temporally organised. These denoted parameters tend to build an incremental dialogue system in perspectives – SASI. The first steps moving towards this system reside on a speech recognition prototype which robustness is not based on the accuracy of the recognised language content but on the possibility to identify the glue degree (i.e. the relational state) between the interlocutors. Thus, the recognition errors avoid the system to be rejected by the user, by tempting to be balanced by this system’s adaptive socio-affective intelligence
APA, Harvard, Vancouver, ISO, and other styles
9

Wärmegård, Erik. "Intelligent chatbot assistant: A study of integration with VOIP and Artificial Intelligence." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-42693.

Full text
Abstract:
Development and research on Artificial Intelligence have increased during recent years, and the field of medicine is not excluded as a target audience for this top modern technology. Despite new research and tools in favor of medical care, the staff is still under heavy workloads. The goal of this thesis is to analyze and propose the possibility of a chatbot that aims to ease the pressure on the medical staff. To provide a guarantee that patients are being monitored. With Artificial Intelligence, VOIP, Natural Language Processing, and web development, this chatbot can communicate with a patient, which will act as an assistant tool that conducts preparatory work for the medical staff. The system of the chatbot is integrated through a web application where the administrator can initiate call and store clients onto the database. To ascertain that the system operates in real-time, several tests have been carried out to tests concerning the latency between subsystems and the quality of service.
I utvecklingen av intelligenta system har sjukvården etablerat sig som en stor målgrupp. Trots avancerade tekniker så är sjukvården fortfarande under tung belastning. Målet för detta examensarbete är att undersöka möjligheten av en chatbot vars syfte är att lätta på arbetsbelastningen hos sjukvårdspersonalen och samtidigt erbjuda en garanti för att patienter får den tillsyn och återkoppling de behöver. Med hjälp av Artificiell Intelligens, VOIP, Natural Language Processing och webbutveckling kan denna chatbot kommunicera med patienten. Chatboten agerar som ett assisterande verktyg som står för ett förarbete i beslutstagandet för sjukvårdspersonal. Ett systemsom inte bara ger praktisk nytta utan också ett främjande av den utveckling som Artificiell Intelligens gör inom sjukvården. Systemet administreras genom en hemsida som kopplar samman de flera olika komponenterna. Här kan en administratör initiera samtal och spara klienter som ska ringas till databasen. För att kunna fastställa att systemet opererar i realtid har görs flertalet prestandatester avseende både tidsfördröjningar och samtalskvalité.
APA, Harvard, Vancouver, ISO, and other styles
10

Kauppi, Ilkka. "Intermediate language for mobile robots : a link between the high-level planner and low-level services in robots /." Espoo [Finland] : VTT Technical Research Centre of Finland, 2003. http://www.vtt.fi/inf/pdf/publications/2003/P510.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Dabiri, Sina. "Application of Deep Learning in Intelligent Transportation Systems." Diss., Virginia Tech, 2019. http://hdl.handle.net/10919/87409.

Full text
Abstract:
The rapid growth of population and the permanent increase in the number of vehicles engender several issues in transportation systems, which in turn call for an intelligent and cost-effective approach to resolve the problems in an efficient manner. A cost-effective approach for improving and optimizing transportation-related problems is to unlock hidden knowledge in ever-increasing spatiotemporal and crowdsourced information collected from various sources such as mobile phone sensors (e.g., GPS sensors) and social media networks (e.g., Twitter). Data mining and machine learning techniques are the major tools for analyzing the collected data and extracting useful knowledge on traffic conditions and mobility behaviors. Deep learning is an advanced branch of machine learning that has enjoyed a lot of success in computer vision and natural language processing fields in recent years. However, deep learning techniques have been applied to only a small number of transportation applications such as traffic flow and speed prediction. Accordingly, my main objective in this dissertation is to develop state-of-the-art deep learning architectures for resolving the transport-related applications that have not been treated by deep learning architectures in much detail, including (1) travel mode detection, (2) vehicle classification, and (3) traffic information system. To this end, an efficient representation for spatiotemporal and crowdsourced data (e.g., GPS trajectories) is also required to be designed in such a way that not only be adaptable with deep learning architectures but also contains efficient information for solving the task-at-hand. Furthermore, since the good performance of a deep learning algorithm is primarily contingent on access to a large volume of training samples, efficient data collection and labeling strategies are developed for different data types and applications. Finally, the performance of the proposed representations and models are evaluated by comparing to several state-of-the-art techniques in literature. The experimental results clearly and consistently demonstrate the superiority of the proposed deep-learning based framework for each application.
PHD
APA, Harvard, Vancouver, ISO, and other styles
12

Boyd, Adriane Amelia. "Detecting and Diagnosing Grammatical Errors for Beginning Learners of German: From Learner Corpus Annotation to Constraint Satisfaction Problems." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1325170396.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Elvir, Miguel. "EPISODIC MEMORY MODEL FOR EMBODIED CONVERSATIONAL AGENTS." Master's thesis, University of Central Florida, 2010. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3000.

Full text
Abstract:
Embodied Conversational Agents (ECA) form part of a range of virtual characters whose intended purpose include engaging in natural conversations with human users. While works in literature are ripe with descriptions of attempts at producing viable ECA architectures, few authors have addressed the role of episodic memory models in conversational agents. This form of memory, which provides a sense of autobiographic record-keeping in humans, has only recently been peripherally integrated into dialog management tools for ECAs. In our work, we propose to take a closer look at the shared characteristics of episodic memory models in recent examples from the field. Additionally, we propose several enhancements to these existing models through a unified episodic memory model for ECAÂ s. As part of our research into episodic memory models, we present a process for determining the prevalent contexts in the conversations obtained from the aforementioned interactions. The process presented demonstrates the use of statistical and machine learning services, as well as Natural Language Processing techniques to extract relevant snippets from conversations. Finally, mechanisms to store, retrieve, and recall episodes from previous conversations are discussed. A primary contribution of this research is in the context of contemporary memory models for conversational agents and cognitive architectures. To the best of our knowledge, this is the first attempt at providing a comparative summary of existing works. As implementations of ECAs become more complex and encompass more realistic conversation engines, we expect that episodic memory models will continue to evolve and further enhance the naturalness of conversations.
M.S.Cp.E.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Engineering MSCpE
APA, Harvard, Vancouver, ISO, and other styles
14

Laurent, Mario. "Recherche et développement du Logiciel Intelligent de Cartographie Inversée, pour l’aide à la compréhension de texte par un public dyslexique." Thesis, Université Clermont Auvergne‎ (2017-2020), 2017. http://www.theses.fr/2017CLFAL016/document.

Full text
Abstract:
Les enfants souffrant de troubles du langage, comme la dyslexie, rencontrent de grandes difficultés dans l'apprentissage de la lecture et dans toute tâche de lecture, par la suite. Ces difficultés compromettent grandement l'accès au sens des textes auxquels ils sont confrontés durant leur scolarité, ce qui implique des difficultés d'apprentissage et les entraîne souvent vers une situation d'échec scolaire. Depuis une quinzaine d'années, des outils développés dans le domaine du Traitement Automatique des Langues sont détournés pour être utilisés comme stratégie d'aide et de compensation pour les élèves en difficultés. Parallèlement, l'usage de cartes conceptuelles ou de cartes heuristiques pour aider les enfants dyslexiques à formuler leurs pensées, ou à retenir certaines connaissances, s'est développé. Ce travail de thèse vise à répertorier et croiser, d'une part, les connaissances sur le public dyslexique, sa prise en charge et ses difficultés, d'autre part, les possibilités pédagogiques ouvertes par l'usage de cartes, et enfin, les technologies de résumé automatique et d'extraction de mots-clés. L'objectif est de réaliser un logiciel novateur capable de transformer automatiquement un texte donné en une carte, celle-ci doit faciliter la compréhension du texte tout en comprenant des fonctionnalités adaptées à un public d'adolescents dyslexiques. Ce projet a abouti, premièrement, à la réalisation d'une expérimentation exploratoire, sur l'aide à la compréhension de texte grâce aux cartes heuristiques, qui permet de définir de nouveaux axes de recherche ; deuxièmement, à la réalisation d'un prototype de logiciel de cartographie automatique qui est présenté en fin de thèse
Children with language impairment, such as dyslexia, are often faced with important difficulties when learning to read and during any subsequent reading tasks. These difficulties tend to compromise the understanding of the texts they must read during their time at school. This implies learning difficulties and may lead to academic failure. Over the past fifteen years, general tools developed in the field of Natural Language Processing have been transformed into specific tools for that help with and compensate for language impaired students' difficulties. At the same time, the use of concept maps or heuristic maps to encourage dyslexic children express their thoughts, or retain certain knowledge, has become popular. This thesis aims to identify and explore knowledge about the dyslexic public, how society takes care of them and what difficulties they face; the pedagogical possibilities opened up by the use of maps; and the opportunities created by automatic summarization and Information Retrieval fields. The aim of this doctoral research project was to create an innovative piece of software that automatically transforms a given text into a map. It was important that this piece of software facilitate reading comprehension while including functionalities that are adapted to dyslexic teenagers. The project involved carrying out an exploratory experiment on reading comprehension aid, thanks to heuristic maps, that make the identification of new research topics possible, and implementing an automatic mapping software prototype that is presented at the end of this thesis
APA, Harvard, Vancouver, ISO, and other styles
15

Silvestre, Cerdà Joan Albert. "Different Contributions to Cost-Effective Transcription and Translation of Video Lectures." Doctoral thesis, Universitat Politècnica de València, 2016. http://hdl.handle.net/10251/62194.

Full text
Abstract:
[EN] In recent years, on-line multimedia repositories have experiencied a strong growth that have made them consolidated as essential knowledge assets, especially in the area of education, where large repositories of video lectures have been built in order to complement or even replace traditional teaching methods. However, most of these video lectures are neither transcribed nor translated due to a lack of cost-effective solutions to do so in a way that gives accurate enough results. Solutions of this kind are clearly necessary in order to make these lectures accessible to speakers of different languages and to people with hearing disabilities. They would also facilitate lecture searchability and analysis functions, such as classification, recommendation or plagiarism detection, as well as the development of advanced educational functionalities like content summarisation to assist student note-taking. For this reason, the main aim of this thesis is to develop a cost-effective solution capable of transcribing and translating video lectures to a reasonable degree of accuracy. More specifically, we address the integration of state-of-the-art techniques in Automatic Speech Recognition and Machine Translation into large video lecture repositories to generate high-quality multilingual video subtitles without human intervention and at a reduced computational cost. Also, we explore the potential benefits of the exploitation of the information that we know a priori about these repositories, that is, lecture-specific knowledge such as speaker, topic or slides, to create specialised, in-domain transcription and translation systems by means of massive adaptation techniques. The proposed solutions have been tested in real-life scenarios by carrying out several objective and subjective evaluations, obtaining very positive results. The main outcome derived from this thesis, The transLectures-UPV Platform, has been publicly released as an open-source software, and, at the time of writing, it is serving automatic transcriptions and translations for several thousands of video lectures in many Spanish and European universities and institutions.
[ES] Durante estos últimos años, los repositorios multimedia on-line han experimentado un gran crecimiento que les ha hecho establecerse como fuentes fundamentales de conocimiento, especialmente en el área de la educación, donde se han creado grandes repositorios de vídeo charlas educativas para complementar e incluso reemplazar los métodos de enseñanza tradicionales. No obstante, la mayoría de estas charlas no están transcritas ni traducidas debido a la ausencia de soluciones de bajo coste que sean capaces de hacerlo garantizando una calidad mínima aceptable. Soluciones de este tipo son claramente necesarias para hacer que las vídeo charlas sean más accesibles para hablantes de otras lenguas o para personas con discapacidades auditivas. Además, dichas soluciones podrían facilitar la aplicación de funciones de búsqueda y de análisis tales como clasificación, recomendación o detección de plagios, así como el desarrollo de funcionalidades educativas avanzadas, como por ejemplo la generación de resúmenes automáticos de contenidos para ayudar al estudiante a tomar apuntes. Por este motivo, el principal objetivo de esta tesis es desarrollar una solución de bajo coste capaz de transcribir y traducir vídeo charlas con un nivel de calidad razonable. Más específicamente, abordamos la integración de técnicas estado del arte de Reconocimiento del Habla Automático y Traducción Automática en grandes repositorios de vídeo charlas educativas para la generación de subtítulos multilingües de alta calidad sin requerir intervención humana y con un reducido coste computacional. Además, también exploramos los beneficios potenciales que conllevaría la explotación de la información de la que disponemos a priori sobre estos repositorios, es decir, conocimientos específicos sobre las charlas tales como el locutor, la temática o las transparencias, para crear sistemas de transcripción y traducción especializados mediante técnicas de adaptación masiva. Las soluciones propuestas en esta tesis han sido testeadas en escenarios reales llevando a cabo nombrosas evaluaciones objetivas y subjetivas, obteniendo muy buenos resultados. El principal legado de esta tesis, The transLectures-UPV Platform, ha sido liberado públicamente como software de código abierto, y, en el momento de escribir estas líneas, está sirviendo transcripciones y traducciones automáticas para diversos miles de vídeo charlas educativas en nombrosas universidades e instituciones Españolas y Europeas.
[CAT] Durant aquests darrers anys, els repositoris multimèdia on-line han experimentat un gran creixement que els ha fet consolidar-se com a fonts fonamentals de coneixement, especialment a l'àrea de l'educació, on s'han creat grans repositoris de vídeo xarrades educatives per tal de complementar o inclús reemplaçar els mètodes d'ensenyament tradicionals. No obstant això, la majoria d'aquestes xarrades no estan transcrites ni traduïdes degut a l'absència de solucions de baix cost capaces de fer-ho garantint una qualitat mínima acceptable. Solucions d'aquest tipus són clarament necessàries per a fer que les vídeo xarres siguen més accessibles per a parlants d'altres llengües o per a persones amb discapacitats auditives. A més, aquestes solucions podrien facilitar l'aplicació de funcions de cerca i d'anàlisi tals com classificació, recomanació o detecció de plagis, així com el desenvolupament de funcionalitats educatives avançades, com per exemple la generació de resums automàtics de continguts per ajudar a l'estudiant a prendre anotacions. Per aquest motiu, el principal objectiu d'aquesta tesi és desenvolupar una solució de baix cost capaç de transcriure i traduir vídeo xarrades amb un nivell de qualitat raonable. Més específicament, abordem la integració de tècniques estat de l'art de Reconeixement de la Parla Automàtic i Traducció Automàtica en grans repositoris de vídeo xarrades educatives per a la generació de subtítols multilingües d'alta qualitat sense requerir intervenció humana i amb un reduït cost computacional. A més, també explorem els beneficis potencials que comportaria l'explotació de la informació de la que disposem a priori sobre aquests repositoris, és a dir, coneixements específics sobre les xarrades tals com el locutor, la temàtica o les transparències, per a crear sistemes de transcripció i traducció especialitzats mitjançant tècniques d'adaptació massiva. Les solucions proposades en aquesta tesi han estat testejades en escenaris reals duent a terme nombroses avaluacions objectives i subjectives, obtenint molt bons resultats. El principal llegat d'aquesta tesi, The transLectures-UPV Platform, ha sigut alliberat públicament com a programari de codi obert, i, en el moment d'escriure aquestes línies, està servint transcripcions i traduccions automàtiques per a diversos milers de vídeo xarrades educatives en nombroses universitats i institucions Espanyoles i Europees.
Silvestre Cerdà, JA. (2016). Different Contributions to Cost-Effective Transcription and Translation of Video Lectures [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/62194
TESIS
APA, Harvard, Vancouver, ISO, and other styles
16

Venter, Wessel Johannes. "An embodied conversational agent with autistic behaviour." Thesis, Stellenbosch : Stellenbosch University, 2012. http://hdl.handle.net/10019.1/20115.

Full text
Abstract:
Thesis (MSc)--Stellenbosch University, 2012.
ENGLISH ABSTRACT: In this thesis we describe the creation of an embodied conversational agent which exhibits the behavioural traits of a child who has Asperger Syndrome. The agent is rule-based, rather than arti cially intelligent, for which we give justi cation. We then describe the design and implementation of the agent, and pay particular attention to the interaction between emotion, personality and social context. A 3D demonstration program shows the typical output to conform to Asperger-like answers, with corresponding emotional responses.
AFRIKAANSE OPSOMMING: In hierdie tesis beskryf ons die ontwerp en implementasie van 'n gestaltegespreksagent wat die gedrag van 'n kind met Asperger se sindroom uitbeeld. Ons regverdig die besluit dat die agent reël-gebaseerd is, eerder as 'n ware skynintelligensie implementasie. Volgende beskryf ons die wisselwerking tussen emosies, persoonlikheid en sosiale konteks en hoe dit inskakel by die ontwerp en implementasie van die agent. 'n 3D demonstrasieprogram toon tipiese ooreenstemmende Asperger-agtige antwoorde op vrae, met gepaardgaande emosionele reaksies.
APA, Harvard, Vancouver, ISO, and other styles
17

Panesar, Kulvinder. "Natural language processing (NLP) in Artificial Intelligence (AI): a functional linguistic perspective." Vernon Press, 2020. http://hdl.handle.net/10454/18140.

Full text
Abstract:
Yes
This chapter encapsulates the multi-disciplinary nature that facilitates NLP in AI and reports on a linguistically orientated conversational software agent (CSA) (Panesar 2017) framework sensitive to natural language processing (NLP), language in the agent environment. We present a novel computational approach of using the functional linguistic theory of Role and Reference Grammar (RRG) as the linguistic engine. Viewing language as action, utterances change the state of the world, and hence speakers and hearer’s mental state change as a result of these utterances. The plan-based method of discourse management (DM) using the BDI model architecture is deployed, to support a greater complexity of conversation. This CSA investigates the integration, intersection and interface of the language, knowledge, speech act constructions (SAC) as a grammatical object, and the sub-model of BDI and DM for NLP. We present an investigation into the intersection and interface between our linguistic and knowledge (belief base) models for both dialogue management and planning. The architecture has three-phase models: (1) a linguistic model based on RRG; (2) Agent Cognitive Model (ACM) with (a) knowledge representation model employing conceptual graphs (CGs) serialised to Resource Description Framework (RDF); (b) a planning model underpinned by BDI concepts and intentionality and rational interaction; and (3) a dialogue model employing common ground. Use of RRG as a linguistic engine for the CSA was successful. We identify the complexity of the semantic gap of internal representations with details of a conceptual bridging solution.
APA, Harvard, Vancouver, ISO, and other styles
18

Li, Wenhui. "Sentiment analysis: Quantitative evaluation of subjective opinions using natural language processing." Thesis, University of Ottawa (Canada), 2008. http://hdl.handle.net/10393/28000.

Full text
Abstract:
Sentiment Analysis consists of recognizing sentiment orientation towards specific subjects within natural language texts. Most research in this area focuses on classifying documents as positive or negative. The purpose of this thesis is to quantitatively evaluate subjective opinions of customer reviews using a five star rating system, which is widely used on on-line review web sites, and to try to make the predicted score as accurate as possible. Firstly, this thesis presents two methods for rating reviews: classifying reviews by supervised learning methods as multi-class classification does, or rating reviews by using association scores of sentiment terms with a set of seed words extracted from the corpus, i.e. the unsupervised learning method. We extend the feature selection approach used in Turney's PMI-IR estimation by introducing semantic relatedness measures based up on the content of WordNet. This thesis reports on experiments using the two methods mentioned above for rating reviews using the combined feature set enriched with WordNet-selected sentiment terms. The results of these experiments suggest ways in which incorporating WordNet relatedness measures into feature selection may yield improvement over classification and unsupervised learning methods which do not use it. Furthermore, via ordinal meta-classifiers, we utilize the ordering information contained in the scores of bank reviews to improve the performance, we explore the effectiveness of re-sampling for reducing the problem of skewed data, and we check whether discretization benefits the ordinal meta-learning process. Finally, we combine the unsupervised and supervised meta-learning methods to optimize performance on our sentiment prediction task.
APA, Harvard, Vancouver, ISO, and other styles
19

Jarmasz, Mario. ""Roget's Thesaurus" as a lexical resource for natural language processing." Thesis, University of Ottawa (Canada), 2003. http://hdl.handle.net/10393/26493.

Full text
Abstract:
This dissertation presents an implementation of an electronic lexical knowledge base that uses the 1987 Penguin edition of Roget's Thesaurus as the source for its lexical material---the first implementation of a computerized Roget's to use an entire current edition. It explains the steps necessary for taking a machine-readable file and transforming it into a tractable system. Roget's organization is studied in detail and contrasted with WordNet's. We show two applications of the computerized Thesaurus: computing semantic similarity between words and phrases, and building lexical chains in a text. The experiments are performed using well-known benchmarks and the results are compared to those of other systems that use Roget's, WordNet and statistical techniques. Roget's has turned out to be an excellent resource for measuring semantic similarity; lexical chains are easily built but more difficult to evaluate. We also explain ways in which Roget's Thesaurus and WordNet can be combined.
APA, Harvard, Vancouver, ISO, and other styles
20

Rogers, Paul Anton Peter. "The baby project : processing character patterns in textual representations of language." Thesis, Bournemouth University, 2000. http://eprints.bournemouth.ac.uk/306/.

Full text
Abstract:
This thesis describes an investigation into a proposed theory of AI. The theory postulates that a machine can be programmed to predict aspects of human behaviour by selecting and processing stored, concrete examples of previously experienced patterns of behaviour. Validity is tested in the domain of natural language. Externalisations that model the resulting theory of NLP entail fuzzy components. Fuzzy formalisms may exhibit inaccuracy and/or over productivity. A research strategy is developed, designed to investigate this aspect of the theory. The strategy includes two experimental hypotheses designed to test, 1) whether the model can process simple language interaction, and 2) the effect of fuzzy processes on such language interaction. Experimental design requires three implementations, each with progressive degrees of fuzziness in their processes. They are respectively named: Nonfuzz Babe, CorrBab and FuzzBabe. Nonfuzz Babe is used to test the first hypothesis and all three implementations are used to test the second hypothesis. A system description is presented for Nonfuzz Babe. Testing the first hypothesis provides results that show NonfuzzBabe is able to process simple language interaction. A system description for CorrBabe and FuzzBabe is presented. Testing the second hypothesis, provides results that show a positive correlation between degree of fuzzy processes and improved simple language performance. FuzzBabe's ability to process more complex language interaction is then investigated and model-intrinsic limitations are found. Research to overcome this problem is designed to illustrate the potential of externalisation of the theory and is conducted less rigorously than previous part of this investigation. Augmenting FuzzBabe to include fuzzy evaluation of non-pattern elements of interaction is hypothesised as a possible solution. The term FuzzyBaby was coined for augmented implementation. Results of a pilot study designed to measure FuzzyBaby's reading comprehension are given. Little research has been conducted that investigates NLP by the fuzzy processing of concrete patterns in language. Consequently, it is proposed that this research contributes to the intellectual disciplines of NLP and AI in general.
APA, Harvard, Vancouver, ISO, and other styles
21

Augustsson, Christopher. "Multipurpose Case-Based Reasoning System, Using Natural Language Processing." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-104890.

Full text
Abstract:
Working as a field technician of any sort can many times be a challenging task. Often you find yourself alone, with a machine you have limited knowledge about, and the only support you have are the user manuals. As a result, it is not uncommon for companies to aid the technicians with a knowledge base that often revolves around some share point. But, unfortunately, the share points quickly get cluttered with too much information that leaves the user overwhelmed. Case-based reasoning (CBR), a form of problem-solving technology, uses previous cases to help users solve new problems they encounter, which could benefit the field technician. But for a CBR system to work with a wide variety of machines, the system must have a dynamic nature and handle multiple data types. By developing a prototype focusing on case retrieval, based on .Net core and MySql, this report sets the foundation for a highly dynamic CBR system that uses natural language processing to map case attributes during case retrieval. In addition, using datasets from UCI and Kaggle, the system's accuracy is validated, and by using a dataset created explicitly for this report, the system manifest to be robust.
APA, Harvard, Vancouver, ISO, and other styles
22

Keller, Thomas Anderson. "Comparison and Fine-Grained Analysis of Sequence Encoders for Natural Language Processing." Thesis, University of California, San Diego, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10599339.

Full text
Abstract:

Most machine learning algorithms require a fixed length input to be able to perform commonly desired tasks such as classification, clustering, and regression. For natural language processing, the inherently unbounded and recursive nature of the input poses a unique challenge when deriving such fixed length representations. Although today there is a general consensus on how to generate fixed length representations of individual words which preserve their meaning, the same cannot be said for sequences of words in sentences, paragraphs, or documents. In this work, we study the encoders commonly used to generate fixed length representations of natural language sequences, and analyze their effectiveness across a variety of high and low level tasks including sentence classification and question answering. Additionally, we propose novel improvements to the existing Skip-Thought and End-to-End Memory Network architectures and study their performance on both the original and auxiliary tasks. Ultimately, we show that the setting in which the encoders are trained, and the corpus used for training, have a greater influence of the final learned representation than the underlying sequence encoders themselves.

APA, Harvard, Vancouver, ISO, and other styles
23

Meyer, Christopher Henry. "On improving natural language processing through phrase-based and one-to-one syntactic algorithms." Thesis, Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/1096.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Leroy, Gondy, Hsinchun Chen, and Jesse D. Martinez. "A shallow parser based on closed-class words to capture relations in biomedical text." Elsevier, 2003. http://hdl.handle.net/10150/105844.

Full text
Abstract:
Artificial Intelligence Lab, Department of MIS, University of Arizona
Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun phrases automatically from free text has been developed and evaluated. It uses heuristics and a noun phraser to capture entities of interest in the text. Cascaded finite state automata structure the relations between individual entities. The automata are based on closed-class English words and model generic relations not limited to specific words. The parser also recognizes coordinating conjunctions and captures negation in text, a feature usually ignored by others. Three cancer researchers evaluated 330 relations extracted from 26 abstracts of interest to them. There were 296 relations correctly extracted from the abstracts resulting in 90% precision of the relations and an average of 11 correct relations per abstract.
APA, Harvard, Vancouver, ISO, and other styles
25

Sabri, Ayoub Diar. "Leveraging Artificial Intelligence For Sustained Organizational Competitive Advantage : A Study In Natural Language Processing And Dynamic Capabilities." Thesis, KTH, Industriell ekonomi och organisation (Inst.), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301650.

Full text
Abstract:
Technologies such as Artificial Intelligence (AI) and Machine Learning (ML)are disrupting industries worldwide and are being categorized as drivers of a technological revolution. The economic impact is hypothesized to amount to hundreds of billions of US dollars in losses of wages, affecting governmental tax revenue streams consequentially. Firms that manage to leverage these technologies by developing sustained competitive advantage are ultimately the firms that will prosper. Competitive advantage stems from the dynamic capabilities, characterizing the organizational and managerial processes in place to withstand the effects of external environmental turbulence, as with the technological revolution galvanized by AI. This research aimed to analyze how a tele- & cloud-communication company manages to leverage AI to materialize competitive advantage. The research was conducted in two principal parts. First, by developing an ML model for language agnostic document retrieval (LaDPR) and evaluating the performance vs. Facebook’s Dense Passage Retrieval (DPR) model. The ML experiments show that the developed LaDPR model outperforms Facebook’s DPR model by over 2x on average, on multilingual document retrieval. This performance increase rises to over 4x when excluding English, which is the language that DPR was trained on. Secondly, interviews were conducted with key representatives to research how such technological advancements can be exploited in the organizational goal for competitive advantage. Specific vital capabilities such as automated decision-making, knowledge integration, and platform maturity are the three prominent organizational and managerial processes that advanced AI systems can undergird. The results pinpoint that the process of a high-technology department focused solely on developing such AI systems, packaging them with engineering competence to then transfer ownership internally in the organization, ultimately coalesce into hard-to-imitate dynamic capabilities, materializing competitive advantage.
Teknologier som Artificiell Intelligens (AI) och Maskininlärning (ML) splittrar industrier världen över, och kategoriseras som drivkrafter bakom en teknologisk revolution. Effekterna på ekonomin spekuleras uppnå hundratals miljarder USD, som påverkar staters skatteintäkter markant. Företag som lyckas begagna sådan teknologi genom att utveckla långvariga konkurrensfördelar är i slutändan de företag som kommer se framgång. Dessa fördelar härstammar från de dynamiska förmågorna i ett företag, och karakteriseras av organisationella och lednings-orienterade processer som används för att stå emot effekterna av utomstående fluktuationer i marknaden, exemplifierat av den teknologiska revolutionen driven av AI. Den bedrivna forskningen ämnade att analysera hur ett företag inom tele- och molnkommunikation begagnar AI för att materialisera konkurrensfördelar. Forskningen bedrevs i två primära delar. Först, genom att utveckla en ML modell för språkagnostisk dokumenthämtning (LaDPR), och utvärdera prestandan i jämförelse med Facebooks Dense Passage Retrieval (DPR) modell. ML experimenten visar att den utvecklade LaDPR modellen presterar i snitt 2x bättre än Facebooks DPR modell på flerspråkig dokument-hämtning. Prestandaförbättringarna stiger upp till 4x, ifall engelska exkluderas, vilket är det språk som DPR tränades på. Genom att föra intervjuer med nyckelpersoner undersöktes det hur sådana teknologiska framsteg exploateras i de organisationella målen för konkurrensfördelar. Specifika nyckelförmågor som automatiserat beslutsfattande, kunskapsintegrering och plattformmognad är tre huvudsakliga organisationella och ledningsorienterade processer som avancerade AI system kan underbinda. Resultaten visar att processen av en högteknologisk avdelning som fokuserar på utveckling av avancerade AI system, som sedan paketeras tillsammans med ingenjörskompetens, för slutgiltig överföring av ägarskap, i slutändan förenas i svårimiterade dynamiska förmågor, som materialiseras i konkurrensfördelar.
APA, Harvard, Vancouver, ISO, and other styles
26

Espinosa-Anke, Luis. "Knowledge acquisition in the information age: the interplay between lexicography and natural language processing." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/404985.

Full text
Abstract:
Natural Language Processing (NLP) is the branch of Artificial Intelligence aimed at understanding and generating language as close as possible to a human’s. Today, NLP benefits substantially of large amounts of unnanotated corpora with which it derives state-of-the-art resources for text understanding such as vectorial representations or knowledge graphs. In addition, NLP also leverages structured and semi-structured information in the form of ontologies, knowledge bases (KBs), encyclopedias or dictionaries. In this dissertation, we present several improvements in NLP tasks such as Definition and Hypernym Extraction, Hypernym Discovery, Taxonomy Learning or KB construction and completion, and in all of them we take advantage of knowledge repositories of various kinds, showing that these are essential enablers in text understanding. Conversely, we use NLP techniques to create, improve or extend existing repositories, and release them along with the associated code for the use of the community.
El Procesamiento del Lenguaje Natural (PLN) es la rama de la Inteligencia Artificial que se ocupa de la comprensión y la generación de lenguage, tomando como referencia el lenguaje humano. Hoy, el PLN se basa en gran medida en la explotación de grandes cantidades de corpus sin anotar, a partir de los cuales se derivan representaciones de gran calidad para la comprensión automática de texto, tales como representaciones vectoriales o grafos de conocimiento. Además, el PLN también explota información estructurada y parcialmente estructurada como ontologías, bases de conocimiento (BCs), enciclopedias o diccionarios. En esta tesis presentamos varias mejoras del estado del arte en tareas de PLN tales como la extracción de definiciones e hiperónimos, descubrimiento de hiperónimos, inducción de taxonomías o construcción y enriquecimiento de BCs, y en todas ellas incorporamos repositorios de varios tipos, evaluando su contribución en diferentes áreas del PLN. Por otra parte, también usamos técnicas de PLN para crear, mejorar o extender repositorios ya existentes, y los publicamos junto con su código asociado con el fin de que sean de utilidad para la comunidad.
APA, Harvard, Vancouver, ISO, and other styles
27

Antici, Francesco. "Advanced techniques for cross-language annotation projection in legal texts." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23884/.

Full text
Abstract:
Nowadays, the majority of the services we benefit from, are provided online and their use is regulated by the acceptance to the terms of service by the users. All our data are handled accordingly with the clauses of such document and all our behaviours must comply with it. Given so, it would be very useful to find automated techniques to ensure fairness of the document or inform the users about possible threats. The focus of this work, is to create resources aimed to the development of such tools in languages other than English, which may lack in linguistic resources and annotated corpus. The enormous breakthroughs of the last years in Natural Language Processing techniques made it possible the creation of such tools through automated and unsupervised process. One of the means to achieve that is through the annotation projection between two parallel corpora. The difficulties and costs of creating ad hoc resource for every language has brought the need to find another way for achieving the goal.\\ This work investigates the cross language annotation projection technique based on sentence embedding and similarity metrics to find matches between sentences. Several combination of methods and algorithms are compared, among which there are monolingual and multilingual embedding neural models. The experiments are conducted on two datasets, where the reference language is always English and the projection are evaluated on Italian, German and Polish. The results obtained provide a robust and reliable technique for the task and a good starting point to build multilingual tools.
APA, Harvard, Vancouver, ISO, and other styles
28

Atkinson, Elizabeth A. M. "Artificial intelligence and the operation of merchant ships : aspects of natural language processing which relate to marine systems." Thesis, Cardiff University, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.369725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Cotra, Aditya Kousik. "Trend Analysis on Artificial Intelligence Patents." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104823936441.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Panesar, Kulvinder. "Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions." Universitat Politécnica de Valéncia, 2020. http://hdl.handle.net/10454/18121.

Full text
Abstract:
yes
This paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.
APA, Harvard, Vancouver, ISO, and other styles
31

Holzer, Corey T. "The application of natural language processing to open source intelligence for ontology development in the advanced persistent threat domain." Thesis, Purdue University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10249704.

Full text
Abstract:

Over the past decade, the Advanced Persistent Threat (APT) has risen to forefront of cybersecurity threats. APTs are a major contributor to the billions of dollars lost by corporations around the world annually. The threat is significant enough that the Navy Cyber Power 2020 plan identified them as a “must mitigate” threat in order to ensure the security of its warfighting network.

Reports, white papers, and various other open source materials offer a plethora of information to cybersecurity professionals regarding these APT attacks and the organizations behind them but mining and correlating information out of these various sources needs the support of standardized language and a common understand of terms that comes from an accepted APT ontology.

This paper and its related research applies the science of Natural Language Processing Open Source Intelligence in order to build an open source Ontology in the APT domain with the goal of building a dictionary and taxonomy for this complex domain.

APA, Harvard, Vancouver, ISO, and other styles
32

Rolnic, Sergiu Gabriel. "Anonimizzazione di documenti mediante Named Entity Recognition e Neural Language Model." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Find full text
Abstract:
I transformers hanno rivoluzionato il mondo dell'interpretazione linguistica da parte delle macchine. La possibilità di addestrare un neural language model su vocabolari ed enciclopedie intere, per poi utilizzare le conoscenze acquisite e trasmetterle a task specifici, ha permesso di raggiungere lo stato dell'arte in quasi tutti i domini applicativi del Natural Language Processing. In questo contesto è stato sviluppato un applicativo per l'anonimizzazione di file, in grado di identificare entità specifiche rappresentative di dati personali.
APA, Harvard, Vancouver, ISO, and other styles
33

au, os goh@murdoch edu, and Ong Sing Goh. "A framework and evaluation of conversation agents." Murdoch University, 2008. http://wwwlib.murdoch.edu.au/adt/browse/view/adt-MU20081020.134601.

Full text
Abstract:
This project details the development of a novel and practical framework for the development of conversation agents (CAs), or conversation robots. CAs, are software programs which can be used to provide a natural interface between human and computers. In this study, ‘conversation’ refers to real-time dialogue exchange between human and machine which may range from web chatting to “on-the-go” conversation through mobile devices. In essence, the project proposes a “smart and effective” communication technology where an autonomous agent is able to carry out simulated human conversation via multiple channels. The CA developed in this project is termed “Artificial Intelligence Natural-language Identity” (AINI) and AINI is used to illustrate the implementation and testing carried out in this project. Up to now, most CAs have been developed with a short term objective to serve as tools to convince users that they are talking with real humans as in the case of the Turing Test. The traditional designs have mainly relied on ad-hoc approach and hand-crafted domain knowledge. Such approaches make it difficult for a fully integrated system to be developed and modified for other domain applications and tasks. The proposed framework in this thesis addresses such limitations. Overcoming the weaknesses of previous systems have been the key challenges in this study. The research in this study has provided a better understanding of the system requirements and the development of a systematic approach for the construction of intelligent CAs based on agent architecture using a modular N-tiered approach. This study demonstrates an effective implementation and exploration of the new paradigm of Computer Mediated Conversation (CMC) through CAs. The most significant aspect of the proposed framework is its ability to re-use and encapsulate expertise such as domain knowledge, natural language query and human-computer interface through plug-in components. As a result, the developer does not need to change the framework implementation for different applications. This proposed system provides interoperability among heterogeneous systems and it has the flexibility to be adapted for other languages, interface designs and domain applications. A modular design of knowledge representation facilitates the creation of the CA knowledge bases. This enables easier integration of open-domain and domain-specific knowledge with the ability to provide answers for broader queries. In order to build the knowledge base for the CAs, this study has also proposed a mechanism to gather information from commonsense collaborative knowledge and online web documents. The proposed Automated Knowledge Extraction Agent (AKEA) has been used for the extraction of unstructured knowledge from the Web. On the other hand, it is also realised that it is important to establish the trustworthiness of the sources of information. This thesis introduces a Web Knowledge Trust Model (WKTM) to establish the trustworthiness of the sources. In order to assess the proposed framework, relevant tools and application modules have been developed and an evaluation of their effectiveness has been carried out to validate the performance and accuracy of the system. Both laboratory and public experiments with online users in real-time have been carried out. The results have shown that the proposed system is effective. In addition, it has been demonstrated that the CA could be implemented on the Web, mobile services and Instant Messaging (IM). In the real-time human-machine conversation experiment, it was shown that AINI is able to carry out conversations with human users by providing spontaneous interaction in an unconstrained setting. The study observed that AINI and humans share common properties in linguistic features and paralinguistic cues. These human-computer interactions have been analysed and contributed to the understanding of how the users interact with CAs. Such knowledge is also useful for the development of conversation systems utilising the commonalities found in these interactions. While AINI is found having difficulties in responding to some forms of paralinguistic cues, this could lead to research directions for further work to improve the CA performance in the future.
APA, Harvard, Vancouver, ISO, and other styles
34

Avenberg, Anna. "Automatic language identification of short texts." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-421032.

Full text
Abstract:
The world is growing more connected through the use of online communication, exposing software and humans to all the world's languages. While devices are able to understand and share the raw data between themselves and with humans, the information itself is not expressed in a monolithic format. This causes issues both in the human to computer interaction and human to human communication. Automatic language identification (LID) is a field within artificial intelligence and natural language processing that strives to solve a part of these issues by identifying languages from text, sign language and speech. One of the challenges is to identify the short pieces of text that can be found online, such as messages, comments and posts on social media. This is due to the small amount of information they carry. The goal of this thesis has been to build a machine learning model that can identify the language for these short pieces of text. A long short-term memory (LSTM) machine learning model was built and benchmarked towards Facebook's fastText model. The results show how the LSTM model reached an accuracy of around 95% and the fastText model used as comparison reached an accuracy of 97%. The LSTM model struggled more when identifying texts shorter than 50 characters than with longer text. The classification performance of the LSTM model was also relatively poor in cases where languages were similar, like Croatian and Serbian. Both the LSTM model and the fastText model reached accuracy's above 94% which can be considered high, depending on how it is evaluated. There are however many improvements and possible future work to be considered; looking further into texts shorter than 50 characters, evaluating the model's softmax output vector values and how to handle similar languages.
APA, Harvard, Vancouver, ISO, and other styles
35

Suddrey, Gavin. "Instructing and training robots through a natural language dialogue." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/229145/1/Gavin_Suddrey_Thesis.pdf.

Full text
Abstract:
This thesis focused on the problem of allowing non-expert users, such as the elderly, to teach robots how to perform everyday tasks through dialogue. In particular, this thesis addressed issues relating to how task knowledge, extracted from spoken instructions, should be encoded by a robot to allow the robot to learn complex tasks; how to generalise knowledge provided by human instructors such that the robot can perform the same task across different scenarios; as well as how to handle information gaps present in the explanation of tasks provided by novice users.
APA, Harvard, Vancouver, ISO, and other styles
36

Oldham, Joseph Dowell. "Generating documents by means of computational registers." Lexington, Ky. : [University of Kentucky Libraries], 2000. http://lib.uky.edu/ETD/ukycosc2000d00006/oldham.pdf.

Full text
Abstract:
Thesis (Ph. D.)--University of Kentucky, 2000.
Title from document title page. Document formatted into pages; contains ix, 169 p. : ill. Includes abstract. Includes bibliographical references (p. 160-167).
APA, Harvard, Vancouver, ISO, and other styles
37

Crocker, Matthew Walter. "A principle-based system for natural language analysis and translation." Thesis, University of British Columbia, 1988. http://hdl.handle.net/2429/27863.

Full text
Abstract:
Traditional views of grammatical theory hold that languages are characterised by sets of constructions. This approach entails the enumeration of all possible constructions for each language being described. Current theories of transformational generative grammar have established an alternative position. Specifically, Chomsky's Government-Binding theory proposes a system of principles which are common to human language. Such a theory is referred to as a "Universal Grammar"(UG). Associated with the principles of grammar are parameters of variation which account for the diversity of human languages. The grammar for a particular language is known as a "Core Grammar", and is characterised by an appropriately parametrised instance of UG. Despite these advances in linguistic theory, construction-based approaches have remained the status quo within the field of natural language processing. This thesis investigates the possibility of developing a principle-based system which reflects the modular nature of the linguistic theory. That is, rather than stipulating the possible constructions of a language, a system is developed which uses the principles of grammar and language specific parameters to parse language. Specifically, a system-is presented which performs syntactic analysis and translation for a subset of English and German. The cross-linguistic nature of the theory is reflected by the system which can be considered a procedural model of UG.
Science, Faculty of
Computer Science, Department of
Graduate
APA, Harvard, Vancouver, ISO, and other styles
38

Idris, Muhammad. "Real-time Business Intelligence through Compact and Efficient Query Processing Under Updates." Doctoral thesis, Universite Libre de Bruxelles, 2019. https://dipot.ulb.ac.be/dspace/bitstream/2013/284705/5/contratMI.pdf.

Full text
Abstract:
Responsive analytics are rapidly taking over the traditional data analytics dominated by the post-fact approaches in traditional data warehousing. Recent advancements in analytics demand placing analytical engines at the forefront of the system to react to updates occurring at high speed and detect patterns, trends, and anomalies. These kinds of solutions find applications in Financial Systems, Industrial Control Systems, Business Intelligence and on-line Machine Learning among others. These applications are usually associated with Big Data and require the ability to react to constantly changing data in order to obtain timely insights and take proactive measures. Generally, these systems specify the analytical results or their basic elements in a query language, where the main task then is to maintain query results under frequent updates efficiently. The task of reacting to updates and analyzing changing data has been addressed in two ways in the literature: traditional business intelligence (BI) solutions focus on historical data analysis where the data is refreshed periodically and in batches, and stream processing solutions process streams of data from transient sources as flows of data items. Both kinds of systems share the niche of reacting to updates (known as dynamic evaluation), however, they differ in architecture, query languages, and processing mechanisms. In this thesis, we investigate the possibility of a reactive and unified framework to model queries that appear in both kinds of systems.In traditional BI solutions, evaluating queries under updates has been studied under the umbrella of incremental evaluation of queries that are based on the relational incremental view maintenance model and mostly focus on queries that feature equi-joins. Streaming systems, in contrast, generally follow automaton based models to evaluate queries under updates, and they generally process queries that mostly feature comparisons of temporal attributes (e.g. timestamp attributes) along with comparisons of non-temporal attributes over streams of bounded sizes. Temporal comparisons constitute inequality constraints while non-temporal comparisons can either be equality or inequality constraints. Hence these systems mostly process inequality joins. As a starting point for our research, we postulate the thesis that queries in streaming systems can also be evaluated efficiently based on the paradigm of incremental evaluation just like in BI systems in a main-memory model. The efficiency of such a model is measured in terms of runtime memory footprint and the update processing cost. To this end, the existing approaches of dynamic evaluation in both kinds of systems present a trade-off between memory footprint and the update processing cost. More specifically, systems that avoid materialization of query (sub)results incur high update latency and systems that materialize (sub)results incur high memory footprint. We are interested in investigating the possibility to build a model that can address this trade-off. In particular, we overcome this trade-off by investigating the possibility of practical dynamic evaluation algorithm for queries that appear in both kinds of systems and present a main-memory data representation that allows to enumerate query (sub)results without materialization and can be maintained efficiently under updates. We call this representation the Dynamic Constant Delay Linear Representation (DCLRs).We devise DCLRs with the following properties: 1) they allow, without materialization, enumeration of query results with bounded-delay (and with constant delay for a sub-class of queries), 2) they allow tuple lookup in query results with logarithmic delay (and with constant delay for conjunctive queries with equi-joins only), 3) they take space linear in the size of the database, 4) they can be maintained efficiently under updates. We first study the DCLRs with the above-described properties for the class of acyclic conjunctive queries featuring equi-joins with projections and present the dynamic evaluation algorithm called the Dynamic Yannakakis (DYN) algorithm. Then, we present the generalization of the DYN algorithm to the class of acyclic queries featuring multi-way Theta-joins with projections and call it Generalized DYN (GDYN). We devise DCLRs with the above properties for acyclic conjunctive queries, and the working of DYN and GDYN over DCLRs are based on a particular variant of join trees, called the Generalized Join Trees (GJTs) that guarantee the above-described properties of DCLRs. We define GJTs and present algorithms to test a conjunctive query featuring Theta-joins for acyclicity and to generate GJTs for such queries. We extend the classical GYO algorithm from testing a conjunctive query with equalities for acyclicity to testing a conjunctive query featuring multi-way Theta-joins with projections for acyclicity. We further extend the GYO algorithm to generate GJTs for queries that are acyclic.GDYN is hence a unified framework based on DCLRs that enables processing of queries that appear in streaming systems as well as in BI systems in a unified main-memory model and addresses the space-time trade-off. We instantiate GDYN to the particular case where all Theta-joins involve only equalities and inequalities and call this instantiation IEDYN. We implement DYN and IEDYN as query compilers that generate executable programs in the Scala programming language and provide all the necessary data structures and their maintenance and enumeration methods in a continuous stream processing model. We evaluate DYN and IEDYN against state-of-the-art BI and streaming systems on both industrial and synthetically generated benchmarks. We show that DYN and IEDYN outperform the existing systems by over an order of magnitude efficiency in both memory footprint and update processing time.
Doctorat en Sciences de l'ingénieur et technologie
info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
39

Minhas, Saliha Z. "A corpus driven computational intelligence framework for deception detection in financial text." Thesis, University of Stirling, 2016. http://hdl.handle.net/1893/25345.

Full text
Abstract:
Financial fraud rampages onwards seemingly uncontained. The annual cost of fraud in the UK is estimated to be as high as £193bn a year [1] . From a data science perspective and hitherto less explored this thesis demonstrates how the use of linguistic features to drive data mining algorithms can aid in unravelling fraud. To this end, the spotlight is turned on Financial Statement Fraud (FSF), known to be the costliest type of fraud [2]. A new corpus of 6.3 million words is composed of102 annual reports/10-K (narrative sections) from firms formally indicted for FSF juxtaposed with 306 non-fraud firms of similar size and industrial grouping. Differently from other similar studies, this thesis uniquely takes a wide angled view and extracts a range of features of different categories from the corpus. These linguistic correlates of deception are uncovered using a variety of techniques and tools. Corpus linguistics methodology is applied to extract keywords and to examine linguistic structure. N-grams are extracted to draw out collocations. Readability measurement in financial text is advanced through the extraction of new indices that probe the text at a deeper level. Cognitive and perceptual processes are also picked out. Tone, intention and liquidity are gauged using customised word lists. Linguistic ratios are derived from grammatical constructs and word categories. An attempt is also made to determine ‘what’ was said as opposed to ‘how’. Further a new module is developed to condense synonyms into concepts. Lastly frequency counts from keywords unearthed from a previous content analysis study on financial narrative are also used. These features are then used to drive machine learning based classification and clustering algorithms to determine if they aid in discriminating a fraud from a non-fraud firm. The results derived from the battery of models built typically exceed classification accuracy of 70%. The above process is amalgamated into a framework. The process outlined, driven by empirical data demonstrates in a practical way how linguistic analysis could aid in fraud detection and also constitutes a unique contribution made to deception detection studies.
APA, Harvard, Vancouver, ISO, and other styles
40

Aljadri, Sinan. "Chatbot : A qualitative study of users' experience of Chatbots." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-105434.

Full text
Abstract:
The aim of the present study has been to examine users' experience of Chatbot from a business perspective and a consumer perspective. The study has also focused on highlighting what limitations a Chatbot can have and possible improvements for future development. The study is based on a qualitative research method with semi-structured interviews that have been analyzed on the basis of a thematic analysis. The results of the interview material have been analyzed based on previous research and various theoretical perspectives such as Artificial Intelligence (AI), Natural Language Processing (NLP). The results of the study have shown that the experience of Chatbot can differ between businesses that offer Chatbot, which are more positive and consumers who use it as customer service. Limitations and suggestions for improvements around Chatbotar are also a consistent result of the study.
Den föreliggande studie har haft som syfte att undersöka användarnas upplevelse av Chatbot utifrån verksamhetsperspektiv och konsumentperspektiv. Studien har också fokuserat på att lyfta fram vilka begränsningar en Chatbot kan ha och eventuella förbättringar för framtida utvecklingen. Studien är baserad på en kvalitativ forskningsmetod med semistrukturerade intervjuer som har analyserats utifrån en tematisk analys. Resultatet av intervjumaterialet har analyserat utifrån tidigare forskning och olika teoretiska perspektiv som Artificial Intelligence (AI), Natural Language Processing (NLP). Resultatet av studien har visat att upplevelsen av Chatbot kan skilja sig mellan verksamheter som erbjuder Chatbot, som är mer positiva och konsumenter som använder det som kundtjänst. Begränsningar och förslag på förbättringar kring Chatbotar är också ett genomgående resultat i studien.
APA, Harvard, Vancouver, ISO, and other styles
41

Janevski, Angel. "UniversityIE: Information Extraction From University Web Pages." UKnowledge, 2000. http://uknowledge.uky.edu/gradschool_theses/217.

Full text
Abstract:
The amount of information available on the web is growing constantly. As a result, theproblem of retrieving any desired information is getting more difficult by the day. Toalleviate this problem, several techniques are currently being used, both for locatingpages of interest and for extracting meaningful information from the retrieved pages.Information extraction (IE) is one such technology that is used for summarizingunrestricted natural language text into a structured set of facts. IE is already being appliedwithin several domains such as news transcripts, insurance information, and weatherreports. Various approaches to IE have been taken and a number of significant resultshave been reported.In this thesis, we describe the application of IE techniques to the domain of universityweb pages. This domain is broader than previously evaluated domains and has a varietyof idiosyncratic problems to address. We present an analysis of the domain of universityweb pages and the consequences of having them input to IE systems. We then presentUniversityIE, a system that can search a web site, extract relevant pages, and processthem for information such as admission requirements or general information. TheUniversityIE system, developed as part of this research, contributes three IE methods anda web-crawling heuristic that worked relatively well and predictably over a test set ofuniversity web sites.We designed UniversityIE as a generic framework for plugging in and executing IEmethods over pages acquired from the web. We also integrated in the system a genericweb crawler (built at the University of Kentucky) and ported to Java and integrated anexternal word lexicon (WordNet) and a syntax parser (Link Grammar Parser).
APA, Harvard, Vancouver, ISO, and other styles
42

Mahendru, Aroma. "Role of Premises in Visual Question Answering." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/78030.

Full text
Abstract:
In this work, we make a simple but important observation questions about images often contain premises -- objects and relationships implied by the question -- and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions. When presented with a question that is irrelevant to an image, state-of-the-art VQA models will still answer based purely on learned language biases, resulting in nonsensical or even misleading answers. We note that a visual question is irrelevant to an image if at least one of its premises is false (i.e. not depicted in the image). We leverage this observation to construct a dataset for Question Relevance Prediction and Explanation (QRPE) by searching for false premises. We train novel irrelevant question detection models and show that models that reason about premises consistently outperform models that do not. We also find that forcing standard VQA models to reason about premises during training can lead to improvements on tasks requiring compositional reasoning.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
43

Marquard, Stephen. "Improving searchability of automatically transcribed lectures through dynamic language modelling." Thesis, University of Cape Town, 2012. http://pubs.cs.uct.ac.za/archive/00000846/.

Full text
Abstract:
Recording university lectures through lecture capture systems is increasingly common. However, a single continuous audio recording is often unhelpful for users, who may wish to navigate quickly to a particular part of a lecture, or locate a specific lecture within a set of recordings. A transcript of the recording can enable faster navigation and searching. Automatic speech recognition (ASR) technologies may be used to create automated transcripts, to avoid the significant time and cost involved in manual transcription. Low accuracy of ASR-generated transcripts may however limit their usefulness. In particular, ASR systems optimized for general speech recognition may not recognize the many technical or discipline-specific words occurring in university lectures. To improve the usefulness of ASR transcripts for the purposes of information retrieval (search) and navigating within recordings, the lexicon and language model used by the ASR engine may be dynamically adapted for the topic of each lecture. A prototype is presented which uses the English Wikipedia as a semantically dense, large language corpus to generate a custom lexicon and language model for each lecture from a small set of keywords. Two strategies for extracting a topic-specific subset of Wikipedia articles are investigated: a naïve crawler which follows all article links from a set of seed articles produced by a Wikipedia search from the initial keywords, and a refinement which follows only links to articles sufficiently similar to the parent article. Pair-wise article similarity is computed from a pre-computed vector space model of Wikipedia article term scores generated using latent semantic indexing. The CMU Sphinx4 ASR engine is used to generate transcripts from thirteen recorded lectures from Open Yale Courses, using the English HUB4 language model as a reference and the two topic-specific language models generated for each lecture from Wikipedia. Three standard metrics – Perplexity, Word Error Rate and Word Correct Rate – are used to evaluate the extent to which the adapted language models improve the searchability of the resulting transcripts, and in particular improve the recognition of specialist words. Ranked Word Correct Rate is proposed as a new metric better aligned with the goals of improving transcript searchability and specialist word recognition. Analysis of recognition performance shows that the language models derived using the similarity-based Wikipedia crawler outperform models created using the naïve crawler, and that transcripts using similarity-based language models have better perplexity and Ranked Word Correct Rate scores than those created using the HUB4 language model, but worse Word Error Rates. It is concluded that English Wikipedia may successfully be used as a language resource for unsupervised topic adaptation of language models to improve recognition performance for better searchability of lecture recording transcripts, although possibly at the expense of other attributes such as readability.
APA, Harvard, Vancouver, ISO, and other styles
44

Grefenstette, Edward Thomas. "Category-theoretic quantitative compositional distributional models of natural language semantics." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:d7f9433b-24c0-4fb5-925b-d8b3744b7012.

Full text
Abstract:
This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produce distributional representations for larger units of text (such as a verb and its arguments) by composing the distributional representations of smaller units of text (such as individual words). This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations. This thesis shows how this approach can be theoretically extended and practically implemented to produce concrete compositional distributional models of natural language semantics. It furthermore demonstrates that such models can perform on par with, or better than, other competing approaches in the field of natural language processing. There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The second contribution is to evaluate the models developed from the procedures presented here, showing that they outperform other compositional distributional models present in the literature. The third contribution is to show how using category theory to solve linguistic problems forms a sound basis for research, illustrated by examples of work on this topic, that also suggest directions for future research.
APA, Harvard, Vancouver, ISO, and other styles
45

Goh, Ong Sing. "A framework and evaluation of conversation agents." Thesis, Goh, Ong Sing (2008) A framework and evaluation of conversation agents. PhD thesis, Murdoch University, 2008. https://researchrepository.murdoch.edu.au/id/eprint/752/.

Full text
Abstract:
This project details the development of a novel and practical framework for the development of conversation agents (CAs), or conversation robots. CAs, are software programs which can be used to provide a natural interface between human and computers. In this study, ‘conversation’ refers to real-time dialogue exchange between human and machine which may range from web chatting to “on-the-go” conversation through mobile devices. In essence, the project proposes a “smart and effective” communication technology where an autonomous agent is able to carry out simulated human conversation via multiple channels. The CA developed in this project is termed “Artificial Intelligence Natural-language Identity” (AINI) and AINI is used to illustrate the implementation and testing carried out in this project. Up to now, most CAs have been developed with a short term objective to serve as tools to convince users that they are talking with real humans as in the case of the Turing Test. The traditional designs have mainly relied on ad-hoc approach and hand-crafted domain knowledge. Such approaches make it difficult for a fully integrated system to be developed and modified for other domain applications and tasks. The proposed framework in this thesis addresses such limitations. Overcoming the weaknesses of previous systems have been the key challenges in this study. The research in this study has provided a better understanding of the system requirements and the development of a systematic approach for the construction of intelligent CAs based on agent architecture using a modular N-tiered approach. This study demonstrates an effective implementation and exploration of the new paradigm of Computer Mediated Conversation (CMC) through CAs. The most significant aspect of the proposed framework is its ability to re-use and encapsulate expertise such as domain knowledge, natural language query and human-computer interface through plug-in components. As a result, the developer does not need to change the framework implementation for different applications. This proposed system provides interoperability among heterogeneous systems and it has the flexibility to be adapted for other languages, interface designs and domain applications. A modular design of knowledge representation facilitates the creation of the CA knowledge bases. This enables easier integration of open-domain and domain-specific knowledge with the ability to provide answers for broader queries. In order to build the knowledge base for the CAs, this study has also proposed a mechanism to gather information from commonsense collaborative knowledge and online web documents. The proposed Automated Knowledge Extraction Agent (AKEA) has been used for the extraction of unstructured knowledge from the Web. On the other hand, it is also realised that it is important to establish the trustworthiness of the sources of information. This thesis introduces a Web Knowledge Trust Model (WKTM) to establish the trustworthiness of the sources. In order to assess the proposed framework, relevant tools and application modules have been developed and an evaluation of their effectiveness has been carried out to validate the performance and accuracy of the system. Both laboratory and public experiments with online users in real-time have been carried out. The results have shown that the proposed system is effective. In addition, it has been demonstrated that the CA could be implemented on the Web, mobile services and Instant Messaging (IM). In the real-time human-machine conversation experiment, it was shown that AINI is able to carry out conversations with human users by providing spontaneous interaction in an unconstrained setting. The study observed that AINI and humans share common properties in linguistic features and paralinguistic cues. These human-computer interactions have been analysed and contributed to the understanding of how the users interact with CAs. Such knowledge is also useful for the development of conversation systems utilising the commonalities found in these interactions. While AINI is found having difficulties in responding to some forms of paralinguistic cues, this could lead to research directions for further work to improve the CA performance in the future.
APA, Harvard, Vancouver, ISO, and other styles
46

Goh, Ong Sing. "A framework and evaluation of conversation agents." Goh, Ong Sing (2008) A framework and evaluation of conversation agents. PhD thesis, Murdoch University, 2008. http://researchrepository.murdoch.edu.au/752/.

Full text
Abstract:
This project details the development of a novel and practical framework for the development of conversation agents (CAs), or conversation robots. CAs, are software programs which can be used to provide a natural interface between human and computers. In this study, ‘conversation’ refers to real-time dialogue exchange between human and machine which may range from web chatting to “on-the-go” conversation through mobile devices. In essence, the project proposes a “smart and effective” communication technology where an autonomous agent is able to carry out simulated human conversation via multiple channels. The CA developed in this project is termed “Artificial Intelligence Natural-language Identity” (AINI) and AINI is used to illustrate the implementation and testing carried out in this project. Up to now, most CAs have been developed with a short term objective to serve as tools to convince users that they are talking with real humans as in the case of the Turing Test. The traditional designs have mainly relied on ad-hoc approach and hand-crafted domain knowledge. Such approaches make it difficult for a fully integrated system to be developed and modified for other domain applications and tasks. The proposed framework in this thesis addresses such limitations. Overcoming the weaknesses of previous systems have been the key challenges in this study. The research in this study has provided a better understanding of the system requirements and the development of a systematic approach for the construction of intelligent CAs based on agent architecture using a modular N-tiered approach. This study demonstrates an effective implementation and exploration of the new paradigm of Computer Mediated Conversation (CMC) through CAs. The most significant aspect of the proposed framework is its ability to re-use and encapsulate expertise such as domain knowledge, natural language query and human-computer interface through plug-in components. As a result, the developer does not need to change the framework implementation for different applications. This proposed system provides interoperability among heterogeneous systems and it has the flexibility to be adapted for other languages, interface designs and domain applications. A modular design of knowledge representation facilitates the creation of the CA knowledge bases. This enables easier integration of open-domain and domain-specific knowledge with the ability to provide answers for broader queries. In order to build the knowledge base for the CAs, this study has also proposed a mechanism to gather information from commonsense collaborative knowledge and online web documents. The proposed Automated Knowledge Extraction Agent (AKEA) has been used for the extraction of unstructured knowledge from the Web. On the other hand, it is also realised that it is important to establish the trustworthiness of the sources of information. This thesis introduces a Web Knowledge Trust Model (WKTM) to establish the trustworthiness of the sources. In order to assess the proposed framework, relevant tools and application modules have been developed and an evaluation of their effectiveness has been carried out to validate the performance and accuracy of the system. Both laboratory and public experiments with online users in real-time have been carried out. The results have shown that the proposed system is effective. In addition, it has been demonstrated that the CA could be implemented on the Web, mobile services and Instant Messaging (IM). In the real-time human-machine conversation experiment, it was shown that AINI is able to carry out conversations with human users by providing spontaneous interaction in an unconstrained setting. The study observed that AINI and humans share common properties in linguistic features and paralinguistic cues. These human-computer interactions have been analysed and contributed to the understanding of how the users interact with CAs. Such knowledge is also useful for the development of conversation systems utilising the commonalities found in these interactions. While AINI is found having difficulties in responding to some forms of paralinguistic cues, this could lead to research directions for further work to improve the CA performance in the future.
APA, Harvard, Vancouver, ISO, and other styles
47

Mills, Michael Thomas. "Natural Language Document and Event Association Using Stochastic Petri Net Modeling." Wright State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=wright1369408524.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Armstrong, Anna-Marie. "Unconscious processing at the subjective threshold : semantic comprehension?" Thesis, University of Sussex, 2014. http://sro.sussex.ac.uk/id/eprint/51557/.

Full text
Abstract:
Our thoughts and behaviours can sometimes be influenced by stimuli that we are not consciously aware of having seen. For example, the presentation of a word that is blocked from entering conscious visual perception through masking can subsequently influence the cognitive processing of a further target word. However, the idea that unconscious cognition is sophisticated enough to process the semantic meaning of subliminal stimuli is controversial. This thesis attempts to explore the extent of subliminal priming. Empirical research centering on subjective methods of measuring conscious knowledge is presented in a series of three articles. The first article investigates the subliminal priming of negation. A series of experiments demonstrates that unconscious processing can accurately discriminate between two nouns beyond chance performance when subliminally instructed to either pick or not pick a given noun. This article demonstrates not only semantic processing of the instructional word, but also unconscious cognitive control by following a two-word subliminal instruction to not choose the primed noun. The second article investigates subliminal priming of active versus passive verb voice by presenting a prime sentence denoting one of two characters as either active or passive and asking which of two pictorial representations best matches the prime. The series of experiments demonstrates that overall, participants were able to identify the correct image for both active and passive conditions beyond chance expectations. This article suggests that individuals are able to process the meaning of word combinations that they are not aware of seeing. The third article attempts to determine whether subliminal processing is sophisticated enough to allow for the activation of specific anxieties relating to relationships. Whilst the findings reveal a small subliminal priming effect on generalised anxiety, the evidence regarding the subliminal priming of very specific anxieties is insensitive. The unconscious is shown in these experiments to be more powerful than previously supposed in terms of the fine grained processing of the semantics of word combinations, though not yet in terms of the fine grained resolution of emotional priming.
APA, Harvard, Vancouver, ISO, and other styles
49

De, Scalzi Marika. "An embodied approach to language comprehension in probable Alzheimer's Disease : could perceptuo-motor processing be a key to better understanding?" Thesis, University of Sussex, 2013. http://sro.sussex.ac.uk/id/eprint/47190/.

Full text
Abstract:
One of the central tenets of the embodied theory of language comprehension is that the process of understanding prompts the same perceptuo-motor activity involved in actual perception and action. This activity is a component of comprehension that is not memory–dependent and is hypothesized to be intact in Alzheimer's Disease (AD). Each article in this thesis is aimed at answering the question whether individuals with probable AD, healthy older adults and younger adults show differences in their performance on tests where perceptual and motoric priming take place during language comprehension. The second question each article asks is whether language comprehension in AD can be facilitated by the specific use of this perceptual and motoric priming. Article I examines whether the way individuals with pAD represent verbs spatially matches the way healthy older and younger adults do, and how stable these representations are. It also explores in what way spatial representations may relate to verb comprehension, more specifically, whether representations matching the norms translate into a better quality of verb comprehension. Article II tests the interaction between the verbs' spatial representations taking place during comprehension and perceptual cues - compatible and incompatible to the representations - in order to investigate whether individuals with pAD show differences in susceptibility to perceptual cues, compared to healthy older and younger participants. The second aim of this article is to explore in what way performance on a word-picture verification task can be affected, with reference to the fact that in previous studies on young participants, both priming and interference have resulted from the interaction of linguistic and perceptual processing. Article III explores the Action Compatibility Effect (ACE) (Glenberg & Kaschak, 2002) with the aim of finding out whether the ACE exists for volunteers with pAD and whether it can facilitate language comprehension. The order of presentation of language and movement is manipulated to establish whether there is a reciprocal relationship between them. This information could be crucial in view of possible applications to individuals with pAD. These articles test, for the first time, the effects of the manipulation of the perceptuo-motor component during language comprehension in individuals with pAD; they are intended as a methodological exploration contributing to a better understanding of the potential of embodiment principles to support language comprehension changes associated with pAD. Embodiment effects need to be studied further with a view to putting them to use in either clinical or real-life applications.
APA, Harvard, Vancouver, ISO, and other styles
50

Oldham, Joseph D. "DEXTER: Generating Documents by means of computational registers." UKnowledge, 2000. http://uknowledge.uky.edu/gradschool_diss/321.

Full text
Abstract:
Software is often capable of efficiently storing and managing data on computers. However, even software systems that store and manage data efficiently often do an inadequate job of presenting data to users. A prototypical example is the display of raw data in the tabular results of SQL queries. Users may need a presentation that is sensitive to data values and sensitive to domain conventions. One way to enhance presentation is to generate documents that correctly convey the data to users, taking into account the needs of the user and the values in the data. I have designed and implemented a software approach to generating human-readable documents in a variety of domains. The software to generate a document is called a {\em computational register}, or ``register'' for short. A {\em register system} is a software package for authoring and managing individual registers. Registers generating documents in various domains may be managed by one register system. In this thesis I describe computational registers at an architectural level and discuss registers as implemented in DEXTER, my register system. Input to DEXTER registers is a set of SQL query results. DEXTER registers use a rule-based approach to create a document outline from the input. A register creates the output document by using flexible templates to express the document outline. The register approach is unique in several ways. Content determination and structural planning are carried out sequentially rather than simultaneously. Content planning itself is broken down into data re-representation followed by content selection. No advanced linguistic knowledge is required to understand the approach. Register authoring follows a course very similar to writing a single document. The internal data representation and content planning steps allow registers to use flexible templates, rather than more abstract grammar-based approaches, to render the final document, Computational registers are applicable in a variety of domains. What registers can be written is restricted not by domain, but by the original data representation. Finally, DEXTER shows that a single software suite can assist in authoring and management of a variety of registers.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography