Dissertations / Theses: 'Language processing tasks'

1

Medlock, Benjamin William. "Investigating classification for natural language processing tasks." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.611949.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Dyson, Lucy. "Insights into language processing in aphasia from semantic priming and semantic judgement tasks." Thesis, University of Sheffield, 2017. http://etheses.whiterose.ac.uk/19144/.

Full text

Abstract:

The nature of semantic impairment in people with aphasia (PWA) provides the background to the current study, which examines whether different methods of semantic assessment can account for such deficits. Cognitive ability, which has previously been linked to language ability in PWA, may impact on test performance and was therefore also examined. The aims of the current study were to compare performance of control participants and PWA on implicit and explicit assessment of semantics, and to relate it to performance on tests of cognition. The impact of semantically similar versus associative relationship types between test stimuli was also considered. Three experimental semantic tasks were developed, including one implicit measure of semantic processing (Semantic Priming) and two explicit measures (Word to Picture Verification and Word to Picture Matching). Test stimuli were matched in terms of key psycholinguistic variables of frequency, imageability and length, and other factors including visual similarity, semantic similarity, and association. Performance of 40 control participants and 20 PWA was investigated within and between participant groups. The relationship between semantic task performance and existing semantic and cognitive assessments was also explored in PWA. An important finding related to a subgroup of PWA who were impaired on the explicit experimental semantic tasks but demonstrated intact semantic processing via the implicit method. Within tasks some differences were found in the effects of semantically related or associated stimuli. No relationships were found between experimental semantic task performance and cognitive task accuracy. The research offers insights into the role of implicit language testing, the impact of stimuli relationship type, and the complex relationship between semantic processing and cognition. The findings underline the need for valid and accurate measures of semantic processing to be in place to enable accurate diagnosis for PWA, in order to direct appropriate intervention choice and facilitate successful rehabilitation.

APA, Harvard, Vancouver, ISO, and other styles

3

Zahidin, Ahmad Zamri. "Using Ada tasks (concurrent processing) to simulate a business system." Virtual Press, 1988. http://liblink.bsu.edu/uhtbin/catkey/539634.

Full text

Abstract:

Concurrent processing has always been a traditional problem in developing operating systems. Today, concurrent algorithms occur in many application areas such as science and engineering, artificial intelligence, business systems databases, and many more. The presence of concurrent processing facilities allows the natural expression of these algorithms as concurrent programs. This is a very distinct advantage if the underlying computer offers parallelism. On the other hand, the lack of concurrent processing facilities forces these algorithms to be written as sequential programs, thus, destroying the structure of the algorithms and making them hard to understand and analyze.The first major programming language that offers high-level concurrent processing facilities is Ada. Ada is a complex, general purpose programming language that provides an excellent concurrent programming facility called task that is based on rendezvous concept. In this study, concurrent processing is practiced by simulating a business system using Ada language and its facilities.A warehouse (the business system) consists of a number of employees purchases microwave ovens from various vendors and distributes them to several retailers. Simulation of activities in the system is carried over by assigning each employee to a specific task and all tasks run simultaneously. The programs. written for this business system produce transactions and financial statements of a typical business day. They(programs) are also examining the behavior of activities that occur simultaneously. The end results show that concurrency and Ada work efficiently and effectively.
Department of Computer Science

APA, Harvard, Vancouver, ISO, and other styles

4

Laws, Florian [Verfasser], and Hinrich [Akademischer Betreuer] Schütze. "Effective active learning for complex natural language processing tasks / Florian Laws. Betreuer: Hinrich Schütze." Stuttgart : Universitätsbibliothek der Universität Stuttgart, 2013. http://d-nb.info/1030521204/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Lorello, Luca Salvatore. "Small transformers for Bioinformatics tasks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23883/.

Full text

Abstract:

Recent trends in bioinformatics are trying to align the techniques to more modern approaches based on statistical natural language processing and deep learning, however state-of-the-art neural natural language processing techniques remain relatively unexplored in this domain. Large models are capable of achieving state-of-the-art performances, however, a typical bioinformatics lab has limited hardware resources. For this reason, this thesis focuses on small architectures, the training of which can be performed in a reasonable amount of time, while trying to limit or even negate the performance loss compared to SOTA. In particular, sparse attention mechanisms (such as the one proposed by Longformer) and parameter sharing techniques (such as the one proposed by Albert) are jointly explored with respect to two genetic languages: human genome and eukaryotic mitochondrial genome of 2000+ different species. Contextual embeddings for each token are learned via pretraining on a language understanding task, both in RoBERTa and Albert styles to highlight differences in performance and training efficiency. The learned contextual embeddings are finally exploited for fine tuning a task of localization (transcription start site in human promoters) and two tasks of sequence classification (12S metagenomics in fishes and chromatin profile prediction, single-class and multi-class respectively). Using smaller architectures, near SOTA performances are achieved in all the tasks already explored in literature, and a new SOTA has been established for the other tasks. Further experiments with larger architectures consistently improved the previous SOTA for every task.

APA, Harvard, Vancouver, ISO, and other styles

6

Curiel, Diaz Arturo Tlacaélel. "Using formal logic to represent sign language phonetics in semi-automatic annotation tasks." Thesis, Toulouse 3, 2015. http://www.theses.fr/2015TOU30308/document.

Full text

Abstract:

Cette thèse présente le développement d'un framework formel pour la représentation des Langues de Signes (LS), les langages des communautés Sourdes, dans le cadre de la construction d'un système de reconnaissance automatique. Les LS sont de langues naturelles, qui utilisent des gestes et l'espace autour du signeur pour transmettre de l'information. Cela veut dire que, à différence des langues vocales, les morphèmes en LS ne correspondent pas aux séquences de sons; ils correspondent aux séquences de postures corporelles très spécifiques, séparés par des changements tels que de mouvements. De plus, lors du discours les signeurs utilisent plusieurs parties de leurs corps (articulateurs) simultanément, ce qui est difficile à capturer avec un système de notation écrite. Cette situation difficulté leur représentation dans de taches de Traitement Automatique du Langage Naturel (TALN). Pour ces raisons, le travail présenté dans ce document a comme objectif la construction d'une représentation abstraite de la LS; plus précisément, le but est de pouvoir représenter des collections de vidéo LS (corpus) de manière formelle. En générale, il s'agit de construire une couche de représentation intermédiaire, permettant de faire de la reconnaissance automatique indépendamment des technologies de suivi et des corpus utilisés pour la recherche. Cette couche corresponde à un système de transition d'états (STE), spécialement crée pour représenter la nature parallèle des LS. En plus, elle peut-être annoté avec de formules logiques pour son analyse, à travers de la vérification de modèles. Pour représenter les propriétés à vérifier, une logique multi-modale a été choisi : la Logique Propositionnelle Dynamique (PDL). Cette logique a été originalement crée pour la spécification de programmes. De manière plus précise, PDL permit d'utilise des opérateurs modales comme [a] et , représentant <> et <>, respectivement. Une variante particulaire a été développée pour les LS : la PDL pour Langue de Signes (PDLSL), qui est interprété sur des STE représentant des corpus. Avec PDLSL, chaque articulateur du corps (comme les mains et la tête) est vu comme un agent indépendant; cela veut dire que chacun a ses propres actions et propositions possibles, et qu'il peux les exécuter pour influencer une posture gestuelle. L'utilisation du framework proposé peut aider à diminuer deux problèmes importantes qui existent dans l'étude linguistique des LS : hétérogénéité des corpus et la manque des systèmes automatiques d'aide à l'annotation. De ce fait, un chercheur peut rendre exploitables des corpus existants en les transformant vers des STE. Finalement, la création de cet outil à permit l'implémentation d'un système d'annotation semi-automatique, basé sur les principes théoriques du formalisme. Globalement, le système reçoit des vidéos LS et les transforme dans un STE valide. Ensuite, un module fait de la vérification formelle sur le STE, en utilisant une base de données de formules crée par un expert en LS. Les formules représentent des propriétés lexicales à chercher dans le STE. Le produit de ce processus, est une annotation qui peut être corrigé par des utilisateurs humains, et qui est utilisable dans des domaines d'études tels que la linguistique
This thesis presents a formal framework for the representation of Signed Languages (SLs), the languages of Deaf communities, in semi-automatic recognition tasks. SLs are complex visio-gestural communication systems; by using corporal gestures, signers achieve the same level of expressivity held by sound-based languages like English or French. However, unlike these, SL morphemes correspond to complex sequences of highly specific body postures, interleaved with postural changes: during signing, signers use several parts of their body simultaneously in order to combinatorially build phonemes. This situation, paired with an extensive use of the three-dimensional space, make them difficult to represent with tools already existent in Natural Language Processing (NLP) of vocal languages. For this reason, the current work presents the development of a formal representation framework, intended to transform SL video repositories (corpus) into an intermediate representation layer, where automatic recognition algorithms can work under better conditions. The main idea is that corpora can be described with a specialized Labeled Transition System (LTS), which can then be annotated with logic formulae for its study. A multi-modal logic was chosen as the basis of the formal language: the Propositional Dynamic Logic (PDL). This logic was originally created to specify and prove properties on computer programs. In particular, PDL uses the modal operators [a] and to denote necessity and possibility, respectively. For SLs, a particular variant based on the original formalism was developed: the PDL for Sign Language (PDLSL). With the PDLSL, body articulators (like the hands or head) are interpreted as independent agents; each articulator has its own set of valid actions and propositions, and executes them without influence from the others. The simultaneous execution of different actions by several articulators yield distinct situations, which can be searched over an LTS with formulae, by using the semantic rules of the logic. Together, the use of PDLSL and the proposed specialized data structures could help curb some of the current problems in SL study; notably the heterogeneity of corpora and the lack of automatic annotation aids. On the same vein, this may not only increase the size of the available datasets, but even extend previous results to new corpora; the framework inserts an intermediate representation layer which can serve to model any corpus, regardless of its technical limitations. With this, annotations is possible by defining with formulae the characteristics to annotate. Afterwards, a formal verification algorithm may be able to find those features in corpora, as long as they are represented as consistent LTSs. Finally, the development of the formal framework led to the creation of a semi-automatic annotator based on the presented theoretical principles. Broadly, the system receives an untreated corpus video, converts it automatically into a valid LTS (by way of some predefined rules), and then verifies human-created PDLSL formulae over the LTS. The final product, is an automatically generated sub-lexical annotation, which can be later corrected by human annotators for their use in other areas such as linguistics

APA, Harvard, Vancouver, ISO, and other styles

7

Milajevs, Dmitrijs. "A study of model parameters for scaling up word to sentence similarity tasks in distributional semantics." Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/36225.

Full text

Abstract:

Representation of sentences that captures semantics is an essential part of natural language processing systems, such as information retrieval or machine translation. The representation of a sentence is commonly built by combining the representations of the words that the sentence consists of. Similarity between words is widely used as a proxy to evaluate semantic representations. Word similarity models are well-studied and are shown to positively correlate with human similarity judgements. Current evaluation of models of sentential similarity builds on the results obtained in lexical experiments. The main focus is how the lexical representations are used, rather than what they should be. It is often assumed that the optimal representations for word similarity are also optimal for sentence similarity. This work discards this assumption and systematically looks for lexical representations that are optimal for similarity measurement between sentences. We find that the best representation for word similarity is not always the best for sentence similarity and vice versa. The best models in word similarity tasks perform best with additive composition. However, the best result on compositional tasks is achieved with Kroneckerbased composition. There are representations that are equally good in both tasks when used with multiplicative composition. The systematic study of the parameters of similarity models reveals that the more information lexical representations contain, the more attention should be paid to noise. In particular, the word vectors in models with the feature size at the magnitude of the vocabulary size should be sparse, but if a small number of context features is used then the vectors should be dense. Given the right lexical representations, compositional operators achieve state-of-the-art performance, improving over models that use neural-word embeddings. To avoid overfitting, either several test datasets should be used or parameter selection should be based on parameters' average behaviours.

APA, Harvard, Vancouver, ISO, and other styles

8

Al-Hadlaq, Mohammed S. "Retention of words learned incidentally by Saudi EFL learners through working on vocabulary learning tasks constructed to activate varying depths of processing." Virtual Press, 2003. http://liblink.bsu.edu/uhtbin/catkey/1263891.

Full text

Abstract:

This study investigated the effectiveness of four vocabulary learning tasks on 104 Saudi EFL learners' retention of ten previously unencountered lexical items. These four tasks were: 1) writing original sentences (WS), 2) writing an original text (i.e. composition) (WT), 3) filling-in-the-blank of single sentences (FS), and 4) filling-in-the-lank of a text (FT). Different results were obtained depending on whether the amount of time required by these tasks was considered in the analysis or not. When time was not considered in the analysis, the WT group outperformed the other groups while the FS group obtained the lowest score. No significant differences were found between WS and FT. The picture, however, changed dramatically when time was considered in the analysis. The analysis of ratio of score to time taken revealed no significant differences between the four groups except between FT and FS, and it was in favor of FT. The differences in vocabulary gains between the four groups were ascribed to the level (or depth) of processing these tasks required the subjects to do and to the richness of the context available in two of the four exercises, namely WT and FT. The researcher concluded that composition writing was the most helpful task for vocabulary retention and also for general language learning, followed by FT. Sentence fill-in was considered the least useful activity in this regard.
Department of English

APA, Harvard, Vancouver, ISO, and other styles

9

Chen, Charles L. "Neural Network Models for Tasks in Open-Domain and Closed-Domain Question Answering." Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1578592581367428.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Malapetsa, Christina. "Stroop tasks with visual and auditory stimuli : How different combinations of spoken words, written words, images and natural sounds affect reaction times." Thesis, Stockholms universitet, Institutionen för lingvistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-185057.

Full text

Abstract:

The Stroop effect is the delay in reaction times due to interference. Since the original experiments of 1935, it has been used primarily in linguistic context. Language is a complex skill unique to humans, which involves a large part of the cerebral cortex and many subcortical regions. It is perceived primarily in auditory form (spoken) and secondarily in visual form (written), but it is also always perceived in representational form (natural sounds, images, smells, etc). Auditory signals are processed much faster than visual signals, and the language processing centres are closer to the primary auditory cortex than the primary visual cortex, but due to the integration of stimuli and the role of the executive functions, we are able to perceive both simultaneously and coherently. However, auditory signals are still processed faster, and this study focused on establishing how auditory and visual, linguistic and representational stimuli interact with each other and affect reaction times in four Stroop tasks with four archetypal mammals (dog, cat, mouse and pig): a written word against an image, a spoken word against an image, a written word against a natural sound and a spoken word against a natural sound. Four hypotheses were tested: in all tasks reaction times would be faster when the stimuli were congruent (Stroop Hypothesis); reaction times would be faster when both stimuli are auditory than when they are visual (Audiovisual Hypothesis); reaction times would be similar in the tasks where one stimulus is auditory and the other visual (Similarity Hypothesis); finally, reaction times would be slower when stimuli come from two sources than when they come from one source (Attention Hypothesis). Twelve native speakers of Swedish between the ages of 22 and 40 participated. The experiment took place in the EEG lab of the Linguistics Department of Stockholm University. The same researcher (the author) and equipment was used for all participants. The results confirmed the Stroop Hypothesis, did not confirm the Audiovisual and Similarity Hypothesis, and the results of the Attention Hypothesis were mixed. The somewhat controversial results were mostly attributed to a false initial assumption, namely that having two different auditory stimuli (one on each ear) was considered one source of stimuli, and possibly the poor quality of some natural sounds. With this additional consideration, the results seemed to be in accord with previous research. Future research could focus on more efficient ways to test the reaction times of Stroop tasks involving auditory and visual stimuli, as well as different populations, especially neurodiverse and bilingual populations.

APA, Harvard, Vancouver, ISO, and other styles

11

Elliot, Mark James. "A computational model of task oriented discourse." Thesis, University of Nottingham, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.284055.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Felce, Catherine. "L’ouverture de l’énoncé en allemand L2 : De la compréhension d’un phénomène à son appropriation et à son enseignement. Perspectives en didactique des langues." Thesis, Sorbonne Paris Cité, 2015. http://www.theses.fr/2015USPCA157/document.

Full text

Abstract:

Notre recherche vise à enrichir la réflexion didactique sur les modalités d’un enseignement susceptible de favoriser dès les premières années d’apprentissage de l’allemand au collège (soit en 6ème et en 5ème) la mise en place de préférences discursives spécifiques à la langue-cible. Plutôt que de nous focaliser sur le placement verbal second dans les énoncés déclaratifs, nous avons fait le choix de nous intéresser à l’ouverture et à la position préverbale. Il s’agit en effet d’un champ que des apprenants même avancés n’investissent pas de manière conforme, en dépit d’une maîtrise avérée des règles morphosyntaxiques de la langue-cible. Dans la mesure où, en allemand, la position préverbale n’est pas déterminée sur le plan syntaxique, il est nécessaire de recourir à des catégories textuelles ainsi qu’à des critères pragmatiques liés à l’organisation informationnelle des constituants pour comprendre les phénomènes qui président au choix de l’élément à l’ouverture. Partant de là, nous avons incorporé ces aspects aux tâches proposées dans le cadre du cours afin de susciter une mise en œuvre de ces principes et de modifier les préférences de traitement des apprenants. Le travail d’appropriation relève aussi de processus intrapsychiques qui posent certes des limites à l’intervention enseignante mais dont la prise en compte permet d’élaborer des propositions didactiques étayées sur le plan linguistique et psycholinguistique. Adossée à la recherche en acquisition des langues (RAL), notre étude s’appuie sur l’analyse des réalisations de l’ouverture dans un corpus de productions d’apprenants de manière à recomposer nos propositions d’intervention en fonction des observations recueillies sur le terrain et de telle sorte qu’elle s’accorde à une progression d’apprentissage. Des notions linguistiques plurielles nous ont ainsi permis de mettre en lumière et d’exploiter à des fins d’enseignement des fonctions spécifiques liées à l’ouverture de l’énoncé en allemand. La RAL offre à notre travail un cadre d’analyse, des résultats permettant de comprendre la dimension cognitive de l’apprentissage et des notions qui fournissent un étayage théorique aux propositions didactiques que nous formulons
This study concerns the acquisition of language specific discourse preferences in the first years of German as a foreign language in French secondary schools, and how new approaches could improve language instruction. Instead of focusing on verbal placement in declaratives, we decided to consider how learners start a sentence; that is, which constituent they decide to put before the finite verb form. The initial field in German (called pre-field) represents a syntactically undetermined position as it can be occupied by a variety of elements. To understand the constraints which influence the choice of the first constituent in a sentence, textual categories, as well as pragmatic and information-structural criteria, are required. These aspects were incorporated in the tasks the learners worked on in the classroom, in order to make them use such principles, and to modify the processing preferences they may have built up during acquisition of their L1. Acquisition draws on internal processes which set limits to instructional intervention. Practitioners should take these limitations into account if they aim to elaborate instructional proposals with linguistic and psycholinguistic relevance. Drawing on findings from the SLA research, we analyse the beginnings of sentences in a corpus of written and oral learner samples. We used these empirical observations as a guideline to redesign proposals for an instructional intervention which better fits a learning progression. Notions from different aspects of linguistics contribute to highlight specific functions of sentence beginnings in German. It would be possible to integrate these functions in a teaching programme. SLA research offers a theoretical framework to our research, as the findings provide a better understanding of the cognitive dimension of learning and the notions constitute a theoretical backing for our didactical proposals

APA, Harvard, Vancouver, ISO, and other styles

13

Røkenes, Håkon Drolsum. "Graph-based Natural Language Processing : Graph edit distance applied to the task of detecting plagiarism." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2012. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-20778.

Full text

Abstract:

The focus of this thesis is the exploration of graph-based similarity, in the context of natural language processing. The work is motivated by a need for richer representations of text. A graph edit distance algorithm was implemented, that calculates the difference between graphs. Sentences were represented by means of dependency graphs, which consist of words connected by dependencies. A dependency graph captures the syntactic structure of a sentence. The graph-based similarity approach was applied to the problem of detecting plagiarism, and was compared against state of the art systems. The key advantages of graph-based textual representations are mainly word order indifference and the ability to capture similarity between words, based on the sentence structure. The approach was compared against contributions made to the PAN plagiarism detection challenge at the CLEF 2011 conference, and would have achieved a 5th place out of 10 contestants. The evaluation results suggest that the approach can be applicable to the task of detecting plagiarism, but require some fine tuning on input parameters. The evaluation results demonstrated that dependency graphs are best represented by directed edges. The graph edit distance algorithm scored best with a combination of node and edge label matching. Different edit weights were applied, which increased performance. Keywords: Graph Edit Distance, Natural Language Processing, Dependency Graphs, Plagiarism Detection

APA, Harvard, Vancouver, ISO, and other styles

14

Ugalde-Gonzalez, Arturo. "Processing conditions as influences on task-based foreign language performance : a longitudinal study of communication strategies." Thesis, King's College London (University of London), 2006. https://kclpure.kcl.ac.uk/portal/en/theses/processing-conditions-as-influences-on-taskbased-foreign-language-performance--a-longitudinal-study-of-communication-strategies(5e7d3767-5b17-4a47-abb5-df068881a0f8).html.

Full text

Abstract:

Communicative language ability requires the development of formal linguistic resources to achieve realistic language use in real contexts and in real time. Linguistic-based instruction may become: less productive and fruitful than exposure to real-world task-based instruction. This study focuses on foreign language dev~lopment over time in sixteen subjects performing task-based activities involving description, narration and problem-solving si~uations. :.a;.... Learners' do not seem to present a clear acquisition-sequencing for language forms and. language functions. The major ideas and claims that have been used as a conceptual frame for this research focus on the cognitive basis for language learning. The output practice which learners' engage in enables them to cope with the task demands which are aimed at 'integrating language knowledge with productive use. The central issue concentrates on the how the cognitive processing demands push the learner towards successful communicative achievement. The pressure exerted on learner's capacity to deal with meaning complexity, accurate performance and fluent communication is intended to focus on the divided learner's efforts to convey appropriate formulations and effective messages.

APA, Harvard, Vancouver, ISO, and other styles

15

Alamry, Ali. "Grammatical Gender Processing in Standard Arabic as a First and a Second Language." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39965.

Full text

Abstract:

The present dissertation investigates grammatical gender representation and processing in Modern Standard Arabic (MSA) as a first (L1) and a second (L2) language. It mainly examines whether L2 can process gender agreement in a native-like manner, and the extent to which L2 processing is influenced by the properties of the L2 speakers’ L1. Additionally, it examines whether L2 gender agreement processing is influenced by noun animacy (animate and inanimate) and word order (verb-subject and subject-verb). A series of experiments using both online and offline techniques were conducted to address these questions. In all of the experiments, gender agreement between verb and nouns was examined. The first series of experiments examined native speakers of MSA (n=49) using a self-paced reading task (SPR), an event-related potential (ERP) experiment, and a grammaticality judgment (GJ) task. Results of these experiments revealed that native speakers were sensitive to grammatical violations. Native speakers showed longer reaction times (RT) in the SPR task, and a P600 effect in the ERP, in responses to sentences with mismatched gender agreement as compared to sentences with matched gender agreement. They also performed at ceiling in the GJ task. The second series of experiments examined L2 speakers of MSA (n=74) using an SPR task, and a GJ task. Both experiments included adult L2 speakers whom were divided into two subgroups, -Gender and +Gender, based on whether or not their L1s has a grammatical gender system. The results of both experiments revealed that both groups were sensitive to gender agreement violations. The L2 speakers showed longer RTs, in the SPR task, in responses to sentences with mismatched gender agreement as compared to sentences with matched gender agreement. No difference was found between the L2 groups in this task. The L2 speakers also performed well in the GJ task, as they were able to correctly identify the grammatical and ungrammatical sentences. Interestingly in this task, the -Gender group outperformed +Gender group, which could be due to proficiency in the L2 as the former group obtained a better score on the proficiency task, or it could be that +Gender group showed negative transfer from their L1s. Based on the results of these two experiments, this dissertation argues that late L2 speakers are not restricted to their L1 grammar, and thus, they are able to acquire gender agreement system of their L2 even if this feature is not instantiated in their L1. The results provide converging evidence for the FTFA rather than FFFH model, as it appears that the -Gender group was able to reset their L1 gender parameter according to the L2 gender values. Although the L2 speakers were advanced, they showed slower RTs than the native speakers in the SPR task, and lower accuracy in the GJT. However, it is possible that they are still in the process of acquiring gender agreement of MSA and have not reached their final stage of acquisition. This is supported by the fact that some L2 speakers from both -Gender and +Gender groups performed as well as native speakers in both SPR and GJ tasks. Regarding the effect of animacy, the L2 speakers had slower RT and lower accuracy on sentences with inanimate nouns than on those with animate ones, which is in line with previous L2 studies (Anton-Medez, 1999; Alarcón, 2009; Gelin, & Bugaiska, 2014). The native speakers, on the other hand, showed no effect of animacy in both SPR task and GJT. Further, no N400 effect was observed as a result of semantic gender agreement violations in the ERP experiment. Finally, the results revealed a potential effect of word order. Both the native and L2 speakers showed longer RTs on VS word order than SV word order in the SPR task. Further the native speakers showed earlier and greater P600 effect on VS word order than SV word order in the ERP. This result suggests that processing gender agreement violation is more complex in the VS word order than in the SV word order due to the inherent asymmetry in the subject-verb agreement system in the two-word orders in MSA.

APA, Harvard, Vancouver, ISO, and other styles

16

Suddrey, Gavin. "Instructing and training robots through a natural language dialogue." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/229145/1/Gavin_Suddrey_Thesis.pdf.

Full text

Abstract:

This thesis focused on the problem of allowing non-expert users, such as the elderly, to teach robots how to perform everyday tasks through dialogue. In particular, this thesis addressed issues relating to how task knowledge, extracted from spoken instructions, should be encoded by a robot to allow the robot to learn complex tasks; how to generalise knowledge provided by human instructors such that the robot can perform the same task across different scenarios; as well as how to handle information gaps present in the explanation of tasks provided by novice users.

APA, Harvard, Vancouver, ISO, and other styles

17

Hikima, Noriko. "The effects of processing instruction and re-exposure on interpretation discourse level tasks : the case of Japanese passive forms." Thesis, University of Greenwich, 2010. http://gala.gre.ac.uk/6604/.

Full text

Abstract:

The present study was conducted to investigate possible interpretation discourse level effects of processing instruction and re-exposure to processing instruction on the acquisition of a specific feature of the Japanese linguistics system: namely Japanese passive forms. Processing instruction is a type of focus on form which is framed around the input processing theoretical framework. In order to carry out this investigation two separate experimental studies were conducted. All participants were native English speakers and were randomly assigned to two groups. In both experimental studies, one group received processing instruction which involved an explicit instruction component and structured input practice directed at altering the way L2 learners process input; the other group was used as a control group and received no instruction. Interpretation and production sentence level tasks, and discourse level tasks were used to measure performance after a one day instruction. A pre-test/post-test design was adopted to collect data in both studies. In the second experimental study, the processing instruction group received a re-exposure treatment between the post-test and the delayed post-test. Based on previous research carried out on the effectiveness of processing instruction, it was hypothesised that processing instruction would have positive effects on the accuracy with which subjects interpreted and produced sentences containing Japanese passive forms. A further hypothesis was that the effects of re-exposure to the processing instruction treatment (after the first post-test) would further improve subjects ability to interpret and produce sentences containing Japanese passive forms. A set of two hypotheses were formulated on possible interpretation discourse effects for processing instruction. It was hypothesised that the group receiving processing instruction would improve in its ability to interpret discourse (guided recall: dialogue and story version) containing Japanese passive forms, and that learners in this group, receiving re-exposure to the processing instruction treatment would further improve in their ability to interpret discourse containing Japanese passive forms. Overall the statistical analyses carried out on the raw scores of all the measures used supported the four hypotheses of this study. The results obtained in this research provide clear evidence that processing instruction has positive effects on the acquisition of Japanese passive construction. The present study showed that processing instruction was successful in altering the way in which learners processed the input and its effects had also an impact on the way learners produced Japanese passive construction forms. The main findings of the present study also provided new evidence on the effectiveness of processing instruction in improving learners’ performance on interpretation discourse level tasks. In addition to this, it also provides new evidence that learners receiving re-exposure to the processing instruction treatment between a post-test and a delayed post-test can further improve in their ability to interpret and produce the target feature at sentence level and interpret the target feature at discourse level. The results obtained in the two studies have implications at two levels. At the theoretical level this research provides further support for the role that input processing plays in SLA. At the pedagogical level it demonstrates the effectiveness of processing instruction on the acquisition of a different linguistic feature of the Japanese grammar system (passive forms), not only on an interpretation and production sentence level task but also on an interpretation discourse level task. It also demonstrated the important role of a re-exposure instructional treatment.

APA, Harvard, Vancouver, ISO, and other styles

18

Acosta, Andrew D. "Laff-O-Tron: Laugh Prediction in TED Talks." DigitalCommons@CalPoly, 2016. https://digitalcommons.calpoly.edu/theses/1667.

Full text

Abstract:

Did you hear where the thesis found its ancestors? They were in the "parent-thesis"! This joke, whether you laughed at it or not, contains a fascinating and mysterious quality: humor. Humor is something so incredibly human that if you squint, the two words can even look the same. As such, humor is not often considered something that computers can understand. But, that doesn't mean we won't try to teach it to them. In this thesis, we propose the system Laff-O-Tron to attempt to predict when the audience of a public speech would laugh by looking only at the text of the speech. To do this, we create a corpus of over 1700 TED Talks retrieved from the TED website. We then adapted various techniques used by researchers to identify humor in text. We also investigated features that were specific to our public speaking environment. Using supervised learning, we try to classify if a chunk of text would cause the audience to laugh or not based on these features. We examine the effects of each feature, classifier, and size of the text chunk provided. On a balanced data set, we are able to accurately predict laughter with up to 75% accuracy in our best conditions. Medium level conditions prove to be around 70% accuracy; while our worst conditions result in 66% accuracy. Computers with humor recognition capabilities would be useful in the fields of human computer interaction and communications. Humor can make a computer easier to interact with and function as a tool to check if humor was properly used in an advertisement or speech.

APA, Harvard, Vancouver, ISO, and other styles

19

Zezinka, Alexandra. "Investigation of a Computerized Nonverbal Word-Picture Verification Task in Healthy Adults." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1491932143220066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Foster, Pauline Mary. "Attending to message and medium : the effects of planning time on the task-based language performance of native and non-native speakers." Thesis, King's College London (University of London), 2000. https://kclpure.kcl.ac.uk/portal/en/theses/attending-to-message-and-medium--the-effects-of-planning-time-on-the-taskbased-language-performance-of-native-and-nonnative-speakers(528f66b6-1324-4ca7-a76c-0f4b3247e14a).html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Enzinna, Naomi R. "The Processing of Preposition-Stranding Constructions in English." FIU Digital Commons, 2013. http://digitalcommons.fiu.edu/etd/889.

Full text

Abstract:

One of the prominent questions in modern psycholinguistics is the relationship between the grammar and the parser. Within the approach of Generative Grammar, this issue has been investigated in terms of the role that Principles of Universal Grammar may play in language processing. The aim of this research experiment is to investigate this topic. Specifically, this experiment aims to test whether the Minimal Structure Principle (MSP) plays a role in the processing of Preposition-Stranding versus Pied-Piped Constructions. This investigation is made with a self-paced reading task, an on-line processing test that measures participants’ unconscious reaction to language stimuli. Monolingual English speakers’ reading times of sentences with Preposition-Stranding and Pied-Piped Constructions are compared. Results indicate that neither construction has greater processing costs, suggesting that factors other than the MSP are active during language processing.

APA, Harvard, Vancouver, ISO, and other styles

22

Masellis, Maria C. "Evidence for temporal processing deficits in children with attention deficit hyperactivity disorder and language impairments on a dichotic listening task." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0007/MQ40662.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Serafini, Sandra. "Functional neuroanatomy during language processing : correspondence of cortical stimulation mapping, fMRI, PEPSI, and ERP during a visual object naming task /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/8235.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Simonis, Rita. "The effects of multilingualism on executive processing." Thesis, Stockholms universitet, Centrum för tvåspråkighetsforskning, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-157571.

Full text

Abstract:

In the first decades of the 20th century, research on bilingualism was just beginning. The first studies on bilingual children proposed a substantial disadvantage with respect to intelligence and learning abilities. This first proposition was later discarded when Peal and Lambert (1962) suggested that, on the contrary, speaking two languages was providing children with significant advantages in their cognition. At the present time, it is assessed that, while knowing more than one language is not negative, the supposition that bilingualism might have positive effects on executive processing is subject to controversy. The Bilingual Executive Advantage (BEA) hypothesis has been tested many times and in several ways. Nevertheless, it appears more like an overstated theory rather than a real and proven fact. The purpose of this study is to contribute to this scholarly debate not only by conducting one more experiment but also by investigating a possible extension to the original hypothesis, more specifically, the possibility that additional languages might confer an even greater cognitive advantage than the one that has been claimed to exist for bilingual individuals. In the study, 23 young adults were tested on a version of the Attentional Network Task and a Colour-Shape switching task, both used in a previous study on professional interpreters (Babcock and Vallesi, 2017). The subjects were divided in two groups, bilinguals and multilinguals. The comparison of their performances in the two task revealed no significant difference in any of the examined measures.

APA, Harvard, Vancouver, ISO, and other styles

25

Widjaja, Hendra. "Visor++ : a software visualisation tool for task-parallel object-orientated programs." Title page, abstract and contents only, 1998. http://web4.library.adelaide.edu.au/theses/09AS/09asw639.pdf.

Full text

Abstract:

Bibliography: leaves 173-184. This thesis describes Visor++, a tool for visualising programs written in CC++, a task-parallel, object-orientated language derived from C++. Visor++ provides a framework of visualising task-parallel object-orientated programs in the absence of language support for visualisation, i.e. for programs such as CC++ which are written in languages which are not "visualisation-conscious". The development of techniques using a wide selection of language features are described and the effectiveness testified by experimentation.

APA, Harvard, Vancouver, ISO, and other styles

26

Gan, Gabriela, Christian Büchel, and Frédéric Isel. "Effect of language task demands on the neural response during lexical access: a functional magnetic resonance imaging study." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-127023.

Full text

Abstract:

This study examined the effects of linguistic task demands on the neuroanatomical localization of the neural response related to automatic semantic processing of concrete German nouns combining the associative priming paradigm with functional magnetic resonance imaging (fMRI). To clarify the functional role of the inferior frontal gyrus (IFG) for semantic processing with respect to semantic decision making compared to semantic processing per se, we used a linguistic task that involved either a binary decision process (i.e., semantic categorization; Experiment 1) or not (i.e., silently thinking about a word's meaning; Experiment 2). We observed associative priming effects indicated as neural suppression in bilateral superior temporal gyri (STG), anterior cingulate cortex (ACC), occipito-temporal brain areas, and in medial frontal brain areas independently of the linguistic task. Inferior parietal brain areas were more active for silently thinking about a word's meaning compared to semantic categorization. A conjunction analysis of linguistic task revealed that both tasks activated the same left-lateralized occipito-temporo-frontal network including the IFG. Contrasting neural associative priming effects across linguistic task demands, we found a significant interaction in the right IFG. The present fMRI data give rise to the assumption that activation of the left inferior frontal gyrus (LIFG) in the semantic domain might be important for semantic processing in general and not only for semantic decision making. These findings contrast with a recent study regarding the role of the LIFG for binary decision making in the lexical domain (Wright et al. 2011).

APA, Harvard, Vancouver, ISO, and other styles

27

Eyorokon, Vahid. "Measuring Goal Similarity Using Concept, Context and Task Features." Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1534084289041091.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Tovedal, Sofiea. "On The Effectiveness of Multi-TaskLearningAn evaluation of Multi-Task Learning techniques in deep learning models." Thesis, Umeå universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-172257.

Full text

Abstract:

Multi-Task Learning is today an interesting and promising field which many mention as a must for achieving the next level advancement within machine learning. However, in reality, Multi-Task Learning is much more rarely used in real-world implementations than its more popular cousin Transfer Learning. The questionis why that is and if Multi-Task Learning outperforms its Single-Task counterparts. In this thesis different Multi-Task Learning architectures were utilized in order to build a model that can handle labeling real technical issues within two categories. The model faces a challenging imbalanced data set with many labels to choose from and short texts to base its predictions on. Can task-sharing be the answer to these problems? This thesis investigated three Multi-Task Learning architectures and compared their performance to a Single-Task model. An authentic data set and two labeling tasks was used in training the models with the method of supervised learning. The four model architectures; Single-Task, Multi-Task, Cross-Stitched and the Shared-Private, first went through a hyper parameter tuning process using one of the two layer options LSTM and GRU. They were then boosted by auxiliary tasks and finally evaluated against each other.

APA, Harvard, Vancouver, ISO, and other styles

29

Gibson, Bob. "Think aloud, non-continuous reporting, and annotated cloze : using verbal report and a self-coding procedure in looking at German and Japanese informants' processing of a second language cloze task." Thesis, University of Edinburgh, 2005. http://hdl.handle.net/1842/24611.

Full text

Abstract:

Following a survey of the history and nature of the cloze test as applied to second-language assessment, the study looks at the role of introspective and verbal-report data in understanding second-language test-taking processes. Particular attention is paid to the verbal report task format known as ‘think-aloud’. The theoretical bases of this procedure are critiqued, and some problematic aspects of its use in the study of linguistic tasks are discussed. Attention is drawn to apparent divergences between the model of think-aloud and its real-world applications. The use of think-aloud in the study of cloze test-taking by German and Japanese first-language informants is discussed, and a number of specific shortcomings identified. These shortcomings lie mainly in the areas of practical sample-size, interpretability and comprehensiveness of data, and negative affective responses on the part of informants. A modification to the ‘classical’ think-aloud is proposed, labelled non-continuous reporting, and the results of this method are compared to those of think-aloud. It is concluded that the advantages of non-continuous reporting outweigh its shortcomings. A further alternative real-time data-gathering procedure is proposed, the so-called ‘annotated cloze’, and its strength and drawbacks discussed. The relative efficiencies of annotated cloze and the two variants of think aloud are examined in terms of their ability to generate a picture of how test-takers process cloze passages, and suggestions are offered regarding optimal use of these task formats in the elicitation of further data.

APA, Harvard, Vancouver, ISO, and other styles

30

Dockes, Jérôme. "Statistical models for comprehensive meta-analysis of neuroimaging studies." Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT048.

Full text

Abstract:

La neuroimagerie permet d’étudier les liens entre la structure et le fonctionnement du cerveau. Des milliers d’études de neuroimagerie sont publiées chaque année. Il est difficile d’exploiter cette grande quantité de résultats. En effet, chaque étude manque de puissance statistique et peut reporter beaucoup de faux positifs. De plus, certains effets sont spécifiques à un protocole expérimental et difficile à reproduire. Les méta-analyses rassemblent plusieurs études pour identifier les associations entre structures anatomiques et processus cognitifs qui sont établies de manière consistante dans la littérature. Les méthodes classiques de méta-analyse commencent par constituer un échantillon d’études focalisées sur un même processus mental ou une même maladie. Ensuite, un test statistique permet de délimiter les régions cérébrales dans lesquelles le nombre d’observations reportées est significatif. Dans cette thèse, nous introduisons une nouvelle forme de méta-analyse, qui s’attache à construire des prédictions plutôt qu’à tester des hypothèses. Nous introduisons des modèles statistiques qui prédisent la distribution spatiale des observations neurologiques à partir de la description textuelle d’une expérience, d’un processus cognitif ou d’une maladie cérébrale. Notre approche est basée sur l’apprentissage statistique supervisé qui fournit un cadre classique pour évaluer et comparer les modèles. Nous construisons le plus grand jeu de données d’études de neuroimagerie et de coordonnées stéréotaxiques existant, qui rassemble plus de 13 000 publications. Dans la dernière partie, nous nous intéressons au décodage: prédire des états psychologiques à partir de l’activité cérébrale. La méta-analyse standard est un outil indispensable pour distinguer les vraies découvertes du bruit et des artefacts parmi les résultats publiés en neuroimagerie. Cette thèse introduit des méthodes adaptées à la méta-analyse prédictive. Cette approche est complémentaire de la méta-analyse standard, et aide à interpréter les résultats d’études de neuroimagerie ainsi qu’à formuler des hypothèses ou des a priori statistiques
Thousands of neuroimaging studies are published every year. Exploiting this huge amount of results is difficult. Indeed, individual studies lack statistical power and report many spurious findings. Even genuine effects are often specific to particular experimental settings and difficult to reproduce. Meta- analysis aggregates studies to identify consistent trends in reported associations between brain structure and behavior. The standard approach to meta-analysis starts by gathering a sample of studies that investigate a same mental process or disease.Then, a statistical test delineates brain regions where there is a significant agreement among reported findings. In this thesis, we develop a different kind of metaanalysis that focuses on prediction rather than hypothesis testing. We build predictive models that map textual descriptions of experiments, mental processes or diseases to anatomical regions in the brain. Our supervised learning approach comes with a natural quantitative evaluation framework, and we conduct extensive experiments to validate and compare statistical models. We collect and share the largest existing dataset of neuroimaging studies and stereotactic coordinates. This dataset contains the full text and locations of neurological observations for over 13 000 publications. In the last part, we turn to decoding: inferring mental states from brain activity.We perform this task through meta-analysis of fMRI statistical maps collected from an online data repository. We use fMRI data to distinguish a wide range of mental conditions. Standard meta-analysis is an essential tool to distinguish true discoveries from noise and artifacts. This thesis introduces methods for predictive metaanalysis, which complement the standard approach and help interpret neuroimaging results and formulate hypotheses or formal statistical priors

APA, Harvard, Vancouver, ISO, and other styles

31

Chan, Siu-Yuen. "Efficent user level infrastructure support for adaptive parallel computing on heterogenous networks of workstations." Thesis, Queensland University of Technology, 2000.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

32

Scarlato, Michele. "Sicurezza di rete, analisi del traffico e monitoraggio." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amslaurea.unibo.it/3223/.

Full text

Abstract:

Il lavoro è stato suddiviso in tre macro-aree. Una prima riguardante un'analisi teorica di come funzionano le intrusioni, di quali software vengono utilizzati per compierle, e di come proteggersi (usando i dispositivi che in termine generico si possono riconoscere come i firewall). Una seconda macro-area che analizza un'intrusione avvenuta dall'esterno verso dei server sensibili di una rete LAN. Questa analisi viene condotta sui file catturati dalle due interfacce di rete configurate in modalità promiscua su una sonda presente nella LAN. Le interfacce sono due per potersi interfacciare a due segmenti di LAN aventi due maschere di sotto-rete differenti. L'attacco viene analizzato mediante vari software. Si può infatti definire una terza parte del lavoro, la parte dove vengono analizzati i file catturati dalle due interfacce con i software che prima si occupano di analizzare i dati di contenuto completo, come Wireshark, poi dei software che si occupano di analizzare i dati di sessione che sono stati trattati con Argus, e infine i dati di tipo statistico che sono stati trattati con Ntop. Il penultimo capitolo, quello prima delle conclusioni, invece tratta l'installazione di Nagios, e la sua configurazione per il monitoraggio attraverso plugin dello spazio di disco rimanente su una macchina agent remota, e sui servizi MySql e DNS. Ovviamente Nagios può essere configurato per monitorare ogni tipo di servizio offerto sulla rete.

APA, Harvard, Vancouver, ISO, and other styles

33

Yi, Li. "Why Do Young Children Fail in False Belief Tasks: Linguistic Representations and Implicit Processing." Diss., 2009. http://hdl.handle.net/10161/1192.

Full text

Abstract:

Despite recent evidence that infants under one year of age have implicit understanding of theory of mind, three-year-old children repeatedly fail in traditional false belief tasks. A serious of 4 studies investigated two possible sources of errors. First, children's comprehension of theory of mind questions was tested in an elicited imitation task. Second, their understanding of mental events was measured using anticipatory eye movements in non-verbal tasks. Results showed that young children's performance in verbal false belief tasks is limited by their understanding of linguistic representations of beliefs and their ability to monitor mental states in real-time. This implies the limitations of young children in keeping track of complex social events in real time and in understanding language conventions in real time.

Dissertation

APA, Harvard, Vancouver, ISO, and other styles

34

Sturrock, Sara Katheleen. "Electroencephalographic correlates of cerebral engagement in auditory and visual language processing tasks in persons with down syndrome." Thesis, 2002. http://hdl.handle.net/2429/13260.

Full text

Abstract:

There has been extensive research performed regarding language processing in persons with Down syndrome. The results of initial dichotic listening studies and movement control studies lead Elliott Weeks and Elliott (1987) to propose a model of functional dissociation between brain centers sub-serving language perception and those sub-serving language production in persons with DS. Based on this model many new avenues of research emerged. The predictions of the model that have been tested thus far have focused on behavioural actions of persons with DS as compared to non-DS populations. The results of these subsequent experiments have provided support to the dissociation model (Elliott, Weeks & Elliott 1987). The present experiments were intended to further investigate language laterality in persons with DS. EEG was used as a means of investigating the purported atypical cerebral lateralization for language perception in persons with DS. The results of the present experiments lend some support to the dissociation model of DS and suggest that the model may be broadened to encompass language processing in general and not auditory language processing specifically.

APA, Harvard, Vancouver, ISO, and other styles

35

Henriques, Daniel Filipe Rodrigues. "Automatic Completion of Text-based Tasks." Master's thesis, 2019. http://hdl.handle.net/10362/92296.

Full text

Abstract:

Crowdsourcing is a widespread problem-solving model which consists in assigning tasks to an existing pool of workers in order to solve a problem, being a scalable alternative to hiring a group of experts for labeling high volumes of data. It can provide results that are similar in quality, with the advantage of achieving such standards in a faster and more efficient manner. Modern approaches to crowdsourcing use Machine Learning models to do the labeling of the data and request the crowd to validate the results. Such approaches can only be applied if the data in which the model was trained (source data), and the data that needs labeling (target data) share some relation. Furthermore, since the model is not adapted to the target data, its predictions may produce a substantial amount of errors. Consequently, the validation of these predictions can be very time-consuming. In this thesis, we propose an approach that leverages in-domain data, which is a labeled portion of the target data, to adapt the model. The remainder of the data is labeled based on these model’s predictions. The crowd is tasked with the generation of the in-domain data and the validation of the model’s predictions. Under this approach, train the model with only in-domain data and with both in-domain data and data from an outer domain. We apply these learning settings with the intent of optimizing a crowdsourcing pipeline for the area of Natural Language Processing, more concretely for the task of Named Entity Recognition (NER). This optimization relates to the effort required by the crowd to performed the NER task. The results of the experiments show that the usage of in-domain data achieves effort savings ranging from 6% to 53%. Furthermore, we such savings in nine distinct datasets, which demonstrates the robustness and application depth of this approach. In conclusion, the in-domain data approach is capable of optimizing a crowdsourcing pipeline of NER. Furthermore, it has a broader range of use cases when compared to reusing a model to generate predictions in the target data.

APA, Harvard, Vancouver, ISO, and other styles

36

Nouri, Golmaei Sara. "Improving the Performance of Clinical Prediction Tasks by using Structured and Unstructured Data combined with a Patient Network." Thesis, 2021. http://dx.doi.org/10.7912/C2/41.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)
With the increasing availability of Electronic Health Records (EHRs) and advances in deep learning techniques, developing deep predictive models that use EHR data to solve healthcare problems has gained momentum in recent years. The majority of clinical predictive models benefit from structured data in EHR (e.g., lab measurements and medications). Still, learning clinical outcomes from all possible information sources is one of the main challenges when building predictive models. This work focuses mainly on two sources of information that have been underused by researchers; unstructured data (e.g., clinical notes) and a patient network. We propose a novel hybrid deep learning model, DeepNote-GNN, that integrates clinical notes information and patient network topological structure to improve 30-day hospital readmission prediction. DeepNote-GNN is a robust deep learning framework consisting of two modules: DeepNote and patient network. DeepNote extracts deep representations of clinical notes using a feature aggregation unit on top of a state-of-the-art Natural Language Processing (NLP) technique - BERT. By exploiting these deep representations, a patient network is built, and Graph Neural Network (GNN) is used to train the network for hospital readmission predictions. Performance evaluation on the MIMIC-III dataset demonstrates that DeepNote-GNN achieves superior results compared to the state-of-the-art baselines on the 30-day hospital readmission task. We extensively analyze the DeepNote-GNN model to illustrate the effectiveness and contribution of each component of it. The model analysis shows that patient network has a significant contribution to the overall performance, and DeepNote-GNN is robust and can consistently perform well on the 30-day readmission prediction task. To evaluate the generalization of DeepNote and patient network modules on new prediction tasks, we create a multimodal model and train it on structured and unstructured data of MIMIC-III dataset to predict patient mortality and Length of Stay (LOS). Our proposed multimodal model consists of four components: DeepNote, patient network, DeepTemporal, and score aggregation. While DeepNote keeps its functionality and extracts representations of clinical notes, we build a DeepTemporal module using a fully connected layer stacked on top of a one-layer Gated Recurrent Unit (GRU) to extract the deep representations of temporal signals. Independent to DeepTemporal, we extract feature vectors of temporal signals and use them to build a patient network. Finally, the DeepNote, DeepTemporal, and patient network scores are linearly aggregated to fit the multimodal model on downstream prediction tasks. Our results are very competitive to the baseline model. The multimodal model analysis reveals that unstructured text data better help to estimate predictions than temporal signals. Moreover, there is no limitation in applying a patient network on structured data. In comparison to other modules, the patient network makes a more significant contribution to prediction tasks. We believe that our efforts in this work have opened up a new study area that can be used to enhance the performance of clinical predictive models.

APA, Harvard, Vancouver, ISO, and other styles

37

LIN, JI-WEI, and 林季緯. "Multi-label Classification with Multi-task learning - A case study of Natural Language Processing." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/9a6q4q.

Full text

Abstract:

碩士
國立交通大學
工業工程與管理系所
108
Natural language processing (NLP) is always an important research topic. Many NLP problems such as spam filtering, text classification, movie rating, and toxic comment detection, are present in our daily life. In recent years, with the development of deep learning and the improvement of computing power, many studies have applied deep learning to deal with NLP problems. Deep learning model uses a large amount of training data to automatically extract features, giving a base to extract important and discriminative features from data, and achieve promising results. Therefore, deep learning has been widely used in speech recognition, object detection and image recognition, and has achieved good results. Multi-label classification problem refers to the problem that each data sample may have multiple labels. In the toxic comments classification problem, a comment may consists of both identity hates and threats, while the other comments may contain threats, obscene and toxic. This thesis proposes to use multi-task learning to deal with multi-label classification problems, in which we treat each single label as a single task. For each task, it is possible to improve the performance by sharing representation between tasks. This thesis uses three different NLP classification datasets to conduct experiments. We use our proposed multi-task learning architecture to deal with this problem. Since the natural language processing problem is a sequence problem, and the Recurrent Neural Networks (RNN) architecture is more capable of processing sequence problems in the deep learning model. Therefore, we construct our model through bidirectional GRU, and then apply CNN for further feature learning. This thesis conducts experiments on three datasets. The experimental results indicate that our proposed method outperforms the other alternatives.

APA, Harvard, Vancouver, ISO, and other styles

38

"An in-depth look of phonological processing in Chinese character recognition: the effects of task." 1997. http://library.cuhk.edu.hk/record=b5889289.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

"Domain-Agnostic Context-Aware Assistant Framework for Task-Based Environment." Master's thesis, 2020. http://hdl.handle.net/2286/R.I.57245.

Full text

Abstract:

abstract: Smart home assistants are becoming a norm due to their ease-of-use. They employ spoken language as an interface, facilitating easy interaction with their users. Even with their obvious advantages, natural-language based interfaces are not prevalent outside the domain of home assistants. It is hard to adopt them for computer-controlled systems due to the numerous complexities involved with their implementation in varying fields. The main challenge is the grounding of natural language base terms into the underlying system's primitives. The existing systems that do use natural language interfaces are specific to one problem domain only. In this thesis, a domain-agnostic framework that creates natural language interfaces for computer-controlled systems has been developed by making the mapping between the language constructs and the system primitives customizable. The framework employs ontologies built using OWL (Web Ontology Language) for knowledge representation purposes and machine learning models for language processing tasks. It has been evaluated within a simulation environment consisting of objects and a robot. This environment has been deployed as a web application, providing anonymous user testing for evaluation, and generating training data for machine learning components. Performance evaluation has been done on metrics such as time taken for a task or the number of instructions given by the user to the robot to accomplish a task. Additionally, the framework has been used to create a natural language interface for a database system to demonstrate its domain independence.
Dissertation/Thesis
Masters Thesis Software Engineering 2020

APA, Harvard, Vancouver, ISO, and other styles

40

Werfelmann, Robert. "A Study of Recurrent and Convolutional Neural Networks in the Native Language Identification Task." Thesis, 2018. http://hdl.handle.net/10754/627954.

Full text

Abstract:

Native Language Identification (NLI) is the task of predicting the native language of an author from their text written in a second language. The idea is to find writing habits that transfer from an author’s native language to their second language. Many approaches to this task have been studied, from simple word frequency analysis, to analyzing grammatical and spelling mistakes to find patterns and traits that are common between different authors of the same native language. This can be a very complex task, depending on the native language and the proficiency of the author’s second language. The most common approach that has seen very good results is based on the usage of n-gram features of words and characters. In this thesis, we attempt to extract lexical, grammatical, and semantic features from the sentences of non-native English essays using neural networks. The training and testing data was obtained from a large corpus of publicly available essays written by authors of several countries around the world. The neural network models consisted of Long Short-Term Memory and Convolutional networks using the sentences of each document as the input. Additional statistical features were generated from the text to complement the predictions of the neural networks, which were then used as feature inputs to a Support Vector Machine, making the final prediction. Results show that Long Short-Term Memory neural network can improve performance over a naive bag of words approach, but with a much smaller feature set. With more fine-tuning of neural network hyperparameters, these results will likely improve significantly.

APA, Harvard, Vancouver, ISO, and other styles

41

Garay, Lucas Gonzalo. "Procesamiento de imágenes médicas para generación automática de reportes." Bachelor's thesis, 2019. http://hdl.handle.net/11086/13418.

Full text

Abstract:

Tesis (Lic. en Ciencias. de la Computación)--Universidad Nacional de Córdoba, Facultad de Matemática, Astronomía, Física y Computación, 2019.
En el presente trabajo se plantea el problema de la generación automática de reportes médicos a partir de imágenes. La redacción de informes que interpretan las imágenes médicas consume gran parte del tiempo de los especialistas. Además, en muchos casos se trata de una tarea muy repetitiva. En este contexto, un texto generado automáticamente puede reducir el trabajo del médico, que en lugar de redactar el texto completo se enfocará en revisar y modificar un texto generado automáticamente. El objetivo de esta tesis es consolidar una implementación basada en redes neuronales para descripción textual de imágenes. Para ello, se utilizará una arquitectura provista para la descripción de imágenes genéricas y se aplicará en este dominio médico. Finalmente se hará una comparación con otras implementaciones específicas de dominio y se compararán los resultados de forma cuantitativa y cualitativa. La principal dificultad que se presenta es la escasez de datos disponibles, porque a pesar de que se generan grandes volúmenes de datos, no siempre se encuentran disponibles para su uso. Para resolver este problema se aplicarán técnicas tales como subsampling y suprasampling. Otro problema detectado refiere a la métrica estándar de evaluación, BLEU, la cual no mide la semejanza entre dos textos de la forma que esperaríamos. Para solucionar esto, se plantea el uso de la similitud coseno. Finalmente, se reportará el impacto de los word embeddings y el mecanismo de atención.
In the present work we expose a system which aim is to automatically generate medical reports from medical images. The specialists spend a lot of time writing reports from images. Moreover, most of the cases this is a very repetitive task. In this context, an automatic generated draft could reduce the doctor’s workload, which will not write the whole report by himself, instead can review and modify the automatic generated draft. This thesis objective is to consolidate a neural network based implementation for image captioning. For this task, we will use a provided architecture for generic image captioning but will use it for medical domain. At the end, we will do a quantitative and qualitative comparison between our generic approach and some specific domain approaches. The main difficulty is the lack of available data, because despite of the huge amount of that is generated, not all of this data is available and with free use. To solve this problem we will apply some techniques such as subsampling and suprasampling. Another detected problem refers to the standard metric, BLEU, which doesn’t capture the similarity of two texts the way that we expected. To address this problem, we propose the cosine similarity. Finally, we report the impact the of specific domain word embeddings and the attention mechanism.

APA, Harvard, Vancouver, ISO, and other styles

42

Jaime, Rodrigo. "Aprendizaje activo para la extracción de relaciones en textos." Bachelor's thesis, 2019. http://hdl.handle.net/11086/13415.

Full text

Abstract:

Tesis (Lic. en Ciencias. de la Computación)--Universidad Nacional de Córdoba, Facultad de Matemática, Astronomía, Física y Computación, 2019.
A la hora de realizar un trabajo de aprendizaje automático, podemos encontrarnos con una fuente abundante de datos sin etiquetar y que su etiquetado manual sea un proceso costoso. Una técnica útil en estos casos es permitir que el algoritmo de aprendizaje consulte a un oráculo sobre las etiquetas de ciertos datos seleccionados que el algoritmo considere importantes. Este tipo de aprendizaje semi‐supervisado iterativo se llama aprendizaje activo. Utilizando IEPY, un framework de extracción de información orientado a extracción de relaciones, construiremos un sistema de extracción de relaciones con aprendizaje activo. Realizaremos una serie de experimentos sobre selección de instancias y etiquetado de features sobre un corpus basándonos en el trabajo de Settles 2011 con el objetivo de acelerar inicialmente el desempeño del modelo.
When working on some machine learning tasks, we may have to deal with large pool of unlabeled data and the labeling process to be manual and resource‐intensive. A useful technique in these cases is to allow the learning algorithm to query an oracle about the labels of certain data selected as important by the algorithm. This kind of iterative semi‐supervised learning is known as active learning. Using IEPY, an information extraction framework oriented towards relationship extraction, we will build a relationship extractor system with active learning. We will also perform a series of experiments about instance selection and feature labeling on a corpus, based on the work from Settles 2011 with the mission of accelerating the model's initial performance.

APA, Harvard, Vancouver, ISO, and other styles

43

Atapattu, Mudiyanselage Thushari Dilhani Atapattu. "A computational model for task-adapted knowledge organisation: improving learning through concept maps extracted from lecture slides." Thesis, 2015. http://hdl.handle.net/2440/95127.

Full text

Abstract:

This thesis presents a framework for automatically generating concept maps from lecture slides. A concept map is recognised as a valuable educational visualisation tool, which assists students in organising, sharing and representing knowledge. Expert maps (also known as expert concept maps) are prepared by domain experts with the intention to serve as scaffolding to facilitate learning. Automated concept map generation provides an alternative solution to the labour-intensive and time-consuming process of manually constructing expert maps. Therefore, the main objective of this thesis is to develop techniques to extract maps from lecture slides, ensuring that auto-generated concept maps may be utilised as a positive alternative to expert maps. This process is known as concept map mining (CMM). The particular interest of this thesis is on CMM from lecture slides, due to their wide usage within the teaching context and the poor support of sequentially-structured lecture slides in aiding learners in identifying relationships between information. In general, semantically and syntactically missing and ambiguous text in lecture slides make it undesirable for adopting previously developed algorithms for CMM. Within this thesis, a set of Natural Language Processing (NLP) algorithms are developed to support concept-relation-concept triple extraction to form concept maps. To support knowledge extraction and to overcome the noise associated with text, this work utilises contextual features specific to lecture slides. The natural layout of the lecture slides is incorporated to organise the extracted triples in a hierarchy. Structural (e.g. co-occurrence, term frequency) and graph-based features (e.g. degree of centrality) are utilised to rank the triples according to their importance within the domain. A series of evaluation studies in this thesis identify promising results, with several case studies demonstrating a strong positive correlation between auto-generated concept maps and human generated maps. These results indicate that this research provides an effective and efficient alternative to expert maps. Auto-generated concept maps can be utilised to provide scaffolding in the problem solving context, in particular supporting students who are lacking the required skills. Even though this application has been studied previously, these studies do not specifically focus on the relevance of information to learning. To fill this gap, this thesis investigates an approach to provide more relevant concept maps to a given problem. In pursuit of this goal, a framework capable of automatically extracting concept maps according to the given problems (named task-adapted concept maps) is developed, utilising auto-generated concept maps from lecture slides as domain knowledge. In order to investigate the effect of task-adapted concept maps as scaffolding for learning, an evaluation study was undertaken, with students in the task-adapted concept map scaffolding group demonstrated statistically significant learning gain compared to the students who received lecture slides or full concept maps as scaffolding.
Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2015

APA, Harvard, Vancouver, ISO, and other styles

44

Sankar, Chinnadhurai. "Neural approaches to dialog modeling." Thesis, 2020. http://hdl.handle.net/1866/24802.

Full text

Abstract:

Cette thèse par article se compose de quatre articles qui contribuent au domaine de l’apprentissage profond, en particulier dans la compréhension et l’apprentissage des ap- proches neuronales des systèmes de dialogue. Le premier article fait un pas vers la compréhension si les architectures de dialogue neuronal couramment utilisées capturent efficacement les informations présentes dans l’historique des conversations. Grâce à une série d’expériences de perturbation sur des ensembles de données de dialogue populaires, nous constatons que les architectures de dialogue neuronal couramment utilisées comme les modèles seq2seq récurrents et basés sur des transformateurs sont rarement sensibles à la plupart des perturbations du contexte d’entrée telles que les énoncés manquants ou réorganisés, les mots mélangés, etc. Le deuxième article propose d’améliorer la qualité de génération de réponse dans les systèmes de dialogue de domaine ouvert en modélisant conjointement les énoncés avec les attributs de dialogue de chaque énoncé. Les attributs de dialogue d’un énoncé se réfèrent à des caractéristiques ou des aspects discrets associés à un énoncé comme les actes de dialogue, le sentiment, l’émotion, l’identité du locuteur, la personnalité du locuteur, etc. Le troisième article présente un moyen simple et économique de collecter des ensembles de données à grande échelle pour modéliser des systèmes de dialogue orientés tâche. Cette approche évite l’exigence d’un schéma d’annotation d’arguments complexes. La version initiale de l’ensemble de données comprend 13 215 dialogues basés sur des tâches comprenant six domaines et environ 8 000 entités nommées uniques, presque 8 fois plus que l’ensemble de données MultiWOZ populaire.
This thesis by article consists of four articles which contribute to the ﬁeld of deep learning, speciﬁcally in understanding and learning neural approaches to dialog systems. The ﬁrst article takes a step towards understanding if commonly used neural dialog architectures eﬀectively capture the information present in the conversation history. Through a series of perturbation experiments on popular dialog datasets, weﬁndthatcommonly used neural dialog architectures like recurrent and transformer-based seq2seq models are rarely sensitive to most input context perturbations such as missing or reordering utterances, shuﬄing words, etc. The second article introduces a simple and cost-eﬀective way to collect large scale datasets for modeling task-oriented dialog systems. This approach avoids the requirement of a com-plex argument annotation schema. The initial release of the dataset includes 13,215 task-based dialogs comprising six domains and around 8k unique named entities, almost 8 times more than the popular MultiWOZ dataset. The third article proposes to improve response generation quality in open domain dialog systems by jointly modeling the utterances with the dialog attributes of each utterance. Dialog attributes of an utterance refer to discrete features or aspects associated with an utterance like dialog-acts, sentiment, emotion, speaker identity, speaker personality, etc. The ﬁnal article introduces an embedding-free method to compute word representations on-the-ﬂy. This approach signiﬁcantly reduces the memory footprint which facilitates de-ployment in on-device (memory constraints) devices. Apart from being independent of the vocabulary size, we ﬁnd this approach to be inherently resilient to common misspellings.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Language processing tasks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles