Thèses sur le sujet « LL. Automated language processing »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les 40 meilleures thèses pour votre recherche sur le sujet « LL. Automated language processing ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.
Allott, Nicholas Mark. « A natural language processing framework for automated assessment ». Thesis, Nottingham Trent University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.314333.
Texte intégralOnyenwe, Ikechukwu Ekene. « Developing methods and resources for automated processing of the African language Igbo ». Thesis, University of Sheffield, 2017. http://etheses.whiterose.ac.uk/17043/.
Texte intégralLeonhard, Annette Christa. « Automated question answering for clinical comparison questions ». Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/6266.
Texte intégralXozwa, Thandolwethu. « Automated statistical audit system for a government regulatory authority ». Thesis, Nelson Mandela Metropolitan University, 2015. http://hdl.handle.net/10948/6061.
Texte intégralSommers, Alexander Mitchell. « EXPLORING PSEUDO-TOPIC-MODELING FOR CREATING AUTOMATED DISTANT-ANNOTATION SYSTEMS ». OpenSIUC, 2021. https://opensiuc.lib.siu.edu/theses/2862.
Texte intégralWang, Wei. « Automated spatiotemporal and semantic information extraction for hazards ». Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/1415.
Texte intégralTeske, Alexander. « Automated Risk Management Framework with Application to Big Maritime Data ». Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/38567.
Texte intégralSalov, Aleksandar. « Towards automated learning from software development issues : Analyzing open source project repositories using natural language processing and machine learning techniques ». Thesis, Linnéuniversitetet, Institutionen för medieteknik (ME), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-66834.
Texte intégralSunil, Kamalakar FNU. « Automatically Generating Tests from Natural Language Descriptions of Software Behavior ». Thesis, Virginia Tech, 2013. http://hdl.handle.net/10919/23907.
Texte intégralMaster of Science
Mao, Jin, Lisa R. Moore, Carrine E. Blank, Elvis Hsin-Hui Wu, Marcia Ackerman, Sonali Ranade et Hong Cui. « Microbial phenomics information extractor (MicroPIE) : a natural language processing tool for the automated acquisition of prokaryotic phenotypic characters from text sources ». BIOMED CENTRAL LTD, 2016. http://hdl.handle.net/10150/622562.
Texte intégralMunnecom, Lorenna, et Miguel Chaves de Lemos Pacheco. « Exploration of an Automated Motivation Letter Scoring System to Emulate Human Judgement ». Thesis, Högskolan Dalarna, Mikrodataanalys, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:du-34563.
Texte intégralCunningham-Nelson, Samuel Kayne. « Enhancing student conceptual understanding and learning experience through automated textual analysis ». Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/134145/1/Samuel_Cunningham-Nelson_Thesis.pdf.
Texte intégralPaterson, Kimberly Laurel Ms. « TSPOONS : Tracking Salience Profiles Of Online News Stories ». DigitalCommons@CalPoly, 2014. https://digitalcommons.calpoly.edu/theses/1222.
Texte intégralSvensson, Pontus. « Automated Image Suggestions for News Articles : An Evaluation of Text and Image Representations in an Image Retrieval System ». Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166669.
Texte intégralLepage, Yves. « Un système de grammaires correspondancielles d'identification ». Grenoble 1, 1989. http://www.theses.fr/1989GRE10059.
Texte intégralXia, Menglin. « Text readability and summarisation for non-native reading comprehension ». Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/288740.
Texte intégralFancellu, Federico. « Computational models for multilingual negation scope detection ». Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/33038.
Texte intégralLermuzeaux, Jean-Marc. « Contribution à l'intégration des niveaux de traitement automatique de la langue écrite : ANAEL : un environnement de compréhension basé sur les objets, les actions et les grammaires d'événements ». Caen, 1988. http://www.theses.fr/1988CAEN2029.
Texte intégralDyremark, Johanna, et Caroline Mayer. « Bedömning av elevuppsatser genom maskininlärning ». Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262041.
Texte intégralToday, a large amount of a teacher’s workload is comprised of essay scoring and there is a large variability between teachers’ gradings. This report aims to examine what accuracy can be acceived with an automated essay scoring system for Swedish. Three following machine learning models for classification are trained and tested with 5-fold cross-validation on essays from Swedish national tests: Linear Discriminant Analysis, K-Nearest Neighbour and Random Forest. Essays are classified based on 31 language structure related attributes such as token-based length measures, similarity to texts with different formal levels and use of grammar. The results show a maximal quadratic weighted kappa value of 0.4829 and a grading identical to expert’s assessment in 57.53% of all tests. These results were achieved by a model based on Linear Discriminant Analysis and showed higher inter-rater reliability with expert grading than a local teacher. Despite an ongoing digitilization within the Swedish educational system, there are a number of obstacles preventing a complete automization of essay scoring such as users’ attitude, ethical issues and the current techniques difficulties in understanding semantics. Nevertheless, a partial integration of automatic essay scoring has potential to effectively identify essays suitable for double grading which can increase the consistency of large-scale tests to a low cost.
Marshall, Susan LaVonne. « Concept of Operations (CONOPS) for foreign language and speech translation technologies in a coalition military environment ». Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2005. http://library.nps.navy.mil/uhtbin/hyperion/05Mar%5FMarshall.pdf.
Texte intégralSilveira, Gabriela. « Narrativas produzidas por indivíduos afásicos e indivíduos cognitivamente sadios : análise computadorizada de macro e micro estrutura ». Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/5/5170/tde-01112018-101055/.
Texte intégralINTRODUCTION: The aphasic discourse analysis provides important information about the phonological, morphological, syntactic, semantic and pragmatic aspects of the language of patients who have suffered a stroke. The evaluation of the discourse, along with other methods, can contribute to observation of the evolution of the language and communication of aphasic patients; however, manual analysis is laborious and can lead to errors. OBJECTIVES: (1) to analyze, by computerized technologies, macro and microstructural aspects of the discourse of healthy cognitive individuals, Broca\'s and anomic aphasics; (2) to explore the discourse as indicator of the evolution of aphasia; (3) to analyze the contribution of single photon emission computed tomography (SPECT) to verify the correlation between behavioral and neuroimaging evolution data. METHOD: Two groups of patients were studied: GA1, consisting of eight individuals with Broca\'s aphasia and anomic aphasia, who were analyzed longitudinally from the sub-acute phase of the lesion and after three and six months; GA2 composed of 15 individuals with Broca\'s and anomic aphasia, with varying times of stroke installation and GC consisting of 30 cognitively healthy participants. Computerized technologies were explored for the analysis of metrics related to the micro and macrostructure of discourses uttered from Cinderela history and Cookie Theft picture. RESULTS: Comparing the GC and GA2, in relation to the discourse macrostructure, it was observed that the GA2 aphasics differed significantly from the GC in relation to the total number of propositions emitted; considering the microstructure, seven metrics differentiated both groups. There was a significant difference in the macro and microstructure between the discourses of Broca\'s aphasic subjects and anomic ones. It was possible to verify differences in macro and microstructure measurements in GA1 with the advancement of injury time. In GA1, the comparison between parameters in the sub-acute phase and after 6 months of stroke revealed differences in macrostructure - increase in the number of propositions of the orientation block and of the total propositions. Regarding the microstructure, the initial measures of syllable metrics by word content, incidence of nouns and incidence of content words differed after 6 months of intervention. The variable incidence of missing words in the dictionary showed a significantly lower value after three months of stroke. Cinderella\'s story provided more complete microstructure data than the Cookie Theft picture. There was no change in SPECT over time, without demonstration of change with the evolution of aphasia. CONCLUSION: The discourse produced from the history of Cinderella and the Cookie Theft picture generated material for macrostructure and microstructure analysis of cognitively healthy and aphasic individuals, made it possible to quantify and qualify the evolution of language in different phases of stroke recuperation and distinguished the behavior of healthy and with Broca´s and anomic aphasia, in macro and microstructure aspects. The exploration of computerized tools facilitated the analysis of the data in relation to the microstructure, but it was not applicable to the macrostructure, demonstrating that there is a need for tool adjustments for the discourse analysis of patients. SPECT data did not reflect the behavioral improvement of the language of aphasic subjects
Toledo, Cíntia Matsuda. « Análise de aspectos micro e macrolinguísticos da narrativa de indivíduos com doença de Alzheimer, comprometimento cognitivo leve e sem comprometimentos cognitivos ». Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/5/5170/tde-11092017-133850/.
Texte intégralINTRODUCTION: Population aging is a social trend known in developed countries and increasingly pronounced in developing countries. Dementia is considered one of the main health problems due to the rapid population growth of the elderly, and language disorders are considered important in these settings. The discourse is important for the identification of linguistic disorders in dementias as well as in the follow-up of these patients. The discourse differences characterization can help on the differential diagnosis and contribute to the creation of future tools for clinical intervention and help prevent the evolution and/or progression of dementia. The transcription and discourse analysis are laborius, thus the use of computational methods helped in the identification and extraction of linguistic characteristics. OBJECTIVE: The objective of this study was to identify changes in micro and macrolinguistic aspects that differentiate individuals with Alzheimer\'s disease, mild cognitive impairment and healthy elderly individuals during narrative of figures in sequence and to explore the computational tool (Coh-Metrix-Dementia) to analyze the subjects\' discourse. METHODS: 60 subjects were evaluated, 20 of them in each research group (mild Alzheimer\'s disease - GDA, amnestic cognitive impairment - GCCLa and control - CG). The subjects were asked to construct a narrative based on sequence of pictures, about the \"Cinderella´s Story\". The following linguistic-cognitive tests were also applied: Verbal Fluency, Boston Naming Test, and Camel and Cactus test. Coh-Metrix-Dementia was used for automatic metrics extraction. RESULTS: The values extracted by Coh-Metrix-Dementia were statistically treated and it was possible to obtain metrics capable of distinguishing the studied groups. In relation to the microlinguistic aspects, it was found the reduction in syntactic abilities, greater difficulty in verbal rescue, discourses with less cohesion and local coherence in the GDA. In the macrolinguistic level the GDA presented the less informative discourses, with greater loss in global coherence and the greater number of modalizations. The GDA also presented greater impairment on narrative structure. It was not possible to discriminate GCCLa and GC in any discourse´s metric in this study tool functioning. CONCLUSION: The GDA subjects presented discourses with greater macro and microstructural impairment. The computational tool usage proved to be an important ally for discursive analysis
Murakami, Tiago R. M. « Tesauros e a World Wide Web ». Thesis, 2005. http://eprints.rclis.org/9863/1/murakami-tesauros.pdf.
Texte intégralVidal-Santos, Gerard. « Avaluació de processos de reconeixement d’entitats (NER) com a complement a interfícies de recuperació d’informació en dipòsits digitals complexos ». Thesis, 2018. http://eprints.rclis.org/33589/1/VidalSantos_TFG_2018.pdf.
Texte intégralVidal-Santos, Gerard. « Avaluació de processos de reconeixement d’entitats (NER) com a complement a interfícies de recuperació d’informació en dipòsits digitals complexos ». Thesis, 2018. http://eprints.rclis.org/33692/1/VidalSantos_TFG_2018.pdf.
Texte intégralWille, Jens. « Automatisches Klassifizieren bibliographischer Beschreibungsdaten : Vorgehensweise und Ergebnisse ». Thesis, 2006. http://eprints.rclis.org/7790/1/wille_-_automatisches_klassifizieren_bibliographischer_beschreibungsdaten_%28diplomarbeit%29.pdf.
Texte intégralGómez-Díaz, Raquel. « Estudio de la incidencia del conocimiento lingüístico en los sistemas de recuperación de la información para el español ». Thesis, 2001. http://eprints.rclis.org/15670/1/DBD_G%C3%B3mezD%C3%ADazR_Estudiodelaincidencia.pdf.
Texte intégralÇapkın, Çağdaş. « Türkçe metin tabanlı açık arşivlerde kullanılan dizinleme yönteminin değerlendirilmesi / Evaluation of indexing method used in Turkish text-based open archives ». Thesis, 2011. http://eprints.rclis.org/28804/1/Cagdas_CAPKIN_Yuksek_Lisans_tezi.pdf.
Texte intégralOberhauser, Otto. « Automatisches Klassifizieren : Verfahren zur Erschliessung elektronischer Dokumente ». Thesis, 2004. http://eprints.rclis.org/8526/1/OCO_MLIS_Thesis.pdf.
Texte intégralBejarano-Ballen, Juan S. « Análisis de los altos cargos de la Generalitat Valenciana ». Thesis, 2017. http://eprints.rclis.org/31994/1/TFM_Juan_Sebastian_Bejarano.pdf.
Texte intégralSchmidt, Nora. « Semantisches Publizieren im interdisziplinären Wissenschaftsnetzwerk. Theoretische Grundlagen und Anforderungen ». Thesis, 2014. http://eprints.rclis.org/24215/1/schmidt_semantic-publishing_e-lis.html.
Texte intégral« An automated Chinese text processing system (ACCESS) : user-friendly interface and feature enhancement ». Chinese University of Hong Kong, 1994. http://library.cuhk.edu.hk/record=b5888227.
Texte intégralThesis (M.Phil.)--Chinese University of Hong Kong, 1994.
Includes bibliographical references (leaves 65-67).
Introduction --- p.1
Chapter 1. --- ACCESS with an Extendible User-friendly X/Chinese Interface --- p.4
Chapter 1.1. --- System requirement --- p.4
Chapter 1.1.1. --- User interface issue --- p.4
Chapter 1.1.2. --- Development issue --- p.5
Chapter 1.2. --- Development decision --- p.6
Chapter 1.2.1. --- X window system --- p.6
Chapter 1.2.2. --- X/Chinese toolkit --- p.7
Chapter 1.2.3. --- C language --- p.8
Chapter 1.2.4. --- Source code control system --- p.8
Chapter 1.3. --- System architecture --- p.9
Chapter 1.4. --- User interface --- p.10
Chapter 1.5. --- Sample screen --- p.13
Chapter 1.6. --- System extension --- p.14
Chapter 1.7. --- System portability --- p.18
Chapter 2. --- Study on Algorithms for Automatically Correcting Characters in Chinese Cangjie-typed Text --- p.19
Chapter 2.1. --- Chinese character input --- p.19
Chapter 2.1.1. --- Chinese keyboards --- p.20
Chapter 2.1.2. --- Keyboard redefinition scheme --- p.21
Chapter 2.2. --- Cangjie input method --- p.24
Chapter 2.3. --- Review on existing techniques for automatically correcting words in English text --- p.26
Chapter 2.3.1. --- Nonword error detection --- p.27
Chapter 2.3.2. --- Isolated-word error correction --- p.28
Chapter 2.3.2.1. --- Spelling error patterns --- p.29
Chapter 2.3.2.2. --- Correction techniques --- p.31
Chapter 2.3.3. --- Context-dependent word correction research --- p.32
Chapter 2.3.3.1. --- Natural language processing approach --- p.33
Chapter 2.3.3.2. --- Statistical language model --- p.35
Chapter 2.4. --- Research on error rates and patterns in Cangjie input method --- p.37
Chapter 2.5. --- Similarities and differences between Chinese and English typed text --- p.41
Chapter 2.5.1. --- Similarities --- p.41
Chapter 2.5.2. --- Differences --- p.42
Chapter 2.6. --- Proposed algorithm for automatic Chinese text correction --- p.44
Chapter 2.6.1. --- Sentence level --- p.44
Chapter 2.6.2. --- Part-of-speech level --- p.45
Chapter 2.6.3. --- Character level --- p.47
Conclusion --- p.50
Appendix A Cangjie Radix Table --- p.51
Appendix B Sample Text --- p.52
Article 1 --- p.52
Article 2 --- p.53
Article 3 --- p.56
Article 4 --- p.58
Appendix C Error Statistics --- p.61
References --- p.65
Gruzd, Anatoliy A., et Caroline Haythornthwaite. « Automated Discovery and Analysis of Social Networks from Threaded Discussions ». 2008. http://hdl.handle.net/10150/105081.
Texte intégralZhang, Lei. « DASE : Document-Assisted Symbolic Execution for Improving Automated Test Generation ». Thesis, 2014. http://hdl.handle.net/10012/8532.
Texte intégralCerveira, João Miguel dos Santos. « Automated Metrics System to Support Software Development Process with Natural Language Assistant ». Master's thesis, 2017. http://hdl.handle.net/10316/83083.
Texte intégralA Whitesmith é uma empresa de produtos e consultoria de desenvolvimento de software, que recorre a várias ferramentas de monitorização para auxiliar no seu processo de desenvolvimento de produtos.Para que este método seja bem aplicado, é necessário a existência de vários repositórios de dados sobre todo o planeamento e monitorização de desenvolvimento. Esta informação tem de estar guardada em ferramentas de fácil alcance e de rápida compreensão. Posto esta necessidade de alojamento de dados, começaram a surgir, no mercado, várias ferramentas com a capacidade de guardar e manipular informação, de modo a ajudar no desenvolvimento de software.Com o crescimento da empresa, seguiu-se uma grande quantidade de informação distribuída em várias destas ferramentas. Para ser possível fazer uma análise ao desenvolvimento de um determinado projeto, é necessário procurar informação e introduzi-la manualmente. Assim, surgiu a necessidade de criar uma solução para este problema que, não só consiga recolher toda a informação, mas que também execute uma análise ao estado de desenvolvimento de todos os projetos. Para não criar atrito no processo de desenvolvimento, vai ser necessário que a solução contenha o mínimo de interacção humano-computacional, sendo que todo o seu processo seja automatizado.A única interacção requisitada pela empresa, foi a integração de um assistente de linguagem natural na plataforma de comunicação usada por todos os membros, com a finalidade de melhorar a usabilidade na recolha de informação.
Whitesmith is a software development and product consulting company that uses a variety of monitoring tools to aid in its product development process.For this method to be well implemented, it's necessary to have several data repositories on all development planning and monitoring. This information must be stored in tools that are easy to reach and quick to understand. With this need for data, several tools with the ability to store and manipulate information have started to appear in the market in order to aid in the development of software.Since the company is growing, a large amount of information is distributed between this tools, so, to be able to make an analysis of a certain project development stage, it's necessary to look for information and to introduce it manually. Thus, the need to create a solution to this problem arose, that not only can collected all the information, but also perform an analysis of the development status of all its projects.To not create friction in the development process, it will be necessary for the solution to contain the minimum human-computational interaction, and the entire needs to be processed is automatically. The only interaction required by the company was the integration of a natural language assistant in the communication platform used by all members, in order to improve the usability of information collection. This communication should be made by both sides depending on the subject of the metric in question, creating the perfect atmosphere.
Radford, Benjamin James. « Automated Learning of Event Coding Dictionaries for Novel Domains with an Application to Cyberspace ». Diss., 2016. http://hdl.handle.net/10161/13386.
Texte intégralEvent data provide high-resolution and high-volume information about political events. From COPDAB to KEDS, GDELT, ICEWS, and PHOENIX, event datasets and the frameworks that produce them have supported a variety of research efforts across fields and including political science. While these datasets are machine-coded from vast amounts of raw text input, they nonetheless require substantial human effort to produce and update sets of required dictionaries. I introduce a novel method for generating large dictionaries appropriate for event-coding given only a small sample dictionary. This technique leverages recent advances in natural language processing and deep learning to greatly reduce the researcher-hours required to go from defining a new domain-of-interest to producing structured event data that describes that domain. An application to cybersecurity is described and both the generated dictionaries and resultant event data are examined. The cybersecurity event data are also examined in relation to existing datasets in related domains.
Dissertation
Gruzd, Anatoliy. « Name Networks : A Content-Based Method for Automated Discovery of Social Networks to Study Collaborative Learning ». 2009. http://hdl.handle.net/10150/105553.
Texte intégral« Analysis and Decision-Making with Social Media ». Doctoral diss., 2019. http://hdl.handle.net/2286/R.I.54830.
Texte intégralDissertation/Thesis
Doctoral Dissertation Computer Science 2019
Silva, Filipe José Good da. « Criação de um Módulo de Aprendizagem Computacional Automatizada para Cientistas de Dados ». Master's thesis, 2020. http://hdl.handle.net/10316/92522.
Texte intégralA área de Aprendizagem Computacional nunca teve tanto interesse e influência como nos dias de hoje. Várias são as outras áreas em que esta pode acrescentar valor e fazer face à crescente necessidade de melhoria, desde a área humana em que as nossas decisões são tomadas por algoritmos informáticos que foram desenvolvidos para executar determinadas tarefas, à área industrial onde as empresas recorrem a Aprendizagem Computacional para obter valor da quantidade enorme de dados que produzem. Contudo, desenvolver sistemas de Aprendizagem Computacional não é trivial, exigindo muito conhecimento e tempo, tornando assim o trabalho limitado a pessoas com experiência na área.Aprendizagem Computacional Automatizada (AutoML) procura remover limitações associadas ao desenvolvimento de sistemas dotados de inteligência ao automatizar as diferentes fases de um projecto de Aprendizagem Computacional. Esta nova área tenciona fazer face à necessidade crescente de ferramentas que tornam Aprendizagem Computacional mais acessível e menos complexa.Neste trabalho explorámos as capacidades actuais de AutoML de forma a implementar um módulo de AutoML. O módulo implementado está capacitado para realizar diversas etapas de um projecto de Aprendizagem Computacional de forma automatizada. Além disso, explorámos também um cenário onde AutoML pode ser integrado. Neste sentido, o módulo implementado foi integrado num assistente virtual, criando assim uma prova de conceito que permite a execução de operações de AutoML com recurso à comunicação em linguagem natural. Os resultados obtidos demonstram que as duas ferramentas implementadas permitem ultrapassar duas dificuldades no que toca à implementação de projectos de Aprendizagem Computacional. Por um lado, o módulo de AutoML reduz a complexidade associada ao desenvolvimento de sistemas inteligentes, permitindo assim que indivíduos sem conhecimento em Aprendizagem Computacional possam beneficiar da mesma. Por outro, o assistente virtual implementado elimina a necessidade de experiência de programação que é, por norma, fundamental em projectos de Aprendizagem Computacional.
The area of Machine Learning has never had as much interest as it has today. There are several other areas in which it can add value and address the growing need for improvement, from the human area in which our decisions are made by computer algorithms that were developed to perform certain tasks, to the industrial area where companies make use of to Machine Learning to gain value from the huge amount of data they produce. However, developing a Machine Learning system is not trivial. It is a complex task that requires a large amount of knowledge and time, limiting its development to people with experience in the area.Automated Machine Learning (AutoML) seeks to remove the limitations associated with developing intelligent systems by automating the different phases of a Machine Learning project. This new area aims to address the growing need for tools that make Machine Learning more accessible and less complex.In this work, we explored the current capabilities of AutoML in order to develop an AutoML module. The implemented module is able to execute several phases of a Machine Learning project in an automated way. In addition, we also explored a scenario where AutoML could be integrated. In this respect, the implemented module was integrated in a virtual assistant, thus creating a proof of concept that allows the execution of AutoML operations using natural language. Our results suggest that the two implemented tools enable to overcome two obstacles related with the implementation of Machine Learning projects. On one hand, the AutoML module reduces the complexity associated with the development of intelligent systems, thus allowing individuals without knowledge in Machine Learning to benefit from it. On the other hand, the implemented virtual assistant eliminates the need for programming experience, that is usually vital in Machine Learning Projects.
Nogueira, Afonso Manuel Salazar. « Comparação de desempenho de algoritmos de Machine Learning na classificação de IT incident tickets ». Master's thesis, 2020. http://hdl.handle.net/1822/71092.
Texte intégralEsta dissertação, inserida no projeto de dissertação de mestrado em Engenharia e Gestão de Sistemas de Informação do departamento de Sistemas de Informação da Universidade do Minho, tem como tema “Comparação de Desempenho de Algoritmos de Machine Learning na Classificação de IT Incident Tickets”, que deriva do estágio profissional que o autor realizou no Grupo Petrotec. Todos os dias, colaboradores dos inúmeros departamentos da instituição reportam incidentes tecnológicos, isto é, problemas relacionados com os mais variados elementos de trabalho do seu quotidiano que, a priori, possam ser resolvidos pelos profissionais de TI. Quando se deparam com algum problema, dirigem-se a uma plataforma onde podem detalhar categórica e textualmente o incidente ocorrido, de forma a que o support agent perceba facilmente o cerne da questão. Contudo, nem todos os colaboradores são rigorosos e precisos a descrever o incidente, onde, por muitas vezes, se verifica uma categoria totalmente desfasada com a descrição textual do ticket, o que torna mais demorada a dedução da solução por parte do profissional. Nesta dissertação, é proposta uma solução que visa atribuir uma categoria ao novo incident ticket através da classificação do mesmo, especificando o técnico informático especializado na solução do incidente em questão, sendo um mecanismo que recorre a técnicas de Text Mining, Processamento de Linguagem Natural (PLN) e Machine Learning que tenta reduzir ao máximo a intervenção humana na classificação dos tickets, diminuindo o tempo gasto na perceção e resolução dos mesmos. Com isso, a classificação do atributo relativo à descrição textual do ticket vai ser fulcral para a dedução do agente informático a resolver o incidente. Os resultados obtidos foram bastante satisfatórios, decifrando qual os melhores procedimentos de processamento textual a serem realizados, obtendo posteriormente, na maior parte dos modelos de classificação utilizados, uma acuidade superior a 90%, o que torna legítima a implementação de todas as metodologias adotadas num cenário real, isto é, no Grupo Petrotec. No que concerne à recolha, processamento e mining dos dados, teve-se em conta a metodologia Cross Industry Standard Process for Data Mining (CRISP-DM) e como metodologia de investigação utilizou-se a Design Science Research (DSR).
This dissertation, included in the master's thesis project in Engineering and Management of Information Systems of the Information Systems department of the University of Minho, has the theme ‘Performance Comparison of Machine Learning Algorithms in Classifying IT Incident Tickets’, which derives from the professional internship that the author performed at Petrotec Group. Every day, employees from the numerous departments of the institution report technological incidents, that is, problems related to the most varied elements of their daily work that can be solved by IT professionals. When faced with a problem, they go to a platform where they can categorically and verbally detail the incident that occurred, so that the 'support agent' easily understands the heart of the matter. However, not all employees are rigorous and accurate in describing the incident, where there is often a category that is totally out of step with the textual description of the ticket, which makes the professional's deduction from the solution more time consuming. In this dissertation, a solution is proposed which aims to assign a category to the new incident ticket through the classification of the same, specifying the specialized support agent in solving the incident in question, being a mechanism, which uses Text Mining, Natural Language Processing (NLP) and Machine Learning techniques and tries to reduce as much as possible the human intervention in the classification of the tickets, decreasing the time spent in their perception and resolution. Therefore, the classification of the attribute related to the ticket's textual description will be central to the assignment of the ‘support agent’ to solve the incident. The results obtained were quite satisfactory, deciphering the best textual processing procedures to be carried out, subsequently obtaining, in most of the classification models used, an accuracy of more than 90%, which makes the implementation of all the methodologies adopted in a real scenario legitimate, that is, in the Petrotec Group. Regarding to data collection, processing and mining, the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology was taken into account and Design Science Research (DSR) was used as the research methodology.