Tesis sobre el tema "Artificial datasets"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Artificial datasets".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Hilton, Erwin. "Visual datasets for artificial intelligence agents". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119553.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 41).
In this thesis, I designed and implemented two visual dataset generation tool frameworks. With these tools, I introduce procedurally generated new data to test VQA agents and other visual Al models on. The first tool is Spatial IQ Generative Dataset (SIQGD). This tool generates images based on the Raven's Progressive Matrices spatial IQ examination metric. The second tool is a collection of 3D models along with a Blender3D extension that renders images of the models from multiple viewpoints along with their depth maps.
by Erwin Hilton.
M. Eng.
Siddique, Nahian A. "PATTERN RECOGNITION IN CLASS IMBALANCED DATASETS". VCU Scholars Compass, 2016. http://scholarscompass.vcu.edu/etd/4480.
Lundberg, Oskar. "Decentralized machine learning on massive heterogeneous datasets : A thesis about vertical federated learning". Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444639.
Horečný, Peter. "Metody segmentace obrazu s malými trénovacími množinami". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412996.
Woods, Brent J. "Computer-Aided Detection of Malignant Lesions in Dynamic Contrast Enhanced MRI Breast and Prostate Cancer Datasets". The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1218155270.
Yasarer, Hakan. "Decision making in engineering prediction systems". Diss., Kansas State University, 2013. http://hdl.handle.net/2097/16231.
Department of Civil Engineering
Yacoub M. Najjar
Access to databases after the digital revolutions has become easier because large databases are progressively available. Knowledge discovery in these databases via intelligent data analysis technology is a relatively young and interdisciplinary field. In engineering applications, there is a demand for turning low-level data-based knowledge into a high-level type knowledge via the use of various data analysis methods. The main reason for this demand is that collecting and analyzing databases can be expensive and time consuming. In cases where experimental or empirical data are already available, prediction models can be used to characterize the desired engineering phenomena and/or eliminate unnecessary future experiments and their associated costs. Phenomena characterization, based on available databases, has been utilized via Artificial Neural Networks (ANNs) for more than two decades. However, there is a need to introduce new paradigms to improve the reliability of the available ANN models and optimize their predictions through a hybrid decision system. In this study, a new set of ANN modeling approaches/paradigms along with a new method to tackle partially missing data (Query method) are introduced for this purpose. The potential use of these methods via a hybrid decision making system is examined by utilizing seven available databases which are obtained from civil engineering applications. Overall, the new proposed approaches have shown notable prediction accuracy improvements on the seven databases in terms of quantified statistical accuracy measures. The proposed new methods are capable in effectively characterizing the general behavior of a specific engineering/scientific phenomenon and can be collectively used to optimize predictions with a reasonable degree of accuracy. The utilization of the proposed hybrid decision making system (HDMS) via an Excel-based environment can easily be utilized by the end user, to any available data-rich database, without the need for any excessive type of training.
Gusarov, Nikita. "Performances des modèles économétriques et de Machine Learning pour l’étude économique des choix discrets de consommation". Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALE001.
This thesis is a cross-disciplinary study of discrete choice modeling, addressing both econometrics and machine learning (ML) techniques applied to individual choice modeling. The problematic arises from insufficient points of contact among users (economists and engineers) and data scientists, who pursue different objectives, although using similar techniques. To bridge this interdisciplinary gap, the PhD work proposes a unified framework for model performance analysis. It facilitates the comparison of data analysis techniques under varying assumptions and transformations.The designed framework is suitable for a variety of econometrics and ML models. It addresses the performance comparison task from the research procedure perspective, incorporating all the steps potentially affecting the performance perceptions. To demonstrate the framework’s capabilities we propose a series of 3 applied studies. In those studies the model performance is explored face to the changes in (1) sample size and balance, resulting from data collection; (2) changes in preferences structure within population, reflecting incorrect behavioral assumptions; and (3) model selection, directly intertwined with the performance perception
Matsumoto, Élia Yathie. "A methodology for improving computed individual regressions predictions". Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-12052016-140407/.
Esta pesquisa propõe uma metodologia para melhorar previsões calculadas por um modelo de regressão, sem a necessidade de modificar seus parâmetros ou sua arquitetura. Em outras palavras, o objetivo é obter melhores resultados por meio de ajustes nos valores computados pela regressão, sem alterar ou reconstruir o modelo de previsão original. A proposta é ajustar os valores previstos pela regressão por meio do uso de estimadores de confiabilidade individuais capazes de indicar se um determinado valor estimado é propenso a produzir um erro considerado crítico pelo usuário da regressão. O método proposto foi testado em três conjuntos de experimentos utilizando três tipos de dados diferentes. O primeiro conjunto de experimentos trabalhou com dados produzidos artificialmente, o segundo, com dados transversais extraídos no repositório público de dados UCI Machine Learning Repository, e o terceiro, com dados do tipo séries de tempos extraídos do ISO-NE (Independent System Operator in New England). Os experimentos com dados artificiais foram executados para verificar o comportamento do método em situações controladas. Nesse caso, os experimentos alcançaram melhores resultados para dados limpos artificialmente produzidos e evidenciaram progressiva piora com a adição de elementos aleatórios. Os experimentos com dados reais extraído das bases de dados UCI e ISO-NE foram realizados para investigar a aplicabilidade da metodologia no mundo real. O método proposto foi capaz de melhorar os valores previstos por regressões em cerca de 95% dos experimentos realizados com dados reais.
Gualandi, Giacomo. "Analisi di dataset in campo finanziario mediante reti neurali LSTM". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19623/.
Mattiussi, Vlad. "Una Rassegna di Dataset e Applicazioni Innovative di Intelligenza Artificiale per Affrontare la Pandemia da COVID19". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21844/.
Nett, Ryan. "Dataset and Evaluation of Self-Supervised Learning for Panoramic Depth Estimation". DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2234.
Elatfi, Hamza. "Sviluppo di un sistema di crowdsourcing per la validazione e l'arricchimento di dataset". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18452/.
Bartocci, John Timothy. "Generating a synthetic dataset for kidney transplantation using generative adversarial networks and categorical logit encoding". Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617104572023027.
Lohniský, Michal. "Všesměrová detekce objektů". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236093.
Bianchi, Eric Loran. "COCO-Bridge: Common Objects in Context Dataset and Benchmark for Structural Detail Detection of Bridges". Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/87588.
MS
Common Objects in Context for bridge inspection (COCO-Bridge) was introduced to improve a drone-conducted bridge inspection process. Drones are a great tool for bridge inspectors because they bring flexibility and access to the inspection. However, drones have a notoriously difficult time operating near bridges, because the signal can be lost between the operator and the drone. COCO-Bridge is an imagebased dataset that uses Artificial Intelligence (AI) as a solution to this particular problem, but has applications in other facets of the inspection as well. This effort initiated a dataset with a focus on identifying specific parts of a bridge or structural bridge elements. This would allow a drone to fly without explicit direction if the signal was lost, and also has the potential to extend its flight time. Extending flight time and operating autonomously are great advantagesfor drone operators and bridge inspectors. The output from COCO-Bridge would also help the inspectors identify areas that are prone to defects by highlighting regions that require inspection. The image dataset consisted of 774 images to detect four structural bridge elements which are commonly reviewed and rated during bridge inspections. The goal is to continue to increase the number of images and encompass more structural bridge elements in the dataset so that it may be used for all types of bridges. Methods to reduce the required number of images were investigated, because gathering images of structural bridge elements is challenging,. The results from model tests helped build a roadmap for the expansion and best-practices for developing a dataset of this type.
Poggi, Cavalletti Stefano. "Utilizzo di tecniche di Machine Learning per l'analisi di dataset in ambito sanitario". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/21743/.
Hou, Chuanchuan. "Vibration-based damage identification with enhanced frequency dataset and a cracked beam element model". Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20434.
Furundzic, Bojan y Fabian Mathisson. "Dataset Evaluation Method for Vehicle Detection Using TensorFlow Object Detection API". Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-43345.
Inom fältet av objektdetektering har ny utveckling demonstrerat stor kvalitetsvariation mellan visuella dataset. Till följd av detta finns det ett behov av standardiserade valideringsmetoder för att jämföra visuella dataset och deras prestationsförmåga. Detta examensarbete har, med ett fokus på fordonsigenkänning, som syfte att utveckla en pålitlig valideringsmetod som kan användas för att jämföra visuella dataset. Denna valideringsmetod användes därefter för att fastställa det dataset som bidrog till systemet med bäst förmåga att detektera fordon. De dataset som användes i denna studien var BDD100K, KITTI och Udacity, som tränades på individuella igenkänningsmodeller. Genom att applicera denna valideringsmetod, fastställdes det att BDD100K var det dataset som bidrog till systemet med bäst presterande igenkänningsförmåga. En analys av dataset storlek, etikettdistribution och genomsnittliga antalet etiketter per bild var även genomförd. Tillsammans med ett experiment som genomfördes för att testa modellerna i verkliga sammanhang, kunde det avgöras att valideringsmetoden stämde överens med de fastställda resultaten. Slutligen studerades TensorFlow Object Detection APIs förmåga att förbättra prestandan som erhålls av ett visuellt dataset. Genom användning av ett modifierat dataset, kunde det fastställas att TensorFlow Object Detection API är ett lämpligt modifieringsverktyg som kan användas för att öka prestandan av ett visuellt dataset.
Thumé, Gabriela Salvador. "Geração de imagens artificiais e quantização aplicadas a problemas de classificação". Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-16122016-150334/.
Each image can be represented by a combination of several features like color frequency and texture properties. Those features compose a multidimensional vector, which represents the original image. Commonly this vector is given as an input to a classification method that can learn from examplesand build a decision model. The literature suggests that image preparation steps like acute acquisition, preprocessing and segmentation can positively impact such classification. Besides that, class unbalancing is also a barrier to achieve good classification accuracy. Some features and methods can be explored to improveobjects\' description, thus their classification. Possible suggestions include: reducing colors number before feature extraction instead of applying quantization methods to raw vectors already extracted; and generating synthetic images from original ones, to balance the number of samples in an uneven data set. We propose to improve image classification using image processing methods before feature extraction. Specifically we want to analyze the influence of both balancing and quantization methods while applied to datasets in a classification routine. This research also analyses the visualization of feature space after the artificial image generation and feature interpolation (SMOTE), against to original space. Such visualization is used because it allows us to know how important is the rebalacing method. The results show that quantization simplifies imagesby producing compacted vectors before feature extraction and dimensionality reduction; and that using artificial generation to rebalance image datasets can improve classification, when compared to the original one and to applying methods on the already extracted feature vectors.
Malazizi, Ladan. "Development of Artificial Intelligence-based In-Silico Toxicity Models. Data Quality Analysis and Model Performance Enhancement through Data Generation". Thesis, University of Bradford, 2008. http://hdl.handle.net/10454/4262.
La, Mura Francesco. "Tecniche di Preparazione di Dataset da Immagini Satellitari di Siti Archeologici per Elaborazioni con Deep Learning". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20428/.
Sievert, Rolf. "Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison". Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-175173.
Nilsson, Alexander y Martin Thönners. "A Framework for Generative Product Design Powered by Deep Learning and Artificial Intelligence : Applied on Everyday Products". Thesis, Linköpings universitet, Maskinkonstruktion, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-149454.
Barbazza, Sigfrido. "Deep-learning applicato all'identificazione automatica di frutta in immagini". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/11526/.
Johansson, David. "Price Prediction of Vinyl Records Using Machine Learning Algorithms". Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-96464.
Udaya, Kumar Magesh Kumar. "Classification of Parkinson’s Disease using MultiPass Lvq,Logistic Model Tree,K-Star for Audio Data set : Classification of Parkinson Disease using Audio Dataset". Thesis, Högskolan Dalarna, Datateknik, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:du-5596.
Galassi, Andrea. "Symbolic versus sub-symbolic approaches: a case study on training Deep Networks to play Nine Men’s Morris game". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/12859/.
Kaster, Joshua M. "Training Convolutional Neural Network Classifiers Using Simultaneous Scaled Supercomputing". University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1588973772607826.
Duncan, Andrew Paul. "The analysis and application of artificial neural networks for early warning systems in hydrology and the environment". Thesis, University of Exeter, 2014. http://hdl.handle.net/10871/17569.
Del, Vecchio Matteo. "Improving Deep Question Answering: The ALBERT Model". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20414/.
Kišš, Martin. "Rozpoznávání historických textů pomocí hlubokých neuronových sítí". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. http://www.nusl.cz/ntk/nusl-385912.
Chen, Jianan. "Deep Learning Based Multimodal Retrieval". Electronic Thesis or Diss., Rennes, INSA, 2023. http://www.theses.fr/2023ISAR0019.
Multimodal tasks play a crucial role in the progression towards achieving general artificial intelligence (AI). The primary goal of multimodal retrieval is to employ machine learning algorithms to extract relevant semantic information, bridging the gap between different modalities such as visual images, linguistic text, and other data sources. It is worth noting that the information entropy associated with heterogeneous data for the same high-level semantics varies significantly, posing a significant challenge for multimodal models. Deep learning-based multimodal network models provide an effective solution to tackle the difficulties arising from substantial differences in information entropy. These models exhibit impressive accuracy and stability in large-scale cross-modal information matching tasks, such as image-text retrieval. Furthermore, they demonstrate strong transfer learning capabilities, enabling a well-trained model from one multimodal task to be fine-tuned and applied to a new multimodal task, even in scenarios involving few-shot or zero-shot learning. In our research, we develop a novel generative multimodal multi-view database specifically designed for the multimodal referential segmentation task. Additionally, we establish a state-of-the-art (SOTA) benchmark and multi-view metric for referring expression segmentation models in the multimodal domain. The results of our comparative experiments are presented visually, providing clear and comprehensive insights
Bustos, Aurelia. "Extraction of medical knowledge from clinical reports and chest x-rays using machine learning techniques". Doctoral thesis, Universidad de Alicante, 2019. http://hdl.handle.net/10045/102193.
Alsulami, Khalil Ibrahim D. "Application-Based Network Traffic Generator for Networking AI Model Development". University of Dayton / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1619387614152354.
Štarha, Dominik. "Meření podobnosti obrazů s pomocí hlubokého učení". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2018. http://www.nusl.cz/ntk/nusl-377018.
Varga, Adam. "Identifikace a charakterizace škodlivého chování v grafech chování". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442388.
Wåhlin, Peter. "Enhanching the Human-Team Awareness of a Robot". Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-16371.
Användningen av autonoma robotar i vårt samhälle ökar varje dag och en robot ses inte längre som ett verktyg utan som en gruppmedlem. Robotarna arbetar nu sida vid sida med oss och ger oss stöd under farliga arbeten där människor annars är utsatta för risker. Denna utveckling har i sin tur ökat behovet av robotar med mer människo-medvetenhet. Därför är målet med detta examensarbete att bidra till en stärkt människo-medvetenhet hos robotar. Specifikt undersöker vi möjligheterna att utrusta autonoma robotar med förmågan att bedöma och upptäcka olika beteenden hos mänskliga lag. Denna förmåga skulle till exempel kunna användas i robotens resonemang och planering för att ta beslut och i sin tur förbättra samarbetet mellan människa och robot. Vi föreslår att förbättra befintliga aktivitetsidentifierare genom att tillföra förmågan att tolka immateriella beteenden hos människan, såsom stress, motivation och fokus. Att kunna urskilja lagaktiviteter inom ett mänskligt lag är grundläggande för en robot som ska vara till stöd för laget. Dolda markovmodeller har tidigare visat sig vara mycket effektiva för just aktivitetsidentifiering och har därför använts i detta arbete. För att en robot ska kunna ha möjlighet att ge ett effektivt stöd till ett mänskligtlag måste den inte bara ta hänsyn till rumsliga parametrar hos lagmedlemmarna utan även de psykologiska. För att tyda psykologiska parametrar hos människor förespråkar denna masteravhandling utnyttjandet av mänskliga kroppssignaler. Signaler så som hjärtfrekvens och hudkonduktans. Kombinerat med kroppenssignalerar påvisar vi möjligheten att använda systemdynamiksmodeller för att tolka immateriella beteenden, vilket i sin tur kan stärka människo-medvetenheten hos en robot.
The thesis work was conducted in Stockholm, Kista at the department of Informatics and Aero System at Swedish Defence Research Agency.
"Anomaly Detection in Categorical Datasets with Artificial Contrasts". Master's thesis, 2016. http://hdl.handle.net/2286/R.I.40782.
Dissertation/Thesis
Masters Thesis Industrial Engineering 2016
"Understanding the Importance of Entities and Roles in Natural Language Inference : A Model and Datasets". Master's thesis, 2019. http://hdl.handle.net/2286/R.I.54921.
Dissertation/Thesis
Masters Thesis Computer Science 2019
Lee, Chia-Yi y 李佳怡. "Investigating the Hybrid Models of Decision Tree, Logistic Regression and Artificial Neural Network for Predicting Recurrence of Breast Cancer from Public Microarray Datasets". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/72125064198285323874.
Mendonça, Francisco de Andrade Bravo. "AudioMood: Classificação de emoções em bandas sonoras de filmes usando Redes Neuronais". Master's thesis, 2021. http://hdl.handle.net/10451/49347.
O recurso à Inteligência Artificial para a ajuda ou execução de uma tarefa é cada vez mais frequente na nossa vida. Desde assistentes pessoais e médicos ou até carros autónomos, o uso é vasto e é adoptado nas mais diversas áreas. Com o aumentar de complexidade das AI, estas requerem a criação de novos métodos para melhorar o treino de tarefas complexas. Nesse sentido, esta dissertação tenta ajudar o estudo dos métodos de treino de Redes Neuronais, utilizando áudio de modo a que a rede consiga identificar os sons presentes num filme. Para concretizar esse objectivo, o primeiro passo foi a análise de diversos datasets, de forma a seleccionar um que seja adaptado à metodologia utilizada. O dataset escolhido foi o AudioSet da Google, pois tem mais de dois milhões de vídeos anotados, algo que favorece este estudo. De seguida, foram desenvolvidas ferramentas para a criação de conjuntos mais pequenos de dados com base no AudioSet. Estas ferramentas trataram do download dos vídeos, a sua conversão em áudio, a manipulação e tratamento dos últimos, e a construção de novos datasets. No processo anteriormente descrito, foram aplicados os métodos de aumentação de dados, sendo estes a rotação de dados e o controlo de volume. Após a criação do dataset procedeu-se o treino. Para cada treino foi utilizado a mesma arquitectura do modelo, com pequenas diferenças no método de treino. É possível afirmar que para a tarefa escolhida, o aumento de dados no dataset e o uso de rotação de dados melhorou os resultados, enquanto a manipulação de volumes não ofereceu alterações suficientes aos dados para permitir que o modelo melhorasse.
Nowadays the use of Artificial Intelligence to help or execute a task is ever more frequent. From personal assistants, to video games, to autonomous cars, the ability to use AI is vast, and getting adopted in new areas. As the complexity of AI increases, the necessity of developing new methods to help in the training of AI is critical. In that sense, this dissertation tries to help in the study training methods for Neural Networks, using audio sources, so that it is able to identify the different sounds present in a movie. To meet this purpose, the first step was the analysis of different datasets, to find one that is adaptable to the methodology used. The chosen dataset was AudioSet by Google, which has more than 2 million annotated videos. Later, tools were developed to create smaller datasets from AudioSet. These tools took care of video download, their conversion to audio, the manipulation and treatment of these audios, and the construction of new datasets. In this process, data rotation and volume control, two methods of data augmentation, were applied with the intention of creating new data. With the abovementioned new dataset, models were trained. The same model architecture was used for all the training processes, but with small differences in the training method. For the chosen task, it can be said that the increase of data in the dataset and the use of data rotation improved the test results, while volume control didn’t offer sufficient alterations to the data, and so the test results didn’t improve.
(8768079), Nanxin Jin. "ASD PREDICTION FROM STRUCTURAL MRI WITH MACHINE LEARNING". Thesis, 2020.
"Referring Expression Comprehension for CLEVR-Ref+ Dataset". Master's thesis, 2020. http://hdl.handle.net/2286/R.I.62696.
Dissertation/Thesis
Masters Thesis Computer Science 2020
Gandhi, Priyanka. "Extracting Symptoms from Narrative Text using Artificial Intelligence". Thesis, 2020. http://hdl.handle.net/1805/24759.
Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms.
Lima, Bruno Tiago da Silva. "VISUALIZAÇÃO E ANÁLISE EM OBSERVAÇÕES AÉREAS POR INTELIGÊNCIA ARTIFICIAL". Master's thesis, 2020. http://hdl.handle.net/11110/1965.
"Accessible Retail Shopping For The Visually Impaired Using Deep Learning". Master's thesis, 2020. http://hdl.handle.net/2286/R.I.57075.
Dissertation/Thesis
Masters Thesis Computer Science 2020
Foroozandeh, Mehdi. "GAN-Based Synthesis of Brain Tumor Segmentation Data : Augmenting a dataset by generating artificial images". Thesis, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-169863.
(5929832), Ikbeom Jang. "Diffusion Tensor Imaging Analysis for Subconcussive Trauma in Football and Convolutional Neural Network-Based Image Quality Control That Does Not Require a Big Dataset". Thesis, 2019.
Dale, Ashley S. "3D Object Detection Using Virtual Environment Assisted Deep Network Training". Thesis, 2020. http://hdl.handle.net/1805/24756.
An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ = 0.015, compared to σ_F1 = 0.020 for the networks trained exclusively with real F1 data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.
(8771429), Ashley S. Dale. "3D OBJECT DETECTION USING VIRTUAL ENVIRONMENT ASSISTED DEEP NETWORK TRAINING". Thesis, 2021.
An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ F1 = 0.015, compared to σF 1 = 0.020 for the networks trained exclusively with real data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.