Tesis sobre el tema "Dataset VISION"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 39 mejores tesis para su investigación sobre el tema "Dataset VISION".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Toll, Abigail. "Matrices of Vision : Sonic Disruption of a Dataset". Thesis, Kungl. Musikhögskolan, Institutionen för komposition, dirigering och musikteori, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kmh:diva-4152.
Texto completoBerriel, Rodrigo Ferreira. "Vision-based ego-lane analysis system : dataset and algorithms". Mestrado em Informática, 2016. http://repositorio.ufes.br/handle/10/6775.
Texto completoApproved for entry into archive by Patricia Barros (patricia.barros@ufes.br) on 2017-04-13T14:00:19Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) dissertacao Rodrigo Ferreira Berriel.pdf: 18168750 bytes, checksum: 52805e1f943170ef4d6cc96046ea48ec (MD5)
Made available in DSpace on 2017-04-13T14:00:19Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) dissertacao Rodrigo Ferreira Berriel.pdf: 18168750 bytes, checksum: 52805e1f943170ef4d6cc96046ea48ec (MD5)
FAPES
A detecção e análise da faixa de trânsito são tarefas importantes e desafiadoras em sistemas avançados de assistência ao motorista e direção autônoma. Essas tarefas são necessárias para auxiliar veículos autônomos e semi-autônomos a operarem com segurança. A queda no custo dos sensores de visão e os avanços em hardware embarcado impulsionaram as pesquisas relacionadas a faixa de trânsito –detecção, estimativa, rastreamento, etc. – nas últimas duas décadas. O interesse nesse tópico aumentou ainda mais com a demanda por sistemas avançados de assistência ao motorista (ADAS) e carros autônomos. Embora amplamente estudado de forma independente, ainda há necessidade de estudos que propõem uma solução combinada para os vários problemas relacionados a faixa do veículo, tal como aviso de saída de faixa (LDW), detecção de troca de faixa, classificação do tipo de linhas de divisão de fluxo (LMT), detecção e classificação de inscrições no pavimento, e detecção da presença de faixas ajdacentes. Esse trabalho propõe um sistema de análise da faixa do veículo (ELAS) em tempo real capaz de estimar a posição da faixa do veículo, classificar as linhas de divisão de fluxo e inscrições na faixa, realizar aviso de saída de faixa e detectar eventos de troca de faixa. O sistema proposto, baseado em visão, funciona em uma sequência temporal de imagens. Características das marcações de faixa são extraídas tanto na perspectiva original quanto em images mapeadas para a vista aérea, que então são combinadas para aumentar a robustez. A estimativa final da faixa é modelada como uma spline usando uma combinação de métodos (linhas de Hough, filtro de Kalman e filtro de partículas). Baseado na faixa estimada, todos os outros eventos são detectados. Além disso, o sistema proposto foi integrado para experimentação em um sistema para carros autônomos que está sendo desenvolvido pelo Laboratório de Computação de Alto Desempenho (LCAD) da Universidade Federal do Espírito Santo (UFES). Para validar os algorítmos propostos e cobrir a falta de base de dados para essas tarefas na literatura, uma nova base dados com mais de 20 cenas diferentes (com mais de 15.000 imagens) e considerando uma variedade de cenários (estrada urbana, rodovias, tráfego, sombras, etc.) foi criada. Essa base de dados foi manualmente anotada e disponilizada publicamente para possibilitar a avaliação de diversos eventos que são de interesse para a comunidade de pesquisa (i.e. estimativa, mudança e centralização da faixa; inscrições no pavimento; cruzamentos; tipos de linhas de divisão de fluxo; faixas de pedestre e faixas adjacentes). Além disso, o sistema também foi validado qualitativamente com base na integração com o veículo autônomo. O sistema alcançou altas taxas de detecção em todos os eventos do mundo real e provou estar pronto para aplicações em tempo real.
Lane detection and analysis are important and challenging tasks in advanced driver assistance systems and autonomous driving. These tasks are required in order to help autonomous and semi-autonomous vehicles to operate safely. Decreasing costs of vision sensors and advances in embedded hardware boosted lane related research – detection, estimation, tracking, etc. – in the past two decades. The interest in this topic has increased even more with the demand for advanced driver assistance systems (ADAS) and self-driving cars. Although extensively studied independently, there is still need for studies that propose a combined solution for the multiple problems related to the ego-lane, such as lane departure warning (LDW), lane change detection, lane marking type (LMT) classification, road markings detection and classification, and detection of adjacent lanes presence. This work proposes a real-time Ego-Lane Analysis System (ELAS) capable of estimating ego-lane position, classifying LMTs and road markings, performing LDW and detecting lane change events. The proposed vision-based system works on a temporal sequence of images. Lane marking features are extracted in perspective and Inverse Perspective Mapping (IPM) images that are combined to increase robustness. The final estimated lane is modeled as a spline using a combination of methods (Hough lines, Kalman filter and Particle filter). Based on the estimated lane, all other events are detected. Moreover, the proposed system was integrated for experimentation into an autonomous car that is being developed by the High Performance Computing Laboratory of the Universidade Federal do Espírito Santo. To validate the proposed algorithms and cover the lack of lane datasets in the literature, a new dataset with more than 20 different scenes (in more than 15,000 frames) and considering a variety of scenarios (urban road, highways, traffic, shadows, etc.) was created. The dataset was manually annotated and made publicly available to enable evaluation of several events that are of interest for the research community (i.e. lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes). Furthermore, the system was also validated qualitatively based on the integration with the autonomous vehicle. ELAS achieved high detection rates in all real-world events and proved to be ready for real-time applications.
RAGONESI, RUGGERO. "Addressing Dataset Bias in Deep Neural Networks". Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1069001.
Texto completoXie, Shuang. "A Tiny Diagnostic Dataset and Diverse Modules for Learning-Based Optical Flow Estimation". Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39634.
Texto completoNett, Ryan. "Dataset and Evaluation of Self-Supervised Learning for Panoramic Depth Estimation". DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2234.
Texto completoAndruccioli, Matteo. "Previsione del Successo di Prodotti di Moda Prima della Commercializzazione: un Nuovo Dataset e Modello di Vision-Language Transformer". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24956/.
Texto completoJoubert, Deon. "Saliency grouped landmarks for use in vision-based simultaneous localisation and mapping". Diss., University of Pretoria, 2013. http://hdl.handle.net/2263/40834.
Texto completoDissertation (MEng)--University of Pretoria, 2013.
gm2014
Electrical, Electronic and Computer Engineering
unrestricted
Horečný, Peter. "Metody segmentace obrazu s malými trénovacími množinami". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-412996.
Texto completoTagebrand, Emil y Ek Emil Gustafsson. "Dataset Generation in a Simulated Environment Using Real Flight Data for Reliable Runway Detection Capabilities". Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-54974.
Texto completoSievert, Rolf. "Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison". Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-175173.
Texto completoMolin, David. "Pedestrian Detection Using Convolutional Neural Networks". Thesis, Linköpings universitet, Datorseende, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-120019.
Texto completoArcidiacono, Claudio Salvatore. "An empirical study on synthetic image generation techniques for object detectors". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235502.
Texto completoKonvolutionella neurala nätverk är ett mycket kraftfullt verktyg för maskininlärning som överträffade andra tekniker inom bildigenkänning. Den största nackdelen med denna metod är den massiva mängd träningsdata som krävs, eftersom det är mycket arbetsintensivt att producera träningsdata för bildigenkänningsuppgifter. För att ta itu med detta problem har olika tekniker föreslagits för att generera syntetiska träningsdata automatiskt. Dessa syntetiska datagenererande tekniker kan grupperas i två kategorier: den första kategorin genererar syntetiska bilder med hjälp av datorgrafikprogram och CAD-modeller av objekten att känna igen; Den andra kategorin genererar syntetiska bilder genom att klippa objektet från en bild och klistra in det på en annan bild. Eftersom båda teknikerna har sina fördelar och nackdelar, skulle det vara intressant för industrier att undersöka mer ingående de båda metoderna. Ett vanligt fall i industriella scenarier är att upptäcka och klassificera objekt i en bild. Olika föremål som hänför sig till klasser som är relevanta i industriella scenarier är ofta oskiljbara (till exempel de är alla samma komponent). Av dessa skäl syftar detta avhandlingsarbete till att svara på frågan “Bland CAD-genereringsteknikerna, Cut-paste generationsteknikerna och en kombination av de två teknikerna, vilken teknik är mer lämplig för att generera bilder för träningsobjektdetektorer i industriellascenarier”. För att svara på forskningsfrågan föreslås två syntetiska bildgenereringstekniker som hänför sig till de två kategorierna. De föreslagna teknikerna är skräddarsydda för applikationer där alla föremål som tillhör samma klass är oskiljbara, men de kan också utökas till andra applikationer. De två syntetiska bildgenereringsteknikerna jämförs med att mäta prestanda hos en objektdetektor som utbildas med hjälp av syntetiska bilder på en testdataset med riktiga bilder. Föreställningarna för de två syntetiska datagenererande teknikerna som används för dataförökning har också uppmätts. De empiriska resultaten visar att CAD-modelleringstekniken fungerar väsentligt bättre än Cut-Paste-genereringstekniken, där syntetiska bilder är den enda källan till träningsdata (61% bättre), medan de två generationsteknikerna fungerar lika bra som dataförstoringstekniker. Dessutom visar de empiriska resultaten att modellerna som utbildats med bara syntetiska bilder utför nästan lika bra som modellen som utbildats med hjälp av riktiga bilder (7,4% sämre) och att förstora datasetet med riktiga bilder med hjälp av syntetiska bilder förbättrar modellens prestanda (9,5% bättre).
Capuzzo, Davide. "3D StixelNet Deep Neural Network for 3D object detection stixel-based". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/22017/.
Texto completoMahmood, Muhammad Habib. "Motion annotation in complex video datasets". Doctoral thesis, Universitat de Girona, 2018. http://hdl.handle.net/10803/667583.
Texto completoLa segmentació del moviment es refereix al procés de separar regions i trajectòries d'una seqüència de vídeo en subconjunts coherents d'espai i de temps. En aquesta tesi hem creat un nou i multifacètic dataset amb seqüències de la vida real que inclou diferent número de moviments i fotogrames per seqüència i distorsions amb dades incomplertes. A més, inclou ground-truth en tots els fotogrames basat en mesures de trajectòria i regió. Hem proposat també una nova eina semiautomàtica per delinear les trajectòries en vídeos complexos, fins i tot en vídeos capturats amb càmeres mòbils. Amb una mínima anotació manual dels objectes, l'algoritme és capaç de propagar-la en tots els fotogrames. Durant les oclusions, la correcció de les etiquetes es realitza aplicant el seguiment de la màscara per a cada ordre de profunditat. Els resultats obtinguts mostren que el nostre enfocament ofereix resultats reeixits en una àmplia varietat de seqüències de vídeo.
Bustos, Aurelia. "Extraction of medical knowledge from clinical reports and chest x-rays using machine learning techniques". Doctoral thesis, Universidad de Alicante, 2019. http://hdl.handle.net/10045/102193.
Texto completoCooper, Lee Alex Donald. "High Performance Image Analysis for Large Histological Datasets". The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1250004647.
Texto completoChaudhary, Gautam. "RZSweep a new volume-rendering technique for uniform rectilinear datasets /". Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-04012003-141739.
Texto completoBäck, Erik. "En vision om en ny didaktik för undervisning i företagsekonomi". Thesis, Malmö högskola, Lärarutbildningen (LUT), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-34009.
Texto completoRamswamy, Lakshmy. "PARZSweep a novel parallel algorithm for volume rendering of regular datasets /". Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-04012003-140443.
Texto completoJia, Sen. "Data from the wild in computer vision : generating and exploiting large scale and noisy datasets". Thesis, University of Bristol, 2016. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.738203.
Texto completoGodavarthy, Sridhar. "Microexpression Spotting in Video Using Optical Strain". Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1642.
Texto completoTouranakou, Maria. "A Novel System for Deep Analysis of Large-Scale Hand Pose Datasets". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240419.
Texto completoDetta examensprojekt föreslår design och implementering av ett nytt system för djup analys av storskaliga datamängder av handställningar. Systemet består av en uppsättning moduler för automatisk borttagning av redundans, klassificering, statistisk analys och visualisering av storskaliga dataset baserade på deras egenskaper. I det här projektet utförs arbete på det specifika användningsområdet för bilder av handrörelser framför smarttelefonkameror. Egenskaperna hos bilderna undersöks, och bilderna förbehandlas för att minska repetitivt innehåll och ljud i data. Två olika designparadigmer för innehållsanalys och bildklassificering används, en datorvisionspipeline och en djuplärningsrörledning. Datasynsrörledningen innehåller flera steg i bildbehandling, inklusive bildsegmentering, handdetektering samt funktionen extraktion följt av ett klassificeringssteg. Den djupa inlärningsrörledningen använder ett fällningsnätverk för klassificering. För industriella applikationer med stor mångfald på datainnehåll föreslås djupinlärning för bildklassificering och vision rekommenderas för funktionsanalys. Slutligen utförs statistisk analys för att visuellt extrahera nödvändig information om handfunktioner och mångfald av klassificerade data. Huvuddelen av detta arbete ligger i anpassningen av datasyn och djupa inlärningsverktyg för design och implementering av ett hybridsystem för djup dataanalys.
Kratochvíla, Lukáš. "Trasování objektu v reálném čase". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403748.
Texto completoHummel, Georg Verfasser], Peter [Akademischer Betreuer] [Stütz y Paolo [Gutachter] Remagnino. "On synthetic datasets for development of computer vision algorithms in airborne reconnaissance applications / Georg Hummel ; Gutachter: Peter Stütz, Paolo Remagnino ; Akademischer Betreuer: Peter Stütz ; Universität der Bundeswehr München, Fakultät für Luft- und Raumfahrttechnik". Neubiberg : Universitätsbibliothek der Universität der Bundeswehr München, 2017. http://d-nb.info/1147386331/34.
Texto completoHummel, Georg [Verfasser], Peter [Akademischer Betreuer] [Gutachter] Stütz y Paolo [Gutachter] Remagnino. "On synthetic datasets for development of computer vision algorithms in airborne reconnaissance applications / Georg Hummel ; Gutachter: Peter Stütz, Paolo Remagnino ; Akademischer Betreuer: Peter Stütz ; Universität der Bundeswehr München, Fakultät für Luft- und Raumfahrttechnik". Neubiberg : Universitätsbibliothek der Universität der Bundeswehr München, 2017. http://d-nb.info/1147386331/34.
Texto completoMichaud, Dorian. "Indexation bio-inspirée pour la recherche d'images par similarité". Thesis, Poitiers, 2018. http://www.theses.fr/2018POIT2288/document.
Texto completoImage Retrieval is still a very active field of image processing as the number of available image datasets continuously increases.One of the principal objectives of Content-Based Image Retrieval (CBIR) is to return the most similar images to a given query with respect to their visual content.Our work fits in a very specific application context: indexing small expert image datasets, with no prior knowledge on the images. Because of the image complexity, one of our contributions is the choice of effective descriptors from literature placed in direct competition.Two strategies are used to combine features: a psycho-visual one and a statistical one.In this context, we propose an unsupervised and adaptive framework based on the well-known bags of visual words and phrases models that select relevant visual descriptors for each keypoint to construct a more discriminative image representation.Experiments show the interest of using this this type of methodologies during a time when convolutional neural networks are ubiquitous.We also propose a study about semi interactive retrieval to improve the accuracy of CBIR systems by using the knowledge of the expert users
Malireddi, Sri Raghu. "Systematic generation of datasets and benchmarks for modern computer vision". Thesis, 2019. http://hdl.handle.net/1828/10689.
Texto completoGraduate
Karpenko, Alexandre. "50,000 Tiny Videos: A Large Dataset for Non-parametric Content-based Retrieval and Recognition". Thesis, 2009. http://hdl.handle.net/1807/17690.
Texto completoShullani, Dasara. "Video forensic tools exploiting features from video-container to video-encoder level". Doctoral thesis, 2018. http://hdl.handle.net/2158/1126144.
Texto completoMoreira, Gonçalo Rebelo de Almeida. "Neuromorphic Event-based Facial Identity Recognition". Master's thesis, 2021. http://hdl.handle.net/10316/98251.
Texto completoA investigação na área do reconhecimento facial existe já há mais de meio século. O grandeinteresse neste tópico advém do seu tremendo potencial para impactar várias indústrias, comoa de vídeovigilância, autenticação pessoal, investigação criminal, lazer, entre outras. A maioriados algoritmos estado da arte baseiam-se apenas na aparência facial, especificamente, estesmétodos utilizam as caraterísticas estáticas da cara humana (e.g., a distância entre os olhos,a localização do nariz, a forma do nariz) para determinar com bastante eficácia a identidadede um sujeito. Contudo, é também discutido o facto de que os humanos fazem uso de outrotipo de informação facial para identificar outras pessoas, nomeadamente, o movimento facialidiossincrático de uma pessoa. Este conjunto de dados faciais é relevante devido a ser difícil de replicar ou de falsificar, enquanto que a aparência é facilmente alterada com ajuda deferramentas computacionais baratas e disponíveis a qualquer um.Por outro lado, câmaras de eventos são dispositivos neuromórficos, bastante recentes, quesão ótimos a codificar informação da dinâmica de uma cena. Estes sensores são inspiradospelo modo de funcionamento biológico do olho humano. Em vez de detetarem as várias intensidades de luz de uma cena, estes captam as variações dessas intensidades no cenário. Demodo que, e comparando com câmaras standard, estes mecanismos sensoriais têm elevadaresolução temporal, não sofrendo de imagem tremida, e são de baixo consumo, entre outrosbenefícios. Algumas das suas aplicações são Localização e Mapeamento Simultâneo (SLAM)em tempo real, deteção de anomalias e reconhecimento de ações/gestos.Tomando tudo isto em conta, o foco principal deste trabalho é de avaliar a aptidão da tecnologia fornecida pelas câmaras de eventos para completar tarefas mais complexas, nestecaso, reconhecimento de identidade facial, e o quão fácil será a sua integração num sistemano mundo real. Adicionalmente, é também disponibilizado o Dataset criado no âmbito destadissertação (NVSFD Dataset) de modo a possibilitar investigação futura sobre o tópico.
Facial recognition research has been around for longer than a half-century, as of today. Thisgreat interest in the field stems from its tremendous potential to enhance various industries,such as video surveillance, personal authentication, criminal investigation, and leisure. Moststateoftheart algorithms rely on facial appearance, particularly, these methods utilize the staticcharacteristics of the human face (e.g., the distance between both eyes, nose location, noseshape) to determine the subject’s identity extremely accurately. However, it is further argued thathumans also make use of another type of facial information to identify other people, namely, one’s idiosyncratic facial motion. This kind of facial data is relevant due to being hardly replicableor forged, whereas appearance can be easily distorted by cheap software available to anyone.On another note, eventcameras are quite recent neuromorphic devices that are remarkable at encoding dynamic information in a scene. These sensors are inspired by the biologicaloperation mode of the human eye. Rather than detecting the light intensity, they capture lightintensity variations in the setting. Thus, in comparison to standard cameras, this sensing mechanism has a high temporal resolution, therefore it does not suffer from motion blur, and haslow power consumption, among other benefits. A few of its early applications have been realtime Simultaneous Localization And Mapping (SLAM), anomaly detection, and action/gesturerecognition.Taking it all into account, the main purpose of this work is to evaluate the aptitude of the technology offered by eventcameras for completing a more complex task, that being facialidentity recognition, and how easily it could be integrated into real world systems. Additionally, itis also provided the Dataset created in the scope of this dissertation (NVSFD Dataset) in orderto facilitate future third-party investigation on the topic.
Foroozandeh, Mehdi. "GAN-Based Synthesis of Brain Tumor Segmentation Data : Augmenting a dataset by generating artificial images". Thesis, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-169863.
Texto completoBojja, Abhishake Kumar. "Deep neural networks for semantic segmentation". Thesis, 2020. http://hdl.handle.net/1828/11696.
Texto completoGraduate
Anderson, Peter James. "Vision and Language Learning: From Image Captioning and Visual Question Answering towards Embodied Agents". Phd thesis, 2018. http://hdl.handle.net/1885/164018.
Texto completoAmelio, Ravelli Andrea. "Annotation of Linguistically Derived Action Concepts in Computer Vision Datasets". Doctoral thesis, 2020. http://hdl.handle.net/2158/1200356.
Texto completoGebali, Aleya. "Detection of salient events in large datasets of underwater video". Thesis, 2012. http://hdl.handle.net/1828/4156.
Texto completoGraduate
Breslav, Mikhail. "3D pose estimation of flying animals in multi-view video datasets". Thesis, 2016. https://hdl.handle.net/2144/19720.
Texto completoHult, Jim y Pontus Pihl. "Inspecting product quality with computer vision techniques : Comparing traditional image processingmethodswith deep learning methodson small datasets in finding surface defects". Thesis, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-54056.
Texto completoRUSSO, PAOLO. "Broadening deep learning horizons: models for RGB and depth images adaptation". Doctoral thesis, 2020. http://hdl.handle.net/11573/1365047.
Texto completo(8771429), Ashley S. Dale. "3D OBJECT DETECTION USING VIRTUAL ENVIRONMENT ASSISTED DEEP NETWORK TRAINING". Thesis, 2021.
Buscar texto completoAn RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ F1 = 0.015, compared to σF 1 = 0.020 for the networks trained exclusively with real data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.