Rozprawy doktorskie na temat „Visual tracking”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Visual tracking”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Danelljan, Martin. "Visual Tracking". Thesis, Linköpings universitet, Datorseende, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-105659.
Pełny tekst źródłaWessler, Mike. "A modular visual tracking system". Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/11459.
Pełny tekst źródłaKlein, Georg. "Visual tracking for augmented reality". Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.614262.
Pełny tekst źródłaSalama, Gouda Ismail Mohamed. "Monocular and Binocular Visual Tracking". Diss., Virginia Tech, 1999. http://hdl.handle.net/10919/37179.
Pełny tekst źródłaPh. D.
Dehlin, Carl. "Visual Tracking Using Stereo Images". Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153776.
Pełny tekst źródłaSalti, Samuele <1982>. "On-line adaptive visual tracking". Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2011. http://amsdottorato.unibo.it/3735/1/samuele_salti_tesi.pdf.
Pełny tekst źródłaSalti, Samuele <1982>. "On-line adaptive visual tracking". Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2011. http://amsdottorato.unibo.it/3735/.
Pełny tekst źródłaDelabarre, Bertrand. "Contributions to dense visual tracking and visual servoing using robust similarity criteria". Thesis, Rennes 1, 2014. http://www.theses.fr/2014REN1S124/document.
Pełny tekst źródłaIn this document, we address the visual tracking and visual servoing problems. They are crucial thematics in the domain of computer and robot vision. Most of these techniques use geometrical primitives extracted from the images in order to estimate a motion from an image sequences. But using geometrical features means having to extract and match them at each new image before performing the tracking or servoing process. In order to get rid of this algorithmic step, recent approaches have proposed to use directly the information provided by the whole image instead of extracting geometrical primitives. Most of these algorithms, referred to as direct techniques, are based on the luminance values of every pixel in the image. But this strategy limits their use, since the criteria is very sensitive to scene perturbations such as luminosity shifts or occlusions. To overcome this problem, we propose in this document to use robust similarity measures, the sum of conditional variance and the mutual information, in order to perform robust direct visual tracking and visual servoing processes. Several algorithms are then proposed that are based on these criteria in order to be robust to scene perturbations. These different methods are tested and analyzed in several setups where perturbations occur which allows to demonstrate their efficiency
Arslan, Ali Erkin. "Visual Tracking With Group Motion Approach". Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/4/1056100/index.pdf.
Pełny tekst źródłaZhu, Biwen. "Visual Tracking with Deep Learning : Automatic tracking of farm animals". Thesis, KTH, Radio Systems Laboratory (RS Lab), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240086.
Pełny tekst źródłaAutomatisk spårning av övervakning i gårdens område kan bidra till att stödja jordbruket management. I detta projekt till ett automatiserat system för upptäckt upptäcka suggor från övervaknings filmer kommer att utformas med djupa lärande och datorseende metoder. Av hänsyn till Diskhantering och tid och hastighet Krav över nätverket för att uppnå realtidsscenarier i framtiden är spårning i komprimerade videoströmmar är avgörande. Det föreslagna systemet i detta projekt skulle använda en DCF (diskriminerande korrelationsfilter) som en klassificerare att upptäcka mål. Spårningen modell kommer att uppdateras genom att utbilda klassificeraren med online inlärningsmetoder. Compression teknik kodar videodata och minskar bithastigheter där videosignaler sänds kan hjälpa videoöverföring anpassar bättre i begränsad nätverk. det kan dock reducera bildkvaliteten på videoklipp och leder exakt hastighet av vårt spårningssystem för att minska. Därför undersöker vi utvärderingen av prestanda av befintlig visuella spårningsalgoritmer på videosekvenser Det ultimata målet med videokomprimering är att bidra till att bygga ett spårningssystem med samma prestanda men kräver färre nätverksresurser. Den föreslagna spårning algoritm spår framgångsrikt varje sugga i konsekutiva ramar i de flesta fall prestanda vår tracker var jämföras med två state-of-art spårning algoritmer:. Siamese Fully-Convolutional (FC) och Efficient Convolution Operators (ECO) utvärdering av prestanda Resultatet visar vår föreslagna tracker blir liknande prestanda med Siamese FC och ECO. I jämförelse med den ursprungliga spårningen uppnådde den föreslagna spårningen liknande spårningseffektivitet, samtidigt som det krävde mycket mindre lagring och alstra en lägre bitrate när videon komprimerades med lämpliga parametrar. Systemet är mycket långsammare än det behövs för spårning i realtid på grund av hög beräkningskomplexitet; därför behövs mer optimala metoder för att uppdatera spårningsmodellen för att uppnå realtidsspårning.
White, Jacob Harley. "Real-Time Visual Multi-Target Tracking in Realistic Tracking Environments". BYU ScholarsArchive, 2019. https://scholarsarchive.byu.edu/etd/7486.
Pełny tekst źródłaNiethammer, Marc. "Dynamic Level Sets for Visual Tracking". Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/7606.
Pełny tekst źródłaNdiour, Ibrahima Jacques. "Dynamic curve estimation for visual tracking". Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/37283.
Pełny tekst źródłaLaberge, Dominic. "Visual tracking for human-computer interaction". Thesis, University of Ottawa (Canada), 2003. http://hdl.handle.net/10393/26504.
Pełny tekst źródłaKhan, Muhammad Haris. "Visual tracking over multiple temporal scales". Thesis, University of Nottingham, 2015. http://eprints.nottingham.ac.uk/33056/.
Pełny tekst źródłaWong, Matthew. "Tracking maneuvering target using visual sensor". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ39896.pdf.
Pełny tekst źródłaMaggio, Emilio. "Monte Carlo methods for visual tracking". Thesis, Queen Mary, University of London, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.497791.
Pełny tekst źródłaNott, Viswajith Karapoondi. "Joint Visual and Wireless Tracking System". UKnowledge, 2009. http://uknowledge.uky.edu/gradschool_theses/592.
Pełny tekst źródłaLuo, Tao, i 羅濤. "Human visual tracking in surveillance video". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/206727.
Pełny tekst źródłapublished_or_final_version
Computer Science
Doctoral
Doctor of Philosophy
North, Ben. "Learning dynamical models for visual tracking". Thesis, University of Oxford, 1998. http://ora.ox.ac.uk/objects/uuid:6ed12552-4c30-4d80-88ef-7245be2d8fb8.
Pełny tekst źródłaGladh, Susanna. "Visual Tracking Using Deep Motion Features". Thesis, Linköpings universitet, Datorseende, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-134342.
Pełny tekst źródłaThanikasalam, Kokul. "Appearance based online visual object tracking". Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/130875/1/Kokul_Thanikasalam_Thesis.pdf.
Pełny tekst źródłaDI, NARDO EMANUEL. "ADVANCED METHODOLOGIES FOR VISUAL OBJECT TRACKING". Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/931766.
Pełny tekst źródłaOlsson, Mica. "Visual composition in video games : Visual analyzation using eye-tracking". Thesis, Luleå tekniska universitet, Institutionen för konst, kommunikation och lärande, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-64489.
Pełny tekst źródłaDenna rapport undersöks hur man kan applicera teori från klassisk konst och visuell struktur till interaktiva spel-miljöer i realtid och kunna påverka en spelares val och handlingar. Först går den igenom teorin som används inom klassisk konst såsom komposition, arbeta med färger, linjer och former samt kontraster mellan dem. Baserat på teorin används olika bilder där teorin appliceras och förutspår vad tittaren ska kolla och finna intressant och vidare därifrån applicera samma metod men på interaktiv media. All data under dessa undersökningar kommer samlas in användandes av eye-tracking system vilket registrerar tittarens ögats rörelse och placering på en datorskärm. Baserat på resultaten finns det mycket mer att studera inom användning av eye-tracking för spelanalys. Eye-tracking för stillbilder fungerar väldigt bra med klar data att se och läsa, men för interaktiva miljöer blir det snabbt mer abstrakt och behöver utvärderas mer om hur Eye-tracking bäst kan användas där.
Pålsson, Nicholas. "Guiding the viewer using visual components : Eye-tracking for visual analysis". Thesis, Luleå tekniska universitet, Institutionen för konst, kommunikation och lärande, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-74563.
Pełny tekst źródłaAli, Saad. "Taming Crowded Visual Scenes". Doctoral diss., University of Central Florida, 2008. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3593.
Pełny tekst źródłaPh.D.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Science PhD
Larsson, Olof. "Visual-inertial tracking using Optical Flow measurements". Thesis, Linköping University, Automatic Control, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-59970.
Pełny tekst źródła
Visual-inertial tracking is a well known technique to track a combination of a camera and an inertial measurement unit (IMU). An issue with the straight-forward approach is the need of known 3D points. To by-pass this, 2D information can be used without recovering depth to estimate the position and orientation (pose) of the camera. This Master's thesis investigates the feasibility of using Optical Flow (OF) measurements and indicates the benifits using this approach.
The 2D information is added using OF measurements. OF describes the visual flow of interest points in the image plane. Without the necessity to estimate depth of these points, the computational complexity is reduced. With the increased 2D information, the 3D information required for the pose estimate decreases.
The usage of 2D points for the pose estimation has been verified with experimental data gathered by a real camera/IMU-system. Several data sequences containing different trajectories are used to estimate the pose. It is shown that OF measurements can be used to improve visual-inertial tracking with reduced need of 3D-point registrations.
Ergezer, Hamza. "Visual Detection And Tracking Of Moving Objects". Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/2/12609098/index.pdf.
Pełny tekst źródłaKalman tracker and mean-shift tracker are other approaches which have been utilized. A new approach has been proposed for the problem of tracking multiple targets. We have implemented this method for single and multiple camera configurations. Multiple cameras have been used to augment the measurements. Homography matrix has been calculated to find the correspondence between cameras. Then, measurements and tracks have been associated by the new tracking method.
Qin, Lei. "Online machine learning methods for visual tracking". Thesis, Troyes, 2014. http://www.theses.fr/2014TROY0017/document.
Pełny tekst źródłaWe study the challenging problem of tracking an arbitrary object in video sequences with no prior knowledge other than a template annotated in the first frame. To tackle this problem, we build a robust tracking system consisting of the following components. First, for image region representation, we propose some improvements to the region covariance descriptor. Characteristics of a specific object are taken into consideration, before constructing the covariance descriptor. Second, for building the object appearance model, we propose to combine the merits of both generative models and discriminative models by organizing them in a detection cascade. Specifically, generative models are deployed in the early layers for eliminating most easy candidates whereas discriminative models are in the later layers for distinguishing the object from a few similar "distracters". The Partial Least Squares Discriminant Analysis (PLS-DA) is employed for building the discriminative object appearance models. Third, for updating the generative models, we propose a weakly-supervised model updating method, which is based on cluster analysis using the mean-shift gradient density estimation procedure. Fourth, a novel online PLS-DA learning algorithm is developed for incrementally updating the discriminative models. The final tracking system that integrates all these building blocks exhibits good robustness for most challenges in visual tracking. Comparing results conducted in challenging video sequences showed that the proposed tracking system performs favorably with respect to a number of state-of-the-art methods
Häger, Gustav. "Improving Discriminative Correlation Filters for Visual Tracking". Thesis, Linköpings universitet, Datorseende, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-125963.
Pełny tekst źródłaAllmän visuell följning är ett klassiskt problem inom datorseende. I den vanliga formuleringen antas ingen förkunskap om objektet som skall följas, utöver en initial rektangel i en videosekvens första bild.Detta är ett mycket svårt problem att lösa allmänt på grund av occlusioner, rotationer, belysningsförändringar och variationer i objektets uppfattde storlek. På senare år har följningsmetoder baserade på diskriminativea korrelationsfilter gett lovande resultat inom området. Dessa metoder är baserade på att med hjälp av Fourertransformen effektivt beräkna detektioner och modellupdateringar, samtidigt som de har mycket bra prestanda och klarar av många hundra bilder per sekund. De nuvarande metoderna uppskattar dock bara translationen hos det följda objektet, medans skalförändringar ignoreras. Detta examensarbete utvärderar ett antal metoder för att göra skaluppskattningar inom ett korrelationsfilterramverk. En innovativ metod baserad på att konstruera separata skal och translationsfilter. Den föreslagna metoden är robust och har signifikant bättre följningsprestanda, samtidigt som den kan användas i realtid. Det utförs också en utvärdering av olika särdragsrepresentationer på två stora benchmarking dataset för följning.
Nassif, Samer Chaker. "Cooperative windowing for real-time visual tracking". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/NQ30107.pdf.
Pełny tekst źródłaTurker, Burcu. "Multiple hypothesis tracking for multiple visual targets". Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/3/12611837/index.pdf.
Pełny tekst źródłaWESIERSKI, Daniel. "Visual tracking of articulated and flexible objects". Phd thesis, Institut National des Télécommunications, 2013. http://tel.archives-ouvertes.fr/tel-00939073.
Pełny tekst źródłaKaucic, Robert August. "Lip tracking for audio-visual speech recognition". Thesis, University of Oxford, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360392.
Pełny tekst źródłaKhoo, B. E. "A visual, knowledge-based robot tracking system". Thesis, Swansea University, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.637791.
Pełny tekst źródłaTosas, Martin. "Visual articulated hand tracking for interactive surfaces". Thesis, University of Nottingham, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.438416.
Pełny tekst źródłaWang, Yiming. "Active visual tracking in multi-agent scenarios". Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/42804.
Pełny tekst źródłaDu, X. "Visual tracking in robotic minimally invasive surgery". Thesis, University College London (University of London), 2018. http://discovery.ucl.ac.uk/10047149/.
Pełny tekst źródłaWesierski, Daniel. "Visual tracking of articulated and flexible objects". Thesis, Evry, Institut national des télécommunications, 2013. http://www.theses.fr/2013TELE0007/document.
Pełny tekst źródłaHumans can visually track objects mostly effortlessly. However, it is hard for a computer to track a fast moving object under varying illumination and occlusions, in clutter, and with varying appearance in camera projective space due to its relaxed rigidity or change in viewpoint. Since a generic, precise, robust, and fast tracker could trigger many applications, object tracking has been a fundamental problem of practical importance since the beginnings of computer vision. The first contribution of the thesis is a computationally efficient approach to tracking objects of various shapes and motions. It describes a unifying tracking system that can be configured to track the pose of a deformable object in a low or high-dimensional state-space. The object is decomposed into a chained assembly of segments of multiple parts that are arranged under a hierarchy of tailored spatio-temporal constraints. The robustness and generality of the approach is widely demonstrated on tracking various flexible and articulated objects. Haar-like features are widely used in tracking. The second contribution of the thesis is a parser of ensembles of Haar-like features to compute them efficiently. The features are decomposed into simpler kernels, possibly shared by subsets of features, thus forming multi-pass convolutions. Discovering and aligning these kernels within and between passes allows forming recursive trees of kernels that require fewer memory operations than the classic computation, thereby producing the same result but more efficiently. The approach is validated experimentally on popular examples of Haar-like features
Woodley, Thomas Edward. "Visual tracking using offline and online learning". Thesis, University of Cambridge, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.608814.
Pełny tekst źródłaLoxam, James Ronald. "Robust filtering for real-time visual tracking". Thesis, University of Cambridge, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609503.
Pełny tekst źródłaHoffmann, McElory Roberto. "Stochastic visual tracking with active appearance models". Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/1381.
Pełny tekst źródłaENGLISH ABSTRACT: In many applications, an accurate, robust and fast tracker is needed, for example in surveillance, gesture recognition, tracking lips for lip-reading and creating an augmented reality by embedding a tracked object in a virtual environment. In this dissertation we investigate the viability of a tracker that combines the accuracy of active appearancemodels with the robustness of the particle lter (a stochastic process)—we call this combination the PFAAM. In order to obtain a fast system, we suggest local optimisation as well as using active appearance models tted with non-linear approaches. Active appearance models use both contour (shape) and greyscale information to build a deformable template of an object. ey are typically accurate, but not necessarily robust, when tracking contours. A particle lter is a generalisation of the Kalman lter. In a tutorial style, we show how the particle lter is derived as a numerical approximation for the general state estimation problem. e algorithms are tested for accuracy, robustness and speed on a PC, in an embedded environment and by tracking in ìD. e algorithms run real-time on a PC and near real-time in our embedded environment. In both cases, good accuracy and robustness is achieved, even if the tracked object moves fast against a cluttered background, and for uncomplicated occlusions.
AFRIKAANSE OPSOMMING: ’nAkkurate, robuuste en vinnige visuele-opspoorderword in vele toepassings benodig. Voorbeelde van toepassings is bewaking, gebaarherkenning, die volg van lippe vir liplees en die skep van ’n vergrote realiteit deur ’n voorwerp wat gevolg word, in ’n virtuele omgewing in te bed. In hierdie proefskrif ondersoek ons die lewensvatbaarheid van ’n visuele-opspoorder deur die akkuraatheid van aktiewe voorkomsmodellemet die robuustheid van die partikel lter (’n stochastiese proses) te kombineer—ons noem hierdie kombinasie die PFAAM. Ten einde ’n vinnige visuele-opspoorder te verkry, stel ons lokale optimering, sowel as die gebruik van aktiewe voorkomsmodelle wat met nie-lineêre tegnieke gepas is, voor. Aktiewe voorkomsmodelle gebruik kontoer (vorm) inligting tesamemet grysskaalinligting om ’n vervormbaremeester van ’n voorwerp te bou. Wanneer aktiewe voorkomsmodelle kontoere volg, is dit normaalweg akkuraat,maar nie noodwendig robuust nie. ’n Partikel lter is ’n veralgemening van die Kalman lter. Ons wys in tutoriaalstyl hoe die partikel lter as ’n numeriese benadering tot die toestand-beramingsprobleem afgelei kan word. Die algoritmes word vir akkuraatheid, robuustheid en spoed op ’n persoonlike rekenaar, ’n ingebedde omgewing en deur volging in ìD, getoets. Die algoritmes loop intyds op ’n persoonlike rekenaar en is naby intyds op ons ingebedde omgewing. In beide gevalle, word goeie akkuraatheid en robuustheid verkry, selfs as die voorwerp wat gevolg word, vinnig, teen ’n besige agtergrond beweeg of eenvoudige okklusies ondergaan.
Lao, Yuanwei. "Visual Tracking by Exploiting Observations and Correlations". The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1269547716.
Pełny tekst źródłaWettermark, Emma, i Linda Berglund. "Multi-Modal Visual Tracking Using Infrared Imagery". Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176540.
Pełny tekst źródłaJohnander, Joakim. "Visual Tracking with Deformable Continuous Convolution Operators". Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-138597.
Pełny tekst źródłaKilic, V. "Audio-visual tracking of multiple moving speakers". Thesis, University of Surrey, 2016. http://epubs.surrey.ac.uk/809761/.
Pełny tekst źródłaWu, Zheng. "Occlusion reasoning for multiple object visual tracking". Thesis, Boston University, 2013. https://hdl.handle.net/2144/12892.
Pełny tekst źródłaOcclusion reasoning for visual object tracking in uncontrolled environments is a challenging problem. It becomes significantly more difficult when dense groups of indistinguishable objects are present in the scene that cause frequent inter-object interactions and occlusions. We present several practical solutions that tackle the inter-object occlusions for video surveillance applications. In particular, this thesis proposes three methods. First, we propose "reconstruction-tracking," an online multi-camera spatial-temporal data association method for tracking large groups of objects imaged with low resolution. As a variant of the well-known Multiple-Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly occluded observations from multiple camera views and performs temporal data association in 3D. Second, we develop "track linking," a class of offline batch processing algorithms for long-term occlusions, where the decision has to be made based on the observations from the entire tracking sequence. We construct a graph representation to characterize occlusion events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions. Third, we propose a novel Bayesian framework where detection and data association are combined into a single module and solved jointly. Almost all traditional tracking systems address the detection and data association tasks separately in sequential order. Such a design implies that the output of the detector has to be reliable in order to make the data association work. Our framework takes advantage of the often complementary nature of the two subproblems, which not only avoids the error propagation issue from which traditional "detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum suppression" of hypotheses by modeling the likelihood of the entire image. The thesis describes a substantial number of experiments, involving challenging, notably distinct simulated and real data, including infrared and visible-light data sets recorded ourselves or taken from data sets publicly available. In these videos, the number of objects ranges from a dozen to a hundred per frame in both monocular and multiple views. The experiments demonstrate that our approaches achieve results comparable to those of state-of-the-art approaches.
Firouzi, Hadi. "Visual non-rigid object tracking in dynamic environments". Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/44629.
Pełny tekst źródłaCampos, TeoÌfilo EmiÌdio de. "3D visual tracking of articulated objects and hands". Thesis, University of Oxford, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.442396.
Pełny tekst źródłaSudderth, Erik B. (Erik Blaine) 1977. "Graphical models for visual object recognition and tracking". Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/34023.
Pełny tekst źródłaThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 277-301).
We develop statistical methods which allow effective visual detection, categorization, and tracking of objects in complex scenes. Such computer vision systems must be robust to wide variations in object appearance, the often small size of training databases, and ambiguities induced by articulated or partially occluded objects. Graphical models provide a powerful framework for encoding the statistical structure of visual scenes, and developing corresponding learning and inference algorithms. In this thesis, we describe several models which integrate graphical representations with nonparametric statistical methods. This approach leads to inference algorithms which tractably recover high-dimensional, continuous object pose variations, and learning procedures which transfer knowledge among related recognition tasks. Motivated by visual tracking problems, we first develop a nonparametric extension of the belief propagation (BP) algorithm. Using Monte Carlo methods, we provide general procedures for recursively updating particle-based approximations of continuous sufficient statistics. Efficient multiscale sampling methods then allow this nonparametric BP algorithm to be flexibly adapted to many different applications.
(cont.) As a particular example, we consider a graphical model describing the hand's three-dimensional (3D) structure, kinematics, and dynamics. This graph encodes global hand pose via the 3D position and orientation of several rigid components, and thus exposes local structure in a high-dimensional articulated model. Applying nonparametric BP, we recover a hand tracking algorithm which is robust to outliers and local visual ambiguities. Via a set of latent occupancy masks, we also extend our approach to consistently infer occlusion events in a distributed fashion. In the second half of this thesis, we develop methods for learning hierarchical models of objects, the parts composing them, and the scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves accuracy when learning from few examples.
(cont.) Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. Adapting these transformed Dirichlet processes to images taken with a binocular stereo camera, we learn integrated, 3D models of object geometry and appearance. This leads to a Monte Carlo algorithm which automatically infers 3D scene structure from the predictable geometry of known object categories.
by Erik B. Sudderth.
Ph.D.