Dissertations / Theses on the topic 'Visual object'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Visual object.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Figueroa, Flores Carola. "Visual Saliency for Object Recognition, and Object Recognition for Visual Saliency." Doctoral thesis, Universitat Autònoma de Barcelona, 2021. http://hdl.handle.net/10803/671964.
Full textEl reconocimiento de objetos para los seres humanos es un proceso instantáneo, preciso y extremadamente adaptable. Además, tenemos la capacidad innata de aprender nuevas categorias de objetos a partir de unos pocos ejemplos. El cerebro humano reduce la complejidad de los datos entrantes filtrando parte de la información y procesando las cosas que captan nuestra atención. Esto, combinado con nuestra predisposición biológica a responder a determinadas formas o colores, nos permite reconocer en una simple mirada las regiones más importantes o destacadas de una imagen. Este mecanismo se puede observar analizando en qué partes de las imágenes los sujetos ponen su atención; por ejemplo donde fijan sus ojos cuando se les muestra una imagen. La forma más precisa de registrar este comportamiento es rastrear los movimientos de los ojos mientras se muestran imágenes. La estimación computacional del ‘saliency’, tiene como objetivo diseñar algoritmos que, dada una imagen de entrada, estimen mapas de ‘saliency’. Estos mapas se pueden utilizar en una variada gama de aplicaciones, incluida la detección de objetos, la compresión de imágenes y videos y el seguimiento visual. La mayoría de la investigación en este campo se ha centrado en estimar automáticamente estos mapas de ‘saliency’, dada una imagen de entrada. En cambio, en esta tesis, nos propusimos incorporar la estimación de ‘saliency’ en un procedimiento de reconocimiento de objeto, puesto que, queremos investigar si los mapas de ‘saliency’ pueden mejorar los resultados de la tarea de reconocimiento de objetos. En esta tesis, identificamos varios problemas relacionados con la estimación del ‘saliency’ visual. Primero, pudimos determinar en qué medida se puede aprovechar la estimación del ‘saliency’ para mejorar el entrenamiento de un modelo de reconocimiento de objetos cuando se cuenta con escasos datos de entrenamiento. Para resolver este problema, diseñamos una red de clasificación de imágenes que incorpora información de ‘saliency’ como entrada. Esta red procesa el mapa de ‘saliency’ a través de una rama de red dedicada y utiliza las características resultantes para modular las características visuales estándar ascendentes de la entrada de la imagen original. Nos referiremos a esta técnica como clasificación de imágenes moduladas por prominencia (SMIC en inglés). En numerosos experimentos realizando sobre en conjuntos de datos de referencia estándar para el reconocimiento de objetos ‘fine-grained’, mostramos que nuestra arquitectura propuesta puede mejorar significativamente el rendimiento, especialmente en conjuntos de datos con datos con escasos datos de entrenamiento. Luego, abordamos el principal inconveniente del problema anterior: es decir, SMIC requiere explícitamente un algoritmo de ‘saliency’, el cual debe entrenarse en un conjunto de datos de ‘saliency’. Para resolver esto, implementamos un mecanismo de alucinación que nos permite incorporar la rama de estimación de ‘saliency’ en una arquitectura de red neuronal entrenada de extremo a extremo que solo necesita la imagen RGB como entrada. Un efecto secundario de esta arquitectura es la estimación de mapas de ‘saliency’. En varios experimentos, demostramos que esta arquitectura puede obtener resultados similares en el reconocimiento de objetos como SMIC pero sin el requisito de mapas de ‘saliency’ para entrenar el sistema. Finalmente, evaluamos la precisión de los mapas de ‘saliency’ que ocurren como efecto secundario del reconocimiento de objetos. Para ello, utilizamos un de conjuntos de datos de referencia para la evaluación de la prominencia basada en experimentos de seguimiento ocular. Sorprendentemente, los mapas de ‘saliency’ estimados son muy similares a los mapas que se calculan a partir de experimentos de seguimiento ocular humano. Nuestros resultados muestran que estos mapas de ‘saliency’ pueden obtener resultados competitivos en mapas de ‘saliency’ de referencia.
For humans, the recognition of objects is an almost instantaneous, precise and extremely adaptable process. Furthermore, we have the innate capability to learn new object classes from only few examples. The human brain lowers the complexity of the incoming data by filtering out part of the information and only processing those things that capture our attention. This, mixed with our biological predisposition to respond to certain shapes or colors, allows us to recognize in a simple glance the most important or salient regions from an image. This mechanism can be observed by analyzing on which parts of images subjects place attention; where they fix their eyes when an image is shown to them. The most accurate way to record this behavior is to track eye movements while displaying images. Computational saliency estimation aims to identify to what extent regions or objects stand out with respect to their surroundings to human observers. Saliency maps can be used in a wide range of applications including object detection, image and video compression, and visual tracking. The majority of research in the field has focused on automatically estimating saliency maps given an input image. Instead, in this thesis, we set out to incorporate saliency maps in an object recognition pipeline: we want to investigate whether saliency maps can improve object recognition results. In this thesis, we identify several problems related to visual saliency estimation. First, to what extent the estimation of saliency can be exploited to improve the training of an object recognition model when scarce training data is available. To solve this problem, we design an image classification network that incorporates saliency information as input. This network processes the saliency map through a dedicated network branch and uses the resulting characteristics to modulate the standard bottom-up visual characteristics of the original image input. We will refer to this technique as saliency-modulated image classification (SMIC). In extensive experiments on standard benchmark datasets for fine-grained object recognition, we show that our proposed architecture can significantly improve performance, especially on dataset with scarce training data. Next, we address the main drawback of the above pipeline: SMIC requires an explicit saliency algorithm that must be trained on a saliency dataset. To solve this, we implement a hallucination mechanism that allows us to incorporate the saliency estimation branch in an end-to-end trained neural network architecture that only needs the RGB image as an input. A side-effect of this architecture is the estimation of saliency maps. In experiments, we show that this architecture can obtain similar results on object recognition as SMIC but without the requirement of ground truth saliency maps to train the system. Finally, we evaluated the accuracy of the saliency maps that occur as a side-effect of object recognition. For this purpose, we use a set of benchmark datasets for saliency evaluation based on eye-tracking experiments. Surprisingly, the estimated saliency maps are very similar to the maps that are computed from human eye-tracking experiments. Our results show that these saliency maps can obtain competitive results on benchmark saliency maps. On one synthetic saliency dataset this method even obtains the state-of-the-art without the need of ever having seen an actual saliency image for training.
Universitat Autònoma de Barcelona. Programa de Doctorat en Informàtica
Fergus, Robert. "Visual object category recognition." Thesis, University of Oxford, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.425029.
Full textWallenberg, Marcus. "Embodied Visual Object Recognition." Doctoral thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-132762.
Full textEmbodied Visual Object Recognition
FaceTrack
Nguyen, Duong B. T. Carleton University Dissertation Computer Science. "The visual object editing kit." Ottawa, 1993.
Find full textTauber, Zinovi. "Visual object retrieval based on locales." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0013/MQ61504.pdf.
Full textBreuel, Thomas M. "Geometric Aspects of Visual Object Recognition." Thesis, Massachusetts Institute of Technology, 1992. http://hdl.handle.net/1721.1/7342.
Full textMeger, David Paul. "Visual object recognition for mobile platforms." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/44682.
Full textFu, Huanzhang. "Contributions to generic visual object categorization." Phd thesis, Ecole Centrale de Lyon, 2010. http://tel.archives-ouvertes.fr/tel-00599713.
Full textChoi, Changhyun. "Visual object perception in unstructured environments." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/53003.
Full textBuchler, Daniela Martins. "Visual perception of the designed object." Thesis, Staffordshire University, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.442502.
Full textFang, Jianzhong. "Computational approaches to visual object detection." Thesis, University of Nottingham, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.416393.
Full textMoghaddam, Baback 1963. "Probabilistic visual learning for object detection." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/10242.
Full textIncludes bibliographical references (leaves 78-82).
by Baback Moghaddam.
Ph.D.
Lim, Joseph J. (Joseph Jaewhan). "Toward visual understanding of everyday object." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/101574.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 83-92).
The computer vision community has made impressive progress on object recognition using large scale data. However, for any visual system to interact with objects, it needs to understand much more than simply recognizing where the objects are. The goal of my research is to explore and solve object understanding tasks for interaction - finding an object's pose in 3D, understanding its various states and transformations, and interpreting its physical interactions. In this thesis, I will focus on two specific aspects of this agenda: 3D object pose estimation and object state understanding. Precise pose estimation is a challenging problem. One reason is that an object's appearance inside an image can vary a lot based on different conditions (e.g. location, occlusions, and lighting). I address these issues by utilizing 3D models directly. The goal is to develop a method that can exploit all possible views provided by a 3D model - a single 3D model represents infinitely many 2D views of the same object. I have developed a method that uses the 3D geometry of an object for pose estimation. The method can then also learn additional real-world statistics, such as which poses appear more frequently, which area is more likely to contain an object, and which parts are commonly occluded and discriminative. These methods allow us to localize and estimate the exact pose of objects in natural images. Finally, I will also describe the work on learning and inferring different states and transformations an object class can undergo. Objects in visual scenes come in a rich variety of transformed states. A few classes of transformation have been heavily studied in computer vision: mostly simple, parametric changes in color and geometry. However, transformations in the physical world occur in many more flavors, and they come with semantic meaning: e.g., bending, folding, aging, etc. Hence, the goal is to learn about an object class, in terms of their states and transformations, using the collection of images from the image search engine.
by Joseph J. Lim.
Ph. D.
Tuovinen, Antti-Pekka. "Object-oriented engineering of visual languages." Helsinki : University of Helsinki, 2002. http://ethesis.helsinki.fi/julkaisut/mat/tieto/vk/tuovinen/.
Full textThanikasalam, Kokul. "Appearance based online visual object tracking." Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/130875/1/Kokul_Thanikasalam_Thesis.pdf.
Full textYang, Tao. "visual tracking and object motion prediction for intelligent vehicles." Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCA005.
Full textObject tracking and motion prediction are important for autonomous vehicles and can be applied in many other fields. First, we design a single object tracker using compressive tracking to correct the optical flow tracking in order to achieve a balance between performance and processing speed. Considering the efficiency of compressive feature extraction, we apply this tracker to multi-object tracking to improve the performance without slowing down too much speed. Second, we improve the DCF based single object tracker by introducing multi-layer CNN features, spatial reliability analysis (through a foreground mask) and conditionally model updating strategy. Then, we apply the DCF based CNN tracker to multi-object tracking. The pre-trained VGGNet-19 and DCFNet are tested as feature extractors respectively. The discriminative model achieved by DCF is considered for data association. Third, two proposed LSTM models (seq2seq and seq2dense) for motion prediction of vehicles and pedestrians in the camera coordinate are proposed. Based on visual data and 3D points cloud (LiDAR), a Kalman filter based multi-object tracking system with a 3D detector are used to generate the object trajectories for testing. The proposed models, and polynomial regression model, considered as baseline, are compared for evaluation
Hussain, Sabit ul. "Machine Learning Methods for Visual Object Detection." Thesis, Grenoble, 2011. http://www.theses.fr/2011GRENM070/document.
Full textThe goal of this thesis is to develop better practical methods for detecting common object classes in real world images. We present a family of object detectors that combine Histogram of Oriented Gradient (HOG), Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) features with efficient Latent SVM classifiers and effective dimensionality reduction and sparsification schemes to give state-of-the-art performance on several important datasets including PASCAL VOC2006 and VOC2007, INRIA Person and ETHZ. The three main contributions are as follows. Firstly, we pioneer the use of Local Ternary Pattern features for object detection, showing that LTP gives better overall performance than HOG and LBP, because it captures both rich local texture and object shape information while being resistant to variations in lighting conditions. It thus works well both for classes that are recognized mainly by their structure and ones that are recognized mainly by their textures. We also show that HOG, LBP and LTP complement one another, so that an extended feature set that incorporates all three of them gives further improvements in performance. Secondly, in order to tackle the speed and memory usage problems associated with high-dimensional modern feature sets, we propose two effective dimensionality reduction techniques. The first, feature projection using Partial Least Squares, allows detectors to be trained more rapidly with negligible loss of accuracy and no loss of run time speed for linear detectors. The second, feature selection using SVM weight truncation, allows active feature sets to be reduced in size by almost an order of magnitude with little or no loss, and often a small gain, in detector accuracy. Despite its simplicity, this feature selection scheme outperforms all of the other sparsity enforcing methods that we have tested. Lastly, we describe work in progress on Local Quantized Patterns (LQP), a generalized form of local pattern features that uses lookup table based vector quantization to provide local pattern style pixel neighbourhood codings that have the speed of LBP/LTP and some of the flexibility and power of traditional visual word representations. Our experiments show that LQP outperforms all of the other feature sets tested including HOG, LBP and LTP
Craddock, Matthew Peter. "Comparing the attainment of object constancy in haptic and visual object recognition." Thesis, University of Liverpool, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.539615.
Full textGepperth, Alexander Rainer Tassilo. "Neural learning methods for visual object detection." [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=981053998.
Full textAllred, Sarah R. "The Neural basis of visual object perception /." Thesis, Connect to this title online; UW restricted, 2006. http://hdl.handle.net/1773/10645.
Full textMahmood, Hamid. "Visual Attention-based Object Detection and Recognition." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-94024.
Full textHussain, Sibt Ul. "Machine Learning Methods for Visual Object Detection." Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00680048.
Full textRebai, Ahmed. "Interactive Object Retrieval using Interpretable Visual Models." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00608467.
Full textWebber, James. "Visual object-oriented development of parallel applications." Thesis, University of Newcastle Upon Tyne, 2000. http://hdl.handle.net/10443/1762.
Full textVillalba, Michael Joseph. "Fast visual recognition of large object sets." Thesis, Massachusetts Institute of Technology, 1990. http://hdl.handle.net/1721.1/42211.
Full textAghajanian, J. "Patch-based models for visual object classes." Thesis, University College London (University of London), 2011. http://discovery.ucl.ac.uk/1306170/.
Full textRevie, Gavin F. "Object based attention in visual word processing." Thesis, University of Dundee, 2015. https://discovery.dundee.ac.uk/en/studentTheses/205c8224-4954-4b76-aa8c-b0ecd40a6591.
Full textKinuthia, Charles. "Visual Object Detector for Vehicle Teleoperation Applications." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-276857.
Full textSjälvkörande bilar har fångat intressen av biltillverkarna på grund av genombrott inom maskinlärning och AI algorithmer. En av de områden som har väckt intressen är förbättrat igenkännande av bilar genom att använda nogrann realtids objektdetektor. Som är föld av att mer fordon blir självkörande öka behovet att övervaka och fjärrstyra fordon. Detta för att kunna hantera speciella fall som är svåra att automatisera eller förutse. Detta kräver sändning av video från fordon till fjärrförare. På grund av nätverk problem som orsakas av bandbredd fluktuationer, räcker inte det att bara skicka video. Man kan förbettra körupplevelsen av fordonkörare genom att markera objekt som till exempel fordon och personer. Huvudbidragen av examensarbetet är en realtids objekt detektor som har jämförbar noggrannhet med Faster R-CNN. Det föreslagna detektorn är modulär och medför att man behöver inte träna om hela modelen om man lägger in en ny typ av objekt klass. I slutändan testas detektorn på en video med artifakter for att bedöma prestandan.
Lindqvist, Zebh. "Design Principles for Visual Object Recognition Systems." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-80769.
Full textTeynor, Alexandra. "Visual object class recognition using local descriptions." [S.l. : s.n.], 2008. http://nbn-resolving.de/urn:nbn:de:bsz:25-opus-62371.
Full textWu, Hanwei. "Object Ranking for Mobile 3D Visual Search." Thesis, KTH, Skolan för elektro- och systemteknik (EES), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-175146.
Full textWu, Zheng. "Occlusion reasoning for multiple object visual tracking." Thesis, Boston University, 2013. https://hdl.handle.net/2144/12892.
Full textOcclusion reasoning for visual object tracking in uncontrolled environments is a challenging problem. It becomes significantly more difficult when dense groups of indistinguishable objects are present in the scene that cause frequent inter-object interactions and occlusions. We present several practical solutions that tackle the inter-object occlusions for video surveillance applications. In particular, this thesis proposes three methods. First, we propose "reconstruction-tracking," an online multi-camera spatial-temporal data association method for tracking large groups of objects imaged with low resolution. As a variant of the well-known Multiple-Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly occluded observations from multiple camera views and performs temporal data association in 3D. Second, we develop "track linking," a class of offline batch processing algorithms for long-term occlusions, where the decision has to be made based on the observations from the entire tracking sequence. We construct a graph representation to characterize occlusion events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions. Third, we propose a novel Bayesian framework where detection and data association are combined into a single module and solved jointly. Almost all traditional tracking systems address the detection and data association tasks separately in sequential order. Such a design implies that the output of the detector has to be reliable in order to make the data association work. Our framework takes advantage of the often complementary nature of the two subproblems, which not only avoids the error propagation issue from which traditional "detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum suppression" of hypotheses by modeling the likelihood of the entire image. The thesis describes a substantial number of experiments, involving challenging, notably distinct simulated and real data, including infrared and visible-light data sets recorded ourselves or taken from data sets publicly available. In these videos, the number of objects ranges from a dozen to a hundred per frame in both monocular and multiple views. The experiments demonstrate that our approaches achieve results comparable to those of state-of-the-art approaches.
Van, Thielen Tessa. "From object towards island." Thesis, Konstfack, Institutionen för Konst (K), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:konstfack:diva-5924.
Full textYang, Fan. "Visual Infrastructure based Accurate Object Recognition and Localization." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492752246062673.
Full textPiñol, Naranjo Mónica. "Reinforcement learning of visual descriptors for object recognition." Doctoral thesis, Universitat Autònoma de Barcelona, 2014. http://hdl.handle.net/10803/283927.
Full textThe human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Further- more the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval.
Ventura, Royo Carles. "Visual object analysis using regions and local features." Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/398407.
Full textLa primera part de la tesi es focalitza en l'anàlisi del context espacial en la segmentació semàntica d'imatges. En primer lloc, revisem com s'ha tractat el context espacial en la literatura per mitjà de descriptors locals i tècniques d'agregació espacial. A partir de la discussió sobre si el context és beneficial o no per al reconeixement d'objectes, extenem una segmentació en objecte, contorn i fons per a l'agregació espacial de descriptors locals amb annotacions a un escenari més realístic on s'utilitzen hipòtesis de localitzacions d'objectes enlloc d'annotacions. Mentres que les regions corresponen a objecte i fons representes aquestes àrees respectives de la imatge, el contorn és una regió al voltant de l'objecte, la qual ha resultat ser la regió més rica amb informació contextual per al reconeixement d'objectes. A més a més, proposem una nova tècnica d'agregació espacial dels descriptors locals de l'interior de l'objecte amb una divisió d'aquesta regió en 4 subregions. Ambdues contribucions han estat verificades en un benchmark de segmentació semàntica amb la combinació de descriptors locals dependents i independents del context que permet que els models automàticament aprenguin si el context és beneficiós o no per a cada categoria semàntica. La segona part de la tesi aborda el problema de segmentació semàntica per a un conjunt d'imatges relacionades en un escenari multi-vista sense calibració. Els algorismes de l'estat de l'art en segmentació semàntica fallen en segmentar correctament els objects dels diferents punts de vista quan les tècniques són aplicades de forma independent a cadascun dels punts de vista. La manca d'un nombre elevat d'annotacions disponibles per a segmentació multi-vista no permet obtenir un model que sigui robust als canvis de vista. En aquesta segona part, explotem la correlació espacial existent entre els diferents punts de vista per obtenir una segmentació semàntica més robusta. En primer lloc, revisem les tècniques de l'estat de l'art en co-agrupament, co-segmentació i segmentació de vídeo que tenen per objectiu segmentar el conjunt d'imatges de forma genèrica, és a dir, sense considerar la semàntica. A continuació, proposem una nova arquitectura de co-agrupament que considera informació de moviment i proveeix una segmentació amb múltiples resolucions i millora les tècniques de l'estat de l'art en segmentació genèrica multi-vista. Finalment, la segmentació multivista proposada és combinada amb els resultats de la segmentació semàntica donant lloc a un mètode per a una selecció automàtica de la resolució i una segmentació semàntica multi-vista coherent.
Wilson, Susan E. "Perceptual organization and symmetry in visual object recognition." Thesis, University of British Columbia, 1991. http://hdl.handle.net/2429/29802.
Full textScience, Faculty of
Computer Science, Department of
Graduate
Wallenberg, Marcus, and Per-Erik Forssén. "A Research Platform for Embodied Visual Object Recognition." Linköpings universitet, Datorseende, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-70769.
Full textFirouzi, Hadi. "Visual non-rigid object tracking in dynamic environments." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/44629.
Full textLeeds, Daniel Demeny. "Searching for the Visual Components of Object Perception." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/313.
Full textLovell, Kylie Sarah. "Implicit and explicit processes in visual object recognition." Thesis, University of Reading, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.430835.
Full textSudderth, Erik B. (Erik Blaine) 1977. "Graphical models for visual object recognition and tracking." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/34023.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 277-301).
We develop statistical methods which allow effective visual detection, categorization, and tracking of objects in complex scenes. Such computer vision systems must be robust to wide variations in object appearance, the often small size of training databases, and ambiguities induced by articulated or partially occluded objects. Graphical models provide a powerful framework for encoding the statistical structure of visual scenes, and developing corresponding learning and inference algorithms. In this thesis, we describe several models which integrate graphical representations with nonparametric statistical methods. This approach leads to inference algorithms which tractably recover high-dimensional, continuous object pose variations, and learning procedures which transfer knowledge among related recognition tasks. Motivated by visual tracking problems, we first develop a nonparametric extension of the belief propagation (BP) algorithm. Using Monte Carlo methods, we provide general procedures for recursively updating particle-based approximations of continuous sufficient statistics. Efficient multiscale sampling methods then allow this nonparametric BP algorithm to be flexibly adapted to many different applications.
(cont.) As a particular example, we consider a graphical model describing the hand's three-dimensional (3D) structure, kinematics, and dynamics. This graph encodes global hand pose via the 3D position and orientation of several rigid components, and thus exposes local structure in a high-dimensional articulated model. Applying nonparametric BP, we recover a hand tracking algorithm which is robust to outliers and local visual ambiguities. Via a set of latent occupancy masks, we also extend our approach to consistently infer occlusion events in a distributed fashion. In the second half of this thesis, we develop methods for learning hierarchical models of objects, the parts composing them, and the scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves accuracy when learning from few examples.
(cont.) Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. Adapting these transformed Dirichlet processes to images taken with a binocular stereo camera, we learn integrated, 3D models of object geometry and appearance. This leads to a Monte Carlo algorithm which automatically infers 3D scene structure from the predictable geometry of known object categories.
by Erik B. Sudderth.
Ph.D.
Kuo, Michael. "Learning visual object categories from few training examples." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66430.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 73-74).
During visual perception of complex objects, humans fixate on salient regions of a particular object, moving their gaze from one region to another in order to gain information about that object. The Bayesian Integrate and Shift (BIAS) model is a recently proposed model for learning visual object categories that is modeled after the process of human visual perception, integrating information from within and across fixations. Previous works have described preliminary evaluations of the BIAS model and demonstrated that it can learn new object categories from only a few examples. In this thesis, we introduce and evaluate improvements to the learning algorithm, demonstrate that the model benefits from using information from fixating on multiple regions of a particular object, evaluate the limitations of the model when learning different object categories, and assess the performance of the learning algorithm when objects are partially occluded.
by Michael Kuo.
M.Eng.
Sun, Yaoru. "Hierarchical object-based visual attention for machine vision." Thesis, University of Edinburgh, 2003. http://hdl.handle.net/1842/316.
Full textPeterson, Jason W. "Visual assessment of object color chroma and colorfulness /." Online version of thesis, 1994. http://hdl.handle.net/1850/11868.
Full textWallenberg, Marcus. "Components of Embodied Visual Object Recognition : Object Perception and Learning on a Robotic Platform." Licentiate thesis, Linköpings universitet, Datorseende, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93812.
Full textEmbodied Visual Object Recognition
Naha, Shujon. "Zero-shot Learning for Visual Recognition Problems." IEEE, 2015. http://hdl.handle.net/1993/31806.
Full textOctober 2016
Corradi, Tadeo. "Integrating visual and tactile robotic perception." Thesis, University of Bath, 2018. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.761005.
Full textZoccoli, Sandra L. "Object features and object recognition Semantic memory abilities during the normal aging process /." Ann Arbor, Mich. : ProQuest, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3288933.
Full textTitle from PDF title page (viewed Nov. 19, 2009). Source: Dissertation Abstracts International, Volume: 68-11, Section: B, page: 7695. Adviser: Alan S. Brown. Includes bibliographical references.
Eren, Kanat Selda. "Visual Object Representations: Effects Of Feature Frequency And Similarity." Phd thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12613978/index.pdf.
Full textold&rdquo
responses for unstudied objects as the number of frequently repeated features (FRFs) on the object increased. In the second experiment, where all features had equal frequency, similarity of test objects did not affect old/new responses. An evaluation of the models on object recognition and categorization with respect to the experimental results showed that these models can only partially explain experimental results. A comprehensive model for the formation of visual object representations and old/new recognition, called CDZ-VIS, developed on the Convergence-Divergence Zone framework by Damasio (1989), has been proposed. According to this framework, co-occurring object features converge to upper layer units in the hierarchical representation which act as binding units. As more objects are displayed, frequent object features cause grouping of these binding units which converge to upper binding units. The performance of the CDZ-VIS model on the feature frequency and similarity experiments of the present study was shown to be closer to the performance of the human participants, compared to the performance of two models from the categorization literature.