To see the other types of publications on this topic, follow the link: Computer vision; Active.

Dissertations / Theses on the topic 'Computer vision; Active'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Computer vision; Active.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Tordoff, Ben. "Active control of zoom for computer vision." Thesis, University of Oxford, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.270752.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Luckman, Adrian John. "Active perception in machine vision." Thesis, University of York, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.280521.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Fuxing. "Active stereo for AGV navigation." Thesis, University of Oxford, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.338984.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Du, Fenglei. "The fundamentals of an active vision system." Thesis, University of Oxford, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.239358.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Onder, Murat. "Face Detection And Active Robot Vision." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/2/12605290/index.pdf.

Full text
Abstract:
The main task in this thesis is to design a robot vision system with face detection and tracking capability. Hence there are two main works in the thesis: Firstly, the detection of the face on an image that is taken from the camera on the robot must be achieved. Hence this is a serious real time image processing task and time constraints are very important because of this reason. A processing rate of 1 frame/second is tried to be achieved and hence a fast face detection algorithm had to be used. The Eigenface method and the Subspace LDA (Linear Discriminant Analysis) method are implemented, tested and compared for face detection and Eigenface method proposed by Turk and Pentland is decided to be used. The images are first passed through a number of preprocessing algorithms to obtain better performance, like skin detection, histogram equalization etc. After this filtering process the face candidate regions are put through the face detection algorithm to understand whether there is a face or not in the image. Some modifications are applied to the eigenface algorithm to detect the faces better and faster. Secondly, the robot must move towards the face in the image. This task includes robot motion. The robot to be used for this purpose is a Pioneer 2-DX8 Plus, which is a product of ActivMedia Robotics Inc. and only the interfaces to move the robot have been implemented in the thesis software. The robot is to detect the faces at different distances and arrange its position according to the distance of the human to the robot. Hence a scaling mechanism must be used either in the training images, or in the input image taken from the camera. Because of timing constraint and low camera resolution, a limited number of scaling is applied in the face detection process. With this reason faces of people who are very far or very close to the robot will not be detected. A background independent face detection system is tried to be designed. However the resultant algorithm is slightly dependent to the background. There is no any other constraints in the system.
APA, Harvard, Vancouver, ISO, and other styles
6

Benameur, Kaouthar. "Control strategies for an active vision system." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0003/NQ44363.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Bradshaw, Kevin J. "Surveillance of dynamic scenes with an active vision system." Thesis, University of Oxford, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.260139.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hoad, Paul. "Active robot vision and its use in object recognition." Thesis, University of Surrey, 1994. http://epubs.surrey.ac.uk/844223/.

Full text
Abstract:
Object recognition has been one of the main areas of research into computer vision in the last 20-30 years. Until recently most of this research has been performed on scenes taken using static monocular, binocular or even trinocular cameras. It is believed, however, that by adding the ability to move the look point and concentrate on a region of interest a more robust and efficient method of vision can be achieved. Recent studies into the ability to provide human-like vision systems for a more active approach to vision have lead to the development of a number of robot controlled vision systems. In this thesis the development of one such system at the University of Surrey, the stereo robot head "Getafix" is described. The design, construction and development of the head and its control system have been undertaken as part of this project with the aim of improving current vision tasks, in particular, that of object recognition. In this thesis the design of the control systems, kinematics and control software of the stereo robot head will be discussed. A number of simple commissioning experiments are also shown, using the concepts of the robot control developed herein. Camera lens control and calibration is also described. A review of classical primitive based object recognition systems is given and the development of a novel generic cylindrical object recognition strategy is shown. The use of this knowledge source is demonstrated with other vision processes of colour and stereo. The work on the cylinder recognition strategy and the stereo robot head are finally combined within an active vision framework. A purposive active vision strategy is used to detect cylindrical structures, that would otherwise be undetectable by the cylindrical object detection algorithm alone.
APA, Harvard, Vancouver, ISO, and other styles
9

Alvino, Christopher Vincent. "Multiscale Active Contour Methods in Computer Vision with Applications in Tomography." Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/6896.

Full text
Abstract:
Most applications in computer vision suffer from two major difficulties. The first is they are notoriously ridden with sub-optimal local minima. The second is that they typically require high computational cost to be solved robustly. The reason for these two drawbacks is that most problems in computer vision, even when well-defined, typically require finding a solution in a very large high-dimensional space. It is for these two reasons that multiscale methods are particularly well-suited to problems in computer vision. Multiscale methods, by way of looking at the coarse scale nature of a problem before considering the fine scale nature, often have the ability to avoid sub-optimal local minima and obtain a more globally optimal solution. In addition, multiscale methods typically enjoy reduced computational cost. This thesis applies novel multiscale active contour methods to several problems in computer vision, especially in simultaneous segmentation and reconstruction of tomography images. In addition, novel multiscale methods are applied to contour registration using minimal surfaces and to the computation of non-linear rotationally invariant optical flow. Finally, a methodology for fast robust image segmentation is presented that relies on a lower dimensional image basis derived from an image scale space. The specific advantages of using multiscale methods in each of these problems is highlighted in the various simulations throughout the thesis, particularly their ability to avoid sub-optimal local minima and their ability to solve the problems at a lower overall computational cost.
APA, Harvard, Vancouver, ISO, and other styles
10

Antonis, Jan. "Development of an active computer vision system for 3 dimensional modelling." Thesis, Queen's University Belfast, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301753.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Wong, Winnie Sze-Wing. "Design of A Saccadic Active Vision System." Thesis, University of Waterloo, 2006. http://hdl.handle.net/10012/953.

Full text
Abstract:
Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called saccades.

Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system.

Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor.
APA, Harvard, Vancouver, ISO, and other styles
12

Toh, Peng Seng. "Three-dimensional reconstruction by active integration of visual cues." Thesis, Imperial College London, 1990. http://hdl.handle.net/10044/1/46581.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Rowe, Simon Michael. "Robust feature search for active tracking." Thesis, University of Oxford, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.318616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jacobs, Emmerentia. "Deterministic tracking using active contours." Thesis, Link to the online version, 2005. http://hdl.handle.net/10019/1055.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Halverson, Timothy E. "An "active vision" computational model of visual search for human-computer interaction /." Connect to title online (Scholars' Bank) Connect to title online (ProQuest), 2008. http://hdl.handle.net/1794/9174.

Full text
Abstract:
Thesis (Ph. D.)--University of Oregon, 2008.
Typescript. Includes vita and abstract. Includes bibliographical references (leaves 185-191). Also available online in Scholars' Bank; and in ProQuest, free to University of Oregon users.
APA, Harvard, Vancouver, ISO, and other styles
16

Halverson, Timothy E. 1971. "An "active vision" computational model of visual search for human-computer interaction." Thesis, University of Oregon, 2008. http://hdl.handle.net/1794/9174.

Full text
Abstract:
xx, 191 p. : ill. (some col.) A print copy of this thesis is available through the UO Libraries. Search the library catalog for the location and call number.
Visual search is an important part of human-computer interaction (HCI). The visual search processes that people use have a substantial effect on the time expended and likelihood of finding the information they seek. This dissertation investigates visual search through experiments and computational cognitive modeling. Computational cognitive modeling is a powerful methodology that uses computer simulation to capture, assert, record, and replay plausible sets of interactions among the many human processes at work during visual search. This dissertation aims to provide a cognitive model of visual search that can be utilized by predictive interface analysis tools and to do so in a manner consistent with a comprehensive theory of human visual processing, namely active vision. The model accounts for the four questions of active vision, the answers to which are important to both practitioners and researchers in HCI: What can be perceived in a fixation? When do the eyes move? Where do the eyes move? What information is integrated between eye movements? This dissertation presents a principled progression of the development of a computational model of active vision. Three experiments were conducted that investigate the effects of visual layout properties: density, color, and word meaning. The experimental results provide a better understanding of how these factors affect human- computer visual interaction. Three sets of data, two from the experiments reported here, were accurately modeled in the EPIC (Executive Process-Interactive Control) cognitive architecture. This work extends the practice of computational cognitive modeling by (a) informing the process of developing computational models through the use of eye movement data and (b) providing the first detailed instantiation of the theory of active vision in a computational framework. This instantiation allows us to better understand (a) the effects and interactions of visual search processes and (b) how these visual search processes can be used computationally to predict people's visual search behavior. This research ultimately benefits HCI by giving researchers and practitioners a better understanding of how users visually interact with computers and provides a foundation for tools to predict that interaction. This dissertation includes-both previously published and co-authored material.
Adviser: Anthony J. Hornof
APA, Harvard, Vancouver, ISO, and other styles
17

Li, Yue. "Active Vision through Invariant Representations and Saccade Movements." Ohio University / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1149389174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Yang, Christopher Chuan-Chi 1968. "Active vision inspection: Planning, error analysis, and tolerance design." Diss., The University of Arizona, 1997. http://hdl.handle.net/10150/282424.

Full text
Abstract:
Inspection is a process used to determine whether a component deviates from a given set of specifications. In industry, we usually use a coordinate measuring machine (CMM) to inspect CAD-based models, but inspection using vision sensors has recently drawn more attention because of advances that have been made in computer and imaging technologies. In this dissertation, we introduce active vision inspection for CAD-based three-dimensional models. We divide the dissertation into three major components: (i) planning, (ii) error analysis, and (iii) tolerance design. In inspection planning, the inputs are boundary representation (object centered representation) and an aspect graph (viewer centered representation) of the inspected component; the output is a sensor arrangement for dimensioning a set of topologic entities. In planning, we first use geometric reasoning and object oriented representation to determine a set of topologic entities (measurable entities) to be dimensioned based on the manufactured features on the component (such as slot, pocket, hole etc.) and their spatial relationships. Using the aspect graph, we obtain a set of possible sensor settings and determine an optimized set of sensor settings (sensor arrangement) for dimensioning the measurable entities. Since quantization errors and displacement errors are inherent in an active vision system, we analyze and model the density functions of these errors based on their characteristics and use them to determine the accuracy of inspection for a given sensor setting. In addition, we utilize hierarchical interval constraint networks for tolerance design. We redefine network satisfaction and constraint consistency for the application in tolerance design and develop new forward and backward propagation techniques for tolerance analysis and tolerance synthesis, respectively.
APA, Harvard, Vancouver, ISO, and other styles
19

Mahmoodi, Sasan. "A knowledge based computer vision system for skeletal age assessment of children." Thesis, University of Newcastle Upon Tyne, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245704.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Sommerlade, Eric Chris Wolfgang. "Active visual scene exploration." Thesis, University of Oxford, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.542975.

Full text
Abstract:
This thesis addresses information theoretic methods for control of one or several active cameras in the context of visual surveillance. This approach has two advantages. Firstly, any system dealing with real inputs must take into account noise in the measurements and the underlying system model. Secondly, the control of cameras in surveillance often has different, potentially conflicting objectives. Information theoretic metrics not only yield a way to assess the uncertainty in the current state estimate, they also provide means to choose the observation parameters that optimally reduce this uncertainty. The latter property allows comparison of sensing actions with respect to different objectives. This allows specification of a preference for objectives, where the generated control will fulfil these desired objectives accordingly. The thesis provides arguments for the utility of information theoretic approaches to control visual surveillance systems, by addressing the following objectives in particular: Firstly, how to choose a zoom setting of a single camera to optimally track a single target with a Kalman filter. Here emphasis is put on an arbitration between loss of track due to noise in the observation process, and information gain due to higher accuracy after successful observation. The resulting method adds a running average of the Kalman filter’s innovation to the observation noise, which not only ameliorates tracking performance in the case of unexpected target motions, but also provides a higher maximum zoom setting. The second major contribution of this thesis is a term that addresses exploration of the supervised area in an information theoretic manner. The reasoning behind this term is to model the appearance of new targets in the supervised environment, and use this as prior uncertainty about the occupancy of areas currently not under observation. Furthermore, this term uses the performance of an object detection method to gauge the information that observations of a single location can yield. Additionally, this thesis shows experimentally that a preference for control objectives can be set using a single scalar value. This linearly combines the objective functions of the two conflicting objectives of detection and exploration, and results in the desired control behaviour. The third contribution is an objective function that addresses classification methods. The thesis shows in detail how the information can be derived that can be gained from the classification of a single target, under consideration of its gaze direction. Quantitative and qualitative validation show the increase in performance when compared to standard methods.
APA, Harvard, Vancouver, ISO, and other styles
21

Ivins, James P. "Statistical snakes: active region models." Thesis, University of Sheffield, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.484310.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Fung, Chun Him. "A biomimetic active stereo head with torsional control /." View abstract or full-text, 2006. http://library.ust.hk/cgi/db/thesis.pl?ECED%202006%20FUNG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Curwen, R. W. M. "Dynamic and adaptive contours." Thesis, University of Oxford, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.239353.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Mueller, Martin F. "Physics-driven variational methods for computer vision and shape-based imaging." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/54034.

Full text
Abstract:
In this dissertation, novel variational optical-flow and active-contour methods are investigated to address challenging problems in computer vision and shape-based imaging. Starting from traditional applications of these methods in computer vision, such as object segmentation, tracking, and detection, this research subsequently applies similar active contour techniques to the realm of shape-based imaging, which is an image reconstruction technique estimating object shapes directly from physical wave measurements. In particular, the first and second part of this thesis deal with the following two physically inspired computer vision applications. Optical Flow for Vision-Based Flame Detection: Fire motion is estimated using optimal mass transport optical flow, whose motion model is inspired by the physical law of mass conservation, a governing equation for fire dynamics. The estimated motion fields are used to first detect candidate regions characterized by high motion activity, which are then tracked over time using active contours. To classify candidate regions, a neural net is trained on a set of novel motion features, which are extracted from optical flow fields of candidate regions. Coupled Photo-Geometric Object Features: Active contour models for segmentation in thermal videos are presented, which generalize the well-known Mumford-Shah functional. The diffusive nature of heat processes in thermal imagery motivates the use of Mumford-Shah-type smooth approximations for the image radiance. Mumford-Shah's isotropic smoothness constraint is generalized to anisotropic diffusion in this dissertation, where the image gradient is decomposed into components parallel and perpendicular to level set curves describing the object's boundary contour. In a limiting case, this anisotropic Mumford-Shah segmentation energy yields a one-dimensional ``photo-geometric'' representation of an object which is invariant to translation, rotation and scale. These properties allow the photo-geometric object representation to be efficiently used as a radiance feature; a recognition-segmentation active contour energy, whose shape and radiance follow a training model obtained by principal component analysis of a training set's shape and radiance features, is finally applied to tracking problems in thermal imagery. The third part of this thesis investigates a physics-driven active contour approach for shape-based imaging. Adjoint Active Contours for Shape-Based Imaging: The goal of this research is to estimate both location and shape of buried objects from surface measurements of waves scattered from the object. These objects' shapes are described by active contours: A misfit energy quantifying the discrepancy between measured and simulated wave amplitudes is minimized with respect to object shape using the adjoint state method. The minimizing active contour evolution requires numerical forward scattering solutions, which are obtained by way of the method of fundamental solutions, a meshfree collocation method. In combination with active contours being implemented as level sets, one obtains a completely meshfree algorithm; a considerable advantage over previous work in this field. With future applications in medical and geophysical imaging in mind, the method is formulated for acoustic and elastodynamic wave processes in the frequency domain.
APA, Harvard, Vancouver, ISO, and other styles
25

Hallenberg, Johan. "Robot Tool Center Point Calibration using Computer Vision." Thesis, Linköping University, Department of Electrical Engineering, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-9520.

Full text
Abstract:

Today, tool center point calibration is mostly done by a manual procedure. The method is very time consuming and the result may vary due to how skilled the operators are.

This thesis proposes a new automated iterative method for tool center point calibration of industrial robots, by making use of computer vision and image processing techniques. The new method has several advantages over the manual calibration method. Experimental verifications have shown that the proposed method is much faster, still delivering a comparable or even better accuracy. The setup of the proposed method is very easy, only one USB camera connected to a laptop computer is needed and no contact with the robot tool is necessary during the calibration procedure.

The method can be split into three different parts. Initially, the transformation between the robot wrist and the tool is determined by solving a closed loop of homogeneous transformations. Second an image segmentation procedure is described for finding point correspondences on a rotation symmetric robot tool. The image segmentation part is necessary for performing a measurement with six degrees of freedom of the camera to tool transformation. The last part of the proposed method is an iterative procedure which automates an ordinary four point tool center point calibration algorithm. The iterative procedure ensures that the accuracy of the tool center point calibration only depends on the accuracy of the camera when registering a movement between two positions.

APA, Harvard, Vancouver, ISO, and other styles
26

Aragon, Camarasa Gerardo. "A hierarchical active binocular robot vision architecture for scene exploration and object appearance learning." Thesis, University of Glasgow, 2012. http://theses.gla.ac.uk/3640/.

Full text
Abstract:
This thesis presents an investigation of a computational model of hierarchical visual behaviours within an active binocular robot vision architecture. The robot vision system is able to localise multiple instances of the same object class, while simultaneously maintaining vergence and directing its gaze to attend and recognise objects within cluttered, complex scenes. This is achieved by implementing all image analysis in an egocentric symbolic space without creating explicit pixel-space maps and without the need for calibration or other knowledge of the camera geometry. One of the important aspects of the active binocular vision paradigm requires that visual features in both camera eyes must be bound together in order to drive visual search to saccade, locate and recognise putative objects or salient locations in the robot's field of view. The system structure is based on the “attentional spotlight” metaphor of biological systems and a collection of abstract and reactive visual behaviours arranged in a hierarchical structure. Several studies have shown that the human brain represents and learns objects for recognition by snapshots of 2-dimensional views of the imaged scene that happens to contain the object of interest during active interaction (exploration) of the environment. Likewise, psychophysical findings specify that the primate’s visual cortex represents common everyday objects by a hierarchical structure of their parts or sub-features and, consequently, recognise by simple but imperfect 2D view object part approximations. This thesis incorporates the above observations into an active visual learning behaviour in the hierarchical active binocular robot vision architecture. By actively exploring the object viewing sphere (as higher mammals do), the robot vision system automatically synthesises and creates its own part-based object representation from multiple observations while a human teacher indicates the object and supplies a classification name. Its is proposed to adopt the computational concepts of a visual learning exploration mechanism that controls the accumulation of visual evidence and directs attention towards the spatial salient object parts. The behavioural structure of the binocular robot vision architecture is loosely modelled by a WHAT and WHERE visual streams. The WHERE stream maintains and binds spatial attention on the object part coordinates that egocentrically characterises the location of the object of interest and extracts spatio-temporal properties of feature coordinates and descriptors. The WHAT stream either determines the identity of an object or triggers a learning behaviour that stores view-invariant feature descriptions of the object part. Therefore, the robot vision is capable to perform a collection of different specific visual tasks such as vergence, detection, discrimination, recognition localisation and multiple same-instance identification. This classification of tasks enables the robot vision system to execute and fulfil specified high-level tasks, e.g. autonomous scene exploration and active object appearance learning.
APA, Harvard, Vancouver, ISO, and other styles
27

Spica, Riccardo. "Contributions to active visual estimation and control of robotic systems." Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S080/document.

Full text
Abstract:
L'exécution d'une expérience scientifique est un processus qui nécessite une phase de préparation minutieuse et approfondie. Le but de cette phase est de s'assurer que l'expérience donne effectivement le plus de renseignements possibles sur le processus que l'on est en train d'observer, de manière à minimiser l'effort (en termes, par exemple, du nombre d'essais ou de la durée de chaque expérience) nécessaire pour parvenir à une conclusion digne de confiance. De manière similaire, la perception est un processus actif dans lequel l'agent percevant (que ce soit un humain, un animal ou un robot) fait de son mieux pour maximiser la quantité d'informations acquises sur l'environnement en utilisant ses capacités de détection et ses ressources limitées. Dans de nombreuses applications robotisées, l'état d'un robot peut être partiellement récupéré par ses capteurs embarqués. Des schémas d'estimation peuvent être exploités pour récupérer en ligne les «informations manquantes» et les fournir à des planificateurs/contrôleurs de mouvement, à la place des états réels non mesurables. Cependant, l'estimation doit souvent faire face aux relations non linéaires entre l'environnement et les mesures des capteurs qui font que la convergence et la précision de l'estimation sont fortement affectées par la trajectoire suivie par le robot/capteur. Par exemple, les techniques de commande basées sur la vision, telles que l'Asservissement Visuel Basé-Image (IBVS), exigent normalement une certaine connaissance de la structure 3-D de la scène qui ne peut pas être extraite directement à partir d'une seule image acquise par la caméra. On peut exploiter un processus d'estimation (“Structure from Motion - SfM”) pour reconstruire ces informations manquantes. Toutefois, les performances d'un estimateur SfM sont grandement affectées par la trajectoire suivie par la caméra pendant l'estimation, créant ainsi un fort couplage entre mouvement de la caméra (nécessaire pour, par exemple, réaliser une tâche visuelle) et performance/précision de l'estimation 3-D. À cet égard, une contribution de cette thèse est le développement d'une stratégie d'optimisation en ligne de trajectoire qui permet de maximiser le taux de convergence d'un estimateur SfM affectant (activement) le mouvement de la caméra. L'optimisation est basée sur des conditions classiques de persistance d'excitation utilisée en commande adaptative pour caractériser le conditionnement d'un problème d'estimation. Cette mesure est aussi fortement liée à la matrice d'information de Fisher employée dans le cadre d'estimation probabiliste à des fins similaires. Nous montrons aussi comment cette technique peut être couplé avec l'exécution simultanée d'une tâche d'asservissement visuel en utilisant des techniques de résolution et de maximisation de la redondance. Tous les résultats théoriques présentés dans cette thèse sont validés par une vaste campagne expérimentale en utilisant un robot manipulateur équipé d'une caméra embarquée
As every scientist and engineer knows, running an experiment requires a careful and thorough planning phase. The goal of such a phase is to ensure that the experiment will give the scientist as much information as possible about the process that she/he is observing so as to minimize the experimental effort (in terms of, e.g., number of trials, duration of each experiment and so on) needed to reach a trustworthy conclusion. Similarly, perception is an active process in which the perceiving agent (be it a human, an animal or a robot) tries its best to maximize the amount of information acquired about the environment using its limited sensor capabilities and resources. In many sensor-based robot applications, the state of a robot can only be partially retrieved from his on-board sensors. State estimation schemes can be exploited for recovering online the “missing information” then fed to any planner/motion controller in place of the actual unmeasurable states. When considering non-trivial cases, however, state estimation must often cope with the nonlinear sensor mappings from the observed environment to the sensor space that make the estimation convergence and accuracy strongly affected by the particular trajectory followed by the robot/sensor. For instance, when relying on vision-based control techniques, such as Image-Based Visual Servoing (IBVS), some knowledge about the 3-D structure of the scene is needed for a correct execution of the task. However, this 3-D information cannot, in general, be extracted from a single camera image without additional assumptions on the scene. One can exploit a Structure from Motion (SfM) estimation process for reconstructing this missing 3-D information. However performance of any SfM estimator is known to be highly affected by the trajectory followed by the camera during the estimation process, thus creating a tight coupling between camera motion (needed to, e.g., realize a visual task) and performance/accuracy of the estimated 3-D structure. In this context, a main contribution of this thesis is the development of an online trajectory optimization strategy that allows maximization of the converge rate of a SfM estimator by (actively) affecting the camera motion. The optimization is based on the classical persistence of excitation condition used in the adaptive control literature to characterize the well-posedness of an estimation problem. This metric, however, is also strongly related to the Fisher information matrix employed in probabilistic estimation frameworks for similar purposes. We also show how this technique can be coupled with the concurrent execution of a IBVS task using appropriate redundancy resolution and maximization techniques. All of the theoretical results presented in this thesis are validated by an extensive experimental campaign run using a real robotic manipulator equipped with a camera in-hand
APA, Harvard, Vancouver, ISO, and other styles
28

Defretin, Joseph. "Stratégies de vision active pour la reconnaissance d'objets." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2011. http://tel.archives-ouvertes.fr/tel-00696044.

Full text
Abstract:
Cette thèse, réalisée en coopération avec l'ONERA, concerne la reconnaissance active d'objets 3D par un agent autonome muni d'une caméra d'observation. Alors qu'en reconnaissance passive les modalités d'acquisitions des observations sont imposées et génèrent parfois des ambiguïtés, la reconnaissance active exploite la possibilité de contrôler en ligne ces modalités d'acquisition au cours d'un processus d'inférence séquentiel dans le but de lever l'ambiguïté. L'objectif des travaux est d'établir des stratégies de planification dans l'acquisition de l'information avec le souci d'une mise en œuvre réaliste de la reconnaissance active. Le cadre de l'apprentissage statistique est pour cela mis à profit. La première partie des travaux se consacre à apprendre à planifier. Deux contraintes réalistes sont prise en compte : d'une part, une modélisation imparfaite des objets susceptible de générer des ambiguïtés supplémentaires - d'autre part, le budget d'apprentissage est coûteux (en temps, en énergie), donc limité. La deuxième partie des travaux s'attache à exploiter au mieux les observations au cours de la reconnaissance. La possibilité d'une reconnaissance active multi-échelles est étudiée pour permettre une interprétation au plus tôt dans le processus séquentiel d'acquisition de l'information. Les observations sont également utilisées pour estimer la pose de l'objet de manière robuste afin d'assurer la cohérence entre les modalités planifiées et celles réellement atteintes par l'agent visuel.
APA, Harvard, Vancouver, ISO, and other styles
29

Chaumette, Francois. "De la perception à l'action : l'asservissement visuel, de l'action à la perception : la vision active." Habilitation à diriger des recherches, Université Rennes 1, 1998. http://tel.archives-ouvertes.fr/tel-00843890.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Eicher, Anton. "Active Shape Model Segmentation of Brain Structures in MR Images of Subjects with Fetal Alcohol Spectrum Disorder." Thesis, University of Cape Town, 2010. http://pubs.cs.uct.ac.za/archive/00000637/.

Full text
Abstract:
Fetal Alcohol Spectrum Disorder (FASD) is the most common form of preventable mental retardation worldwide. This condition affects children whose mothers excessively consume alcohol whilst pregnant. FASD can be identied by physical and mental defects, such as stunted growth, facial deformities, cognitive impairment, and behavioural abnormalities. Magnetic Resonance Imaging provides a non-invasive means to study the neural correlates of FASD. One such approach aims to detect brain abnormalities through an assessment of volume and shape of sub-cortical structures on high-resolution MR images. Two brain structures of interest are the Caudate Nucleus and Hippocampus. Manual segmentation of these structures is time-consuming and subjective. We therefore present a method for automatically segmenting the Caudate Nucleus and Hippocampus from high-resolution MR images captured as part of an ongoing study into the neural correlates of FASD. Our method incorporates an Active Shape Model (ASM), which is used to learn shape variation from manually segmented training data. A discrete Geometrically Deformable Model (GDM) is rst deformed to t the relevant structure in each training set. The vertices belonging to each GDM are then used as 3D landmark points - effectively generating point correspondence between training models. An ASM is then created from the landmark points. This ASM is only able to deform to t structures with similar shape to those found in the training data. There are many variations of the standard ASM technique - each suited to the segmentation of data with particular characteristics. Experiments were conducted on the image search phase of ASM segmentation, in order to find the technique best suited to segmentation of the research data. Various popular image search techniques were tested, including an edge detection method and a method based on grey prole Mahalanobis distance measurement. A heuristic image search method, especially designed to target Caudate Nuclei and Hippocampi, was also developed and tested. This method was extended to include multisampling of voxel proles. ASM segmentation quality was evaluated according to various quantitative metrics, including: overlap, false positives, false negatives, mean squared distance and Hausdorff distance. Results show that ASMs that use the heuristic image search technique, without multisampling, produce the most accurate segmentations. Mean overlap for segmentation of the various target structures ranged from 0.76 to 0.82. Mean squared distance ranged from 0.72 to 0.76 - indicating sub-1mm accuracy, on average. Mean Hausdorff distance ranged from 2:7mm to 3:1mm. An ASM constructed using our heuristic technique will enable researchers to quickly, reliably, and automatically segment test data for use in the FASD study - thereby facilitating a better understanding of the eects of this unfortunate condition.
APA, Harvard, Vancouver, ISO, and other styles
31

Nelson, Eric D. "Zoom techniques for achieving scale invariant object tracking in real-time active vision systems /." Online version of the thesis, 2006. https://ritdml.rit.edu/dspace/handle/1850/2620.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Marchand, Eric. "Stratégies de perception par vision active pour la reconstruction et l'exploration de scènes statiques." Phd thesis, Université Rennes 1, 1996. http://tel.archives-ouvertes.fr/tel-00843873.

Full text
Abstract:
Ce travail apporte sa contribution au problème de la reconstruction et de l'exploration de scènes dans un contexte de vision active. À la base du processus de reconstruction, nous avons choisi une méthode qui consiste à contraindre les mouvements de la caméra de manière à obtenir une estimation précise et robuste de primitives géométriques paramétrables telles que les segments et les cylindres. À cet aspect {\em continu} du processus de reconstruction que constitue l'estimation des paramètres des primitives, il est nécessaire de définir des stratégies de reconstruction et d'exploration de la scène que l'on supposera composée de segments, polyèdres et cylindres. Cette reconstruction est de caractère {\em événementiel} et est pilotée par la découverte de nouvelles primitives dans l'image. L'approche que nous avons définie consiste à sélectionner automatiquement les informations images pertinentes puis à focaliser successivement la caméra sur les différentes primitives de la scène afin de les reconnaître et ensuite de les reconstruire. La première étape de l'exploration, qui inclut la reconstruction 3D, permet de reconstruire de manière incrémentale l'ensemble des primitives qui apparaissent dans le champ de vision de la caméra. Nous avons appelé cette phase {exploration locale car elle ne fait appel qu'à des informations disponibles localement. Elle repose sur une approche de prédiction~/~vérification d'hypothèses gérées à l'aide de réseaux Bayesiens. Cette approche permet d'obtenir une représentation de plus haut niveau des objets considérés tout en traitant les problèmes locaux d'occlusion. Par contre, quand toutes les primitives précédemment observées ont été reconstruites, une stratégie différente doit été mise en oeuvre afin de focaliser la caméra sur des zones de la scène n'ayant pas encore été observées. Cette étape d'exploration globale permet d'assurer la complétude de la reconstruction. Cette méthode repose sur l'optimisation par ICM multi-échelle d'une fonction de coût adéquatement modélisée qui prend en compte les obstacles de la scène. Finalement, les algorithmes développés ont été spécifiés et mis en \oe uvre par le langage synchrone \signal\ permettant de l'intégration au sein du même formalisme, \signal\ et \signalgti, de la dualité continu / événementiel inhérente à ce type d'algorithme. Les méthodes que nous avons développées ont été mises en oeuvre sur la cellule de vision robotique de l' Irisa. Elles permettent de reconstruire en temps réel de façon précise, robuste, complète et totalement autonome, un environnement 3D composé de plusieurs primitives.
APA, Harvard, Vancouver, ISO, and other styles
33

Ulusoy, Ilkay. "Active Stereo Vision: Depth Perception For Navigation, Environmental Map Formation And Object Recognition." Phd thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/12604737/index.pdf.

Full text
Abstract:
In very few mobile robotic applications stereo vision based navigation and mapping is used because dealing with stereo images is very hard and very time consuming. Despite all the problems, stereo vision still becomes one of the most important resources of knowing the world for a mobile robot because imaging provides much more information than most other sensors. Real robotic applications are very complicated because besides the problems of finding how the robot should behave to complete the task at hand, the problems faced while controlling the robot&rsquo
s internal parameters bring high computational load. Thus, finding the strategy to be followed in a simulated world and then applying this on real robot for real applications is preferable. In this study, we describe an algorithm for object recognition and cognitive map formation using stereo image data in a 3D virtual world where 3D objects and a robot with active stereo imaging system are simulated. Stereo imaging system is simulated so that the actual human visual system properties are parameterized. Only the stereo images obtained from this world are supplied to the virtual robot. By applying our disparity algorithm, depth map for the current stereo view is extracted. Using the depth information for the current view, a cognitive map of the environment is updated gradually while the virtual agent is exploring the environment. The agent explores its environment in an intelligent way using the current view and environmental map information obtained up to date. Also, during exploration if a new object is observed, the robot turns around it, obtains stereo images from different directions and extracts the model of the object in 3D. Using the available set of possible objects, it recognizes the object.
APA, Harvard, Vancouver, ISO, and other styles
34

Kihlström, Helena. "Active Stereo Reconstruction using Deep Learning." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158276.

Full text
Abstract:
Depth estimation using stereo images is an important task in many computer vision applications. A stereo camera contains two image sensors that observe the scene from slightly different viewpoints, making it possible to find the depth of the scene. An active stereo camera also uses a laser projector that projects a pattern into the scene. The advantage of the laser pattern is the additional texture that gives better depth estimations in dark and textureless areas.  Recently, deep learning methods have provided new solutions producing state-of-the-art performance in stereo reconstruction. The aim of this project was to investigate the behavior of a deep learning model for active stereo reconstruction, when using data from different cameras. The model is self-supervised, which solves the problem of having enough ground truth data for training the model. It instead uses the known relationship between the left and right images to let the model learn the best estimation. The model was separately trained on datasets from three different active stereo cameras. The three trained models were then compared using evaluation images from all three cameras. The results showed that the model did not always perform better on images from the camera that was used for collecting the training data. However, when comparing the results of different models using the same test images, the model that was trained on images from the camera used for testing gave better results in most cases.
APA, Harvard, Vancouver, ISO, and other styles
35

Trujillo-Romero, Felipe De Jesus. "Modélisation et reconnaissance active d'objets 3D de forme libre par vision en robotique." Phd thesis, Institut National Polytechnique de Toulouse - INPT, 2008. http://tel.archives-ouvertes.fr/tel-00842693.

Full text
Abstract:
Cette thèse concerne la robotique au service de l'Homme. Un robot compagnon de l'Homme devra manipuler des objets 3D courants (bouteille, verre...), reconnus et localisés à partir de données acquises depuis des capteurs embarqués sur le robot. Nous exploitons la Vision, monoculaire ou stéréo. Pour traiter de la manipulation à partir de données visuelles, il faut au préalable construire deux représentations pour chaque objet : un modèle géométrique 3D, indispensable pour contrôler la saisie, et un modèle d'apparence visuelle, nécessaire pour la reconnaissance. Cette thèse traite donc de l'apprentissage de ces représentations, puis propose une approche active de reconnaissance d'objets depuis des images acquises par les caméras embarquées. La modélisation est traitée sur un objet 3D isolé posé sur une table, ; nous exploitons des données 3D acquises depuis un capteur stéréo monté sur un bras manipulateur; le capteur est déplacé par le bras autour de l'objet pour acquérir N images, exploitées pour construire un modèle de type maillage triangulaire. Nous proposons d'abord une approche originale de recalage des vues partielles de l'objet, fondée sur des informations de pseudo-couleur générées à partir des points 3D acquis sur l'objet à apprendre ; puis une méthode simple et rapide, fondée sur la paramétrisation sphérique, est proposée pour construire un maillage triangulaire à partir des vues recalées fusionnées dans un nuage de points 3D. Pour la reconnaissance active, nous exploitons une simple caméra. L'apprentissage du modèle d'apparence pour chaque objet, se fait aussi en déplaçant ce capteur autour de l'objet isolé posé sur une table. Ce modèle est donc fait de plusieurs vues ; dans chacune, (1) la silhouette de l'objet est extraite par un contour actif, puis (2) plusieurs descripteurs sont extraits, globaux (couleur, signature de la silhouette, shape context calculés) ou locaux (points d'intérêt, couleur ou shape context dans des régions). Pendant la reconnaissance, la scène peut contenir un objet isolé, ou plusieurs en vrac, avec éventuellement des objets non appris ; nous proposons une approche active, approche incrémentale qui met à jour un ensemble de probabilités P(Obji), i=1 à N+1 si N objets ont été appris ; les objets inconnus sont affectés à la classe N+1 ; P(Obji) donne la probabilité qu'un objet de la classe i soit présent dans la scène. A chaque étape la meilleure position du capteur est sélectionnée en exploitant la maximisation de l'information mutuelle. De nombreux résultats en images de synthèse ou en images réelles ont permis de valider cette approche.
APA, Harvard, Vancouver, ISO, and other styles
36

Kargén, Rolf. "Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseende." Thesis, Linköpings universitet, Datorseende, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-107186.

Full text
Abstract:
Datorseende är ett snabbt växande, tvärvetenskapligt forskningsområde vars tillämpningar tar en allt mer framskjutande roll i dagens samhälle. Med ett ökat intresse för datorseende ökar också behovet av att kunna kontrollera kameror kopplade till datorseende system. Vid Linköpings tekniska högskola, på avdelningen för datorseende, har ramverket EDSDK++ utvecklats för att fjärrstyra digitala kameror tillverkade av Canon Inc. Ramverket är mycket omfattande och innehåller en stor mängd funktioner och inställningsalternativ. Systemet är därför till stor del ännu relativt oprövat. Detta examensarbete syftar till att utveckla ett demonstratorsystem till EDSDK++ i form av ett enkelt active vision system, som med hjälp av ansiktsdetektion i realtid styr en kameratilt, samt en kamera monterad på tilten, till att följa, zooma in och fokusera på ett ansikte eller en grupp av ansikten. Ett krav var att programbiblioteket OpenCV skulle användas för ansiktsdetektionen och att EDSDK++ skulle användas för att kontrollera kameran. Dessutom skulle ett API för att kontrollera kameratilten utvecklas. Under utvecklingsarbetet undersöktes bl.a. olika metoder för ansiktsdetektion. För att förbättra prestandan användes multipla ansiktsdetektorer, som med hjälp av multitrådning avsöker en bild parallellt från olika vinklar. Såväl experimentella som teoretiska ansatser gjordes för att bestämma de parametrar som behövdes för att kunna reglera kamera och kameratilt. Resultatet av arbetet blev en demonstrator, som uppfyllde samtliga krav.
Computer vision is a rapidly growing, interdisciplinary field whose applications are taking an increasingly prominent role in today's society. With an increased interest in computer vision there is also an increasing need to be able to control cameras connected to computer vision systems. At the division of computer vision, at Linköping University, the framework EDSDK++ has been developed to remotely control digital cameras made by Canon Inc. The framework is very comprehensive and contains a large amount of features and configuration options. The system is therefore largely still relatively untested. This thesis aims to develop a demonstrator to EDSDK++ in the form of a simple active vision system, which utilizes real-time face detection in order to control a camera tilt, and a camera mounted on the tilt, to follow, zoom in and focus on a face or a group of faces. A requirement was that the OpenCV library would be used for face detection and EDSDK++ would be used to control the camera. Moreover, an API to control the camera tilt was to be developed. During development, different methods for face detection were investigated. In order to improve performance, multiple, parallel face detectors using multithreading, were used to scan an image from different angles. Both experimental and theoretical approaches were made to determine the parameters needed to control the camera and camera tilt. The project resulted in a fully functional demonstrator, which fulfilled all requirements.
APA, Harvard, Vancouver, ISO, and other styles
37

Hoffmann, McElory Roberto. "Stochastic visual tracking with active appearance models." Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/1381.

Full text
Abstract:
Thesis (PhD (Applied Mathematics))--University of Stellenbosch, 2009.
ENGLISH ABSTRACT: In many applications, an accurate, robust and fast tracker is needed, for example in surveillance, gesture recognition, tracking lips for lip-reading and creating an augmented reality by embedding a tracked object in a virtual environment. In this dissertation we investigate the viability of a tracker that combines the accuracy of active appearancemodels with the robustness of the particle lter (a stochastic process)—we call this combination the PFAAM. In order to obtain a fast system, we suggest local optimisation as well as using active appearance models tted with non-linear approaches. Active appearance models use both contour (shape) and greyscale information to build a deformable template of an object. ey are typically accurate, but not necessarily robust, when tracking contours. A particle lter is a generalisation of the Kalman lter. In a tutorial style, we show how the particle lter is derived as a numerical approximation for the general state estimation problem. e algorithms are tested for accuracy, robustness and speed on a PC, in an embedded environment and by tracking in ìD. e algorithms run real-time on a PC and near real-time in our embedded environment. In both cases, good accuracy and robustness is achieved, even if the tracked object moves fast against a cluttered background, and for uncomplicated occlusions.
AFRIKAANSE OPSOMMING: ’nAkkurate, robuuste en vinnige visuele-opspoorderword in vele toepassings benodig. Voorbeelde van toepassings is bewaking, gebaarherkenning, die volg van lippe vir liplees en die skep van ’n vergrote realiteit deur ’n voorwerp wat gevolg word, in ’n virtuele omgewing in te bed. In hierdie proefskrif ondersoek ons die lewensvatbaarheid van ’n visuele-opspoorder deur die akkuraatheid van aktiewe voorkomsmodellemet die robuustheid van die partikel lter (’n stochastiese proses) te kombineer—ons noem hierdie kombinasie die PFAAM. Ten einde ’n vinnige visuele-opspoorder te verkry, stel ons lokale optimering, sowel as die gebruik van aktiewe voorkomsmodelle wat met nie-lineêre tegnieke gepas is, voor. Aktiewe voorkomsmodelle gebruik kontoer (vorm) inligting tesamemet grysskaalinligting om ’n vervormbaremeester van ’n voorwerp te bou. Wanneer aktiewe voorkomsmodelle kontoere volg, is dit normaalweg akkuraat,maar nie noodwendig robuust nie. ’n Partikel lter is ’n veralgemening van die Kalman lter. Ons wys in tutoriaalstyl hoe die partikel lter as ’n numeriese benadering tot die toestand-beramingsprobleem afgelei kan word. Die algoritmes word vir akkuraatheid, robuustheid en spoed op ’n persoonlike rekenaar, ’n ingebedde omgewing en deur volging in ìD, getoets. Die algoritmes loop intyds op ’n persoonlike rekenaar en is naby intyds op ons ingebedde omgewing. In beide gevalle, word goeie akkuraatheid en robuustheid verkry, selfs as die voorwerp wat gevolg word, vinnig, teen ’n besige agtergrond beweeg of eenvoudige okklusies ondergaan.
APA, Harvard, Vancouver, ISO, and other styles
38

Sundaramoorthi, Ganesh. "Global Optimizing Flows for Active Contours." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/16145.

Full text
Abstract:
This thesis makes significant contributions to the object detection problem in computer vision. The object detection problem is, given a digital image of a scene, to detect the relevant object in the image. One technique for performing object detection, called ``active contours,' optimizes a constructed energy that is defined on contours (closed curves) and is tailored to image features. An optimization method can be used to perform the optimization of the energy, and thereby deform an initially placed contour to the relevant object. The typical optimization technique used in almost every active contour paper is evolving the contour by the energy's gradient descent flow, i.e., the steepest descent flow, in order to drive the initial contour to (hopefully) the minimum curve. The problem with this technique is that often times the contour becomes stuck in a sub-optimal and undesirable local minimum of the energy. This problem can be partially attributed to the fact that the gradient flows of these energies make use of only local image and contour information. By local, we mean that in order to evolve a point on the contour, only information local to that point is used. Therefore, in this thesis, we introduce a new class of flows that are global in that the evolution of a point on the contour depends on global information from the entire curve. These flows help avoid a number of problems with traditional flows including helping in avoiding undesirable local minima. We demonstrate practical applications of these flows for the object detection problem, including applications to both image segmentation and visual object tracking.
APA, Harvard, Vancouver, ISO, and other styles
39

Appia, Vikram VijayanBabu. "Non-local active contours." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44739.

Full text
Abstract:
This thesis deals with image segmentation problems that arise in various computer vision related fields such as medical imaging, satellite imaging, video surveillance, recognition and robotic vision. More specifically, this thesis deals with a special class of image segmentation technique called Snakes or Active Contour Models. In active contour models, image segmentation is posed as an energy minimization problem, where an objective energy function (based on certain image related features) is defined on the segmenting curve (contour). Typically, a gradient descent energy minimization approach is used to drive the initial contour towards a minimum for the defined energy. The drawback associated with this approach is that the contour has a tendency to get stuck at undesired local minima caused by subtle and undesired image features/edges. Thus, active contour based curve evolution approaches are very sensitive to initialization and noise. The central theme of this thesis is to develop techniques that can make active contour models robust against certain classes of local minima by incorporating global information in energy minimization. These techniques lead to energy minimization with global considerations; we call these models -- 'Non-local active contours'. In this thesis, we consider three widely used active contour models: 1) Edge- and region-based segmentation model, 2) Prior shape knowledge based segmentation model, and 3) Motion segmentation model. We analyze the traditional techniques used for these models and establish the need for robust models that avoid local minima. We address the local minima problem for each model by adding global image considerations.
APA, Harvard, Vancouver, ISO, and other styles
40

Flandin, Grégory. "Modélisation probabiliste et exploration visuelle autonome pour la reconstruction de scènes inconnues." Phd thesis, Université Rennes 1, 2001. http://tel.archives-ouvertes.fr/tel-00843884.

Full text
Abstract:
inconnues. Il s'agit d'interpréter des informations visuelles et de générer les points de vue qui permettent de construire progressivement une carte de l'environnement. Le problème est décomposé, de façon hiérarchique, en trois fonctionnalités. Dans un premier temps, le système détermine la suite des actions aboutissantàuninventaire de tous les objets de la scène. Cette problématique s'inscrit dans le contexte très général de la recherche d'objets. L'approche que nous présentons est basée sur une description probabiliste de l'occupation de la scène par des objets. La recherche consiste alors à générer une suite d'observations aboutissant à des probabilités proches de 1 aux endroits où se trouve un objet et proches de 0 ailleurs. Nous développons plusieurs stratégies allant dans ce sens. Dans un second temps, l'exploration est focalisée sur chaque objet a˝n d'en améliorer la description. Nous présentons un modèle d'objet basé sur un mélange de modèles stochastique et à erreur bornée permettant de représenter l'enveloppe approchée de l'objet tout en tenant compte des incertitudes de localisation. Nous développons un algorithme d'estimation en ligne de ce modèle et élaborons un processus d'exploration optimale en temps réel basé sur la minimisation de l'incertitude de localisation de l'objet observé. En˝n, la dernière fonction-nalité concerne le suivi des consignes permettant de déplacer la caméra tout en suivant l'objet d'intérêt. Ce problème est résolu par asservissement visuel dont nous étudions les potentialités du point de vue de la coopération caméra globale/caméra locale.
APA, Harvard, Vancouver, ISO, and other styles
41

Ben, Hamadou Achraf. "Contribution à la cartographie 3D des parois internes de la vessie par cystoscopie à vision active." Phd thesis, Institut National Polytechnique de Lorraine - INPL, 2011. http://tel.archives-ouvertes.fr/tel-00628292.

Full text
Abstract:
La cystoscopie est actuellement l'examen clinique de référence permettant l'exploration visuelle des parois internes de la vessie. Le cystoscope (instrument utilisé pour cet examen) permet d'acquérir une séquence vidéo des parois épithéliales de la vessie. Cependant, chaque image de la séquence vidéo ne visualise qu'une surface réduite de quelques centimètres carrés de la paroi. Les travaux réalisés dans le cadre de cette thèse ont pour objectif de construire une carte 3D reproduisant d'une manière fidèle les formes et les textures des parois internes de la vessie. Une telle représentation de l'intérieur de la vessie permettrait d'améliorer l'interprétation des données acquises lors d'un examen cystoscopique. Pour atteindre cet objectif, un nouvel algorithme flexible est proposé pour le calibrage de systèmes cystoscopiques à vision active. Cet algorithme fournit les paramètres nécessaires à la reconstruction précise de points 3D sur la portion de surface imagée à chaque instant donné de la séquence vidéo cystoscopique. Ainsi, pour chaque acquisition de la séquence vidéo, un ensemble de quelques points 3D/2D et une image 2D est disponible. L'objectif du deuxième algorithme proposé dans cette thèse est de ramener l'ensemble des données obtenues pour une séquence dans un repère global pour générer un nuage de points 3D et une image panoramique 2D représentant respectivement la forme 3D et la texture de la totalité de la paroi imagée dans la séquence vidéo. Cette méthode de cartographie 3D permet l'estimation simultanée des transformations 3D rigides et 2D perspectives liant respectivement les positions du cystoscope et les images de paires d'acquisitions consécutives. Les résultats obtenus sur des fantômes réalistes de vessie montrent que ces algorithmes permettent de calculer des surfaces 3D reproduisant les formes à retrouver.
APA, Harvard, Vancouver, ISO, and other styles
42

Dambreville, Samuel. "Statistical and geometric methods for shape-driven segmentation and tracking." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/22707.

Full text
Abstract:
Thesis (Ph. D.)--Electrical and Computer Engineering, Georgia Institute of Technology, 2008.
Committee Chair: Allen Tannenbaum; Committee Member: Anthony Yezzi; Committee Member: Marc Niethammer; Committee Member: Patricio Vela; Committee Member: Yucel Altunbasak.
APA, Harvard, Vancouver, ISO, and other styles
43

Li, Xin. "Multi-label Learning under Different Labeling Scenarios." Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/350482.

Full text
Abstract:
Computer and Information Science
Ph.D.
Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where |Y| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods.
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
44

Nain, Delphine. "Scale-based decomposable shape representations for medical image segmentation and shape analysis." Diss., Available online, Georgia Institute of Technology, 2006, 2006. http://etd.gatech.edu/theses/available/etd-11192006-184858/.

Full text
Abstract:
Thesis (Ph. D.)--Computing, Georgia Institute of Technology, 2007.
Aaron Bobick, Committee Chair ; Allen Tannenbaum, Committee Co-Chair ; Greg Turk, Committee Member ; Steven Haker, Committee Member ; W. Eric. L. Grimson, Committee Member.
APA, Harvard, Vancouver, ISO, and other styles
45

Veyret, Morgan. "Un guide virtuel autonome pour la description d'un environnement réel dynamique: interaction entre la perception et la prise de décision." Phd thesis, Université de Bretagne occidentale - Brest, 2009. http://tel.archives-ouvertes.fr/tel-00376176.

Full text
Abstract:
Classiquement la réalité augmentée consiste en l'annotation d'objets fixes pour un utilisateur en mouvement au sein d'un environnement réel. Le travail présenté dans cette thèse porte sur l'utilisation de la réalité augmentée dans le but de décrire des objets dynamiques dont le comportement n'est que peu prévisible. Nous nous intéressons tout particulièrement aux problèmes posés par la nature dynamique de l'environnement en ce qui concerne: 1°/ la description du réel (adapter les explications fournies par le guide virtuel à l'évolution de l'environnement) ; 2°/ la perception du réel (percevoir l'environnement à expliquer en temps réel à l'aide de caméras).
La description du réel consiste en la génération d'un exposé par le guide virtuel. Cette génération repose sur deux points: des connaissances a priori sous la forme d'explications et un comportement décrit par un automate hiérarchique. Nous considérons la visite guidée comme l'évolution conjointe du comportement du guide virtuel et des explications qu'il fournit aux visiteurs. Une explication permet de décrire l'enchaînement d'éléments de discours sur un sujet donné à l'aide d'un graphe. Chacun de ces éléments décrit une unité de discours indivisible décrivant l'utilisation des différentes modalités (parole, gestes, expression, ...) sous la forme d'un script. L'exécution d'un graphe d'explication est effectuée par le comportement qui intègre la notion d'interruption. Lorsqu'un processus d'explication est interrompu, il est suspendu et le sujet courant de la visite guidée est réévalué. Cette réévaluation repose sur l'utilisation d'un ensemble d'experts votant pour les différentes explications disponibles selon un point de vue particulier. Ce vote se base sur le contexte courant de la visite guidée (historique, temps écoulé/restant, ...) et l'état de l'environnement réel.
La perception consiste en la construction et la mise à jour d'une représentation de l'environnement. Ceci est effectué en temps réel par la coopération de différentes routines de perception. La complexité de l'environnement observé (quantité d'informations et variations des conditions d'éclairage) empêchent une analyse complète du flux vidéo. Nous proposons de surmonter ce problème par l'utilisation de stratégies de prise d'information adaptées. Ces stratégies de perception sont mises en oeuvre par certaines routines au travers du choix et du paramétrage des traitements qu'elles effectuent. Nous présentons un ensemble minimal de routines nécessaires à la construction d'une représentation de l'environnement exploitable dans le cadre de la description de cet environnement. Ce système repose sur la mise en oeuvre de trois stratégies de perception: la vigilance qui coordonne des traitements de détection dans le temps et dans l'espace; le suivi qui se charge de mettre à jour les propriétés spatiales des entités existantes dans la représentation; la reconnaissance dont le rôle est d'identifier ces entités. L'efficacité des stratégies de perception suppose une interaction entre la prise de décision (génération de l'exposé) et la perception (construction d'une représentation de l'environnement) de notre acteur virtuel autonome. Nous proposons de mettre en oeuvre cette interaction au travers de la représentation de l'environnement et des requêtes effectuées par le processus de prise de décision sur cette représentation.
Nous avons mené des expérimentations afin mettre en évidence le fonctionnement des différents aspects de notre proposition et de la valider des conditions contrôlées. Ces travaux sont appliqués à un cas concret d'environnement réel dynamique complexe au sein du projet ANR SIRENE. Cette application met en évidence les questions liées à notre problématique et montre la pertinence de notre approche dans le cadre de la présentation d'un aquarium marin d'Océanopolis.
APA, Harvard, Vancouver, ISO, and other styles
46

Sörsäter, Michael. "Active Learning for Road Segmentation using Convolutional Neural Networks." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-152286.

Full text
Abstract:
In recent years, development of Convolutional Neural Networks has enabled high performing semantic segmentation models. Generally, these deep learning based segmentation methods require a large amount of annotated data. Acquiring such annotated data for semantic segmentation is a tedious and expensive task. Within machine learning, active learning involves in the selection of new data in order to limit the usage of annotated data. In active learning, the model is trained for several iterations and additional samples are selected that the model is uncertain of. The model is then retrained on additional samples and the process is repeated again. In this thesis, an active learning framework has been applied to road segmentation which is semantic segmentation of objects related to road scenes. The uncertainty in the samples is estimated with Monte Carlo dropout. In Monte Carlo dropout, several dropout masks are applied to the model and the variance is captured, working as an estimate of the model’s uncertainty. Other metrics to rank the uncertainty evaluated in this work are: a baseline method that selects samples randomly, the entropy in the default predictions and three additional variations/extensions of Monte Carlo dropout. Both the active learning framework and uncertainty estimation are implemented in the thesis. Monte Carlo dropout performs slightly better than the baseline in 3 out of 4 metrics. Entropy outperforms all other implemented methods in all metrics. The three additional methods do not perform better than Monte Carlo dropout. An analysis of what kind of uncertainty Monte Carlo dropout capture is performed together with a comparison of the samples selected by baseline and Monte Carlo dropout. Future development and possible improvements are also discussed.
APA, Harvard, Vancouver, ISO, and other styles
47

Wernersson, Björn, and Mikael Södergren. "Automatiserad inlärning av detaljer för igenkänning och robotplockning." Thesis, Linköping University, Department of Electrical Engineering, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170.

Full text
Abstract:

Just how far is it possible to make learning of new parts for recognition and robot picking autonomous? This thesis initially gives the prerequisites for the steps in learning and calibration that are to be automated. Among these tasks are to select a suitable part model from numerous candidates with the help of a new part segmenter, as well as computing the spatial extent of this part, facilitating robotic collision handling. Other tasks are to analyze the part model in order to highlight correct and suitable edge segments for increasing pattern matching certainty, and to choose appropriate acceptance levels for pattern matching. Furthermore, tasks deal with simplifying camera calibration by analyzing the calibration pattern, as well as compensating for differences in perspective at great depth variations, by calculating the centre of perspective of the image. The image processing algorithms created in order to solve the tasks are described and evaluated thoroughly. This thesis shows that simplification of steps of learning and calibration, by the help of advanced image processing, really is possible.

APA, Harvard, Vancouver, ISO, and other styles
48

Fanelli, Gabriele. "Facial Features Tracking using Active Appearance Models." Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-7658.

Full text
Abstract:

This thesis aims at building a system capable of automatically extracting and parameterizing the position of a face and its features in images acquired from a low-end monocular camera. Such a challenging task is justified by the importance and variety of its possible applications, ranging from face and expression recognition to animation of virtual characters using video depicting real actors. The implementation includes the construction of Active Appearance Models of the human face from training images. The existing face model Candide-3 is used as a starting point, making the translation of the tracking parameters to standard MPEG-4 Facial Animation Parameters easy.

The Inverse Compositional Algorithm is employed to adapt the models to new images, working on a subspace where the appearance is "projected out" and thus focusing only on shape.

The algorithm is tested on a generic model, aiming at tracking different people’s faces, and on a specific model, considering one person only. In the former case, the need for improvements in the robustness of the system is highlighted. By contrast, the latter case gives good results regarding both quality and speed, with real time performance being a feasible goal for future developments.

APA, Harvard, Vancouver, ISO, and other styles
49

Dune, Claire. "Localisation et caractérisation d'objets inconnus à partir d'informations visuelles : vers une saisie intuitive pour les personnes en situation de handicap." Phd thesis, Université Rennes 1, 2009. http://tel.archives-ouvertes.fr/tel-00844919.

Full text
Abstract:
Le point de départ des travaux présentés dans cette thèse est la volonté de développement d'une aide robotisée à la saisie intuitive pour les personnes en situation de handicap. L'outil proposé est un manipulateur contrôlé en utilisant directement les informations transmises par deux caméras, l'une embarquée sur la pince qui donne une vue détaillée de la scène, et l'autre déportée qui en oﰛre une vue d'ensemble. L'objet de nos travaux est de saisir un objet a priori inconnu à partir d'un seul clic de l'utilisateur sur une image acquise par la caméra déportée. Nous proposons des méthodes permettant de localiser et de caractériser grossièrement un objet de forme convexe aﰜn qu'il puisse être saisi par la pince du manipulateur. Cette thèse peut être vue comme complémentaire aux méthodes existantes reposant sur l'utilisation de bases de données. Ce manuscrit est divisé en deux parties : la localisation grossière d'un objet inconnu et la caractérisation de sa forme. L'objet se situe sur la ligne de vue qui passe par le centre optique de la caméra déportée et le clic. La projection de cette ligne de vue dans la caméra embarquée est la ligne épipolaire associée aux clic. Nous proposons donc un asservissement visuel reposant sur l'utilisation de la géométrie épipolaire pour commander la caméra embarquée de façon à parcourir cette ligne. Les indices visuels extraits des images embarquées sont ensuite mis en correspondance avec les indices détectés au voisinage du clic pour estimer la position 3D de l'objet. Cette méthode est robuste à des mouvements relatifs de l'objet et de la caméra déportée au cours du processus de localisation. En ﰜn de processus, l'objet désigné se trouve dans le champ de vision des deux caméras et ces deux vues peuvent servir à initier une caractérisation plus précise de l'objet et suﰞsante pour la saisie. Le problème de la caractérisation de la forme de l'objet a été traité dans le cadre d'une observation monoculaire dynamique. La forme de l'objet est modélisée par une quadrique dont les paramètres sont estimés à partir de ses projections dans un ensemble d'images. Les contours de l'objet sont détectés par une méthode de contours actifs initialisés à partir de la localisation grossière de l'objet. La caractérisation de l'objet est d'autant plus précise que les vues utilisées pour l'estimer sont bien choisies. La dernière contribution de ce mémoire est une méthode de sélection par vision active des vues optimales pour la reconstruction. Les meilleures vues sont choisies en recherchant les positions de la caméra qui maximisent l'information.
APA, Harvard, Vancouver, ISO, and other styles
50

Martinez, Pujol Oriol. "Template tracking of articulated objects using active contours." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/373919.

Full text
Abstract:
En aquesta dissertació es fusionen dos dels temes tradicionals de la Visió per Computador: la segmentació i el seguiment d'objectes. Per a la segmentació s'utilitzen mètodes basats en "Active Contours (AC)" i per al seguiment mètodes basats en "templates" o patrons. El nostre objectiu és combinar-los per tal de crear mètodes robustos i eficients a l'hora de segmentar i seguir objectes articulats o deformables. Al capítol 1 es revisa el marc teòric dels AC i s'aplica en la segmentació de cossos i amenaces (com explosius o pistoles) que estan amagades darrera la roba en imatges MilliMeter-Waves (MMW). Al capítol 2 es revisen dos dels marcs principals de seguiment de patrons: el flux òptic de Lucas-Kanade i els filtres de partícules, i es combinen amb la segmentació mitjançant AC per tal de crear un mètode robust i eficient capaç de seguir objectes articulats o deformables sense informació a priori. Finalment, al capítol 3 es donen les claus per introduir informació a priori d'una manera robusta i eficient dins del marc del seguiment de patrons utilitzant AC.
In this dissertation we fuse two of the traditional topics in Computer Vision: object segmentation and tracking. For segmentation we use the Active Contours (AC) framework and for tracking we use the Template Tracking (TT) scheme. Our aim is to combine them to create efficient and robust methods to segment and track articulated or deformable objects. In Chapter 1, we review the AC framework and we apply it over MilliMeter-Waves (MMW) images to segment bodies and concealed threats (such as explosives or guns) behind their wearing clothes. In Chapter 2 we review two of the main trends of TT methods: Lucas-Kanade optical flow and particle filters. Moreover, we combine them with an AC method to create a robust tracker for articulated or deformable objects without using prior shape information. Finally, in Chapter 3 we give the clues of how to efficiently introduce shape priors into the TT framework using AC methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography