To see the other types of publications on this topic, follow the link: Invariant Object Recognition.

Dissertations / Theses on the topic 'Invariant Object Recognition'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Invariant Object Recognition.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Srestasathiern, Panu. "View Invariant Planar-Object Recognition." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1420564069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tonge, Ashwini Kishor. "Object Recognition Using Scale-Invariant Chordiogram." Thesis, University of North Texas, 2017. https://digital.library.unt.edu/ark:/67531/metadc984116/.

Full text
Abstract:
This thesis describes an approach for object recognition using the chordiogram shape-based descriptor. Global shape representations are highly susceptible to clutter generated due to the background or other irrelevant objects in real-world images. To overcome the problem, we aim to extract precise object shape using superpixel segmentation, perceptual grouping, and connected components. The employed shape descriptor chordiogram is based on geometric relationships of chords generated from the pairs of boundary points of an object. The chordiogram descriptor applies holistic properties of the shape and also proven suitable for object detection and digit recognition mechanisms. Additionally, it is translation invariant and robust to shape deformations. In spite of such excellent properties, chordiogram is not scale-invariant. To this end, we propose scale invariant chordiogram descriptors and intend to achieve a similar performance before and after applying scale invariance. Our experiments show that we achieve similar performance with and without scale invariance for silhouettes and real world object images. We also show experiments at different scales to confirm that we obtain scale invariance for chordiogram.
APA, Harvard, Vancouver, ISO, and other styles
3

Dahmen, Jörg. "Invariant image object recognition using Gaussian mixture densities." [S.l.] : [s.n.], 2001. http://deposit.ddb.de/cgi-bin/dokserv?idn=964586940.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Booth, Michael C. A. "Temporal lobe mechanisms for view-invariant object recognition." Thesis, University of Oxford, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.299094.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Hsu, Tao-i. "Affine invariant object recognition by voting match techniques." Thesis, Monterey, California. Naval Postgraduate School, 1988. http://hdl.handle.net/10945/22865.

Full text
Abstract:
Approved for public release; distribution is unlimited
This thesis begins with a general survey of different model based systems for object recognition. The advantage and disadvantage of those systems are discussed. A system is then selected for study because of its effective Affine invariant matching [Ref. 1] characteristic. This system involves two separate phases, the modeling and the recognition. One is done off-line and the other is done on-line. A Hashing technique is implemented to achieve fast accessing and voting. Different test data sets are used in experiments to illustrate the recognition capabilities of this system. This demonstrates the capabilities of partial match, recognizing objects under similarity transformation applied to the models, and the results of noise perturbation. The testing results are discussed, and related experiences and recommendations are presented.
http://archive.org/details/affineinvarianto00hsut
Captain, Taiwan Republic of China Army
APA, Harvard, Vancouver, ISO, and other styles
6

Robinson, Leigh. "Invariant object recognition : biologically plausible and machine learning approaches." Thesis, University of Warwick, 2015. http://wrap.warwick.ac.uk/83167/.

Full text
Abstract:
Understanding the processes that facilitate object recognition is a task that draws on a wide range of fields, integrating knowledge from neuroscience, psychology, computer science and mathematics. The substantial work done in these fields has lead to two major outcomes: Firstly, a rich interplay between computational models and biological experiments that seek to explain the biological processes that underpin object recognition. Secondly, engineered vision systems that on many tasks are approaching the performance of humans. This work first highlights the importance of ensuring models which are aiming for biological relevance actually produce biologically plausible representations that are consistent with what has been measured within the primate visual cortex. To accomplish this two leading biologically plausible models, HMAX and VisNet are compared on a set of visual processing tasks. The work then changes approach, focusing on models that do not explicitly seek to model any biological process, but rather solve a particular vision task with the goal being increased performance. This section explores the recently discovered problem convolution networks being susceptible to adversarial exemplars. An extension of previous work is shown that allows state-of-the-art networks to be fooled to classify any image as any label while leaving that original image visually unchanged. Secondly an efficient implementation of applying dropout in a batchwise fashion is introduced that approximately halves the computational cost, allowing models twice as large to be trained. Finally an extension to Deep Belief Networks is proposed that constrains the connectivity of the a given layer to that of a topologically local region of the previous one.
APA, Harvard, Vancouver, ISO, and other styles
7

Allan, Moray. "Sprite learning and object category recognition using invariant features." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/2430.

Full text
Abstract:
This thesis explores the use of invariant features for learning sprites from image sequences, and for recognising object categories in images. A popular framework for the interpretation of image sequences is the layers or sprite model of e.g. Wang and Adelson (1994), Irani et al. (1994). Jojic and Frey (2001) provide a generative probabilistic model framework for this task, but their algorithm is slow as it needs to search over discretised transformations (e.g. translations, or affines) for each layer. We show that by using invariant features (e.g. Lowe’s SIFT features) and clustering their motions we can reduce or eliminate the search and thus learn the sprites much faster. The algorithm is demonstrated on example image sequences. We introduce the Generative Template of Features (GTF), a parts-based model for visual object category detection. The GTF consists of a number of parts, and for each part there is a corresponding spatial location distribution and a distribution over ‘visual words’ (clusters of invariant features). We evaluate the performance of the GTF model for object localisation as compared to other techniques, and show that such a relatively simple model can give state-of- the-art performance. We also discuss the connection of the GTF to Hough-transform-like methods for object localisation.
APA, Harvard, Vancouver, ISO, and other styles
8

Bone, Peter. "Fully invariant object recognition and tracking from cluttered scenes." Thesis, University of Sussex, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.444109.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Banarse, D. S. "A generic neural network architecture for deformation invariant object recognition." Thesis, Bangor University, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362146.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sim, Hak Chuah. "Invariant object matching with a modified dynamic link network." Thesis, University of Southampton, 1999. https://eprints.soton.ac.uk/256269/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Graf, Thorsten. "Flexible object recognition based on invariant theory and agent technology." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=96086170X.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Woo, Myung Chul. "Biologically-inspired translation, scale, and rotation invariant object recognition models /." Online version of thesis, 2007. http://hdl.handle.net/1850/3933.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Tafazoli, Sina. "Behavioral and Neuronal Substrates of Invariant Object Recognition in Rats." Doctoral thesis, SISSA, 2014. http://hdl.handle.net/20.500.11767/4838.

Full text
Abstract:
The visual system of humans and other primates has the remarkable ability to recognize objects despite tremendous variation in their appearance, due to changes in size, position, background, and viewpoint. While this ability is central to human visual perception, the underlying brain mechanisms are poorly understood, and transformation-tolerant recognition remains a major challenge in the development of artificial vision systems. Arguably, this is a consequence of the formidable complexity of the primate visual system and the relatively narrow range of experimental approaches that human and nonhuman primate studies allow. Although, traditionally, the invasive study of the neuronal basis of object vision has been restricted to non-human primate experiments, recently, rodents are merging as powerful models to study visual processing. However, successful use of rodents as models for studying visual object recognition crucially depends on the ability of their visual system to construct representations of visual objects that tolerate (i.e., remain relatively unchanged with respect to) the tremendous changes in object appearance produced, for instance, by size and viewpoint variation. As the first part of this Thesis, I addressed this question by training rats to categorize a continuum of morph objects resulting from blending two object prototypes. The resulting psychometric curve (reporting the proportion of responses to one prototype along the morph line) served as a reference when, in a second phase of the experiment, either prototype was briefly presented as a prime, immediately before a test morph object. The resulting shift of the psychometric curve showed that recognition became biased (primed) toward the identity of the prime. Critically, this bias was observed also when the primes were transformed along a variety of dimensions (i.e., size, position, viewpoint, and their combination) that the animals had never experienced before. These results indicate that rats spontaneously perceive different views/appearances of an object as similar (i.e., as instances of the same object) and argue for the existence of neuronal substrates underlying formation of transformation-tolerant object representations in rats. As the next step, I tried to characterize such neuronal substrates by performing multi- electrode neuronal recordings (in anesthetized rats exposed to a battery of visual objects) from five different cortical areas of the rat brain: primary visual cortex (V1) and four extrastriate areas (named LM, AL, LI and LL) that are located laterally to V1 and have been proposed as candidate stages of a putative rat visual shape processing stream,homologous to the monkey 5 ventral visual stream (Apart from area AL that probably belongs to dorsal pathway in rat). An object set consisting of 10 different objects, each transformed across a variety of axes (position, size, in-depth azimuth rotation and in-plane rotation) was used. I found that along the processing hierarchy V1->LM->LI->LL, receptive fields become progressively bigger, as well as the latency of the response. Using information theory I found that, as the information travels through this hierarchy, the fractional information that each cell carries about the luminance gradually decreases, whereas the fractional information about shape gradually increases. Accordingly, I found that neurons along this pathway become increasingly tolerant to transformations. This indicates that neurons along this hierarchy become progressively tuned to more complex visual attributes and become more tolerant to transformations, thus suggesting that the pathway V1->LM->LI->LL could be homologous to the primate ventral stream. Overall, the combination of this behavioral and neurophysiological studies will provide an unprecedented understanding of high-level visual processing in a rodent species.
APA, Harvard, Vancouver, ISO, and other styles
14

Voils, Danny. "Scale Invariant Object Recognition Using Cortical Computational Models and a Robotic Platform." PDXScholar, 2012. https://pdxscholar.library.pdx.edu/open_access_etds/632.

Full text
Abstract:
This paper proposes an end-to-end, scale invariant, visual object recognition system, composed of computational components that mimic the cortex in the brain. The system uses a two stage process. The first stage is a filter that extracts scale invariant features from the visual field. The second stage uses inference based spacio-temporal analysis of these features to identify objects in the visual field. The proposed model combines Numenta's Hierarchical Temporal Memory (HTM), with HMAX developed by MIT's Brain and Cognitive Science Department. While these two biologically inspired paradigms are based on what is known about the visual cortex, HTM and HMAX tackle the overall object recognition problem from different directions. Image pyramid based methods like HMAX make explicit use of scale, but have no sense of time. HTM, on the other hand, only indirectly tackles scale, but makes explicit use of time. By combining HTM and HMAX, both scale and time are addressed. In this paper, I show that HTM and HMAX can be combined to make a com- plete cortex inspired object recognition model that explicitly uses both scale and time to recognize objects in temporal sequences of images. Additionally, through experimentation, I examine several variations of HMAX and its
APA, Harvard, Vancouver, ISO, and other styles
15

Isik, Leyla. "The dynamics of invariant object and action recognition in the human visual system." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98000.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 123-138).
Humans can quickly and effortlessly recognize objects, and people and their actions from complex visual inputs. Despite the ease with which the human brain solves this problem, the underlying computational steps have remained enigmatic. What makes object and action recognition challenging are identity-preserving transformations that alter the visual appearance of objects and actions, such as changes in scale, position, and viewpoint. The majority of visual neuroscience studies examining visual recognition either use physiology recordings, which provide high spatiotemporal resolution data with limited brain coverage, or functional MRI, which provides high spatial resolution data from across the brain with limited temporal resolution. High temporal resolution data from across the brain is needed to break down and understand the computational steps underlying invariant visual recognition. In this thesis I use magenetoencephalography, machine learning, and computational modeling to study invariant visual recognition. I show that a temporal association learning rule for learning invariance in hierarchical visual systems is very robust to manipulations and visual disputations that happen during development (Chapter 2). I next show that object recognition occurs very quickly, with invariance to size and position developing in stages beginning around 100ms after stimulus onset (Chapter 3), and that action recognition occurs on a similarly fast time scale, 200 ms after video onset, with this early representation being invariant to changes in actor and viewpoint (Chapter 4). Finally, I show that the same hierarchical feedforward model can explain both the object and action recognition timing results, putting this timing data in the broader context of computer vision systems and models of the brain. This work sheds light on the computational mechanisms underlying invariant object and action recognition in the brain and demonstrates the importance of using high temporal resolution data to understand neural computations.
by Leyla Isik.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
16

Rahtu, E. (Esa). "A multiscale framework for affine invariant pattern recognition and registration." Doctoral thesis, University of Oulu, 2007. http://urn.fi/urn:isbn:9789514286018.

Full text
Abstract:
Abstract This thesis presents a multiscale framework for the construction of affine invariant pattern recognition and registration methods. The idea in the introduced approach is to extend the given pattern to a set of affine covariant versions, each carrying slightly different information, and then to apply known affine invariants to each of them separately. The key part of the framework is the construction of the affine covariant set, and this is done by combining several scaled representations of the original pattern. The advantages compared to previous approaches include the possibility of many variations and the inclusion of spatial information on the patterns in the features. The application of the multiscale framework is demonstrated by constructing several new affine invariant methods using different preprocessing techniques, combination schemes, and final recognition and registration approaches. The techniques introduced are briefly described from the perspective of the multiscale framework, and further treatment and properties are presented in the corresponding original publications. The theoretical discussion is supported by several experiments where the new methods are compared to existing approaches. In this thesis the patterns are assumed to be gray scale images, since this is the main application where affine relations arise. Nevertheless, multiscale methods can also be applied to other kinds of patterns where an affine relation is present. An additional application of one multiscale based technique in convexity measurements is introduced. The method, called multiscale autoconvolution, can be used to build a convexity measure which is a descriptor of object shape. The proposed measure has two special features compared to existing approaches. It can be applied directly to gray scale images approximating binary objects, and it can be easily modified to produce a number of measures. The new measure is shown to be straightforward to evaluate for a given shape, and it performs well in the applications, as demonstrated by the experiments in the original paper.
APA, Harvard, Vancouver, ISO, and other styles
17

Eskizara, Omer. "3d Geometric Hashing Using Transform Invariant Features." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12610546/index.pdf.

Full text
Abstract:
3D object recognition is performed by using geometric hashing where transformation and scale invariant 3D surface features are utilized. 3D features are extracted from object surfaces after a scale space search where size of each feature is also estimated. Scale space is constructed based on orientation invariant surface curvature values which classify each surface point'
s shape. Extracted features are grouped into triplets and orientation invariant descriptors are defined for each triplet. Each pose of each object is indexed in a hash table using these triplets. For scale invariance matching, cosine similarity is applied for scale variant triple variables. Tests were performed on Stuttgart database where 66 poses of 42 objects are stored in the hash table during training and 258 poses of 42 objects are used during testing. %90.97 recognition rate is achieved.
APA, Harvard, Vancouver, ISO, and other styles
18

Zografos, V. "Pose-invariant, model-based object recognition, using linear combination of views and Bayesian statistics." Thesis, University College London (University of London), 2009. http://discovery.ucl.ac.uk/18954/.

Full text
Abstract:
This thesis presents an in-depth study on the problem of object recognition, and in particular the detection of 3-D objects in 2-D intensity images which may be viewed from a variety of angles. A solution to this problem remains elusive to this day, since it involves dealing with variations in geometry, photometry and viewing angle, noise, occlusions and incomplete data. This work restricts its scope to a particular kind of extrinsic variation; variation of the image due to changes in the viewpoint from which the object is seen. A technique is proposed and developed to address this problem, which falls into the category of view-based approaches, that is, a method in which an object is represented as a collection of a small number of 2-D views, as opposed to a generation of a full 3-D model. This technique is based on the theoretical observation that the geometry of the set of possible images of an object undergoing 3-D rigid transformations and scaling may, under most imaging conditions, be represented by a linear combination of a small number of 2-D views of that object. It is therefore possible to synthesise a novel image of an object given at least two existing and dissimilar views of the object, and a set of linear coefficients that determine how these views are to be combined in order to synthesise the new image. The method works in conjunction with a powerful optimization algorithm, to search and recover the optimal linear combination coefficients that will synthesize a novel image, which is as similar as possible to the target, scene view. If the similarity between the synthesized and the target images is above some threshold, then an object is determined to be present in the scene and its location and pose are defined, in part, by the coefficients. The key benefits of using this technique is that because it works directly with pixel values, it avoids the need for problematic, low-level feature extraction and solution of the correspondence problem. As a result, a linear combination of views (LCV) model is easy to construct and use, since it only requires a small number of stored, 2-D views of the object in question, and the selection of a few landmark points on the object, the process which is easily carried out during the offline, model building stage. In addition, this method is general enough to be applied across a variety of recognition problems and different types of objects. The development and application of this method is initially explored looking at two-dimensional problems, and then extending the same principles to 3-D. Additionally, the method is evaluated across synthetic and real-image datasets, containing variations in the objects’ identity and pose. Future work on possible extensions to incorporate a foreground/background model and lighting variations of the pixels are examined.
APA, Harvard, Vancouver, ISO, and other styles
19

Ojansivu, V. (Ville). "Blur invariant pattern recognition and registration in the Fourier domain." Doctoral thesis, University of Oulu, 2009. http://urn.fi/urn:isbn:9789514292552.

Full text
Abstract:
Abstract Pattern recognition and registration are integral elements of computer vision, which considers image patterns. This thesis presents novel blur, and combined blur and geometric invariant features for pattern recognition and registration related to images. These global or local features are based on the Fourier transform phase, and are invariant or insensitive to image blurring with a centrally symmetric point spread function which can result, for example, from linear motion or out of focus. The global features are based on the even powers of the phase-only discrete Fourier spectrum or bispectrum of an image and are invariant to centrally symmetric blur. These global features are used for object recognition and image registration. The features are extended for geometrical invariances up to similarity transformation: shift invariance is obtained using bispectrum, and rotation-scale invariance using log-polar mapping of bispectrum slices. Affine invariance can be achieved as well using rotated sets of the log-log mapped bispectrum slices. The novel invariants are shown to be more robust to additive noise than the earlier blur, and combined blur and geometric invariants based on image moments. The local features are computed using the short term Fourier transform in local windows around the points of interest. Only the lowest horizontal, vertical, and diagonal frequency coefficients are used, the phase of which is insensitive to centrally symmetric blur. The phases of these four frequency coefficients are quantized and used to form a descriptor code for the local region. When these local descriptors are used for texture classification, they are computed for every pixel, and added up to a histogram which describes the local pattern. There are no earlier textures features which have been claimed to be invariant to blur. The proposed descriptors were superior in the classification of blurred textures compared to a few non-blur invariant state of the art texture classification methods.
APA, Harvard, Vancouver, ISO, and other styles
20

Evans, Benjamin D. "Learning transformation-invariant visual representations in spiking neural networks." Thesis, University of Oxford, 2012. https://ora.ox.ac.uk/objects/uuid:15bdf771-de28-400e-a1a7-82228c7f01e4.

Full text
Abstract:
This thesis aims to understand the learning mechanisms which underpin the process of visual object recognition in the primate ventral visual system. The computational crux of this problem lies in the ability to retain specificity to recognize particular objects or faces, while exhibiting generality across natural variations and distortions in the view (DiCarlo et al., 2012). In particular, the work presented is focussed on gaining insight into the processes through which transformation-invariant visual representations may develop in the primate ventral visual system. The primary motivation for this work is the belief that some of the fundamental mechanisms employed in the primate visual system may only be captured through modelling the individual action potentials of neurons and therefore, existing rate-coded models of this process constitute an inadequate level of description to fully understand the learning processes of visual object recognition. To this end, spiking neural network models are formulated and applied to the problem of learning transformation-invariant visual representations, using a spike-time dependent learning rule to adjust the synaptic efficacies between the neurons. The ways in which the existing rate-coded CT (Stringer et al., 2006) and Trace (Földiák, 1991) learning mechanisms may operate in a simple spiking neural network model are explored, and these findings are then applied to a more accurate model using realistic 3-D stimuli. Three mechanisms are then examined, through which a spiking neural network may solve the problem of learning separate transformation-invariant representations in scenes composed of multiple stimuli by temporally segmenting competing input representations. The spike-time dependent plasticity in the feed-forward connections is then shown to be able to exploit these input layer dynamics to form individual stimulus representations in the output layer. Finally, the work is evaluated and future directions of investigation are proposed.
APA, Harvard, Vancouver, ISO, and other styles
21

Nelson, Eric D. "Zoom techniques for achieving scale invariant object tracking in real-time active vision systems /." Online version of the thesis, 2006. https://ritdml.rit.edu/dspace/handle/1850/2620.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Mathew, Alex. "Rotation Invariant Histogram Features for Object Detection and Tracking in Aerial Imagery." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1397662849.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Donatti, Guillermo Sebastián [Verfasser], Rolf [Gutachter] Würtz, and Boris [Gutachter] Suchan. "Memory organization for invariant object recognition and categorization / Guillermo Sebastián Donatti ; Gutachter: Rolf Würtz, Boris Suchan." Bochum : Ruhr-Universität Bochum, 2016. http://d-nb.info/1114496944/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Hall, Daniela. "Viewpoint independent recognition of objects from local appearance." Grenoble INPG, 2001. http://www.theses.fr/2001INPG0086.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Krasilenko, V. G., O. I. Nikolskyy, A. A. Lazarev, D. V. Nikitovich, В. Г. Красіленко, О. І. Нікольський, О. О. Лазарєв, and Д. В. Нікітович. "Simulating optical pattern recognition algorithms for object tracking based on nonlinear models and subtraction of frames." Thesis, Український державний хіміко-технологічний університет, 2015. http://ir.lib.vntu.edu.ua//handle/123456789/23850.

Full text
Abstract:
We have proposed and discussed optical pattern recognition algorithms for object tracking based on nonlinear equivalence models and subtraction of frames. Experimental results of suggested algorithms in Mathcad and LabVIEW are shown. Application of equivalent functions and difference of frames gives good results for recognition and tracking moving objects
APA, Harvard, Vancouver, ISO, and other styles
26

Leßmann, Markus [Verfasser], Laurenz [Gutachter] Wiskott, and Rolf [Gutachter] Würtz. "Learning of invariant object recognition in hierarchical neural networks using temporal continuity / Markus Leßmann ; Gutachter: Laurenz Wiskott, Rolf Würtz ; Fakultät für Physik und Astronomie." Bochum : Ruhr-Universität Bochum, 2015. http://d-nb.info/1239416415/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Vinther, Sven. "Active 3D object recognition using geometric invariants." Thesis, University of Cambridge, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362974.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Beis, Jeffrey S. "Indexing without invariants in model-based object recognition." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq25014.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Zhu, Yonggen. "Feature extraction and 2D/3D object recognition using geometric invariants." Thesis, King's College London (University of London), 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362731.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Smart, Michael Howard William. "Adaptive, linear, subspatial projections for invariant recognition of objects in real infrared images." Thesis, University of Edinburgh, 1998. http://hdl.handle.net/1842/12974.

Full text
Abstract:
In recent years computer technology has advanced to a state whereby large quantities of data can be processed. This advancement has fuelled a dramatic increased in research into areas of image processing which were previously impractical, such as automated vision systems for, both military, and domestic purposes. Automatic Target Recognition (ATR) systems are one such example of these automated processes. ATR is the automatic detection, isolation and identification of objects, often derived from raw video, in a real-world, potentially hostile environment. The ability to rapidly, and accurately, process each frame of the incoming video stream is paramount to the success of the system, in order to output suitable actions against constantly changing situations. One of the main functions of an ATR system is to identify correctly all the objects detected in each frame of data. The standard approach to implementing this component is to divide the identification process into two separate modules; feature extraction and classification. However, it is often difficult to optimise such a dual system with respect to reducing the probability of mis-identification. This can lead to reduced performance. One potential solution is a neural network that accepts image data at the input, and outputs estimated classification. Unfortunately, neural network models of this type are prone to misuse due to their apparent black box solutions. In this thesis a new technique, based on existing adaptive wavelet algorithms, is implemented that offers ease-of-use, adaptability to new environments, and good generalisation in a single image-in-classification-out model that avoids many of the problems of the neural network approach. This new model is compared with the standard two stage approach using real-world, infrared, ATR data. Various extensions to the model are proposed to incorporate invariance to particular object deformations, such as size and rotation, which are necessary for reliable ATR performance. Further work increases the flexibility of the model to further improve generalisation. Other aspects, such as data analysis and object generation accuracy, which are often neglected, are also considered.
APA, Harvard, Vancouver, ISO, and other styles
31

Glauser, Thomas. "CAD-based recognition of polyhedral 3-D objects using affine invariant surface representations /." Bern : Universität Bern Institut für Informatik und angewandte Mathematik, 1992. http://www.ub.unibe.ch/content/bibliotheken_sammlungen/sondersammlungen/dissen_bestellformular/index_ger.html.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Weismantel, Eric. "Perceptual Salience of Non-accidental Properties." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1376610211.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Sadek, Rida. "Some problems on temporally consistent video editing and object recognition." Doctoral thesis, Universitat Pompeu Fabra, 2012. http://hdl.handle.net/10803/101413.

Full text
Abstract:
Video editing and object recognition are two significant fields in computer vi- sion: the first has remarkably assisted digital production and post-production tasks of a digital video footage; the second is considered fundamental to image classification or image based search in large databases (e.g. the web). In this thesis, we address two problems, namely we present a novel formulation that tackles video editing tasks and we develop a mechanism that allows to generate more robust descriptors for objects in an image. Concerning the first problem, this thesis proposes two variational models to perform temporally coherent video editing. These models are applied to change an object’s (rigid or non-rigid) texture throughout a given video sequence. One model is based on propagating color information from a given frame (or be- tween two given frames) along the motion trajectories of the video; while the other is based on propagating gradient domain information. The models we present in this thesis require minimal user intervention and they automatically accommodate for illumination changes in the scene. Concerning the second problem, this thesis addresses the problem of affine invariance in object recognition. We introduce a way to generate geometric affine invariant quantities that are used in the construction of feature descrip- tors. We show that when these quantities are used they do indeed achieve a more robust recognition than the state of the art descriptors. i
La edición de vídeo y el reconocimiento de objetos son dos áreas fundamentales en el campo de la visión por computador: la primera es de gran utilidad en los procesos de producción y post-producción digital de vídeo; la segunda es esencial para la clasificación o búsqueda de imágenes en grandes bases de datos (por ejemplo, en la web). En esta tesis se acometen ambos problemas, en concreto, se presenta una nueva formulación que aborda las tareas de edición de vídeo y se desarrolla un mecanismo que permite generar descriptores más robustos para los objetos de la imagen. Con respecto al primer problema, en esta tesis se proponen dos modelos variacionales para llevar a cabo la edición de vídeo de forma coherente en el tiempo. Estos modelos se aplican para cambiar la textura de un objeto (rígido o no) a lo largo de una secuencia de vídeo dada. Uno de los modelos está basado en la propagación de la información de color desde un determinado cuadro de la secuencia de vídeo (o entre dos cuadros dados) a lo largo de las trayectorias de movimiento del vídeo. El otro modelo está basado en la propagación de la información en el dominio del gradiente. Ambos modelos requieren una intervención mínima por parte del usuario y se ajustan de manera automática a los cambios de iluminación de la escena. Con respecto al segundo problema, esta tesis aborda el problema de la invariancia afín en el reconocimiento de objetos. Se introduce un nuevo método para generar cantidades geométricas afines que se utilizan en la generación de descriptores de características. También se demuestra que el uso de dichas cantidades proporciona mayor robustez al reconocimiento que los descriptores existentes actualmente en el estado del arte.
APA, Harvard, Vancouver, ISO, and other styles
34

Umasuthan, M. "Recognition and position estimation of 3D objects from range images using algebraic and moment invariants." Thesis, Heriot-Watt University, 1995. http://hdl.handle.net/10399/763.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Soysal, Medeni. "Joint Utilization Of Local Appearance Descriptors And Semi-local Geometry For Multi-view Object Recognition." Phd thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614313/index.pdf.

Full text
Abstract:
Novel methods of object recognition that form a bridge between today&rsquo
s local feature frameworks and previous decade&rsquo
s strong but deserted geometric invariance field are presented in this dissertation. The rationale behind this effort is to complement the lowered discriminative capacity of local features, by the invariant geometric descriptions. Similar to our predecessors, we first start with constrained cases and then extend the applicability of our methods to more general scenarios. Local features approach, on which our methods are established, is reviewed in three parts
namely, detectors, descriptors and the methods of object recognition that employ them. Next, a novel planar object recognition framework that lifts the requirement for exact appearance-based local feature matching is presented. This method enables matching of groups of features by utilizing both appearance information and group geometric descriptions. An under investigated area, scene logo recognition, is selected for real life application of this method. Finally, we present a novel method for three-dimensional (3D) object recognition, which utilizes well-known local features in a more efficient way without any reliance on partial or global planarity. Geometrically consistent local features, which form the crucial basis for object recognition, are identified using affine 3D geometric invariants. The utilization of 3D geometric invariants replaces the classical 2D affine transform estimation /verification step, and provides the ability to directly verify 3D geometric consistency. The accuracy and robustness of the proposed method in highly cluttered scenes with no prior segmentation or post 3D reconstruction requirements, are presented during the experiments.
APA, Harvard, Vancouver, ISO, and other styles
36

Araújo, Sidnei Alves de. "Casamento de padrões em imagens digitais livre de segmentação e invariante sob transformações de similaridade." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-18122009-124219/.

Full text
Abstract:
Reconhecimento de padrões em imagens é um problema clássico da área de visão computacional e consiste em detectar um padrão ou objeto de referência (template) em uma imagem digital. A maioria dos métodos para esta finalidade propostos na literatura simplifica as imagens por meio de operações como binarização, segmentação e detecção de bordas ou pontos de contorno, para em seguida extrair um conjunto de atributos descritores. O problema é que esta simplificação pode descartar informações importantes para descrição dos padrões, fazendo diminuir a robustez do processo de detecção. Um método eficiente deve ter a habilidade de identificar um padrão sujeito a algumas transformações geométricas como rotação, escalonamento, translação, cisalhamento e, no caso de métodos para imagens coloridas, deve ainda tratar do problema da constância da cor. Além disso, o conjunto de atributos que descrevem um padrão deve ser pequeno o suficiente para viabilizar o desenvolvimento de aplicações práticas como um sistema de visão robótica ou um sistema de vigilância. Estes são alguns dos motivos que justificam os esforços empreendidos nos inúmeros trabalhos desta natureza encontrados na literatura. Neste trabalho é proposto um método de casamento de padrões em imagens digitais, denominado Ciratefi (Circular, Radial and Template-Matching Filter), livre de segmentação e invariante sob transformações de similaridade, brilho e contraste. O Ciratefi consiste de três etapas de filtragem que sucessivamente descartam pontos na imagem analisada que não correspondem ao padrão procurado. Também foram propostas duas extensões do Ciratefi, uma que utiliza operadores morfológicos na extração dos atributos descritores, denominada Ciratefi Morfológico e outra para imagens coloridas chamada de color Ciratefi. Foram realizados vários experimentos com o intuito de comparar o desempenho do método proposto com dois dos principais métodos encontrados na literatura. Os resultados experimentais mostram que o desempenho do Ciratefi é superior ao desempenho dos métodos empregados na análise comparativa.
Pattern recognition in images is a classical problem in computer vision. It consists in detecting some reference pattern or template in a digital image. Most of the existing pattern recognition techniques usually apply simplifications like binarization, segmentation, interest points or edges detection before extracting features from images. Unfortunately, these simplification operations can discard rich grayscale information used to describe the patterns, decreasing the robustness of the detection process. An efficient method should be able to identify a pattern subject to some geometric transformations such as translation, scale, rotation, shearing and, in the case of color images, should deal with the color constancy problem. In addition, the set of features that describe a pattern should be sufficiently small to make feasible practical applications such as robot vision or surveillance system. These are some of the reasons that justify the effort for development of many works of this nature found in the literature. In this work we propose a segmentation-free template matching method named Ciratefi (Circular, Radial and Template-Matching Filter) that is invariant to rotation, scale, translation, brightness and contrast. Ciratefi consists of three cascaded filters that successively exclude pixels that have no chance of matching the template from further processing. Also we propose two extensions of Ciratefi, one using the mathematical morphology approach to extract the descriptors named Morphological Ciratefi and another to deal with color images named Color Ciratefi. We conducted various experiments aiming to compare the performance of the proposed method with two other methods found in the literature. The experimental results show that Ciratefi outperforms the methods used in the comparison analysis.
APA, Harvard, Vancouver, ISO, and other styles
37

Eberhardt, Sven [Verfasser], Kerstin [Akademischer Betreuer] Schill, and Manfred [Akademischer Betreuer] Fahle. "Analysis and Modeling of Visual Invariance for Object Recognition and Spatial Cognition / Sven Eberhardt. Gutachter: Kerstin Schill ; Manfred Fahle. Betreuer: Kerstin Schill." Bremen : Staats- und Universitätsbibliothek Bremen, 2015. http://d-nb.info/1072746344/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

gundam, madhuri, and Madhuri Gundam. "Automatic Classification of Fish in Underwater Video; Pattern Matching - Affine Invariance and Beyond." ScholarWorks@UNO, 2015. http://scholarworks.uno.edu/td/1976.

Full text
Abstract:
Underwater video is used by marine biologists to observe, identify, and quantify living marine resources. Video sequences are typically analyzed manually, which is a time consuming and laborious process. Automating this process will significantly save time and cost. This work proposes a technique for automatic fish classification in underwater video. The steps involved are background subtracting, fish region tracking and classification using features. The background processing is used to separate moving objects from their surrounding environment. Tracking associates multiple views of the same fish in consecutive frames. This step is especially important since recognizing and classifying one or a few of the views as a species of interest may allow labeling the sequence as that particular species. Shape features are extracted using Fourier descriptors from each object and are presented to nearest neighbor classifier for classification. Finally, the nearest neighbor classifier results are combined using a probabilistic-like framework to classify an entire sequence. The majority of the existing pattern matching techniques focus on affine invariance, mainly because rotation, scale, translation and shear are common image transformations. However, in some situations, other transformations may be modeled as a small deformation on top of an affine transformation. The proposed algorithm complements the existing Fourier transform-based pattern matching methods in such a situation. First, the spatial domain pattern is decomposed into non-overlapping concentric circular rings with centers at the middle of the pattern. The Fourier transforms of the rings are computed, and are then mapped to polar domain. The algorithm assumes that the individual rings are rotated with respect to each other. The variable angles of rotation provide information about the directional features of the pattern. This angle of rotation is determined starting from the Fourier transform of the outermost ring and moving inwards to the innermost ring. Two different approaches, one using dynamic programming algorithm and second using a greedy algorithm, are used to determine the directional features of the pattern.
APA, Harvard, Vancouver, ISO, and other styles
39

Tromans, James Matthew. "Computational neuroscience of natural scene processing in the ventral visual pathway." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:b82e1332-df7b-41db-9612-879c7a7dda39.

Full text
Abstract:
Neural responses in the primate ventral visual system become more complex in the later stages of the pathway. For example, not only do neurons in IT cortex respond to complete objects, they also learn to respond invariantly with respect to the viewing angle of an object and also with respect to the location of an object. These types of neural responses have helped guide past research with VisNet, a computational model of the primate ventral visual pathway that self-organises during learning. In particular, previous research has focussed on presenting to the model one object at a time during training, and has placed emphasis on the transform invariant response properties of the output neurons of the model that consequently develop. This doctoral thesis extends previous VisNet research and investigates the performance of the model with a range of more challenging and ecologically valid training paradigms. For example, when multiple objects are presented to the network during training, or when objects partially occlude one another during training. The different mechanisms that help output neurons to develop object selective, transform invariant responses during learning are proposed and explored. Such mechanisms include the statistical decoupling of objects through multiple object pairings, and the separation of object representations by independent motion. Consideration is also given to the heterogeneous response properties of neurons that develop during learning. For example, although IT neurons demonstrate a number of differing invariances, they also convey spatial information and view specific information about the objects presented on the retina. A updated, scaled-up version of the VisNet model, with a significantly larger retina, is introduced in order to explore these heterogeneous neural response properties.
APA, Harvard, Vancouver, ISO, and other styles
40

Minařík, Martin. "Strukturální metody identifikace objektů pro řízení průmyslového robotu." Doctoral thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2009. http://www.nusl.cz/ntk/nusl-233840.

Full text
Abstract:
This PhD thesis deals with the use of structural methods of objects identification for industrial robots operation. First, the present state of knowledge in the field is described, i.e. the whole process of objects recognition with the aid of common methods of the syntactic analysis. The main disadvantage of these methods is that is impossible to recognize objects whose digitalized image is corrupted in some ways (due to excessive noise or image disturbances), objects are therefore deformed. Further, other methods for the recognition of deformed objects are described. These methods use structural description of objects for object recognition, i.e. methods which determine the distance between attribute descriptions of images. The core part of this PhD thesis begins in Chapter 5, where deformation grammars, capable of description of all possible object deformations, are described. The only complication in the analysis is the ambiguity of the deformation grammar, which lowers the effectiveness of the analysis. Further, PhD thesis deals with the selection and modification of a proper parser, which is able to analyze a deformation grammar effectively. Three parsers are described: the modified Earley parser, the modified Tomita parser and the modified hybrid LRE(k) parser. As for the modified Earley’s parser, ways of its effective implementation are described. One of the necessary parts of the object recognition is providing the invariances, which this PhD thesis covers in detail, too. Finally, the results of described algorithms are mentioned (successfulness and speed of deformed objects recognition) and suggested testing environment and implemented algorithms are described. In conclusion, all determined possibilities of deformation grammars and their results are summarized.
APA, Harvard, Vancouver, ISO, and other styles
41

López, Guillermo Ángel Pérez. "AFORAPRO: reconhecimento de objetos invariante sob transformações afins." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-31052011-155411/.

Full text
Abstract:
Reconhecimento de objetos é uma aplicação básica da área de processamento de imagens e visão computacional. O procedimento comum do reconhecimento consiste em achar ocorrências de uma imagem modelo numa outra imagem a ser analisada. Consequentemente, se as imagens apresentarem mudanças no ponto de vista da câmera o algoritmo normalmente falha. A invariância a pontos de vista é uma qualidade que permite reconhecer um objeto, mesmo que este apresente distorções resultantes de uma transformação em perspectiva causada pela mudança do ponto de vista. Uma abordagem baseada na simulação de pontos de vista, chamada ASIFT, tem sido recentemente proposta no entorno desta problemática. O ASIFT é invariante a pontos de vista, no entanto falha na presença de padrões repetitivos e baixo contraste. O objetivo de nosso trabalho é utilizar uma variante da técnica de simulação de pontos de vista em combinação com a técnica de extração dos coeficientes de Fourier de projeções radiais e circulares (FORAPRO), para propor um algoritmo invariante a pontos de vista, e robusto a padrões repetitivos e baixo contraste. De maneira geral, a nossa proposta resume-se nas seguintes fases: (a) Distorcemos a imagem, variando os parâmetros de inclinação e rotação da câmera, para gerar alguns modelos e conseguir a invariância a deformações em perspectiva, (b) utilizamos cada como modelo a ser procurado na imagem, para escolher o que melhor case, (c) realizamos o casamento de padrões. As duas últimas fases do processo baseiam-se em características invariantes por rotação, escala, brilho e contraste extraídas pelos coeficientes de Fourier. Nossa proposta, que chamamos AFORAPRO, foi testada com 350 imagens que continham diversidade nos requerimentos, e demonstrou ser invariante a pontos de vista e ter ótimo desempenho na presença de padrões repetitivos e baixo contraste.
Object recognition is a basic application from the domain of image processing and computer vision. The common process recognition consists of finding occurrences of an image query in another image to be analyzed A. Consequently, if the images changes viewpoint in the camera it will normally result in the algorithm failure. The invariance viewpoints are qualities that permit recognition of an object, even if this present distortion resultant of a transformation of perspective is caused by the change in viewpoint. An approach based on viewpoint simulation, called ASIFT, has recently been proposed surrounding this issue. The ASIFT algorithm is invariant viewpoints; however there are flaws in the presence of repetitive patterns and low contrast. The objective of our work is to use a variant of this technique of viewpoint simulating, in combination with the technique of extraction of the Coefficients of Fourier Projections Radials and Circulars (FORAPRO), and to propose an algorithm of invariant viewpoints and robust repetitive patterns and low contrast. In general, our proposal summarizes the following stages: (a) We distort the image, varying the parameters of inclination and rotation of the camera, to produce some models and achieve perspective invariance deformation, (b) use as the model to be search in the image, to choose the that match best, (c) realize the template matching. The two last stages of process are based on invariant features by images rotation, scale, brightness and contrast extracted by Fourier coefficients. Our approach, that we call AFORAPRO, was tested with 350 images that contained diversity in applications, and demonstrated to have invariant viewpoints, and to have excellent performance in the presence of patterns repetitive and low contrast.
APA, Harvard, Vancouver, ISO, and other styles
42

Wilbert, Niko. "Hierarchical Slow Feature Analysis on visual stimuli and top-down reconstruction." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2012. http://dx.doi.org/10.18452/16526.

Full text
Abstract:
In dieser Dissertation wird ein Modell des visuellen Systems untersucht, basierend auf dem Prinzip des unüberwachten Langsamkeitslernens und des SFA-Algorithmus (Slow Feature Analysis). Dieses Modell wird hier für die invariante Objekterkennung und verwandte Probleme eingesetzt. Das Modell kann dabei sowohl die zu Grunde liegenden diskreten Variablen der Stimuli extrahieren (z.B. die Identität des gezeigten Objektes) als auch kontinuierliche Variablen (z.B. Position und Rotationswinkel). Dabei ist es in der Lage, mit komplizierten Transformationen umzugehen, wie beispielsweise Tiefenrotation. Die Leistungsfähigkeit des Modells wird zunächst mit Hilfe von überwachten Methoden zur Datenanalyse untersucht. Anschließend wird gezeigt, dass auch die biologisch fundierte Methode des Verstärkenden Lernens (reinforcement learning) die Ausgabedaten unseres Modells erfolgreich verwenden kann. Dies erlaubt die Anwendung des Verstärkenden Lernens auf hochdimensionale visuelle Stimuli. Im zweiten Teil der Arbeit wird versucht, das hierarchische Modell mit Top-down Prozessen zu erweitern, speziell für die Rekonstruktion von visuellen Stimuli. Dabei setzen wir die Methode der Vektorquantisierung ein und verbinden diese mit einem Verfahren zum Gradientenabstieg. Die wesentlichen Komponenten der für unsere Simulationen entwickelten Software wurden in eine quelloffene Programmbibliothek integriert, in das ``Modular toolkit for Data Processing'''' (MDP). Diese Programmkomponenten werden im letzten Teil der Dissertation vorgestellt.
This thesis examines a model of the visual system, which is based on the principle of unsupervised slowness learning and using Slow Feature Analysis (SFA). We apply this model to the task of invariant object recognition and several related problems. The model not only learns to extract the underlying discrete variables of the stimuli (e.g., identity of the shown object) but also to extract continuous variables (e.g., position and rotational angles). It is shown to be capable of dealing with complex transformations like in-depth rotation. The performance of the model is first measured with the help of supervised post-processing methods. We then show that biologically motivated methods like reinforcement learning are also capable of processing the high-level output from the model. This enables reinforcement learning to deal with high-dimensional visual stimuli. In the second part of this thesis we try to extend the model with top-down processes, centered around the task of reconstructing visual stimuli. We utilize the method of vector quantization and combine it with gradient descent. The key components of our simulation software have been integrated into an open-source software library, the Modular toolkit for Data Processing (MDP). These components are presented in the last part of the thesis.
APA, Harvard, Vancouver, ISO, and other styles
43

Yokono, Jerry Jun, and Tomaso Poggio. "Rotation Invariant Object Recognition from One Training Example." 2004. http://hdl.handle.net/1721.1/30465.

Full text
Abstract:
Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds.
APA, Harvard, Vancouver, ISO, and other styles
44

Lin, Nan-Chieh, and 林楠傑. "Efficient Wavelet-Based Scale Invariant Features for Object Recognition." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/63665272937756208162.

Full text
Abstract:
碩士
淡江大學
資訊工程學系碩士班
97
Feature points’ matching is a popular method in dealing with object recognition problems. However, variations of images, such as shift, rotation, and scaling, influence the matching correctness. Therefore, a feature point matching system with distinctive and invariant feature point detector as well as robust description mechanism becomes the main challenge of this issue. We use discrete wavelet transform (DWT) and accumulated map to detect feature points which are local maximum points on the accumulated map. DWT calculation is very efficient comparing to that of Harris corner detection or Difference of Gaussian (DoG) proposed by Lowe. Besides, feature points detected by DWT are located more evenly on texture area unlike those detected by Harris’ are clustered on corners. To be scale invariant, the dominate scale (DS) is determined for each feature point. According to the DS of a feature point, an appropriate size of region centered at this feature point is transformed to log-polar coordinate system to improve the rotation and scale invariance. A descriptor of dimension 32 is made of the contrast information to enhance the illumination robustness. Finally, in matching stage, a geometry relation is adopted to improve the matching accuracy. Comparing to existing methods, the proposed algorithm has better performance especially in scale invariance and robustness to blurring effect.
APA, Harvard, Vancouver, ISO, and other styles
45

Werkhoven, Shaun. "Improving interest point object recognition." Thesis, 2010. http://hdl.handle.net/1959.13/804109.

Full text
Abstract:
Research Doctorate - Doctor of Philosophy (PhD)
Vision is a fundamental ability for humans. It is essential to a wide range of activities. The ability to see underpins almost all tasks of our day to day life. It is also an ability exercised by people almost effortlessly. Yet, in spite of this it is an ability that is still poorly understood, and has been possible to reproduce in machines only to a very limited degree. This work grows out of a belief that substantial progress is currently being made in understanding visual recognition processes. Advances in algorithms and computer power have recently resulted in clear and measurable progress in recognition performance. Many of the key advances in recognizing objects have related to recognition of key points or interest points. Such image primitives now underpin a wide array of tasks in computer vision such as object recognition, structure from motion, navigation. The object of this thesis is to find ways to improve the performance of such interest point methods. The most popular interest point methods such as SIFT (Scale Invariant Feature Transform) consist of a descriptor, a feature detector and a standard distance metric. This thesis outlines methods whereby all of these elements can be varied to deliver higher performance in some situations. SIFT is a performance standard to which we often refer herein. Typically, the standard Euclidean distance metric is used as a distance measure with interest points. This metric fails to take account of the specific geometric nature of the information in the descriptor vector. By varying this distance measure in a way that accounts for its geometry we show that performance improvements can be obtained. We investigate whether this can be done in an effective and computationally efficient way. Use of sparse detectors or feature points is a mainstay of current interest point methods. Yet such an approach is questionable for class recognition since the most discriminative points may not be selected by the detector. We therefore develop a dense interest point method, whereby interest points are calculated at every point. This requires a low dimensional descriptor to be computationally feasible. Also, we use aggressive approximate nearest neighbour methods. These dense features can be used for both point matching and class recognition, and we provide experimental results for each. These results show that it is competitive with, and in some cases superior to, traditional interest point methods. Having formed dense descriptors, we then have a multi-dimensional quantity at every point. Each of these can be regarded as a new image and descriptors can be applied to them again. Thus we have higher level descriptors – ‘descriptors upon descriptors’. Experimental results are obtained demonstrating that this provides an improvement to matching performance. Standard image databases are used for experiments. The application of these methods to several tasks, such as navigation (or structure from motion) and object class recognition is discussed.
APA, Harvard, Vancouver, ISO, and other styles
46

Nagao, Kanji, and Grimson W. Eric L. "Object Recognition By Alignment Using Invariant Projections of Planar Surfaces." 1994. http://hdl.handle.net/1721.1/6623.

Full text
Abstract:
In order to recognize an object in an image, we must determine the best transformation from object model to the image. In this paper, we show that for features from coplanar surfaces which undergo linear transformations in space, there exist projections invariant to the surface motions up to rotations in the image field. To use this property, we propose a new alignment approach to object recognition based on centroid alignment of corresponding feature groups. This method uses only a single pair of 2D model and data. Experimental results show the robustness of the proposed method against perturbations of feature positions.
APA, Harvard, Vancouver, ISO, and other styles
47

Zhang, Yuhang. "Local invariant feature based object retrieval in a supermarket." Master's thesis, 2009. http://hdl.handle.net/1885/150903.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

LAN, SUI-GING, and 藍遂青. "Invariant object recognition for robot vision using a single neural network." Thesis, 1989. http://ndltd.ncl.edu.tw/handle/20284166199952394633.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Tzeng, Chih-Hung, and 曾智宏. "Using Local Invariant in Occluded Object Recognition by Hopfield Neural Network." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/61679627653397484929.

Full text
Abstract:
碩士
國立中山大學
機械與機電工程學系研究所
91
In our research, we proposed a novel invariant in 2-D image contour recognition based on Hopfield-Tank neural network. At first, we searched the feature points, the position of feature points where are included high curvature and corner on the contour. We used polygonal approximation to describe the image contour. There have two patterns we set, one is model pattern another is test pattern. The Hopfield-Tank network was employed to perform feature matching. In our results show that we can overcome the test pattern which consists of translation, rotation, scaling transformation and no matter single or occlusion pattern.
APA, Harvard, Vancouver, ISO, and other styles
50

Dahmen, Jörg [Verfasser]. "Invariant image object recognition using Gaussian mixture densities / vorgelegt von Jörg Dahmen." 2001. http://d-nb.info/964586940/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography