Dissertations / Theses: 'Scale Invariant Feature Descriptor'

1

Emir, Erdem. "A Comparative Performance Evaluation Of Scale Invariant Interest Point Detectors For Infrared And Visual Images." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/2/12610159/index.pdf.

Full text

Abstract:

In this thesis, the performance of four state-of-the-art feature detectors along with SIFT and SURF descriptors in matching object features of mid-wave infrared, long-wave infrared and visual-band images is evaluated across viewpoints and changing distance conditions. The utilized feature detectors are Scale Invariant Feature Transform (SIFT), multiscale Harris-Laplace, multiscale Hessian-Laplace and Speeded Up Robust Features (SURF) detectors, all of which are invariant to image scale and rotation. Features on different blackbodies, human face and vehicle images are extracted and performance of reliable matching is explored between different views of these objects each in their own category. All of these feature detectors provide good matching performance results in infrared-band images compared with visual-band images. The comparison of matching performance for mid-wave and long-wave infrared images is also explored in this study and it is observed that long-wave infrared images provide good matching performance for objects at lower temperatures, whereas mid-wave infrared-band images provide good matching performance for objects at higher temperatures. The matching performance of SURF detector and descriptor for human face images in long-wave infrared-band is found to be outperforming than other detectors and descriptors.

APA, Harvard, Vancouver, ISO, and other styles

2

Hall, Daniela. "Viewpoint independent recognition of objects from local appearance." Grenoble INPG, 2001. http://www.theses.fr/2001INPG0086.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Kerr, Dermot. "Autonomous Scale Invariant Feature Extraction." Thesis, University of Ulster, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.502896.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Saad, Elhusain Salem. "Defocus Blur-Invariant Scale-Space Feature Extractions." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1418907974.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Shen, Yao. "Scene Analysis Using Scale Invariant Feature Extraction and Probabilistic Modeling." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc84275/.

Full text

Abstract:

Conventional pattern recognition systems have two components: feature analysis and pattern classification. For any object in an image, features could be considered as the major characteristic of the object either for object recognition or object tracking purpose. Features extracted from a training image, can be used to identify the object when attempting to locate the object in a test image containing many other objects. To perform reliable scene analysis, it is important that the features extracted from the training image are detectable even under changes in image scale, noise and illumination. Scale invariant feature has wide applications such as image classification, object recognition and object tracking in the image processing area. In this thesis, color feature and SIFT (scale invariant feature transform) are considered to be scale invariant feature. The classification, recognition and tracking result were evaluated with novel evaluation criterion and compared with some existing methods. I also studied different types of scale invariant feature for the purpose of solving scene analysis problems. I propose probabilistic models as the foundation of analysis scene scenario of images. In order to differential the content of image, I develop novel algorithms for the adaptive combination for multiple features extracted from images. I demonstrate the performance of the developed algorithm on several scene analysis tasks, including object tracking, video stabilization, medical video segmentation and scene classification.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhang, Zheng, and 张政. "Passivity assessment and model order reduction for linear time-invariant descriptor systems in VLSI circuit simulation." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B44909056.

Full text

Abstract:

The Best MPhil Thesis in the Faculties of Dentistry, Engineering, Medicine and Science (University of Hong Kong), Li Ka Shing Prize,2009-2010
published_or_final_version
Electrical and Electronic Engineering
Master
Master of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

7

Accordino, Andrea. "Studio e sviluppo di descrittori locali per nuvole di punti basati su proprietà geometriche." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17919/.

Full text

Abstract:

In questo lavoro sono stati proposti due nuovi descrittori per cloud point: ReSHOT e KPL-Descriptor. Inoltre sono state testate delle idee per migliorare le performance di tutta la pipeline di feature matching. Il lavoro comprende una fase di comparazione con i descrittori preesistenti.

APA, Harvard, Vancouver, ISO, and other styles

8

Lindeberg, Tony. "Scale Selection Properties of Generalized Scale-Space Interest Point Detectors." KTH, Beräkningsbiologi, CB, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-101220.

Full text

Abstract:

Scale-invariant interest points have found several highly successful applications in computer vision, in particular for image-based matching and recognition. This paper presents a theoretical analysis of the scale selection properties of a generalized framework for detecting interest points from scale-space features presented in Lindeberg (Int. J. Comput. Vis. 2010, under revision) and comprising: an enriched set of differential interest operators at a fixed scale including the Laplacian operator, the determinant of the Hessian, the new Hessian feature strength measures I and II and the rescaled level curve curvature operator, as well as an enriched set of scale selection mechanisms including scale selection based on local extrema over scale, complementary post-smoothing after the computation of non-linear differential invariants and scale selection based on weighted averaging of scale values along feature trajectories over scale. A theoretical analysis of the sensitivity to affine image deformations is presented, and it is shown that the scale estimates obtained from the determinant of the Hessian operator are affine covariant for an anisotropic Gaussian blob model. Among the other purely second-order operators, the Hessian feature strength measure I has the lowest sensitivity to non-uniform scaling transformations, followed by the Laplacian operator and the Hessian feature strength measure II. The predictions from this theoretical analysis agree with experimental results of the repeatability properties of the different interest point detectors under affine and perspective transformations of real image data. A number of less complete results are derived for the level curve curvature operator.

QC 20121003

Image descriptors and scale-space theory for spatial and spatio-temporal recognition

APA, Harvard, Vancouver, ISO, and other styles

9

May, Michael. "Data analytics and methods for improved feature selection and matching." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/data-analytics-and-methods-for-improved-feature-selection-and-matching(965ded10-e3a0-4ed5-8145-2af7a8b5e35d).html.

Full text

Abstract:

This work focuses on analysing and improving feature detection and matching. After creating an initial framework of study, four main areas of work are researched. These areas make up the main chapters within this thesis and focus on using the Scale Invariant Feature Transform (SIFT).The preliminary analysis of the SIFT investigates how this algorithm functions. Included is an analysis of the SIFT feature descriptor space and an investigation into the noise properties of the SIFT. It introduces a novel use of the a contrario methodology and shows the success of this method as a way of discriminating between images which are likely to contain corresponding regions from images which do not. Parameter analysis of the SIFT uses both parameter sweeps and genetic algorithms as an intelligent means of setting the SIFT parameters for different image types utilising a GPGPU implementation of SIFT. The results have demonstrated which parameters are more important when optimising the algorithm and the areas within the parameter space to focus on when tuning the values. A multi-exposure, High Dynamic Range (HDR), fusion features process has been developed where the SIFT image features are matched within high contrast scenes. Bracketed exposure images are analysed and features are extracted and combined from different images to create a set of features which describe a larger dynamic range. They are shown to reduce the effects of noise and artefacts that are introduced when extracting features from HDR images directly and have a superior image matching performance. The final area is the development of a novel, 3D-based, SIFT weighting technique which utilises the 3D data from a pair of stereo images to cluster and class matched SIFT features. Weightings are applied to the matches based on the 3D properties of the features and how they cluster in order to attempt to discriminate between correct and incorrect matches using the a contrario methodology. The results show that the technique provides a method for discriminating between correct and incorrect matches and that the a contrario methodology has potential for future investigation as a method for correct feature match prediction.

APA, Harvard, Vancouver, ISO, and other styles

10

Decombas, Marc. "Compression vidéo très bas débit par analyse du contenu." Thesis, Paris, ENST, 2013. http://www.theses.fr/2013ENST0067/document.

Full text

Abstract:

L’objectif de cette thèse est de trouver de nouvelles méthodes de compression sémantique compatible avec un encodeur classique tel que H.264/AVC. . L’objectif principal est de maintenir la sémantique et non pas la qualité globale. Un débit cible de 300 kb/s a été fixé pour des applications de sécurité et de défense Pour cela une chaine complète de compression a dû être réalisée. Une étude et des contributions sur les modèles de saillance spatio-temporel ont été réalisées avec pour objectif d’extraire l’information pertinente. Pour réduire le débit, une méthode de redimensionnement dénommée «seam carving » a été combinée à un encodeur H.264/AVC. En outre, une métrique combinant les points SIFT et le SSIM a été réalisée afin de mesurer la qualité des objets sans être perturbée par les zones de moindre contenant la majorité des artefacts. Une base de données pouvant être utilisée pour des modèles de saillance mais aussi pour de la compression est proposée avec des masques binaires. Les différentes approches ont été validées par divers tests. Une extension de ces travaux pour des applications de résumé vidéo est proposée
The objective of this thesis is to find new methods for semantic video compatible with a traditional encoder like H.264/AVC. The main objective is to maintain the semantic and not the global quality. A target bitrate of 300 Kb/s has been fixed for defense and security applications. To do that, a complete chain of compression has been proposed. A study and new contributions on a spatio-temporal saliency model have been done to extract the important information in the scene. To reduce the bitrate, a resizing method named seam carving has been combined with the H.264/AVC encoder. Also, a metric combining SIFT points and SSIM has been created to measure the quality of objects without being disturbed by less important areas containing mostly artifacts. A database that can be used for testing the saliency model but also for video compression has been proposed, containing sequences with their manually extracted binary masks. All the different approaches have been thoroughly validated by different tests. An extension of this work on video summary application has also been proposed

APA, Harvard, Vancouver, ISO, and other styles

11

Dardas, Nasser Hasan Abdel-Qader. "Real-time Hand Gesture Detection and Recognition for Human Computer Interaction." Thèse, Université d'Ottawa / University of Ottawa, 2012. http://hdl.handle.net/10393/23499.

Full text

Abstract:

This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control. Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture. Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture. Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands.

APA, Harvard, Vancouver, ISO, and other styles

12

Mykhalchuk, Vasyl. "Correspondance de maillages dynamiques basée sur les caractéristiques." Thesis, Strasbourg, 2015. http://www.theses.fr/2015STRAD010/document.

Full text

Abstract:

Correspondance de forme est un problème fondamental dans de nombreuses disciplines de recherche, tels que la géométrie algorithmique, vision par ordinateur et l'infographie. Communément définie comme un problème de trouver injective/ multivaluée correspondance entre une source et une cible, il constitue une tâche centrale dans de nombreuses applications y compris le transfert de attributes, récupération des formes etc. Dans récupération des formes, on peut d'abord calculer la correspondance entre la forme de requête et les formes dans une base de données, puis obtenir le meilleure correspondance en utilisant une mesure de qualité de correspondance prédéfini. Il est également particulièrement avantageuse dans les applications basées sur la modélisation statistique des formes. En encapsulant les propriétés statistiques de l'anatomie du sujet dans le model de forme, comme variations géométriques, des variations de densité, etc., il est utile non seulement pour l'analyse des structures anatomiques telles que des organes ou des os et leur variations valides, mais aussi pour apprendre les modèle de déformation de la classe d'objets. Dans cette thèse, nous nous intéressons à une enquête sur une nouvelle méthode d'appariement de forme qui exploite grande redondance de l'information à partir des ensembles de données dynamiques, variables dans le temps. Récemment, une grande quantité de recherches ont été effectuées en infographie sur l'établissement de correspondances entre les mailles statiques (Anguelov, Srinivasan et al. 2005, Aiger, Mitra et al. 2008, Castellani, Cristani et al. 2008). Ces méthodes reposent sur les caractéristiques géométriques ou les propriétés extrinsèques/intrinsèques des surfaces statiques (Lipman et Funkhouser 2009, Sun, Ovsjanikov et al. 2009, Ovsjanikov, Mérigot et al. 2010, Kim, Lipman et al., 2011) pour élaguer efficacement les paires. Bien que l'utilisation de la caractéristique géométrique est encore un standard d'or, les méthodes reposant uniquement sur l'information statique de formes peuvent générer dans les résultats de correspondance grossièrement trompeurs lorsque les formes sont radicalement différentes ou ne contiennent pas suffisamment de caractéristiques géométriques. [...]
3D geometry modelling tools and 3D scanners become more enhanced and to a greater degree affordable today. Thus, development of the new algorithms in geometry processing, shape analysis and shape correspondence gather momentum in computer graphics. Those algorithms steadily extend and increasingly replace prevailing methods based on images and videos. Non-rigid shape correspondence or deformable shape matching has been a long-studied subject in computer graphics and related research fields. Not to forget, shape correspondence is of wide use in many applications such as statistical shape analysis, motion cloning, texture transfer, medical applications and many more. However, robust and efficient non-rigid shape correspondence still remains a challenging task due to fundamental variations between individual subjects, acquisition noise and the number of degrees of freedom involved in correspondence search. Although dynamic 2D/3D intra-subject shape correspondence problem has been addressed in the rich set of previous methods, dynamic inter-subject shape correspondence received much less attention. The primary purpose of our research is to develop a novel, efficient, robust deforming shape analysis and correspondence framework for animated meshes based on their dynamic and motion properties. We elaborate our method by exploiting a profitable set of motion data exhibited by deforming meshes with time-varying embedding. Our approach is based on an observation that a dynamic, deforming shape of a given subject contains much more information rather than a single static posture of it. That is different from the existing methods that rely on static shape information for shape correspondence and analysis.Our framework of deforming shape analysis and correspondence of animated meshes is comprised of several major contributions: a new dynamic feature detection technique based on multi-scale animated mesh’s deformation characteristics, novel dynamic feature descriptor, and an adaptation of a robust graph-based feature correspondence approach followed by the fine matching of the animated meshes. [...]

APA, Harvard, Vancouver, ISO, and other styles

13

Sahin, Yavuz. "A Programming Framework To Implement Rule-based Target Detection In Images." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12610213/index.pdf.

Full text

Abstract:

An expert system is useful when conventional programming techniques fall short of capturing human expert knowledge and making decisions using this information. In this study, we describe a framework for capturing expert knowledge under a decision tree form and this framework can be used for making decisions based on captured knowledge. The framework proposed in this study is generic and can be used to create domain specific expert systems for different problems. Features are created or processed by the nodes of decision tree and a final conclusion is reached for each feature. Framework supplies 3 types of nodes to construct a decision tree. First type is the decision node, which guides the search path with its answers. Second type is the operator node, which creates new features using the inputs. Last type of node is the end node, which corresponds to a conclusion about a feature. Once the nodes of the tree are developed, then user can interactively create the decision tree and run the supplied inference engine to collect the result on a specific problem. The framework proposed is experimented with two case studies
"
Airport Runway Detection in High Resolution Satellite Images"
and "
Urban Area Detection in High Resolution Satellite Images"
. In these studies linear features are used for structural decisions and Scale Invariant Feature Transform (SIFT) features are used for testing existence of man made structures.

APA, Harvard, Vancouver, ISO, and other styles

14

Murtin, Chloé Isabelle. "Traitement d’images de microscopie confocale 3D haute résolution du cerveau de la mouche Drosophile." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI081/document.

Full text

Abstract:

La profondeur possible d’imagerie en laser-scanning microscopie est limitée non seulement par la distance de travail des lentilles de objectifs mais également par la dégradation de l’image causée par une atténuation et une diffraction de la lumière passant à travers l’échantillon. Afin d’étendre cette limite, il est possible, soit de retourner le spécimen pour enregistrer les images depuis chaque côté, or couper progressivement la partie supérieure de l’échantillon au fur et à mesure de l‘acquisition. Les différentes images prises de l’une de ces manières doivent ensuite être combinées pour générer un volume unique. Cependant, des mouvements de l’échantillon durant les procédures d’acquisition engendrent un décalage non seulement sur en translation selon les axes x, y et z mais également en rotation autour de ces même axes, rendant la fusion entres ces multiples images difficile. Nous avons développé une nouvelle approche appelée 2D-SIFT-in-3D-Space utilisant les SIFT (scale Invariant Feature Transform) pour atteindre un recalage robuste en trois dimensions de deux images. Notre méthode recale les images en corrigeant séparément les translations et rotations sur les trois axes grâce à l’extraction et l’association de caractéristiques stables de leurs coupes transversales bidimensionnelles. Pour évaluer la qualité du recalage, nous avons également développé un simulateur d’images de laser-scanning microscopie qui génère une paire d’images 3D virtuelle dans laquelle le niveau de bruit et les angles de rotations entre les angles de rotation sont contrôlés avec des paramètres connus. Pour une concaténation précise et naturelle de deux images, nous avons également développé un module permettant une compensation progressive de la luminosité et du contraste en fonction de la distance à la surface de l’échantillon. Ces outils ont été utilisés avec succès pour l’obtention d’images tridimensionnelles de haute résolution du cerveau de la mouche Drosophila melanogaster, particulièrement des neurones dopaminergiques, octopaminergiques et de leurs synapses. Ces neurones monoamines sont particulièrement important pour le fonctionnement du cerveau et une étude de leur réseau et connectivité est nécessaire pour comprendre leurs interactions. Si une évolution de leur connectivité au cours du temps n’a pas pu être démontrée via l’analyse de la répartition des sites synaptiques, l’étude suggère cependant que l’inactivation de l’un de ces types de neurones entraine des changements drastiques dans le réseau neuronal
Although laser scanning microscopy is a powerful tool for obtaining thin optical sections, the possible depth of imaging is limited by the working distance of the microscope objective but also by the image degradation caused by the attenuation of both excitation laser beam and the light emitted from the fluorescence-labeled objects. Several workaround techniques have been employed to overcome this problem, such as recording the images from both sides of the sample, or by progressively cutting off the sample surface. The different views must then be combined in a unique volume. However, a straightforward concatenation is often not possible, because the small rotations that occur during the acquisition procedure, not only in translation along x, y and z axes but also in rotation around those axis, making the fusion uneasy. To address this problem we implemented a new algorithm called 2D-SIFT-in-3D-Space using SIFT (scale Invariant Feature Transform) to achieve a robust registration of big image stacks. Our method register the images fixing separately rotations and translations around the three axes using the extraction and matching of stable features in 2D cross-sections. In order to evaluate the registration quality, we created a simulator that generates artificial images that mimic laser scanning image stacks to make a mock pair of image stacks one of which is made from the same stack with the other but is rotated arbitrarily with known angles and filtered with a known noise. For a precise and natural-looking concatenation of the two images, we also developed a module progressively correcting the sample brightness and contrast depending on the sample surface. Those tools we successfully used to generate tridimensional high resolution images of the fly Drosophila melanogaster brain, in particular, its octopaminergic and dopaminergic neurons and their synapses. Those monoamine neurons appear to be determinant in the correct operating of the central nervous system and a precise and systematic analysis of their evolution and interaction is necessary to understand its mechanisms. If an evolution over time could not be highlighted through the pre-synaptic sites analysis, our study suggests however that the inactivation of one of these neuron types triggers drastic changes in the neural network

APA, Harvard, Vancouver, ISO, and other styles

15

Dellinger, Flora. "Descripteurs locaux pour l'imagerie radar et applications." Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0037/document.

Full text

Abstract:

Nous étudions ici l’intérêt des descripteurs locaux pour les images satellites optiques et radar. Ces descripteurs, par leurs invariances et leur représentation compacte, offrent un intérêt pour la comparaison d’images acquises dans des conditions différentes. Facilement applicables aux images optiques, ils offrent des performances limitées sur les images radar, en raison de leur fort bruit multiplicatif. Nous proposons ici un descripteur original pour la comparaison d’images radar. Cet algorithme, appelé SAR-SIFT, repose sur la même structure que l’algorithme SIFT (détection de points-clés et extraction de descripteurs) et offre des performances supérieures pour les images radar. Pour adapter ces étapes au bruit multiplicatif, nous avons développé un opérateur différentiel, le Gradient par Ratio, permettant de calculer une norme et une orientation du gradient robustes à ce type de bruit. Cet opérateur nous a permis de modifier les étapes de l’algorithme SIFT. Nous présentons aussi deux applications pour la télédétection basées sur les descripteurs. En premier, nous estimons une transformation globale entre deux images radar à l’aide de SAR-SIFT. L’estimation est réalisée à l’aide d’un algorithme RANSAC et en utilisant comme points homologues les points-clés mis en correspondance. Enfin nous avons mené une étude prospective sur l’utilisation des descripteurs pour la détection de changements en télédétection. La méthode proposée compare les densités de points-clés mis en correspondance aux densités de points-clés détectés pour mettre en évidence les zones de changement
We study here the interest of local features for optical and SAR images. These features, because of their invariances and their dense representation, offer a real interest for the comparison of satellite images acquired under different conditions. While it is easy to apply them to optical images, they offer limited performances on SAR images, because of their multiplicative noise. We propose here an original feature for the comparison of SAR images. This algorithm, called SAR-SIFT, relies on the same structure as the SIFT algorithm (detection of keypoints and extraction of features) and offers better performances for SAR images. To adapt these steps to multiplicative noise, we have developed a differential operator, the Gradient by Ratio, allowing to compute a magnitude and an orientation of the gradient robust to this type of noise. This operator allows us to modify the steps of the SIFT algorithm. We present also two applications for remote sensing based on local features. First, we estimate a global transformation between two SAR images with help of SAR-SIFT. The estimation is realized with help of a RANSAC algorithm and by using the matched keypoints as tie points. Finally, we have led a prospective study on the use of local features for change detection in remote sensing. The proposed method consists in comparing the densities of matched keypoints to the densities of detected keypoints, in order to point out changed areas

APA, Harvard, Vancouver, ISO, and other styles

16

Leoputra, Wilson Suryajaya. "Video foreground extraction for mobile camera platforms." Thesis, Curtin University, 2009. http://hdl.handle.net/20.500.11937/1384.

Full text

Abstract:

Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis.

APA, Harvard, Vancouver, ISO, and other styles

17

Hejl, Zdeněk. "Rekonstrukce 3D scény z obrazových dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236495.

Full text

Abstract:

This thesis describes methods of reconstruction of 3D scenes from photographs and videos using the Structure from motion approach. A new software capable of automatic reconstruction of point clouds and polygonal models from common images and videos was implemented based on these methods. The software uses variety of existing and custom solutions and clearly links them into one easily executable application. The reconstruction consists of feature point detection, pairwise matching, Bundle adjustment, stereoscopic algorithms and polygon model creation from point cloud using PCL library. Program is based on Bundler and PMVS. Poisson surface reconstruction algorithm, as well as simple triangulation and own reconstruction method based on plane segmentation were used for polygonal model creation.

APA, Harvard, Vancouver, ISO, and other styles

18

Saravi, Sara. "Use of Coherent Point Drift in computer vision applications." Thesis, Loughborough University, 2013. https://dspace.lboro.ac.uk/2134/12548.

Full text

Abstract:

This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration.

APA, Harvard, Vancouver, ISO, and other styles

19

GUPTA, ANKITA. "PERSONAL MULTIMODAL BIOMETRIC AUTHENTICATION USING UNSUPERVISED LEARNING, HIDDEN MARKOV MODEL (HMM)." Thesis, 2016. http://dspace.dtu.ac.in:8080/jspui/handle/repository/14543.

Full text

Abstract:

ABSTRACT Biometric authentication systems have been used since decades. Palmprint and finger knuckle prints are two such modalities that are universal and possess uniqueness. A variety of algorithms are available to extract features from these modalities and do the authentication process. In this report, use of a machine learning, unsupervised Hidden Markov Model algorithm is proposed to classify the users into genuine and imposter classes. In the following report, a multimodal system using palmprint and finger knuckle print has been proposed using a combination of Harris Corner Detector; SIFT descriptors and Continuous Density Hidden Markov Model (CDHMM). Here the states defining the origination of the observation feature vectors are hidden. The features are extracted using Harris Corner Detector and are described using Scale Invariant Feature Descriptor (SIFT). An approach is proposed to do the authentication at feature level as well as at score level. The log-likelihood computed by HMM and the parameters are maximised by Expectation-Maximization Algorithm. An iterative approach is used to increase the authentication rates and to get the correct number of states in each Hidden Markov Model of each user at feature level and for genuine and imposter classes at score level. The various fusion methods at score level are experimented for the PolyU, IITD palmprint and PolyU finger knuckle print database. The authentication rates obtained are as high as 99% GAR at 0.01 FAR for PolyU palmprint database that are comparable to other methods of authentication at score level. The highest GAR was recorded using SUM fusion rule. The authentication rates are high for feature level authentication as well for both knuckle prints and palmprints. GAR was recorded as high as 97% for right middle knuckle finger print at 0.01 FAR.

APA, Harvard, Vancouver, ISO, and other styles

20

"Bending invariant correspondence matching on 3D models with feature descriptor." 2010. http://library.cuhk.edu.hk/record=b5896651.

Full text

Abstract:

Li, Sai Man.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.
Includes bibliographical references (leaves 91-96).
Abstracts in English and Chinese.
Abstract --- p.2
List of Figures --- p.6
Acknowledgement --- p.10
Chapter Chapter 1 --- Introduction --- p.11
Chapter 1.1 --- Problem definition --- p.11
Chapter 1.2. --- Proposed algorithm --- p.12
Chapter 1.3. --- Main features --- p.14
Chapter Chapter 2 --- Literature Review --- p.16
Chapter 2.1 --- Local Feature Matching techniques --- p.16
Chapter 2.2. --- Global Iterative alignment techniques --- p.19
Chapter 2.3 --- Other Approaches --- p.20
Chapter Chapter 3 --- Correspondence Matching --- p.21
Chapter 3.1 --- Fundamental Techniques --- p.24
Chapter 3.1.1 --- Geodesic Distance Approximation --- p.24
Chapter 3.1.1.1 --- Dijkstra ´ةs algorithm --- p.25
Chapter 3.1.1.2 --- Wavefront Propagation --- p.26
Chapter 3.1.2 --- Farthest Point Sampling --- p.27
Chapter 3.1.3 --- Curvature Estimation --- p.29
Chapter 3.1.4 --- Radial Basis Function (RBF) --- p.32
Chapter 3.1.5 --- Multi-dimensional Scaling (MDS) --- p.35
Chapter 3.1.5.1 --- Classical MDS --- p.35
Chapter 3.1.5.2 --- Fast MDS --- p.38
Chapter 3.2 --- Matching Processes --- p.40
Chapter 3.2.1 --- Posture Alignment --- p.42
Chapter 3.2.1.1 --- Sign Flip Correction --- p.43
Chapter 3.2.1.2 --- Input model Alignment --- p.49
Chapter 3.2.2 --- Surface Fitting --- p.52
Chapter 3.2.2.1 --- Optimizing Surface Fitness --- p.54
Chapter 3.2.2.2 --- Optimizing Surface Smoothness --- p.56
Chapter 3.2.3 --- Feature Matching Refinement --- p.59
Chapter 3.2.3.1 --- Feature descriptor --- p.61
Chapter 3.2.3.3 --- Feature Descriptor matching --- p.63
Chapter Chapter 4 --- Experimental Result --- p.66
Chapter 4.1 --- Result of the Fundamental Techniques --- p.66
Chapter 4.1.1 --- Geodesic Distance Approximation --- p.67
Chapter 4.1.2 --- Farthest Point Sampling (FPS) --- p.67
Chapter 4.1.3 --- Radial Basis Function (RBF) --- p.69
Chapter 4.1.4 --- Curvature Estimation --- p.70
Chapter 4.1.5 --- Multi-Dimensional Scaling (MDS) --- p.71
Chapter 4.2 --- Result of the Core Matching Processes --- p.73
Chapter 4.2.1 --- Posture Alignment Step --- p.73
Chapter 4.2.2 --- Surface Fitting Step --- p.78
Chapter 4.2.3 --- Feature Matching Refinement --- p.82
Chapter 4.2.4 --- Application of the proposed algorithm --- p.84
Chapter 4.2.4.1 --- Design Automation in Garment Industry --- p.84
Chapter 4.3 --- Analysis --- p.86
Chapter 4.3.1 --- Performance --- p.86
Chapter 4.3.2 --- Accuracy --- p.87
Chapter 4.3.3 --- Approach Comparison --- p.88
Chapter Chapter 5 --- Conclusion --- p.89
Chapter 5.1 --- Strength and contributions --- p.89
Chapter 5.2 --- Limitation and future works --- p.90
References --- p.91

APA, Harvard, Vancouver, ISO, and other styles

21

Huang, Liangkang, and 黃亮綱. "Visual Words With Scale-Invariant Features And Color Features For Image Description And Classification." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/67267197214332166074.

Full text

Abstract:

碩士
義守大學
資訊工程學系
100
As the growing image database, to manage the database effectively is more and more important. CBIR (content-based image retrieval) is the well known systems with content-based image retrieval, and it has been widely adopted in Multimedia database. Image classification system which uses visual word to classify the suitable classification in undefined content-based image is difference in image retrieval. We extract SIFT(Scale-Invariant Feature Transform) image feature and training visual word which is image descriptor for comparative standard. With the rapid growing of image databases, how to manage the database effectively becomes an important issue. The content-based image retrieval (CBIR) is a well known technique for content-based image retrieval, and has been widely adopted form multimedia-database applications. Typically, image classification systems compare visual words in dictionary, and then create suitable classifications In the thesis, we first use Scale-Invariant Feature Transform (SIFT) to extract image features. Then we train the visual words by merging similar features. The trained visual words are collected to our visual dictionary. Experimental results show that our word dictionary is able to describe images effectively.

APA, Harvard, Vancouver, ISO, and other styles

22

Huang, Ling-Hsuan, and 黃齡萱. "CBIR System with Scale-Invariant Feature Transform." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/30841024099347199111.

Full text

Abstract:

碩士
國立宜蘭大學
資訊工程研究所碩士班
97
These years, with the development of Multimedia System and Computer Network, the number of digital image grows rapidly. The thesis mentions that CBIR (Content-based Image Retrieval) System with Scale-Invariant Feature Transform and match the assistance of Artificial Neural Network, in order to achieve the accuracy and efficiency of retrieval. For solving Semantic Gap of Content-based Image Retrieval, in this part of image feature analysis, this thesis choose characteristics of color and texture and combine local gray-level variant to obtain keypoints; these characteristics are scale-invariant and the quality of unchangeable rotation, although it can search information of keypoints easilier compared by images of scale or variation of rotation and through these keypoints to reduce the difference of word meaning to promote the accuracy of system retrieval.

APA, Harvard, Vancouver, ISO, and other styles

23

Barreiros, João Carlos da Costa. "Fast Scale-Invariant Feature Transform on GPU." Master's thesis, 2020. http://hdl.handle.net/10316/93988.

Full text

Abstract:

Dissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e Tecnologia
Feature extraction of high-resolution images is a challenging procedure in low-power signal processing applications. This thesis describes how to optimize and efficiently parallelize the scale-invariant feature transform (SIFT) feature detection algorithm and maximize the use of bandwidth on the GPUsubsystem. Together with the minimization of data communications between host and device, the successful parallelization of all the main kernels used in SIFT allowed a global speedup in high-resolution images above 78x while being more than an order of magnitude energy efficient (FPS/W) than its serial counterpart. From the 3 GPUs tested, the low-power GPU has shown superior energy efficiency -- 44 FPS/W.‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎
Feature extraction of high-resolution images is a challenging procedure in low-power signal processing applications. This thesis describes how to optimize and efficiently parallelize the scale-invariant feature transform (SIFT) feature detection algorithm and maximize the use of bandwidth on the GPUsubsystem. Together with the minimization of data communications between host and device, the successful parallelization of all the main kernels used in SIFT allowed a global speedup in high-resolution images above 78x while being more than an order of magnitude energy efficient (FPS/W) than its serial counterpart. From the 3 GPUs tested, the low-power GPU has shown superior energy efficiency -- 44 FPS/W.‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

APA, Harvard, Vancouver, ISO, and other styles

24

"Locally Scale-Invariant Descriptor for 2D Whole-Shape and Partial-Shape Matching." 2015. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1292582.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Chen, Shih-Min, and 陳士民. "Rotation, Translation, and Scale Invariant Bag of Feature based on Feature Density." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/14962400969645161689.

Full text

Abstract:

碩士
國立中正大學
資訊工程研究所
103
In human vision, people can easily recognize object in image with any size, location at any position, at any angle, and with complicated background. But in computer vision, it is hard to achieve image recognition with such invariance. Spatial Pyramid Matching (SPM) has excellent performance on computer vision applications. However, SPM still meets the difficulty when the position of object changes in images. In recent year, researchers try to find a robust representation. For example, translation invariant, rotation invariant, and scale invariant features. There are works trying to solve this issue. However, they just deal with one of three invariants respectively. It lacks a robust representation that can handle three invariant simultaneously. In our work, we aim to develop a robust feature that achieves translation, rotation, and scale invariant simultaneously. To handle this problem, we propose a novel method named Block Based Integral Image to search the densest region of features and constraint the region size similar to a predefined region size, and further find the approximated center of object in image. Then, we apply SPR by replacing the image center with the approximated object center to handle translation and rotation invariance problem. After that, we use histogram equalization to adjust captured representation for scale invariant. After the adjustment, a robust representation can be obtained to handle translation, rotation, and scale invariance simultaneously. Finally, we verify our system on different datasets on image classification task. Experimental results show that our system indeed can deal with translation, rotation, and scale invariant simultaneously and achieve higher accuracy than the previous methods.

APA, Harvard, Vancouver, ISO, and other styles

26

Tsai, Ruei-Jen, and 蔡睿烝. "Accelerating Scale-Invariant Feature Transform Using Graphic Processing Units." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/60537499609581683635.

Full text

Abstract:

碩士
國立臺灣師範大學
科技應用與人力資源發展學系
101
Content-based image retrieval (CBIR) is the application of computer vision techniques to the searching for digital images from large databases using image actual contents such as colors, shapes, and textures rather than the metadata such as keywords, tags, and/or descriptions associated with the image. Many techniques of image processing and computer vision are applied to capture the image contents. Among them, the scale invariant features transform (SIFT) has been widely adopted in many applications, such as object recognition, image stitching, and stereo correspondence to extract and describe local features in images. In certain application such as CBIR, feature extraction is a preprocessing process and feature matching is the most computing-intensive process. Graphic Processing Units (GPUs) have attracted a lot of attention because of their dramatic power of parallel computing on massive data. In this thesis, we propose a GPU-based SIFT by accelerating linear search and K-Nearest Neighbor (KNN) on GPUs. The proposed approach achieves 22 times faster than the ordinary Nearest Neighbor (NN) performed on CPUs, and 11 times faster than the ordinary linear search and KNN performed on CPUs.

APA, Harvard, Vancouver, ISO, and other styles

27

Chang, Che-wei, and 張哲維. "A scale invariant feature transform based palm vein recognition system." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/76035923255925172095.

Full text

Abstract:

碩士
國立臺灣科技大學
資訊工程系
98
Biometrics is playing a more and more important role in modern society. From flash drives, notebooks, entrance guard systems to automatic teller machines, biometrics can be seen built-in applications within them. Palm vein recognition is arguably a burgeoning research emphasis on biometrics. Palm vein image contains rich information for identifying and authenticating, and it provides nice and accurate recognition rate. With the vantage that it can not be fabricated, it is becoming a new star of biometrics. A highly-growing market share can be expected. However, in our country, it is a pity that researches about palm vein recognition are rare. In our research, we focused on building a palm texture recognition system by using scale invariant feature transform. Scale invariant feature transform(SIFT) transforms captured palm vein images into distinctive feature points, and they can be compared and used for identifying people. The experimental result shows that it is ideal for being a biometrics system, and its future is promising.

APA, Harvard, Vancouver, ISO, and other styles

28

Chen, Pao-Feng, and 陳寶鳳. "Detection and Recognition of Road Signs Using Scale Invariant Feature Transform." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/99439764816390051879.

Full text

Abstract:

碩士
元智大學
資訊管理研究所
93
This study describes an automatic road sign detection and recognition system by using scale invariant feature transform (SIFT). The method consists of two stages. In the detection stage, the relative position of road sign is located by using a priori knowledge, shape and specific color information. The shape feature is then used to reconstruct the road sign in the candidate region, and the road sign image is fully extracted from the original image for further recognition. In the recognition stage, distinctive invariant features are extracted from the road sign image by using SIFT to perform reliable matching. The recognition proceeds by matching individual features to a database of features from known road signs using the fast nearest-neighbor algorithm, a Hough transform for identifying clusters that agree on object pose, and finally performing verification through least-squares solution for consistent pose parameters. Experimental results demonstrate that most road signs can be correctly detected and recognized with an accuracy of 95.37%. Moreover, the extensive experiments have also shown that the proposed method is robust against the major difficulties of detecting and recognizing road signs such as image scaling and rotation, illumination change, partial occlusion, deformation, perspective distortion, and so on. The proposed approach can be very helpful for the development of Driver Support System and Intelligent Autonomous Vehicles to provide effective driving assistance.

APA, Harvard, Vancouver, ISO, and other styles

29

IRMAWULANDARI and IRMAWULANDARI. "Image Fusion Using the Scale Invariant Feature TRansform as Image Registration." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/vk7fbc.

Full text

Abstract:

碩士
國立臺北科技大學
資訊工程系研究所
100
Image fusion is the process of combining two or more images into a single image, which retains important features from each. Image fusion is one way to resolve the problem of un-focused images produced by non-professional camera users. Image fusion can be also used in remote sensing, robotics and medical application. In this thesis, a new image fusion technique for multi-focus images based on the SIFT (Scale Invariant Feature Transform) is proposed. The fusion procedure is performed by matching the image features of SIFT and then fusing two images by averaging that firstly decomposed using Discrete Wavelet Transform. Conditional sharpening is applied to get images better of quality. Experimental results show well in multi-focus image fusion.

APA, Harvard, Vancouver, ISO, and other styles

30

Jian-Wen, Chen, and 陳建文. "Dynamic Visual Tracking Using Scale-Invariant Feature Transform and Particle Filter." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/22327718769209427122.

Full text

Abstract:

碩士
國立高雄應用科技大學
電機工程系碩士班
95
We propose an estimation model simultaneously containing translation, scaling, and rotation for affine transformation with the scale-invariant feature transform (SIFT) technique in the dynamic recognition application. Under the model assumption, it can effectively draw the suitable shape of a distortion target in the cluttered environment. The SIFT is an algorithm which searches the invariant feature via recording the information of orientations around the keypoint, and this method is insensitive to the change of the illumination or occlusion momentarily. In the tracking applications, our proposed algorithm is based on extended particle filter (EPF) approach utilizing prior distributions and posterior ones to estimate parameters of highly nonlinear system. To improve the tracker performance, particle filter combines the foreground-background absolute difference (FBAD) and SIFT to achieve the real time tracking and reliable recognition. Each particle represents a possible state with the associated weight of a measurable likelihood distribution. The estimation results are robust against light and shade changes, and implementation in real-time is plausible.

APA, Harvard, Vancouver, ISO, and other styles

31

Yang, Tzung-Da, and 楊宗達. "Scale-Invariant Feature Transform (SIFT) Based Iris Match Technology for Identity Identification." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/52714099795239015467.

Full text

Abstract:

碩士
國立中興大學
電機工程學系所
105
Biometrics has been applied to the personal recognition popularly and it becomes more important. The iris recognition is one of the biometric identification methods, and the technology can provide the accurate personal recognition. As early as 2004, the German airport in Frankfurt began to use the iris identification system. By the iris scan identification, the iris information is linked to the passport data database, and the personal identity is functional. In recent years, the iris identification is used widely and increasingly in personal identifications. Even the mobile phone also begin to use the iris identification system, and the importance of biometrics gains more and more attention. The traditional iris recognition technology mainly transforms the iris feature region into a square matrix by using the polar coordinate method, and the square matrix is transformed to the feature codes, and then the signature is used to the feature match finally. The difference between the proposed and the traditional iris recognition systems is : to avoid the eyelid and eyelash interferences, the retrieved iris region in the proposed design only locates near the pupil around the ring area and the lower half of the iris area for recognitions. On the other side, the traditional iris identification uses the feature code matching technology; however, the proposed method uses the image feature matching technology, i.e. the scale-invariant feature transform (SIFT) method. The SIFT uses the local features of the image, and it keeps the feature invariance for the changes of rotation, scaling, and brightness. The SIFT also maintains a certain degree of stability for the change of the perspective affine transformation and noises. Therefore, it is very suitable that the SIFT technology is applied to iris feature matching. In the proposed design, the accuracy of the iris recognition is 95%. Compared with other methods by using the same database and the similar SIFT technology as the matching method, the recognition performance of the proposed design is suitable.

APA, Harvard, Vancouver, ISO, and other styles

32

Hsieh, Chih-Hsiung, and 謝志雄. "Planer Object Detection Using Scale Invariant Feature Transform Accompanying with Generalized Hough Transform." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/y4u6bz.

Full text

Abstract:

碩士
國立臺北科技大學
電資碩士在職專班研究所
102
We have seen wide range of applications, such as object detection and recognition systems, security monitoring systems, factory automation and detection systems, and video indexing systems on scale-invariant feature transform (SIFT) algorithm in recent years. Without a doubt, SIFT feature points present significant invariance and superiority with conditions such as scaling, rotation, slight perspective, and illumination changes in images. However, a certain degree of error is to be expected in feature point matching. SIFT is particularly less reliable in object detection when the textures or features of the test object are similar to or the same as those of other foreground objects. To address these errors in matching, researchers have proposed methods involving the Nearest Neighbor (NN), the Hough transform (HT), and RANSAC. However, experiments demonstrate that the voting method of the Hough transform can only slightly reduce errors and fails to overcome the problems caused by multiple objects having the same features or textures. These are combined with a model of reference points and edge points established with GHT. This allows for the detection of objects with unknown rotation changes, scale ratios, and irregular shapes. Our results prove that the proposed method improves the precision of object detection in experiments, and saves over 50% in computation time than the original method. In addition, the method achieves good stability in relevant experiments.

APA, Harvard, Vancouver, ISO, and other styles

33

Lin, Chih-Chang, and 林志展. "Implementation of an Object Security System based on Scale Invariant Feature Transform Algorithm." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/27472606492265476126.

Full text

Abstract:

碩士
佛光大學
資訊學系
97
There has been a significant increase in the use of surveillance cameras in the past few years. Idyllically, the use of surveillance cameras and video monitoring systems can not only help altering their users before threatening situations getting worse, but also providing them with vital recorded evidences for security/safety events. However, one common shortcoming of traditional video surveillance systems is that they still need human operators to monitor surveillance cameras and to trace after-happening security/safety events from huge amount of video records. As more and more surveillance cameras are being mounted around our society to help stopping crime and protecting our properties, there are enormous needs of developing software solutions and other technologies to make those video surveillance systems smarter in order to streamline and automate their on-line monitoring and evidence retrieval processes. Intelligent video analysis mechanism (also known as video analytics) is a well known solution to make video surveillance systems smarter. Object recognition technologies in video analytics are usually refer to image processing algorithms that detect and track objects of interest to look for possible security/safety threats or breaches. Recently, Scale Invariant Feature Transform algorithm (SIFT) is recognized as a very useful method for video analytics applications due to its effectiveness in dealing with scale, illumination or position changes of the object of interest. In this research, a SIFT-based intelligent video surveillance system is proposed to help monitoring objects (valuable properties) display in open spaces. Once the proposed system detects abnormal or suspicious activities via video analytics, it will provide pre-caution warning or record only video of suspicious activity. In this intelligent system, Self Adaptive SIFT (SA-SIFT) algorithm, an improved version of the original SIFT algorithm is also proposed by adding mechanism for incessant updating the template of SIFT features and adjusting the region of interest. Such enhancements are designed to extend capability of the intelligent system in object recognition with motion and scene changes. The efficiency and effectiveness of the proposed intelligent object security system are demonstrated experimentally. After benchmarking with the original SIFT algorithm in the same experiments, it is confirmed that the proposed SA-SIFT algorithm is a more suitable method to help surveillance operators monitoring expensive or important objects via intelligent video surveillance.

APA, Harvard, Vancouver, ISO, and other styles

34

Hsieh, Wan-Ching, and 謝皖青. "Using Scale Invariant Feature Transform for Target Identification in High Resolution Optical Image." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/59944080644154827021.

Full text

Abstract:

碩士
國立中央大學
通訊工程研究所碩士在職專班
98
With finer resolution of satellite imagery, people can extract abundant information and develop more applications from it. Nowadays, research institutes and commercial imagery companies all over the world work intensively to develop many image processing techniques. However, satellite imagery still requires correction and value-added processing for further utilization and applications. Because of the difference for imagery collection time, angle and sensors, images at the same location still have different scale, rotation and translation. In such case, feature extraction is the key technique for target identification in different images. In the thesis, we try to use Scale Invariant Feature Transform（SIFT）to extract features and match them in images with different collection conditions. The result shows that SIFT is capable of extracting stable features, and many of them are matched even the images have different scale and distortion.

APA, Harvard, Vancouver, ISO, and other styles

35

Lin, Jia-Hong, and 林家弘. "Combining Scale Invariant Feature Transform with Principal Component Analysis in Face Recognition System." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/14470588190349908465.

Full text

Abstract:

碩士
國立東華大學
資訊工程學系
96
Because the Individual Identification, Access Control, and Security Appliance issues attract much attention, face recognition application is more and more popular. The challenge of face recognition is the performance mainly affected by the variation of illumination, expression, pose, and accessory. And most algorithms proposed in recent years focus on how to conquest these constraints. This paper combines Principal Component Analysis (PCA) and Scale Invariant Feature Transform (SIFT) applying to face recognition application. Firstly, extract the stable feature vectors which are invariant to image scaling and rotation by SIFT. Secondly, apply PCA to project the feature vectors to the new feature space as PCA-SIFT local descriptors and reduce the dimension greatly. Lastly, cluster the local descriptors by K-mean algorithm and combine local and global information of images for face recognition. By the simulation results, PCA-SIFT local descriptor has better performance than other comparative methods and is robust to the variation of accessory and expression. Another advantage of PCA-SIFT local descriptor is the better computation efficiency because PCA reduces the local descriptor dimension greatly.

APA, Harvard, Vancouver, ISO, and other styles

36

Pan, Wei-Zheng, and 潘偉正. "FPGA-Based Implementation for Scale Invariant Feature Transform (SIFT) of Image Recognition Algorithm." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/yjp76f.

Full text

Abstract:

碩士
國立臺灣師範大學
電機工程學系
104
To solve the problem of image recognition, which requires plenty of computation time by software, we present a hardware implementation approach of SIFT recognition algorithm to achieve the goal of real time execution, through the use of offline calculation of the Gaussian kernel by software, a mathematical derivation to calculate inverse matrix without using any divisors, realization of image pyramid in parallel, etc. As a result, the system performs well in reducing a number of logic units required and the system frequency is significantly increased. In addition, the CORDIC algorithm is employed to implement not only mathematical functions such as trigonometric functions and square root computation, but also an image gradient histogram successfully by hardware. Consequently, the dominant orientation detection and key point descriptors can be implemented by image gradient histogram. To develop an applicable system, the first step is to apply the software and hardware co-design approach to accelerate functional modules and subsequenty implement the entire system in pure hardware. Besides, the structure of all modules is based on pipeline design. Experimental results demonstrated that the proposed approach has significantly reduced computation time required and efficiently increased maximum system frequency. Most importantly, the execution speed has achieved real time computation for practical applications.

APA, Harvard, Vancouver, ISO, and other styles

37

Rajeev, Namburu. "Analysis of Palmprint and Palmvein Authentication Using Scale Invariant Feature Transform(SIFT) Features." Thesis, 2017. http://ethesis.nitrkl.ac.in/8803/1/2017_MT_N_Rajeev.pdf.

Full text

Abstract:

Securing the information has been a major issue now a days and depending on the requirements and security reasons most of the authentication systems are moved from passcodes, pass cards to biometric systems where the metrics are derived from human features. Some of the major biometrics vastly used are Iris, fingerprint, voice recognition, face recognition. But there exists some other biometrics which can be used to increase the security level like palmvein pattern. For this project palmprint and palmvein patterns are selected because both the metrics need to be extracted from same region of palm. By applying Scale Invariant Feature Transform (SIFT) method on the biometrics palmprint and palmvein patterns we can analyze which metric is better and the efficiency in authentication by using different matching techniques. The aim of the project was to analyze the performance of SIFT on palmvein patterns and the palmprint to know which is more secure because even though both the metrics are extracted from the same region it is difficult to forge the palmvein pattern when compared to palm print.

APA, Harvard, Vancouver, ISO, and other styles

38

Xhuan, Wen-Hua, and 宣文華. "Surveillance System Design for Vehicle Tracking and VLSI Architecture Design of Feature Detection in Scale Invariant Feature Transform." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/89117569060195361137.

Full text

Abstract:

碩士
國立中興大學
電機工程學系所
105
Nowadays, automatic visual system with high resolution video stream application is much more common in our life. With huge progress of computer and mobile system, we can use this powerful tool to help us conquer the massive computation of visual analysis and their related applications. Amount the visual system, objects tracking is almost the most basic but complicated subject, the user always wants to find the perfect balance between computation and precision, with more complex application, we used to find more and more new algorithms to solve unexpected problems. In this architecture, in order to increase the accuracy of multi-object tracking, I use the scale invariant feature transform to establish the ID of each registered objects. After matching, all the features in database with searching area, the major problem is to find the robust pairs of those matching key points. With this key points, I can find the precise transform matrix to locate the update set of key points in searching area. Repeat all this rule to find each relocate objects in the new input frame. Because of massive computation, I have to speed up a part of my design to catch up the real time implementation requirement. So I decide to build a hardware version of SIFT feature detection to replace the software one, take advantage of high parallelism of the algorithm of detection itself, the hardware can really reduce much of computation to speed up my original architecture.

APA, Harvard, Vancouver, ISO, and other styles

39

Bastos, Rafael Afonso Chiquelho Alves. "FIRST, invariant image features for augmented reality and computer vision." Doctoral thesis, 2008. http://hdl.handle.net/10071/12002.

Full text

Abstract:

ACM Classification System: I.4.1 Digitization and Image Capture, I.4.7 Feature Measurement, I.4.8 Scene Analysis, H.5.1 Multimedia Information Systems
A variety of application areas can be attained in the fields of human-computer interaction for augmented and mixed reality, object tracking and gesture recognition. By combining the areas of 3D computer graphics, computer vision and programming, we have developed a fast, yet robust and accurate image feature detector and matcher to solve common problems that arise in the mentioned research areas. In this thesis, frequent computer vision and augmented reality problems related to camera calibration, object recognition/tracking, image stitching and gesture recognition, are shown to be solved in real-time using our novel feature detection and matching technique. Our method is referred to as FIRST – Feature Invariant to Rotation and Scale Transform. We have also generalized our texture tracking algorithm to a near model base tracking method, using pre-calibrated static planar structures. Our results are compared and discussed with other state of the art works in the areas of invariant feature descriptors and vision based augmented reality, both in accuracy and performance.
Nos campos de investigação e desenvolvimento relacionados com a interacção pessoamáquina em realidade aumentada e mista, o seguimento de objectos e o reconhecimento de gestos, existe uma vasta área de aplicações por explorar. Através da combinação dos domínios da computação gráfica 3D, visão por computador e programação, apresentamos um método eficiente e no entanto robusto e preciso, que permite extrair características invariantes de imagens, de modo a resolver problemas comuns dentro destas áreas de investigação. Nesta tese, alguns desafios comuns existentes nas áreas de visão por computador e realidade aumentada, como por exemplo, a calibração da câmara, o reconhecimento e o seguimento de objectos, a composição panorâmica de imagens e o reconhecimento de gestos, são resolvidos em tempo-real através da aplicação deste novo método de extracção e correlação de características invariantes das imagens. Este método é referido como FIRST – Feature Invariant to Rotation and Scale Transform (Transformada de Característica Invariante à Rotação e Escalamento). Neste trabalho, apresentamos ainda uma nova generalização do algoritmo de seguimento de texturas em realidade aumentada, para um método aproximado de seguimento de objectos baseado num modelo tridimensional conhecido, através da pré-calibração de estruturas planares estáticas. Os resultados obtidos são comparados e discutidos com outros trabalhos do estado da arte, nos domínios da realidade aumentada baseada em visão e das características de imagem, tanto ao nível da precisão como da eficiência.

APA, Harvard, Vancouver, ISO, and other styles

40

Wu, Jia-Shan, and 吳加山. "Real-time 3-D Object Recognition by Using Scale Invariant Feature Transform and Stereo Vision." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/64211211762544190810.

Full text

Abstract:

碩士
國立臺灣科技大學
機械工程系
96
3-D object recognition and stereo vision are important tasks in computer vision. In this thesis, we use Scale Invariant Feature Transform (SIFT) to search 3-D object features and use GPU to perform the real-time capability. Since SIFT has rotation-invariant, and scale-invariant characteristics, and can handle complex backgrounds, our detector can detect objects of different sizes based on its own unique feature. The corresponding homography is used to calculate the out-plane orientations. In this thesis, we implement the SIFT algorithm to recognize the 3-D objects and also use the stereo vision theorem to determine the distance form the cameras to the object. A robot arm is controlled to point to the object based on the orientations, and depth information of the object.

APA, Harvard, Vancouver, ISO, and other styles

41

Teng, Chtng-Yuan, and 鄧景元. "A study of using Scale Invariant Feature Transform (SIFT) algorithm for radar satellite imagery coregistration." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/84739806805595993983.

Full text

Abstract:

碩士
國立臺灣海洋大學
海洋環境資訊學系
98
The time-sequence images are collected on different orbits and incidence angles, results in images are quite different in scale, position and rotation angle. That will be a problem when one tries to locate interest points on different images and match them. Besides, radar reflectance highly depends on the local incidence angle with terrain and the shape of the object; it is harder to match radar imagery. Therefore, how to automatically register radar imagery has become a critical issue. In this thesis, we study the radar imaging geometry, radar imagery characteristics, and differentiations between images like variance in scale and rotation. Scale Invariant Feature Transformation (SIFT) has been proven to match optical imagery with variance in scale, translation and rotation. After a thorough study, we try to use SIFT on radar imagery to get stable features automatically to avoid the influence of imagery shift, scale and speckles in time-sequence images, without user intervention. According to the result via testing SIFT on several pair radar images with different resolution and imaging angle. These shows that SIFT can locate interest points on the roads and building in the image and match them accurately. Therefore, SIFT can register different radar imagery effectively and automatically.

APA, Harvard, Vancouver, ISO, and other styles

42

Lee-YungChen and 陳李永. "Age-Variant Face Recognition Scheme Using Scale Invariant Feature Transform and the Probabilistic Neural Network." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/83926691560305817266.

Full text

Abstract:

碩士
國立成功大學
電機工程學系碩士在職專班
102
Facing to the aging variation problem, how to improve the correct recognition rate of an automatic face recognition system is an important issue. Most face recognition studies only focus on aging simulation or age estimation. For face recognition system under age variation, it is possible to effectively design a suitable and efficient performance matching a framework model. This thesis mainly discusses the differences caused by age level using the Scale Invariant Feature Transform (SIFT) algorithm. Because it has a high tolerance of noise characteristics, the light and viewing angle has changed. It can be detected and can describe local features of the face images through intensively sampling a local descriptor. Then it uses the Probabilistic Neural Network (PNN) by Bayesian classification decisions to deal with the problem by adjusting the smoothing parameter from the probabilistic density function in order to improve the recognition success rate. Finally, the proposed age-variant face recognition scheme is applied to the FG-NET (Face and Gesture Recognition Research Network) face database and the simulation results demonstrate that the correct recognition rate is indeed improved.

APA, Harvard, Vancouver, ISO, and other styles

43

FAN, SHU-DUAN, and 范恕端. "Automatic Cardiac Contour Tracking in Ultrasound Imaging Using Active Contour Model and Scale Invariant Feature Transform." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/fc878r.

Full text

Abstract:

碩士
國立中正大學
資訊管理學系暨研究所
102
In this study, we combined an active contour model and a scale invariant feature transform for use in cardiac ultrasound imaging tracking. The conventional active contour model is inappropriate for use in cardiac imaging tracking because the mitral and tricuspid rise and fall, leading to poor tracking during conventional methods and excessive convergence in the overall contour during systoles. To amend this deficiency, we proposed adding the scale invariant feature transform to track the heart valve position accurately, thereby preventing excessive convergence below the two heart valves in the dynamic contour. Applying this method resulted in accurate segmentation and tracking results. Experiment shows the segmentation results of our method. And using receiver operating characteristic curve to analysis relative data. Then compared with two other methods, our proposed method is accurate and effective for cardiac imaging tracking.

APA, Harvard, Vancouver, ISO, and other styles

44

Li, Jung-Lin, and 李忠霖. "Stereo Visual Navigation Based on Local Scale-Invariant Feature Transform and Its Nao Embedded System Implementation." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/17978402803821569526.

Full text

Abstract:

碩士
雲林科技大學
電機工程系碩士班
98
Stereo vision navigation is the fundamental functionality of the intelligent robot, so that the intelligent robot can smoothly achieve the features of obstacle avoidance, path planning, map building, and environmental localization. , However, conventional feature detection methods can not provide plenty of feature points that are distributed evenly and can not accomplish the stereo vision navigation. Meanwhile, the intelligent robot often requires some extra ultrasonic or infrared sensor for assistance. In this thesis, Local Scale-Invariant Feature Transform (SIFT) method is proposed to get more and evenly feature points. So accurate 3D environment modeling and elaborate stereo map can be accomplished easily. Experimental results verify the proposed Local SIFT can detect more and reliable feature points. On the other hand, this thesis also implements the simplified stereo vision navigation based on grayscale histogram segmentation onto Nao embedded robot. Implementation results show the simplified vision navigation based on grayscale histogram analysis is simple and efficient.

APA, Harvard, Vancouver, ISO, and other styles

45

Chen, Yu-wei, and 陳昱維. "A Geometry-Distortion Resistant Image Detection System Based on Log-Polar Transform and Scale Invariant Feature Transform." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/38443904983925420897.

Full text

Abstract:

碩士
大同大學
資訊工程學系(所)
98
In many image detection systems, the detection results are superior to tamper distortion. However, the geometric distortions rearrange the feature positions, and this property often affects the results of feature comparison. In this thesis, the presented scheme aims at resisting the geometric distortions. The scheme contains the feature construction phase and the comparison phase. In the feature construction phase, the scheme extracts unique features from each protected image based on Log Polar Transform and Scale Invariant Feature Transform. In the comparison phase, the scheme extracts features from the suspect image to compare each protected image. Furthermore, this paper also focuses on similar image identification. There are two types of similar image that the scheme aims. The first type is that there are similar objects in two images. The second type is different view images. These two types of images are serious issue for feature comparison. Hence, this paper presents a scheme to solve this problem.

APA, Harvard, Vancouver, ISO, and other styles

46

PRAKASH, VED. "AN ANALYTICAL APPROACH TOWARDS CONVERSION OF HUMAN SIGNED LANGUAGE TO TEXT USING MODIFIED SCALE INVARIANT FEATURE TRANSFORM (SIFT)." Thesis, 2016. http://dspace.dtu.ac.in:8080/jspui/handle/repository/14739.

Full text

Abstract:

Sign language is used as a communication medium among deaf & dumb people to convey the message with each other. A person who can talk and hear properly (normal person) cannot communicate with deaf & dumb person unless he/she is familiar with sign language. Same case is applicable when a deaf & dumb person wants to communicate with a normal person or blind person. In order to bridge the gap in communication among deaf & dumb community and normal community, researchers are working to convert hand signs to voice and vice versa to help communication at both ends. A lot of research work has been carried out to automate the process of sign language interpretation with the help of image processing and pattern recognition techniques. The approaches can be broadly classified into “Data -Glove based” and “Vision-based” .Tracking bare hand and operations to detect hand from image frames. The main drawback of this method lies in its huge computational complexity which is further handled with the concept of integral image. The use of integral image for hand detection in viola-Jones method reduces computational complexity and shows satisfactory performance only in a controlled environment. To detect hand in a cluttered background, many researchers used color information and histogram distribution model. Some Local orientation histogram technique is also used for static gesture recognition. These algorithms perform well in a controlled lighting condition, but fails in case of illumination changes, scaling and rotation. To resist illumination changes, Elastic graphs are applied to represent different hand gestures An Analytical Approach towards Conversion of Human SL to Text using Modified SIFT │ xi with local jets of Gabor Filters. Adaboost for wearable computing is insensitive to camera movement and user variance. Their hand tracking is promising, but segmentation is not reliable. Fourier descriptors of binary hand blobs used as feature vector to Radial Basis Function (RBF) classifier for pose classification and combined HMM classifiers for gesture classification. Even though their system achieves good performance, it is not robust against multi variations during hand movement. To overcome the problem of multi variations like rotation, scaling, translation some popular techniques like SIFT, Haar-like features with Adaboost classifiers, Active learning and appearance based approaches are used. However, all these algorithms suffer from the problem of time complexity. To increase the accuracy of the hand gesture recognition system, combined feature selection approach is adopted. My thesis proposes new approach of hand gesture recognition which will recognize sign language gestures in a real time environment. A hybrid feature approach, which combines the advantages of SIFT, Principal Component Analysis, Histogram and they are used as a combined feature set to achieve a good recognition rate. To increase the recognition rate and make the recognition system resilient to view-point variations, the concept of principal component analysis introduced. K-Nearest Neighbors (KNN[11]) is used for hybrid classification of single signed letter. In addition, integration of color detection method is under progress to increase the accuracy further. The performance analysis of the proposed approache is presented along with the experimental results. Comparative study of these methods with other popular techniques shows that the real time efficiency and robustness are better.

APA, Harvard, Vancouver, ISO, and other styles

47

Lin, Hsin-Ping, and 林鑫平. "Detection of early-stage gastric cancer in endoscopy NBI images by using scale-invariant feature transform and support vector machine." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/ky7476.

Full text

Abstract:

碩士
國立雲林科技大學
電機工程系
107
In this paper, we use amplified narrow-band imaging (NBI) endoscopic images of the stomach as a data set, there are 66 and 60 images of the training set and test set, respectively. We extract the scale-invariant feature transform (SIFT) feature and find the abnormal region of early gastric cancer. First, we capture the region of interest in an image and filter out the bright and dark blocks. The images segmented into different block sizes, such as 40×40, 50×50, and 60×60, which are partially overlapping. For each block, we determine the SIFT features and then cluster these feature vectors to the bag of visual words (BOVW). Therefore, each image can be represented as a histogram of visual words, which can be used as an input for classifier training. In our experiments, the highest average precision and recall rates reached 85% and 81%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

48

Werkhoven, Shaun. "Improving interest point object recognition." Thesis, 2010. http://hdl.handle.net/1959.13/804109.

Full text

Abstract:

Research Doctorate - Doctor of Philosophy (PhD)
Vision is a fundamental ability for humans. It is essential to a wide range of activities. The ability to see underpins almost all tasks of our day to day life. It is also an ability exercised by people almost effortlessly. Yet, in spite of this it is an ability that is still poorly understood, and has been possible to reproduce in machines only to a very limited degree. This work grows out of a belief that substantial progress is currently being made in understanding visual recognition processes. Advances in algorithms and computer power have recently resulted in clear and measurable progress in recognition performance. Many of the key advances in recognizing objects have related to recognition of key points or interest points. Such image primitives now underpin a wide array of tasks in computer vision such as object recognition, structure from motion, navigation. The object of this thesis is to find ways to improve the performance of such interest point methods. The most popular interest point methods such as SIFT (Scale Invariant Feature Transform) consist of a descriptor, a feature detector and a standard distance metric. This thesis outlines methods whereby all of these elements can be varied to deliver higher performance in some situations. SIFT is a performance standard to which we often refer herein. Typically, the standard Euclidean distance metric is used as a distance measure with interest points. This metric fails to take account of the specific geometric nature of the information in the descriptor vector. By varying this distance measure in a way that accounts for its geometry we show that performance improvements can be obtained. We investigate whether this can be done in an effective and computationally efficient way. Use of sparse detectors or feature points is a mainstay of current interest point methods. Yet such an approach is questionable for class recognition since the most discriminative points may not be selected by the detector. We therefore develop a dense interest point method, whereby interest points are calculated at every point. This requires a low dimensional descriptor to be computationally feasible. Also, we use aggressive approximate nearest neighbour methods. These dense features can be used for both point matching and class recognition, and we provide experimental results for each. These results show that it is competitive with, and in some cases superior to, traditional interest point methods. Having formed dense descriptors, we then have a multi-dimensional quantity at every point. Each of these can be regarded as a new image and descriptors can be applied to them again. Thus we have higher level descriptors – ‘descriptors upon descriptors’. Experimental results are obtained demonstrating that this provides an improvement to matching performance. Standard image databases are used for experiments. The application of these methods to several tasks, such as navigation (or structure from motion) and object class recognition is discussed.

APA, Harvard, Vancouver, ISO, and other styles

49

Γράψα, Ιωάννα. "Ανάπτυξη τεχνικών αντιστοίχισης εικόνων με χρήση σημείων κλειδιών." Thesis, 2012. http://hdl.handle.net/10889/5500.

Full text

Abstract:

Ένα σημαντικό πρόβλημα είναι η αντιστοίχιση εικόνων με σκοπό τη δημιουργία πανοράματος. Στην παρούσα εργασία έχουν χρησιμοποιηθεί αλγόριθμοι που βασίζονται στη χρήση σημείων κλειδιών. Αρχικά στην εργασία βρίσκονται σημεία κλειδιά για κάθε εικόνα που μένουν ανεπηρέαστα από τις αναμενόμενες παραμορφώσεις με την βοήθεια του αλγορίθμου SIFT (Scale Invariant Feature Transform). Έχοντας τελειώσει αυτή τη διαδικασία για όλες τις εικόνες, προσπαθούμε να βρούμε το πρώτο ζευγάρι εικόνων που θα ενωθεί. Για να δούμε αν δύο εικόνες μπορούν να ενωθούν, ακολουθεί ταίριασμα των σημείων κλειδιών τους. Όταν ένα αρχικό σετ αντίστοιχων χαρακτηριστικών έχει υπολογιστεί, πρέπει να βρεθεί ένα σετ που θα παράγει υψηλής ακρίβειας αντιστοίχιση. Αυτό το πετυχαίνουμε με τον αλγόριθμο RANSAC, μέσω του οποίου βρίσκουμε το γεωμετρικό μετασχηματισμό ανάμεσα στις δύο εικόνες, ομογραφία στην περίπτωσή μας. Αν ο αριθμός των κοινών σημείων κλειδιών είναι επαρκής, δηλαδή ταιριάζουν οι εικόνες, ακολουθεί η ένωσή τους. Αν απλώς ενώσουμε τις εικόνες, τότε θα έχουμε σίγουρα κάποια προβλήματα, όπως το ότι οι ενώσεις των δύο εικόνων θα είναι πολύ εμφανείς. Γι’ αυτό, για την εξάλειψη αυτού του προβλήματος, χρησιμοποιούμε τη μέθοδο των Λαπλασιανών πυραμίδων. Επαναλαμβάνεται η παραπάνω διαδικασία μέχρι να δημιουργηθεί το τελικό πανόραμα παίρνοντας κάθε φορά σαν αρχική την τελευταία εικόνα που φτιάξαμε στην προηγούμενη φάση.
Stitching multiple images together to create high resolution panoramas is one of the most popular consumer applications of image registration and blending. At this work, feature-based registration algorithms have been used. The first step is to extract distinctive invariant features from every image which are invariant to image scale and rotation, using SIFT (Scale Invariant Feature Transform) algorithm. After that, we try to find the first pair of images in order to stitch them. To check if two images can be stitched, we match their keypoints (the results from SIFT). Once an initial set of feature correspondences has been computed, we need to find the set that is will produce a high-accuracy alignment. The solution at this problem is RANdom Sample Consensus (RANSAC). Using this algorithm (RANSAC) we find the motion model between the two images (homography). If there is enough number of correspond points, we stitch these images. After that, seams are visible. As solution to this problem is used the method of Laplacian Pyramids. We repeat the above procedure using as initial image the ex panorama which has been created.

APA, Harvard, Vancouver, ISO, and other styles

50

Rosner, Jakub. "Methods of parallelizing selected computer vision algorithms for multi-core graphics processors." Rozprawa doktorska, 2015. https://repolis.bg.polsl.pl/dlibra/docmetadata?showContent=true&id=28390.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Scale Invariant Feature Descriptor'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles