Dissertations / Theses: 'Face and Object Matching'

1

Mian, Ajmal Saeed. "Representations and matching techniques for 3D free-form object and face recognition /." Connect to this title, 2006. http://theses.library.uwa.edu.au/adt-WU2007.0046.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Mian, Ajmal Saeed. "Representations and matching techniques for 3D free-form object and face recognition." University of Western Australia. School of Computer Science and Software Engineering, 2007. http://theses.library.uwa.edu.au/adt-WU2007.0046.

Full text

Abstract:

[Truncated abstract] The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene . . . The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.

APA, Harvard, Vancouver, ISO, and other styles

3

Tewes, Andreas H. [Verfasser]. "A Flexible Object Model for Encoding and Matching Human Faces / Andreas H Tewes." Aachen : Shaker, 2006. http://d-nb.info/1170529097/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Malla, Amol Man. "Automated video-based measurement of eye closure using a remote camera for detecting drowsiness and behavioural microsleeps." Thesis, University of Canterbury. Electrical and Computer Engineering, 2008. http://hdl.handle.net/10092/2111.

Full text

Abstract:

A device capable of continuously monitoring an individual’s levels of alertness in real-time is highly desirable for preventing drowsiness and lapse related accidents. This thesis presents the development of a non-intrusive and light-insensitive video-based system that uses computer-vision methods to localize face, eyes, and eyelids positions to measure level of eye closure within an image, which, in turn, can be used to identify visible facial signs associated with drowsiness and behavioural microsleeps. The system was developed to be non-intrusive and light-insensitive to make it practical and end-user compliant. To non-intrusively monitor the subject without constraining their movement, the video was collected by placing a camera, a near-infrared (NIR) illumination source, and an NIR-pass optical filter at an eye-to-camera distance of 60 cm from the subject. The NIR-illumination source and filter make the system insensitive to lighting conditions, allowing it to operate in both ambient light and complete darkness without visually distracting the subject. To determine the image characteristics and to quantitatively evaluate the developed methods, reference videos of nine subjects were recorded under four different lighting conditions with the subjects exhibiting several levels of eye closure, head orientations, and eye gaze. For each subject, a set of 66 frontal face reference images was selected and manually annotated with multiple face and eye features. The eye-closure measurement system was developed using a top-down passive feature-detection approach, in which the face region of interest (fROI), eye regions of interests (eROIs), eyes, and eyelid positions were sequentially localized. The fROI was localized using an existing Haar-object detection algorithm. In addition, a Kalman filter was used to stabilize and track the fROI in the video. The left and the right eROIs were localized by scaling the fROI with corresponding proportional anthropometric constants. The position of an eye within each eROI was detected by applying a template-matching method in which a pre-formed eye-template image was cross-correlated with the sub-images derived from the eROI. Once the eye position was determined, the positions of the upper and lower eyelids were detected using a vertical integral-projection of the eROI. The detected positions of the eyelids were then used to measure eye closure. The detection of fROI and eROI was very reliable for frontal-face images, which was considered sufficient for an alertness monitoring system as subjects are most likely facing straight ahead when they are drowsy or about to have microsleep. Estimation of the y- coordinates of the eye, upper eyelid, and lower eyelid positions showed average median errors of 1.7, 1.4, and 2.1 pixels and average 90th percentile (worst-case) errors of 3.2, 2.7, and 6.9 pixels, respectively (1 pixel 1.3 mm in reference images). The average height of a fully open eye in the reference database was 14.2 pixels. The average median and 90th percentile errors of the eye and eyelid detection methods were reasonably low except for the 90th percentile error of the lower eyelid detection method. Poor estimation of the lower eyelid was the primary limitation for accurate eye-closure measurement. The median error of fractional eye-closure (EC) estimation (i.e., the ratio of closed portions of an eye to average height when the eye is fully open) was 0.15, which was sufficient to distinguish between the eyes being fully open, half closed, or fully closed. However, compounding errors in the facial-feature detection methods resulted in a 90th percentile EC estimation error of 0.42, which was too high to reliably determine extent of eye-closure. The eye-closure measurement system was relatively robust to variation in facial-features except for spectacles, for which reflections can saturate much of the eye-image. Therefore, in its current state, the eye-closure measurement system requires further development before it could be used with confidence for monitoring drowsiness and detecting microsleeps.

APA, Harvard, Vancouver, ISO, and other styles

5

Morris, Ryan L. "Hand/Face/Object." Kent State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=kent155655052646378.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Lennartsson, Mattias. "Object Recognition with Cluster Matching." Thesis, Linköping University, Department of Electrical Engineering, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-51494.

Full text

Abstract:

Within this thesis an algorithm for object recognition called Cluster Matching has been developed, implemented and evaluated. The image information is sampled at arbitrary sample points, instead of interest points, and local image features are extracted. These sample points are used as a compact representation of the image data and can quickly be searched for prior known objects. The algorithm is evaluated on a test set of images and the result is surprisingly reliable and time efficient.

APA, Harvard, Vancouver, ISO, and other styles

7

Havard, Catriona. "Eye movement strategies during face matching." Thesis, University of Glasgow, 2007. http://theses.gla.ac.uk/91/.

Full text

Abstract:

Although there is a large literature on face recognition, less is known about the process of face matching, i.e., deciding whether two photographs depict the same person. The research described here examines viewers’ strategies for matching faces, and addresses the issue of which parts of a face are important for this task. Consistent with previous research, several eye-tracking experiments demonstrated a bias to the eye region when looking at faces. In some studies, there was a scanning strategy whereby only one eye on each face was viewed (the left eye on the right face and the right eye on the left face). However, viewing patterns and matching performance could be influenced by manipulating the way the face pair was presented: through face inversion, changing the distance between the two faces and varying the layout. There was a strong bias to look at the face on the left first, and then to look at the face on the right. A left visual field bias for individual faces has been found in a number of previous studies, but this is the first time it has been reported using pairs of faces in a matching task. The bias to look first at the item on the left was also found when trying to match pairs of similar line drawings of objects and therefore is not specific to face stimuli. Finally, the experiments in this thesis suggest that the way face pairs are presented can influence viewers’ accuracy on a matching task, as well as the way in which these faces are viewed. This suggests that the layout of face pairs for matching might be important in real world settings, such as the attempt to identify criminals from security cameras.

APA, Harvard, Vancouver, ISO, and other styles

8

Dowsett, Andrew James. "Methods for improving unfamiliar face matching." Thesis, University of Aberdeen, 2015. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=228194.

Full text

Abstract:

Matching unfamiliar faces is known to be a very difficult task. Yet, despite this, we frequently rely on this method to verify people's identity in high security situations, such as at the airport. Because of such security implications, recent research has focussed on investigating methods to improve our ability to match unfamiliar faces. This has involved methods for improving the document itself, such that photographic-ID presents a better representation of an individual, or training matchers to be better at the task. However, to date, no method has demonstrated significant improvements that would allow the technique to be put into practice in the real world. The experiments in this thesis therefore further explore methods to improve unfamiliar face matching. In the first two chapters both variability and feedback are examined to determine if these previously used techniques do produce reliable improvements. Results show that variability is only of use when training to learn a specific identity, and feedback only leads to improvements when the task is difficult. In the final chapter, collaboration is explored as a new method for improving unfamiliar face matching in general. Asking two people to perform the task together did produce consistent accuracy improvements, and importantly, also demonstrated individual training benefits. Overall, the results further demonstrate that unfamiliar face matching is difficult, and although finding methods to improve this is not straightforward, collaboration does appear to be successful and worth exploring further. The findings are discussed in relation to previous attempts at improving unfamiliar face matching, and the effect these may have on real world applications.

APA, Harvard, Vancouver, ISO, and other styles

9

Harvard, Catriona. "Eye movements strategies during face matching." Thesis, University of Glasgow, 2007. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.502694.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Whitney, Hannah L. "Object agnosia and face processing." Thesis, University of Southampton, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.548326.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Anderson, R. "Phase-based object matching using complex wavelets." Thesis, University of Cambridge, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.595514.

Full text

Abstract:

This thesis investigates the use of phase information from the Dual-Tree Complex Wavelet Transform (DT CWT) for the purpose of image matching. We review methods for image matching with a particular emphasis on matching with local features. We review the meaning and current uses of local phase, and introduce the DT CWT. We highlight the shortcomings of decimated wavelets for typical local phase analysis, and introduce two new functions that can extract useful information from the phases of decimated complex wavelets. The first, the InterLevel Product (ILP), is a pyramidal representation where the phases are equal to the phase difference between co-located, same-direction DT CWT coefficients at different scales. The second function, the Same-Level Product (SLP), has phases proportional to the phase differences between adjacent coefficients in the same level; these SLP phases are parallel to the gradient of dominant local features. We seek to represent dominant local features sparsely by clustering areas where the ILP phase is large and same-phase. Three novel clustering techniques are introduced and discussed, with the Line-Growing technique shown to be the best. Line-Growing is a technique similar to the Canny edge detector, operating upon decimated coefficients and sensitive to phase symmetry changes. We call the resulting clusters Edge-Profile Clusters (EPC’s). We explore three different matching techniques based upon local phase information. The first technique shows how ILP information can be combined with the normalized cross correlation (template matching) to accelerate traditional matching. We also show a hybrid ILP/SLP format, where the target is abstracted into EPC cluster constellations that may be rotated quickly to test different match hypotheses. Finally, we show a method where EPC parameters are used to represent both images in the comparison, and a geometric hashing algorithm is combined with a cluster overlap metric to evaluated matches in a fully affine-invariant manner. We highlight the compatibility of our EPC and ILP representations and current physiological/psychovisual observations regarding the mammalian visual cortex, including an ILP phase-based explanation of the “pop-out” effect and perceptual grouping regarding “Glass patterns”.

APA, Harvard, Vancouver, ISO, and other styles

12

McCaffery, Jennifer. "Unfamiliar face matching in the applied context." Thesis, University of York, 2016. http://etheses.whiterose.ac.uk/16130/.

Full text

Abstract:

Matching unfamiliar faces is a difficult task. Despite this, ID checks are the primary screening method for individuals wishing to access countries, employment and a range of financial and medical services. Those we might consider experts, such as passport officers, are no better at the task than general population. Individuals with superior unfamiliar face matching have been identified, but the range of ability remains large across expert and general populations alike. Even individuals with superior face recognition skills have not been consistently found to have superior unfamiliar face matching abilities. This suggests that unfamiliar face matching ability may be highly specific. It may also suggest that the unfamiliar matching tasks carried out in the lab are different from ID checks in the applied context. It is the aim of this thesis to investigate the nature of unfamiliar face matching in the applied context and identify ways in which performance might be predicted. In Chapters 2 and 3 participants are required to match unfamiliar faces shown with a passport context and to check the validity of the accompanying biographical information. The presence of a passport context biases viewers to identify face pairs as the same and presence of a face pair biases and reduces accuracy when checking biographical information. These findings demonstrate that applied error rates in unfamiliar face matching may well have been underestimated. In Chapter 4, a battery of tasks is used to identify predictors of unfamiliar face matching ability. The results show that unfamiliar face matching is positively associated with other face identity tasks. However, same and different unfamiliar face matching also associate with more general measures of local processing and space perception. These findings are tested in Chapter 5 and the theoretical implications of these results and methods for optimising unfamiliar face matching performance are discussed.

APA, Harvard, Vancouver, ISO, and other styles

13

Salam, Hanan. "Multi-Object modelling of the face." Thesis, Supélec, 2013. http://www.theses.fr/2013SUPL0035/document.

Full text

Abstract:

Cette thèse traite la problématique liée à la modélisation du visage dans le but de l’analyse faciale.Dans la première partie de cette thèse, nous avons proposé le Modèle Actif d’Apparence Multi-Objet. La spécificité du modèle proposé est que les différentes parties du visage sont traités comme des objets distincts et les mouvements oculaires (du regard et clignotement) sont extrinsèquement paramétrées.La deuxième partie de la thèse porte sur l'utilisation de la modélisation de visage dans le contexte de la reconnaissance des émotions.Premièrement, nous avons proposé un système de reconnaissance des expressions faciales sous la forme d’Action Units. Notre contribution porte principalement sur l'extraction des descripteurs de visage. Pour cela nous avons utilisé les modèles AAM locaux.Le second système concerne la reconnaissance multimodale des quatre dimensions affectives :. Nous avons proposé un système qui fusionne des caractéristiques audio, contextuelles et visuelles pour donner en sortie les quatre dimensions émotionnelles. Nous contribuons à ce système en trouvant une localisation précise des traits du visage. En conséquence, nous proposons l’AAM Multi-Modèle. Ce modèle combine un modèle global extrinsèque du visage et un modèle local de la bouche
The work in this thesis deals with the problematic of face modeling for the purpose of facial analysis.In the first part of this thesis, we proposed the Multi-Object Facial Actions Active Appearance Model (AAM). The specificity of the proposed model is that different parts of the face are treated as separate objects and eye movements (gaze and blink) are extrinsically parameterized. This increases the generalization capabilities of classical AAM.The second part of the thesis concerns the use of face modeling in the context of expression and emotion recognition. First we have proposed a system for the recognition of facial expressions in the form of Action Units (AU). Our contribution concerned mainly the extraction of AAM features of which we have opted for the use of local models.The second system concerns multi-modal recognition of four continuously valued affective dimensions. We have proposed a system that fuses audio, context and visual features and gives as output the four emotional dimensions. We contribute to the system by finding the precise localization of the facial features. Accordingly, we propose the Multi-Local AAM. This model combines extrinsically a global model of the face and a local one of the mouth through the computation of projection errors on the same global AAM

APA, Harvard, Vancouver, ISO, and other styles

14

Kwon, Ohkyu. "Similarity measures for object matching in computer vision." Thesis, University of Bolton, 2016. http://ubir.bolton.ac.uk/890/.

Full text

Abstract:

The similarity measures for object matching and their applications have been important topics in many fields of computer vision such as those of image recognition, image fusion, image analysis, video sequence matching, and so on. This critical commentary presents the efficiency of new metric methods such as the robust Hausdorff distance (RHD), the accurate M-Hausdorff distance (AMHD), and the fast sum of absolute differences (FSAD). The RHD measure computes the similarity distance of the occluded/noisy image pair and evaluates the performances of the multi-modal registration algorithms. The AMHD measure is utilised for aligning the pair of the occluded/noisy multi-sensor face images, and the FSAD measure in adaptive-template matching method finds the zero location of the slide in an automatic scanning microscope system. A Hausdorff distance (HD) similarity measure has been widely investigated to compare the pair of two-dimensional (2-D) images by low-level features since it is simple and insensitive to the changes in an image characteristic. In this research, novel HD measures based on the robust statistics of regression analysis are addressed for occluded and noisy object matching, resulting in two RHD measures such as M-HD based on the M-estimation and LTS-HD based on the least trimmed squares (LTS). The M-HD is extended to three-dimensional (3-D) version for scoring the registration algorithms of the multi-modal medical images. This 3-D measure yields the comparison results with different outlier-suppression parameters (OSP) quantitatively, even though the Computed Tomography (CT) and emission-Positron Emission Tomography (PET) images have different distinctive features. The RHD matching technique requires a high level of complexity in computing the minimum distance from one point to the nearest point between two edge point sets and searching for the best fit of matching position. To overcome these problems, the improved 3×3 distance transform (DT) is employed. It has a separable scan structure to reduce the calculation time of the minimum distance in multi-core processors. The object matching algorithm with hierarchical structures is also demonstrated to minimize the computational complexity dramatically without failing the matching position. The object comparison between different modality images is still challenging due to the poor edge correspondence coming from heterogeneous characteristics. To improve the robustness of HD measures in comparing the pair of multi-modal sensor images, an accurate M-HD (AMHD) is proposed by utilizing the orientation information of each point in addition to the DT map. This similarity measure can precisely analyse the non-correspondent edges and noises by using the distance orientation information. The AMHD measure yields superior performance at aligning the pairs of multi-modal face images over those achieved by the conventional robust HD schemes. The sum of absolute differences (SAD) is popular similarity measure in template matching technique. This thesis shows the adaptive-template matching method based on the FSAD for accurately locating the slide in automated microscope. The adaptive-template matching method detects the fiduciary ring mark in the slide by predicting the constant used in the template, where the FSAD reduces the processing time with a low rate of error of the template matching by inducing 1-D vertical and horizontal SAD. The proposed scheme results in an accurate performance in terms of detecting the ring mark and estimating the relative offset in slide alignment during the on-line calibration process.

APA, Harvard, Vancouver, ISO, and other styles

15

Tieu, Kinh H. (Kinh Han) 1976. "Statistical dependence estimation for object interaction and matching." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/38316.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
Includes bibliographical references (p. 97-103).
This dissertation shows how statistical dependence estimation underlies two key problems in visual surveillance and wide-area tracking. The first problem is to detect and describe interactions between moving objects. The goal is to measure the influence objects exert on one another. The second problem is to match objects between non-overlapping cameras. There, the goal is to pair the departures in one camera with the arrivals in a different camera so that the resulting distribution of relationships best models the data. Both problems have become important for scaling up surveillance systems to larger areas and expanding the monitoring to more interesting behaviors. We show how statistical dependence estimation generalizes previous work and may have applications in other areas. The two problems represent different applications of our thesis that statistical dependence estimation underlies the learning of the structure of probabilistic models. First, we analyze the relationship between Bayesian, information-theoretic, and classical statistical methods for statistical dependence estimation. Then, we apply these ideas to formulate object interaction in terms of dependency structure model selection.
(cont.) We describe experiments on simulated and real interaction data to validate our approach. Second, we formulate the matching problem in terms of maximizing statistical dependence. This allows us to generalize previous work on matching, and we show improved results on simulated and real data for non-overlapping cameras. We also prove an intractability result on exact maximally dependent matching.
by Kinh Tieu.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

16

Ko, Kwang Hee 1971. "Algorithms for three-dimensional free-form object matching." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/29751.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Ocean Engineering, 2003.
Includes bibliographical references (leaves 117-126).
This thesis addresses problems of free-form object matching for the point vs. NURBS surface and the NURBS surface vs. NURBS surface cases, and its application to copyright protection. Two new methods are developed to solve a global and partial matching problem with no a priori information on correspondence or initial transformation and no scaling effects, namely the KH and the umbilic method. The KH method establishes a correspondence between two objects by utilizing the Gaussian and mean curvatures. The umbilic method uses the qualitative properties of umbilical points to find correspondence information between two objects. These two methods are extended to deal with uniform scaling effects. The umbilic method is enhanced with an algorithm for scaling factor estimation using the quantitative properties of umbilical points. The KH method is used as a building block of an optimization scheme based on the golden section search which recovers iteratively an optimum scaling factor. Since the golden section search only requires an initial interval for the scaling factor, the solution process is simplified compared to iterative optimization algorithms, which require good initial estimates of the scaling factor and the rigid body transformation. The matching algorithms are applied to problems of copyright protection.
(cont.) A suspect model is aligned to an original model through matching methods so that similarity between two geometric models can be assessed to determine if the suspect model contains part(s) of the original model. Three types of tests, the weak, intermediate and strong tests, are proposed for similarity assessment between two objects. The weak and intermediate tests are performed at node points obtained through shape intrinsic wireframing. The strong test relies on isolated umbilical points which can be used as fingerprints of an object for supporting an ownership claim to the original model. The three tests are organized in two decision algorithms so that they produce systematic and statistical measures for a similarity decision between two objects in a hierarchical manner. Based on the systematic statistical evaluation of similarity, a decision can be reached whether the suspect model is a copy of the original model.
by Kwang Hee Ko.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

17

Ahn, Yushin. "Object space matching and reconstruction using multiple images." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1213375997.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Gathers, Ann D. "DEVELOPMENTAL FMRI STUDY: FACE AND OBJECT RECOGNITION." Lexington, Ky. : [University of Kentucky Libraries], 2005. http://lib.uky.edu/ETD/ukyanne2005d00276/etd.pdf.

Full text

Abstract:

Thesis (Ph. D.)--University of Kentucky, 2005.
Title from document title page (viewed on November 4, 2005). Document formatted into pages; contains xi, 152 p. : ill. Includes abstract and vita. Includes bibliographical references (p. 134-148).

APA, Harvard, Vancouver, ISO, and other styles

19

Sangi, P. (Pekka). "Object motion estimation using block matching with uncertainty analysis." Doctoral thesis, Oulun yliopisto, 2013. http://urn.fi/urn:isbn:9789526200774.

Full text

Abstract:

Abstract Estimation of 2-D motion is one of the fundamental problems in video processing and computer vision. This thesis addresses two general tasks in estimating projected motions of background and foreground objects in a scene: global motion estimation and motion based segmentation. The work concentrates on the study of the block matching method, and especially on those cases where the matching measure is based on the sum of squared or absolute displaced frame differences. Related techniques for performing the confidence analysis of local displacement are considered and used to improve the performance of the higher-level tasks mentioned. In general, local motion estimation techniques suffer from the aperture problem. Therefore, confidence analysis methods are needed which can complement motion estimates with information about their reliability. This work studies a particular form of confidence analysis which uses the evaluation of the match criterion for local displacement candidates. In contrast to the existing approaches, the method takes into account the local image gradient. The second part of the thesis presents a four-step feature based method for global motion estimation. For basic observations, it uses motion features which are combinations of image point coordinates, displacement estimates at those points, and representations of displacement uncertainty. A parametric form of uncertainty representation is computed exploiting the technique described in the first part of the thesis. This confidence information is used as a basis for weighting the features in motion estimation. Aspects of gradient based feature point selection are also studied. In the experimental part, the design choices of the method are compared, using both synthetic and real sequences. In the third part of the thesis, a technique for feature based extraction of background and foreground motions is presented. The new sparse segmentation algorithm performs competitive segmentation using both the spatial and temporal propagation of support information. The weighting of features exploits parametric uncertainty information which is experimentally shown to improve the performance of motion estimation. In the final part of the thesis, a novel framework for motion based object detection, segmentation, and tracking is developed. It uses a block grid based representation for segmentation and a particle filter based approach to motion estimation. Analysis techniques for obtaining the segmentation are described. Finally, the approach is integrated with the sparse motion segmentation and the combination of the methods is experimentally shown to increase both the efficiency of sampling and the accuracy of segmentation
Tiivistelmä Tässä väitöskirjassa tutkitaan yhtä videonkäsittelyn ja konenäön perusongelmaa, kaksiulotteisen liikkeen estimointia. Työ käsittelee kahta yleistä tehtävää taustan ja etualan kohteiden liikkeiden määrittämisessä: hallitsevan liikkeen estimointia ja liikepohjaista kuvan segmentointia. Tutkituissa ratkaisuissa lähtökohtana käytetään lohkosovitukseen perustuvaa paikallisen liikkeen määritystä, jossa sovituksen kriteerinä käytetään poikkeutettujen kehysten pikseliarvojen erotusta. Tähän liittyen tarkastellaan estimoinnin luotettavuuden analyysin tekniikoita ja näiden hyödyntämistä edellä mainittujen tehtävien ratkaisuissa. Yleensä ottaen paikallisen liikkeen estimointia vaikeuttaa apertuuriongelma. Tämän vuoksi tarvitaan analyysitekniikoita, jotka kykenevät antamaan täydentävää tietoa liike-estimaattien luotettavuudesta. Työn ensimmäisessä osassa kehitetty analyysimenetelmä käyttää lähtötietona lohkosovituksen kriteerin arvoja, jotka on saatu eri liikekandidaateille. Erotuksena aiempiin menetelmiin kehitetty ratkaisu ottaa huomioon kuvagradientin vaikutuksen. Työn toisessa osassa tutkitaan nelivaiheista piirrepohjaista ratkaisua hallitsevan liikkeen estimoimiseksi. Perushavaintoina mallissa käytetään liikepiirteitä, jotka koostuvat valittujen kuvapisteiden koordinaateista, näissä pisteissä lasketuista liike-estimaateista ja estimaattien epävarmuuden esityksestä. Jälkimmäinen esitetään parametrisessa muodossa käyttäen laskentaan työn ensimmäisessä osassa esitettyä menetelmää. Tätä epävarmuustietoa käytetään piirteiden painottamiseen hallitsevan liikkeen estimoinnissa. Lisäksi tutkitaan gradienttipohjaista piirteiden valintaa. Kokeellisessa osassa erilaisia suunnitteluvalintoja verrataan toisiinsa käyttäen synteettisiä ja todellisia kuvasekvenssejä. Väitöstyön kolmannessa osassa esitetään piirrepohjainen menetelmä taustan ja etualan kohteen liikkeiden erottamiseksi toisistaan. Algoritmi tekee analyysin kahta liikettä sisältävälle näkymälle käyttäen sekä spatiaalista että ajallista segmentointitiedon välittämistä. Piirteiden painotus hyödyntää epävarmuustietoa tässä yhteydessä, jonka osoitetaan kokeellisesti parantavan liike-estimoinnin suorituskykyä. Viimeisessä osassa kehitetään viitekehys liikepohjaisen kohteen ilmaisun, segmentoinnin ja seurannan toteutukselle. Se perustuu lohkopohjaiseen esitystapaan ja näytteistyksen soveltamiseen liikkeen estimoinnissa. Analyysitekniikka segmentoinnin määrittämiseksi esitellään. Lopuksi ratkaisu integroidaan työn kolmannessa osassa esitetyn menetelmän kanssa, ja menetelmien yhdistelmän osoitetaan kokeellisesti parantavan sekä näytteistyksen tehokkuutta että segmentoinnin tarkkuutta

APA, Harvard, Vancouver, ISO, and other styles

20

Staniaszek, Michal. "Feature-Feature Matching For Object Retrieval in Point Clouds." Thesis, KTH, Datorseende och robotik, CVAP, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-170475.

Full text

Abstract:

In this project, we implement a system for retrieving instances of objects from point clouds using feature based matching techniques. The target dataset of point clouds consists of approximately 80 full scans of office rooms over a period of one month. The raw clouds are reprocessed to remove regions which are unlikely to contain objects. Using locations determined by one of several possible interest point selection methods, one of a number of descriptors is extracted from the processed clouds. Descriptors from a target cloud are compared to those from a query object using a nearest neighbour approach. The nearest neighbours of each descriptor in the query cloud are used to vote for the position of the object in a 3D grid overlaid on the room cloud. We apply clustering in the voting space and rank the clusters according to the number of votes they contain. The centroid of each of the clusters is used to extract a region from the target cloud which, in the ideal case, corresponds to the query object. We perform an experimental evaluation of the system using various parameter settings in order to investigate factors affecting the usability of the system, and the efficacy of the system in retrieving correct objects. In the best case, we retrieve approximately 50% of the matching objects in the dataset. In the worst case, we retrieve only 10%. We find that the best approach is to use a uniform sampling over the room clouds, and to use a descriptor which factors in both colour and shape information to describe points.

APA, Harvard, Vancouver, ISO, and other styles

21

Sim, Hak Chuah. "Invariant object matching with a modified dynamic link network." Thesis, University of Southampton, 1999. https://eprints.soton.ac.uk/256269/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Krupnik, Amnon. "Multiple-patch matching in the object space for aerotriangulation /." The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487857546386844.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Schellewald, Christian. "Convex Mathematical Programs for Relational Matching of Object Views." [S.l. : s.n.], 2005. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB11947807.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Sjahputera, Ozy. "Object registration in scene matching based on spatial relationships /." free to MU campus, to others for purchase, 2004. http://wwwlib.umi.com/cr/mo/fullcit?p3144457.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Smith, David. "Parallel approximate string matching applied to occluded object recognition." PDXScholar, 1987. https://pdxscholar.library.pdx.edu/open_access_etds/3724.

Full text

Abstract:

This thesis develops an algorithm for approximate string matching and applies it to the problem of partially occluded object recognition. The algorithm measures the similarity of differing strings by scanning for matching substrings between strings. The length and number of matching substrings determines the amount of similarity. A classification algorithm is developed using the approximate string matching algorithm for the identification and classification of objects. A previously developed method of shape description is used for object representation.

APA, Harvard, Vancouver, ISO, and other styles

26

Wong, Iok Lan. "Face detection in skin color modeling and template matching." Thesis, University of Macau, 2008. http://umaclib3.umac.mo/record=b1795653.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Fysh, Matthew. "Time pressure and human-computer interaction in face matching." Thesis, University of Kent, 2017. https://kar.kent.ac.uk/65773/.

Full text

Abstract:

Research has consistently demonstrated that the matching of unfamiliar faces is remarkably error-prone. This raises concerns surrounding the reliability of this task in operational settings, such as passport control, to verify a person's identity. A large proportion of the research investigating face matching has done so whilst employing highly optimised same-day face photographs. Conversely, such ideal conditions are unlikely to arise in realistic contexts, thus making it difficult to estimate accuracy in these settings from current research. To attempt to address this limitation, the experiments in this thesis aimed to explore performance in forensic face matching under a range of conditions that were intended to more closely approximate those at passport control. This was achieved by developing a new test of face matching - the Kent Face Matching Test (KFMT) - in which to-be-matched stimuli were photographed months apart (Chapter 2). The more challenging conditions provided by the KFMT were then utilised throughout the subsequent experiments reported, to investigate the impact of time pressure on task performance (Chapter 3), as well as the reliability of human-computer interaction at passport control (Chapter 4). The results of these experiments indicate that person identification at passport control is substantially more challenging than is currently estimated by studies that employ highly optimised face-pair stimuli. This was particularly evident on identity mismatch trials, for which accuracy deteriorated consistently within sessions, due to a match response bias that emerged over time (Chapters 2 & 3). These results are discussed within the context of passport control, and suggestions are provided for future research to further reveal why errors might arise in this task.

APA, Harvard, Vancouver, ISO, and other styles

28

Tsishkou, Dzmitry. "Face detection, matching and recognition for semantic video understanding." Ecully, Ecole centrale de Lyon, 2005. http://www.theses.fr/2005ECDL0044.

Full text

Abstract:

The objective of this work can be summarized as follows : to propose face detection and recognition in video solution that is enough fast, accurate and reliable to be implemented in the semantic video understanding system that is capable of replacing human expert in a variety of multimedia indexing applications. Meanwhile we assume that the research results that were raised during this work are complete enough to be adapted or modified as a part of other image processing, pattern recognition and video indexing and analysis systems.

APA, Harvard, Vancouver, ISO, and other styles

29

Oliveira, Johnatan Santos de. "Cross-domain deep face matching for banking security systems." reponame:Repositório Institucional da UnB, 2018. http://repositorio.unb.br/handle/10482/33033.

Full text

Abstract:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2018.
Submitted by Fabiana Santos (fabianacamargo@bce.unb.br) on 2018-11-07T19:44:44Z No. of bitstreams: 1 2018_JohnatanSantosdeOliveira.pdf: 5538984 bytes, checksum: 20580b9ff8534339b6b7390d5c55d9fb (MD5)
Approved for entry into archive by Fabiana Santos (fabianacamargo@bce.unb.br) on 2018-11-12T17:49:55Z (GMT) No. of bitstreams: 1 2018_JohnatanSantosdeOliveira.pdf: 5538984 bytes, checksum: 20580b9ff8534339b6b7390d5c55d9fb (MD5)
Made available in DSpace on 2018-11-12T17:49:55Z (GMT). No. of bitstreams: 1 2018_JohnatanSantosdeOliveira.pdf: 5538984 bytes, checksum: 20580b9ff8534339b6b7390d5c55d9fb (MD5) Previous issue date: 2018-11-12
Um dos principais desafios enfrentados pelo sistema bancário é garantir a segurança das transações financeiras. Devido à conveniência e aceitação, o uso de caracterı́sticas faciais para autenticação biométrica de usuários em sistemas bancários está se tornando uma tendência mundial. Essa abordagem de autenticação de usuários está atraindo grandes investimentos de instituições bancárias e financeiras, especialmente em cenários de diferentes domı́nios, nos quais imagens faciais tiradas de documentos de identificação são comparadas com autorretratos digitais (selfies) tiradas com câmeras de dispositivos móveis. Neste estudo, coletamos das bases de dados do maior banco público brasileiro um grande dataset, chamado FaceBank, com 27.002 imagens de selfies e fotos de documentos de identificação de 13.501 sujeitos. Em seguida, avaliamos os desempenhos de dois modelos de Redes Neurais Convolucionais bem referenciados (VGG-Face e OpenFace) para extração de caracterı́sticas profundas, bem como os desempenhos de quatro classificadores (SVM Linear, SVM Power Mean, Random Forest e Random Forest com o Ensemble Vote) para autenticação robusta de face em diferentes domı́nios. Com base nos resultados obtidos (precisões superiores a 90%, em geral), é possı́vel concluir que a abordagem de matching de faces profundas avaliada neste estudo é adequada para autenticação de usuários em aplicações bancárias entre domı́nios. Até onde sabemos, este é o primeiro trabalho que usa um grande conjunto de dados composto por imagens bancárias reais para avaliar a abordagem de autenticação de face entre domı́nios. Além disso, este trabalho apresenta um estudo sobre as reais necessidades na implementação futura de um sistema biométrico, propondo um sistema de nuvem para permitir a adoção de tecnologias biométricas. Por fim, propõe também um modelo seguro e integrado de subsistema ABIS de transmissão de dados. Toda a análise e implementação leva em conta a total aderência e compatibilidade com padrões e especificações propostos pelo governo brasileiro.
Ensuring the security of transactions is currently one of the major challenges facing banking systems. The use of facial features for biometric authentication of users in banking systems is becoming a worldwide trend, due to the convenience and acceptability of this form of identification, and also because computers and mobile devices already have built-in cameras. This user authentication approach is attracting large investments from banking and financial institutions especially in cross-domain scenarios, in which facial images taken from ID documents are compared with digital self-portraits (selfies) taken with mobile device cameras. In this study, from the databases of the largest public Brazilian bank we collected a large dataset, called FaceBank, with 27,002 images of selfies and ID document photos from 13,501 subjects. Then, we assessed the performances of two well-referenced Convolutional Neural Networks models (VGG-Face and OpenFace) for deep face features extraction, as well as the performances of four effective classifiers (Linear SVM, Power Mean SVM, Random Forest and Random Forest with Ensemble Vote) for robust cross-domain face authentication. Based on the results obtained (authentication accuracies higher than 90%, in general), it is possible to conclude that the deep face matching approach assessed in this study is suitable for user authentication in cross-domain banking applications. To the best of our knowledge, this is the first study that uses a large dataset composed of real banking images to assess the cross-domain face authentication approach to be used in banking systems. As an additional, this work presents a study on the real needs in the future implementation of a biometric system proposing a cloud system to enable the adoption of biometrics technologies, creating a new model of service delivery. Besides that, proposes a secure and integrated ABIS Data Transmission subsystem model. All the analysis and implementation takes into account the total adherence and compatibility with the standards and specifications proposed by the Brazilian government, at the same time, establish mechanisms and controls to ensure the effective protection of data.

APA, Harvard, Vancouver, ISO, and other styles

30

Ahmadyfard, Alireza. "Object recognition by region matching using relaxation with relational constraints." Thesis, University of Surrey, 2003. http://epubs.surrey.ac.uk/843289/.

Full text

Abstract:

Our objective in this thesis is to develop a method for establishing an object recognition system based on the matching of image regions. A region is segmented from image based on colour homogeneity of pixels. The method can be applied to a number of computer vision applications such as object recognition (in general) and image retrieval. The motivation for using regions as image primitives is that they can be represented invariantly to a group of geometric transformations and regions are stable under scaling. We model each object of interest in our database using a single frontal image. The recognition task is to determine the presence of object(s) of interest in scene images. We propose a novel method for afflne invariant representation of image regions in the form of Attributed Relational Graph (ARG). To make image regions comparable for matching, we project each region to an affine invariant space and describe it using a set of unary measurements. The distinctiveness of these features is enhanced by describing the relation between the region and its neighbours. We limit ourselves to the low order relations, binary relations, to minimise the combinatorial complexity of both feature extraction and model matching, and to maximise the probability of the features being observed. We propose two sets of binary measurements: geometric relations between pair of regions, and colour profile on the line connecting the centroids of regions. We demonstrate that the former measurements are very discriminative when the shape of segmented regions is informative. However, they are susceptible to distortion of regions boundaries as a result of severe geometric transformations. In contrast, the colour profile binary measurements are very robust. Using this representation we construct a graph to represent the regions in the scene image and refer to it as the scene graph. Similarly a graph containing the regions of all object models is constructed and referred to as the model graph. We consider the object recognition as the problem of matching the scene graph and model graphs. We adopt the probabilistic relaxation labelling technique for our problem. The method is modified to cope better with image segmentation errors. The implemented algorithm is evaluated under affine transformation, occlusion, illumination change and cluttered scene. Good performance for recognition even under severe scaling and in cluttered scenes is reported. Key words: Region Matching, Object Recognition, Relaxation Labelling, Affine Invariant.

APA, Harvard, Vancouver, ISO, and other styles

31

Jeong, Kideog. "OBJECT MATCHING IN DISJOINT CAMERAS USING A COLOR TRANSFER APPROACH." UKnowledge, 2007. http://uknowledge.uky.edu/gradschool_theses/434.

Full text

Abstract:

Object appearance models are a consequence of illumination, viewing direction, camera intrinsics, and other conditions that are specific to a particular camera. As a result, a model acquired in one view is often inappropriate for use in other viewpoints. In this work we treat this appearance model distortion between two non-overlapping cameras as one in which some unknown color transfer function warps a known appearance model from one view to another. We demonstrate how to recover this function in the case where the distortion function is approximated as general affine and object appearance is represented as a mixture of Gaussians. Appearance models are brought into correspondence by searching for a bijection function that best minimizes an entropic metric for model dissimilarity. These correspondences lead to a solution for the transfer function that brings the parameters of the models into alignment in the UV chromaticity plane. Finally, a set of these transfer functions acquired from a collection of object pairs are generalized to a single camera-pair-specific transfer function via robust fitting. We demonstrate the method in the context of a video surveillance network and show that recognition of subjects in disjoint views can be significantly improved using the new color transfer approach.

APA, Harvard, Vancouver, ISO, and other styles

32

Zhang, Jian, and 张简. "Image point matching in multiple-view object reconstruction from imagesequences." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B48079856.

Full text

Abstract:

This thesis is concerned with three-dimensional (3D) reconstruction and point registration, which are fundamental topics of numerous applications in the area of computer vision. First, we propose the multiple epipolar lines (MEL) shape recovery method for 3D reconstruction from an image sequence captured under circular motion. This method involves recovering the 3D shape by reconstructing a set of 3D rim curves. The position of each point on a 3D rim curve is estimated by using three or more views. Two or more of these views are chosen close to each other to guarantee good image point matching, while one or more views are chosen far from these views to properly compensate for the error introduced in the triangulation scheme by the short baseline of the close views. Image point matching among all views is performed using a new method that suitably combines epipolar geometry and cross-correlation. Second, we develop the one line search (OLS) method for estimating the 3D model of an object from a sequence of images. The recovered object comprises a set of 3D rim curves. The OLS method determines the image point correspondences of each 3D point through a single line search along the ray defined by the camera center and each two-dimensional (2D) point where a photo-consistency index is maximized. In accordance with the approach, the search area is independently reduced to a line segment on the number of views. The key advantage of the proposed method is that only one variable is focused on in defining the corresponding 3D point, whereas the approaches for multiple-view stereo typically exploit multiple epipolar lines and hence require multiple variables. Third, we propose the expectation conditional maximization for point registration (ECMPR) algorithm to solve the rigid point registration problem by fitting the problem into the framework of maximum likelihood with missing data. The unknown correspondences are handled via mixture models. We derive a maximization criterion based on the expected complete-data log-likelihood. Then, the point registration problem can be solved by an instance of the expectation conditional maximization algorithm, that is, the ECMPR algorithm. Experiments with synthetic and real data are presented in each section. The proposed approaches provide satisfactory and promising results.
published_or_final_version
Electrical and Electronic Engineering
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

33

Neal, Pamela J. "Finding and matching topographic features in 3-D object meshes /." Thesis, Connect to this title online; UW restricted, 1999. http://hdl.handle.net/1773/5949.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Nilsson, Linus. "Object Tracking and Face Recognition in Video Streams." Thesis, Umeå universitet, Institutionen för datavetenskap, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-58076.

Full text

Abstract:

The goal with this project was to improve an existing face recognition system for video streams by using adaptive object tracking to track faces between frames. The knowledge of what faces occur and do not occur in subsequent frames was used to filter false faces and to better identify real ones. The recognition ability was tested by measuring how many faces were found and how many of them were correctly identified in two short video files. The tests also looked at the number of false face detections. The results were compared to a reference implementation that did not use object tracking. Two identification modes were tested: the default and strict modes. In the default mode, whichever person is most similar to a given image patch is accepted as the answer. In strict mode, the similarity also has to be above a certain threshold. The first video file had a fairly high image quality. It had only frontal faces, one at a time. The second video file had a slightly lower image quality. It had up to two faces at a time, in a larger variety of angles. The second video was therefore a more difficult case. The results show that the number of detected faces increased by 6-21% in the two video files, for both identification modes, compared to the reference implementation. In the meantime, the number of false detections remained low. In the first video file, there were fewer than 0.009 false detections per frame. In the second video file, there were fewer than 0.08 false detections per frame. The number of faces that were correctly identified increased by 8-22% in the two video files in default mode. In the first video file, there was also a large improvement in strict mode, as it went from recognising 13% to 85% of all faces. In the second video file, however,neither implementation managed to identify anyone in strict mode. The conclusion is that object tracking is a good tool for improving the accuracy of face recognition in video streams. Anyone implementing face recognition for video streams should consider using object tracking as a central component.

APA, Harvard, Vancouver, ISO, and other styles

35

Myers, Fiona Anne. "Face to face : sociology looks at the art object : the case of portraiture." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/30985.

Full text

Abstract:

The thesis emerged from two streams. First, from an interest in portraiture. Both the portrait as an art object and portraiture as a social process are mediated by power relations, yet, despite this, it is a genre that has been relatively underexplored by sociology. Second, as a response to calls within the sociology of art for approaches that, rather than maintaining a distance, seek to take the artwork “seriously” as a source of “social knowledge that is of its own worth” (Harrington, 2004, p.3). Explorations of the ‘affordances’ that material objects provoke in socially situated subjects reflect this interest in capturing the ‘in-itself-ness’ of the art object in ways which are “congruent with social constructionism” (De la Fuente, 2007). These two streams inform the thesis’ three aims, addressed through three case studies of five 20th Century portraits. First, to present portraiture and its relations of power to the sociological gaze. Second, to develop an empirical approach, characterised as ‘taking a line for a walk’, that seeks to keep in play: the material properties of the image and what these may afford to a situated viewer; to explore how these affordances operate to constitute the subjectivities of the individuals portrayed and the artists; and to consider how these processes play out in and through the processes of consecration of the object and artist within the cultural field. Third, to make a contribution to understanding how to capture sociologically the ‘in-itself-ness’ of the art object. The thesis suggests the value of an approach that keeps in focus the art object and the context of its circulation, helping to deepen understanding of the operation of the field. Second, it reveals portraiture as an exercise of power, including in the constitution of the subjectivities represented in and through the portrait. Third, it suggests the continued difficulties of empirically capturing the ‘in-itself-ness’ of the art object.

APA, Harvard, Vancouver, ISO, and other styles

36

Arashloo, Shervin Rahimzadeh. "Pose-invariant 2D face recognition by matching using graphical models." Thesis, University of Surrey, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.527013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Geller, Felix, Robert Hirschfeld, and Gilad Bracha. "Pattern Matching for an object-oriented and dynamically typed programming language." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4303/.

Full text

Abstract:

Pattern matching is a well-established concept in the functional programming community. It provides the means for concisely identifying and destructuring values of interest. This enables a clean separation of data structures and respective functionality, as well as dispatching functionality based on more than a single value. Unfortunately, expressive pattern matching facilities are seldomly incorporated in present object-oriented programming languages. We present a seamless integration of pattern matching facilities in an object-oriented and dynamically typed programming language: Newspeak. We describe language extensions to improve the practicability and integrate our additions with the existing programming environment for Newspeak. This report is based on the first author’s master’s thesis.

APA, Harvard, Vancouver, ISO, and other styles

38

Jones, Michael J. (Michael Jeffrey) 1968. "Multidimensional morphable models : a framework for representing and matching object classes." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43399.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.
Includes bibliographical references (p. 129-133).
by Michel Jeffrey Jones.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

39

Steliaros, Michael Konstantinos. "Motion compensation for 2D object-based video coding." Thesis, University of Warwick, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.340917.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Westerlund, Tomas. "Fast Face Finding." Thesis, Linköping University, Department of Electrical Engineering, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2068.

Full text

Abstract:

Face detection is a classical application of object detection. There are many practical applications in which face detection is the first step; face recognition, video surveillance, image database management, video coding.

This report presents the results of an implementation of the AdaBoost algorithm to train a Strong Classifier to be used for face detection. The AdaBoost algorithm is fast and shows a low false detection rate, two characteristics which are important for face detection algorithms.

The application is an implementation of the AdaBoost algorithm with several command-line executables that support testing of the algorithm. The training and detection algorithms are separated from the rest of the application by a well defined interface to allow reuse as a software library.

The source code is documented using the JavaDoc-standard, and CppDoc is then used to produce detailed information on classes and relationships in html format.

The implemented algorithm is found to produce relatively high detection rate and low false alarm rate, considering the badly suited training data used.

APA, Harvard, Vancouver, ISO, and other styles

41

El, Sayed Abdul Rahman. "Traitement des objets 3D et images par les méthodes numériques sur graphes." Thesis, Normandie, 2018. http://www.theses.fr/2018NORMLH19/document.

Full text

Abstract:

La détection de peau consiste à détecter les pixels correspondant à une peau humaine dans une image couleur. Les visages constituent une catégorie de stimulus importante par la richesse des informations qu’ils véhiculent car avant de reconnaître n’importe quelle personne il est indispensable de localiser et reconnaître son visage. La plupart des applications liées à la sécurité et à la biométrie reposent sur la détection de régions de peau telles que la détection de visages, le filtrage d'objets 3D pour adultes et la reconnaissance de gestes. En outre, la détection de la saillance des mailles 3D est une phase de prétraitement importante pour de nombreuses applications de vision par ordinateur. La segmentation d'objets 3D basée sur des régions saillantes a été largement utilisée dans de nombreuses applications de vision par ordinateur telles que la correspondance de formes 3D, les alignements d'objets, le lissage de nuages de points 3D, la recherche des images sur le web, l’indexation des images par le contenu, la segmentation de la vidéo et la détection et la reconnaissance de visages. La détection de peau est une tâche très difficile pour différentes raisons liées en général à la variabilité de la forme et la couleur à détecter (teintes différentes d’une personne à une autre, orientation et tailles quelconques, conditions d’éclairage) et surtout pour les images issues du web capturées sous différentes conditions de lumière. Il existe plusieurs approches connues pour la détection de peau : les approches basées sur la géométrie et l’extraction de traits caractéristiques, les approches basées sur le mouvement (la soustraction de l’arrière-plan (SAP), différence entre deux images consécutives, calcul du flot optique) et les approches basées sur la couleur. Dans cette thèse, nous proposons des méthodes d'optimisation numérique pour la détection de régions de couleurs de peaux et de régions saillantes sur des maillages 3D et des nuages de points 3D en utilisant un graphe pondéré. En se basant sur ces méthodes, nous proposons des approches de détection de visage 3D à l'aide de la programmation linéaire et de fouille de données (Data Mining). En outre, nous avons adapté nos méthodes proposées pour résoudre le problème de la simplification des nuages de points 3D et de la correspondance des objets 3D. En plus, nous montrons la robustesse et l’efficacité de nos méthodes proposées à travers de différents résultats expérimentaux réalisés. Enfin, nous montrons la stabilité et la robustesse de nos méthodes par rapport au bruit
Skin detection involves detecting pixels corresponding to human skin in a color image. The faces constitute a category of stimulus important by the wealth of information that they convey because before recognizing any person it is essential to locate and recognize his face. Most security and biometrics applications rely on the detection of skin regions such as face detection, 3D adult object filtering, and gesture recognition. In addition, saliency detection of 3D mesh is an important pretreatment phase for many computer vision applications. 3D segmentation based on salient regions has been widely used in many computer vision applications such as 3D shape matching, object alignments, 3D point-point smoothing, searching images on the web, image indexing by content, video segmentation and face detection and recognition. The detection of skin is a very difficult task for various reasons generally related to the variability of the shape and the color to be detected (different hues from one person to another, orientation and different sizes, lighting conditions) and especially for images from the web captured under different light conditions. There are several known approaches to skin detection: approaches based on geometry and feature extraction, motion-based approaches (background subtraction (SAP), difference between two consecutive images, optical flow calculation) and color-based approaches. In this thesis, we propose numerical optimization methods for the detection of skins color and salient regions on 3D meshes and 3D point clouds using a weighted graph. Based on these methods, we provide 3D face detection approaches using Linear Programming and Data Mining. In addition, we adapted our proposed methods to solve the problem of simplifying 3D point clouds and matching 3D objects. In addition, we show the robustness and efficiency of our proposed methods through different experimental results. Finally, we show the stability and robustness of our methods with respect to noise

APA, Harvard, Vancouver, ISO, and other styles

42

Rubio, Ballester Jose C. "Many-to-Many High Order Matching. Applications to Tracking and Object Segmentation." Doctoral thesis, Universitat Autònoma de Barcelona, 2012. http://hdl.handle.net/10803/96481.

Full text

Abstract:

La correspondència de característiques és un problema fonamental de la Visió per Computador, que té múltiples aplicacions com el seguiment, la classificació i recuperació d’imatges, el reconeixement de formes i la visió estereoscòpica. En molts àmbits, és útil per representar l’estructura local de les carácterístiques en correspondència, per augmentar la precissió o per fer les correspondències invariants a certes transformacions (afins, homografies, etc...). No obstant això, la codificació d’aquest coneixement requereix complicar el model mitjançant l’establiment de relacions d’ordre alt entre els elements del model, i per tant l’augment de la complexitat del problema d’optimització. La importància de les correspondències molts-a-molts es de vegades ignorada en la literatura. La majoria dels mètodes es limiten a realizar correspondències un-a-un, generalment validant en conjunts de dades sintètiques, o no realistes. En un entorn real, amb variacions d’escala, il.luminació i orientació de l’objecte d’interés, i amb la presència d’oclusions, desordre, i observacions sorolloses, les relacions molts-a-molts son necessàries per aconseguir resultats satisfactoris. Com a conseqüència, trovar la correspondència molts-a-molts més probable, implica un procés complicat d’optimització combinatòria. En aquest treball dissenyem i demostrem algorismes de correspondència que calculen associacions molts-a-molts, i que poden ser aplicats a diversos problemes difícils de resoldre. El nostre objectiu és fer ús de representacios d’ordre alt per millorar el poder expressiu de la correspondència, alhora que ferm possible el procés d’inferència o l’optimització d’aquests models. Al llarg de la tesi, hem utilitzat eficaçment els models gràfics com la nostra representació preferida, ja que proporcionen un marc probabilístic elegant per abordar problemes de predicció estructurada. Hem introdüit un algorisme de seguiment bassat en correspondències que es porten a terme entre els fotogrames d’una sequència de vídeo, per tal de resoldre el problema de segument de fars de cotxes durant la nit. També generalitzem aquest mateix algorisme per resoldre el problema de l’associació de dades aplicat a different escenaris de seguiment. Hem demostrat l’eficàcia d’aquest enfoc en seqüències de vídeo reals i demostrem que el nostre algorisme de seguiment es pot utilitzar per millorar la precisió d’un sistema de classificació de fars de cotxes. A la segona part d’aquest treball, pasem desde correspondències no denses (punts) cap a correspondèencies denses (regions), i introdüim una nova representació jeràrquica d’imatges. Seguidament, fem ús d’aquest model per desenvolupar correspondències molts-a-molts d’ordre alt entre parelles d’imatges. Demostrem que l’ús de models d’ordre alt en comparació amb altres models més senzills no només millora l’exactitud dels resultats, sinó també la velocitat de convergència de l’algorisme d’inferència. Finalment, seguim explotant la idea de correspondència de regions per dissenyar un algorisme de co-segmentació completament no supervisat, que és capaç de competir amb altres mètodes supervisats de l’estat-de-l’art. El nostre mètode supera inconvenients típics d’alguns treballs passats, com evitar la necesitat d’aparences variades al fons de les imatges. La correspondència de regions en aquest cas s’aplica per explotar eficaçment la informació compartida entre les imatges. També extenem aquest treball per dur a terme co-segmentació de vídeos, sent la primera vegada que s’aborda aquest problema.
Feature matching is a fundamental problem in Computer Vision, having multiple applications such as tracking, image classification and retrieval, shape recognition and stereo fusion. In numerous domains, it is useful to represent the local structure of the matching features to increase the matching accuracy or to make the correspondence invariant to certain transformations (affine, homography, etc…). However, ncoding this knowledge requires complicating the model by establishing high-order relationships between the model elements, and therefore increasing the complexity of the optimization problem. The importance of many-to-many matching is sometimes dismissed in the literature. Most methods are restricted to perform one-to-one matching, and are usually validated on synthetic, or non-realistic datasets. In a real challenging environment, with scale, pose and illumination variations of the object of interest, as well as the presence of occlusions, clutter, and noisy observations, many-to-many matching is necessary to achieve satisfactory results. As a consequence, finding the most likely many-to-many correspondence often involves a challenging combinatorial optimization process. In this work, we design and demonstrate matching algorithms that compute many-to-many correspondences, applied to several challenging problems. Our goal is to make use of high-order representations to improve the expressive power of the matching, at the same time that we make feasible the process of inference or optimization of such models. We effectively use graphical models as our preferred representation because they provide an elegant probabilistic framework to tackle structured prediction problems. We introduce a matching-based tracking algorithm which performs matching between frames of a video sequence in order to solve the difficult problem of headlight tracking at night-time. We also generalize this algorithm to solve the problem of data association applied to various tracking scenarios. We demonstrate the effectiveness of such approach in real video sequences and we show that our tracking algorithm can be used to improve the accuracy of a headlight classification system. In the second part of this work, we move from single (point) matching to dense (region) matching and we introduce a new hierarchical image representation. We make use of such model to develop a high-order many-to-many matching between pairs of images. We show that the use of high-order models in comparison to simpler models improves not only the accuracy of the results, but also the convergence speed of the inference algorithm. Finally, we keep exploiting the idea of region matching to design a fully unsupervised image cosegmentation algorithm that is able to perform competitively with state-of-the-art supervised methods. Our method also overcomes the typical drawbacks of some of the past works, such as avoiding the necessity of variate appearances on the image backgrounds. The region matching in this case is applied to effectively exploit inter-image information. We also extend this work to perform co-segmentation of videos, being the first time that such problem is addressed, as a way to perform video object segmentation.

APA, Harvard, Vancouver, ISO, and other styles

43

Boros, Peter. "Object Recognition: Modelling and the Interface to a Control Strategy for Matching." Doctoral thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1690.

Full text

Abstract:

Amodelling system for object recognition and pose estimation is presented in this work, based on approximating the aspect/appearance graph of arbitrary rigid objects for a spherical viewing surface using simulated image data. The approximation is achieved by adaptively subdividing the viewing sphere starting with an icosahedral tessellation and iteratively decreasing the patch size until the desired resolution is reached. The adaptive subdivision is controlled by both the required resolution and object detail. The decision whether a patch should be divided is based on a similarity measure, which is obtained from applying graph matching to attributed relational graphs generated from image features.

Patches surrounded by similar views are grouped together and reference classes for the aspects are established. The reference classes are indexed by contour types encountered in the views within the group, where the contour types are computed via unsupervised clustering performed on the complete set of contours for all views of a given object.

Classification of an unknown pose is done efficiently via simple or weighted bipartite matching of the contours extracted from the unknown pose to the equivalence classes. The best suggestions are selected by a scoring scheme applied to the match results.

Themodelling system is demonstrated by experimental results for a number of objects at varying levels of resolution. Pose estimation results from both synthetic and real images are also presented.

APA, Harvard, Vancouver, ISO, and other styles

44

Ta, Anh Phuong. "Inexact graph matching techniques : application to object detection and human action recognition." Lyon, INSA, 2010. http://theses.insa-lyon.fr/publication/2010ISAL0099/these.pdf.

Full text

Abstract:

Object detection and human action recognition are two active fields of research in computer vision, which have applications ranging from robotics and video surveillance, medical image analysis, human-computer interactions to content-based video annotation and retrieval. At this time, building such robust recognition systems still remain very challenging tasks, because of the variations in action/object classes, different possible viewpoints, as well as illumination changes, moving cameras, complex dynamic backgrounds and occlusions. In this thesis, we deal with object and activity recognition problems. Despite differences in the applications’ goals, the associated fundamental problems share numerous properties, for instance the necessity of handling non-rigid transformations. Describing a model object or a video by a set of local features, we formulate the recognition problem as a graph matching problem, where nodes represent local features, and edges represent spatial and/or spatio-temporal relationships between them. Inexact matching of valued graphs is a well known NP-hard problem, therefore we concentrated on finding approximate solutions. To this end, the graph matching problem is formulated as an energy minimization problem. Based on this energy function, we propose two different solutions for the two applications: object detection in images and activity recognition in video sequences. We also propose new features to improve the conventional Bag of words model, which is widely used in computer vision. Experiments on both standard datasets and our own datasets, demonstrate that our methods provide good results regarding the recent state-of-the-art in both domains
La détection d’objets et la reconnaissance des activités humaines sont les deux domaines actifs dans la vision par ordinateur, qui trouve des applications en robotique, vidéo surveillance, analyse des images médicales, interaction homme-machine, annotation et recherche de la vidéo par le contenue. Actuellement, il reste encore très difficile de construire de tels systèmes, en raison des variations des classes d’objets et d’actions, les différents points de vue, ainsi que des changements d’illumination, des mouvements de caméra, des fonds dynamiques et des occlusions. Dans cette thèse, nous traitons le problème de la détection d’objet et d’activités dans la vidéo. Malgré ses différences de buts, les problèmes fondamentaux associés partagent de nombreuses propriétés, par exemple la nécessité de manipuler des transformations non-ridiges. En décrivant un modèle d’objet ou une vidéo par un ensemble des caractéristiques locales, nous formulons le problème de reconnaissance comme celui d’une mise en correspondance de graphes, dont les nœuds représentent les caractéristiques locales, et les arrêtes représentent les relations que l’on veut vérifier entre ces caractéristiques. Le problème de mise en correspondance inexacte de graphes est connu comme NP-difficile, nous avons donc porté notre effort sur des solutions approchées. Pour cela, le problème est transformé en problème d’optimisation d’une fonction d’énergie, qui contient un terme en rapport avec la distance entre les descripteurs locaux et d’autres termes en rapport avec les relations spatiales (ou/et temporelles) entre eux. Basé sur cette énergie, deux différentes solutions ont été proposées et validées pour les deux applications ciblées: la reconnaissance d’objets à partir d’images et la reconnaissance des activités dans la vidéo. En plus, nous avons également proposé un nouveaux descripteur pour améliorer les modèles de Sac-de-mots, qui sont largement utilisé dans la vision par ordinateur. Nos expérimentations sur deux bases standards, ainsi que sur nos bases démontrent que les méthodes proposées donnent de bons résultats en comparant avec l’état de l’art dans ces deux domaines

APA, Harvard, Vancouver, ISO, and other styles

45

Collin, Charles Alain. "Effects of spatial frequency overlap on face and object recognition." Thesis, McGill University, 2000. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=36896.

Full text

Abstract:

There has recently been much interest in how limitations in spatial frequency range affect face and object perception. This work has mainly focussed on determining which bands of frequencies are most useful for visual recognition. However, a fundamental question not yet addressed is how spatial frequency overlap (i.e., the range of spatial frequencies shared by two images) affects complex image recognition. Aside from the basic theoretical interest this question holds, it also bears on research about effects of display format (e.g., line-drawings, Mooney faces, etc.) and studies examining the nature of mnemonic representations of faces and objects. Examining the effects of spatial frequency overlap on face and object recognition is the main goal of this thesis.
A second question that is examined concerns the effect of calibration of stimuli on recognition of spatially filtered images. Past studies using non-calibrated presentation methods have inadvertently introduced aberrant frequency content to their stimuli. The effect this has on recognition performance has not been examined, leading to doubts about the comparability of older and newer studies. Examining the impact of calibration on recognition is an ancillary goal of this dissertation.
Seven experiments examining the above questions are reported here. Results suggest that spatial frequency overlap had a strong effect on face recognition and a lesser effect on object recognition. Indeed, contrary to much previous research it was found that the band of frequencies occupied by a face image had little effect on recognition, but that small variations in overlap had significant effects. This suggests that the overlap factor is important in understanding various phenomena in visual recognition. Overlap effects likely contribute to the apparent superiority of certain spatial bands for different recognition tasks, and to the inferiority of line drawings in face recognition. Results concerning the mnemonic representation of faces and objects suggest that these are both encoded in a format that retains spatial frequency information, and do not support certain proposed fundamental differences in how these two stimulus classes are stored. Data on calibration generally shows non-calibration having little impact on visual recognition, suggesting moderate confidence in results of older studies.

APA, Harvard, Vancouver, ISO, and other styles

46

Tan, Cheston Y. C. (Cheston Yin-Chet). "Towards a unified account of face (and maybe object) processing." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/73696.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2012.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (p. 191-197).
Faces are an important class of visual stimuli, and are thought to be processed differently from objects by the human visual system. Going beyond the false dichotomy of same versus different processing, it is more important to understand how exactly faces are processed similarly or differently from objects. However, even by itself, face processing is poorly understood. Various aspects of face processing, such as holistic, configural, and face-space processing, are investigated in relative isolation, and the relationships between these are unclear. Furthermore, face processing is characteristically affected by various stimulus transformations such as inversion, contrast reversal and spatial frequency filtering, but how or why is unclear. Most importantly, we do not understand even the basic mechanisms of face processing. We hypothesize that what makes face processing distinctive is the existence of large, coarse face templates. We test our hypothesis by modifying an existing model of object processing to utilize such templates, and find that our model can account for many face-related phenomena. Using small, fine face templates as a control, we find that our model displays object-like processing characteristics instead. Overall, we believe that we may have made the first steps towards achieving a unified account of face processing. In addition, results from our control suggest that face and object processing share fundamental computational mechanisms. Coupled with recent advances in brain recording techniques, our results mean that face recognition could form the "tip of the spear" for attacking and solving the problem of visual recognition.
by Cheston Y.-C. Tan.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

47

Lind, Anders. "High-speed View Matching using Region Descriptors." Thesis, Linköping University, Computer Vision, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-58843.

Full text

Abstract:

This thesis treats topics within the area of object recognition. A real-time view matching method has been developed to compute the transformation between two different images of the same scene. This method uses a color based region detector called MSCR and affine transformations of these regions to create affine-invariant patches that are used as input to the SIFT algorithm. A parallel method to compute the SIFT descriptor has been created with relaxed constraints so that the descriptor size and the number of histogram bins can be adjusted. Additionally, a matching step to deduce correspondences and a parallel RANSAC method have been created to estimate the undergone transformation between these descriptors. To achieve real-time performance, the implementation has been targeted to use the parallel nature of the GPU with CUDA as the programming language. Focus has been put on the architecture of the GPU to find the best way to parallelize the different processing steps. CUDA has also been combined with OpenGL to be able to use the hardware accelerated anisotropic sampling method for affine transformations of regions. Parts of the implementation can also be used individually from either Matlab or by using the provided C++ library directly. The method was also evaluated in terms of accuracy and speed. It was shown that our algorithm has similar or better accuracy at finding correspondences than SIFT when the 3D geometry changes are large but we get a slightly worse result on images with flat surfaces.

APA, Harvard, Vancouver, ISO, and other styles

48

Banarse, D. S. "A generic neural network architecture for deformation invariant object recognition." Thesis, Bangor University, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362146.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Smith, H. M. J. "Matching novel face and voice identity using static and dynamic facial images." Thesis, Nottingham Trent University, 2016. http://irep.ntu.ac.uk/id/eprint/29001/.

Full text

Abstract:

Research suggests that both static and dynamic faces share identity information with voices. However, face-voice matching studies offer contradictory results. Accurate face-voice matching is consistently above chance when facial stimuli are dynamic, but not when facial stimuli are static. This thesis aims to account for previous inconsistencies, comparing accuracy across a variety of two-alternative forced-choice (2AFC) procedures to isolate the features that support accuracy. In addition, the thesis provides a clearer and more complete picture of face-voice matching ability than that available in the existing literature. Samedifferent procedures are used to address original research questions relating to response bias and the delay between face and voice presentation. The overall findings indicate that faces and voices offer concordant source identity information. When faces and voices are presented close together in time, matching accuracy is consistently above chance level using both dynamic and static facial stimuli. Previous contradictory findings across studies can be accounted for by procedural differences and the characteristics of specific stimulus sets. Multilevel modelling analyses show that some people look and sound more similar than others. The results also indicate that when there is only a short (~1 second) interval between faces and voices, people exhibit a bias to assume that they belong to the same person. The findings presented in this thesis have theoretical and applied relevance. They highlight the value of considering person perception from a multimodal point of view, and are consistent with evidence for the existence of early perceptual integrative mechanisms between face and voice processing pathways. The results also offer insights into how people successfully navigate complex social situations featuring a number of novel speakers.

APA, Harvard, Vancouver, ISO, and other styles

50

Breuel, Thomas M. "Geometric Aspects of Visual Object Recognition." Thesis, Massachusetts Institute of Technology, 1992. http://hdl.handle.net/1721.1/7342.

Full text

Abstract:

This thesis presents there important results in visual object recognition based on shape. (1) A new algorithm (RAST; Recognition by Adaptive Sudivisions of Tranformation space) is presented that has lower average-case complexity than any known recognition algorithm. (2) It is shown, both theoretically and empirically, that representing 3D objects as collections of 2D views (the "View-Based Approximation") is feasible and affects the reliability of 3D recognition systems no more than other commonly made approximations. (3) The problem of recognition in cluttered scenes is considered from a Bayesian perspective; the commonly-used "bounded-error errorsmeasure" is demonstrated to correspond to an independence assumption. It is shown that by modeling the statistical properties of real-scenes better, objects can be recognized more reliably.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Face and Object Matching'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles