Dissertations / Theses: 'Detection and recognition'

1

O'Shea, Kieran. "Roadsign detection & recognition /." Leeds : University of Leeds, School of Computer Studies, 2008. http://www.comp.leeds.ac.uk/fyproj/reports/0708/OShea.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Bashir, Sulaimon A. "Change detection for activity recognition." Thesis, Robert Gordon University, 2017. http://hdl.handle.net/10059/3104.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Activity Recognition is concerned with identifying the physical state of a user at a particular point in time. Activity recognition task requires the training of classification algorithm using the processed sensor data from the representative population of users. The accuracy of the generated model often reduces during classification of new instances due to the non-stationary sensor data and variations in user characteristics. Thus, there is a need to adapt the classification model to new user haracteristics. However, the existing approaches to model adaptation in activity recognition are blind. They continuously adapt a classification model at a regular interval without specific and precise detection of the indicator of the degrading performance of the model. This approach can lead to wastage of system resources dedicated to continuous adaptation. This thesis addresses the problem of detecting changes in the accuracy of activity recognition model. The thesis developed a classifier for activity recognition. The classifier uses three statistical summaries data that can be generated from any dataset for similarity based classification of new samples. The weighted ensemble combination of the classification decision from each statistical summary data results in a better performance than three existing benchmarked classification algorithms. The thesis also presents change detection approaches that can detect the changes in the accuracy of the underlying recognition model without having access to the ground truth label of each activity being recognised. The first approach called `UDetect' computes the change statistics from the window of classified data and employed statistical process control method to detect variations between the classified data and the reference data of a class. Evaluation of the approach indicates a consistent detection that correlates with the error rate of the model. The second approach is a distance based change detection technique that relies on the developed statistical summaries data for comparing new classified samples and detects any drift in the original class of the activity. The implemented approach uses distance function and a threshold parameter to detect the accuracy change in the classifier that is classifying new instances. Evaluation of the approach yields above 90% detection accuracy. Finally, a layered framework for activity recognition is proposed to make model adaptation in activity recognition informed using the developed techniques in this thesis.

3

Sandström, Marie. "Liveness Detection in Fingerprint Recognition Systems." Thesis, Linköping University, Department of Electrical Engineering, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2397.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Biometrics deals with identifying individuals with help of their biological data. Fingerprint scanning is the most common method of the biometric methods available today. The security of fingerprint scanners has however been questioned and previous studies have shown that fingerprint scanners can be fooled with artificial fingerprints, i.e. copies of real fingerprints. The fingerprint recognition systems are evolving and this study will discuss the situation of today.

Two approaches have been used to find out how good fingerprint recognition systems are in distinguishing between live fingers and artificial clones. The first approach is a literature study, while the second consists of experiments.

A literature study of liveness detection in fingerprint recognition systems has been performed. A description of different liveness detection methods is presented and discussed. Methods requiring extra hardware use temperature, pulse, blood pressure, electric resistance, etc., and methods using already existent information in the system use skin deformation, pores, perspiration, etc.

The experiments focus on making artificial fingerprints in gelatin from a latent fingerprint. Nine different systems were tested at the CeBIT trade fair in Germany and all were deceived. Three other different systems were put up against more extensive tests with three different subjects. All systems werecircumvented with all subjects'artificial fingerprints, but with varying results. The results are analyzed and discussed, partly with help of the A/R value defined in this report.

4

Khan, Muhammad. "Hand Gesture Detection & Recognition System." Thesis, Högskolan Dalarna, Datateknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:du-6496.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The project introduces an application using computer vision for Hand gesture recognition. A camera records a live video stream, from which a snapshot is taken with the help of interface. The system is trained for each type of count hand gestures (one, two, three, four, and five) at least once. After that a test gesture is given to it and the system tries to recognize it.A research was carried out on a number of algorithms that could best differentiate a hand gesture. It was found that the diagonal sum algorithm gave the highest accuracy rate. In the preprocessing phase, a self-developed algorithm removes the background of each training gesture. After that the image is converted into a binary image and the sums of all diagonal elements of the picture are taken. This sum helps us in differentiating and classifying different hand gestures.Previous systems have used data gloves or markers for input in the system. I have no such constraints for using the system. The user can give hand gestures in view of the camera naturally. A completely robust hand gesture recognition system is still under heavy research and development; the implemented system serves as an extendible foundation for future work.

5

Zakir, Usman. "Automatic road sign detection and recognition." Thesis, Loughborough University, 2011. https://dspace.lboro.ac.uk/2134/9733.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Road Sign Detection and Recognition (RSDR) systems provide an additional level of driver assistance, leading to improved safety for passengers, road users and vehicles. As part of Advanced Driving Assistance Systems (ADAS), RSDR can be used to benefit drivers (specially with driving disabilities) by alerting them about the presence of road signs to reduce risks in situations of driving distraction, fatigue ,poor sight and weather conditions. Although a number of RSDR systems have been proposed in literature; the design of a robust algorithm still remains an open research problem. This thesis aims to resolve some of the outstanding research challenges in RSDR, while considering variations in colour illumination, scale, rotation, translation, occlusion, computational complexity and functional limitations. RSDR pipeline is divided into three parts namely; Colour Segmentation, Shape Classification and Content Recognition. This thesis presents each part as a separate chapter, except for Colour Segmentation that introduces two distinct approaches for Road Sign region of interest (ROI) selection. The first approach in Colour Segmentation presents a detailed investigation of computer based colour spaces i.e. YCbCr, YIQ, RGB, CIElab, CYMK and HSV, whereas second approach presents the development and utilisation of an illumination invariant Combined Colour Model (CCM) on Gamma Corrected images containing road signs considering varying illumination conditions. Shape Classification of the road sign acts as second part of RSDR pipeline consisting on shape feature extraction and shape feature classification stages. Shape features of road signs are extracted by introducing Contourlet Transforms at the decomposition level-3 with haar filters for generating the Laplacian Pyramid (LP) and Directional Filter Bank (DFB). The third part of the RSDR system presented in this thesis is the Content Recognition, which is carried out by extracting the LESH (Local Energy based Shape Histogram) features of the normalized road sign contents. Extracted shape and content features are utilised to train a Support Vector Machine (SVM) polynomial kernel which are later classified with the input candidate road sign shapes and contents respectively. The thesis further highlights possible extensions and improvements to the proposed approaches for RSDR.

6

Park, Chi-youn 1981. "Consonant landmark detection for speech recognition." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/44905.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 191-197).
This thesis focuses on the detection of abrupt acoustic discontinuities in the speech signal, which constitute landmarks for consonant sounds. Because a large amount of phonetic information is concentrated near acoustic discontinuities, more focused speech analysis and recognition can be performed based on the landmarks. Three types of consonant landmarks are defined according to its characteristics -- glottal vibration, turbulence noise, and sonorant consonant -- so that the appropriate analysis method for each landmark point can be determined. A probabilistic knowledge-based algorithm is developed in three steps. First, landmark candidates are detected and their landmark types are classified based on changes in spectral amplitude. Next, a bigram model describing the physiologically-feasible sequences of consonant landmarks is proposed, so that the most likely landmark sequence among the candidates can be found. Finally, it has been observed that certain landmarks are ambiguous in certain sets of phonetic and prosodic contexts, while they can be reliably detected in other contexts. A method to represent the regions where the landmarks are reliably detected versus where they are ambiguous is presented. On TIMIT test set, 91% of all the consonant landmarks and 95% of obstruent landmarks are located as landmark candidates. The bigram-based process for determining the most likely landmark sequences yields 12% deletion and substitution rates and a 15% insertion rate. An alternative representation that distinguishes reliable and ambiguous regions can detect 92% of the landmarks and 40% of the landmarks are judged to be reliable. The deletion rate within reliable regions is as low as 5%.
(cont.) The resulting landmark sequences form a basis for a knowledge-based speech recognition system since the landmarks imply broad phonetic classes of the speech signal and indicate the points of focus for estimating detailed phonetic information. In addition, because the reliable regions generally correspond to lexical stresses and word boundaries, it is expected that the landmarks can guide the focus of attention not only at the phoneme-level, but at the phrase-level as well.
by Chiyoun Park.
Ph.D.

7

Ning, Guanghan. "Vehicle license plate detection and recognition." Thesis, University of Missouri - Columbia, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10157318.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In this work, we develop a license plate detection method using a SVM (Support Vector Machine) classifier with HOG (Histogram of Oriented Gradients) features. The system performs window searching at different scales and analyzes the HOG feature using a SVM and locates their bounding boxes using a Mean Shift method. Edge information is used to accelerate the time consuming scanning process.

Our license plate detection results show that this method is relatively insensitive to variations in illumination, license plate patterns, camera perspective and background variations. We tested our method on 200 real life images, captured on Chinese highways under different weather conditions and lighting conditions. And we achieved a detection rate of 100%.

After detecting license plates, alignment is then performed on the plate candidates. Conceptually, this alignment method searches neighbors of the bounding box detected, and finds the optimum edge position where the outside regions are very different from the inside regions of the license plate, from color's perspective in RGB space. This method accurately aligns the bounding box to the edges of the plate so that the subsequent license plate segmentation and recognition can be performed accurately and reliably.

The system performs license plate segmentation using global alignment on the binary license plate. A global model depending on the layout of license plates is proposed to segment the plates. This model searches for the optimum position where the characters are all segmented but not chopped into pieces. At last, the characters are recognized by another SVM classifier, with a feature size of 576, including raw features, vertical and horizontal scanning features.

Our character recognition results show that 99% of the digits are successfully recognized, while the letters achieve an recognition rate of 95%.

The license plate recognition system was then incorporated into an embedded system for parallel computing. Several TS7250 and an auxiliary board are used to simulate the process of vehicle retrieval.

8

Liu, Chang. "Human motion detection and action recognition." HKBU Institutional Repository, 2010. http://repository.hkbu.edu.hk/etd_ra/1108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Anwer, Rao Muhammad. "Color for Object Detection and Action Recognition." Doctoral thesis, Universitat Autònoma de Barcelona, 2013. http://hdl.handle.net/10803/120224.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Detectar objetos en imágenes es un problema central en el campo de la visión por computador. El marco de detección basado en modelos de partes deformable es actualmente el más eficaz. Generalmente, HOG es el descriptor de imágenes a partir del cual se construyen esos modelos. El reconocimiento de acciones humanas es otro de los tópicos de más interés actualmente en el campo de la visión por computador. En este caso, los modelos usados siguen la idea de conjuntos de palabras (visuales), en inglés bag-of-words, en este caso siendo SIFT uno de los descriptor de imágenes más usados para dar soporte a la formación de esos modelos. En este contexto hay una información muy relevante para el sistema visual humano que normalmente está infrautilizada tanto en la detección de objetos como en el reconocimiento de acciones, hablamos del color. Es decir, tanto HOG como SIFT suelen ser aplicados al canal de luminancia o algún tipo de proyección de los canales de color que también lo desechan. Globalmente esta tesis se centra en incorporar color como fuente de información adicional para mejorar tanto la detección objetos como el reconocimiento de acciones. En primer lugar la tesis analiza el problema de la detección de personas en fotografías. En particular nos centramos en analizar la aportación del color a los métodos del estado del arte. A continuación damos el salto al problema de la detección de objetos en general, no solo personas. Además, en lugar de introducir el color en el nivel más bajo de la representación de la imagen, lo cual incrementa la dimensión de la representación provocando un mayor coste computacional y la necesidad de más ejemplos de aprendizaje, en esta tesis nos centramos en introducir el color en un nivel más alto de la representación. Esto no es trivial ya que el sistema en desarrollo tiene que aprender una serie de atributos de color que sean lo suficientemente discriminativos para cada tarea. En particular, en esta tesis combinamos esos atributos de color con los tradicionales atributos de forma y lo aplicamos de forma que mejoramos el estado del arte de la detección de objetos. Finalmente, nos centramos en llevar las ideas incorporadas para la tarea de detección a la tarea de reconocimiento de acciones. En este caso también demostramos cómo la incorporación del color, tal y como proponemos en esta tesis, permite mejorar el estado del arte.
Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition. In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection. In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task. Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.

10

Wang, Ge. "Verilogo proactive phishing detection via logo recognition /." Diss., [La Jolla] : University of California, San Diego, 2010. http://wwwlib.umi.com/cr/fullcit?p1477945.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Thesis (M.S.)--University of California, San Diego, 2010.
Title from first page of PDF file (viewed July 16, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (leaves 38-40).

11

Koniaris, Christos. "Perceptually motivated speech recognition and mispronunciation detection." Doctoral thesis, KTH, Tal-kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-102321.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This doctoral thesis is the result of a research effort performed in two fields of speech technology, i.e., speech recognition and mispronunciation detection. Although the two areas are clearly distinguishable, the proposed approaches share a common hypothesis based on psychoacoustic processing of speech signals. The conjecture implies that the human auditory periphery provides a relatively good separation of different sound classes. Hence, it is possible to use recent findings from psychoacoustic perception together with mathematical and computational tools to model the auditory sensitivities to small speech signal changes. The performance of an automatic speech recognition system strongly depends on the representation used for the front-end. If the extracted features do not include all relevant information, the performance of the classification stage is inherently suboptimal. The work described in Papers A, B and C is motivated by the fact that humans perform better at speech recognition than machines, particularly for noisy environments. The goal is to make use of knowledge of human perception in the selection and optimization of speech features for speech recognition. These papers show that maximizing the similarity of the Euclidean geometry of the features to the geometry of the perceptual domain is a powerful tool to select or optimize features. Experiments with a practical speech recognizer confirm the validity of the principle. It is also shown an approach to improve mel frequency cepstrum coefficients (MFCCs) through offline optimization. The method has three advantages: i) it is computationally inexpensive, ii) it does not use the auditory model directly, thus avoiding its computational cost, and iii) importantly, it provides better recognition performance than traditional MFCCs for both clean and noisy conditions. The second task concerns automatic pronunciation error detection. The research, described in Papers D, E and F, is motivated by the observation that almost all native speakers perceive, relatively easily, the acoustic characteristics of their own language when it is produced by speakers of the language. Small variations within a phoneme category, sometimes different for various phonemes, do not change significantly the perception of the language’s own sounds. Several methods are introduced based on similarity measures of the Euclidean space spanned by the acoustic representations of the speech signal and the Euclidean space spanned by an auditory model output, to identify the problematic phonemes for a given speaker. The methods are tested for groups of speakers from different languages and evaluated according to a theoretical linguistic study showing that they can capture many of the problematic phonemes that speakers from each language mispronounce. Finally, a listening test on the same dataset verifies the validity of these methods.

QC 20120914

European Union FP6-034362 research project ACORNS
Computer-Animated language Teachers (CALATea)

12

Mahmood, Hamid. "Visual Attention-based Object Detection and Recognition." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-94024.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis is all about the visual attention, starting from understanding the human visual system up till applying this mechanism to a real-world computer vision application. This has been achieved by taking the advantage of latest findings about the human visual attention and the increased performance of the computers. These two facts played a vital role in simulating the many different aspects of this visual behavior. In addition, the concept of bio-inspired visual attention systems have become applicable due to the emergence of different interdisciplinary approaches to vision which leads to a beneficial interaction between the scientists related to different fields. The problems of high complexities in computer vision lead to consider the visual attention paradigm to become a part of real time computer vision solutions which have increasing demand. In this thesis work, different aspects of visual attention paradigm have been dealt ranging from the biological modeling to the real-world computer vision tasks implementation based on this visual behavior. The implementation of traffic signs detection and recognition system benefited from this mechanism is the central part of this thesis work.

13

Zhou, Yun. "Embedded Face Detection and Facial Expression Recognition." Digital WPI, 2014. https://digitalcommons.wpi.edu/etd-theses/583.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Face Detection has been applied in many fields such as surveillance, human machine interaction, entertainment and health care. Two main reasons for extensive attention on this typical research domain are: 1) a strong need for the face recognition system is obvious due to the widespread use of security, 2) face recognition is more user friendly and faster since it almost requests the users to do nothing. The system is based on ARM Cortex-A8 development board, including transplantation of Linux operating system, the development of drivers, detecting face by using face class Haar feature and Viola-Jones algorithm. In the paper, the face Detection system uses the AdaBoost algorithm to detect human face from the frame captured by the camera. The paper introduces the pros and cons between several popular images processing algorithm. Facial expression recognition system involves face detection and emotion feature interpretation, which consists of offline training and online test part. Active shape model (ASM) for facial feature node detection, optical flow for face tracking, support vector machine (SVM) for classification is applied in this research.

14

Norris, Jeffrey S. (Jeffrey Singley) 1976. "Face detection and recognition in office environments." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Aleixo, Patrícia Nunes. "Object detection and recognition for robotic applications." Master's thesis, Universidade de Aveiro, 2014. http://hdl.handle.net/10773/13811.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Mestrado em Engenharia Eletrónica e Telecomunicações
The computer vision assumes an important relevance in the development of robotic applications. In several applications, robots need to use vision to detect objects, a challenging and sometimes difficult task. This thesis is focused on the study and development of algorithms to be used in detection and identification of objects on digital images to be applied on robots that will be used in practice cases. Three problems are addressed: Detection and identification of decorative stones for textile industry; Detection of the ball in robotic soccer; Detection of objects in a service robot, that operates in a domestic environment. In each case, different methods are studied and applied, such as, Template Matching, Hough transform and visual descriptors (like SIFT and SURF). It was chosen the OpenCv library in order to use the data structures to image manipulation, as well as other structures for all information generated by the developed vision systems. Whenever possible, it was used the implementation of the described methods and have been developed new approaches, both in terms of pre-processing algorithms and in terms of modification of the source code in some used functions. Regarding the pre-processing algorithms, were used the Canny edge detector, contours detection, extraction of color information, among others. For the three problems, there are presented and discussed experimental results in order to evaluate the best method to apply in each case. The best method for each application is already integrated or in the process of integration in the described robots.
A visão por computador assume uma importante relevância no desenvolvimento de aplicações robóticas, na medida em que há robôs que precisam de usar a visão para detetar objetos, uma tarefa desafiadora e por vezes difícil. Esta dissertação foca-se no estudo e desenvolvimento de algoritmos para a deteção e identificação de objetos em imagem digital para aplicar em robôs que serão usados em casos práticos. São abordados três problemas: Deteção e identificação de pedras decorativas para a indústria têxtil; Deteção da bola em futebol robótico; Deteção de objetos num robô de serviço, que opera em ambiente doméstico. Para cada caso, diferentes métodos são estudados e aplicados, tais como, Template Matching, transformada de Hough e descritores visuais (como SIFT e SURF). Optou-se pela biblioteca OpenCv com vista a utilizar as suas estruturas de dados para manipulação de imagem, bem como as demais estruturas para toda a informação gerada pelos sistemas de visão desenvolvidos. Sempre que possivel utilizaram-se as implementações dos métodos descritos tendo sido desenvolvidas novas abordagens, quer em termos de algoritmos de preprocessamento quer em termos de alteração do código fonte das funções utilizadas. Como algoritmos de pre-processamento foram utilizados o detetor de arestas Canny, deteção de contornos, extração de informação de cor, entre outros. Para os três problemas, são apresentados e discutidos resultados experimentais, de forma a avaliar o melhor método a aplicar em cada caso. O melhor método em cada aplicação encontra-se já integrado ou em fase de integração dos robôs descritos.

16

Cohen, Gregory Kevin. "Event-Based Feature Detection, Recognition and Classification." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066204/document.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

La detection, le suivi de cible et la reconnaissance de primitives visuelles constituent des problèmes fondamentaux de la vision robotique. Ces problématiques sont réputés difficiles et sources de défis. Malgré les progrès en puissance de calcul des machines, le gain en résolution et en fréquence des capteurs, l’état-de-l’art de la vision robotique peine à atteindre des performances en coût d’énergie et en robustesse qu’offre la vision biologique. L’apparition des nouveaux capteurs, appelés "rétines de silicium” tel que le DVS (Dynamic Vision Sensor) et l’ATIS (Asynchronous Time-based Imaging Sensor) reproduisant certaines fonctionnalités des rétines biologiques, ouvre la voie à de nouveaux paradigmes pour décrire et modéliser la perception visuelle, ainsi que pour traiter l’information visuelle qui en résulte. Les tâches de suivi et de reconnaissance de formes requièrent toujours la caractérisation et la mise en correspondance de primitives visuelles. La détection de ces dernières et leur description nécessitent des approches fondamentalement différentes de celles employées en vision robotique traditionnelle. Cette thèse développe et formalise de nouvelles méthodes de détection et de caractérisation de primitives spatio-temporel des signaux acquis par les rétines de silicium (plus communément appelés capteurs “event-based”). Une structure théorique pour les tâches de détection, de suivi, de reconnaissance et de classification de primitives est proposée. Elle est ensuite validée par des données issues de ces capteurs “event-based”,ainsi que par des bases données standard du domaine de la reconnaissance de formes, convertit au préalable à un format compatible avec la representation “événement”. Les résultats présentés dans cette thèse démontrent les potentiels et l’efficacité des systèmes "event-based”. Ce travail fournit une analyse approfondie de différentes méthodes de reconnaissance de forme et de classification “event-based". Cette thèse propose ensuite deux solutions basées sur les primitives. Deux mécanismes d’apprentissage, un purement événementiel et un autre, itératif, sont développés puis évalués pour leur capacité de classification et de robustesse. Les résultats démontrent la validité de la classification “event-based” et souligne l’importance de la dynamique de la scène dans les tâches primordiales de définitions des primitives et de leur détection et caractétisation
One of the fundamental tasks underlying much of computer vision is the detection, tracking and recognition of visual features. It is an inherently difficult and challenging problem, and despite the advances in computational power, pixel resolution, and frame rates, even the state-of-the-art methods fall far short of the robustness, reliability and energy consumption of biological vision systems. Silicon retinas, such as the Dynamic Vision Sensor (DVS) and Asynchronous Time-based Imaging Sensor (ATIS), attempt to replicate some of the benefits of biological retinas and provide a vastly different paradigm in which to sense and process the visual world. Tasks such as tracking and object recognition still require the identification and matching of local visual features, but the detection, extraction and recognition of features requires a fundamentally different approach, and the methods that are commonly applied to conventional imaging are not directly applicable. This thesis explores methods to detect features in the spatio-temporal information from event-based vision sensors. The nature of features in such data is explored, and methods to determine and detect features are demonstrated. A framework for detecting, tracking, recognising and classifying features is developed and validated using real-world data and event-based variations of existing computer vision datasets and benchmarks. The results presented in this thesis demonstrate the potential and efficacy of event-based systems. This work provides an in-depth analysis of different event-based methods for object recognition and classification and introduces two feature-based methods. Two learning systems, one event-based and the other iterative, were used to explore the nature and classification ability of these methods. The results demonstrate the viability of event-based classification and the importance and role of motion in event-based feature detection

17

Dittmar, George William. "Object Detection and Recognition in Natural Settings." PDXScholar, 2013. https://pdxscholar.library.pdx.edu/open_access_etds/926.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Much research as of late has focused on biologically inspired vision models that are based on our understanding of how the visual cortex processes information. One prominent example of such a system is HMAX [17]. HMAX attempts to simulate the biological process for object recognition in cortex based on the model proposed by Hubel & Wiesel [10]. This thesis investigates the ability of an HMAX-like system (GLIMPSE [20]) to perform object-detection in cluttered natural scenes. I evaluate these results using the StreetScenes database from MIT [1, 8]. This thesis addresses three questions: (1) Can the GLIMPSE-based object detection system replicate the results on object-detection reported by Bileschi using HMAX? (2) Which features computed by GLIMPSE lead to the best object-detection performance? (3) What effect does elimination of clutter in the training sets have on the performance of our system? As part of this thesis, I built an object detection and recognition system using GLIMPSE [20] and demonstrate that it approximately replicates the results reported in Bileschi's thesis. In addition, I found that extracting and combining features from GLIMPSE using different layers of the HMAX model gives the best overall invariance to position, scale and translation for recognition tasks, but comes with a much higher computational overhead. Further contributions include the creation of modified training and test sets based on the StreetScenes database, with removed clutter in the training data and extending the annotations for the detection task to cover more objects of interest that were not in the original annotations of the database.

18

Olsson, Oskar, and Moa Eriksson. "Automated system tests with image recognition : focused on text detection and recognition." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160249.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Today’s airplanes and modern cars are equipped with displays to communicate important information to the pilot or driver. These displays needs to be tested for safety reasons; displays that fail can be a huge safety risk and lead to catastrophic events. Today displays are tested by checking the output signals or with the help of a person who validates the physical display manually. However this technique is very inefficient and can lead to important errors being unnoticed. MindRoad AB is searching for a solution where validation of the display is made from a camera pointed at it, text and numbers will then be recognized using a computer vision algorithm and validated in a time efficient and accurate way. This thesis compares the three different text detection algorithms, EAST, SWT and Tesseract to determine the most suitable for continued work. The chosen algorithm is then optimized and the possibility to develop a program which meets MindRoad ABs expectations is investigated. As a result several algorithms were combined to a fully working program to detect and recognize text in industrial displays.

19

Kou, Yufeng. "Abnormal Pattern Recognition in Spatial Data." Diss., Virginia Tech, 2006. http://hdl.handle.net/10919/30145.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In the recent years, abnormal spatial pattern recognition has received a great deal of attention from both industry and academia, and has become an important branch of data mining. Abnormal spatial patterns, or spatial outliers, are those observations whose characteristics are markedly different from their spatial neighbors. The identification of spatial outliers can be used to reveal hidden but valuable knowledge in many applications. For example, it can help locate extreme meteorological events such as tornadoes and hurricanes, identify aberrant genes or tumor cells, discover highway traffic congestion points, pinpoint military targets in satellite images, determine possible locations of oil reservoirs, and detect water pollution incidents. Numerous traditional outlier detection methods have been developed, but they cannot be directly applied to spatial data in order to extract abnormal patterns. Traditional outlier detection mainly focuses on "global comparison" and identifies deviations from the remainder of the entire data set. In contrast, spatial outlier detection concentrates on discovering neighborhood instabilities that break the spatial continuity. In recent years, a number of techniques have been proposed for spatial outlier detection. However, they have the following limitations. First, most of them focus primarily on single-attribute outlier detection. Second, they may not accurately locate outliers when multiple outliers exist in a cluster and correlate with each other. Third, the existing algorithms tend to abstract spatial objects as isolated points and do not consider their geometrical and topological properties, which may lead to inexact results. This dissertation reports a study of the problem of abnormal spatial pattern recognition, and proposes a suite of novel algorithms. Contributions include: (1) formal definitions of various spatial outliers, including single-attribute outliers, multi-attribute outliers, and region outliers; (2) a set of algorithms for the accurate detection of single-attribute spatial outliers; (3) a systematic approach to identifying and tracking region outliers in continuous meteorological data sequences; (4) a novel Mahalanobis-distance-based algorithm to detect outliers with multiple attributes; (5) a set of graph-based algorithms to identify point outliers and region outliers; and (6) extensive analysis of experiments on several spatial data sets (e.g., West Nile virus data and NOAA meteorological data) to evaluate the effectiveness and efficiency of the proposed algorithms.
Ph. D.

20

Pavani, Sri-Kaushik. "Methods for face detection and adaptive face recognition." Doctoral thesis, Universitat Pompeu Fabra, 2010. http://hdl.handle.net/10803/7567.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The focus of this thesis is on facial biometrics; specifically in the problems of face detection and face recognition. Despite intensive research over the last 20 years, the technology is not foolproof, which is why we do not see use of face recognition systems in critical sectors such as banking. In this thesis, we focus on three sub-problems in these two areas of research. Firstly, we propose methods to improve the speed-accuracy trade-off of the state-of-the-art face detector. Secondly, we consider a problem that is often ignored in the literature: to decrease the training time of the detectors. We propose two techniques to this end. Thirdly, we present a detailed large-scale study on self-updating face recognition systems in an attempt to answer if continuously changing facial appearance can be learnt automatically.
L'objectiu d'aquesta tesi és sobre biometria facial, específicament en els problemes de detecció de rostres i reconeixement facial. Malgrat la intensa recerca durant els últims 20 anys, la tecnologia no és infalible, de manera que no veiem l'ús dels sistemes de reconeixement de rostres en sectors crítics com la banca. En aquesta tesi, ens centrem en tres sub-problemes en aquestes dues àrees de recerca. En primer lloc, es proposa mètodes per millorar l'equilibri entre la precisió i la velocitat del detector de cares d'última generació. En segon lloc, considerem un problema que sovint s'ignora en la literatura: disminuir el temps de formació dels detectors. Es proposen dues tècniques per a aquest fi. En tercer lloc, es presenta un estudi detallat a gran escala sobre l'auto-actualització dels sistemes de reconeixement facial en un intent de respondre si el canvi constant de l'aparença facial es pot aprendre de forma automàtica.

21

Beyreuther, Moritz. "Speech Recognition based Automatic Earthquake Detection and Classification." Diss., lmu, 2011. http://nbn-resolving.de/urn:nbn:de:bvb:19-132557.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Khiari, El Hebri. "Text Detection and Recognition in the Automotive Context." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32458.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis achieved the goal of obtaining high accuracy rates (precision and recall) in a real-time system that detects and recognizes text in the automotive context. For the sake of simplicity, this work targets two Objects of Interest (OOIs): North American (NA) traffic boards (TBs) and license plates (LPs). The proposed approach adopts a hybrid detection module consisting of a Connected Component Analysis (CCA) step followed by a Texture Analysis (TA) step. An initial set of candidates is extracted by highlighting the Maximally Stable Extremal Regions (MSERs). Each sebsequent step in the CCA and TA steps attempts to reduce the size of the set by filtering out false positives and retaining the true positives. The final set of candidates is fed into a recognition stage that integrates an open source Optical Character Reader (OCR) into the framework by using two additional steps that serve the purpose of minimizing false readings as well as the incurred delays. A set of of manually taken videos from various regions of Ottawa were used to evaluate the performance of the system, using precision, recall and latency as metrics. The high precision and recall values reflect the proposed approach's ability in removing false positives and retaining the true positives, respectively, while the low latency values deem it suitable for the automotive context. Moreover, the ability to detect two OOIs of varying appearances demonstrates the flexibility that is featured by the hybrid detection module.

23

Hayes, William S. "Pattern recognition and signal detection in gene finding." Diss., Georgia Institute of Technology, 1998. http://hdl.handle.net/1853/25420.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Liu, Sharlene Anne. "Landmark detection for distinctive feature-based speech recognition." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/11406.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.
Includes bibliographical references (leaves 187-190).
by Sharlene Anne Liu.
Ph.D.

25

Patel, Ravi L. "Security system with motion detection and face recognition." Thesis, California State University, Long Beach, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10251645.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Security is an essential criterion in all industries. This project develops a Security System that includes motion detection and face recognition. Motion detection is achieved by using the PIR (Passive Infrared) sensor, and face recognition is achieved by using the SIFT (Scale Invariant Feature Transform) algorithm.

The primary hardware components used in this system are a PIR sensor, microcontroller, relay, LCD (Liquid Crystal Display), buzzer, MAX232 IC, and GSM (Global System for Mobile Communication). The system incorporates the feature extraction method, which is utilized to identify the number of objects in an image, and the proposed SIFT algorithm is used for the face recognition. These two methods, the feature extraction method and SIFT algorithm, are implemented in MATLAB. The result shows that the efficiency and the recognition time of the proposed SIFT algorithm is better than its predecessors. This system can be used for industrial, hospital, or even residential purposes.

26

Halberstadt, Warren. "Pattern recognition in the detection of Tuberculous Meningitis." Master's thesis, University of Cape Town, 2005. http://hdl.handle.net/11427/3239.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Robertson, Curtis E. "Deep Learning-Based Speed Sign Detection and Recognition." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1595500028808679.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Qiao, Long. "Structural damage detection using signal-based pattern recognition." Diss., Manhattan, Kan. : Kansas State University, 2009. http://hdl.handle.net/2097/1385.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Yousfi, Sonia. "Embedded Arabic text detection and recognition in videos." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI069/document.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Cette thèse s'intéresse à la détection et la reconnaissance du texte arabe incrusté dans les vidéos. Dans ce contexte, nous proposons différents prototypes de détection et d'OCR vidéo (Optical Character Recognition) qui sont robustes à la complexité du texte arabe (différentes échelles, tailles, polices, etc.) ainsi qu'aux différents défis liés à l'environnement vidéo et aux conditions d'acquisitions (variabilité du fond, luminosité, contraste, faible résolution, etc.). Nous introduisons différents détecteurs de texte arabe qui se basent sur l'apprentissage artificiel sans aucun prétraitement. Les détecteurs se basent sur des Réseaux de Neurones à Convolution (ConvNet) ainsi que sur des schémas de boosting pour apprendre la sélection des caractéristiques textuelles manuellement conçus. Quant à notre méthodologie d'OCR, elle se passe de la segmentation en traitant chaque image de texte en tant que séquence de caractéristiques grâce à un processus de scanning. Contrairement aux méthodes existantes qui se basent sur des caractéristiques manuellement conçues, nous proposons des représentations pertinentes apprises automatiquement à partir des données. Nous utilisons différents modèles d'apprentissage profond, regroupant des Auto-Encodeurs, des ConvNets et un modèle d'apprentissage non-supervisé, qui génèrent automatiquement ces caractéristiques. Chaque modèle résulte en un système d'OCR bien spécifique. Le processus de reconnaissance se base sur une approche connexionniste récurrente pour l'apprentissage de l'étiquetage des séquences de caractéristiques sans aucune segmentation préalable. Nos modèles d'OCR proposés sont comparés à d'autres modèles qui se basent sur des caractéristiques manuellement conçues. Nous proposons, en outre, d'intégrer des modèles de langage (LM) arabes afin d'améliorer les résultats de reconnaissance. Nous introduisons différents LMs à base des Réseaux de Neurones Récurrents capables d'apprendre des longues interdépendances linguistiques. Nous proposons un schéma de décodage conjoint qui intègre les inférences du LM en parallèle avec celles de l'OCR tout en introduisant un ensemble d’hyper-paramètres afin d'améliorer la reconnaissance et réduire le temps de réponse. Afin de surpasser le manque de corpus textuels arabes issus de contenus multimédia, nous mettons au point de nouveaux corpus manuellement annotés à partir des flux TV arabes. Le corpus conçu pour l'OCR, nommé ALIF et composée de 6,532 images de texte annotées, a été publié a des fins de recherche. Nos systèmes ont été développés et évalués sur ces corpus. L’étude des résultats a permis de valider nos approches et de montrer leurs efficacité et généricité avec plus de 97% en taux de détection, 88.63% en taux de reconnaissance mots sur le corpus ALIF dépassant ainsi un des systèmes d'OCR commerciaux les mieux connus par 36 points
This thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transformed into sequences of features using a multi-scale scanning scheme. Standing out from the dominant methodology of hand-crafted features, we propose to learn relevant text representations from data using different deep learning methods, namely Deep Auto-Encoders, ConvNets and unsupervised learning models. Each one leads to a specific OCR (Optical Character Recognition) solution. Sequence labeling is performed without any prior segmentation using a recurrent connectionist learning model. Proposed solutions are compared to other methods based on non-connectionist and hand-crafted features. In addition, we propose to enhance the recognition results using Recurrent Neural Network-based language models that are able to capture long-range linguistic dependencies. Both OCR and language model probabilities are incorporated in a joint decoding scheme where additional hyper-parameters are introduced to boost recognition results and reduce the response time. Given the lack of public multimedia Arabic datasets, we propose novel annotated datasets issued from Arabic videos. The OCR dataset, called ALIF, is publicly available for research purposes. As the best of our knowledge, it is first public dataset dedicated for Arabic video OCR. Our proposed solutions were extensively evaluated. Obtained results highlight the genericity and the efficiency of our approaches, reaching a word recognition rate of 88.63% on the ALIF dataset and outperforming well-known commercial OCR engine by more than 36%

30

IACONO, MASSIMILIANO. "Object detection and recognition with event driven cameras." Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/1005981.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis presents study, analysis and implementation of algorithms to perform object detection and recognition using an event-based cam era. This sensor represents a novel paradigm which opens a wide range of possibilities for future developments of computer vision. In partic ular it allows to produce a fast, compressed, illumination invariant output, which can be exploited for robotic tasks, where fast dynamics and signiﬁcant illumination changes are frequent. The experiments are carried out on the neuromorphic version of the iCub humanoid platform. The robot is equipped with a novel dual camera setup mounted directly in the robot’s eyes, used to generate data with a moving camera. The motion causes the presence of background clut ter in the event stream. In such scenario the detection problem has been addressed with an at tention mechanism, speciﬁcally designed to respond to the presence of objects, while discarding clutter. The proposed implementation takes advantage of the nature of the data to simplify the original proto object saliency model which inspired this work. Successively, the recognition task was ﬁrst tackled with a feasibility study to demonstrate that the event stream carries suﬃcient informa tion to classify objects and then with the implementation of a spiking neural network. The feasibility study provides the proof-of-concept that events are informative enough in the context of object classiﬁ cation, whereas the spiking implementation improves the results by employing an architecture speciﬁcally designed to process event data. The spiking network was trained with a three-factor local learning rule which overcomes weight transport, update locking and non-locality problem. The presented results prove that both detection and classiﬁcation can be carried-out in the target application using the event data.

31

Higgs, David Robert. "Parts-based object detection using multiple views /." Link to online version, 2005. https://ritdml.rit.edu/dspace/handle/1850/1000.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Ma, Chengyuan. "A detection-based pattern recognition framework and its applications." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33889.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation. Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages. A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage. This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed in more applications in the future.

33

Kelsey, Matthew Douglas. "THE DEVELOPMENT AND EVALUATION OF TECHNIQUES FOR USE IN MAMMOGRAPHIC SCREENING COMPUTER AIDED DETECTION SYSTEMS." OpenSIUC, 2011. https://opensiuc.lib.siu.edu/dissertations/331.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The material presented in this dissertation details techniques developed to aid in the detection of a specific type of cancerous lesion visible on screening mammography images. These spiculated lesions most often appear as centrally bright objects with semi-defined borders. Furthermore, lesion margins are composed of indicative spiculations or fine tendrils projecting outward from the mass center. The techniques developed here to identify these characteristics and detect these objects are intended to operate as a processing pipeline. The first group of these processing stages is responsible for converting raw mammogram pixel data into localized and described objects. A second group of processing stages categorizes these objects by manipulating their descriptors and evaluating their meaning. At the conclusion of this processing pipeline, it is intended that image pixels which designate a cancerous mass will be highlighted and presented to a human operator as an aid in the early detection of breast cancers. The initial problem of object localization is addressed with breast tissue region extraction followed by a specialized spot detection algorithm. Tissue region extraction is accomplished using specific dataset image domain knowledge along with a simple threshold segmentation algorithm. Once this image area of interest is specified, contained objects of interest are identified using Iterative Disjoint Region Detection (IDRD). This specialized procedure utilizes iterative threshold segmentation to produce a three dimensional map of each image's pixel space. In this map, two dimensions directly correspond to the spatial dimension of the original image while the third corresponds to the normalized gray level of individual pixels. Traversing this map from the brightest pixel values to the darkest yields object "peaks", which are taken to be seeds of visible objects. Seeds are further processed at each successive threshold iteration by considering the effects of combining adjacent designations. This seeding process effectively detected all objects of interest with at least one seed. Because it was designed as a general purpose spot detection algorithm, many non-cancerous locally bright objects were detected as well. These other detections accounted for a wide majority of the seeds noted in each mammogram with approximately thirty to sixty seeds identified in most dataset images. A complementary task to object localization is the identification of each object's visible border and pixel area. This process is accomplished by a customized general purpose region growing routine, commonly known as pixel aggregation. During this procedure, spatially attached pixels are considered for inclusion with a prototype region defined by the region's corresponding seed object. Candidate pixels must meet a gray tone similarity criteria with our inclusion interval computed using the template region's average gray value. This process is supplemented by a leakage detection mechanism which serves to detect and recover from over segmentation of non-target objects in the image space. Leakage detection operates by tracking pixel aggregation rates for each iteration of the region growing process. A leakage is said to occur if the aggregation rate profile exhibits telltale characteristics of object border crossing followed by segmentation of an adjacent object. Once objects have been localized and their member pixels identified through the proceeding procedures it is the purpose of the next system stage to describe these objects using various measured features. The extraction of these measurements is the final step in transforming objects from image based visual depictions to abstract numerical representations. This new representation facilitates the forthcoming statistical treatment of these objects. Feature extraction is accomplished using a number of general use as well as special purpose measurements which quantify characteristics such as object shape, texture, and parent seed evolution. A total of forty-one feature measurements are extracted in order to insure full representation of detected objects and to facilitate accurate object class membership. In the next section of work, we seek to categorize these objects which have just been detected, segmented and described using feature measurements. The roll of a statistical classifier in accomplishing this is presented along with specifics as to the type of classifier used here. The use of a Bayes classifier is discussed and rationalized along with the development of the parametric Gaussian model for class conditional density estimation. Along with classifier development, a treatment of system performance evaluation is given. The Free-response Receiver Operating Characteristic (FROC) is described as an appropriate method by which to evaluate observer studies. This method suits the described CAD system, as a certain number of false positive detections are seen as acceptable and the system goal is to maximize mass sensitivity within these bounds. Our CAD system supplements the traditional classifier components by considering the effects of advanced feature vector manipulation. In total, five distinct models are developed including various iterations of feature selection and feature vector transformation. The Select model is presented as a benchmark and consists of a cumulative performance based feature selection step. The PCT Select and the DCT Select models are used to generate new feature vectors from the original measured set as linear combinations of its elements. PCT and DCT indicate the vector transformation model, Principle Components Transform and Discrete Cosine Transform respectively. Once transformed, the resultant feature vectors are processed with the same Select feature selection routine as in the benchmark model. The goal with both Transform-Select feature manipulation models is to generate a compact feature set which retains all of the necessary discriminatory information from measured features while rejecting measured characteristics which do not support accurate object classification. Two related models are also considered which measure the impact of implementing feature pre-selection on the PCT Select and the DCT Select models. The aptly named Select PCT Select and the Select DCT Select models seek to remove measured features which contain no discriminatory information from the pool of transformed data. System performance results for the five selection models are then compared to discern the contribution of each in the detection of cancerous masses. A complete analysis of the feature selection and transformation models show that while the benchmark Select model performs reasonably, considerable performance improvements are possible using feature vector manipulation methods. Performance metrics are generated with the use of a Free-response Receiver Operating Characteristic (FROC) plot. This method compares the mass detection sensitivity possible to the number of false positive detections per mammogram evaluated. Feature selection and classifier training is performed to maximized this sensitivity at a particular operating point, 4 FPpI. This point is taken as within the range of acceptable false indications in a typical clinical setting. Overall, the best system performance is seen with the use of the Select DCT Select feature model (84.51% sensitivity at 4 FPpI). This corresponds to a net increase of eighteen additional mass detections with the same amount of false positive indications and an increased mass sensitivity of 84.51% from 71.53% using the benchmark Select model. The other selection model using a pre-selection stage, Select PCT Select, reports similar performance results. This model is used to detect 118 true positive masses, sixteen more then the Select model and just two less then the Select DCT Select model. Both of the other system configurations, PCT Select and DCT Select, were able to detect 109 true masses in the data set. This corresponds to a 76.76% mass sensitivity at 4 FPpI. Although not as impressive as results generated with the pre-selection models, this is still a 5.23% improvement in mass sensitivity in comparison to the benchmark.

34

Ridge, Douglas John. "Imaging for small object detection." Thesis, Queen's University Belfast, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.295423.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Espinosa-Romero, Arturo. "Situated face detection." Thesis, University of Edinburgh, 2001. http://hdl.handle.net/1842/6667.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In the last twenty years, important advances have been made in the field of automatic face processing, given the importance of human faces for personal identification, emotional expression and verbal and non verbal communication. The very first step in a face processing algorithm is the detection of faces; while this is a trivial problem in controlled environments, the detection of faces in real environments is still a challenging task. Until now, the most successful approaches for face detection represent the face as a grey-level pattern, and the problem itself is considered as the classification between "face" and "non-face" patterns. Satisfactory results have been achieved in this area. The main disadvantage is that an exhaustive search has to be done on each image in order to locate the faces. This search normally involves testing every single position on the image at different scales, and although this does not represent an important drawback in off-line face processing systems, in those cases where a real-time response is needed it is still a problem. In the different proposed methods for face detection, the "observer" is a disembodied entity, which holds no relationship with the observed scene. This thesis presents a framework for an efficient location of faces in real scenes, in which, by considering both the observer to be situated in the world, and the relationships that hold between the two, a set of constraints in the search space can be defined. The constraints rely on two main assumptions; first, the observer can purposively interact with the world (i.e. change its position relative to the observed scene) and second, the camera is fully calibrated. The first source constraint is the structural information about the observer environment, represented as a depth map of the scene in front of the camera. From this representation the search space can be constrained in terms of the range of scales where a face might be found as different positions in the image. The second source of constraint is the geometrical relationship between the camera and the scene, which allows us to project a model of the subject into the scene in order to eliminate those areas where faces are unlikely to be found. In order to test the proposed framework, a system based on the premises stated above was constructed. It is based on three different modules: a face/non-face classifier, a depth estimation module and a search module. The classifier is composed of a set of convolutional neural networks (CNN) that were trained to differentiate between face and non-face patterns, the depth estimation modules uses a multilevel algorithm to compute the scene depth map from a sequence of images captured the depth information and the subject model into the image where the search will be performed in order to constrain the search space. Finally, the proposed system was validated by running a set of experiments on the individual modules and then on the whole system.

36

Al, Qader Akram Abed Al Karim Abed. "Unconstrained road sign recognition." Thesis, De Montfort University, 2017. http://hdl.handle.net/2086/14942.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

There are many types of road signs, each of which carries a different meaning and function: some signs regulate traffic, others indicate the state of the road or guide and warn drivers and pedestrians. Existent image-based road sign recognition systems work well under ideal conditions, but experience problems when the lighting conditions are poor or the signs are partially occluded. The aim of this research is to propose techniques to recognize road signs in a real outdoor environment, especially to deal with poor lighting and partially occluded road signs. To achieve this, hybrid segmentation and classification algorithms are proposed. In the first part of the thesis, we propose a hybrid dynamic threshold colour segmentation algorithm based on histogram analysis. A dynamic threshold is very important in road sign segmentation, since road sign colours may change throughout the day due to environmental conditions. In the second part, we propose a geometrical shape symmetry detection and reconstruction algorithm to detect and reconstruct the shape of the sign when it is partially occluded. This algorithm is robust to scale changes and rotations. The last part of this thesis deals with feature extraction and classification. We propose a hybrid feature vector based on histograms of oriented gradients, local binary patterns, and the scale-invariant feature transform. This vector is fed into a classifier that combines a Support Vector Machine (SVM) using a Random Forest and a hybrid SVM k-Nearest Neighbours (kNN) classifier. The overall method proposed in this thesis shows a high accuracy rate of 99.4% in ideal conditions, 98.6% in noisy and fading conditions, 98.4% in poor lighting conditions, and 92.5% for partially occluded road signs on the GRAMUAH traffic signs dataset.

37

Chen, Datong. "Text detection and recognition in images and video sequences /." [S.l.] : [s.n.], 2003. http://library.epfl.ch/theses/?display=detail&nr=2863.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Bouganis, Christos-Savvas. "Multiple light source detection with application to face recognition." Thesis, Imperial College London, 2004. http://hdl.handle.net/10044/1/11322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Feng, Jingwen. "Traffic Sign Detection and Recognition System for Intelligent Vehicles." Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31449.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Road traffic signs provide instructions, warning information, to regulate driver behavior. In addition, these signs provide a reliable guarantee for safe and convenient driving. The Traffic Sign Detection and Recognition (TSDR) system is one of the primary applications for Advanced Driver Assistance Systems (ADAS). TSDR has obtained a great deal of attention over the recent years. But, it is still a challenging field of image processing. In this thesis, we first created our own dataset for North American Traffic Signs, which is still being updated. We then decided to choose Histogram Orientation Gradients (HOG) and Support Vector Machines (SVMs) to build our system after comparing them with some other techniques. For better results, we tested different HOG parameters to find the best combination. After this, we developed a TSDR system using HOG, SVM and our new color information extraction algorithm. To reduce time-consumption, we used the Maximally Stable Extremal Region (MSER) to replace the HOG and SVM detection stage. In addition, we developed a new approach based on Global Positioning System (GPS) information other than image processing. At last, we tested these three systems; the results show that all of them can recognize traffic signs with a good accuracy rate. The MSER based system is faster than the one using only HOG and SVM; and, the GPS based system is even faster than the MSER based system.

40

Cheung, Karen. "Image processing for skin cancer detection, malignant melanoma recognition." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp04/mq29403.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

GUPTA, MUSKAN. "FACIAL DETECTION and RECOGNITION." Thesis, 2022. http://dspace.dtu.ac.in:8080/jspui/handle/repository/19521.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Face recognition has been one of the most interesting and important research fields in the past two decades. The reasons come from the need of automatic recognitions and surveillance systems, the interest in human visual system on face recognition, and the design of human-computer interface, etc. These researches involve knowledge and researchers from disciplines such as neuroscience, psychology, computer vision, pattern recognition, image processing, and machine learning, etc. A bunch of papers have been published to overcome different factors (such as illumination, expression, scale, pose, etc.) and achieve better recognition rate, while there is still no robust technique against uncontrolled practical cases which may involve kinds of factors simultaneously. In this report, we’ll go through general ideas and structures of recognition, important issues and factors of human faces, critical techniques and algorithms, and finally give a conclusion.

42

Jung-Chieh, Hsien. "Road Sign Detection and Recognition." 2003. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0009-0112200611353624.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Wang, Tsung-Jen, and 王宗任. "Traffic Sign Detection and Recognition." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/14675198330887014049.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
淡江大學
資訊工程學系碩士班
97
In this paper, we use color and shape to detect and classify traffic signs. Then, the message on the traffic sign is recognized for driver. The method consists of two phases: traffic sign detection and recognition. In the detection stage, we use the distribution of traffic sign on HSV color model to segment the regions of traffic sign, and then use connected component labeling and edge detection to find positions of traffic signs. In the recognition stage, the detected traffic signs are normalized and classified by shape detection. Finally, we input the result to template match system, so information on traffic signs is identified. Our system uses simple algorithm to achieve high detection rate. The format of input image is 640×480 true color bitmap. The average execution time for each image is 671.9ms, the detection rate is 95% and the recognition rate is 81%.

44

Hsien, Jung-Chieh, and 謝榮桀. "Road Sign Detection and Recognition." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/52667283080897300108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
元智大學
資訊工程學系
91
This study proposes a novel road sign detection and recognition method. The position of road sign in an image is detected using the projection techniques. The features of road sign are then extracted using the techniques of projection, moment and Markov model, which, in turn are used to match the detected road sign to those in the database so that the goal of road sign recognition can be achieved. More specifically, the color images in terms of RGB color system are first converted to HSV color system and then quantized into specific colors existing in road signs. The horizontal and vertical projections of whole images in the specific colors are then used to detect the positions of road signs. In the recognition stage, only local features around the detected positions are used and two-step strategy is adopted. The horizontal and vertical projections of background in local area are used to prune irrelevant road signs. The candidate road signs are then sorted by the horizontal and vertical projections of foreground together with moment or by the techniques of Markov model. The two ranking results are integrated into the final consensus and the one with the first rank is regarded as the recognition result. The effectiveness of the proposed method has been demonstrated by various experiments.

45

Gill, HK. "Abalone tag detection and recognition." Thesis, 2009. https://eprints.utas.edu.au/19912/1/whole_GillHarpreetKaur2009_thesis.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In recent years, there have been serious concerns about the declining stocks of wild abalone combined with a rapidly increasing market demand and so aquaculture researchers are continuously investing in new methods for growing and monitoring cultured abalone. There are a number of new programs that have been planned for farmed abalone, such as selective breeding and genetic manipulation to meet world demand. These methods can only be successful if abalone traits and behaviour can be identified properly. Therefore, physical tagging of abalone shells and DNA (Deoxyribonucleic Acid) pedigree markers have been developed to enable tracking and tracing of individuals. Researchers are continually finding more effective methods of physical tagging so that tags can be visualised more readily and will be retained on the abalone shell for a longer period of time. Identifying the tag and character information is also time and labour intensive. Therefore, automated image analysis of abalone tags may provide a solution for tracking abalone and for identifying abalone behaviour and pedigree information. After reviewing the broad field of computer vision, an image processing system was developed in MATLAB using appropriate image analysis and processing techniques, to automate the process of extracting sub-images of physical tags attached to the abalone shells, in preparation for input to an optical character recognition system, which would read the tags on the shells. The image processing system developed was able to successfully identify a number of tags from digital images directly taken from land-based tanks on various abalone farms; tag colour and character recognition was achieved. In addition, this research will help aquaculture researchers to study abalone movement, behaviour and performance traits in a cultured environment.

46

Lin, Yuh-Ju, and 林育如. "Chinese Text Detection and Recognition." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/37v6v3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
國立中央大學
資訊工程學系
107
Optical Character Recognition (OCR) is a big challenge of Computer Vision. The degree of challenge has become harder from the task of recognizing the English characters and numbers with specific font and some symbol to the task of detecting and recognizing the text in the wild. And in the domain of text detection and recognition, detecting and recognizing Chinese context is more complex than the English. First, the amount of Chinese character is much more than English, and the shape is much more complex, too. Different from English context, Chinese can be written from left to right, and from top to bottom, also, which makes Chinese text detection and recognition much harder. Training a model of OCR system needs a lot of data with label, both position of the character and what the character is, the more complex scene needs more data with label. We focus on simple task, we just detect and recognize the Chinese text with the scan files. Different from task of text in wild, the block of text is more structural in task that detecting text in scan files. Therefore, we can get a great result with a simple network for text detection. And we just need to separate each line from the region that we detected, and use the line as the input of text recognition. Then, combine the result of OCR and the position we detect, we can get all the text in the scan file. And maybe, with these results, it can develop more applications, file classification takes for an example.

47

Subudhi, K. Krishan Kumar, and Ramshankar Mishra. "Human Face Recognition and Detection." Thesis, 2011. http://ethesis.nitrkl.ac.in/2568/1/face_recognition_and_detection.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Human face detection and recognition play important roles in many applications such as video surveillance and face image database management. In our project, we have studied worked on both face recognition and detection techniques and developed algorithms for them. In face recognition the algorithm used is PCA (principal component analysis), MPCA(Multilinear Principal Component Analysis) and LDA(Linear Discriminant Analysis) in which we recognize an unknown test image by comparing it with the known training images stored in the database as well as give information regarding the person recognized. These techniques works well under robust conditions like complex background, different face positions. These algorithms give different rates of accuracy under different conditions as experimentally observed. In face detection, we have developed an algorithm that can detect human faces from an image. We have taken skin colour as a tool for detection. This technique works well for Indian faces which have a specific complexion varying under certain range. We have taken real life examples and simulated the algorithms in MATLAB successfully.

48

Jain, Deepak. "Moving Object Detection and Recognition." Thesis, 2017. http://ethesis.nitrkl.ac.in/8858/1/2017_MT_DJain.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Detection of moving object is an active research topic in computer vision applications, like people counting, intruder detection, object tracking. For detection of moving object we are using background subtraction technique. In this thesis, we are dealing with Robust Principal Component Analysis which decomposes a given data matrix into low-rank component and sparse component. Low-rank component gives us the background portion whereas the sparse one gives the required foreground object. The limitation of classical PCA, i.e. it is not applicable when more number of outliers are present, is overcome by RPCA. Here RPCA is being solved by using Principal Component Pursuit (RPCA-PCP) under the ideal conditions, i.e. video is noise free and static camera being used. PCP imposed the low-rank component being exactly low-rank and the sparse component being exactly sparse but the observations such as in video surveillance are often corrupted by noise affecting every entry of the data matrix. In those situations PCP fails and it is being solved by using Stable Principal Component Analysis which assumes that observation matrix is summation of Low-rank component, sparse component, and the noise. A comparative analysis is being done by evaluating Recall, Precision of different video sequence using both RPCA-PCP, and RPCA-SPCP. After this for recognition purpose, CBIR is being used. Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) are used to find the feature vectors. Then query matching is done to recognize the object.

49

Shen, Yu-sian, and 沈育賢. "Road Traffic Sign Detection and Recognition." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/n6uc3f.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
國立臺灣科技大學
高分子系
95
Traffic sign detection and recognition is a difficult task in an outdoor environment. Complex background, weather, illumination-related problems, and even the covered, damaged and rotated signs may make traffic sign detection and recognition more difficult. In the detection, we use RGB and HSI color models to classify colors, and utilize the corner detector mask and shape characteristic to implement the detection components. In the recognition, we use Otsu statistical threshold selecting method and gray-level variance, together with the proposed feature extraction method, to achieve the recognition task. In the experiment, 139 images, including 165 road traffic signs, are to be detected and recognized. 126 traffic signs are accurately identified. Thus, this system yields a recognition rate of about 76%.

50

Chuang, Chia-Lung, and 莊佳龍. "Vehicle Detection and License Plate Recognition." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/57361447713712641354.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
國立中正大學
光機電整合工程研究所
93
License plate recognition system has been extensively used in a variety of applications, such as parking lot management of community and buildings, and searching stolen car of police departments, roadway monitor, car management and so on。In the past study, the needed car is catched by the method of responding and activating. In this system, the video camera continuously shoot the motion-based vehicle and the video frames are analyzed by shadow detection, vehicle location, size of license plate to catch the needed image including vehicle automatically. Moreover, a car can be recognized more times according to the characteristics of Multiple-frames to increase the probability of recognition, since the system can be used in reality more practically than which using shooting statically.

Dissertations / Theses on the topic 'Detection and recognition'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles