To see the other types of publications on this topic, follow the link: Camera recognition.

Dissertations / Theses on the topic 'Camera recognition'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Camera recognition.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Johansson, Fredrik. "Recognition of Targets in Camera Networks." Thesis, Linköpings universitet, Institutionen för teknik och naturvetenskap, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-95351.

Full text
Abstract:
This thesis presents a re-recognition model for use in area camera network surveillance systems. The method relies on a mix of covariance matrix fea- ture descriptions and Bayesian networks for topological information. The system consists of an object recognition model and an re-recognition model. The object recognition model is responsible for separating people from the background and generating the position and description for each person and frame. This is done by using a foreground-background segmen- tation model to separate the background from a person. The segmented image is then tracked by a tracking algorithm that produces the coordinates for each person. It is also responsible for creating a silhouette that is used to create a feature vector consisting of a covariance matrix that describes the persons appearance. A hypothesis engine is then responsible for connecting the coordinates into a continues track that describes the trajectory were aa person has been visiting. Every trajectory is stored and available to the re-recognition model. It then compares two covariance matrices using a sophisticated distance me- thod to generate a probabilistic score value. The score is then combined with a likelihood-value of the topological match generated with a Bayesian network structure containing gathered statistical data. The topological in- formation is mainly intended to ¯lter the most un-likely matches.
APA, Harvard, Vancouver, ISO, and other styles
2

Tadesse, Girmaw Abebe. "Human activity recognition using a wearable camera." Doctoral thesis, Universitat Politècnica de Catalunya, 2018. http://hdl.handle.net/10803/668914.

Full text
Abstract:
Advances in wearable technologies are facilitating the understanding of human activities using first-person vision (FPV) for a wide range of assistive applications. In this thesis, we propose robust multiple motion features for human activity recognition from first­ person videos. The proposed features encode discriminant characteristics form magnitude, direction and dynamics of motion estimated using optical flow. M:>reover, we design novel virtual-inertial features from video, without using the actual inertial sensor, from the movement of intensity centroid across frames. Results on multiple datasets demonstrate that centroid-based inertial features improve the recognition performance of grid-based features. Moreover, we propose a multi-layer modelling framework that encodes hierarchical and temporal relationships among activities. The first layer operates on groups of features that effectively encode motion dynamics and temporal variaitons of intra-frame appearance descriptors of activities with a hierarchical topology. The second layer exploits the temporal context by weighting the outputs of the hierarchy during modelling. In addition, a post-decoding smoothing technique utilises decisions on past samples based on the confidence of the current sample. We validate the proposed framework with several classi fiers, and the temporal modelling is shown to improve recognition performance. We also investigate the use of deep networks to simplify the feature engineering from first-person videos. We propose a stacking of spectrograms to represent short-term global motions that contains a frequency-time representation of multiplemotion components. This enables us to apply 2D convolutions to extract/learn motion features. We employ long short-term memory recurrent network to encode long-term temporal dependency among activiites. Furthermore, we apply cross-domain knowledge transfer between inertial­ based and vision-based approaches for egocentric activity recognition. We propose sparsity weightedcombination of information from different motion modalities and/or streams . Results show that the proposed approach performs competitively with existing deep frameworks, moreover, with reduced complexity.
Los avances en tecnologías wearables facilitan la comprensión de actividades humanas utilizando cuando se usan videos grabados en primera persona para una amplia gama de aplicaciones. En esta tesis, proponemos características robustas de movimiento para el reconocimiento de actividades humana a partir de videos en primera persona. Las características propuestas codifican características discriminativas estimadas a partir de optical flow como magnitud, dirección y dinámica de movimiento. Además, diseñamos nuevas características de inercia virtual a partir de video, sin usar sensores inerciales, utilizando el movimiento del centroide de intensidad a través de los fotogramas. Los resultados obtenidos en múltiples bases de datos demuestran que las características inerciales basadas en centroides mejoran el rendimiento de reconocimiento en comparación con grid-based características. Además, proponemos un algoritmo multicapa que codifica las relaciones jerárquicas y temporales entre actividades. La primera capa opera en grupos de características que codifican eficazmente las dinámicas del movimiento y las variaciones temporales de características de apariencia entre múltiples fotogramas utilizando una jerarquía. La segunda capa aprovecha el contexto temporal ponderando las salidas de la jerarquía durante el modelado. Además, diseñamos una técnica de postprocesado para filtrar las decisiones utilizando estimaciones pasadas y la confianza de la estimación actual. Validamos el algoritmo propuesto utilizando varios clasificadores. El modelado temporal muestra una mejora del rendimiento en el reconocimiento de actividades. También investigamos el uso de redes profundas (deep networks) para simplificar el diseño manual de características a partir de videos en primera persona. Proponemos apilar espectrogramas para representar movimientos globales a corto plazo. Estos espectrogramas contienen una representación espaciotemporal de múltiples componentes de movimiento. Esto nos permite aplicar convoluciones bidimensionales para aprender funciones de movimiento. Empleamos long short-term memory recurrent networks para codificar la dependencia temporal a largo plazo entre las actividades. Además, aplicamos transferencia de conocimiento entre diferentes dominios (cross-domain knowledge) entre enfoques inerciales y basados en la visión para el reconocimiento de la actividad en primera persona. Proponemos una combinación ponderada de información de diferentes modalidades de movimiento y/o secuencias. Los resultados muestran que el algoritmo propuesto obtiene resultados competitivos en comparación con existentes algoritmos basados en deep learning, a la vez que se reduce la complejidad.
APA, Harvard, Vancouver, ISO, and other styles
3

Erhard, Matthew John. "Visual intent recognition in a multiple camera environment /." Online version of thesis, 2006. http://hdl.handle.net/1850/3365.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Soh, Ling Min. "Recognition using tagged objects." Thesis, University of Surrey, 2000. http://epubs.surrey.ac.uk/844110/.

Full text
Abstract:
This thesis describes a method for the recognition of objects in an unconstrained environment with a widely ranging illumination, imaged from unknown view points and complicated background. The general problem is simplified by placing specially designed patterns on the object that allows us to solve the pose determination problem easily. There are several key components involved in the proposed recognition approach. They include pattern detection, pose estimation, model acquisition and matching, searching and indexing the model database. Other crucial issues pertaining to the individual components of the recognition system such as the choice of the pattern, the reliability and accuracy of the pattern detector, pose estimator and matching and the speed of the overall system are addressed. After establishing the methodological framework, experiments are carried out on a wide range of both synthetic and real data to illustrate the validity and usefulness of the proposed methods. The principal contribution of this research is a methodology for Tagged Object Recognition (TOR) in unconstrained conditions. A robust pattern (calibration chart) detector is developed for off-the-shelf use. To empirically assess the effectiveness of the pattern detector and the pose estimator under various scenarios, simulated data generated using a graphics rendering process is used. This simulated data provides ground truth which is difficult to obtain in projected images. Using the ground truth, the detection error, which is usually ignored, can be analysed. For model matching, the Chamfer matching algorithm is modified to get a more reliable matching score. The technique facilitates reliable Tagged Object Recognition (TOR). Finally, the results of extensive quantitative and qualitative tests are presented that show the plausibility of practical use of Tagged Object Recognition (TOR). The features characterising the enabling technology developed are the ability to a) recognise an object which is tagged with the calibration chart, b) establish camera position with respect to a landmark and c) test any camera calibration and 3D pose estimation routines, thus facilitating future research and applications in mobile robots navigations, 3D reconstruction and stereo vision.
APA, Harvard, Vancouver, ISO, and other styles
5

Mudduluru, Sravani. "Indian Sign Language Numbers Recognition using Intel RealSense Camera." DigitalCommons@CalPoly, 2017. https://digitalcommons.calpoly.edu/theses/1815.

Full text
Abstract:
The use of gesture based interaction with devices has been a significant area of research in the field of computer science since many years. The main idea of these kind of interactions is to ease the user experience by providing high degree of freedom and provide more interactive way of communication with the technology in a natural way. The significant areas of applications of gesture recognition are in video gaming, human computer interaction, virtual reality, smart home appliances, medical systems, robotics and several others. With the availability of the devices such as Kinect, Leap Motion and Intel RealSense cameras accessing the depth as well as color information has become available to the public with affordable costs. The Intel RealSense camera is a USB powered controller that can be supported with few hardware requirements such as Windows 8 and above. This is one such camera that can be used to track the human body information similar to the Kinect and Leap Motion. It was designed specifically to provide more minute information about the different parts of the human body such as face, hand etc. This camera was designed to give users more natural and intuitive interactions with the smart devices by providing some features such as creating 3D avatars, high quality 3D prints, high-quality graphic gaming visuals, virtual reality and others. The main aim of this study is to try to analyze hand tracking information and build a training model in order to decide if this camera is suitable for sign language. In this study, we have extracted the joint information of 22 joint labels per single hand .We trained the model to identify the Indian Sign Language(ISL) numbers from 0-9. Through this study we analyzed that multi-class SVM model showed higher accuracy of 93.5% when compared to the decision tree and KNN models.
APA, Harvard, Vancouver, ISO, and other styles
6

Bellando, John Louis. "Modeling and Recognition of Gestures Using a Single Camera." University of Cincinnati / OhioLINK, 2000. http://rave.ohiolink.edu/etdc/view?acc_num=ucin973088031.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Brauer, Henrik Siebo Peter. "Camera based human localization and recognition in smart environments." Thesis, University of the West of Scotland, 2014. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.739946.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hannuksela, J. (Jari). "Camera based motion estimation and recognition for human-computer interaction." Doctoral thesis, University of Oulu, 2008. http://urn.fi/urn:isbn:9789514289781.

Full text
Abstract:
Abstract Communicating with mobile devices has become an unavoidable part of our daily life. Unfortunately, the current user interface designs are mostly taken directly from desktop computers. This has resulted in devices that are sometimes hard to use. Since more processing power and new sensing technologies are already available, there is a possibility to develop systems to communicate through different modalities. This thesis proposes some novel computer vision approaches, including head tracking, object motion analysis and device ego-motion estimation, to allow efficient interaction with mobile devices. For head tracking, two new methods have been developed. The first method detects a face region and facial features by employing skin detection, morphology, and a geometrical face model. The second method, designed especially for mobile use, detects the face and eyes using local texture features. In both cases, Kalman filtering is applied to estimate the 3-D pose of the head. Experiments indicate that the methods introduced can be applied on platforms with limited computational resources. A novel object tracking method is also presented. The idea is to combine Kalman filtering and EM-algorithms to track an object, such as a finger, using motion features. This technique is also applicable when some conventional methods such as colour segmentation and background subtraction cannot be used. In addition, a new feature based camera ego-motion estimation framework is proposed. The method introduced exploits gradient measures for feature selection and feature displacement uncertainty analysis. Experiments with a fixed point implementation testify to the effectiveness of the approach on a camera-equipped mobile phone. The feasibility of the methods developed is demonstrated in three new mobile interface solutions. One of them estimates the ego-motion of the device with respect to the user's face and utilises that information for browsing large documents or bitmaps on small displays. The second solution is to use device or finger motion to recognize simple gestures. In addition to these applications, a novel interactive system to build document panorama images is presented. The motion estimation and recognition techniques presented in this thesis have clear potential to become practical means for interacting with mobile devices. In fact, cameras in future mobile devices may, for the most of time, be used as sensors for self intuitive user interfaces rather than using them for digital photography.
APA, Harvard, Vancouver, ISO, and other styles
9

Akman, Oytun. "Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking And Event Recognition." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12608620/index.pdf.

Full text
Abstract:
In this thesis, novel methods for background modeling, tracking, occlusion handling and event recognition via multi-camera configurations are presented. As the initial step, building blocks of typical single camera surveillance systems that are moving object detection, tracking and event recognition, are discussed and various widely accepted methods for these building blocks are tested to asses on their performance. Next, for the multi-camera surveillance systems, background modeling, occlusion handling, tracking and event recognition for two-camera configurations are examined. Various foreground detection methods are discussed and a background modeling algorithm, which is based on multi-variate mixture of Gaussians, is proposed. During occlusion handling studies, a novel method for segmenting the occluded objects is proposed, in which a top-view of the scene, free of occlusions, is generated from multi-view data. The experiments indicate that the occlusion handling algorithm operates successfully on various test data. A novel tracking method by using multi-camera configurations is also proposed. The main idea of multi-camera employment is fusing the 2D information coming from the cameras to obtain a 3D information for better occlusion handling and seamless tracking. The proposed algorithm is tested on different data sets and it shows clear improvement over single camera tracker. Finally, multi-camera trajectories of objects are classified by proposed multi-camera event recognition method. In this method, concatenated different view trajectories are used to train Gaussian Mixture Hidden Markov Models. The experimental results indicate an improvement for the multi-camera event recognition performance over the event recognition by using single camera.
APA, Harvard, Vancouver, ISO, and other styles
10

Kurihata, Hiroyuki, Tomokazu Takahashi, Ichiro Ide, Yoshito Mekada, Hiroshi Murase, Yukimasa Tamatsu, and Takayuki Miyahara. "Rainy weather recognition from in-vehicle camera images for driver assistance." IEEE, 2005. http://hdl.handle.net/2237/6798.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Turk, Matthew Robert. "A homography-based multiple-camera person-tracking algorithm /." Online version of thesis, 2008. http://hdl.handle.net/1850/7853.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Chen, Quanxin. "Camera calibration and shape recovery from videos of two mirrors." HKBU Institutional Repository, 2015. https://repository.hkbu.edu.hk/etd_oa/168.

Full text
Abstract:
Mirrors are often studied for camera calibration since they provide symmetric relationship for object which can guarantee synchronization in multiple views. However, it is sometimes difficult to compute the reflection matrices of mirrors. This thesis aims to solve the problem of camera calibration and shape recovery from a two-mirror system which is able to generate five views of an object. It firstly studies the similarity relationship of the motion formed by the five views in two-mirror system with the circular motion. It is shown that the motion formed by the five views can be regarded as two circular motions so that we can avoid computing the reflection matrices of mirrors. This thesis then shows the most important problem which is to recover the vanishing line of rotation plane and the imaged circular points by two unknown equal angles via metric rectification. After that, it is easy to recover the imaged rotation axis and the vanishing points X-axis via imaged circular points. Different from the state-of-the-art algorithm, this thesis avoid computing vanishing points X-axis at first because it will cause accumulative error when recovering the imaged rotation axis. By now it is enough to compute the camera intrinsics which is the main objective of this thesis. At last, a 3D visual hull model of object could be reconstructed once all the projective matrices of views were computed. This thesis uses a short video instead of static snapshots so that the reconstructed 3D visual hull model of each frame can be put together based on the motion sequence of object to make a 3D animation. This animation can help to boost the accuracy of action recognition in contrast to 2D video. In general, the action recognition by 2D videos always distinguishes action according to the side of human taken by videos but cannot do for the side does not appear in videos. It then requires to store every direction for human actions of video into database which causes redundancy. The 3D animation can deal with this problem since the reconstructed model can be seen in every direction so that only one 3D animation of human action is needed to store in database. The experimental results show that the more frames are used, the less error of camera intrinsics will occur and the reconstructed 3D model shows the feasibility of the approach.
APA, Harvard, Vancouver, ISO, and other styles
13

Liang, Jian. "Processing camera-captured document images geometric rectification, mosaicing, and layout structure recognition /." College Park, Md. : University of Maryland, 2006. http://hdl.handle.net/1903/3659.

Full text
Abstract:
Thesis (Ph. D.) -- University of Maryland, College Park, 2006.
Thesis research directed by: Electrical Engineering. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
14

Anwar, Qaiser. "Optical Navigation by recognition of reference labels using 3D calibration of camera." Thesis, Mittuniversitetet, Institutionen för informationsteknologi och medier, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-18453.

Full text
Abstract:
In this thesis a machine vision based indoor navigation system is presented. This is achieved by using rotationally independent optimized color reference labels and a geometrical camera calibration model which determines a set of camera parameters. All reference labels carry one byte of information (0 to 255), which can be designed for different values. An algorithm in Matlab has been developed so that a machine vision system for N number of symbols can recognize the symbols at different orientations. A camera calibration model describes the mapping between the 3-D world coordinates and the 2-D image coordinates. The reconstruction system uses the direct linear transform (DLT) method with a set of control reference labels in relation to the camera calibration. The least-squares adjustment method has been developed to calculate the parameters of the machine vision system. In these experiments it has been demonstrated that the pose of the camera can be calculated, with a relatively high precision, by using the least-squares estimation.
APA, Harvard, Vancouver, ISO, and other styles
15

Kotwal, Thomas (Thomas Prabhakar Pramod) 1978. "The untrusted computer problem and camera based authentication using optical character recognition." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/87272.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Qin, Yinghao. "The Smart Phone as a Mouse." The University of Waikato, 2006. http://hdl.handle.net/10289/2289.

Full text
Abstract:
With the development of hardware, mobile phone has become a feature-rich handheld device. Built-in camera and Bluetooth technology are supported in most current mobile phones. A real-time image processing experiment was conducted with a SonyEricsson P910i smartphone and a desktop computer. This thesis describes the design and implementation of a system which uses a mobile phone as a PC mouse. The movement of the mobile phone can be detected by analyzing the images captured by the onboard camera and the mouse cursor in the PC can be controlled by the movement of the phone.
APA, Harvard, Vancouver, ISO, and other styles
17

Mohammed, Abdulmalik. "Obstacle detection and emergency exit sign recognition for autonomous navigation using camera phone." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/obstacle-detection-and-emergency-exit-sign-recognition-for-autonomous-navigation-using-camera-phone(e0224d89-e743-47a4-8c68-52f718457098).html.

Full text
Abstract:
In this research work, we develop an obstacle detection and emergency exit sign recognition system on a mobile phone by extending the feature from accelerated segment test detector with Harris corner filter. The first step often required for many vision based applications is the detection of objects of interest in an image. Hence, in this research work, we introduce emergency exit sign detection method using colour histogram. The hue and saturation component of an HSV colour model are processed into features to build a 2D colour histogram. We backproject a 2D colour histogram to detect emergency exit sign from a captured image as the first task required before performing emergency exit sign recognition. The result of classification shows that the 2D histogram is fast and can discriminate between objects and background with accuracy. One of the challenges confronting object recognition methods is the type of image feature to compute. In this work therefore, we present two feature detectors and descriptor methods based on the feature from accelerated segment test detector with Harris corner filter. The first method is called Upright FAST-Harris and binary detector (U-FaHB), while the second method Scale Interpolated FAST-Harris and Binary (SIFaHB). In both methods, feature points are extracted using the accelerated segment test detectors and Harris filter to return the strongest corner points as features. However, in the case of SIFaHB, the extraction of feature points is done across the image plane and along the scale-space. The modular design of these detectors allows for the integration of descriptors of any kind. Therefore, we combine these detectors with binary test descriptor like BRIEF to compute feature regions. These detectors and the combined descriptor are evaluated using different images observed under various geometric and photometric transformations and the performance is compared with other detectors and descriptors. The results obtained show that our proposed feature detector and descriptor method is fast and performs better compared with other methods like SIFT, SURF, ORB, BRISK, CenSurE. Based on the potential of U-FaHB detector and descriptor, we extended it for use in optical flow computation, which we termed the Nearest-flow method. This method has the potential of computing flow vectors for use in obstacle detection. Just like any other new methods, we evaluated the Nearest flow method using real and synthetic image sequences. We compare the performance of the Nearest-flow with other methods like the Lucas and Kanade, Farneback and SIFT-flow. The results obtained show that our Nearest-flow method is faster to compute and performs better on real scene images compared with the other methods. In the final part of this research, we demonstrate the application potential of our proposed methods by developing an obstacle detection and exit sign recognition system on a camera phone and the result obtained shows that the methods have the potential to solve this vision based object detection and recognition problem.
APA, Harvard, Vancouver, ISO, and other styles
18

Smith, Benjamin Andrew. "Determination of Normal or Abnormal Gait Using a Two-Dimensional Video Camera." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/31795.

Full text
Abstract:
The extraction and analysis of human gait characteristics using image sequences and the subsequent classification of these characteristics are currently an intense area of research. Recently, the focus of this research area has turned to the realm of computer vision as an unobtrusive way of performing this analysis. With such systems becoming more common, a gait analysis system that will quickly and accurately determine if a subject is walking normally becomes more valuable. Such a system could be used as a preprocessing step in a more sophisticated gait analysis system or could be used for rehabilitation purposes. In this thesis a system is proposed which utilizes a novel fusion of spatial computer vision operations as well as motion in order to accurately and efficiently determine if a subject moving through a scene is walking normally or abnormally. Specifically this system will yield a classification of the type of motion being observed, whether it is a human walking normally or some other kind of motion taking place within the frame. Experimental results will show that the system provides accurate detection of normal walking and can distinguish abnormalities as subtle as limping or walking with a straight leg reliably.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
19

Williams, William. "A novel multispectral and 2.5D/3D image fusion camera system for enhanced face recognition." Thesis, Birkbeck (University of London), 2017. http://bbktheses.da.ulcc.ac.uk/272/.

Full text
Abstract:
The fusion of images from the visible and long-wave infrared (thermal) portions of the spectrum produces images that have improved face recognition performance under varying lighting conditions. This is because long-wave infrared images are the result of emitted, rather than reflected, light and are therefore less sensitive to changes in ambient light. Similarly, 3D and 2.5D images have also improved face recognition under varying pose and lighting. The opacity of glass to long-wave infrared light, however, means that the presence of eyeglasses in a face image reduces the recognition performance. This thesis presents the design and performance evaluation of a novel camera system which is capable of capturing spatially registered visible, near-infrared, long-wave infrared and 2.5D depth video images via a common optical path requiring no spatial registration between sensors beyond scaling for differences in sensor sizes. Experiments using a range of established face recognition methods and multi-class SVM classifiers show that the fused output from our camera system not only outperforms the single modality images for face recognition, but that the adaptive fusion methods used produce consistent increases in recognition accuracy under varying pose, lighting and with the presence of eyeglasses.
APA, Harvard, Vancouver, ISO, and other styles
20

Pakalapati, Himani Raj. "Programming of Microcontroller and/or FPGA for Wafer-Level Applications - Display Control, Simple Stereo Processing, Simple Image Recognition." Thesis, Linköpings universitet, Elektroniksystem, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-89795.

Full text
Abstract:
In this work the usage of a WLC (Wafer Level Camera) for ensuring road safety has been presented. A prototype of a WLC along with the Aptina MT9M114 stereoboard has been used for this project. The basic idea is to observe the movements of the driver. By doing so an understanding of whether the driver is concentrating on the road can be achieved. For this project the display of the required scene is captured with a wafer-level camera pair. Using the image pairs stereo processing is performed to obtain the real depth of the objects in the scene. Image recognition is used to separate the object from the background. This ultimately leads to just concentrating on the object which in the present context is the driver.
APA, Harvard, Vancouver, ISO, and other styles
21

Ozkilic, Sibel. "Performance Improvement Of A 3-d Configuration Reconstruction Algorithm For An Object Using A Single Camera Image." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/1095793/index.pdf.

Full text
Abstract:
Performance improvement of a 3-D configuration reconstruction algorithm using a passive secondary target has been focused in this study. In earlier studies, a theoretical development of the 3-D configuration reconstruction algorithm was achieved and it was implemented by a computer program on a system consisting of an optical bench and a digital imaging system. The passive secondary target used was a circle with two internal spots. In order to use this reconstruction algorithm in autonomous systems, an automatic target recognition algorithm has been developed in this study. Starting from a pre-captured and stored 8-bit gray-level image, the algorithm automatically detects the elliptical image of a circular target and determines its contour in the scene. It was shown that the algorithm can also be used for partially captured elliptical images. Another improvement achieved in this study is the determination of internal camera parameters of the vision system.
APA, Harvard, Vancouver, ISO, and other styles
22

MURASE, Hiroshi, Yoshito MEKADA, Ichiro IDE, Tomokazu TAKAHASHI, and Hiroyuki ISHIDA. "Generation of Training Data by Degradation Models for Traffic Sign Symbol Recognition." Institute of Electronics, Information and Communication Engineers, 2007. http://hdl.handle.net/2237/14958.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Tang, Zongzhi. "A Novel Road Marking Detection and Recognition Technique Using a Camera-based Advanced Driver Assistance System." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/35729.

Full text
Abstract:
Advanced Driver Assistance System (ADAS) was widely learned nowadays. As crucial parts of ADAS, lane markings detection, as well as other objects detection, have become more popular than before. However, most methods implemented in such areas cannot perfectly balance the performance of accuracy versus efficiency, and the mainstream methods (e.g. Machine Learning) suffer from several limitations which can hardly break the wall between partial autonomous and fully autonomous driving. This thesis proposed a real-time lane marking detection framework for ADAS, which included 4-extreme points set descriptor and a rule-based cascade classifier. By analyzing the behavior of lane markings on the road surface, a characteristic of markings was discovered, i.e., standard markings can sustain their shape in the perpendicular plane of the driving direction. By employing this feature, a 4-extreme points set descriptor was applied to describe the shape of each marking first. Specifically, after processing Maximally Stable Extremal Region (MSER) and Hough transforms on a 2-D image, several contours of interest are obtained. A bounding box, with borders parallel to the image coordinate, intersected with each contour at 4 points in the edge, which was named 4-extreme points set. Afterward, to verify consistency of each contour and standard marking, some rules abstracted from construction manual are employed such as Area Filter, Colour Filter, Relative Location Filter, Convex Filter, etc. To reduce the errors caused by changes in driving direction, an enhanced module was then introduced. By tracking the vanishing point as well as other key points of the road net, a method for 3-D reconstruction, with respect to the optical axis between vanishing point and camera center, is possible. The principle of such algorithm was exhibited, and a description about how to obtain the depth information from this model was also provided. Among all of these processes, a key-point based classification method is the main contribution of this paper because of its function in eliminating the deformation of the object caused by inverse perspective mapping. Several experiments were conducted in highway and urban roads in Ottawa. The detection rate of the markings by the proposed algorithm reached an average accuracy rate of 96.77% while F1 Score (harmonic mean of precision and recall) also attained a rate of 90.57%. In summary, the proposed method exhibited a state-of-the-art performance and represents a significant advancement of understanding.
APA, Harvard, Vancouver, ISO, and other styles
24

Adeeb, Karam, and Adam Alveteg. "SIYA - Slide Into Your Albums : Design and construction of a controllable dolly camera with object recognition." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264450.

Full text
Abstract:
The scope of this project is to design, construct and build an automated camera rig with object recognition. The project explores if there are any advantages with an automated camera rig compared to a manual one, how an external camera module can be implemented to track an object and under what circumstances the camera module can register the objects for optimal performance. The construction is built to travel along a rail of two iron pipes. A camera is mounted on a small wagon that travels on top of the rail with the help of a DC-motor. On the wagon, an external camera module called Pixy2 detects a predetermined object that the user wants the main camera to detect and focus on. Using the feedback data from the Pixy2, two stepper motors run to rotate the main camera horizontally and vertically so that the object is placed in the middle of the frame while the wagon travels along the rail.
Syftet med detta projekt är att konstruera och bygga en automatiserad kamerarigg med objektigenkänning. Projektet undersöker om det finns några fördelar med en automatiserad kamerarigg gentemot en manuell, hur en extern kameramodul implementeras för att kamerariggen ska kunna följa ett objekt och under vilka förhållanden kameramodulen registrerar objekten bäst. Kamerariggen är byggd för att åka längsmed en räls som består av två järnrör. En filmkamera är monterad på en vagn som rullar ovanpå denna räls och drivs med hjälp av en DC-motor. Ovanpå vagnen ska en extern kameramodul vid namn Pixy2 upptäcka ett förbestämt objekt som användaren vill att filmkameran ska fokusera på. Med hjälp av återkoppling av datan som Pixy2 registrerar styrs två stycken stegmotorer som antingen roterar filmkameran horisontellt i x-led eller vertikalt i y-led tills objektet är placerat i mitten av Pixy2’s synfält. På detta sätt kommer konstruktionen att fokusera på objektet samtidigt som den rör sig i sidled på rälsen.
APA, Harvard, Vancouver, ISO, and other styles
25

ULRICH, LUCA. "RGB-D methodologies for Face Expression Recognition." Doctoral thesis, Politecnico di Torino, 2021. http://hdl.handle.net/11583/2872356.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Sarella, Kanthi. "An image processing technique for the improvement of Sign2 using a dual camera approach /." Online version of thesis, 2008. http://hdl.handle.net/1850/5721.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Colberg, Kathryn. "Investigating the ability of automated license plate recognition camera systems to measure travel times in work zones." Thesis, Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49048.

Full text
Abstract:
This thesis evaluates the performance of a vehicle detection technology, Automated License Plate Recognition (ALPR) camera systems, with regards to its ability to produce real-time travel time information in active work zones. A literature review was conducted to investigate the ALPR technology as well as to identify other research that has been conducted using ALPR systems to collect travel time information. Next, the ALPR technology was tested in a series of field deployments in both an arterial and a freeway environment. The goal of the arterial field deployment was to evaluate the optimal ALPR camera angles that produce the highest license plate detection rates and accuracy percentages. Next, a series of freeway deployments were conducted on corridors of I-285 in Atlanta, Georgia in order to evaluate the ALPR system in active work zone environments. During the series of I-285 freeway deployments, ALPR data was collected in conjunction with data from Bluetooth and radar technologies, as well as from high definition video cameras. The data collected during the I-285 deployments was analyzed to determine the ALPR vehicle detection rates. Additionally, a script was written to match the ALPR reads across two data collection stations to determine the ALPR travel times through the corridors. The ALPR travel time data was compared with the travel time data produced by the Bluetooth and video cameras with a particular focus on identifying travel time biases associated with each given technology. Finally, based on the knowledge gained, recommendations for larger-scale ALPR work zone deployments as well as suggestions for future research are provided.
APA, Harvard, Vancouver, ISO, and other styles
28

Minetto, Rodrigo 1983. "Detecção robusta de movimento de camera em videos por analise de fluxo otico ponderado." [s.n.], 2007. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276203.

Full text
Abstract:
Orientadores: Neucimar Jeronimo Leite, Jorge Stolfi
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-09T20:36:43Z (GMT). No. of bitstreams: 1 Minetto_Rodrigo_M.pdf: 4634555 bytes, checksum: 6335c719fb04357e47f9dd14b51fbaa9 (MD5) Previous issue date: 2007
Resumo: Nosso objetivo nesta dissertação é a detecção robusta de movimento de câmera (tilt, pan, roll e zoom) em vídeos. Para tanto, desenvolvemos um algoritmo original para esta tarefa, baseado em um ajuste ponderado de mínimos quadrados de um fluxo ótico, onde um procedimento iterativo é utilizado para melhorar o peso de cada vetor. Além da detecção de movimento de câmera, nosso algoritmo fornece uma análise quantitativa precisa e confiável dos movimentos. Este também fornece uma segmentação grosseira de cada quadro em duas regiões, "objeto" e "fundo", correspondentes às partes estacionárias e com movimento na cena, respectivamente. Experimentos com vídeos reais mostram que o algoritmo é rápido e eficaz, mesmo para cenas com movimento substancial de objetos
Abstract: Our goal in this dissertation is the reliable detection of camera motion (tilt, pan, roll and zoom) in videos. We propose an original algorithm for this task based on weighted leastsquare fitting of the optical flow, where an iterative procedure is used to improve the weight of each flow vector. Besides detecting camera motion, our algorithm provides a precise and reliable quantitative analysis of the movements. It also provides a rough segmentation of each frame into two regions, "foreground" and "background", corresponding to the moving and stationary parts of the scene, respectively. Tests with real videos show that the algorithm is fast and effective, even for scenes with substantial object motion
Mestrado
Processamento de Imagens
Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
29

Zagnoli, Andrea. "Human Activity Recognition con telecamere di profondità." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/12946/.

Full text
Abstract:
Lo studio presentato in questa Tesi si propone di elaborare, implementare e testare un algoritmo di Human Activity Recognition (HAR) basato su telecamere di profondità. Per HAR si intende quel settore del machine learning che mira a studiare tecniche che, tramite l’acquisizione di informazioni da sorgenti di diverso tipo, permettano ad una macchina di apprendere in modo autonomo un metodo di classificazione delle attività umane. In particolare l’algoritmo proposto sfrutta la tecnologia delle telecamere di profondità (il sensore utilizzato è il Microsoft Kinect) che a differenza delle tradizionali telecamere a colori proiettano un campo di luce infrarossa e, in base a come questa viene riflessa dagli oggetti nella stanza, è in grado di calcolare la distanza tra il sensore e l’oggetto. L’algoritmo implementato in ambiente .NET, è stato testato su due dataset raccolti dal Computer Science Department, Cornell University e su un nuovo dataset raccolto contestualmente a questo studio. I risultati sperimentali confermano l’efficacia dell’algoritmo su tutte le azioni raccolte nei diversi dataset.
APA, Harvard, Vancouver, ISO, and other styles
30

Carraro, Marco. "Real-time RGB-Depth preception of humans for robots and camera networks." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3426800.

Full text
Abstract:
This thesis deals with robot and camera network perception using RGB-Depth data. The goal is to provide efficient and robust algorithms for interacting with humans. For this reason, a special care has been devoted to design algorithms which can run in real-time on consumer computers and embedded cards. The main contribution of this thesis is the 3D body pose estimation of the human body. We propose two novel algorithms which take advantage of the data stream of a RGB-D camera network outperforming the state-of-the-art performance in both single-view and multi-view tests. While the first algorithm works on point cloud data which is feasible also with no external light, the second one performs better, since it deals with multiple persons with negligible overhead and does not rely on the synchronization between the different cameras in the network. The second contribution regards long-term people re-identification in camera networks. This is particularly challenging since we cannot rely on appearance cues, in order to be able to re-identify people also in different days. We address this problem by proposing a face-recognition framework based on a Convolutional Neural Network and a Bayes inference system to re-assign the correct ID and person name to each new track. The third contribution is about Ambient Assisted Living. We propose a prototype of an assistive robot which periodically patrols a known environment, reporting unusual events as people fallen on the ground. To this end, we developed a fast and robust approach which can work also in dimmer scenes and is validated using a new publicly-available RGB-D dataset recorded on-board of our open-source robot prototype. As a further contribution of this work, in order to boost the research on this topics and to provide the best benefit to the robotics and computer vision community, we released under open-source licenses most of the software implementations of the novel algorithms described in this work.
Questa tesi tratta di percezione per robot autonomi e per reti di telecamere da dati RGB-Depth. L'obiettivo è quello di fornire algoritmi robusti ed efficienti per l'interazione con le persone. Per questa ragione, una particolare attenzione è stata dedicata allo sviluppo di soluzioni efficienti che possano essere eseguite in tempo reale su computer e schede grafiche consumer. Il contributo principale di questo lavoro riguarda la stima automatica della posa 3D del corpo delle persone presenti in una scena. Vengono proposti due algoritmi che sfruttano lo stream di dati RGB-Depth da una rete di telecamere andando a migliorare lo stato dell'arte sia considerando dati da singola telecamera che usando tutte le telecamere disponibili. Il secondo algoritmo ottiene risultati migliori in quanto riesce a stimare la posa di tutte le persone nella scena con overhead trascurabile e non richiede sincronizzazione tra i vari nodi della rete. Tuttavia, il primo metodo utilizza solamente nuvole di punti che sono disponibili anche in ambiente con poca luce nei quali il secondo algoritmo non raggiungerebbe gli stessi risultati. Il secondo contributo riguarda la re-identificazione di persone a lungo termine in reti di telecamere. Questo problema è particolarmente difficile in quanto non si può contare su feature di colore o che considerino i vestiti di ogni persona, in quanto si vuole che il riconoscimento funzioni anche a distanza di giorni. Viene proposto un framework che sfrutta il riconoscimento facciale utilizzando una Convolutional Neural Network e un sistema di classificazione Bayesiano. In questo modo, ogni qual volta viene generata una nuova traccia dal sistema di people tracking, la faccia della persona viene analizzata e, in caso di match, il vecchio ID viene riassegnato. Il terzo contributo riguarda l'Ambient Assisted Living. Abbiamo proposto e implementato un robot di assistenza che ha il compito di sorvegliare periodicamente un ambiente conosciuto, riportando eventi non usuali come la presenza di persone a terra. A questo fine, abbiamo sviluppato un approccio veloce e robusto che funziona anche in assenza di luce ed è stato validato usando un nuovo dataset RGB-Depth registrato a bordo robot. Con l'obiettivo di avanzare la ricerca in questi campi e per fornire il maggior beneficio possibile alle community di robotica e computer vision, come contributo aggiuntivo di questo lavoro, abbiamo rilasciato, con licenze open-source, la maggior parte delle implementazioni software degli algoritmi descritti in questo lavoro.
APA, Harvard, Vancouver, ISO, and other styles
31

Wang, Chong, and 王翀. "Joint color-depth restoration with kinect depth camera and its applications to image-based rendering and hand gesture recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/206343.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Rönnqvist, Patrik. "Surveillance Applications : Image Recognition on the Internet of Things." Thesis, Mittuniversitetet, Institutionen för informationsteknologi och medier, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-18557.

Full text
Abstract:
This is a B.Sc. thesis within the Computer Science programme at the Mid Sweden University. The purpose of this project has been to investigate the possibility of using image based surveillance in smart applications on the Internet-of-Things. The goals involved investigating relevant technologies and designing, implementing and evaluating an application that can perform image recognition. A number of image recognition techniques have been investigated and the use of color histograms has been chosen for its simplicity and low resource requirement. The main source of study material has been the Internet. The solution has been developed in the Java programming language, for use on the Android operating system and using the MediaSense platform for communication. It consists of a camera application that produces image data and a monitor application that performs image recognition and handles user interaction. To evaluate the solution a number of tests have been performed and its pros and cons have been identified. The results show that the solution can differentiate between simple colored stick figures in a controlled environment. Variables such as lighting and the background are significant. The application can reliably send images from the camera to the monitor at a rate of one image every four seconds. The possibility of using streaming video instead of images has been investigated but found to be difficult under the given circumstances. It has been concluded that while the solution cannot differentiate between actual people it has shown that image based surveillance is possible on the IoT and the goals of this project have been satisfied. The results were expected and hold little newsworthiness. Suggested future work involves improvements to the MediaSense platform and infrastructure for processing and storing data.
MediaSense
APA, Harvard, Vancouver, ISO, and other styles
33

Graumann, Jean-Marc. "Intelligent optical methods in image analysis for human detection." Thesis, Brunel University, 2005. http://bura.brunel.ac.uk/handle/2438/7892.

Full text
Abstract:
This thesis introduces the concept of a person recognition system for use on an integrated autonomous surveillance camera. Developed to enable generic surveillance tasks without the need for complex setup procedures nor operator assistance, this is achieved through the novel use of a simple dynamic noise reduction and object detection algorithm requiring no previous knowledge of the installation environment and without any need to train the system to its installation. The combination of this initial processing stage with a novel hybrid neural network structure composed of a SOM mapper and an MLP classifier using a combination of common and individual input data lines has enabled the development of a reliable detection process, capable of dealing with both noisy environments and partial occlusion of valid targets. With a final correct classification rate of 94% on a single image analysis, this provides a huge step forwards as compared to the reported 97% failure rate of standard camera surveillance systems.
APA, Harvard, Vancouver, ISO, and other styles
34

Bubeník, Martin. "RaspberryPI kamerový checker." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-402129.

Full text
Abstract:
The diploma thesis deals with the industrial inspection of correctly made connectors based on computer recognition, and the detection and recognition application is implemented in Python on the Raspberry Pi platform.The work uses empirically known OpenCV library for recognition. The work also deals with the selection of suitable hardware devices, which are a camera with a lens and an illuminator, from which is created one compact device together with the Raspberry Pi microcomputer. The compact device is further mounted on the designed mechanical structure under which is created inspection zone. Finally, Raspberry Pi has a web-based user interface to check the inspection and the interface to write the data to the database.
APA, Harvard, Vancouver, ISO, and other styles
35

Bodén, Rikard, and Jonathan Pernow. "SORTED : Serial manipulator with Object Recognition Trough Edge Detection." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-264513.

Full text
Abstract:
Today, there is an increasing demand for smart robots that can make decisions on their own and cooperate with humans in changing environments. The application areas for robotic arms with camera vision are likely to increase in the future of artificial intelligence as algorithms become more adaptable and intelligent than ever. The purpose of this bachelor’s thesis is to develop a robotic arm that recognises arbitrarily placed objects with camera vision and has the ability to pick and place the objects when they appear in unpredictable positions. The robotic arm has three degrees of freedom and the construction is modularised and 3D-printed with respect to maintenance, but also in order to be adaptive to new applications. The camera vision sensor is integrated in an external camera tripod with its field of view over the workspace. The camera vision sensor recognises objects through colour filtering and it uses an edge detection algorithm to return measurements of detected objects. The measurements are then used as input for the inverse kinematics, that calculates the rotation of each stepper motor. Moreover, there are three different angular potentiometers integrated in each axis to regulate the rotation by each stepper motor. The results in this thesis show that the robotic arm is able to pick up to 90% of the detected objects when using barrel distortion correction in the algorithm. The findings in this thesis is that barrel distortion, that comes with the camera lens, significantly impacts the precision of the robotic arm and thus the results. It can also be stated that the method for barrel distortion correction is affected by the geometry of detected objects and differences in illumination over the workspace. Another conclusion is that correct illumination is needed in order for the vision sensor to differentiate objects with different hue and saturation.
Idag ökar efterfrågan på smarta robotar som kan ta egna beslut och samarbeta med människor i föränderliga miljöer. Tillämpningsområdena för robotar med kamerasensorer kommer sannolikt att öka i en framtid av artificiell intelligens med algoritmer som blir mer intelligenta och anpassningsbara än tidigare. Syftet med detta kandidatexamensarbete är att utveckla en robotarm som, med hjälp av en kamerasensor, kan ta upp och sortera godtyckliga objekt när de uppträder på oförutsägbara positioner. Robotarmen har tre frihetsgrader och hela konstruktionen är 3D-printad och modulariserad för att vara underhållsvänlig, men också anpassningsbar för nya tillämpningsområden. Kamerasensorn ¨ar integrerad i ett externt kamerastativ med sitt synfält över robotarmens arbetsyta. Kamerasensorn detekterar objekt med hjälp av en färgfiltreringsalgoritm och returnerar sedan storlek, position och signatur för objekten med hjälp av en kantdetekteringsalgoritm. Objektens storlek används för att kalibrera kameran och kompensera för den radiella förvrängningen hos linsen. Objektens relativa position används sedan till invers kinematik för att räkna ut hur mycket varje stegmotor ska rotera för att erhålla den önskade vinkeln på varje axel som gör att gripdonet kan nå det detekterade objektet. Robotarmen har även tre olika potentiometrar integrerade i varje axel för att reglera rotationen av varje stegmotor. Resultaten i denna rapport visar att robotarmen kan detektera och plocka upp till 90% av objekten när kamerakalibrering används i algoritmen. Slutsatsen från rapporten är att förvrängningen från kameralinsen har störst påverkan på robotarmens precision och därmed resultatet. Det går även att konstatera att metoden som används för att korrigera kameraförvrängningen påverkas av geometrin samt orienteringen av objekten som ska detekteras, men framför allt variationer i belysning och skuggor över arbetsytan. En annan slutsats är att belysningen över arbetsytan är helt avgörande för om kamerasensorn ska kunna särskilja objekt med olika färgmättad och nyans.
APA, Harvard, Vancouver, ISO, and other styles
36

Dadej, Vincent. "Raspberry Pi: programování v prostředí Matlab/Simulink." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2018. http://www.nusl.cz/ntk/nusl-320104.

Full text
Abstract:
The diploma thesis focuses on programming in the Matlab for the Raspberry Pi 3 platform. For the purpose of the presentation, there are two applications designed for Raspberry Pi that are using available hardware, camera and servos. The first application serves as colour object detecting and accurate tracking by using camera calibration. The second application serves as a face detection and recognition. These applications are implemented by modern methods and knowledge of computer vision. Tracking of the objects and face recognition are verified by an experiment that reveals the accuracy of the used methods.
APA, Harvard, Vancouver, ISO, and other styles
37

Kannala, J. (Juho). "Models and methods for geometric computer vision." Doctoral thesis, University of Oulu, 2010. http://urn.fi/urn:isbn:9789514261510.

Full text
Abstract:
Abstract Automatic three-dimensional scene reconstruction from multiple images is a central problem in geometric computer vision. This thesis considers topics that are related to this problem area. New models and methods are presented for various tasks in such specific domains as camera calibration, image-based modeling and image matching. In particular, the main themes of the thesis are geometric camera calibration and quasi-dense image matching. In addition, a topic related to the estimation of two-view geometric relations is studied, namely, the computation of a planar homography from corresponding conics. Further, as an example of a reconstruction system, a structure-from-motion approach is presented for modeling sewer pipes from video sequences. In geometric camera calibration, the thesis concentrates on central cameras. A generic camera model and a plane-based camera calibration method are presented. The experiments with various real cameras show that the proposed calibration approach is applicable for conventional perspective cameras as well as for many omnidirectional cameras, such as fish-eye lens cameras. In addition, a method is presented for the self-calibration of radially symmetric central cameras from two-view point correspondences. In image matching, the thesis proposes a method for obtaining quasi-dense pixel matches between two wide baseline images. The method extends the match propagation algorithm to the wide baseline setting by using an affine model for the local geometric transformations between the images. Further, two adaptive propagation strategies are presented, where local texture properties are used for adjusting the local transformation estimates during the propagation. These extensions make the quasi-dense approach applicable for both rigid and non-rigid wide baseline matching. In this thesis, quasi-dense matching is additionally applied for piecewise image registration problems which are encountered in specific object recognition and motion segmentation. The proposed object recognition approach is based on grouping the quasi-dense matches between the model and test images into geometrically consistent groups, which are supposed to represent individual objects, whereafter the number and quality of grouped matches are used as recognition criteria. Finally, the proposed approach for dense two-view motion segmentation is built on a layer-based segmentation framework which utilizes grouped quasi-dense matches for initializing the motion layers, and is applicable under wide baseline conditions.
APA, Harvard, Vancouver, ISO, and other styles
38

Krutílek, Jan. "Systémy průmyslového vidění s roboty Kuka a jeho aplikace na rozpoznávání volně ložených prvků." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2010. http://www.nusl.cz/ntk/nusl-229174.

Full text
Abstract:
Diploma thesis deals with a robot vision and its application to the problem of manipulation of coincidentally placed objects. There is mentioned an overview of current principles of the most frequently used vision systems on the market. With regard to the required task to be solved, there are mentioned various possibilities of using basic softsensors during the recognition of different objects. The objective of this Diploma thesis is also programming and realization of a demonstration application applying knowledge of PLC programming, knowledge of expert programming KRL language (for KUKA robots), knowledge of designing scripts for smart camera in Spectation software and knowledge of network communication among all devices used in this case.
APA, Harvard, Vancouver, ISO, and other styles
39

Taha, Abu Snaineh Sami. "AUTOMATIC PERFORMANCE LEVEL ASSESSMENT IN MINIMALLY INVASIVE SURGERY USING COORDINATED SENSORS AND COMPOSITE METRICS." UKnowledge, 2013. http://uknowledge.uky.edu/cs_etds/12.

Full text
Abstract:
Skills assessment in Minimally Invasive Surgery (MIS) has been a challenge for training centers for a long time. The emerging maturity of camera-based systems has the potential to transform problems into solutions in many different areas, including MIS. The current evaluation techniques for assessing the performance of surgeons and trainees are direct observation, global assessments, and checklists. These techniques are mostly subjective and can, therefore, involve a margin of bias. The current automated approaches are all implemented using mechanical or electromagnetic sensors, which suffer limitations and influence the surgeon’s motion. Thus, evaluating the skills of the MIS surgeons and trainees objectively has become an increasing concern. In this work, we integrate and coordinate multiple camera sensors to assess the performance of MIS trainees and surgeons. This study aims at developing an objective data-driven assessment that takes advantage of multiple coordinated sensors. The technical framework for the study is a synchronized network of sensors that captures large sets of measures from the training environment. The measures are then, processed to produce a reliable set of individual and composed metrics, coordinated in time, that suggest patterns of skill development. The sensors are non-invasive, real-time, and coordinated over many cues such as, eye movement, external shots of body and instruments, and internal shots of the operative field. The platform is validated by a case study of 17 subjects and 70 sessions. The results show that the platform output is highly accurate and reliable in detecting patterns of skills development and predicting the skill level of the trainees.
APA, Harvard, Vancouver, ISO, and other styles
40

Darmadi, Steve. "Strobed IR Illumination for Image Quality Improvement in Surveillance Cameras." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235632.

Full text
Abstract:
Infrared (IR) illumination is commonly found in a surveillance camera to improve night-time recording quality. However, the limited available power from Power over Ethernet (PoE) connection in networkenabled cameras restricts the possibilities of increasing image quality by allocating more power to the illumination system.The thesis explored an alternative way to improve the image quality by using strobed IR illumination. Different strobing methods will be discussed in relation to the rolling shutter timing commonly used in CMOS sensors. The method that benefits the evaluation scenario the most was implemented in a prototype which is based on a commercialized fixed-box camera from Axis. The prototype demonstrated how the synchronization of the sensor and the strobing illumination system can be achieved.License plate recognition (LPR) in a dark highway was chosen as the evaluation scenario and an analysis on the car movements was done in a pursue of creating an indoor test. The indoor test provided a controlled environment while the outdoor test exposed the prototype to real-life conditions. The test results show that with strobed IR, the output image experienced brightness improvement and reduction in rolling shutter artifact, compared to constant IR. The theoretical calculation also proved that these improvement does not compromise the average power consumption and eye-safety level of the illumination system.
Infraröd (IR) belysning påträffas ofta i övervakningskameror för att förbättra bildkvalitén vid videoinspelning på natten. Den begränsade tillgängliga effekten från Power over Ethernet-anslutningen (PoE) i nätverksaktiverade kameror sätter dock en övre gräns för hur mycket effekt som kameran tillåts använda till belysningssystemet, och därmed hur pass mycket bildkvalitén kan ökas.I detta examensarbete undersöktes ett alternativt sätt att förbättra bildkvalitén genom att använda blixtrande (eng: ”strobed”) IR-belysning. Olika strobe-metoder undersöktes i relation till rullande slutare, vilket är den slutar-metod som vanligtvis används i CMOS-sensorer. Den metod som gav mest fördelaktiga resultat vid utvärdering implementerades i en prototyp baserad på en kommersiell nätverkskamera av Fixed box-typ från Axis Communications. Denna prototyp visade framgångsrikt ett koncept för hur synkronisering av bildsensorn och belysningssystemet kan uppnås.Registreringsskyltigenkänning (LPR) på en mörk motorväg valdes som utvärderingsscenario och en analys av bilens rörelser gjordes för att skapa en motsvarande testuppställning inomhus. Inomhustesterna gav en kontrollerad miljö medan testerna utomhus utsatte prototypen för verkliga förhållanden. Testresultaten visar att med strobed IR blev bilden från kameran både ljusare och uppvisade mindre artefakter till följd av rullande slutare, jämfört med konstant IR-belysning. Teoretiska beräkningar visade också att dessa förbättringar inte påverkar varken kamerans genomsnittliga effektförbrukning eller ögonsäkerheten för belysningssystemet negativt.
APA, Harvard, Vancouver, ISO, and other styles
41

Yang, Shih-Chuan, and 楊世詮. "Human Action Recognition Using Kinect Camera." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/76928392913615657701.

Full text
Abstract:
碩士
義守大學
資訊工程學系
101
An action is composed of a sequence of postures with a high degree of complexity in time and space, and thus it has become one of important issues to effectively recognize high-level semantic about human action. Along with rapid development of human-machine interface technology, the body sensing device has been gradually changed into video camera to capture the human action. In particular, Microsoft released Kinect sensor with infrared camera in 2010, which is gradually widely used in digital teaching, medical applications, animation and other applications. In the past decade, some researchers devoted themselves to relevant issues on action recognition, but most of them used video camera to capture human action. Since video data are lack of depth values in scene, the subject cannot be stably separated from background image. This study is focused on human action recognition based on Kinect camera. The first is to define user-defined actions in which each action needs recording one or more repetitive actions to extract common action features, and to build an action database. When a sequence of images with unknown high-level semantics is captured in real time, feature extraction is used to generate a series of feature symbols which is belonged to the action database. Then, the string matching algorithm is applied to match action database, and finally high-level action semantics are recognized based on the matching similarity.
APA, Harvard, Vancouver, ISO, and other styles
42

Lu, Chen-Che, and 盧清治. "Application Of Web Camera In Dices Recognition." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/08847007594270303858.

Full text
Abstract:
碩士
南台科技大學
資訊工程系
94
Nowadays, the Web camera has been broadly used and applied to many fields in industry. The purpose of this research is to study real-time pattern recognition using the Web camera, which is a promising solution to some fields in modern computer industry such as keyboard recognition. In this paper, an effective real-time pattern recognition system using the Web camera is proposed and evaluated against real objects - dices. First, the system converted the original image from RGB to HSV color space, followed by polarizing the image. The observed image further went through an iterative morphological process to eliminate noisy data. Following the mathematical morphological approach, Region Filling is to fill up the depressions or holes on the ground of the number of the dices. And erosion gets rid of noisy data such as shadows. Meanwhile, a pixel shrinking technique was employed to speed up the image processing. Finally, the system applied a clustering technique to pixels in order to classify the numbers of dices. There are 500 test patterns that are used in the experiment. The results show that the recognition rate is up to 90% in the application of dices recognition. Based on the experimental results, it is practical to apply the Web Camera to other recognition applications.
APA, Harvard, Vancouver, ISO, and other styles
43

Hu, Jhen-Da, and 胡振達. "Hybrid Hand Gesture Recognition Based on Depth Camera." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/7febjn.

Full text
Abstract:
碩士
國立交通大學
多媒體工程研究所
103
Hand gesture recognition (HRG) becomes one of most popular topics in recent years because that hand gesture is one of the most natural and intuitive way of communication between Human and machines. It is widely used in HCI (Human-Computer-interaction). In this paper, we proposed a method for hand gesture recognition based on depth camera. Firstly, the hand information within depth image is separated from background based on a specific range of depth. And the contour of hand is detected after segmentation. After that, we estimate centroid of hand, and palm size is calculated by using linear regression. Then, fingers’ states of gesture are estimated depending on information of hand contour. And fingertips are estimated by means of smooth hand contours which reduce number of contours by Douglas-Peucker Algorithm. Finally, we propose a gesture type estimation algorithm to determine which gesture is. The extensive experiments demonstrate that the accuracy rate of our method is from 84.35% to 99.55%, and the mean accuracy is 94.29%.
APA, Harvard, Vancouver, ISO, and other styles
44

Huang, Tzu-Ta, and 黃自達. "Camera based Preprocessing System for Chinese Document Image Recognition." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/86838882293892444033.

Full text
Abstract:
碩士
國立中央大學
資訊工程研究所
95
As we know, Chinese documents convey a lot of meaningful and useful information. Due to the popularization of digital cameras, it is convenient to take picture and retrieve important text information from the digitalized Chinese document images. A successful camera-based Chinese document processing system should overcome the problems resulted from various document formats, font sizes, and document skewing to extract correct text block without generating erroneous results. The major difference between Chinese documents and English documents is that Chinese characters are mainly composed of multiple connected components. The most important step in obtaining the message of the existence of Chinese documents is to merge connected components with correct combining and produce complete Chinese character blocks. In this thesis, we propose a method to link Chinese characters into text line and develop a rule to discriminate the merging condition of ordering connected components to hypothesize the existence of skewing documents. Two mechanisms are developed in the thesis. The first mechanism is the detection of inversed text blocks which may be filtered out as oversize noise blocks in the preprocessing. The second mechanism is the detection of document images laid in incorrect direction because sometimes people will rotate camera 90o or 270o to capture document images. A two pass statistical method is proposed to automatically determine the rotating degree of documents images(0o、90o、180o、270o). The first step is devised by using the phenomenon that horizontal strokes appear more frequently than vertical strokes in Chinese characters. The second step is devised by analyzing the vertical projection histogram of each text block and defining keywords that assist in deciding the rotating degree.
APA, Harvard, Vancouver, ISO, and other styles
45

Chiang, Cheng-Ming, and 姜政銘. "Single-camera Hand Gesture Recognition for Human-Computer Interface." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/14566311560453722557.

Full text
Abstract:
碩士
國立交通大學
電子工程學系 電子研究所
102
In this thesis, we propose a novel hand gesture recognition technique for a remote-control human computer interface (HCI) using a single visible-light camera. The system is mainly composed of an image projector and a camera installed on the left side of the panel. We wish to develop a human computer interface that is not limited to finger touching on the board, but allows remotely controlling the system. In this system, we develop our human computer interface in order to find the hand location and to recognize human hand gesture in cluttered backgrounds in real time. In our approach, we first use a simple calibration process to get the initial position of the hand and the relation between image coordinates and the projected board coordinates. After that, we develop a tracking algorithm to get the position of hand, with the help of a hand detection algorithm. Next, we use a gesture recognition technique to recognize the current gesture. We also integrate the detection algorithm with the tracking algorithm to boost the performance. Finally, by projecting the detected hand position onto the projected screen, we can replace the use of mouse and use hand gesture to control the system.
APA, Harvard, Vancouver, ISO, and other styles
46

Lin, Li-Wei, and 林立偉. "The Applications of Web Camera on Head Turning Recognition." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/47177539701971589181.

Full text
Abstract:
碩士
南台科技大學
機械工程系
97
In this paper, a webcam is used to capture images, and combining image processing and neural network technology to develop a head rotation recognition system. First, to identify the face region, then to find out the locations of eyes and lips, and after that according to the head rotating angle, to identify the centers of gravity of the eyes and lips. Furthermore, three sides of the triangle based on the three centers of gravity are obtained, after appropriate calculations to obtain five characteristics. After normalizing these characteristics and the corresponding rotating angles of head, the database is obtained and is used as the training data of the neural network. The experimental results indicate that, in the same light source, the rotating angle of head from -40 degrees to 40 degrees, the proposed head rotation recognition system can identify the head rotation degree direction.
APA, Harvard, Vancouver, ISO, and other styles
47

LIN, CHENG-YEN, and 林政諺. "Taiwanese Sign Language Recognition Using an RGB-D Camera." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/91582000909725784127.

Full text
Abstract:
碩士
國立臺灣科技大學
機械工程系
104
In this thesis hand images were applied to perform sign language recognition through an RGB-D camera in general environments. Unlike many available methods focusing on number (from zero to nine) recognition in sign language, we proposed a method to perform Taiwanese sign language recognition, both for single vocabulary and sentences. For practical use, users first put their hands in front of the RGB-D camera with a distance between 40 cm and 70 cm. The depth information extracted from RGB-D camera was then used to construct the hand images and perform Taiwanese sign language vocabulary recognition using Haar feature-based cascade classifiers. The recognition can be classified into two parts. The first part is static Taiwanese sign language recognition for a sentence. The second part is recognizing dynamic Taiwanese sign language vocabularies as a sentence. Because hands are moving during dynamic sign language vocabulary recognition, we applied the optical flow method to recognize the hand orientation. Using the methods above, we have successfully performed Taiwanese sign language recognition. Finally, we also developed a Taiwanese sign language recognition module, which can be treated as a key technology for Taiwanese sign language translation. The recognition includes static and dynamic Taiwanese sign language vocabularies. These results may be useful for future real-time Taiwanese sign language recognition researches.
APA, Harvard, Vancouver, ISO, and other styles
48

Lin, Yi-ta, and 林逸達. "3D Object Tracking and Recognition with RGB-Depth Camera." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/yn2qzy.

Full text
Abstract:
碩士
國立中山大學
電機工程學系研究所
106
The main purpose of this paper is 3D object tracking by using RGB-D camera. In addition, we would change our object during the tracking phase and our system can identify the new object. This paper is basically composed of three phases. The first phase is off-line training. The second phase is on-line tracking. The third phase is identification of the new object. In the first phase, we create three 3D models of the tracking objects which are box, cylinder and sphere, and we use a method to calculate the point pair features for each 3D model. Then, we store those point pair feature into the database which would be used later. In the second phase, use the RGB-D sensor to obtain the real world scenery, and calculate the point pair feature of the real world scenery as well as the first phase. After that, we compare the scenery ''s point pair features to the database so that we can find out where the 3D model is in the scenery. However, it is just an initial pose for the 3D model, so here we have to use the Iterative Closet Point (ICP) algorithm to obtain a better pose. In the third phase, we would change the tracking object during the tracking phase, and our system can detect the situation from the scenery. Besides, it can identify the new tracking object and keep tracking of it by the method introduced in the second phase.
APA, Harvard, Vancouver, ISO, and other styles
49

Harguess, Joshua David. "Face recognition from video." Thesis, 2011. http://hdl.handle.net/2152/ETD-UT-2011-12-4711.

Full text
Abstract:
While the area of face recognition has been extensively studied in recent years, it remains a largely open problem, despite what movie and television studios would leave you to believe. Frontal, still face recognition research has seen a lot of success in recent years from any different researchers. However,the accuracy of such systems can be greatly diminished in cases such as increasing the variability of the database,occluding the face, and varying the illumination of the face. Further varying the pose of the face (yaw, pitch, and roll) and the face expression (smile, frown, etc.) adds even more complexity to the face recognition task, such as in the case of face recognition from video. In a more realistic video surveillance setting, a face recognition system should be robust to scale, pose, resolution, and occlusion as well as successfully track the face between frames. Also, a more advanced face recognition system should be able to improve the face recognition result by utilizing the information present in multiple video cameras. We approach the problem of face recognition from video in the following manner. We assume that the training data for the system consists of only still image data, such as passport photos or mugshots in a real-world system. We then transform the problem of face recognition from video to a still face recognition problem. Our research focuses on solutions to detecting, tracking and extracting face information from video frames so that they may be utilized effectively in a still face recognition system. We have developed four novel methods that assist in face recognition from video and multiple cameras. The first uses a patch-based method to handle the face recognition task when only patches, or parts, of the face are seen in a video, such as when occlusion of the face happens often. The second uses multiple cameras to fuse the recognition results of multiple cameras to improve the recognition accuracy. In the third solution, we utilize multiple overlapping video cameras to improve the face tracking result which thus improves the face recognition accuracy of the system. We additionally implement a methodology to detect and handle occlusion so that unwanted information is not used in the tracking algorithm. Finally, we introduce the average-half-face, which is shown to improve the results of still face recognition by utilizing the symmetry of the face. In one attempt to understand the use of the average-half-face in face recognition, an analysis of the effect of face symmetry on face recognition results is shown.
text
APA, Harvard, Vancouver, ISO, and other styles
50

Kuo, Wen-Te, and 郭文德. "Auto-Recognition and Performing of Music Score Captured By Camera." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/399v4q.

Full text
Abstract:
碩士
國立臺北科技大學
電資碩士班
100
As the technology developed for the camera capturing images, people need faster and smarter method for processing. This paper focuses on the image processing of music scores captured from the video camera. There are two main goals. One, because the existing recognition software could not provide a good correction for the distorted image, the paper proposed an improved method. First of all, we cut the images into small pieces. Then, we used the Hough Transform to calculate the slope of each piece. When we obtained slope, we used tangent function to obtain the offset value. Finally, based on the slopes and the connections between the pieces, the correction makes images become better. According to the good results, this process could effectively improve the pattern recognition rate. Two, to effectively identify the music symbols and increase the recognition rate, we decide to use neural networks to enhance the music score recognition. After many experiments were done, proving success rate averaged 96.4%, the results indicate the significant improvements of the music score recognition. Finally, this article will integrate the two approaches into a complete processing method and to verify by the actual reading music score image. Then, we prove the method that distorted image correction and neural network for recognition in this paper is work correctly.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography