Dissertations / Theses: 'Real-time vision systems'

1

Benoit, Stephen M. "Monocular optical flow for real-time vision systems." Thesis, McGill University, 1996. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=23862.

Full text

Abstract:

This thesis introduces a monocular optical flow algorithm that has been shown to perform well at nearly real-time frame rates (4 FPS) on natural image sequences. The system is completely bottom-up, using pixel region-matching techniques. A coordinated gradient descent method is broken down into two stages; pixel region matching error measures are locally minimized, and flow field consistency constraints apply non-linear adaptive diffusion, causing confident measurements to influence their less confident neighbors. Convergence is usually accomplished with one iteration for an image frame pair. Temporal integration and Kalman filtering predicts upcoming flow fields and figure/ground separation. The algorithm is designed for flexibility: large displacements are tracked as easily as sub-pixel displacements, and higher-level information can feed flow field predictions into the measurement predictions into the measurement process.

APA, Harvard, Vancouver, ISO, and other styles

2

Tippetts, Beau J. "Real-Time Stereo Vision for Resource Limited Systems." BYU ScholarsArchive, 2012. https://scholarsarchive.byu.edu/etd/2972.

Full text

Abstract:

A significant amount of research in the field of stereo vision has been published in the past decade. Considerable progress has been made in improving accuracy of results as well as achieving real-time performance in obtaining those results. Although much of the literature does not address it, many applications are sensitive to the tradeoff between accuracy and speed that exists among stereo vision algorithms. Overall, this work aims to organize existing efforts and encourage new ones in the development of stereo vision algorithms for resource limited systems. It does this through a review of the status quo as well as providing both software and hardware designs of new stereo vision algorithms that offer an efficient tradeoff between speed and accuracy. A comprehensive review and analysis of stereo vision algorithms is provided with specific emphasis on real-time performance and suitability for resource limited systems. An attempt has been made to compile and present accuracy and runtime performance data for all stereo vision algorithms developed in the past decade. The tradeoff in accuracy that is typically made to achieve real-time performance is examined with an example of an existing highly accurate stereo vision that is modified to see how much speedup can be achieved. Two new stereo vision algorithms, GA Spline and Profile Shape Matching, are presented with a hardware design of the latter also being provided, making Profile Shape Matching available to both embedded processor-based and programmable hardware-based resource limited systems.

APA, Harvard, Vancouver, ISO, and other styles

3

Arshad, Norhashim Mohd. "Real-time data compression for machine vision measurement systems." Thesis, Liverpool John Moores University, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.285284.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Pan, Wenbo. "Real-time human face tracking." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0018/MQ55535.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Nguyen, Dai-Duong. "A vision system based real-time SLAM applications." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS518/document.

Full text

Abstract:

SLAM (localisation et cartographie simultanées) joue un rôle important dans plusieurs applications telles que les robots autonomes, les véhicules intelligents, les véhicules aériens sans pilote (UAV) et autres. De nos jours, les applications SLAM basées sur la vision en temps réel deviennent un sujet d'intérêt général dans de nombreuses recherches. L'une des solutions pour résoudre la complexité de calcul des algorithmes de traitement d'image, dédiés aux applications SLAM, consiste à effectuer un traitement de haut ou de bas niveau sur les coprocesseurs afin de créer un système sur puce. Les architectures hétérogènes ont démontré leur capacité à devenir des candidats potentiels pour un système sur puce dans une approche de co-conception de logiciels matériels. L'objectif de cette thèse est de proposer un système de vision implémentant un algorithme SLAM sur une architecture hétérogène (CPU-GPU ou CPU-FPGA). L'étude permettra d'évaluer ce type d'architectures et contribuer à répondre aux questions relatives à la définition des fonctions et/ou opérateurs élémentaires qui devraient être implantés et comment intégrer des algorithmes de traitement de données tout en prenant en considération l'architecture cible (dans un contexte d'adéquation algorithme architecture). Il y a deux parties dans un système SLAM visuel : Front-end (extraction des points d'intérêt) et Back-end (cœur de SLAM). Au cours de la thèse, concernant la partie Front-end, nous avons étudié plusieurs algorithmes de détection et description des primitives dans l’image. Nous avons développé notre propre algorithme intitulé HOOFR (Hessian ORB Overlapped FREAK) qui possède une meilleure performance par rapport à ceux de l’état de l’art. Cet algorithme est basé sur la modification du détecteur ORB et du descripteur bio-inspiré FREAK. Les résultats de l’amélioration ont été validés en utilisant des jeux de données réel connus. Ensuite, nous avons proposé l'algorithme HOOFR-SLAM Stereo pour la partie Back-end. Cet algorithme utilise les images acquises par une paire de caméras pour réaliser la localisation et cartographie simultanées. La validation a été faite sur plusieurs jeux de données (KITTI, New_College, Malaga, MRT, St_lucia…). Par la suite, pour atteindre un système temps réel, nous avons étudié la complexité algorithmique de HOOFR SLAM ainsi que les architectures matérielles actuelles dédiées aux systèmes embarqués. Nous avons utilisé une méthodologie basée sur la complexité de l'algorithme et le partitionnement des blocs fonctionnels. Le temps de traitement de chaque bloc est analysé en tenant compte des contraintes des architectures ciblées. Nous avons réalisé une implémentation de HOOFR SLAM sur une architecture massivement parallèle basée sur CPU-GPU. Les performances ont été évaluées sur un poste de travail puissant et sur des systèmes embarqués basés sur des architectures. Dans cette étude, nous proposons une architecture au niveau du système et une méthodologie de conception pour intégrer un algorithme de vision SLAM sur un SoC. Ce système mettra en évidence un compromis entre polyvalence, parallélisme, vitesse de traitement et résultats de localisation. Une comparaison avec les systèmes conventionnels sera effectuée pour évaluer l'architecture du système définie. Vue de la consommation d'énergie, nous avons étudié l'implémentation la partie Front-end sur l'architecture configurable type soc-FPGA. Le SLAM kernel est destiné à être exécuté sur un processeur. Nous avons proposé une architecture par la méthode HLS (High-level synthesis) en utilisant langage OpenCL. Nous avons validé notre architecture sur la carte Altera Arria 10 soc. Une comparaison avec les systèmes les plus récents montre que l’architecture conçue présente de meilleures performances et un compromis entre la consommation d’énergie et les temps de traitement
SLAM (Simultaneous Localization And Mapping) has an important role in several applications such as autonomous robots, smart vehicles, unmanned aerial vehicles (UAVs) and others. Nowadays, real-time vision based SLAM applications becomes a subject of widespread interests in many researches. One of the solutions to solve the computational complexity of image processing algorithms, dedicated to SLAM applications, is to perform high or/and low level processing on co-processors in order to build a System on Chip. Heterogeneous architectures have demonstrated their ability to become potential candidates for a system on chip in a hardware software co-design approach. The aim of this thesis is to propose a vision system implementing a SLAM algorithm on a heterogeneous architecture (CPU-GPU or CPU-FPGA). The study will allow verifying if these types of heterogeneous architectures are advantageous, what elementary functions and/or operators should be added on chip and how to integrate image-processing and the SLAM Kernel on a heterogeneous architecture (i. e. How to map the vision SLAM on a System on Chip).There are two parts in a visual SLAM system: Front-end (feature extraction, image processing) and Back-end (SLAM kernel). During this thesis, we studied several features detection and description algorithms for the Front-end part. We have developed our own algorithm denoted as HOOFR (Hessian ORB Overlapped FREAK) extractor which has a better compromise between precision and processing times compared to those of the state of the art. This algorithm is based on the modification of the ORB (Oriented FAST and rotated BRIEF) detector and the bio-inspired descriptor: FREAK (Fast Retina Keypoint). The improvements were validated using well-known real datasets. Consequently, we propose the HOOFR-SLAM Stereo algorithm for the Back-end part. This algorithm uses images acquired by a stereo camera to perform simultaneous localization and mapping. The HOOFR SLAM performances were evaluated on different datasets (KITTI, New-College , Malaga, MRT, St-Lucia, ...).Afterward, to reach a real-time system, we studied the algorithmic complexity of HOOFR SLAM as well as the current hardware architectures dedicated for embedded systems. We used a methodology based on the algorithm complexity and functional blocks partitioning. The processing time of each block is analyzed taking into account the constraints of the targeted architectures. We achieved an implementation of HOOFR SLAM on a massively parallel architecture based on CPU-GPU. The performances were evaluated on a powerful workstation and on architectures based embedded systems. In this study, we propose a system-level architecture and a design methodology to integrate a vision SLAM algorithm on a SoC. This system will highlight a compromise between versatility, parallelism, processing speed and localization results. A comparison related to conventional systems will be performed to evaluate the defined system architecture. In order to reduce the energy consumption, we have studied the implementation of the Front-end part (image processing) on an FPGA based SoC system. The SLAM kernel is intended to run on a CPU processor. We proposed a parallelized architecture using HLS (High-level synthesis) method and OpenCL language programming. We validated our architecture for an Altera Arria 10 SoC. A comparison with systems in the state-of-the-art showed that the designed architecture presents better performances and a compromise between power consumption and processing times

APA, Harvard, Vancouver, ISO, and other styles

6

Garner, Harry Douglas Jr. "Development of a real-time vision based absolute orientation sensor." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/17022.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Guo, Guanghao. "Evaluation of FPGA Partial Reconfiguration : for real-time Vision applications." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279957.

Full text

Abstract:

The usage of programmable logic resources in Field Programmable Gate Arrays, also known as FPGAs, has increased a lot recently due to the complexity of the algorithms, especially for some computer vision algorithms. Due to this reason, sometimes the hardware resources in the FPGA are not sufficient. Partial reconfiguration provides us with the possibility to solve this problem. Partial reconfiguration is a technique that can be used to reconfigure specific parts of the FPGA during run-time. By using this technique, we can reduce the need for programmable logic resources. This master thesis project aims to design a software framework for partial reconfiguration that can load a set of processing components/algorithms (e.g. object detection, optical flow, Harris-Corner detection etc) in the FPGA area without affecting real-time static components such as camera capture, basic image filtering and colour conversion which are continuously running. Partial reconfiguration has been applied to two different video processing pipelines, a direct streaming architecture and a frame buffer streaming architecture respectively. The result shows that reconfiguration time is predictable which depends on the partial bitstream size, and that partial reconfiguration can be used in real-time applications taking the partial bitstream size and the frequency to switch the partial bitstreams into account.
Användningen av programmerbara logiska resurser i Field Programmable Gate Arrayer, även känd som FPGA:er, har ökat mycket nyligen på grund av komplexiteten hos algoritmerna, speciellt för vissa datorvisningsalgoritmer. På grund av detta är det ibland inte tillräckligt med hårdvaruresurser i FPGA:n. Partiell omkonfiguration ger oss möjlighet att lösa detta problem. Partiell omkonfigurering är en teknik som kan användas för att omkonfigurera specifika delar av FPGA:n under körtid. Genom att använda denna teknik kan vi minska behovet av programmerbara logiska resurser. Det här mastersprojektet syftar till att utforma ett programvaru-ramverk för partiell omkonfiguration som kan ladda en uppsättning processkomponenter / algoritmer (t.ex. objektdetektering, optiskt flöde, Harris-Corner detection etc) i FPGA- området utan att påverka statiska realtids-komponenter såsom kamerafångst, grundläggande bildfiltrering och färgkonvertering som körs kontinuerligt. Partiell omkonfiguration har tillämpats på två olika videoprocessnings-pipelines, en direkt-strömmande respektive en rambuffert-strömmande arkitektur. Resultatet visar att omkonfigurationstiden är förutsägbar och att partiell omkonfiguration kan användas i realtids-tillämpningar.

APA, Harvard, Vancouver, ISO, and other styles

8

Hiromoto, Masayuki. "LSI design methodology for real-time computer vision on embedded systems." 京都大学 (Kyoto University), 2009. http://hdl.handle.net/2433/126476.

Full text

Abstract:

Kyoto University (京都大学)
0048
新制・課程博士
博士(情報学)
甲第15012号
情博第371号
新制||情||68(附属図書館)
27462
UT51-2009-R736
京都大学大学院情報学研究科通信情報システム専攻
(主査)教授佐藤高史, 教授小野寺秀俊, 教授松山隆司, 准教授越智裕之
学位規則第4条第1項該当

APA, Harvard, Vancouver, ISO, and other styles

9

Pereira, Pedro André Marques. "Measuring the strain of metallic surfaces in real time through vision systems." Master's thesis, Universidade de Aveiro, 2015. http://hdl.handle.net/10773/16447.

Full text

Abstract:

Mestrado em Engenharia Mecânica
Vision systems have already proven to be a useful tool in various elds. The ease of their implementation, allied to their low cost mean that their growth potential is immense. In this dissertation it is proposed a approach to measure strains in metallic surfaces, using stereo vision. This approach is based on the 3D DIC. This method measures the strain of the surface by dividing this surface in small sections, called subsets, and iteratively nding the equation that describes its shape variation through time. However, calculating the transformation of this subset is very timeconsuming. The proposed approach tries to optimize this calculation by rst determine the displacement eld, and then the strain eld by derivation. The dissertation also presents some experimental data and practical considerations relatively to the camera setup and image equalization algorithms in order to obtain better disparity maps. The results were veri ed experimentally and compared with the results obtained from other softwares.
Os sistemas de vis~ao j a provaram ser uma ferramenta util em v arios campos. A facilidade da sua implementa c~ao, aliada ao seu baixo custo signi cam que o seu potencial de crescimento e enorme. Nesta disserta c~ao e proposta uma abordagem para medir deforma c~oes em superf cies met alicas usando vis~ao stereo. Esta abordagem e baseada na t ecnica 3D DIC. Este m etodo mede as deforma c~oes da superf cie dividindo-a em pequenas se c~oes, designadas por sub- sets, tentando iterativamente encontrar a equa c~ao que de ne as varia c~oes das suas formas ao longo do tempo. No entanto, o c alculo das transforma c~oes destes subsets e demorado. A abordagem proposta pretende pretende otimizar este c alculo determinando primeiro o campo de deslocamentos e depois o campo das deforma c~oes atrav es da deriva c~ao. A disserta c~ao apresenta tamb em dados experimentais e considera c~oes pr aticas relativamente a con gura c~ao (setup) das c^amaras e algoritmos de equaliza c~ao de imagens de forma a se obterem melhores mapas de disparidade. Os resultados foram veri cados experimentalmente e comparados com os resultados obtidos por outros softwares.

APA, Harvard, Vancouver, ISO, and other styles

10

Katramados, Ioannis. "Real-time object detection using monocular vision for low-cost automotive sensing systems." Thesis, Cranfield University, 2013. http://dspace.lib.cranfield.ac.uk/handle/1826/10386.

Full text

Abstract:

This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain.

APA, Harvard, Vancouver, ISO, and other styles

11

Björkman, Mårten. "Real-Time Motion and Stereo Cues for Active Visual Observers." Doctoral thesis, KTH, Numerical Analysis and Computer Science, NADA, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3382.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Nelson, Eric D. "Zoom techniques for achieving scale invariant object tracking in real-time active vision systems /." Online version of the thesis, 2006. https://ritdml.rit.edu/dspace/handle/1850/2620.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Watanabe, Yoko. "Stochastically optimized monocular vision-based navigation and guidance." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/22545.

Full text

Abstract:

Thesis (Ph. D.)--Aerospace Engineering, Georgia Institute of Technology, 2008.
Committee Chair: Johnson, Eric; Committee Co-Chair: Calise, Anthony; Committee Member: Prasad, J.V.R.; Committee Member: Tannenbaum, Allen; Committee Member: Tsiotras, Panagiotis.

APA, Harvard, Vancouver, ISO, and other styles

14

Hellsten, Jonas. "Evaluation of tone mapping operators for use in real time environments." Thesis, Linköping University, Department of Science and Technology, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-9749.

Full text

Abstract:

As real time visualizations become more realistic it also becomes more important to simulate the perceptual effects of the human visual system. Such effects include the response to varying illumination, glare and differences between photopic and scotopic vision. This thesis evaluates several different tone mapping methods to allow a greater dynamic range to be used in real time visualisations. Several tone mapping methods have been implemented in the Avalanche Game Engine and evaluated using a small test group. To increase immersion in the visualization several filters aimed to simulate perceptual effects has also been implemented. The primary goal of these filters is to simulate scotopic vision. The tests showed that two tone mapping methods would be suitable for the environment used in the tests. The S-curve tone mapping method gave the best result while the Mean Value method gave good results while being the simplest to implement and the cheapest. The test subjects agreed that the simulation of scotopic vision enhanced the immersion in a visualization. The primary difficulties in this work has been lack of dynamic range in the input images and the challenges in coding real time graphics using a graphics processing unit.

APA, Harvard, Vancouver, ISO, and other styles

15

Entschev, Peter Andreas. "Efficient construction of multi-scale image pyramids for real-time embedded robot vision." Universidade Tecnológica Federal do Paraná, 2013. http://repositorio.utfpr.edu.br/jspui/handle/1/720.

Full text

Abstract:

Detectores de pontos de interesse, ou detectores de keypoints, têm sido de grande interesse para a área de visão robótica embarcada, especialmente aqueles que possuem robustez a variações geométricas, como rotação, transformações afins e mudanças em escala. A detecção de características invariáveis a escala é normalmente realizada com a construção de pirâmides de imagens em multiescala e pela busca exaustiva de extremos no espaço de escala, uma abordagem presente em métodos de reconhecimento de objetos como SIFT e SURF. Esses métodos são capazes de encontrar pontos de interesse bastante robustos, com propriedades adequadas para o reconhecimento de objetos, mas são ao mesmo tempo computacionalmente custosos. Nesse trabalho é apresentado um método eficiente para a construção de pirâmides de imagens em sistemas embarcados, como a plataforma BeagleBoard-xM, de forma similar ao método SIFT. O método aqui apresentado tem como objetivo utilizar técnicas computacionalmente menos custosas e a reutilização de informações previamente processadas de forma eficiente para reduzir a complexidade computacional. Para simplificar o processo de construção de pirâmides, o método utiliza filtros binomiais em substituição aos filtros Gaussianos convencionais utilizados no método SIFT original para calcular múltiplas escalas de uma imagem. Filtros binomiais possuem a vantagem de serem implementáveis utilizando notação ponto-fixo, o que é uma grande vantagem para muitos sistemas embarcados que não possuem suporte nativo a ponto-flutuante. A quantidade de convoluções necessária é reduzida pela reamostragem de escalas já processadas da pirâmide. Após a apresentação do método para construção eficiente de pirâmides, é apresentada uma maneira de implementação eficiente do método em uma plataforma SIMD (Single Instruction, Multiple Data, em português, Instrução Única, Dados Múltiplos) – a plataforma SIMD usada é a extensão ARM Neon disponível no processador ARM Cortex-A8 da BeagleBoard-xM. Plataformas SIMD em geral são muito úteis para aplicações multimídia, onde normalmente é necessário realizar a mesma operação em vários elementos, como pixels em uma imagem, permitindo que múltiplos dados sejam processados com uma única instrução do processador. Entretanto, a extensão Neon no processador Cortex-A8 não suporta operações em ponto-flutuante, tendo o método sido cuidadosamente implementado de forma a superar essa limitação. Por fim, alguns resultados sobre o método aqui proposto e método SIFT original são apresentados, incluindo seu desempenho em tempo de execução e repetibilidade de pontos de interesse detectados. Com uma implementação direta (sem o uso da plataforma SIMD), é mostrado que o método aqui apresentado necessita de aproximadamente 1/4 do tempo necessário para construir a pirâmide do método SIFT original, ao mesmo tempo em que repete até 86% dos pontos de interesse. Com uma abordagem completamente implementada em ponto-fixo (incluindo a vetorização com a plataforma SIMD) a repetibilidade chega a 92% dos pontos de interesse do método SIFT original, porém, reduzindo o tempo de processamento para menos de 3%.
Interest point detectors, or keypoint detectors, have been of great interest for embedded robot vision for a long time, especially those which provide robustness against geometrical variations, such as rotation, affine transformations and changes in scale. The detection of scale invariant features is normally done by constructing multi-scale image pyramids and performing an exhaustive search for extrema in the scale space, an approach that is present in object recognition methods such as SIFT and SURF. These methods are able to find very robust interest points with suitable properties for object recognition, but at the same time are computationally expensive. In this work we present an efficient method for the construction of SIFT-like image pyramids in embedded systems such as the BeagleBoard-xM. The method we present here aims at using computationally less expensive techniques and reusing already processed information in an efficient manner in order to reduce the overall computational complexity. To simplify the pyramid building process we use binomial filters instead of conventional Gaussian filters used in the original SIFT method to calculate multiple scales of an image. Binomial filters have the advantage of being able to be implemented by using fixed-point notation, which is a big advantage for many embedded systems that do not provide native floating-point support. We also reduce the amount of convolution operations needed by resampling already processed scales of the pyramid. After presenting our efficient pyramid construction method, we show how to implement it in an efficient manner in an SIMD (Single Instruction, Multiple Data) platform -- the SIMD platform we use is the ARM Neon extension available in the BeagleBoard-xM ARM Cortex-A8 processor. SIMD platforms in general are very useful for multimedia applications, where normally it is necessary to perform the same operation over several elements, such as pixels in images, enabling multiple data to be processed with a single instruction of the processor. However, the Neon extension in the Cortex-A8 processor does not support floating-point operations, so the whole method was carefully implemented to overcome this limitation. Finally, we provide some comparison results regarding the method we propose here and the original SIFT approach, including performance regarding execution time and repeatability of detected keypoints. With a straightforward implementation (without the use of the SIMD platform), we show that our method takes approximately 1/4 of the time taken to build the entire original SIFT pyramid, while repeating up to 86% of the interest points found with the original method. With a complete fixed-point approach (including vectorization within the SIMD platform) we show that repeatability reaches up to 92% of the original SIFT keypoints while reducing the processing time to less than 3%.

APA, Harvard, Vancouver, ISO, and other styles

16

Cedernaes, Erasmus. "Runway detection in LWIR video : Real time image processing and presentation of sensor data." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-300690.

Full text

Abstract:

Runway detection in long wavelength infrared (LWIR) video could potentially increase the number of successful landings by increasing the situational awareness of pilots and verifying a correct approach. A method for detecting runways in LWIR video was therefore proposed and evaluated for robustness, speed and FPGA acceleration. The proposed algorithm improves the detection probability by making assumptions of the runway appearance during approach, as well as by using a modified Hough line transform and a symmetric search of peaks in the accumulator that is returned by the Hough line transform. A video chain was implemented on a Xilinx ZC702 Development card with input and output via HDMI through an expansion card. The video frames were buffered to RAM, and the detection algorithm ran on the CPU, which however did not meet the real-time requirement. Strategies were proposed that would improve the processing speed by either acceleration in hardware or algorithmic changes.

APA, Harvard, Vancouver, ISO, and other styles

17

Forsthoefel, Dana. "Leap segmentation in mobile image and video analysis." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50285.

Full text

Abstract:

As demand for real-time image processing increases, the need to improve the efficiency of image processing systems is growing. The process of image segmentation is often used in preprocessing stages of computer vision systems to reduce image data and increase processing efficiency. This dissertation introduces a novel image segmentation approach known as leap segmentation, which applies a flexible definition of adjacency to allow groupings of pixels into segments which need not be spatially contiguous and thus can more accurately correspond to large surfaces in the scene. Experiments show that leap segmentation correctly preserves an average of 20% more original scene pixels than traditional approaches, while using the same number of segments, and significantly improves execution performance (executing 10x - 15x faster than leading approaches). Further, leap segmentation is shown to improve the efficiency of a high-level vision application for scene layout analysis within 3D scene reconstruction. The benefits of applying image segmentation in preprocessing are not limited to single-frame image processing. Segmentation is also often applied in the preprocessing stages of video analysis applications. In the second contribution of this dissertation, the fast, single-frame leap segmentation approach is extended into the temporal domain to develop a highly-efficient method for multiple-frame segmentation, called video leap segmentation. This approach is evaluated for use on mobile platforms where processing speed is critical using moving-camera traffic sequences captured on busy, multi-lane highways. Video leap segmentation accurately tracks segments across temporal bounds, maintaining temporal coherence between the input sequence frames. It is shown that video leap segmentation can be applied with high accuracy to the task of salient segment transformation detection for alerting drivers to important scene changes that may affect future steering decisions. Finally, while research efforts in the field of image segmentation have often recognized the need for efficient implementations for real-time processing, many of today’s leading image segmentation approaches exhibit processing times which exceed their camera frame periods, making them infeasible for use in real-time applications. The third research contribution of this dissertation focuses on developing fast implementations of the single-frame leap segmentation approach for use on both single-core and multi-core platforms as well as on both high-performance and resource-constrained systems. While the design of leap segmentation lends itself to efficient implementations, the efficiency achieved by this algorithm, as in any algorithm, is can be improved with careful implementation optimizations. The leap segmentation approach is analyzed in detail and highly optimized implementations of the approach are presented with in-depth studies, ranging from storage considerations to realizing parallel processing potential. The final implementations of leap segmentation for both serial and parallel platforms are shown to achieve real-time frame rates even when processing very high resolution input images. Leap segmentation’s accuracy and speed make it a highly competitive alternative to today’s leading segmentation approaches for modern, real-time computer vision systems.

APA, Harvard, Vancouver, ISO, and other styles

18

Skoglund, Johan. "Robust Real-Time Estimation of Region Displacements in Video Sequences." Licentiate thesis, Linköping : Department of Electrical Engineering, Linköpings universitet, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Mahammad, Sarfaraz Ahmad, and Vendrapu Sushma. "Raspberry Pi Based Vision System for Foreign Object Debris (FOD) Detection." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20198.

Full text

Abstract:

Background: The main purpose of this research is to design and develop a cost-effective system for detection of Foreign Object Debris (FOD), dedicated to airports. FOD detection has been a significant problem at airports as it can cause damage to aircraft. Developing such a device to detect FOD may require complicated hardware and software structures. The proposed solution is based on a computer vision system, which comprises of flexible off the shelf components such as a Raspberry Pi and Camera Module, allowing the simplistic and efficient way to detect FOD. Methods: The solution to this research is achieved through User-centered design, which implies to design a system solution suitably and efficiently. The system solution specifications, objectives and limitations are derived from this User-centered design. The possible technologies are concluded from the required functionalities and constraints to obtain a real-time efficient FOD detection system. Results: The results are obtained using background subtraction for FOD detection and implementation of SSD (single-shot multi-box detector) model for FOD classification. The performance evaluation of the system is analysed by testing the system to detect FOD of different size for different distances. The web design is also implemented to notify the user in real-time when there is an occurrence of FOD. Conclusions: We concluded that the background subtraction and SSD model are the most suitable algorithms for the solution design with Raspberry Pi to detect FOD in a real-time system. The system performs in real-time, giving the efficiency of 84% for detecting medium-sized FOD such as persons at a distance of 75 meters and 72% efficiency for detecting large-sized FOD such as cars at a distance of 125 meters, and the average frame per second (fps) that is the system ’s performance in recording and processing frames of the area required to detect FOD is 0.95.

APA, Harvard, Vancouver, ISO, and other styles

20

Pettersson, Johan. "Real-time Object Recognition on a GPU." Thesis, Linköping University, Department of Electrical Engineering, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-10238.

Full text

Abstract:

Shape-Based matching (SBM) is a known method for 2D object recognition that is rather robust against illumination variations, noise, clutter and partial occlusion.

The objects to be recognized can be translated, rotated and scaled.

The translation of an object is determined by evaluating a similarity measure for all possible positions (similar to cross correlation).

The similarity measure is based on dot products between normalized gradient directions in edges.

Rotation and scale is determined by evaluating all possible combinations, spanning a huge search space.

A resolution pyramid is used to form a heuristic for the search that then gains real-time performance.

For SBM, a model consisting of normalized edge gradient directions, are constructed for all possible combinations of rotation and scale.

We have avoided this by using (bilinear) interpolation in the search gradient map, which greatly reduces the amount of storage required.

SBM is highly parallelizable by nature and with our suggested improvements it becomes much suited for running on a GPU.

This have been implemented and tested, and the results clearly outperform those of our reference CPU implementation (with magnitudes of hundreds).

It is also very scalable and easily benefits from future devices without effort.

An extensive evaluation material and tools for evaluating object recognition algorithms have been developed and the implementation is evaluated and compared to two commercial 2D object recognition solutions.

The results show that the method is very powerful when dealing with the distortions listed above and competes well with its opponents.

APA, Harvard, Vancouver, ISO, and other styles

21

Launila, Andreas. "Real-Time Head Pose Estimation in Low-Resolution Football Footage." Thesis, KTH, Computer Vision and Active Perception, CVAP, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-12061.

Full text

Abstract:

This report examines the problem of real-time head pose estimation in low-resolution football footage. A method is presented for inferring the head pose using a combination of footage and knowledge of the locations of the football and players. An ensemble of randomized ferns is compared with a support vector machine for processing the footage, while a support vector machine performs pattern recognition on the location data. Combining the two sources of information outperforms either in isolation. The location of the football turns out to be an important piece of information.

QC 20100707
Capturing and Visualizing Large scale Human Action (ACTVIS)

APA, Harvard, Vancouver, ISO, and other styles

22

Schennings, Jacob. "Deep Convolutional Neural Networks for Real-Time Single Frame Monocular Depth Estimation." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-336923.

Full text

Abstract:

Vision based active safety systems have become more frequently occurring in modern vehicles to estimate depth of the objects ahead and for autonomous driving (AD) and advanced driver-assistance systems (ADAS). In this thesis a lightweight deep convolutional neural network performing real-time depth estimation on single monocular images is implemented and evaluated. Many of the vision based automatic brake systems in modern vehicles only detect pre-trained object types such as pedestrians and vehicles. These systems fail to detect general objects such as road debris and roadside obstacles. In stereo vision systems the problem is resolved by calculating a disparity image from the stereo image pair to extract depth information. The distance to an object can also be determined using radar and LiDAR systems. By using this depth information the system performs necessary actions to avoid collisions with objects that are determined to be too close. However, these systems are also more expensive than a regular mono camera system and are therefore not very common in the average consumer car. By implementing robust depth estimation in mono vision systems the benefits from active safety systems could be utilized by a larger segment of the vehicle fleet. This could drastically reduce human error related traffic accidents and possibly save many lives. The network architecture evaluated in this thesis is more lightweight than other CNN architectures previously used for monocular depth estimation. The proposed architecture is therefore preferable to use on computationally lightweight systems. The network solves a supervised regression problem during the training procedure in order to produce a pixel-wise depth estimation map. The network was trained using a sparse ground truth image with spatially incoherent and discontinuous data and output a dense spatially coherent and continuous depth map prediction. The spatially incoherent ground truth posed a problem of discontinuity that was addressed by a masked loss function with regularization. The network was able to predict a dense depth estimation on the KITTI dataset with close to state-of-the-art performance.

APA, Harvard, Vancouver, ISO, and other styles

23

Sällqvist, Jessica. "Real-time 3D Semantic Segmentation of Timber Loads with Convolutional Neural Networks." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148862.

Full text

Abstract:

Volume measurements of timber loads is done in conjunction with timber trade. When dealing with goods of major economic values such as these, it is important to achieve an impartial and fair assessment when determining price-based volumes. With the help of Saab’s missile targeting technology, CIND AB develops products for digital volume measurement of timber loads. Currently there is a system in operation that automatically reconstructs timber trucks in motion to create measurable images of them. Future iterations of the system is expected to fully automate the scaling by generating a volumetric representation of the timber and calculate its external gross volume. The first challenge towards this development is to separate the timber load from the truck. This thesis aims to evaluate and implement appropriate method for semantic pixel-wise segmentation of timber loads in real time. Image segmentation is a classic but difficult problem in computer vision. To achieve greater robustness, it is therefore important to carefully study and make use of the conditions given by the existing system. Variations in timber type, truck type and packing together create unique combinations that the system must be able to handle. The system must work around the clock in different weather conditions while maintaining high precision and performance.

APA, Harvard, Vancouver, ISO, and other styles

24

Mohan, Deepak. "Real-time detection of grip length deviation for fastening operations: a Mahalanobis-Taguchi system (MTS) based approach." Diss., Rolla, Mo. : University of Missouri-Rolla, 2007. http://scholarsmine.mst.edu/thesis/pdf/DeepakMohanThesisFinal_09007dcc80410b1d.pdf.

Full text

Abstract:

Thesis (M.S.)--University of Missouri--Rolla, 2007.
Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed October 24, 2007) Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

25

Algers, Björn. "Stereo Camera Calibration Accuracy in Real-time Car Angles Estimation for Vision Driver Assistance and Autonomous Driving." Thesis, Umeå universitet, Institutionen för fysik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149443.

Full text

Abstract:

The automotive safety company Veoneer are producers of high end driver visual assistance systems, but the knowledge about the absolute accuracy of their dynamic calibration algorithms that estimate the vehicle’s orientation is limited. In this thesis, a novel measurement system is proposed to be used in gathering reference data of a vehicle’s orientation as it is in motion, more specifically the pitch and roll angle of the vehicle. Focus has been to estimate how the uncertainty of the measurement system is affected by errors introduced during its construction, and to evaluate its potential in being a viable tool in gathering reference data for algorithm performance evaluation. The system consisted of three laser distance sensors mounted on the body of the vehicle, and a range of data acquisition sequences with different perturbations were performed by driving along a stretch of road in Linköping with weights loaded in the vehicle. The reference data were compared to camera system data where the bias of the calculated angles were estimated, along with the dynamic behaviour of the camera system algorithms. The experimental results showed that the accuracy of the system exceeded 0.1 degrees for both pitch and roll, but no conclusions about the bias of the algorithms could be drawn as there were systematic errors present in the measurements.
Bilsäkerhetsföretaget Veoneer är utvecklare av avancerade kamerasystem inom förarassistans, men kunskapen om den absoluta noggrannheten i deras dynamiska kalibreringsalgoritmer som skattar fordonets orientering är begränsad. I denna avhandling utvecklas och testas ett nytt mätsystem för att samla in referensdata av ett fordons orientering när det är i rörelse, mer specifikt dess pitchvinkel och rollvinkel. Fokus har legat på att skatta hur osäkerheten i mätsystemet påverkas av fel som introducerats vid dess konstruktion, samt att utreda dess potential när det kommer till att vara ett gångbart alternativ för att samla in referensdata för evaluering av prestandan hos algoritmerna. Systemet bestod av tre laseravståndssensorer monterade på fordonets kaross. En rad mätförsök utfördes med olika störningar introducerade genom att köra längs en vägsträcka i Linköping med vikter lastade i fordonet. Det insamlade referensdatat jämfördes med data från kamerasystemet där bias hos de framräknade vinklarna skattades, samt att de dynamiska egenskaperna kamerasystemets algoritmer utvärderades. Resultaten från mätförsöken visade på att noggrannheten i mätsystemet översteg 0.1 grader för både pitchvinklarna och rollvinklarna, men några slutsatser kring eventuell bias hos algoritmerna kunde ej dras då systematiska fel uppstått i mätresultaten.

APA, Harvard, Vancouver, ISO, and other styles

26

Nilsson, Linus. "Quality and real-time performance assessment of color-correction methods : A comparison between histogram-based prefiltering and global color transfer." Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-33877.

Full text

Abstract:

In the field of computer vision and more specifically multi-camera systems color correction is an important topic of discussion. The need for color-tone similarity among multiple images that are used to construct a single scene is self-evident. The strength and weaknesses of color- correction methods can be assessed by using metrics to measure structural and color-tone similarity and timing the methods. Color transfer has a better structural similarity than histogram-based prefiltering and a worse color-tone similarity. The color transfer method is faster than the histogram-based prefiltering. Color transfer is a better method if the focus is a structural similar image after correction, if better color-tone similarity at the cost of structural similarity is acceptable histogram-based prefiltering is a better choice. Color transfer is a faster method and is easier to run with a parallel computing approach then histogram-based prefiltering. Color transfer might therefore be a better pick for real-time applications. There is however more room to optimize an implementation of histogram-based prefiltering utilizing parallel computing.

APA, Harvard, Vancouver, ISO, and other styles

27

Ding, Yuhua. "An integrated approach to real-time multisensory inspection with an application to food processing." Diss., Available online, Georgia Institute of Technology, 2003:, 2003. http://etd.gatech.edu/theses/available/etd-11242003-180728/unrestricted/dingyuhu200312.pdf.

Full text

Abstract:

Thesis (Ph. D.)--Electrical and Computer Engineering, Georgia Institute of Technology, 2004.
Vachtsevanos, George J., Committee Chair; Dorrity, J. Lewis, Committee Member; Egerstedt, Magnus, Committee Member; Heck-Ferri, Bonnie S., Committee Co-Chair; Williams, Douglas B., Committee Member; Yezzi, Anthony J., Committee Member. Includes bibliography.

APA, Harvard, Vancouver, ISO, and other styles

28

Alberts, Stefan Francois. "Real-time Software Hand Pose Recognition using Single View Depth Images." Thesis, Stellenbosch : Stellenbosch University, 2014. http://hdl.handle.net/10019.1/86442.

Full text

Abstract:

Thesis (MEng)--Stellenbosch University, 2014.
ENGLISH ABSTRACT: The fairly recent introduction of low-cost depth sensors such as Microsoft’s Xbox Kinect has encouraged a large amount of research on the use of depth sensors for many common Computer Vision problems. Depth images are advantageous over normal colour images because of how easily objects in a scene can be segregated in real-time. Microsoft used the depth images from the Kinect to successfully separate multiple users and track various larger body joints, but has difficulty tracking smaller joints such as those of the fingers. This is a result of the low resolution and noisy nature of the depth images produced by the Kinect. The objective of this project is to use the depth images produced by the Kinect to remotely track the user’s hands and to recognise the static hand poses in real-time. Such a system would make it possible to control an electronic device from a distance without the use of a remote control. It can be used to control computer systems during computer aided presentations, translate sign language and to provide more hygienic control devices in clean rooms such as operating theatres and electronic laboratories. The proposed system uses the open-source OpenNI framework to retrieve the depth images from the Kinect and to track the user’s hands. Random Decision Forests are trained using computer generated depth images of various hand poses and used to classify the hand regions from a depth image. The region images are processed using a Mean-Shift based joint estimator to find the 3D joint coordinates. These coordinates are finally used to classify the static hand pose using a Support Vector Machine trained using the libSVM library. The system achieves a final accuracy of 95.61% when tested against synthetic data and 81.35% when tested against real world data.
AFRIKAANSE OPSOMMING: Die onlangse bekendstelling van lae-koste diepte sensors soos Microsoft se Xbox Kinect het groot belangstelling opgewek in navorsing oor die gebruik van die diepte sensors vir algemene Rekenaarvisie probleme. Diepte beelde maak dit baie eenvoudig om intyds verskillende voorwerpe in ’n toneel van mekaar te skei. Microsoft het diepte beelde van die Kinect gebruik om verskeie persone en hul ledemate suksesvol te volg. Dit kan egter nie kleiner ledemate soos die vingers volg nie as gevolg van die lae resolusie en voorkoms van geraas in die beelde. Die doel van hierdie projek is om die diepte beelde (verkry vanaf die Kinect) te gebruik om intyds ’n gebruiker se hande te volg oor ’n afstand en die statiese handgebare te herken. So ’n stelsel sal dit moontlik maak om elektroniese toestelle oor ’n afstand te kan beheer sonder die gebruik van ’n afstandsbeheerder. Dit kan gebruik word om rekenaarstelsels te beheer gedurende rekenaargesteunde aanbiedings, vir die vertaling van vingertaal en kan ook gebruik word as higiëniese, tasvrye beheer toestelle in skoonkamers soos operasieteaters en elektroniese laboratoriums. Die voorgestelde stelsel maak gebruik van die oopbron OpenNI raamwerk om die diepte beelde vanaf die Kinect te lees en die gebruiker se hande te volg. Lukrake Besluitnemingswoude ("Random Decision Forests") is opgelei met behulp van rekenaar gegenereerde diepte beelde van verskeie handgebare en word gebruik om die verskeie handdele vanaf ’n diepte beeld te klassifiseer. Die 3D koördinate van die hand ledemate word dan verkry deur gebruik te maak van ’n Gemiddelde-Afset gebaseerde ledemaat herkenner. Hierdie koördinate word dan gebruik om die statiese handgebaar te klassifiseer met behulp van ’n Steun-Vektor Masjien ("Support Vector Machine"), opgelei met behulp van die libSVM biblioteek. Die stelsel behaal ’n finale akkuraatheid van 95.61% wanneer dit getoets word teen sintetiese data en 81.35% wanneer getoets word teen werklike data.

APA, Harvard, Vancouver, ISO, and other styles

29

Modi, Kalpesh Prakash. "Vision application of human robot interaction : development of a ping pong playing robotic arm /." Link to online version, 2005. https://ritdml.rit.edu/dspace/handle/1850/943.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Ärleryd, Sebastian. "Realtime Virtual 3D Image of Kidney Using Pre-Operative CT Image for Geometry and Realtime US-Image for Tracking." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-234991.

Full text

Abstract:

In this thesis a method is presented to provide a 3D visualization of the human kidney and surrounding tissue during kidney surgery. The method takes advantage of the high detail of 3D X-Ray Computed Tomography (CT) and the high time resolution of Ultrasonography (US). By extracting the geometry from a single preoperative CT scan and animating the kidney by tracking its position in real time US images, a 3D visualization of the surgical volume can be created. The first part of the project consisted of building an imaging phantom as a simplified model of the human body around the kidney. It consists of three parts: the shell part representing surrounding tissue, the kidney part representing the kidney soft tissue and a kidney stone part embedded in the kidney part. The shell and soft tissue kidney parts was cast with a mixture of the synthetic polymer Polyvinyl Alchohol (PVA) and water. The kidney stone part was cast with epoxy glue. All three parts where designed to look like human tissue in CT and US images. The method is a pipeline of stages that starts with acquiring the CT image as a 3D matrix of intensity values. This matrix is then segmented, resulting in separate polygonal 3D models for the three phantom parts. A scan of the model is then performed using US, producing a sequence of US images. A computer program extracts easily recognizable image feature points from the images in the sequence. Knowing the spatial position and orientation of a new US image in which these features can be found again allows the position of the kidney to be calculated. The presented method is realized as a proof of concept implementation of the pipeline. The implementation displays an interactive visualization where the kidney is positioned according to a user-selected US image scanned for image features. Using the proof of concept implementation as a guide, the accuracy of the proposed method is estimated to be bounded by the acquired image data. For high resolution CT and US images, the accuracy can be in the order of a few millimeters.

APA, Harvard, Vancouver, ISO, and other styles

31

Somani, Nikhil [Verfasser], Alois C. [Akademischer Betreuer] Knoll, Torsten [Gutachter] Kröger, and Alois C. [Gutachter] Knoll. "Constraint-based Approaches for Robotic Systems: from Computer Vision to Real-Time Robot Control / Nikhil Somani ; Gutachter: Torsten Kröger, Alois C. Knoll ; Betreuer: Alois C. Knoll." München : Universitätsbibliothek der TU München, 2018. http://d-nb.info/1172414947/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Julin, Fredrik. "Vision based facial emotion detection using deep convolutional neural networks." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-42622.

Full text

Abstract:

Emotion detection, also known as Facial expression recognition, is the art of mapping an emotion to some sort of input data taken from a human. This is a powerful tool to extract valuable information from individuals which can be used as data for many different purposes, ranging from medical conditions such as depression to customer feedback. To be able to solve the problem of facial expression recognition, smaller subtasks are required and all of them together form the complete system to the problem. Breaking down the bigger task at hand, one can think of these smaller subtasks in the form of a pipeline that implements the necessary steps for classification of some input to then give an output in the form of emotion. In recent time with the rise of the art of computer vision, images are often used as input for these systems and have shown great promise to assist in the task of facial expression recognition as the human face conveys the subjects emotional state and contain more information than other inputs, such as text or audio. Many of the current state-of-the-art systems utilize computer vision in combination with another rising field, namely AI, or more specifically deep learning. These proposed methods for deep learning are in many cases using a special form of neural network called convolutional neural network that specializes in extracting information from images. Then performing classification using the SoftMax function, acting as the last part before the output in the facial expression pipeline. This thesis work has explored these methods of utilizing convolutional neural networks to extract information from images and builds upon it by exploring a set of machine learning algorithms that replace the more commonly used SoftMax function as a classifier, in attempts to further increase not only the accuracy but also optimize the use of computational resources. The work also explores different techniques for the face detection subtask in the pipeline by comparing two approaches. One of these approaches is more frequently used in the state-of-the-art and is said to be more viable for possible real-time applications, namely the Viola-Jones algorithm. The other is a deep learning approach using a state-of-the-art convolutional neural network to perform the detection, in many cases speculated to be too computationally intense to run in real-time. By applying a state-of-the-art inspired new developed convolutional neural network together with the SoftMax classifier, the final performance did not reach state-of-the-art accuracy. However, the machine-learning classifiers used shows promise and bypass the SoftMax function in performance in several cases when given a massively smaller number of samples as training. Furthermore, the results given from implementing and testing a pure deep learning approach, using deep learning algorithms for both the detection and classification stages of the pipeline, shows that deep learning might outperform the classic Viola-Jones algorithm in terms of both detection rate and frames per second.

APA, Harvard, Vancouver, ISO, and other styles

33

Mercado-Ravell, Diego Alberto. "Autonomous navigation and teleoperation of unmanned aerial vehicles using monocular vision." Thesis, Compiègne, 2015. http://www.theses.fr/2015COMP2239/document.

Full text

Abstract:

Ce travail porte, de façon théorétique et pratique, sur les sujets plus pertinents autour des drones en navigation autonome et semi-autonome. Conformément à la nature multidisciplinaire des problèmes étudies, une grande diversité des techniques et théories ont été couverts dans les domaines de la robotique, l’automatique, l’informatique, la vision par ordinateur et les systèmes embarques, parmi outres.Dans le cadre de cette thèse, deux plates-formes expérimentales ont été développées afin de valider la théorie proposée pour la navigation autonome d’un drone. Le premier prototype, développé au laboratoire, est un quadrirotor spécialement conçu pour les applications extérieures. La deuxième plate-forme est composée d’un quadrirotor à bas coût du type AR.Drone fabrique par Parrot. Le véhicule est connecté sans fil à une station au sol équipé d’un système d’exploitation pour robots (ROS) et dédié à tester, d’une façon facile, rapide et sécurisé, les algorithmes de vision et les stratégies de commande proposés. Les premiers travaux développés ont été basés sur la fusion de donnés pour estimer la position du drone en utilisant des capteurs inertiels et le GPS. Deux stratégies ont été étudiées et appliquées, le Filtre de Kalman Etendu (EKF) et le filtre à Particules (PF). Les deux approches prennent en compte les mesures bruitées de la position de l’UAV, de sa vitesse et de son orientation. On a réalisé une validation numérique pour tester la performance des algorithmes. Une tâche dans le cahier de cette thèse a été de concevoir d’algorithmes de commande pour le suivi de trajectoires ou bien pour la télé-opération. Pour ce faire, on a proposé une loi de commande basée sur l’approche de Mode Glissants à deuxième ordre. Cette technique de commande permet de suivre au quadrirotor de trajectoires désirées et de réaliser l’évitement des collisions frontales si nécessaire. Etant donné que la plate-forme A.R.Drone est équipée d’un auto-pilote d’attitude, nous avons utilisé les angles désirés de roulis et de tangage comme entrées de commande. L’algorithme de commande proposé donne de la robustesse au système en boucle fermée. De plus, une nouvelle technique de vision monoculaire par ordinateur a été utilisée pour la localisation d’un drone. Les informations visuelles sont fusionnées avec les mesures inertielles du drone pour avoir une bonne estimation de sa position. Cette technique utilise l’algorithme PTAM (localisation parallèle et mapping), qui s’agit d’obtenir un nuage de points caractéristiques dans l’image par rapport à une scène qui servira comme repère. Cet algorithme n’utilise pas de cibles, de marqueurs ou de scènes bien définies. La contribution dans cette méthodologie a été de pouvoir utiliser le nuage de points disperse pour détecter possibles obstacles en face du véhicule. Avec cette information nous avons proposé un algorithme de commande pour réaliser l’évitement d’obstacles. Cette loi de commande utilise les champs de potentiel pour calculer une force de répulsion qui sera appliquée au drone. Des expériences en temps réel ont montré la bonne performance du système proposé. Les résultats antérieurs ont motivé la conception et développement d’un drone capable de réaliser en sécurité l’interaction avec les hommes et les suivre de façon autonome. Un classificateur en cascade du type Haar a été utilisé pour détecter le visage d’une personne. Une fois le visage est détecté, on utilise un filtre de Kalman (KF) pour améliorer la détection et un algorithme pour estimer la position relative du visage. Pour réguler la position du drone et la maintenir à une distance désirée du visage, on a utilisé une loi de commande linéaire
The present document addresses, theoretically and experimentally, the most relevant topics for Unmanned Aerial Vehicles (UAVs) in autonomous and semi-autonomous navigation. According with the multidisciplinary nature of the studied problems, a wide range of techniques and theories are covered in the fields of robotics, automatic control, computer science, computer vision and embedded systems, among others. As part of this thesis, two different experimental platforms were developed in order to explore and evaluate various theories and techniques of interest for autonomous navigation. The first prototype is a quadrotor specially designed for outdoor applications and was fully developed in our lab. The second testbed is composed by a non expensive commercial quadrotor kind AR. Drone, wireless connected to a ground station equipped with the Robot Operating System (ROS), and specially intended to test computer vision algorithms and automatic control strategies in an easy, fast and safe way. In addition, this work provides a study of data fusion techniques looking to enhance the UAVs pose estimation provided by commonly used sensors. Two strategies are evaluated in particular, an Extended Kalman Filter (EKF) and a Particle Filter (PF). Both estimators are adapted for the system under consideration, taking into account noisy measurements of the UAV position, velocity and orientation. Simulations show the performance of the developed algorithms while adding noise from real GPS (Global Positioning System) measurements. Safe and accurate navigation for either autonomous trajectory tracking or haptic teleoperation of quadrotors is presented as well. A second order Sliding Mode (2-SM) control algorithm is used to track trajectories while avoiding frontal collisions in autonomous flight. The time-scale separation of the translational and rotational dynamics allows us to design position controllers by giving desired references in the roll and pitch angles, which is suitable for quadrotors equipped with an internal attitude controller. The 2-SM control allows adding robustness to the closed-loop system. A Lyapunov based analysis probes the system stability. Vision algorithms are employed to estimate the pose of the vehicle using only a monocular SLAM (Simultaneous Localization and Mapping) fused with inertial measurements. Distance to potential obstacles is detected and computed using the sparse depth map from the vision algorithm. For teleoperation tests, a haptic device is employed to feedback information to the pilot about possible collisions, by exerting opposite forces. The proposed strategies are successfully tested in real-time experiments, using a low-cost commercial quadrotor. Also, conception and development of a Micro Aerial Vehicle (MAV) able to safely interact with human users by following them autonomously, is achieved in the present work. Once a face is detected by means of a Haar cascade classifier, it is tracked applying a Kalman Filter (KF), and an estimation of the relative position with respect to the face is obtained at a high rate. A linear Proportional Derivative (PD) controller regulates the UAV’s position in order to keep a constant distance to the face, employing as well the extra available information from the embedded UAV’s sensors. Several experiments were carried out through different conditions, showing good performance even under disadvantageous scenarios like outdoor flight, being robust against illumination changes, wind perturbations, image noise and the presence of several faces on the same image. Finally, this thesis deals with the problem of implementing a safe and fast transportation system using an UAV kind quadrotor with a cable suspended load. The objective consists in transporting the load from one place to another, in a fast way and with minimum swing in the cable

APA, Harvard, Vancouver, ISO, and other styles

34

Lo, Haw-Jing. "Real-time stereoscopic vision system." Thesis, Georgia Institute of Technology, 2003. http://hdl.handle.net/1853/14911.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Chambers, Simon Paul. "TIPS : a transputer based real-time vision system." Thesis, University of Liverpool, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.333629.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Clare, Anthony Joseph. "Real-time modelling and sensor fusion for a synthetic vision system." Thesis, University of Sheffield, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.434515.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Chen, Sicheng. "A single-chip real-Time range finder." Diss., Texas A&M University, 2004. http://hdl.handle.net/1969.1/553.

Full text

Abstract:

Range finding are widely used in various industrial applications, such as machine vision, collision avoidance, and robotics. Presently most range finders either rely on active transmitters or sophisticated mechanical controllers and powerful processors to extract range information, which make the range finders costly, bulky, or slowly, and limit their applications. This dissertation is a detailed description of a real-time vision-based range sensing technique and its single-chip CMOS implementation. To the best of our knowledge, this system is the first single chip vision-based range finder that doesn't need any mechanical position adjustment, memory or digital processor. The entire signal processing on the chip is purely analog and occurs in parallel. The chip captures the image of an object and extracts the depth and range information from just a single picture. The on-chip, continuous-time, logarithmic photoreceptor circuits are used to couple spatial image signals into the range-extracting processing network. The photoreceptor pixels can adjust their operating regions, simultaneously achieving high sensitivity and wide dynamic range. The image sharpness processor and Winner-Take-All circuits are characterized and analyzed carefully for their temporal bandwidth and detection performance. The mathematical and optical models of the system are built and carefully verified. A prototype based on this technique has been fabricated and tested. The experimental results prove that the range finder can achieve acceptable range sensing precision with low cost and excellent speed performance in short-to-medium range coverage. Therefore, it is particularly useful for collision avoidance.

APA, Harvard, Vancouver, ISO, and other styles

38

Lu, Qiang. "A Real-Time System for Color Sorting Edge-Glued Panel Parts." Thesis, Virginia Tech, 1994. http://hdl.handle.net/10919/35881.

Full text

Abstract:

This thesis describes the development of a software system for color sorting hardwood edge-glued panel parts. Conceptually, this system can be broken down into three separate processing steps. The first step is to segment color images of each of the two part faces into background and part. The second step involves extracting color information from each region labeled part and using this information to classify each part face as one of a pre-selected number of color classes plus an out class. The third step involves using the two face labels and some distance information to determine which part face is the better to use in the face of an edge-glued panel. Since a part face is illuminated while the background is not, the segmentation into background and part can be done using very simple computational methods. The color classification component of this system is based on the Trichromatic Color Theory. It uses an estimate of a part's 3-dimension (3-D) color probability function, P, to characterize the surface color of the part. Each color class is also represented by an estimate of the 3-D color probability function that describes the permissible distribution of colors within this color class. Let P_omega_i denote the estimated probability function for color class omega_i. Classification is accomplished by finding the color difference between the estimated color probability function for the part and each of the estimated 3-D color probability functions that represent the color classes. The distance function used is the sum of the absolute values of the differences between the elements of the estimated probability function for a class and the estimated probability function of the part. The sample is given the label of the color class to which it is closest if this distance is less than some class specific threshold for that class. If the distance to the class to which the part is closest is larger than the threshold for that class, the part is called an out. This supervised classification procedure first requires one to select training samples from each of the color classes to be considered. These training samples are used to generate P_omega_i for each color class omega_i and to establish the value of the threshold T_i that is used to determine when a part is an out. To aid in determining which part face is better to use in making a panel, the system allows one to prioritize the various color classes so that one or more color classes can have the same priority. Using these priorities, labels for each of the part faces, and the distance from each of the part faces' estimated probability functions to the estimated probability function of the class to which each face was assigned, the decision logic selects which is the ``better'' face. If the two part faces are assigned to color classes that have different priorities, the part face assigned to the color class with higher priority is chosen as the better face. If the two part faces have been assigned to the same color class or to two different classes having the same priority, the part face that is closest to the estimated probability function of the color class to which it has been assigned is chosen to be the better face. Finally, if both faces are labeled out, the part becomes an out part. This software system has been implemented on a prototype machine vision system that has undergone several months of in-plant testing. To date the system has only been tested on one type of material, southern red oak, with which it has proven itself capable of significantly out performing humans in creating high-quality edge-glued panels. Since southern red oak has significantly more color variation than any other hardwood type or species, it is believed that this system will work very well on any hardwood material.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

39

Gornall, Matthew James. "A Real-time Computer Vision System for tracking the face and hands /." Leeds : University of Leeds, School of Computer Studies, 2008. http://www.comp.leeds.ac.uk/fyproj/reports/0708/Gornall.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Rao, Niankun. "A novel high-speed stereo-vision system for real-time position sensing." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/39637.

Full text

Abstract:

Real-time position sensing has a wide range of applications in motion control systems, parts inspection and general metrology. Vision-based position sensing systems have signiﬁcant advantages over other sensing methods, including large measurement volume, non-contact sensing, and simultaneous mea- surement in multiple degrees-of-freedom (DOF). Existing vision-based position sensing solutions are limited by low sampling frequency and low position accuracy. This thesis presents the theory, design, implementation and calibration of a new high-speed stereo-vision camera system for real-time position sensing based on CMOS image sensors. By reading small regions around each target image rather than the full frame data of the sensor, the frame rate and image processing speed are vastly increased. A high speed camera interface is designed based on Camera Link technology, which allows a maximum continuous data throughput of 2.3Gbps. In addition, this stereo-vision system also includes ﬁxed pattern noise (FPN) correction, threshold processing, and sub-pixel target position interpolation. In order to achieve high position accuracy, this system is calibrated to determine its model parame- ters. The primary error sources in this system include target image noise, mechanical installation error and lens distortion. The image sensor is characterized, and its FPN data is extracted, by experiment. The mechanical installation error and lens distortion parameters are identiﬁed through camera cali- bration. The proposed camera calibration method uses the 3D position reconstruction error as its cost function in the iterative optimization. The optimization of linear and nonlinear parameters is decoupled. By these means, better estimation of model parameters is achieved. To verify the performance of the proposed calibration method, it is compared with a traditional single camera calibration method in sim- ulation and experiment. The results show that the proposed calibration method gives better parameter estimation than the traditional single camera calibration method. The experimental results indicate that the prototype system is capable of measuring 8 targets in 3- DOF at a sampling frequency of 8kHz. Comparison with a coordinate measurement machine (CMM) shows that the prototype system achieves a 3D position accuracy of 18μm (RMS) over a range of 400mm by 400mm by 15mm, with a resolution of 2μm.

APA, Harvard, Vancouver, ISO, and other styles

41

Alhamwi, Ali. "Co-design hardware/software of real time vision system on FPGA for obstacle detection." Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30342/document.

Full text

Abstract:

La détection, localisation d'obstacles et la reconstruction de carte d'occupation 2D sont des fonctions de base pour un robot navigant dans un environnement intérieure lorsque l'intervention avec les objets se fait dans un environnement encombré. Les solutions fondées sur la vision artificielle et couramment utilisées comme SLAM (simultaneous localization and mapping) ou le flux optique ont tendance a être des calculs intensifs. Ces solutions nécessitent des ressources de calcul puissantes pour répondre à faible vitesse en temps réel aux contraintes. Nous présentons une architecture matérielle pour la détection, localisation d'obstacles et la reconstruction de cartes d'occupation 2D en temps réel. Le système proposé est réalisé en utilisant une architecture de vision sur FPGA (field programmable gates array) et des capteurs d'odométrie pour la détection, localisation des obstacles et la cartographie. De la fusion de ces deux sources d'information complémentaires résulte un modèle amelioré de l'environnement autour des robots. L'architecture proposé est un système à faible coût avec un temps de calcul réduit, un débit d'images élevé, et une faible consommation d'énergie
Obstacle detection, localization and occupancy map reconstruction are essential abilities for a mobile robot to navigate in an environment. Solutions based on passive monocular vision such as simultaneous localization and mapping (SLAM) or optical flow (OF) require intensive computation. Systems based on these methods often rely on over-sized computation resources to meet real-time constraints. Inverse perspective mapping allows for obstacles detection at a low computational cost under the hypothesis of a flat ground observed during motion. It is thus possible to build an occupancy grid map by integrating obstacle detection over the course of the sensor. In this work we propose hardware/software system for obstacle detection, localization and 2D occupancy map reconstruction in real-time. The proposed system uses a FPGA-based design for vision and proprioceptive sensors for localization. Fusing this information allows for the construction of a simple environment model of the sensor surrounding. The resulting architecture is a low-cost, low-latency, high-throughput and low-power system

APA, Harvard, Vancouver, ISO, and other styles

42

Lu, Siliang. "Dynamic HVAC Operations Based on Occupancy Patterns With Real-Time Vision- Based System." Research Showcase @ CMU, 2017. http://repository.cmu.edu/theses/132.

Full text

Abstract:

An integrated heating, ventilation and air-conditioning (HVAC) system is one of the most important components to determining the energy consumption of the entire building. For commercial buildings, particularly office buildings and schools, the heating and cooling loads are largely dependent on the occupant behavioral patterns such as occupancy rates and their activities. Therefore, if HVAC systems can respond to dynamic occupancy profiles, there is a large potential to reduce energy consumption. However, currently, most of existing HVAC systems operate without the ability to adjust supply air rate accordingly in response to the dynamic profiles of occupants. Due to this inefficiency, much of the HVAC energy use is wasted, particularly when the conditioned spaces are unoccupied or under-occupied (less occupants than the intended design). The solution to this inefficiency is to control HVAC system based on dynamic occupant profiles. Motivated by this, the research provides a real-time vision-based occupant pattern recognition system for occupancy counting as well as activity level classification. The proposed vision-based system is integrated into the existing HVAC simulation model of a U.S. office building to investigate the level of energy savings as well as thermal comfort improvement compared to traditional existing HVAC control system. The research is divided into two parts. The first part is to use an open source library based on neural network for real-time occupant counting and background subtraction method for activity level classification with a common static RGB camera. The second part utilizes a DOE reference office building model with customized dynamic occupancy schedule, including the number of occupant schedule, activity schedule and clothing insulation schedule to identify the potential energy savings compared with conventional HVAC control system. The research results revealed that vision-based systems can detect occupants and classify activity level in real time with accuracy around 90% when there are not many occlusions. Additionally, the dynamic occupant schedules indeed can bring about energy savings. Details of vision-based system, methodology, simulation configurations and results will be presented in the paper as well as potential opportunities for use throughout multiple types of commercial buildings, specifically focused on office and educational institutes.

APA, Harvard, Vancouver, ISO, and other styles

43

McCarthy, Cheryl. "Automatic non-destructive dimensional measurement of cotton plants in real-time by machine vision." University of Southern Queensland, Faculty of Engineering and Surveying, 2009. http://eprints.usq.edu.au/archive/00006228/.

Full text

Abstract:

[Abstract]Pressure on water resources in Australia necessitates improved application of water to irrigated crops. Cotton is one of Australia’s major crops, but is also a large water user. On-farm water savings can be achieved by irrigating via large mobile irrigationmachines (LMIMs), which are capable of implementing deficit strategies and varying water application to 1 m2 resolution. However, irrigation amounts are commonly heldconstant throughout a field despite differing water requirements for different areas of a crop due to spatial variability of soil, microclimate and crop properties.This research has developed a non-destructive cotton plant dimensional measurement system, capable of mounting on a LMIM and streaming live crop measurement data toa variable-rate irrigation controller. The sensor is a vision system that measures the cotton plant attribute of internode length, i.e. the distance between main stem nodes(or branch junctions) on the plant’s main stem, which is a significant indicator of plant water stress.The vision system consisted of a Sony camcorder deinterlaced image size 720 × 288 pixels) mounted behind a transparent panel that moved continuously through the cropcanopy. The camera and transparent panel were embodied in a contoured fibreglass camera enclosure (dimensions 535 mm × 555 mm × 270 mm wide) that utilised the natural flexibility of the growing foliage to firstly contact the plant, such that the top five nodes of the plant were in front of the transparent panel, and then smoothly andnon-destructively guide the plant under the curved bottom surface of the enclosure. By forcing the plant into a fixed object plane (the transparent panel), reliable geometric measurement was possible without the use of stereo vision. Motorisation of the camera enclosure enabled conveyance both across and along the crop rows using an in-field chassis.A custom image processing algorithm was developed to automatically extract internode distance from the images collected by the camera, and comprised both single frameand sequential-frame analyses. Single frame processing consisted of detecting lines corresponding to branches and calculating the intersection of the detected lines with themain stem to estimate candidate node positions. Calculation of the ‘vesselness’ function for each pixel using the Hessian matrix eigenvalues determined whether the pixel was likely to belong to a stem (i.e. a curvilinear structure). Large areas of connectedhigh-vesselness pixels were identified as branches. For each branch area, centre points were determined by solving the second order Taylor polynomial in the direction perpendicular to the line direction. The main stem was estimated with a linear Hough transform on the branch centre points within the image. Lines were then fitted to the centre points of other branch segments using the hop-along line-fitting algorithm and these lines were selectively projected to the main stem to estimate candidate node positions. The automatically-identified node positions corresponded to manual positionmeasurements made on the source images.Within individual images, leaf edges were erroneously detected as candidate nodes (‘false positives’) and contributed up to 22% of the total number of detected candidate nodes. However, a grouping algorithm based on a Delaunay Triangulation mesh of the candidate node positions was used to remove the largely-random false positives and to create accurate candidate node trajectories. The internode distance measurementwas then calculated as the maximum value between detected trajectories which corresponded to when the plant was closest to the transparent panel.From 168 video sequences of fourteen plants, 95 internode lengths were automatically detected at an average rate of one internode length per 1.75 plants for across row measurement,and one internode length per 3.3 m for along row measurement. Comparison with manually-measured internode lengths yielded a correlation coefficient of 0.86 for the automatic measurements and an average standard error in measurement of 3.0 mm with almost zero measurement bias.The second and third internode distances were most commonly detected by the vision system. The most measurements were obtained with the camera facing north orsouth, on a partially cloudy day in which the sunlight was diffused. Heliotropic effects and overexposed image background reduced image quality when the camera faced eastor west. Night time images, captured with 850 nm LED illumination, provided as many measurements as the corresponding daytime measurements. Along row cameraenclosure speeds up to 0.20 m/s yielded internode lengths using the current image processing algorithms and hardware. Calculations based on field programmable gate array (FPGA) implementation indicated an overall algorithm run-time of 46 ms per frame which is suitable for real-time application.It is concluded that field measurement of cotton plant internode length is possible using a moving, plant-contacting camera enclosure; that the presence of occlusions and other foliage edges can be overcome by analysing the sequence of images; and that real-timein-field operation is achievable.

APA, Harvard, Vancouver, ISO, and other styles

44

Schofield, Nicholas Roger. "A low cost, real time robot vision system with a cluster-based learning capacity." Thesis, Durham University, 1988. http://etheses.dur.ac.uk/947/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Shen, Anqi. "A real time 3D surface measurement system using projected line patterns." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5399.

Full text

Abstract:

This thesis is based on a research project to evaluate a quality control system for car component stamping lines. The quality control system measures the abrasion of the stamping tools by measuring the surface of the products. A 3D vision system is developed for the real time online measurement of the product surface. In this thesis, there are three main research themes. First is to produce an industrial application. All the components of this vision system are selected from industrial products and user application software is developed. A rich human machine interface for interaction with the vision system is developed along with a link between the vision system and a control unit which is established for interaction with a production line. The second research theme is to enhance the robustness of the 3D measurement. As an industrial product, this system will be deployed in different factories. It should be robust against environmental uncertainties. For this purpose, a high signal to noise ratio is required with the light pattern being produced by a laser projector. Additionally, multiple height calculation methods and a spatial Kalman filter are proposed for optimal height estimation. The final research theme is to achieve real time 3D measurement. The vision system is expected to be installed on production lines for online quality inspection. A new 3D measurement method is developed. It combines the spatial binary coded method with phase shift methods with a single image needs to be captured.
SHRIS (Shanghai Ro-Intelligent System,co.,Ltd.)

APA, Harvard, Vancouver, ISO, and other styles

46

Walker, Ryan Christopher Gareth. "Poi Poi Revolution: A real-time feedback training system for objectmanipulation." Thesis, University of Canterbury. Human Interface Technology Laboratory, 2013. http://hdl.handle.net/10092/8039.

Full text

Abstract:

The affordability and availability of fast motion cameras presents an ideal opportunity to build computer systems that create real-time feed- back loops. These systems can enable users to learn at a faster rate than traditional systems, as well as present a more engaging experience. In this dissertation, I document the development and evaluation of a real- time audio and visual feedback system for geometric poi manipulation. The goal of the system is to present an experiential and objectively su- perior learning tool when compared to traditional learning techniques in the object manipulation community. For the evaluation, I conduct an experiment that compares the feedback training system with traditional learning techniques in the object manipulation community. The results suggest that the feedback system presents a more engaging experience than traditional mirror feedback training, and conclude that further re- search is warranted.

APA, Harvard, Vancouver, ISO, and other styles

47

Naoulou, Abdelelah. "Architectures pour la stéréovision passive dense temps réel : application à la stéréo-endoscopie." Phd thesis, Université Paul Sabatier - Toulouse III, 2006. http://tel.archives-ouvertes.fr/tel-00110093.

Full text

Abstract:

L'émergence d'une robotique médicale en chirurgie laparoscopique destinée à automatiser et améliorer la précision des interventions nécessite la mise en Suvre d'outils et capteurs miniaturisés intelligents dont la vision 3D temps réel est un des enjeux. Bien que les systèmes de vision 3D actuels représentent un intérêt certain pour des manipulations chirurgicales endoscopiques précises, ils ont l'inconvénient de donner une image 3D qualitative plutôt que quantitative, laquelle nécessite un appareillage spécifique rendant l'acte chirurgical inconfortable et empêche le couplage avec un calculateur dans le cadre d'une chirurgie assistée. Nous avons développé dans la cadre du projet interne « PICASO » (Plate-forme d'Intégration de CAméras multiSenOrielles) dont les enjeux scientifiques concernent le conditionnement de capteurs intégrés et le traitement et la fusion d'images multi spectrales, un dispositif de vision 3D compatible avec les temps d'exécution des actes chirurgicaux. Ce système est basé sur le principe de la stéréoscopie humaine et met en Suvre des algorithmes de stéréovision passive dense issus de la robotique mobile. Dans cette thèse nous présentons des architectures massivement parallèles, implémentées dans un circuit FPGA, et capables de fournir des images de disparité à la cadence de 130 trames/sec à partir d'images de résolution 640x480 pixels. L'algorithme utilisé est basé sur la corrélation Census avec une fenêtre de calcul de 7 x 7 pixels. Celui-ci a été choisi pour ses performances en regard de sa simplicité de mise en Suvre et la possibilité de paralléliser la plupart des calculs. L'objectif principal de cet algorithme est de rechercher, pour chaque point, la correspondance entre deux images d'entrées (droite et gauche) prises de deux angles de vue différents afin d'obtenir une "carte de disparités" à partir de laquelle il est possible de reconstruire la scène 3D. Pour mettre en Suvre cet algorithme et tenir les contraintes « temps réel » nous avons développé des architectures en « pipeline » (calcul des moyennes, transformation Census, recherche des points stéréo-correspondants, vérification droite-gauche, filtrage...). L'essentiel des différentes parties qui composent l'architecture est décrit en langage VHDL synthétisable. Enfin nous nous sommes intéressés à la consommation en termes de ressources FPGA (mémoires, macro-cellules) en fonction des performances souhaitées.

APA, Harvard, Vancouver, ISO, and other styles

48

Xing, Xiaoliang. "Etude et realisation d'un systeme temps reel de vision par ordinateur." Université Louis Pasteur (Strasbourg) (1971-2008), 1987. http://www.theses.fr/1987STR13078.

Full text

Abstract:

Etude et realisation d'un systeme temps reel de vision par ordinateur destine a l'assemblage, a l'inspection, a la classification et au tri des pieces industrielles. Description du systeme comportant un microordinateur conventionnel de structure classique et un ensemble de processeurs specialises concus pour le traitement d'images

APA, Harvard, Vancouver, ISO, and other styles

49

Campbell, Jacob. "Characteristics of a real-time digital terrain database Integrity Monitor for a Synthetic Vision System." Ohio University / OhioLINK, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1177441410.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Straumann, Hugo M. "The development of a software package for low cost machine vision system for real time applications." Ohio : Ohio University, 1986. http://www.ohiolink.edu/etd/view.cgi?ohiou1183378665.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Real-time vision systems'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles