Dissertations / Theses: 'Computer vision algorithm'

1

Anani-Manyo, Nina K. "Computer Vision and Building Envelopes." Kent State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=kent1619539038754026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Mac, Aodha O. "Supervised algorithm selection for flow and other computer vision problems." Thesis, University College London (University of London), 2014. http://discovery.ucl.ac.uk/1426968/.

Full text

Abstract:

Motion estimation is one of the core problems of computer vision. Given two or more frames from a video sequence, the goal is to find the temporal correspondence for one or more points from the sequence. For dense motion estimation, or optical flow, a dense correspondence field is sought between the pair of frames. A standard approach to optical flow involves constructing an energy function and then using some optimization scheme to find its minimum. These energy functions are hand designed to work well generally, with the intention that the global minimum corresponds to the ground truth temporal correspondence. As an alternative to these heuristic energy functions we aim to assess the quality of existing algorithms directly from training data. We show that the addition of an offline training phase can improve the quality of motion estimation. For optical flow, decisions such as which algorithm to use and when to trust its accuracy, can all be learned from training data. Generating ground truth optical flow data is a difficult and time consuming process. We propose the use of synthetic data for training and present a new dataset for optical flow evaluation and a tool for generating an unlimited quantity of ground truth correspondence data. We use this method for generating data to synthesize depth images for the problem of depth image super-resolution and show that it is superior to real data. We present results for optical flow confidence estimation with improved performance on a standard benchmark dataset. Using a similar feature representation, we extend this work to occlusion region detection and present state of the art results for challenging real scenes. Finally, given a set of different algorithms we treat optical flow estimation as the problem of choosing the best algorithm from this set for a given pixel. However, posing algorithm selection as a standard classification problem assumes that class labels are disjoint. For each training example it is assumed that there is only one class label that correctly describes it, and that all other labels are equally bad. To overcome this, we propose a novel example dependent cost-sensitive learning algorithm based on decision trees where each label is instead a vector representing a data point's affinity for each of the algorithms. We show that this new algorithm has improved accuracy compared to other classification baselines on several computer vision problems.

APA, Harvard, Vancouver, ISO, and other styles

3

Zakaria, Marwan F. "An automated vision system using a fast 2-dimensional moment invariants algorithm /." Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=66244.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Zhang, Lichang. "Non-invasive detection algorithm of thermal comfort based on computer vision." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-241082.

Full text

Abstract:

The waste of building energy consumption is a major challenge in the world. And the real-time detection of human thermal comfort is an effective way to meet this issue. As mentioned in name, it means to detect the human’s comfort level in real-time and non-invasively. However, due to the various factors such as individual difference of thermal comfort, elements related to climatic (temperature, humidity, illumination, etc.) and so on, there is still a long way to implement this strategy in real life. From another perspective, the current HVAC (heating, ventilating and air-conditioning) systems cannot provide flexible interaction channels to adjust atmosphere, and naturally fails to satisfy requirements of users. All of them indicate the necessity to develop a detection method for human thermal comfort. In this paper, a non-invasion detection method toward human thermal comfort is proposed from two perspectives: macro human postures and skin textures. In posture part, OpenPose is used for analyzing the position coordinates of human body key points’ in images, for example, elbow, knee, and hipbone, etc. And the results of analyzing would be interpreted from the term of thermal comfort. In skin textures, deep neural network is used to predict the temperature of human skins via images. Based on Fanger’s theory of thermal comfort, the results of both parts are satisfying: subjects’ postures can be captured and interpreted into different thermal comfort level: hot, cold and comfort. And the absolute error of prediction from neurons network is less than 0.125 degrees centigrade which is the equipment error of thermometer used in data acquisition. With the solution proposed by this paper, it is promising to non-invasively detect the thermal comfort level of users from postures and skin textures. Finally, theconclusion and future work are discussed in final chapter.
Slöseriet med att bygga energiförbrukningen är en stor utmaning i världen. Ochdetektering av mänsklig termisk komfort i realtid är ett effektivt sätt att lösaproblemet. Som nämns i namn betyder det att detektera människans komfortnivå i realtid och icke-invasivt. På grund av de olika faktorerna som individuell skillnad i termisk komfort, är emellertid faktorer som är relaterade till klimat (temperatur, luftfuktighet, belysning etc.) det fortfarande en lång väg att implementera denna strategi i verkligheten. Från ett annat perspektiv kan nuvarande system för uppvärmning, ventilation och luftkonditionering inte tillhandahålla flexibla interaktionskanaler för att anpassa atmosfären och naturligtvis misslyckas till nöjda krav från användarna. Alla indikerar nödvändigheten av att utveckla en detekteringsmetod för mänsklig termisk komfort. I detta dokument föreslås en ickeinvasion detekteringsmetod mot mänsklig termisk komfort från två perspektiv: makro mänskliga hållningar och hudtexturer. I hållningspartiet används OpenPose för att analysera positionskoordinaterna för kroppens huvudpunkter i bilder, till exempel armbåge, knä och höftben osv. Och resultaten av analysen skulle tolkas från termen av termisk komfort. I hudtexturer används djupt neuralt nätverk för att förutse temperaturen på mänskliga skinn via bilder. Baserat på Fangers teorin om värmekomfort är resultaten av båda delarna tillfredsställande: subjektens hållningar kan fångas och tolkas till olika värmekomfortnivåer: varm, kall och komfort. Och det absoluta felet av prediktering från neuronnätverket är mindre än 0,125 grader Celsius, vilket är utrustningsfelet hos termometern som används vid datainsamling. Med lösningar i detta papper är det lovande att detektera användarens värmekomfortnivå fritt från invändningar och hudtexturer. Slutligen diskuteras slutsatserna och detframtida arbetet i sista kapitlet.

APA, Harvard, Vancouver, ISO, and other styles

5

Anderson, Travis M. "Motion detection algorithm based on the common housefly eye." Laramie, Wyo. : University of Wyoming, 2007. http://proquest.umi.com/pqdweb?did=1400965531&sid=1&Fmt=2&clientId=18949&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Chavez, Aaron J. "A fast interest point detection algorithm." Thesis, Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/538.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Bergendahl, Jason Robert. "A computationally efficient stereo vision algorithm for adaptive cruise control." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43389.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.
Includes bibliographical references (p. 55-56).
by Jason Robert Bergendahl.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

8

Ng, Brian Walter. "Wavelet based image texture segementation using a modified K-means algorithm." Title page, table of contents and abstract only, 2003. http://web4.library.adelaide.edu.au/theses/09PH/09phn5759.pdf.

Full text

Abstract:

"August, 2003" Bibliography: p. 261-268. In this thesis, wavelet transforms are chosen as the primary analytical tool for texture analysis. Specifically, Dual-Tree Complex Wavelet Transform is applied to the texture segmentation problem. Several possibilities for feature extraction and clustering steps are examined, new schemes being introduced and compared to known techniques.

APA, Harvard, Vancouver, ISO, and other styles

9

Ramswamy, Lakshmy. "PARZSweep a novel parallel algorithm for volume rendering of regular datasets /." Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-04012003-140443.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Johnson, Amanda R. "A pose estimation algorithm based on points to regions correspondence using multiple viewpoints." Laramie, Wyo. : University of Wyoming, 2008. http://proquest.umi.com/pqdweb?did=1798480891&sid=1&Fmt=2&clientId=18949&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Schwambach, Vítor. "Methods and tools for rapid and efficient parallel implementation of computer vision algorithms on embedded multiprocessors." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM022/document.

Full text

Abstract:

Les applications de vision par ordinateur embarquées demandent une forte capacité decalcul et poussent le développement des systèmes multi- et many-cores spécifiques à l’application. Les choix au départ de la conception du système peuvent impacter sa performance parallèle finale – parmi lesquelles la granularité de la parallélisation, le nombre de processeurs et l’équilibre entre calculs et l’acheminement des données. L’impact de ces choix est difficile à estimer dans les phases initiales de conception et il y a peu d’outils et méthodes pour aider les concepteurs dans cette tâche. Les contributions de cette thèse consistent en deux méthodes et les outils associés qui visent à faciliter la sélection des paramètres architecturaux d’un multiprocesseur embarqué et les stratégies de parallélisation des applications de vision embarquée. La première est une méthode d’exploration de l’espace de conception qui repose sur Parana, un outil fournissant une estimation rapide et précise de la performance parallèle. Parana permet l’évaluation de différents scénarios de parallélisation et peut déterminer la limite maximale de performance atteignable. La seconde contribution est une méthode pour l’optimisation du dimensionnement des tuiles d’images 2D utilisant la programmation par contraintes dans l’outil Tilana. La méthode proposée intègre pour plus de précision des facteurs non-linéaires comme les temps des transferts DMA et les surcoûts de l’ordonnancement parallèle
Embedded computer vision applications demand high system computational power and constitute one of the key drivers for application-specific multi- and many-core systems. A number of early system design choices can impact the system’s parallel performance – among which the parallel granularity, the number of processors and the balance between computation and communication. Their impact in the final system performance is difficult to assess in early design stages and there is a lack for tools that support designers in this task. The contributions of this thesis consist in two methods and associated tools that facilitate the selection of embedded multiprocessor’s architectural parameters and computer vision application parallelization strategies. The first consists of a Design Space Exploration (DSE) methodology that relies on Parana, a fast and accurate parallel performance estimation tool. Parana enables the evaluation of what-if parallelization scenarios and can determine their maximum achievable performance limits. The second contribution consists of a method for optimal 2D image tile sizing using constraint programming within the Tilana tool. The proposed method integrates non-linear DMA data transfer times and parallel scheduling overheads for increased accuracy

APA, Harvard, Vancouver, ISO, and other styles

12

Turk, Matthew Robert. "A homography-based multiple-camera person-tracking algorithm /." Online version of thesis, 2008. http://hdl.handle.net/1850/7853.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Khekare, Pranav Prakash. "Application of Computer Vision algorithm and Deep learning for roundabout capacity evaluation using UAV aerial imagery and videos." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613748930108492.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Ghosh, Payel. "Medical Image Segmentation Using a Genetic Algorithm." PDXScholar, 2010. https://pdxscholar.library.pdx.edu/open_access_etds/25.

Full text

Abstract:

Advances in medical imaging technology have led to the acquisition of large number of images in different modalities. On some of these images the boundaries of key organs need to be accurately identified for treatment planning and diagnosis. This is typically performed manually by a physician who uses prior knowledge of organ shapes and locations to demarcate the boundaries of organs. Such manual segmentation is subjective, time consuming and prone to inconsistency. Automating this task has been found to be very challenging due to poor tissue contrast and ill-defined organ/tissue boundaries. This dissertation presents a genetic algorithm for combining representations of learned information such as known shapes, regional properties and relative location of objects into a single framework in order to perform automated segmentation. The algorithm has been tested on two different datasets: for segmenting hands on thermographic images and for prostate segmentation on pelvic computed tomography (CT) and magnetic resonance (MR) images. In this dissertation we report the results of segmentation in two dimensions (2D) for thermographic images; and two as well as three dimensions (3D) for pelvic images. We show that combining multiple features for segmentation improves segmentation accuracy as compared with segmentation using single features such as texture or shape alone.

APA, Harvard, Vancouver, ISO, and other styles

15

Bae, Sung Eun. "Sequential and Parallel Algorithms for the Generalized Maximum Subarray Problem." Thesis, University of Canterbury. Computer Science and Software Engineering, 2007. http://hdl.handle.net/10092/1202.

Full text

Abstract:

The maximum subarray problem (MSP) involves selection of a segment of consecutive array elements that has the largest possible sum over all other segments in a given array. The efficient algorithms for the MSP and related problems are expected to contribute to various applications in genomic sequence analysis, data mining or in computer vision etc. The MSP is a conceptually simple problem, and several linear time optimal algorithms for 1D version of the problem are already known. For 2D version, the currently known upper bounds are cubic or near-cubic time. For the wider applications, it would be interesting if multiple maximum subarrays are computed instead of just one, which motivates the work in the first half of the thesis. The generalized problem of K-maximum subarray involves finding K segments of the largest sum in sorted order. Two subcategories of the problem can be defined, which are K-overlapping maximum subarray problem (K-OMSP), and K-disjoint maximum subarray problem (K-DMSP). Studies on the K-OMSP have not been undertaken previously, hence the thesis explores various techniques to speed up the computation, and several new algorithms. The first algorithm for the 1D problem is of O(Kn) time, and increasingly efficient algorithms of O(K² + n logK) time, O((n+K) logK) time and O(n+K logmin(K, n)) time are presented. Considerations on extending these results to higher dimensions are made, which contributes to establishing O(n³) time for 2D version of the problem where K is bounded by a certain range. Ruzzo and Tompa studied the problem of all maximal scoring subsequences, whose definition is almost identical to that of the K-DMSP with a few subtle differences. Despite slight differences, their linear time algorithm is readily capable of computing the 1D K-DMSP, but it is not easily extended to higher dimensions. This observation motivates a new algorithm based on the tournament data structure, which is of O(n+K logmin(K, n)) worst-case time. The extended version of the new algorithm is capable of processing a 2D problem in O(n³ + min(K, n) · n² logmin(K, n)) time, that is O(n³) for K ≤ n/log n For the 2D MSP, the cubic time sequential computation is still expensive for practical purposes considering potential applications in computer vision and data mining. The second half of the thesis investigates a speed-up option through parallel computation. Previous parallel algorithms for the 2D MSP have huge demand for hardware resources, or their target parallel computation models are in the realm of pure theoretics. A nice compromise between speed and cost can be realized through utilizing a mesh topology. Two mesh algorithms for the 2D MSP with O(n) running time that require a network of size O(n²) are designed and analyzed, and various techniques are considered to maximize the practicality to their full potential.

APA, Harvard, Vancouver, ISO, and other styles

16

Manfredsson, Johan. "Evaluation Tool for a Road Surface Algorithm." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-138936.

Full text

Abstract:

Modern cars are often equipped with sensors like radar, infrared cameras and stereo cameras that collect information about its surroundings. By using a stereo camera, it is possible to receive information about the distance to points in front of the car. This information can be used to estimate the height of the predicted path of the car. An application which does this is the stereo based Road surface preview (RSP) algorithm. By using the output from the RSP algorithm it is possible to use active suspension control, which controls the vertical movement of the wheels relative to the chassis. This application primarily makes the driving experience more comfortable, but also extends the durability of the vehicle. The idea behind this Master’s thesis is to create an evaluation tool for the RSP algorithm, which can be used at arbitrary roads. The thesis describes the proposed evaluation tool, where focus has been to make an accurate comparison of camera data received from the RSP algorithm and laser data used as ground truth in this thesis. Since the tool shall be used at the company proposing this thesis, focus has also been on making the tool user friendly. The report discusses the proposed methods, possible sources to errors and improvements. The evaluation tool considered in this thesis shows good results for the available test data, which made it possible to include an investigation of a possible improvement of the RSP algorithm.

APA, Harvard, Vancouver, ISO, and other styles

17

Kemp, Neal. "Content-Based Image Retrieval for Tattoos: An Analysis and Comparison of Keypoint Detection Algorithms." Scholarship @ Claremont, 2013. http://scholarship.claremont.edu/cmc_theses/784.

Full text

Abstract:

The field of biometrics has grown significantly in the past decade due to an increase in interest from law enforcement. Law enforcement officials are interested in adding tattoos alongside irises and fingerprints to their toolbox of biometrics. They often use these biometrics to aid in the identification of victims and suspects. Like facial recognition, tattoos have seen a spike in attention over the past few years. Tattoos, however, have not received as much attention by researchers. This lack of attention towards tattoos stems from the difficulty inherent in matching these tattoos. Such difficulties include image quality, affine transformation, warping of tattoos around the body, and in some cases, excessive body hair covering the tattoo. We will utilize context-based image retrieval to find a tattoo in a database which means using one image to query against a database in order to find similar tattoos. We will focus specifically on the keypoint detection process in computer vision. In addition, we are interested in finding not just exact matches but also similar tattoos. We will conclude that the ORB detector pulls the most relevant features and thus is the best chance for yielding an accurate result from content-based image retrieval for tattoos. However, we will also show that even ORB will not work on its own in a content-based image retrieval system. Other processes will have to be involved in order to return accurate matches. We will give recommendations on next-steps to create a better tattoo retrieval system.

APA, Harvard, Vancouver, ISO, and other styles

18

Vidas, Dario. "Performance Evaluation of Stereo Reconstruction Algorithms on NIR Images." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191148.

Full text

Abstract:

Stereo vision is one of the most active research areas in computer vision. While hundreds of stereo reconstruction algorithms have been developed, little work has been done on the evaluation of such algorithms and almost none on evaluation on Near-Infrared (NIR) images. Of almost a hundred examined, we selected a set of 15 stereo algorithms, mostly with real-time performance, which were then categorized and evaluated on several NIR image datasets, including single stereo pair and stream datasets. The accuracy and run time of each algorithm are measured and compared, giving an insight into which categories of algorithms perform best on NIR images and which algorithms may be candidates for real-time applications. Our comparison indicates that adaptive support-weight and belief propagation algorithms have the highest accuracy of all fast methods, but also longer run times (2-3 seconds). On the other hand, faster algorithms (that achieve 30 or more fps on a single thread) usually perform an order of magnitude worse when measuring the per-centage of incorrectly computed pixels.

APA, Harvard, Vancouver, ISO, and other styles

19

Lakshman, Prabhashankar. "Parallel implementation of the split and merge algorithm on the hypercube machine." Ohio : Ohio University, 1989. http://www.ohiolink.edu/etd/view.cgi?ohiou1182440993.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Aldrovandi, Lorenzo. "Depth estimation algorithm for light field data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

21

ALAHMAD, MOUHAMAD. "Developpement de methodes de vision par ordinateur : extraction de primitives geometriques." Université Louis Pasteur (Strasbourg) (1971-2008), 1986. http://www.theses.fr/1986STR13192.

Full text

Abstract:

Cette these concerne le developpement de methodes de vision par ordinateur, destinees a l'extraction de caracteristiques geometriques (barycentre, surface, perimetre, axe principal et orientation) pour identifier, localiser et comparer des objets a partir d'images en deux dimensions

APA, Harvard, Vancouver, ISO, and other styles

22

Lef, Annette. "CAD-Based Pose Estimation - Algorithm Investigation." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157776.

Full text

Abstract:

One fundamental task in robotics is random bin-picking, where it is important to be able to detect an object in a bin and estimate its pose to plan the motion of a robotic arm. For this purpose, this thesis work aimed to investigate and evaluate algorithms for 6D pose estimation when the object was given by a CAD model. The scene was given by a point cloud illustrating a partial 3D view of the bin with multiple instances of the object. Two algorithms were thus implemented and evaluated. The first algorithm was an approach based on Point Pair Features, and the second was Fast Global Registration. For evaluation, four different CAD models were used to create synthetic data with ground truth annotations. It was concluded that the Point Pair Feature approach provided a robust localization of objects and can be used for bin-picking. The algorithm appears to be able to handle different types of objects, however, with small limitations when the object has flat surfaces and weak texture or many similar details. The disadvantage with the algorithm was the execution time. Fast Global Registration, on the other hand, did not provide a robust localization of objects and is thus not a good solution for bin-picking.

APA, Harvard, Vancouver, ISO, and other styles

23

Botterill, Tom. "Visual navigation for mobile robots using the Bag-of-Words algorithm." Thesis, University of Canterbury. Computer Science and Software Engineering, 2011. http://hdl.handle.net/10092/5511.

Full text

Abstract:

Robust long-term positioning for autonomous mobile robots is essential for many applications. In many environments this task is challenging, as errors accumulate in the robot’s position estimate over time. The robot must also build a map so that these errors can be corrected when mapped regions are re-visited; this is known as Simultaneous Localisation and Mapping, or SLAM. Successful SLAM schemes have been demonstrated which accurately map tracks of tens of kilometres, however these schemes rely on expensive sensors such as laser scanners and inertial measurement units. A more attractive, low-cost sensor is a digital camera, which captures images that can be used to recognise where the robot is, and to incrementally position the robot as it moves. SLAM using a single camera is challenging however, and many contemporary schemes suffer complete failure in dynamic or featureless environments, or during erratic camera motion. An additional problem, known as scale drift, is that cameras do not directly measure the scale of the environment, and errors in relative scale accumulate over time, introducing errors into the robot’s speed and position estimates. Key to a successful visual SLAM system is the ability to continue operation despite these difficulties, and to recover from positioning failure when it occurs. This thesis describes the development of such a scheme, which is known as BoWSLAM. BoWSLAM enables a robot to reliably navigate and map previously unknown environments, in real-time, using only a single camera. In order to position a camera in visually challenging environments, BoWSLAM combines contemporary visual SLAM techniques with four new components. Firstly, a new Bag-of-Words (BoW) scheme is developed, which allows a robot to recognise places it has visited previously, without any prior knowledge of its environment. This BoW scheme is also used to select the best set of frames to reconstruct positions from, and to find efficient wide-baseline correspondences between many pairs of frames. Secondly, BaySAC, a new outlier- robust relative pose estimation scheme based on the popular RANSAC framework, is developed. BaySAC allows the efficient computation of multiple position hypotheses for each frame. Thirdly, a graph-based representation of these position hypotheses is proposed, which enables the selection of only reliable position estimates in the presence of gross outliers. Fourthly, as the robot explores, objects in the world are recognised and measured. These measurements enable scale drift to be corrected. BoWSLAM is demonstrated mapping a 25 minute 2.5km trajectory through a challenging and dynamic outdoor environment in real-time, and without any other sensor input; considerably further than previous single camera SLAM schemes.

APA, Harvard, Vancouver, ISO, and other styles

24

Lefebvre, Thomas. "Exploration architecturale pour la conception d'un système sur puce de vision robotique, adéquation algorithme-architecture d'un système embarqué temps-réel." Phd thesis, Université de Cergy Pontoise, 2012. http://tel.archives-ouvertes.fr/tel-00782081.

Full text

Abstract:

La problématique de cette thèse se tient à l'interface des domaines scientifiques de l'adéquation algorithme architecture, des systèmes de vision bio-inspirée en robotique mobile et du traitement d'images. Le but est de rendre un robot autonome dans son processus de perception visuelle, en intégrant au sein du robot cette tâche cognitive habituellement déportée sur un serveur de calcul distant. Pour atteindre cet objectif, lapproche de conception employée suit un processus d'adéquation algorithme architecture, où les différentes étapes de traitement d'images sont analysées minutieusement. Les traitements d'image sont modifiés et déployés sur une architecture embarquée de façon à respecter des contraintes d'exécution temps-réel imposées par le contexte robotique. La robotique mobile est un sujet de recherche académique qui s'appuie notamment sur des approches bio-mimétiques. La vision artificielle étudiée dans notre contexte emploie une approche bio-inspirée multirésolution, basée sur l'extraction et la mise en forme de zones caractéristiques de l'image. Du fait de la complexité de ces traitements et des nombreuses contraintes liées à l'autonomie du robot, le déploiement de ce système de vision nécessite une démarche rigoureuse et complète d'exploration architecturale logicielle et matérielle. Ce processus d'exploration de l'espace de conception est présenté dans cette thèse. Les résultats de cette exploration ont mené à la conception d'une architecture principalement composée d'accélérateurs matériels de traitements (IP) paramétrables et modulaires, qui sera déployée sur un circuit reconfigurable de type FPGA. Ces IP et le fonctionnement interne de chacun d'entre eux sont décrits dans le document. L'impact des paramètres architecturaux sur l'utilisation des ressources matérielles est étudié pour les traitements principaux. Le déploiement de la partie logicielle restante est présenté pour plusieurs plate-formes FPGA potentielles. Les performances obtenues pour cette solution architecturale sont enfin présentées. Ces résultats nous permettent aujourd'hui de conclure que la solution proposée permet d'embarquer le système de vision dans des robots mobiles en respectant les contraintes temps-réel qui sont imposées.

APA, Harvard, Vancouver, ISO, and other styles

25

Bortolot, Zachary Jared. "An Adaptive Computer Vision Technique for Estimating the Biomass and Density of Loblolly Pine Plantations using Digital Orthophotography and LiDAR Imagery." Diss., Virginia Tech, 2004. http://hdl.handle.net/10919/27454.

Full text

Abstract:

Forests have been proposed as a means of reducing atmospheric carbon dioxide levels due to their ability to store carbon as biomass. To quantify the amount of atmospheric carbon sequestered by forests, biomass and density estimates are often needed. This study develops, implements, and tests an individual tree-based algorithm for obtaining forest density and biomass using orthophotographs and small footprint LiDAR imagery. It was designed to work with a range of forests and image types without modification, which is accomplished by using generic properties of trees found in many types of images. Multiple parameters are employed to determine how these generic properties are used. To set these parameters, training data is used in conjunction with an optimization algorithm (a modified Nelder-Mead simplex algorithm or a genetic algorithm). The training data consist of small images in which density and biomass are known. A first test of this technique was performed using 25 circular plots (radius = 15 m) placed in young pine plantations in central Virginia, together with false color othophotograph (spatial resolution = 0.5 m) or small footprint LiDAR (interpolated to 0.5 m) imagery. The highest density prediction accuracies (r2 up to 0.88, RMSE as low as 83 trees / ha) were found for runs where photointerpreted densities were used for training and testing. For tests run using density measurements made on the ground, accuracies were consistency higher for orthophotograph-based results than for LiDAR-based results, and were higher for trees with DBH ≥10cm than for trees with DBH ≥7 cm. Biomass estimates obtained by the algorithm using LiDAR imagery had a lower RMSE (as low as 15.6 t / ha) than most comparable studies. The correlations between the actual and predicted values (r2 up to 0.64) were lower than comparable studies, but were generally highly significant (p ≤ 0.05 or 0.01). In all runs there was no obvious relationship between accuracy and the amount of training data used, but the algorithm was sensitive to which training and testing data were selected. Methods were evaluated for combining predictions made using different parameter sets obtained after training using identical data. It was found that averaging the predictions produced improved results. After training using density estimates from the human photointerpreter, 89% of the trees located by the algorithm corresponded to trees found by the human photointerpreter. A comparison of the two optimization techniques found them to be comparable in speed and effectiveness.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

26

Ivarsson, Adam. "Expediting Gathering and Labeling of Data from Zebrafish Models of Tumor Progression and Metastasis Using Bespoke Software." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148691.

Full text

Abstract:

In this paper I describe a set of algorithms used to partly automate the labeling and preparation of images of zebrafish embryos used as models of tumor progression and metastasis. These algorithms show promise for saving time for researchers using zebrafish in this way.

APA, Harvard, Vancouver, ISO, and other styles

27

Tippetts, Beau J. "Real-Time Implementation of Vision Algorithm for Control, Stabilization, and Target Tracking for a Hovering Micro-UAV." BYU ScholarsArchive, 2008. https://scholarsarchive.byu.edu/etd/1418.

Full text

Abstract:

A lightweight, powerful, yet efficient quad-rotor platform was designed and constructed to obtain experimental results of completely autonomous control of a hovering micro-UAV using a complete on-board vision system. The on-board vision and control system is composed of a Helios FPGA board, an Autonomous Vehicle Toolkit daughterboard, and a Kestrel Autopilot. The resulting platform is referred to as the Helio-copter. An efficient algorithm to detect, correlate, and track features in a scene and estimate attitude information was implemented with a combination of hardware and software on the FPGA, and real-time performance was obtained. The algorithms implemented include a Harris feature detector, template matching feature correlator, RANSAC similarity-constrained homography, color segmentation, radial distortion correction, and an extended Kalman filter with a standard-deviation outlier rejection technique (SORT). This implementation was designed specifically for use as an on-board vision solution in determining movement of small unmanned air vehicles that have size, weight, and power limitations. Experimental results show the Helio-copter capable of maintaining level, stable flight within a 6 foot by 6 foot area for over 40 seconds without human intervention.

APA, Harvard, Vancouver, ISO, and other styles

28

Guidi, Gianluca. "A new algorithm for estimating pedestrian flows during massive touristic events, optimized for an existing camera setup." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14552/.

Full text

Abstract:

In questa tesi presento un nuovo algoritmo per l'analisi di filmati che permette di calcolare il flusso di persone che attraversano un passaggio anche in presenza di condizioni sfavorevoli della telecamera. Il lavoro di tesi si è concentrato sull'analisi di una serie di sequenze video precedentemente estratte da una telecamera di sicurezza rivolta verso il Ponte della Costituzione a Venezia, con lo scopo di stimare il flusso pedonale sul ponte. La scarsa qualità dei video dovuta alla bassa risoluzione ed il posizionamento non ottimale della telecamera, che provoca numerose sovrapposizioni, causano il fallimento di molte tecniche di computer vision esistenti, perciò è stato necessario creare una nuova soluzione. È stata inoltre effettuata una verifica dell'algoritmo attraverso un programma che lo implementa, analizzando sia dati artificiali che reali.

APA, Harvard, Vancouver, ISO, and other styles

29

BOUYAKHF, EL-HOUSSINE. "Description et interpretation d'images pour la vision en robotique : reconnaissance d'objets partiellement observes." Toulouse 3, 1988. http://www.theses.fr/1988TOU30016.

Full text

Abstract:

Reflexion sur la vision par ordinateur avec un apercu des approches theoriques et une analyse des travaux developpes dans le domaine. Definition d'un systeme de description d'images a partir de la modelisation de leurs contours par un ensemble de logiciels d'extraction d'indices visuels s'appuyant sur un algorithme de detection de points representatifs. Developpement de deux systemes d'interpretation de scenes d'objets en vrac quasi planaire. Dans le premier, l'image est decrite par des segments extraits des contours. Le deuxieme utilise les algorithmes de description d'images a partir des points representatifs

APA, Harvard, Vancouver, ISO, and other styles

30

Datari, Srinivasa R. "Hypercube machine implementation of a 2-D FFT algorithm for object recognition." Ohio : Ohio University, 1989. http://www.ohiolink.edu/etd/view.cgi?ohiou1182434398.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Khajo, Gabriel. "Region Proposal Based Object Detectors Integrated With an Extended Kalman Filter for a Robust Detect-Tracking Algorithm." Thesis, Karlstads universitet, Fakulteten för hälsa, natur- och teknikvetenskap (from 2013), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-72698.

Full text

Abstract:

In this thesis we present a detect-tracking algorithm (see figure 3.1) that combines the detection robustness of static region proposal based object detectors, like the faster region convolutional neural network (R-CNN) and the region-based fully convolutional networks (R-FCN) model, with the tracking prediction strength of extended Kalman filters, by using, what we have called, a translating and non-rigid user input region of interest (RoI-) mapping. This so-called RoI-mapping maps a region, which includes the object that one is interested in tracking, to a featureless three-channeled image. The detection part of our proposed algorithm is then performed on the image that includes only the RoI features (see figure 3.2). After the detection step, our model re-maps the RoI features to the original frame, and translates the RoI to the center of the prediction. If no prediction occurs, our proposed model integrates a temporal dependence through a Kalman filter as a predictor; this filter is continuously corrected when detections do occur. To train the region proposal based object detectors that we integrate into our detect-tracking model, we used TensorFlow®’s object detection api, with a random search hyperparameter tuning, where we fine-tuned, all models from TensorFlow® slim base network classification checkpoints. The trained region proposal based object detectors used the inception V2 base network for the faster R-CNN model and the R-FCN model, while the inception V3 base network only was applied to the faster R-CNN model. This was made to compare the two base networks and their corresponding affects on the detection models. In addition to the deep learning part of this thesis, for the implementation part of our detect-tracking model, like for the extended Kalman filter, we used Python and OpenCV® . The results show that, with a stationary camera reference frame, our proposed detect-tracking algorithm, combined with region proposal based object detectors on images of size 414 × 740 × 3, can detect and track a small object in real-time, like a tennis ball, moving along a horizontal trajectory with an average velocity v ≈ 50 km/h at a distance d = 25 m, with a combined detect-tracking frequency of about 13 to 14 Hz. The largest measured state error between the actual state and the predicted state from the Kalman filter, at the aforementioned horizontal velocity, have been measured to be a maximum of 10-15 pixels, see table 5.1, but in certain frames where many detections occur this error has been shown to be much smaller (3-5 pixels). Additionally, our combined detect-tracking model has also been shown to be able to handle obstacles and two learnable features that overlap, thanks to the integrated extended Kalman filter. Lastly, our detect-tracking model also was applied on a set of infra-red images, where the goal was to detect and track a moving truck moving along a semi-horizontal path. Our results show that a faster R-CNN inception V2 model was able to extract features from a sequence of infra-red frames, and that our proposed RoI-mapping method worked relatively well at detecting only one truck in a short test-sequence (see figure 5.22).

APA, Harvard, Vancouver, ISO, and other styles

32

Тарановський, Антон Володимирович, Антон Владимирович Тарановский, Anton Volodymyrovych Taranovskyi, Сергій Олександрович Петров, Сергей Александрович Петров, and Serhii Oleksandrovych Petrov. "Визначення оптимальних параметрів вхідного зображення на характеристики розпізнавання з використанням алгоритму Віола-Джонса." Thesis, Видавництво СумДУ, 2013. http://essuir.sumdu.edu.ua/handle/123456789/42602.

Full text

Abstract:

Об’єктом аналізу алгоритмів, що реалізуються в системах комп’ютерного зору та працюють у рамках вирішення питання детекції та розпізнавання образів, є відео- та фотоматеріали, що є результатом спостереження фото- та відеокамер. Характеристики вхідних даних різняться в залежності від технічних можливостей камер.

APA, Harvard, Vancouver, ISO, and other styles

33

Bodily, John M. "An Optical Flow Implementation Comparison Study." Diss., CLICK HERE for online access, 2009. http://contentdm.lib.byu.edu/ETD/image/etd2818.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Viloria, John A. (John Alexander) 1978. "Optimizing clustering algorithms for computer vision." Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/86847.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Albertazzi, Riccardo. "Sistema di visione stereo su architettura ZYNQ." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/11310/.

Full text

Abstract:

Lo scopo della tesi è creare un’architettura in FPGA in grado di ricavare informazioni 3D da una coppia di sensori stereo. La pipeline è stata realizzata utilizzando il System-on-Chip Zynq, che permette una stretta interazione tra la parte hardware realizzata in FPGA e la CPU. Dopo uno studio preliminare degli strumenti hardware e software, è stata realizzata l’architettura base per la scrittura e la lettura di immagini nella memoria DDR dello Zynq. In seguito l’attenzione si è spostata sull’implementazione di algoritmi stereo (rettificazione e stereo matching) su FPGA e nella realizzazione di una pipeline in grado di ricavare accurate mappe di disparità in tempo reale acquisendo le immagini da una camera stereo.

APA, Harvard, Vancouver, ISO, and other styles

36

Gultekin, Gokhan Koray. "An Fpga Based High Performance Optical Flow Hardware Design For Autonomous Mobile Robotic Platforms." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612483/index.pdf.

Full text

Abstract:

Optical flow is used in a number of computer vision applications. However, its use in mobile robotic applications is limited because of the high computational complexity involved and the limited availability of computational resources on such platforms. The lack of a hardware that is capable of computing optical flow vector field in real time is a factor that prevents the mobile robotics community to efficiently utilize some successful techniques presented in computer vision literature. In this thesis work, we design and implement a high performance FPGA hardware with a small footprint and low power consumption that is capable of providing over-realtime optical flow data and is hence suitable for this application domain. A well known differential optical flow algorithm presented by Horn &
Schunck is selected for this implementation. The complete hardware design of the proposed system is described in details. We also discuss the design alternatives and the selected approaches together with a discussion of the selection procedure. We present the performance analysis of the proposed hardware in terms of computation speed, power consumption and accuracy. The designed hardware is tested with some of the available test sequences that are frequently used for performance evaluations of the optical flow techniques in literature. The proposed hardware is capable of computing optical flow vector field on 256x256 pixels images in 3.89ms which corresponds to a processing speed of 257 fps. The results obtained from FPGA implementation are compared with a floating-point implementation of the same algorithm realized on a PC hardware. The obtained results show that the hardware implementation achieved a superior performance in terms of speed, power consumption and compactness while there is minimal loss of accuracy due to the fixed point implementation.

APA, Harvard, Vancouver, ISO, and other styles

37

Javadi, Mohammad Saleh. "Computer Vision Algorithms for Intelligent Transportation Systems Applications." Licentiate thesis, Blekinge Tekniska Högskola, Institutionen för matematik och naturvetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-17166.

Full text

Abstract:

In recent years, Intelligent Transportation Systems (ITS) have emerged as an efficient way of enhancing traffic flow, safety and management. These goals are realized by combining various technologies and analyzing the acquired data from vehicles and roadways. Among all ITS technologies, computer vision solutions have the advantages of high flexibility, easy maintenance and high price-performance ratio that make them very popular for transportation surveillance systems. However, computer vision solutions are demanding and challenging due to computational complexity, reliability, efficiency and accuracy among other aspects. In this thesis, three transportation surveillance systems based on computer vision are presented. These systems are able to interpret the image data and extract the information about the presence, speed and class of vehicles, respectively. The image data in these proposed systems are acquired using Unmanned Aerial Vehicle (UAV) as a non-stationary source and roadside camera as a stationary source. The goal of these works is to enhance the general performance of accuracy and robustness of the systems with variant illumination and traffic conditions. This is a compilation thesis in systems engineering consisting of three parts. The red thread through each part is a transportation surveillance system. The first part presents a change detection system using aerial images of a cargo port. The extracted information shows how the space is utilized at various times aiming for further management and development of the port. The proposed solution can be used at different viewpoints and illumination levels e.g. at sunset. The method is able to transform the images taken from different viewpoints and match them together. Thereafter, it detects discrepancies between the images using a proposed adaptive local threshold. In the second part, a video-based vehicle's speed estimation system is presented. The measured speeds are essential information for law enforcement and they also provide an estimation of traffic flow at certain points on the road. The system employs several intrusion lines to extract the movement pattern of each vehicle (non-equidistant sampling) as an input feature to the proposed analytical model. In addition, other parameters such as camera sampling rate and distances between intrusion lines are also taken into account to address the uncertainty in the measurements and to obtain the probability density function of the vehicle's speed. In the third part, a vehicle classification system is provided to categorize vehicles into \private car", \light trailer", \lorry or bus" and \heavy trailer". This information can be used by authorities for surveillance and development of the roads. The proposed system consists of multiple fuzzy c-means clusterings using input features of length, width and speed of each vehicle. The system has been constructed by using prior knowledge of traffic regulations regarding each class of vehicle in order to enhance the classification performance.

APA, Harvard, Vancouver, ISO, and other styles

38

Lim, Choon Kee. "Hypercube machine implementation of low-level vision algorithms." Ohio : Ohio University, 1989. http://www.ohiolink.edu/etd/view.cgi?ohiou1182864143.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Carletti, Angelo. "Development of a machine learning algorithm for the automatic analysis of microscopy images in an in-vitro diagnostic platform." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text

Abstract:

In this thesis we present the development of machine learning algorithms for single cell analysis in an in-vitro diagnostic platform for Cellply, a startup that operates in precision medicine. We researched the state of the art of deep learning for biomedical image analysis, and we analyzed the impact that convolutional neural networks have had in object detection tasks. Then we compared neural networks that are currently used for cell detection, and we chose the one (i.e. Stardist) that is able to perform a more efficient detection also in a crowded cells context. We could train models using Stardist algorithm in the open-source platform ZeroCostDL4Mic, using code and GPU in Colab environment. We trained different models, intended for distinct applications, and we evaluated them using metrics such as precision and recall. These are our results: • a model for single channel brightfield images taken from samples of Covid patients, that guarantees a precision of about 0.98 and a recall of about 0.96 • a model for multi-channel images (i.e. a stack of multiple images, each one highlighting different contents) taken from experiments about natural killer cells, with precision and recall of about 0.81 • a model for multi-channel images taken from samples of AML (Acute Myeloid Leukemia) patients, with precision and recall of about 0.73 • a simpler model, trained to detect the main area (named "well") on which cells can be found, in order to discard what is out of this area. This model has a precision of about 1 and a recall of about 0.98. Finally, we wrote Python code in order to read a text input file that contains the necessary information to run a specified trained model for cell detection, with certain parameters, on a given set of images of a certain experiment. The output of the code is a .csv file where measurements related to every detected “object of interest” (i.e. cells or other particles) are stored. We also talk about future developments in this field.

APA, Harvard, Vancouver, ISO, and other styles

40

El, Gheche Mireille. "Proximal methods for convex minimization of Phi-divergences : application to computer vision." Thesis, Paris Est, 2014. http://www.theses.fr/2014PEST1018/document.

Full text

Abstract:

Cette thèse s'inscrit dans le contexte de l'optimisation convexe. Elle apporte à ce domaine deux contributions principales. La première porte sur les méthodes d'optimisation convexe non lisse appliquées à la vision par ordinateur. Quant à la seconde, elle fournit de nouveaux résultats théoriques concernant la manipulation de mesures de divergences, telles que celles utilisées en théorie de l'information et dans divers problèmes d'optimisation. Le principe de la stéréovision consiste à exploiter deux images d'une même scène prises sous deux points de vue, afin de retrouver les pixels homologues et de se ramener ainsi à un problème d'estimation d'un champ de disparité. Dans ce travail, le problème de l'estimation de la disparité est considéré en présence de variations d'illumination. Ceci se traduit par l'ajout, dans la fonction objective globale à minimiser, d'un facteur multiplicatif variant spatialement, estimé conjointement avec la disparité. Nous avons mis l'accent sur l'avantage de considérer plusieurs critères convexes et non-nécessairement différentiables, et d'exploiter des images multicomposantes (par exemple, des images couleurs) pour améliorer les performances de notre méthode. Le problème d'estimation posé est résolu en utilisant un algorithme parallèle proximal basé sur des développements récents en analyse convexe. Dans une seconde partie, nous avons étendu notre approche au cas multi-vues qui est un sujet de recherche relativement nouveau. Cette extension s'avère particulièrement utile dans le cadre d'applications où les zones d'occultation sont très larges et posent de nombreuses difficultés. Pour résoudre le problème d'optimisation associé, nous avons utilisé des algorithmes proximaux en suivant des approches multi-étiquettes relaxés de manière convexe. Les algorithmes employés présentent l'avantage de pouvoir gérer simultanément un grand nombre d'images et de contraintes, ainsi que des critères convexes et non convexes. Des résultats sur des images synthétiques ont permis de valider l'efficacité de ces méthodes, pour différentes mesures d'erreur. La dernière partie de cette thèse porte sur les problèmes d'optimisation convexe impliquant des mesures d'information (Phi-divergences), qui sont largement utilisés dans le codage source et le codage canal. Ces mesures peuvent être également employées avec succès dans des problèmes inverses rencontrés dans le traitement du signal et de l'image. Les problèmes d'optimisation associés sont souvent difficiles à résoudre en raison de leur grande taille. Dans ce travail, nous avons établi les expressions des opérateurs proximaux de ces divergences. En s'appuyant sur ces résultats, nous avons développé une approche proximale reposant sur l'usage de méthodes primales-duales. Ceci nous a permis de répondre à une large gamme de problèmes d'optimisation convexe dont la fonction objective comprend un terme qui s'exprime sous la forme de l'une de ces divergences
Convex optimization aims at searching for the minimum of a convex function over a convex set. While the theory of convex optimization has been largely explored for about a century, several related developments have stimulated a new interest in the topic. The first one is the emergence of efficient optimization algorithms, such as proximal methods, which allow one to easily solve large-size nonsmooth convex problems in a parallel manner. The second development is the discovery of the fact that convex optimization problems are more ubiquitous in practice than was thought previously. In this thesis, we address two different problems within the framework of convex optimization. The first one is an application to computer stereo vision, where the goal is to recover the depth information of a scene from a pair of images taken from the left and right positions. The second one is the proposition of new mathematical tools to deal with convex optimization problems involving information measures, where the objective is to minimize the divergence between two statistical objects such as random variables or probability distributions. We propose a convex approach to address the problem of dense disparity estimation under varying illumination conditions. A convex energy function is derived for jointly estimating the disparity and the illumination variation. The resulting problem is tackled in a set theoretic framework and solved using proximal tools. It is worth emphasizing the ability of this method to process multicomponent images under illumination variation. The conducted experiments indicate that this approach can effectively deal with the local illumination changes and yields better results compared with existing methods. We then extend the previous approach to the problem of multi-view disparity estimation. Rather than estimating a single depth map, we estimate a sequence of disparity maps, one for each input image. We address this problem by adopting a discrete reformulation that can be efficiently solved through a convex relaxation. This approach offers the advantage of handling both convex and nonconvex similarity measures within the same framework. We have shown that the additional complexity required by the application of our method to the multi-view case is small with respect to the stereo case. Finally, we have proposed a novel approach to handle a broad class of statistical distances, called $varphi$-divergences, within the framework of proximal algorithms. In particular, we have developed the expression of the proximity operators of several $varphi$-divergences, such as Kulback-Leibler, Jeffrey-Kulback, Hellinger, Chi-Square, I$_{alpha}$, and Renyi divergences. This allows proximal algorithms to deal with problems involving such divergences, thus overcoming the limitations of current state-of-the-art approaches for similar problems. The proposed approach is validated in two different contexts. The first is an application to image restoration that illustrates how to employ divergences as a regularization term, while the second is an application to image registration that employs divergences as a data fidelity term

APA, Harvard, Vancouver, ISO, and other styles

41

Orloski, Andrey. "PROCEDIMENTO PARA AUTOLOCALIZAÇÃO DE ROBÔS EM CASAS DE VEGETAÇÃO UTILIZANDO DESCRITORES SURF: Implementação Sequencial e Paralela." UNIVERSIDADE ESTADUAL DE PONTA GROSSA, 2015. http://tede2.uepg.br/jspui/handle/prefix/130.

Full text

Abstract:

Made available in DSpace on 2017-07-21T14:19:25Z (GMT). No. of bitstreams: 1 Andrey Orloski.pdf: 5923583 bytes, checksum: 1a18c76b30193410838467808e3fa40d (MD5) Previous issue date: 2015-09-04
This paper describes a procedure for self-localization of mobile and autonomous agrobots in greenhouses, that is, the determination of the robot's position relative to a coordinate system,using procedures and computational resources. The proposed procedure uses computer vision techniques to recognize markers objects in the greenhouse and, from them, estimate the coordinate of the robot in a parallel plane to the surface of the stove. The detection of the presence of markers in the scene is performed using the SURF algorithm. To enable the estimation of coordinates, based on data contained in a single image, the method of Rahman et al. (2008), which consists in etermining the distance between a camera and a marker object has been extended to allow the coordinate calculation. The performance of the procedure was evaluated in three experiments. In the first experiment, the objective was to verify, in the laboratory, the influence of image resolution on accuracy. The results indicate that by reducing the image resolution, the range of the process is impaired for the recognition of the markers. These results also show that by reducing the resolution, the error in estimating the coordinates relative to the distance between the camera and the marker increases. The second experiment ran a test that evaluates the computational performance of the SURF algorithm, in terms of computing time, in the image processing. This is important because agrobots usually need to perform tasks that require the processing power in real time. The results of this test indicate that the efficiency of the procedure drops with the increase of image resolution. A second test compared the processing time of two implementations of the algorithm. One explores a sequential version of the SURF algorithm and another uses a parallel implementation. The results of this test suggest that the parallel implementation is more efficient in all tested resolutions, with an almost constant proportionate improvement.The third experiment was performed in a greenhouse to evaluate the performance of the proposed procedure in the environment for which it was designed. Field results were similar to the laboratory, but indicate that lighting variations require parameter settings of the SURF algorithm.
Este trabalho descreve um procedimento para autolocalização de agrobots móveis e autônomos em casas de vegetação. Isto é, a determinação da posição do robô em relação a um sistema de coordenadas, usando procedimentos e recursos computacionais. O procedimento proposto emprega técnicas de visão computacional para reconhecer objetos marcadores na casa de vegetação e, a partir deles, estimar a coordenada do robô em um plano paralelo a superfície da estufa. A detecção da presença dos marcadores na cena é realizada através do algoritmo SURF. Para viabilizar a estimativa das coordenadas, a partir de dados contidos em uma única imagem, o método de Rahman et al. (2008), que consiste em determinar a distância entre uma câmera e um objeto marcador, foi estendido para permitir o cômputo de coordenadas. O desempenho do procedimento proposto foi avaliado em três experimentos. No primeiro experimento, o objetivo foi verificar, em laboratório, a influência da resolução da imagem sobre a precisão. Os resultados indicam que, ao reduzir a resolução da imagem, o alcance do procedimento é prejudicado para reconhecimento dos marcadores. Estes resultados também mostram que, ao reduzir a resolução, o erro na estimativa das coordenadas em relação à distância entre a câmera e o marcador aumenta. O segundo experimento executou um teste que avalia o desempenho computacional do algoritmo SURF, em termos de tempo de computação, no processamento das imagens. Isto é importante pois agrobots usualmente precisam executar tarefas que demandam capacidade de processamento em tempo real. Os resultados deste teste indicam que a eficiência do procedimento cai com o aumento da resolução da imagem. Um segundo teste comparou o tempo de processamento de duas implementações do algoritmo. Uma que explora uma versão sequencial do algoritmo SURF e outra que usa uma implementação paralela. Os resultados deste teste sugerem que a implementação paralela foi mais eficiente em todas as resoluções testadas, apresentando uma melhora proporcional quase constante. O terceiro experimento foi realizado em uma casa de vegetação com objetivo de avaliar o desempenho do procedimento proposto no ambiente para o qual foi projetado. Os resultados de campo se mostraram semelhantes aos do laboratório, mas indicam que variações de iluminação exigem ajustes de parâmetros do algoritmo SURF.

APA, Harvard, Vancouver, ISO, and other styles

42

Boisard, Olivier. "Optimization and implementation of bio-inspired feature extraction frameworks for visual object recognition." Thesis, Dijon, 2016. http://www.theses.fr/2016DIJOS016/document.

Full text

Abstract:

L'industrie a des besoins croissants en systèmes dits intelligents, capable d'analyserles signaux acquis par des capteurs et prendre une décision en conséquence. Cessystèmes sont particulièrement utiles pour des applications de vidéo-surveillanceou de contrôle de qualité. Pour des questions de coût et de consommation d'énergie,il est souhaitable que la prise de décision ait lieu au plus près du capteur. Pourrépondre à cette problématique, une approche prometteuse est d'utiliser des méthodesdites bio-inspirées, qui consistent en l'application de modèles computationels issusde la biologie ou des sciences cognitives à des problèmes industriels. Les travauxmenés au cours de ce doctorat ont consisté à choisir des méthodes d'extractionde caractéristiques bio-inspirées, et à les optimiser dans le but de les implantersur des plateformes matérielles dédiées pour des applications en vision par ordinateur.Tout d'abord, nous proposons un algorithme générique pouvant être utilisés dans différentscas d'utilisation, ayant une complexité acceptable et une faible empreinte mémoire.Ensuite, nous proposons des optimisations pour une méthode plus générale, baséesessentiellement sur une simplification du codage des données, ainsi qu'une implantationmatérielle basées sur ces optimisations. Ces deux contributions peuvent par ailleurss'appliquer à bien d'autres méthodes que celles étudiées dans ce document
Industry has growing needs for so-called “intelligent systems”, capable of not only ac-quire data, but also to analyse it and to make decisions accordingly. Such systems areparticularly useful for video-surveillance, in which case alarms must be raised in case ofan intrusion. For cost saving and power consumption reasons, it is better to perform thatprocess as close to the sensor as possible. To address that issue, a promising approach isto use bio-inspired frameworks, which consist in applying computational biology modelsto industrial applications. The work carried out during that thesis consisted in select-ing bio-inspired feature extraction frameworks, and to optimize them with the aim toimplement them on a dedicated hardware platform, for computer vision applications.First, we propose a generic algorithm, which may be used in several use case scenarios,having an acceptable complexity and a low memory print. Then, we proposed opti-mizations for a more global framework, based on precision degradation in computations,hence easing up its implementation on embedded systems. Results suggest that whilethe framework we developed may not be as accurate as the state of the art, it is moregeneric. Furthermore, the optimizations we proposed for the more complex frameworkare fully compatible with other optimizations from the literature, and provide encourag-ing perspective for future developments. Finally, both contributions have a scope thatgoes beyond the sole frameworks that we studied, and may be used in other, more widelyused frameworks as well

APA, Harvard, Vancouver, ISO, and other styles

43

Habe, Hitoshi. "Geometric information processing methods for elaborating computer vision algorithms." 京都大学 (Kyoto University), 2006. http://hdl.handle.net/2433/136028.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Nilsson, Mattias. "Evaluation of Computer Vision Algorithms Optimized for Embedded GPU:s." Thesis, Linköpings universitet, Datorseende, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-112575.

Full text

Abstract:

The interest of using GPU:s as general processing units for heavy computations (GPGPU) has increased in the last couple of years. Manufacturers such as Nvidia and AMD make GPU:s powerful enough to outrun CPU:s in one order of magnitude, for suitable algorithms. For embedded systems, GPU:s are not as popular yet. The embedded GPU:s available on the market have often not been able to justify hardware changes from the current systems (CPU:s and FPGA:s) to systems using embedded GPU:s. They have been too hard to get, too energy consuming and not suitable for some algorithms. At SICK IVP, advanced computer vision algorithms run on FPGA:s. This master thesis optimizes two such algorithms for embedded GPU:s and evaluates the result. It also evaluates the status of the embedded GPU:s on the market today. The results indicates that embedded GPU:s perform well enough to run the evaluatedd algorithms as fast as needed. The implementations are also easy to understand compared to implementations for FPGA:s which are competing hardware.

APA, Harvard, Vancouver, ISO, and other styles

45

Apewokin, Senyo. "Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/28256.

Full text

Abstract:

Thesis (M. S.)--Electrical and Computer Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Wills, Scott; Committee Co-Chair: Wills, Linda; Committee Member: Bader, David; Committee Member: Davis, Jeff; Committee Member: Hamblen, James; Committee Member: Lanterman, Aaron.

APA, Harvard, Vancouver, ISO, and other styles

46

Головацький, Ігор Володимирович. "Інтелектуальна система розпізнавання елементів дорожнього руху." Master's thesis, КПІ ім. Ігоря Сікорського, 2019. https://ela.kpi.ua/handle/123456789/31808.

Full text

Abstract:

У роботі розглянуто проблему розпізнавання елементів дорожнього руху у відео потоці, проведено аналіз наявних проблем та складнощів в існуючих методах розпізнавання елементів та порівняння їхніх характеристик точності та швидкодії, переваг та недоліків. Розроблено інтелектуальну систему розпізнавання елементів дорожнього руху за допомогою алгоритмів машинного навчання та нейронних мереж. Система може бути використана у відео реєстраторах та у системах пасивної безпеки автомобіля. Загалом в роботі розкрито питання призначення та доцільність використання нейронної мережі та представлена програмна реалізація системи за допомогою мови програмування C# та бібліотеки Accord.NET, основними вимогами якої є: прийнятна точність розпізнавання, можливість використання відео потоку в якості вхідних даних, знайдені елементи повинні бути інтуїтивно виділені серед інших елементів та простота в налагоджені. Окремо було приділено увагу локальним результатам експериментів, що дають уявлення про характеристики запропонованої системи. Ключові слова: інтелектуальна система, нейронна мережа, машинне навчання, алгоритм, комп’ютерний зір, дорожній рух. Розмір пояснювальної записки – 81 аркушів, містить 23 ілюстрацій, 28 таблиць, 6 додатків.
Examines the problem of recognition of traffic elements in the video stream, analyzes the existing problems and complexities in the existing methods of recognition of the elements and compares their characteristics of accuracy and speed, advantages and disadvantages. An intelligent system for recognizing traffic elements is using machine learning algorithms and neural networks. The system can be used in video recorders and passive vehicle security systems. In general, the paper addresses the purpose and feasibility of using a neural network and presents the software implementation of the system using the C# programming language and the Accord.NET library. The main requirements of which are: acceptable recognition accuracy, the ability to use video stream as input, found elements should be intuitive highlighted among other elements and simplicity in configuring. Special attention was paid to the local results of the experiments, which give an idea of the characteristics of the proposed system. Explanatory note size – 81 pages, contains 23 illustrations, 28 tables, 6 applications.

APA, Harvard, Vancouver, ISO, and other styles

47

Vivet, Tañà Marc. "Fast Computer Vision Algorithms applied to Motion Detection and Mosaicing." Doctoral thesis, Universitat Autònoma de Barcelona, 2013. http://hdl.handle.net/10803/125980.

Full text

Abstract:

Aquesta tesi està centrada en la detecció de moviment i el seu aprofitament per la sumarització de les escenes de vídeo en imatges mosaic. Mentre construir la imatge mosaic amb càmeres pivotants és un tema ben conegut, no es aquest el cas per les càmeres amb moviment. El primer pas consisteix en alinear totes les imatges en un únic sistema de coordenades. Aquest procés, anomenat alineament d’imatges, prové de l’estimació de la transformació que projecta cada imatge de vídeo en aquest sistema de coordenades comú. La imatge mosaic es genera assignant a cada punt, un cert valor derivat de la informació transmesa per les diferents imatges amb informació sobre aquest punt. Moviment i mosaics estan profundament relacionats. La tesi s'estructura en sis capítols. Després d'una introducció als aspectes de percepció del moviment en una seqüència de vídeo i exposar el pla de la tesi, el segon capítol aborda el problema de la detecció de moviment amb càmeres estàtiques. Amb aquesta finalitat, es presenta una àmplia descripció dels algoritmes de separació del fons de la imatge descrits en la literatura. Es presenta a continuació l'algoritme de sostracció de fons desenvolupat en la tesi. Aquest algorisme combina diferents senyals visuals i utilitza un model gràfic probabilístic per garantir la coherència espai-‐temporal per al model de fons. Aquest model representa cada píxel com una variable aleatòria amb dos estats, de fons i de primer pla. Llavors, s’utilitza un camps probabilístic de Markov (MRF) per descriure la correlació entre els píxels veïns en el volum de l'espai-‐temps. A més a més , es presenta un marc general de combinar diferents fonts d'informació relacionades amb el moviment per tal d'augmentar la precisió de la màscara de moviment. El següent pas és fer front al problema de la detecció de moviment quan la càmera no és estàtica, que s'analitza en el capítol 3. En particular, es considera el cas sense paral·laxi. Aquest és un cas comú, en càmeres PTZ o perspectives aèries no produeixen paral·laxi de moviment. Per a compensar les transformacions afins 2D causades per la càmera es proposa utilitzar un seguiment de nucli múltiple, assumint que la major part de la trama pertany al fons. El primer pas és introduir Seguiment de Nucli Múltiple i es descriu com es pot formular per aquest propòsit en particular. A continuació, la generació del mosaic de fons es defineix i es valida la capacitat d'adaptació a través del temps. El capítol 4 presenta un nou algorisme d'alineació de imatges, el Directe-‐Local, Indirecte-‐Global (DLIG), que compensa el moviment 2D mitjançant una transformació projectiva. La idea clau de l'alineació DLIG és dividir el problema d'alineació de imatges en el problema de registrar un conjunt de trossos d'imatge espacialment relacionats. El registrament d’un tros d’imatge es realitza iterativament imposant tan una bona concordança local com una bona coherència espacial global. L’alineament d’un tros d’imatge es porta a terme utilitzant un algoritme de seguiment, de manera que es molt eficient per aconseguir una concordança local. L'algorisme utilitza el registrat de trossos d’imatge per obtenir un registrat multiimatge i utilitza les coordenades mosaic per relacionar el tros actual de la imatge a trossos provinents de altres imatges que comparteixen parcialment el camp de vista. La registració multimatge impedeix el problema d'acumulació d'errors, un dels problemes més importants en mosaics. També es mostra com incrustar un algoritme de seguiment basat en nucli per tal d'obtenir un algoritme de construcció de mosaics precís i eficient. El capítol 5 encara el problema de la generació de mosaics quan l'escena gravada conté paral·laxi de moviment. La solució desenvolupada proposa alinear la seqüència de vídeo en un volum d'espai-‐temps basat en el seguiment eficient de característiques utilitzant un algoritme de seguiment de nucli. El càlcul és ràpid i, com el moviment, es calcula només per a unes poques regions de la imatge, i tot i així proporciona una estimació del moviment 3D precisa. Aquest càlcul és més ràpid i més precís que l’estat de l’art que es basen en un mètode d'alineació directa. La síntesi de la imatge del mosaic encara amb el mètode innovador presentat a la tesi barcode Blending , un nou mètode per utilitzar el blending piràmidal en les imatges mosaic, que és molt eficient. Barcode Blending permet superar la complexitat de la construcció de piràmides per a múltiples tires estretes, en base a combinar totes les tires en una sola etapa de mescla. Finalment la tesi acaba am les conclusions i el treball futur a fer en el capítol sisè.
This thesis is focused on motion detection and its use for the summarization of video scenes in mosaic images. While mosaicing with pivoting cameras is a well-known topic, this is not the case with full motion cameras. The first step is to align all the images into a single coordinate system. This process, named image alignment, comes from the estimation of the transform that projects every video image into this common coordinate system. The mosaic image is generated assigning to each point some value derived from the information conveyed for the different images with information about that point. Motion and Mosaicing are deeply related. The thesis is organized in six chapters. After an introduction to the perceptual aspects of motion in a video sequence and exposing the plan of the thesis, the second chapter deals with the problem of detecting motion using static cameras. To this end, an extensive description of the main background subtraction algorithms in the literature is presented. The original background subtraction algorithm developed in the thesis is presented. This algorithm combines different visual cues and uses a probabilistic graphical model to provide spatio-temporal consistency to the background model. This model represents each pixel as a random variable with two states, background and foreground. Then, Markov Random Fields (MRF) is used to describe the correlation between neighbouring pixels in the space-time volume. In addition, a general framework to combine different motion related information sources is presented in order to increase the accuracy of the motion mask. The next step is to face the problem of detecting motion when the camera is not static, which is analysed in the chapter 3. In particular, the case with no parallax is considered. This is a common case as PTZ cameras or aerial perspectives do not produce motion parallax. It is proposed to compensate for 2D affine transformations caused by the camera by using Multiple Kernel Tracking, assuming that the major part of the frame belongs to the background. The first step is to introduce Multiple Kernel Tracking describing how it can be formulated for this particular purpose. Then the generation of the background mosaic is defined and it adaptability over time. Chapter 4 presents a new frame alignment algorithm, the Direct Local Indirect global (DLIG), which compensates the 2D motion using a projective transformation. The key idea of the DLIG alignment is to divide the frame alignment problem into the problem of registering a set of spatially related image patches. The registration is iteratively computed by sequentially imposing a good local match and global spatial coherence. The patch registration is performed using a tracking algorithm, so a very efficient local matching can be achieved. The algorithm uses the patch-based registration to obtain multiframe registration, using the mosaic coordinates to relate the current frame to patches from different frames that partially share the current field of view. Multiframe registration prevents the error accumulation problem, one of the most important problems in mosaicing. It is also show how to embed a Kernel Tracking algorithm in order to obtain a precise and efficient mosaicing algorithm. The chapter 5 moves to the problem of generating mosaics when the recorded scene contains motion parallax. The developed solution proposes to align the video sequence in a space-time volume based on efficient feature tracking using a Kernel Tracking algorithm. Computation is fast and, as the motion, is computed only for a few regions of the image, yet still gives accurate 3D motion. This computation is faster and more accurate than the previous work that is based on a direct alignment method. The synthesis of the mosaic image is faced with the novel Barcode Blending , a new approach for using pyramid blending in video mosaics, which is very efficient. Barcode Blending overcomes the complexity of building pyramids for multiple narrow strips, combining all strips in a single blending step. This thesis finishes with the conclusions and future work in chapter 6.

APA, Harvard, Vancouver, ISO, and other styles

48

Avdiu, Blerta. "Matching Feature Points in 3D World." Thesis, Tekniska Högskolan, Högskolan i Jönköping, JTH, Data- och elektroteknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-23049.

Full text

Abstract:

This thesis work deals with the most actual topic in Computer Vision field which is scene understanding and this using matching of 3D feature point images. The objective is to make use of Saab’s latest breakthrough in extraction of 3D feature points, to identify the best alignment of at least two 3D feature point images. The thesis gives a theoretical overview of the latest algorithms used for feature detection, description and matching. The work continues with a brief description of the simultaneous localization and mapping (SLAM) technique, ending with a case study on evaluation of the newly developed software solution for SLAM, called slam6d. Slam6d is a tool that registers point clouds into a common coordinate system. It does an automatic high-accurate registration of the laser scans. In the case study the use of slam6d is extended in registering 3D feature point images extracted from a stereo camera and the results of registration are analyzed. In the case study we start with registration of one single 3D feature point image captured from stationary image sensor continuing with registration of multiple images following a trail. Finally the conclusion from the case study results is that slam6d can register non-laser scan extracted feature point images with high-accuracy in case of single image but it introduces some overlapping results in the case of multiple images following a trail.

APA, Harvard, Vancouver, ISO, and other styles

49

Kim, Kyungnam. "Algorithms and evaluation for object detection and tracking in computer vision." College Park, Md. : University of Maryland, 2005. http://hdl.handle.net/1903/2925.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2005.
Thesis research directed by: Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

50

Gu, Jian. "Development of computer vision algorithms using J2ME for mobile phone applications." Thesis, University of Canterbury. Computer Science and Software Engineering, 2009. http://hdl.handle.net/10092/2683.

Full text

Abstract:

This thesis describes research on the use of Java to develop cross-platform computer vision applications for mobile phones with integrated cameras. The particular area of research that we are interested in is Mobile Augmented Reality (AR). Currently there is no computer vision library which can be used for mobile Augmented Reality using the J2ME platform. This thesis introduces the structure of our J2ME computer vision library and describes the implementation of algorithms in our library. We also present several sample applications on J2ME enabled mobile phones and report on experiments conducted to evaluate the compatibility, portability and efficiency of the implemented algorithms.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Computer vision algorithm'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles