Dissertations / Theses: 'Object detection in images'

1

Kok, R. "An object detection approach for cluttered images." Thesis, Stellenbosch : Stellenbosch University, 2003. http://hdl.handle.net/10019.1/53281.

Full text

Abstract:

Thesis (MScEng)--Stellenbosch University, 2003.
ENGLISH ABSTRACT: We investigate object detection against cluttered backgrounds, based on the MINACE (Minimum Noise and Correlation Energy) filter. Application of the filter is followed by a suitable segmentation algorithm, and the standard techniques of global and local thresholding are compared to watershed-based segmentation. The aim of this approach is to provide a custom region-based object detection algorithm with a concise set of regions of interest. Two industrial case studies are examined: diamond detection in X-ray images, and the reading of a dynamic, and ink stamped, 2D barcode on packaging clutter. We demonstrate the robustness of our approach on these two diverse applications, and develop a complete algorithmic prototype for an automatic stamped code reader.
AFRIKAANSE OPSOMMING: Hierdie tesis ondersoek die herkenning van voorwerpe teen onduidelike agtergronde. Ons benadering maak staat op die MINACE (" Minimum Noise and Correlation Energy") korrelasiefilter. Die filter word aangewend saam met 'n gepaste segmenteringsalgoritme, en die standaard tegnieke van globale en lokale drumpelingsalgoritmes word vergelyk met 'n waterskeidingsgebaseerde segmenteringsalgoritme. Die doel van hierdie deteksiebenadering is om 'n klein stel moontlike voorwerpe te kan verskaf aan enige klassifikasie-algoritme wat fokus op die voorwerpe self. Twee industriële toepassings word ondersoek: die opsporing van diamante in X-straal beelde, en die lees van 'n dinamiese, inkgedrukte, 2D balkieskode op verpakkingsmateriaal. Ons demonstreer die robuustheid van ons benadering met hierdie twee uiteenlopende voorbeelde, en ontwikkel 'n volledige algoritmiese prototipe vir 'n outomatiese stempelkode leser.

APA, Harvard, Vancouver, ISO, and other styles

2

Mohan, Anuj 1976. "Robust object detection in images by components." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80554.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Grahn, Fredrik, and Kristian Nilsson. "Object Detection in Domain Specific Stereo-Analysed Satellite Images." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159917.

Full text

Abstract:

Given satellite images with accompanying pixel classifications and elevation data, we propose different solutions to object detection. The first method uses hierarchical clustering for segmentation and then employs different methods of classification. One of these classification methods used domain knowledge to classify objects while the other used Support Vector Machines. Additionally, a combination of three Support Vector Machines were used in a hierarchical structure which out-performed the regular Support Vector Machine method in most of the evaluation metrics. The second approach is more conventional with different types of Convolutional Neural Networks. A segmentation network was used as well as a few detection networks and different fusions between these. The Convolutional Neural Network approach proved to be the better of the two in terms of precision and recall but the clustering approach was not far behind. This work was done using a relatively small amount of data which potentially could have impacted the results of the Machine Learning models in a negative way.

APA, Harvard, Vancouver, ISO, and other styles

4

Papageorgiou, Constantine P. "A Trainable System for Object Detection in Images and Video Sequences." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/5566.

Full text

Abstract:

This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making any assumptions on the scene structure or the number of objects in the scene. The system uses a set of training data of positive and negative example images as input, transforms the pixel images to a Haar wavelet representation, and uses a support vector machine classifier to learn the difference between in-class and out-of-class patterns. To detect objects in out-of-sample images, we do a brute force search over all the subwindows in the image. This system is applied to face, people, and car detection with excellent results. For our extensions to video sequences, we augment the core static detection system in several ways -- 1) extending the representation to five frames, 2) implementing an approximation to a Kalman filter, and 3) modeling detections in an image as a density and propagating this density through time according to measured features. In addition, we present a real-time version of the system that is currently running in a DaimlerChrysler experimental vehicle. As part of this thesis, we also present a system that, instead of detecting full patterns, uses a component-based approach. We find it to be more robust to occlusions, rotations in depth, and severe lighting conditions for people detection than the full body version. We also experiment with various other representations including pixels and principal components and show results that quantify how the number of features, color, and gray-level affect performance.

APA, Harvard, Vancouver, ISO, and other styles

5

Gonzalez-Garcia, Abel. "Image context for object detection, object context for part detection." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/28842.

Full text

Abstract:

Objects and parts are crucial elements for achieving automatic image understanding. The goal of the object detection task is to recognize and localize all the objects in an image. Similarly, semantic part detection attempts to recognize and localize the object parts. This thesis proposes four contributions. The first two make object detection more efficient by using active search strategies guided by image context. The last two involve parts. One of them explores the emergence of parts in neural networks trained for object detection, whereas the other improves on part detection by adding object context. First, we present an active search strategy for efficient object class detection. Modern object detectors evaluate a large set of windows using a window classifier. Instead, our search sequentially chooses what window to evaluate next based on all the information gathered before. This results in a significant reduction on the number of necessary window evaluations to detect the objects in the image. We guide our search strategy using image context and the score of the classifier. In our second contribution, we extend this active search to jointly detect pairs of object classes that appear close in the image, exploiting the valuable information that one class can provide about the location of the other. This leads to an even further reduction on the number of necessary evaluations for the smaller, more challenging classes. In the third contribution of this thesis, we study whether semantic parts emerge in Convolutional Neural Networks trained for different visual recognition tasks, especially object detection. We perform two quantitative analyses that provide a deeper understanding of their internal representation by investigating the responses of the network filters. Moreover, we explore several connections between discriminative power and semantics, which provides further insights on the role of semantic parts in the network. Finally, the last contribution is a part detection approach that exploits object context. We complement part appearance with the object appearance, its class, and the expected relative location of the parts inside it. We significantly outperform approaches that use part appearance alone in this challenging task.

APA, Harvard, Vancouver, ISO, and other styles

6

Gadsby, David. "Object recognition for threat detection from 2D X-ray images." Thesis, Manchester Metropolitan University, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.493851.

Full text

Abstract:

This thesis examines methods to identify threat objects inside airport handheld passenger baggage. The work presents techniques for the enhancement and classification of objects from 2-dimensional x-ray images. It has been conducted with the collaboration of Manchester Aviation Services and uses test images from real x-ray baggage machines. The research attempts to overcome the key problem of object occlusion that impedes the performance of x-ray baggage operators identifying threat objects such as guns and knifes in x-ray images. Object occlusions can hide key information on the appearance of an object and potentially lead to a threat item entering an aircraft.

APA, Harvard, Vancouver, ISO, and other styles

7

Vi, Margareta. "Object Detection Using Convolutional Neural Network Trained on Synthetic Images." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153224.

Full text

Abstract:

Training data is the bottleneck for training Convolutional Neural Networks. A larger dataset gives better accuracy though also needs longer training time. It is shown by finetuning neural networks on synthetic rendered images, that the mean average precision increases. This method was applied to two different datasets with five distinctive objects in each. The first dataset consisted of random objects with different geometric shapes. The second dataset contained objects used to assemble IKEA furniture. The neural network with the best performance, trained on 5400 images, achieved a mean average precision of 0.81 on a test which was a sample of a video sequence. Analysis of the impact of the factors dataset size, batch size, and numbers of epochs used in training and different network architectures were done. Using synthetic images to train CNN’s is a promising path to take for object detection where access to large amount of annotated image data is hard to come by.

APA, Harvard, Vancouver, ISO, and other styles

8

Rickert, Thomas D. (Thomas Dale) 1975. "Texture-based statistical models for object detection in natural images." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80570.

Full text

Abstract:

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.
Includes bibliographical references (p. 63-65).
by Thomas D. Rickert.
S.B.and M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

9

Jangblad, Markus. "Object Detection in Infrared Images using Deep Convolutional Neural Networks." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-355221.

Full text

Abstract:

In the master thesis about object detection(OD) using deep convolutional neural network(DCNN), the area of OD is being tested when being applied to infrared images(IR). In this thesis the, goal is to use both long wave infrared(LWIR) images and short wave infrared(SWIR) images taken from an airplane in order to train a DCNN to detect runways, Precision Approach Path Indicator(PAPI) lights, and approaching lights. The purpose for detecting these objects in IR images is because IR light transmits better than visible light under certain weather conditions, for example, fog. This system could then help the pilot detect the runway in bad weather. The RetinaNet model architecture was used and modified in different ways to find the best performing model. The models contain parameters that are found during the training process but some parameters, called hyperparameters, need to be determined in advance. A way to automatically find good values of these hyperparameters was also tested. In hyperparameter optimization, the Bayesian optimization method proved to create a model with equally good performance as the best performance acieved by the author using manual hyperparameter tuning. The OD system was implemented using Keras with Tensorflow backend and received a high perfomance (mAP=0.9245) on the test data. The system manages to detect the wanted objects in the images but is expected to perform worse in a general situation since the training data and test data are very similar. In order to further develop this system and to improve performance under general conditions more data is needed from other airfields and under different weather conditions.

APA, Harvard, Vancouver, ISO, and other styles

10

Melcherson, Tim. "Image Augmentation to Create Lower Quality Images for Training a YOLOv4 Object Detection Model." Thesis, Uppsala universitet, Signaler och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429146.

Full text

Abstract:

Research in the Arctic is of ever growing importance, and modern technology is used in news ways to map and understand this very complex region and how it is effected by climate change. Here, animals and vegetation are tightly coupled with their environment in a fragile ecosystem, and when the environment undergo rapid changes it risks damaging these ecosystems severely. Understanding what kind of data that has potential to be used in artificial intelligence, can be of importance as many research stations have data archives from decades of work in the Arctic. In this thesis, a YOLOv4 object detection model has been trained on two classes of images to investigate the performance impacts of disturbances in the training data set. An expanded data set was created by augmenting the initial data to contain various disturbances. A model was successfully trained on the augmented data set and a correlation between worse performance and presence of noise was detected, but changes in saturation and altered colour levels seemed to have less impact than expected. Reducing noise in gathered data is seemingly of greater importance than enhancing images with lacking colour levels. Further investigations with a larger and more thoroughly processed data set is required to gain a clearer picture of the impact of the various disturbances.

APA, Harvard, Vancouver, ISO, and other styles

11

Yang, Xingwei. "Shape Based Object Detection and Recognition in Silhouettes and Real Images." Diss., Temple University Libraries, 2011. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/111091.

Full text

Abstract:

Computer and Information Science
Ph.D.
Shape is very essential for detecting and recognizing objects. It is robust to illumination, color changes. Human can recognize objects just based on shapes, thus shape based object detection and recognition methods have been popular in many years. Due to problem of segmentation, some researchers have worked on silhouettes instead of real images. The main problem in this area is object recognition and the difficulty is to handle shapes articulation and distortion. Previous methods mainly focus on one to one shape similarity measurement, which ignores context information between shapes. Instead, we utilize graph-transduction methods to reveal the intrinsic relation between shapes on 'shape manifold'. Our methods consider the context information in the dataset, which improves the performance a lot. To better describe the manifold structure, we also propose a novel method to add synthetic data points for densifying data manifold. The experimental results have shown the advantage of the algorithm. Moreover, a novel diffusion process on Tensor Product Graph is carried out for learning better affinities between data. This is also used for shape retrieval, which reaches the best ever results on MPEG-7 dataset. As shapes are important and helpful for object detection and recognition in real images, a lot of methods have used shapes to detect and recognize objects. There are two important parts for shape based methods, model construction and object detection, recognition. Most of the current methods are based on hand selected models, which is helpful but not extendable. To solve this problem, we propose to construct model by shape matching between some silhouettes and one hand decomposed silhouette. This weakly supervised method can be used not only learn the models in one object class, but also transfer the structure knowledge to other classes, which has the similar structure with the hand decomposed silhouette. The other problem is detecting and recognizing objects. A lot of methods search the images by sliding window to detect objects, which can find the global solution but with high complexity. Instead, we use sampling methods to reduce the complexity. The method we utilized is particle filter, which is popular in robot mapping and localization. We modified the standard particle filter to make it suitable for static observations and it is very helpful for object detection. Moreover, The usage of particle filter is extended for solving the jigsaw puzzle problem, where puzzle pieces are square image patches. The proposed method is able to reach much better results than the method with Loopy Belief Propagation.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

12

To, Thang Long Information Technology &amp Electrical Engineering Australian Defence Force Academy UNSW. "Video object segmentation using phase-base detection of moving object boundaries." Awarded by:University of New South Wales - Australian Defence Force Academy. School of Information Technology and Electrical Engineering, 2005. http://handle.unsw.edu.au/1959.4/38705.

Full text

Abstract:

A video sequence often contains a number of objects. For each object, the motion of its projection on the video frames is affected by its movement in 3-D space, as well as the movement of the camera. Video object segmentation refers to the task of delineating and distinguishing different objects that exist in a series of video frames. Segmentation of moving objects from a two-dimensional video is difficult due to the lack of depth information at the boundaries between different objects. As the motion incoherency of a region is intrinsically linked to the presence of such boundaries and vice versa, a failure to recognise a discontinuity in the motion field, or the use of an incorrect motion, often leads directly to errors in the segmentation result. In addition, many defects in a segmentation mask are also located in the vicinity of moving object boundaries, due to the unreliability of motion estimation in these regions. The approach to segmentation in this work comprises of three stages. In the first part, a phase-based method is devised for detection of moving object boundaries. This detection scheme is based on the characteristics of a phase-matched difference image, and is shown to be sensitive to even small disruptions to a coherent motion field. In the second part, a spatio-temporal approach for object segmentation is introduced, which involves a spatial segmentation in the detected boundary region, followed by a motion-based region-merging operation using three temporally adjacent video frames. In the third stage, a multiple-frame approach for stabilisation of object masks is introduced to alleviate the defects which may have existed earlier in a local segmentation, and to improve upon the temporal consistency of object boundaries in the segmentation masks along a sequence. The feasibility of the proposed work is demonstrated at each stage through examples carried out on a number of real video sequences. In the presence of another object motion, the phase-based boundary detection method is shown to be much more sensitive than direct measures such as sum-of-squared error on a motion-compensated difference image. The three-frame segmentation scheme also compares favourably with a recently proposed method initiated from a non-selective spatial segmentation. In addition, improvements in the quality of the object masks after the stabilisation stage are also observed both quantitatively and visually. The final segmentation result is then used in an experimental object-based video compression framework, which also shows improvements in efficiency over a contemporary video coding method.

APA, Harvard, Vancouver, ISO, and other styles

13

Pathare, Sneha P. "Detection of black-backed jackal in still images." Thesis, Stellenbosch : Stellenbosch University, 2015. http://hdl.handle.net/10019.1/97023.

Full text

Abstract:

Thesis (MSc)--Stellenbosch University, 2015.
ENGLISH ABSTRACT: In South Africa, black-back jackal (BBJ) predation of sheep causes heavy losses to sheep farmers. Different control measures such as shooting, gin-traps and poisoning have been used to control the jackal population; however, these techniques also kill many harmless animals, as they fail to differentiate between BBJ and harmless animals. In this project, a system is implemented to detect black-backed jackal faces in images. The system was implemented using the Viola-Jones object detection algorithm. This algorithm was originally developed to detect human faces, but can also be used to detect a variety of other objects. The three important key features of the Viola-Jones algorithm are the representation of an image as a so-called ”integral image”, the use of the Adaboost boosting algorithm for feature selection, and the use of a cascade of classifiers to reduce false alarms. In this project, Python code has been developed to extract the Haar-features from BBJ images by acting as a classifier to distinguish between a BBJ and the background. Furthermore, the feature selection is done using the Asymboost instead of the Adaboost algorithm so as to achieve a high detection rate and low false positive rate. A cascade of strong classifiers is trained using a cascade learning algorithm. The inclusion of a special fifth feature Haar feature, adapted to the relative spacing of the jackal’s eyes, improves accuracy further. The final system detects 78% of the jackal faces, while only 0.006% of other image frames are wrongly identified as faces.
AFRIKAANSE OPSOMMING: Swartrugjakkalse veroorsaak swaar vee-verliese in Suid Afrika. Teenmaatreels soos jag, slagysters en vergiftiging word algemeen gebruik, maar is nie selektief genoeg nie en dood dus ook vele nie-teiken spesies. In hierdie projek is ’n stelsel ontwikkel om swartrugjakkals gesigte te vind op statiese beelde. Die Viola-Jones deteksie algoritme, aanvanklik ontwikkel vir die deteksie van mens-gesigte, is hiervoor gebruik. Drie sleutel-aspekte van hierdie algoritme is die voorstelling van ’n beeld deur middel van ’n sogenaamde integraalbeeld, die gebruik van die ”Adaboost” algoritme om gepaste kenmerke te selekteer, en die gebruik van ’n kaskade van klassifiseerders om vals-alarm tempos te verlaag. In hierdie projek is Python kode ontwikkel om die nuttigste ”Haar”-kenmerke vir die deteksie van dié jakkalse te onttrek. Eksperimente is gedoen om die nuttigheid van die ”Asymboost” algoritme met die van die ”Adaboost” algoritme te kontrasteer. ’n Kaskade van klassifiseerders is vir beide van hierdie tegnieke afgerig en vergelyk. Die resultate toon dat die kenmerke wat die ”Asymboost” algoritme oplewer, tot laer vals-alarm tempos lei. Die byvoeging van ’n spesiale vyfde tipe Haar-kenmerk, wat aangepas is by die relatiewe spasieëring van die jakkals se oë, verhoog die akkuraatheid verder. Die uiteindelike stelsel vind 78% van die gesigte terwyl slegs 0.006% ander beeld-raampies verkeerdelik as gesigte geklassifiseer word.

APA, Harvard, Vancouver, ISO, and other styles

14

Stigson, Magnus. "Object Tracking Using Tracking-Learning-Detection inThermal Infrared Video." Thesis, Linköpings universitet, Datorseende, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93936.

Full text

Abstract:

Automatic tracking of an object of interest in a video sequence is a task that has been much researched. Difficulties include varying scale of the object, rotation and object appearance changing over time, thus leading to tracking failures. Different tracking methods, such as short-term tracking often fail if the object steps out of the camera’s field of view, or changes shape rapidly. Also, small inaccuracies in the tracking method can accumulate over time, which can lead to tracking drift. Long-term tracking is also problematic, partly due to updating and degradation of the object model, leading to incorrectly classified and tracked objects. This master’s thesis implements a long-term tracking framework called Tracking-Learning-Detection which can learn and adapt, using so called P/N-learning, to changing object appearance over time, thus making it more robust to tracking failures. The framework consists of three parts; a tracking module which follows the object from frame to frame, a learning module that learns new appearances of the object, and a detection module which can detect learned appearances of the object and correct the tracking module if necessary. This tracking framework is evaluated on thermal infrared videos and the results are compared to the results obtained from videos captured within the visible spectrum. Several important differences between visual and thermal infrared tracking are presented, and the effect these have on the tracking performance is evaluated. In conclusion, the results are analyzed to evaluate which differences matter the most and how they affect tracking, and a number of different ways to improve the tracking are proposed.

APA, Harvard, Vancouver, ISO, and other styles

15

Pepik, Bojan [Verfasser], and Bernt [Akademischer Betreuer] Schiele. "Richer object representations for object class detection in challenging real world images / Bojan Pepik. Betreuer: Bernt Schiele." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2016. http://d-nb.info/1081935022/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Schrider, Christina Da-Wann. "Histogram-based template matching object detection in images with varying brightness and contrast." Wright State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=wright1224044521.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Ridge, Douglas John. "Imaging for small object detection." Thesis, Queen's University Belfast, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.295423.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Tang, Jiayu. "Automatic image annotation and object detection." Thesis, University of Southampton, 2008. https://eprints.soton.ac.uk/265835/.

Full text

Abstract:

We live in the midst of the information era, during which organising and indexing information more effectively is a matter of essential importance. With the fast development of digital imagery, how to search images - a rich form of information - more efficiently by their content has become one of the biggest challenges. Content-based image retrieval (CBIR) has been the traditional and dominant technique for searching images for decades. However, not until recently have researchers started to realise some vital problems existing in CBIR systems. One of the most important is perhaps what people call the \textit{semantic gap}, which refers to the gap between the information that can be extracted from images and the interpretation of the images for humans. As an attempt to bridge the semantic gap, automatic image annotation has been gaining more and more attentions in recent years. This thesis aims to explore a number of different approaches to automatic image annotation and some related issues. It begins with an introduction into different techniques for image description, which forms the foundation of the research on image auto-annotation. The thesis then goes on to give an in-depth examination of some of the quality issues of the data-set used for evaluating auto-annotation systems. A series of approaches to auto-annotation are presented in the follow-up chapters. Firstly, we describe an approach that incorporates the salient based image representation into a statistical model for better annotation performance. Secondly, we explore the use of non-negative matrix factorisation (NMF), a matrix decomposition tehcnique, for two tasks; object class detection and automatic annotation of images. The results imply that NMF is a promising sub-space technique for these purposes. Finally, we propose a model named the image based feature space (IBFS) model for linking image regions and keywords, and for image auto-annotation. Both image regions and keywords are mapped into the same space in which their relationships can be measured. The idea of multiple segmentations is then implemented in the model, and better results are achieved than using a single segmentation.

APA, Harvard, Vancouver, ISO, and other styles

19

Kessi, Louisa. "Unsupervised detection based on spatial relationships : Application for object detection and recognition of colored business document structures." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEI068.

Full text

Abstract:

Cette thèse a pour objectif de développer un système de reconnaissance de structures logique des documents d'entreprises sans modèle. Il s'agit de reconnaître la fonction logique de blocs de textes qui sont importants à localiser et à identifier. Ce problème est identique à celui de la détection d'objets dans une scène naturelle puisqu'il faut à la fois reconnaître les objets et les localiser dans une image. A la différence de la reconnaissance d'objets, les documents d'entreprises doivent être interprétés sans aucune information a priori sur leurs modèles de structures. La seule solution consiste à développer une approche non supervisée basée principalement sur les relations spatiales et sur les informations textuelles et images. Les documents d'entreprises possèdent des contenus et des formes très hétérogènes car chaque entreprise et chaque administration créent son propre formulaire ou ses propres modèles de factures. Nous faisons l'hypothèse que toute structure logique de document est constituée de morceaux de micro-structures déjà observées dans d'autres documents. Cette démarche est identique en détection d'objets dans les images naturelles. Tout modèle particulier d'objet dans une scène est composé de morceaux d'éléments déjà vu sur d'autres exemples d'objets de même classe et qui sont reliés entre eux par des relations spatiales déjà observées. Notre modèle est donc basé sur une reconnaissance partie par partie et sur l'accumulation d'évidences dans l'espace paramétrique et spatial. Notre solution a été testée sur des applications de détection d'objets dans les scènes naturelles et de reconnaissance de structure logique de documents d'entreprises. Les bonnes performances obtenues valident les hypothèses initiales. Ces travaux contiennent aussi de nouvelles méthodes de traitement et d'analyse d'image couleurs de documents et d'images naturelles
This digital revolution introduces new services and new usages in numerous domains. The advent of the digitization of documents and the automatization of their processing constitutes a great cultural and economic revolution. In this context, computer vision provides numerous applications and impacts our daily lives and businesses. Behind computer-vision technology, fundamental concepts, methodologies, and algorithms have been developed worldwide in the last fifty years. Today, computer vision technologies arrive to maturity and become a reality in many domains. Computer-vision systems reach high performance thanks to the large amount of data and the increasing performance of the hardware. Despite the success of computer-vision applications, however, numerous other applications require more research, new methodologies, and novel algorithms. Among the difficult problems encountered in the computer-vision domain, detection remains a challenging task. Detection consists of localizing and recognizing an object in an image. This problem is far more difficult than the problem of recognition alone. Among the numerous applications based on detection, object detection in a natural scene is the most popular application in the computer-vision community. This work is about the detection tasks and its applications

APA, Harvard, Vancouver, ISO, and other styles

20

Thakkar, Chintan. "Ventricle slice detection in MRI images using Hough Transform and Object Matching techniques." [Tampa, Fla] : University of South Florida, 2006. http://purl.fcla.edu/usf/dc/et/SFE0001815.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Li, Guannan. "Locality sensitive modelling approach for object detection, tracking and segmentation in biomedical images." Thesis, University of Warwick, 2016. http://wrap.warwick.ac.uk/81399/.

Full text

Abstract:

Biomedical imaging techniques play an important role in visualisation of e.g., biological structures, tissues, diseases and medical conditions in cellular level. The techniques bring us enormous image datasets for studying biological processes, clinical diagnosis and medical analysis. Thanks to recent advances in computer technology and hardware, automatic analysis of biomedical images becomes more feasible and popular. Although computer scientists have made a great effort in developing advanced imaging processing algorithms, many problems regarding object analysis still remain unsolved due to the diversity of biomedical imaging. In this thesis, we focus on developing object analysis solutions for two entirely different biomedical image types: uorescence microscopy sequences and endometrial histology images. In uorescence microscopy, our task is to track massive uorescent spots with similar appearances and complicated motion pattern in noisy environments over hundreds of frames. In endometrial histology, we are challenged by detecting different types of cells with similar appearance and in terms of colour and morphology. The proposed solutions utilise several novel locality sensitive models which can extract spatial or/and temporal relational features of the objects, i.e., local neighbouring objects exhibiting certain structures or patterns, for overcoming the difficulties of object analysis in uorescence microscopy and endometrial histology.

APA, Harvard, Vancouver, ISO, and other styles

22

Thörnberg, Jesper. "Combining RGB and Depth Images for Robust Object Detection using Convolutional Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174137.

Full text

Abstract:

We investigated the advantage of combining RGB images with depth data to get more robust object classifications and detections using pre-trained deep convolutional neural networks. We relied upon the raw images from publicly available datasets captured using Microsoft Kinect cameras. The raw images varied in size, and therefore required resizing to fit our network. We designed a resizing method called "bleeding edge" to avoid distorting the objects in the images. We present a novel method of interpolating the missing depth pixel values by comparing to similar RGB values. This method proved superior to the other methods tested. We showed that a simple colormap transformation of the depth image can provide close to state-of-art performance. Using our methods, we can present state-of-art performance on the Washington Object dataset and we provide some results on the Washington Scenes (V1) dataset. Specifically, for the detection, we used contours at different thresholds to find the likely object locations in the images. For the classification task we can report state-of-art results using only RGB and RGB-D images, depth data alone gave close to state-of-art results. For the detection task we found the RGB only detector to be superior to the other detectors.

APA, Harvard, Vancouver, ISO, and other styles

23

Liu, Wenye III. "Automatic Detection of Elongated Objects in X-Ray Images of Luggage." Thesis, Virginia Tech, 1997. http://hdl.handle.net/10919/37033.

Full text

Abstract:

This thesis presents a part of the research work at Virginia Tech on developing a prototype automatic luggage scanner for explosive detection, and it deals with the automatic detection of elongated objects (detonators) in x-ray images using matched filtering, the Hough transform, and information fusion techniques. A sophisticated algorithm has been developed for detonator detection in x-ray images, and computer software utilizing this algorithm was programmed to implement the detection on both UNIX and PC platforms. A variety of template matching techniques were evaluated, and the filtering parameters (template size, template model, thresholding value, etc.) were optimized. A variation of matched filtering was found to be reasonably effective, while a Gabor-filtering method was found not to be suitable for this problem. The developed software for both single orientations and multiple orientations was tested on x-ray images generated on AS&E and Fiscan inspection systems, and was found to work well for a variety of images. The effects of object overlapping, luggage position on the conveyor, and detonator orientation variation were also investigated using the single-orientation algorithm. It was found that the effectiveness of the software depended on the extent of overlapping as well as on the objects the detonator overlapped. The software was found to work well regardless of the position of the luggage bag on the conveyor, and it was able to tolerate a moderate amount of orientation change.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

24

Maryan, Corey C. "Detecting Rip Currents from Images." ScholarWorks@UNO, 2018. https://scholarworks.uno.edu/td/2473.

Full text

Abstract:

Rip current images are useful for assisting in climate studies but time consuming to manually annotate by hand over thousands of images. Object detection is a possible solution for automatic annotation because of its success and popularity in identifying regions of interest in images, such as human faces. Similarly to faces, rip currents have distinct features that set them apart from other areas of an image, such as more generic patterns of the surf zone. There are many distinct methods of object detection applied in face detection research. In this thesis, the best fit for a rip current object detector is found by comparing these methods. In addition, the methods are improved with Haar features exclusively created for rip current images. The compared methods include max distance from the average, support vector machines, convolutional neural networks, the Viola-Jones object detector, and a meta-learner. The presented results are compared for accuracy, false positive rate, and detection rate. Viola-Jones has the top base-line performance by achieving a detection rate of 0.88 and identifying only 15 false positives in the test image set of 53 rip currents. The described meta-learner integrates the presented Haar features, which are developed in accordance with the original Viola-Jones algorithm. Ada-Boost, a feature ranking algorithm, shows that the newly presented Haar features extract more meaningful data from rip current images than some of the current features. The meta-classifier improves upon the stand-alone Viola-Jones when applying these features by reducing its false positives by 47% while retaining a similar computational cost and detection rate.

APA, Harvard, Vancouver, ISO, and other styles

25

Yousif, Osama. "Urban Change Detection Using Multitemporal SAR Images." Doctoral thesis, KTH, Geoinformatik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-168216.

Full text

Abstract:

Multitemporal SAR images have been increasingly used for the detection of different types of environmental changes. The detection of urban changes using SAR images is complicated due to the complex mixture of the urban environment and the special characteristics of SAR images, for example, the existence of speckle. This thesis investigates urban change detection using multitemporal SAR images with the following specific objectives: (1) to investigate unsupervised change detection, (2) to investigate effective methods for reduction of the speckle effect in change detection, (3) to investigate spatio-contextual change detection, (4) to investigate object-based unsupervised change detection, and (5) to investigate a new technique for object-based change image generation. Beijing and Shanghai, the largest cities in China, were selected as study areas. Multitemporal SAR images acquired by ERS-2 SAR and ENVISAT ASAR sensors were used for pixel-based change detection. For the object-based approaches, TerraSAR-X images were used. In Paper I, the unsupervised detection of urban change was investigated using the Kittler-Illingworth algorithm. A modified ratio operator that combines positive and negative changes was used to construct the change image. Four density function models were tested and compared. Among them, the log-normal and Nakagami ratio models achieved the best results. Despite the good performance of the algorithm, the obtained results suffer from the loss of fine geometric detail in general. This was a consequence of the use of local adaptive filters for speckle suppression. Paper II addresses this problem using the nonlocal means (NLM) denoising algorithm for speckle suppression and detail preservation. In this algorithm, denoising was achieved through a moving weighted average. The weights are a function of the similarity of small image patches defined around each pixel in the image. To decrease the computational complexity, principle component analysis (PCA) was used to reduce the dimensionality of the neighbourhood feature vectors. Simple methods to estimate the number of significant PCA components to be retained for weights computation and the required noise variance were proposed. The experimental results showed that the NLM algorithm successfully suppressed speckle effects, while preserving fine geometric detail in the scene. The analysis also indicates that filtering the change image instead of the individual SAR images was effective in terms of the quality of the results and the time needed to carry out the computation. The Markov random field (MRF) change detection algorithm showed limited capacity to simultaneously maintain fine geometric detail in urban areas and combat the effect of speckle. To overcome this problem, Paper III utilizes the NLM theory to define a nonlocal constraint on pixels class-labels. The iterated conditional mode (ICM) scheme for the optimization of the MRF criterion function is extended to include a new step that maximizes the nonlocal probability model. Compared with the traditional MRF algorithm, the experimental results showed that the proposed algorithm was superior in preserving fine structural detail, effective in reducing the effect of speckle, less sensitive to the value of the contextual parameter, and less affected by the quality of the initial change map. Paper IV investigates object-based unsupervised change detection using very high resolution TerraSAR-X images over urban areas. Three algorithms, i.e., Kittler-Illingworth, Otsu, and outlier detection, were tested and compared. The multitemporal images were segmented using multidate segmentation strategy. The analysis reveals that the three algorithms achieved similar accuracies. The achieved accuracies were very close to the maximum possible, given the modified ratio image as an input. This maximum, however, was not very high. This was attributed, partially, to the low capacity of the modified ratio image to accentuate the difference between changed and unchanged areas. Consequently, Paper V proposes a new object-based change image generation technique. The strong intensity variations associated with high resolution and speckle effects render object mean intensity unreliable feature. The modified ratio image is, therefore, less efficient in emphasizing the contrast between the classes. An alternative representation of the change data was proposed. To measure the intensity of change at the object in isolation of disturbances caused by strong intensity variations and speckle effects, two techniques based on the Fourier transform and the Wavelet transform of the change signal were developed. Qualitative and quantitative analyses of the result show that improved change detection accuracies can be obtained by classifying the proposed change variables.

QC 20150529

APA, Harvard, Vancouver, ISO, and other styles

26

Baris, Yuksel. "Automated Building Detection From Satellite Images By Using Shadow Information As An Object Invariant." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614909/index.pdf.

Full text

Abstract:

Apart from classical pattern recognition techniques applied for automated building detection in satellite images, a robust building detection methodology is proposed, where self-supervision data can be automatically extracted from the image by using shadow and its direction as an invariant for building object. In this methodology
first the vegetation, water and shadow regions are detected from a given satellite image and local directional fuzzy landscapes representing the existence of building are generated from the shadow regions using the direction of illumination obtained from image metadata. For each landscape, foreground (building) and background pixels are automatically determined and a bipartitioning is obtained using a graph-based algorithm, Grabcut. Finally, local results are merged to obtain the final building detection result. Considering performance evaluation results, this approach can be seen as a proof of concept that the shadow is an invariant for a building object and promising detection results can be obtained when even a single invariant for an object is used.

APA, Harvard, Vancouver, ISO, and other styles

27

Flasseur, Olivier. "Object detection and characterization from faint signals in images : applications in astronomy and microscopy." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSES042.

Full text

Abstract:

La détection et la caractérisation d’objets dans des images à faible rapport signal sur bruit est un problème courant dans de nombreux domaines tels que l’astronomie ou la microscopie. En astronomie, la détection des exoplanètes et leur caractérisation par imagerie directe depuis la Terre sont des sujets de recherche très actifs. Une étoile cible et son environnement proche (abritant potentiellement des exoplanètes) sont observés sur de courtes poses. En microscopie, l’holographie en ligne est une méthode de choix pour caractériser à faibles coûts les objets microscopiques. Basée sur l’enregistrement d’un hologramme, elle permet une mise au point numérique dans n’importe quel plan du volume 3-D imagé. Dans ces deux applications cibles, le problème est rendu difficile par le faible contraste entre les objets et le fond non stationnaire des images enregistrées.Dans cette thèse, nous proposons un algorithme non-supervisé dédié à la détection et à la caractérisation d’exoplanètes par une modélisation statistique des fluctuations du fond. Cette méthode est basée sur une modélisation de la distribution statistique des données à une échelle locale de patchs, capturant ainsi leur covariances spatiales. Testé sur plusieurs jeux de données de l’imageur haut-contraste SPHERE opérant au Très Grand Télescope Européen, cet algorithme atteint de meilleures performances que les méthodes de l’état de l’art. En particulier, les cartes de détection produites sont stationnaires et statistiquement fondées. La détection des exoplanètes peut ainsi être effectuée à probabilité de fausse alarme contrôlée. L’estimation de la distribution d’énergie spectrale des sources détectées est également non biaisée. L’utilisation d’un modèle statistique permet également de déduire des précisions photométriques et astrométriques fiables. Ce cadre méthodologique est ensuite adapté pour la détection de motifs spatialement étendus tels que les motifs de diffraction rencontrés en microscopie holographique qui sont également dominés par un fond non-stationnaire. Nous proposons aussi des approches robustes basées sur des stratégies de pondération afin de réduire l’influence des nombreuses valeurs aberrantes présentes sur les données réelles. Nous montrons sur des vidéos holographiques que les méthodes de pondération proposées permettent d’atteindre un compromis biais/variance. En astronomie, la robustesse améliore les performances de détection, en particulier à courtes séparations angulaires, où les fuites stellaires dominent. Les algorithmes développés sont également adaptés pour tirer parti de la diversité spectrale des données en plus de leur diversité temporelle, améliorant ainsi leurs performances de détection et de caractérisation. Tous les algorithmes développés sont totalement non-supervisés: les paramètres de pondération et/ou de régularisation sont estimés directement à partir des données. Au-delà des applications considérées en astronomie et en microscopie, les méthodes de traitement du signal introduites dans cette thèse sont générales et pourraient être appliquées à d’autres problèmes de détection et d’estimation
Detecting and characterizing objects in images in the low signal-to-noise ratio regime is a critical issue in many areas such as astronomy or microscopy. In astronomy, the detection of exoplanets and their characterization by direct imaging from the Earth is a hot topic. A target star and its close environment (hosting potential exoplanets) are observed on short exposures. In microscopy, in-line holography is a cost-effective method for characterizing microscopic objects. Based on the recording of a hologram, it allows a digital focusing in any plane of the imaged 3-D volume. In these two fields, the object detection problem is made difficult by the low contrast between the objects and the nonstationary background of the recorded images.In this thesis, we propose an unsupervised exoplanet detection and characterization algorithm based on the statistical modeling of background fluctuations. The method, based on a modeling of the statistical distribution of patches, captures their spatial covariances. It reaches a performance superior to state-of-the-art techniques on several datasets of the European high-contrast imager SPHERE operating at the Very Large Telescope. It produces statistically grounded and spatially-stationary detection maps in which detections can be performed at a constant probability of false alarm. It also produces photometrically unbiased spectral energy distributions of the detected sources. The use of a statistical model of the data leads to reliable photometric and astrometric accuracies. This methodological framework can be adapted to the detection of spatially-extended patterns in strong structured background, such as the diffraction patterns in holographic microscopy. We also propose robust approaches based on weighting strategies to reduce the influence of the numerous outliers present in real data. We show on holographic videos that the proposed weighting approach achieves a bias/variance tradeoff. In astronomy, the robustness improves the performance of our detection method in particular at close separations where the stellar residuals dominate. Our algorithms are adapted to benefit from the possible spectral diversity of the data, which improves the detection and characterization performance. All the algorithms developed are unsupervised: weighting and/or regularization parameters are estimated in a data-driven fashion. Beyond the applications in astronomy and microscopy, the signal processing methodologies introduced are general and could be applied to other detection and estimation problems

APA, Harvard, Vancouver, ISO, and other styles

28

Aytar, Yusuf. "Transfer learning for object category detection." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:c9e18ff9-df43-4f67-b8ac-28c3fdfa584b.

Full text

Abstract:

Object category detection, the task of determining if one or more instances of a category are present in an image with their corresponding locations, is one of the fundamental problems of computer vision. The task is very challenging because of the large variations in imaged object appearance, particularly due to the changes in viewpoint, illumination and intra-class variance. Although successful solutions exist for learning object category detectors, they require massive amounts of training data. Transfer learning builds upon previously acquired knowledge and thus reduces training requirements. The objective of this work is to develop and apply novel transfer learning techniques specific to the object category detection problem. This thesis proposes methods which not only address the challenges of performing transfer learning for object category detection such as finding relevant sources for transfer, handling aspect ratio mismatches and considering the geometric relations between the features; but also enable large scale object category detection by quickly learning from considerably fewer training samples and immediate evaluation of models on web scale data with the help of part-based indexing. Several novel transfer models are introduced such as: (a) rigid transfer for transferring knowledge between similar classes, (b) deformable transfer which tolerates small structural changes by deforming the source detector while performing the transfer, and (c) part level transfer particularly for the cases where full template transfer is not possible due to aspect ratio mismatches or not having adequately similar sources. Building upon the idea of using part-level transfer, instead of performing an exhaustive sliding window search, part-based indexing is proposed for efficient evaluation of templates enabling us to obtain immediate detection results in large scale image collections. Furthermore, easier and more robust optimization methods are developed with the help of feature maps defined between proposed transfer learning formulations and the “classical” SVM formulation.

APA, Harvard, Vancouver, ISO, and other styles

29

Andersson, Daniel. "Automatic vertebrae detection and labeling in sagittal magnetic resonance images." Thesis, Linköpings universitet, Medicinsk informatik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-115874.

Full text

Abstract:

Radiologists are often plagued by limited time for completing their work, with an ever increasing workload. A picture archiving and communication system (PACS) is a platform for daily image reviewing that improves their work environment, and on that platform for example spinal MR images can be reviewed. When reviewing spinal images a radiologist wants vertebrae labels, and in Sectra's PACS platform there is a good opportunity for implementing an automatic method for spinal labeling. In this thesis a method for performing automatic spinal labeling, called a vertebrae classifier, is presented. This method should remove the need for radiologists to perform manual spine labeling, and could be implemented in Sectra's PACS software to improve radiologists overall work experience.Spine labeling is the process of marking vertebrae centres with a name on a spinal image. The method proposed in this thesis for performing that process was developed using a machine learning approach for vertebrae detection in sagittal MR images. The developed classifier works for both the lumbar and the cervical spine, but it is optimized for the lumbar spine. During the development three different methods for the purpose of vertebrae detection were evaluated. Detection is done on multiple sagittal slices. The output from the detection is then labeled using a pictorial structure based algorithm which uses a trained model of the spine to correctly assess correct labeling. The suggested method achieves 99.6% recall and 99.9% precision for the lumbar spine. The cervical spine achieves slightly worse performance, with 98.1% for both recall and precision. This result was achieved by training the proposed method on 43 images and validated with 89 images for the lumbar spine. The cervical spine was validated using 26 images. These results are promising, especially for the lumbar spine. However, further evaluation is needed to test the method in a clinical setting.
Radiologer får bara mindre och mindre tid för att utföra sina arbetsuppgifter, då arbetsbördan bara blir större. Ett picture archiving and communication system (PACS) är en platform där radiologer kan undersöka medicinska bilder, däribland magnetic resonance (MR) bilder av ryggraden. När radiologerna tittar på dessa bilder av ryggraden vill de att kotorna ska vara markerade med sina namn, och i Sectra's PACS platform finns det en bra möjlighet för att implementera en automatisk metod för att namnge ryggradens kotor på bilden. I detta examensarbete presenteras en metod för att automatiskt markera alla kotorna utifrån saggitala MR bilder. Denna metod kan göra så att radiologer inte längre behöver manuellt markera kotor, och den skulle kunna implementeras i Sectra's PACS för att förbättra radiologernas arbetsmiljö. Det som menas med att markera kotor är att man ger mitten av alla kotor ett namn utifrån en MR bild på ryggraden. Metoden som presenteras i detta arbete kan utföra detta med hjälp av ett "machine learning" arbetssätt. Metoden fungerar både för övre och nedre delen av ryggraden, men den är optimerad för den nedre delen. Under utvecklingsfasen var tre olika metoder för att detektera kotor evaluerade. Resultatet från detektionen är sedan använt för att namnge alla kotor med hjälp av en algoritm baserad på pictorial structures, som använder en tränad model för att kunna evaluera vad som bör anses vara korrekt namngivning. Metoden uppnår 99.6% recall och 99.9% precision för nedre ryggraden. För övre ryggraden uppnås något sämre resultat, med 98.1% vad gäller både recall och precision. Detta resultat uppnådes då metoden tränades på 43 bilder och validerades på 89 bilder för nedre ryggraden. För övre ryggraden användes 26 stycken bilder. Resultaten är lovande, speciellt för den nedre delen. Dock måste ytterligare utvärdering göras för metoden i en klinisk miljö.

APA, Harvard, Vancouver, ISO, and other styles

30

Gandhi, Tarak L. "Image sequence analysis for object detection and segmentation." Adobe Acrobat reader required to view the full dissertation, 2000. http://www.etda.libraries.psu.edu/theses/approved/PSUonlyIndex/ETD-18/index.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Silva, Filho José Grimaldo da. "Multiscale spectral residue for faster image object detection." reponame:Repositório Institucional da UFBA, 2013. http://www.repositorio.ufba.br/ri/handle/ri/13203.

Full text

Abstract:

Submitted by LIVIA FREITAS (livia.freitas@ufba.br) on 2013-10-11T12:05:23Z No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Approved for entry into archive by LIVIA FREITAS(livia.freitas@ufba.br) on 2013-10-11T19:14:00Z (GMT) No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Made available in DSpace on 2013-10-11T19:14:00Z (GMT). No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Accuracy in image object detection has been usually achieved at the expense of much computational load. Therefore a trade-o between detection performance and fast execution commonly represents the ultimate goal of an object detector in real life applications. Most images are composed of non-trivial amounts of background nformation, such as sky, ground and water. In this sense, using an object detector against a recurring background pattern can require a signi cant amount of the total processing time. To alleviate this problem, search space reduction methods can help focusing the detection procedure on more distinctive image regions. Among the several approaches for search space reduction, we explored saliency information to organize regions based on their probability of containing objects. Saliency detectors are capable of pinpointing regions which generate stronger visual stimuli based solely on information extracted from the image. The fact that saliency methods do not require prior training is an important bene t, which allows application of these techniques in a broad range of machine vision domains. We propose a novel method toward the goal of faster object detectors. The proposed method was grounded on a multi-scale spectral residue (MSR) analysis using saliency detection. For better search space reduction, our method enables ne control of search scale, more robustness to variations on saliency intensity along an object length and also a direct way to control the balance between search space reduction and false negatives caused by region selection. Compared to a regular sliding window search over the images, in our experiments, MSR was able to reduce by 75% (in average) the number of windows to be evaluated by an object detector while improving or at least maintaining detector ROC performance. The proposed method was thoroughly evaluated over a subset of LabelMe dataset (person images), improving detection performance in most cases. This evaluation was done comparing object detection performance against di erent object detectors, with and without MSR. Additionally, we also provide evaluation of how di erent object classes interact with MSR, which was done using Pascal VOC 2007 dataset. Finally, tests made showed that window selection performance of MSR has a good scalability with regard to image size. From the obtained data, our conclusion is that MSR can provide substantial bene ts to existing sliding window detectors.
Salvador

APA, Harvard, Vancouver, ISO, and other styles

32

Silva, Filho Jose Grimaldo da. "Multiscale Spectral Residue for Faster Image Object Detection." Escola Politécnica / Instituto de Matemática, 2013. http://repositorio.ufba.br/ri/handle/ri/21340.

Full text

Abstract:

Submitted by Diogo Barreiros (diogo.barreiros@ufba.br) on 2017-02-06T16:59:36Z No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Approved for entry into archive by Vanessa Reis (vanessa.jamile@ufba.br) on 2017-02-07T11:51:58Z (GMT) No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Made available in DSpace on 2017-02-07T11:51:58Z (GMT). No. of bitstreams: 1 dissertacao_mestrado_jose-grimaldo.pdf: 19406681 bytes, checksum: d108855fa0fb0d44ee5d1cb59579a04c (MD5)
Accuracy in image object detection has been usually achieved at the expense of much computational load. Therefore a trade-o between detection performance and fast execution commonly represents the ultimate goal of an object detector in real life applications. Most images are composed of non-trivial amounts of background information, such as sky, ground and water. In this sense, using an object detector against a recurring background pattern can require a signi cant amount of the total processing time. To alleviate this problem, search space reduction methods can help focusing the detection procedure on more distinctive image regions.
Among the several approaches for search space reduction, we explored saliency information to organize regions based on their probability of containing objects. Saliency detectors are capable of pinpointing regions which generate stronger visual stimuli based solely on information extracted from the image. The fact that saliency methods do not require prior training is an important benefit, which allows application of these techniques in a broad range of machine vision domains. We propose a novel method toward the goal of faster object detectors. The proposed method was grounded on a multi-scale spectral residue (MSR) analysis using saliency detection. For better search space reduction, our method enables fine control of search scale, more robustness to variations on saliency intensity along an object length and also a direct way to control the balance between search space reduction and false negatives caused by region selection. Compared to a regular sliding window search over the images, in our experiments, MSR was able to reduce by 75% (in average) the number of windows to be evaluated by an object detector while improving or at least maintaining detector ROC performance. The proposed method was thoroughly evaluated over a subset of LabelMe dataset (person images), improving detection performance in most cases. This evaluation was done comparing object detection performance against different object detectors, with and without MSR. Additionally, we also provide evaluation of how different object classes interact with MSR, which was done using Pascal VOC 2007 dataset. Finally, tests made showed that window selection performance of MSR has a good scalability with regard to image size. From the obtained data, our conclusion is that MSR can provide substantial benefits to existing sliding window detectors

APA, Harvard, Vancouver, ISO, and other styles

33

Luo, Yuanqing. "Moving Object Detection based on Background Modeling." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-230267.

Full text

Abstract:

Aim at the moving objects detection, after studying several categories of background modeling methods, we design an improved Vibe algorithm based on image segmentation algorithm. Vibe algorithm builds background model via storing a sample set for each pixel. In order to detect moving objects, it uses several techniques such as fast initialization, random update and classification based on distance between pixel value and its sample set. In our improved algorithm, firstly we use histograms of multiple layers to extract moving objects in block-level in pre-process stage. Secondly we segment the blocks of moving objects via image segmentation algorithm. Then the algorithm constructs region-level information for the moving objects, designs the classification principles for regions and the modification mechanism among neighboring regions. In addition, to solve the problem that the original Vibe algorithm can easily introduce the ghost region into the background model, the improved algorithm designs and implements the fast ghost elimination algorithm. Compared with the tradition pixel-level background modeling methods, the improved method has better robustness and reliability against the factors like background disturbance, noise and existence of moving objects in the initial stage. Specifically, our algorithm improves the precision rate from 83.17% in the original Vibe algorithm to 95.35%, and recall rate from 81.48% to 90.25%. Considering the affection of shadow to moving objects detection, this paper designs a shadow elimination algorithm based on Red Green and Illumination (RGI) color feature, which can be converted from RGB color space, and dynamic match threshold. The results of experiments demonstrate that the algorithm can effectively reduce the influence of shadow on the moving objects detection. At last this paper makes a conclusion for the work of this thesis and discusses the future work.

APA, Harvard, Vancouver, ISO, and other styles

34

Dickens, James. "Depth-Aware Deep Learning Networks for Object Detection and Image Segmentation." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42619.

Full text

Abstract:

The rise of convolutional neural networks (CNNs) in the context of computer vision has occurred in tandem with the advancement of depth sensing technology. Depth cameras are capable of yielding two-dimensional arrays storing at each pixel the distance from objects and surfaces in a scene from a given sensor, aligned with a regular color image, obtaining so-called RGBD images. Inspired by prior models in the literature, this work develops a suite of RGBD CNN models to tackle the challenging tasks of object detection, instance segmentation, and semantic segmentation. Prominent architectures for object detection and image segmentation are modified to incorporate dual backbone approaches inputting RGB and depth images, combining features from both modalities through the use of novel fusion modules. For each task, the models developed are competitive with state-of-the-art RGBD architectures. In particular, the proposed RGBD object detection approach achieves 53.5% mAP on the SUN RGBD 19-class object detection benchmark, while the proposed RGBD semantic segmentation architecture yields 69.4% accuracy with respect to the SUN RGBD 37-class semantic segmentation benchmark. An original 13-class RGBD instance segmentation benchmark is introduced for the SUN RGBD dataset, for which the proposed model achieves 38.4% mAP. Additionally, an original depth-aware panoptic segmentation model is developed, trained, and tested for new benchmarks conceived for the NYUDv2 and SUN RGBD datasets. These benchmarks offer researchers a baseline for the task of RGBD panoptic segmentation on these datasets, where the novel depth-aware model outperforms a comparable RGB counterpart.

APA, Harvard, Vancouver, ISO, and other styles

35

Fasth, Niklas, and Rasmus Hallblad. "Air Reconnaissance Analysis using Convolutional Neural Network-based Object Detection." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-48422.

Full text

Abstract:

The Swedish armed forces use the Single Source Intelligent Cell (SSIC), developed by Saab, for analysis of aerial reconnaissance video and report generation. The analysis can be time-consuming and demanding for a human operator. In the analysis workflow, identifying vehicles is an important part of the work. Artificial Intelligence is widely used for analysis in many industries to aid or replace a human worker. In this paper, the possibility to aid the human operator with air reconnaissance data analysis is investigated, specifically, object detection for finding cars in aerial images. Many state-of-the-art object detection models for vehicle detection in aerial images are based on a Convolutional Neural Network (CNN) architecture. The Faster R-CNN- and SSD-based models are both based on this architecture and are implemented. Comprehensive experiments are conducted using the models on two different datasets, the open Video Verification of Identity (VIVID) dataset and a confidential dataset provided by Saab. The datasets are similar, both consisting of aerial images with vehicles. The initial experiments are conducted to find suitable configurations for the proposed models. Finally, an experiment is conducted to compare the performance of a human operator and a machine. The results from this work prove that object detection can be used to supporting the work of air reconnaissance image analysis regarding inference time. The current performance of the object detectors makes applications, where speed is more important than accuracy, most suitable.

APA, Harvard, Vancouver, ISO, and other styles

36

Ribeiro, Bruno Miguel Marques. "Object detection in robotics using morphological information." Master's thesis, Universidade de Aveiro, 2009. http://hdl.handle.net/10773/2129.

Full text

Abstract:

Mestrado em Engenharia Electrónica e Telecomunicações
Uma das componentes mais importantes em sistemas de processamento de imagem é a detecção de objectos de interesse. Contudo, a detecção de objectos é um desafio. Dada uma imagem arbitrária e assumindo que se está interessado em localizar um determinado objecto, o grande objectivo da detecção de objectos passa por determinar se existe ou não qualquer objecto de interesse. Esta tese encontra-se inserida no domínio do RoboCup e foca o desenvolvimento de algoritmos para a detecção de bolas oficiais da FIFA, um objecto importante no futebol robótico. Para atingir o objectivo principal, foram desenvolvidos três algoritmos para detectar bolas de futebol com cores arbitrárias, usando informação morfológica obtida através do detector de cortornos Canny e da tranformada de Hough. Em primeiro lugar, foi desenvolvida uma abordagem onde se implementou um algoritmo específico usando a transformada de Hough circular. Em segundo lugar, foi implementado um algoritmo que utiliza uma função da biblioteca OpenCV dedicada à procura de círculos em imagens. Finalmente, os dois primeiros algoritmos foram agrupados para criar uma nova abordagem, na qual ambos os algoritmos são usados. São apresentados resultados experimentais que mostram que os algoritmos desenvolvidos são precisos, sendo capazes de realizar a detecção da bola de forma confiável em situações de tempo-real. ABSTRACT: One of the most important steps in image processing systems is the detection of objects of interest. However, object detection is a challenging task. Given an arbitrary image and assuming that we are interested in locating a particular object, the goal of object detection is to determine whether or not there is any object of interest. This thesis is inserted in the RoboCup domain and is focused on the development of algorithms for the detection of arbitrary FIFA balls, an important object for soccer robots. To achieve the main objective, we developed three algorithms to detect arbitrary soccer balls using morphological information given by the Canny edge detector and the Hough Transform. First, it was developed an approach where we implemented a specific algorithm using the circular Hough Transform, applied after the segmentation of the acquired image. Secondly, it was implemented an algorithm that uses a function of the OpenCV library dedicated to the search of circles in images. Finally, the two first algorithms were joined to create a new approach in which both of the algorithms are used. Experimental results are presented, showing that the developed algorithms are accurate, being capable of reliable ball detection in real-time situations.

APA, Harvard, Vancouver, ISO, and other styles

37

Wälivaara, Marcus. "General Object Detection Using Superpixel Preprocessing." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-140874.

Full text

Abstract:

The objective of this master’s thesis work is to evaluate the potential benefit of a superpixel preprocessing step for general object detection in a traffic environment. The various effects of different superpixel parameters on object detection performance, as well as the benefit of including depth information when generating the superpixels are investigated. In this work, three superpixel algorithms are implemented and compared, including a proposal for an improved version of the popular Spectral Linear Iterative Clustering superpixel algorithm (SLIC). The proposed improved algorithm utilises a coarse-to-fine approach which outperforms the original SLIC for high-resolution images. An object detection algorithm is also implemented and evaluated. The algorithm makes use of depth information obtained by a stereo camera to extract superpixels corresponding to foreground objects in the image. Hierarchical clustering is then applied, with the segments formed by the clustered superpixels indicating potential objects in the input image. The object detection algorithm managed to detect on average 58% of the objects present in the chosen dataset. It performed especially well for detecting pedestrians or other objects close to the car. Altering the density distribution of the superpixels in the image yielded an increase in detection rate, and could be achieved both with or without utilising depth information. It was also shown that the use of superpixels greatly reduces the amount of computations needed for the algorithm, indicating that a real-time implementation is feasible.

APA, Harvard, Vancouver, ISO, and other styles

38

Forssén, Per-Erik. "Detection of Man-made Objects in Satellite Images." Thesis, Linköping University, Linköping University, Computer Vision, 1997. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-54356.

Full text

Abstract:

In this report, the principles of man-made object detection in satellite images is investigated. An overview of terminology and of how the detection problem is usually solved today is given. A three level system to solve the detection problem is proposed. The main branches of this system handle road, and city detection respectively. To achieve data source flexibility, the Logical Sensor notion is used to model the low level system components. Three Logical Sensors have been implemented and tested on Landsat TM and SPOT XS scenes. These are: BDT (Background Discriminant Transformation) to construct a man-made object property field; Local-orientation for texture estimation and road tracking; Texture estimation using local variance and variance of local orientation. A gradient magnitude measure for road seed generation has also been tested.

APA, Harvard, Vancouver, ISO, and other styles

39

Landin, Roman. "Object Detection with Deep Convolutional Neural Networks in Images with Various Lighting Conditions and Limited Resolution." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300055.

Full text

Abstract:

Computer vision is a key component of any autonomous system. Real world computer vision applications rely on a proper and accurate detection and classification of objects. A detection algorithm that doesn’t guarantee reasonable detection accuracy is not applicable in real time scenarios where safety is the main objective. Factors that impact detection accuracy are illumination conditions and image resolution. Both contribute to degradation of objects and lead to low classifications and detection accuracy. Recent development of Convolutional Neural Networks (CNNs) based algorithms offers possibilities for low-light (LL) image enhancement and super resolution (SR) image generation which makes it possible to combine such models in order to improve image quality and increase detection accuracy. This thesis evaluates different CNNs models for SR generation and LL enhancement by comparing generated images against ground truth images. To quantify the impact of the respective model on detection accuracy, a detection procedure was evaluated on generated images. Experimental results evaluated on images selected from NoghtOwls and Caltech Pedestrian datasets proved that super resolution image generation and low-light image enhancement improve detection accuracy by a substantial margin. Additionally, it has been proven that a cascade of SR generation and LL enhancement further boosts detection accuracy. However, the main drawback of such cascades is related to an increased computational time which limits possibilities for a range of real time applications.
Datorseende är en nyckelkomponent i alla autonoma system. Applikationer för datorseende i realtid är beroende av en korrekt detektering och klassificering av objekt. En detekteringsalgoritm som inte kan garantera rimlig noggrannhet är inte tillämpningsbar i realtidsscenarier, där huvudmålet är säkerhet. Faktorer som påverkar detekteringsnoggrannheten är belysningförhållanden och bildupplösning. Dessa bidrar till degradering av objekt och leder till låg klassificerings- och detekteringsnoggrannhet. Senaste utvecklingar av Convolutional Neural Networks (CNNs) -baserade algoritmer erbjuder möjligheter för förbättring av bilder med dålig belysning och bildgenerering med superupplösning vilket gör det möjligt att kombinera sådana modeller för att förbättra bildkvaliteten och öka detekteringsnoggrannheten. I denna uppsats utvärderas olika CNN-modeller för superupplösning och förbättring av bilder med dålig belysning genom att jämföra genererade bilder med det faktiska data. För att kvantifiera inverkan av respektive modell på detektionsnoggrannhet utvärderades en detekteringsprocedur på genererade bilder. Experimentella resultat utvärderades på bilder utvalda från NoghtOwls och Caltech datauppsättningar för fotgängare och visade att bildgenerering med superupplösning och bildförbättring i svagt ljus förbättrar noggrannheten med en betydande marginal. Dessutom har det bevisats att en kaskad av superupplösning-generering och förbättring av bilder med dålig belysning ytterligare ökar noggrannheten. Den största nackdelen med sådana kaskader är relaterad till en ökad beräkningstid som begränsar möjligheterna för en rad realtidsapplikationer.

APA, Harvard, Vancouver, ISO, and other styles

40

Andersson, Oskar, and Marquez Steffany Reyna. "A comparison of object detection algorithms using unmanipulated testing images : Comparing SIFT, KAZE, AKAZE and ORB." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186503.

Full text

Abstract:

While the thought of having computers recognize objects in images have been around for a long time it is only in the last 20 years that this has become a reality.One of the first successful recognition algorithms was called SIFT and to this day it is one of the most used. However in recent years new algorithms have beenpublished claiming to outperform SIFT. It is the goal of this report to investigate if SIFT still is the top performer 17 years after its publicationor if the newest generation of algorithms are superior. By creating a new data-set of over 170 test images with categories such as scale, rotation, illumination and general detectiona thorough test has been run comparing four algorithms, SIFT, KAZE, AKAZE and ORB. The result of this study contradicts the claims from the creators of KAZE and show thatSIFT has higher score on all tests. It also showed that AKAZE is at least as accurate as KAZE while being significantly faster. Another result was that whileSIFT, KAZE and AKAZE were relatively evenly matched when comparing single invariances that changed when performing tests that contained multiple variables. Whentesting detection in cluttered environments SIFT proved vastly superior to the other algorithms. This led to the conclusion that if the goal is the best possibledetection in every-day situations SIFT is still the best algorithm.

APA, Harvard, Vancouver, ISO, and other styles

41

Nyberg, Selma. "Video Recommendation Based on Object Detection." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-351122.

Full text

Abstract:

In this thesis, various machine learning domains have been combined in order to build a video recommender system that is based on object detection. The work combines two extensively studied research fields, recommender systems and computer vision, that also are rapidly growing and popular techniques on commercial markets. To investigate the performance of the approach, three different content-based recommender systems have been implemented at Spotify, which are based on the following video features: object detections, titles and descriptions, and user preferences. These systems have then been evaluated and compared against each other together with their hybridized result. Two algorithms have been implemented, the prediction and the top-N algorithm, where the former is the more reliable source for evaluating the system's performance. The evaluation of the system shows that the overall performance scores for predicting values of the users' liked and disliked videos are in the range from about 40 % to 70 % for the prediction algorithm and from about 15 % to 70 % for the top-N algorithm. The approach based on object detection performs worse in comparison to the other approaches. Hence, there seems to be is a low correlation between the user preferences and the video contents in terms of object detection data. Therefore, this data is not very suitable for describing the content of videos and using it in the recommender system. However, the results of this study cannot be generalized to apply for other systems before the approach has been evaluated in other environments and for various data sets. Moreover, there are plenty of room for refinements and improvements to the system, as well as there are many interesting research areas for future work.

APA, Harvard, Vancouver, ISO, and other styles

42

Pont, Tuset Jordi. "Image segmentation evaluation and its application to object detection." Doctoral thesis, Universitat Politècnica de Catalunya, 2014. http://hdl.handle.net/10803/134354.

Full text

Abstract:

The first parts of this Thesis are focused on the study of the supervised evaluation of image segmentation algorithms. Supervised in the sense that the segmentation results are compared to a human-made annotation, known as ground truth, by means of different measures of similarity. The evaluation depends, therefore, on three main points. First, the image segmentation techniques we evaluate. We review the state of the art in image segmentation, making an explicit difference between those techniques that provide a flat output, that is, a single clustering of the set of pixels into regions; and those that produce a hierarchical segmentation, that is, a tree-like structure that represents regions at different scales from the details to the whole image. Second, ground-truth databases are of paramount importance in the evaluation. They can be divided into those annotated only at object level, that is, with marked sets of pixels that refer to objects that do not cover the whole image; or those with annotated full partitions, which provide a full clustering of all pixels in an image. Depending on the type of database, we say that the analysis is done from an object perspective or from a partition perspective. Finally, the similarity measures used to compare the generated results to the ground truth are what will provide us with a quantitative tool to evaluate whether our results are good, and in which way they can be improved. The main contributions of the first parts of the thesis are in the field of the similarity measures. First of all, from an object perspective, we review the used basic measures to compare two object representations and show that some of them are equivalent. In order to evaluate full partitions and hierarchies against an object, one needs to select which of their regions form the object to be assessed. We review and improve these techniques by means of a mathematical model of the problem. This analysis allows us to show that hierarchies can represent objects much better with much less number of regions than flat partitions. From a partition perspective, the literature about evaluation measures is large and entangled. Our first contribution is to review, structure, and deduplicate the measures available. We provide a new measure that we show that improves previous ones in terms of a set of qualitative and quantitative meta-measures. We also extend the measures on flat partitions to cover hierarchical segmentations. The second part of this Thesis moves from the evaluation of image segmentation to its application to object detection. In particular, we build on some of the conclusions extracted in the first part to generate segmented object candidates. Given a set of hierarchies, we build the pairs and triplets of regions, we learn to combine the set from each hierarchy, and we rank them using low-level and mid-level cues. We conduct an extensive experimental validation that show that our method outperforms the state of the art in many metrics tested.

APA, Harvard, Vancouver, ISO, and other styles

43

Schels, Johannes [Verfasser], and Rainer [Akademischer Betreuer] Lienhart. "Object Class Detection Using Part-Based Models Trained from Synthetically Generated Images / Johannes Schels. Betreuer: Rainer Lienhart." Augsburg : Universität Augsburg, 2013. http://d-nb.info/1077702655/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Westerberg, Erik. "AI-based Age Estimation using X-ray Hand Images : A comparison of Object Detection and Deep Learning models." Thesis, Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19598.

Full text

Abstract:

Bone age assessment can be useful in a variety of ways. It can help pediatricians predict growth, puberty entrance, identify diseases, and assess if a person lacking proper identification is a minor or not. It is a time-consuming process that is also prone to intra-observer variation, which can cause problems in many ways. This thesis attempts to improve and speed up bone age assessments by using different object detection methods to detect and segment bones anatomically important for the assessment and using these segmented bones to train deep learning models to predict bone age. A dataset consisting of 12811 X-ray hand images of persons ranging from infant age to 19 years of age was used. In the first research question, we compared the performance of three state-of-the-art object detection models: Mask R-CNN, Yolo, and RetinaNet. We chose the best performing model, Yolo, to segment all the growth plates in the phalanges of the dataset. We proceeded to train four different pre-trained models: Xception, InceptionV3, VGG19, and ResNet152, using both the segmented and unsegmented dataset and compared the performance. We achieved good results using both the unsegmented and segmented dataset, although the performance was slightly better using the unsegmented dataset. The analysis suggests that we might be able to achieve a higher accuracy using the segmented dataset by adding the detection of growth plates from the carpal bones, epiphysis, and the diaphysis. The best performing model was Xception, which achieved a mean average error of 1.007 years using the unsegmented dataset and 1.193 years using the segmented dataset.

Presentationen gjordes online via Zoom.

APA, Harvard, Vancouver, ISO, and other styles

45

Kaba, Utku. "Moving Hot Object Detection In Airborne Thermal Videos." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614532/index.pdf.

Full text

Abstract:

In this thesis, we present an algorithm for vision based detection of moving objects observed by IR sensors on a moving platform. In addition we analyze the performance of different approaches in each step of the algorithm. The proposed algorithm is composed of preprocessing, feature detection, feature matching, homography estimation and difference image analysis steps. First, a global motion estimation based on planar homography model is performed in order to compensate the motion of the sensor and moving platform where the sensors are located. Then, moving objects are identified on difference images of consecutive video frames with global motion suppression. Performance of the proposed algorithm is shown on different IR image sequences.

APA, Harvard, Vancouver, ISO, and other styles

46

Zaborowski, Robert Michael. "Onboard and parts-based object detection from aerial imagery." Thesis, Monterey, California. Naval Postgraduate School, 2011. http://hdl.handle.net/10945/5523.

Full text

Abstract:

Approved for public release; distribution is unlimited.
The almost endless amount of full-motion video (FMV) data collected by Unmanned Aerial Vehicles (UAV) and similar sources presents mounting challenges to human analysts, particularly to their sustained attention to detail despite the monotony of continuous review. This digital deluge of raw imagery also places unsustainable loads on the limited resource of network bandwidth. Automated analysis onboard the UAV allows transmitting only pertinent portions of the imagery, reducing bandwidth usage and mitigating operator fatigue. Further, target detection and tracking information that is immediately available to the UAV facilitates more autonomous operations, with reduced communication needs to the ground station. Experimental results proved the utility of our onboard detection system a) through bandwidth reduction by two orders of magnitude and b) through reduced operator workload. Additionally, a novel parts-based detection method was developed. A whole-object detector is not well suited for deformable and articulated objects, and susceptible to failure due to partial occlusions. Parts detection with a subsequent structural model overcomes these difficulties, is potentially more computationally efficient (smaller resource footprint and able to be decomposed into a hierarchy), and permits reuse for multiple object types. Our parts-based vehicle detector achieved detection accuracy comparable to whole-object detection, yet exhibiting said advantages.

APA, Harvard, Vancouver, ISO, and other styles

47

Bergenroth, Hannah. "Use of Thermal Imagery for Robust Moving Object Detection." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177888.

Full text

Abstract:

This work proposes a system that utilizes both infrared and visual imagery to create a more robust object detection and classification system. The system consists of two main parts: a moving object detector and a target classifier. The first stage detects moving objects in visible and infrared spectrum using background subtraction based on Gaussian Mixture Models. Low-level fusion is performed to combine the foreground regions in the respective domain. For the second stage, a Convolutional Neural Network (CNN), pre-trained on the ImageNet dataset is used to classify the detected targets into one of the pre-defined classes; human and vehicle. The performance of the proposed object detector is evaluated using multiple video streams recorded in different areas and under various weather conditions, which form a broad basis for testing the suggested method. The accuracy of the classifier is evaluated from experimentally generated images from the moving object detection stage supplemented with publicly available CIFAR-10 and CIFAR-100 datasets. The low-level fusion method shows to be more effective than using either domain separately in terms of detection results.

Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet

APA, Harvard, Vancouver, ISO, and other styles

48

Jacobzon, Gustaf. "Multi-site Organ Detection in CT Images using Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279290.

Full text

Abstract:

When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set.
Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden.

APA, Harvard, Vancouver, ISO, and other styles

49

Kuchi, Aditi S. "Detection of Sand Boils from Images using Machine Learning Approaches." ScholarWorks@UNO, 2019. https://scholarworks.uno.edu/td/2618.

Full text

Abstract:

Levees provide protection for vast amounts of commercial and residential properties. However, these structures degrade over time, due to the impact of severe weather, sand boils, subsidence of land, seepage, etc. In this research, we focus on detecting sand boils. Sand boils occur when water under pressure wells up to the surface through a bed of sand. These make levees especially vulnerable. Object detection is a good approach to confirm the presence of sand boils from satellite or drone imagery, which can be utilized to assist in the automated levee monitoring methodology. Since sand boils have distinct features, applying object detection algorithms to it can result in accurate detection. To the best of our knowledge, this research work is the first approach to detect sand boils from images. In this research, we compare some of the latest deep learning methods, Viola Jones algorithm, and other non-deep learning methods to determine the best performing one. We also train a Stacking-based machine learning method for the accurate prediction of sand boils. The accuracy of our robust model is 95.4%.

APA, Harvard, Vancouver, ISO, and other styles

50

Yigit, Ahmet. "Thermal And Visible Band Image Fusion For Abandoned Object Detection." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/3/12611720/index.pdf.

Full text

Abstract:

Packages that are left unattended in public spaces are a security concern and timely detection of these packages is important for prevention of potential threats. Operators should be always alert to detect abandoned items in crowded environments. However, it is very difficult for operators to stay concentrated for extended periods. Therefore, it is important to aid operators with automatic detection of abandoned items. Most of the methods in the literature define abandoned items as items newly added to the scene and stayed stationary for a predefined time. Hence other stationary objects, such as people sitting on a bench are also detected as suspicious objects resulting in a high number of false alarms. These false alarms could be prevented by discriminating suspicious items as living/nonliving objects. In this thesis, visible band and thermal band cameras are used together to analyze the interactions between humans and other objects. Thermal images help classification of objects using their heat signatures. This way, people and the objects they carry or left behind can be detected separately. Especially, it is aimed to detect abandoned items and discriminate living or nonliving objects

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Object detection in images'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles