Дисертації: "Embedded Systems, Computer Vision, Object Classification"

1

Fagg, Ashton J. "Why capture frame rate matters for embedded vision." Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/117072/1/Ashton_Fagg_Thesis.pdf.

Повний текст джерела

Анотація:

This thesis examines the practical challenges of reliable object and facial tracking on mobile devices. We investigate the capabilities of such devices and propose a number of strategies to leverage the hardware and architectural strengths offered by smartphones and other embedded systems. We show how high frame rate cameras can be used as a resource to trade off algorithmic complexity while still achieving reliable, real time tracking performance. We also propose a number of strategies for formulating tracking algorithms, which make better use of the architectural redundancies inherent to modern system-on-chips.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Bartoli, Giacomo. "Edge AI: Deep Learning techniques for Computer Vision applied to embedded systems." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16820/.

Повний текст джерела

Анотація:

In the last decade, Machine Learning techniques have been used in different fields, ranging from finance to healthcare and even marketing. Amongst all these techniques, the ones adopting a Deep Learning approach were revealed to outperform humans in tasks such as object detection, image classification and speech recognition. This thesis introduces the concept of Edge AI: that is the possibility to build learning models capable of making inference locally, without any dependence on expensive servers or cloud services. A first case study we consider is based on the Google AIY Vision Kit, an intelligent camera equipped with a graphic board to optimize Computer Vision algorithms. Then, we test the performances of CORe50, a dataset for continuous object recognition, on embedded systems. The techniques developed in these chapters will be finally used to solve a challenge within the Audi Autonomous Driving Cup 2018, where a mobile car equipped with a camera, sensors and a graphic board must recognize pedestrians and stop before hitting them.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Örn, Fredrik. "Computer Vision for Camera Trap Footage : Comparing classification with object detection." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447482.

Повний текст джерела

Анотація:

Monitoring wildlife is of great interest to ecologists and is arguably even more important in the Arctic, the region in focus for the research network INTERACT, where the effects of climate change are greater than on the rest of the planet. This master thesis studies how artificial intelligence (AI) and computer vision can be used together with camera traps to achieve an effective way to monitor populations. The study uses an image data set, containing both humans and animals. The images were taken by camera traps from ECN Cairngorms, a station in the INTERACT network. The goal of the project is to classify these images into one of three categories: "Empty", "Animal" and "Human". Three different methods are compared, a DenseNet201 classifier, a YOLOv3 object detector, and the pre-trained MegaDetector, developed by Microsoft. No sufficient results were achieved with the classifier, but YOLOv3 performed well on human detection, with an average precision (AP) of 0.8 on both training and validation data. The animal detections for YOLOv3 did not reach an as high AP and this was likely because of the smaller amount of training examples. The best results were achieved by MegaDetector in combination with an added method to determine if the detected animals were dogs, reaching an average precision of 0.85 for animals and 0.99 for humans. This is the method that is recommended for future use, but there is potential to improve all the models and reach even more impressive results.Teknisk-naturvetenskapliga

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Parvez, Bilal. "Embedded Vision Machine Learning on Embedded Devices for Image classification in Industrial Internet of things." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-219622.

Повний текст джерела

Анотація:

Because of Machine Learning, machines have become extremely good at image classification in near real time. With using significant training data, powerful machines can be trained to recognize images as good as any human would. Till now the norm has been to have pictures sent to a server and have the server recognize them. With increasing number of sensors the trend is moving towards edge computing to curb the increasing rate of data transfer and communication bottlenecks. The idea is to do the processing locally or as close to the sensor as possible and then only transmit actionable data to the server. While, this does solve plethora of communication problems, specially in industrial settings, it creates a new problem. The sensors need to do this computationally intensive image classification which is a challenge for embedded/wearable devices, due to their resource constrained nature. This thesis analyzes Machine Learning algorithms and libraries from the motivation of porting image classifiers to embedded devices. This includes, comparing different supervised Machine Learning approaches to image classification and figuring out which are most suited for being ported to embedded devices. Taking a step forward in making the process of testing and implementing Machine Learning algorithms as easy as their desktop counterparts. The goal is to ease the process of porting new image recognition and classification algorithms on a host of different embedded devices and to provide motivations behind design decisions. The final proposal goes through all design considerations and implements a prototype that is hardware independent. Which can be used as a reference for designing and then later porting of Machine Learning classifiers to embedded devices.
Maskiner har blivit extremt bra på bildklassificering i nära realtid. På grund av maskininlärning med kraftig träningsdata, kan kraftfulla maskiner utbildas för att känna igen bilder så bra som alla människor skulle. Hittills har trenden varit att få bilderna skickade till en server och sedan få servern att känna igen bilderna. Men eftersom sensorerna ökar i antal, går trenden mot så kallad "edge computing" för att stryka den ökande graden av dataöverföring och kommunikationsflaskhalsar. Tanken är att göra bearbetningen lokalt eller så nära sensorn som möjligt och sedan bara överföra aktiv data till servern. Samtidigt som detta löser överflöd av kommunikationsproblem, speciellt i industriella inställningar, skapar det ett nytt problem. Sensorerna måste kunna göra denna beräkningsintensiva bildklassificering ombord vilket speciellt är en utmaning för inbyggda system och bärbara enheter, på grund av sin resursbegränsade natur. Denna avhandling analyserar maskininlärningsalgoritmer och biblioteken från motivationen att portera generiska bildklassificatorer till inbyggda system. Att jämföra olika övervakade maskininlärningsmetoder för bildklassificering, utreda vilka som är mest lämpade för att bli porterade till inbyggda system, för att göra processen att testa och implementera maskininlärningsalgoritmer lika enkelt som sina skrivbordsmodeller. Målet är att underlätta processen för att portera nya bildigenkännings och klassificeringsalgoritmer på en mängd olika inbyggda system och att ge motivation bakom designbeslut som tagits och för att beskriva det snabbaste sättet att skapa en prototyp med "embedded vision design". Det slutliga förslaget går igenom all hänsyn till konstruktion och implementerar en prototyp som är maskinvaruoberoende och kan användas för snabb framtagning av prototyper och sedan senare överföring av maskininlärningsklassificatorer till inbyggda system.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Palazzo, Simone. "Hybrid human-machine vision systems for automated object segmentation and categorization." Doctoral thesis, Università di Catania, 2017. http://hdl.handle.net/10761/3985.

Повний текст джерела

Анотація:

Emulating human perception is a foundational component in the research towards artificial intelligence (AI). Computer vision, in particular, is now one of the most active and fastest growing research topics in AI, and its field of practical applications range from video-survaillance to robotics to ecological monitoring. However, in spite of all the recent progress, humans still greatly outperform machines in most visual tasks, and even competitive artificial models require thousands of examples to learn concepts that children learn easily. Hence, given the objective difficulty in emulating the human visual system, the question that we intended to investigate in this thesis is in which ways humans can support the advancement of computer vision techniques. More precisely, we investigated how the synergy between human vision expertise and automated methods can be shifted from a top-down paradigm where direct user action or human perception principles explicitly guide the software component to a bottom-up paradigm, where instead of trying to copy the way our mind works, we exploit the by-product (i.e. some kind of measured feedback) of its workings to extract information on how visual tasks are performed. Starting from a purely top-down approach, where a fully-automated video object segmentation algorithm is extended to encode and include principles of human perceptual organization, we moved to interactive methods, where the same task is performed involving humans in the loop by means of gamification and eye-gaze analysis strategies, in a progressively increasing bottom-up fashion. Lastly, we pushed this trend to the limit by investigating brain-driven image classification approaches, where brain signals were used to extract compact representation of image contents. Performance evaluation of the tested approaches shows that involving people in automated vision methods can enhance their accuracy. Our experiments, carried out at different degrees of awareness and control of the generated human feedback, show that top-down approaches may achieve a better accuracy than bottom-up ones, at the cost of higher user interaction time and effort. As for our most ambitious objective, the purely bottom-up image classification system from brain pattern analysis, we were able to outperform the current state of the art with a method trained to extract brain-inspired visual content descriptors, thus removing the need of undergoing EEG recording for unseen images.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Wallenberg, Marcus. "Components of Embodied Visual Object Recognition : Object Perception and Learning on a Robotic Platform." Licentiate thesis, Linköpings universitet, Datorseende, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93812.

Повний текст джерела

Анотація:

Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, and the implementation of the system itself. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. Finally, in order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. All of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied.
Embodied Visual Object Recognition

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Simons, Taylor Scott. "High-Speed Image Classification for Resource-Limited Systems Using Binary Values." BYU ScholarsArchive, 2021. https://scholarsarchive.byu.edu/etd/9097.

Повний текст джерела

Анотація:

Image classification is a memory- and compute-intensive task. It is difficult to implement high-speed image classification algorithms on resource-limited systems like FPGAs and embedded computers. Most image classification algorithms require many fixed- and/or floating-point operations and values. In this work, we explore the use of binary values to reduce the memory and compute requirements of image classification algorithms. Our objective was to implement these algorithms on resource-limited systems while maintaining comparable accuracy and high speeds. By implementing high-speed image classification algorithms on resource-limited systems like embedded computers, FPGAs, and ASICs, automated visual inspection can be performed on small low-powered systems. Industries like manufacturing, medicine, and agriculture can benefit from compact, high-speed, low-power visual inspection systems. Tasks like defect detection in manufactured products and quality sorting of harvested produce can be performed cheaper and more quickly. In this work, we present ECO Jet Features, an algorithm adapted to use binary values for visual inspection. The ECO Jet Features algorithm ran 3.7x faster than the original ECO Features algorithm on embedded computers. It also allowed the algorithm to be implemented on an FPGA, achieving 78x speedup over full-sized desktop systems, using a fraction of the power and space. We reviewed Binarized Neural Nets (BNNs), neural networks that use binary values for weights and activations. These networks are particularly well suited for FPGA implementation and we compared and contrasted various FPGA implementations found throughout the literature. Finally, we combined the deep learning methods used in BNNs with the efficiency of Jet Features to make Neural Jet Features. Neural Jet Features are binarized convolutional layers that are learned through deep learning and learn classic computer vision kernels like the Gaussian and Sobel kernels. These kernels are efficiently computed as a group and their outputs can be reused when forming output channels. They performed just as well as BNN convolutions on visual inspection tasks and are more stable when trained on small models.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Lindqvist, Zebh. "Design Principles for Visual Object Recognition Systems." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-80769.

Повний текст джерела

Анотація:

Today's smartphones are capable of accomplishing far more advanced tasks than reading emails. With the modern framework TensorFlow, visual object recognition becomes possible using smartphone resources. This thesis shows that the main challenge does not lie in developing an artifact which performs visual object recognition. Instead, the main challenge lies in developing an ecosystem which allows for continuous improvement of the system’s ability to accomplish the given task without laborious and inefficient data collection. This thesis presents four design principles which contribute to an efficient ecosystem with quick initiation of new object classes and efficient data collection which is used to continuously improve the system’s ability to recognize smart meters in varying environments in an automated fashion.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Huttunen, S. (Sami). "Methods and systems for vision-based proactive applications." Doctoral thesis, Oulun yliopisto, 2011. http://urn.fi/urn:isbn:9789514296536.

Повний текст джерела

Анотація:

Abstract Human-computer interaction (HCI) is an integral part of modern society. Since the number of technical devices around us is increasing, the way of interacting is changing as well. The systems of the future should be proactive, so that they can adapt and adjust to people’s movements and actions without requiring any conscious control. Visual information plays a vital role in this kind of implicit human-computer interaction due to its expressiveness. It is therefore obvious that cameras equipped with computing power and computer vision techniques provide an unobtrusive way of analyzing human intentions. Despite its many advantages, use of computer vision is not always straightforward. Typically, every application sets specific requirements for the methods that can be applied. Given these motivations, this thesis aims to develop new vision-based methods and systems that can be utilized in proactive applications. As a case study, the thesis covers two different proactive computer vision applications. Firstly, an automated system that takes care of both the selection and switching of the video source in a distance education situation is presented. The system is further extended with a pan-tilt-zoom camera system that is designed to track the teacher when s/he walks at the front of the classroom. The second proactive application is targeted at mobile devices. The system presented recognizes landscape scenes which can be utilized in automatic shooting mode selection. Distributed smart cameras have been an active area of research in recent years, and they play an important role in many applications. Most of the research has focused on either the computer vision algorithms or on a specific implementation. There has been less activity on building generic frameworks which allow different algorithms, sensors and distribution methods to be used. In this field, the thesis presents an open and expendable framework for development of distributed sensor networks with an emphasis on peer-to-peer networking. From the methodological point of view, the thesis makes its contribution to the field of multi-object tracking. The method presented utilizes soft assignment to associate the measurements to the objects tracked. In addition, the thesis also presents two different ways of extracting location measurements from images. As a result, the method proposed provides location and trajectories of multiple objects which can be utilized in proactive applications
Tiivistelmä Ihmisen ja eri laitteiden välisellä vuorovaikutuksella on keskeinen osa nyky-yhteiskunnassa. Teknisten laitteiden lisääntymisen myötä vuorovaikutustavat ovat myös muuttumassa. Tulevaisuuden järjestelmien tulisi olla proaktiivisia, jotta ne voisivat sopeutua ihmisten liikkeisiin ja toimintoihin ilman tietoista ohjausta. Ilmaisuvoimansa ansiosta visuaalisella tiedolla on keskeinen rooli tällaisessa epäsuorassa ihminen-tietokone –vuorovaikutuksessa. Tämän vuoksi on selvää, että kamerat yhdessä laskentaresurssien ja konenäkömenetelmien kanssa tarjoavat huomaamattoman tavan ihmisten toiminnan analysointiin. Lukuisista eduistaan huolimatta konenäön soveltaminen ei ole aina suoraviivaista. Yleensä jokainen sovellus asettaa erikoisvaatimuksia käytettäville menetelmille. Tästä syystä väitöskirjassa on päämääränä kehittää uusia kuvatietoon perustuvia menetelmiä ja järjestelmiä, joita voidaan hyödyntää proaktiivisissa sovelluksissa. Tässä väitöskirjassa esitellään kaksi proaktiivista sovellusta, jotka molemmat hyödyntävät tietokonenäköä. Ensimmäinen sovellus on etäopetusjärjestelmä, joka valitsee ja vaihtaa kuvalähteen automaattisesti. Järjestelmään esitellään myös ohjattavaan kameraan perustava laajennus, jonka avulla opettajaa voidaan seurata hänen liikkuessaan eri puolilla luokkahuonetta. Toinen proaktiivisen tekniikan sovellus on tarkoitettu mobiililaitteisiin. Kehitetty järjestelmä kykenee tunnistamaan maisemakuvat, jolloin kameran kuvaustila voidaan asettaa automaattisesti. Monissa sovelluksissa on tarpeen käyttää useampia kameroita. Tämän seurauksena eri puolille ympäristöä sijoitettavat älykkäät kamerat ovat olleet viime vuosina erityisen kiinnostuksen kohteena. Suurin osa kehityksestä on kuitenkin keskittynyt lähinnä eri konenäköalgoritmeihin tai yksittäisiin sovelluksiin. Sen sijaan panostukset yleisiin ja helposti laajennettaviin ratkaisuihin, jotka mahdollistavat erilaisten menetelmien, sensoreiden ja tiedonvälityskanavien käyttämisen, ovat olleet vähäisempiä. Tilanteen parantamiseksi väitöskirjassa esitellään hajautettujen sensoriverkkojen kehitykseen tarkoitettu avoin ja laajennettavissa oleva ohjelmistorunko. Menetelmien osalta tässä väitöskirjassa keskitytään useiden kohteiden seurantaan. Kehitetty seurantamenetelmä yhdistää saadut paikkamittaukset seurattaviin kohteisiin siten, että jokaiselle mittaukselle lasketaan todennäköisyys, jolla se kuuluu jokaiseen yksittäiseen seurattavaan kohteeseen. Seurantaongelman lisäksi työssä esitellään kaksi erilaista tapaa, joilla kohteiden paikka kuvassa voidaan määrittää. Esiteltyä kokonaisuutta voidaan hyödyntää proaktiivisissa sovelluksissa, jotka tarvitsevat usean kohteen paikkatiedon tai kohteiden kulkeman reitin

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Andersson, Dickfors Robin, and Nick Grannas. "OBJECT DETECTION USING DEEP LEARNING ON METAL CHIPS IN MANUFACTURING." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-55068.

Повний текст джерела

Анотація:

Designing cutting tools for the turning industry, providing optimal cutting parameters is of importance for both the client, and for the company's own research. By examining the metal chips that form in the turning process, operators can recommend optimal cutting parameters. Instead of doing manual classification of metal chips that come from the turning process, an automated approach of detecting chips and classification is preferred. This thesis aims to evaluate if such an approach is possible using either a Convolutional Neural Network (CNN) or a CNN feature extraction coupled with machine learning (ML). The thesis started with a research phase where we reviewed existing state of the art CNNs, image processing and ML algorithms. From the research, we implemented our own object detection algorithm, and we chose to implement two CNNs, AlexNet and VGG16. A third CNN was designed and implemented with our specific task in mind. The three models were tested against each other, both as standalone image classifiers and as a feature extractor coupled with a ML algorithm. Because the chips were inside a machine, different angles and light setup had to be tested to evaluate which setup provided the optimal image for classification. A top view of the cutting area was found to be the optimal angle with light focused on both below the cutting area, and in the chip disposal tray. The smaller proposed CNN with three convolutional layers, three pooling layers and two dense layers was found to rival both AlexNet and VGG16 in terms of both as a standalone classifier, and as a feature extractor. The proposed model was designed with a limited system in mind and is therefore more suited for those systems while still having a high accuracy. The classification accuracy of the proposed model as a standalone classifier was 92.03%. Compared to the state of the art classifier AlexNet which had an accuracy of 92.20%, and VGG16 which had an accuracy of 91.88%. When used as a feature extractor, all three models paired best with the Random Forest algorithm, but the accuracy between the feature extractors is not that significant. The proposed feature extractor combined with Random Forest had an accuracy of 82.56%, compared to AlexNet with an accuracy of 81.93%, and VGG16 with 79.14% accuracy.
DIGICOGS

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Leijonhufvud, Peder, and Emil Bråkenhielm. "Image Processing for Improved Bacteria Classification." Thesis, Linköpings universitet, Programvara och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-167416.

Повний текст джерела

Анотація:

Mastitis is a common disease among cows in dairy farms. Diagnosis of the infection is today done manually, by analyzing bacteria growth on agar plates. However, classifiers are being developed for automated diagnostics using images of agar plates. Input images need to be of reasonable quality and consistent in terms of scale, positioning, perspective, and rotation for accurate classification. Therefore, this thesis investigates if a combination of image processing techniques can be used to match each input image to a pre-defined reference model. A method was proposed to identify important key points needed to register the input image to the reference model. The key points were defined by identifying the agar plate, its compartments, and its rotation within the image. The results showed that image registration with the correct key points was sufficient enough to match images of agar plates to a reference model despite any varieties in scale, position, perspective, or rotation. However, the accuracy depended on the identification of the salient features of the agar plate. Ultimately, the work proposes an approach using image registration to transform images of agar plates based on a pre-defined reference model, rather than a reference image.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Sandino, Mora Juan David. "Autonomous decision-making for UAVs operating under environmental and object detection uncertainty." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/232513/1/Juan%20David_Sandino%20Mora_Thesis.pdf.

Повний текст джерела

Анотація:

This study established a framework that increases cognitive levels in small UAVs (or drones), enabling autonomous navigation in partially observable environments. The UAV system was validated under search and rescue by locating victims last seen inside cluttered buildings and in bushlands. This framework improved the decision-making skills of the drone to collect more accurate statistics of detected victims. This study assists validation processes of detected objects in real-time when data is complex to interpret for UAV pilots and reduces human bias on scouting strategies.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Edlund, Fredrik, and Saqib Sarker. "Smart Kitchen : Automatisk inventering av föremål." Thesis, KTH, Data- och elektroteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-183583.

Повний текст джерела

Анотація:

Internet of Things växer fort och förutspås bli en del av vardagen. Detta öppnar möjligheter för att skapa produkter som förenklar vardagslivet. Automatisk objektsidentifiering kombinerad med en automatiserad lagerstatus kan underlätta inventering, något som kan användas till exempel i smarta kylskåp för att göra vardagen enklare genom Internet of Things.Detta examensarbete studerar metoder inom objektsidentifikation för att ta fram ett system som automatiskt kan identifiera objekt och hantera lagerstatus. En prototyp framställdes och testades för att se vilka möjligheter som finns. Systemet använder en Raspberry Pi som basenhet, vilken använder Dlib-bibliotek för att identifiera objekt som har blivit fördefinierade. Vid okända objekt identifierar användaren objekt i en mobilapplikation, systemet kan genom detta lära sig identifiera nya objekt. Samma applikation används för att se lagerstatusen på de olika objekt som har registrerats av systemet. Prototypen klarar av att identifiera kända objekt samt att lära sig nya, enligt projektets mål.
Internet of Things is growing fast and is predicted to become a part of everyday life. This can be used to create products which will make everyday life easier. Automated object detection combined with an automated inventory check can make it easier to manage what is in stock, this is something that can be used in smart refrigerators as an example, to make life more convenient through Internet of Things. This Bachelor thesis studies methods regarding object detection with the purpose to build a system which automatically identifies objects and manages the inventory status. A prototype was built and tested to see what the possibilities there is with such a system. The Prototype uses a Raspberry Pi as core unit, which uses Dlib libraries to identify predefined objects. The user will identify unknown objects via the mobile phone application, which makes it possible for the system to learn how to identify new objects. The same application is used to check the inventory status for the different objects that has been identified by the system. The prototype can identify objects and learn to identify new ones, according to the goals of the project.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Stenhager, Elinore. "Sportanalys för skytte : En metod för automatisk detektion och analys av träffar." Thesis, Jönköping University, JTH, Avdelningen för datateknik och informatik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-45692.

Повний текст джерела

Анотація:

Poängräkning och resultatanalys vid skytteträning är en viktig aspekt i utvecklingen av skyttens skjutförmåga. En bildbaserad träffpunktdetektionsalgoritm skulle automatisera denna process och bidra med digital presentation av resultatet. Befintliga lösningar är högkvalitativa metoder som detekterar träffpunkter med hög precision. Dessa lösningar är dock anpassade efter orealistiska användningsfall där måltavlor i gott skick som beskjutits vid ett enda tillfälle fotograferas i gynnsamma miljöer. Realistiska skytteförhållanden förekommer utomhus där träffpunkterna täcks med klisterlappar mellan skottrundorna och måltavlorna återanvänds tills dem faller sönder. Detta kandidatarbete introducerar en metod för automatisk detektion av träffar anpassad efter realistiska användningssituationer och bygger på tillgängliga bildanalystekniker. Den föreslagna metoden detekterar punkter med 40 procent noggrannhet i lågkvalitativa måltavlor och uppnår 88 procents noggrannhet i måltavlor av hög kvalitet. Dock produceras ett betydande antal falska positiva utfall. Resultatet påvisar möjligheten att utveckla ett sådant system och belyser de problem som tillkommer en sådan implementation.
Score calculation and performance analysis on shooting targets is an important aspect in development of shooting ability. An image based automatic scoring algorithm would provide automation of this procedure and digital visualization of the result. Prevailing solutions are high quality algorithms detecting hit points with high precision. However, these methods are adapted to unrealistic use cases where single-used, high-quality target boards are photographed in favourable environments. Usually gun shooting is performed outdoors where bullet holes are covered with stickers between shooting rounds, and targets are reused until they fall apart. This bachelor thesis introduces a method of automatic hit point detection adapted to realistic shooting conditions and relies solely on available image processing techniques. The proposed algorithm detects holes with 40 percent detection rate in low-quality target boards, reaching 88 percent detection rate in targets of higher quality. However, producing a significant number of false positives. The result demonstrates the possibilities of developing such a system and highlights the difficulties associated with such an implementation.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Sievert, Rolf. "Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-175173.

Повний текст джерела

Анотація:

Instance segmentation has a great potential for improving the current state of littering by autonomously detecting and segmenting different categories of litter. With this information, litter could, for example, be geotagged to aid litter pickers or to give precise locational information to unmanned vehicles for autonomous litter collection. Land-based litter instance segmentation is a relatively unexplored field, and this study aims to give a comparison of the instance segmentation models Mask R-CNN and DetectoRS using the multiclass litter dataset called Trash Annotations in Context (TACO) in conjunction with the Common Objects in Context precision and recall scores. TACO is an imbalanced dataset, and therefore imbalanced data-handling is addressed, exercising a second-order relation iterative stratified split, and additionally oversampling when training Mask R-CNN. Mask R-CNN without oversampling resulted in a segmentation of 0.127 mAP, and with oversampling 0.163 mAP. DetectoRS achieved 0.167 segmentation mAP, and improves the segmentation mAP of small objects most noticeably, with a factor of at least 2, which is important within the litter domain since small objects such as cigarettes are overrepresented. In contrast, oversampling with Mask R-CNN does not seem to improve the general precision of small and medium objects, but only improves the detection of large objects. It is concluded that DetectoRS improves results compared to Mask R-CNN, as well does oversampling. However, using a dataset that cannot have an all-class representation for train, validation, and test splits, together with an iterative stratification that does not guarantee all-class representations, makes it hard for future works to do exact comparisons to this study. Results are therefore approximate considering using all categories since 12 categories are missing from the test set, where 4 of those were impossible to split into train, validation, and test set. Further image collection and annotation to mitigate the imbalance would most noticeably improve results since results depend on class-averaged values. Doing oversampling with DetectoRS would also help improve results. There is also the option to combine the two datasets TACO and MJU-Waste to enforce training of more categories.

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Haynes, Keith L. Liu Xiuwen. "Object recognition using rapid classification trees." 2006. http://etd.lib.fsu.edu/theses/available/05182006-092602.

Повний текст джерела

Анотація:

Thesis (Ph. D.)--Florida State University, 2006.
Advisor: Xiuwen Liu, Florida State University, College of Arts and Sciences, Dept. of Computer Science. Title and description from dissertation home page (viewed Sept. 20, 2006). Document formatted into pages; contains xi, 109 pages. Includes bibliographical references.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Vance, Lauren M. "A Transfer Learning Approach to Object Detection Acceleration for Embedded Applications." Thesis, 2021. http://dx.doi.org/10.7912/C2/62.

Повний текст джерела

Анотація:

Indiana University-Purdue University Indianapolis (IUPUI)
Deep learning solutions to computer vision tasks have revolutionized many industries in recent years, but embedded systems have too many restrictions to take advantage of current state-of-the-art configurations. Typical embedded processor hardware configurations must meet very low power and memory constraints to maintain small and lightweight packaging, and the architectures of the current best deep learning models are too computationally-intensive for these hardware configurations. Current research shows that convolutional neural networks (CNNs) can be deployed with a few architectural modifications on Field-Programmable Gate Arrays (FPGAs) resulting in minimal loss of accuracy, similar or decreased processing speeds, and lower power consumption when compared to general-purpose Central Processing Units (CPUs) and Graphics Processing Units (GPUs). This research contributes further to these findings with the FPGA implementation of a YOLOv4 object detection model that was developed with the use of transfer learning. The transfer-learned model uses the weights of a model pre-trained on the MS-COCO dataset as a starting point then fine-tunes only the output layers for detection on more specific objects of five classes. The model architecture was then modified slightly for compatibility with the FPGA hardware using techniques such as weight quantization and replacing unsupported activation layer types. The model was deployed on three different hardware setups (CPU, GPU, FPGA) for inference on a test set of 100 images. It was found that the FPGA was able to achieve real-time inference speeds of 33.77 frames-per-second, a speedup of 7.74 frames-per-second when compared to GPU deployment. The model also consumed 96% less power than a GPU configuration with only approximately 4% average loss in accuracy across all 5 classes. The results are even more striking when compared to CPU deployment, with 131.7-times speedup in inference throughput. CPUs have long since been outperformed by GPUs for deep learning applications but are used in most embedded systems. These results further illustrate the advantages of FPGAs for deep learning inference on embedded systems even when transfer learning is used for an efficient end-to-end deployment process. This work advances current state-of-the-art with the implementation of a YOLOv4 object detection model developed with transfer learning for FPGA deployment.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Martí, i. Rabadán Miquel. "Multitask Deep Learning models for real-time deployment in embedded systems." Thesis, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-216673.

Повний текст джерела

Анотація:

Multitask Learning (MTL) was conceived as an approach to improve thegeneralization ability of machine learning models. When applied to neu-ral networks, multitask models take advantage of sharing resources forreducing the total inference time, memory footprint and model size. Wepropose MTL as a way to speed up deep learning models for applicationsin which multiple tasks need to be solved simultaneously, which is par-ticularly useful in embedded, real-time systems such as the ones foundin autonomous cars or UAVs.In order to study this approach, we apply MTL to a Computer Vi-sion problem in which both Object Detection and Semantic Segmenta-tion tasks are solved based on the Single Shot Multibox Detector andFully Convolutional Networks with skip connections respectively, usinga ResNet-50 as the base network. We train multitask models for twodifferent datasets, Pascal VOC, which is used to validate the decisionsmade, and a combination of datasets with aerial view images capturedfrom UAVs.Finally, we analyse the challenges that appear during the process of train-ing multitask networks and try to overcome them. However, these hinderthe capacity of our multitask models to reach the performance of the bestsingle-task models trained without the limitations imposed by applyingMTL. Nevertheless, multitask networks benefit from sharing resourcesand are 1.6x faster, lighter and use less memory compared to deployingthe single-task models in parallel, which turns essential when runningthem on a Jetson TX1 SoC as the parallel approach does not fit intomemory. We conclude that MTL has the potential to give superior per-formance as far as the object detection and semantic segmentation tasksare concerned in exchange of a more complex training process that re-quires overcoming challenges not present in the training of single-taskmodels.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Martí, Rabadán Miquel. "Multitask Deep Learning models for real-time deployment in embedded systems." Thesis, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-216673.

Повний текст джерела

Анотація:

Multitask Learning (MTL) was conceived as an approach to improve thegeneralization ability of machine learning models. When applied to neu-ral networks, multitask models take advantage of sharing resources forreducing the total inference time, memory footprint and model size. Wepropose MTL as a way to speed up deep learning models for applicationsin which multiple tasks need to be solved simultaneously, which is par-ticularly useful in embedded, real-time systems such as the ones foundin autonomous cars or UAVs.In order to study this approach, we apply MTL to a Computer Vi-sion problem in which both Object Detection and Semantic Segmenta-tion tasks are solved based on the Single Shot Multibox Detector andFully Convolutional Networks with skip connections respectively, usinga ResNet-50 as the base network. We train multitask models for twodifferent datasets, Pascal VOC, which is used to validate the decisionsmade, and a combination of datasets with aerial view images capturedfrom UAVs.Finally, we analyse the challenges that appear during the process of train-ing multitask networks and try to overcome them. However, these hinderthe capacity of our multitask models to reach the performance of the bestsingle-task models trained without the limitations imposed by applyingMTL. Nevertheless, multitask networks benefit from sharing resourcesand are 1.6x faster, lighter and use less memory compared to deployingthe single-task models in parallel, which turns essential when runningthem on a Jetson TX1 SoC as the parallel approach does not fit intomemory. We conclude that MTL has the potential to give superior per-formance as far as the object detection and semantic segmentation tasksare concerned in exchange of a more complex training process that re-quires overcoming challenges not present in the training of single-taskmodels.

Стилі APA, Harvard, Vancouver, ISO та ін.

20

(8082806), Dewant Katare. "EXPLORATION OF DEEP LEARNING APPLICATIONS ON AN AUTONOMOUS EMBEDDED PLATFORM (BLUEBOX 2.0)." Thesis, 2019.

Знайти повний текст джерела

Анотація:

An Autonomous vehicle depends on the combination of latest technology or the ADAS safety features such as Adaptive cruise control (ACC), Autonomous Emergency Braking (AEB), Automatic Parking, Blind Spot Monitor, Forward Collision Warning or Avoidance (FCW or FCA), Lane Departure Warning. The current trend follows incorporation of these technologies using the Artificial neural network or Deep neural network, as an imitation of the traditionally used algorithms. Recent research in the field of deep learning and development of competent processors for autonomous or self driving car have shown amplitude of prospect, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Deployment of several mentioned ADAS safety feature using multiple sensors and individual processors, increases the integration complexity and also results in the distribution of the system, which is very pivotal for autonomous vehicles.

This thesis attempts to tackle two important adas safety feature: Forward collision Warning, and Object Detection using the machine learning and Deep Neural Networks and there deployment in the autonomous embedded platform.

This thesis proposes the following:

1. A machine learning based approach for the forward collision warning system in an autonomous vehicle.

2.3-D object detection using Lidar and Camera which is primarily based on Lidar Point Clouds.

The proposed forward collision warning model is based on the forward facing automotive radar providing the sensed input values such as acceleration, velocity and separation distance to a classifier algorithm which on the basis of supervised learning model, alerts the driver of possible collision. Decision Tress, Linear Regression, Support Vector Machine, Stochastic Gradient Descent, and a Fully Connected Neural Network is used for the prediction purpose.

The second proposed methods uses object detection architecture, which combines the 2D object detectors and a contemporary 3D deep learning techniques. For this approach, the 2D object detectors is used first, which proposes a 2D bounding box on the images or video frames. Additionally a 3D object detection technique is used where the point clouds are instance segmented and based on raw point clouds density a 3D bounding box is predicted across the previously segmented objects.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

(5931110), Durvesh Pathak. "Compressed Convolutional Neural Network for Autonomous Systems." Thesis, 2019.

Знайти повний текст джерела

Анотація:

The word “Perception” seems to be intuitive and maybe the most straightforward problem for the human brain because as a child we have been trained to classify images, detect objects, but for computers, it can be a daunting task. Giving intuition and reasoning to a computer which has mere capabilities to accept commands and process those commands is a big challenge. However, recent leaps in hardware development, sophisticated software frameworks, and mathematical techniques have made it a little less daunting if not easy. There are various applications built around to the concept of “Perception”. These applications require substantial computational resources, expensive hardware, and some sophisticated software frameworks. Building an application for perception for the embedded system is an entirely different ballgame. Embedded system is a culmination of hardware, software and peripherals developed for specific tasks with imposed constraints on memory and power. Therefore, the applications developed should keep in mind the memory and power constraints imposed due to the nature of these systems.Before 2012, the problems related to “Perception” such as classification, object detection were solved using algorithms with manually engineered features. However, in recent years, instead of manually engineering the features, these features are learned through learning algorithms. The game-changing architecture of Convolution Neural Networks proposed in 2012 by Alex K, provided a tremendous momentum in the direction of pushing Neural networks for perception. This thesis is an attempt to develop a convolution neural network architecture for embedded systems, i.e. an architecture that has a small model size and competitive accuracy. Recreate state-of-the-art architectures using fire module’s concept to reduce the model size of the architecture. The proposed compact models are feasible for deployment on embedded devices such as the Bluebox 2.0. Furthermore, attempts are made to integrate the compact Convolution Neural Network with object detection pipelines.

Стилі APA, Harvard, Vancouver, ISO та ін.

Дисертації з теми "Embedded Systems, Computer Vision, Object Classification"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями