Bibliographies: '080199 Artificial Intelligence and Image Processing not elsewhere classified'

1

Leitner, Jürgen. "From vision to actions: Towards adaptive and autonomous humanoid robots." Thesis, Università della Svizzera Italiana, 2014. https://eprints.qut.edu.au/90178/2/2014INFO020.pdf.

Full text

Abstract:

Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.

APA, Harvard, Vancouver, ISO, and other styles

2

(10695907), Wo Jae Lee. "AI-DRIVEN PREDICTIVE WELLNESS OF MECHANICAL SYSTEMS: ASSESSMENT OF TECHNICAL, ENVIRONMENTAL, AND ECONOMIC PERFORMANCE." Thesis, 2021.

Find full text

Abstract:

One way to reduce the lifecycle cost and environmental impact of a product in a circular economy is to extend its lifespan by either creating longer-lasting products or managing the product properly during its use stage. Life extension of a product is envisioned to help better utilize raw materials efficiently and slow the rate of resource depletion. In the case of manufacturing equipment (e.g., an electric motor on a machine tool), securing reliable service life as well as the life extension are important for consistent production and operational excellence in a factory. However, manufacturing equipment is often utilized without a planned maintenance approach. Such a strategy frequently results in unplanned downtime, owing to unexpected failures. Scheduled maintenance replaces components frequently to avoid unexpected equipment stoppages, but increases the time associated with machine non-operation and maintenance cost.

Recently, the emergence of Industry 4.0 and smart systems is leading to increasing attention to predictive maintenance (PdM) strategies that can decrease the cost of downtime and increase the availability (utilization rate) of manufacturing equipment. PdM also has the potential to foster sustainable practices in manufacturing by maximizing the useful lives of components. In addition, advances in sensor technology (e.g., lower fabrication cost) enable greater use of sensors in a factory, which in turn is producing greater and more diverse sets of data. Widespread use of wireless sensor networks (WSNs) and plug-and-play interfaces for the data collection on product/equipment states are allowing predictive maintenance on a much greater scale. Through advances in computing, big data analysis is faster/improved and has allowed maintenance to transition from run-to-failure to statistical inference-based or machine learning prediction methods.

Moreover, maintenance practice in a factory is evolving from equipment “health management” to equipment “wellness” by establishing an integrated and collaborative manufacturing system that responds in real-time to changing conditions in a factory. The equipment wellness is an active process of becoming aware of the health condition and of making choices that achieve the full potential of the equipment. In order to enable this, a large amount of machine condition data obtained from sensors needs to be analyzed to diagnose the current health condition and predict future behavior (e.g., remaining useful life). If a fault is detected during this diagnosis, a root cause of a fault must be identified to extend equipment life and prevent problem reoccurrence.

However, it is challenging to build a model capturing a relationship between multi-sensor signals and mechanical failures, considering the dynamic manufacturing environment and the complex mechanical system in equipment. Another key challenge is to obtain usable machine condition data to validate a method.

A goal of the proposed work is to develop a systematic tool for maintenance in manufacturing plants using emerging technologies (e.g., AI, Smart Sensor, and IoT). The proposed method will facilitate decision-making that supports equipment maintenance by rapidly detecting a worn component and estimating remaining useful life. In order to diagnose and prognose a health condition of equipment, several data-driven models that describe the relationships between proxy measures (i.e., sensor signals) and machine health conditions are developed and validated through the experiment for several different manufacturing-oriented cases (e.g., cutting tool, gear, and bearing). To enhance the robustness and the prediction capability of the data-driven models, signal processing is conducted to preprocess the raw signals using domain knowledge. Through this process, useful features from the large dataset are extracted and selected, thus increasing computational efficiency in model training. To make a decision using the processed signals, a customized deep learning architecture for each case is designed to effectively and efficiently learn the relationship between the processed signals and the model’s outputs (e.g., health indicators). Ultimately, the method developed through this research helps to avoid catastrophic mechanical failures, products with unacceptable quality, defective products in the manufacturing process as well as to extend equipment service life.

To summarize, in this dissertation, the assessment of technical, environmental and economic performance of the AI-driven method for the wellness of mechanical systems is conducted. The proposed methods are applied to (1) quantify the level of tool wear in a machining process, (2) detect different faults from a power transmission mini-motor testbed (CNN), (3) detect a fault in a motor operated under various rotation speeds, and (4) to predict the time to failure of rotating machinery. Also, the effectiveness of maintenance in the use stage is examined from an environmental and economic perspective using a power efficiency loss as a metric for decision making between repair and replacement.

APA, Harvard, Vancouver, ISO, and other styles

3

(7534550), David Güera. "Media Forensics Using Machine Learning Approaches." Thesis, 2019.

Find full text

Abstract:

Consumer-grade imaging sensors have become ubiquitous in the past decade. Images and videos, collected from such sensors are used by many entities for public and private communications, including publicity, advocacy, disinformation, and deception.

In this thesis, we present tools to be able to extract knowledge from and understand this imagery and its provenance. Many images and videos are modified and/or manipulated prior to their public release. We also propose a set of forensics and counter-forensic techniques to determine the integrity of this multimedia content and modify it in specific ways to deceive adversaries. The presented tools are evaluated using publicly available datasets and independently organized challenges.

APA, Harvard, Vancouver, ISO, and other styles

4

(11180610), Indranil Chakraborty. "Toward Energy-Efficient Machine Learning: Algorithms and Analog Compute-In-Memory Hardware." Thesis, 2021.

Find full text

Abstract:

The ‘Internet of Things’ has increased the demand for artificial intelligence (AI)-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. However, the growing complexity of machine learning workloads requires rethinking to make AI amenable to resource constrained environments such as edge devices. To that effect, the entire stack of machine learning, from algorithms to hardware primitives, have been explored to enable energy-efficient intelligence at the edge.

From the algorithmic aspect, model compression techniques such as quantization are powerful tools to address the growing computational cost of ML workloads. However, quantization, particularly, can result in substantial loss of performance for complex image classification tasks. To address this, a principal component analysis (PCA)-driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a significant improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets, while still achieving up remarkable energy-efficiency.

Having explored compressed neural networks, there is a need to investigate suitable computing systems to further the energy efficiency. Memristive crossbars have been extensively explored as an alternative to traditional CMOS based systems for deep learning accelerators due to their high on-chip storage density and efficient Matrix Vector Multiplication (MVM) compared to digital CMOS. However, the analog nature of computing poses significant issues due to various non-idealities such as: parasitic resistances, non-linear I-V characteristics of the memristor device etc. To address this, a simplified equation-based modelling of the non-ideal behavior of crossbars is performed and correspondingly, a modified technology aware training algorithm is proposed. Building on the drawbacks of equation-based modeling, a Generalized Approach to Emulating Non-Ideality in Memristive Crossbars using Neural Networks (GENIEx) is proposed where a neural network is trained on HSPICE simulation data to learn the transfer characteristics of the non-ideal crossbar. Next, a functional simulator was developed which includes key architectural facets such as tiling, and bit-slicing to analyze the impact of non-idealities on the classification accuracy of large-scale neural networks.

To truly realize the benefits of hardware primitives and the algorithms on top of the stack, it is necessary to build efficient devices that mimic the behavior of the fundamental units of a neural network, namely, neurons and synapses. However, efforts have largely been invested in implementations in the electrical domain with potential limitations of switching speed, functional errors due to analog computing, etc. As an alternative, a purely photonic operation of an Integrate-and-Fire Spiking neuron is proposed, based on the phase change dynamics of Ge2Sb2Te5 (GST) embedded on top of a microring resonator, which alleviates the energy constraints of PCMs in electrical domain. Further, the inherent parallelism of wavelength-division multiplexing (WDM) was leveraged to propose a photonic dot-product engine. The proposed computing platform was used to emulate a SNN inferencing engine for image-classification tasks. These explorations at different levels of the stack can enable energy-efficient machine learning for edge intelligence.

Having explored various domains to design efficient DNN models and studying various hardware primitives based on emerging technologies, we focus on Silicon implementation of compute-in-memory (CIM) primitives for machine learning acceleration based on the more available CMOS technology. CIM primitives enable efficient matrix-vector multiplications (MVM) through parallelized multiply-and-accumulate operations inside the memory array itself. As CIM primitives deploy bit-serial computing, the computations are exposed bit-level sparsity of inputs and weights in a ML model. To that effect, we present an energy-efficient sparsity-aware reconfigurable-precision compute-in-memory (CIM) 8T-SRAM macro for machine learning (ML) applications. Standard 8T-SRAM arrays are re-purposed to enable MAC operations using selective current flow through the read-port transistors. The proposed macro dynamically leverages workload sparsity by reconfiguring the output precision in the peripheral circuitry without degrading application accuracy. Specifically, we propose a new energy-efficient reconfigurable-precision SAR ADC design with the ability to form (n+m)-bit precision using n-bit and m-bit ADCs. Additionally, the transimpedance amplifier (TIA) –required to convert the summed current into voltage before conversion—is reconfigured based on sparsity to improve sense margin at lower output precision. The proposed macro, fabricated in 65 nm technology, provides 35.5-127.2 TOPS/W as the ADC precision varies from 6-bit to 2-bit, respectively. Building on top of the fabricated macro, we next design a hierarchical CIM core micro-architecture that addresses the existing CIM scaling challenges. The proposed CIM core micro-architecture consists of 32 proposed sparsity-aware CIM macros. The 32 macros are divided into 4 matrix-vector multiplication units (MVMUs) consisting of 8 macros each. The core has three unique features: i) it can adaptively reconfigure ADC precision to achieve energy-efficiency and lower latency based on input and weight sparsity, determined by a sparsity controller, ii) it deploys row-gating feature to maintain SNR requirements for accurate DNN computations, and iii) hardware support for load balancing to balance latency mismatches occurring due to different ADC precisions in different compute units. Besides the CIM macros, the core micro-architecture consists of input, weight, and output memories, along with instruction memory and control circuits. The instruction set architecture allows for flexible dataflows and mapping in the proposed core micro-architecture. The sparsity-aware processing core is scheduled to be taped out next month. The proposed CIM demonstrations complemented by our previous analysis on analog CIM systems progressed our understanding of this emerging paradigm in pertinence to ML acceleration.

APA, Harvard, Vancouver, ISO, and other styles

5

(8771429), Ashley S. Dale. "3D OBJECT DETECTION USING VIRTUAL ENVIRONMENT ASSISTED DEEP NETWORK TRAINING." Thesis, 2021.

Find full text

Abstract:

An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ _F1= 0.015, compared to σF 1 = 0.020 for the networks trained exclusively with real data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.

APA, Harvard, Vancouver, ISO, and other styles

6

(6630578), Yellamraju Tarun. "n-TARP: A Random Projection based Method for Supervised and Unsupervised Machine Learning in High-dimensions with Application to Educational Data Analysis." Thesis, 2019.

Find full text

Abstract:

Analyzing the structure of a dataset is a challenging problem in high-dimensions as the volume of the space increases at an exponential rate and typically, data becomes sparse in this high-dimensional space. This poses a significant challenge to machine learning methods which rely on exploiting structures underlying data to make meaningful inferences. This dissertation proposes the n-TARP method as a building block for high-dimensional data analysis, in both supervised and unsupervised scenarios.

The basic element, n-TARP, consists of a random projection framework to transform high-dimensional data to one-dimensional data in a manner that yields point separations in the projected space. The point separation can be tuned to reflect classes in supervised scenarios and clusters in unsupervised scenarios. The n-TARP method finds linear separations in high-dimensional data. This basic unit can be used repeatedly to find a variety of structures. It can be arranged in a hierarchical structure like a tree, which increases the model complexity, flexibility and discriminating power. Feature space extensions combined with n-TARP can also be used to investigate non-linear separations in high-dimensional data.

The application of n-TARP to both supervised and unsupervised problems is investigated in this dissertation. In the supervised scenario, a sequence of n-TARP based classifiers with increasing complexity is considered. The point separations are measured by classification metrics like accuracy, Gini impurity or entropy. The performance of these classifiers on image classification tasks is studied. This study provides an interesting insight into the working of classification methods. The sequence of n-TARP classifiers yields benchmark curves that put in context the accuracy and complexity of other classification methods for a given dataset. The benchmark curves are parameterized by classification error and computational cost to define a benchmarking plane. This framework splits this plane into regions of "positive-gain" and "negative-gain" which provide context for the performance and effectiveness of other classification methods. The asymptotes of benchmark curves are shown to be optimal (i.e. at Bayes Error) in some cases (Theorem 2.5.2).

In the unsupervised scenario, the n-TARP method highlights the existence of many different clustering structures in a dataset. However, not all structures present are statistically meaningful. This issue is amplified when the dataset is small, as random events may yield sample sets that exhibit separations that are not present in the distribution of the data. Thus, statistical validation is an important step in data analysis, especially in high-dimensions. However, in order to statistically validate results, often an exponentially increasing number of data samples are required as the dimensions increase. The proposed n-TARP method circumvents this challenge by evaluating statistical significance in the one-dimensional space of data projections. The n-TARP framework also results in several different statistically valid instances of point separation into clusters, as opposed to a unique "best" separation, which leads to a distribution of clusters induced by the random projection process.

The distributions of clusters resulting from n-TARP are studied. This dissertation focuses on small sample high-dimensional problems. A large number of distinct clusters are found, which are statistically validated. The distribution of clusters is studied as the dimensionality of the problem evolves through the extension of the feature space using monomial terms of increasing degree in the original features, which corresponds to investigating non-linear point separations in the projection space.

A statistical framework is introduced to detect patterns of dependence between the clusters formed with the features (predictors) and a chosen outcome (response) in the data that is not used by the clustering method. This framework is designed to detect the existence of a relationship between the predictors and response. This framework can also serve as an alternative cluster validation tool.

The concepts and methods developed in this dissertation are applied to a real world data analysis problem in Engineering Education. Specifically, engineering students' Habits of Mind are analyzed. The data at hand is qualitative, in the form of text, equations and figures. To use the n-TARP based analysis method, the source data must be transformed into quantitative data (vectors). This is done by modeling it as a random process based on the theoretical framework defined by a rubric. Since the number of students is small, this problem falls into the small sample high-dimensions scenario. The n-TARP clustering method is used to find groups within this data in a statistically valid manner. The resulting clusters are analyzed in the context of education to determine what is represented by the identified clusters. The dependence of student performance indicators like the course grade on the clusters formed with n-TARP are studied in the pattern dependence framework, and the observed effect is statistically validated. The data obtained suggests the presence of a large variety of different patterns of Habits of Mind among students, many of which are associated with significant grade differences. In particular, the course grade is found to be dependent on at least two Habits of Mind: "computation and estimation" and "values and attitudes."

APA, Harvard, Vancouver, ISO, and other styles

Academic literature on the topic '080199 Artificial Intelligence and Image Processing not elsewhere classified'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Contents

Dissertations / Theses on the topic "080199 Artificial Intelligence and Image Processing not elsewhere classified"