Dissertations / Theses: 'Multiple Sparse Bayesian Learning'

1

Higson, Edward John. "Bayesian methods and machine learning in astrophysics." Thesis, University of Cambridge, 2019. https://www.repository.cam.ac.uk/handle/1810/289728.

Full text

Abstract:

This thesis is concerned with methods for Bayesian inference and their applications in astrophysics. We principally discuss two related themes: advances in nested sampling (Chapters 3 to 5), and Bayesian sparse reconstruction of signals from noisy data (Chapters 6 and 7). Nested sampling is a popular method for Bayesian computation which is widely used in astrophysics. Following the introduction and background material in Chapters 1 and 2, Chapter 3 analyses the sampling errors in nested sampling parameter estimation and presents a method for estimating them numerically for a single nested sampling calculation. Chapter 4 introduces diagnostic tests for detecting when software has not performed the nested sampling algorithm accurately, for example due to missing a mode in a multimodal posterior. The uncertainty estimates and diagnostics in Chapters 3 and 4 are implemented in the $\texttt{nestcheck}$ software package, and both chapters describe an astronomical application of the techniques introduced. Chapter 5 describes dynamic nested sampling: a generalisation of the nested sampling algorithm which can produce large improvements in computational efficiency compared to standard nested sampling. We have implemented dynamic nested sampling in the $\texttt{dyPolyChord}$ and $\texttt{perfectns}$ software packages. Chapter 6 presents a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number (and form, if required) is determined by the data themselves. This approach is based on a Bayesian interpretation of conventional sparse reconstruction and regularisation techniques, in which sparsity is imposed through priors via Bayesian model selection. We demonstrate our method for noisy 1- and 2-dimensional signals, including examples of processing astronomical images. The numerical implementation uses dynamic nested sampling, and uncertainties are calculated using the methods introduced in Chapters 3 and 4. Chapter 7 applies our Bayesian sparse reconstruction framework to artificial neural networks, where it allows the optimum network architecture to be determined by treating the number of nodes and hidden layers as parameters. We conclude by suggesting possible areas of future research in Chapter 8.

APA, Harvard, Vancouver, ISO, and other styles

2

Parisi, Simone [Verfasser], Jan [Akademischer Betreuer] Peters, and Joschka [Akademischer Betreuer] Boedeker. "Reinforcement Learning with Sparse and Multiple Rewards / Simone Parisi ; Jan Peters, Joschka Boedeker." Darmstadt : Universitäts- und Landesbibliothek Darmstadt, 2020. http://d-nb.info/1203301545/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Tandon, Prateek. "Bayesian Aggregation of Evidence for Detection and Characterization of Patterns in Multiple Noisy Observations." Research Showcase @ CMU, 2015. http://repository.cmu.edu/dissertations/658.

Full text

Abstract:

Effective use of Machine Learning to support extracting maximal information from limited sensor data is one of the important research challenges in robotic sensing. This thesis develops techniques for detecting and characterizing patterns in noisy sensor data. Our Bayesian Aggregation (BA) algorithmic framework can leverage data fusion from multiple low Signal-To-Noise Ratio (SNR) sensor observations to boost the capability to detect and characterize the properties of a signal generating source or process of interest. We illustrate our research with application to the nuclear threat detection domain. Developed algorithms are applied to the problem of processing the large amounts of gamma ray spectroscopy data that can be produced in real-time by mobile radiation sensors. The thesis experimentally shows BA’s capability to boost sensor performance in detecting radiation sources of interest, even if the source is faint, partiallyoccluded, or enveloped in the noisy and variable radiation background characteristic of urban scenes. In addition, BA provides simultaneous inference of source parameters such as the source intensity or source type while detecting it. The thesis demonstrates this capability and also develops techniques to efficiently optimize these parameters over large possible setting spaces. Methods developed in this thesis are demonstrated both in simulation and in a radiation-sensing backpack that applies robotic localization techniques to enable indoor surveillance of radiation sources. The thesis further improves the BA algorithm’s capability to be robust under various detection scenarios. First, we augment BA with appropriate statistical models to improve estimation of signal components in low photon count detection, where the sensor may receive limited photon counts from either source and/or background. Second, we develop methods for online sensor reliability monitoring to create algorithms that are resilient to possible sensor faults in a data pipeline containing one or multiple sensors. Finally, we develop Retrospective BA, a variant of BA that allows reinterpretation of past sensor data in light of new information about percepts. These Retrospective capabilities include the use of Hidden Markov Models in BA to allow automatic correction of a sensor pipeline when sensor malfunction may be occur, an Anomaly- Match search strategy to efficiently optimize source hypotheses, and prototyping of a Multi-Modal Augmented PCA to more flexibly model background and nuisance source fluctuations in a dynamic environment.

APA, Harvard, Vancouver, ISO, and other styles

4

Ticlavilca, Andres M. "Multivariate Bayesian Machine Learning Regression for Operation and Management of Multiple Reservoir, Irrigation Canal, and River Systems." DigitalCommons@USU, 2010. https://digitalcommons.usu.edu/etd/600.

Full text

Abstract:

The principal objective of this dissertation is to develop Bayesian machine learning models for multiple reservoir, irrigation canal, and river system operation and management. These types of models are derived from the emerging area of machine learning theory; they are characterized by their ability to capture the underlying physics of the system simply by examination of the measured system inputs and outputs. They can be used to provide probabilistic predictions of system behavior using only historical data. The models were developed in the form of a multivariate relevance vector machine (MVRVM) that is based on a sparse Bayesian learning machine approach for regression. Using this Bayesian approach, a predictive confidence interval is obtained from the model that captures the uncertainty of both the model and the data. The models were applied to the multiple reservoir, canal and river system located in the regulated Lower Sevier River Basin in Utah. The models were developed to perform predictions of multi-time-ahead releases of multiple reservoirs, diversions of multiple canals, and streamflow and water loss/gain in a river system. This research represents the first attempt to use a multivariate Bayesian learning regression approach to develop simultaneous multi-step-ahead predictions with predictive confidence intervals for multiple outputs in a regulated river basin system. These predictions will be of potential value to reservoir and canal operators in identifying the best decisions for operation and management of irrigation water supply systems.

APA, Harvard, Vancouver, ISO, and other styles

5

Jin, Junyang. "Novel methods for biological network inference : an application to circadian Ca2+ signaling network." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/285323.

Full text

Abstract:

Biological processes involve complex biochemical interactions among a large number of species like cells, RNA, proteins and metabolites. Learning these interactions is essential to interfering artificially with biological processes in order to, for example, improve crop yield, develop new therapies, and predict new cell or organism behaviors to genetic or environmental perturbations. For a biological process, two pieces of information are of most interest. For a particular species, the first step is to learn which other species are regulating it. This reveals topology and causality. The second step involves learning the precise mechanisms of how this regulation occurs. This step reveals the dynamics of the system. Applying this process to all species leads to the complete dynamical network. Systems biology is making considerable efforts to learn biological networks at low experimental costs. The main goal of this thesis is to develop advanced methods to build models for biological networks, taking the circadian system of Arabidopsis thaliana as a case study. A variety of network inference approaches have been proposed in the literature to study dynamic biological networks. However, many successful methods either require prior knowledge of the system or focus more on topology. This thesis presents novel methods that identify both network topology and dynamics, and do not depend on prior knowledge. Hence, the proposed methods are applicable to general biological networks. These methods are initially developed for linear systems, and, at the cost of higher computational complexity, can also be applied to nonlinear systems. Overall, we propose four methods with increasing computational complexity: one-to-one, combined group and element sparse Bayesian learning (GESBL), the kernel method and reversible jump Markov chain Monte Carlo method (RJMCMC). All methods are tested with challenging dynamical network simulations (including feedback, random networks, different levels of noise and number of samples), and realistic models of circadian system of Arabidopsis thaliana. These simulations show that, while the one-to-one method scales to the whole genome, the kernel method and RJMCMC method are superior for smaller networks. They are robust to tuning variables and able to provide stable performance. The simulations also imply the advantage of GESBL and RJMCMC over the state-of-the-art method. We envision that the estimated models can benefit a wide range of research. For example, they can locate biological compounds responsible for human disease through mathematical analysis and help predict the effectiveness of new treatments.

APA, Harvard, Vancouver, ISO, and other styles

6

Yazdani, Akram. "Statistical Approaches in Genome-Wide Association Studies." Doctoral thesis, Università degli studi di Padova, 2014. http://hdl.handle.net/11577/3423743.

Full text

Abstract:

Genome-wide association studies, GWAS, typically contain hundreds of thousands single nucleotide polymorphisms, SNPs, genotyped for few numbers of samples. The aim of these studies is to identify regions harboring SNPs or to predict the outcomes of interest. Since the number of predictors in the GWAS far exceeds the number of samples, it is impossible to analyze the data with classical statistical methods. In the current GWAS, the widely applied methods are based on single marker analysis that does assess association of each SNP with the complex traits independently. Because of the low power of this analysis for detecting true association, simultaneous analysis has recently received more attention. The new statistical methods for simultaneous analysis in high dimensional settings have a limitation of disparity between the number of predictors and the number of samples. Therefore, reducing the dimensionality of the set of SNPs is required. This thesis reviews single marker analysis and simultaneous analysis with a focus on Bayesian methods. It addresses the weaknesses of these approaches with reference to recent literature and illustrating simulation studies. To bypass these problems, we first attempt to reduce dimension of the set of SNPs with random projection technique. Since this method does not improve the predictive performance of the model, we present a new two-stage approach that is a hybrid method of single and simultaneous analyses. This full Bayesian approach selects the most promising SNPs in the first stage by evaluating the impact of each marker independently. In the second stage, we develop a hierarchical Bayesian model to analyze the impact of selected markers simultaneously. The model that accounts for related samples places the local-global shrinkage prior on marker effects in order to shrink small effects to zero while keeping large effects relatively large. The prior specification on marker effects, which is hierarchical representation of generalized double Pareto, improves the predictive performance. Finally, we represent the result of real SNP-data analysis through single-maker study and the new two-stage approach.
Lo Studio di Associazione Genome-Wide, GWAS, tipicamente comprende centinaia di migliaia di polimorfismi a singolo nucleotide, SNPs, genotipizzati per pochi campioni. L'obiettivo di tale studio consiste nell'individuare le regioni cruciali SNPs e prevedere gli esiti di una variabile risposta. Dal momento che il numero di predittori è di gran lunga superiore al numero di campioni, non è possibile condurre l'analisi dei dati con metodi statistici classici. GWAS attuali, i metodi negli maggiormente utilizzati si basano sull'analisi a marcatore unico, che valuta indipendentemente l'associazione di ogni SNP con i tratti complessi. A causa della bassa potenza dell'analisi a marcatore unico nel rilevamento delle associazioni reali, l'analisi simultanea ha recentemente ottenuto più attenzione. I recenti metodi per l'analisi simultanea nel multidimensionale hanno una limitazione sulla disparità tra il numero di predittori e il numero di campioni. Pertanto, è necessario ridurre la dimensionalità dell'insieme di SNPs. Questa tesi fornisce una panoramica dell'analisi a marcatore singolo e dell'analisi simultanea, focalizzandosi su metodi Bayesiani. Vengono discussi i limiti di tali approcci in relazione ai GWAS, con riferimento alla letteratura recente e utilizzando studi di simulazione. Per superare tali problemi, si è cercato di ridurre la dimensione dell'insieme di SNPs con una tecnica a proiezione casuale. Poiché questo approccio non comporta miglioramenti nella accuratezza predittiva del modello, viene quindi proposto un approccio in due fasi, che risulta essere un metodo ibrido di analisi singola e simultanea. Tale approccio, completamente Bayesiano, seleziona gli SNPs più promettenti nella prima fase valutando l'impatto di ogni marcatore indipendentemente. Nella seconda fase, viene sviluppato un modello gerarchico Bayesiano per analizzare contemporaneamente l'impatto degli indicatori selezionati. Il modello che considera i campioni correlati pone una priori locale-globale ristretta sugli effetti dei marcatori. Tale prior riduce a zero gli effetti piccoli, mentre mantiene gli effetti più grandi relativamente grandi. Le priori specificate sugli effetti dei marcatori sono rappresentazioni gerarchiche della distribuzione Pareto doppia; queste a priori migliorano le prestazioni predittive del modello. Infine, nella tesi vengono riportati i risultati dell'analisi su dati reali di SNP basate sullo studio a marcatore singolo e sul nuovo approccio a due stadi.

APA, Harvard, Vancouver, ISO, and other styles

7

Deshpande, Hrishikesh. "Dictionary learning for pattern classification in medical imaging." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S032/document.

Full text

Abstract:

La plupart des signaux naturels peuvent être représentés par une combinaison linéaire de quelques atomes dans un dictionnaire. Ces représentations parcimonieuses et les méthodes d'apprentissage de dictionnaires (AD) ont suscité un vif intérêt au cours des dernières années. Bien que les méthodes d'AD classiques soient efficaces dans des applications telles que le débruitage d'images, plusieurs méthodes d'AD discriminatifs ont été proposées pour obtenir des dictionnaires mieux adaptés à la classification. Dans ce travail, nous avons montré que la taille des dictionnaires de chaque classe est un facteur crucial dans les applications de reconnaissance des formes lorsqu'il existe des différences de variabilité entre les classes, à la fois dans le cas des dictionnaires classiques et des dictionnaires discriminatifs. Nous avons validé la proposition d'utiliser différentes tailles de dictionnaires, dans une application de vision par ordinateur, la détection des lèvres dans des images de visages, ainsi que par une application médicale plus complexe, la classification des lésions de scléroses en plaques (SEP) dans des images IRM multimodales. Les dictionnaires spécifiques à chaque classe sont appris pour les lésions et les tissus cérébraux sains. La taille du dictionnaire pour chaque classe est adaptée en fonction de la complexité des données. L'algorithme est validé à l'aide de 52 séquences IRM multimodales de 13 patients atteints de SEP
Most natural signals can be approximated by a linear combination of a few atoms in a dictionary. Such sparse representations of signals and dictionary learning (DL) methods have received a special attention over the past few years. While standard DL approaches are effective in applications such as image denoising or compression, several discriminative DL methods have been proposed to achieve better image classification. In this thesis, we have shown that the dictionary size for each class is an important factor in the pattern recognition applications where there exist variability difference between classes, in the case of both the standard and discriminative DL methods. We validated the proposition of using different dictionary size based on complexity of the class data in a computer vision application such as lips detection in face images, followed by more complex medical imaging application such as classification of multiple sclerosis (MS) lesions using MR images. The class specific dictionaries are learned for the lesions and individual healthy brain tissues, and the size of the dictionary for each class is adapted according to the complexity of the underlying data. The algorithm is validated using 52 multi-sequence MR images acquired from 13 MS patients

APA, Harvard, Vancouver, ISO, and other styles

8

Chen, Cong. "High-Dimensional Generative Models for 3D Perception." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/103948.

Full text

Abstract:

Modern robotics and automation systems require high-level reasoning capability in representing, identifying, and interpreting the three-dimensional data of the real world. Understanding the world's geometric structure by visual data is known as 3D perception. The necessity of analyzing irregular and complex 3D data has led to the development of high-dimensional frameworks for data learning. Here, we design several sparse learning-based approaches for high-dimensional data that effectively tackle multiple perception problems, including data filtering, data recovery, and data retrieval. The frameworks offer generative solutions for analyzing complex and irregular data structures without prior knowledge of data. The first part of the dissertation proposes a novel method that simultaneously filters point cloud noise and outliers as well as completing missing data by utilizing a unified framework consisting of a novel tensor data representation, an adaptive feature encoder, and a generative Bayesian network. In the next section, a novel multi-level generative chaotic Recurrent Neural Network (RNN) has been proposed using a sparse tensor structure for image restoration. In the last part of the dissertation, we discuss the detection followed by localization, where we discuss extracting features from sparse tensors for data retrieval.
Doctor of Philosophy
The development of automation systems and robotics brought the modern world unrivaled affluence and convenience. However, the current automated tasks are mainly simple repetitive motions. Tasks that require more artificial capability with advanced visual cognition are still an unsolved problem for automation. Many of the high-level cognition-based tasks require the accurate visual perception of the environment and dynamic objects from the data received from the optical sensor. The capability to represent, identify and interpret complex visual data for understanding the geometric structure of the world is 3D perception. To better tackle the existing 3D perception challenges, this dissertation proposed a set of generative learning-based frameworks on sparse tensor data for various high-dimensional robotics perception applications: underwater point cloud filtering, image restoration, deformation detection, and localization. Underwater point cloud data is relevant for many applications such as environmental monitoring or geological exploration. The data collected with sonar sensors are however subjected to different types of noise, including holes, noise measurements, and outliers. In the first chapter, we propose a generative model for point cloud data recovery using Variational Bayesian (VB) based sparse tensor factorization methods to tackle these three defects simultaneously. In the second part of the dissertation, we propose an image restoration technique to tackle missing data, which is essential for many perception applications. An efficient generative chaotic RNN framework has been introduced for recovering the sparse tensor from a single corrupted image for various types of missing data. In the last chapter, a multi-level CNN for high-dimension tensor feature extraction for underwater vehicle localization has been proposed.

APA, Harvard, Vancouver, ISO, and other styles

9

Subramanian, Harshavardhan. "Combining scientific computing and machine learning techniques to model longitudinal outcomes in clinical trials." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176427.

Full text

Abstract:

Scientific machine learning (SciML) is a new branch of AI research at the edge of scientific computing (Sci) and machine learning (ML). It deals with efficient amalgamation of data-driven algorithms along with scientific computing to discover the dynamics of the time-evolving process. The output of such algorithms is represented in the form of a governing equation(s) (e.g., ordinary differential equation(s), ODE(s)), which one can solve then for any time point and, thus, obtain a rigorous prediction. In this thesis, we present a methodology on how to incorporate the SciML approach in the context of clinical trials to predict IPF disease progression in the form of governing equation. Our proposed methodology also quantifies the uncertainties associated with the model by fitting 95\% high density interval (HDI) for the ODE parameters and 95\% posterior prediction interval for posterior predicted samples. We have also investigated the possibility of predicting later outcomes by using the observations collected at early phase of the study. We were successful in combining ML techniques, statistical methodologies and scientific computing tools such as bootstrap sampling, cubic spline interpolation, Bayesian inference and sparse identification of nonlinear dynamics (SINDy) to discover the dynamics behind the efficacy outcome as well as in quantifying the uncertainty of the parameters of the governing equation in the form of 95 \% HDI intervals. We compared the resulting model with the existed disease progression model described by the Weibull function. Based on the mean squared error (MSE) criterion between our ODE approximated values and population means of respective datasets, we achieved the least possible MSE of 0.133,0.089,0.213 and 0.057. After comparing these MSE values with the MSE values obtained after using Weibull function, for the third dataset and pooled dataset, our ODE model performed better in reducing error than the Weibull baseline model by 7.5\% and 8.1\%, respectively. Whereas for the first and second datasets, the Weibull model performed better in reducing errors by 1.5\% and 1.2\%, respectively. Comparing the overall performance in terms of MSE, our proposed model approximates the population means better in all the cases except for the first and second datasets, assuming the latter case's error margin is very small. Also, in terms of interpretation, our dynamical system model contains the mechanistic elements that can explain the decay/acceleration rate of the efficacy endpoint, which is missing in the Weibull model. However, our approach had a limitation in predicting final outcomes using a model derived from 24, 36, 48 weeks observations with good accuracy where as on the contrast, the Weibull model do not possess the predicting capability. However, the extrapolated trend based on 60 weeks of data was found to be close to population mean and the ODE model built on 72 weeks of data. Finally we highlight potential questions for the future work.

APA, Harvard, Vancouver, ISO, and other styles

10

Francisco, André Biasin Segalla. "Esparsidade estruturada em reconstrução de fontes de EEG." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/43/43134/tde-13052018-112615/.

Full text

Abstract:

Neuroimagiologia funcional é uma área da neurociência que visa o desenvolvimento de diversas técnicas para mapear a atividade do sistema nervoso e esteve sob constante desenvolvimento durante as últimas décadas devido à sua grande importância para aplicações clínicas e pesquisa. Técnicas usualmente utilizadas, como imagem por ressonância magnética functional (fMRI) e tomografia por emissão de pósitrons (PET) têm ótima resolução espacial (~ mm), mas uma resolução temporal limitada (~ s), impondo um grande desafio para nossa compreensão a respeito da dinâmica de funções cognitivas mais elevadas, cujas oscilações podem ocorrer em escalas temporais muito mais finas (~ ms). Tal limitação ocorre pelo fato destas técnicas medirem respostas biológicas lentas que são correlacionadas de maneira indireta com a atividade elétrica cerebral. As duas principais técnicas capazes de superar essa limitação são a Eletro- e Magnetoencefalografia (EEG/MEG), que são técnicas não invasivas para medir os campos elétricos e magnéticos no escalpo, respectivamente, gerados pelas fontes elétricas cerebrais. Ambas possuem resolução temporal na ordem de milisegundo, mas tipicalmente uma baixa resolução espacial (~ cm) devido à natureza mal posta do problema inverso eletromagnético. Um imenso esforço vem sendo feito durante as últimas décadas para melhorar suas resoluções espaciais através da incorporação de informação relevante ao problema de outras técnicas de imagens e/ou de vínculos biologicamente inspirados aliados ao desenvolvimento de métodos matemáticos e algoritmos sofisticados. Neste trabalho focaremos em EEG, embora todas técnicas aqui apresentadas possam ser igualmente aplicadas ao MEG devido às suas formas matemáticas idênticas. Em particular, nós exploramos esparsidade como uma importante restrição matemática dentro de uma abordagem Bayesiana chamada Aprendizagem Bayesiana Esparsa (SBL), que permite a obtenção de soluções únicas significativas no problema de reconstrução de fontes. Além disso, investigamos como incorporar diferentes estruturas como graus de liberdade nesta abordagem, que é uma aplicação de esparsidade estruturada e mostramos que é um caminho promisor para melhorar a precisão de reconstrução de fontes em métodos de imagens eletromagnéticos.
Functional Neuroimaging is an area of neuroscience which aims at developing several techniques to map the activity of the nervous system and has been under constant development in the last decades due to its high importance in clinical applications and research. Common applied techniques such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) have great spatial resolution (~ mm), but a limited temporal resolution (~ s), which poses a great challenge on our understanding of the dynamics of higher cognitive functions, whose oscillations can occur in much finer temporal scales (~ ms). Such limitation occurs because these techniques rely on measurements of slow biological responses which are correlated in a complicated manner to the actual electric activity. The two major candidates that overcome this shortcoming are Electro- and Magnetoencephalography (EEG/MEG), which are non-invasive techniques that measure the electric and magnetic fields on the scalp, respectively, generated by the electrical brain sources. Both have millisecond temporal resolution, but typically low spatial resolution (~ cm) due to the highly ill-posed nature of the electromagnetic inverse problem. There has been a huge effort in the last decades to improve their spatial resolution by means of incorporating relevant information to the problem from either other imaging modalities and/or biologically inspired constraints allied with the development of sophisticated mathematical methods and algorithms. In this work we focus on EEG, although all techniques here presented can be equally applied to MEG because of their identical mathematical form. In particular, we explore sparsity as a useful mathematical constraint in a Bayesian framework called Sparse Bayesian Learning (SBL), which enables the achievement of meaningful unique solutions in the source reconstruction problem. Moreover, we investigate how to incorporate different structures as degrees of freedom into this framework, which is an application of structured sparsity and show that it is a promising way to improve the source reconstruction accuracy of electromagnetic imaging methods.

APA, Harvard, Vancouver, ISO, and other styles

11

Zambonin, Giuliano. "Development of Machine Learning-based technologies for major appliances: soft sensing for drying technology applications." Doctoral thesis, Università degli studi di Padova, 2019. http://hdl.handle.net/11577/3425771.

Full text

Abstract:

In this thesis, Machine Learning techniques for the improvements in the performance of household major appliances are described. In particular, the focus is on drying technologies and domestic dryers are the machines of interest selected as case studies. Statistical models called Soft Sensors have been developed to provide estimates of quantities that are costly/time-consuming to measure in our applications using data that were available for other purposes. The work has been developed as industrially driven research activity in collaborations with Electrolux Italia S.p.a. R&D department located in Porcia, Pordenone, Italy. During the thesis, practical aspects of the implementation of the proposed approaches in a real industrial environment as well as topics related to collaborations between industry and academies are specified.

APA, Harvard, Vancouver, ISO, and other styles

12

Umakanthan, Sabanadesan. "Human action recognition from video sequences." Thesis, Queensland University of Technology, 2016. https://eprints.qut.edu.au/93749/1/Sabanadesan_Umakanthan_Thesis.pdf.

Full text

Abstract:

This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.

APA, Harvard, Vancouver, ISO, and other styles

13

Azevedo, Carlos Renato Belo 1984. "Anticipation in multiple criteria decision-making under uncertainty = Antecipação na tomada de decisão com múltiplos critérios sob incerteza." [s.n.], 2012. http://repositorio.unicamp.br/jspui/handle/REPOSIP/260775.

Full text

Abstract:

Orientador: Fernando José Von Zuben
Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação
Made available in DSpace on 2018-08-26T06:49:07Z (GMT). No. of bitstreams: 1 Azevedo_CarlosRenatoBelo_D.pdf: 3449858 bytes, checksum: 7a1811aa772f1ae996e8851c60627b7c (MD5) Previous issue date: 2012
Resumo: A presença de incerteza em resultados futuros pode levar a indecisões em processos de escolha, especialmente ao elicitar as importâncias relativas de múltiplos critérios de decisão e de desempenhos de curto vs. longo prazo. Algumas decisões, no entanto, devem ser tomadas sob informação incompleta, o que pode resultar em ações precipitadas com consequências imprevisíveis. Quando uma solução deve ser selecionada sob vários pontos de vista conflitantes para operar em ambientes ruidosos e variantes no tempo, implementar alternativas provisórias flexíveis pode ser fundamental para contornar a falta de informação completa, mantendo opções futuras em aberto. A engenharia antecipatória pode então ser considerada como a estratégia de conceber soluções flexíveis as quais permitem aos tomadores de decisão responder de forma robusta a cenários imprevisíveis. Essa estratégia pode, assim, mitigar os riscos de, sem intenção, se comprometer fortemente a alternativas incertas, ao mesmo tempo em que aumenta a adaptabilidade às mudanças futuras. Nesta tese, os papéis da antecipação e da flexibilidade na automação de processos de tomada de decisão sequencial com múltiplos critérios sob incerteza é investigado. O dilema de atribuir importâncias relativas aos critérios de decisão e a recompensas imediatas sob informação incompleta é então tratado pela antecipação autônoma de decisões flexíveis capazes de preservar ao máximo a diversidade de escolhas futuras. Uma metodologia de aprendizagem antecipatória on-line é então proposta para melhorar a variedade e qualidade dos conjuntos futuros de soluções de trade-off. Esse objetivo é alcançado por meio da previsão de conjuntos de máximo hipervolume esperado, para a qual as capacidades de antecipação de metaheurísticas multi-objetivo são incrementadas com rastreamento bayesiano em ambos os espaços de busca e dos objetivos. A metodologia foi aplicada para a obtenção de decisões de investimento, as quais levaram a melhoras significativas do hipervolume futuro de conjuntos de carteiras financeiras de trade-off avaliadas com dados de ações fora da amostra de treino, quando comparada a uma estratégia míope. Além disso, a tomada de decisões flexíveis para o rebalanceamento de carteiras foi confirmada como uma estratégia significativamente melhor do que a de escolher aleatoriamente uma decisão de investimento a partir da fronteira estocástica eficiente evoluída, em todos os mercados artificiais e reais testados. Finalmente, os resultados sugerem que a antecipação de opções flexíveis levou a composições de carteiras que se mostraram significativamente correlacionadas com as melhorias observadas no hipervolume futuro esperado, avaliado com dados fora das amostras de treino
Abstract: The presence of uncertainty in future outcomes can lead to indecision in choice processes, especially when eliciting the relative importances of multiple decision criteria and of long-term vs. near-term performance. Some decisions, however, must be taken under incomplete information, what may result in precipitated actions with unforeseen consequences. When a solution must be selected under multiple conflicting views for operating in time-varying and noisy environments, implementing flexible provisional alternatives can be critical to circumvent the lack of complete information by keeping future options open. Anticipatory engineering can be then regarded as the strategy of designing flexible solutions that enable decision makers to respond robustly to unpredictable scenarios. This strategy can thus mitigate the risks of strong unintended commitments to uncertain alternatives, while increasing adaptability to future changes. In this thesis, the roles of anticipation and of flexibility on automating sequential multiple criteria decision-making processes under uncertainty are investigated. The dilemma of assigning relative importances to decision criteria and to immediate rewards under incomplete information is then handled by autonomously anticipating flexible decisions predicted to maximally preserve diversity of future choices. An online anticipatory learning methodology is then proposed for improving the range and quality of future trade-off solution sets. This goal is achieved by predicting maximal expected hypervolume sets, for which the anticipation capabilities of multi-objective metaheuristics are augmented with Bayesian tracking in both the objective and search spaces. The methodology has been applied for obtaining investment decisions that are shown to significantly improve the future hypervolume of trade-off financial portfolios for out-of-sample stock data, when compared to a myopic strategy. Moreover, implementing flexible portfolio rebalancing decisions was confirmed as a significantly better strategy than to randomly choosing an investment decision from the evolved stochastic efficient frontier in all tested artificial and real-world markets. Finally, the results suggest that anticipating flexible choices has lead to portfolio compositions that are significantly correlated with the observed improvements in out-of-sample future expected hypervolume
Doutorado
Engenharia de Computação
Doutor em Engenharia Elétrica

APA, Harvard, Vancouver, ISO, and other styles

14

Cherief-Abdellatif, Badr-Eddine. "Contributions to the theoretical study of variational inference and robustness." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAG001.

Full text

Abstract:

Cette thèse de doctorat traite de l'inférence variationnelle et de la robustesse en statistique et en machine learning. Plus précisément, elle se concentre sur les propriétés statistiques des approximations variationnelles et sur la conception d'algorithmes efficaces pour les calculer de manière séquentielle, et étudie les estimateurs basés sur le Maximum Mean Discrepancy comme règles d'apprentissage qui sont robustes à la mauvaise spécification du modèle.Ces dernières années, l'inférence variationnelle a été largement étudiée du point de vue computationnel, cependant, la littérature n'a accordé que peu d'attention à ses propriétés théoriques jusqu'à très récemment. Dans cette thèse, nous étudions la consistence des approximations variationnelles dans divers modèles statistiques et les conditions qui assurent leur consistence. En particulier, nous abordons le cas des modèles de mélange et des réseaux de neurones profonds. Nous justifions également d'un point de vue théorique l'utilisation de la stratégie de maximisation de l'ELBO, un critère numérique qui est largement utilisé dans la communauté VB pour la sélection de modèle et dont l'efficacité a déjà été confirmée en pratique. En outre, l'inférence Bayésienne offre un cadre d'apprentissage en ligne attrayant pour analyser des données séquentielles, et offre des garanties de généralisation qui restent valables même en cas de mauvaise spécification des modèles et en présence d'adversaires. Malheureusement, l'inférence Bayésienne exacte est rarement tractable en pratique et des méthodes d'approximation sont généralement employées, mais ces méthodes préservent-elles les propriétés de généralisation de l'inférence Bayésienne ? Dans cette thèse, nous montrons que c'est effectivement le cas pour certains algorithmes d'inférence variationnelle (VI). Nous proposons de nouveaux algorithmes tempérés en ligne et nous en déduisons des bornes de généralisation. Notre résultat théorique repose sur la convexité de l'objectif variationnel, mais nous soutenons que notre résultat devrait être plus général et présentons des preuves empiriques à l'appui. Notre travail donne des justifications théoriques en faveur des algorithmes en ligne qui s'appuient sur des méthodes Bayésiennes approchées.Une autre question d'intérêt majeur en statistique qui est abordée dans cette thèse est la conception d'une procédure d'estimation universelle. Cette question est d'un intérêt majeur, notamment parce qu'elle conduit à des estimateurs robustes, un thème d'actualité en statistique et en machine learning. Nous abordons le problème de l'estimation universelle en utilisant un estimateur de minimisation de distance basé sur la Maximum Mean Discrepancy. Nous montrons que l'estimateur est robuste à la fois à la dépendance et à la présence de valeurs aberrantes dans le jeu de données. Nous mettons également en évidence les liens qui peuvent exister avec les estimateurs de minimisation de distance utilisant la distance L2. Enfin, nous présentons une étude théorique de l'algorithme de descente de gradient stochastique utilisé pour calculer l'estimateur, et nous étayons nos conclusions par des simulations numériques. Nous proposons également une version Bayésienne de notre estimateur, que nous étudions à la fois d'un point de vue théorique et d'un point de vue computationnel
This PhD thesis deals with variational inference and robustness. More precisely, it focuses on the statistical properties of variational approximations and the design of efficient algorithms for computing them in an online fashion, and investigates Maximum Mean Discrepancy based estimators as learning rules that are robust to model misspecification.In recent years, variational inference has been extensively studied from the computational viewpoint, but only little attention has been put in the literature towards theoretical properties of variational approximations until very recently. In this thesis, we investigate the consistency of variational approximations in various statistical models and the conditions that ensure the consistency of variational approximations. In particular, we tackle the special case of mixture models and deep neural networks. We also justify in theory the use of the ELBO maximization strategy, a model selection criterion that is widely used in the Variational Bayes community and is known to work well in practice.Moreover, Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even under model mismatch and with adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference? In this thesis, we show that this is indeed the case for some variational inference algorithms. We propose new online, tempered variational algorithms and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that our result should hold more generally and present empirical evidence in support of this. Our work presents theoretical justifications in favor of online algorithms that rely on approximate Bayesian methods. Another point that is addressed in this thesis is the design of a universal estimation procedure. This question is of major interest, in particular because it leads to robust estimators, a very hot topic in statistics and machine learning. We tackle the problem of universal estimation using a minimum distance estimator based on the Maximum Mean Discrepancy. We show that the estimator is robust to both dependence and to the presence of outliers in the dataset. We also highlight the connections that may exist with minimum distance estimators using L2-distance. Finally, we provide a theoretical study of the stochastic gradient descent algorithm used to compute the estimator, and we support our findings with numerical simulations. We also propose a Bayesian version of our estimator, that we study from both a theoretical and a computational points of view

APA, Harvard, Vancouver, ISO, and other styles

15

Wolley, Chirine. "Apprentissage supervisé à partir des multiples annotateurs incertains." Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4070/document.

Full text

Abstract:

En apprentissage supervisé, obtenir les réels labels pour un ensemble de données peut être très fastidieux et long. Aujourd'hui, les récentes avancées d'Internet ont permis le développement de services d'annotations en ligne, faisant appel au crowdsourcing pour collecter facilement des labels. Néanmoins, le principal inconvénient de ces services réside dans le fait que les annotateurs peuvent avoir des niveaux d'expertise très hétérogènes. De telles données ne sont alors pas forcément fiables. Par conséquent, la gestion de l'incertitude des annotateurs est un élément clé pour l'apprentissage à partir de multiples annotateurs non experts. Dans cette thèse, nous proposons des algorithmes probabilistes qui traitent l'incertitude des annotateurs et la qualité des données durant la phase d'apprentissage. Trois modèles sont proposés: IGNORE permet de classer de nouvelles instances tout en évaluant les annotateurs en terme de performance d'annotation qui dépend de leur incertitude. X-IGNORE intègre la qualité des données en plus de l'incertitude des juges. En effet, X-IGNORE suppose que la performance des annotateurs dépend non seulement de leur incertitude mais aussi de la qualité des données qu'ils annotent. Enfin, ExpertS répond au problème de sélection d'annotateurs durant l'apprentissage. ExpertS élimine les annotateurs les moins performants, et se base ainsi uniquement sur les labels des bons annotateurs (experts) lors de l'étape d'apprentissage. De nombreuses expérimentations effectuées sur des données synthétiques et réelles montrent la performance et la stabilité de nos modèles par rapport à différents algorithmes de la littérature
In supervised learning tasks, obtaining the ground truth label for each instance of the training dataset can be difficult, time-consuming and/or expensive. With the advent of infrastructures such as the Internet, an increasing number of web services propose crowdsourcing as a way to collect a large enough set of labels from internet users. The use of these services provides an exceptional facility to collect labels from anonymous annotators, and thus, it considerably simplifies the process of building labels datasets. Nonetheless, the main drawback of crowdsourcing services is their lack of control over the annotators and their inability to verify and control the accuracy of the labels and the level of expertise for each labeler. Hence, managing the annotators' uncertainty is a clue for learning from imperfect annotations. This thesis provides three algorithms when learning from multiple uncertain annotators. IGNORE generates a classifier that predict the label of a new instance and evaluate the performance of each annotator according to their level of uncertainty. X-Ignore, considers that the performance of the annotators both depends on their uncertainty and on the quality of the initial dataset to be annotated. Finally, ExpertS deals with the problem of annotators' selection when generating the classifier. It identifies experts annotators, and learn the classifier based only on their labels. We conducted in this thesis a large set of experiments in order to evaluate our models, both using experimental and real world medical data. The results prove the performance and accuracy of our models compared to previous state of the art solutions in this context

APA, Harvard, Vancouver, ISO, and other styles

16

Behúň, Kamil. "Příznaky z videa pro klasifikaci." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236367.

Full text

Abstract:

This thesis compares hand-designed features with features learned by feature learning methods in video classification. The features learned by Principal Component Analysis whitening, Independent subspace analysis and Sparse Autoencoders were tested in a standard Bag of Visual Word classification paradigm replacing hand-designed features (e.g. SIFT, HOG, HOF). The classification performance was measured on Human Motion DataBase and YouTube Action Data Set. Learned features showed better performance than the hand-desined features. The combination of hand-designed features and learned features by Multiple Kernel Learning method showed even better performance, including cases when hand-designed features and learned features achieved not so good performance separately.

APA, Harvard, Vancouver, ISO, and other styles

17

Le, Folgoc Loïc. "Apprentissage statistique pour la personnalisation de modèles cardiaques à partir de données d’imagerie." Thesis, Nice, 2015. http://www.theses.fr/2015NICE4098/document.

Full text

Abstract:

Cette thèse porte sur un problème de calibration d'un modèle électromécanique de cœur, personnalisé à partir de données d'imagerie médicale 3D+t ; et sur celui - en amont - de suivi du mouvement cardiaque. A cette fin, nous adoptons une méthodologie fondée sur l'apprentissage statistique. Pour la calibration du modèle mécanique, nous introduisons une méthode efficace mêlant apprentissage automatique et une description statistique originale du mouvement cardiaque utilisant la représentation des courants 3D+t. Notre approche repose sur la construction d'un modèle statistique réduit reliant l'espace des paramètres mécaniques à celui du mouvement cardiaque. L'extraction du mouvement à partir d'images médicales avec quantification d'incertitude apparaît essentielle pour cette calibration, et constitue l'objet de la seconde partie de cette thèse. Plus généralement, nous développons un modèle bayésien parcimonieux pour le problème de recalage d'images médicales. Notre contribution est triple et porte sur un modèle étendu de similarité entre images, sur l'ajustement automatique des paramètres du recalage et sur la quantification de l'incertitude. Nous proposons une technique rapide d'inférence gloutonne, applicable à des données cliniques 4D. Enfin, nous nous intéressons de plus près à la qualité des estimations d'incertitude fournies par le modèle. Nous comparons les prédictions du schéma d'inférence gloutonne avec celles données par une procédure d'inférence fidèle au modèle, que nous développons sur la base de techniques MCMC. Nous approfondissons les propriétés théoriques et empiriques du modèle bayésien parcimonieux et des deux schémas d'inférence
This thesis focuses on the calibration of an electromechanical model of the heart from patient-specific, image-based data; and on the related task of extracting the cardiac motion from 4D images. Long-term perspectives for personalized computer simulation of the cardiac function include aid to the diagnosis, aid to the planning of therapy and prevention of risks. To this end, we explore tools and possibilities offered by statistical learning. To personalize cardiac mechanics, we introduce an efficient framework coupling machine learning and an original statistical representation of shape & motion based on 3D+t currents. The method relies on a reduced mapping between the space of mechanical parameters and the space of cardiac motion. The second focus of the thesis is on cardiac motion tracking, a key processing step in the calibration pipeline, with an emphasis on quantification of uncertainty. We develop a generic sparse Bayesian model of image registration with three main contributions: an extended image similarity term, the automated tuning of registration parameters and uncertainty quantification. We propose an approximate inference scheme that is tractable on 4D clinical data. Finally, we wish to evaluate the quality of uncertainty estimates returned by the approximate inference scheme. We compare the predictions of the approximate scheme with those of an inference scheme developed on the grounds of reversible jump MCMC. We provide more insight into the theoretical properties of the sparse structured Bayesian model and into the empirical behaviour of both inference schemes

APA, Harvard, Vancouver, ISO, and other styles

18

Dang, Hong-Phuong. "Approches bayésiennes non paramétriques et apprentissage de dictionnaire pour les problèmes inverses en traitement d'image." Thesis, Ecole centrale de Lille, 2016. http://www.theses.fr/2016ECLI0019/document.

Full text

Abstract:

L'apprentissage de dictionnaire pour la représentation parcimonieuse est bien connu dans le cadre de la résolution de problèmes inverses. Les méthodes d'optimisation et les approches paramétriques ont été particulièrement explorées. Ces méthodes rencontrent certaines limitations, notamment liées au choix de paramètres. En général, la taille de dictionnaire doit être fixée à l'avance et une connaissance des niveaux de bruit et éventuellement de parcimonie sont aussi nécessaires. Les contributions méthodologies de cette thèse concernent l'apprentissage conjoint du dictionnaire et de ces paramètres, notamment pour les problèmes inverses en traitement d'image. Nous étudions et proposons la méthode IBP-DL (Indien Buffet Process for Dictionary Learning) en utilisant une approche bayésienne non paramétrique. Une introduction sur les approches bayésiennes non paramétriques est présentée. Le processus de Dirichlet et son dérivé, le processus du restaurant chinois, ainsi que le processus Bêta et son dérivé, le processus du buffet indien, sont décrits. Le modèle proposé pour l'apprentissage de dictionnaire s'appuie sur un a priori de type Buffet Indien qui permet d'apprendre un dictionnaire de taille adaptative. Nous détaillons la méthode de Monte-Carlo proposée pour l'inférence. Le niveau de bruit et celui de la parcimonie sont aussi échantillonnés, de sorte qu'aucun réglage de paramètres n'est nécessaire en pratique. Des expériences numériques illustrent les performances de l'approche pour les problèmes du débruitage, de l'inpainting et de l'acquisition compressée. Les résultats sont comparés avec l'état de l'art.Le code source en Matlab et en C est mis à disposition
Dictionary learning for sparse representation has been widely advocated for solving inverse problems. Optimization methods and parametric approaches towards dictionary learning have been particularly explored. These methods meet some limitations, particularly related to the choice of parameters. In general, the dictionary size is fixed in advance, and sparsity or noise level may also be needed. In this thesis, we show how to perform jointly dictionary and parameter learning, with an emphasis on image processing. We propose and study the Indian Buffet Process for Dictionary Learning (IBP-DL) method, using a bayesian nonparametric approach.A primer on bayesian nonparametrics is first presented. Dirichlet and Beta processes and their respective derivatives, the Chinese restaurant and Indian Buffet processes are described. The proposed model for dictionary learning relies on an Indian Buffet prior, which permits to learn an adaptive size dictionary. The Monte-Carlo method for inference is detailed. Noise and sparsity levels are also inferred, so that in practice no parameter tuning is required. Numerical experiments illustrate the performances of the approach in different settings: image denoising, inpainting and compressed sensing. Results are compared with state-of-the art methods is made. Matlab and C sources are available for sake of reproducibility

APA, Harvard, Vancouver, ISO, and other styles

19

Gerchinovitz, Sébastien. "Prédiction de suites individuelles et cadre statistique classique : étude de quelques liens autour de la régression parcimonieuse et des techniques d'agrégation." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00653550.

Full text

Abstract:

Cette thèse s'inscrit dans le domaine de l'apprentissage statistique. Le cadre principal est celui de la prévision de suites déterministes arbitraires (ou suites individuelles), qui recouvre des problèmes d'apprentissage séquentiel où l'on ne peut ou ne veut pas faire d'hypothèses de stochasticité sur la suite des données à prévoir. Cela conduit à des méthodes très robustes. Dans ces travaux, on étudie quelques liens étroits entre la théorie de la prévision de suites individuelles et le cadre statistique classique, notamment le modèle de régression avec design aléatoire ou fixe, où les données sont modélisées de façon stochastique. Les apports entre ces deux cadres sont mutuels : certaines méthodes statistiques peuvent être adaptées au cadre séquentiel pour bénéficier de garanties déterministes ; réciproquement, des techniques de suites individuelles permettent de calibrer automatiquement des méthodes statistiques pour obtenir des bornes adaptatives en la variance du bruit. On étudie de tels liens sur plusieurs problèmes voisins : la régression linéaire séquentielle parcimonieuse en grande dimension (avec application au cadre stochastique), la régression linéaire séquentielle sur des boules L1, et l'agrégation de modèles non linéaires dans un cadre de sélection de modèles (régression avec design fixe). Enfin, des techniques stochastiques sont utilisées et développées pour déterminer les vitesses minimax de divers critères de performance séquentielle (regrets interne et swap notamment) en environnement déterministe ou stochastique.

APA, Harvard, Vancouver, ISO, and other styles

20

Prasad, Ranjitha. "Sparse Bayesian Learning For Joint Channel Estimation Data Detection In OFDM Systems." Thesis, 2015. http://etd.iisc.ernet.in/2005/3997.

Full text

Abstract:

Bayesian approaches for sparse signal recovery have enjoyed a long-standing history in signal processing and machine learning literature. Among the Bayesian techniques, the expectation maximization based Sparse Bayesian Learning(SBL) approach is an iterative procedure with global convergence guarantee to a local optimum, which uses a parameterized prior that encourages sparsity under an evidence maximization frame¬work. SBL has been successfully employed in a wide range of applications ranging from image processing to communications. In this thesis, we propose novel, efficient and low-complexity SBL-based algorithms that exploit structured sparsity in the presence of fully/partially known measurement matrices. We apply the proposed algorithms to the problem of channel estimation and data detection in Orthogonal Frequency Division Multiplexing(OFDM) systems. Further, we derive Cram´er Rao type lower Bounds(CRB) for the single and multiple measurement vector SBL problem of estimating compressible vectors and their prior distribution parameters. The main contributions of the thesis are as follows: We derive Hybrid, Bayesian and Marginalized Cram´er Rao lower bounds for the problem of estimating compressible vectors drawn from a Student-t prior distribution. We derive CRBs that encompass the deterministic or random nature of the unknown parameters of the prior distribution and the regression noise variance. We use the derived bounds to uncover the relationship between the compressibility and Mean Square Error(MSE) in the estimates. Through simulations, we demonstrate the dependence of the MSE performance of SBL based estimators on the compressibility of the vector. OFDM is a well-known multi-carrier modulation technique that provides high spectral efficiency and resilience to multi-path distortion of the wireless channel It is well-known that the impulse response of a wideband wireless channel is approximately sparse, in the sense that it has a small number of significant components relative to the channel delay spread. In this thesis, we consider the estimation of the unknown channel coefficients and its support in SISO-OFDM systems using a SBL framework. We propose novel pilot-only and joint channel estimation and data detection algorithms in block-fading and time-varying scenarios. In the latter case, we use a first order auto-regressive model for the time-variations, and propose recursive, low-complexity Kalman filtering based algorithms for channel estimation. Monte Carlo simulations illustrate the efficacy of the proposed techniques in terms of the MSE and coded bit error rate performance. • Multiple Input Multiple Output(MIMO) combined with OFDM harnesses the inherent advantages of OFDM along with the diversity and multiplexing advantages of a MIMO system. The impulse response of wireless channels between the Nt transmit and Nr receive antennas of a MIMO-OFDM system are group approximately sparse(ga-sparse),i.e. ,the Nt Nr channels have a small number of significant paths relative to the channel delay spread, and the time-lags of the significant paths between transmit and receive antenna pairs coincide. Often, wire¬less channels are also group approximately-cluster sparse(ga-csparse),i.e.,every ga-sparse channel consists of clusters, where a few clusters have all strong components while most clusters have all weak components. In this thesis, we cast the problem of estimating the ga-sparse and ga-csparse block-fading and time-varying channels using a multiple measurement SBL framework. We propose a bouquet of novel algorithms for MIMO-OFDM systems that generalize the algorithms proposed in the context of SISO-OFDM systems. The efficacy of the proposed techniques are demonstrated in terms of MSE and coded bit error rate performance.

APA, Harvard, Vancouver, ISO, and other styles

21

Shi, Minghui. "Bayesian Sparse Learning for High Dimensional Data." Diss., 2011. http://hdl.handle.net/10161/3869.

Full text

Abstract:

In this thesis, we develop some Bayesian sparse learning methods for high dimensional data analysis. There are two important topics that are related to the idea of sparse learning -- variable selection and factor analysis. We start with Bayesian variable selection problem in regression models. One challenge in Bayesian variable selection is to search the huge model space adequately, while identifying high posterior probability regions. In the past decades, the main focus has been on the use of Markov chain Monte Carlo (MCMC) algorithms for these purposes. In the first part of this thesis, instead of using MCMC, we propose a new computational approach based on sequential Monte Carlo (SMC), which we refer to as particle stochastic search (PSS). We illustrate PSS through applications to linear regression and probit models.

Besides the Bayesian stochastic search algorithms, there is a rich literature on shrinkage and variable selection methods for high dimensional regression and classification with vector-valued parameters, such as lasso (Tibshirani, 1996) and the relevance vector machine (Tipping, 2001). Comparing with the Bayesian stochastic search algorithms, these methods does not account for model uncertainty but are more computationally efficient. In the second part of this thesis, we generalize this type of ideas to matrix valued parameters and focus on developing efficient variable selection method for multivariate regression. We propose a Bayesian shrinkage model (BSM) and an efficient algorithm for learning the associated parameters .

In the third part of this thesis, we focus on the topic of factor analysis which has been widely used in unsupervised learnings. One central problem in factor analysis is related to the determination of the number of latent factors. We propose some Bayesian model selection criteria for selecting the number of latent factors based on a graphical factor model. As it is illustrated in Chapter 4, our proposed method achieves good performance in correctly selecting the number of factors in several different settings. As for application, we implement the graphical factor model for several different purposes, such as covariance matrix estimation, latent factor regression and classification.

Dissertation

APA, Harvard, Vancouver, ISO, and other styles

22

Huang, Din-Hwa, and 黃汀華. "Basis Adaptive Sparse Bayesian Learning : Algorithms and Applications." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/6n47p5.

Full text

Abstract:

博士
國立交通大學
電信工程研究所
103
Sparse Bayesian learning (SBL) is a widely used compressive sensing (CS) method that finds the solution by Bayesian inference. In this approach, a basis function is specified to form the transform matrix. For a particular application, it may exist a proper basis, with known model function and unknown parameters, which can convert the signal to a sparse domain. In conventional SBL, the parameters of the basis are assumed to be known as priori. This assumption may not be valid in real-world applications, and the efficacy of conventional SBL approaches can be greatly affected. In this dissertation, we propose a basis-adaptive-sparse-Bayesian-learning (BA-SBL) framework, which can estimate the basis and system parameters, alternatively and iteratively, to solve the problem. Possible applications are also explored. We start the work with the cooperative spectrum sensing problem in cognitive radio (CR) systems. It is known that in addition to spectrum sparsity, spatial sparsity can also be used to further enhance spectral utilization. To achieve that, secondary users (SUs) must know the locations and signal-strength distributions of primary-users’ base-stations (PUBSs), which is referred to as radio source positioning and power-propagation-map (PPM) reconstruction. Conventional approaches approximate PUBSs’ power decay with a path-loss model (PLM) and assume PUBSs’ locations on some grid points. However, the parameters of the PLM have to be known in advance and the estimation accuracy is bounded by the resolution of the grid points. We first employ a Laplacian function to model the PUBS power decay profile and propose a BA-SBL scheme to estimate corresponding parameters. With the proposed method, little priori information is required. To further enhance the performance, we incorporate source number detection methods such that the number of the PUBSs can be precisely detected. Simulations show that the proposed algorithm has satisfactory performance even when the spatial measurement rate is low. While the proposed BA-SBL scheme can effectively reconstruct the PPM in CR systems, it can only be applied in one frequency band at a time, and the frequency-band dependence is not considered. To fill the gap, we then extend the Laplacian function to the multiple-band scenario. For a multi-band Laplacian function, its correlation between different bands is taken into consideration by a block SBL (BSBL) method. The BA-SBL is then modified and extended to a basis-adaptive BSBL (BA-BSBL) scheme, simultaneously reconstructing the PPMs of multiple frequency bands. Simulations show that BA-BSBL outperforms BA-SBL applied to each band, independently. Finally, we apply the proposed BA-BSBL procedure to the positioning problem in the 3rdgeneration-partnership-project (3GPP) long-term-evolution (LTE) systems. The observed-timedifference-of-arrival (OTDOA) method is used to estimate the location of user-element (UE). It uses the estimated time-of-arrivals (TOAs) from three different base stations (BSs) as the observations. The TOA corresponding to a BS can be obtained by the first-tap delay of the time-domain channel response. The main problem of conventional OTDOA methods is that the precision of TOA estimation, obtained by a channel estimation method, is limited by the quantization effect of the receiver’s sampler. Since wireless channels are generally spare, we can then formulate the time-domain channel estimation as a CS problem. Using the pulseshaping-filter response as the basis, we apply the proposed BA-BSBL procedure to conduct the channel estimation, and the TOA can be estimated without quantization. Simulations show that the proposed BA-BSBL algorithm can significantly enhance the precision of TOA estimation and then improve the positioning performance.

APA, Harvard, Vancouver, ISO, and other styles

23

Huang, Wen-Han, and 黃玟翰. "Three-dimensional probabilistic site characterization by sparse Bayesian learning." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/6u62y3.

Full text

Abstract:

碩士
國立臺灣大學
土木工程學研究所
107
This study investigated the modified cone tip resistance (qt) data from cone penetration tests (CPT). The feasibility and method of identifying the trend function were also discussed. The vertical spatial distribution is expressed as a depth-dependent trend function and a zero-mean spatial variation. Trend function can help us catch soil properties in space. Spatial variation can be estimated by standard deviation (σ) and scale of fluctuation (δ). In addition to the vertical scale of fluctuation, in 3D case, horizontal scale of fluctuation is also important. However, the number of horizontal data is much less than that of the vertical data. Horizontal scale of fluctuation is hard to be estimated. The estimation of the horizontal parameter is difficult. Another problem is that when analyzing multiple data at a time, the matrix becomes very huge, increasing the computation and even exceeding the load of the memory. We use Cholesky decomposition and Kronecker product to simplify the matrix. In this way, we can greatly reduce the computation. This study uses a two-step Bayesian analysis to identify trend functions. The first step is to select the basis functions we need by sparse Bayesian learning. In this study, we also consider the effects of different kinds of basis functions. The second step is to use transitional Markov chain Monte Carlo (TMCMC; Ching and Chen, 2007) as a method for estimating the parameters of the random field. Through the above two steps, we can fit the trend function and model the random field.

APA, Harvard, Vancouver, ISO, and other styles

24

Parisi, Simone. "Reinforcement Learning with Sparse and Multiple Rewards." Phd thesis, 2020. https://tuprints.ulb.tu-darmstadt.de/11372/1/THESIS.PDF.

Full text

Abstract:

Over the course of the last decade, the framework of reinforcement learning has developed into a promising tool for learning a large variety of task. The idea of reinforcement learning is, at its core, very simple yet effective. The learning agent is left to explore the world by performing actions based on its observations of the state of the world, and in turn receives a feedback, called reward, assessing the quality of its behavior. However, learning soon becomes challenging and even impractical as the complexity of the environment and of the task increase. In particular, learning without any pre-defined behavior (1) in the presence of rarely emitted or sparse rewards, (2) maintaining stability even with limited data, and (3) with possibly multiple conflicting objectives are some of the most prominent issues that the agent has to face. Consider the simple problem of a robot that needs to learn a parameterized controller in order to reach a certain point based solely on the raw sensory observation, e.g., internal reading of joints position and camera images of the surrounding environment, and on the binary reward "success'' / "failure''. Without any prior knowledge of the world's dynamics, or any hint on how to behave, the robot will start acting randomly. Such exploration strategy will be (1) very unlikely to bring the robot closer to the goal, and thus to experience the "success'' feedback, and (2) likely generate useless trajectories and, subsequently, learning will be unstable. Furthermore, (3) there are many different ways the robot can reach the goal. For instance, the robot can quickly accelerate and then suddenly stop at the desired point, or it can slowly and smoothly navigate to the goal. These behaviors are clearly opposite, but the binary feedback does not provide any hint on which is more desirable. It should be clear that even simple problems such as a reaching task can turn non-trivial for reinforcement learning. One possible solution is to pre-engineer the task, e.g., hand-crafting the initial exploration behavior with imitation learning, shaping the reward based on the distance from the goal, or adding auxiliary rewards based on speed and safety. Following this solution, in recent years a lot of effort has been directed towards scaling reinforcement learning to solve complex real-world problems, such as robotic tasks with many degrees of freedom, videogames, and board games like Chess, Go, and Shogi. These advances, however, were possible largely thanks to experts prior knowledge and engineering, such as pre-initialized parameterized agent behaviors and reward shaping, and often required a prohibitive amount of data. This large amount of required prior knowledge and pre-structuring is arguably in stark contrast to the goal of developing autonomous learning. In this thesis we will present methods to increase the autonomy of reinforcement learning algorithms, i.e., learning without expert pre-engineering, by addressing the issues discussed above. The key points of our research address (1) techniques to deal with multiple conflicting reward functions, (2) methods to enhance exploration in the presence of sparse rewards, and (3) techniques to enable more stable and safer learning. Progress in each of these aspects will lift reinforcement learning to a higher level of autonomy. First, we will address the presence of conflicting objective from a multi-objective optimization perspective. In this scenario, the standard concept of optimality is replaced by Pareto optimality, a concept for representing compromises among the objectives. Subsequently, the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, achieving an accurate representation of the Pareto frontier is still an important challenge. Common practical approaches rely on experts to manually set priority or thresholds on the objectives. These methods require prior knowledge and are not able to learn the whole Pareto frontier but just a portion of it, possibly missing interesting solutions. On the contrary, we propose a manifold-based method which learn a continuous approximation of the frontier without the need of any prior knowledge. We will then consider learning in the presence of sparse rewards and present novel exploration strategies. Classical exploration techniques in reinforcement learning mostly revolve around the immediate reward, that is, how to choose an action to balance between exploitation and exploration for the current state. These methods, however, perform extremely poorly if only sparse rewards are provided. State-of-the-art exploration strategies, thus, rely either on local exploration along the current solution together with sensible initialization, or on handcrafted strategies based on heuristics. These approaches, however, either require prior knowledge or have poor guarantees of convergence, and often falls in local optima. On the contrary, we propose an approach that plans exploration actions far into the future based on what we call long-term visitation value. Intuitively, this value assesses the number of unvisited states that the agent can visit in the future by performing that action. Finally, we address the problem of stabilizing learning when little data is available. Even assuming efficient exploration strategies, dense rewards, and the presence of only one objective, reinforcement learning can exhibit unstable behavior. Interestingly, the most successful algorithms, namely actor-critic methods, are also the most sensible to this issue. These methods typically separate the problem of learning the value of a given state from the problem of learning the optimal action to execute in such a state. The former is fullfilled by the so-called critic, while the latter by the so-called actor. In this scenario, the instability is due the interplay between these two components, especially when nonlinear approximators, such as neural networks, are employed. To avoid such issues, we propose to regularize the learning objective of the actor by penalizing the error of the critic. This improves stability by avoiding large steps in the actor update whenever the critic is highly inaccurate. Altogether, the individual contributions of this thesis allow reinforcement learning to rely less on expert pre-engineering. The proposed methods can be applied to a large variety of common algorithms, and are evaluated on a wide array of tasks. Results on both standard and novel benchmarks confirm their effectiveness.

APA, Harvard, Vancouver, ISO, and other styles

25

Huang, Han-Shen, and 黃漢申. "Learning from Sparse Data: An Approach to Parameter Learning in Bayesian Networks." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/18831073237145141413.

Full text

Abstract:

博士
國立臺灣大學
資訊工程學研究所
91
Many newly-emerging applications with small and incomplete (sparse for abbreviation) data sets present new challenges to machine learning. For example, we would like to have a model that can accurately predict the possibility of domestic terrorist incidents and attack terrorism in advance. Such incidents are rare, but always bring severe impact once they really happen. In addition, the relevant symptoms may be unknown, unobserved, and different case by case. Therefore, learning accurate models from this kind of sparse data is difficult, but very meaningful and important. One way to deal with such situations is to learn probabilistic models from sparse data sets. Probability theory is well-founded for domains with uncertainty and for data sets with missing values. We use the Bayesian network as the modeling tool because of its clear semantics for human experts. The network structure can be determined by the domain experts, showing the causal relations between features. Then, the parameters can be learned from data sets, which is more tedious for human experts. This thesis proposes a search-based approach to the parameter learning problem in Bayesian networks from sparse training sets. A search-based solution consists of the metric and the search algorithm. The most frequently used solution is to search on the data likelihood metric based on Maximum-Likelihood estimation (ML) with the Expectation-Maximization (EM) algorithm or the gradient ascent algorithm. However, our analysis shows that the ML learning for sparse data tends to over/underestimate the probabilities for low/high-frequency states of multinomial random variables. Therefore, we propose Entropic Rectification Function (ERF) to rectify the deviation without prior information about the application domain. The general EM-based framework for penalized data likelihood function, Penalized EM (PEM) algorithm, can search on ERF, but time-consuming numerical methods are required in the M-step. To accelerate the computation, we propose Fixed-Point PEM (FPEM) algorithm, in which there is a closed-form solution for the M-step based on the framework of the fixed-point iteration method. We show that ERF outperforms the data likelihood metric by leading the search algorithms to stop at the estimates with smaller KL divergences to the true distribution, and FPEM outperforms PEM by searching out local maxima faster. In addition, ERF can also be used to learn other probabilistic models with multinomial distributions, like Hidden Markov model. FPEM can search on other penalized data likelihood metrics as well.

APA, Harvard, Vancouver, ISO, and other styles

26

Kuen-FengLee and 李昆峯. "Construction of Document Model and Language Model Using Bayesian Sparse Learning." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/57056195766494950616.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Tien-Yu, Hsieh, and 謝典佑. "Modeling Students' Learning Bugs and Skills Using Combining Multiple Bayesian Networks." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/06642102650546308286.

Full text

Abstract:

碩士
國立臺中教育大學
數學教育學系
94
The goal of this paper is trying to develop fusion methods for combining multiple Bayesian networks and to obtain better classification results than single Bayesian networks. Six fusion methods, Maximum, Minimum, Average, Product, Majority Vote and Fusion Structure were proposed and evaluated based on educational test data. The results show that the proposed fusion methods, Structure Fusion, with dynamic cut-point selection can improve the classification accuracy.

APA, Harvard, Vancouver, ISO, and other styles

28

Kück, Hendrik. "Bayesian formulations of multiple instance learning with applications to general object recognition." Thesis, 2004. http://hdl.handle.net/2429/15680.

Full text

Abstract:

We attack the problem of general object recognition by learning probabilistic, nonlinear object classifiers for a large number of object classes. The individual classifiers allow for detection and localization of objects belonging to a certain class in an image by classifying image regions into those that likely show such an object and those that do not. Instead of relying on expensive supervised training data, we propose an approach for learning such classifiers from annotated images. One major problem to overcome in this scenario is the ambiguity due to the unknown associations between annotation words and image regions. We propose a fully Bayesian learning technique for learning probabilistic classifiers, which deals with the ambiguity issue in a principled way by integrating out the uncertainty in both the unknown associations and the parameters of our probabilistic model. Our approach uses an extremely flexible kernel classifier trained with an efficient Markov Chain Monte Carlo technique. A further problem in this setting is that of class imbalance, which we address in two different ways. Firstly, we propose a new problem formulation, in which we provide our training algorithm with additional information, namely estimates of the number of image regions showing the object class in the given training images. Additionally, we experiment with altering the distribution of the training data itself. Using only an automatic and appropriately chosen subset of the available training images without the object class of interest leads to a more balanced class distribution as well as a remarkable speedup in training and testing. The value of the new techniques is demonstrated in synthetic and real image datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Manandhar, Achut. "Hierarchical Bayesian Learning Approaches for Different Labeling Cases." Diss., 2015. http://hdl.handle.net/10161/11321.

Full text

Abstract:

The goal of a machine learning problem is to learn useful patterns from observations so that appropriate inference can be made from new observations as they become available. Based on whether labels are available for training data, a vast majority of the machine learning approaches can be broadly categorized into supervised or unsupervised learning approaches. In the context of supervised learning, when observations are available as labeled feature vectors, the learning process is a well-understood problem. However, for many applications, the standard supervised learning becomes complicated because the labels for observations are unavailable as labeled feature vectors. For example, in a ground penetrating radar (GPR) based landmine detection problem, the alarm locations are only known in 2D coordinates on the earth's surface but unknown for individual target depths. Typically, in order to apply computer vision techniques to the GPR data, it is convenient to represent the GPR data as a 2D image. Since a large portion of the image does not contain useful information pertaining to the target, the image is typically further subdivided into subimages along depth. These subimages at a particular alarm location can be considered as a set of observations, where the label is only available for the entire set but unavailable for individual observations along depth. In the absence of individual observation labels, for the purposes of training standard supervised learning approaches, observations both above and below the target are labeled as targets despite substantial differences in their characteristics. As a result, the label uncertainty with depth would complicate the parameter inference in the standard supervised learning approaches, potentially degrading their performance. In this work, we develop learning algorithms for three such specific scenarios where: (1) labels are only available for sets of independent and identically distributed (i.i.d.) observations, (2) labels are only available for sets of sequential observations, and (3) continuous correlated multiple labels are available for spatio-temporal observations. For each of these scenarios, we propose a modification in a traditional learning approach to improve its predictive accuracy. The first two algorithms are based on a set-based framework called as multiple instance learning (MIL) whereas the third algorithm is based on a structured output-associative regression (SOAR) framework. The MIL approaches are motivated by the landmine detection problem using GPR data, where the training data is typically available as labeled sets of observations or sets of sequences. The SOAR learning approach is instead motivated by the multi-dimensional human emotion label prediction problem using audio-visual data, where the training data is available in the form of multiple continuous correlated labels representing complex human emotions. In both of these applications, the unavailability of the training data as labeled featured vectors motivate developing new learning approaches that are more appropriate to model the data.

A large majority of the existing MIL approaches require computationally expensive parameter optimization, do not generalize well with time-series data, and are incapable of online learning. To overcome these limitations, for sets of observations, this work develops a nonparametric Bayesian approach to learning in MIL scenarios based on Dirichlet process mixture models. The nonparametric nature of the model and the use of non-informative priors remove the need to perform cross-validation based optimization while variational Bayesian inference allows for rapid parameter learning. The resulting approach is highly generalizable and also capable of online learning. For sets of sequences, this work integrates Hidden Markov models (HMMs) into an MIL framework and develops a new approach called the multiple instance hidden Markov model. The model parameters are inferred using variational Bayes, making the model tractable and computationally efficient. The resulting approach is highly generalizable and also capable of online learning. Similarly, most of the existing approaches developed for modeling multiple continuous correlated emotion labels do not model the spatio-temporal correlation among the emotion labels. Few approaches that do model the correlation fail to predict the multiple emotion labels simultaneously, resulting in latency during testing, and potentially compromising the effectiveness of implementing the approach in real-time scenario. This work integrates the output-associative relevance vector machine (OARVM) approach with the multivariate relevance vector machine (MVRVM) approach to simultaneously predict multiple emotion labels. The resulting approach performs competitively with the existing approaches while reducing the prediction time during testing, and the sparse Bayesian inference allows for rapid parameter learning. Experimental results on several synthetic datasets, benchmark datasets, GPR-based landmine detection datasets, and human emotion recognition datasets show that our proposed approaches perform comparably or better than the existing approaches.

Dissertation

APA, Harvard, Vancouver, ISO, and other styles

30

"Bayesian Framework for Sparse Vector Recovery and Parameter Bounds with Application to Compressive Sensing." Master's thesis, 2019. http://hdl.handle.net/2286/R.I.55639.

Full text

Abstract:

abstract: Signal compressed using classical compression methods can be acquired using brute force (i.e. searching for non-zero entries in component-wise). However, sparse solutions require combinatorial searches of high computations. In this thesis, instead, two Bayesian approaches are considered to recover a sparse vector from underdetermined noisy measurements. The first is constructed using a Bernoulli-Gaussian (BG) prior distribution and is assumed to be the true generative model. The second is constructed using a Gamma-Normal (GN) prior distribution and is, therefore, a different (i.e. misspecified) model. To estimate the posterior distribution for the correctly specified scenario, an algorithm based on generalized approximated message passing (GAMP) is constructed, while an algorithm based on sparse Bayesian learning (SBL) is used for the misspecified scenario. Recovering sparse signal using Bayesian framework is one class of algorithms to solve the sparse problem. All classes of algorithms aim to get around the high computations associated with the combinatorial searches. Compressive sensing (CS) is a widely-used terminology attributed to optimize the sparse problem and its applications. Applications such as magnetic resonance imaging (MRI), image acquisition in radar imaging, and facial recognition. In CS literature, the target vector can be recovered either by optimizing an objective function using point estimation, or recovering a distribution of the sparse vector using Bayesian estimation. Although Bayesian framework provides an extra degree of freedom to assume a distribution that is directly applicable to the problem of interest, it is hard to find a theoretical guarantee of convergence. This limitation has shifted some of researches to use a non-Bayesian framework. This thesis tries to close this gab by proposing a Bayesian framework with a suggested theoretical bound for the assumed, not necessarily correct, distribution. In the simulation study, a general lower Bayesian Cram\'er-Rao bound (BCRB) bound is extracted along with misspecified Bayesian Cram\'er-Rao bound (MBCRB) for GN model. Both bounds are validated using mean square error (MSE) performances of the aforementioned algorithms. Also, a quantification of the performance in terms of gains versus losses is introduced as one main finding of this report.
Dissertation/Thesis
Masters Thesis Computer Engineering 2019

APA, Harvard, Vancouver, ISO, and other styles

31

Yang, Chih-Wei, and 楊智為. "Modeling Student’s Learning Bugs and Skills Using Combining Multiple Modeling Student’s Learning Bugs and Skills Using Combining Bayesian Networks based on SVM." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/14681956954826606265.

Full text

Abstract:

碩士
國立臺中教育大學
教育測驗統計研究所
95
The goal of this study was trying to combine the multiple Bayesian networks with classifier and to obtain better accuracy than single Bayesian network. The two classifiers, k-nearest-neighborhood, and support vector machine were used in this paper with two kinds of input, binary and posterior probability. The results showed that basing on the support vector machine to combine the multiple Bayesian networks with posterior probability can improve the classification accuracy.

APA, Harvard, Vancouver, ISO, and other styles

32

Rodrigues, Filipe Manuel Pereira Duarte. "Probabilistic models for learning from crowdsourced data." Doctoral thesis, 2016. http://hdl.handle.net/10316/29454.

Full text

Abstract:

Tese de doutoramento em Programa de Doutoramento em Ciência da Informação e Tecnologia, apresentada ao Departamento de Engenharia Informática da Faculdade de Ciências e Tecnologia da Universidade de Coimbra
A presente tese propõe um conjunto de modelos probabilísticos para aprendizagem a partir de dados gerados pela multidão (crowd). Este tipo de dados tem vindo rapidamente a alterar a forma como muitos problemas de aprendizagem máquina são abordados em diferentes áreas do domínio científico, tais como o processamento de linguagem natural, a visão computacional e a música. Através da sabedoria e conhecimento da crowd, foi possível na área de aprendizagem máquina o desenvolvimento de abordagens para realizar tarefas complexas de uma forma muito mais escalável. Por exemplo, as plataformas de crowdsourcing como o Amazon mechanical turk (AMT) colocam ao dispor dos seus utilizadores um recurso acessível e económico para etiquetar largos conjuntos de dados de forma eficiente. Contudo, os diferentes vieses e níveis de perícia individidual dos diversos anotadores que contribuem nestas plataformas tornam necessário o desenvolvimento de abordagens específicas e direcionadas para este tipo de dados multi-anotador. Tendo em mente o problema da heterogeneidade dos anotadores, começamos por introduzir uma classe de modelos de conhecimento latente. Estes modelos são capazes de diferenciar anotadores confiáveis de anotadores cujas respostas são dadas de forma aleatória ou pouco premeditada, sem que para isso seja necessário ter acesso às respostas verdadeiras, ao mesmo tempo que é treinado um classificador de regressão logística ou um conditional random field. De seguida, são considerados modelos de crescente complexidade, desenvolvendo-se uma generalização dos classificadores baseados em processos Gaussianos para configurações multi-anotador. Estes modelos permitem aprender fronteiras de decisão não lineares entre classes, bem como o desenvolvimento de metodologias de aprendizagem activa, que são capazes de aumentar a eficiência do crowdsourcing e reduzir os custos associados. Por último, tendo em conta que a grande maioria das tarefas para as quais o crowdsourcing é usado envolvem dados complexos e de elevada dimensionalidade tais como texto ou imagens, são propostos dois modelos de tópicos supervisionados: um, para problemas de classificação e, outro, para regressão. A superioridade das modelos acima mencionados sobre as abordagens do estado da arte é empiricamente demonstrada usando dados reais recolhidos do AMT para diferentes tarefas como a classificação de posts, notícias, imagens e música, ou até mesmo na previsão do sentimento latente num texto e da atribuição do número de estrelas a um restaurante ou a um filme. Contudo, o conceito de crowdsourcing não se limita a plataformas dedicadas como o AMT. Basta considerarmos os aspectos sociais da Web moderna, que rapidamente começamos a compreender a verdadeira natureza ubíqua do crowdsourcing. Essa componente social da Web deu origem a um mundo de possibilidades estimulantes na área de inteligência artificial em geral. Por exemplo, da perspectiva dos sistemas inteligentes de transportes, a informação partilhada online por multidões fornece o contexto que nos dá a possibilidade de perceber melhor como as pessoas se movem em espaços urbanos. Na segunda parte desta tese, são usados dados gerados pela crowd como entradas adicionais de forma a melhorar modelos de aprendizagem máquina. Nomeadamente, é considerado o problema de compreender a procura em sistemas de transportes na presença de eventos, tais como concertos, eventos desportivos ou festivais. Inicialmente, é desenvolvido um modelo probabilístico para explicar sobrelotações anormais usando informação recolhida da Web. De seguida, é proposto um modelo Bayesiano aditivo cujas componentes são processos Gaussianos. Utilizando dados reais do sistema de transportes públicos de Singapura e dados gerados na Web sobre eventos, verificamos empiricamente a qualidade superior das previsões do modelo proposto em relação a abordagens do estado da arte. Além disso, devido à formulação aditiva do modelo proposto, verificamos que este é capaz de desagregar uma série temporal de procura de transportes numa componente de rotina (e.g. devido à mobilidade pendular) e nas componentes que correspondem às contribuições dos vários eventos individuais identificados. No geral, os modelos propostos nesta tese para aprender com base em dados gerados pela crowd são de vasta aplicabilidade e de grande valor para um amplo espectro de comunidades científicas.
This thesis leverages the general framework of probabilistic graphical models to develop probabilistic approaches for learning from crowdsourced data. This type of data is rapidly changing the way we approach many machine learning problems in different areas such as natural language processing, computer vision and music. By exploiting the wisdom of crowds, machine learning researchers and practitioners are able to develop approaches to perform complex tasks in a much more scalable manner. For instance, crowdsourcing platforms like Amazon mechanical turk provide users with an inexpensive and accessible resource for labeling large datasets efficiently. However, the different biases and levels of expertise that are commonly found among different annotators in these platforms deem the development of targeted approaches necessary. With the issue of annotator heterogeneity in mind, we start by introducing a class of latent expertise models which are able to discern reliable annotators from random ones without access to the ground truth, while jointly learning a logistic regression classifier or a conditional random field. Then, a generalization of Gaussian process classifiers to multiple-annotator settings is developed, which makes it possible to learn non-linear decision boundaries between classes and to develop an active learning methodology that is able to increase the efficiency of crowdsourcing while reducing its cost. Lastly, since the majority of the tasks for which crowdsourced data is commonly used involves complex high-dimensional data such as images or text, two supervised topic models are also proposed, one for classification and another for regression problems. Using real crowdsourced data from Mechanical Turk, we empirically demonstrate the superiority of the aforementioned models over state-of-the-art approaches in many different tasks such as classifying posts, news stories, images and music, or even predicting the sentiment of a text, the number of stars of a review or the rating of movie. But the concept of crowdsourcing is not limited to dedicated platforms such as Mechanical Turk. For example, if we consider the social aspects of the modern Web, we begin to perceive the true ubiquitous nature of crowdsourcing. This opened up an exciting new world of possibilities in artificial intelligence. For instance, from the perspective of intelligent transportation systems, the information shared online by crowds provides the context that allows us to better understand how people move in urban environments. In the second part of this thesis, we explore the use of data generated by crowds as additional inputs in order to improve machine learning models. Namely, the problem of understanding public transport demand in the presence of special events such as concerts, sports games or festivals, is considered. First, a probabilistic model is developed for explaining non-habitual overcrowding using crowd-generated information mined from the Web. Then, a Bayesian additive model with Gaussian process components is proposed. Using real data from Singapore's transport system and crowd-generated data regarding special events, this model is empirically shown to be able to outperform state-of-the-art approaches for predicting public transport demand. Furthermore, due to its additive formulation, the proposed model is able to breakdown an observed time-series of transport demand into a routine component corresponding to commuting and the contributions of individual special events. Overall, the models proposed in this thesis for learning from crowdsourced data are of wide applicability and can be of great value to a broad range of research communities.
FCT - SFRH/BD/78396/2011

APA, Harvard, Vancouver, ISO, and other styles

33

chang, chen jung, and 陳榮昌. "Adaptive learning system based on Bayesian network using fusion strategy for combining multiple student models -using compound shape for an example." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/50476915610782026926.

Full text

Abstract:

碩士
亞洲大學
資訊工程學系碩士班
95
The main purpose of the research is to explore the educational assessment on the basis of Evidence-Centered Design（ECD） to build a convenient and effective diagnosis system. We use multiple Bayesian networks for modeling assessment data and identifying bugs and sub-skills in The “Compound Shape” of Mathematics in Grade 6. This research integrates the opinion of the experts, scholars and primary school teachers. Also, the multimedia computer is devised for Diagnostic Testing and computerizes adaptive remedial instruction with the system. Students can receive not only individual diagnostic tests. But adequate and in-time computerized adaptive remedial instruction. Evaluation Diagnosis and remedy can be achieved simultaneously. The findings of this research are as follows: 1. Multiple Bayesian networks did enhance the recognition level. 2. The system could diagnose students’ errors, which shows that the adaptive test based on the multiple Bayesian networks was effective. 3. The computerized adaptive remedial instruction was testified to be able to replace written tests in a convenient and time-saving way. 4. After adopting computerized adaptive remedial instruction, the bugs of most students were reduced and their skills were improved. It revealed that the computerized adaptive remedial instruction did help enhance students’ learning effects. 5. The distribution of students’ bugs and skills varied with districts where the schools were located and the teachers’ teaching methods.

APA, Harvard, Vancouver, ISO, and other styles

34

Srinivas, Suraj. "Learning Compact Architectures for Deep Neural Networks." Thesis, 2017. http://etd.iisc.ernet.in/2005/3581.

Full text

Abstract:

Deep neural networks with millions of parameters are at the heart of many state of the art computer vision models. However, recent works have shown that models with much smaller number of parameters can often perform just as well. A smaller model has the advantage of being faster to evaluate and easier to store - both of which are crucial for real-time and embedded applications. While prior work on compressing neural networks have looked at methods based on sparsity, quantization and factorization of neural network layers, we look at the alternate approach of pruning neurons. Training Neural Networks is often described as a kind of `black magic', as successful training requires setting the right hyper-parameter values (such as the number of neurons in a layer, depth of the network, etc ). It is often not clear what these values should be, and these decisions often end up being either ad-hoc or driven through extensive experimentation. It would be desirable to automatically set some of these hyper-parameters for the user so as to minimize trial-and-error. Combining this objective with our earlier preference for smaller models, we ask the following question - for a given task, is it possible to come up with small neural network architectures automatically? In this thesis, we propose methods to achieve the same. The work is divided into four parts. First, given a neural network, we look at the problem of identifying important and unimportant neurons. We look at this problem in a data-free setting, i.e; assuming that the data the neural network was trained on, is not available. We propose two rules for identifying wasteful neurons and show that these suffice in such a data-free setting. By removing neurons based on these rules, we are able to reduce model size without significantly affecting accuracy. Second, we propose an automated learning procedure to remove neurons during the process of training. We call this procedure ‘Architecture-Learning’, as this automatically discovers the optimal width and depth of neural networks. We empirically show that this procedure is preferable to trial-and-error based Bayesian Optimization procedures for selecting neural network architectures. Third, we connect ‘Architecture-Learning’ to a popular regularize called ‘Dropout’, and propose a novel regularized which we call ‘Generalized Dropout’. From a Bayesian viewpoint, this method corresponds to a hierarchical extension of the Dropout algorithm. Empirically, we observe that Generalized Dropout corresponds to a more flexible version of Dropout, and works in scenarios where Dropout fails. Finally, we apply our procedure for removing neurons to the problem of removing weights in a neural network, and achieve state-of-the-art results in scarifying neural networks.

APA, Harvard, Vancouver, ISO, and other styles

35

Divya, Padmanabhan. "New Methods for Learning from Heterogeneous and Strategic Agents." Thesis, 2017. http://etd.iisc.ernet.in/2005/3562.

Full text

Abstract:

1 Introduction In this doctoral thesis, we address several representative problems that arise in the context of learning from multiple heterogeneous agents. These problems are relevant to many modern applications such as crowdsourcing and internet advertising. In scenarios such as crowdsourcing, there is a planner who is interested in learning a task and a set of noisy agents provide the training data for this learning task. Any learning algorithm making use of the data provided by these noisy agents must account for their noise levels. The noise levels of the agents are unknown to the planner, leading to a non-trivial difficulty. Further, the agents are heterogeneous as they differ in terms of their noise levels. A key challenge in such settings is to learn the noise levels of the agents while simultaneously learning the underlying model. Another challenge arises when the agents are strategic. For example, when the agents are required to perform a task, they could be strategic on the efforts they put in. As another example, when required to report their costs incurred towards performing the task, the agents could be strategic and may not report the costs truthfully. In general, the performance of the learning algorithms could be severely affected if the information elicited from the agents is incorrect. We address the above challenges that arise in the following representative learning problems. Multi-label Classification from Heterogeneous Noisy Agents Multi-label classification is a well-known supervised machine learning problem where each instance is associated with multiple classes. Since several labels can be assigned to a single instance, one of the key challenges in this problem is to learn the correlations between the classes. We first assume labels from a perfect source and propose a novel topic model called Multi-Label Presence-Absence Latent Dirichlet Allocation (ML-PA-LDA). In the current day scenario, a natural source for procuring the training dataset is through mining user-generated content or directly through users in a crowdsourcing platform. In the more practical scenario of crowdsourcing, an additional challenge arises as the labels of the training instances are provided by noisy, heterogeneous crowd-workers with unknown qualities. With this as the motivation, we further adapt our topic model to the scenario where the labels are provided by multiple noisy sources and refer to this model as ML-PA-LDA-MNS (ML-PA-LDA with Multiple Noisy Sources). With experiments on standard datasets, we show that the proposed models achieve superior performance over existing methods. Active Linear Regression with Heterogeneous, Noisy and Strategic Agents In this work, we study the problem of training a linear regression model by procuring labels from multiple noisy agents or crowd annotators, under a budget constraint. We propose a Bayesian model for linear regression from multiple noisy sources and use variational inference for parameter estimation. When labels are sought from agents, it is important to minimize the number of labels procured as every call to an agent incurs a cost. Towards this, we adopt an active learning approach. In this specific context, we prove the equivalence of well-studied criteria of active learning such as entropy minimization and expected error reduction. For the purpose of annotator selection in active learning, we observe a useful connection with the multi-armed bandit framework. Due to the nature of the distribution of the rewards on the arms, we resort to the Robust Upper Confidence Bound (UCB) scheme with truncated empirical mean estimator to solve the annotator selection problem. This yields provable guarantees on the regret. We apply our model to the scenario where annotators are strategic and design suitable incentives to induce them to put in their best efforts. Ranking with Heterogeneous Strategic Agents We look at the problem where a planner must rank multiple strategic agents, a problem that has many applications including sponsored search auctions (SSA). Stochastic multi-armed bandit (MAB) mechanisms have been used in the literature to solve this problem. Existing stochastic MAB mechanisms with a deterministic payment rule, proposed in the literature, necessarily suffer a regret of (T 2=3), where T is the number of time steps. This happens because these mechanisms address the worst case scenario where the means of the agents’ stochastic rewards are separated by a very small amount that depends on T . We however take a detour and allow the planner to indicate the resolution, , with which the agents must be distinguished. This immediately leads us to introduce the notion of -Regret. We propose a dominant strategy incentive compatible (DSIC) and individually rational (IR), deterministic MAB mechanism, based on ideas from the Upper Confidence Bound (UCB) family of MAB algorithms. The proposed mechanism - UCB achieves a -regret of O(log T ). We first establish the results for single slot SSA and then non-trivially extend the results to the case of multi-slot SSA.

APA, Harvard, Vancouver, ISO, and other styles

36

(8086652), Guilherme Maia Rodrigues Gomes. "Hypothesis testing and community detection on networks with missingness and block structure." Thesis, 2019.

Find full text

Abstract:

Statistical analysis of networks has grown rapidly over the last few years with increasing number of applications. Graph-valued data carries additional information of dependencies which opens the possibility of modeling highly complex objects in vast number of fields such as biology (e.g. brain networks , fungi networks, genes co-expression), chemistry (e.g. molecules fingerprints), psychology (e.g. social networks) and many others (e.g. citation networks, word co-occurrences, financial systems, anomaly detection). While the inclusion of graph structure in the analysis can further help inference, simple statistical tasks in a network is very complex. For instance, the assumption of exchangeability of the nodes or the edges is quite strong, and it brings issues such as sparsity, size bias and poor characterization of the generative process of the data. Solutions to these issues include adding specific constraints and assumptions on the data generation process. In this work, we approach this problem by assuming graphs are globally sparse but locally dense, which allows exchangeability assumption to hold in local regions of the graph. We consider problems with two types of locality structure: block structure (also framed as multiple graphs or population of networks) and unstructured sparsity which can be seen as missing data. For the former, we developed a hypothesis testing framework for weighted aligned graphs; and a spectral clustering method for community detection on population of non-aligned networks. For the latter, we derive an efficient spectral clustering approach to learn the parameters of the zero inflated stochastic blockmodel. Overall, we found that incorporating multiple local dense structures leads to a more precise and powerful local and global inference. This result indicates that this general modeling scheme allows for exchangeability assumption on the edges to hold while generating more realistic graphs. We give theoretical conditions for our proposed algorithms, and we evaluate them on synthetic and real-world datasets, we show our models are able to outperform the baselines on a number of settings.

APA, Harvard, Vancouver, ISO, and other styles

37

Ashofteh, Afshin. "Data Science for Finance: Targeted Learning from (Big) Data to Economic Stability and Financial Risk Management." Doctoral thesis, 2022. http://hdl.handle.net/10362/135620.

Full text

Abstract:

A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Statistics and Econometrics
The modelling, measurement, and management of systemic financial stability remains a critical issue in most countries. Policymakers, regulators, and managers depend on complex models for financial stability and risk management. The models are compelled to be robust, realistic, and consistent with all relevant available data. This requires great data disclosure, which is deemed to have the highest quality standards. However, stressed situations, financial crises, and pandemics are the source of many new risks with new requirements such as new data sources and different models. This dissertation aims to show the data quality challenges of high-risk situations such as pandemics or economic crisis and it try to theorize the new machine learning models for predictive and longitudes time series models. In the first study (Chapter Two) we analyzed and compared the quality of official datasets available for COVID-19 as a best practice for a recent high-risk situation with dramatic effects on financial stability. We used comparative statistical analysis to evaluate the accuracy of data collection by a national (Chinese Center for Disease Control and Prevention) and two international (World Health Organization; European Centre for Disease Prevention and Control) organizations based on the value of systematic measurement errors. We combined excel files, text mining techniques, and manual data entries to extract the COVID-19 data from official reports and to generate an accurate profile for comparisons. The findings show noticeable and increasing measurement errors in the three datasets as the pandemic outbreak expanded and more countries contributed data for the official repositories, raising data comparability concerns and pointing to the need for better coordination and harmonized statistical methods. The study offers a COVID-19 combined dataset and dashboard with minimum systematic measurement errors and valuable insights into the potential problems in using databanks without carefully examining the metadata and additional documentation that describe the overall context of data. In the second study (Chapter Three) we discussed credit risk as the most significant source of risk in banking as one of the most important sectors of financial institutions. We proposed a new machine learning approach for online credit scoring which is enough conservative and robust for unstable and high-risk situations. This Chapter is aimed at the case of credit scoring in risk management and presents a novel method to be used for the default prediction of high-risk branches or customers. This study uses the Kruskal-Wallis non-parametric statistic to form a conservative credit-scoring model and to study its impact on modeling performance on the benefit of the credit provider. The findings show that the new credit scoring methodology represents a reasonable coefficient of determination and a very low false-negative rate. It is computationally less expensive with high accuracy with around 18% improvement in Recall/Sensitivity. Because of the recent perspective of continued credit/behavior scoring, our study suggests using this credit score for non-traditional data sources for online loan providers to allow them to study and reveal changes in client behavior over time and choose the reliable unbanked customers, based on their application data. This is the first study that develops an online non-parametric credit scoring system, which can reselect effective features automatically for continued credit evaluation and weigh them out by their level of contribution with a good diagnostic ability. In the third study (Chapter Four) we focus on the financial stability challenges faced by insurance companies and pension schemes when managing systematic (undiversifiable) mortality and longevity risk. For this purpose, we first developed a new ensemble learning strategy for panel time-series forecasting and studied its applications to tracking respiratory disease excess mortality during the COVID-19 pandemic. The layered learning approach is a solution related to ensemble learning to address a given predictive task by different predictive models when direct mapping from inputs to outputs is not accurate. We adopt a layered learning approach to an ensemble learning strategy to solve the predictive tasks with improved predictive performance and take advantage of multiple learning processes into an ensemble model. In this proposed strategy, the appropriate holdout for each model is specified individually. Additionally, the models in the ensemble are selected by a proposed selection approach to be combined dynamically based on their predictive performance. It provides a high-performance ensemble model to automatically cope with the different kinds of time series for each panel member. For the experimental section, we studied more than twelve thousand observations in a portfolio of 61-time series (countries) of reported respiratory disease deaths with monthly sampling frequency to show the amount of improvement in predictive performance. We then compare each country’s forecasts of respiratory disease deaths generated by our model with the corresponding COVID-19 deaths in 2020. The results of this large set of experiments show that the accuracy of the ensemble model is improved noticeably by using different holdouts for different contributed time series methods based on the proposed model selection method. These improved time series models provide us proper forecasting of respiratory disease deaths for each country, exhibiting high correlation (0.94) with Covid-19 deaths in 2020. In the fourth study (Chapter Five) we used the new ensemble learning approach for time series modeling, discussed in the previous Chapter, accompany by K-means clustering for forecasting life tables in COVID-19 times. Stochastic mortality modeling plays a critical role in public pension design, population and public health projections, and in the design, pricing, and risk management of life insurance contracts and longevity-linked securities. There is no general method to forecast the mortality rate applicable to all situations especially for unusual years such as the COVID-19 pandemic. In this Chapter, we investigate the feasibility of using an ensemble of traditional and machine learning time series methods to empower forecasts of age-specific mortality rates for groups of countries that share common longevity trends. We use Generalized Age-Period-Cohort stochastic mortality models to capture age and period effects, apply K-means clustering to time series to group countries following common longevity trends, and use ensemble learning to forecast life expectancy and annuity prices by age and sex. To calibrate models, we use data for 14 European countries from 1960 to 2018. The results show that the ensemble method presents the best robust results overall with minimum RMSE in the presence of structural changes in the shape of time series at the time of COVID-19. In this dissertation’s conclusions (Chapter Six), we provide more detailed insights about the overall contributions of this dissertation on the financial stability and risk management by data science, opportunities, limitations, and avenues for future research about the application of data science in finance and economy.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Multiple Sparse Bayesian Learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles