Tesis sobre el tema "Apprentissage d'ensemble"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 18 mejores tesis para su investigación sobre el tema "Apprentissage d'ensemble".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Guo, Li. "Classifieurs multiples intégarnt la marge d'ensemble. Application aux données de télédétection". Bordeaux 3, 2011. http://www.theses.fr/2011BOR30022.
Texto completoThis dissertation focuses on exploiting the ensemble margin concept to design better ensemble classifiers. Some training data set issues, such as redundancy, imbalanced classes and noise, are investigated in an ensemble margin framework. An alternative definition of the ensemble margin is at the core of this work. An innovative approach to measure the importance of each instance in the learning process is introduced. We show that there is less redundancy among smaller margin instances than among higher margin ones. In addition, these smaller margin instances carry more significant information than higher margin instances. Therefore, these low margin instances have a major influence in forming an appropriate training set to build up a reliable classifier. Based on these observations, we propose a new boundary bagging method. Another major issue that is investigated in this thesis is the complexity induced by an ensemble approach which usually involves a significant number of base classifiers. A new efficient ensemble pruning method is proposed. It consists in ordering all the base classifiers with respect to an entropy-inspired criterion that also exploits our new version of the margin of ensemble methods. Finally, the proposed ensemble methods are applied to remote sensing data analysis at three learning levels: data level, feature level and classifier level
Roy, Jean-Francis. "Apprentissage automatique avec garanties de généralisation à l'aide de méthodes d'ensemble maximisant le désaccord". Doctoral thesis, Université Laval, 2018. http://hdl.handle.net/20.500.11794/29563.
Texto completoWe focus on machine learning, a branch of artificial intelligence. When solving a classification problem, a learning algorithm is provided labelled data and has the task of learning a function that will be able to automatically classify future, unseen data. Many classical learning algorithms are designed to combine simple classifiers by building a weighted majority vote classifier out of them. In this thesis, we extend the usage of the C-bound, bound on the risk of the majority vote classifier. This bound is defined using two quantities : the individual performance of the voters, and the correlation of their errors (their disagreement). First, we design majority vote generalization bounds based on the C-bound. Then, we extend this bound from binary classification to generalized majority votes. Finally, we develop new learning algorithms with state-of-the-art performance, by constructing majority votes that maximize the voters’ disagreement, while controlling their individual performance. The generalization guarantees that we develop in this thesis are in the family of PAC-Bayesian bounds. We generalize the PAC-Bayesian theory by introducing a general theorem, from which the classical bounds from the literature can be recovered. Using this same theorem, we introduce generalization bounds based on the C-bound. We also simplify the proof process of PAC-Bayesian theorems, easing the development of new families of bounds. We introduce two new families of PAC-Bayesian bounds. One is based on a different notion of complexity than usual bounds, the Rényi divergence, instead of the classical Kullback-Leibler divergence. The second family is specialized to transductive learning, instead of inductive learning. The two learning algorithms that we introduce, MinCq and CqBoost, output a majority vote classifier that maximizes the disagreement between voters. An hyperparameter of the algorithms gives a direct control over the individual performance of the voters. These two algorithms being designed to minimize PAC-Bayesian generalization bounds on the risk of the majority vote classifier, they come with rigorous theoretical guarantees. By performing an empirical evaluation, we show that MinCq and CqBoost perform as well as classical stateof- the-art algorithms.
Baudin, Paul. "Prévision séquentielle par agrégation d'ensemble : application à des prévisions météorologiques assorties d'incertitudes". Thesis, Université Paris-Saclay (ComUE), 2015. http://www.theses.fr/2015SACLS117/document.
Texto completoIn this thesis, we study sequential prediction problems. The goal is to devise and apply automatic strategy, learning from the past, with potential help from basis predictors. We desire these strategies to have strong mathematical guarantees and to be valid in the most general cases. This enables us to apply the algorithms deriving from the strategies to meteorological data predictions. Finally, we are interested in theoretical and practical versions of this sequential prediction framework to cumulative density function prediction. Firstly, we study online prediction of bounded stationary ergodic processes. To do so, we consider the setting of prediction of individual sequences and propose a deterministic regression tree that performs asymptotically as well as the best L-Lipschitz predictor. Then, we show why the obtained regret bound entails the asymptotical optimality with respect to the class of bounded stationary ergodic processes. Secondly, we propose a specific sequential aggregation method of meteorological simulation of mean sea level pressure. The aim is to obtain, with a ridge regression algorithm, better prediction performance than a reference prediction, belonging to the constant linear prediction of basis predictors. We begin by recalling the mathematical framework and basic notions of environmental science. Then, the used datasets and practical performance of strategies are studied, as well as the sensitivity of the algorithm to parameter tuning. We then transpose the former method to another meteorological variable: the wind speed 10 meter above ground. This study shows that the wind speed exhibits different behaviors on a macro level. In the last chapter, we present the tools used in a probabilistic prediction framework and underline their merits. First, we explain the relevancy of probabilistic prediction and expose this domain's state of the art. We carry on with an historical approach of popular probabilistic scores. The used algorithms are then thoroughly described before the descriptions of their empirical results on the mean sea level pressure and wind speed
Loth, Manuel. "Algorithmes d'Ensemble Actif pour le LASSO". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00845441.
Texto completoTran, Anh-Tuan. "Ensemble learning-based approach for the global minimum variance portfolio". Electronic Thesis or Diss., Université Paris sciences et lettres, 2024. http://www.theses.fr/2024UPSLP010.
Texto completoEnsemble Learning has a simple idea that combining several learning algorithms tend to yield a better result than any single learning algorithm. Empirically, the ensemble method is better if its base models are diversified even if they are non-intuitively random algorithms such as random decision trees. Because of its advantages, Ensemble Learning is used in various applications such as fraud detection problems. In more detail, the advantages of Ensemble Learning are because of two points: i) combines the strengths of its base models then each model is complementary to one another and ii) neutralizes the noise and outliers among all base models then reduces their impacts on the final predictions. We use these two ideas of Ensemble Learning for different applications in the Machine Learning and the Finance industry. Our main contributions in this thesis are: i) efficiently deal with a hard scenario of imbalance data problem in the Machine Learning which is extremely imbalance big data problem by using undersampling technique and the Ensemble Learning, ii) appropriately apply time-series Cross-Validation and the Ensemble Learning to resolve a covariance matrix estimator selection problem in Quantitative Trading and iii) reduce the impact of outliers in covariance matrix estimations in order to increase the stability of portfolios by using the undersampling and the Ensemble Learning
Thorey, Jean. "Prévision d’ensemble par agrégation séquentielle appliquée à la prévision de production d’énergie photovoltaïque". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066526/document.
Texto completoOur main objective is to improve the quality of photovoltaic power forecasts deriving from weather forecasts. Such forecasts are imperfect due to meteorological uncertainties and statistical modeling inaccuracies in the conversion of weather forecasts to power forecasts. First we gather several weather forecasts, secondly we generate multiple photovoltaic power forecasts, and finally we build linear combinations of the power forecasts. The minimization of the Continuous Ranked Probability Score (CRPS) allows to statistically calibrate the combination of these forecasts, and provides probabilistic forecasts under the form of a weighted empirical distribution function. We investigate the CRPS bias in this context and several properties of scoring rules which can be seen as a sum of quantile-weighted losses or a sum of threshold-weighted losses. The minimization procedure is achieved with online learning techniques. Such techniques come with theoretical guarantees of robustness on the predictive power of the combination of the forecasts. Essentially no assumptions are needed for the theoretical guarantees to hold. The proposed methods are applied to the forecast of solar radiation using satellite data, and the forecast of photovoltaic power based on high-resolution weather forecasts and standard ensembles of forecasts
Jaber, Ghazal. "An approach for online learning in the presence of concept changes". Phd thesis, Université Paris Sud - Paris XI, 2013. http://tel.archives-ouvertes.fr/tel-00907486.
Texto completoBoulegane, Dihia. "Machine learning algorithms for dynamic Internet of Things". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT048.
Texto completoWith the rapid growth of Internet-of-Things (IoT) devices and sensors, sources that are continuously releasing and curating vast amount of data at high pace in the form of stream. The ubiquitous data streams are essential for data driven decisionmaking in different business sectors using Artificial Intelligence (AI) and Machine Learning (ML) techniques in order to extract valuable knowledge and turn it to appropriate actions. Besides, the data being collected is often associated with a temporal indicator, referred to as temporal data stream that is a potentially infinite sequence of observations captured over time at regular intervals, but not necessarily. Forecasting is a challenging tasks in the field of AI and aims at understanding the process generating the observations over time based on past data in order to accurately predict future behavior. Stream Learning is the emerging research field which focuses on learning from infinite and evolving data streams. The thesis tackles dynamic model combination that achieves competitive results despite their high computational costs in terms of memory and time. We study several approaches to estimate the predictive performance of individual forecasting models according to the data and contribute by introducing novel windowing and meta-learning based methods to cope with evolving data streams. Subsequently, we propose different selection methods that aim at constituting a committee of accurate and diverse models. The predictions of these models are then weighted and aggregated. The second part addresses model compression that aims at building a single model to mimic the behavior of a highly performing and complex ensemble while reducing its complexity. Finally, we present the first streaming competition ”Real-time Machine Learning Competition on Data Streams”, at the IEEE Big Data 2019 conference, using the new SCALAR platform
Tremblay, Guillaume. "Optimisation d'ensembles de classifieurs non paramétriques avec apprentissage par représentation partielle de l'information". Mémoire, École de technologie supérieure, 2004. http://espace.etsmtl.ca/716/1/TREMBLAY_Guillaume.pdf.
Texto completoFaddoul, Jean Baptiste. "Modèles d'Ensembles pour l'Apprentissage Multi-Tache, avec des taches Hétérogènes et sans Restrictions". Phd thesis, Université Charles de Gaulle - Lille III, 2012. http://tel.archives-ouvertes.fr/tel-00712710.
Texto completoEzzeddine, Diala. "A contribution to topological learning and its application in Social Networks". Thesis, Lyon 2, 2014. http://www.theses.fr/2014LYO22011/document.
Texto completoSupervised Learning is a popular field of Machine Learning that has made recent progress. In particular, many methods and procedures have been developed to solve the classification problem. Most classical methods in Supervised Learning use the density estimation of data to construct their classifiers.In this dissertation, we show that the topology of data can be a good alternative in constructing classifiers. We propose using topological graphs like Gabriel graphs (GG) and Relative Neighborhood Graphs (RNG) that can build the topology of data based on its neighborhood structure. To apply this concept, we create a new method called Random Neighborhood Classification (RNC).In this method, we use topological graphs to construct classifiers and then apply Ensemble Methods (EM) to get all relevant information from the data. EM is well known in Machine Learning, generates many classifiers from data and then aggregates these classifiers into one. Aggregate classifiers have been shown to be very efficient in many studies, because it leverages relevant and effective information from each generated classifier. We first compare RNC to other known classification methods using data from the UCI Irvine repository. We find that RNC works very well compared to very efficient methods such as Random Forests and Support Vector Machines. Most of the time, it ranks in the top three methods in efficiency. This result has encouraged us to study the efficiency of RNC on real data like tweets. Twitter, a microblogging Social Network, is especially useful to mine opinion on current affairs and topics that span the range of human interest, including politics. Mining political opinion from Twitter poses peculiar challenges such as the versatility of the authors when they express their political view, that motivate this study. We define a new attribute, called couple, that will be very helpful in the process to study the tweets opinion. A couple is an author that talk about a politician. We propose a new procedure that focuses on identifying the opinion on tweet using couples. We think that focusing on the couples's opinion expressed by several tweets can overcome the problems of analysing each single tweet. This approach can be useful to avoid the versatility, language ambiguity and many other artifacts that are easy to understand for a human being but not automatically for a machine.We use classical Machine Learning techniques like KNN, Random Forests (RF) and also our method RNC. We proceed in two steps : First, we build a reference set of classified couples using Naive Bayes. We also apply a second alternative method to Naive method, sampling plan procedure, to compare and evaluate the results of Naive method. Second, we evaluate the performance of this approach using proximity measures in order to use RNC, RF and KNN. The expirements used are based on real data of tweets from the French presidential election in 2012. The results show that this approach works well and that RNC performs very good in order to classify opinion in tweets.Topological Learning seems to be very intersting field to study, in particular to address the classification problem. Many concepts to get informations from topological graphs need to analyse like the ones described by Aupetit, M. in his work (2005). Our work show that Topological Learning can be an effective way to perform classification problem
Gacquer, David. "Sur l'utilisation active de la diversité dans la construction d'ensembles classifieurs : application à la détection de fumées nocives sur site industriel". Valenciennes, 2009. http://ged.univ-valenciennes.fr/nuxeo/site/esupversions/2a04cf89-c324-43d6-a36b-052aa232f813.
Texto completoDiscussions about the influence of diversity when designing Multiple Classifier Systems is an active topic in Machine Learning. One possible way of considering the design of Multiple Classifier Systems is to select the ensemble members from a large pool of classifiers focusing on predefined criteria, which is known as the Overproduce and Choose paradigm. The objective of this PhD Thesis is to study the trade-off between accuracy and diversity which exists in multiple classifier systems. We review some well known Machine Learning algorithms and ensemble learning techniques from the literature and we present in details the concept of diversity and the way it is used by certain ensemble learning algorithms. We propose a genetic heuristic to design multiple classifier systems by controlling the trade-off between diversity and accuracy when selecting individual classifiers. We compare the proposed genetic selection with several heuristics described in the literature to build multiple classifier systems under the Overproduce and Choose paradigm. The application of our research work concerns the development of a supervised classification system to control atmospheric pollution around industrial complexes. This system is based on the analysis of visual scenes recorded by cameras and aims at detecting dangerous smoke trails rejected by steelworks or chemical factories
Hadjem, Medina. "Contribution à l'analyse et à la détection automatique d'anomalies ECG dans le cas de l'ischémie myocardique". Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB011.
Texto completoRecent advances in sensing and miniaturization of ultra-low power devices allow for more intelligent and wearable health monitoring sensor-based systems. The sensors are capable of collecting vital signs, such as heart rate, temperature, oxygen saturation, blood pressure, ECG, EMG, etc., and communicate wirelessly the collected data to a remote device and/or smartphone. Nowadays, these aforementioned advances have led a large research community to have interest in the design and development of new biomedical data analysis systems, particularly electrocardiogram (ECG) analysis systems. Aimed at contributing to this broad research area, we have mainly focused in this thesis on the automatic analysis and detection of coronary heart diseases, such as Ischemia and Myocardial Infarction (MI), that are well known to be the leading death causes worldwide. Toward this end, and because the ECG signals are deemed to be very noisy and not stationary, our challenge was first to extract the relevant parameters without losing their main features. This particular issue has been widely addressed in the literature and does not represent the main purpose of this thesis. However, as it is a prerequisite, it required us to understand the state of the art proposed methods and select the most suitable one for our work. Based on the ECG parameters extracted, particularly the ST segment and the T wave parameters, we have contributed with two different approaches to analyze the ECG records: (1) the first analysis is performed in the time series level, in order to detect abnormal elevations of the ST segment and the T wave, known to be an accurate predictor of ischemia or MI; (2) the second analysis is performed at the ECG beat level to automatically classify the ST segment and T wave anomalies within different categories. This latter approach is the most commonly used in the literature. However, lacking a performance comparison standard in the state of the art existing works, we have carried out our own comparison of the actual classification methods by taking into account diverse ST and T anomaly classes, several performance evaluation parameters, as well as several ECG signal leads. To obtain more realistic performances, we have also performed the same study in the presence of other frequent cardiac anomalies, such as arrhythmia. Based on this substantial comparative study, we have proposed a new classification approach of seven ST-T anomaly classes, by using a hybrid of the boosting and the random under sampling methods, our goal was ultimately to reach the best tradeoff between true-positives and false-positives
Desir, Chesner. "Classification automatique d'images, application à l'imagerie du poumon profond". Phd thesis, Rouen, 2013. http://www.theses.fr/2013ROUES053.
Texto completoThis thesis deals with automated image classification, applied to images acquired with alveoscopy, a new imaging technique of the distal lung. The aim is to propose and develop a computer aided-diagnosis system, so as to help the clinician analyze these images never seen before. Our contributions lie in the development of effective, robust and generic methods to classify images of healthy and pathological patients. Our first classification system is based on a rich and local characterization of the images, an ensemble of random trees approach for classification and a rejection mechanism, providing the medical expert with tools to enhance the reliability of the system. Due to the complexity of alveoscopy images and to the lack of expertize on the pathological cases (unlike healthy cases), we adopt the one-class learning paradigm which allows to learn a classifier from healthy data only. We propose a one-class approach taking advantage of combining and randomization mechanisms of ensemble methods to respond to common issues such as the curse of dimensionality. Our method is shown to be effective, robust to the dimension, competitive and even better than state-of-the-art methods on various public datasets. It has proved to be particularly relevant to our medical problem
Desir, Chesner. "Classification Automatique d'Images, Application à l'Imagerie du Poumon Profond". Phd thesis, Université de Rouen, 2013. http://tel.archives-ouvertes.fr/tel-00879356.
Texto completoHadjem, Medina. "Contribution à l'analyse et à la détection automatique d'anomalies ECG dans le cas de l'ischémie myocardique". Electronic Thesis or Diss., Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB011.
Texto completoRecent advances in sensing and miniaturization of ultra-low power devices allow for more intelligent and wearable health monitoring sensor-based systems. The sensors are capable of collecting vital signs, such as heart rate, temperature, oxygen saturation, blood pressure, ECG, EMG, etc., and communicate wirelessly the collected data to a remote device and/or smartphone. Nowadays, these aforementioned advances have led a large research community to have interest in the design and development of new biomedical data analysis systems, particularly electrocardiogram (ECG) analysis systems. Aimed at contributing to this broad research area, we have mainly focused in this thesis on the automatic analysis and detection of coronary heart diseases, such as Ischemia and Myocardial Infarction (MI), that are well known to be the leading death causes worldwide. Toward this end, and because the ECG signals are deemed to be very noisy and not stationary, our challenge was first to extract the relevant parameters without losing their main features. This particular issue has been widely addressed in the literature and does not represent the main purpose of this thesis. However, as it is a prerequisite, it required us to understand the state of the art proposed methods and select the most suitable one for our work. Based on the ECG parameters extracted, particularly the ST segment and the T wave parameters, we have contributed with two different approaches to analyze the ECG records: (1) the first analysis is performed in the time series level, in order to detect abnormal elevations of the ST segment and the T wave, known to be an accurate predictor of ischemia or MI; (2) the second analysis is performed at the ECG beat level to automatically classify the ST segment and T wave anomalies within different categories. This latter approach is the most commonly used in the literature. However, lacking a performance comparison standard in the state of the art existing works, we have carried out our own comparison of the actual classification methods by taking into account diverse ST and T anomaly classes, several performance evaluation parameters, as well as several ECG signal leads. To obtain more realistic performances, we have also performed the same study in the presence of other frequent cardiac anomalies, such as arrhythmia. Based on this substantial comparative study, we have proposed a new classification approach of seven ST-T anomaly classes, by using a hybrid of the boosting and the random under sampling methods, our goal was ultimately to reach the best tradeoff between true-positives and false-positives
Gacquer, David. "Sur l'utilisation active de la diversité dans la construction d'ensembles de classifieurs. Application à la détection de fumées nocives sur site industriel". Phd thesis, Université de Valenciennes et du Hainaut-Cambresis, 2008. http://tel.archives-ouvertes.fr/tel-00392616.
Texto completoUne manière particulière de construire un ensemble de classifieurs consiste à sélectionner individuellement les membres de l'ensemble à partir d'un pool de classifieurs en se basant sur des critères prédéfinis.
La littérature fait référence à cette méthode sous le terme de paradigme Surproduction et Sélection, également appelé élagage d'ensemble de classifieurs.
Les travaux présentés dans cette thèse ont pour objectif d'étudier le compromis entre la précision et la diversité existant dans les ensembles de classifieurs. Nous apportons également certains éléments de réponse sur le comportement insaisissable de la diversité lorsqu'elle est utilisée de manière explicite lors de la construction d'un ensemble de classifieurs.
Nous commençons par étudier différents algorithmes d'apprentissage de la littérature. Nous présentons également les algorithmes ensemblistes les plus fréquemment utilisés. Nous définissons ensuite le concept de diversité dans les ensembles de classifieurs ainsi que les différentes méthodes permettant de l'utiliser directement lors de la création de l'ensemble.
Nous proposons un algorithme génétique permettant de construire un ensemble de classifieurs en contrôlant le compromis entre précision et diversité lors de la sélection des membres de l'ensemble. Nous comparons notre algorithme avec différentes heuristiques de sélection proposées dans la littérature pour construire un ensemble de classifieurs selon le paradigme Surproduction et Sélection.
Les différentes conclusions que nous tirons des résultats obtenus pour différents jeux de données de l'UCI Repository nous conduisent à la proposition de conditions spécifiques pour lesquelles l'utilisation de la diversité peut amener à une amélioration des performances de l'ensemble de classifieurs. Nous montrons également que l'efficacité de l'approche Surproduction et Sélection repose en grande partie sur la stabilité inhérente au problème posé.
Nous appliquons finalement nos travaux de recherche au développement d'un système de classification supervisée pour le contrôle de la pollution atmosphérique survenant sur des sites industriels. Ce système est basé sur l'analyse par traitement d'image de scènes à risque enregistrées à l'aide de caméras. Son principal objectif principal est de détecter les rejets de fumées dangereux émis par des usines sidérurgiques et pétro-chimiques.
Boutaleb, Mohamed Yasser. "Egocentric Hand Activity Recognition : The principal components of an egocentric hand activity recognition framework, exploitable for augmented reality user assistance". Electronic Thesis or Diss., CentraleSupélec, 2022. http://www.theses.fr/2022CSUP0007.
Texto completoHumans use their hands for various tasks in daily life and industry, making research in this area a recent focus of significant interest. Moreover, analyzing and interpreting human behavior using visual signals is one of the most animated and explored areas of computer vision. With the advent of new augmented reality technologies, researchers are increasingly interested in hand activity understanding from a first-person perspective exploring its suitability for human guidance and assistance. Our work is based on machine learning technology to contribute to this research area. Recently, deep neural networks have proven their outstanding effectiveness in many research areas, allowing researchers to jump significantly in efficiency and robustness.This thesis's main objective is to propose a user's activity recognition framework including four key components, which can be used to assist users during their activities oriented towards specific objectives: industry 4.0 (e.g., assisted assembly, maintenance) and teaching. Thus, the system observes the user's hands and the manipulated objects from the user's viewpoint to recognize his performed hand activity. The desired framework must robustly recognize the user's usual activities. Nevertheless, it must detect unusual ones to feedback and prevent him from performing wrong maneuvers, a fundamental requirement for user assistance. This thesis, therefore, combines techniques from the research fields of computer vision and machine learning to propose comprehensive hand activity recognition components essential for a complete assistance tool