Dissertations / Theses on the topic 'Algorithmes d'apprentissage automatique'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 23 dissertations / theses for your research on the topic 'Algorithmes d'apprentissage automatique.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Germain, Pascal. "Algorithmes d'apprentissage automatique inspirés de la théorie PAC-Bayes." Thesis, Université Laval, 2009. http://www.theses.ulaval.ca/2009/26191/26191.pdf.
Full textAt first, this master thesis presents a general PAC-Bayes theorem, from which we can easily obtain some well-known PAC-Bayes bounds. Those bounds allow us to compute a guarantee on the risk of a classifier from its achievements on the training set. We analyze the behavior of two PAC-Bayes bounds and we determine peculiar characteristics of classifiers favoured by those bounds. Then, we present a specialization of those bounds to the linear classifiers family. Secondly, we conceive three new machine learning algorithms based on the minimization, by conjugate gradient descent, of various mathematical expressions of the PAC-Bayes bounds. The last algorithm uses a part of the training set to capture a priori knowledges. One can use those algorithms to construct majority vote classifiers as well as linear classifiers implicitly represented by the kernel trick. Finally, an elaborated empirical study compares the three algorithms and shows that some versions of those algorithms are competitive with both AdaBoost and SVM.
Inscrit au Tableau d'honneur de la Faculté des études supérieures
Mariéthoz, Johnny. "Algorithmes d'apprentissage discriminants en vérification du locuteur." Lyon 2, 2006. http://theses.univ-lyon2.fr/documents/lyon2/2006/mariethoz_j.
Full textDans cette thèse le problème de la vérification du locuteur indépendante du texte est abordée du point de vue de l'apprentissage statistique (machine learning). Les théories développées en apprentissage statistique permettent de mieux définir ce problème, de développer de nouvelles mesures de performance non-biaisées et de proposer de nouveaux tests statistiques afin de comparer objectivement les modèles proposés. Une nouvelle interprétation des modèles de l'état de l'art basée sur des mixtures de gaussiennes (GMM) montre que ces modèles sont en fait discriminants et équivalents à une mixture d'experts linéaires. Un cadre théorique général pour la normalisation des scores est aussi proposé pour des modèles probabilistes et non-probabilistes. Grâce à ce nouveau cadre théorique, les hypothèses faites lors de l'utilisation de la normalisation Z et T (T- and Z-norm) sont mises en évidence. Différents modèles discriminants sont proposés. On présente un nouveau noyau utilisé par des machines à vecteurs de support (SVM) qui permet de traîter des séquences. Ce noyau est en fait la généralisation d'un noyau déjà existant qui présente l'inconvénient d'être limité à une forme polynomiale. La nouvelle approche proposée permet la projection des données dans un espace de dimension infinie, comme c'est le cas, par exemple, avec l'utilisation d'un noyau gaussien. Une variante de ce noyau cherchant le meilleur vecteur acoustique (frame) dans la séquence à comparer, améliore les résultats actuellement connus. Comme cette approche est particulièrement coûteuse pour les séquences longues, un algorithme de regroupement (clustering) est utilisé pour en réduire la complexité. Finalement, cette thèse aborde aussi des problèmes spécifiques de la vé-ri-fi-ca-tion du locuteur, comme le fait que les nombres d'exemples positifs et négatifs sont très déséquilibrés et que la distribution des distances intra et inter classes est spécifique de ce type de problème. Ainsi, le noyau est modifié en ajoutant un bruit gaussien sur chaque exemple négatif. Même si cette approche manque de justification théorique pour l'instant, elle produit de très bons résultats empiriques et ouvre des perspectives intéressantes pour de futures recherches
Lacasse, Alexandre. "Bornes PAC-Bayes et algorithmes d'apprentissage." Thesis, Université Laval, 2010. http://www.theses.ulaval.ca/2010/27635/27635.pdf.
Full textThe main purpose of this thesis is the theoretical study and the design of learning algorithms returning majority-vote classifiers. In particular, we present a PAC-Bayes theorem allowing us to bound the variance of the Gibbs’ loss (not only its expectation). We deduce from this theorem a bound on the risk of a majority vote tighter than the famous bound based on the Gibbs’ risk. We also present a theorem that allows to bound the risk associated with general loss functions. From this theorem, we design learning algorithms building weighted majority vote classifiers minimizing a bound on the risk associated with the following loss functions : linear, quadratic and exponential. Also, we present algorithms based on the randomized majority vote. Some of these algorithms compare favorably with AdaBoost.
Choquette, Philippe. "Nouveaux algorithmes d'apprentissage pour classificateurs de type SCM." Master's thesis, Québec : Université Laval, 2007. http://www.theses.ulaval.ca/2007/24840/24840.pdf.
Full textGiguère, Sébastien. "Algorithmes d'apprentissage automatique pour la conception de composés pharmaceutiques et de vaccins." Doctoral thesis, Université Laval, 2015. http://hdl.handle.net/20.500.11794/25748.
Full textThe discovery of pharmaceutical compounds is currently too time-consuming, too expensive, and the failure rate is too high. Biochemical and genomic databases continue to grow and it is now impracticable to interpret these data. A radical change is needed; some steps in this process must be automated. Peptides are molecules that play an important role in the immune system and in cell signaling. Their favorable properties make them prime candidates for initiating the design of new drugs and assist in the design of vaccines. In addition, modern synthesis techniques can quickly generate these molecules at low cost. Statistical learning algorithms are well suited to manage large amount of data and to learn models in an automated fashion. These methods and peptides thus offer a solution of choice to the challenges facing pharmaceutical research. We propose a kernel for learning statistical models of biochemical phenomena involving peptides. This allows, among other things, to learn a universal model that can reasonably quantify the binding energy between any peptide sequence and any binding site of a protein. In addition, it unifies the theory of many existing string kernels while maintaining a low computational complexity. This kernel is particularly suitable for quantifying the interaction between antigens and proteins of the major histocompatibility complex. We provide a tool to predict peptides that are likely to be processed by the antigen presentation pathway. This tool has won an international competition and has several applications in immunology, including vaccine design. Ultimately, a peptide should maximize the interaction with a target protein or maximize bioactivity in the host. We formalize this problem as a structured prediction problem. Then, we propose an algorithm exploiting the longest paths in a graph to identify peptides maximizing the predicted bioactivity of a previously learned model. We validate this new approach in the laboratory with the discovery of new antimicrobial peptides. Finally, we provide PAC-Bayes bound for two structured prediction algorithms, one of which is new.
Sayadi, Karim. "Classification du texte numérique et numérisé. Approche fondée sur les algorithmes d'apprentissage automatique." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066079/document.
Full textDifferent disciplines in the humanities, such as philology or palaeography, face complex and time-consuming tasks whenever it comes to examining the data sources. The introduction of computational approaches in humanities makes it possible to address issues such as semantic analysis and systematic archiving. The conceptual models developed are based on algorithms that are later hard coded in order to automate these tedious tasks. In the first part of the thesis we propose a novel method to build a semantic space based on topics modeling. In the second part and in order to classify historical documents according to their script. We propose a novel representation learning method based on stacking convolutional auto-encoder. The goal is to automatically learn plot representations of the script or the written language
Brouard, Thierry. "Algorithmes hybrides d'apprentissage de chaines de Markov cachées : conception et applications à la reconnaissance des formes." Tours, 1999. http://www.theses.fr/1999TOUR4002.
Full textThe main point of this work is based on the quality of modelization of data (called observations) made by hidden Markov models (HMMs). Our goal is to propose algorithms that improve this quality. The criterion used to quantify the quality of HMM is the probability that a given model generates a given observation. To solve this problem, we use a genetic hybridization of HMM. Using genetic algorithms (GAs) jointly to HMM permits two things. First, GAs let us to explore more efficiently the set of models, avoiding local optima. Second, GAs optimize an important characteristic of HMM : its number of hidden states. The most efficient hybrid algorithm finds the best HMM for a given problem, by itself. This means that the GA designs a set of states and the associated transition probabilities. Many explications have been done in the framework of this thesis, in many domains like image recognition, time series prediction, unsupervised image segmentation and object tracking in sequences of images. The new algorithms proposed here are appliable to all domains (peovided that hypothesis related to HMM are satisfied). They allow a fast and efficient training of HMM, and an entirely automatic determination of the architecture (number of states, transition probabilities) of the HMM
Drouin, Alexandre. "Inferring phenotypes from genotypes with machine learning : an application to the global problem of antibiotic resistance." Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/34944.
Full textLa compréhension du lien entre les caractéristiques génomiques d’un individu, le génotype, et son état biologique, le phénotype, est un élément essentiel au développement d’une médecine personnalisée où les traitements sont adaptés à chacun. Elle permet notamment d’anticiper des maladies, d’estimer la réponse à des traitements et même d’identifier de nouvelles cibles pharmaceutiques. L’apprentissage automatique est une science visant à développer des algorithmes capables d’apprendre à partir d’exemples. Ces algorithmes peuvent être utilisés pour produire des modèles qui estiment des phénotypes à partir de génotypes, lesquels peuvent ensuite être étudiés pour élucider les mécanismes biologiques sous-jacents aux phénotypes. Toutefois, l’utilisation d’algorithmes d’apprentissage dans ce contexte pose d’importants défis algorithmiques et théoriques. La haute dimensionnalité des données génomiques et la petite taille des échantillons de données peuvent mener au surapprentissage; le volume des données requiert des algorithmes adaptés qui limitent leur utilisation des ressources computationnelles; et finalement, les modèles obtenus doivent pouvoir être interprétés par des experts du domaine, ce qui n’est pas toujours possible. Cette thèse présente des algorithmes d’apprentissage produisant des modèles interprétables pour la prédiction de phénotypes à partir de génotypes. En premier lieu, nous explorons la prédiction de phénotypes discrets à l’aide d’algorithmes à base de règles. Nous proposons de nouvelles implémentations hautement optimisées et des garanties de généralisation adaptées aux données génomiques. En second lieu, nous nous intéressons à un problème plus théorique, soit la régression par intervalles, et nous proposons deux nouveaux algorithmes d’apprentissage, dont un à base de règles. Finalement, nous montrons que ce type de régression peut être utilisé pour prédire des phénotypes continus et que ceci mène à des modèles plus précis que ceux des méthodes conventionnelles en présence de données censurées ou bruitées. Le thème applicatif de cette thèse est la prédiction de la résistance aux antibiotiques, un problème de santé publique d’envergure mondiale. Nous démontrons que nos algorithmes peuvent servir à prédire, de façon très précise, des phénotypes de résistance, tout en contribuant à en améliorer la compréhension. Ultimement, nos algorithmes pourront servir au développement d’outils permettant une meilleure utilisation des antibiotiques et un meilleur suivi épidémiologique, un élément clé de la solution à ce problème.
A thorough understanding of the relationship between the genomic characteristics of an individual (the genotype) and its biological state (the phenotype) is essential to personalized medicine, where treatments are tailored to each individual. This notably allows to anticipate diseases, estimate response to treatments, and even identify new pharmaceutical targets. Machine learning is a science that aims to develop algorithms that learn from examples. Such algorithms can be used to learn models that estimate phenotypes based on genotypes, which can then be studied to elucidate the biological mechanisms that underlie the phenotypes. Nonetheless, the application of machine learning in this context poses significant algorithmic and theoretical challenges. The high dimensionality of genomic data and the small size of data samples can lead to overfitting; the large volume of genomic data requires adapted algorithms that limit their use of computational resources; and importantly, the learned models must be interpretable by domain experts, which is not always possible. This thesis presents learning algorithms that produce interpretable models for the prediction of phenotypes based on genotypes. Firstly, we explore the prediction of discrete phenotypes using rule-based learning algorithms. We propose new implementations that are highly optimized and generalization guarantees that are adapted to genomic data. Secondly, we study a more theoretical problem, namely interval regression. We propose two new learning algorithms, one which is rule-based. Finally, we show that this type of regression can be used to predict continuous phenotypes and that this leads to models that are more accurate than those of conventional approaches in the presence of censored or noisy data. The overarching theme of this thesis is an application to the prediction of antibiotic resistance, a global public health problem of high significance. We demonstrate that our algorithms can be used to accurately predict resistance phenotypes and contribute to the improvement of their understanding. Ultimately, we expect that our algorithms will take part in the development of tools that will allow a better use of antibiotics and improved epidemiological surveillance, a key component of the solution to this problem.
Fournier, Laurent. "Contribution à la modélisation d'un véhicule automobile et de son environnement : Algorithmes d'apprentissage pour la commande électronique de boîte de vitesse automatique." Limoges, 1996. http://www.theses.fr/1996LIMO0061.
Full textBordes, Antoine. "Nouveaux Algorithmes pour l'Apprentissage de Machines à Vecteurs Supports sur de Grandes Masses de Données." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2010. http://tel.archives-ouvertes.fr/tel-00464007.
Full textJoshi, Bikash. "Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchrone." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM046/document.
Full textThis thesis focuses on developing scalable algorithms for large scale machine learning. In this work, we present two perspectives to handle large data. First, we consider the problem of large-scale multiclass classification. We introduce the task of multiclass classification and the challenge of classifying with a large number of classes. To alleviate these challenges, we propose an algorithm which reduces the original multiclass problem to an equivalent binary one. Based on this reduction technique, we introduce a scalable method to tackle the multiclass classification problem for very large number of classes and perform detailed theoretical and empirical analyses.In the second part, we discuss the problem of distributed machine learning. In this domain, we introduce an asynchronous framework for performing distributed optimization. We present application of the proposed asynchronous framework on two popular domains: matrix factorization for large-scale recommender systems and large-scale binary classification. In the case of matrix factorization, we perform Stochastic Gradient Descent (SGD) in an asynchronous distributed manner. Whereas, in the case of large-scale binary classification we use a variant of SGD which uses variance reduction technique, SVRG as our optimization algorithm
Cornec, Matthieu. "Inégalités probabilistes pour l'estimateur de validation croisée dans le cadre de l'apprentissage statistique et Modèles statistiques appliqués à l'économie et à la finance." Phd thesis, Université de Nanterre - Paris X, 2009. http://tel.archives-ouvertes.fr/tel-00530876.
Full textSani, Amir. "Apprentissage automatique pour la prise de décisions." Thesis, Lille 1, 2015. http://www.theses.fr/2015LIL10038/document.
Full textStrategic decision-making over valuable resources should consider risk-averse objectives. Many practical areas of application consider risk as central to decision-making. However, machine learning does not. As a result, research should provide insights and algorithms that endow machine learning with the ability to consider decision-theoretic risk. In particular, in estimating decision-theoretic risk on short dependent sequences generated from the most general possible class of processes for statistical inference and through decision-theoretic risk objectives in sequential decision-making. This thesis studies these two problems to provide principled algorithmic methods for considering decision-theoretic risk in machine learning. An algorithm with state-of-the-art performance is introduced for accurate estimation of risk statistics on the most general class of stationary--ergodic processes and risk-averse objectives are introduced in sequential decision-making (online learning) in both the stochastic multi-arm bandit setting and the adversarial full-information setting
Ngo, Duy Hoa. "Amélioration de l'alignement d'ontologies par les techniques d'apprentissage automatique, d'appariement de graphes et de recherche d'information." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2012. http://tel.archives-ouvertes.fr/tel-00767318.
Full textAouini, Zied. "Traffic monitoring in home networks : from theory to practice." Thesis, La Rochelle, 2017. http://www.theses.fr/2017LAROS035/document.
Full textHome networks are facing a continuous evolution and are becoming more and more complex. Their complexity has evolved according to two interrelated dimensions. On the one hand, the home network topology (devices and connectivity technologies) tends to produce more complex configurations. On the other hand, the set of services accessed through the home network is growing in a tremendous fashion. Such context has made the home network management more challenging for both Internet Service Provider (ISP) and end-users. In this dissertation, we focus on the traffic dimension of the above described complexity. Our first contribution consists on proposing an architecture for traffic monitoring in home networks. We provide a comparative study of some existing open source tools. Then, we perform a testbed evaluation of the main software components implied in our architecture. Based on the experiments results, we discuss several deployment limits and possibilities. In our second contribution, we conduct a residential traffic and usages analysis based on real trace involving more than 34 000 customers. First, we present our data collection and processing methodology. Second, we present our findings with respect to the different layers of the TCP/IP protocol stack characteristics. Then, we perform a subjective analysis across 645 of residential customers. The results of both evaluations provide a complete synthesis of residential usage patterns and applications characteristics. In our third contribution, we propose a novel scheme for real-time residential traffic classification. Our scheme, which is based on a machine learning approach called C5.0, aims to fulfil the lacks identified in the literature. At this aim, our algorithm is evaluated using several traffic inputs. Then, we detail how we implemented a lightweight probe able to capture, track and identify finely applications running in the home network. This implementation allowed us to validate our designing principles upon realistic test conditions. The obtained results show clearly the efficiency and feasibility of our solution
Bubeck, Sébastien. "JEUX DE BANDITS ET FONDATIONS DU CLUSTERING." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2010. http://tel.archives-ouvertes.fr/tel-00845565.
Full textBuhot, Arnaud. "Etude de propriétés d'apprentissage supervisé et non supervisé par des méthodes de Physique Statistique." Phd thesis, Université Joseph Fourier (Grenoble), 1999. http://tel.archives-ouvertes.fr/tel-00001642.
Full textZantedeschi, Valentina. "A Unified View of Local Learning : Theory and Algorithms for Enhancing Linear Models." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSES055/document.
Full textIn Machine Learning field, data characteristics usually vary over the space: the overall distribution might be multi-modal and contain non-linearities.In order to achieve good performance, the learning algorithm should then be able to capture and adapt to these changes. Even though linear models fail to describe complex distributions, they are renowned for their scalability, at training and at testing, to datasets big in terms of number of examples and of number of features. Several methods have been proposed to take advantage of the scalability and the simplicity of linear hypotheses to build models with great discriminatory capabilities. These methods empower linear models, in the sense that they enhance their expressive power through different techniques. This dissertation focuses on enhancing local learning approaches, a family of techniques that infers models by capturing the local characteristics of the space in which the observations are embedded. The founding assumption of these techniques is that the learned model should behave consistently on examples that are close, implying that its results should also change smoothly over the space. The locality can be defined on spatial criteria (e.g. closeness according to a selected metric) or other provided relations, such as the association to the same category of examples or a shared attribute. Local learning approaches are known to be effective in capturing complex distributions of the data, avoiding to resort to selecting a model specific for the task. However, state of the art techniques suffer from three major drawbacks: they easily memorize the training set, resulting in poor performance on unseen data; their predictions lack of smoothness in particular locations of the space;they scale poorly with the size of the datasets. The contributions of this dissertation investigate the aforementioned pitfalls in two directions: we propose to introduce side information in the problem formulation to enforce smoothness in prediction and attenuate the memorization phenomenon; we provide a new representation for the dataset which takes into account its local specificities and improves scalability. Thorough studies are conducted to highlight the effectiveness of the said contributions which confirmed the soundness of their intuitions. We empirically study the performance of the proposed methods both on toy and real tasks, in terms of accuracy and execution time, and compare it to state of the art results. We also analyze our approaches from a theoretical standpoint, by studying their computational and memory complexities and by deriving tight generalization bounds
Liakopoulos, Nikolaos. "Machine Learning Techniques for Online Resource Allocation in Wireless Networks." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS529.
Full textTraditionally, network optimization is used to provide good configurations in real network system problems based on mathematical models and statistical assumptions. Recently, this paradigm is evolving, fueled by an explosion of availability of data. The modern trend in networking problems is to tap into the power of data to extract models and deal with uncertainty. This thesis proposes algorithmic frameworks for wireless networks, based both on classical or data-driven optimization and machine learning. We target two use cases, user association and cloud resource reservation.The baseline approach for user association, connecting wireless devices to the base station that provides the strongest signal, leads to very inefficient configurations even in current wireless networks. We focus on tailoring user association based on resource efficiency and service requirement satisfaction, depending on the underlying network demand. We first study distributed user association with priority QoS guarantees, then scalable centralized load balancing based on computational optimal transport and finally robust user association based on approximate traffic prediction.Moving to the topic of cloud resource reservation, we develop a novel framework for resource reservation in worst-case scenaria, where the demand is engineered by an adversary aiming to harm our performance. We provide policies that have ``no regret'' and guarantee asymptotic feasibility in budget constraints under such workloads. More importantly we expand to a general framework for online convex optimization (OCO) problems with long term budget constraints complementing the results of recent literature in OCO
Mignacco, Francesca. "Statistical physics insights on the dynamics and generalisation of artificial neural networks." Thesis, université Paris-Saclay, 2022. http://www.theses.fr/2022UPASP074.
Full textMachine learning technologies have become ubiquitous in our daily lives. However, this field still remains largely empirical and its scientific stakes lack a deep theoretical understanding.This thesis explores the mechanisms underlying learning in artificial neural networks through the prism of statistical physics. In the first part, we focus on the static properties of learning problems, that we introduce in Chapter 1.1. In Chapter 1.2, we consider the prototype classification of a binary mixture of Gaussian clusters and we derive rigorous closed-form expressions for the errors in the infinite-dimensional regime, that we apply to shed light on the role of different problem parameters. In Chapter 1.3, we show how to extend the teacher-student perceptron model to encompass multi-class classification deriving asymptotic expressions for the optimal performance and the performance of regularised empirical risk minimisation. In the second part, we turn our focus to the dynamics of learning, that we introduce in Chapter 2.1. In Chapter 2.2, we show how to track analytically the training dynamics of multi-pass stochastic gradient descent (SGD) via dynamical mean-field theory for generic non convex loss functions and Gaussian mixture data. Chapter 2.3 presents a late-time analysis of the effective noise introduced by SGD in the underparametrised and overparametrised regimes. In Chapter 2.4, we take the sign retrieval problem as a benchmark highly non-convex optimisation problem and show that stochasticity is crucial to achieve perfect generalisation. The third part of the thesis contains the conclusions and some future perspectives
Maillard, Odalric-Ambrym. "APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00845410.
Full text"Algorithmes d'apprentissage automatique inspirés de la théorie PAC-Bayes." Thesis, Université Laval, 2009. http://www.theses.ulaval.ca/2009/26191/26191.pdf.
Full textTshibala, Tshitoko Emmanuel. "Prédiction des efforts de test : une approche basée sur les seuils des métriques logicielles et les algorithmes d'apprentissage automatique." Thèse, 2019. http://depot-e.uqtr.ca/id/eprint/9431/1/eprint9431.pdf.
Full text