Dissertations / Theses on the topic 'Factorisation creuse de matrices'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Factorisation creuse de matrices.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ramet, Pierre. "Optimisation de la communication et de la distribution des données pour des solveurs parallèles directs en algèbre linéaire dense et creuse." Bordeaux 1, 2000. http://www.theses.fr/2000BOR10506.
Full textGrigori, Laura. "Prédiction de structure et algorithmique parallèle pour la factorisation LU des matrices creuses." Nancy 1, 2001. http://www.theses.fr/2001NAN10264.
Full textThis dissertation treats of parallel numerical computing considering the Gaussian elimination, as it is used to solve large sparse nonsymmetric linear systems. Usually, computations on sparse matrices have an initial phase that predicts the nonzero structure of the output, which helps with memory allocations, set up data structures and schedule parallel tasks prior to the numerical computation itself. To this end, we study the structure prediction for the sparse LU factorization with partial pivoting. We are mainly interested to identify upper bounds as tight as possible to these structures. This structure prediction is then used in a phase called symbolic factorization, followed by a phase that performs the numerical computation of the factors, called numerical factorization. For very large matrices, a significant part of the overall memory space is needed by structures used during the symbolic factorization, and this can prevent a swap-free execution of the LU factorization. We propose and study a parallel algorithm to decrease the memory requirements of the nonsymmetric symbolic factorization. For an efficient parallel execution of the numerical factorization, we consider the analysis and the handling of the data dependencies graphs resulting from the processing of sparse matrices. This analysis enables us to develop scalable algorithms, which manage memory and computing resources in an effective way
Puglisi, Chiara. "Factorisation QR de grandes matrices creuses basée sur une méthode multifrontale dans un environnement multiprocesseur." Toulouse, INPT, 1993. http://www.theses.fr/1993INPT091H.
Full textGuermouche, Abdou. "Étude et optimisation du comportement mémoire dans les méthodes parallèles de factorisation de matrices creuses." Lyon, École normale supérieure (sciences), 2004. http://www.theses.fr/2004ENSL0284.
Full textDirect methods for solving sparse linear systems are known for their large memory requirements that can represent the limiting factor to solve large systems. The work done during this thesis concerns the study and the optimization of the memory behaviour of a sparse direct method, the multifrontal method, for both the sequential and the parallel cases. Thus, optimal memory minimization algorithms have been proposed for the sequential case. Concerning the parallel case, we have introduced new scheduling strategies aiming at improving the memory behaviour of the method. After that, we extended these approaches to have a good performance while keeping a good memory behaviour. In addition, in the case where the data to be treated cannot fit into memory, out-of-core factorization schemes have to be designed. To be efficient, such approaches require to overlap I/O operations with computations and to reuse the data sets already in memory to reduce the amount of I/O operations. Therefore, another part of the work presented in this thesis concerns the design and the study of implicit out-of-core techniques well-adapted to the memory access pattern of the multifrontal method. These techniques are based on a modification of the standard paging policies of the operating system using a low-level tool (MMUM&MMUSSEL)
Amestoy, Patrick. "Factorisation de grandes matrices creuses non symétriques basée sur une méthode multifrontale dans un environnement multiprocesseur." Toulouse, INPT, 1990. http://www.theses.fr/1990INPT050H.
Full textZheng, Léon. "Frugalité en données et efficacité computationnelle dans l'apprentissage profond." Electronic Thesis or Diss., Lyon, École normale supérieure, 2024. http://www.theses.fr/2024ENSL0009.
Full textThis thesis focuses on two challenges of frugality and efficiency in modern deep learning: data frugality and computational resource efficiency. First, we study self-supervised learning, a promising approach in computer vision that does not require data annotations for learning representations. In particular, we propose a unification of several self-supervised objective functions under a framework based on rotation-invariant kernels, which opens up prospects to reduce the computational cost of these objective functions. Second, given that matrix multiplication is the predominant operation in deep neural networks, we focus on the construction of fast algorithms that allow matrix-vector multiplication with nearly linear complexity. More specifically, we examine the problem of sparse matrix factorization under the constraint of butterfly sparsity, a structure common to several fast transforms like the discrete Fourier transform. The thesis establishes new theoretical guarantees for butterfly factorization algorithms, and explores the potential of butterfly sparsity to reduce the computational costs of neural networks during their training or inference phase. In particular, we explore the efficiency of GPU implementations for butterfly sparse matrix multiplication, with the goal of truly accelerating sparse neural networks
Agullo, Emmanuel. "Méthodes directes hors-mémoire (out-of-core) pour la résolution de systèmes linéaires creux de grande taille." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2008. http://tel.archives-ouvertes.fr/tel-00563463.
Full textRouet, François-Henry. "Problèmes de mémoire et de performance de la factorisation multifrontale parallèle et de la résolution triangulaire à seconds membres creux." Phd thesis, Institut National Polytechnique de Toulouse - INPT, 2012. http://tel.archives-ouvertes.fr/tel-00785748.
Full textGouvert, Olivier. "Factorisation bayésienne de matrices pour le filtrage collaboratif." Thesis, Toulouse, INPT, 2019. https://oatao.univ-toulouse.fr/25879/1/Gouvert_Olivier.pdf.
Full textIn recent years, a lot of research has been devoted to recommender systems. The goal of these systems is to recommend to each user some products that he/she may like, in order to facilitate his/her exploration of large catalogs of items. Collaborative filtering (CF) allows to make such recommendations based on the past interactions of the users only. These data are stored in a matrix, where each entry corresponds to the feedback of a user on an item. In particular, this matrix is of very high dimensions and extremly sparse, since the users have interacted with a few items from the catalog. Implicit feedbacks are the easiest data to collect. They are usually available in the form of counts, corresponding to the number of times a user interacted with an item. Non-negative matrix factorization (NMF) techniques consist in approximating the feedback matrix by the product of two non-negative matrices. Thus, each user and item is represented by a latent factor of small dimension corresponding to its preferences and attributes respectively. In recent years, a lot of research has been devoted to recommender systems. The goal of these systems is to recommend to each user some products that he/she may like, in order to facilitate his/her exploration of large catalogs of items. Collaborative filtering (CF) allows to make such recommendations based on the past interactions of the users only. These data are stored in a matrix, where each entry corresponds to the feedback of a user on an item. In particular, this matrix is of very high dimensions and extremly sparse, since the users have interacted with a few items from the catalog. Implicit feedbacks are the easiest data to collect. They are usually available in the form of counts, corresponding to the number of times a user interacted with an item. Non-negative matrix factorization (NMF) techniques consist in approximating the feedback matrix by the product of two non-negative matrices. Thus, each user and item is represented by a latent factor of small dimension corresponding to its preferences and attributes respectively. The goal of this thesis is to develop Bayesian NMF methods which can directly model the overdispersed count data arising in CF. To do so, we first study Poisson factorization (PF) and present its limits for the processing of over-dispersed data. To alleviate this problem, we propose two extensions of PF : negative binomial factorization (NBF) and discrete compound Poisson factorisation (dcPF). In particular, dcPF leads to an interpretation of the variables especially suited to music recommendation. Then, we choose to work on quantified implicit data. This pre- processing simplifies the data which are therefore ordinal. Thus, we propose a Bayesian NMF model for this kind of data, coined OrdNMF. We show that this model is also an extension of PF applied to pre-processed data. Finally, in the last chapter of this thesis, we focus on the wellknown cold-start problem which affects CF techniques. We propose a matrix co-factorization model which allow us to solve this issue
Aleksandrova, Marharyta. "Factorisation de matrices et analyse de contraste pour la recommandation." Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0080/document.
Full textIn many application areas, data elements can be high-dimensional. This raises the problem of dimensionality reduction. The dimensionality reduction techniques can be classified based on their aim: dimensionality reduction for optimal data representation and dimensionality reduction for classification, as well as based on the adopted strategy: feature selection and feature extraction. The set of features resulting from feature extraction methods is usually uninterpretable. Thereby, the first scientific problematic of the thesis is how to extract interpretable latent features? The dimensionality reduction for classification aims to enhance the classification power of the selected subset of features. We see the development of the task of classification as the task of trigger factors identification that is identification of those factors that can influence the transfer of data elements from one class to another. The second scientific problematic of this thesis is how to automatically identify these trigger factors? We aim at solving both scientific problematics within the recommender systems application domain. We propose to interpret latent features for the matrix factorization-based recommender systems as real users. We design an algorithm for automatic identification of trigger factors based on the concepts of contrast analysis. Through experimental results, we show that the defined patterns indeed can be considered as trigger factors
Aleksandrova, Marharyta. "Factorisation de matrices et analyse de contraste pour la recommandation." Electronic Thesis or Diss., Université de Lorraine, 2017. http://www.theses.fr/2017LORR0080.
Full textIn many application areas, data elements can be high-dimensional. This raises the problem of dimensionality reduction. The dimensionality reduction techniques can be classified based on their aim: dimensionality reduction for optimal data representation and dimensionality reduction for classification, as well as based on the adopted strategy: feature selection and feature extraction. The set of features resulting from feature extraction methods is usually uninterpretable. Thereby, the first scientific problematic of the thesis is how to extract interpretable latent features? The dimensionality reduction for classification aims to enhance the classification power of the selected subset of features. We see the development of the task of classification as the task of trigger factors identification that is identification of those factors that can influence the transfer of data elements from one class to another. The second scientific problematic of this thesis is how to automatically identify these trigger factors? We aim at solving both scientific problematics within the recommender systems application domain. We propose to interpret latent features for the matrix factorization-based recommender systems as real users. We design an algorithm for automatic identification of trigger factors based on the concepts of contrast analysis. Through experimental results, we show that the defined patterns indeed can be considered as trigger factors
Le, Magoarou Luc. "Matrices efficientes pour le traitement du signal et l'apprentissage automatique." Thesis, Rennes, INSA, 2016. http://www.theses.fr/2016ISAR0008/document.
Full textMatrices, as natural representation of linear mappings in finite dimension, play a crucial role in signal processing and machine learning. Multiplying a vector by a full rank matrix a priori costs of the order of the number of non-zero entries in the matrix, in terms of arithmetic operations. However, matrices exist that can be applied much faster, this property being crucial to the success of certain linear transformations, such as the Fourier transform or the wavelet transform. What is the property that allows these matrices to be applied rapidly ? Is it easy to verify ? Can weapproximate matrices with ones having this property ? Can we estimate matrices having this property ? This thesis investigates these questions, exploring applications such as learning dictionaries with efficient implementations, accelerating the resolution of inverse problems or Fast Fourier Transform on graphs
Pralet, Stéphane. "Ordonnancement sous contraintes et séquencement en algèbre linéaire creuse parallèle." Toulouse, INPT, 2004. http://www.theses.fr/2004INPT033H.
Full textBenner, Peter, and Thomas Mach. "On the QR Decomposition of H-Matrices." Universitätsbibliothek Chemnitz, 2009. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-200901420.
Full textRigaud, François. "Modèles de signaux musicaux informés par la physiques des instruments : Application à l'analyse automatique de musique pour piano par factorisation en matrices non-négatives." Thesis, Paris, ENST, 2013. http://www.theses.fr/2013ENST0073/document.
Full textThis thesis introduces new models of music signals informed by the physics of the instruments. While instrumental acoustics and audio signal processing target the modeling of musical tones from different perspectives (modeling of the production mechanism of the sound vs modeling of the generic "morphological'' features of the sound), this thesis aims at mixing both approaches by constraining generic signal models with acoustics-based information. Thus, it is here intended to design instrument-specific models for applications both to acoustics (learning of parameters related to the design and the tuning) and signal processing (transcription). In particular, we focus on piano music analysis for which the tones have the well-known property of inharmonicity. The inclusion of such a property in signal models however makes the optimization harder, and may even damage the performance in tasks such as music transcription when compared to a simpler harmonic model. A major goal of this thesis is thus to have a better understanding about the issues arising from the explicit inclusion of the inharmonicity in signal models, and to investigate whether it is really valuable when targeting tasks such as polyphonic music transcription
Rigaud, François. "Modèles de signaux musicaux informés par la physiques des instruments : Application à l'analyse automatique de musique pour piano par factorisation en matrices non-négatives." Electronic Thesis or Diss., Paris, ENST, 2013. http://www.theses.fr/2013ENST0073.
Full textThis thesis introduces new models of music signals informed by the physics of the instruments. While instrumental acoustics and audio signal processing target the modeling of musical tones from different perspectives (modeling of the production mechanism of the sound vs modeling of the generic "morphological'' features of the sound), this thesis aims at mixing both approaches by constraining generic signal models with acoustics-based information. Thus, it is here intended to design instrument-specific models for applications both to acoustics (learning of parameters related to the design and the tuning) and signal processing (transcription). In particular, we focus on piano music analysis for which the tones have the well-known property of inharmonicity. The inclusion of such a property in signal models however makes the optimization harder, and may even damage the performance in tasks such as music transcription when compared to a simpler harmonic model. A major goal of this thesis is thus to have a better understanding about the issues arising from the explicit inclusion of the inharmonicity in signal models, and to investigate whether it is really valuable when targeting tasks such as polyphonic music transcription
Dia, Nafissa. "Suivi non-invasif du rythme cardiaque foetal : exploitation de la factorisation non-négative des matrices sur signaux électrocardiographiques et phonocardiographiques." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAS034.
Full textWith more than 200,000 births per day in the world, fetal well-being monitoring during birth is a major clinical challenge. This monitoring is done by analyzing the fetal heart rate (FHR) and its variability, and this has to be robust while minimizing the number of non-invasive sensors to lay on the mother's abdomen.In this context, electrocardiogram (ECG) and phonocardiogram (PCG) signals are of interest since they both bring cardiac information, both redundant and complementary. This multimodality as well as some features of ECG and PCG signals, as quasi-periodicity, have been exploited. Several propositions were put in competition, based on non-negative matrix factorization (NMF), a matrix decomposition algorithm adapted to physiological signals.The final solution proposed for the FHR estimation is based on a source-filter modeling of real fetal ECG or PCG signals, previously extracted, allowing an estimation of the fundamental frequency by NMF.The approach was carried out on a clinical database of ECG and PCG signals on pregnant women and FHR results were validated by comparison with the cardiotocography clinical reference technique
Zuniga, Anaya Juan Carlos. "Algorithmes numériques pour les matrices polynomiales avec applications en commande." Phd thesis, INSA de Toulouse, 2005. http://tel.archives-ouvertes.fr/tel-00010911.
Full textZúñiga, Anaya Juan Carlos. "Algorithmes numériques pour les matrices polynomiales avec applications en commande." Toulouse, INSA, 2005. http://www.theses.fr/2005ISAT0013.
Full textIn this thesis we develop new numerical algorithms for polynomial matrices. We tackle the problem of computing the eigenstructure (rank, null-space, finite and infinite structures) of a polynomial matrix and we apply the obtained results to the matrix polynomial J-spectral factorization problem. We also present some applications of these algorithms in control theory. All the new algorithms presented here are based on the computation of the constant null-spaces of block Toeplitz matrices associated to the analysed polynomial matrix. For computing these null-spaces we apply standard numerical linear algebra methods such as the singular value decomposition or the QR factorization. We also study the application of fast methods like the generalized Schur method for structured matrices. We analyze the presented algorithms in terms of algorithmic complexity and numerical stability, and we present some comparisons with others algorithms existing in the literature
Boysson, Marianne de. "Contribution à l'algorithmique non commutative." Rouen, 1999. http://www.theses.fr/1999ROUES002.
Full textBettache, Nayel. "Matrix-valued Time Series in High Dimension." Electronic Thesis or Diss., Institut polytechnique de Paris, 2024. http://www.theses.fr/2024IPPAG002.
Full textThe objective of this thesis is to model matrix-valued time series in a high-dimensional framework. To this end, the entire study is presented in a non-asymptotic framework. We first provide a test procedure capable of distinguishing whether the covariance matrix of centered random vectors with centered stationary distribution is equal to the identity or has a sparse Toeplitz structure. Secondly, we propose an extension of low-rank matrix linear regression to a regression model with two matrix-parameters which create correlations between the rows and he columns of the output random matrix. Finally, we introduce and estimate a dynamic topic model where the expected value of the observations is factorizes into a static matrix and a time-dependent matrix following a simplex-valued auto-regressive process of order one
Zhu, Fei. "Kernel nonnegative matrix factorization : application to hyperspectral imagery." Thesis, Troyes, 2016. http://www.theses.fr/2016TROY0024/document.
Full textThis thesis aims to propose new nonlinear unmixing models within the framework of kernel methods and to develop associated algorithms, in order to address the hyperspectral unmixing problem.First, we investigate a novel kernel-based nonnegative matrix factorization (NMF) model, that circumvents the pre-image problem inherited from the kernel machines. Within the proposed framework, several extensions are developed to incorporate common constraints raised in hypersepctral images analysis. In order to tackle large-scale and streaming data, we next extend the kernel-based NMF to an online fashion, by keeping a fixed and tractable complexity. Moreover, we propose a bi-objective NMF model as an attempt to combine the linear and nonlinear unmixing models. The decompositions of both the conventional NMF and the kernel-based NMF are performed simultaneously. The last part of this thesis studies a supervised unmixing model, based on the correntropy maximization principle. This model is shown robust to outlier bands. Two correntropy-based unmixing problems are addressed, considering different constraints in hyperspectral unmixing problem. The alternating direction method of multipliers (ADMM) is investigated to solve the related optimization problems
Gao, Sheng. "Latent factor models for link prediction problems." Paris 6, 2012. http://www.theses.fr/2012PA066056.
Full textWith the rising of Internet as well as modern social media, relational data has become ubiquitous, which consists of those kinds of data where the objects are linked to each other with various relation types. Accordingly, various relational learning techniques have been studied in a large variety of applications with relational data, such as recommender systems, social network analysis, Web mining or bioinformatic. Among a wide range of tasks encompassed by relational learning, we address the problem of link prediction in this thesis. Link prediction has arisen as a fundamental task in relational learning, which considers to predict the presence or absence of links between objects in the relational data based on the topological structure of the network and/or the attributes of objects. However, the complexity and sparsity of network structure make this a great challenging problem. In this thesis, we propose solutions to reduce the difficulties in learning and fit various models into corresponding applications. Basically, in Chapter 3 we present a unified framework of latent factor models to address the generic link prediction problem, in which we specifically discuss various configurations in the models from computational perspective and probabilistic view. Then, according to the applications addressed in this dissertation, we propose different latentfactor models for two classes of link prediction problems: (i) structural link prediction. (ii) temporal link prediction. In terms of structural link prediction problem, in Chapter 4 we define a new task called Link Pattern Prediction (LPP) in multi-relational networks. By introducing a specific latent factor for different relation types in addition to using latent feature factors to characterize objects, we develop a computational tensor factorization model, and the probabilistic version with its Bayesian treatment to reveal the intrinsic causality of interaction patterns in multi-relational networks. Moreover, considering the complex structural patterns in relational data, in Chapter 5 we propose a novel model that simultaneously incorporates the effect of latent feature factors and the impact from the latent cluster structures in the network, and also develop an optimization transfer algorithm to facilitate the model learning procedure. In terms of temporal link prediction problem in time-evolving networks, in Chapter 6 we propose a unified latent factor model which integrates multiple information sources in the network, including the global network structure, the content of objects and the graph proximity information from the network to capture the time-evolving patterns of links. This joint model is constructed based on matrix factorization and graph regularization technique. Each model proposed in this thesis achieves state-of-the-art performances, extensive experiments are conducted on real world datasets to demonstrate their significant improvements over baseline methods. Almost all of themhave been published in international or national peer-reviewed conference proceedings
Hajlaoui, Ayoub. "Emotion recognition and brain activity synchronization across individuals." Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS623.
Full textAffective computing needs a better understanding of human emotion elicitation. Most contributions use modalities such as speech or facila expressions, that are limited by their alterability. Physiological signals such as EEG (electro-encephalography) are an interesting alaternative. EEG can reveal macroscopically invisible emotional states, and have already proved to be efficient in emotion classification. This thesis falls within this context. EEG signals are analysed in the time-frequency domain. Such signals are recorded from participants while they watch video excerpts which provoke different emotions. Variants of the Nonnegative Matrix Factorization (NMF) method are used. This method can decompose an EEG spectrogram into a product of two matrices : a dictionary of frequential atoms and an activation matrix. The focus is made on a variant named Group NMF. In this thesis, we also study Inter-Subject Correlation (ISC), which measures the correlation of EEG signals of two subjects exposed to the same stimuli. The idea is to link the ISC level to the nature of the elicited emotion. Understanding the link between ISC and the elicited emotion then allows to design Group NMF methods that are adapated to EEG-based emotion recognition
Le, Quoc Tung. "Algorithmic and theoretical aspects of sparse deep neural networks." Electronic Thesis or Diss., Lyon, École normale supérieure, 2023. http://www.theses.fr/2023ENSL0105.
Full textSparse deep neural networks offer a compelling practical opportunity to reduce the cost of training, inference and storage, which are growing exponentially in the state of the art of deep learning. In this presentation, we will introduce an approach to study sparse deep neural networks through the lens of another related problem: sparse matrix factorization, i.e., the problem of approximating a (dense) matrix by the product of (multiple) sparse factors. In particular, we identify and investigate in detail some theoretical and algorithmic aspects of a variant of sparse matrix factorization named fixed support matrix factorization (FSMF) in which the set of non-zero entries of sparse factors are known. Several fundamental questions of sparse deep neural networks such as the existence of optimal solutions of the training problem or topological properties of its function space can be addressed using the results of (FSMF). In addition, by applying the results of (FSMF), we also study the butterfly parametrization, an approach that consists of replacing (large) weight matrices by the products of extremely sparse and structured ones in sparse deep neural networks
Diop, Mamadou. "Décomposition booléenne des tableaux multi-dimensionnels de données binaires : une approche par modèle de mélange post non-linéaire." Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0222/document.
Full textThis work is dedicated to the study of boolean decompositions of binary multidimensional arrays using a post nonlinear mixture model. In the first part, we introduce a new approach for the boolean factorization of binary matrices (BFBM) based on a post nonlinear mixture model. Unlike the existing binary matrix factorization methods, the proposed method is equivalent to the boolean factorization model when the matrices are strictly binary and give thus more interpretable results in the case of correlated sources and lower rank matrix approximations compared to other state-of-the-art algorithms. A necessary and suffi-cient condition for the uniqueness of the BFBM is also provided. Two algorithms based on multiplicative update rules are proposed and tested in numerical simulations, as well as on a real dataset. The gener-alization of this approach to the case of binary multidimensional arrays (tensors) leads to the boolean factorisation of binary tensors (BFBT). The proof of the necessary and sufficient condition for the boolean decomposition of binary tensors is based on a notion of boolean independence of binary vectors. The multiplicative algorithm based on the post nonlinear mixture model is extended to the multidimensional case. We also propose a new algorithm based on an AO-ADMM (Alternating Optimization-ADMM) strategy. These algorithms are compared to state-of-the-art algorithms on simulated and on real data
Doreille, Mathias. "Athapascan-1 : vers un modèle de programmation parallèle adapté au calcul scientifique." Phd thesis, Grenoble INPG, 1999. http://tel.archives-ouvertes.fr/tel-00004825.
Full textDiop, Mamadou. "Décomposition booléenne des tableaux multi-dimensionnels de données binaires : une approche par modèle de mélange post non-linéaire." Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0222.
Full textThis work is dedicated to the study of boolean decompositions of binary multidimensional arrays using a post nonlinear mixture model. In the first part, we introduce a new approach for the boolean factorization of binary matrices (BFBM) based on a post nonlinear mixture model. Unlike the existing binary matrix factorization methods, the proposed method is equivalent to the boolean factorization model when the matrices are strictly binary and give thus more interpretable results in the case of correlated sources and lower rank matrix approximations compared to other state-of-the-art algorithms. A necessary and suffi-cient condition for the uniqueness of the BFBM is also provided. Two algorithms based on multiplicative update rules are proposed and tested in numerical simulations, as well as on a real dataset. The gener-alization of this approach to the case of binary multidimensional arrays (tensors) leads to the boolean factorisation of binary tensors (BFBT). The proof of the necessary and sufficient condition for the boolean decomposition of binary tensors is based on a notion of boolean independence of binary vectors. The multiplicative algorithm based on the post nonlinear mixture model is extended to the multidimensional case. We also propose a new algorithm based on an AO-ADMM (Alternating Optimization-ADMM) strategy. These algorithms are compared to state-of-the-art algorithms on simulated and on real data
Montcuquet, Anne-Sophie. "Imagerie spectrale pour l'étude de structures profondes par tomographie optique diffusive de fluorescence." Phd thesis, Université de Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00557141.
Full textHamdi-Larbi, Olfa. "Étude de la Distribution, sur Système à Grande Échelle, de Calcul Numérique Traitant des Matrices Creuses Compressées." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2010. http://tel.archives-ouvertes.fr/tel-00693322.
Full textArchid, Atika. "Méthodes par blocs adaptées aux matrices structurées et au calcul du pseudo-inverse." Thesis, Littoral, 2013. http://www.theses.fr/2013DUNK0394/document.
Full textWe study, in this thesis, some numerical block Krylov subspace methods. These methods preserve geometric properties of the reduced matrix (Hamiltonian or skew-Hamiltonian or symplectic). Among these methods, we interest on block symplectic Arnoldi, namely block J-Arnoldi algorithm. Our main goal is to study this method, theoretically and numerically, on using ℝ²nx²s as free module on (ℝ²sx²s, +, x) with s ≪ n the size of block. A second aim is to study the approximation of exp (A)V, where A is a real Hamiltonian and skew-symmetric matrix of size 2n x 2n and V a rectangular matrix of size 2n x 2s on block Krylov subspace Km (A, V) = blockspan {V, AV,...Am-1V}, that preserve the structure of the initial matrix. this approximation is required in many applications. For example, this approximation is important for solving systems of ordinary differential equations (ODEs) or time-dependant partial differential equations (PDEs). We also present a block symplectic structure preserving Lanczos method, namely block J-Lanczos algorithm. Our approach is based on a block J-tridiagonalization procedure of a structured matrix. We propose algorithms based on two normalization methods : the SR factorization and the Rj R factorization. In the last part, we proposea generalized algorithm of Greville method for iteratively computing the Moore-Penrose inverse of a rectangular real matrix. our purpose is to give a block version of Greville's method. All methods are completed by many numerical examples
Bertin, Nancy. "Les factorisations en matrices non-négatives : approches contraintes et probabilistes, application à la transcription automatique de musique polyphonique." Phd thesis, Télécom ParisTech, 2009. http://tel.archives-ouvertes.fr/tel-00472896.
Full textGriesner, Jean-Benoit. "Systèmes de recommandation de POI à large échelle." Electronic Thesis or Diss., Paris, ENST, 2018. http://www.theses.fr/2018ENST0037.
Full textThe task of points-of-interest (POI) recommendations has become an essential feature in location-based social networks. However it remains a challenging problem because of specific constraints of these networks. In this thesis I investigate new approaches to solve the personalized POI recommendation problem. Three main contributions are proposed in this work. The first contribution is a new matrix factorization model that integrates geographical and temporal influences. This model is based on a specific processing of geographical data. The second contribution is an innovative solution against the implicit feedback problem. This problem corresponds to the difficulty to distinguish among unvisited POI the actual "unknown" from the "negative" ones. Finally the third contribution of this thesis is a new method to generate recommendations with large-scale datasets. In this approach I propose to combine a new geographical clustering algorithm with users’ implicit social influences in order to define local and global mobility scales
Fuentes, Benoît. "L'analyse probabiliste en composantes latentes et ses adaptations aux signaux musicaux : application à la transcription automatique de musique et à la séparation de sources." Thesis, Paris, ENST, 2013. http://www.theses.fr/2013ENST0011/document.
Full textAutomatic music transcription consists in automatically estimating the notes in a recording, through three attributes: onset time, duration and pitch. To address this problem, there is a class of methods which is based on the modeling of a signal as a sum of basic elements, carrying symbolic information. Among these analysis techniques, one can find the probabilistic latent component analysis (PLCA). The purpose of this thesis is to propose variants and improvements of the PLCA, so that it can better adapt to musical signals and th us better address the problem of transcription. To this aim, a first approach is to put forward new models of signals, instead of the inherent model 0 PLCA, expressive enough so they can adapt to musical notes having variations of both pitch and spectral envelope over time. A second aspect of this work is to provide tools to help the parameters estimation algorithm to converge towards meaningful solutions through the incorporation of prior knowledge about the signals to be analyzed, as weil as a new dynamic model. Ali the devised algorithms are applie to the task of automatic transcription. They can also be directly used for source separation, which consists in separating several sources from a mixture, and Iwo applications are put forward in this direction
Ravel, Sylvain. "Démixage d’images hyperspectrales en présence d’objets de petite taille." Thesis, Ecole centrale de Marseille, 2017. http://www.theses.fr/2017ECDM0006/document.
Full textThis thesis is devoted to the unmixing issue in hyperspectral images, especiallyin presence of small sized objects. Hyperspectral images contains an importantamount of both spectral and spatial information. Each pixel of the image canbe assimilated to the reflection spectra of the imaged scene. Due to sensors’ lowspatial resolution, the observed spectra are a mixture of the reflection spectraof the different materials present in the pixel. The unmixing issue consists inestimating those materials’ spectra, called endmembers, and their correspondingabundances in each pixel. Numerous unmixing methods have been proposed butthey fail when an endmembers is rare (that is to say an endmember present inonly a few of the pixels). We call rare pixels, pixels containing those endmembers.The presence of those rare endmembers can be seen as anomalies that we want todetect and unmix. In a first time, we present two detection methods to retrievethis anomalies. The first one use a thresholding criterion on the reconstructionerror from estimated dominant endmembers. The second one, is based on wavelettransform. Then we propose an unmixing method adapted when some endmembersare known a priori. This method is then used with the presented detectionmethod to propose an algorithm to unmix the rare pixels’ endmembers. Finally,we study the application of bootstrap resampling method to artificially upsamplerare pixels and propose unmixing methods in presence of small sized targets
Roig, Rodelas Roger. "Chemical characterization, sources and origins of secondary inorganic aerosols measured at a suburban site in Northern France." Thesis, Lille 1, 2018. http://www.theses.fr/2018LIL1R017/document.
Full textTropospheric fine particles with aerodynamic diameters less than 2.5 µm (PM2.5) may impact health, climate and ecosystems. Secondary inorganic (SIA) and organic aerosols (OA) contribute largely to PM2.5. To understand their formation and origin, a 1-year campaign (August 2015 to July 2016) of inorganic precursor gases and PM2.5 water-soluble ions was performed at an hourly resolution at a suburban site in northern France using a MARGA 1S, complemented by mass concentrations of PM2.5, Black Carbon, nitrogen oxides and trace elements. The highest levels of ammonium nitrate (AN) and sulfate were observed at night in spring and during daytime in summer, respectively. A source apportionment study performed by positive matrix factorization (PMF) determined 8 source factors, 3 having a regional origin (sulfate-rich, nitrate-rich, marine) contributing to PM2.5 mass for 73-78%; and 5 a local one (road traffic, biomass combustion, metal industry background, local industry and dust) (22-27%). In addition, a HR-ToF-AMS (aerosol mass spectrometer) and a SMPS (particle sizer) were deployed during an intensive winter campaign, to gain further insight on OA composition and new particle formation, respectively. The application of PMF to the AMS OA mass spectra allowed identifying 5 source factors: hydrocarbon-like (15%), cooking-like (11%), oxidized biomass burning (25%), less- and more-oxidized oxygenated factors (16% and 33%, respectively). Combining the SMPS size distribution with the chemical speciation of the aerosols and precursor gases allowed the identification of nocturnal new particle formation (NPF) events associated to the formation of SIA, in particular AN
Redko, Ievgen. "Nonnegative matrix factorization for transfer learning." Thesis, Sorbonne Paris Cité, 2015. http://www.theses.fr/2015USPCD059.
Full textThe ability of a human being to extrapolate previously gained knowledge to other domains inspired a new family of methods in machine learning called transfer learning. Transfer learning is often based on the assumption that objects in both target and source domains share some common feature and/or data space. If this assumption is false, most of transfer learning algorithms are likely to fail. In this thesis we propose to investigate the problem of transfer learning from both theoretical and applicational points of view.First, we present two different methods to solve the problem of unsuper-vised transfer learning based on Non-negative matrix factorization tech-niques. First one proceeds using an iterative optimization procedure that aims at aligning the kernel matrices calculated based on the data from two tasks. Second one represents a linear approach that aims at discovering an embedding for two tasks that decreases the distance between the cor-responding probability distributions while preserving the non-negativity property.We also introduce a theoretical framework based on the Hilbert-Schmidt embeddings that allows us to improve the current state-of-the-art theo-retical results on transfer learning by introducing a natural and intuitive distance measure with strong computational guarantees for its estimation. The proposed results combine the tightness of data-dependent bounds de-rived from Rademacher learning theory while ensuring the efficient esti-mation of its key factors.Both theoretical contributions and the proposed methods were evaluated on a benchmark computer vision data set with promising results. Finally, we believe that the research direction chosen in this thesis may have fruit-ful implications in the nearest future
Filstroff, Louis. "Contributions to probabilistic non-negative matrix factorization - Maximum marginal likelihood estimation and Markovian temporal models." Thesis, Toulouse, INPT, 2019. http://www.theses.fr/2019INPT0143.
Full textNon-negative matrix factorization (NMF) has become a popular dimensionality reductiontechnique, and has found applications in many different fields, such as audio signal processing,hyperspectral imaging, or recommender systems. In its simplest form, NMF aims at finding anapproximation of a non-negative data matrix (i.e., with non-negative entries) as the product of twonon-negative matrices, called the factors. One of these two matrices can be interpreted as adictionary of characteristic patterns of the data, and the other one as activation coefficients ofthese patterns. This low-rank approximation is traditionally retrieved by optimizing a measure of fitbetween the data matrix and its approximation. As it turns out, for many choices of measures of fit,the problem can be shown to be equivalent to the joint maximum likelihood estimation of thefactors under a certain statistical model describing the data. This leads us to an alternativeparadigm for NMF, where the learning task revolves around probabilistic models whoseobservation density is parametrized by the product of non-negative factors. This general framework, coined probabilistic NMF, encompasses many well-known latent variable models ofthe literature, such as models for count data. In this thesis, we consider specific probabilistic NMFmodels in which a prior distribution is assumed on the activation coefficients, but the dictionary remains a deterministic variable. The objective is then to maximize the marginal likelihood in thesesemi-Bayesian NMF models, i.e., the integrated joint likelihood over the activation coefficients.This amounts to learning the dictionary only; the activation coefficients may be inferred in asecond step if necessary. We proceed to study in greater depth the properties of this estimation process. In particular, two scenarios are considered. In the first one, we assume the independence of the activation coefficients sample-wise. Previous experimental work showed that dictionarieslearned with this approach exhibited a tendency to automatically regularize the number of components, a favorable property which was left unexplained. In the second one, we lift thisstandard assumption, and consider instead Markov structures to add statistical correlation to themodel, in order to better analyze temporal data
Limem, Abdelhakim. "Méthodes informées de factorisation matricielle non négative : Application à l'identification de sources de particules industrielles." Thesis, Littoral, 2014. http://www.theses.fr/2014DUNK0432/document.
Full textNMF methods aim to factorize a non negative observation matrix X as the product X = G.F between two non-negative matrices G and F. Although these approaches have been studied with great interest in the scientific community, they often suffer from a lack of robustness to data and to initial conditions, and provide multiple solutions. To this end and in order to reduce the space of admissible solutions, the work proposed in this thesis aims to inform NMF, thus placing our work in between regression and classic blind factorization. In addition, some cost functions called parametric αβ-divergences are used, so that the resulting NMF methods are robust to outliers in the data. Three types of constraints are introduced on the matrix F, i. e., (i) the "exact" or "bounded" knowledge on some components, and (ii) the sum to 1 of each line of F. Update rules are proposed so that all these constraints are taken into account by mixing multiplicative methods with projection. Moreover, we propose to constrain the structure of the matrix G by the use of a physical model, in order to discern sources which are influent at the receiver. The considered application - consisting of source identification of particulate matter in the air around an insdustrial area on the French northern coast - showed the interest of the proposed methods. Through a series of experiments on both synthetic and real data, we show the contribution of different informations to make the factorization results more consistent in terms of physical interpretation and less dependent of the initialization
Bassomo, Pierre. "Contribution à la parallélisation de méthodes numériques à matrices creuses skyline. Application à un module de calcul de modes et fréquences propres de Systus." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 1999. http://tel.archives-ouvertes.fr/tel-00822654.
Full textFuentes, Benoit. "L'analyse probabiliste en composantes latentes et ses adaptations aux signaux musicaux : application à la transcription automatique de musique et à la séparation de sources." Electronic Thesis or Diss., Paris, ENST, 2013. http://www.theses.fr/2013ENST0011.
Full textAutomatic music transcription consists in automatically estimating the notes in a recording, through three attributes: onset time, duration and pitch. To address this problem, there is a class of methods which is based on the modeling of a signal as a sum of basic elements, carrying symbolic information. Among these analysis techniques, one can find the probabilistic latent component analysis (PLCA). The purpose of this thesis is to propose variants and improvements of the PLCA, so that it can better adapt to musical signals and th us better address the problem of transcription. To this aim, a first approach is to put forward new models of signals, instead of the inherent model 0 PLCA, expressive enough so they can adapt to musical notes having variations of both pitch and spectral envelope over time. A second aspect of this work is to provide tools to help the parameters estimation algorithm to converge towards meaningful solutions through the incorporation of prior knowledge about the signals to be analyzed, as weil as a new dynamic model. Ali the devised algorithms are applie to the task of automatic transcription. They can also be directly used for source separation, which consists in separating several sources from a mixture, and Iwo applications are put forward in this direction
Dridi, Marwa. "Sur les méthodes rapides de résolution de systèmes de Toeplitz bandes." Thesis, Littoral, 2016. http://www.theses.fr/2016DUNK0402/document.
Full textThis thesis aims to design new fast algorithms for numerical computation via the Toeplitz matrices. First, we introduced a fast algorithm to compute the inverse of a triangular Toeplitz matrix with real and/or complex numbers based on polynomial interpolation techniques. This algorithm requires only two FFT (2n) is clearly effective compared to predecessors. A numerical accuracy and error analysis is also considered. Numerical examples are given to illustrate the effectiveness of our method. In addition, we introduced a fast algorithm for solving a linear banded Toeplitz system. This new approach is based on extending the given matrix with several rows on the top and several columns on the right and to assign zeros and some nonzero constants in each of these rows and columns in such a way that the augmented matrix has a lower triangular Toeplitz structure. Stability of the algorithm is discussed and its performance is showed by numerical experiments. This is essential to connect our algorithms to applications such as image restoration applications, a key area in applied mathematics
Augustin, Lefèvre. "Méthodes d'apprentissage appliquées à la séparation de sources mono-canal." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2012. http://tel.archives-ouvertes.fr/tel-00764546.
Full textProst, Vincent. "Sparse unsupervised learning for metagenomic data." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL013.
Full textThe development of massively parallel sequencing technologies enables to sequence DNA at high-throughput and low cost, fueling the rise of metagenomics which is the study of complex microbial communities sequenced in their natural environment.Metagenomic problems are usually computationally difficult and are further complicated by the massive amount of data involved.In this thesis we consider two different metagenomics problems: 1. raw reads binning and 2. microbial network inference from taxonomic abundance profiles. We address them using unsupervised machine learning methods leveraging the parsimony principle, typically involving l1 penalized log-likelihood maximization.The assembly of genomes from raw metagenomic datasets is a challenging task akin to assembling a mixture of large puzzles composed of billions or trillions of pieces (DNA sequences). In the first part of this thesis, we consider the related task of clustering sequences into biologically meaningful partitions (binning). Most of the existing computational tools perform binning after read assembly as a pre-processing, which is error-prone (yielding artifacts like chimeric contigs) and discards vast amounts of information in the form of unassembled reads (up to 50% for highly diverse metagenomes). This motivated us to try to address the raw read binning (without prior assembly) problem. We exploit the co-abundance of species across samples as discriminative signal. Abundance is usually measured via the number of occurrences of long k-mers (subsequences of size k). The use of Local Sensitive Hashing (LSH) allows us to contain, at the cost of some approximation, the combinatorial explosion of long k-mers indexing. The first contribution of this thesis is to propose a sparse Non-Negative Matrix factorization (NMF) of the samples x k-mers count matrix in order to extract abundance variation signals. We first show that using sparse NMF is well-grounded since data is a sparse linear mixture of non-negative components. Sparse NMF exploiting online dictionary learning algorithms retained our attention, including its decent behavior on largely asymmetric data matrices. The validation of metagenomic binning being difficult on real datasets, because of the absence of ground truth, we created and used several benchmarks for the different methods evaluated on. We illustrated that sparse NMF improves state of the art binning methods on those datasets. Experiments conducted on a real metagenomic cohort of 1135 human gut microbiota showed the relevance of the approach.In the second part of the thesis, we consider metagenomic data after taxonomic profiling: multivariate data representing abundances of taxa across samples. It is known that microbes live in communities structured by ecological interaction between the members of the community. We focus on the problem of the inference of microbial interaction networks from taxonomic profiles. This problem is frequently cast into the paradigm of Gaussian graphical models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present in this part a zero-inflated log-normal graphical model specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets
Brisebarre, Godefroy. "Détection de changements en imagerie hyperspectrale : une approche directionnelle." Thesis, Ecole centrale de Marseille, 2014. http://www.theses.fr/2014ECDM0010.
Full textHyperspectral imagery is an emerging imagery technology which has known a growing interest since the 2000’s. This technology allows an impressive growth of the data registered from a specific scene compared to classical RGB imagery. Indeed, although the spatial resolution is significantly lower, the spectral resolution is very small and the covered spectral area is very wide. We focus on change detection between two images of a given scene for defense oriented purposes.In the following, we start by introducing hyperspectral imagery and the specificity of its exploitation for defence purposes. We then present a change detection and analysis method based on the search for specifical directions in the space generated by the image couple, followed by a merging of the nearby directions. We then exploit this information focusing on theunmixing capabilities of multitemporal hyperspectral data. Finally, we will present a range of further works that could be done in relation with our work and conclude about it
Vo, Xuan Thanh. "Apprentissage avec la parcimonie et sur des données incertaines par la programmation DC et DCA." Thesis, Université de Lorraine, 2015. http://www.theses.fr/2015LORR0193/document.
Full textIn this thesis, we focus on developing optimization approaches for solving some classes of optimization problems in sparsity and robust optimization for data uncertainty. Our methods are based on DC (Difference of Convex functions) programming and DCA (DC Algorithms) which are well-known as powerful tools in optimization. This thesis is composed of two parts: the first part concerns with sparsity while the second part deals with uncertainty. In the first part, a unified DC approximation approach to optimization problem involving the zero-norm in objective is thoroughly studied on both theoretical and computational aspects. We consider a common DC approximation of zero-norm that includes all standard sparse inducing penalty functions, and develop general DCA schemes that cover all standard algorithms in the field. Next, the thesis turns to the nonnegative matrix factorization (NMF) problem. We investigate the structure of the considered problem and provide appropriate DCA based algorithms. To enhance the performance of NMF, the sparse NMF formulations are proposed. Continuing this topic, we study the dictionary learning problem where sparse representation plays a crucial role. In the second part, we exploit robust optimization technique to deal with data uncertainty for two important problems in machine learning: feature selection in linear Support Vector Machines and clustering. In this context, individual data point is uncertain but varies in a bounded uncertainty set. Different models (box/spherical/ellipsoidal) related to uncertain data are studied. DCA based algorithms are developed to solve the robust problems
Michelet, Stéphane. "Modélisation non-supervisée de signaux sociaux." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066052/document.
Full textIn a social interaction, we adapt our behavior to our interlocutors. Studying and understanding the underlying mecanisms of this adaptation is the center of Social Signal Processing. The goal of this thesis is to propose methods of study and models for the analysis of social signals in the context of interaction, by exploiting both social processing and pattern recognition techniques. First, an unsupervised method allowing the measurement of imitation between two partners in terms of delay and degree is proposed, only using gestual data. Spatio-temporal interest point are first detected in order to select the most important regions of videos. Then they are described by histograms in order to construct bag-of-words models in which spatial information is reintroduced. Imitation degree and delay between partners are estimated in a continuous way thanks to cross-correlation between the two bag-of-words models. The second part of this thesis focus on the automatic extraction of features permitting to characterizing group interactions. After regrouping all features commonly used in literature, we proposed the utilization of non-negative factorization. More than only extracting the most pertinent features, it also allowed to automatically regroup, and in an unsupervised manner, meetings in three classes corresponding to three types of leadership defined by psychologists. Finally, the last part focus on unsupervised extraction of features permitting to characterize groups. The relevance of these features, compared to ad-hoc features from state of the art, is then validated in a role recognition task
Durrieu, Jean-Louis. "Transcription et séparation automatique de la mélodie principale dans les signaux de musique polyphoniques." Phd thesis, Paris, Télécom ParisTech, 2010. https://pastel.hal.science/pastel-00006123.
Full textWe propose to address the problem of melody extraction along with the monaural lead instrument and accompaniment separation problem. The first task is related to Music Information Retrieval (MIR), since it aims at indexing the audio music signals with their melody. The separation problem is related to Blind Audio Source Separation (BASS), as it aims at breaking an audio mixture into several source tracks. Leading instrument source separation and main melody extraction are addressed within a unified framework. The lead instrument is modelled thanks to a source/filter production model. Its signal is generated by two hidden states, the filter state and the source state. The proposed signal spectral model therefore explicitly uses pitches both to separate the lead instrument from the others and to transcribe the pitch sequence played by that instrument, the "main melody". This model gives rise to two alternative models, a Gaussian Scaled Mixture Model (GSMM) and the Instantaneous Mixture Model (IMM). The accompaniment is modelled with a more general spectral model. Five systems are proposed. Three systems detect the fundamental frequency sequence of the lead instrument, i. E. They estimate the main melody. A system returns a musical melody transcription and the last system separates the lead instrument from the accompaniment. The results in melody transcription and source separation are at the state of the art, as shown by our participations to international evaluation campaigns (MIREX'08, MIREX'09 and SiSEC'08). The proposed extension of previous source separation works using "MIR" knowledge is therefore a very successful combination
Durrieu, Jean-Louis. "Transcription et séparation automatique de la mélodie principale dans les signaux de musique polyphoniques." Phd thesis, Télécom ParisTech, 2010. http://pastel.archives-ouvertes.fr/pastel-00006123.
Full textVo, Xuan Thanh. "Apprentissage avec la parcimonie et sur des données incertaines par la programmation DC et DCA." Electronic Thesis or Diss., Université de Lorraine, 2015. http://www.theses.fr/2015LORR0193.
Full textIn this thesis, we focus on developing optimization approaches for solving some classes of optimization problems in sparsity and robust optimization for data uncertainty. Our methods are based on DC (Difference of Convex functions) programming and DCA (DC Algorithms) which are well-known as powerful tools in optimization. This thesis is composed of two parts: the first part concerns with sparsity while the second part deals with uncertainty. In the first part, a unified DC approximation approach to optimization problem involving the zero-norm in objective is thoroughly studied on both theoretical and computational aspects. We consider a common DC approximation of zero-norm that includes all standard sparse inducing penalty functions, and develop general DCA schemes that cover all standard algorithms in the field. Next, the thesis turns to the nonnegative matrix factorization (NMF) problem. We investigate the structure of the considered problem and provide appropriate DCA based algorithms. To enhance the performance of NMF, the sparse NMF formulations are proposed. Continuing this topic, we study the dictionary learning problem where sparse representation plays a crucial role. In the second part, we exploit robust optimization technique to deal with data uncertainty for two important problems in machine learning: feature selection in linear Support Vector Machines and clustering. In this context, individual data point is uncertain but varies in a bounded uncertainty set. Different models (box/spherical/ellipsoidal) related to uncertain data are studied. DCA based algorithms are developed to solve the robust problems