Дисертації з теми "Multiplication de matrices creuses"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Multiplication de matrices creuses".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Gonon, Antoine. "Harnessing symmetries for modern deep learning challenges : a path-lifting perspective." Electronic Thesis or Diss., Lyon, École normale supérieure, 2024. http://www.theses.fr/2024ENSL0043.
Повний текст джерелаNeural networks have demonstrated impressive practical success, but theoretical tools for analyzing them are often limited to simple cases that do not capture the complexity of real-world applications. This thesis seeks to narrow this gap by making theoretical tools more applicable to practical scenarios.The first focus of this work is on generalization: can a given network perform well on previously unseen data? This thesis improves generalization guarantees based on the path-norm and extends their applicability to ReLU networks incorporating pooling or skip connections. By reducing the gap between theoretically analyzable networks and those used in practice, this work provides the first empirical evaluation of these guarantees on practical ReLU networks, such as ResNets.The second focus is on resource optimization (time, energy, memory). This thesis introduces a novel pruning method based on the path-norm, which not only retains the accuracy of traditional magnitude pruning but also exhibits robustness to parameter symmetries. Additionally, this work presents a new GPU matrix multiplication algorithm that enhances the state-of-the-art for sparse matrices with Kronecker-structured support, achieving gains in both time and energy. Finally, this thesis makes approximation guarantees for neural networks more concrete by establishing sufficient bit-precision conditions to ensure that quantized networks maintain the same approximation speed as their unconstrained real-weight counterparts
Lawson, Jean-Christophe. "Smart : un neurocalculateur parallèle exploitant des matrices creuses." Grenoble INPG, 1993. http://www.theses.fr/1993INPG0030.
Повний текст джерелаGeronimi, Sylvain. "Determination d'ensembles essentiels minimaux dans les matrices creuses : application a l'analyse des circuits." Toulouse 3, 1987. http://www.theses.fr/1987TOU30104.
Повний текст джерелаVömel, Christof. "Contributions à la recherche en calcul scientifique haute performance pour les matrices creuses." Toulouse, INPT, 2003. http://www.theses.fr/2003INPT003H.
Повний текст джерелаGrigori, Laura. "Prédiction de structure et algorithmique parallèle pour la factorisation LU des matrices creuses." Nancy 1, 2001. http://www.theses.fr/2001NAN10264.
Повний текст джерелаThis dissertation treats of parallel numerical computing considering the Gaussian elimination, as it is used to solve large sparse nonsymmetric linear systems. Usually, computations on sparse matrices have an initial phase that predicts the nonzero structure of the output, which helps with memory allocations, set up data structures and schedule parallel tasks prior to the numerical computation itself. To this end, we study the structure prediction for the sparse LU factorization with partial pivoting. We are mainly interested to identify upper bounds as tight as possible to these structures. This structure prediction is then used in a phase called symbolic factorization, followed by a phase that performs the numerical computation of the factors, called numerical factorization. For very large matrices, a significant part of the overall memory space is needed by structures used during the symbolic factorization, and this can prevent a swap-free execution of the LU factorization. We propose and study a parallel algorithm to decrease the memory requirements of the nonsymmetric symbolic factorization. For an efficient parallel execution of the numerical factorization, we consider the analysis and the handling of the data dependencies graphs resulting from the processing of sparse matrices. This analysis enables us to develop scalable algorithms, which manage memory and computing resources in an effective way
Geronimi, Sylvain. "Détermination d'ensembles essentiels minimaux dans les matrices creuses application à l'analyse des circuits /." Grenoble 2 : ANRT, 1987. http://catalogue.bnf.fr/ark:/12148/cb376053608.
Повний текст джерелаPuglisi, Chiara. "Factorisation QR de grandes matrices creuses basée sur une méthode multifrontale dans un environnement multiprocesseur." Toulouse, INPT, 1993. http://www.theses.fr/1993INPT091H.
Повний текст джерелаEDJLALI, GUY. "Contribution a la parallelisation de methodes iteratives hybrides pour matrices creuses sur architectures heterogenes." Paris 6, 1994. http://www.theses.fr/1994PA066360.
Повний текст джерелаBrown, Christopher Ian. "A VLSI device for multiplication of high order sparse matrices." Thesis, University of Sheffield, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.265915.
Повний текст джерелаGuermouche, Abdou. "Étude et optimisation du comportement mémoire dans les méthodes parallèles de factorisation de matrices creuses." Lyon, École normale supérieure (sciences), 2004. http://www.theses.fr/2004ENSL0284.
Повний текст джерелаDirect methods for solving sparse linear systems are known for their large memory requirements that can represent the limiting factor to solve large systems. The work done during this thesis concerns the study and the optimization of the memory behaviour of a sparse direct method, the multifrontal method, for both the sequential and the parallel cases. Thus, optimal memory minimization algorithms have been proposed for the sequential case. Concerning the parallel case, we have introduced new scheduling strategies aiming at improving the memory behaviour of the method. After that, we extended these approaches to have a good performance while keeping a good memory behaviour. In addition, in the case where the data to be treated cannot fit into memory, out-of-core factorization schemes have to be designed. To be efficient, such approaches require to overlap I/O operations with computations and to reuse the data sets already in memory to reduce the amount of I/O operations. Therefore, another part of the work presented in this thesis concerns the design and the study of implicit out-of-core techniques well-adapted to the memory access pattern of the multifrontal method. These techniques are based on a modification of the standard paging policies of the operating system using a low-level tool (MMUM&MMUSSEL)
Callant, Julien. "Méthodes numériques pour le calcul des valeurs propres les plus à droite des matrices creuses de très grande taille." Doctoral thesis, Universite Libre de Bruxelles, 2012. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209578.
Повний текст джерелаDoctorat en Sciences de l'ingénieur
info:eu-repo/semantics/nonPublished
Kunchum, Rakshith. "On Improving Sparse Matrix-Matrix Multiplication on GPUs." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492694387445938.
Повний текст джерелаNguyen, Duc Kien. "Parallélisation des algorithmes de multiplication rapide de matrices sur machines à mémoire distribuée." Paris 8, 2007. http://www.theses.fr/2007PA083778.
Повний текст джерелаFast matrix multiplication (FMM) algorithms to multiply two n x n matrices reduce the asymptotic operation count from O (n³) of the traditional method to O (n². Xx), thus the parallelization of FMM algorithms always gives remarkable results in comparison to the parallel matrix multiplication algorithms based on traditional method. Within this parallelization, the application of FMM algorithms at the inter-processor level requires us to solve more difficult problems in designing but it forms the most effective algorithms. To use FMM algorithms at the inter-processor level, the most significant point is to determine the submatrices after having recursively executed r times the FMM formulas and then to find the result matrix from the products of these sub matrices. With a definite value of r, we can manually solve this problem like in the previous works with r=1,2, but the solution for the general case has not been found. In this PhD work, by combining our general solution for this problem with a good storage map of submatrices to processor, and with the parallel matrix multiplication algorithms based on traditional method (1D-systolic, 2D-systolic, Fox (BMR), Cannon, PUMMA, BiMMeR, SUMMA, DIMMA. . . ) we have a general scalable parallelization of FMM algorithms on distributed memory computers. Complexity analyses show that our algorithms should be faster than the parallel algorithms based on traditional method when the matrix size is large and our work is relevant to exploit better algorithms when the recursion level is large enough. Experimental results on Fujitsu Siemens Computers/hpcLine confirm the theoretical result by showing that our algorithms perform better than Cannon's Algorithm from 1. 2 to 2. 4 times for matrices of size 8196 x 8196
Amestoy, Patrick. "Factorisation de grandes matrices creuses non symétriques basée sur une méthode multifrontale dans un environnement multiprocesseur." Toulouse, INPT, 1990. http://www.theses.fr/1990INPT050H.
Повний текст джерелаEtoa, Etoa Jean-Bosco. "Methodes simpliciales numeriquement stables pour la resolution de programmes lineaires a matrices des contraintes tres creuses." Paris 6, 1987. http://www.theses.fr/1987PA066784.
Повний текст джерелаHamdi-Larbi, Olfa. "Étude de la Distribution, sur Système à Grande Échelle, de Calcul Numérique Traitant des Matrices Creuses Compressées." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2010. http://tel.archives-ouvertes.fr/tel-00693322.
Повний текст джерелаHamdi-Larbi, Olfa. "Etude de la distribution, sur système à grande échelle, de calcul numérique traitant des matrices creuses compressées." Versailles-St Quentin en Yvelines, 2010. http://www.theses.fr/2010VERS0018.
Повний текст джерелаSeveral scientific applications often use kernels performing computations on large sparse matrices. For reasons of efficiency in time and space, specific compression formats are used for storing such matrices. Most of sparse scientific computations address sparse linear algebra problems. Here two fundamental problems are often considered i. E. Linear systems resolution (LSR) and matrix eigen-values/vector computation (EVC). In this thesis, we address the problem of distributing, onto a Large Scale Distributed System (LSDS), computations performed in iterative methods for both LSR and EVC. The sparse matrix-vector product (SMVP) constitutes a basic kernel in such iterative mathods. Thus, our problem reduces to the SMVP distribution study on an LSDS. In principle, three phases are required for achieving this kind of applications, namely, pre -processing, processing and post-processing. In phase 1, we first proceed to the optimization of four versions of the SMVP algorithm corresponding to four specific matrix compressing formats, then study their performances on sequential target machines. In addition, we focus on the study of load balancing in the procedure of data (i. E. The sparse matrix rows) distribution on a LSDS. Concerning the processing phase, it consists in validating the previous study by a series of experimentations achieved on a volunteer distributed system we installed through using XtremWeb-CH middleware. As to the post-processing phase, it consists in interpreting the experimental results previously obtained in order to deduce adequate conclusions
Boyer, Brice. "Multiplication matricielle efficace et conception logicielle pour la bibliothèque de calcul exact LinBox." Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00767915.
Повний текст джерелаAndersson, Tobias, and Christoffer Brenden. "Parallelism in Go and Java : A Comparison of Performance Using Matrix Multiplication." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16548.
Повний текст джерелаSibut, Pinote Thomas. "Investigations in Computer-Aided Mathematics : Experimentation, Computation, and Certification." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLX086/document.
Повний текст джерелаThis thesis proposes three contributions to computer-aidedmathematical proofs. It deals, not only with proofs relying oncomputations, but also with formal proofs, which are both produced andverified using a piece of software called a proof assistant.In the first part, we illustrate the theme of experimentation at theservice of proofs by considering the problem of the complexity ofmatrix multiplication algorithms. This problem has historically beenapproached in an increasingly abstract way: modern approaches do notconstruct algorithms but use theoretical results to improve the lowerbound on the famous omega constant. We went back to a more practicalapproach by attempting to program some of the algorithms implied bythese theoretical results. This experimental approach reveals anunexpected pattern in some existing algorithms. While these algorithmscontain a new variable epsilon whose presence is reputed to renderthem inefficient for the purposes of reasonable matrix sizes, we havediscovered that we could build matrix multiplication algorithms inparallel without epsilon's with an asymptotic complexity which cantheoretically beat Strassen's algorithm in terms of the number ofmultiplications. A by-product of this exploration is a symbolic toolin Ocaml which can analyze, compose and export matrix multiplicationalgorithms. We also believe that it could be used to build newpractical algorithms for matrix multiplication.In the second part, we describe a formal proof of the irrationality ofthe constant zeta (3), following the historical demonstration due toApéry. The crucial step of this proof is to establish that twosequences of rational numbers satisfy a suprising commonrecurrence. It is in fact possible to "discover" this recurrence usingsymbolic algorithms, and their existing implementations in a computeralgebra system. In fact, this work is an example of a skepticalapproach to the formal proof of theorems, in which computations aremainly accomplished by an efficient computer algebra program, and thenformally verified in a proof assistant. Incidentally, this workquestions the value of creative telescoping certificates as completeproofs of identities. This formal proof is also based on newmathematical libraries, which were formalised for its needs. Inparticular, we have formalized and simplified a study of theasymptotic behaviour of the sequence lcm(1,..., n). This work isdeveloped in the Coq proof assistant and extends the MathematicalComponents libraries.In the last part, we present a procedure which computes approximationsof a class of proper and improper integrals while simultaneouslyproducing a Coq formal proof of the correction of the result of thiscomputation. This procedure uses a combination of interval arithmeticand rigorous polynomial approximations of functions. This work makescrucial use of the possibility to efficiently compute inside Coq'slogic. It is an extension of the CoqInterval library providingnumerical approximation of a class of real expressions. Itsimplementation has also resulted in extensions to the Coquelicotlibrary for real analysis, including a better treatment of improperintegrals. We illustrate the value of this tool and its performanceby dealing with standard but nontrivial examples from the literature,on which other tools have in some cases been incorrect
DOUADI, KAMEL. "Etude et mise en oeuvre de logiciels rapides et numeriquement stables d'optimisation sans contraintes : cas de matrices de contraintes creuses." Paris 6, 1989. http://www.theses.fr/1989PA066155.
Повний текст джерелаDe, Lara Nathan. "Algorithmic and software contributions to graph mining." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT029.
Повний текст джерелаSince the introduction of Google's PageRank method for Web searches in the late 1990s, graph algorithms have been part of our daily lives. In the mid 2000s, the arrival of social networks has amplified this phenomenon, creating new use-cases for these algorithms. Relationships between entities can be of multiple types: user-user symmetric relationships for Facebook or LinkedIn, follower-followee asymmetric ones for Twitter or even user-content bipartite ones for Netflix or Amazon. They all come with their own challenges and the applications are numerous: centrality calculus for influence measurement, node clustering for knowledge discovery, node classification for recommendation or embedding for link prediction, to name a few.In the meantime, the context in which graph algorithms are applied has rapidly become more constrained. On the one hand, the increasing size of the datasets with millions of entities, and sometimes billions of relationships, bounds the asymptotic complexity of the algorithms for industrial applications. On the other hand, as these algorithms affect our daily lives, there is a growing demand for explanability and fairness in the domain of artificial intelligence in general. Graph mining is no exception. For example, the European Union has published a set of ethics guidelines for trustworthy AI. This calls for further analysis of the current models and even new ones.This thesis provides specific answers via a novel analysis of not only standard, but also extensions, variants, and original graph algorithms. Scalability is taken into account every step of the way. Following what the Scikit-learn project does for standard machine learning, we deem important to make these algorithms available to as many people as possible and participate in graph mining popularization. Therefore, we have developed an open-source software, Scikit-network, which implements and documents the algorithms in a simple and efficient way. With this tool, we cover several areas of graph mining such as graph embedding, clustering, and semi-supervised node classification
Zhang, Ye. "Méthodes itératives hybrides asynchrones sur plateformes de calcul hétérogènes pour la résolution accélérée de grands systèmes linéaires." Thesis, Lille 1, 2009. http://www.theses.fr/2009LIL10129/document.
Повний текст джерелаIn this thesis, we have studied an effective parallel hybrid method of solving linear systems, GMRES / LS-Arnoldi, which accelerates the convergence through knowledge of some eigenvalues calculated in paralled by the Arnoldi method in real cases. The asynchronous nature of this method has the advantage of working with a heterogeneous architecture. A study in complex cases is also done by transforming the complex matrix into a real matrix of double dimension. We have implemented our hybrid GMRES method and the general GMRES method on three different types of hardware platforms. They are respectively the IBM SP series supercomputer, a typically centralized hardware platform; Grid5000, a fully distributed hardware platform, and the Tsubame (Tokyo-tech Supercomputer and Ubiquitously Accessible Massstorage Environment) supercomputer, where some nodes are equipped with an accelerator card. We have tested the performance of general GMRES and hybrid GMRES on these three platforms, observing the influence of various parameters for the performance. A number of meaningful results have been obtained; we can not only improve the performance of parallel computing but also specify the direction of our future efforts
Wu, Wenhao. "High-performance matrix multiplication hierarchical data structures, optimized kernel routines, and qualitative performance modeling /." Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-07092003-003633.
Повний текст джерелаCovanov, Svyatoslav. "Algorithmes de multiplication : complexité bilinéaire et méthodes asymptotiquement rapides." Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0057/document.
Повний текст джерелаSince 1960 and the result of Karatsuba, we know that the complexity of the multiplication (of integers or polynomials) is sub-quadratic: given a ring R, the product in R[X] of polynomials a_0 + a_1 X and b_0 + b_1 X, for any a_0, a_1, b_0 and b_1 in R, can be computed with three and not four multiplications over R: (a_0 + a_1X)(b_0 + b_1X) = m_0 + (m_2 - m_0 - m_1)X + m_1X^2, with the three multiplications m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). In the same manner, Strassen's algorithm allows one to multiply two matrices 2nx2n with only seven products of matrices nxn. The two previous examples fall in the category of bilinear maps: these are functions of the form Phi : K^m x K^n -> K^l, given a field K, linear in each variable. Among the most classical bilinear maps, we have the multiplication of polynomials, matrices, or even elements of algebraic extension of finite fields. Given a bilinear map Phi, computing the minimal number of multiplications necessary to the evaluation of this map is a NP-hard problem. The purpose of this thesis is to propose algorithms minimizing this number of multiplications. Two angles of attack have been studied. The first aspect of this thesis is to study the problem of the computation of the bilinear complexity under the angle of the reformulation of this problem in terms of research of matrix subspaces of a given rank. This work led to an algorithm taking into account intrinsic properties of the considered products such as matrix or polynomial products over finite fields. This algorithm allows one to find all the possible decompositions, over F_2, for the product of polynomials modulo X^5 and the product of matrices 3x2 by 2x3. Another aspect of this thesis was the development of fast asymptotic methods for the integer multiplication. There is a particular family of algorithms that has been proposed after an article by Fürer published in 2007. This article proposed a first algorithm, relying on fast Fourier transform (FFT), allowing one to multiply n-bit integers in O(n log n 2^{O(log^* n)}), where log^* is the iterated logarithm function. In this thesis, an algorithm, relying on a number theoretical conjecture, has been proposed, involving the use of FFT and generalized Fermat primes. With a careful complexity analysis of this algorithm, we obtain a complexity in O(nlog n 4^{log^* n})
Covanov, Svyatoslav. "Algorithmes de multiplication : complexité bilinéaire et méthodes asymptotiquement rapides." Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0057.
Повний текст джерелаSince 1960 and the result of Karatsuba, we know that the complexity of the multiplication (of integers or polynomials) is sub-quadratic: given a ring R, the product in R[X] of polynomials a_0 + a_1 X and b_0 + b_1 X, for any a_0, a_1, b_0 and b_1 in R, can be computed with three and not four multiplications over R: (a_0 + a_1X)(b_0 + b_1X) = m_0 + (m_2 - m_0 - m_1)X + m_1X^2, with the three multiplications m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). In the same manner, Strassen's algorithm allows one to multiply two matrices 2nx2n with only seven products of matrices nxn. The two previous examples fall in the category of bilinear maps: these are functions of the form Phi : K^m x K^n -> K^l, given a field K, linear in each variable. Among the most classical bilinear maps, we have the multiplication of polynomials, matrices, or even elements of algebraic extension of finite fields. Given a bilinear map Phi, computing the minimal number of multiplications necessary to the evaluation of this map is a NP-hard problem. The purpose of this thesis is to propose algorithms minimizing this number of multiplications. Two angles of attack have been studied. The first aspect of this thesis is to study the problem of the computation of the bilinear complexity under the angle of the reformulation of this problem in terms of research of matrix subspaces of a given rank. This work led to an algorithm taking into account intrinsic properties of the considered products such as matrix or polynomial products over finite fields. This algorithm allows one to find all the possible decompositions, over F_2, for the product of polynomials modulo X^5 and the product of matrices 3x2 by 2x3. Another aspect of this thesis was the development of fast asymptotic methods for the integer multiplication. There is a particular family of algorithms that has been proposed after an article by Fürer published in 2007. This article proposed a first algorithm, relying on fast Fourier transform (FFT), allowing one to multiply n-bit integers in O(n log n 2^{O(log^* n)}), where log^* is the iterated logarithm function. In this thesis, an algorithm, relying on a number theoretical conjecture, has been proposed, involving the use of FFT and generalized Fermat primes. With a careful complexity analysis of this algorithm, we obtain a complexity in O(nlog n 4^{log^* n})
Bassomo, Pierre. "Contribution à la parallélisation de méthodes numériques à matrices creuses skyline. Application à un module de calcul de modes et fréquences propres de Systus." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 1999. http://tel.archives-ouvertes.fr/tel-00822654.
Повний текст джерелаBassomo, Pierre. "Contribution à la parallélisation de méthodes numériques à matrices creuses skylines : application à un module de calcul de modes et fréquences propres de SYSTUS." Saint-Etienne, EMSE, 1999. http://tel.archives-ouvertes.fr/docs/00/82/26/54/PDF/1999_Bassomo_Pierre.pdf.
Повний текст джерелаDistributed memory machines consisting of multiple autonomous processors connected by a network are becoming commonplace. Unlike specialized machines like systolic arrays, such systems of autonomous processors provide virtual parallelism through standard message passing libraries {PVM or MPI). In the area of parallelizing existing numerical algorithms, two main approaches have been proposed: automatic parallelization techniques and explicit parallelization. In the present work, we focus our studies on the second approach. The parallelization paradigm found to be most effective for numerical algorithms on distributed memory machine was to provide the user with a clientjserver architecture. The most difficult part to design is the SPMD code which is initiated by a client process to speedup the computing time. To do this, our methodology aims: at reusing the systolic model principles for the display of the potential parallelism inside nested loops, and justifying the the aggregation of iteration of loops so as to reduce communication overheads while exploiting coarse-grained parallelism. Each aggregation is a bloc of fine-grained computations not located in the same hyperplan of a given space. It also defines an atomic unit of computation i. E no synchronization or communication is necessary during the execution of the fine-grained computations inside a bloc. Thus all necessary data must be available before such atomic executions. This imposes the constraint that splitting the set of fine-grained computations does not result in deadlocks
Welin-Berger, Robert, and Anton Bäckström. "Optimizing Strassen's multiplication algorithm for modern processors : A study in optimizing matrix multiplications for large matrices on modern CPUs." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186418.
Повний текст джерелаFalco, Aurélien. "Bridging the Gap Between H-Matrices and Sparse Direct Methods for the Solution of Large Linear Systems." Thesis, Bordeaux, 2019. http://www.theses.fr/2019BORD0090/document.
Повний текст джерелаMany physical phenomena may be studied through modeling and numerical simulations, commonplace in scientific applications. To be tractable on a computer, appropriated discretization techniques must be considered, which often lead to a set of linear equations whose features depend on the discretization techniques. Among them, the Finite Element Method usually leads to sparse linear systems whereas the Boundary Element Method leads to dense linear systems. The size of the resulting linear systems depends on the domain where the studied physical phenomenon develops and tends to become larger and larger as the performance of the computer facilities increases. For the sake of numerical robustness, the solution techniques based on the factorization of the matrix associated with the linear system are the methods of choice when affordable. In that respect, hierarchical methods based on low-rank compression have allowed a drastic reduction of the computational requirements for the solution of dense linear systems over the last two decades. For sparse linear systems, their application remains a challenge which has been studied by both the community of hierarchical matrices and the community of sparse matrices. On the one hand, the first step taken by the community of hierarchical matrices most often takes advantage of the sparsity of the problem through the use of nested dissection. While this approach benefits from the hierarchical structure, it is not, however, as efficient as sparse solvers regarding the exploitation of zeros and the structural separation of zeros from non-zeros. On the other hand, sparse factorization is organized so as to lead to a sequence of smaller dense operations, enticing sparse solvers to use this property and exploit compression techniques from hierarchical methods in order to reduce the computational cost of these elementary operations. Nonetheless, the globally hierarchical structure may be lost if the compression of hierarchical methods is used only locally on dense submatrices. We here review the main techniques that have been employed by both those communities, trying to highlight their common properties and their respective limits with a special emphasis on studies that have aimed to bridge the gap between them. With these observations in mind, we propose a class of hierarchical algorithms based on the symbolic analysis of the structure of the factors of a sparse matrix. These algorithms rely on a symbolic information to cluster and construct a hierarchical structure coherent with the non-zero pattern of the matrix. Moreover, the resulting hierarchical matrix relies on low-rank compression for the reduction of the memory consumption of large submatrices as well as the time to solution of the solver. We also compare multiple ordering techniques based on geometrical or topological properties. Finally, we open the discussion to a coupling between the Finite Element Method and the Boundary Element Method in a unified computational framework
L'Excellent, Jean-Yves. "Multifrontal Methods: Parallelism, Memory Usage and Numerical Aspects." Habilitation à diriger des recherches, Ecole normale supérieure de lyon - ENS LYON, 2012. http://tel.archives-ouvertes.fr/tel-00737751.
Повний текст джерелаFerreira, Silvia da Rocha Izidoro. "Aplicações de matrizes no ensino médio." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/55/55136/tde-07062013-100316/.
Повний текст джерелаThe aim is this work is to stress on the use of algebraic operations with matrices in the mathematics teaching for secondary school students. For this purpose, we studied some topics that require algebraic operations with matrices. It was observed that these topics reveal circumstances in which the matrix multiplication is not only useful in the theoretical development but also in the applications. In addition, the studied showed that these themes have potential to be considered in the secondary school
Agullo, Emmanuel. "Méthodes directes hors-mémoire (out-of-core) pour la résolution de systèmes linéaires creux de grande taille." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2008. http://tel.archives-ouvertes.fr/tel-00563463.
Повний текст джерелаNegre, Stéphane. "Optimisation de la méthode multifrontale en vue de sa parallélisation." Compiègne, 1997. http://www.theses.fr/1997COMP1045.
Повний текст джерелаThis work is concerned with the optimization of the parallelization of a particular finite element resolution method based on a substructuring principle and called the multifrontal method. Our aim was to build a set of methods and tools in order to parallelize this method efficiently. The problem is concerned with combinatorial optimization and graph theory. Indeed a finite element mesh can be modelized as a graph. The computing times spent to solve the problem directly depend on the reordering of an associated graph. Because the quality of a solution has to be measured, we propose accurate computing times estimators to measure our solutions. We compare different heuristics we have found in the litterature. We propose efficient improvements of these heuristics and two original reordering methods. The first one is an hybrid method which interleaves greedy algorithms. The second one is based on the tabu search method which is a metaheuristic. Another problem we are concerned with is the mesh decomposition into substructures. Different methods are compared and a new one is proposed. This method iteratively improves initial mesh decomposition by applying an exchanging principle of the finite elements between substructures. In this way we aim to optimize the load balancing on the processors by estimating the computing times of each substructure. The computing times of a substructure also depend on the finite element reordering of the corresponding substructure. However we have to take into account in addition the boundary nodes between the substructures. We then propose particular reordering methods which take into account this additional problem. When a parallel treatment is performed, it is important to schedule the tasks on the processors. We have proposed and studied a theoretical model assuming some assumptions concerning the computing times (the communications delays and the merging tasks). We have shown that a scheduling strategy dominates another one, widely used in the mechanical and numerical community. The results are compared on the thirty meshes of the Everstine's collection and show the efficiency of our algorithms
Hofmann, B., and G. Fleischer. "Stability Rates for Linear Ill-Posed Problems with Convolution and Multiplication Operators." Universitätsbibliothek Chemnitz, 1998. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-199800987.
Повний текст джерелаWeisbecker, Clément. "Improving multifrontal solvers by means of algebraic Block Low-Rank representations." Phd thesis, Toulouse, INPT, 2013. http://oatao.univ-toulouse.fr/10506/1/weisbecker.pdf.
Повний текст джерелаMurphy, Steven. "Methods for solving discontinuous-Galerkin finite element equations with application to neutron transport." Phd thesis, Toulouse, INPT, 2015. http://oatao.univ-toulouse.fr/14650/1/murphy.pdf.
Повний текст джерелаAilem, Melissa. "Sparsity-sensitive diagonal co-clustering algorithms for the effective handling of text data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB087.
Повний текст джерелаIn the current context, there is a clear need for Text Mining techniques to analyse the huge quantity of unstructured text documents available on the Internet. These textual data are often represented by sparse high dimensional matrices where rows and columns represent documents and terms respectively. Thus, it would be worthwhile to simultaneously group these terms and documents into meaningful clusters, making this substantial amount of data easier to handle and interpret. Co-clustering techniques just serve this purpose. Although many existing co-clustering approaches have been successful in revealing homogeneous blocks in several domains, these techniques are still challenged by the high dimensionality and sparsity characteristics exhibited by document-term matrices. Due to this sparsity, several co-clusters are primarily composed of zeros. While homogeneous, these co-clusters are irrelevant and must be filtered out in a post-processing step to keep only the most significant ones. The objective of this thesis is to propose new co-clustering algorithms tailored to take into account these sparsity-related issues. The proposed algorithms seek a block diagonal structure and allow to straightaway identify the most useful co-clusters, which makes them specially effective for the text co-clustering task. Our contributions can be summarized as follows: First, we introduce and demonstrate the effectiveness of a novel co-clustering algorithm based on a direct maximization of graph modularity. While existing graph-based co-clustering algorithms rely on spectral relaxation, the proposed algorithm uses an iterative alternating optimization procedure to reveal the most meaningful co-clusters in a document-term matrix. Moreover, the proposed optimization has the advantage of avoiding the computation of eigenvectors, a task which is prohibitive when considering high dimensional data. This is an improvement over spectral approaches, where the eigenvectors computation is necessary to perform the co-clustering. Second, we use an even more powerful approach to discover block diagonal structures in document-term matrices. We rely on mixture models, which offer strong theoretical foundations and considerable flexibility that makes it possible to uncover various specific cluster structure. More precisely, we propose a rigorous probabilistic model based on the Poisson distribution and the well known Latent Block Model. Interestingly, this model includes the sparsity in its formulation, which makes it particularly effective for text data. Setting the estimate of this model’s parameters under the Maximum Likelihood (ML) and the Classification Maximum Likelihood (CML) approaches, four co-clustering algorithms have been proposed, including a hard, a soft, a stochastic and a fourth algorithm which leverages the benefits of both the soft and stochastic variants, simultaneously. As a last contribution of this thesis, we propose a new biomedical text mining framework that includes some of the above mentioned co-clustering algorithms. This work shows the contribution of co-clustering in a real biomedical text mining problematic. The proposed framework is able to propose new clues about the results of genome wide association studies (GWAS) by mining PUBMED abstracts. This framework has been tested on asthma disease and allowed to assess the strength of associations between asthma genes reported in previous GWAS as well as discover new candidate genes likely associated to asthma. In a nutshell, while several text co-clustering algorithms already exist, their performance can be substantially increased if more appropriate models and algorithms are available. According to the extensive experiments done on several challenging real-world text data sets, we believe that this thesis has served well this objective
Slavova, Tzvetomila. "Résolution triangulaire de systèmes linéaires creux de grande taille dans un contexte parallèle multifrontal et hors-mémoire." Thesis, Toulouse, INPT, 2009. http://www.theses.fr/2009INPT016H/document.
Повний текст джерелаWe consider the solution of very large systems of linear equations with direct multifrontal methods. In this context the size of the factors is an important limitation for the use of sparse direct solvers. We will thus assume that the factors have been written on the local disks of our target multiprocessor machine during parallel factorization. Our main focus is the study and the design of efficient approaches for the forward and backward substitution phases after a sparse multifrontal factorization. These phases involve sparse triangular solution and have often been neglected in previous works on sparse direct factorization. In many applications, however, the time for the solution can be the main bottleneck for the performance. This thesis consists of two parts. The focus of the first part is on optimizing the out-of-core performance of the solution phase. The focus of the second part is to further improve the performance by exploiting the sparsity of the right-hand side vectors. In the first part, we describe and compare two approaches to access data from the hard disk. We then show that in a parallel environment the task scheduling can strongly influence the performance. We prove that a constraint ordering of the tasks is possible; it does not introduce any deadlock and it improves the performance. Experiments on large real test problems (more than 8 million unknowns) using an out-of-core version of a sparse multifrontal code called MUMPS (MUltifrontal Massively Parallel Solver) are used to analyse the behaviour of our algorithms. In the second part, we are interested in applications with sparse multiple right-hand sides, particularly those with single nonzero entries. The motivating applications arise in electromagnetism and data assimilation. In such applications, we need either to compute the null space of a highly rank deficient matrix or to compute entries in the inverse of a matrix associated with the normal equations of linear least-squares problems. We cast both of these problems as linear systems with multiple right-hand side vectors, each containing a single nonzero entry. We describe, implement and comment on efficient algorithms to reduce the input-output cost during an outof- core execution. We show how the sparsity of the right-hand side can be exploited to limit both the number of operations and the amount of data accessed. The work presented in this thesis has been partially supported by SOLSTICE ANR project (ANR-06-CIS6-010)
Ailem, Melissa. "Sparsity-sensitive diagonal co-clustering algorithms for the effective handling of text data." Electronic Thesis or Diss., Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB087.
Повний текст джерелаIn the current context, there is a clear need for Text Mining techniques to analyse the huge quantity of unstructured text documents available on the Internet. These textual data are often represented by sparse high dimensional matrices where rows and columns represent documents and terms respectively. Thus, it would be worthwhile to simultaneously group these terms and documents into meaningful clusters, making this substantial amount of data easier to handle and interpret. Co-clustering techniques just serve this purpose. Although many existing co-clustering approaches have been successful in revealing homogeneous blocks in several domains, these techniques are still challenged by the high dimensionality and sparsity characteristics exhibited by document-term matrices. Due to this sparsity, several co-clusters are primarily composed of zeros. While homogeneous, these co-clusters are irrelevant and must be filtered out in a post-processing step to keep only the most significant ones. The objective of this thesis is to propose new co-clustering algorithms tailored to take into account these sparsity-related issues. The proposed algorithms seek a block diagonal structure and allow to straightaway identify the most useful co-clusters, which makes them specially effective for the text co-clustering task. Our contributions can be summarized as follows: First, we introduce and demonstrate the effectiveness of a novel co-clustering algorithm based on a direct maximization of graph modularity. While existing graph-based co-clustering algorithms rely on spectral relaxation, the proposed algorithm uses an iterative alternating optimization procedure to reveal the most meaningful co-clusters in a document-term matrix. Moreover, the proposed optimization has the advantage of avoiding the computation of eigenvectors, a task which is prohibitive when considering high dimensional data. This is an improvement over spectral approaches, where the eigenvectors computation is necessary to perform the co-clustering. Second, we use an even more powerful approach to discover block diagonal structures in document-term matrices. We rely on mixture models, which offer strong theoretical foundations and considerable flexibility that makes it possible to uncover various specific cluster structure. More precisely, we propose a rigorous probabilistic model based on the Poisson distribution and the well known Latent Block Model. Interestingly, this model includes the sparsity in its formulation, which makes it particularly effective for text data. Setting the estimate of this model’s parameters under the Maximum Likelihood (ML) and the Classification Maximum Likelihood (CML) approaches, four co-clustering algorithms have been proposed, including a hard, a soft, a stochastic and a fourth algorithm which leverages the benefits of both the soft and stochastic variants, simultaneously. As a last contribution of this thesis, we propose a new biomedical text mining framework that includes some of the above mentioned co-clustering algorithms. This work shows the contribution of co-clustering in a real biomedical text mining problematic. The proposed framework is able to propose new clues about the results of genome wide association studies (GWAS) by mining PUBMED abstracts. This framework has been tested on asthma disease and allowed to assess the strength of associations between asthma genes reported in previous GWAS as well as discover new candidate genes likely associated to asthma. In a nutshell, while several text co-clustering algorithms already exist, their performance can be substantially increased if more appropriate models and algorithms are available. According to the extensive experiments done on several challenging real-world text data sets, we believe that this thesis has served well this objective
Rouet, François-Henry. "Memory and performance issues in parallel multifrontal factorizations and triangular solutions with sparse right-hand sides." Thesis, Toulouse, INPT, 2012. http://www.theses.fr/2012INPT0070/document.
Повний текст джерелаWe consider the solution of very large sparse systems of linear equations on parallel architectures. In this context, memory is often a bottleneck that prevents or limits the use of direct solvers, especially those based on the multifrontal method. This work focuses on memory and performance issues of the two memory and computationally intensive phases of direct methods, that is, the numerical factorization and the solution phase. In the first part we consider the solution phase with sparse right-hand sides, and in the second part we consider the memory scalability of the multifrontal factorization. In the first part, we focus on the triangular solution phase with multiple sparse right-hand sides, that appear in numerous applications. We especially emphasize the computation of entries of the inverse, where both the right-hand sides and the solution are sparse. We first present several storage schemes that enable a significant compression of the solution space, both in a sequential and a parallel context. We then show that the way the right-hand sides are partitioned into blocks strongly influences the performance and we consider two different settings: the out-of-core case, where the aim is to reduce the number of accesses to the factors, that are stored on disk, and the in-core case, where the aim is to reduce the computational cost. Finally, we show how to enhance the parallel efficiency. In the second part, we consider the parallel multifrontal factorization. We show that controlling the active memory specific to the multifrontal method is critical, and that commonly used mapping techniques usually fail to do so: they cannot achieve a high memory scalability, i.e. they dramatically increase the amount of memory needed by the factorization when the number of processors increases. We propose a class of "memory-aware" mapping and scheduling algorithms that aim at maximizing performance while enforcing a user-given memory constraint and provide robust memory estimates before the factorization. These techniques have raised performance issues in the parallel dense kernels used at each step of the factorization, and we have proposed some algorithmic improvements. The ideas presented throughout this study have been implemented within the MUMPS (MUltifrontal Massively Parallel Solver) solver and experimented on large matrices (up to a few tens of millions unknowns) and massively parallel architectures (up to a few thousand cores). They have demonstrated to improve the performance and the robustness of the code, and will be available in a future release. Some of the ideas presented in the first part have also been implemented within the PDSLin (Parallel Domain decomposition Schur complement based Linear solver) solver
Mary, Théo. "Solveurs multifrontaux exploitant des blocs de rang faible : complexité, performance et parallélisme." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30305/document.
Повний текст джерелаWe investigate the use of low-rank approximations to reduce the cost of sparse direct multifrontal solvers. Among the different matrix representations that have been proposed to exploit the low-rank property within multifrontal solvers, we focus on the Block Low-Rank (BLR) format whose simplicity and flexibility make it easy to use in a general purpose, algebraic multifrontal solver. We present different variants of the BLR factorization, depending on how the low-rank updates are performed and on the constraints to handle numerical pivoting. We first investigate the theoretical complexity of the BLR format which, unlike other formats such as hierarchical ones, was previously unknown. We prove that the theoretical complexity of the BLR multifrontal factorization is asymptotically lower than that of the full-rank solver. We then show how the BLR variants can further reduce that complexity. We provide an experimental study with numerical results to support our complexity bounds. After proving that BLR multifrontal solvers can achieve a low complexity, we turn to the problem of translating that low complexity in actual performance gains on modern architectures. We first present a multithreaded BLR factorization, and analyze its performance in shared-memory multicore environments on a large set of real-life problems. We put forward several algorithmic properties of the BLR variants necessary to efficiently exploit multicore systems by improving the arithmetic intensity and the scalability of the BLR factorization. We then move on to the distributed-memory BLR factorization, for which additional challenges are identified and addressed. The algorithms presented throughout this thesis have been implemented within the MUMPS solver. We illustrate the use of our approach in three industrial applications coming from geosciences and structural mechanics. We also compare our solver with the STRUMPACK package, based on Hierarchically Semi-Separable approximations. We conclude this thesis by reporting results on a very large problem (130 millions of unknowns) which illustrates future challenges posed by BLR multifrontal solvers at scale
Theveny, Philippe. "Numerical Quality and High Performance In Interval Linear Algebra on Multi-Core Processors." Thesis, Lyon, École normale supérieure, 2014. http://www.theses.fr/2014ENSL0941/document.
Повний текст джерелаThis work aims at determining suitable scopes for several algorithms of interval matrices multiplication.First, we quantify the numerical quality. Former error analyses of interval matrix products establish bounds on the radius overestimation by neglecting the roundoff error. We discuss here several possible measures for interval approximations. We then bound the roundoff error and compare experimentally this bound with the global error distribution on several random data sets. This approach enlightens the relative importance of the roundoff and arithmetic errors depending on the value and homogeneity of relative accuracies of inputs, on the matrix dimension, and on the working precision. This also leads to a new algorithm that is cheaper yet as accurate as previous ones under well-identified conditions.Second, we exploit the parallelism of linear algebra. Previous implementations use calls to BLAS routines on numerical matrices. We show that this may lead to wrong interval results and also restrict the scalability of the performance when the core count increases. To overcome these problems, we implement a blocking version with OpenMP threads executing block kernels with vector instructions. The timings on a 4-octo-core machine show that this implementation is more scalable than the BLAS one and that the cost of numerical and interval matrix products are comparable
Weisbecker, Clement. "Amélioration des solveurs multifrontaux à l'aide de représentations algébriques rang-faible par blocs." Phd thesis, Institut National Polytechnique de Toulouse - INPT, 2013. http://tel.archives-ouvertes.fr/tel-00934939.
Повний текст джерелаZenadi, Mohamed. "Méthodes hybrides pour la résolution de grands systèmes linéaires creux sur calculateurs parallèles." Thesis, Toulouse, INPT, 2013. http://www.theses.fr/2013INPT0126/document.
Повний текст джерелаWe are interested in solving large sparse systems of linear equations in parallel. Computing the solution of such systems requires a large amount of memory and computational power. The two main ways to obtain the solution are direct and iterative approaches. The former achieves this goal fast but with a large memory footprint while the latter is memory friendly but can be slow to converge. In this work we try first to combine both approaches to create a hybrid solver that can be memory efficient while being fast. Then we discuss a novel approach that creates a pseudo-direct solver that compensates for the drawback of the earlier approach. In the first chapters we take a look at row projection techniques, especially the block Cimmino method and examine some of their numerical aspects and how they affect the convergence. We then discuss the acceleration of convergence using conjugate gradients and show that a block version improves the convergence. Next, we see how partitioning the linear system affects the convergence and show how to improve its quality. We finish by discussing the parallel implementation of the hybrid solver, discussing its performance and seeing how it can be improved. The last two chapters focus on an improvement to this hybrid solver. We try to improve the numerical properties of the linear system so that we converge in a single iteration which results in a pseudo-direct solver. We first discuss the numerical properties of the new system, see how it works in parallel and see how it performs versus the iterative version and versus a direct solver. We finally consider some possible improvements to the solver. This work led to the implementation of a hybrid solver, our "ABCD solver" (Augmented Block Cimmino Distributed solver), that can either work in a fully iterative mode or in a pseudo-direct mode
Rouet, François-Henry. "Problèmes de mémoire et de performance de la factorisation multifrontale parallèle et de la résolution triangulaire à seconds membres creux." Phd thesis, Institut National Polytechnique de Toulouse - INPT, 2012. http://tel.archives-ouvertes.fr/tel-00785748.
Повний текст джерелаHe, Haiwu. "ANALYSES AVANCÉES DE LA MÉTHODE HYBRIDE GMRES/LS-ARNOLDI ASYNCHRONE PARALLÈLE ET DISTRIBUÉE POUR LES GRILLES DE CALCUL ET LES SUPERCALCULATEURS." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2005. http://tel.archives-ouvertes.fr/tel-00431124.
Повний текст джерелаLoulidi, Sanae. "Modélisation stochastique en finance, application à la construction d’un modèle à changement de régime avec des sauts." Thesis, Bordeaux 1, 2008. http://www.theses.fr/2008BOR13675/document.
Повний текст джерелаAbstract
Garnier, Romain. "Contribution à la résolution des équations de Maxwell dans les structures périodiques par la méthode des éléments finis." Phd thesis, Toulouse 3, 2013. http://thesesups.ups-tlse.fr/1944/.
Повний текст джерелаElectromagnetic periodic structures are of great interest. These structures act as frequency filters and allow the manufacturing of meta-materials which appear to be composite and artificial. They exhibit electromagnetic properties that are unusual to natural materials such as band gaps. This allows new devices to guide, focus or stop the propagation. This is for example useful to avoid coupling between various radiating elements via the characterization of the surface waves which propagate at the interface between air and the periodic structure. This thesis provides a description of the finite element method dedicated to the characterization of periodic structures. Numerical modelling results in eigenvalue problems of large sizes. It involves solving linear systems compounds of sparse matrices. A method is therefore discussed for solving this type of problem, optimizing and combining different algorithms. Before discussing the different aspects of the developed method, we establish an exhaustive list of all the existing methods by stating their advantages and drawbacks. We note that the finite element method can handle a wide range of periodic structures in three dimensions without limitation on their shape. We present different formulations of this method. Then the algorithmic aspects of the method are detailed. We show that an analysis of the resolution settings can impact the physical interpretations of the results. Finally we show the performance of our tool on classical validation results from the bibliography and we discuss the characterization of surface waves. Therefore, the study of a patch antenna array included in metal cavities is conducted. To conclude we can say that the studies conducted in this thesis have resulted in the production of a code used in an environment calculation initially present at ONERA
Garnier, Romain. "Contribution à la résolution des équations de Maxwell dans les structures périodiques par la méthode des éléments finis." Phd thesis, Université Paul Sabatier - Toulouse III, 2013. http://tel.archives-ouvertes.fr/tel-00878558.
Повний текст джерела