Zaloguj się

Gotowe bibliografie tematyczne / SpMM Multiplication

Spis treści

Artykuły w czasopismach
Rozprawy doktorskie
Książki
Części książek
Streszczenia konferencji

Gotowa bibliografia na temat „SpMM Multiplication”

Autor: Grafiati

Data publikacji: 6 września 2023

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „SpMM Multiplication”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "SpMM Multiplication"

1

Wilkinson, Lucas, Kazem Cheshmi i Maryam Mehri Dehnavi. "Register Tiling for Unstructured Sparsity in Neural Network Inference". Proceedings of the ACM on Programming Languages 7, PLDI (6.06.2023): 1995–2020. http://dx.doi.org/10.1145/3591302.

Pełny tekst źródła

Streszczenie:

Unstructured sparse neural networks are an important class of machine learning (ML) models, as they compact model size and reduce floating point operations. The execution time of these models is frequently dominated by the sparse matrix multiplication (SpMM) kernel, C = A × B , where A is a sparse matrix, and B and C are dense matrices. The unstructured sparsity pattern of matrices in pruned machine learning models along with their sparsity ratio has rendered useless the large class of libraries and systems that optimize sparse matrix multiplications. Reusing registers is particularly difficult because accesses to memory locations should be known statically. This paper proposes Sparse Register Tiling, a new technique composed of an unroll-and-sparse-jam transformation followed by data compression that is specifically tailored to sparsity patterns in ML matrices. Unroll-and-sparse-jam uses sparsity information to jam the code while improving register reuse. Sparse register tiling is evaluated across 2396 weight matrices from transformer and convolutional models with a sparsity range of 60-95% and provides an average speedup of 1.72× and 2.65× over MKL SpMM and dense matrix multiplication, respectively, on a multicore CPU processor. It also provides an end-to-end speedup of 2.12× for MobileNetV1 with 70% sparsity on an ARM processor commonly used in edge devices.

Style APA, Harvard, Vancouver, ISO itp.

2

Anzt, Hartwig, Stanimire Tomov i Jack Dongarra. "On the performance and energy efficiency of sparse linear algebra on GPUs". International Journal of High Performance Computing Applications 31, nr 5 (5.10.2016): 375–90. http://dx.doi.org/10.1177/1094342016672081.

Pełny tekst źródła

Streszczenie:

In this paper we unveil some performance and energy efficiency frontiers for sparse computations on GPU-based supercomputers. We compare the resource efficiency of different sparse matrix–vector products (SpMV) taken from libraries such as cuSPARSE and MAGMA for GPU and Intel’s MKL for multicore CPUs, and develop a GPU sparse matrix–matrix product (SpMM) implementation that handles the simultaneous multiplication of a sparse matrix with a set of vectors in block-wise fashion. While a typical sparse computation such as the SpMV reaches only a fraction of the peak of current GPUs, we show that the SpMM succeeds in exceeding the memory-bound limitations of the SpMV. We integrate this kernel into a GPU-accelerated Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) eigensolver. LOBPCG is chosen as a benchmark algorithm for this study as it combines an interesting mix of sparse and dense linear algebra operations that is typical for complex simulation applications, and allows for hardware-aware optimizations. In a detailed analysis we compare the performance and energy efficiency against a multi-threaded CPU counterpart. The reported performance and energy efficiency results are indicative of sparse computations on supercomputers.

Style APA, Harvard, Vancouver, ISO itp.

3

Ernst, Thomas. "On the q-Lie group of q-Appell polynomial matrices and related factorizations". Special Matrices 6, nr 1 (1.02.2018): 93–109. http://dx.doi.org/10.1515/spma-2018-0009.

Pełny tekst źródła

Streszczenie:

Abstract In the spirit of our earlier paper [10] and Zhang and Wang [16],we introduce the matrix of multiplicative q-Appell polynomials of order M ∈ ℤ. This is the representation of the respective q-Appell polynomials in ke-ke basis. Based on the fact that the q-Appell polynomials form a commutative ring [11], we prove that this set constitutes a q-Lie group with two dual q-multiplications in the sense of [9]. A comparison with earlier results on q-Pascal matrices gives factorizations according to [7], which are specialized to q-Bernoulli and q-Euler polynomials.We also show that the corresponding q-Bernoulli and q-Euler matrices form q-Lie subgroups. In the limit q → 1 we obtain corresponding formulas for Appell polynomial matrices.We conclude by presenting the commutative ring of generalized q-Pascal functional matrices,which operates on all functions f ∈ C∞q .

Style APA, Harvard, Vancouver, ISO itp.

4

Guzu, D., T. Hoffmann-Ostenhof i A. Laptev. "On a class of sharp multiplicative Hardy inequalities". St. Petersburg Mathematical Journal 32, nr 3 (11.05.2021): 523–30. http://dx.doi.org/10.1090/spmj/1659.

Pełny tekst źródła

Streszczenie:

A class of weighted Hardy inequalities is treated. The sharp constants depend on the lowest eigenvalues of auxiliary Schrödinger operators on a sphere. In particular, for some block radial weights these sharp constants are given in terms of the lowest eigenvalue of a Legendre type equation.

Style APA, Harvard, Vancouver, ISO itp.

5

Bakhadly, Bakhad, Alexander Guterman i María Jesús de la Puente. "Orthogonality for (0, −1) tropical normal matrices". Special Matrices 8, nr 1 (17.02.2020): 40–60. http://dx.doi.org/10.1515/spma-2020-0006.

Pełny tekst źródła

Streszczenie:

AbstractWe study pairs of mutually orthogonal normal matrices with respect to tropical multiplication. Minimal orthogonal pairs are characterized. The diameter and girth of three graphs arising from the orthogonality equivalence relation are computed.

Style APA, Harvard, Vancouver, ISO itp.

6

Liu, Jie. "Accuracy Controllable SpMV Optimization on GPU". Journal of Physics: Conference Series 2363, nr 1 (1.11.2022): 012008. http://dx.doi.org/10.1088/1742-6596/2363/1/012008.

Pełny tekst źródła

Streszczenie:

Sparse matrix vector multiplication (SpMV) is a key kernel widely used in a variety of fields, and mixed-precision calculation brings opportunities to SpMV optimization. Researchers have proposed to store nonzero elements in the interval (-1, 1) in single precision and calculate SpMV in mixed precision. Though it leads to high performance, it also brings loss of accuracy. This paper proposes an accuracy controllable optimization method for SpMV. By limiting the error caused by converting double-precision floating-point numbers in the interval (-1, 1) into single-precision format, the calculation accuracy of mixed-precision SpMV is effectively improved. We tested sparse matrices from the SuiteSparse Matrix Collection on Tesla V100. Compared with the existing mixed-precision MpSpMV kernel, the mixed-precision SpMV proposed in this paper achieves an accuracy improvement.

Style APA, Harvard, Vancouver, ISO itp.

7

Nikolski, N., i A. Pushnitski. "Szegő-type limit theorems for “multiplicative Toeplitz” operators and non-Følner approximations". St. Petersburg Mathematical Journal 32, nr 6 (20.10.2021): 1033–50. http://dx.doi.org/10.1090/spmj/1683.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

8

Merrill, Duane, i Michael Garland. "Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format". ACM SIGPLAN Notices 51, nr 8 (9.11.2016): 1–2. http://dx.doi.org/10.1145/3016078.2851190.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

9

Ernst, Thomas. "On the q-exponential of matrix q-Lie algebras". Special Matrices 5, nr 1 (26.01.2017): 36–50. http://dx.doi.org/10.1515/spma-2017-0003.

Pełny tekst źródła

Streszczenie:

Abstract In this paper, we define several new concepts in the borderline between linear algebra, Lie groups and q-calculus.We first introduce the ring epimorphism r, the set of all inversions of the basis q, and then the important q-determinant and corresponding q-scalar products from an earlier paper. Then we discuss matrix q-Lie algebras with a modified q-addition, and compute the matrix q-exponential to form the corresponding n × n matrix, a so-called q-Lie group, or manifold, usually with q-determinant 1. The corresponding matrix multiplication is twisted under τ, which makes it possible to draw diagrams similar to Lie group theory for the q-exponential, or the so-called q-morphism. There is no definition of letter multiplication in a general alphabet, but in this article we introduce new q-number systems, the biring of q-integers, and the extended q-rational numbers. Furthermore, we provide examples of matrices in suq(4), and its corresponding q-Lie group. We conclude with an example of system of equations with Ward number coeficients.

Style APA, Harvard, Vancouver, ISO itp.

10

Alkenani, Ahmad N., Mohammad Ashraf i Aisha Jabeen. "Nonlinear generalized Jordan (σ, Γ)-derivations on triangular algebras". Special Matrices 6, nr 1 (20.12.2017): 216–28. http://dx.doi.org/10.1515/spma-2017-0008.

Pełny tekst źródła

Streszczenie:

Abstract Let R be a commutative ring with identity element, A and B be unital algebras over R and let M be (A,B)-bimodule which is faithful as a left A-module and also faithful as a right B-module. Suppose that A = Tri(A,M,B) is a triangular algebra which is 2-torsion free and σ, Γ be automorphisms of A. A map δ:A→A (not necessarily linear) is called a multiplicative generalized (σ, Γ)-derivation (resp. multiplicative generalized Jordan (σ, Γ)-derivation) on A associated with a (σ, Γ)-derivation (resp. Jordan (σ, Γ)-derivation) d on A if δ(xy) = δ(x)r(y) + σ(x)d(y) (resp. σ(x<sup>2</sup>) = δ(x)r(x) + δ(x)d(x)) holds for all x, y Є A. In the present paper it is shown that if δ:A→A is a multiplicative generalized Jordan (σ, Γ)-derivation on A, then δ is an additive generalized (σ, Γ)-derivation on A.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Rozprawy doktorskie na temat "SpMM Multiplication"

1

Singh, Kunal. "High-Performance Sparse Matrix-Multi Vector Multiplication on Multi-Core Architecture". The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524089757826551.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Hong, Changwan. "Code Optimization on GPUs". The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1557123832601533.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Ashari, Arash. "Sparse Matrix-Vector Multiplication on GPU". The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1417770100.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Godwin, Jeswin Samuel. "High-Performancs Sparse Matrix-Vector Multiplication on GPUS for Structured Grid Computations". The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1357280824.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

5

Boyer, Brice. "Multiplication matricielle efficace et conception logicielle pour la bibliothèque de calcul exact LinBox". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00767915.

Pełny tekst źródła

Streszczenie:

Dans ce mémoire de thèse, nous développons d'abord des multiplications matricielles efficaces. Nous créons de nouveaux ordonnancements qui permettent de réduire la taille de la mémoire supplémentaire nécessaire lors d'une multiplication du type Winograd tout en gardant une bonne complexité, grâce au développement d'outils externes ad hoc (jeu de galets), à des calculs fins de complexité et à de nouveaux algorithmes hybrides. Nous utilisons ensuite des technologies parallèles (multicœurs et GPU) pour accélérer efficacement la multiplication entre matrice creuse et vecteur dense (SpMV), essentielles aux algorithmes dits /boîte noire/, et créons de nouveaux formats hybrides adéquats. Enfin, nous établissons des méthodes de /design/ générique orientées vers l'efficacité, notamment par conception par briques de base, et via des auto-optimisations. Nous proposons aussi des méthodes pour améliorer et standardiser la qualité du code de manière à pérenniser et rendre plus robuste le code produit. Cela permet de pérenniser de rendre plus robuste le code produit. Ces méthodes sont appliquées en particulier à la bibliothèque de calcul exact LinBox.

Style APA, Harvard, Vancouver, ISO itp.

6

Ramesh, Chinthala. "Hardware-Software Co-Design Accelerators for Sparse BLAS". Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4276.

Pełny tekst źródła

Streszczenie:

Sparse Basic Linear Algebra Subroutines (Sparse BLAS) is an important library. Sparse BLAS includes three levels of subroutines. Level 1, Level2 and Level 3 Sparse BLAS routines. Level 1 Sparse BLAS routines do computations over sparse vector and spare/dense vector. Level 2 deals with sparse matrix and vector operations. Level 3 deals with sparse matrix and dense matrix operations. The computations of these Sparse BLAS routines on General Purpose Processors (GPPs) not only suffer from less utilization of hardware resources but also takes more compute time than the workload due to poor data locality of sparse vector/matrix storage formats. In the literature, tremendous efforts have been put into software to improve these Sparse BLAS routines performance on GPPs. GPPs best suit for applications with high data locality, whereas Sparse BLAS routines operate on applications with less data locality hence, GPPs performance is poor. Various Custom Function Units (Hardware Accelerators) are proposed in the literature and are proved to be efficient than soft wares which tried to accelerate Sparse BLAS subroutines. Though existing hardware accelerators improved the Sparse BLAS performance compared to software Sparse BLAS routines, there is still lot of scope to improve these accelerators. This thesis describes both the existing software and hardware software co-designs (HW/SW co-design) and identifies the limitations of these existing solutions. We propose a new sparse data representation called Sawtooth Compressed Row Storage (SCRS) and corresponding SpMV and SpMM algorithms. SCRS based SpMV and SpMM are performing better than existing software solutions. Even though SCRS based SpMV and SpMM algorithms perform better than existing solutions, they still could not reach theoretical peak performance. The knowledge gained from the study of limitations of these existing solutions including the proposed SCRS based SpMV and SpMM is used to propose new HW/SW co-designs. Software accelerators are limited by the hardware properties of GPPs, and GPUs itself, hence, we propose HW/SW co-designs to accelerate few basic Sparse BLAS operations (SpVV and SpMV). Our proposed Parallel Sparse BLAS HW/SW co-design achieves near theoretical peak performance with reasonable hardware resources.

Style APA, Harvard, Vancouver, ISO itp.

Książki na temat "SpMM Multiplication"

1

Bisseling, Rob H. Parallel Scientific Computation. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780198788348.001.0001.

Pełny tekst źródła

Streszczenie:

This book explains how to use the bulk synchronous parallel (BSP) model to design and implement parallel algorithms in the areas of scientific computing and big data. Furthermore, it presents a hybrid BSP approach towards new hardware developments such as hierarchical architectures with both shared and distributed memory. The book provides a full treatment of core problems in scientific computing and big data, starting from a high-level problem description, via a sequential solution algorithm to a parallel solution algorithm and an actual parallel program written in the communication library BSPlib. Numerical experiments are presented for parallel programs on modern parallel computers ranging from desktop computers to massively parallel supercomputers. The introductory chapter of the book gives a complete overview of BSPlib, so that the reader already at an early stage is able to write his/her own parallel programs. Furthermore, it treats BSP benchmarking and parallel sorting by regular sampling. The next three chapters treat basic numerical linear algebra problems such as linear system solving by LU decomposition, sparse matrix-vector multiplication (SpMV), and the fast Fourier transform (FFT). The final chapter explores parallel algorithms for big data problems such as graph matching. The book is accompanied by a software package BSPedupack, freely available online from the author’s homepage, which contains all programs of the book and a set of test programs.

Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "SpMM Multiplication"

1

Guo, Mingfeng, Yaobin Wang, Jun Huang, Qingfeng Wang, Yaqing Zhang, Mu Xu i Fang Lu. "Rgs-SpMM: Accelerate Sparse Matrix-Matrix Multiplication by Row Group Splitting Strategy on the GPU". W Lecture Notes in Computer Science, 61–66. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-21395-3_6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Khan, Muhammad Hannan, Osman Hassan i Shahid Khan. "Accelerating SpMV Multiplication in Probabilistic Model Checkers Using GPUs". W Theoretical Aspects of Computing – ICTAC 2021, 86–104. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-85315-0_6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Stoyanov, Dimitar, Rui Machado i Franz-Josef Pfreundt. "Task-Based Parallel Sparse Matrix-Vector Multiplication (SpMVM) with GPI-2". W Large-Scale Scientific Computing, 153–60. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-26520-9_16.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Zekri, Ahmed S. "Three Dimensional SPMD Matrix–Matrix Multiplication Algorithm and a Stacked Many-Core Processor Architecture". W Lecture Notes in Electrical Engineering, 1139–50. New York, NY: Springer New York, 2012. http://dx.doi.org/10.1007/978-1-4614-3535-8_94.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

5

Bisseling, Rob H. "Sparse matrix–vector multiplication". W Parallel Scientific Computation, 190–290. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780198788348.003.0004.

Pełny tekst źródła

Streszczenie:

This chapter introduces irregular algorithms and presents the example of parallel sparse matrix-vector multiplication (SpMV), which is the central operation in iterative linear system solvers. The irregular sparsity pattern of the matrix does not change during the multiplication, which may be repeated many times. This justifies putting a lot of effort into finding a good data distribution. The Mondriaan distribution of a sparse matrix is a useful non-Cartesian distribution that can be found by hypergraph-based partitioning. The Mondriaan package implements such a partitioning and also the newer medium-grain partitioning method. The chapter analyses the special cases of random sparse matrices and Laplacian matrices. It uses performance profiles and geometric means to compare different partitioning methods. Furthermore, it presents the hybrid-BSP model and a hybrid-BSP SpMV, which are aimed at hybrid distributed/shared-memory architectures. The parallel SpMV can be incorporated in applications, ranging from PageRank computation to artificial neural networks.

Style APA, Harvard, Vancouver, ISO itp.

6

Mpakos, Panagiotis, Nikela Papadopoulou, Chloe Alverti, Georgios Goumas i Nectarios Koziris. "On the Performance and Energy Efficiency of Sparse Matrix-Vector Multiplication on FPGAs". W Parallel Computing: Technology Trends. IOS Press, 2020. http://dx.doi.org/10.3233/apc200092.

Pełny tekst źródła

Streszczenie:

The Sparse Matrix-Vector Multiplication kernel (SpMV) has been one of the most popular kernels in high-performance computing, as the building block of many iterative solvers. At the same time, it has been one of the most notorious kernels, due to its low flop per byte ratio, which leads to under-utilization of modern processing system resources and a huge gap between the peak system performance and the observed performance of the kernel. However, moving forward to exascale, performance by itself is no longer the holy grail; the requirement for energy efficient high-performance computing systems is driving a trend towards processing units with better performance per watt ratios. Following this trend, FPGAs have emerged as an alternative, low-power accelerator for high-end systems. In this paper, we implement the SpMV kernel on FPGAs, towards an accelerated library for sparse matrix computations, for single-precision floating point values. Our implementation focuses on optimizing access to the data for the SpMV kernel and applies common optimizations to improve the parallelism and the performance of the SpMV kernel on FPGAs.We evaluate the performance and energy efficiency of our implementation, in comparison to modern CPUs and GPUs, for a diverse set of sparse matrices and demonstrate that FPGAs can be an energy-efficient solution for the SpMV kernel.

Style APA, Harvard, Vancouver, ISO itp.

7

"Sparse Linear Algebra". W Advances in Systems Analysis, Software Engineering, and High Performance Computing, 94–137. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-7998-7082-1.ch004.

Pełny tekst źródła

Streszczenie:

This chapter shows several new programming strategies based on tasking to parallelize sparse linear algebra kernels. The reader will explore different approaches to improve the performance of these kernels thanks to a better workload distribution and comprehension of the data layout. This will be accomplished through the study of some of the most popular and widely used sparse operations, such as SpMV (sparse matrix vector multiplication), GTSV (triangular solve), or CG (conjugate gradient). Those strategies have been tested on multicore systems. Some of them equipped GPU devices, showcasing how to overcome the peculiarities of task-based parallelized kernels in the context of sparse linear algebra computations.

Style APA, Harvard, Vancouver, ISO itp.

8

Jamalmohammed, Saira Banu, Lavanya K., Sumaiya Thaseen I. i Biju V. "Review on Sparse Matrix Storage Formats With Space Complexity Analysis". W Applications of Artificial Intelligence for Smart Technology, 122–45. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-3335-2.ch009.

Pełny tekst źródła

Streszczenie:

Sparse matrix-vector multiplication (SpMV) is a challenging computational kernel in linear algebra applications, like data mining, image processing, and machine learning. The performance of this kernel is greatly dependent on the size of the input matrix and the underlying hardware features. Various sparse matrix storage formats referred to commonly as sparse formats have been proposed in the literature to reduce the size of the matrix. In modern multi-core and many-core architectures, the performance of the kernel is mainly dependent on memory wall and power wall problem. Normally review on sparse formats is done with specific architecture or with specific application. This chapter presents a comparative study on various sparse formats in cross platform architecture like CPU, graphics processor unit (GPU), and single instruction multiple data stream (SIMD) registers. Space complexity analysis of various formats with its representation is discussed. Finally, the merits and demerits of each format have been summarized into a table.

Style APA, Harvard, Vancouver, ISO itp.

9

Janecek, Andreas, i Ying Tan. "Swarm Intelligence for Non-Negative Matrix Factorization". W Recent Algorithms and Applications in Swarm Intelligence Research, 168–92. IGI Global, 2013. http://dx.doi.org/10.4018/978-1-4666-2479-5.ch009.

Pełny tekst źródła

Streszczenie:

The Non-negative Matrix Factorization (NMF) is a special low-rank approximation which allows for an additive parts-based and interpretable representation of the data. This article presents efforts to improve the convergence, approximation quality, and classification accuracy of NMF using five different meta-heuristics based on swarm intelligence. Several properties of the NMF objective function motivate the utilization of meta-heuristics: this function is non-convex, discontinuous, and may possess many local minima. The proposed optimization strategies are two-fold: On the one hand, a new initialization strategy for NMF is presented in order to initialize the NMF factors prior to the factorization; on the other hand, an iterative update strategy is proposed, which improves the accuracy per runtime for the multiplicative update NMF algorithm. The success of the proposed optimization strategies are shown by applying them on synthetic data and data sets coming from the areas of spam filtering/email classification, and evaluate them also in their application context. Experimental results show that both optimization strategies are able to improve NMF in terms of faster convergence, lower approximation error, and better classification accuracy. Especially the initialization strategy leads to significant reductions of the runtime per accuracy ratio for both, the NMF approximation as well as the classification results achieved with NMF.

Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "SpMM Multiplication"

1

Huang, Guyue, Guohao Dai, Yu Wang i Huazhong Yang. "GE-SpMM: General-Purpose Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks". W SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2020. http://dx.doi.org/10.1109/sc41405.2020.00076.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Page, Brian A., i Peter M. Kogge. "Scalability of Hybrid Sparse Matrix Dense Vector (SpMV) Multiplication". W 2018 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 2018. http://dx.doi.org/10.1109/hpcs.2018.00072.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Wu, Di, Wei Cao i Lingli Wang. "SpWMM: A High-Performance Sparse-Winograd Matrix-Matrix Multiplication Accelerator for CNNs". W 2019 International Conference on Field-Programmable Technology (ICFPT). IEEE, 2019. http://dx.doi.org/10.1109/icfpt47387.2019.00041.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Merrill, Duane, i Michael Garland. "Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format". W PPoPP '16: 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2851141.2851190.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

5

Suresh, Krishnan, i Praveen Yadav. "Large-Scale Modal Analysis on Multi-Core Architectures". W ASME 2012 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2012. http://dx.doi.org/10.1115/detc2012-70281.

Pełny tekst źródła

Streszczenie:

We propose here a subspace augmented Rayleigh-Ritz conjugate gradient method (SaRCG) for solving large-scale eigen-value problems. The method is highly scalable and well suited for multi-core architectures since it only requires sparse matrix-vector multiplications (SpMV). As a specific application, we consider the modal analysis of geometrically complex structures that are discretized via non-conforming voxels. The voxelization process is robust and relatively insensitive to geometric complexity, but it leads to large eigen-value problems, that are difficult to solve via standard eigen-solvers such as block-Lanczos. Such problems are easily solved via the proposed SaRCG, where one can, in addition, exploit the voxelization structure to render the SpMV assembly-free. As the numerical experiments indicate, the resulting implementation on multi-core CPUs, and graphics-programmable-units is a practical solution to automated eigen-value estimation during early stages of design.

Style APA, Harvard, Vancouver, ISO itp.

6

Hou, Kaixi, Wu-chun Feng i Shuai Che. "Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors". W 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2017. http://dx.doi.org/10.1109/ipdpsw.2017.155.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

7

Kang, Jinsong, Shoujian Yang, Wei Xia i Shuo Wang. "An improved deadbeat control strategy for photovoltaic grid-connected inverter based on unipolar & frequency multiplication SPWM and its implementation". W 2014 IEEE International Power Electronics and Application Conference and Exposition (PEAC). IEEE, 2014. http://dx.doi.org/10.1109/peac.2014.7037934.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!