Academic literature on the topic 'Sparse Basic Linear Algebra Subroutines'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Sparse Basic Linear Algebra Subroutines.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Sparse Basic Linear Algebra Subroutines"

1

Yang, Bing, Xi Chen, Xiang Yun Liao, Mian Lun Zheng, and Zhi Yong Yuan. "FEM-Based Modeling and Deformation of Soft Tissue Accelerated by CUSPARSE and CUBLAS." Advanced Materials Research 671-674 (March 2013): 3200–3203. http://dx.doi.org/10.4028/www.scientific.net/amr.671-674.3200.

Full text
Abstract:
Realistic modeling and deformation of soft tissue is one of the key technologies of virtual surgery simulation which is a challenging research field that stimulates the development of new clinical applications such as the virtual surgery simulator. In this paper we adopt the linear FEM (Finite Element Method) and sparse matrix compression stored in CSR (Compressed Sparse Row) format that enables fast modeling and deformation of soft tissue on GPU hardware with NVIDIA’s CUSPARSE (Compute Unified Device Architecture Sparse Matrix) and CUBLAS (Compute Unified Device Architecture Basic Linear Algebra Subroutines) library. We focus on the CGS (Conjugate Gradient Solver) which is the mainly time-consuming part of FEM, and transplant it onto GPU with the two libraries mentioned above. The experimental results show that the accelerating method in this paper can achieve realistic and fast modeling and deformation simulation of soft tissue.
APA, Harvard, Vancouver, ISO, and other styles
2

Magnin, H., and J. L. Coulomb. "A parallel and vectorial implementation of basic linear algebra subroutines in iterative solving of large sparse linear systems of equations." IEEE Transactions on Magnetics 25, no. 4 (July 1989): 2895–97. http://dx.doi.org/10.1109/20.34317.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kramer, David, S. Lennart Johnsson, and Yu Hu. "Local Basic Linear Algebra Subroutines (LBLAS) for the CM-5/5E." International Journal of Supercomputer Applications and High Performance Computing 10, no. 4 (December 1996): 300–335. http://dx.doi.org/10.1177/109434209601000403.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Shaeffer, John. "BLAS IV: A BLAS for Rk Matrix Algebra." Applied Computational Electromagnetics Society 35, no. 11 (February 3, 2021): 1266–67. http://dx.doi.org/10.47037/2020.aces.j.351102.

Full text
Abstract:
Basic Linear Algebra Subroutines (BLAS) are well-known low-level workhorse subroutines for linear algebra vector-vector, matrixvector and matrix-matrix operations for full rank matrices. The advent of block low rank (Rk) full wave direct solvers, where most blocks of the system matrix are Rk, an extension to the BLAS III matrix-matrix work horse routine is needed due to the agony of Rk addition. This note outlines the problem of BLAS III for Rk LU and solve operations and then outlines an alternative approach, which we will call BLAS IV. This approach utilizes the thrill of Rk matrix-matrix multiply and uses the Adaptive Cross Approximation (ACA) as a methodology to evaluate sums of Rk terms to circumvent the agony of low rank addition.
APA, Harvard, Vancouver, ISO, and other styles
5

Demmel, James W., Michael T. Heath, and Henk A. van der Vorst. "Parallel numerical linear algebra." Acta Numerica 2 (January 1993): 111–97. http://dx.doi.org/10.1017/s096249290000235x.

Full text
Abstract:
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of paralled processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, and the singular value decomposition. We consider dense, band and sparse matrices.
APA, Harvard, Vancouver, ISO, and other styles
6

Duff, Iain S., Michele Marrone, Giuseppe Radicati, and Carlo Vittoli. "Level 3 basic linear algebra subprograms for sparse matrices." ACM Transactions on Mathematical Software 23, no. 3 (September 1997): 379–401. http://dx.doi.org/10.1145/275323.275327.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Dodson, David S., Roger G. Grimes, and John G. Lewis. "Sparse extensions to the FORTRAN Basic Linear Algebra Subprograms." ACM Transactions on Mathematical Software 17, no. 2 (June 1991): 253–63. http://dx.doi.org/10.1145/108556.108577.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Dodson, David S., and John G. Lewis. "Proposed sparse extensions to the Basic Linear Algebra Subprograms." ACM SIGNUM Newsletter 20, no. 1 (January 1985): 22–25. http://dx.doi.org/10.1145/1057935.1057938.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Duff, Iain S., Michael A. Heroux, and Roldan Pozo. "An overview of the sparse basic linear algebra subprograms." ACM Transactions on Mathematical Software 28, no. 2 (June 2002): 239–67. http://dx.doi.org/10.1145/567806.567810.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Aliaga, José I., Rocío Carratalá-Sáez, and Enrique S. Quintana-Ortí. "Parallel Solution of Hierarchical Symmetric Positive Definite Linear Systems." Applied Mathematics and Nonlinear Sciences 2, no. 1 (June 22, 2017): 201–12. http://dx.doi.org/10.21042/amns.2017.1.00017.

Full text
Abstract:
AbstractWe present a prototype task-parallel algorithm for the solution of hierarchical symmetric positive definite linear systems via the ℋ-Cholesky factorization that builds upon the parallel programming standards and associated runtimes for OpenMP and OmpSs. In contrast with previous efforts, our proposal decouples the numerical aspects of the linear algebra operation from the complexities associated with high performance computing. Our experiments make an exhaustive analysis of the efficiency attained by different parallelization approaches that exploit either task-parallelism or loop-parallelism via a runtime. Alternatively, we also evaluate a solution that leverages multi-threaded parallelism via the parallel implementation of the Basic Linear Algebra Subroutines (BLAS) in Intel MKL.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Sparse Basic Linear Algebra Subroutines"

1

Bisseling, Rob H. Parallel Scientific Computation. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780198788348.001.0001.

Full text
Abstract:
This book explains how to use the bulk synchronous parallel (BSP) model to design and implement parallel algorithms in the areas of scientific computing and big data. Furthermore, it presents a hybrid BSP approach towards new hardware developments such as hierarchical architectures with both shared and distributed memory. The book provides a full treatment of core problems in scientific computing and big data, starting from a high-level problem description, via a sequential solution algorithm to a parallel solution algorithm and an actual parallel program written in the communication library BSPlib. Numerical experiments are presented for parallel programs on modern parallel computers ranging from desktop computers to massively parallel supercomputers. The introductory chapter of the book gives a complete overview of BSPlib, so that the reader already at an early stage is able to write his/her own parallel programs. Furthermore, it treats BSP benchmarking and parallel sorting by regular sampling. The next three chapters treat basic numerical linear algebra problems such as linear system solving by LU decomposition, sparse matrix-vector multiplication (SpMV), and the fast Fourier transform (FFT). The final chapter explores parallel algorithms for big data problems such as graph matching. The book is accompanied by a software package BSPedupack, freely available online from the author’s homepage, which contains all programs of the book and a set of test programs.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Sparse Basic Linear Algebra Subroutines"

1

Johnsson, S. Lennart. "Data Parallel Programming and Basic Linear Algebra Subroutines." In Mathematical Aspects of Scientific Software, 183–96. New York, NY: Springer New York, 1988. http://dx.doi.org/10.1007/978-1-4684-7074-1_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Reinhardt, Gerd. "Zur maschinennahen Implementation und Performance von Basic Linear Algebra Subroutines (BLAS) Level 1, 2 und 3 auf dem Transputer T9000." In Informatik aktuell, 53–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 1994. http://dx.doi.org/10.1007/978-3-642-78901-4_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Giesen, Joachim, Lars Kuehne, and Sören Laue. "The GENO Software Stack." In Lecture Notes in Computer Science, 213–28. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-21534-6_12.

Full text
Abstract:
AbstractGENO (generic optimization) is a domain specific language for mathematical optimization. The GENO software generates a solver from a specification of an optimization problem class. The optimization problems, that is, their objective function and constraints, are specified in a formal language. The problem specification is then translated into a general normal form. Problems in normal form are then passed on to a general purpose solver. In its Iterations, the solver evaluates expressions for the objective function, constraints, and their derivatives. Hence, computing symbolic gradients of linear algebra expressions is an important component of the GENO software stack. The expressions are evaluated on the available hardware platforms including CPUs and GPUs from different vendors. This becomes possible by compiling the expressions into BLAS (Basic Linear Algebra Subroutines) calls that have been optimized for the different hardware platforms by their vendors. The compiler, called autoBLAS, that translates formal linear algebra expressions into optimized BLAS calls is another important component in the GENO software stack. By putting all the components together the generated solvers are competitive with problem-specific hand-written solvers and orders of magnitude faster than competing approaches that offer comparable ease-of-use. While this article describes the full GENO software stack, its components are of also of interest on their own and thus have been made available independently.
APA, Harvard, Vancouver, ISO, and other styles
4

"Introduction to Linear Algebra." In Advances in Systems Analysis, Software Engineering, and High Performance Computing, 1–25. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-7998-7082-1.ch001.

Full text
Abstract:
This chapter introduces widely used concepts about linear algebra in computer science, as well as information about the standard libraries that gather kernels for linear algebra operations, such as the basic linear algebra subprograms (BLAS) and the linear algebra package (LAPACK). The creation and evolution of these libraries is historically contextualized to help the reader understand their relevance and utility. Moreover, dense and sparse linear algebra are explained. The authors describe the levels of the BLAS library, the motivation behind the hierarchical structure of the BLAS library, and its connection with the LAPACK library. The authors also provide a detailed introduction on some of the most used and popular dense linear algebra kernels or routines, such as GEMM (matrix-matrix multiplication), TRSM (triangular solver), GETRF (LU factorization), and GESV (LU solve). Finally, the authors focus on the most important sparse linear algebra routines and the motivation behind the discussed approaches.
APA, Harvard, Vancouver, ISO, and other styles
5

Petersen, Wesley, and Peter Arbenz. "Applications." In Introduction to Parallel Computing. Oxford University Press, 2004. http://dx.doi.org/10.1093/oso/9780198515760.003.0007.

Full text
Abstract:
Linear algebra is often the kernel of most numerical computations. It deals with vectors and matrices and simple operations like addition and multiplication on these objects. Vectors are one-dimensional arrays of say n real or complex numbers x0, x1, . . . , xn−1. We denote such a vector by x and think of it as a column vector, On a sequential computer, these numbers occupy n consecutive memory locations. This is also true, at least conceptually, on a shared memory multiprocessor computer. On distributed memory multicomputers, the primary issue is how to distribute vectors on the memory of the processors involved in the computation. Matrices are two-dimensional arrays of the form The n · m real (complex) matrix elements aij are stored in n · m (respectively 2 · n ·m if complex datatype is available) consecutive memory locations. This is achieved by either stacking the columns on top of each other or by appending row after row. The former is called column-major, the latter row-major order. The actual procedure depends on the programming language. In Fortran, matrices are stored in column-major order, in C in row-major order. There is no principal difference, but for writing efficient programs one has to respect how matrices are laid out. To be consistent with the libraries that we will use that are mostly written in Fortran, we will explicitly program in column-major order. Thus, the matrix element aij of the m×n matrix A is located i+j · m memory locations after a00. Therefore, in our C codes we will write a[i+j*m]. Notice that there is no such simple procedure for determining the memory location of an element of a sparse matrix. In Section 2.3, we outline data descriptors to handle sparse matrices. In this and later chapters we deal with one of the simplest operations one wants to do with vectors and matrices: the so-called saxpy operation (2.3). In Tables 2.1 and 2.2 are listed some of the acronyms and conventions for the basic linear algebra subprograms discussed in this book.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Sparse Basic Linear Algebra Subroutines"

1

Zekri, Ahmed S., and Stanislav G. Sedukhin. "Evaluating the Performance of Basic Linear Algebra Subroutines on a Torus Array Processor." In 7th IEEE International Conference on Computer and Information Technology (CIT 2007). IEEE, 2007. http://dx.doi.org/10.1109/cit.2007.166.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Doshi, Parshwanath S., Rajesh Ranjan, and Datta V. Gaitonde. "2D and 3D Stability of Cavity Flows in High Mach Number Regimes." In ASME 2019 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/imece2019-10828.

Full text
Abstract:
Abstract The stability characteristics of an open cavity flow at very high Mach number are examined with BiGlobal stability analysis based on the eigenvalues of the linearized Navier-Stokes equations. During linearization, all possible first-order terms are retained without any approximation, with particular emphasis on extracting the effects of compressibility on the flowfield. The method leverages sparse linear algebra and the implicitly restarted shift-invert Arnoldi algorithm to extract eigenvalues of practical physical consequence. The stability dynamics of cavity flows at four Mach numbers between 1.4 and 4 are considered at a Reynolds number of 502. The basic states are obtained through Large Eddy Simulation (LES). Frequency results from the stability analysis show good agreement when compared to the theoretical values using Rossiter’s formula. An examination of the stability modes reveals that the shear layer is increasingly decoupled from the cavity as the Mach number is increased. Additionally, the outer lobes of the Rossiter modes are observed to get stretched and tilted in the direction of the freestream. Future efforts will extend the present analysis to examine current and potential cavity flame holder configurations, which often have downstream walls inclined to the vertical.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography