Увійти

Готові списки джерел за темами / SpMM Multiplication / Статті в журналах

Статті в журналах з теми "SpMM Multiplication"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: SpMM Multiplication.

Автор: Grafiati

Опубліковано: 6 вересня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-27 статей у журналах для дослідження на тему "SpMM Multiplication".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Wilkinson, Lucas, Kazem Cheshmi, and Maryam Mehri Dehnavi. "Register Tiling for Unstructured Sparsity in Neural Network Inference." Proceedings of the ACM on Programming Languages 7, PLDI (June 6, 2023): 1995–2020. http://dx.doi.org/10.1145/3591302.

Повний текст джерела

Анотація:

Unstructured sparse neural networks are an important class of machine learning (ML) models, as they compact model size and reduce floating point operations. The execution time of these models is frequently dominated by the sparse matrix multiplication (SpMM) kernel, C = A × B , where A is a sparse matrix, and B and C are dense matrices. The unstructured sparsity pattern of matrices in pruned machine learning models along with their sparsity ratio has rendered useless the large class of libraries and systems that optimize sparse matrix multiplications. Reusing registers is particularly difficult because accesses to memory locations should be known statically. This paper proposes Sparse Register Tiling, a new technique composed of an unroll-and-sparse-jam transformation followed by data compression that is specifically tailored to sparsity patterns in ML matrices. Unroll-and-sparse-jam uses sparsity information to jam the code while improving register reuse. Sparse register tiling is evaluated across 2396 weight matrices from transformer and convolutional models with a sparsity range of 60-95% and provides an average speedup of 1.72× and 2.65× over MKL SpMM and dense matrix multiplication, respectively, on a multicore CPU processor. It also provides an end-to-end speedup of 2.12× for MobileNetV1 with 70% sparsity on an ARM processor commonly used in edge devices.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Anzt, Hartwig, Stanimire Tomov, and Jack Dongarra. "On the performance and energy efficiency of sparse linear algebra on GPUs." International Journal of High Performance Computing Applications 31, no. 5 (October 5, 2016): 375–90. http://dx.doi.org/10.1177/1094342016672081.

Повний текст джерела

Анотація:

In this paper we unveil some performance and energy efficiency frontiers for sparse computations on GPU-based supercomputers. We compare the resource efficiency of different sparse matrix–vector products (SpMV) taken from libraries such as cuSPARSE and MAGMA for GPU and Intel’s MKL for multicore CPUs, and develop a GPU sparse matrix–matrix product (SpMM) implementation that handles the simultaneous multiplication of a sparse matrix with a set of vectors in block-wise fashion. While a typical sparse computation such as the SpMV reaches only a fraction of the peak of current GPUs, we show that the SpMM succeeds in exceeding the memory-bound limitations of the SpMV. We integrate this kernel into a GPU-accelerated Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) eigensolver. LOBPCG is chosen as a benchmark algorithm for this study as it combines an interesting mix of sparse and dense linear algebra operations that is typical for complex simulation applications, and allows for hardware-aware optimizations. In a detailed analysis we compare the performance and energy efficiency against a multi-threaded CPU counterpart. The reported performance and energy efficiency results are indicative of sparse computations on supercomputers.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Ernst, Thomas. "On the q-Lie group of q-Appell polynomial matrices and related factorizations." Special Matrices 6, no. 1 (February 1, 2018): 93–109. http://dx.doi.org/10.1515/spma-2018-0009.

Повний текст джерела

Анотація:

Abstract In the spirit of our earlier paper [10] and Zhang and Wang [16],we introduce the matrix of multiplicative q-Appell polynomials of order M ∈ ℤ. This is the representation of the respective q-Appell polynomials in ke-ke basis. Based on the fact that the q-Appell polynomials form a commutative ring [11], we prove that this set constitutes a q-Lie group with two dual q-multiplications in the sense of [9]. A comparison with earlier results on q-Pascal matrices gives factorizations according to [7], which are specialized to q-Bernoulli and q-Euler polynomials.We also show that the corresponding q-Bernoulli and q-Euler matrices form q-Lie subgroups. In the limit q → 1 we obtain corresponding formulas for Appell polynomial matrices.We conclude by presenting the commutative ring of generalized q-Pascal functional matrices,which operates on all functions f ∈ C∞q .

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Guzu, D., T. Hoffmann-Ostenhof, and A. Laptev. "On a class of sharp multiplicative Hardy inequalities." St. Petersburg Mathematical Journal 32, no. 3 (May 11, 2021): 523–30. http://dx.doi.org/10.1090/spmj/1659.

Повний текст джерела

Анотація:

A class of weighted Hardy inequalities is treated. The sharp constants depend on the lowest eigenvalues of auxiliary Schrödinger operators on a sphere. In particular, for some block radial weights these sharp constants are given in terms of the lowest eigenvalue of a Legendre type equation.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Bakhadly, Bakhad, Alexander Guterman, and María Jesús de la Puente. "Orthogonality for (0, −1) tropical normal matrices." Special Matrices 8, no. 1 (February 17, 2020): 40–60. http://dx.doi.org/10.1515/spma-2020-0006.

Повний текст джерела

Анотація:

AbstractWe study pairs of mutually orthogonal normal matrices with respect to tropical multiplication. Minimal orthogonal pairs are characterized. The diameter and girth of three graphs arising from the orthogonality equivalence relation are computed.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Liu, Jie. "Accuracy Controllable SpMV Optimization on GPU." Journal of Physics: Conference Series 2363, no. 1 (November 1, 2022): 012008. http://dx.doi.org/10.1088/1742-6596/2363/1/012008.

Повний текст джерела

Анотація:

Sparse matrix vector multiplication (SpMV) is a key kernel widely used in a variety of fields, and mixed-precision calculation brings opportunities to SpMV optimization. Researchers have proposed to store nonzero elements in the interval (-1, 1) in single precision and calculate SpMV in mixed precision. Though it leads to high performance, it also brings loss of accuracy. This paper proposes an accuracy controllable optimization method for SpMV. By limiting the error caused by converting double-precision floating-point numbers in the interval (-1, 1) into single-precision format, the calculation accuracy of mixed-precision SpMV is effectively improved. We tested sparse matrices from the SuiteSparse Matrix Collection on Tesla V100. Compared with the existing mixed-precision MpSpMV kernel, the mixed-precision SpMV proposed in this paper achieves an accuracy improvement.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Nikolski, N., and A. Pushnitski. "Szegő-type limit theorems for “multiplicative Toeplitz” operators and non-Følner approximations." St. Petersburg Mathematical Journal 32, no. 6 (October 20, 2021): 1033–50. http://dx.doi.org/10.1090/spmj/1683.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Merrill, Duane, and Michael Garland. "Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format." ACM SIGPLAN Notices 51, no. 8 (November 9, 2016): 1–2. http://dx.doi.org/10.1145/3016078.2851190.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Ernst, Thomas. "On the q-exponential of matrix q-Lie algebras." Special Matrices 5, no. 1 (January 26, 2017): 36–50. http://dx.doi.org/10.1515/spma-2017-0003.

Повний текст джерела

Анотація:

Abstract In this paper, we define several new concepts in the borderline between linear algebra, Lie groups and q-calculus.We first introduce the ring epimorphism r, the set of all inversions of the basis q, and then the important q-determinant and corresponding q-scalar products from an earlier paper. Then we discuss matrix q-Lie algebras with a modified q-addition, and compute the matrix q-exponential to form the corresponding n × n matrix, a so-called q-Lie group, or manifold, usually with q-determinant 1. The corresponding matrix multiplication is twisted under τ, which makes it possible to draw diagrams similar to Lie group theory for the q-exponential, or the so-called q-morphism. There is no definition of letter multiplication in a general alphabet, but in this article we introduce new q-number systems, the biring of q-integers, and the extended q-rational numbers. Furthermore, we provide examples of matrices in suq(4), and its corresponding q-Lie group. We conclude with an example of system of equations with Ward number coeficients.

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Alkenani, Ahmad N., Mohammad Ashraf та Aisha Jabeen. "Nonlinear generalized Jordan (σ, Γ)-derivations on triangular algebras". Special Matrices 6, № 1 (20 грудня 2017): 216–28. http://dx.doi.org/10.1515/spma-2017-0008.

Повний текст джерела

Анотація:

Abstract Let R be a commutative ring with identity element, A and B be unital algebras over R and let M be (A,B)-bimodule which is faithful as a left A-module and also faithful as a right B-module. Suppose that A = Tri(A,M,B) is a triangular algebra which is 2-torsion free and σ, Γ be automorphisms of A. A map δ:A→A (not necessarily linear) is called a multiplicative generalized (σ, Γ)-derivation (resp. multiplicative generalized Jordan (σ, Γ)-derivation) on A associated with a (σ, Γ)-derivation (resp. Jordan (σ, Γ)-derivation) d on A if δ(xy) = δ(x)r(y) + σ(x)d(y) (resp. σ(x<sup>2</sup>) = δ(x)r(x) + δ(x)d(x)) holds for all x, y Є A. In the present paper it is shown that if δ:A→A is a multiplicative generalized Jordan (σ, Γ)-derivation on A, then δ is an additive generalized (σ, Γ)-derivation on A.

Стилі APA, Harvard, Vancouver, ISO та ін.

11

AlAhmadi, Sarah, Thaha Mohammed, Aiiad Albeshri, Iyad Katib, and Rashid Mehmood. "Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)." Electronics 9, no. 10 (October 13, 2020): 1675. http://dx.doi.org/10.3390/electronics9101675.

Повний текст джерела

Анотація:

Graphics processing units (GPUs) have delivered a remarkable performance for a variety of high performance computing (HPC) applications through massive parallelism. One such application is sparse matrix-vector (SpMV) computations, which is central to many scientific, engineering, and other applications including machine learning. No single SpMV storage or computation scheme provides consistent and sufficiently high performance for all matrices due to their varying sparsity patterns. An extensive literature review reveals that the performance of SpMV techniques on GPUs has not been studied in sufficient detail. In this paper, we provide a detailed performance analysis of SpMV performance on GPUs using four notable sparse matrix storage schemes (compressed sparse row (CSR), ELLAPCK (ELL), hybrid ELL/COO (HYB), and compressed sparse row 5 (CSR5)), five performance metrics (execution time, giga floating point operations per second (GFLOPS), achieved occupancy, instructions per warp, and warp execution efficiency), five matrix sparsity features (nnz, anpr, nprvariance, maxnpr, and distavg), and 17 sparse matrices from 10 application domains (chemical simulations, computational fluid dynamics (CFD), electromagnetics, linear programming, economics, etc.). Subsequently, based on the deeper insights gained through the detailed performance analysis, we propose a technique called the heterogeneous CPU–GPU Hybrid (HCGHYB) scheme. It utilizes both the CPU and GPU in parallel and provides better performance over the HYB format by an average speedup of 1.7x. Heterogeneous computing is an important direction for SpMV and other application areas. Moreover, to the best of our knowledge, this is the first work where the SpMV performance on GPUs has been discussed in such depth. We believe that this work on SpMV performance analysis and the heterogeneous scheme will open up many new directions and improvements for the SpMV computing field in the future.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Guo, Ping, and Liqiang Wang. "Accurate cross-architecture performance modeling for sparse matrix-vector multiplication (SpMV) on GPUs." Concurrency and Computation: Practice and Experience 27, no. 13 (February 12, 2014): 3281–94. http://dx.doi.org/10.1002/cpe.3217.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Fagerlund, Olav Aanes, Takeshi Kitayama, Gaku Hashimoto, and Hiroshi Okuda. "Effect of GPU Communication-Hiding for SPMV Using OpenACC." International Journal of Computational Methods 13, no. 02 (March 2016): 1640011. http://dx.doi.org/10.1142/s0219876216400119.

Повний текст джерела

Анотація:

In this study, we discuss overlapping possibilities of Sparse Matrix-Vector multiplication (SpMV) in cases where we have multiple RHS-vectors and where the whole sparse matrix data may or may not fit into the memory of the discrete GPU, at once, by using OpenACC. With GPUs, one can take advantage of their relatively high memory bandwidths. However, data needs to be transferred over the relatively slow PCIe bus. We implement communication-hiding to increase performance. In the case of three degrees of freedom and modeling 2,097,152 nodes, we observe a just above 40% performance increase by applying communication-hiding in our routine. This underlines the importance of applying such techniques in simulations, when it is suitable with the algorithmic structure of the problem in relation to the underlying computer architecture.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Farooq, Aamir, Mahvish Samar, Rewayat Khan, Hanyu Li, and Muhammad Kamran. "Perturbation analysis for the Takagi vector matrix." Special Matrices 10, no. 1 (July 3, 2021): 23–33. http://dx.doi.org/10.1515/spma-2020-0144.

Повний текст джерела

Анотація:

Abstract In this article, we present some perturbation bounds for the Takagi vector matrix when the original matrix undergoes the additive or multiplicative perturbation. Two numerical examples are given to illuminate these bounds.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Liu, Sheng, Yasong Cao, and Shuwei Sun. "Mapping and Optimization Method of SpMV on Multi-DSP Accelerator." Electronics 11, no. 22 (November 11, 2022): 3699. http://dx.doi.org/10.3390/electronics11223699.

Повний текст джерела

Анотація:

Sparse matrix-vector multiplication (SpMV) solves the product of a sparse matrix and dense vector, and the sparseness of a sparse matrix is often more than 90%. Usually, the sparse matrix is compressed to save storage resources, but this causes irregular access to dense vectors in the algorithm, which takes a lot of time and degrades the SpMV performance of the system. In this study, we design a dedicated channel in the DMA to implement an indirect memory access process to speed up the SpMV operation. On this basis, we propose six SpMV algorithm schemes and map them to optimize the performance of SpMV. The results show that the M processor’s SpMV performance reached 6.88 GFLOPS. Besides, the average performance of the HPCG benchmark is 2.8 GFLOPS.

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Van Tran, Nam, and Imme van den Berg. "An algebraic model for the propagation of errors in matrix calculus." Special Matrices 8, no. 1 (March 5, 2020): 68–97. http://dx.doi.org/10.1515/spma-2020-0008.

Повний текст джерела

Анотація:

AbstractWe assume that every element of a matrix has a small, individual error, and model it by an external number, which is the sum of a nonstandard real number and a neutrix, the latter being a convex (external) additive group. The algebraic properties of external numbers formalize common error analysis, with rules for calculation which are a sort of mellowed form of the axioms for real numbers.We model the propagation of errors in matrix calculus by the calculus of matrices with external numbers, and study its algebraic properties. Many classical properties continue to hold, sometimes stated in terms of inclusion instead of equality. There are notable exceptions, for which we give counterexamples and investigate suitable adaptations. In particular we study addition and multiplication of matrices, determinants, near inverses, and generalized notions of linear independence and rank.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Watanabe, Aki, Takayuki Kawaguchi, Mai Sakimoto, Yuya Oikawa, Keiichiro Furuya, and Taichi Matsuoka. "Occupational Dysfunction as a Mediator between Recovery Process and Difficulties in Daily Life in Severe and Persistent Mental Illness: A Bayesian Structural Equation Modeling Approach." Occupational Therapy International 2022 (June 17, 2022): 1–11. http://dx.doi.org/10.1155/2022/2661585.

Повний текст джерела

Анотація:

Background. This study is aimed at verifying a hypothetical model of the structural relationship between the recovery process and difficulties in daily life mediated by occupational dysfunction in severe and persistent mental illness (SPMI). Methods. Community-dwelling participants with SPMI were enrolled in this multicenter cross-sectional study. The Recovery Assessment Scale (RAS), the World Health Organization Disability Assessment Schedule second edition (WHODAS 2.0), and the Classification and Assessment of Occupational Dysfunction (CAOD) were used for assessment. Confirmatory factor analysis, multiple regression analysis, and Bayesian structural equation modelling (BSEM) were determined to analyze the hypothesized model. If the mediation model was significant, the path coefficient from difficulty in daily life to recovery and the multiplication of the path coefficients mediated by occupational dysfunction were considered as each the direct effect and the indirect effect. The goodness of fit in the model was determined by the posterior predictive P value (PPP). Each path coefficient was validated with median and 95% confidence interval (CI). Results. The participants comprised 98 individuals with SPMI. The factor structures of RAS, WHODAS 2.0, and CAOD were confirmed by confirmatory factor analysis to be similar to those of their original studies. Multiple regression analysis showed that the independent variables of RAS were WHODAS 2.0 and CAOD, and that of CAOD was WHODAS 2.0. The goodness of fit of the model in the BSEM was satisfactory with a PPP = 0.27 . The standardized path coefficients were, respectively, significant at -0.372 from “difficulty in daily life” to “recovery” as the direct effect and at -0.322 (95% CI: -0.477, -0.171) mediated by “occupational dysfunction” as the indirect effect. Conclusions. An approach for reducing not only difficulty in daily life but also occupational dysfunction may be an additional strategy of person-centered, recovery-oriented practice in SPMI.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Benatia, Akrem, Weixing Ji, Yizhuo Wang, and Feng Shi. "Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms." International Journal of High Performance Computing Applications 34, no. 1 (November 14, 2019): 66–80. http://dx.doi.org/10.1177/1094342019886628.

Повний текст джерела

Анотація:

Sparse matrix–vector multiplication (SpMV) kernel dominates the computing cost in numerous applications. Most of the existing studies dedicated to improving this kernel have been targeting just one type of processing units, mainly multicore CPUs or graphics processing units (GPUs), and have not explored the potential of the recent, rapidly emerging, CPU-GPU heterogeneous platforms. To take full advantage of these heterogeneous systems, the input sparse matrix has to be partitioned on different available processing units. The partitioning problem is more challenging with the existence of many sparse formats whose performances depend both on the sparsity of the input matrix and the used hardware. Thus, the best performance does not only depend on how to partition the input sparse matrix but also on which sparse format to use for each partition. To address this challenge, we propose in this article a new CPU-GPU heterogeneous method for computing the SpMV kernel that combines between different sparse formats to achieve better performance and better utilization of CPU-GPU heterogeneous platforms. The proposed solution horizontally partitions the input matrix into multiple block-rows and predicts their best sparse formats using machine learning-based performance models. A mapping algorithm is then used to assign the block-rows to the CPU and GPU(s) available in the system. Our experimental results using real-world large unstructured sparse matrices on two different machines show a noticeable performance improvement.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Mahmoud, Mohammed, Mark Hoffmann, and Hassan Reza. "Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU." Computation 6, no. 3 (August 24, 2018): 45. http://dx.doi.org/10.3390/computation6030045.

Повний текст джерела

Анотація:

Sparse matrix-vector multiplication (SpMV) can be used to solve diverse-scaled linear systems and eigenvalue problems that exist in numerous, and varying scientific applications. One of the scientific applications that SpMV is involved in is known as Configuration Interaction (CI). CI is a linear method for solving the nonrelativistic Schrödinger equation for quantum chemical multi-electron systems, and it can deal with the ground state as well as multiple excited states. In this paper, we have developed a hybrid approach in order to deal with CI sparse matrices. The proposed model includes a newly-developed hybrid format for storing CI sparse matrices on the Graphics Processing Unit (GPU). In addition to the new developed format, the proposed model includes the SpMV kernel for multiplying the CI matrix (proposed format) by a vector using the C language and the Compute Unified Device Architecture (CUDA) platform. The proposed SpMV kernel is a vector kernel that uses the warp approach. We have gauged the newly developed model in terms of two primary factors, memory usage and performance. Our proposed kernel was compared to the cuSPARSE library and the CSR5 (Compressed Sparse Row 5) format and already outperformed both.

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Muhammed, Thaha, Rashid Mehmood, Aiiad Albeshri, and Iyad Katib. "SURAA: A Novel Method and Tool for Loadbalanced and Coalesced SpMV Computations on GPUs." Applied Sciences 9, no. 5 (March 6, 2019): 947. http://dx.doi.org/10.3390/app9050947.

Повний текст джерела

Анотація:

Sparse matrix-vector (SpMV) multiplication is a vital building block for numerous scientific and engineering applications. This paper proposes SURAA (translates to speed in arabic), a novel method for SpMV computations on graphics processing units (GPUs). The novelty lies in the way we group matrix rows into different segments, and adaptively schedule various segments to different types of kernels. The sparse matrix data structure is created by sorting the rows of the matrix on the basis of the nonzero elements per row ( n p r) and forming segments of equal size (containing approximately an equal number of nonzero elements per row) using the Freedman–Diaconis rule. The segments are assembled into three groups based on the mean n p r of the segments. For each group, we use multiple kernels to execute the group segments on different streams. Hence, the number of threads to execute each segment is adaptively chosen. Dynamic Parallelism available in Nvidia GPUs is utilized to execute the group containing segments with the largest mean n p r, providing improved load balancing and coalesced memory access, and hence more efficient SpMV computations on GPUs. Therefore, SURAA minimizes the adverse effects of the n p r variance by uniformly distributing the load using equal sized segments. We implement the SURAA method as a tool and compare its performance with the de facto best commercial (cuSPARSE) and open source (CUSP, MAGMA) tools using widely used benchmarks comprising 26 high n p r v a r i a n c e matrices from 13 diverse domains. SURAA outperforms the other tools by delivering 13.99x speedup on average. We believe that our approach provides a fundamental shift in addressing SpMV related challenges on GPUs including coalesced memory access, thread divergence, and load balancing, and is set to open new avenues for further improving SpMV performance in the future.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Janecek, Andreas, and Ying Tan. "Swarm Intelligence for Non-Negative Matrix Factorization." International Journal of Swarm Intelligence Research 2, no. 4 (October 2011): 12–34. http://dx.doi.org/10.4018/jsir.2011100102.

Повний текст джерела

Анотація:

The Non-negative Matrix Factorization (NMF) is a special low-rank approximation which allows for an additive parts-based and interpretable representation of the data. This article presents efforts to improve the convergence, approximation quality, and classification accuracy of NMF using five different meta-heuristics based on swarm intelligence. Several properties of the NMF objective function motivate the utilization of meta-heuristics: this function is non-convex, discontinuous, and may possess many local minima. The proposed optimization strategies are two-fold: On the one hand, a new initialization strategy for NMF is presented in order to initialize the NMF factors prior to the factorization; on the other hand, an iterative update strategy is proposed, which improves the accuracy per runtime for the multiplicative update NMF algorithm. The success of the proposed optimization strategies are shown by applying them on synthetic data and data sets coming from the areas of spam filtering/email classification, and evaluate them also in their application context. Experimental results show that both optimization strategies are able to improve NMF in terms of faster convergence, lower approximation error, and better classification accuracy. Especially the initialization strategy leads to significant reductions of the runtime per accuracy ratio for both, the NMF approximation as well as the classification results achieved with NMF.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Tao, Zhuofu, Chen Wu, Yuan Liang, Kun Wang, and Lei He. "LW-GCN: A Lightweight FPGA-based Graph Convolutional Network Accelerator." ACM Transactions on Reconfigurable Technology and Systems, August 4, 2022. http://dx.doi.org/10.1145/3550075.

Повний текст джерела

Анотація:

Graph convolutional networks (GCNs) have been introduced to effectively process non-euclidean graph data. However, GCNs incur large amounts of irregularity in computation and memory access, which prevents efficient use of traditional neural network accelerators. Moreover, existing dedicated GCN accelerators demand high memory volumes and are difficult to implement onto resource limited edge devices. In this work, we propose LW-GCN , a lightweight FPGA-based accelerator with a software-hardware co-designed process to tackle irregularity in computation and memory access in GCN inference. LW-GCN decomposes the main GCN operations into Sparse Matrix-Matrix Multiplication (SpMM) and Matrix-Matrix Multiplication (MM). We propose a novel compression format to balance workload across PEs and prevent data hazards. Moreover, we apply data quantization and workload tiling, and map both SpMM and MM of GCN inference onto a uniform architecture on resource limited hardware. Evaluation on GCN and GraphSAGE are performed on Xilinx Kintex-7 FPGA with three popular datasets. Compared to existing CPU, GPU, and state-of-the-art FPGA-based accelerator, LW-GCN reduces latency by up to 60x, 12x and 1.7x and increases power efficiency by up to 912x., 511x and 3.87x, respectively. Furthermore, compared with NVIDIA’s latest edge GPU Jetson Xavier NX, LW-GCN achieves speedup and energy savings of 32x and 84x, respectively.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Appi Reddy, K., and T. Kurmayya. "Moore-Penrose inverses of Gram matrices leaving a cone invariant in an indefinite inner product space." Special Matrices 3, no. 1 (January 10, 2015). http://dx.doi.org/10.1515/spma-2015-0013.

Повний текст джерела

Анотація:

AbstractIn this paper we characterize Moore-Penrose inverses of Gram matrices leaving a cone invariant in an indefinite inner product space using the indefinite matrix multiplication. This characterization includes the acuteness (or obtuseness) of certain closed convex cones.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Verde-Star, Luis. "Elementary triangular matrices and inverses of k-Hessenberg and triangular matrices." Special Matrices 3, no. 1 (January 6, 2015). http://dx.doi.org/10.1515/spma-2015-0025.

Повний текст джерела

Анотація:

AbstractWe use elementary triangular matrices to obtain some factorization, multiplication, and inversion properties of triangular matrices. We also obtain explicit expressions for the inverses of strict k-Hessenberg matrices and banded matrices. Our results can be extended to the cases of block triangular and block Hessenberg matrices. An n × n lower triangular matrix is called elementary if it is of the form I + C, where I is the identity matrix and C is lower triangular and has all of its nonzero entries in the k-th column,where 1 ≤ k ≤ n.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Nazarov, F., V. Vasyunin, and A. Volberg. "On a Bellman function associated with the Chang–Wilson–Wolff theorem: a case study." St. Petersburg Mathematical Journal, June 27, 2022. http://dx.doi.org/10.1090/spmj/1719.

Повний текст джерела

Анотація:

The tail of distribution (i.e., the measure of the set { f ≥ x } \{f\ge x\} ) is estimated for those functions f f whose dyadic square function is bounded by a given constant. In particular, an estimate following from the Chang–Wilson–Wolf theorem is slightly improved. The study of the Bellman function corresponding to the problem reveals a curious structure of this function: it has jumps of the first derivative at a dense subset of the interval [ 0 , 1 ] [0,1] (where it is calculated exactly), but it is of C ∞ C^\infty -class for x > 3 x>\sqrt 3 (where it is calculated up to a multiplicative constant). An unusual feature of the paper consists of the usage of computer calculations in the proof. Nevertheless, all the proofs are quite rigorous, since only the integer arithmetic was assigned to a computer.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Hilberdink, T., and A. Pushnitski. "Spectral asymptotics for a family of LCM matrices." St. Petersburg Mathematical Journal, June 7, 2023. http://dx.doi.org/10.1090/spmj/1764.

Повний текст джерела

Анотація:

The family of arithmetical matrices is studied given explicitly by E ( σ , τ ) = { n σ m σ [ n , m ] τ } n , m = 1 ∞ , \begin{equation*} E(\sigma ,\tau )= \bigg \{\frac {n^\sigma m^\sigma }{[n,m]^\tau }\bigg \}_{n,m=1}^\infty , \end{equation*} where [ n , m ] [n,m] is the least common multiple of n n and m m and the real parameters σ \sigma and τ \tau satisfy ρ ≔ τ − 2 σ > 0 \rho ≔\tau -2\sigma >0 , τ − σ > 1 2 \tau -\sigma >\frac 12 , and τ > 0 \tau >0 . It is proved that E ( σ , τ ) E(\sigma ,\tau ) is a compact selfadjoint positive definite operator on ℓ 2 ( N ) \ell ^2(\mathbb {N}) , and the ordered sequence of eigenvalues of E ( σ , τ ) E(\sigma ,\tau ) obeys the asymptotic relation λ n ( E ( σ , τ ) ) = ϰ ( σ , τ ) n ρ + o ( n − ρ ) , n → ∞ , \begin{equation*} \lambda _n(E(\sigma ,\tau ))=\frac {\varkappa (\sigma ,\tau )}{n^\rho }+o(n^{-\rho }), \quad n\to \infty , \end{equation*} with some ϰ ( σ , τ ) > 0 \varkappa (\sigma ,\tau )>0 . This fact is applied to the asymptotics of singular values of truncated multiplicative Toeplitz matrices with the symbol given by the Riemann zeta function on the vertical line with abscissa σ > 1 / 2 \sigma >1/2 . The relationship of the spectral analysis of E ( σ , τ ) E(\sigma ,\tau ) with the theory of generalized prime systems is also pointed out.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Pistelli, Laura, Cecilia Noccioli, Francesca D'Angiolillo, and Luisa Pistelli. "Composition of volatile in micropropagated and field grown aromatic plants from Tuscany Islands." Acta Biochimica Polonica 60, no. 1 (February 25, 2013). http://dx.doi.org/10.18388/abp.2013_1949.

Повний текст джерела

Анотація:

Aromatic plant species present in the natural Park of Tuscany Archipelago are used as flavoring agents and spices, as dietary supplements and in cosmetics and aromatherapy. The plants are usually collected from wild stands, inducing a depletion of the natural habitat. Therefore, micropropagation of these aromatic plants can play a role in the protection of the natural ecosystem, can guarantee a massive sustainable production and can provide standardized plant materials for diverse economical purposes. The aim of this study is to compare the volatile organic compounds produced by the wild plants with those from in vitro plantlets using headspace solid phase micro-extraction (HS-SPME) followed by capillary gas-chromatography coupled to mass spectrometry (GC-MS). Typical plants of this natural area selected for this work were Calamintha nepeta L., Crithmum maritimum L., Lavandula angustifolia L., Myrtus communis L., Rosmarinus officinalis L., Salvia officinalis L. and Satureja hortensis L. Different explants were used: microcuttings with vegetative apical parts, axillary buds and internodes. Sterilization percentage, multiplication rate and shoot length, as well as root formation were measured. The volatile aromatic profiles produced from in vitro plantlets were compared with those of the wild plants, in particular for C. maritimum, R. officinalis, S. officinalis and S. hortensis. This study indicated that the micropropagation technique can represent a valid alternative to produce massive and sterile plant material characterised by the same aromatic flavour as in the wild grown plants.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!