Acceder

Bibliografías temáticas / Algoritmi paralleli / Artículos de revistas

Siga este enlace para ver otros tipos de publicaciones sobre el tema: Algoritmi paralleli.

Artículos de revistas sobre el tema "Algoritmi paralleli"

Autor: Grafiati

Publicado: 11 de marzo de 2023

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Algoritmi paralleli".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.

1

Di Viggiano, Pasquale Luigi. "DEMOCRAZIA DIGITALE COME DIFFERENZA:". Revista da Faculdade Mineira de Direito 24, n.º 48 (18 de marzo de 2022): 64–78. http://dx.doi.org/10.5752/p.2318-7999.2021v24n48p64-78.

Texto completo

Resumen

La partecipazione sociale e politica digitale contemporanea (e-Democracy) è un prodotto della digitalizzazione dello Stato e dei suoi apparati, caratterizzata dalla produzione di nuovi diritti resi possibili dalle tecnologie della comunicazione. La digitalizzazione degli apparati dello Stato attraverso le nuove tecnologie basate su algoritmi intelligenti e le norme sulla società dell’informazione e della comunicazione hanno innescato la produzione di cosiddetti “nuovi diritti” la cui esigibilità amplia il concetto di democrazia stabilendo una differenza tra il tradizionale governo della cosa pubblica e le crescenti pretese delle comunità sempre più legate al sistema della comunicazione digitale. I diritti di accedere a Internet e alla rete, all’e-voting, a comunicare con la PA attraverso le nuove tecnologie, a ricevere servizi pubblici digitali sono paralleli a doveri dello Stato caratterizzati dalla soddisfazione dei nuovi diritti. Contemporaneamente cresce il rischio che forme di partecipazione digitale producano livelli di esclusioni intollerabili che intaccano la democrazia. Osservare e descrivere, con gli strumenti concettuali del Centro di Studi sul Rischio, come il sistema del diritto, della politica e della società evolvono attraverso il rapporto con l’ecosistema digitale trainato dall’arcipelago delle intelligenze artificiali rappresenta l’obiettivo e la sfida sempre incerta negli esiti, sempre nuova nelle acquisizioni ma sempre stimolante e proficua sotto il profilo della ricerca sociale, politica e giuridica.

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

TRINDER, P. W., K. HAMMOND, H. W. LOIDL y S. L. PEYTON JONES. "Algorithm + strategy = parallelism". Journal of Functional Programming 8, n.º 1 (enero de 1998): 23–60. http://dx.doi.org/10.1017/s0956796897002967.

Texto completo

Resumen

The process of writing large parallel programs is complicated by the need to specify both the parallel behaviour of the program and the algorithm that is to be used to compute its result. This paper introduces evaluation strategies: lazy higher-order functions that control the parallel evaluation of non-strict functional languages. Using evaluation strategies, it is possible to achieve a clean separation between algorithmic and behavioural code. The result is enhanced clarity and shorter parallel programs. Evaluation strategies are a very general concept: this paper shows how they can be used to model a wide range of commonly used programming paradigms, including divide-and-conquer parallelism, pipeline parallelism, producer/consumer parallelism, and data-oriented parallelism. Because they are based on unrestricted higher-order functions, they can also capture irregular parallel structures. Evaluation strategies are not just of theoretical interest: they have evolved out of our experience in parallelising several large-scale parallel applications, where they have proved invaluable in helping to manage the complexities of parallel behaviour. Some of these applications are described in detail here. The largest application we have studied to date, Lolita, is a 40,000 line natural language engineering system. Initial results show that for these programs we can achieve acceptable parallel performance, for relatively little programming effort.

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Bassil, Youssef. "Implementation of Computational Algorithms using Parallel Programming". International Journal of Trend in Scientific Research and Development Volume-3, Issue-3 (30 de abril de 2019): 704–10. http://dx.doi.org/10.31142/ijtsrd22947.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Deng, An-Wen y Chih-Ying Gwo. "Parallel Computing Zernike Moments via Combined Algorithms". SIJ Transactions on Computer Science Engineering & its Applications (CSEA) 04, n.º 03 (28 de junio de 2016): 01–09. http://dx.doi.org/10.9756/sijcsea/v4i3/04020050101.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Shu, Qin, Xiuli He, Chang Wang y Yunxiu Yang. "Parallel registration algorithm with arbitrary affine transformation". Chinese Optics Letters 18, n.º 7 (2020): 071001. http://dx.doi.org/10.3788/col202018.071001.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Liu, Yu y Yi Xiao. "Parallel Solution of Magnetotelluric Occam Inversion Algorithm Based on Hybrid MPI/OpenMP Model". Applied Mechanics and Materials 602-605 (agosto de 2014): 3751–54. http://dx.doi.org/10.4028/www.scientific.net/amm.602-605.3751.

Texto completo

Resumen

In order to improve the efficiency of magnetotelluric Occam inversion algorithm (MT Occam), a parallel algorithm is implemented on a hybrid MPI/OpenMP parallel programming model to increase its convergence speed and to decrease the operation time. MT Occam is partitioned to map the task on the parallel model. The parallel algorithm implements the coarse-grained parallelism between computation nodes and fine-grained parallelism between cores within each node. By analyzing the data dependency, the computing tasks are accurately partitioned so as to reduce transmission time. The experimental results show that with the increase of model scale, higher speedup can be obtained. The high efficiency of the parallel partitioning strategy of the model can improve the scalability of the parallel algorithm.

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

DIKAIAKOS, MARIOS D., ANNE ROGERS y KENNETH STEIGLITZ. "FUNCTIONAL ALGORITHM SIMULATION OF THE FAST MULTIPOLE METHOD: ARCHITECTURAL IMPLICATIONS". Parallel Processing Letters 06, n.º 01 (marzo de 1996): 55–66. http://dx.doi.org/10.1142/s0129626496000078.

Texto completo

Resumen

Functional Algorithm Simulation is a methodology for predicting the computation and communication characteristics of parallel algorithms for a class of scientific problems, without actually performing the expensive numerical computations involved. In this paper, we use Functional Algorithm Simulation to study the parallel Fast Multipole Method (FMM), which solves the N-body problem. Functional Algorithm Simulation provides us with useful information regarding communication patterns in the algorithm, the variation of available parallelism during different algorithmic phases, and upper bounds on available speedups for different problem sizes. Furthermore, it allows us to predict the performance of the FMM on message-passing multiprocessors with topologies such as cliques, hypercubes, rings, and multirings, over a wider range of problem sizes and numbers of processors than would be feasible by direct simulation. Our simulations show that an implementation of the FMM on low-cost, scalable ring or multiring architectures can attain satisfactory performance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Hahne, Jens, Stephanie Friedhoff y Matthias Bolten. "Algorithm 1016". ACM Transactions on Mathematical Software 47, n.º 2 (abril de 2021): 1–22. http://dx.doi.org/10.1145/3446979.

Texto completo

Resumen

In this article, we introduce the Python framework PyMGRIT, which implements the multigrid-reduction-in-time (MGRIT) algorithm for solving (non-)linear systems arising from the discretization of time-dependent problems. The MGRIT algorithm is a reduction-based iterative method that allows parallel-in-time simulations, i.e., calculating multiple time steps simultaneously in a simulation, using a time-grid hierarchy. The PyMGRIT framework includes many different variants of the MGRIT algorithm, ranging from different multigrid cycle types and relaxation schemes, various coarsening strategies, including time-only and space-time coarsening, and the ability to utilize different time integrators on different levels in the multigrid hierachy. The comprehensive documentation with tutorials and many examples and the fully documented code allow an easy start into the work with the package. The functionality of the code is ensured by automated serial and parallel tests using continuous integration. PyMGRIT supports serial runs suitable for prototyping and testing of new approaches, as well as parallel runs using the Message Passing Interface (MPI). In this manuscript, we describe the implementation of the MGRIT algorithm in PyMGRIT and present the usage from both a user and a developer point of view. Three examples illustrate different aspects of the package itself, especially running tests with pure time parallelism, as well as space-time parallelism through the coupling of PyMGRIT with PETSc or Firedrake.

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Najoui, Mohamed, Anas Hatim, Said Belkouch y Noureddine Chabini. "Novel Implementation Approach with Enhanced Memory Access Performance of MGS Algorithm for VLIW Architecture". Journal of Circuits, Systems and Computers 29, n.º 12 (19 de febrero de 2020): 2050200. http://dx.doi.org/10.1142/s021812662050200x.

Texto completo

Resumen

Modified Gram–Schmidt (MGS) algorithm is one of the most-known forms of QR decomposition (QRD) algorithms. It has been used in many signal and image processing applications to solve least square problem and linear equations or to invert matrices. However, QRD is well-thought-out as a computationally expensive technique, and its sequential implementation fails to meet the requirements of many real-time applications. In this paper, we suggest a new parallel version of MGS algorithm that uses VLIW (Very Long Instruction Word) resources in an efficient way to get more performance. The presented parallel MGS is based on compact VLIW kernels that have been designed for each algorithm step taking into account architectural and algorithmic constraints. Based on instruction scheduling and software pipelining techniques, the proposed kernels exploit efficiently data, instruction and loop levels parallelism. Additionally, cache memory properties were used efficiently to enhance parallel memory access and to avoid cache misses. The robustness, accuracy and rapidity of the introduced parallel MGS implementation on VLIW enhance significantly the performance of systems under severe rea-time and low power constraints. Experimental results show great improvements over the optimized vendor QRD implementation and the state of art.

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Li, Yong. "An Improved Parallel FFT Algorithm Based on the GPU". Advanced Materials Research 647 (enero de 2013): 880–84. http://dx.doi.org/10.4028/www.scientific.net/amr.647.880.

Texto completo

Resumen

With the extensive applications of FFT in digital signal processing and image signal processing which needs a extensive application of large-scale computing, it become more and more important to improve parallelism, especially efficient and scalable parallel of FFT algorithm. This paper improves the parallelism of the FFT algorithm based on the Six-Step FFT algorithm. The introduction of GPU to parallel computing is to realize parallel FFT computing in a single machine and to improve the speed of Frontier transform. With the optimization strategy of the mapping hiding the transport matrix, the performance of parallel FFT algorithm after optimization is remarkably promoted by the assignment of matrix calculation and butterfly computation to GPU. Finally it applies to design the digital filter in seismic data.

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

Qawasmeh, Ahmad, Salah Taamneh, Ashraf H. Aljammal, Nabhan Hamadneh, Mustafa Banikhalaf y Mohammad Kharabsheh. "Parallelism exploration in sequential algorithms via animation tool". Multiagent and Grid Systems 17, n.º 2 (23 de agosto de 2021): 145–58. http://dx.doi.org/10.3233/mgs-210347.

Texto completo

Resumen

Different high performance techniques, such as profiling, tracing, and instrumentation, have been used to tune and enhance the performance of parallel applications. However, these techniques do not show how to explore the potential of parallelism in a given application. Animating and visualizing the execution process of a sequential algorithm provide a thorough understanding of its usage and functionality. In this work, an interactive web-based educational animation tool was developed to assist users in analyzing sequential algorithms to detect parallel regions regardless of the used parallel programming model. The tool simplifies algorithms’ learning, and helps students to analyze programs efficiently. Our statistical t-test study on a sample of students showed a significant improvement in their perception of the mechanism and parallelism of applications and an increase in their willingness to learn algorithms and parallel programming.

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

Arbogast, Todd, Clint N. Dawson y Mary F. Wheeler. "A parallel algorithm for two phase multicomponent contaminant transport". Applications of Mathematics 40, n.º 3 (1995): 163–74. http://dx.doi.org/10.21136/am.1995.134289.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

B.jyothirmai, B. jyothirmai y M. premalatha M.premalatha. "Design and Implementation of Parallel Mac by Booth Algorithm". International Journal of Scientific Research 1, n.º 6 (1 de junio de 2012): 78–79. http://dx.doi.org/10.15373/22778179/nov2012/29.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

Priya, R. Arokia y Shreedhar Gyandeo Pawar. "Parallel Algorithm Using Opencl for Depth Estimation of Image". International Journal of Scientific Research 2, n.º 10 (1 de junio de 2012): 1–5. http://dx.doi.org/10.15373/22778179/oct2013/46.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Qian, Yu Xia, K. Dong y X. N. Zhang. "Two New Parallel Algorithms Based on QPSO". Applied Mechanics and Materials 743 (marzo de 2015): 325–32. http://dx.doi.org/10.4028/www.scientific.net/amm.743.325.

Texto completo

Resumen

Based on the analysis of classical particle swarm optimization (PSO) algorithm, we adopted Sun’s theory that has the behavior of quantum particle swarm optimization (QPSO) algorithm, by analyzing the algorithm natural parallelism and combined with parallel computer high-speed parallelism, we put forward a new parallel with the behavior of quantum particle swarm optimization (PQPSO) algorithm. On this basis, introduced the island model, relative to the fine-grained has two quantum behavior of particle swarm,m optimization algorithm, the proposed two kinds of coarse-grained parallel based on multiple populations has the behavior of quantum particle swarm optimization (QPSO) algorithm. Finally under the environment of MPI parallel machine using benchmark functions to do the numerical test, and a comparative analysis with other optimization algorithms. Results show that based on the global optimal value is superior to the exchange of data based on local optimum values of exchange, but in the comparison of time is just the opposite.

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

González, Carlos H. y Basilio B. Fraguela. "An Algorithm Template for Domain-Based Parallel Irregular Algorithms". International Journal of Parallel Programming 42, n.º 6 (1 de septiembre de 2013): 948–67. http://dx.doi.org/10.1007/s10766-013-0268-3.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

Baruah, Nirvik, Peter Kraft, Fiodar Kazhamiaka, Peter Bailis y Matei Zaharia. "Parallelism-Optimizing Data Placement for Faster Data-Parallel Computations". Proceedings of the VLDB Endowment 16, n.º 4 (diciembre de 2022): 760–71. http://dx.doi.org/10.14778/3574245.3574260.

Texto completo

Resumen

Systems performing large data-parallel computations, including online analytical processing (OLAP) systems like Druid and search engines like Elasticsearch, are increasingly being used for business-critical real-time applications where providing low query latency is paramount. In this paper, we investigate an underexplored factor in the performance of data-parallel queries: their parallelism. We find that to minimize the tail latency of data-parallel queries, it is critical to place data such that the data items accessed by each individual query are spread across as many machines as possible so that each query can leverage the computational resources of as many machines as possible. To optimize parallelism and minimize tail latency in real systems, we develop a novel parallelism-optimizing data placement algorithm that defines a linearly-computable measure of query parallelism, uses it to frame data placement as an optimization problem, and leverages a new optimization problem partitioning technique to scale to large cluster sizes. We apply this algorithm to popular systems such as Solr and MongoDB and show that it reduces p99 latency by 7-64% on data-parallel workloads.

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

NARAYANAN, P. J. y LARRY S. DAVIS. "REPLICATED IMAGE ALGORITHMS AND THEIR ANALYSES ON SIMD MACHINES". International Journal of Pattern Recognition and Artificial Intelligence 06, n.º 02n03 (agosto de 1992): 335–52. http://dx.doi.org/10.1142/s0218001492000217.

Texto completo

Resumen

Data parallel processing on processor array architectures has gained popularity in data intensive applications, such as image processing and scientific computing, as massively parallel processor array machines became feasible commercially. The data parallel paradigm of assigning one processing element to each data element results in an inefficient utilization of a large processor array when a relatively small data structure is processed on it. The large degree of parallelism of a massively parallel processor array machine does not result in a faster solution to a problem involving relatively small data structures than the modest degree of parallelism of a machine that is just as large as the data structure. We presented data replication technique to speed up the processing of small data structures on large processor arrays. In this paper, we present replicated data algorithms for digital image convolutions and median filtering, and compare their performance with conventional data parallel algorithms for the same on three popular array interconnection networks, namely, the 2-D mesh, the 3-D mesh, and the hypercube.

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

HU, ZHENJIANG y MASATO TAKEICHI. "CALCULATING AN OPTIMAL HOMOMORPHIC ALGORITHM FOR BRACKET MATCHING". Parallel Processing Letters 09, n.º 03 (septiembre de 1999): 335–45. http://dx.doi.org/10.1142/s0129626499000311.

Texto completo

Resumen

It is widely recognized that a key problem of parallel computation is in the development of both efficient and correct parallel software. Although many advanced language features and compilation techniques have been proposed to alleviate the complexity of parallel programming, much effort is still required to develop parallelism in a formal and systematic way. In this paper, we intend to clarify this point by demonstrating a formal derivation of a correct but efficient homomorphic parallel algorithm for a simple language recognition problem known as bracket matching. To the best of our knowledge, our formal derivation leads to a novel divide-and-conquer parallel algorithm for bracket matching.

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Lee, Taekhee y Young J. Kim. "Massively parallel motion planning algorithms under uncertainty using POMDP". International Journal of Robotics Research 35, n.º 8 (21 de agosto de 2015): 928–42. http://dx.doi.org/10.1177/0278364915594856.

Texto completo

Resumen

We present new parallel algorithms that solve continuous-state partially observable Markov decision process (POMDP) problems using the GPU (gPOMDP) and a hybrid of the GPU and CPU (hPOMDP). We choose the Monte Carlo value iteration (MCVI) method as our base algorithm and parallelize this algorithm using the multi-level parallel formulation of MCVI. For each parallel level, we propose efficient algorithms to utilize the massive data parallelism available on modern GPUs. Our GPU-based method uses the two workload distribution techniques, compute/data interleaving and workload balancing, in order to obtain the maximum parallel performance at the highest level. Here we also present a CPU–GPU hybrid method that takes advantage of both CPU and GPU parallelism in order to solve highly complex POMDP planning problems. The CPU is responsible for data preparation, while the GPU performs Monte Cacrlo simulations; these operations are performed concurrently using the compute/data overlap technique between the CPU and GPU. To the best of the authors’ knowledge, our algorithms are the first parallel algorithms that efficiently execute POMDP in a massively parallel fashion utilizing the GPU or a hybrid of the GPU and CPU. Our algorithms outperform the existing CPU-based algorithm by a factor of 75–99 based on the chosen benchmark.

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Chen Qingjiang, 陈清江, 李金阳 Li Jinyang y 胡倩楠 Hu Qiannan. "基于并联残差网络的低照度图像增强算法". Laser & Optoelectronics Progress 58, n.º 14 (2021): 1410015. http://dx.doi.org/10.3788/lop202158.1410015.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

KR, Ramkumar, Ramkumar K.R., Ganesh Kumar M, Hemachandar N y Manoj Prasadh D. "HPRAAM: Hybrid Parallel Routing Algorithm using Ant Agents for MANETs". International Journal of Engineering and Technology 1, n.º 1 (2009): 102–6. http://dx.doi.org/10.7763/ijet.2009.v1.19.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Lin, H. X. "Graph Transformation and Designing Parallel Sparse Matrix Algorithms beyond Data Dependence Analysis". Scientific Programming 12, n.º 2 (2004): 91–100. http://dx.doi.org/10.1155/2004/169467.

Texto completo

Resumen

Algorithms are often parallelized based on data dependence analysis manually or by means of parallel compilers. Some vector/matrix computations such as the matrix-vector products with simple data dependence structures (data parallelism) can be easily parallelized. For problems with more complicated data dependence structures, parallelization is less straightforward. The data dependence graph is a powerful means for designing and analyzing parallel algorithms. However, for sparse matrix computations, parallelization based on solely exploiting the existing parallelism in an algorithm does not always give satisfactory results. For example, the conventional Gaussian elimination algorithm for the solution of a tri-diagonal system is inherently sequential, so algorithms specially for parallel computation has to be designed. After briefly reviewing different parallelization approaches, a powerful graph formalism for designing parallel algorithms is introduced. This formalism will be discussed using a tri-diagonal system as an example. Its application to general matrix computations is also discussed. Its power in designing parallel algorithms beyond the ability of data dependence analysis is shown by means of a new algorithm called ACER (Alternating Cyclic Elimination and Reduction algorithm).

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Bahi, J. M., S. Contassot-Vivier, R. Couturier y F. Vernier. "A decentralized convergence detection algorithm for asynchronous parallel iterative algorithms". IEEE Transactions on Parallel and Distributed Systems 16, n.º 1 (enero de 2005): 4–13. http://dx.doi.org/10.1109/tpds.2005.2.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Cheng, Dong-Nian, Yu-Xiang Hu y Cai-Xia Liu. "Parallel Algorithm Core: A Novel IPSec Algorithm Engine for Both Exploiting Parallelism and Improving Scalability". Journal of Computer Science and Technology 23, n.º 5 (septiembre de 2008): 792–805. http://dx.doi.org/10.1007/s11390-008-9166-3.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Blelloch, Guy E. y Bruce M. Maggs. "Parallel algorithms". ACM Computing Surveys 28, n.º 1 (marzo de 1996): 51–54. http://dx.doi.org/10.1145/234313.234339.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

Vasilev, Vasil P. "BSPGRID: VARIABLE RESOURCES PARALLEL COMPUTATION AND MULTIPROGRAMMED PARALLELISM". Parallel Processing Letters 13, n.º 03 (septiembre de 2003): 329–40. http://dx.doi.org/10.1142/s0129626403001318.

Texto completo

Resumen

This paper introduces a new framework for the design of parallel algorithms that may be executed on multiprogrammed architectures with variable resources. These features, in combination with an implied ability to handle fault tolerance, facilitates environments such as the GRID. A new model, BSPGRID is presented, which exploits the bulk synchronous paradigm to allow existing algorithms to be easily adapted and used. It models computation, communication, external memory accesses (I/O) and synchronization. By combining the communication and I/O operations BSPGRID allows the easy design of portable algorithms while permitting them to execute on non-dedicated hardware and/or changing resources, which is typical for machines in a GRID. However, even with this degree of dynamicity, the model still offers a simple and tractable cost model. Each program runs in its own virtual BSPGRID machine. Its emulation on a real computer is demonstrated to show the practicality of the framework. A dense matrix multiplication algorithm and its emulation in a multiprogrammed environment is given as an example.

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Ling, Huidong, Xinmu Zhu, Tao Zhu, Mingxing Nie, Zhenghai Liu y Zhenyu Liu. "A Parallel Multiobjective PSO Weighted Average Clustering Algorithm Based on Apache Spark". Entropy 25, n.º 2 (31 de enero de 2023): 259. http://dx.doi.org/10.3390/e25020259.

Texto completo

Resumen

Multiobjective clustering algorithm using particle swarm optimization has been applied successfully in some applications. However, existing algorithms are implemented on a single machine and cannot be directly parallelized on a cluster, which makes it difficult for existing algorithms to handle large-scale data. With the development of distributed parallel computing framework, data parallelism was proposed. However, the increase in parallelism will lead to the problem of unbalanced data distribution affecting the clustering effect. In this paper, we propose a parallel multiobjective PSO weighted average clustering algorithm based on apache Spark (Spark-MOPSO-Avg). First, the entire data set is divided into multiple partitions and cached in memory using the distributed parallel and memory-based computing of Apache Spark. The local fitness value of the particle is calculated in parallel according to the data in the partition. After the calculation is completed, only particle information is transmitted, and there is no need to transmit a large number of data objects between each node, reducing the communication of data in the network and thus effectively reducing the algorithm’s running time. Second, a weighted average calculation of the local fitness values is performed to improve the problem of unbalanced data distribution affecting the results. Experimental results show that the Spark-MOPSO-Avg algorithm achieves lower information loss under data parallelism, losing about 1% to 9% accuracy, but can effectively reduce the algorithm time overhead. It shows good execution efficiency and parallel computing capability under the Spark distributed cluster.

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

CAMPANINI, R., G. DI CARO, M. VILLANI, I. D’ANTONE y G. GIUSTI. "PARALLEL ARCHITECTURES AND INTRINSICALLY PARALLEL ALGORITHMS: GENETIC ALGORITHMS". International Journal of Modern Physics C 05, n.º 01 (febrero de 1994): 95–112. http://dx.doi.org/10.1142/s012918319400009x.

Texto completo

Resumen

Genetic algorithms are search or classification algorithms based on natural models. They present a high degree of internal parallelism. We developed two versions, differing in the way the population is organized and we studied and compared their characteristics and performances when applied to the optimization of multidimensional function problems. All the implementations are realized on transputer networks.

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Liu, Qun y Xiaobing Li. "A New Parallel Item-Based Collaborative Filtering Algorithm Based on Hadoop". Journal of Software 10, n.º 4 (abril de 2015): 416–26. http://dx.doi.org/10.17706/jsw.10.4.416-426.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Ahmed, Rafid, Md Sazzadul Islam y Jia Uddin. "Optimizing Apple Lossless Audio Codec Algorithm using NVIDIA CUDA Architecture". International Journal of Electrical and Computer Engineering (IJECE) 8, n.º 1 (1 de febrero de 2018): 70. http://dx.doi.org/10.11591/ijece.v8i1.pp70-75.

Texto completo

Resumen

As majority of the compression algorithms are implementations for CPU architecture, the primary focus of our work was to exploit the opportunities of GPU parallelism in audio compression. This paper presents an implementation of Apple Lossless Audio Codec (ALAC) algorithm by using NVIDIA GPUs Compute Unified Device Architecture (CUDA) Framework. The core idea was to identify the areas where data parallelism could be applied and parallel programming model CUDA could be used to execute the identified parallel components on Single Instruction Multiple Thread (SIMT) model of CUDA. The dataset was retrieved from European Broadcasting Union, Sound Quality Assessment Material (SQAM). Faster execution of the algorithm led to execution time reduction when applied to audio coding for large audios. This paper also presents the reduction of power usage due to running the parallel components on GPU. Experimental results reveal that we achieve about 80-90% speedup through CUDA on the identified components over its CPU implementation while saving CPU power consumption.

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Ivanov, Ivelin. "Parallel processing and parallel algorithms". ACM SIGACT News 33, n.º 4 (diciembre de 2002): 12–14. http://dx.doi.org/10.1145/601819.601825.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

Aharoni, Gad, Dror G. Feitelson y Amnon Barak. "A run-time algorithm for managing the granularity of parallel functional programs". Journal of Functional Programming 2, n.º 4 (octubre de 1992): 387–405. http://dx.doi.org/10.1017/s0956796800000484.

Texto completo

Resumen

AbstractWe present an on-line (run-time) algorithm that manages the granularity of parallel functional programs. The algorithm exploits useful parallelism when it exists, and ignores ineffective parallelism in programs that produce many small tasks. The idea is to balance the amount of local work with the cost of distributing the work. This is achieved by ensuring that for every parallel task spawned, an amount of work that equals the cost of the spawn is performed locally. We analyse several cases and compare the algorithm to the optimal execution. In most cases the algorithm competes well with the optimal algorithm, even though the optimal algorithm has information about the future evolution of the computation that is not available to the on-line algorithm. This is quite remarkable considering we have chosen extreme cases that have contradicting optimal executions. Moreover, we show that no other on-line algorithm can be consistently better than it. We also present experimental results that demonstrate the effectiveness of the algorithm.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Hutchinson, D. y B. M. S. Khalaf. "Parallel algorithms for solving initial value problems: front broadening and embedded parallelism". Parallel Computing 17, n.º 9 (noviembre de 1991): 957–68. http://dx.doi.org/10.1016/s0167-8191(05)80041-9.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Zhang, Xiaohuan, Dan Zhang, Zhen Wang y Yu Xin. "Multiple-Devices-Process Integrated Scheduling Algorithm with Time-Selective Strategy for Process Sequence". Complexity 2020 (17 de octubre de 2020): 1–12. http://dx.doi.org/10.1155/2020/8898536.

Texto completo

Resumen

This paper is in view of the current algorithm to solve Multiple-Devices-Process integrated scheduling problems without considering parallel process between the parallel processing. Meanwhile, the influence of the first processing procedure on the second processing procedure is ignored, resulting in poor tightness between serial processes and poor parallelism between parallel processes, ultimately affecting the product scheduling results. We proposed Multiple-Devices-Process integrated scheduling algorithm with time-selective for process sequence. The proposed Multiple-Devices-Process sequencing strategy determines the process of scheduling order and improves the tightness between serial process. This paper presents a method to determine the quasischeduling time point of the multiequipment process, the time-selective strategy of Multiple-Devices-Process, and the time-selective adjustment strategy of Multiple-Devices-Process so that the first and the second processing processes cooperate with each other, and the purpose of improving the tightness of the serial process and the parallelism of the parallel processes is achieved, so as to shorten the product processing time.

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

NARAYANASWAMI, CHANDRASEKHAR y WILLIAM RANDOLPH FRANKLIN. "DETERMINATION OF MASS PROPERTIES OF POLYGONAL CSG OBJECTS IN PARALLEL". International Journal of Computational Geometry & Applications 01, n.º 04 (diciembre de 1991): 381–403. http://dx.doi.org/10.1142/s0218195991000268.

Texto completo

Resumen

A parallel algorithm for determining the mass properties of objects represented in the Constructive Solid Geometry (CSG) scheme that uses polygons as primitives is presented. The algorithm exploits the fact that integration of local information around the vertices of the evaluated polygon is sufficient for the determination of its mass properties, i.e., determination of the edges and the complete topology of the evaluated polygon is not necessary. This reduces interprocessor communication and makes it suitable for parallel processing. The algorithm uses data parallelism, spatial partitioning and parallel sorting for parallelization. Tuple-sets on which simple operations have to be performed are identified. The elements of tuple-sets are distributed among the processors for parallelization. The uniform grid spatial partitioning technique is used to generate sub-problems that can be done in parallel and to reduce the cardinality of some of the tuple-sets generated in the algorithm. Parallel sorting is used to sort the tuple-sets between the data-parallel phases of the algorithm.

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Aliaga, José I., Rocío Carratalá-Sáez y Enrique S. Quintana-Ortí. "Parallel Solution of Hierarchical Symmetric Positive Definite Linear Systems". Applied Mathematics and Nonlinear Sciences 2, n.º 1 (22 de junio de 2017): 201–12. http://dx.doi.org/10.21042/amns.2017.1.00017.

Texto completo

Resumen

AbstractWe present a prototype task-parallel algorithm for the solution of hierarchical symmetric positive definite linear systems via the ℋ-Cholesky factorization that builds upon the parallel programming standards and associated runtimes for OpenMP and OmpSs. In contrast with previous efforts, our proposal decouples the numerical aspects of the linear algebra operation from the complexities associated with high performance computing. Our experiments make an exhaustive analysis of the efficiency attained by different parallelization approaches that exploit either task-parallelism or loop-parallelism via a runtime. Alternatively, we also evaluate a solution that leverages multi-threaded parallelism via the parallel implementation of the Basic Linear Algebra Subroutines (BLAS) in Intel MKL.

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Belazi, Akram, Héctor Migallón, Daniel Gónzalez-Sánchez, Jorge Gónzalez-García, Antonio Jimeno-Morenilla y José-Luis Sánchez-Romero. "Enhanced Parallel Sine Cosine Algorithm for Constrained and Unconstrained Optimization". Mathematics 10, n.º 7 (3 de abril de 2022): 1166. http://dx.doi.org/10.3390/math10071166.

Texto completo

Resumen

The sine cosine algorithm’s main idea is the sine and cosine-based vacillation outwards or towards the best solution. The first main contribution of this paper proposes an enhanced version of the SCA algorithm called as ESCA algorithm. The supremacy of the proposed algorithm over a set of state-of-the-art algorithms in terms of solution accuracy and convergence speed will be demonstrated by experimental tests. When these algorithms are transferred to the business sector, they must meet time requirements dependent on the industrial process. If these temporal requirements are not met, an efficient solution is to speed them up by designing parallel algorithms. The second major contribution of this work is the design of several parallel algorithms for efficiently exploiting current multicore processor architectures. First, one-level synchronous and asynchronous parallel ESCA algorithms are designed. They have two favors; retain the proposed algorithm’s behavior and provide excellent parallel performance by combining coarse-grained parallelism with fine-grained parallelism. Moreover, the parallel scalability of the proposed algorithms is further improved by employing a two-level parallel strategy. Indeed, the experimental results suggest that the one-level parallel ESCA algorithms reduce the computing time, on average, by 87.4% and 90.8%, respectively, using 12 physical processing cores. The two-level parallel algorithms provide extra reductions of the computing time by 91.4%, 93.1%, and 94.5% with 16, 20, and 24 processing cores, including physical and logical cores. Comparison analysis is carried out on 30 unconstrained benchmark functions and three challenging engineering design problems. The experimental outcomes show that the proposed ESCA algorithm behaves outstandingly well in terms of exploration and exploitation behaviors, local optima avoidance, and convergence speed toward the optimum. The overall performance of the proposed algorithm is statistically validated using three non-parametric statistical tests, namely Friedman, Friedman aligned, and Quade tests.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Zhang, Bin, Jia Jin Le y Mei Wang. "Effective ACPS-Based Rescheduling of Parallel Batch Processing Machines with MapReduce". Applied Mechanics and Materials 575 (junio de 2014): 820–24. http://dx.doi.org/10.4028/www.scientific.net/amm.575.820.

Texto completo

Resumen

MapReduce is a highly efficient distributed and parallel computing framework, allowing users to readily manage large clusters in parallel computing. For Big data search problem in the distributed computing environment based on MapReduce architecture, in this paper we propose an Ant colony parallel search algorithm (ACPSMR) for Big data. It take advantage of the group intelligence of ant colony algorithm for global parallel search heuristic scheduling capabilities to solve problem of multi-task parallel batch scheduling with low efficiency in the MapReduce. And we extended HDFS design in MapReduce architecture, which make it to achieve effective integration with MapReduce. Then the algorithm can make the best of the scalability, high parallelism of MapReduce. The simulation experiment result shows that, the new algorithm can take advantages of cloud computing to get good efficiency when mining Big data.

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Imperatore, Pasquale y Eugenio Sansosti. "Multithreading Based Parallel Processing for Image Geometric Coregistration in SAR Interferometry". Remote Sensing 13, n.º 10 (18 de mayo de 2021): 1963. http://dx.doi.org/10.3390/rs13101963.

Texto completo

Resumen

Within the framework of multi-temporal Synthetic Aperture Radar (SAR) interferometric processing, image coregistration is a fundamental operation that might be extremely time-consuming. This paper explores the possibility of addressing fast and accurate SAR image geometric coregistration, with sub-pixel accuracy and in the presence of a complex 3-D object scene, by exploiting the parallelism offered by shared-memory architectures. An efficient and scalable processor is proposed by designing a parallel algorithm incorporating thread-level parallelism for solving the inherent computationally intensive problem. The adopted functional scheme is first mathematically framed and then investigated in detail in terms of its computational structures. Subsequently, a parallel version of the algorithm is designed, according to a fork-join model, by suitably taking into account the granularity of the decomposition, load-balancing, and different scheduling strategies. The developed parallel algorithm implements parallelism at the thread-level by using OpenMP (Open Multi-Processing) and it is specifically targeted at shared-memory multiprocessors. The parallel performance of the implemented multithreading-based SAR image coregistration prototype processor is experimentally investigated and quantitatively assessed by processing high-resolution X-band COSMO-SkyMed SAR data and using two different multicore architectures. The effectiveness of the developed multithreaded prototype solution in fully benefitting from the computing power offered by multicore processors has successfully been demonstrated via a suitable experimental performance analysis conducted in terms of parallel speedup and efficiency. The demonstrated scalable performance and portability of the developed parallel processor confirm its potential for operational use in the interferometric SAR data processing at large scales.

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

CHANG, WENG-LONG, MINYI GUO y JESSE WU. "Solving the Independent-set Problem in a DNA-Based Supercomputer Model". Parallel Processing Letters 15, n.º 04 (diciembre de 2005): 469–79. http://dx.doi.org/10.1142/s0129626405002386.

Texto completo

Resumen

In this paper, it is demonstrated how the DNA (DeoxyriboNucleic Acid) operations presented by Adleman and Lipton can be used to develop the parallel genetic algorithm that solves the independent-set problem. The advantage of the genetic algorithm is the huge parallelism inherent in DNA based computing. Furthermore, this work represents obvious evidence for the ability of DNA based parallel computing to solve NP-complete problems.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Shang, Yizi, Guiming Lu, Ling Shang y Guangqian Wang. "Parallel processing on block-based Gauss-Jordan algorithm for desktop grid". Computer Science and Information Systems 8, n.º 3 (2011): 739–59. http://dx.doi.org/10.2298/csis100907026s.

Texto completo

Resumen

Two kinds of parallel possibilities exist in the block-based Gauss-Jordan (BbGJ) algorithm, which are intra-step and inter-steps based parallelism. But the existing parallel paradigm of BbGJ algorithm just aiming at the intra-step based parallelism, can?t meet the requirement of dispatching simultaneously as many tasks as possible to computing nodes of desktop grid platform exploiting thousands of volunteer computing resources. To overcome the problem described above, this paper presents a hybrid parallel paradigm for desktop grid platform, exploiting all the possible parallelizable parts of the BbGJ algorithm. As well known to us all, volatility is the key issue of desktop grid platform and faults are unavoidable during the process of program execution. So the adapted version of block BbGJ algorithm for desktop grid platform should take the volatility into consideration. To solve the problem presented above, the paper adopts multi-copy distribution strategy and multi-queue based task preemption method to ensure the key tasks can be executed on time, thus ensure the whole tasks can be finished in shorter period of time.

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Zmejev, D. N. y N. N. Levchenko. "Aspects of Creating Parallel Programs in Dataflow Programming Paradigm". Informacionnye Tehnologii 28, n.º 11 (17 de noviembre de 2022): 597–606. http://dx.doi.org/10.17587/it.28.597-606.

Texto completo

Resumen

The imperative programming paradigm is the main one for creating sequential and parallel programs for the vast majority of modern computers, including supercomputers. A feature of the imperative paradigm is the sequence of commands. This feature is an obstacle to the creation of efficient parallel programs, since parallelism is achieved at the expense of additional code. One of the solutions to the problem of overhead for parallel computing is the creation of such a computing model and the architecture of the system that implements it, for which the parallel execution of the algorithm is an immanent property. This model is a dataflow computing model with a dynamically formed context and the architecture of the parallel dataflow computing system "Buran". A complete transition to dataflow systems is hampered, among other things, by the conceptual difference between the dataflow programming paradigm and the imperative one. The article compares these two paradigms. First, parallel data processing is an inherent property of a dataflow program. Second, the dataflow program consists of three elements: a set of initial data, a program code, and a parameterizable distribution function. And third, a conceptually different approach to the algorithmization of the task — the data themselves store information about who should process them (in traditional programs, on the contrary, the command stores information about what data should be processed). The article also presents the structure of a dataflow program and the route for creating a dataflow algorithm. The translation of basic algorithmic constructions (following, branching, loops) is considered on the example of simple problems.

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

Xu, Wencai. "Efficient Distributed Image Recognition Algorithm of Deep Learning Framework TensorFlow". Journal of Physics: Conference Series 2066, n.º 1 (1 de noviembre de 2021): 012070. http://dx.doi.org/10.1088/1742-6596/2066/1/012070.

Texto completo

Resumen

Abstract Deep learning requires training on massive data to get the ability to deal with unfamiliar data in the future, but it is not as easy to get a good model from training on massive data. Because of the requirements of deep learning tasks, a deep learning framework has also emerged. This article mainly studies the efficient distributed image recognition algorithm of the deep learning framework TensorFlow. This paper studies the deep learning framework TensorFlow itself and the related theoretical knowledge of its parallel execution, which lays a theoretical foundation for the design and implementation of the TensorFlow distributed parallel optimization algorithm. This paper designs and implements a more efficient TensorFlow distributed parallel algorithm, and designs and implements different optimization algorithms from TensorFlow data parallelism and model parallelism. Through multiple sets of comparative experiments, this paper verifies the effectiveness of the two optimization algorithms implemented in this paper for improving the speed of TensorFlow distributed parallel iteration. The results of research experiments show that the 12 sets of experiments finally achieved a stable model accuracy rate, and the accuracy rate of each set of experiments is above 97%. It can be seen that the distributed algorithm of using a suitable deep learning framework TensorFlow can be implemented in the goal of effectively reducing model training time without reducing the accuracy of the final model.

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

Zhang, Jun, Yong Ping Gao, Yue Shun He y Xue Yuan Wang. "Algorithm Improvement of Two-Way Merge Sort Based on OpenMP". Applied Mechanics and Materials 701-702 (diciembre de 2014): 24–29. http://dx.doi.org/10.4028/www.scientific.net/amm.701-702.24.

Texto completo

Resumen

Two-way merge sort algorithm has a good time efficiency which has been used widely. The sort algorithm can be improved on speed and efficient based on its own potential parallelism via the parallel processing capacity of multi-core processor and the convenient programming interface of OpenMP. The time complexity is improved to O(nlog2n/TNUM) and inversely proportional to the number of parallel threads. The experiment results show that the improved two-way merge sort algorithm become much more efficient compared to the traditional one.

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Jin-Liang Zhou, Jin-Liang Zhou, Shu-Chuan Chu Jin-Liang Zhou, Ai-Qing Tian Shu-Chuan Chu, Yan-Jun Peng Ai-Qing Tian y Jeng-Shyang Pan Yan-Jun Peng. "Intelligent Neural Network with Parallel Salp Swarm Algorithm for Power Load Prediction". 網際網路技術學刊 23, n.º 4 (julio de 2022): 643–57. http://dx.doi.org/10.53106/160792642022072304001.

Texto completo

Resumen

<p>The neural network runs slowly and lacks accuracy in power load forecasting, so it is optimized using meta-heuristic algorithm. Salp Swarm Algorithm (SSA) is a novel meta-heuristic algorithm that simulates the salp foraging process. In this paper, a parallel salp swarm algorithm (PSSA) is proposed to improve the performance of SSA. It not only improves local development capabilities, but also accelerates global exploration. Through 23 test functions, PSSA performs better than other algorithms and can effectively explore the whole search space. Finally, PSSA is used to optimize the weights as well as the thresholds of the neural network. Using the optimized neural network to predict the power load in a certain region, the results show that PSSA can better optimize the neural network and increase its prediction accuracy.</p> <p> </p>

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

Nieblas, Carlos I. "Parallel Matching Pursuit Algorithm to Decomposition of Heart Sound Using Gabor Dictionaries". International Journal of Signal Processing Systems 9, n.º 4 (diciembre de 2021): 22–28. http://dx.doi.org/10.18178/ijsps.9.4.22-28.

Texto completo

Resumen

In order to improve the performance of matching pursuit algorithm, we propose a Parallel Matching Pursuit algorithm to decompose phonocardiogram sounds of long duration. The main goal is to demonstrate the performance of the Parallel Matching Pursuit algorithm (PMP) compared with traditional iterative matching pursuit algorithm to decompose normal and pathological heart sounds of Phonocardiogram (PCG) using a Gabor dictionary. This architecture is implemented in open source Java SE 8 using a concurrency library, which is able to reduce computational cost using multi-threading until 83 % compared with traditional Matching Pursuit. Java language is widely used in the creation of web components and enterprise solutions so based on this point the main idea of this research is to set the base to implement Parallel Matching Pursuit algorithm (PMP) on web platforms focused on the monitoring of heart to sounds. This implementation allows exploring and applying iterative algorithms or sparse approximation which require processing long audio signals with low processing time.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Jianpo Li, Jianpo Li, Geng-Chen Li Jianpo Li, Shu-Chuan Chu Geng-Chen Li, Min Gao Shu-Chuan Chu y Jeng-Shyang Pan Min Gao. "Modified Parallel Tunicate Swarm Algorithm and Application in 3D WSNs Coverage Optimization". 網際網路技術學刊 23, n.º 2 (marzo de 2022): 227–44. http://dx.doi.org/10.53106/160792642022032302004.

Texto completo

Resumen

<p>As the application of Wireless Sensor Networks (WSNs) in today’s society becomes more and more extensive, and the status is getting higher and higher, the node layout of sensors has also begun to attract social attention. In reality, the coverage of WSNs in 3D space is particularly important. Therefore, it is worth investigating an efficient way to find out the maximum coverage of WSNs. In this paper, a Modified Parallel Tunicate Swarm Algorithm (MPTSA) is proposed based on modified parallelism, which can improve the convergence of the algorithm and optimal global solution. Next, the proposed MPTSA is implemented and tested on 23 benchmark functions to verify the algorithm performance. Finally, a WSNs network layout scheme based on MPTSA is proposed to improve the coverage of the whole network. Experimental results show that, compared with the traditional PSO (Particle Swarm Optimization), improved PSO (PPSO and APSO), GBMO (Gases Brownian Motion Optimization) and traditional TSA, MPTSA family algorithms show better performance in WSNs network layout.</p> <p> </p>

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Yoshizoe, Kazuki, Akihiro Kishimoto, Tomoyuki Kaneko, Haruhiro Yoshimoto y Yutaka Ishikawa. "Scalable Distributed Monte-Carlo Tree Search". Proceedings of the International Symposium on Combinatorial Search 2, n.º 1 (19 de agosto de 2021): 180–87. http://dx.doi.org/10.1609/socs.v2i1.18194.

Texto completo

Resumen

Monte-Carlo Tree Search (MCTS) is remarkably successful in two-player games, but parallelizing MCTS has been notoriously difficult to scale well, especially in distributed environments. For a distributed parallel search, transposition-table driven scheduling (TDS) is known to be efficient in several domains. We present a massively parallel MCTS algorithm, that applies the TDS parallelism to the Upper Confidence bound Applied to Trees (UCT) algorithm, which is the most representative MCTS algorithm. To drastically decrease communication overhead, we introduce a reformulation of UCT called Depth-First UCT. The parallel performance of the algorithm is evaluated on clusters using up to 1,200 cores in artificial game-trees. We show that this approach scales well, achieving 740-fold speedups in the best case.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

Chen, Chunlei, Li He, Huixiang Zhang, Hao Zheng y Lei Wang. "On the Accuracy and Parallelism of GPGPU-Powered Incremental Clustering Algorithms". Computational Intelligence and Neuroscience 2017 (2017): 1–12. http://dx.doi.org/10.1155/2017/2519782.

Texto completo

Resumen

Incremental clustering algorithms play a vital role in various applications such as massive data analysis and real-time data processing. Typical application scenarios of incremental clustering raise high demand on computing power of the hardware platform. Parallel computing is a common solution to meet this demand. Moreover, General Purpose Graphic Processing Unit (GPGPU) is a promising parallel computing device. Nevertheless, the incremental clustering algorithm is facing a dilemma between clustering accuracy and parallelism when they are powered by GPGPU. We formally analyzed the cause of this dilemma. First, we formalized concepts relevant to incremental clustering like evolving granularity. Second, we formally proved two theorems. The first theorem proves the relation between clustering accuracy and evolving granularity. Additionally, this theorem analyzes the upper and lower bounds of different-to-same mis-affiliation. Fewer occurrences of such mis-affiliation mean higher accuracy. The second theorem reveals the relation between parallelism and evolving granularity. Smaller work-depth means superior parallelism. Through the proofs, we conclude that accuracy of an incremental clustering algorithm is negatively related to evolving granularity while parallelism is positively related to the granularity. Thus the contradictory relations cause the dilemma. Finally, we validated the relations through a demo algorithm. Experiment results verified theoretical conclusions.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!