Статті в журналах з теми "Test parallelization"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Test parallelization.

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Test parallelization".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Rauchwerger, L., and D. A. Padua. "The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization." IEEE Transactions on Parallel and Distributed Systems 10, no. 2 (1999): 160–80. http://dx.doi.org/10.1109/71.752782.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Son, Changhwan, Wooyeol Park, HyeongGyun Kim, KyungSook Han, and Changwoo Pyo. "Parallelization of CUSUM Test in a CUDA Environment." KIISE Transactions on Computing Practices 21, no. 7 (July 15, 2015): 476–81. http://dx.doi.org/10.5626/ktcp.2015.21.7.476.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Kong, X., D. Klappholz, and K. Psarris. "The I test: an improved dependence test for automatic parallelization and vectorization." IEEE Transactions on Parallel and Distributed Systems 2, no. 3 (July 1991): 342–49. http://dx.doi.org/10.1109/71.86109.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

DEBBI, Aimad Eddine, and Haddi BAKHTI. "Incremental Banerjee test conditions committing for robust parallelization framework." TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES 26, no. 5 (September 28, 2018): 2595–604. http://dx.doi.org/10.3906/elk-1712-374.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Zhang, Lei, Guo Xin Zhang, Yi Liu, and Hai Lin Pan. "Parallelization Development Method and Realization for a FEM Simulation Program." Applied Mechanics and Materials 405-408 (September 2013): 3169–72. http://dx.doi.org/10.4028/www.scientific.net/amm.405-408.3169.

Повний текст джерела
Анотація:
The parallel computation of equation sets solution are crucial for finite element analysis. The paper discussed the key technology of FEM parallelization and the parallelization strategy for the FEM program is given; and then discussed the Data structure of Aztec and how to call Aztec which consists of Krylov subspace iterative methods solvers and preconditioners; finally, the realization and testing of the Aztec-based finite element program parallelization was put forward. Test results show that there are high efficiency of this method which make the SapTis software more powerful, flexible and adaptable.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Moon, Sungdo, Byoungro So, and Mary W. Hall. "Combining Compile-Time and Run-Time Parallelization." Scientific Programming 7, no. 3-4 (1999): 247–60. http://dx.doi.org/10.1155/1999/490628.

Повний текст джерела
Анотація:
This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1) they must combine high‐quality compile‐time analysis with low‐cost run‐time testing; and (2) they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites – SPECFP95and NASsample benchmarks – which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run‐time testing, analysis of control flow, or some combination of the two. We present a new compile‐time analysis technique that can be used to parallelize most of these remaining loops. This technique is designed to not only improve the results of compile‐time parallelization, but also to produce low‐cost, directed run‐time tests that allow the system to defer binding of parallelization until run‐time when safety cannot be proven statically. We call this approachpredicated array data‐flow analysis. We augment array data‐flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating predicates with array data‐flow values. Predicated array data‐flow analysis allows the compiler to derive “optimistic” data‐flow values guarded by predicates; these predicates can be used to derive a run‐time test guaranteeing the safety of parallelization.
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Depolli, Matjaž, Roman Trobec, and Bogdan Filipič. "Asynchronous Master-Slave Parallelization of Differential Evolution for Multi-Objective Optimization." Evolutionary Computation 21, no. 2 (May 2013): 261–91. http://dx.doi.org/10.1162/evco_a_00076.

Повний текст джерела
Анотація:
In this paper, we present AMS-DEMO, an asynchronous master-slave implementation of DEMO, an evolutionary algorithm for multi-objective optimization. AMS-DEMO was designed for solving time-intensive problems efficiently on both homogeneous and heterogeneous parallel computer architectures. The algorithm is used as a test case for the asynchronous master-slave parallelization of multi-objective optimization that has not yet been thoroughly investigated. Selection lag is identified as the key property of the parallelization method, which explains how its behavior depends on the type of computer architecture and the number of processors. It is arrived at analytically and from the empirical results. AMS-DEMO is tested on a benchmark problem and a time-intensive industrial optimization problem, on homogeneous and heterogeneous parallel setups, providing performance results for the algorithm and an insight into the parallelization method. A comparison is also performed between AMS-DEMO and generational master-slave DEMO to demonstrate how the asynchronous parallelization method enhances the algorithm and what benefits it brings compared to the synchronous method.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Bukáček, Michal. "On SOMA Parallelization with Android Devices." Journal of Advanced Engineering and Computation 3, no. 3 (September 30, 2019): 441. http://dx.doi.org/10.25073/jaec.201933.247.

Повний текст джерела
Анотація:
Today's world is full of new, small, personal handhelds. They are called smartphones or tablets. The machines themselves always have less power than desktop computers or even mainframes which were left behind. Their computational power can be increased when they are joined together in a group and are addressing one common task. To check and demonstrate the possibility of the use of mobile devices being joined to a group, the SOMA algorithm was chosen. The well as known functions, for example; De Jong, Rosenbrock, Rastrigin or Schwefel will be used and their extremes (minimums) will be realized. The goal is to test the speed of these mobile devices to realize the extremes of more dimensional functions. The advantages and disadvantages of this swarm linking will be shown. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Samardzic, Aleksandar, Dusan Starcevic, and Milan Tuba. "An implementation of ray tracing algorithm for the multiprocessor machines." Yugoslav Journal of Operations Research 16, no. 1 (2006): 125–35. http://dx.doi.org/10.2298/yjor0601125s.

Повний текст джерела
Анотація:
Ray Tracing is an algorithm for generating photo-realistic pictures of the 3D scenes, given scene description, lighting condition and viewing parameters as inputs. The algorithm is inherently convenient for parallelization and the simplest parallelization scheme is for the shared-memory parallel machines (multiprocessors). This paper presents two implementations of the algorithm developed by the authors for alike machines, one using the POSIX threads API and another one using the OpenMP API. The paper also presents results of rendering some test scenes using these implementations and discusses our parallel algorithm version efficiency.
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Martynenko, S. I. "Potentialities of the Robust Multigrid Technique." Computational Methods in Applied Mathematics 10, no. 1 (2010): 87–94. http://dx.doi.org/10.2478/cmam-2010-0004.

Повний текст джерела
Анотація:
AbstractThe present paper discusses the parallelization of the robust multigrid technique (RMT) and the possible way of applying this to unstructured grids. As opposed to the classical multigrid methods, the RMT is a trivial method of parallelization on coarse grids independent of the smoothing iterations. Estimates of the minimum speed-up and parallelism efficiency are given. An almost perfect load balance is demonstrated in a 3D illustrative test. To overcome the geometric nature of the technique, the RMT is used as a preconditioner in solving PDEs on unstructured grids. The procedure of auxiliary structured grids generation is considered in details.
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Badwaik, Jayesh, Matthieu Boileau, David Coulette, Emmanuel Franck, Philippe Helluy, Christian Klingenberg, Laura Mendoza, and Herbert Oberlin. "Task-Based Parallelization of an Implicit Kinetic Scheme." ESAIM: Proceedings and Surveys 63 (2018): 60–77. http://dx.doi.org/10.1051/proc/201863060.

Повний текст джерела
Анотація:
In this paper we present and implement the Palindromic Discontinuous Galerkin (PDG) method in dimensions higher than one. The method has already been exposed and tested in [4] in the one-dimensional context. The PDG method is a general implicit high order method for approximating systems of conservation laws. It relies on a kinetic interpretation of the conservation laws containing stiff relaxation terms. The kinetic system is approximated with an asymptotic-preserving high order DG method. We describe the parallel implementation of the method, based on the StarPU runtime library. Then we apply it on preliminary test cases.
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Hutzschenreuter, Daniel, Frank Härtig, and Markus Schmidt. "An SQP method for Chebyshev and hole-pattern fitting with geometrical elements." Journal of Sensors and Sensor Systems 7, no. 1 (February 6, 2018): 57–67. http://dx.doi.org/10.5194/jsss-7-57-2018.

Повний текст джерела
Анотація:
Abstract. A customized sequential quadratic program (SQP) method for the solution of minimax-type fitting applications in coordinate metrology is presented. This area increasingly requires highly efficient and accurate algorithms, as modern three-dimensional geometry measurement systems provide large and computationally intensive data sets for fitting calculations. In order to meet these aspects, approaches for an optimization and parallelization of the SQP method are provided. The implementation is verified with medium (500 thousand points) and large (up to 13 million points) test data sets. A relative accuracy of the results in the range of 1 × 10−14 is observed. With four-CPU parallelization, the associated calculation time has been less than 5 s.
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Zhang, Lei, Guo Xin Zhang, Hai Lin Pan, and Xiao Kai Du. "Parallelization Research of SapTis-Software of Multi-Field Simulation and Nonlinear Analysis of Complex Structures." Applied Mechanics and Materials 444-445 (October 2013): 1192–96. http://dx.doi.org/10.4028/www.scientific.net/amm.444-445.1192.

Повний текст джерела
Анотація:
SapTis is a software of simulation and nonlinear analysis of complex structures which based on the FEM method, this paper discussed the key technology of FEM parallelization and then the parallelization strategy of SapTis is given; There are three realization for the SapTis parallelization:Make fine grain parallelism to the PCG solution on the OpenMP platform; the other Krylov subspace method solutions and preconditioners was developed on the MPI platform; the sparse direct solutions was developed on the OpenMP platform. Test results show that there are high efficiency of the three parallel realization for SapTis, and each had its own characteristics. In addition, the GPU solution is still in development which will make the SapTis software more powerful, flexible and adaptable.
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Boman, Björn. "Parallelization: the Fourth Leg of Cultural Globalization Theory." Integrative Psychological and Behavioral Science 55, no. 2 (January 20, 2021): 354–70. http://dx.doi.org/10.1007/s12124-021-09600-4.

Повний текст джерела
Анотація:
AbstractExtending Pieterse’s (1996) tripartite cultural globalization theory consisting of homogenization, hybridization and polarization, the current article outlines a set of exemplifications and justifications of a fourth theoretical underpinning labeled parallelization. The theory implies that at a global scale, crucial events that appear paradoxical or contradictory occur at the same time, such as carbon emissions due to growth-fixated global capitalism, while the causes of carbon emissions lead to greater resilience against the consequences of carbon emissions as wealth accumulates. Other examples discussed are large-scale migration flows which lead to increased segregation in host societies while integration of migrants occur as a parallel process; secularization visa-à-vis the resurgence of religions; clear indications of that the biological component of cognitive abilities decreases due to fertility patterns in many locations around the globe, while the IQ test scores have risen as a consequence of various environmental factors.
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Dinh, Van Quang, and Yves Marechal. "GPU-based parallelization for bubble mesh generation." COMPEL - The international journal for computation and mathematics in electrical and electronic engineering 36, no. 4 (July 3, 2017): 1184–97. http://dx.doi.org/10.1108/compel-11-2016-0476.

Повний текст джерела
Анотація:
Purpose In FEM computations, the mesh quality improves the accuracy of the approximation solution and reduces the computation time. The dynamic bubble system meshing technique can provide high-quality meshes, but the packing process is time-consuming. This paper aims to improve the running time of the bubble meshing by using the advantages of parallel computing on graphics processing unit (GPU). Design/methodology/approach This paper is based on the analysis of the processing time on CPU. A massively parallel computing-based CUDA architecture is proposed to improve the bubble displacement and database updating. Constraints linked to hardware considerations are taken into account. Finally, speedup factors are provided on test cases and real scale examples. Findings The numerical experiences show the efficiency of parallel performance reaches a speedup of 35 compared to the serial implementation. Research limitations/implications This contribution is so far limited to two-dimensional (2D) geometries although the extension to three-dimension (3D) is straightforward regarding the meshing technique itself and the GPU implementation. The authors’ works are based on a CUDA environment which is widely used by developers. C\C++ and Java were the programming languages used. Other languages may of course lead to slightly different implementations. Practical implications This approach makes it possible to use bubble meshing technique for both initial design and optimization, as excellent meshes can be built in few seconds. Originality/value Compared to previous works, this contribution shows that the scalability of the bubble meshing technique needs to solve two key issues: reach a T(N) global cost of the implementation and reach a very fast size map interpolation strategy.
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Sun, Yu Qiang, A. Ling Yin, Xiao Kang Wang, and Qiao Ying Liu. "Parallel Study of Integrated Test in Software Testing Process." Advanced Materials Research 468-471 (February 2012): 2459–62. http://dx.doi.org/10.4028/www.scientific.net/amr.468-471.2459.

Повний текст джерела
Анотація:
With the view of the phenomenon that different test modules have significant differences in the assembly test system, and there is some dependence between these modules, through the dependence theory of the parallel technology, there is the possibility of parallelization used in the integrated test system. In this paper, we analyses various control flows of each program or relation between the modules, and then gets all kinds of dependency between these modules, at the same time transforms the dependency into control flow graph, lastly gets the dependency of actual test cases. On this basis, according to the former implementation plan in parallel research, we can arrange parallel orders of modules the integrated test system, it can effectively improve the efficiency of software testing, and greatly saves the test time.
Стилі APA, Harvard, Vancouver, ISO та ін.
17

He, Jiandong, and Juanmian Lei. "A GPU-Accelerated TLSPH Algorithm for 3D Geometrical Nonlinear Structural Analysis." International Journal of Computational Methods 16, no. 07 (July 26, 2019): 1850114. http://dx.doi.org/10.1142/s0219876218501141.

Повний текст джерела
Анотація:
In this paper, we developed a GPU parallelized Total Lagrangian Formation of Smoothed Particle Hydrodynamics (TLSPH) algorithm for 3D geometrical nonlinear structure analysis. The code was developed using NVDIA CUDA C++. Both the TLSPH and GPU parallelization algorithms are described in detail. Compared to the traditional FEM method for structure analysis, TLSPH method is much easier to be implemented and parallelized. In addition, as a meshless based method, there is no need to mesh the domain for TLSPH method. Also, the computational cost of TLSPH is much lower than the Weakly Compressible Smoothed Particle (WCSPH) method. By introducing GPU acceleration, we have significantly improved the code performance. Two benchmark test cases for 3D geometrical nonlinear structure analysis are carried out. The simulation results are compared with analysis results and the data obtained by Abaqus, which is a popularly-used software for structure analysis based on FEM method. In order to show the efficiency of GPU parallelization, a serial code based on the same TLSPH method is also developed as a reference. Results show GPU parallelization accelerates the code obviously. In summary, the GPU parallelized TLSPH method shows the potential to become an alternative way to deal with 3D geometrical nonlinear structure analysis.
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Zhao, Ran, Chao Li, Xiaowei Guo, Sijiang Fan, Yi Wang, and Canqun Yang. "A Block Iteration with Parallelization Method for the Greedy Selection in Radial Basis Functions Based Mesh Deformation." Applied Sciences 9, no. 6 (March 18, 2019): 1141. http://dx.doi.org/10.3390/app9061141.

Повний текст джерела
Анотація:
Greedy algorithm is one of the important point selection methods in the radial basis function based mesh deformation. However, in large-scale mesh, the conventional greedy selection will generate expensive time consumption and result in performance penalties. To accelerate the computational procedure of the point selection, a block iteration with parallelization method is proposed in this paper. By the block iteration method, the computational complexities of three steps in the greedy selection are all reduced from O ( n 3 ) to O ( n 2 ) . In addition, the parallelization of two steps in the greedy selection separates boundary points into sub-cores, efficiently accelerating the procedure. Specifically, three typical models of three-dimensional undulating fish, ONERA M6 wing and three-dimensional Super-cavitating Hydrofoil are taken as the test cases to validate the proposed method and the results show that it improves 17.41 times performance compared to the conventional method.
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Ono, Kenji, and Takanori Uchida. "High-Performance Parallel Simulation of Airflow for Complex Terrain Surface." Modelling and Simulation in Engineering 2019 (February 3, 2019): 1–10. http://dx.doi.org/10.1155/2019/5231839.

Повний текст джерела
Анотація:
It is important to develop a reliable and high-throughput simulation method for predicting airflows in the installation planning phase of windmill power plants. This study proposes a two-stage mesh generation approach to reduce the meshing cost and introduces a hybrid parallelization scheme for atmospheric fluid simulations. The meshing approach splits mesh generation into two stages: in the first stage, the meshing parameters that uniquely determine the mesh distribution are extracted, and in the second stage, a mesh system is generated in parallel via an in situ approach using the parameters obtained in the initialization phase of the simulation. The proposed two-stage approach is flexible since an arbitrary number of processes can be selected at run time. An efficient OpenMP-MPI hybrid parallelization scheme using a middleware that provides a framework of parallel codes based on the domain decomposition method is also developed. The preliminary results of the meshing and computing performance show excellent scalability in the strong scaling test.
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Zhang, Dejian, Bingqing Lin, Jiefeng Wu, and Qiaoying Lin. "GP-SWAT (v1.0): a two-level graph-based parallel simulation tool for the SWAT model." Geoscientific Model Development 14, no. 10 (September 30, 2021): 5915–25. http://dx.doi.org/10.5194/gmd-14-5915-2021.

Повний текст джерела
Анотація:
Abstract. High-fidelity and large-scale hydrological models are increasingly used to investigate the impacts of human activities and climate change on water availability and quality. However, the detailed representations of real-world systems and processes contained in these models inevitably lead to prohibitively high execution times, ranging from minutes to days. Such models become computationally prohibitive or even infeasible when large iterative model simulations are involved. In this study, we propose a generic two-level (i.e., watershed- and subbasin-level) model parallelization schema to reduce the run time of computationally expensive model applications through a combination of model spatial decomposition and the graph-parallel Pregel algorithm. Taking the Soil and Water Assessment Tool (SWAT) as an example, we implemented a generic tool named GP-SWAT, enabling watershed-level and subbasin-level model parallelization on a Spark computer cluster. We then evaluated GP-SWAT in two sets of experiments to demonstrate the ability of GP-SWAT to accelerate single and iterative model simulations and to run in different environments. In each test set, GP-SWAT was applied for the parallel simulation of four synthetic hydrological models with different input/output (I/O) burdens. The single-model parallelization results showed that GP-SWAT can obtain a 2.3–5.8-times speedup. For multiple simulations with subbasin-level parallelization, GP-SWAT yielded a remarkable speedup of 8.34–27.03 times. In both cases, the speedup ratios increased with an increasing computation burden. The experimental results indicate that GP-SWAT can effectively solve the high-computational-demand problems of the SWAT model. In addition, as a scalable and flexible tool, it can be run in diverse environments, from a commodity computer running the Microsoft Windows operating system to a Spark cluster consisting of a large number of computational nodes. Moreover, it is possible to apply this generic tool to other subbasin-based hydrological models or even acyclic models in other domains to alleviate I/O demands and to optimize model computational performance.
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Vögler, A., S. Shelyag, M. Schüssler, F. Cattaneo, T. Emonet, and T. Linde. "Simulation of Solar Magnetoconvection." Symposium - International Astronomical Union 210 (2003): 157–67. http://dx.doi.org/10.1017/s0074180900133339.

Повний текст джерела
Анотація:
We present a new 3D MHD code for the simulation of solar magnetoconvection. The code is designed for use on parallel computers and in the choice of methods emphasis has been laid on efficient parallelization. We give a description of the numerical methods and discuss the non-local and non-grey treatment of the radiative transfer. Test calculations underlining the importance of non-grey effects and first results of the simulation of a solar plage region are shown.
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Tang, Peiyi, and Pen-Chung Yew. "Interprocedural Induction Variable Analysis." International Journal of Foundations of Computer Science 14, no. 03 (June 2003): 405–23. http://dx.doi.org/10.1142/s0129054103001819.

Повний текст джерела
Анотація:
Induction variable analysis is an important part of the symbolic analysis in parallelizing compilers. Induction variables can be formed by for or [Formula: see text] loops within procedures or loops of recursive procedure calls. This paper presents an algorithm to find induction variables in formal parameters of procedures caused by recursive procedure calls. The compile-time knowledge of induction variables in formal parameters is essential to summarize array sections to be used for data dependence test and parallelization.
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Chafik, Sanaa, and Cherki Daoui. "A Modified Value Iteration Algorithm for Discounted Markov Decision Processes." Journal of Electronic Commerce in Organizations 13, no. 3 (July 2015): 47–57. http://dx.doi.org/10.4018/jeco.2015070104.

Повний текст джерела
Анотація:
As many real applications need a large amount of states, the classical methods are intractable for solving large Markov Decision Processes. The decomposition technique basing on the topology of each state in the associated graph and the parallelization technique are very useful methods to cope with this problem. In this paper, the authors propose a Modified Value Iteration algorithm, adding the parallelism technique. They test their implementation on artificial data using an Open MP that offers a significant speed-up.
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Liebrock, Lorie M., and Ken Kennedy. "Automatic Data Distribution for Composite Grid Applications." Scientific Programming 6, no. 1 (1997): 95–113. http://dx.doi.org/10.1155/1997/174748.

Повний текст джерела
Анотація:
Problem topology is the key to efficient parallelization support for partially regular applications. Specifically, problem topology provides the information necessary for automatic data distribution and regular application optimization of a large class of partially regular applications. Problem topology is the connectivity of the problem. This research focuses on composite grid applications and strives to take advantage of their partial regularity in the parallelization and compilation process. Composite grid problems arise in important application areas, e.g., reactor and aerodynamic simulation. Related physical phenomena are inherently parallel and their simulations are computationally intensive. We present algorithms that automatically determine data distributions for composite grid problems. Our algorithm's alignment and distribution specifications may be used as input to a High Performance Fortran program to apply the mapping for execution of the simulation code. These algorithms eliminate the need for user-specified data distribution for this large class of complex topology problems. We test the algorithms using a number of topological descriptions from aerodynamic and water-cooled nuclear reactor simulations. Speedup-bound predictions with and without communication, based on the automatically generated distributions, indicate that significant speedups are possible using these algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Kańka, Tomasz, Tomasz Kryjak, and Marek Gorgon. "FPGA Implementation of Multi-scale Pedestrian Detection in Thermal Images." Image Processing & Communications 21, no. 3 (September 1, 2016): 55–67. http://dx.doi.org/10.1515/ipc-2016-0016.

Повний текст джерела
Анотація:
Abstract In this paper an embedded vision system for human silhouette detection in thermal images is presented. As the computing platform a reprogrammable device (FPGA – Field Programmable Gate Array) is used. The detection algorithm is based on a sliding window approach, which content is compared with a probabilistic template. Moreover, detection is four scales in supported. On the used test database, the proposed method obtained 97% accuracy, with average one false detection per frame. Due to the used parallelization and pipelining real-time processing for 720 × 480 @ 50 fps and 1280 × 720 @ 50 fps video streams was achieved. The system has been practically verified in a test setup with a thermal camera.
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Rosenberg, Duane, Pablo D. Mininni, Raghu Reddy, and Annick Pouquet. "GPU Parallelization of a Hybrid Pseudospectral Geophysical Turbulence Framework Using CUDA." Atmosphere 11, no. 2 (February 8, 2020): 178. http://dx.doi.org/10.3390/atmos11020178.

Повний текст джерела
Анотація:
An existing hybrid MPI-OpenMP scheme is augmented with a CUDA-based fine grain parallelization approach for multidimensional distributed Fourier transforms, in a well-characterized pseudospectral fluid turbulence code. Basics of the hybrid scheme are reviewed, and heuristics provided to show a potential benefit of the CUDA implementation. The method draws heavily on the CUDA runtime library to handle memory management and on the cuFFT library for computing local FFTs. The manner in which the interfaces to these libraries are constructed, and ISO bindings utilized to facilitate platform portability, are discussed. CUDA streams are implemented to overlap data transfer with cuFFT computation. Testing with a baseline solver demonstrated significant aggregate speed-up over the hybrid MPI-OpenMP solver by offloading to GPUs on an NVLink-based test system. While the batch streamed approach provided little benefit with NVLink, we saw a performance gain of 30 % when tuned for the optimal number of streams on a PCIe-based system. It was found that strong GPU scaling is nearly ideal, in all cases. Profiling of the CUDA kernels shows that the transform computation achieves 15% of the attainable peak FlOp-rate based on a roofline model for the system. In addition to speed-up measurements for the fiducial solver, we also considered several other solvers with different numbers of transform operations and found that aggregate speed-ups are nearly constant for all solvers.
Стилі APA, Harvard, Vancouver, ISO та ін.
27

Jannach, Dietmar, Thomas Schmitz, and Kostyantyn Shchekotykhin. "Parallel Model-Based Diagnosis on Multi-Core Computers." Journal of Artificial Intelligence Research 55 (April 12, 2016): 835–87. http://dx.doi.org/10.1613/jair.5001.

Повний текст джерела
Анотація:
Model-Based Diagnosis (MBD) is a principled and domain-independent way of analyzing why a system under examination is not behaving as expected. Given an abstract description (model) of the system's components and their behavior when functioning normally, MBD techniques rely on observations about the actual system behavior to reason about possible causes when there are discrepancies between the expected and observed behavior. Due to its generality, MBD has been successfully applied in a variety of application domains over the last decades. In many application domains of MBD, testing different hypotheses about the reasons for a failure can be computationally costly, e.g., because complex simulations of the system behavior have to be performed. In this work, we therefore propose different schemes of parallelizing the diagnostic reasoning process in order to better exploit the capabilities of modern multi-core computers. We propose and systematically evaluate parallelization schemes for Reiter's hitting set algorithm for finding all or a few leading minimal diagnoses using two different conflict detection techniques. Furthermore, we perform initial experiments for a basic depth-first search strategy to assess the potential of parallelization when searching for one single diagnosis. Finally, we test the effects of parallelizing "direct encodings" of the diagnosis problem in a constraint solver.
Стилі APA, Harvard, Vancouver, ISO та ін.
28

Browne, Nigel P. A., and Marcus V. dos Santos. "Adaptive Representations for Improving Evolvability, Parameter Control, and Parallelization of Gene Expression Programming." Applied Computational Intelligence and Soft Computing 2010 (2010): 1–19. http://dx.doi.org/10.1155/2010/409045.

Повний текст джерела
Анотація:
Gene Expression Programming (GEP) is a genetic algorithm that evolves linear chromosomes encoding nonlinear (tree-like) structures. In the original GEP algorithm, the genome size is problem specific and is determined through trial and error. In this work, a method for adaptive control of the genome size is presented. The approach introduces mutation, transposition, and recombination operators that enable a population of heterogeneously structured chromosomes, something the original GEP algorithm does not support. This permits crossbreeding between normally incompatible individuals, speciation within a population, increases the evolvability of the representations, and enhances parallel GEP. To test our approach, an assortment of problems were used, including symbolic regression, classification, and parameter optimization. Our experimental results show that our approach provides a solution for the problem of self-adaptive control of the genome size of GEP's representation.
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Batycky, R. P. P., M. Förster, M. R. R. Thiele, and K. Stüben. "Parallelization of a Commercial Streamline Simulator and Performance on Practical Models." SPE Reservoir Evaluation & Engineering 13, no. 03 (June 7, 2010): 383–90. http://dx.doi.org/10.2118/118684-pa.

Повний текст джерела
Анотація:
Summary We present the parallelization of a commercial streamline simulator to multicore architectures based on the OpenMP programming model and its performance on various field examples. This work is a continuation of recent work by Gerritsen et al. (2009) in which a research streamline simulator was extended to parallel execution. We identified that the streamline-transport step represents approximately 40-80% of the total run time. It is exactly this step that is straightforward to parallelize owing to the independent solution of each streamline that is at the heart of streamline simulation. Because we are working with an existing large serial code, we used specialty software to quickly and easily identify variables that required particular handling for implementing the parallel extension. Minimal rewrite to existing code was required to extend the streamline-transport step to OpenMP. As part of this work, we also parallelized additional run-time code, including the gravity-line solver and some simple routines required for constructing the pressure matrix. Overall, the run-time fraction of code parallelized ranged from 0.50 to 0.83, depending on the transport physics being considered. We tested our parallel simulator on a variety of large models including SPE 10, Forties-a UK oil/water model, Judy Creek-a Canadian waterflood/water-alternating-gas (WAG) model, and a South American black-oil model. We noted overall speedup factors from 1.8 to 3.3x for eight threads. In terms of real time, this implies that large-scale streamline simulation models as tested here can be simulated in less than 4 hours. We found speedup results to be reasonable when compared with Amdahl's ideal scaling law. Beyond eight threads, we observed minimal speedups because of memory bandwidth limits on our test machine.
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Koguciuk, Daniel. "Parallel RANSAC for Point Cloud Registration." Foundations of Computing and Decision Sciences 42, no. 3 (September 1, 2017): 203–17. http://dx.doi.org/10.1515/fcds-2017-0010.

Повний текст джерела
Анотація:
AbstractIn this paper, a project and implementation of the parallel RANSAC algorithm in CUDA architecture for point cloud registration are presented. At the beginning, a serial state of the art method with several heuristic improvements from the literature compared to basic RANSAC is introduced. Subsequently, its algorithmic parallelization and CUDA implementation details are discussed. The comparative test has proven a significant program execution acceleration. The result is finding of the local coordinate system of the object in the scene in the near real-time conditions. The source code is shared on the Internet as a part of the Heuros system.
Стилі APA, Harvard, Vancouver, ISO та ін.
31

GUSATTO, ÉDER, JOSÉ C. M. MOMBACH, FERNANDO P. CERCATO, and GERSON H. CAVALHEIRO. "AN EFFICIENT PARALLEL ALGORITHM TO EVOLVE SIMULATIONS OF THE CELLULAR POTTS MODEL." Parallel Processing Letters 15, no. 01n02 (March 2005): 199–208. http://dx.doi.org/10.1142/s0129626405002155.

Повний текст джерела
Анотація:
Applications of the cellular Potts model to investigate cellular structures are becoming widely spread in the scientific literature. Despite its realism and generality, the standard Monte Carlo algorithm used to evolve this model in the scientific literature lacks computational efficiency. As an alternative we introduce the Random Walker algorithm that is a modified Monte Carlo procedure of simpler parallelization. We test it in cell sorting and foam coarsening simulations obtaining velocity increase factors of 10 and 3 times, respectively, in relation to the standard algorithm. The results obtained with these simulations are equivalent to those obtained with the standard algorithm.
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Abramov, Sergey, Vladimir Roganov, Valeriy Osipov, and German Matveev. "Implementation of the LAMMPS package using T-system with an Open Architecture." Informatics and Automation 20, no. 4 (August 11, 2021): 971–99. http://dx.doi.org/10.15622/ia.20.4.8.

Повний текст джерела
Анотація:
Supercomputer applications are usually implemented in the C, C++, and Fortran programming languages using different versions of the Message Passing Interface library. The "T-system" project (OpenTS) studies the issues of automatic dynamic parallelization of programs. In practical terms, the implementation of applications in a mixed (hybrid) style is relevant, when one part of the application is written in the paradigm of automatic dynamic parallelization of programs and does not use any primitives of the MPI library, and the other part of it is written using the Message Passing Interface library. In this case, the library is used, which is a part of the T-system and is called DMPI (Dynamic Message Passing Interface). In this way, it is necessary to evaluate the effectiveness of the MPI implementation available in the T-system. The purpose of this work is to examine the effectiveness of DMPI implementation in the T-system. In a classic MPI application, 0% of the code is implemented using automatic dynamic parallelization of programs and 100% of the code is implemented in the form of a regular Message Passing Interface program. For comparative analysis, at the beginning the code is executed on the standard Message Passing Interface, for which it was originally written, and then it is executed using the DMPI library taken from the developed T-system. Сomparing the effectiveness of the approaches, the performance losses and the prospects for using a hybrid programming style are evaluated. As a result of the conducted experimental studies for different types of computational problems, it was possible to make sure that the efficiency losses are negligible. This allowed to formulate the direction of further work on the T-system and the most promising options for building hybrid applications. Thus, this article presents the results of the comparative tests of LAMMPS application using OpenMPI and using OpenTS DMPI. The test results confirm the effectiveness of the DMPI implementation in the OpenTS parallel programming environment
Стилі APA, Harvard, Vancouver, ISO та ін.
33

García-Feal, Orlando, José González-Cao, Moncho Gómez-Gesteira, Luis Cea, José Domínguez, and Arno Formella. "An Accelerated Tool for Flood Modelling Based on Iber." Water 10, no. 10 (October 16, 2018): 1459. http://dx.doi.org/10.3390/w10101459.

Повний текст джерела
Анотація:
This paper presents Iber+, a new parallel code based on the numerical model Iber for two-dimensional (2D) flood inundation modelling. The new implementation, which is coded in C++ and takes advantage of the parallelization functionalities both on CPUs (central processing units) and GPUs (graphics processing units), was validated using different benchmark cases and compared, in terms of numerical output and computational efficiency, with other well-known hydraulic software packages. Depending on the complexity of the specific test case, the new parallel implementation can achieve speedups up to two orders of magnitude when compared with the standard version. The speedup is especially remarkable for the GPU parallelization that uses Nvidia CUDA (compute unified device architecture). The efficiency is as good as the one provided by some of the most popular hydraulic models. We also present the application of Iber+ to model an extreme flash flood that took place in the Spanish Pyrenees in October 2012. The new implementation was used to simulate 24 h of real time in roughly eight minutes of computing time, while the standard version needed more than 15 h. This huge improvement in computational efficiency opens up the possibility of using the code for real-time forecasting of flood events in early-warning systems, in order to help decision making under hazardous events that need a fast intervention to deploy countermeasures.
Стилі APA, Harvard, Vancouver, ISO та ін.
34

García-Feal, Orlando, Luis Cea, José González-Cao, José Manuel Domínguez, and Moncho Gómez-Gesteira. "IberWQ: A GPU Accelerated Tool for 2D Water Quality Modeling in Rivers and Estuaries." Water 12, no. 2 (February 4, 2020): 413. http://dx.doi.org/10.3390/w12020413.

Повний текст джерела
Анотація:
Numerical models are useful tools to analyze water quality by computing the concentration of physical, chemical and biological parameters. The present work introduces a two-dimensional depth-averaged model that computes the most relevant and frequent parameters used to evaluate water quality. High performance computing (HPC) techniques based on graphic processing unit (GPU) parallelization have been applied to improve the efficiency of the package, providing speed-ups of two orders of magnitude in a standard PC. Several test cases were analyzed to show the capabilities and efficiency of the model to evaluate the environmental status of rivers and non-stratified estuaries. IberWQ will be freely available through the package Iber.
Стилі APA, Harvard, Vancouver, ISO та ін.
35

Uribe-Hurtado, Ana Lorena, Mauricio Orozco-Alzate, and Eduardo-Jose Villegas-Jaramillo. "Leave-one-out evaluation of the nearest feature line and the rectified nearest feature line segment classifiers using Multi-core architectures." Ingeniería y Ciencia 14, no. 27 (June 2018): 75–99. http://dx.doi.org/10.17230/ingciencia.14.27.4.

Повний текст джерела
Анотація:
In this paper we present the parallelization of the leave-one-out test: areproducible test that is, in general, computationally expensive. Paral-lelization was implemented on multi-core multi-threaded architectures, us-ing the Flynn Single Instruction Multiple Data taxonomy. This techniquewas used for the preprocessing and processing stages of two classificationalgorithms that are oriented to enrich the representation in small samplecases: the nearest feature line (NFL) algorithm and the rectified nearestfeature line segment (RNFLS) algorithm. Results show an accelerationof up to 18.17 times with the smallest dataset and 29.91 times with thelargest one, using the most costly algorithm (RNFLS) whose complexityisO(n4). The paper also shows the pseudo-codes of the serial and parallel algorithms using, in the latter case, a notation that describes the way theparallelization was carried out as a function of the threads.
Стилі APA, Harvard, Vancouver, ISO та ін.
36

Ikhsanudin, Rakhmad, and Edi Winarko. "Parallelization of Hybrid Content Based and Collaborative Filtering Method in Recommendation System with Apache Spark." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 13, no. 2 (April 30, 2019): 149. http://dx.doi.org/10.22146/ijccs.38596.

Повний текст джерела
Анотація:
Collaborative Filtering as a popular method that used for recommendation system. Improvisation is done in purpose of improving the accuracy of the recommendation. A way to do this is to combine with content based method. But the hybrid method has a lack in terms of scalability. The main aim of this research is to solve problem that faced by recommendation system with hybrid collaborative filtering and content based method by applying parallelization on the Apache Spark platform.Based on the test results, the value of hybrid collaborative filtering method and content based on Apache Spark cluster with 2 node worker is 1,003 which then increased to 2,913 on cluster having 4 node worker. The speedup got more increased to 5,85 on the cluster that containing 7 node worker.
Стилі APA, Harvard, Vancouver, ISO та ін.
37

Zeng, Yao Yuan, Zheng Hua Wang, and Wen Tao Zhao. "Application of Parallel Computation in Numerical Simulation of Laser Propulsion." Applied Mechanics and Materials 130-134 (October 2011): 3027–31. http://dx.doi.org/10.4028/www.scientific.net/amm.130-134.3027.

Повний текст джерела
Анотація:
As for the problem of numerical simulation oflaser propulsion of three dimensions and multi-sub domains, the domain decomposition strategy based on message passing mechanismis applied in this paper to realize parallelization. The cell-centered finite volume scheme is performed to solve Euler equation. A five-step Runge-Kutta scheme of explicit integral model is used for time advancement. The spatial discretization of inviscid fluid is estimated byhigh-order Godunov-type scheme. We test some different examples on a cluster system and the results show the smallest number of speedup is more than 5.19 when the degree of parallelism is 8. In a word, parallel computation is an inevitable choice to achieve the aim of accelerating the study of the mechanism of laser propulsion.
Стилі APA, Harvard, Vancouver, ISO та ін.
38

MÁRQUEZ, A., C. GIL, R. BAÑOS, and J. GÓMEZ. "IMPROVING THE PERFORMANCE OF MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS USING THE ISLAND PARALLEL MODEL." Parallel Processing Letters 17, no. 02 (June 2007): 127–39. http://dx.doi.org/10.1142/s0129626407002922.

Повний текст джерела
Анотація:
Recently, the research interest in multi-objective optimization has increased remarkably. Most of the proposed methods use a population of solutions that are simultaneously improved trying to approximate them to the Pareto-optimal front. When the population size increases, the quality of the solutions tends to be better, but the runtime is higher. This paper presents how to apply parallel processing to enhance the convergence to the Pareto-optimal front, without increasing the runtime. In particular, we present an island-based parallelization of five multi-objective evolutionary algorithms: NSGAII, SPEA2, PESA, msPESA, and a new hybrid version we propose. Experimental results in some test problems denote that the quality of the solutions tends to improve when the number of islands increases.
Стилі APA, Harvard, Vancouver, ISO та ін.
39

Elhadef, Mourad, Kaouther Abrougui, Shantanu Das, and Amiya Nayak. "A PARALLEL PROBABILISTIC SYSTEM-LEVEL FAULT DIAGNOSIS APPROACH FOR LARGE MULTIPROCESSOR SYSTEMS." Parallel Processing Letters 16, no. 01 (March 2006): 63–79. http://dx.doi.org/10.1142/s0129626406002472.

Повний текст джерела
Анотація:
In this paper, we present a system-level fault identification algorithm, using a parallel genetic algorithm, for diagnosing faulty nodes in large heterogeneous systems. The algorithm is based on a probabilistic model where individual node fails with an a priori probability p. The assumptions concerning test outcomes are the same as in the PMC model, that is, fault-free testers always give correct test outcomes and faulty testers are totally unpredictable. The parallel diagnosis algorithm was implemented and simulated on randomly generated large systems. The proposed parallelization is intended to speed up the performance of the evolutionary diagnosis approach, hence reducing the computation time by evolving various sub-populations in parallel. Simulation results are provided showing that the parallel diagnosis did improve the efficiency of the evolutionary diagnosis approach, in that it allowed faster diagnosis of faulty situations, making it a viable alternative to existing techniques of diagnosis. Moreover, the evolutionary approach still provide good results even when extreme non-diagnosable faulty situations are considered.
Стилі APA, Harvard, Vancouver, ISO та ін.
40

Hückelheim, Jan, Paul Hovland, Michelle Mills Strout, and Jens-Dominik Müller. "Reverse-mode algorithmic differentiation of an OpenMP-parallel compressible flow solver." International Journal of High Performance Computing Applications 33, no. 1 (June 29, 2017): 140–54. http://dx.doi.org/10.1177/1094342017712060.

Повний текст джерела
Анотація:
Reverse-mode algorithmic differentiation (AD) is an established method for obtaining adjoint derivatives of computer simulation applications. In computational fluid dynamics (CFD), adjoint derivatives of a cost function output such as drag or lift with respect to design parameters such as surface coordinates or geometry control points are a key ingredient for shape optimization, uncertainty quantification and flow control. The computational cost of CFD applications and their derivatives makes it essential to use high-performance computing hardware efficiently, including multi- and many-core architectures. Nevertheless, OpenMP is not supported in most AD tools, and previously shown methods achieve poor scalability of the derivative code. We present the AD of an OpenMP-parallelized finite volume compressible flow solver for unstructured meshes. Our approach enables us to reuse the parallelization of the original code in the computation of adjoint derivatives. The method works by identifying code segments that can be differentiated in reverse-mode without changing their memory access pattern. The OpenMP parallelization is integrated into the derivative code during the build process in a way that is robust to modifications of the original code and independent of the OpenMP support of the differentiation tool. We show the scalability of our adjoint CFD solver on test cases ranging from thousands to millions of finite volume mesh cells on CPUs with up to 16 threads as well as on an Intel XeonPhi card with 236 threads. We demonstrate that our approach is more practical to implement for production-sized CFD codes and produces more efficient adjoint derivative code than previously shown AD methods.
Стилі APA, Harvard, Vancouver, ISO та ін.
41

Momeni, Zahra, and Mohammad Saniee Abadeh. "MapReduce-Based Parallel Genetic Algorithm for CpG-Site Selection in Age Prediction." Genes 10, no. 12 (November 25, 2019): 969. http://dx.doi.org/10.3390/genes10120969.

Повний текст джерела
Анотація:
Genomic biomarkers such as DNA methylation (DNAm) are employed for age prediction. In recent years, several studies have suggested the association between changes in DNAm and its effect on human age. The high dimensional nature of this type of data significantly increases the execution time of modeling algorithms. To mitigate this problem, we propose a two-stage parallel algorithm for selection of age related CpG-sites. The algorithm first attempts to cluster the data into similar age ranges. In the next stage, a parallel genetic algorithm (PGA), based on the MapReduce paradigm (MR-based PGA), is used for selecting age-related features of each individual age range. In the proposed method, the execution of the algorithm for each age range (data parallel), the evaluation of chromosomes (task parallel) and the calculation of the fitness function (data parallel) are performed using a novel parallel framework. In this paper, we consider 16 different healthy DNAm datasets that are related to the human blood tissue and that contain the relevant age information. These datasets are combined into a single unioned set, which is in turn randomly divided into two sets of train and test data with a ratio of 7:3, respectively. We build a Gradient Boosting Regressor (GBR) model on the selected CpG-sites from the train set. To evaluate the model accuracy, we compared our results with state-of-the-art approaches that used these datasets, and observed that our method performs better on the unseen test dataset with a Mean Absolute Deviation (MAD) of 3.62 years, and a correlation (R2) of 95.96% between age and DNAm. In the train data, the MAD and R2 are 1.27 years and 99.27%, respectively. Finally, we evaluate our method in terms of the effect of parallelization in computation time. The algorithm without parallelization requires 4123 min to complete, whereas the parallelized execution on 3 computing machines having 32 processing cores each, only takes a total of 58 min. This shows that our proposed algorithm is both efficient and scalable.
Стилі APA, Harvard, Vancouver, ISO та ін.
42

Praga, A., D. Cariolle, and L. Giraud. "Pangolin v1.0, a conservative 2-D transport model for large scale parallel calculation." Geoscientific Model Development Discussions 7, no. 4 (July 18, 2014): 4527–76. http://dx.doi.org/10.5194/gmdd-7-4527-2014.

Повний текст джерела
Анотація:
Abstract. To exploit the possibilities of parallel computers, we designed a large-scale bidimensional atmospheric transport model named Pangolin. As the basis for a future chemistry-transport model, a finite-volume approach was chosen both for mass preservation and to ease parallelization. To overcome the pole restriction on time-steps for a regular latitude-longitude grid, Pangolin uses a quasi-area-preserving reduced latitude-longitude grid. The features of the regular grid are exploited to improve parallel performances and a custom domain decomposition algorithm is presented. To assess the validity of the transport scheme, its results are compared with state-of-the-art models on analytical test cases. Finally, parallel performances are shown in terms of strong scaling and confirm the efficient scalability up to a few hundred of cores.
Стилі APA, Harvard, Vancouver, ISO та ін.
43

Zhang, Jianfei. "A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems." Mathematical Problems in Engineering 2015 (2015): 1–7. http://dx.doi.org/10.1155/2015/147286.

Повний текст джерела
Анотація:
Starting a parallel code from scratch is not a good choice for parallel programming finite element analysis of elasticity problems because we cannot make full use of our existing serial code and the programming work is painful for developers. PETSc provides libraries for various numerical methods that can give us more flexibility in migrating our serial application code to a parallel implementation. We present the approach to parallelize the existing finite element code within the PETSc framework. Our approach permits users to easily implement the formation and solution of linear system arising from finite element discretization of elasticity problem. The main PETSc subroutines are given for the main parallelization step and the corresponding code fragments are listed. Cantilever examples are used to validate the code and test the performance.
Стилі APA, Harvard, Vancouver, ISO та ін.
44

Lemarchand, Laurent, Damien Massé, Pascal Rebreyend, and Johan Håkansson. "Multiobjective Optimization for Multimode Transportation Problems." Advances in Operations Research 2018 (June 7, 2018): 1–13. http://dx.doi.org/10.1155/2018/8720643.

Повний текст джерела
Анотація:
We propose modelling for a facilities localization problem in the context of multimode transportation. The applicative goal is to locate service facilities such as schools or hospitals while optimizing the different transportation modes to these facilities. We formalize the School Problem and solve it first exactly using an adapted ϵ-constraint multiobjective method. Because of the size of the instances considered, we have also explored the use of heuristic methods based on evolutionary multiobjective frameworks, namely, NSGA2 and a modified version of PAES. Those methods are mixed with an original local search technique to provide better results. Numerical comparisons of solutions sets quality are made using the hypervolume metric. Based on the results for test-cases that can be solved exactly, efficient implementation for PAES and NSGA2 allows execution times comparison for large instances. Results show good performances for the heuristic approaches as compared to the exact algorithm for small test-cases. Approximate methods present a scalable behavior on largest problem instances. A master/slave parallelization scheme also helps to reduce execution times significantly for the modified PAES approach.
Стилі APA, Harvard, Vancouver, ISO та ін.
45

Praga, A., D. Cariolle, and L. Giraud. "Pangolin v1.0, a conservative 2-D advection model towards large-scale parallel calculation." Geoscientific Model Development 8, no. 2 (February 9, 2015): 205–20. http://dx.doi.org/10.5194/gmd-8-205-2015.

Повний текст джерела
Анотація:
Abstract. To exploit the possibilities of parallel computers, we designed a large-scale bidimensional atmospheric advection model named Pangolin. As the basis for a future chemistry-transport model, a finite-volume approach for advection was chosen to ensure mass preservation and to ease parallelization. To overcome the pole restriction on time steps for a regular latitude–longitude grid, Pangolin uses a quasi-area-preserving reduced latitude–longitude grid. The features of the regular grid are exploited to reduce the memory footprint and enable effective parallel performances. In addition, a custom domain decomposition algorithm is presented. To assess the validity of the advection scheme, its results are compared with state-of-the-art models on algebraic test cases. Finally, parallel performances are shown in terms of strong scaling and confirm the efficient scalability up to a few hundred cores.
Стилі APA, Harvard, Vancouver, ISO та ін.
46

Soner, Seren, and Can Ozturan. "Generating Multibillion Element Unstructured Meshes on Distributed Memory Parallel Machines." Scientific Programming 2015 (2015): 1–10. http://dx.doi.org/10.1155/2015/437480.

Повний текст джерела
Анотація:
We present a parallel mesh generator called PMSH that is developed as a wrapper code around the open source sequential Netgen mesh generator. Parallelization of the mesh generator is carried out in five stages: (i) generation of a coarse volume mesh; (ii) partitioning of the coarse mesh; (iii) refinement of coarse surface mesh to produce fine surface submeshes; (iv) remeshing of each fine surface submesh to get a final fine mesh; (v) matching of partition boundary vertices followed by global vertex numbering. A new integer based barycentric coordinate method is developed for matching distributed partition boundary vertices. This method does not have precision related problems of floating point coordinate based vertex matching. Test results obtained on an SGI Altix ICE X system with 8192 cores confirm that our approach does indeed enable us to generate multibillion element meshes in a scalable way.
Стилі APA, Harvard, Vancouver, ISO та ін.
47

Baravykaitė, Milda, and Raimondas Čiegis. "AN IMPLEMENTATION OF A PARALLEL GENERALIZED BRANCH AND BOUND TEMPLATE." Mathematical Modelling and Analysis 12, no. 3 (September 30, 2007): 277–89. http://dx.doi.org/10.3846/1392-6292.2007.12.277-289.

Повний текст джерела
Анотація:
Branch and bound (BnB) is a general algorithm to solve optimization problems. We present a template implementation of the BnB paradigm. A BnB template is implemented using C++ object oriented paradigm. MPI is used for underlying communications. A paradigm of domain decomposition (data parallelization) is used to construct a parallel algorithm. To obtain a better load balancing, the BnB template has the load balancing module that allows the redistribution of search spaces among the processors at run time. A parallel version of user's algorithm is obtained automatically. A new derivative-free global optimization algorithm is proposed for solving nonlinear global optimization problems. It is based on the BnB algorithm and its implementation is done by using the developed BnB algorithm template library. The robustness of the new algorithm is demonstrated by solving a selection of test problems.
Стилі APA, Harvard, Vancouver, ISO та ін.
48

Vo, Anh Vu, Debra F. Laefer, and Jonathan Byrne. "Optimizing Urban LiDAR Flight Path Planning Using a Genetic Algorithm and a Dual Parallel Computing Framework." Remote Sensing 13, no. 21 (November 4, 2021): 4437. http://dx.doi.org/10.3390/rs13214437.

Повний текст джерела
Анотація:
This paper introduces a genetic algorithm (GA) and a beam tracing algorithm incorporated within a dual parallel computing framework to optimize urban aerial laser scanning (ALS) missions to maximize vertical façade data capture, as needed for many three-dimensional reconstruction and modeling workflows. The optimization employs a low-density point cloud from the site of interest as a spatial representation of the urban scene. The GA is suitable for LiDAR flight path optimization due to its capability of handling open-ended problems that have many solutions. However, GAs require evaluating a very large number of candidates. The use of an initial point cloud allows realistic modeling of the urban environment in the optimization at the cost of high data input volumes. To cope with the computational and data demands, a dual parallel computing framework was devised. The parallel computing framework consists of two layers of parallelization. In the upper layer, multiple evaluators work in parallel and in conjunction with a main multi-threading GA optimizer to perform GA operations and evaluate the flight paths. In the lower layer, to evaluate assigned flight paths, each evaluator distributes its data and computation to multiple executors, which can reside on multiple physical nodes of a distributed-memory computing cluster. In addition to parallelism, the data partitioning on the lower layer allows out-of-core computation. Namely, data partitions are efficiently transferred between disks and memory so that only relevant subsets of data are kept in the main memory. The objective of the proposed method is threefold: (1) search for flight paths that yield the highest numbers of vertical points, (2) create a means to explicitly consider the detailed spatial configuration of urban environments, and (3) assure that the proposed optimization strategy is fast and can scale to large problem sizes. Multiple experiments were conducted and demonstrated the success of the proposed method. Converged results were achieved after dozens of generations within two hours. Two flight paths identified by the GA as the most and the least optimal candidates were deployed in real flight missions. The optimal flight path captured 16% more vertical points than the least optimal one, slightly higher than the 13% predicted. Both layers of parallelization were efficient: 13.1/16 for the lower layer and 3.2/4 for the upper layer. The two complementary layers of parallelization allowed flexible and efficient use of distributed computing resources to reduce the runtime. The scalability of the proposed approach was successfully demonstrated up to a data size of 460 million points. The optimization results were realistic and aligned well with the test flight results.
Стилі APA, Harvard, Vancouver, ISO та ін.
49

Wang, Jian Jun. "Development of Parallel Program of Two-Dimensional Flow-Sediment Mathematical Model for Long River Reach." Advanced Materials Research 779-780 (September 2013): 1562–66. http://dx.doi.org/10.4028/www.scientific.net/amr.779-780.1562.

Повний текст джерела
Анотація:
In order to meet technical requirements for systematic regulation to long reach of the Yangtze River, parallel program to computing core part of the software is developed based on TK-2DC software, which mainly contains parallelization research on flow convection diffusion model and sediment convection diffusion model. This parallel program conducts parallel programming on the basis of MPI. The whole parallel computing part adopts peer mode and corresponding special processing is performed to solving method of equation and special points so as to obtain better parallel speedup and parallel efficiency. Parallel program has been tested by relying on channel regulation engineering related to Jingjiang reach in the middle reach of the Yangtze River. Test results shows that parallel efficiency with 8 CPU is up to 80% and parallel speedup can reach 4.98. Flow-sediment analog computing efficiency of long river reach is greatly improved.
Стилі APA, Harvard, Vancouver, ISO та ін.
50

Campos, Carmen, and Jose E. Roman. "NEP." ACM Transactions on Mathematical Software 47, no. 3 (June 25, 2021): 1–29. http://dx.doi.org/10.1145/3447544.

Повний текст джерела
Анотація:
SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. Over the past few years, we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These problems can be defined by means of a matrix-valued function that depends nonlinearly on a single scalar parameter. We do not consider the particular case of polynomial eigenvalue problems (which are implemented in a different module in SLEPc) and focus here on rational eigenvalue problems and other general nonlinear eigenproblems involving square roots or any other nonlinear function. The article discusses how the NEP module has been designed to fit the needs of applications and provides a description of the available solvers, including some implementation details such as parallelization. Several test problems coming from real applications are used to evaluate the performance and reliability of the solvers.
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!

До бібліографії