Log in

Relevant bibliographies by topics / Parallel programming; Sequential / Journal articles

To see the other types of publications on this topic, follow the link: Parallel programming; Sequential.

Journal articles on the topic 'Parallel programming; Sequential'

Author: Grafiati

Published: 4 June 2021

Last updated: 15 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Parallel programming; Sequential.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Cheang, Sin Man, Kwong Sak Leung, and Kin Hong Lee. "Genetic Parallel Programming: Design and Implementation." Evolutionary Computation 14, no. 2 (June 2006): 129–56. http://dx.doi.org/10.1162/evco.2006.14.2.129.

Full text

Abstract:

This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential programif required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

APA, Harvard, Vancouver, ISO, and other styles

2

Baravykaite, M., and R. Šablinskas. "THE TEMPLATE PROGRAMMING OF PARALLEL ALGORITHMS." Mathematical Modelling and Analysis 7, no. 1 (June 30, 2002): 11–20. http://dx.doi.org/10.3846/13926292.2002.9637173.

Full text

Abstract:

The parallel programming tools and packages are evolving rapidly. However the complexity of parallel thinking does not allow to implement many algorithms for the end user. In most cases only expert programmers risk to involve in parallel programming and program debugging. In this paper we extend the ideas from [3] of template programming for a certain class of problems which could be solved by using general master‐slave paradigm. The template is suitable for solution of the coarse grain and middle grain granularity problem set. Actually, it could be applied to solve any problem P, which is decomposable into a set of tasks P = U i N=0ti. The most effective application cases are obtained for the problems where all ti are independent. The template programming sets some requirements for the sequential version of the user program: The main program must comprise of several code blocks: data initialization, computation of one task ti and the processing of the result. The user has to define the data structures: initial data, one task data, the result data. These requirements do not require to rewrite the existing sequential code but to organize it into some logical parts. After these requirements (and naming conventions) are fulfilled, the parallel version of the code is obtained automatically by compiling and linking the code with the Master‐Slave Template library. In this paper we introduce the idea of the template programming and describe the layer structure of the Master‐Slave Template library. We show how the user has to adjust the sequential code to obtain a valid parallel version of the initial program. We also give examples of the prime number search problem and the Mandelbrot set calculation problem.

APA, Harvard, Vancouver, ISO, and other styles

3

del Rio Astorga, David, Manuel F. Dolz, Luis Miguel Sánchez, J. Daniel García, Marco Danelutto, and Massimo Torquati. "Finding parallel patterns through static analysis in C++ applications." International Journal of High Performance Computing Applications 32, no. 6 (March 9, 2017): 779–88. http://dx.doi.org/10.1177/1094342017695639.

Full text

Abstract:

Since the ‘free lunch’ of processor performance is over, parallelism has become the new trend in hardware and architecture design. However, parallel resources deployed in data centers are underused in many cases, given that sequential programming is still deeply rooted in current software development. To address this problem, new methodologies and techniques for parallel programming have been progressively developed. For instance, parallel frameworks, offering programming patterns, allow expressing concurrency in applications to better exploit parallel hardware. Nevertheless, a large portion of production software, from a broad range of scientific and industrial areas, is still developed sequentially. Considering that these software modules contain thousands, or even millions, of lines of code, an extremely large amount of effort is needed to identify parallel regions. To pave the way in this area, this paper presents Parallel Pattern Analyzer Tool, a software component that aids the discovery and annotation of parallel patterns in source codes. This tool simplifies the transformation of sequential source code to parallel. Specifically, we provide support for identifying Map, Farm, and Pipeline parallel patterns and evaluate the quality of the detection for a set of different C++ applications.

APA, Harvard, Vancouver, ISO, and other styles

4

GAVA, FRÉDÉRIC. "A MODULAR IMPLEMENTATION OF DATA STRUCTURES IN BULK-SYNCHRONOUS PARALLEL ML." Parallel Processing Letters 18, no. 01 (March 2008): 39–53. http://dx.doi.org/10.1142/s0129626408003211.

Full text

Abstract:

A functional data-parallel language called BSML has been designed for programming Bulk-Synchronous Parallel algorithms. Many sequential algorithms do not have parallel counterparts and many non-computer science researchers do not want to deal with parallel programming. In sequential programming environments, common data structures are often provided through reusable libraries to simplify the development of applications. A parallel representation of such data structures is thus a solution for writing parallel programs without suffering from disadvantages of all the features of a parallel language. In this paper we describe a modular implementation in BSML of some data structures and show how those data types can address the needs of many potential users of parallel machines who have so far been deterred by the complexity of parallelizing code.

APA, Harvard, Vancouver, ISO, and other styles

5

LOOGEN, RITA, YOLANDA ORTEGA-MALLÉN, and RICARDO PEÑA-MARÍ. "Parallel functional programming in Eden." Journal of Functional Programming 15, no. 3 (May 2005): 431–75. http://dx.doi.org/10.1017/s0956796805005526.

Full text

Abstract:

Eden extends the non-strict functional language Haskell with constructs to control parallel evaluation of processes. Although processes are defined explicitly, communication and synchronisation issues are handled in a way transparent to the programmer. In order to offer effective support for parallel evaluation, Eden's coordination constructs override the inherently sequential demand-driven (lazy) evaluation strategy of its computation language Haskell. Eden is a general-purpose parallel functional language suitable for developing sophisticated skeletons – which simplify parallel programming immensely – as well as for exploiting more irregular parallelism that cannot easily be captured by a predefined skeleton. The paper gives a comprehensive description of Eden, its semantics, its skeleton-based programming methodology – which is applied in three case studies – its implementation and performance. Furthermore it points at many additional results that have been achieved in the context of the Eden project.

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Xiang, Fei Li, and Chang Hao Wang. "Research of Parallel Processing Technology Based on Multi-Core." Applied Mechanics and Materials 182-183 (June 2012): 639–43. http://dx.doi.org/10.4028/www.scientific.net/amm.182-183.639.

Full text

Abstract:

In this paper, five kinds of typical multi-core processers are compared from thread, cache, inter-core interconnect and etc. Two kinds of multi-core programming environments and some new programming languages are introduced. Thread-level speculation (TLS) and transactional memory (TM) are introduced to solve the problem of parallelization of sequential program. TLS automatically analyze and speculate the part of sequential process which can be parallel implement, and then automatically generate parallel code. TM systems provide an efficient and easy mechanism for parallel programming on multi-core processors. Typical TM likes TCC, UTM, LogTM, LogTM-SE and SigTM are introduced. Combined the TLS and TM can more effectively improve the sequential program running on the multi-core processors. Typical extended TM systems to support TLS likes TCC, TTM, PTT and STMlite are introduced.

APA, Harvard, Vancouver, ISO, and other styles

7

Jézéquel, J. M., F. Bergheul, and F. André. "Programming massively parallel architectures with sequential object oriented languages." Future Generation Computer Systems 10, no. 1 (April 1994): 59–70. http://dx.doi.org/10.1016/0167-739x(94)90051-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Chen, Zhong. "Parallel Iterative Methods for Nonlinear Programming Problems." Advanced Materials Research 159 (December 2010): 105–10. http://dx.doi.org/10.4028/www.scientific.net/amr.159.105.

Full text

Abstract:

In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function is differentiable and convex on the positive orthant of , and it has compact level sets and has a locally Lipschitz continuous gradient, we prove these algorithms converge to a solution of minimization problem. For the proofs there are essentially used the results of sequential methods shown by Eggermont[1].

APA, Harvard, Vancouver, ISO, and other styles

9

Boekkooi-Timminga, Ellen. "The Construction of Parallel Tests From IRT-Based Item Banks." Journal of Educational Statistics 15, no. 2 (June 1990): 129–45. http://dx.doi.org/10.3102/10769986015002129.

Full text

Abstract:

The construction of parallel tests from IRT-based item banks is discussed. Tests are considered to be parallel whenever their information functions are identical. Simultaneous and sequential parallel test construction methods based on the use of 0–1 programming are examined for the Rasch and 3-parameter logistic model. Sequential methods construct the tests one after another; simultaneous methods construct them all at the same time. A heuristic procedure is used for solving the 0–1 programming problems. Satisfactory results are obtained, both in terms of the CPU-time needed and differences between the information functions of the parallel tests selected.

APA, Harvard, Vancouver, ISO, and other styles

10

Qawasmeh, Ahmad, Salah Taamneh, Ashraf H. Aljammal, Nabhan Hamadneh, Mustafa Banikhalaf, and Mohammad Kharabsheh. "Parallelism exploration in sequential algorithms via animation tool." Multiagent and Grid Systems 17, no. 2 (August 23, 2021): 145–58. http://dx.doi.org/10.3233/mgs-210347.

Full text

Abstract:

Different high performance techniques, such as profiling, tracing, and instrumentation, have been used to tune and enhance the performance of parallel applications. However, these techniques do not show how to explore the potential of parallelism in a given application. Animating and visualizing the execution process of a sequential algorithm provide a thorough understanding of its usage and functionality. In this work, an interactive web-based educational animation tool was developed to assist users in analyzing sequential algorithms to detect parallel regions regardless of the used parallel programming model. The tool simplifies algorithms’ learning, and helps students to analyze programs efficiently. Our statistical t-test study on a sample of students showed a significant improvement in their perception of the mechanism and parallelism of applications and an increase in their willingness to learn algorithms and parallel programming.

APA, Harvard, Vancouver, ISO, and other styles

11

PELÁEZ, IGNACIO, FRANCISCO ALMEIDA, and DANIEL GONZÁLEZ. "HIGH LEVEL PARALLEL SKELETONS FOR DYNAMIC PROGRAMMING." Parallel Processing Letters 18, no. 01 (March 2008): 133–47. http://dx.doi.org/10.1142/s0129626408003272.

Full text

Abstract:

Dynamic Programming is an important problem-solving technique used for solving a wide variety of optimization problems. Dynamic Programming programs are commonly designed as individual applications and software tools are usually tailored to specific classes of recurrences and methodologies. That contrasts with some other algorithmic techniques where a single generic program may solve all the instances. We have developed a general skeleton tool providing support for a wide range of dynamic programming methodologies on different parallel architectures. Genericity, flexibility and efficiency are basic issues of the design strategy. Parallelism is supplied to the user in a transparent manner through a common sequential interface. A set of test problems representative of different classes of Dynamic Programming formulations has been used to validate our skeleton on an IBM-SP.

APA, Harvard, Vancouver, ISO, and other styles

12

Ciobanu, Gabriel. "A Programming Perspective of the Membrane Systems." International Journal of Computers Communications & Control 1, no. 3 (July 1, 2006): 13. http://dx.doi.org/10.15837/ijccc.2006.3.2291.

Full text

Abstract:

<p>We present an operational semantics of the membrane systems, using an appropriate notion of configurations and sets of inference rules corresponding to the three stages of an evolution step in membrane systems: maximal parallel rewriting step, parallel communication of objects through membranes, and parallel membrane dissolving.<br /> We define various arithmetical operations over multisets in the framework of membrane systems, indicating their complexity and presenting the membrane systems which implement the arithmetic operations.<br /> Finally we discuss and compare various sequential and parallel software simulators of the membrane systems, emphasizing their specific features</p>

APA, Harvard, Vancouver, ISO, and other styles

13

Haveraaen, Magne. "Machine and Collection Abstractions for User-Implemented Data-Parallel Programming." Scientific Programming 8, no. 4 (2000): 231–46. http://dx.doi.org/10.1155/2000/485607.

Full text

Abstract:

Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic) structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.

APA, Harvard, Vancouver, ISO, and other styles

14

MASSINGILL, BERNA L. "EXPERIMENTS WITH PROGRAM PARALLELIZATION USING ARCHETYPES AND STEPWISE REFINEMENT." Parallel Processing Letters 09, no. 04 (December 1999): 487–98. http://dx.doi.org/10.1142/s0129626499000451.

Full text

Abstract:

Parallel programming continues to be difficult and error-prone, whether starting from specifications or from an existing sequential program. This paper presents (1) a methodology for parallelizing sequential applications and (2) experiments in applying the methodology. The methodology is based on the use of stepwise refinement together with what we call parallel programming archetypes (briefly, abstractions that capture common features of classes of programs), in which most of the work of parallelization is done using familiar sequential tools and techniques, and those parts of the process that cannot be addressed with sequential tools and techniques are addressed with formally-justified transformations. The experiments consist of applying the methodology to sequential application programs, and they provide evidence that the methodology produces correct and reasonably efficient programs at reasonable human-effort cost. Of particular interest is the fact that the aspect of the methodology that is most completely formally justified is the aspect that in practice was the most trouble-free.

APA, Harvard, Vancouver, ISO, and other styles

15

Siow, C. L., Jaswar, and Efi Afrizal. "Computational Fluid Dynamic Using Parallel Loop of Multi-Cores Processor." Applied Mechanics and Materials 493 (January 2014): 80–85. http://dx.doi.org/10.4028/www.scientific.net/amm.493.80.

Full text

Abstract:

Computational Fluid Dynamics (CFD) software is often used to study fluid flow and structures motion in fluids. The CFD normally requires large size of arrays and computer memory and then caused long execution time. However, Innovation of computer hardware such as multi-cores processor provides an alternative solution to improve this programming performance. This paper discussed loop parallelize multi-cores processor for optimization of sequential looping CFD code. This loop parallelize CFD was achieved by applying multi-tasking or multi-threading code into the original CFD code which was developed by one of the authors. The CFD code was developed based on Reynolds Average Navier-Stokes (RANS) method. The new CFD code program was developed using Microsoft Visual Basic (VB) programming language. In the early stage, the whole CFD code was constructed in a sequential flow before it is modified to parallel flow by using VBs multi-threading library. In the comparison, fluid flow around the hull of round-shaped FPSO was selected to compare the performance of both the programming codes. Besides, executed results of this self-developed code such as pressure distribution around the hull were also presented in this paper.

APA, Harvard, Vancouver, ISO, and other styles

16

SHEERAN, MARY. "Functional and dynamic programming in the design of parallel prefix networks." Journal of Functional Programming 21, no. 1 (December 6, 2010): 59–114. http://dx.doi.org/10.1017/s0956796810000304.

Full text

Abstract:

AbstractA parallel prefix network of width n takes n inputs, a1, a2, . . ., an, and computes each yi = a1 ○ a2 ○ ⋅ ⋅ ⋅ ○ ai for 1 ≤ i ≤ n, for an associative operator ○. This is one of the fundamental problems in computer science, because it gives insight into how parallel computation can be used to solve an apparently sequential problem. As parallel programming becomes the dominant programming paradigm, parallel prefix or scan is proving to be a very important building block of parallel algorithms and applications. There are many different parallel prefix networks, with different properties such as number of operators, depth and allowed fanout from the operators. In this paper, ideas from functional programming are combined with search to enable a deep exploration of parallel prefix network design. Networks that improve on the best known previous results are generated. It is argued that precise modelling in a functional programming language, together with simple visualization of the networks, gives a new, more experimental, approach to parallel prefix network design, improving on the manual techniques typically employed in the literature. The programming idiom that marries search with higher order functions may well have wider application than the network generation described here.

APA, Harvard, Vancouver, ISO, and other styles

17

Sitsylitsyn, Yuriy. "Methods and tools for teaching parallel and distributed computing in universities: a systematic review of the literature." SHS Web of Conferences 75 (2020): 04017. http://dx.doi.org/10.1051/shsconf/20207504017.

Full text

Abstract:

As computer hardware becomes more and more parallel, there is a need for software engineers who are experienced in developing parallel programs, not only by “parallelizing” sequential designs. Teach students a parallelism in elementary courses in computer science this is a very important step towards building the competencies of future software engineers. We have conducted research on “teaching parallel and distributed computing” and “parallel programming” publications in the Scopus database, published in English between 2008 and 2019. After quality assessment, 26 articles were included in the analysis. As a result, the main tool for teaching parallel and distributed computing is a lab course with a C++ programming language and MPI library.

APA, Harvard, Vancouver, ISO, and other styles

18

Wang, Ping, and Xiaoping Wu. "OpenMP Programming for a Global Inverse Model." Scientific Programming 10, no. 3 (2002): 253–61. http://dx.doi.org/10.1155/2002/620712.

Full text

Abstract:

The objective of our investigation is to establish robust inverse algorithms to convert GRACE gravity and ICESat altimetry mission data into global current and past surface mass variations. To assess separation of global sources of change and to evaluate spatio-temporal resolution and accuracy statistically from full posterior covariance matrices, a high performance version of a global simultaneous grid inverse algorithm is essential. One means to accomplish this is to implement a general, well-optimized, parallel global model on massively parallel supercomputers. In our present work, an efficient parallel version of a global inverse program has been implemented on the Origin 2000 using the OpenMP programming model. In this paper, porting a sequential global code to a shared-memory computing system is discussed; several efficient strategies to optimize the code are reported; well-optimized scientific libraries are used; detailed parallel implementation of the global model is reported; performance data of the code are analyzed. Scaling performance on a shared-memory system is also discussed. The parallel version software gives good speedup and dramatically reduces total data processing time.

APA, Harvard, Vancouver, ISO, and other styles

19

Venter, Gerhard, and Garret N. Vanderplaats. "Using a Filter-Based Sequential Quadratic Programming Algorithm in a Parallel Environment." Journal of Aerospace Computing, Information, and Communication 6, no. 12 (December 2009): 635–48. http://dx.doi.org/10.2514/1.43224.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

BERNARD, THOMAS A. M., CLEMENS GRELCK, and CHRIS R. JESSHOPE. "ON THE COMPILATION OF A LANGUAGE FOR GENERAL CONCURRENT TARGET ARCHITECTURES." Parallel Processing Letters 20, no. 01 (March 2010): 51–69. http://dx.doi.org/10.1142/s0129626410000053.

Full text

Abstract:

The challenge of programming many-core architectures efficiently and effectively requires models and methods to co-design chip architectures and their software tool chain, using an approach that is both vertical and general. In this paper, we present compilation schemes for a general model of concurrency captured in a parallel language designed for system-level programming and as a target for higher level compilers. We also expose the challenges of integrating these transformation rules into a sequential-oriented compiler. Moreover, we discuss resource mapping inherent to those challenges. Our aim has been to reuse as much of the existing sequential compiler technology as possible in order to harness decades of prior research in compiling sequential languages.

APA, Harvard, Vancouver, ISO, and other styles

21

Journal, Baghdad Science. "Parallel Computing for Sorting Algorithms." Baghdad Science Journal 11, no. 2 (June 1, 2014): 292–302. http://dx.doi.org/10.21123/bsj.11.2.292-302.

Full text

Abstract:

The expanding use of multi-processor supercomputers has made a significant impact on the speed and size of many problems. The adaptation of standard Message Passing Interface protocol (MPI) has enabled programmers to write portable and efficient codes across a wide variety of parallel architectures. Sorting is one of the most common operations performed by a computer. Because sorted data are easier to manipulate than randomly ordered data, many algorithms require sorted data. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. In this paper, sequential sorting algorithms, the parallel implementation of many sorting methods in a variety of ways using MPICH.NT.1.2.3 library under C++ programming language and comparisons between the parallel and sequential implementations are presented. Then, these methods are used in the image processing field. It have been built a median filter based on these submitted algorithms. As the parallel platform is unavailable, the time is computed in terms of a number of computations steps and communications steps

APA, Harvard, Vancouver, ISO, and other styles

22

Susungi, Adilla, and Claude Tadonki. "Intermediate Representations for Explicitly Parallel Programs." ACM Computing Surveys 54, no. 5 (June 2021): 1–24. http://dx.doi.org/10.1145/3452299.

Full text

Abstract:

While compilers generally support parallel programming languages and APIs, their internal program representations are mostly designed from the sequential programs standpoint (exceptions include source-to-source parallel compilers, for instance). This makes the integration of compilation techniques dedicated to parallel programs more challenging. In addition, parallelism has various levels and different targets, each of them with specific characteristics and constraints. With the advent of multi-core processors and general purpose accelerators, parallel computing is now a common and pervasive consideration. Thus, software support to parallel programming activities is essential to make this technical transition more realistic and beneficial. The case of compilers is fundamental as they deal with (parallel) programs at a structural level, thus the need for intermediate representations. This article surveys and discusses attempts to provide intermediate representations for the proper support of explicitly parallel programs. We highlight the gap between available contributions and their concrete implementation in compilers and then exhibit possible future research directions.

APA, Harvard, Vancouver, ISO, and other styles

23

Larrabee, Allan R. "The P4 Parallel Programming System, the Linda Environment, and Some Experiences with Parallel Computation." Scientific Programming 2, no. 3 (1993): 23–35. http://dx.doi.org/10.1155/1993/817634.

Full text

Abstract:

The first digital computers consisted of a single processor acting on a single stream of data. In this so-called "von Neumann" architecture, computation speed is limited mainly by the time required to transfer data between the processor and memory. This limiting factor has been referred to as the "von Neumann bottleneck". The concern that the miniaturization of silicon-based integrated circuits will soon reach theoretical limits of size and gate times has led to increased interest in parallel architectures and also spurred research into alternatives to silicon-based implementations of processors. Meanwhile, sequential processors continue to be produced that have increased clock rates and an increase in memory locally available to a processor, and an increase in the rate at which data can be transferred to and from memories, networks, and remote storage. The efficiency of compilers and operating systems is also improving over time. Although such characteristics limit maximum performance, a large improvement in the speed of scientific computations can often be achieved by utilizing more efficient algorithms, particularly those that support parallel computation. This work discusses experiences with two tools for large grain (or "macro task") parallelism.

APA, Harvard, Vancouver, ISO, and other styles

24

Seinstra, F. J., and D. Koelma. "User transparency: a fully sequential programming model for efficient data parallel image processing." Concurrency and Computation: Practice and Experience 16, no. 6 (April 2, 2004): 611–44. http://dx.doi.org/10.1002/cpe.765.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Bała, Piotr, Terry Clark, and L. Ridgway Scott. "Application of Pfortran and Co-Array Fortran in the Parallelization of the GROMOS96 Molecular Dynamics Module." Scientific Programming 9, no. 1 (2001): 61–68. http://dx.doi.org/10.1155/2001/829792.

Full text

Abstract:

After at least a decade of parallel tool development, parallelization of scientific applications remains a significant undertaking. Typically parallelization is a specialized activity supported only partially by the programming tool set, with the programmer involved with parallel issues in addition to sequential ones. The details of concern range from algorithm design down to low-level data movement details. The aim of parallel programming tools is to automate the latter without sacrificing performance and portability, allowing the programmer to focus on algorithm specification and development. We present our use of two similar parallelization tools, Pfortran and Cray's Co-Array Fortran, in the parallelization of the GROMOS96 molecular dynamics module. Our parallelization started from the GROMOS96 distribution's shared-memory implementation of the replicated algorithm, but used little of that existing parallel structure. Consequently, our parallelization was close to starting with the sequential version. We found the intuitive extensions to Pfortran and Co-Array Fortran helpful in the rapid parallelization of the project. We present performance figures for both the Pfortran and Co-Array Fortran parallelizations showing linear speedup within the range expected by these parallelization methods.

APA, Harvard, Vancouver, ISO, and other styles

26

Legalov, Alexander I., Ivan V. Matkovskii, Mariya S. Ushakova, and Darya S. Romanova. "Dynamically Changing Parallelism with the Asynchronous Sequential Data Flows." Modeling and Analysis of Information Systems 27, no. 2 (June 24, 2020): 164–79. http://dx.doi.org/10.18255/1818-1015-2020-2-164-179.

Full text

Abstract:

A statically typed version of the data driven functional parallel computing model is proposed. It enables a representation of dynamically changing parallelism by means of asynchronous serial data flows. We consider the features of the syntax and semantics of the statically typed data driven functional parallel programming language Smile that supports asynchronous sequential flows. Our main idea is to apply the Hoar concept of communicating sequential processes to the computation control on the data readiness. It is assumed that on the data readiness a control signal is emitted to inform the processes about the occurrence of certain events. The special feature of our approach is that the model is extended with the special asynchronous containers that can generate events on their partial filling. These containers are a stream and a swarm, each of which has its own specifics. A stream is used to process data which have identical type. The data comes sequentially and asynchronously at arbitrary time moments. The number of the incoming data elements is initially unknown, so the processing completes on the signal of the end of the stream. A swarm is used to contain independent data of the same type and may be used for the massive parallel operations performing. Unlike a stream, the swarm’s size is fixed and known in advance. General principles of the operations with the asynchronous sequential flows with an arbitrary order of data arrival are described. The use of the streams and the swarms in various situations is considered. We propose the language constructions which allow us to operate the swarms and streams and describe the specifics of their application. We provide the sample functions to illustrate the use of the different approaches to description of the parallelism: recursive processing of the asynchronous flows, processing of the flows in an arbitrary or predefined order of operations, direct access and access by the reference to the elements of the streams and swarms, pipelining of calculations. We give a preliminary parallelism assessment which depends on the ratio of the rates of data arrival and their processing. The proposed methods can be used in the development of the future languages and tool-kits of architecture-independent parallel programming.

APA, Harvard, Vancouver, ISO, and other styles

27

Tejedor, Enric, Yolanda Becerra, Guillem Alomar, Anna Queralt, Rosa M. Badia, Jordi Torres, Toni Cortes, and Jesús Labarta. "PyCOMPSs: Parallel computational workflows in Python." International Journal of High Performance Computing Applications 31, no. 1 (July 27, 2016): 66–82. http://dx.doi.org/10.1177/1094342015594678.

Full text

Abstract:

The use of the Python programming language for scientific computing has been gaining momentum in the last years. The fact that it is compact and readable and its complete set of scientific libraries are two important characteristics that favour its adoption. Nevertheless, Python still lacks a solution for easily parallelizing generic scripts on distributed infrastructures, since the current alternatives mostly require the use of APIs for message passing or are restricted to embarrassingly parallel computations. In that sense, this paper presents PyCOMPSs, a framework that facilitates the development of parallel computational workflows in Python. In this approach, the user programs her script in a sequential fashion and decorates the functions to be run as asynchronous parallel tasks. A runtime system is in charge of exploiting the inherent concurrency of the script, detecting the data dependencies between tasks and spawning them to the available resources. Furthermore, we show how this programming model can be built on top of a Big Data storage architecture, where the data stored in the backend is abstracted and accessed from the application in the form of persistent objects.

APA, Harvard, Vancouver, ISO, and other styles

28

Xue, Xiao Guang, Guo Xi Li, Jing Zhong Gong, and Bao Zhong Wu. "Parallel Control for Structural Dynamic Topological Optimization Problems Based on MMA." Advanced Materials Research 538-541 (June 2012): 2872–77. http://dx.doi.org/10.4028/www.scientific.net/amr.538-541.2872.

Full text

Abstract:

This paper presents a parallel control method for the structural dynamic topology optimization problem. The necessary and feasibility of parallel control during the optimization iteration process were discussed, respectively. The parallel control algorithm based on traditional sequential programming method was constructed by introducing MMA. The method proposed in this paper can improve the drawback of low computational efficiency and local convergence in FEA, which has been illustrated in the results discussion section in the end of the paper.

APA, Harvard, Vancouver, ISO, and other styles

29

Brown, Christopher, Vladimir Janjic, M. Goli, and J. McCall. "Programming Heterogeneous Parallel Machines Using Refactoring and Monte–Carlo Tree Search." International Journal of Parallel Programming 48, no. 4 (June 10, 2020): 583–602. http://dx.doi.org/10.1007/s10766-020-00665-z.

Full text

Abstract:

Abstract This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a combination of algorithmic skeletons (such as farms and pipelines), Monte–Carlo tree search for deriving mappings of tasks to available hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scalable speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the speedups obtained by mappings derived by the MCTS algorithm are within 5–15% of the best-obtained manual parallelisation.

APA, Harvard, Vancouver, ISO, and other styles

30

Iakymchuk, Roman, Maria Barreda Vayá, Stef Graillat, José I. Aliaga, and Enrique S. Quintana-Ortí. "Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments." International Journal of High Performance Computing Applications 34, no. 5 (June 17, 2020): 502–18. http://dx.doi.org/10.1177/1094342020932650.

Full text

Abstract:

The Preconditioned Conjugate Gradient method is often employed for the solution of linear systems of equations arising in numerical simulations of physical phenomena. While being widely used, the solver is also known for its lack of accuracy while computing the residual. In this article, we propose two algorithmic solutions that originate from the ExBLAS project to enhance the accuracy of the solver as well as to ensure its reproducibility in a hybrid MPI + OpenMP tasks programming environment. One is based on ExBLAS and preserves every bit of information until the final rounding, while the other relies upon floating-point expansions and, hence, expands the intermediate precision. Instead of converting the entire solver into its ExBLAS-related implementation, we identify those parts that violate reproducibility/non-associativity, secure them, and combine this with the sequential executions. These algorithmic strategies are reinforced with programmability suggestions to assure deterministic executions. Finally, we verify these approaches on two modern HPC systems: both versions deliver reproducible number of iterations, residuals, direct errors, and vector-solutions for the overhead of less than 37.7% on 768 cores.

APA, Harvard, Vancouver, ISO, and other styles

31

Ramon-Cortes, Cristian, Ramon Amela, Jorge Ejarque, Philippe Clauss, and Rosa M. Badia. "AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python." International Journal of High Performance Computing Applications 34, no. 6 (July 14, 2020): 659–75. http://dx.doi.org/10.1177/1094342020937050.

Full text

Abstract:

The last improvements in programming languages and models have focused on simplicity and abstraction; leading Python to the top of the list of the programming languages. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelisation of affine loop nests and execute them in parallel in a distributed computing infrastructure. It is based on sequential programming and contains one single annotation (in the form of a Python decorator) so that anyone with intermediate-level programming skills can scale up an application to hundreds of cores. The evaluation demonstrates that AutoParallel goes one step further in easing the development of distributed applications. On the one hand, the programmability evaluation highlights the benefits of using a single Python decorator instead of manually annotating each task and its parameters or, even worse, having to develop the parallel code explicitly (e.g., using OpenMP, MPI). On the other hand, the performance evaluation demonstrates that AutoParallel is capable of automatically generating task-based workflows from sequential Python code while achieving the same performances than manually taskified versions of established state-of-the-art algorithms (i.e., Cholesky, LU, and QR decompositions). Finally, AutoParallel is also capable of automatically building data blocks to increase the tasks’ granularity; freeing the user from creating the data chunks, and re-designing the algorithm. For advanced users, we believe that this feature can be useful as a baseline to design blocked algorithms.

APA, Harvard, Vancouver, ISO, and other styles

32

Nguyễn, Trường Huy. "Automata Technique for The LCS Problem." Journal of Computer Science and Cybernetics 35, no. 1 (March 18, 2019): 21–37. http://dx.doi.org/10.15625/1813-9663/35/1/13293.

Full text

Abstract:

In this paper, we introduce two eﬃcient algorithms in practice for computing the length of a longest common subsequence of two strings, using automata technique, in sequential and parallel ways. For two input strings of lengths m and n with m ≤ n, the parallel algorithm uses k processors (k ≤ m) and costs time complexity O(n) in the worst case, where k is an upper estimate of the length of a longest common subsequence of the two strings. These results are based on the Knapsack Shaking approach proposed by P. T. Huy et al. in 2002. Experimental results show that for the alphabet of size 256, our sequential and parallel algorithms are about 65.85 and 3.41m times faster than the standard dynamic programming algorithm proposed by Wagner and Fisher in 1974, respectively.

APA, Harvard, Vancouver, ISO, and other styles

33

Kaber, Sidi-Mahmoud, Amine Loumi, and Philippe Parnaudeau. "Parallel Solution of Linear Systems." East Asian Journal on Applied Mathematics 6, no. 3 (July 20, 2016): 278–89. http://dx.doi.org/10.4208/eajam.210715.250316a.

Full text

Abstract:

AbstractComputational scientists generally seek more accurate results in shorter times, and to achieve this a knowledge of evolving programming paradigms and hardware is important. In particular, optimising solvers for linear systems is a major challenge in scientific computation, and numerical algorithms must be modified or new ones created to fully use the parallel architecture of new computers. Parallel space discretisation solvers for Partial Differential Equations (PDE) such as Domain Decomposition Methods (DDM) are efficient and well documented. At first glance, parallelisation seems to be inconsistent with inherently sequential time evolution, but parallelisation is not limited to space directions. In this article, we present a new and simple method for time parallelisation, based on partial fraction decomposition of the inverse of some special matrices. We discuss its application to the heat equation and some limitations, in associated numerical experiments.

APA, Harvard, Vancouver, ISO, and other styles

34

Vasilev, Vladimir S., Alexander I. Legalov, and Sergey V. Zykov. "The System for Transforming the Code of Dataflow Programs into Imperative." Modeling and Analysis of Information Systems 28, no. 2 (June 11, 2021): 198–214. http://dx.doi.org/10.18255/1818-1015-2021-2-198-214.

Full text

Abstract:

Functional dataflow programming languages are designed to create parallel portable programs. The source code of such programs is translated into a set of graphs that reflect information and control dependencies. The main way of their execution is interpretation, which does not allow to perform calculations efficiently on real parallel computing systems and leads to poor performance. To run programs directly on existing computing systems, you need to use specific optimization and transformation methods that take into account the features of both the programming language and the architecture of the system. Currently, the most common is the Von Neumann architecture, however, parallel programming for it in most cases is carried out using imperative languages with a static type system. For different architectures of parallel computing systems, there are various approaches to writing parallel programs. The transformation of dataflow parallel programs into imperative programs allows to form a framework of imperative code fragments that directly display sequential calculations. In the future, this framework can be adapted to a specific parallel architecture. The paper considers an approach to performing this type of transformation, which consists in allocating fragments of dataflow parallel programs as templates, which are subsequently replaced by equivalent fragments of imperative languages. The proposed transformation methods allow generating program code, to which various optimizing transformations can be applied in the future, including parallelization taking into account the target architecture.

APA, Harvard, Vancouver, ISO, and other styles

35

EBERBACH, EUGENIUSZ. "SEMAL: A COST LANGUAGE BASED ON THE CALCULUS OF SELF-MODIFIABLE ALGORITHMS." International Journal of Software Engineering and Knowledge Engineering 04, no. 03 (September 1994): 391–408. http://dx.doi.org/10.1142/s0218194094000192.

Full text

Abstract:

The design, specification, and preliminary implementation of the SEMAL language, based upon the Calculus of Self-modifiable Algorithms model of computation is presented. A Calculus of Self-modifiable Algorithms is a universal theory for parallel and intelligent systems, integrating different styles of programming, and applied to a wealth of domains of future generation computers. It has some features from logic, rule-based, procedural, functional, and object-oriented programming. It has been designed to be a relatively universal tool for AI similar to the way Hoare’s Communicating Sequential Processes and Milner’s Calculus of Communicating Systems are basic theories for parallel systems. The formal basis of this approach is described. The model is used to derive a new programming paradigm, so-called cost languages and new computer architectures cost-driven computers. As a representative of cost languages, the SEMAL language is presented.

APA, Harvard, Vancouver, ISO, and other styles

36

Gnatowski, Andrzej, and Teodor Niżyński. "A Parallel Algorithm for Scheduling a Two-Machine Robotic Cell in Bicycle Frame Welding Process." Applied Sciences 11, no. 17 (August 31, 2021): 8083. http://dx.doi.org/10.3390/app11178083.

Full text

Abstract:

Welding frames with differing geometries is one of the most crucial stages in the production of high-end bicycles. This paper proposes a parallel algorithm and a mixed integer linear programming formulation for scheduling a two-machine robotic welding station. The time complexity of the introduced parallel method is O(log2n) on an n3-processor Exclusive Read Exclusive Write Parallel Random-Access Machine (EREW PRAM), where n is the problem size. The algorithm is designed to take advantage of modern graphics cards to significantly accelerate the computations. To present the benefits of the parallelization, the algorithm is compared to the state of art sequential method and a solver-based approach. Experimental results show an impressive speedup for larger problem instances—up to 314 on a single Graphics Processing Unit (GPU), compared to a single-threaded CPU execution of the sequential algorithm.

APA, Harvard, Vancouver, ISO, and other styles

37

Rek, Václav, and Ivan Němec. "Parallel Computation on Multicore Processors Using Explicit Form of the Finite Element Method and C++ Standard Libraries." Strojnícky casopis – Journal of Mechanical Engineering 66, no. 2 (November 1, 2016): 67–78. http://dx.doi.org/10.1515/scjme-2016-0020.

Full text

Abstract:

Abstract In this paper, the form of modifications of the existing sequential code written in C or C++ programming language for the calculation of various kind of structures using the explicit form of the Finite Element Method (Dynamic Relaxation Method, Explicit Dynamics) in the NEXX system is introduced. The NEXX system is the core of engineering software NEXIS, Scia Engineer, RFEM and RENEX. It has the possibilities of multithreaded running, which can now be supported at the level of native C++ programming language using standard libraries. Thanks to the high degree of abstraction that a contemporary C++ programming language provides, a respective library created in this way can be very generalized for other purposes of usage of parallelism in computational mechanics.

APA, Harvard, Vancouver, ISO, and other styles

38

Zhang, Jing, and Bai Lin Li. "Study on Multidisciplinary Design Optimization of 3-RRS Parallel Robot." Applied Mechanics and Materials 214 (November 2012): 919–23. http://dx.doi.org/10.4028/www.scientific.net/amm.214.919.

Full text

Abstract:

The paper aims to apply the idea of multidisciplinary design optimization to the design of robot system. The main idea of collaborative optimization is introduced. The collaborative optimization frame of 3-RRS parallel robot is analyzed. With the method of genetic algorithm and Sequential Quadratic Programming, the investigation is made on the executing collaborative optimization of working stroke, driving performance and hydraulic components. The numerical results indicate that the collaborative optimization can be successfully applied to dealing with the complex robot system, and lay a foundation to solve more complex mechanical system.

APA, Harvard, Vancouver, ISO, and other styles

39

Sen, S. K., Hongwei Du, and D. W. Fausett. "A center of a polytope: An expository review and a parallel implementation." International Journal of Mathematics and Mathematical Sciences 16, no. 2 (1993): 209–24. http://dx.doi.org/10.1155/s0161171293000262.

Full text

Abstract:

The solution space of the rectangular linear systemAx=b, subject tox≥0, is called a polytope. An attempt is made to provide a deeper geometric insight, with numerical examples, into the condensed paper by Lord, et al. [1], that presents an algorithm to compute a center of a polytope. The algorithm is readily adopted for either sequential or parallel computer implementation. The computed center provides an initial feasible solution (interior point) of a linear programming problem.

APA, Harvard, Vancouver, ISO, and other styles

40

Radhakrishnan, Hari, Damian W. I. Rouson, Karla Morris, Sameer Shende, and Stavros C. Kassinos. "Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study." Scientific Programming 2015 (2015): 1–12. http://dx.doi.org/10.1155/2015/904983.

Full text

Abstract:

This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.

APA, Harvard, Vancouver, ISO, and other styles

41

AHN, JOONSEON, and TAISOOK HAN. "AN ANALYTICAL METHOD FOR PARALLELIZATION OF RECURSIVE FUNCTIONS." Parallel Processing Letters 10, no. 01 (March 2000): 87–98. http://dx.doi.org/10.1142/s012962640000010x.

Full text

Abstract:

Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult tasks. In this paper, we propose an analytical approach to transforming recursive functions on general recursive data structures into compositions of parallel skeletons. Using static slicing, we have defined a classification of subexpressions based on their data-parallelism. Then, skeleton-based parallel programs are generated from the classification. To extend the scope of parallelization, we have adopted more general parallel skeletons which do not require the associativity of argument functions. In this way, our analytical method can parallelize recursive functions with complex data flows.

APA, Harvard, Vancouver, ISO, and other styles

42

AHN, JOONSEON, and TAISOOK HAN. "AN ANALYTICAL METHOD FOR PARALLELIZATION OF RECURSIVE FUNCTIONS." Parallel Processing Letters 10, no. 04 (December 2000): 359–70. http://dx.doi.org/10.1142/s0129626400000330.

Full text

Abstract:

Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult tasks. In this paper, we propose an analytical approach to transforming recursive functions on general recursive data structures into compositions of parallel skeletons. Using static slicing, we have defined a classification of subexpressions based on their data-parallelism. Then, skeleton-based parallel programs are generated from the classification. To extend the scope of parallelization, we have adopted more general parallel skeletons which do not require the associativity of argument functions. In this way, our analytical method can parallelize recursive functions with complex data flows.

APA, Harvard, Vancouver, ISO, and other styles

43

Basu, Debaleena, and Aditya Murthy. "Parallel programming of saccades in the macaque frontal eye field: are sequential motor plans coactivated?" Journal of Neurophysiology 123, no. 1 (January 1, 2020): 107–19. http://dx.doi.org/10.1152/jn.00545.2018.

Full text

Abstract:

We use sequences of saccadic eye movements to continually explore our visual environments. Previous behavioral studies have established that saccades in a sequence may be programmed in parallel by the oculomotor system. In this study, we tested the neural correlates of parallel programming of saccade sequences in the frontal eye field (FEF), using single-unit electrophysiological recordings from macaques performing a sequential saccade task. It is known that FEF visual neurons instantiate target selection whereas FEF movement neurons undertake saccade preparation, where the activity corresponding to a saccade vector gradually ramps up. The question of whether FEF movement neurons are involved in concurrent processing of saccade plans is as yet unresolved. In the present study, we show that, when a peripheral target is foveated after a sequence of two saccades, presaccadic activity of FEF movement neurons for the second saccade can be activated while the first is still underway. Moreover, the onset of movement activity varied parametrically with the behaviorally measured time available for parallel programming. Although at central fixation coactivated FEF movement activity may vectorially encode the retinotopic location of the second target with respect to the fixation point or the remapped location of the second target, with respect to the first our evidence suggests the possibility of early encoding of the remapped second saccade vector. Taken together, the results indicate that movement neurons, although located terminally in the FEF visual-motor spectrum, can accomplish concurrent processing of multiple saccade plans, leading to rapid execution of saccade sequences. NEW & NOTEWORTHY The execution of purposeful sequences underlies much of goal-directed behavior. How different brain areas accomplish sequencing is poorly understood. Using a modified double-step task to generate a rapid sequence of two saccades, we demonstrate that downstream movement neurons in the frontal eye field (FEF), a prefrontal oculomotor area, allow for coactivation of the first and second movement plans that constitute the sequence. These results provide fundamental insights into the neural control of action sequencing.

APA, Harvard, Vancouver, ISO, and other styles

44

P Sajan, Priya, and S. S. Kumar. "GVF snake algorithm-a parallel approach." International Journal of Engineering & Technology 7, no. 1.1 (December 21, 2017): 101. http://dx.doi.org/10.14419/ijet.v7i1.1.9206.

Full text

Abstract:

Multicore architecture is an emerging computer technology where multiple processing element will be acting as independent processing cores by sharing a common memory. Digital image segmentation is a widely used medical imaging application to extract regions of interest. GVF Active Contour is a region based segmentation technique which extracts curved and irregular shaped regions by diffusing gradient vectors and by the influence of internal and external forces. This requires prior knowledge on the geometric position and anatomical structures to locate the specific region defined within an image domain. This process requires complex mathematical calculations which in turn results in the immense consumption of CPU processing time. This may adversely affect the overall performance efficiency of the process. With the advancements in multicore technology, this processing time delay can be reduced by adapting parallelization in the computation of GVF field to the specific region of interest which is to be segmented. OpenMP is a shared memory parallel programming construct, which could implement multicore parallelism with extensive and powerful APIs thereby supporting the functionalities required to attain parallelism. This article provides a high level overview of OpenMP, its effectiveness and ease of implementation in adapting parallelism to existing traditional sequential methods using instruction, data and loop level parallelism. Performance comparison could be done with sequential versions of the program written in Matlab, Java and C languages with the proposed parallelized version of OpenMP. The result is also comparable with different operating systems like Windows and Linux.

APA, Harvard, Vancouver, ISO, and other styles

45

van Soest, A. J. Knoek, and L. J. R. Richard Casius. "The Merits of a Parallel Genetic Algorithm in Solving Hard Optimization Problems." Journal of Biomechanical Engineering 125, no. 1 (February 1, 2003): 141–46. http://dx.doi.org/10.1115/1.1537735.

Full text

Abstract:

A parallel genetic algorithm for optimization is outlined, and its performance on both mathematical and biomechanical optimization problems is compared to a sequential quadratic programming algorithm, a downhill simplex algorithm and a simulated annealing algorithm. When high-dimensional non-smooth or discontinuous problems with numerous local optima are considered, only the simulated annealing and the genetic algorithm, which are both characterized by a weak search heuristic, are successful in finding the optimal region in parameter space. The key advantage of the genetic algorithm is that it can easily be parallelized at negligible overhead.

APA, Harvard, Vancouver, ISO, and other styles

46

McSorley, Eugene, Iain D. Gilchrist, and Rachel McCloy. "The role of fixation disengagement in the parallel programming of sequences of saccades." Experimental Brain Research 237, no. 11 (September 17, 2019): 3033–45. http://dx.doi.org/10.1007/s00221-019-05641-9.

Full text

Abstract:

Abstract One of the core mechanisms involved in the control of saccade responses to selected target stimuli is the disengagement from the current fixation location, so that the next saccade can be executed. To carry out everyday visual tasks, we make multiple eye movements that can be programmed in parallel. However, the role of disengagement in the parallel programming of saccades has not been examined. It is well established that the need for disengagement slows down saccadic response time. This may be important in allowing the system to program accurate eye movements and have a role to play in the control of multiple eye movements but as yet this remains untested. Here, we report two experiments that seek to examine whether fixation disengagement reduces saccade latencies when the task completion demands multiple saccade responses. A saccade contingent paradigm was employed and participants were asked to execute saccadic eye movements to a series of seven targets while manipulating when these targets were shown. This both promotes fixation disengagement and controls the extent that parallel programming can occur. We found that trial duration decreased as more targets were made available prior to fixation: this was a result both of a reduction in the number of saccades being executed and in their saccade latencies. This supports the view that even when fixation disengagement is not required, parallel programming of multiple sequential saccadic eye movements is still present. By comparison with previous published data, we demonstrate a substantial speeded of response times in these condition (“a gap effect”) and that parallel programming is attenuated in these conditions.

APA, Harvard, Vancouver, ISO, and other styles

47

Romanova, D. S., and S. Yu Smogluk. "POSSIBILITY OF USING ELEMENTS OF MATHEMATICAL LIBRARY FOR A PARALLEL LANGUAGE IN CONSTRUCTION MATRIX PROGRAMS." Vestnik komp'iuternykh i informatsionnykh tekhnologii, no. 193 (July 2020): 55–64. http://dx.doi.org/10.14489/vkit.2020.07.pp.055-064.

Full text

Abstract:

Today, due to problems in improving computing performance, parallel programming continues to evolve. There are many different languages in which you can write parallel programs. One of them is the functional-threading parallel programming language Pifagor, which in turn is very specific and allows you to write a program with maximum parallelism, as well as it is designed to solve the portability problem of parallel programs. Tools and a library of functions continue to be developed for this language. This study is devoted to the development of elements of the mathematical library and the search for the most effective mathematical parallel algorithms. The following methods are considered and used in the work: sequential, recursive (left and right recursion), factorization, and pairwise comparisons. As a result of the study, a number of mathematical functions were developed, and a study was made of the possibility of using these functions in the development of programs for multiplying large-dimensional matrices. The work demonstrates the effectiveness of using the developed simple functions implemented by different methods in matrix multiplication programs. The prospects of further work in this direction are noted, having in mind the analysis of the possibility of using artificial intelligence methods to increase efficiency and facilitate the development of parallel programs with large-sized matrices.

APA, Harvard, Vancouver, ISO, and other styles

48

Romanova, D. S., and S. Yu Smogluk. "POSSIBILITY OF USING ELEMENTS OF MATHEMATICAL LIBRARY FOR A PARALLEL LANGUAGE IN CONSTRUCTION MATRIX PROGRAMS." Vestnik komp'iuternykh i informatsionnykh tekhnologii, no. 193 (July 2020): 55–64. http://dx.doi.org/10.14489/vkit.2020.07.pp.055-064.

Full text

Abstract:

Today, due to problems in improving computing performance, parallel programming continues to evolve. There are many different languages in which you can write parallel programs. One of them is the functional-threading parallel programming language Pifagor, which in turn is very specific and allows you to write a program with maximum parallelism, as well as it is designed to solve the portability problem of parallel programs. Tools and a library of functions continue to be developed for this language. This study is devoted to the development of elements of the mathematical library and the search for the most effective mathematical parallel algorithms. The following methods are considered and used in the work: sequential, recursive (left and right recursion), factorization, and pairwise comparisons. As a result of the study, a number of mathematical functions were developed, and a study was made of the possibility of using these functions in the development of programs for multiplying large-dimensional matrices. The work demonstrates the effectiveness of using the developed simple functions implemented by different methods in matrix multiplication programs. The prospects of further work in this direction are noted, having in mind the analysis of the possibility of using artificial intelligence methods to increase efficiency and facilitate the development of parallel programs with large-sized matrices.

APA, Harvard, Vancouver, ISO, and other styles

49

COLLINS, ALEXANDER, CHRISTIAN FENSCH, and HUGH LEATHER. "AUTO-TUNING PARALLEL SKELETONS." Parallel Processing Letters 22, no. 02 (May 16, 2012): 1240005. http://dx.doi.org/10.1142/s0129626412400051.

Full text

Abstract:

Parallel skeletons are a structured parallel programming abstraction that provide programmers with a predefined set of algorithmic templates that can be combined, nested and parameterized with sequential code to produce complex programs. The implementation of these skeletons is currently a manual process, requiring human expertise to choose suitable implementation parameters that provide good performance. This paper presents an empirical exploration of the optimization space of the FastFlow parallel skeleton framework. We performed this using a Monte Carlo search of a random subset of the space, for a representative set of platforms and programs. The results show that the space is program and platform dependent, non-linear, and that automatic search achieves a significant average speedup in program execution time of 1.6× over a human expert. An exploratory data analysis of the results shows a linear dependence between two of the parameters, and that another two parameters have little effect on performance. These properties are then used to reduce the size of the space by a factor of 6, reducing the cost of the search. This provides a starting point for automatically optimizing parallel skeleton programs without the need for human expertise, and with a large improvement in execution time compared to that achievable using human expert tuning.

APA, Harvard, Vancouver, ISO, and other styles

50

PARK, SUNGWOO, and HYEONSEUNG IM. "Type-safe higher-order channels with channel locality." Journal of Functional Programming 19, no. 1 (January 2009): 107–42. http://dx.doi.org/10.1017/s0956796808006989.

Full text

Abstract:

AbstractAs a means of transmitting not only data but also code encapsulated within functions, higher-order channels provide an advanced form of task parallelism in parallel computations. In the presence of mutable references, however, they pose a safety problem because references may be transmitted to remote threads where they are no longer valid. This paper presents an ML-like parallel language with type-safe higher-order channels. By type safety, we mean that no value written to a channel contains references, or equivalently, that no reference escapes via a channel from the thread where it is created. The type system uses a typing judgment that is capable of deciding whether the value to which a term evaluates contains references or not. The use of such a typing judgment also makes it easy to achieve another desirable feature of channels, channel locality, that associates every channel with a unique thread for serving all values addressed to it. Our type system permits mutable references in sequential computations and also ensures that mutable references never interfere with parallel computations. Thus, it provides both flexibility in sequential programming and ease of implementing parallel computations.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!