To see the other types of publications on this topic, follow the link: Parallel and distributed multi-Level programming.

Journal articles on the topic 'Parallel and distributed multi-Level programming'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Parallel and distributed multi-Level programming.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhunissov, N. M., A. T. Bayaly, and E. T. Satybaldy. "THE POSSIBILITIES OF USING PARALLEL PROGRAMMING USING PYTHON." Q A Iasaýı atyndaǵy Halyqaralyq qazaq-túrіk ýnıversıtetіnіń habarlary (fızıka matematıka ınformatıka serııasy) 28, no. 1 (March 30, 2024): 105–14. http://dx.doi.org/10.47526/2024-1/2524-0080.09.

Full text
Abstract:
This article explores the development of parallel software applications using the Python programming language. Parallel programming is becoming increasingly important in the information technology world as multi-core processors and distributed computing become more common. Python provides developers with a variety of tools and libraries for creating parallel applications, including threads, processes, and asynchronous programming. This topic covers the basics of parallel programming in Python, including the principles of thread and process management, error handling, synchronization mechanisms, and resource management. He also considers asynchronous programming using the asyncio library, which allows you to efficiently handle asynchronous tasks. In addition, this topic raises issues of optimization and profiling of parallel applications, as well as explores distributed parallel programming using third-party libraries and frameworks. He also emphasizes the importance of testing and debugging in the context of parallel programming. Research and experiments in parallel programming using Python help developers create high-performance and efficient applications that can effectively use multi-core systems and distributed computing. This article offers an in-depth study that examines how Python is suitable for teaching parallel programming to inexperienced students. The results show that there are obstacles that prevent Python from maintaining its advantages in the transition from sequential programming to parallel.
APA, Harvard, Vancouver, ISO, and other styles
2

Deshpande, Ashish, and Martin Schultz. "Efficient Parallel Programming with Linda." Scientific Programming 1, no. 2 (1992): 177–83. http://dx.doi.org/10.1155/1992/829092.

Full text
Abstract:
Linda is a coordination language inverted by David Gelernter at Yale University, which when combined with a computation language (like C) yields a high-level parallel programming language for MIMD machines. Linda is based on a virtual shared associative memory containing objects called tuples. Skeptics have long claimed that Linda programs could not be efficient on distributed memory architectures. In this paper, we address this claim by discussing C-Linda's performance in solving a particular scientific computing problem, the shallow water equations, and make comparisons with alternatives available on various shared and distributed memory parallel machines.
APA, Harvard, Vancouver, ISO, and other styles
3

RAUBER, THOMAS, and GUDULA RÜNGER. "A DATA RE-DISTRIBUTION LIBRARY FOR MULTI-PROCESSOR TASK PROGRAMMING." International Journal of Foundations of Computer Science 17, no. 02 (April 2006): 251–70. http://dx.doi.org/10.1142/s0129054106003814.

Full text
Abstract:
Multiprocessor task (M-task) programming is a suitable parallel programming model for coding application problems with an inherent modular structure. An M-task can be executed on a group of processors of arbitrary size, concurrently to other M-tasks of the same application program. The data of a multiprocessor task program usually include composed data structures, like vectors or arrays. For distributed memory machines or cluster platforms, those composed data structures are distributed within one or more processor groups. Thus, a concise parallel programming model for M-tasks requires a standardized distributed data format for composed data structures. Additionally, functions for data re-distribution with respect to different data distributions and different processor group layouts are needed to glue program parts together. In this paper, we present a data re-distribution library which extends the M-task programming with Tlib, a library providing operations to split processor groups and to map M-tasks to processor groups.
APA, Harvard, Vancouver, ISO, and other styles
4

Aversa, R., B. Di Martino, N. Mazzocca, and S. Venticinque. "A Skeleton Based Programming Paradigm for Mobile Multi-Agents on Distributed Systems and Its Realization within the MAGDA Mobile Agents Platform." Mobile Information Systems 4, no. 2 (2008): 131–46. http://dx.doi.org/10.1155/2008/745406.

Full text
Abstract:
Parallel programming effort can be reduced by using high level constructs such as algorithmic skeletons. Within the MAGDA toolset, supporting programming and execution of mobile agent based distributed applications, we provide a skeleton-based parallel programming environment, based on specialization of Algorithmic Skeleton Java interfaces and classes. Their implementation include mobile agent features for execution on heterogeneous systems, such as clusters of WSs and PCs, and support reliability and dynamic workload balancing. The user can thus develop a parallel, mobile agent based application by simply specialising a given set of classes and methods and using a set of added functionalities.
APA, Harvard, Vancouver, ISO, and other styles
5

Gorodnyaya, Lidia. "FUNCTIONAL PROGRAMMING FOR PARALLEL COMPUTING." Bulletin of the Novosibirsk Computing Center. Series: Computer Science, no. 45 (2021): 29–48. http://dx.doi.org/10.31144/bncc.cs.2542-1972.2021.n45.p29-48.

Full text
Abstract:
The paper is devoted to modern trends in the application of functional programming to the problems of organizing parallel computations. Functional programming is considered as a meta-paradigm for solving the problems of developing multi-threaded programs for multiprocessor complexes and distributed systems, as well as for solving the problems associated with rapid IT development. The semantic and pragmatic principles of functional programming and consequences of these principles are described. The paradigm analysis of programming languages and systems is used, which allows assessing their similarities and differences. Taking into account these features is necessary when predicting the course of application processes, as well as when planning the study and organization of program development. There are reasons to believe that functional programming is capable of improving program performance through its adaptability to modeling and prototyping. A variety of features and characteristics inherent in the development and debugging of long-lived parallel computing programs is shown. The author emphasizes the prospects of functional programming as a universal technique for solving complex problems burdened with difficult to verify and poorly compatible requirements. A brief outline of the requirements for a multiparadigm parallel programming language is given.
APA, Harvard, Vancouver, ISO, and other styles
6

Spahi, Enis, and D. Altilar. "ITU-PRP: Parallel and Distributed Computing Middleware for Java Developers." International Journal of Business & Technology 3, no. 1 (November 2014): 2–13. http://dx.doi.org/10.33107/ijbte.2014.3.1.01.

Full text
Abstract:
ITU-PRP provides a Parallel Programming Framework for Java Developers on which they can adapt their sequential application code to operate on a distributed multi-host parallel environment. Developers would implement parallel models, such as Loop Parallelism, Divide and Conquer, Master-Slave and Fork-Join by the help of an API Library provided under framework. Produced parallel applications would be submitted to a middleware called Parallel Running Platform (PRP), on which parallel resources for parallel processing are being organized and performed. The middleware creates Task Plans (TP) according to application’s parallel model, assigns best available resource Hosts, in order to perform fast parallel processing. Task Plans will be created dynamically in real time according to resources actual utilization status or availability, instead of predefined/preconfigured task plans. ITU-PRP achieves better efficiency on parallel processing over big data sets and distributes divided base data to multiple hosts to be operated by Coarse-Grained parallelism. According to this model distributed parallel tasks would operate independently with minimal interaction until processing ends.
APA, Harvard, Vancouver, ISO, and other styles
7

Городняя, Лидия Васильевна. "Perspectives of Functional Programming of Parallel Computations." Russian Digital Libraries Journal 24, no. 6 (January 26, 2022): 1090–116. http://dx.doi.org/10.26907/1562-5419-2021-24-6-1090-1116.

Full text
Abstract:
The article is devoted to the results of the analysis of modern trends in functional programming, considered as a metaparadigm for solving the problems of organizing parallel computations and multithreaded programs for multiprocessor complexes and distributed systems. Taking into account the multi-paradigm nature of parallel programming, the paradigm analysis of languages and functional programming systems is used. This makes it possible to reduce the complexity of the problems being solved by methods of decomposition of programs into autonomously developed components, to evaluate their similarities and differences. Consideration of such features is necessary when predicting the course of application processes, as well as when planning the study and organizing the development of programs. There is reason to believe that functional programming has the ability to improve programs performance. A variety of paradigmatic characteristics inherent in the preparation and debugging of long-lived parallel computing programs are shown.
APA, Harvard, Vancouver, ISO, and other styles
8

LUKE, EDWARD A., and THOMAS GEORGE. "Loci: a rule-based framework for parallel multi-disciplinary simulation synthesis." Journal of Functional Programming 15, no. 3 (May 2005): 477–502. http://dx.doi.org/10.1017/s0956796805005514.

Full text
Abstract:
We present a rule-based framework for the development of scalable parallel high performance simulations for a broad class of scientific applications (with particular emphasis on continuum mechanics). We take a pragmatic approach to our programming abstractions by implementing structures that are used frequently and have common high performance implementations on distributed memory architectures. The resulting framework borrows heavily from rule-based systems for relational database models, however limiting the scope to those parts that have obvious high performance implementation. Using our approach, we demonstrate predictable performance behavior and efficient utilization of large scale distributed memory architectures on problems of significant complexity involving multiple disciplines.
APA, Harvard, Vancouver, ISO, and other styles
9

TRINDER, P. W. "Special Issue High Performance Parallel Functional Programming." Journal of Functional Programming 15, no. 3 (May 2005): 351–52. http://dx.doi.org/10.1017/s0956796805005496.

Full text
Abstract:
Engineering high-performance parallel programs is hard: not only must a correct, efficient and inherently-parallel algorithm be developed, but the computations must be effectively and efficiently coordinated across multiple processors. It has long been recognised that ideas and approaches drawn from functional programming may be particularly applicable to parallel and distributed computing (e.g. Wegner 1971). There are several reasons for this suitability. Concurrent stateless computations are much easier to coordinate, high-level coordination abstractions reduce programming effort, and declarative notations are amenable to reasoning, i.e. to optimising transformations, derivation and performance analysis.
APA, Harvard, Vancouver, ISO, and other styles
10

POGGI, AGOSTINO, and PAOLA TURCI. "AN AGENT BASED LANGUAGE FOR THE DEVELOPMENT OF DISTRIBUTED SOFTWARE SYSTEMS." International Journal on Artificial Intelligence Tools 05, no. 03 (September 1996): 347–66. http://dx.doi.org/10.1142/s0218213096000237.

Full text
Abstract:
This paper presents a concurrent object-oriented language, called CUBL, that seems be suitable for the development and maintenance of multi-agent systems. This language is based on objects, called c_units, that act in parallel and communicate with each other through synchronous and asynchronous message passing, and allows the distribution of a program, that is, of its objects on a network of UNIX workstations. This language has been enriched with an agent architecture that offers some of more important features for agent-oriented programming and some advantages as regards the other implemented agent architectures. In particular this architecture allows the development of systems where agents communicate with each other through a high level agent communication language and can change their behavior during their life.
APA, Harvard, Vancouver, ISO, and other styles
11

Liu, Rui Tao, and Xiu Jian Lv. "MapReduce-Based Ant Colony Optimization Algorithm for Multi-Dimensional Knapsack Problem." Applied Mechanics and Materials 380-384 (August 2013): 1877–80. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.1877.

Full text
Abstract:
This paper uses MapReduce parallel programming mode to make the Ant Colony Optimization (ACO) algorithm parallel and bring forward the MapReduce-based improved ACO for Multi-dimensional Knapsack Problem (MKP). A variety of techniques, such as change the probability calculation of the timing, roulette, crossover and mutation, are applied for improving the drawback of the ACO and complexity of the algorithm is greatly reduced. It is applied to distributed parallel as to solve the large-scale MKP in cloud computing. Simulation experimental results show that the algorithm can improve the defects of long search time for ant colony algorithm and the processing power for large-scale problems.
APA, Harvard, Vancouver, ISO, and other styles
12

Di Martino, Beniamino, Sergio Briguglio, Gregorio Vlad, and Giuliana Fogaccia. "Workload Decomposition Strategies for Shared Memory Parallel Systems with OpenMP." Scientific Programming 9, no. 2-3 (2001): 109–22. http://dx.doi.org/10.1155/2001/891073.

Full text
Abstract:
A crucial issue in parallel programming (both for distributed and shared memory architectures) is work decomposition. Work decomposition task can be accomplished without large programming effort with use of high-level parallel programming languages, such as OpenMP. Anyway particular care must still be payed on achieving performance goals. In this paper we introduce and compare two decomposition strategies, in the framework of shared memory systems, as applied to a case study particle in cell application. A number of different implementations of them, based on the OpenMP language, are discussed with regard to time efficiency, memory occupancy, and program restructuring effort.
APA, Harvard, Vancouver, ISO, and other styles
13

Klilou, Abdessamad, and Assia Arsalane. "Parallel implementation of pulse compression method on a multi-core digital signal processor." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 6 (December 1, 2020): 6541. http://dx.doi.org/10.11591/ijece.v10i6.pp6541-6548.

Full text
Abstract:
Pulse compression algorithm is widely used in radar applications. It requires a huge processing power in order to be executed in real time. Therefore, its processing must be distributed along multiple processing units. The present paper proposes a real time platform based on the multi-core digital signal processor (DSP) C6678 from Texas Instruments (TI). The objective of this paper is the optimization of the parallel implementation of pulse compression algorithm over the eight cores of the C6678 DSP. Two parallelization approaches were implemented. The first approach is based on the open multi processing (OpenMP) programming interface, which is a software interface that helps to execute different sections of a program on a multi core processor. The second approach is an optimized method that we have proposed in order to distribute the processing and to synchronize the eight cores of the C6678 DSP. The proposed method gives the best performance. Indeed, a parallel efficiency of 94% was obtained when the eight cores were activated.
APA, Harvard, Vancouver, ISO, and other styles
14

Czarnul, Paweł, Jerzy Proficz, and Krzysztof Drypczewski. "Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems." Scientific Programming 2020 (January 29, 2020): 1–19. http://dx.doi.org/10.1155/2020/4176794.

Full text
Abstract:
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals, ease of programming, debugging, deployment, portability, level of parallelism, constructs enabling parallelism and synchronization, features introduced in recent versions indicating trends, support for hybridity in parallel execution, and disadvantages. Such detailed analysis has led us to the identification of trends in high-performance computing and of the challenges to be addressed in the near future. It can help to shape future versions of programming standards, select technologies best matching programmers’ needs, and avoid potential difficulties while using high-performance computing systems.
APA, Harvard, Vancouver, ISO, and other styles
15

Amela, Ramon, Cristian Ramon-Cortes, Jorge Ejarque, Javier Conejero, and Rosa M. Badia. "Executing linear algebra kernels in heterogeneous distributed infrastructures with PyCOMPSs." Oil & Gas Science and Technology – Revue d’IFP Energies nouvelles 73 (2018): 47. http://dx.doi.org/10.2516/ogst/2018047.

Full text
Abstract:
Python is a popular programming language due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. The adoption from multiple scientific communities has evolved in the emergence of a large number of libraries and modules, which has helped to put Python on the top of the list of the programming languages [1]. Task-based programming has been proposed in the recent years as an alternative parallel programming model. PyCOMPSs follows such approach for Python, and this paper presents its extensions to combine task-based parallelism and thread-level parallelism. Also, we present how PyCOMPSs has been adapted to support heterogeneous architectures, including Xeon Phi and GPUs. Results obtained with linear algebra benchmarks demonstrate that significant performance can be obtained with a few lines of Python.
APA, Harvard, Vancouver, ISO, and other styles
16

Jones, Jeff. "Mechanisms Inducing Parallel Computation in a Model of Physarum polycephalum Transport Networks." Parallel Processing Letters 25, no. 01 (March 2015): 1540004. http://dx.doi.org/10.1142/s0129626415400046.

Full text
Abstract:
The giant amoeboid organism true slime mould Physarum polycephalum dynamically adapts its body plan in response to changing environmental conditions and its protoplasmic transport network is used to distribute nutrients within the organism. These networks are efficient in terms of network length and network resilience and are parallel approximations of a range of proximity graphs and plane division problems. The complex parallel distributed computation exhibited by this simple organism has since served as an inspiration for intensive research into distributed computing and robotics within the last decade. P. polycephalum may be considered as a spatially represented parallel unconventional computing substrate, but how can this ‘computer’ be programmed? In this paper we examine and catalogue individual low-level mechanisms which may be used to induce network formation and adaptation in a multi-agent model of P. polycephalum. These mechanisms include those intrinsic to the model (particle sensor angle, rotation angle, and scaling parameters) and those mediated by the environment (stimulus location, distance, angle, concentration, engulfment and consumption of nutrients, and the presence of simulated light irradiation, repellents and obstacles). The mechanisms induce a concurrent integration of chemoattractant and chemorepellent gradients diffusing within the 2D lattice upon which the agent population resides, stimulating growth, movement, morphological adaptation and network minimisation. Chemoattractant gradients, and their modulation by the engulfment and consumption of nutrients by the model population, represent an efficient outsourcing of spatial computation. The mechanisms may prove useful in understanding the search strategies and adaptation of distributed organisms within their environment, in understanding the minimal requirements for complex adaptive behaviours, and in developing methods of spatially programming parallel unconventional computers and robotic devices.
APA, Harvard, Vancouver, ISO, and other styles
17

BAHGAT, REEM, OSAMA MOSTAFA, and GEORGE A. PAPADOPOULOS. "CONCURRENT ABDUCTIVE LOGIC PROGRAMMING IN PANDORA." International Journal on Artificial Intelligence Tools 10, no. 03 (September 2001): 387–406. http://dx.doi.org/10.1142/s021821300100057x.

Full text
Abstract:
The extension of logic programming with abduction (ALP) allows a form of hypothetical reasoning. The advantages of abduction lie in the ability to reason with incomplete information and the enhancement of the declarative representation of problems. On the other hand, concurrent logic programming is a framework which explores AND-parallelism and/or OR-parallelism in logic programs in order to efficiently execute them on multi-processor / distributed machines. The aim of our work is to study a way to model abduction within the framework of concurrent logic programming, thus taking advantage of the latter's potential for parallel and/or distributed execution. In particular, we describe Abductive Pandora, a syntactic sugar on top of the concurrent logic programming language Pandora, which provides the user with an abductive behavior for a concurrent logic program. Abductive Pandora programs are then transformed into Pandora programs which support the concurrent abductive behavior through a simple programming technique while at the same time taking advantage of the underlying Pandora machine infrastructure.
APA, Harvard, Vancouver, ISO, and other styles
18

Nielsen, Ida M. B., and Curtis L. Janssen. "Multicore Challenges and Benefits for High Performance Scientific Computing." Scientific Programming 16, no. 4 (2008): 277–85. http://dx.doi.org/10.1155/2008/450818.

Full text
Abstract:
Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.
APA, Harvard, Vancouver, ISO, and other styles
19

Huang, Xiaobing, Tian Zhao, and Yu Cao. "PIR." International Journal of Multimedia Data Engineering and Management 5, no. 3 (July 2014): 1–27. http://dx.doi.org/10.4018/ijmdem.2014070101.

Full text
Abstract:
Multimedia Information Retrieval (MIR) is a problem domain that includes programming tasks such as salient feature extraction, machine learning, indexing, and retrieval. There are a variety of implementations and algorithms for these tasks in different languages and frameworks, which are difficult to compose and reuse due to the interface and language incompatibility. Due to this low reusability, researchers often have to implement their experiments from scratch and the resulting programs cannot be easily adapted to parallel and distributed executions, which is important for handling large data sets. In this paper, we present Pipeline Information Retrieval (PIR), a Domain Specific Language (DSL) for multi-modal feature manipulation. The goal of PIR is to unify the MIR programming tasks by hiding the programming details under a flexible layer of domain specific interface. PIR optimizes the MIR tasks by compiling the DSL programs into pipeline graphs, which can be executed using a variety of strategies (e.g. sequential, parallel, or distributed execution). The authors evaluated the performance of PIR applications on single machine with multiple cores, local cluster, and Amazon Elastic Compute Cloud (EC2) platform. The result shows that the PIR programs can greatly help MIR researchers and developers perform fast prototyping on single machine environment and achieve nice scalability on distributed platforms.
APA, Harvard, Vancouver, ISO, and other styles
20

Sapaty, P. S. "A language to comprehend and manage distributed worlds." Mathematical machines and systems 3 (2022): 9–27. http://dx.doi.org/10.34121/1028-9763-2022-3-9-27.

Full text
Abstract:
The paper presents and discusses the details of the Spatial Grasp Language (SGL), including its philosophy, methodology, syntax, semantics, and implementation in distributed systems. As a key element of the developed Spatial Grasp Model and Technology, SGL has been used in nu-merous applications and publications, including seven books. This inspired us to devote the cur-rent paper exclusively to the main features of this language and comparison with other lan-guages as a tribute to its impact on the Distributed Management Project at the National Acade-my of Sciences, with strong international participation and support. The comparison with other programming languages shows high level, simplicity, and compactness of the obtained solu-tions, which are explained by the fact that SGL operates directly on distributed networked bod-ies in a holistic, parallel, self-navigation and pattern matching mode. This effectively puts up-side down the established practice and opinion that parallel and distributed computing is essen-tially more complex than sequential programming. In comparison with specialized battle man-agement languages oriented on command and control of military campaigns, the SGL pro-gramming example shows high clarity and efficiency for expressing campaigns with spatial movement and maintenance of the needed resources, also confirming its universal ability for extended applications in similar areas. The relation of SGL to natural languages is shown where globally recursive SGL organization can be directly used for describing and analyzing natural language structures of any volume and complexity, just by adding new rules to the SGL defini-tion. SGL can also effectively substitute natural languages for a high-level and quick definition of complex spatial problems and their solutions due to its power, compactness, and formula-like recursive nature.
APA, Harvard, Vancouver, ISO, and other styles
21

Silva, Luis M., JoÃo Gabriel Silva, and Simon Chapple. "Implementation and Performance of DSMPI." Scientific Programming 6, no. 2 (1997): 201–14. http://dx.doi.org/10.1155/1997/452521.

Full text
Abstract:
Distributed shared memory has been recognized as an alternative programming model to exploit the parallelism in distributed memory systems because it provides a higher level of abstraction than simple message passing. DSM combines the simple programming model of shared memory with the scalability of distributed memory machines. This article presents DSMPI, a parallel library that runs atop of MPI and provides a DSM abstraction. It provides an easy-to-use programming interface, is fully, portable, and supports heterogeneity. For the sake of flexibility, it supports different coherence protocols and models of consistency. We present some performance results taken in a network of workstations and in a Cray T3D which show that DSMPI can be competitive with MPI for some applications.
APA, Harvard, Vancouver, ISO, and other styles
22

Fan, Wenfei, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, et al. "GraphScope." Proceedings of the VLDB Endowment 14, no. 12 (July 2021): 2879–92. http://dx.doi.org/10.14778/3476311.3476369.

Full text
Abstract:
GraphScope is a system and a set of language extensions that enable a new programming interface for large-scale distributed graph computing. It generalizes previous graph processing frameworks (e.g. , Pregel, GraphX) and distributed graph databases ( e.g ., Janus-Graph, Neptune) in two important ways: by exposing a unified programming interface to a wide variety of graph computations such as graph traversal, pattern matching, iterative algorithms and graph neural networks within a high-level programming language; and by supporting the seamless integration of a highly optimized graph engine in a general purpose data-parallel computing system. A GraphScope program is a sequential program composed of declarative data-parallel operators, and can be written using standard Python development tools. The system automatically handles the parallelization and distributed execution of programs on a cluster of machines. It outperforms current state-of-the-art systems by enabling a separate optimization (or family of optimizations) for each graph operation in one carefully designed coherent framework. We describe the design and implementation of GraphScope and evaluate system performance using several real-world applications.
APA, Harvard, Vancouver, ISO, and other styles
23

Ramon-Cortes, Cristian, Ramon Amela, Jorge Ejarque, Philippe Clauss, and Rosa M. Badia. "AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python." International Journal of High Performance Computing Applications 34, no. 6 (July 14, 2020): 659–75. http://dx.doi.org/10.1177/1094342020937050.

Full text
Abstract:
The last improvements in programming languages and models have focused on simplicity and abstraction; leading Python to the top of the list of the programming languages. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelisation of affine loop nests and execute them in parallel in a distributed computing infrastructure. It is based on sequential programming and contains one single annotation (in the form of a Python decorator) so that anyone with intermediate-level programming skills can scale up an application to hundreds of cores. The evaluation demonstrates that AutoParallel goes one step further in easing the development of distributed applications. On the one hand, the programmability evaluation highlights the benefits of using a single Python decorator instead of manually annotating each task and its parameters or, even worse, having to develop the parallel code explicitly (e.g., using OpenMP, MPI). On the other hand, the performance evaluation demonstrates that AutoParallel is capable of automatically generating task-based workflows from sequential Python code while achieving the same performances than manually taskified versions of established state-of-the-art algorithms (i.e., Cholesky, LU, and QR decompositions). Finally, AutoParallel is also capable of automatically building data blocks to increase the tasks’ granularity; freeing the user from creating the data chunks, and re-designing the algorithm. For advanced users, we believe that this feature can be useful as a baseline to design blocked algorithms.
APA, Harvard, Vancouver, ISO, and other styles
24

Yang, Guoshan, Xumin Liu, Wei Zhang, and Zhengxiong Ma. "Multi-stage planning and calculation method of power system considering energy storage selection." Journal of Physics: Conference Series 2814, no. 1 (August 1, 2024): 012037. http://dx.doi.org/10.1088/1742-6596/2814/1/012037.

Full text
Abstract:
Abstract Under new power system flexibility, this text proposes a multi-stage random generation-transmission-energy storage, integrated programming method that considers the short-period flexibility demand. The model can accurately reflect the relationship between short-period operation and long-period uncertainty within an extraordinary method and can also correctly evaluate the value of investment flexibility. Based on this, this text proposes a parallel processing solution, which can be separated into subproblems through column generation and shared decomposition and enables simulation calculation using a distributed measuring platform in long-period programming. The calculation proves the optimal solution speed of the proposed method and can effectively solve problems that cannot be dealt with by the known methods.
APA, Harvard, Vancouver, ISO, and other styles
25

FRÜHWIRTH, THOM. "Parallelism, concurrency and distribution in constraint handling rules: A survey." Theory and Practice of Logic Programming 18, no. 5-6 (May 23, 2018): 759–805. http://dx.doi.org/10.1017/s1471068418000078.

Full text
Abstract:
AbstractConstraint Handling Rules (CHR) is both an effective concurrent declarative programming language and a versatile computational logic formalism. In CHR, guarded reactive rules rewrite a multi-set of constraints. Concurrency is inherent, since rules can be applied to the constraints in parallel. In this comprehensive survey, we give an overview of the concurrent, parallel as well as distributed CHR semantics, standard and more exotic, that have been proposed over the years at various levels of refinement. These semantics range from the abstract to the concrete. They are related by formal soundness results. Their correctness is proven as a correspondence between parallel and sequential computations. On the more practical side, we present common concise example CHR programs that have been widely used in experiments and benchmarks. We review parallel and distributed CHR implementations in software as well as hardware. The experimental results obtained show a parallel speed-up for unmodified sequential CHR programs. The software implementations are available online for free download and we give the web links. Due to its high level of abstraction, the CHR formalism can also be used to implement and analyse models for concurrency. To this end, the Software Transaction Model, the Actor Model, Colored Petri Nets and the Join-Calculus have been faithfully encoded in CHR. Finally, we identify and discuss commonalities of the approaches surveyed and indicate what problems are left open for future research.
APA, Harvard, Vancouver, ISO, and other styles
26

Cook, Sebastien, and Paulo Garcia. "Arbitrarily Parallelizable Code: A Model of Computation Evaluated on a Message-Passing Many-Core System." Computers 11, no. 11 (November 18, 2022): 164. http://dx.doi.org/10.3390/computers11110164.

Full text
Abstract:
The number of processing elements per solution is growing. From embedded devices now employing (often heterogeneous) multi-core processors, across many-core scientific computing platforms, to distributed systems comprising thousands of interconnected processors, parallel programming of one form or another is now the norm. Understanding how to efficiently parallelize code, however, is still an open problem, and the difficulties are exacerbated across heterogeneous processing, and especially at run time, when it is sometimes desirable to change the parallelization strategy to meet non-functional requirements (e.g., load balancing and power consumption). In this article, we investigate the use of a programming model based on series-parallel partial orders: computations are expressed as directed graphs that expose parallelization opportunities and necessary sequencing by construction. This programming model is suitable as an intermediate representation for higher-level languages. We then describe a model of computation for such a programming model that maps such graphs into a stack-based structure more amenable to hardware processing. We describe the formal small-step semantics for this model of computation and use this formal description to show that the model can be arbitrarily parallelized, at compile and runtime, with correct execution guaranteed by design. We empirically support this claim and evaluate parallelization benefits using a prototype open-source compiler, targeting a message-passing many-core simulation. We empirically verify the correctness of arbitrary parallelization, supporting the validity of our formal semantics, analyze the distribution of operations within cores to understand the implementation impact of the paradigm, and assess execution time improvements when five micro-benchmarks are automatically and randomly parallelized across 2 × 2 and 4 × 4 multi-core configurations, resulting in execution time decrease by up to 95% in the best case.
APA, Harvard, Vancouver, ISO, and other styles
27

Finnerty, Patrick, Yoshiki Kawanishi, Tomio Kamada, and Chikara Ohta. "Supercharging the APGAS Programming Model with Relocatable Distributed Collections." Scientific Programming 2022 (September 21, 2022): 1–27. http://dx.doi.org/10.1155/2022/5092422.

Full text
Abstract:
In this article, we present our relocatable distributed collection library. Building on top of the AGPAS for Java library, we provide a number of useful intranode parallel patterns as well as the features necessary to support the distributed nature of the computation through clearly identified methods. In particular, the transfer of distributed collections’ entries between processes is supported via an integrated relocation system. This enables dynamic load-balancing capabilities, making it possible for programs to adapt to uneven or evolving cluster performance. The system we developed makes it possible to dynamically control the distribution and the data flow of distributed programs through high-level abstractions. Programmers using our library can, therefore, write complex distributed programs combining computation and communication phases through a consistent API. We evaluate the performance of our library against two programs taken from well-known Java benchmark suites, demonstrating superior programmability and obtaining better performance on one benchmark and reasonable overhead on the second. Finally, we demonstrate the ease and benefits of load balancing and a more complex application, which uses the various features of our library extensively.
APA, Harvard, Vancouver, ISO, and other styles
28

Liu, Gao Ming, Xiao Bo Wang, Wei Song, and Chong Zhan Li. "The Hybrid Parallel Algorithm of Online Verification Based on P2P." Applied Mechanics and Materials 373-375 (August 2013): 1251–55. http://dx.doi.org/10.4028/www.scientific.net/amm.373-375.1251.

Full text
Abstract:
Taking into account the gradually expanded scale of large interconnected power system, especially the UHV synchronous grid successfully puts into operation. Traditional centralized computing will encounter the bottleneck of hardware computing power. A protection setting online checking hybrid parallel algorithm based on P2P is proposed. Peer-to-peer communications using P2P network technology to achieve inter-regional information fast interactive. The design of MPI+OpenMP hybrid parallel programming models and algorithms is emphatically introduced. Online checking two levels parallel of process-level and thread-level is achieved through the parallel analysis of online checking. Finally, the hybrid parallel algorithm was tested and compared based on P2P technology distributed parallel computing platform. The results show that the proposed algorithm is correct and effective.
APA, Harvard, Vancouver, ISO, and other styles
29

Seki, Kazuhiro, Ryota Jinno, and Kuniaki Uehara. "Parallel Distributed Trajectory Pattern Mining Using Hierarchical Grid with MapReduce." International Journal of Grid and High Performance Computing 5, no. 4 (October 2013): 79–96. http://dx.doi.org/10.4018/ijghpc.2013100106.

Full text
Abstract:
This paper proposes a new approach to trajectory pattern mining, which attempts to discover frequent movement patterns from the trajectories of moving objects. For dealing with a large volume of trajectory data, traditional approaches quantize them by a grid with a fixed resolution. However, an appropriate resolution often varies across different areas of trajectories. Simply increasing the resolution cannot capture broad patterns and consumes unnecessarily large computational resources. To solve the problem, the authors propose a hierarchical grid-based approach with quadtree search. The approach initially searches for frequent patterns with a coarse grid and drills down into a finer grid level to discover more minute patterns. The algorithm is naturally parallelized and implemented in the MapReduce programming model to accelerate the computation. The authors’ evaluative experiments on real-word data show the effectiveness of the authors’ approach in mining complex patterns with lower computational cost than the previous work.
APA, Harvard, Vancouver, ISO, and other styles
30

Otto, Steve W. "Parallel Array Classes and Lightweight Sharing Mechanisms." Scientific Programming 2, no. 4 (1993): 203–16. http://dx.doi.org/10.1155/1993/393409.

Full text
Abstract:
We discuss a set of parallel array classes, MetaMP, for distributed-memory architectures. The classes are implemented in C++ and interface to the PVM or Intel NX message-passing systems. An array class implements a partitioned array as a set of objects distributed across the nodes – a "collective" object. Object methods hide the low-level message-passing and implement meaningful array operations. These include transparent guard strips (or sharing regions) that support finite-difference stencils, reductions and multibroadcasts for support of pivoting and row operations, and interpolation/contraction operations for support of multigrid algorithms. The concept of guard strips is generalized to an object implementation of lightweight sharing mechanisms for finite element method (FEM) and particle-in-cell (PIC) algorithms. The sharing is accomplished through the mechanism of weak memory coherence and can be efficiently implemented. The price of the efficient implementation is memory usage and the need to explicitly specify the coherence operations. An intriguing feature of this programming model is that it maps well to both distributed-memory and shared-memory architectures.
APA, Harvard, Vancouver, ISO, and other styles
31

Sahu, Muktikanta, Rupjit Chakraborty, and Gopal Krishna Nayak. "A task-level parallelism approach for process discovery." International Journal of Engineering & Technology 7, no. 4 (September 20, 2018): 2446. http://dx.doi.org/10.14419/ijet.v7i4.14748.

Full text
Abstract:
Building process models from the available data in the event logs is the primary objective of Process discovery. Alpha algorithm is one of the popular algorithms accessible for ascertaining a process model from the event logs in process mining. The steps involved in the Alpha algorithm are computationally rigorous and this problem further manifolds with the exponentially increasing event log data. In this work, we have exploited task parallelism in the Alpha algorithm for process discovery by using MPI programming model. The proposed work is based on distributed memory parallelism available in MPI programming for performance improvement. Independent and computationally intensive steps in the Alpha algorithm are identified and task parallelism is exploited. The execution time of serial as well as parallel implementation of Alpha algorithm are measured and used for calculating the extent of speedup achieved. The maximum and minimum speedups obtained are 3.97x and 3.88x respectively with an average speedup of 3.94x.
APA, Harvard, Vancouver, ISO, and other styles
32

KAKULAVARAPU, PRASAD, OLIVIER C. MAQUELIN, JOSÉ NELSON AMARAL, and GUANG R. GAO. "DYNAMIC LOAD BALANCERS FOR A MULTITHREADED MULTIPROCESSOR SYSTEM." Parallel Processing Letters 11, no. 01 (March 2001): 169–84. http://dx.doi.org/10.1142/s0129626401000506.

Full text
Abstract:
Designing multi-processor systems that deliver a reasonable price-performance ratio using off-the-shelf processor and compiler technologies is a major challenge. For an important class of applications, it is critical to explore fine-grain parallelism to achieve reasonable performance. In such parallel systems it is essential to efficiently manage communication latencies, bandwidth, and synchronization overheads. In this paper we study load balancing strategies for the runtime system of a multi-threaded system. EARTH (Efficient Architecture for Running Threads) is a multi-threaded programming and execution model that supports fine-grain, non-preemptive, threads in a distributed memory environment. We describe the design and implementation of a set of dynamic load balancing algorithms, and study their performance in divide-and-conquer, regular, and irregular applications. Our experimental study on the distributed memory multi-processor IBP SP-2 indicate that a randomized load balancer perform as well as, and often better than, history based load balancers.
APA, Harvard, Vancouver, ISO, and other styles
33

Haveraaen, Magne. "Machine and Collection Abstractions for User-Implemented Data-Parallel Programming." Scientific Programming 8, no. 4 (2000): 231–46. http://dx.doi.org/10.1155/2000/485607.

Full text
Abstract:
Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic) structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.
APA, Harvard, Vancouver, ISO, and other styles
34

DATTOLO, ANTONINA, and VINCENZO LOIA. "DISTRIBUTED INFORMATION AND CONTROL IN A CONCURRENT HYPERMEDIA-ORIENTED ARCHITECTURE." International Journal of Software Engineering and Knowledge Engineering 10, no. 03 (June 2000): 345–69. http://dx.doi.org/10.1142/s0218194000000158.

Full text
Abstract:
The market for parallel and distributed computing systems keeps growing. Technological advances in processor power, networking, telecommunication and multimedia are stimulating the development of applications requiring parallel and distributed computing. An important research problem in this area is the need to find a robust bridge between the decentralisation of knowledge sources in information-based systems and the distribution of computational power. Consequently, the attention of the research community has been directed towards high-level, concurrent, distributed programming. This work proposes a new hypermedia framework based on the metaphor of the actor model. The storage and run-time layers are represented entirely as communities of independent actors that cooperate in order to accomplish common goals, such as version management or user adaptivity. These goals involve fundamental and complex hypermedia issues, which, thanks to the distribution of tasks, are treated in an efficient and simple way.
APA, Harvard, Vancouver, ISO, and other styles
35

Piotrowski, M., G. A. McGilvary, M. Mewissen, A. D. Lloyd, T. Forster, L. Mitchell, P. Ghazal, J. Hill, and T. M. Sloan. "Exploiting Parallel R in the Cloud with SPRINT." Methods of Information in Medicine 52, no. 01 (2013): 80–90. http://dx.doi.org/10.3414/me11-02-0039.

Full text
Abstract:
SummaryBackground: Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need.Objectives: Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multiprocessor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance.Methods: The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences.Results: It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of the algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation.Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds.
APA, Harvard, Vancouver, ISO, and other styles
36

Huh, Joonmoo, and Deokwoo Lee. "Effective On-Chip Communication for Message Passing Programs on Multi-Core Processors." Electronics 10, no. 21 (November 3, 2021): 2681. http://dx.doi.org/10.3390/electronics10212681.

Full text
Abstract:
Shared memory is the most popular parallel programming model for multi-core processors, while message passing is generally used for large distributed machines. However, as the number of cores on a chip increases, the relative merits of shared memory versus message passing change, and we argue that message passing becomes a viable, high performing, and parallel programming model. To demonstrate this hypothesis, we compare a shared memory architecture with a new message passing architecture on a suite of applications tuned for each system independently. Perhaps surprisingly, the fundamental behaviors of the applications studied in this work, when optimized for both models, are very similar to each other, and both could execute efficiently on multicore architectures despite many implementations being different from each other. Furthermore, if hardware is tuned to support message passing by supporting bulk message transfer and the elimination of unnecessary coherence overheads, and if effective support is available for global operations, then some applications would perform much better on a message passing architecture. Leveraging our insights, we design a message passing architecture that supports both memory-to-memory and cache-to-cache messaging in hardware. With the new architecture, message passing is able to outperform its shared memory counterparts on many of the applications due to the unique advantages of the message passing hardware as compared to cache coherence. In the best case, message passing achieves up to a 34% increase in speed over its shared memory counterpart, and it achieves an average 10% increase in speed. In the worst case, message passing is slowed down in two applications—CG (conjugate gradient) and FT (Fourier transform)—because it could not perform well on the unique data sharing patterns as its counterpart of shared memory. Overall, our analysis demonstrates the importance of considering message passing as a high performing and hardware-supported programming model on future multicore architectures.
APA, Harvard, Vancouver, ISO, and other styles
37

Sujit R Wakchaure, Et al. "MR-AT: Map Reduce based Apriori Technique for Sequential Pattern Mining using Big Data in Hadoop." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 9 (November 5, 2023): 4258–67. http://dx.doi.org/10.17762/ijritcc.v11i9.9877.

Full text
Abstract:
One of the most well-known and widely implemented data mining methods is Apriori algorithm which is responsible for mining frequent item sets. The effectiveness of the Apriori algorithm has been improved by a number of algorithms that have been introduced on both parallel and distributed platforms in recent years. They are distinct from one another on account of the method of load balancing, memory system, method of data degradation, and data layout that was utilised in their implementation. The majority of the issues that arise with distributed frameworks are associated with the operating costs of handling distributed systems and the absence of high-level parallel programming languages. In addition, when using grid computing, there is constantly a possibility that a node will fail, which will result in the task being re-executed multiple times. The MapReduce approach that was developed by Google can be used to solve these kinds of issues. MapReduce is a programming model that is applied to large-scale distributed processing of data on large clusters of commodity computers. It is effective, scalable, and easy to use. MapReduce is also utilised in cloud computing. This research paper presents an enhanced version of the Apriori algorithm, which is referred to as Improved Parallel and Distributed Apriori (IPDA). It is based on the scalable environment referred as Hadoop MapReduce, which was used to analyse Big Data. Through the generation of split-frequent data regionally and the early elimination of unusual data, the proposed work has its primary objective to reduce the enormous demands placed on available resources as well as the reduction of the overhead communication that occurs whenever frequent data are retrieved. The paper presents the results of tests, which demonstrate that the IPDA performs better than traditional apriori and parallel and distributed apriori in terms of the amount of time required, the number of rules created, and the various minimum support values.
APA, Harvard, Vancouver, ISO, and other styles
38

Cieplak, Tomasz, Tomasz Rymarczyk, and Grzegorz Kłosowski. "USING MICROSERVICES ARCHITECTURE AS ANALYTICAL SYSTEM FOR ELECTRICAL IMPEDANCE TOMOGRAPHY IMAGING." Informatics Control Measurement in Economy and Environment Protection 8, no. 1 (February 28, 2018): 52–55. http://dx.doi.org/10.5604/01.3001.0010.8652.

Full text
Abstract:
An image reconstruction with use of EIT method has been found useful in many areas of medical, industrial and environmental applications. Papers show that computational systems used for image reconstructions are utilizing parallel and distributed computations and multi-tier architecture, as well as monolithic architecture. The aim of our research is to define an analytical system architecture that will be able to combine a variety of image reconstruction algorithms with their representations in different programming languages. Based on examples described in different proceedings and research papers, a microservices architecture seems to be an interesting alternative to the monolithic one.
APA, Harvard, Vancouver, ISO, and other styles
39

Liu, Zonglin, and Olaf Stursberg. "Distributed control of networked systems with coupling constraints." at - Automatisierungstechnik 67, no. 12 (November 18, 2019): 1007–18. http://dx.doi.org/10.1515/auto-2019-0085.

Full text
Abstract:
Abstract This paper proposes algorithms for the distributed solution of control problems for networked systems with coupling constraints. This type of problem is practically relevant, e. g., for subsystems which share common resources, or need to go through a bottleneck, while considering non-convex state constraints. Centralized solution schemes, which typically first cast the non-convexities into mixed-integer formulations that are then solved by mixed-integer programming, suffer from high computational complexity for larger numbers of subsystems. The distributed solution proposed in this paper decomposes the centralized problem into a set of small subproblems to be solved in parallel. By iterating over the subproblems and exchanging information either among all subsystems, or within subsets selected by a coordinator, locally optimal solutions of the global problem are determined. The paper shows for two instances of distributed algorithms that feasibility as well as continuous cost reduction over the iterations up to termination can be guaranteed, while the solutions times are considerably shorter than for the centralized problem. These properties are illustrated for a multi-vehicle motion problem.
APA, Harvard, Vancouver, ISO, and other styles
40

Pryadko, S. A., A. S. Krutogolova, A. S. Uglyanitsa, and A. E. Ivanov. "Multi-core processors use for numerical problems solutions." Radio industry (Russia) 30, no. 4 (December 23, 2020): 98–105. http://dx.doi.org/10.21778/2413-9599-2020-30-4-98-105.

Full text
Abstract:
Problem statement. The use of programming technologies on modern multicore systems is an integral part of an enterprise whose activities involve multitasking or the need to make a large number of calculations over a certain time. The article discusses the development of such technologies aimed at increasing the speed of solving various issues, for example, numerical modeling.Objective. Search for alternative ways to increase the speed of calculations by increasing the number of processors. As an example of increasing the calculation speed depending on the number of processors, the well-known heat-transfer equation is taken, and classical numerical schemes for its solution are given. The use of explicit and implicit schemes is compared, including for the possibility of parallelization of calculations.Results. The article describes systems with shared and distributed memory, describes their possible use for solving various problems, and provides recommendations for their use.Practical implications. Parallel computing helps to solve many problems in various fields, as it reduces the time required to solve partial differential equations.
APA, Harvard, Vancouver, ISO, and other styles
41

Mochurad, L. I., and M. V. Mamchur. "PARALLEL AND DISTRIBUTED COMPUTING TECHNOLOGIES FOR AUTONOMOUS VEHICLE NAVIGATION." Radio Electronics, Computer Science, Control, no. 4 (January 3, 2024): 111. http://dx.doi.org/10.15588/1607-3274-2023-4-11.

Full text
Abstract:
Context. Autonomous vehicles are becoming increasingly popular, and one of the important modern challenges in their development is ensuring their effective navigation in space and movement within designated lanes. This paper examines a method of spatial orientation for vehicles using computer vision and artificial neural networks. The research focused on the navigation system of an autonomous vehicle, which incorporates the use of modern distributed and parallel computing technologies. Objective. The aim of this work is to enhance modern autonomous vehicle navigation algorithms through parallel training of artificial neural networks and to determine the optimal combination of technologies and nodes of devices to increase speed and enable real-time decision-making capabilities in spatial navigation for autonomous vehicles. Method. The research establishes that the utilization of computer vision and neural networks for road lane segmentation proves to be an effective method for spatial orientation of autonomous vehicles. For multi-core computing systems, the application of parallel programming technology, OpenMP, for neural network training on processors with varying numbers of parallel threads increases the algorithm’s execution speed. However, the use of CUDA technology for neural network training on a graphics processing unit significantly enhances prediction speeds compared to OpenMP. Additionally, the feasibility of employing PyTorch Distributed Data Parallel (DDP) technology for training the neural network across multiple graphics processing units (nodes) simultaneously was explored. This approach further improved prediction execution times compared to using a single graphics processing unit. Results. An algorithm for training and prediction of an artificial neural network was developed using two independent nodes, each equipped with separate graphics processing units, and their synchronization for exchanging training results after each epoch, employing PyTorch Distributed Data Parallel (DDP) technology. This approach allows for scalable computations across a higher number of resources, significantly expediting the model training process. Conclusions. The conducted experiments have affirmed the effectiveness of the proposed algorithm, warranting the recommendation of this research for further advancement in autonomous vehicles and enhancement of their navigational capabilities. Notably, the research outcomes can find applications in various domains, encompassing automotive manufacturing, logistics, and urban transportation infrastructure. The obtained results are expected to assist future researchers in understanding the most efficient hardware and software resources to employ for implementing AI-based navigation systems in autonomous vehicles. Prospects for future investigations may encompass refining the accuracy of the proposed parallel algorithm without compromising its efficiency metrics. Furthermore, there is potential for experimental exploration of the proposed algorithm in more intricate practical scenarios of diverse nature and dimensions.
APA, Harvard, Vancouver, ISO, and other styles
42

Piña, Johan S., Simon Orozco-Arias, Nicolas Tobón-Orozco, Leonardo Camargo-Forero, Reinel Tabares-Soto, and Romain Guyot. "G-SAIP: Graphical Sequence Alignment Through Parallel Programming in the Post-Genomic Era." Evolutionary Bioinformatics 19 (January 2023): 117693432211505. http://dx.doi.org/10.1177/11769343221150585.

Full text
Abstract:
A common task in bioinformatics is to compare DNA sequences to identify similarities between organisms at the sequence level. An approach to such comparison is the dot-plots, a 2-dimensional graphical representation to analyze DNA or protein alignments. Dot-plots alignment software existed before the sequencing revolution, and now there is an ongoing limitation when dealing with large-size sequences, resulting in very long execution times. High-Performance Computing (HPC) techniques have been successfully used in many applications to reduce computing times, but so far, very few applications for graphical sequence alignment using HPC have been reported. Here, we present G-SAIP (Graphical Sequence Alignment in Parallel), a software capable of spawning multiple distributed processes on CPUs, over a supercomputing infrastructure to speed up the execution time for dot-plot generation up to 1.68× compared with other current fastest tools, improve the efficiency for comparative structural genomic analysis, phylogenetics because the benefits of pairwise alignments for comparison between genomes, repetitive structure identification, and assembly quality checking.
APA, Harvard, Vancouver, ISO, and other styles
43

He, Yueshun, Wei Zhang, Ping Du, and Qiaohe Yang. "A Novel Strategy for Retrieving Large Scale Scene Images Based on Emotional Feature Clustering." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 08 (November 14, 2019): 2054019. http://dx.doi.org/10.1142/s0218001420540191.

Full text
Abstract:
Due to complicated data structure, image can present rich information, and so images are applied widely at different fields. Although the image can offer a lot of convenience, handling such data consume much time and multi-dimensional space. Especially when users need to retrieve some images from larger-scale image datasets, the disadvantage is more obvious. So, in order to retrieve larger-scale image data effectively, a scene images retrieval strategy based on the MapReduce parallel programming model is proposed. The proposed strategy first, investigates how to effectively store large-scale scene images under a Hadoop cluster parallel processing architecture. Second, a distributed feature clustering algorithm MeanShift is introduced to implement the clustering process of emotional feature of scene images. Finally, several experiments are conducted to verify the effectiveness and efficiency of the proposed strategy in terms of different aspects such as retrieval accuracy, speedup ratio and efficiency and data scalability.
APA, Harvard, Vancouver, ISO, and other styles
44

Cutler, Joseph W., Christopher Watson, Emeka Nkurumeh, Phillip Hilliard, Harrison Goldstein, Caleb Stanford, and Benjamin C. Pierce. "Stream Types." Proceedings of the ACM on Programming Languages 8, PLDI (June 20, 2024): 1412–36. http://dx.doi.org/10.1145/3656434.

Full text
Abstract:
We propose a rich foundational theory of typed data streams and stream transformers, motivated by two high-level goals. First, the type of a stream should be able to express complex sequential patterns of events over time. And second, it should describe the internal parallel structure of the stream, to support deterministic stream processing on parallel and distributed systems. To these ends, we introduce stream types , with operators capturing sequential composition, parallel composition, and iteration, plus a core calculus λ ST of transformers over typed streams that naturally supports a number of common streaming idioms, including punctuation, windowing, and parallel partitioning, as first-class constructions. λ ST exploits a Curry-Howard-like correspondence with an ordered variant of the Logic of Bunched Implication to program with streams compositionally and uses Brzozowski-style derivatives to enable an incremental, prefix-based operational semantics. To illustrate the programming style supported by the rich types of λ ST , we present a number of examples written in Delta, a prototype high-level language design based on λ ST .
APA, Harvard, Vancouver, ISO, and other styles
45

Masood Abdulqader, Dildar, Subhi R. M. Zeebaree, Rizgar R. Zebari, Mohammed A. M.Sadeeq, Umed H. Jader, and Mohammed Mahmood Delzy. "Parallel Processing Distributed-Memory Approach Influences on Performance of Multicomputer-Multicore Systems Using Single-Process Single-Thread." Wasit Journal of Engineering Sciences 12, no. 1 (January 5, 2024): 30–40. http://dx.doi.org/10.31185/ejuow.vol12.iss1.533.

Full text
Abstract:
Based on client/server architecture concepts, this research suggests a method for creating a multicomputer-multicore distributed memory system that can be implemented on distributed-shared memory systems. Both of number of the participated computers and number of existed processors for each of these computers, this research was depended with the specific design and its implementation. The suggested system has two primary phases: monitoring and managing the programmes that may be executed on multiple distributed-multi-core architectures with (2, 4, and 8) CPUs to perform a certain job. There might be a single client and unlimited servers in the network. The implementation phase relies on three separate scenarios covering most of the design space. The suggested system can determine the start time, duration, CPU use, kernel time, user time, waiting time, and end time for each server in the system. Single-Process Single-Thread (SPST) is considered a possible situation while developing User Programmes (UPs). The findings confirmed that more processing power (more servers and more processors on each server) increases the speed at which tasks can be solved. There was a 2.877-fold gain in task processing speed after considering three different possible SPST UPs situations. The C# programming language is used to create this system.
APA, Harvard, Vancouver, ISO, and other styles
46

Skoneczny, Szymon. "Cellular automata-based modelling and simulation of biofilm structure on multi-core computers." Water Science and Technology 72, no. 11 (August 14, 2015): 2071–81. http://dx.doi.org/10.2166/wst.2015.426.

Full text
Abstract:
The article presents a mathematical model of biofilm growth for aerobic biodegradation of a toxic carbonaceous substrate. Modelling of biofilm growth has fundamental significance in numerous processes of biotechnology and mathematical modelling of bioreactors. The process following double-substrate kinetics with substrate inhibition proceeding in a biofilm has not been modelled so far by means of cellular automata. Each process in the model proposed, i.e. diffusion of substrates, uptake of substrates, growth and decay of microorganisms and biofilm detachment, is simulated in a discrete manner. It was shown that for flat biofilm of constant thickness, the results of the presented model agree with those of a continuous model. The primary outcome of the study was to propose a mathematical model of biofilm growth; however a considerable amount of focus was also placed on the development of efficient algorithms for its solution. Two parallel algorithms were created, differing in the way computations are distributed. Computer programs were created using OpenMP Application Programming Interface for C ++ programming language. Simulations of biofilm growth were performed on three high-performance computers. Speed-up coefficients of computer programs were compared. Both algorithms enabled a significant reduction of computation time. It is important, inter alia, in modelling and simulation of bioreactor dynamics.
APA, Harvard, Vancouver, ISO, and other styles
47

Kolumbet, Vadim, and Olha Svynchuk. "MULTIAGENT METHODS OF MANAGEMENT OF DISTRIBUTED COMPUTING IN HYBRID CLUSTERS." Advanced Information Systems 6, no. 1 (April 6, 2022): 32–36. http://dx.doi.org/10.20998/2522-9052.2022.1.05.

Full text
Abstract:
Modern information technologies include the use of server systems, virtualization technologies, communication tools for distributed computing and development of software and hardware solutions of data processing and storage centers, the most effective of such complexes for managing heterogeneous computing resources are hybrid GRID- distributed computing infrastructure combines resources of different types with collective access to these resources for and sharing shared resources. The article considers a multi-agent system that provides integration of the computational management approach for a cluster Grid system of computational type, the nodes of which have a complex hybrid structure. The hybrid cluster includes computing modules that support different parallel programming technologies and differ in their computational characteristics. The novelty and practical significance of the methods and tools presented in the article are a significant increase in the functionality of the Grid cluster computing management system for the distribution and division of Grid resources at different levels of tasks, the ability to embed intelligent computing management tools in problem-oriented applications. The use of multi-agent systems for task planning in Grid systems will solve two main problems - scalability and adaptability. The methods and techniques used today do not sufficiently provide solutions to these complex problems. Thus, the scientific task of improving the effectiveness of methods and tools for managing problem-oriented distributed computing in a cluster Grid system, integrated with traditional meta-planners and local resource managers of Grid nodes, corresponding to trends in the concept of scalability and adaptability.
APA, Harvard, Vancouver, ISO, and other styles
48

Lima, Lícia S. C., Tiago A. O. Demay, Leonardo N. Rosa, André Filipe M. Batista, and Luciano Silva. "Enhancing Supercomputing Education through a Low-Cost Cluster: A Case Study at Insper." International Journal of Computer Architecture Education 12, no. 2 (December 1, 2023): 11–19. http://dx.doi.org/10.5753/ijcae.2023.4824.

Full text
Abstract:
High-Performance Computing (HPC) and parallel programming presents intricate challenges due to the sophisticated interplay between advanced hardware and software components. This paper delineates a case study of a cost-effective cluster comprising 24 Upboards engineered to bolster a project-based Supercomputing course. The project received the name of UpCluster, and it serves as a pragmatic, cost-efficient solution for experiential learning, mitigating the abstraction often associated with theoretical constructs. The curriculum encompasses various topics, including distributed computing, parallel computing, algorithm analysis, and the Message Passing Interface (MPI). The team meticulously documented the cluster infrastructure, providing a comprehensive guide for the configuration and utilization of the Single Board Computer cluster with Kubernetes and MPI operators. Students engaged in practical experimentation, developing scalable algorithms, and gaining valuable insights into the challenges and opportunities associated with distributed computing. These experiences fostered a deeper appreciation for the complexities and potential of distributed computing. The primary objective of this study is to demonstrate the efficacy of the cost-effective cluster in augmenting high-performance computing education. By providing a practical learning environment, the UpCluster complements theoretical instruction and empowers students to acquire practical skills in the design of large-scale distributed systems with multi-core nodes. Furthermore, the paper discusses this low-cost cluster’s potential impact and applications in HPC education. The insights from the study may benefit academic departments and institutions seeking to develop analogous project-based courses focused on high-performance computing for graduate students.
APA, Harvard, Vancouver, ISO, and other styles
49

Sun, Ling, and Dali Gao. "Security Attitude Prediction Model of Secret-Related Computer Information System Based on Distributed Parallel Computing Programming." Mathematical Problems in Engineering 2022 (March 25, 2022): 1–13. http://dx.doi.org/10.1155/2022/3141568.

Full text
Abstract:
In recent years, there has been an upward trend in the number of leaked secrets. Among them, secret-related computers or networks are connected to the Internet in violation of regulations, cross-use of mobile storage media, and poor security management of secret-related intranets, which are the main reasons for leaks. Therefore, it is of great significance to study the physical isolation and protection technology of classified information systems. Physical isolation is an important part of the protection of classified information systems, which cuts off the possibility of unauthorized outflow of information from the network environment. To achieve the physical isolation of the network environment and build a safe and reliable network, it is necessary to continuously improve the level of network construction and strengthen network management capabilities. At present, the realization of physical isolation technology mainly relies on security products such as firewall, intrusion detection, illegal outreach, host monitoring, and auditing. This study analyzes network security systems such as intrusion detection, network scanning, and firewall. Establishing a model based on network security vulnerabilities and making up for network security hidden dangers caused by holes are generally a passive security system. In a network, the leader of network behavior—human behavior—needs to be constrained according to the requirements of the security management system and monitoring. Accordingly, this study proposes a security monitoring system for computer information network involving classified computer. The system can analyze, monitor, manage, and process the network behavior of the terminal computer host in the local area network, to achieve the purpose of reducing security risks in the network system. Based on the evaluation value sequence, the initial prediction value sequence is obtained by sliding adaptive triple exponential smoothing method. The time-varying weighted Markov chain is used for error prediction, the initial prediction value is corrected, and the accuracy of security situation prediction is improved. According to the security protection requirements of secret-related information systems, a complete, safe, reliable, and controllable security protection system for secret-related information systems is constructed, and the existing security risks and loopholes in secret-related information systems are eliminated to the greatest extent possible. This enables the confidentiality, integrity, and availability of confidential data and information in the computer information system to be reliably protected.
APA, Harvard, Vancouver, ISO, and other styles
50

Averbuch, A., E. Gabber, S. Itzikowitz, and B. Shoham. "On the Parallel Elliptic Single/Multigrid Solutions about Aligned and Nonaligned Bodies Using the Virtual Machine for Multiprocessors." Scientific Programming 3, no. 1 (1994): 13–32. http://dx.doi.org/10.1155/1994/895737.

Full text
Abstract:
Parallel elliptic single/multigrid solutions around an aligned and nonaligned body are presented and implemented on two multi-user and single-user shared memory multiprocessors (Sequent Symmetry and MOS) and on a distributed memory multiprocessor (a Transputer network). Our parallel implementation uses the Virtual Machine for Muli-Processors (VMMP), a software package that provides a coherent set of services for explicitly parallel application programs running on diverse multiple instruction multiple data (MIMD) multiprocessors, both shared memory and message passing. VMMP is intended to simplify parallel program writing and to promote portable and efficient programming. Furthermore, it ensures high portability of application programs by implementing the same services on all target multiprocessors. The performance of our algorithm is investigated in detail. It is seen to fit well the above architectures when the number of processors is less than the maximal number of grid points along the axes. In general, the efficiency in the nonaligned case is higher than in the aligned case. Alignment overhead is observed to be up to 200% in the shared-memory case and up to 65% in the message-passing case. We have demonstrated that when using VMMP, the portability of the algorithms is straightforward and efficient.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography