Увійти

Готові списки джерел за темами / Shared-Memory Machines / Статті в журналах

Статті в журналах з теми "Shared-Memory Machines"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Shared-Memory Machines.

Автор: Grafiati

Опубліковано: 10 грудня 2022

Оновлено: 28 січня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Shared-Memory Machines".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Xian-He Sun and Jianping Zhu. "Performance considerations of shared virtual memory machines." IEEE Transactions on Parallel and Distributed Systems 6, no. 11 (1995): 1185–94. http://dx.doi.org/10.1109/71.476190.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Barton, Christopher, CĆlin Casçaval, George Almási, Yili Zheng, Montse Farreras, Siddhartha Chatterje, and José Nelson Amaral. "Shared memory programming for large scale machines." ACM SIGPLAN Notices 41, no. 6 (June 11, 2006): 108–17. http://dx.doi.org/10.1145/1133255.1133995.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Bonomo, John P., and Wayne R. Dyksen. "Pipelined iterative methods for shared memory machines." Parallel Computing 11, no. 2 (August 1989): 187–99. http://dx.doi.org/10.1016/0167-8191(89)90028-8.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Zaki, Mohammed J. "Parallel Sequence Mining on Shared-Memory Machines." Journal of Parallel and Distributed Computing 61, no. 3 (March 2001): 401–26. http://dx.doi.org/10.1006/jpdc.2000.1695.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Bircsak, John, Peter Craig, RaeLyn Crowell, Zarka Cvetanovic, Jonathan Harris, C. Alexander Nelson, and Carl D. Offner. "Extending OpenMP for NUMA Machines." Scientific Programming 8, no. 3 (2000): 163–81. http://dx.doi.org/10.1155/2000/464182.

Повний текст джерела

Анотація:

This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines used to write portable parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations. OpenMP -- designed for shared-memory architectures -- does not by itself address these issues. The extensions to OpenMP Fortran presented here have been mainly taken from High Performance Fortran. The paper describes some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions. It also describes some additional compiler optimizations, and concludes with some preliminary results.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

STRATULAT, S., and D. J. EVANS. "VIRTUAL SHARED MEMORY MACHINES— AN APPLICATION OF PVM∗." Parallel Algorithms and Applications 7, no. 1-2 (January 1995): 143–60. http://dx.doi.org/10.1080/10637199508915528.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

7

FANTOZZI, CARLO, ANDREA PIETRACAPRINA, and GEPPINO PUCCI. "A GENERAL PRAM SIMULATION SCHEME FOR CLUSTERED MACHINES." International Journal of Foundations of Computer Science 14, no. 06 (December 2003): 1147–64. http://dx.doi.org/10.1142/s0129054103002230.

Повний текст джерела

Анотація:

We present a general deterministic scheme to implement a shared memory abstraction on any distributed-memory machine which exhibits a clustered structure. More specifically, we develop a memory distribution strategy and an access protocol for the Decomposable BSP (D-BSP), a generic machine model whose bandwidth/latency parameters can be instantiated to closely reflect the characteristics of machines that admit a hierarchical decomposition into independent clusters. Our scheme achieves provably optimal slowdown for those machines where delays due to latency dominate over those due to bandwidth limitations. For machines where this is not the case, the slowdown is a mere logarithmic factor away from the natural bandwidth-based lower bound.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

HABBAS, ZINEB, MICHAËL KRAJECKI, and DANIEL SINGER. "SHARED MEMORY IMPLEMENTATION OF CONSTRAINT SATISFACTION PROBLEM RESOLUTION." Parallel Processing Letters 11, no. 04 (December 2001): 487–501. http://dx.doi.org/10.1142/s0129626401000749.

Повний текст джерела

Анотація:

Many problems in Computer Science, especially in Artificial Intelligence, can be formulated as Constraint Satisfaction Problems (CSP). This paper presents a parallel implementation of the Forward-Checking algorithm for solving a binary CSP over finite domains. Its main contribution is to use a simple decomposition strategy in order to distribute dynamically the search tree among machines. The feasibility and benefit of this approach are studied for a Shared Memory model. An implementation is drafted using the new emergent standard OpenMP library for shared memory, thus controlling load balancing. We mainly highlight satisfactory efficiencies without using any tricky load balancing policy. All the experiments were carried out running on the Sillicon Graphics Origin 2000 parallel machine.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Choi, Yoonseo, and Hwansoo Han. "Shared heap management for memory-limited java virtual machines." ACM Transactions on Embedded Computing Systems 7, no. 2 (February 2008): 1–32. http://dx.doi.org/10.1145/1331331.1331337.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Limaye, Ajay C. "Parallel MP2-energy evaluation: Simulated shared memory approach on distributed memory parallel machines." Journal of Computational Chemistry 18, no. 4 (March 1997): 552–61. http://dx.doi.org/10.1002/(sici)1096-987x(199703)18:4<552::aid-jcc8>3.0.co;2-s.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

11

CHONG, FREDERIC T., and ANANT AGARWAL. "SHARED MEMORY VERSUS MESSAGE PASSING FOR ITERATIVE SOLUTION OF SPARSE, IRREGULAR PROBLEMS." Parallel Processing Letters 09, no. 01 (March 1999): 159–70. http://dx.doi.org/10.1142/s0129626499000177.

Повний текст джерела

Анотація:

The benefits of hardware support for shared memory versus those for message passing are difficult to evaluate without an in-depth study of real applications on a common platform. We evaluate the communication mechanisms of the MIT Alewife machine, a multiprocessor which provides integrated cache-coherent shared memory, massage passing, and DMA. We perform this evaluation with "best-effort" implementations which solve several sparse, irregular benchmark problems with a preconditioned conjugate gradient sparse matrix solver (ICCG). We find that machines with fast global memory operations do not need message passing or bulk transfer to suport our irregular problems. This is primarily due to three reasons. First, a 5-to-1 ratio between global and local cache misses makes memory copies in bulk communication expensive relati to communication via shared memory. Second, although message passing has synchronization semantics superior to shared memory for data-driven computation, efficient shared memory can overcome this handicap by using global read-modify-writes to change from the traditional owner-computers model to a producer-computes model. Third, bulk transfers can result in high processor idle times in irregular applications.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Tu, Peng, and David Padua. "Array privatization for shared and distributed memory machines (extended abstract)." ACM SIGPLAN Notices 28, no. 1 (January 1993): 64–67. http://dx.doi.org/10.1145/156668.156692.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

13

BENNETT, KARIN REMINGTON, and GRAEME FAIRWEATHER. "A PARALLEL BOUNDARY VALUE ODE CODE FOR SHARED-MEMORY MACHINES." International Journal of High Speed Computing 04, no. 02 (June 1992): 71–86. http://dx.doi.org/10.1142/s012905339200002x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Wilkinson, M. H. F., Hui Gao, W. H. Hesselink, J. E. Jonker, and A. Meijster. "Concurrent Computation of Attribute Filters on Shared Memory Parallel Machines." IEEE Transactions on Pattern Analysis and Machine Intelligence 30, no. 10 (October 2008): 1800–1813. http://dx.doi.org/10.1109/tpami.2007.70836.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Mahmoudi, Ramzi, Mohamed Akil, and Mohamed Hédi Bedoui. "Concurrent computation of topological watershed on shared memory parallel machines." Parallel Computing 69 (November 2017): 78–97. http://dx.doi.org/10.1016/j.parco.2017.08.010.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Lang, Bruno. "Efficient eigenvalue and singular value computations on shared memory machines." Parallel Computing 25, no. 7 (July 1999): 845–60. http://dx.doi.org/10.1016/s0167-8191(99)00021-6.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Cierniak, Michał, and Wei Li. "Unifying data and control transformations for distributed shared-memory machines." ACM SIGPLAN Notices 30, no. 6 (June 1995): 205–17. http://dx.doi.org/10.1145/223428.207145.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

18

LOWENTHAL, DAVID K., VINCENT W. FREEH, and GREGORY R. ANDREWS. "Efficient support for fine-grain parallelism on shared-memory machines." Concurrency: Practice and Experience 10, no. 3 (March 1998): 157–73. http://dx.doi.org/10.1002/(sici)1096-9128(199803)10:3<157::aid-cpe293>3.0.co;2-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Katz, Randy H., and Wei Hong. "The performance of disk arrays in shared-memory database machines." Distributed and Parallel Databases 1, no. 2 (April 1993): 167–98. http://dx.doi.org/10.1007/bf01264050.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Deshpande, Ashish, and Martin Schultz. "Efficient Parallel Programming with Linda." Scientific Programming 1, no. 2 (1992): 177–83. http://dx.doi.org/10.1155/1992/829092.

Повний текст джерела

Анотація:

Linda is a coordination language inverted by David Gelernter at Yale University, which when combined with a computation language (like C) yields a high-level parallel programming language for MIMD machines. Linda is based on a virtual shared associative memory containing objects called tuples. Skeptics have long claimed that Linda programs could not be efficient on distributed memory architectures. In this paper, we address this claim by discussing C-Linda's performance in solving a particular scientific computing problem, the shallow water equations, and make comparisons with alternatives available on various shared and distributed memory parallel machines.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Axtmann, Michael, Sascha Witt, Daniel Ferizovic, and Peter Sanders. "Engineering In-place (Shared-memory) Sorting Algorithms." ACM Transactions on Parallel Computing 9, no. 1 (March 31, 2022): 1–62. http://dx.doi.org/10.1145/3505286.

Повний текст джерела

Анотація:

We present new sequential and parallel sorting algorithms that now represent the fastest known techniques for a wide range of input sizes, input distributions, data types, and machines. Somewhat surprisingly, part of the speed advantage is due to the additional feature of the algorithms to work in-place, i.e., they do not need a significant amount of space beyond the input array. Previously, the in-place feature often implied performance penalties. Our main algorithmic contribution is a blockwise approach to in-place data distribution that is provably cache-efficient. We also parallelize this approach taking dynamic load balancing and memory locality into account. Our new comparison-based algorithm In-place Parallel Super Scalar Samplesort ( IPS 4 o ) , combines this technique with branchless decision trees. By taking cases with many equal elements into account and by adapting the distribution degree dynamically, we obtain a highly robust algorithm that outperforms the best previous in-place parallel comparison-based sorting algorithms by almost a factor of three. That algorithm also outperforms the best comparison-based competitors regardless of whether we consider in-place or not in-place, parallel or sequential settings. Another surprising result is that IPS 4 o even outperforms the best (in-place or not in-place) integer sorting algorithms in a wide range of situations. In many of the remaining cases (often involving near-uniform input distributions, small keys, or a sequential setting), our new In-place Parallel Super Scalar Radix Sort ( IPS 2 Ra ) turns out to be the best algorithm. Claims to have the – in some sense – “best” sorting algorithm can be found in many papers which cannot all be true. Therefore, we base our conclusions on an extensive experimental study involving a large part of the cross product of 21 state-of-the-art sorting codes, 6 data types, 10 input distributions, 4 machines, 4 memory allocation strategies, and input sizes varying over 7 orders of magnitude. This confirms the claims made about the robust performance of our algorithms while revealing major performance problems in many competitors outside the concrete set of measurements reported in the associated publications. This is particularly true for integer sorting algorithms giving one reason to prefer comparison-based algorithms for robust general-purpose sorting.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Ninomiya, Takashi, Kentaro Torisawa, and Jun'ichi Tsujii. "An Agent-based Parallel HPSG Parser for Shared-memory Parallel Machines." Journal of Natural Language Processing 8, no. 1 (2001): 21–47. http://dx.doi.org/10.5715/jnlp.8.21.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Holt, Chris, Jaswinder Pal Singh, and John Hennessy. "Application and architectural bottlenecks in large scale distributed shared memory machines." ACM SIGARCH Computer Architecture News 24, no. 2 (May 1996): 134–45. http://dx.doi.org/10.1145/232974.232988.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Bilardi, Gianfranco, and Alexandru Nicolau. "Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines." SIAM Journal on Computing 18, no. 2 (April 1989): 216–28. http://dx.doi.org/10.1137/0218014.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Tuszyński, Jaroslaw, and Rainald Löhner. "Parallelizing the construction of indirect access arrays for shared-memory machines." Communications in Numerical Methods in Engineering 14, no. 8 (August 1998): 773–81. http://dx.doi.org/10.1002/(sici)1099-0887(199808)14:8<773::aid-cnm186>3.0.co;2-5.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Bozkus, Zeki, Larry Meadows, Steven Nakamoto, Vincent Schuster, and Mark Young. "PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines." Scientific Programming 6, no. 1 (1997): 29–40. http://dx.doi.org/10.1155/1997/705102.

Повний текст джерела

Анотація:

High Performance Fortran (HPF) is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Mather, David. "Extended Memory: Early Calculating Engines and Historical Computer Simulations." Leonardo 39, no. 3 (June 2006): 237–43. http://dx.doi.org/10.1162/leon.2006.39.3.237.

Повний текст джерела

Анотація:

When framed within cognitive theory's extended mind hypothesis, Charles Babbage's 19th-century calculating machines illustrate a distinction between accuracy and flexibility. These properties affect how historical data and memory are organized, providing conceptual linkages for mind-machine integration. The distinction between accuracy and flexibility is also apparent in present-day computer simulations that use historical scenarios, such as virtual-reality software designed for the Bloody Sunday Inquiry, history-based video games and other art and entertainment software applications. These contemporary examples share one important feature of extended mind: the incorporation of history or personal memory into a shared memory system.

Стилі APA, Harvard, Vancouver, ISO та ін.

28

TANIAR, DAVID, and J. WENNY RAHAYU. "PARALLEL SORT-HASH OBJECT-ORIENTED COLLECTION JOIN ALGORITHMS FOR SHARED-MEMORY MACHINES." Parallel Algorithms and Applications 17, no. 2 (January 2002): 85–126. http://dx.doi.org/10.1080/10637190208941435.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Adeli, H., and S. L. Hung. "A Concurrent Adaptive Conjugate Gradient Learning Algorithm On Mimd Shared-Memory Machines." International Journal of Supercomputing Applications 7, no. 2 (June 1993): 155–66. http://dx.doi.org/10.1177/109434209300700206.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Hung, S. L., and H. Adeli. "A parallel genetic/neural network learning algorithm for MIMD shared memory machines." IEEE Transactions on Neural Networks 5, no. 6 (1994): 900–909. http://dx.doi.org/10.1109/72.329686.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Braham, Yosra, Yaroub Elloumi, Mohamed Akil, and Mohamed Hedi Bedoui. "Parallel computation of Watershed Transform in weighted graphs on shared memory machines." Journal of Real-Time Image Processing 17, no. 3 (July 18, 2018): 527–42. http://dx.doi.org/10.1007/s11554-018-0804-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Aubry, R., G. Houzeaux, M. Vázquez, and J. M. Cela. "Some useful strategies for unstructured edge-based solvers on shared memory machines." International Journal for Numerical Methods in Engineering 85, no. 5 (December 29, 2010): 537–61. http://dx.doi.org/10.1002/nme.2973.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Kim, Hyung Tae, and Kyung Chan Jin. "Multi-Application and Large Shared Memory in a Mechatronic System for Massive Computation." Applied Mechanics and Materials 307 (February 2013): 18–22. http://dx.doi.org/10.4028/www.scientific.net/amm.307.18.

Повний текст джерела

Анотація:

Recent mechatronic systems, such as inspection machines or 3D imaging apparatuses, acquire and compute massive data for final results. A host in the mechatronic system is commonly composed of multiple hardware devices which interface with high-speed external signals. The host and the devices usually have large memory, so efficient data management is important due to data storage and transfer. In our software structure, each device is managed by respective application and large shared memory (LSM) is allocated in the host for the massive data. The shared memory is accessible from the device applications. Actions of the mechatronic system are driven by combining and broadcasting events through and inter-process communication (IPC). The model with LSM and IPC was applied to a 3D RF imaging system. We expect the proposed model can also be applied to machine vision with big image and engineering simulation with hardware accelerators.

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Nor Asilah Wati Abdul Hamid and Paul Coddington. "Comparison of MPI Benchmark Programs on Shared Memory and Distributed Memory Machines (Point-to-Point Communication)." International Journal of High Performance Computing Applications 24, no. 4 (June 7, 2010): 469–83. http://dx.doi.org/10.1177/1094342010371106.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Tang, Hong, Kai Shen, and Tao Yang. "Program transformation and runtime support for threaded MPI execution on shared-memory machines." ACM Transactions on Programming Languages and Systems 22, no. 4 (July 2000): 673–700. http://dx.doi.org/10.1145/363911.363920.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Pai, Prasad, and Tate T. H. Tsang. "Parallel computations of turbulent diffusion in convective boundary layers on shared-memory machines." Atmospheric Environment. Part A. General Topics 26, no. 13 (September 1992): 2425–35. http://dx.doi.org/10.1016/0960-1686(92)90372-r.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Tang, Hong, Kai Shen, and Tao Yang. "Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines." ACM SIGPLAN Notices 34, no. 8 (August 1999): 107–18. http://dx.doi.org/10.1145/329366.301114.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Kandemir, M., J. Ramanujam, and A. Choudhary. "Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed-Memory Machines." Journal of Parallel and Distributed Computing 60, no. 8 (August 2000): 924–65. http://dx.doi.org/10.1006/jpdc.2000.1639.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Silva, Luis M., JoÃo Gabriel Silva, and Simon Chapple. "Implementation and Performance of DSMPI." Scientific Programming 6, no. 2 (1997): 201–14. http://dx.doi.org/10.1155/1997/452521.

Повний текст джерела

Анотація:

Distributed shared memory has been recognized as an alternative programming model to exploit the parallelism in distributed memory systems because it provides a higher level of abstraction than simple message passing. DSM combines the simple programming model of shared memory with the scalability of distributed memory machines. This article presents DSMPI, a parallel library that runs atop of MPI and provides a DSM abstraction. It provides an easy-to-use programming interface, is fully, portable, and supports heterogeneity. For the sake of flexibility, it supports different coherence protocols and models of consistency. We present some performance results taken in a network of workstations and in a Cray T3D which show that DSMPI can be competitive with MPI for some applications.

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Mavriplis, Dimitri J. "Parallel Performance Investigations of an Unstructured Mesh Navier-Stokes Solver." International Journal of High Performance Computing Applications 16, no. 4 (November 2002): 395–407. http://dx.doi.org/10.1177/109434200201600403.

Повний текст джерела

Анотація:

Summary The implementation and performance of a hybrid OpenMP/ MPI parallel communication strategy for an unstructured mesh computational fluid dynamics code is described. The solver is cache efficient and fully vectorizable, and is parallelized using a two-level hybrid MPI-OpenMP implementation suitable for shared and/or distributed memory architectures, as well as clusters of shared memory machines. Parallelism is obtained through domain decomposition for both communication models. Single processor computational rates as well as scalability curves are given on various architectures. For the architectures studied in this work, the OpenMP or hybrid OpenMP/MPI communication strategies achieved no appreciable performance benefit over an exclusive MPI communication strategy.

Стилі APA, Harvard, Vancouver, ISO та ін.

41

PAGE, DANIEL R. "PARALLEL ALGORITHM FOR SECOND–ORDER RESTRICTED WEAK INTEGER COMPOSITION GENERATION FOR SHARED MEMORY MACHINES." Parallel Processing Letters 23, no. 03 (September 2013): 1350010. http://dx.doi.org/10.1142/s0129626413500102.

Повний текст джерела

Анотація:

In 2012, Page presented a sequential combinatorial generation algorithm for generalized types of restricted weak integer compositions called second–order restricted weak integer compositions. Second–order restricted weak integer compositions cover various types of restricted weak integer compositions of n parts such as integer compositions, bounded compositions, and part–wise integer compositions. In this paper, we present a parallel algorithm that derives from our parallelization of Page's sequential algorithm with a focus on load balancing for shared memory machines.

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Lowenthal, David K., and Vincent W. Freeh. "Architecture-independent parallelism for both shared- and distributed-memory machines using the Filaments package." Parallel Computing 26, no. 10 (August 2000): 1297–323. http://dx.doi.org/10.1016/s0167-8191(00)00038-7.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Löhner, Rainald. "Renumbering strategies for unstructured-grid solvers operating on shared-memory, cache-based parallel machines." Computer Methods in Applied Mechanics and Engineering 163, no. 1-4 (September 1998): 95–109. http://dx.doi.org/10.1016/s0045-7825(98)00005-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

44

Huh, Joonmoo, and Deokwoo Lee. "Effective On-Chip Communication for Message Passing Programs on Multi-Core Processors." Electronics 10, no. 21 (November 3, 2021): 2681. http://dx.doi.org/10.3390/electronics10212681.

Повний текст джерела

Анотація:

Shared memory is the most popular parallel programming model for multi-core processors, while message passing is generally used for large distributed machines. However, as the number of cores on a chip increases, the relative merits of shared memory versus message passing change, and we argue that message passing becomes a viable, high performing, and parallel programming model. To demonstrate this hypothesis, we compare a shared memory architecture with a new message passing architecture on a suite of applications tuned for each system independently. Perhaps surprisingly, the fundamental behaviors of the applications studied in this work, when optimized for both models, are very similar to each other, and both could execute efficiently on multicore architectures despite many implementations being different from each other. Furthermore, if hardware is tuned to support message passing by supporting bulk message transfer and the elimination of unnecessary coherence overheads, and if effective support is available for global operations, then some applications would perform much better on a message passing architecture. Leveraging our insights, we design a message passing architecture that supports both memory-to-memory and cache-to-cache messaging in hardware. With the new architecture, message passing is able to outperform its shared memory counterparts on many of the applications due to the unique advantages of the message passing hardware as compared to cache coherence. In the best case, message passing achieves up to a 34% increase in speed over its shared memory counterpart, and it achieves an average 10% increase in speed. In the worst case, message passing is slowed down in two applications—CG (conjugate gradient) and FT (Fourier transform)—because it could not perform well on the unique data sharing patterns as its counterpart of shared memory. Overall, our analysis demonstrates the importance of considering message passing as a high performing and hardware-supported programming model on future multicore architectures.

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Rappleye, Jason, Martins Innus, Charles M. Weeks, and Russ Miller. "SnBversion 2.2: an example of crystallographic multiprocessing." Journal of Applied Crystallography 35, no. 3 (May 16, 2002): 374–76. http://dx.doi.org/10.1107/s0021889802005782.

Повний текст джерела

Анотація:

The computer programSnBimplements a direct-methods algorithm, known asShake-and-Bake, which optimizes trial structures consisting of randomly positioned atoms. Although largeShake-and-Bakeapplications require significant amounts of computing time, the algorithm can be easily implemented in parallel in order to decrease the real time required to achieve a solution. By using a master–worker model,SnBversion 2.2 is amenable to all of the prevalent modern parallel-computing platforms, including (i) shared-memory multiprocessor machines, such as the SGI Origin2000, (ii) distributed-memory multiprocessor machines, such as the IBM SP, and (iii) collections of workstations, including Beowulf clusters. A linear speedup in the processing of a fixed number of trial structures can be obtained on each of these platforms.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Traversa, Fabio Lorenzo, Chiara Ramella, Fabrizio Bonani, and Massimiliano Di Ventra. "Memcomputing NP-complete problems in polynomial time using polynomial resources and collective states." Science Advances 1, no. 6 (July 2015): e1500031. http://dx.doi.org/10.1126/sciadv.1500031.

Повний текст джерела

Анотація:

Memcomputing is a novel non-Turing paradigm of computation that uses interacting memory cells (memprocessors for short) to store and process information on the same physical platform. It was recently proven mathematically that memcomputing machines have the same computational power of nondeterministic Turing machines. Therefore, they can solve NP-complete problems in polynomial time and, using the appropriate architecture, with resources that only grow polynomially with the input size. The reason for this computational power stems from properties inspired by the brain and shared by any universal memcomputing machine, in particular intrinsic parallelism and information overhead, namely, the capability of compressing information in the collective state of the memprocessor network. We show an experimental demonstration of an actual memcomputing architecture that solves the NP-complete version of the subset sum problem in only one step and is composed of a number of memprocessors that scales linearly with the size of the problem. We have fabricated this architecture using standard microelectronic technology so that it can be easily realized in any laboratory setting. Although the particular machine presented here is eventually limited by noise—and will thus require error-correcting codes to scale to an arbitrary number of memprocessors—it represents the first proof of concept of a machine capable of working with the collective state of interacting memory cells, unlike the present-day single-state machines built using the von Neumann architecture.

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Burn, G. L. "Implementing the evaluation transformer model of reduction on parallel machines." Journal of Functional Programming 1, no. 3 (July 1991): 329–66. http://dx.doi.org/10.1017/s0956796800000137.

Повний текст джерела

Анотація:

AbstractThe evaluation transformer model of reduction generalizes lazy evaluation in two ways: it can start the evaluation of expressions before their first use, and it can evaluate expressions further than weak head normal form. Moreover, the amount of evaluation required of an argument to a function may depend on the amount of evaluation required of the function application. It is a suitable candidate model for implementing lazy functional languages on parallel machines.In this paper we explore the implementation of lazy functional languages on parallel machines, both shared and distributed memory architectures, using the evaluation transformer model of reduction. We will see that the same code can be produced for both styles of architecture, and the definition of the instruction set is virtually the same for each style. The essential difference is that a distributed memory architecture has one extra node type for non-local pointers, and instructions which involve the value of such nodes need their definitions extended to cover this new type of node.To make our presentation accessible, we base our description on a variant of the well-known G-machine, an abstract machine for executing lazy functional programs.

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Chan, Albert, Frank Dehne, and Ryan Taylor. "CGMGRAPH/CGMLIB: Implementing and Testing CGM Graph Algorithms on PC Clusters and Shared Memory Machines." International Journal of High Performance Computing Applications 19, no. 1 (February 2005): 81–97. http://dx.doi.org/10.1177/1094342005051196.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Samardzic, Aleksandar, Dusan Starcevic, and Milan Tuba. "An implementation of ray tracing algorithm for the multiprocessor machines." Yugoslav Journal of Operations Research 16, no. 1 (2006): 125–35. http://dx.doi.org/10.2298/yjor0601125s.

Повний текст джерела

Анотація:

Ray Tracing is an algorithm for generating photo-realistic pictures of the 3D scenes, given scene description, lighting condition and viewing parameters as inputs. The algorithm is inherently convenient for parallelization and the simplest parallelization scheme is for the shared-memory parallel machines (multiprocessors). This paper presents two implementations of the algorithm developed by the authors for alike machines, one using the POSIX threads API and another one using the OpenMP API. The paper also presents results of rendering some test scenes using these implementations and discusses our parallel algorithm version efficiency.

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Munier, Badri, Muhammad Aleem, Muhammad Arshad Islam, Muhammad Azhar Iqbal, and Waqar Mehmood. "A Fast Implementation of Minimum Spanning Tree Method and Applying it to Kruskal’s and Prim’s Algorithms." Sukkur IBA Journal of Computing and Mathematical Sciences 1, no. 1 (June 30, 2017): 58. http://dx.doi.org/10.30537/sjcms.v1i1.8.

Повний текст джерела

Анотація:

In last decade, application developers attained improved performances by merely employing the machines based on higher-clocked processors. However, in 2003 multi-core processors emerged and eradicated the old processor manufacturing technology based on increasing processor’s clock frequencies. After emergence of new parallel processor architectures, serial applications must be re-engineered into parallel versions to exploit the computing power of the existing hardware. In this paper, we present an efficient parallel implementation of minimum spanning tree algorithm to take advantage of the computing power of multi-core machines. Computer network routing, civil infrastructure planning and cluster analysis are typically use-cases of spanning tree problem. The experimental results show that the proposed algorithm is scalable for different machine and graph sizes. The methodology is simple and can easily be implemented using different shared-memory parallel programming models.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!