Log in

Relevant bibliographies by topics / Exascale systems / Journal articles

To see the other types of publications on this topic, follow the link: Exascale systems.

Journal articles on the topic 'Exascale systems'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Exascale systems.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Coteus, P. W., J. U. Knickerbocker, C. H. Lam, and Y. A. Vlasov. "Technologies for exascale systems." IBM Journal of Research and Development 55, no. 5 (September 2011): 14:1–14:12. http://dx.doi.org/10.1147/jrd.2011.2163967.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Rumley, Sebastien, Dessislava Nikolova, Robert Hendry, Qi Li, David Calhoun, and Keren Bergman. "Silicon Photonics for Exascale Systems." Journal of Lightwave Technology 33, no. 3 (February 1, 2015): 547–62. http://dx.doi.org/10.1109/jlt.2014.2363947.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Jensen, David, and Arun Rodrigues. "Embedded Systems and Exascale Computing." Computing in Science & Engineering 12, no. 6 (November 2010): 20–29. http://dx.doi.org/10.1109/mcse.2010.95.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Tahmazli-Khaligova, Firuza. "CHALLENGES OF USING BIG DATA IN DISTRIBUTED EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 3, no. 2 (December 29, 2020): 245–54. http://dx.doi.org/10.32010/26166127.2020.3.2.245.254.

Full text

Abstract:

In a traditional High Performance Computing system, it is possible to process a huge data volume. The nature of events in classic High Performance computing is static. In Distributed Exa-scale System has a different nature. The processing Big data in a distributed exascale system evokes a new challenge. The dynamic and interactive character of a distributed exascale system changes processes status and system elements. This paper discusses the challenge that Big data attributes: volume, velocity, variety, how they influence distributed exascale system dynamic and interactive nature. While investigating the effect of the Dynamic and Interactive nature of exascale systems in computing Big data, this research work suggests the Markov chains model. This model suggests the transition matrix, which identifies system status and memory sharing. It lets us analyze the two systems convergence. As a result in both systems are explored by the influence of each other.

APA, Harvard, Vancouver, ISO, and other styles

5

Alexander, Francis J., James Ang, Jenna A. Bilbrey, Jan Balewski, Tiernan Casey, Ryan Chard, Jong Choi, et al. "Co-design Center for Exascale Machine Learning Technologies (ExaLearn)." International Journal of High Performance Computing Applications 35, no. 6 (September 27, 2021): 598–616. http://dx.doi.org/10.1177/10943420211029302.

Full text

Abstract:

Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.

APA, Harvard, Vancouver, ISO, and other styles

6

Ismayilova, Nigar. "CHALLENGES OF USING THE FUZZY APPROACH IN EXASCALE COMPUTING SYSTEMS." Azerbaijan Journal of High Performance Computing 4, no. 2 (December 31, 2021): 198–205. http://dx.doi.org/10.32010/26166127.2021.4.2.198.205.

Full text

Abstract:

In this paper were studied opportunities of using fuzzy sets theory for constructing an appropriate load balancing model in Exascale distributed systems. The occurrence of dynamic and interactive events in multicore computing systems leads to uncertainty. As the fuzzy logic-based solutions allow the management of uncertain environments, there are several approaches and useful challenges for the development of load balancing models in Exascale computing systems.

APA, Harvard, Vancouver, ISO, and other styles

7

Klasky, S. A., H. Abbasi, M. Ainsworth, J. Choi, M. Curry, T. Kurc, Q. Liu, et al. "Exascale Storage Systems the SIRIUS Way." Journal of Physics: Conference Series 759 (October 2016): 012095. http://dx.doi.org/10.1088/1742-6596/759/1/012095.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Степаненко, Сергей, Sergey Stepanenko, Василий Южаков, and Vasiliy Yuzhakov. "Exascale supercomputers. Architectural outlines." Program systems: theory and applications 4, no. 4 (November 15, 2013): 61–90. http://dx.doi.org/10.12737/2418.

Full text

Abstract:

Architectural aspects of exascale supercomputers are explored. Param-eters of the computing environment and interconnect are evaluated. It is shown that reaching exascale performances requires hybrid systems. Processor elements of such systems comprise CPU cores and arithmetic accelerators, implementing the MIMD and SIMD computing disciplines, respectively. Efficient exascale hybrid systems require fundamentally new applications and architectural efficiency scaling solutions, including: 1) process-aware structural reconfiguring of hybrid processor elements by varying the number of MIMD cores and SIMD cores communicating with them to attain as high performance and efficiency as possible under given conditions; 2) application of conflict-free sets of sources and receivers and/or decomposi-tion of the computation to subprocesses and their allocation to environment elements in accordance with their features and communication topology to minimize communication time; 3) application of topological redundancy methods to preserve the topology and overall performance achieved by the above communication time minimiza-tion solutions in case of element failure thus maintaining the efficiency reached by the above reconfiguring and communication minimization solu-tions, i.e. to provide fault-tolerant efficiency scaling. Application of these solutions is illustrated by running molecular dynamics tests and the NPB LU benchmark. The resulting architecture displays dynamic adaptability to program features, which in turn ensures the efficiency of using exascale supercomputers.

APA, Harvard, Vancouver, ISO, and other styles

9

Abdullayev, Fakhraddin. "RESOURCE DISCOVERY IN DISTRIBUTED EXASCALE SYSTEMS USING A MULTI-AGENT MODEL: CATEGORIZATION OF AGENTS BASED ON THEIR CHARACTERISTICS." Azerbaijan Journal of High Performance Computing 6, no. 1 (June 30, 2023): 113–20. http://dx.doi.org/10.32010/26166127.2023.6.1.113.120.

Full text

Abstract:

Resource discovery is a crucial component in high-performance computing (HPC) systems. This paper presents a multi-agent model for resource discovery in distributed exascale systems. Agents are categorized based on resource types and behavior-specific characteristics. The model enables efficient identification and acquisition of memory, process, file, and IO resources. Through a comprehensive exploration, we highlight the potential of our approach in addressing resource discovery challenges in exascale computing systems, paving the way for optimized resource utilization and enhanced system performance.

APA, Harvard, Vancouver, ISO, and other styles

10

Shalf, John, Dan Quinlan, and Curtis Janssen. "Rethinking Hardware-Software Codesign for Exascale Systems." Computer 44, no. 11 (November 2011): 22–30. http://dx.doi.org/10.1109/mc.2011.300.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Vazirov, Etibar. "MACHINE LEARNING-BASED MODELING FOR PERFORMANCE IMPROVEMENT IN AN EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 3, no. 2 (December 29, 2020): 223–33. http://dx.doi.org/10.32010/26166127.2020.3.2.223.233.

Full text

Abstract:

The combination of heterogeneous resources within exascale architectures guarantees to be capable of revolutionary compute for scientific applications. There will be some data about the status of the current progress of jobs, hardware and software, memory, and network resource usage. This provisional information has an irreplaceable value in learning to predict where applications may face dynamic and interactive behavior when resource failures occur. What is proposed in this paper is building a scalable framework that uses special performance information collected from all other sources. It will perform an analysis of HPC applications in order to develop new statistical footprints of resource usage. Besides, this framework should predict the reasons for failure and provide new capabilities to recover from application failures. We are applying HPC capabilities at exascale causes the possibility of substantial scientific unproductiveness in computational procedures. In that sense, the integration of machine learning into exascale computations is an encouraging way to obtain large performance profits and introduce an opportunity to jump a generation of simulation improvements.

APA, Harvard, Vancouver, ISO, and other styles

12

Bakhishov, Ulphat. "DEFINING PARAMETERS FOR THE OSCILLATION MODEL OF LOAD FLOW OF GLOBAL ACTIVITIES IN A FULLY DISTRIBUTED EXASCALE SYSTEM." Azerbaijan Journal of High Performance Computing 4, no. 1 (June 30, 2021): 126–31. http://dx.doi.org/10.32010/26166127.2021.4.1.126.131.

Full text

Abstract:

Distributed exascale computing systems are the idea of the HPC systems, that capable to perform one exaflop operations per second in dynamic and interactive nature without central managers. In such environment, each node should manage its own load itself and it should be found the basic rules of load distribution for all nodes because of being able to optimize the load distribution without central managers. In this paper proposed oscillation model for load distribution in fully distributed exascale systems and defined some parameters for this model and mentioned about feature works.

APA, Harvard, Vancouver, ISO, and other styles

13

Esmaeili Bidhendi, Zohreh, Pouria Fakhri, and Ehsan Mousavi Khaneghah. "CHALLENGES OF USING UNSTRUCTURED P2P SYSTEMS TO SUPPORT DISTRIBUTED EXASCALE COMPUTING." Azerbaijan Journal of High Performance Computing 2, no. 1 (June 30, 2019): 3–6. http://dx.doi.org/10.32010/26166127.2019.2.1.3.6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Wulff, Eric, Maria Girone, and Joosep Pata. "Hyperparameter optimization of data-driven AI models on HPC systems." Journal of Physics: Conference Series 2438, no. 1 (February 1, 2023): 012092. http://dx.doi.org/10.1088/1742-6596/2438/1/012092.

Full text

Abstract:

Abstract In the European Center of Excellence in Exascale Computing ”Research on AI- and Simulation-Based Engineering at Exascale” (CoE RAISE), researchers develop novel, scalable AI technologies towards Exascale. This work exercises High Performance Computing resources to perform large-scale hyperparameter optimization using distributed training on multiple compute nodes. This is part of RAISE’s work on data-driven use cases which leverages AI- and HPC cross-methods developed within the project. In response to the demand for parallelizable and resource efficient hyperparameter optimization methods, advanced hyperparameter search algorithms are benchmarked and compared. The evaluated algorithms, including Random Search, Hyperband and ASHA, are tested and compared in terms of both accuracy and accuracy per compute resources spent. As an example use case, a graph neural network model known as MLPF, developed for Machine Learned Particle-Flow reconstruction, acts as the base model for optimization. Results show that hyperparameter optimization significantly increased the performance of MLPF and that this would not have been possible without access to large-scale High Performance Computing resources. It is also shown that, in the case of MLPF, the ASHA algorithm in combination with Bayesian optimization gives the largest performance increase per compute resources spent out of the investigated algorithms.

APA, Harvard, Vancouver, ISO, and other styles

15

Anzt, Hartwig, Erik Boman, Rob Falgout, Pieter Ghysels, Michael Heroux, Xiaoye Li, Lois Curfman McInnes, et al. "Preparing sparse solvers for exascale computing." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 378, no. 2166 (January 20, 2020): 20190053. http://dx.doi.org/10.1098/rsta.2019.0053.

Full text

Abstract:

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

APA, Harvard, Vancouver, ISO, and other styles

16

Mousavi Khaneghah, Ehsan, and Araz R. Aliev. "CHALLENGES OF INFLUENCE DYNAMIC AND INTERACTIVE EVENTS ON RESOURCE DISCOVERY FUNCTIONALITY OUTSIDE OF DISTRIBUTED EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 3, no. 2 (December 29, 2020): 164–80. http://dx.doi.org/10.32010/26166127.2020.3.2.164.180.

Full text

Abstract:

The resource discovery in Exascale systems should support the occurrence of dynamic nature in each stakeholder's elements in the resource discovery process. The occurrence of dynamic and interactive nature in the accountable computational element creates challenges in executing the activities related to resource discovery, such as the continuation of the response to the request, granting access rights, and the resource allocation to the process. In the case of a lack of management and dynamic and interactive event control in the accountable computational element, the process of activities related to the resource discovery will fail. In this paper, we first examine the concept function of resource discovery in the accountable computational element. Then, to analyze the effects of occurrence, dynamic, and interactive event effects on resource discovery function in the accountable computational element are discussed. The purpose of this paper is to analyze the use of the traditional resource discovery in the Exascale distributed systems and investigate the factors that should be considered in the function of resource discovery management to have the possibility of application in the Exascale distributed system.

APA, Harvard, Vancouver, ISO, and other styles

17

Panda, Dhabaleswar, Xiao-Yi Lu, and Hari Subramoni. "Networking and communication challenges for post-exascale systems." Frontiers of Information Technology & Electronic Engineering 19, no. 10 (October 2018): 1230–35. http://dx.doi.org/10.1631/fitee.1800631.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Bougeret, Marin, Henri Casanova, Yves Robert, Frédéric Vivien, and Dounia Zaidouni. "Using group replication for resilience on exascale systems." International Journal of High Performance Computing Applications 28, no. 2 (October 2013): 210–24. http://dx.doi.org/10.1177/1094342013505348.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Mirtaheri, Seyedeh Leili, and Lucio Grandinetti. "Dynamic load balancing in distributed exascale computing systems." Cluster Computing 20, no. 4 (May 19, 2017): 3677–89. http://dx.doi.org/10.1007/s10586-017-0902-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Canal, Ramon, Carles Hernandez, Rafa Tornero, Alessandro Cilardo, Giuseppe Massari, Federico Reghenzani, William Fornaciari, et al. "Predictive Reliability and Fault Management in Exascale Systems." ACM Computing Surveys 53, no. 5 (October 15, 2020): 1–32. http://dx.doi.org/10.1145/3403956.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Dauwe, Daniel, Sudeep Pasricha, Anthony A. Maciejewski, and Howard Jay Siegel. "Resilience-Aware Resource Management for Exascale Computing Systems." IEEE Transactions on Sustainable Computing 3, no. 4 (October 1, 2018): 332–45. http://dx.doi.org/10.1109/tsusc.2018.2797890.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Pleiter, Dirk. "HPC Systems in the Next Decade – What to Expect, When, Where." EPJ Web of Conferences 245 (2020): 11004. http://dx.doi.org/10.1051/epjconf/202024511004.

Full text

Abstract:

HPC systems have seen impressive growth in terms of performance over a period of many years. Soon the next milestone is expected to be reached with the deployment of exascale systems in 2021. In this paper, we provide an overview of the exascale challenges from a computer architecture’s perspective and explore technological and other constraints. The analysis of upcoming architectural options and emerging technologies allow for setting expectations for application developers, which will have to cope with heterogeneous architectures, increasingly diverse compute technologies as well as deeper memory and storage hierarchies. Finally, needs resulting from changing science and engineering workflows will be discussed, which need to be addressed by making HPC systems available as part of more open e-infrastructures that provide also other compute and storage services.

APA, Harvard, Vancouver, ISO, and other styles

23

Sohrabi, Zeinab, and Ehsan Mousavi Khaneghah. "CHALLENGES OF USING LIVE PROCESS MIGRATION IN DISTRIBUTED EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 3, no. 2 (December 29, 2020): 151–63. http://dx.doi.org/10.32010/26166127.2020.3.2.151.163.

Full text

Abstract:

Virtual machine-based process migrator mechanisms have the potential to be used in distributed exascale systems due to their ability to execute process execution and support environments with the heterogenous of the computational unit. The ability to reduce process suspension time and use the concept of live process migrator makes it possible to use this mechanism to transfer processes in distributed exascale systems to prevent related process activity failure. The performance function of a virtual machine-based process migrator mechanism cannot manage dynamic and interactive events and the effects of this event on the mechanism operation and the change in the basic concept of system activity from the concept of the process to the concept of global activity. This paper examines the challenges of dynamic and interactive event occurrence on virtual machine-based process migrators by analyzing VM-based migrator's performance function

APA, Harvard, Vancouver, ISO, and other styles

24

Hoekstra, Alfons G., Simon Portegies Zwart, and Peter V. Coveney. "Multiscale modelling, simulation and computing: from the desktop to the exascale." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 377, no. 2142 (February 18, 2019): 20180355. http://dx.doi.org/10.1098/rsta.2018.0355.

Full text

Abstract:

This short contribution introduces a theme issue dedicated to ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’. It holds a collection of articles presenting cutting-edge research in generic multiscale modelling and multiscale computing, and applications thereof on high-performance computing systems. The special issue starts with a position paper to discuss the paradigm of multiscale computing in the face of the emerging exascale, followed by a review and critical assessment of existing multiscale computing environments. This theme issue provides a state-of-the-art account of generic multiscale computing, as well as exciting examples of applications of such concepts in domains ranging from astrophysics, via material science and fusion, to biomedical sciences. This article is part of the theme issue ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’.

APA, Harvard, Vancouver, ISO, and other styles

25

Davis, Andrew, Aleksander Dubas, and Ruben Otin. "Enabling validated exascale nuclear science." EPJ Web of Conferences 245 (2020): 09001. http://dx.doi.org/10.1051/epjconf/202024509001.

Full text

Abstract:

The field of fusion energy is about to enter the ITER era, for the first time we will have access to a device capable of producing 500 MW of fusion power, with plasmas lasting more than 300 seconds and with core temperatures in excess of 100-200 Million K. Engineering simulation for fusion, sits in an awkward position, a mixture of commercial and licensed tools are used, often with email driven transfer of data. In order to address the engineering simulation challenges of the future, the community must address simulation in a much more tightly coupled ecosystem, with a set of tools that can scale to take advantage of current petascale and upcoming exascale systems to address the design challenges of the ITER era.

APA, Harvard, Vancouver, ISO, and other styles

26

Gholamrezaie, Faezeh, and Azar Feyziyev. "A MECHANISM FOR USING THE FLUSHING PROCESS MIGRATION MECHANISM IN DISTRIBUTED EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 5, no. 2 (July 1, 2022): 3–32. http://dx.doi.org/10.32010/26166127.2022.5.1.3.32.

Full text

Abstract:

The effect of dynamic and interactive events on the function of the elements that make up the computing system manager causes the time required to run the user program to increase or the operation of these elements to change. These changes either increase the execution time of the scientific program or make the system incapable of executing the program. Computational processes on the migration process and vector algebras try to analyze and enable the Flushing process migration mechanism in support of distributed Exascale systems despite dynamic and interactive events. This paper investigates the Flushing process migration management mechanism in distributed Exascale systems, the effects of dynamic and interactive occurrences in the computational system, and the impact of dynamic and interactive events on the system.

APA, Harvard, Vancouver, ISO, and other styles

27

ALFIAN AMRIZAL, Muhammad, Atsuya UNO, Yukinori SATO, Hiroyuki TAKIZAWA, and Hiroaki KOBAYASHI. "Energy-Performance Modeling of Speculative Checkpointing for Exascale Systems." IEICE Transactions on Information and Systems E100.D, no. 12 (2017): 2749–60. http://dx.doi.org/10.1587/transinf.2017pap0002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Kerbyson, Darren, Abhinav Vishnu, Kevin Barker, and Adolfy Hoisie. "Codesign Challenges for Exascale Systems: Performance, Power, and Reliability." Computer 44, no. 11 (November 2011): 37–43. http://dx.doi.org/10.1109/mc.2011.298.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Di Girolamo, Alessandro, Federica Legger, Panos Paparrigopoulos, Alexei Klimentov, Jaroslava Schovancová, Valentin Kuznetsov, Mario Lassnig, et al. "Operational Intelligence for Distributed Computing Systems for Exascale Science." EPJ Web of Conferences 245 (2020): 03017. http://dx.doi.org/10.1051/epjconf/202024503017.

Full text

Abstract:

In the near future, large scientific collaborations will face unprecedented computing challenges. Processing and storing exabyte datasets require a federated infrastructure of distributed computing resources. The current systems have proven to be mature and capable of meeting the experiment goals, by allowing timely delivery of scientific results. However, a substantial amount of interventions from software developers, shifters and operational teams is needed to efficiently manage such heterogeneous infrastructures. A wealth of operational data can be exploited to increase the level of automation in computing operations by using adequate techniques, such as machine learning (ML), tailored to solve specific problems. The Operational Intelligence project is a joint effort from various WLCG communities aimed at increasing the level of automation in computing operations. We discuss how state-of-the-art technologies can be used to build general solutions to common problems and to reduce the operational cost of the experiment computing infrastructure.

APA, Harvard, Vancouver, ISO, and other styles

30

Del Ben, Mauro, Felipe H. da Jornada, Andrew Canning, Nathan Wichmann, Karthik Raman, Ruchira Sasanka, Chao Yang, Steven G. Louie, and Jack Deslippe. "Large-scale GW calculations on pre-exascale HPC systems." Computer Physics Communications 235 (February 2019): 187–95. http://dx.doi.org/10.1016/j.cpc.2018.09.003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Shetty, Nayana. "A Comprehensive Review on Power Efficient Fault Tolerance Models in High Performance Computation Systems." September 2021 3, no. 3 (August 7, 2021): 135–48. http://dx.doi.org/10.36548/jscp.2021.3.001.

Full text

Abstract:

For the purpose of high performance computation, several machines are developed at an exascale level. These machines can perform at least one exaflop calculations per second, which corresponds to a billion billon or 108. The universe and nature can be understood in a better manner while addressing certain challenging computational issues by using these machines. However, certain obstacles are faced by these machines. As huge quantity of components is encompassed in the exascale machines, frequent failure may be experienced and also the resilience may be challenging. High progress rate must be maintained for the applications by incorporating certain form of fault tolerance in the system. Power management has to be performed by incorporating the system in a parallel manner. All layers inclusive of fault tolerance layer must adhere to the power limitation in the system. Huge energy bills may be expected on installation of exascale machines due to the high power consumption. For various fault tolerance models, the energy profile must be analyzed. Parallel recovery, message-logging, and restart or checkpoint fault tolerance models for rollback recovery are evaluated in this paper. For execution with failure, the most energy efficient solution is provided by parallel recovery when programs with various programming models are used. The execution is performed faster with parallel recovery when compared to the other techniques. An analytical model is used for exploring these models and their behavior at extreme scales.

APA, Harvard, Vancouver, ISO, and other styles

32

Снытников, Алексей Владимирович, and Галина Геннадьевна Лазарева. "Computational methods of continuum mechanics for exaflop computer systems." Вычислительные технологии, no. 5 (November 24, 2021): 81–94. http://dx.doi.org/10.25743/ict.2021.26.5.007.

Full text

Abstract:

Рассмотрены вопросы использования экзафлопсных вычислений для решения прикладных задач. На основе обзора работ в этой области выделены наиболее актуальные вопросы, связанные с экзафлопсными вычислениями. Особое внимание уделено особенностям программного обеспечения, алгоритмам и численным методам для экзафлопсных суперЭВМ. Приведены примеры разработки новых и адаптации существующих алгоритмов и численных методов для решения задач механики сплошной среды. Сделан анализ наиболее популярных приложений The article deals with applied issues which arise when exascale computing are used to solve applied problems. Based on the review of works in this area, the most pressing issues related to exascale calculations are highlighted. Particular attention is paid to software features, algorithms and numerical methods for exaflop supercomputers. The requirements for such programs and algorithms are formulated. Based on the review of existing approaches related to achieving high performance, the main fundamentally different and non-overlapping directions for improving the performance of calculations are highlighted. The question of the necessity for criteria of applicability for computational algorithms for exaflop supercomputers is raised. Currently, the only criterion which is used, demands the absence of a significant drop in efficiency in the transition from a petaflop calculation to a ten-petaflop calculation. In the absence of the possibility of such calculations, simulation modelling can be carried out. Examples of development for new and adaptation of existing algorithms and numerical methods for solving problems of continuum mechanics are given. The fundamental difference between algorithms specially designed for exascale machines and algorithms adapted for exaflops is shown. The analysis of publications has showed that in the field of solving problems of continuum mechanics, the approach not associated with the development of new, but rather with the adaptation of existing numerical methods and algorithms to the architecture of exaflop supercomputers prevails. The analysis of the most popular applications is made. The most relevant application of exaflop supercomputers in this area is computational fluid dynamics. This is because hydrodynamic applications are rich and diverse field. The number of publications indicates that the involvement of high-performance computing now is available and in demand

APA, Harvard, Vancouver, ISO, and other styles

33

Goz, David, Georgios Ieronymakis, Vassilis Papaefstathiou, Nikolaos Dimou, Sara Bertocco, Francesco Simula, Antonio Ragagnin, Luca Tornatore, Igor Coretti, and Giuliano Taffoni. "Performance and Energy Footprint Assessment of FPGAs and GPUs on HPC Systems Using Astrophysics Application." Computation 8, no. 2 (April 17, 2020): 34. http://dx.doi.org/10.3390/computation8020034.

Full text

Abstract:

New challenges in Astronomy and Astrophysics (AA) are urging the need for many exceptionally computationally intensive simulations. “Exascale” (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. Currently, the High-Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the primary challenge to the achievement of the “Exascale” is the power consumption. The goal of this work is to give some insights about performance and energy footprint of contemporary architectures for a real astrophysical application in an HPC context. We use a state-of-the-art N-body application that we re-engineered and optimized to exploit the heterogeneous underlying hardware fully. We quantitatively evaluate the impact of computation on energy consumption when running on four different platforms. Two of them represent the current HPC systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster based on ARM-MPSoC, and one is a “prototype towards Exascale” equipped with ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the different devices where the high-end GPUs excel in terms of time-to-solution while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. This work can be a reference for future platform development for astrophysics applications where computationally intensive calculations are required.

APA, Harvard, Vancouver, ISO, and other styles

34

Sun, Zhiwei, Anthony Skjellum, Lee Ward, and Matthew L. Curry. "A Lightweight Data Location Service for Nondeterministic Exascale Storage Systems." ACM Transactions on Storage 10, no. 3 (July 2014): 1–22. http://dx.doi.org/10.1145/2629451.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Dong, Xiangyu, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. "Hybrid checkpointing using emerging nonvolatile memories for future exascale systems." ACM Transactions on Architecture and Code Optimization 8, no. 2 (July 2011): 1–29. http://dx.doi.org/10.1145/1970386.1970387.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Byna, Suren, M. Scot Breitenfeld, Bin Dong, Quincey Koziol, Elena Pourmal, Dana Robinson, Jerome Soumagne, Houjun Tang, Venkatram Vishwanath, and Richard Warren. "ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems." Journal of Computer Science and Technology 35, no. 1 (January 2020): 145–60. http://dx.doi.org/10.1007/s11390-020-9822-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Purandare, Devashish R., Daniel Bittman, and Ethan L. Miller. "Analysis and Workload Characterization of the CERN EOS Storage System." ACM SIGOPS Operating Systems Review 56, no. 1 (June 14, 2022): 55–61. http://dx.doi.org/10.1145/3544497.3544507.

Full text

Abstract:

Modern, large-scale scientific computing runs on complex exascale storage systems that support even more complex data workloads. Understanding the data access and movement patterns is vital for informing the design of future iterations of existing systems and next-generation systems. Yet we are lacking in publicly available traces and tools to help us understand even one system in depth, let alone correlate long-term cross-system trends.

APA, Harvard, Vancouver, ISO, and other styles

38

Turner, John A., James Belak, Nathan Barton, Matthew Bement, Neil Carlson, Robert Carson, Stephen DeWitt, et al. "ExaAM: Metal additive manufacturing simulation at the fidelity of the microstructure." International Journal of High Performance Computing Applications 36, no. 1 (January 2022): 13–39. http://dx.doi.org/10.1177/10943420211042558.

Full text

Abstract:

Additive manufacturing (AM), or 3D printing, of metals is transforming the fabrication of components, in part by dramatically expanding the design space, allowing optimization of shape and topology. However, although the physical processes involved in AM are similar to those of welding, a field with decades of experimental, modeling, simulation, and characterization experience, qualification of AM parts remains a challenge. The availability of exascale computational systems, particularly when combined with data-driven approaches such as machine learning, enables topology and shape optimization as well as accelerated qualification by providing process-aware, locally accurate microstructure and mechanical property models. We describe the physics components comprising the Exascale Additive Manufacturing simulation environment and report progress using highly resolved melt pool simulations to inform part-scale finite element thermomechanics simulations, drive microstructure evolution, and determine constitutive mechanical property relationships based on those microstructures using polycrystal plasticity. We report on implementation of these components for exascale computing architectures, as well as the multi-stage simulation workflow that provides a unique high-fidelity model of process–structure–property relationships for AM parts. In addition, we discuss verification and validation through collaboration with efforts such as AM-Bench, a set of benchmark test problems under development by a team led by the National Institute of Standards and Technology.

APA, Harvard, Vancouver, ISO, and other styles

39

Cha, Myung-Hoon, Sang-Min Lee, Hong-Yeon Kim, and Young-Kyun Kim. "Effective metadata management in exascale file system." Journal of Supercomputing 75, no. 11 (August 22, 2019): 7665–89. http://dx.doi.org/10.1007/s11227-019-02974-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Varghese, Anish, Bob Edwards, Gaurav Mitra, and Alistair P. Rendell. "Programming the Adapteva Epiphany 64-core network-on-chip coprocessor." International Journal of High Performance Computing Applications 31, no. 4 (August 27, 2015): 285–302. http://dx.doi.org/10.1177/1094342015599238.

Full text

Abstract:

Energy efficiency is the primary impediment in the path to exascale computing. Consequently, the high-performance computing community is increasingly interested in low-power high-performance embedded systems as building blocks for large-scale high-performance systems. The Adapteva Epiphany architecture integrates low-power RISC cores on a 2D mesh network and promises up to 70 GFLOPS/Watt of theoretical performance. However, with just 32 KB of memory per eCore for storing both data and code, programming the Epiphany system presents significant challenges. In this paper we evaluate the performance of a 64-core Epiphany system with a variety of basic compute and communication micro-benchmarks. Further, we implemented two well known application kernels, 5-point star-shaped heat stencil with a peak performance of 65.2 GFLOPS and matrix multiplication with 65.3 GFLOPS in single precision across 64 Epiphany cores. We discuss strategies for implementing high-performance computing application kernels on such memory constrained low-power devices and compare the Epiphany with competing low-power systems. With future Epiphany revisions expected to house thousands of cores on a single chip, understanding the merits of such an architecture is of prime importance to the exascale initiative.

APA, Harvard, Vancouver, ISO, and other styles

41

Kim, Jeong-Joon. "Erasure-Coding-Based Storage and Recovery for Distributed Exascale Storage Systems." Applied Sciences 11, no. 8 (April 7, 2021): 3298. http://dx.doi.org/10.3390/app11083298.

Full text

Abstract:

Various techniques have been used in distributed file systems for data availability and stability. Typically, a method for storing data in a replication technique-based distributed file system is used, but due to the problem of space efficiency, an erasure-coding (EC) technique has been utilized more recently. The EC technique improves the space efficiency problem more than the replication technique does. However, the EC technique has various performance degradation factors, such as encoding and decoding and input and output (I/O) degradation. Thus, this study proposes a buffering and combining technique in which various I/O requests that occurred during encoding in an EC-based distributed file system are combined into one and processed. In addition, it proposes four recovery measures (disk input/output load distribution, random block layout, multi-thread-based parallel recovery, and matrix recycle technique) to distribute the disk input/output loads generated during decoding.

APA, Harvard, Vancouver, ISO, and other styles

42

Nair, R., S. F. Antao, C. Bertolli, P. Bose, J. R. Brunheroto, T. Chen, C. Y. Cher, et al. "Active Memory Cube: A processing-in-memory architecture for exascale systems." IBM Journal of Research and Development 59, no. 2/3 (March 2015): 17:1–17:14. http://dx.doi.org/10.1147/jrd.2015.2409732.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Getov, Vladimir, Adolfy Hoisie, and Harvey J. Wasserman. "Codesign for Systems and Applications: Charting the Path to Exascale Computing." Computer 44, no. 11 (November 2011): 19–21. http://dx.doi.org/10.1109/mc.2011.334.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Losada, Nuria, Patricia González, María J. Martín, George Bosilca, Aurélien Bouteiller, and Keita Teranishi. "Fault tolerance of MPI applications in exascale systems: The ULFM solution." Future Generation Computer Systems 106 (May 2020): 467–81. http://dx.doi.org/10.1016/j.future.2020.01.026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Acer, Seher, Ariful Azad, Erik G. Boman, Aydın Buluç, Karen D. Devine, SM Ferdous, Nitin Gawande, et al. "EXAGRAPH: Graph and combinatorial methods for enabling exascale applications." International Journal of High Performance Computing Applications 35, no. 6 (September 30, 2021): 553–71. http://dx.doi.org/10.1177/10943420211029299.

Full text

Abstract:

Combinatorial algorithms in general and graph algorithms in particular play a critical enabling role in numerous scientific applications. However, the irregular memory access nature of these algorithms makes them one of the hardest algorithmic kernels to implement on parallel systems. With tens of billions of hardware threads and deep memory hierarchies, the exascale computing systems in particular pose extreme challenges in scaling graph algorithms. The codesign center on combinatorial algorithms, ExaGraph, was established to design and develop methods and techniques for efficient implementation of key combinatorial (graph) algorithms chosen from a diverse set of exascale applications. Algebraic and combinatorial methods have a complementary role in the advancement of computational science and engineering, including playing an enabling role on each other. In this paper, we survey the algorithmic and software development activities performed under the auspices of ExaGraph from both a combinatorial and an algebraic perspective. In particular, we detail our recent efforts in porting the algorithms to manycore accelerator (GPU) architectures. We also provide a brief survey of the applications that have benefited from the scalable implementations of different combinatorial algorithms to enable scientific discovery at scale. We believe that several applications will benefit from the algorithmic and software tools developed by the ExaGraph team.

APA, Harvard, Vancouver, ISO, and other styles

46

SHAHZAD, FAISAL, MARKUS WITTMANN, MORITZ KREUTZER, THOMAS ZEISER, GEORG HAGER, and GERHARD WELLEIN. "A SURVEY OF CHECKPOINT/RESTART TECHNIQUES ON DISTRIBUTED MEMORY SYSTEMS." Parallel Processing Letters 23, no. 04 (December 2013): 1340011. http://dx.doi.org/10.1142/s0129626413400112.

Full text

Abstract:

The road to exascale computing poses many challenges for the High Performance Computing (HPC) community. Each step on the exascale path is mainly the result of a higher level of parallelism of the basic building blocks (i.e., CPUs, memory units, networking components, etc.). The reliability of each of these basic components does not increase at the same rate as the rate of hardware parallelism. This results in a reduction of the mean time to failure (MTTF) of the whole system. A fault tolerance environment is thus indispensable to run large applications on such clusters. Checkpoint/Restart (C/R) is the classic and most popular method to minimize failure damage. Its ease of implementation makes it useful, but typically it introduces significant overhead to the application. Several efforts have been made to reduce the C/R overhead. In this paper we compare various C/R techniques for their overheads by implementing them on two different categories of applications. These approaches are based on parallel-file-system (PFS)-level checkpoints (synchronous/asynchronous) and node-level checkpoints. We utilize the Scalable Checkpoint/Restart (SCR) library for the comparison of node-level checkpoints. For asynchronous PFS-level checkpoints, we use the Damaris library, the SCR asynchronous feature, and application-based checkpointing via dedicated threads. Our baseline for overhead comparison is the naïve application-based synchronous PFS-level checkpointing method. A 3D lattice-Boltzmann (LBM) flow solver and a Lanczos eigenvalue solver are used as prototypical applications in which all the techniques considered here may be applied.

APA, Harvard, Vancouver, ISO, and other styles

47

Cappello, Franck. "Fault Tolerance in Petascale/ Exascale Systems: Current Knowledge, Challenges and Research Opportunities." International Journal of High Performance Computing Applications 23, no. 3 (July 20, 2009): 212–26. http://dx.doi.org/10.1177/1094342009106189.

Full text

Abstract:

The emergence of petascale systems and the promise of future exascale systems have reinvigorated the community interest in how to manage failures in such systems and ensure that large applications, lasting several hours or tens of hours, are completed successfully. Most of the existing results for several key mechanisms associated with fault tolerance in high-performance computing (HPC) platforms follow the rollback—recovery approach. Over the last decade, these mechanisms have received a lot of attention from the community with different levels of success. Unfortunately, despite their high degree of optimization, existing approaches do not fit well with the challenging evolutions of large-scale systems. There is room and even a need for new approaches. Opportunities may come from different origins: diskless checkpointing, algorithmic-based fault tolerance, proactive operation, speculative execution, software transactional memory, forward recovery, etc. The contributions of this paper are as follows: (1) we summarize and analyze the existing results concerning the failures in large-scale computers and point out the urgent need for drastic improvements or disruptive approaches for fault tolerance in these systems; (2) we sketch most of the known opportunities and analyze their associated limitations; (3) we extract and express the challenges that the HPC community will have to face for addressing the stringent issue of failures in HPC systems.

APA, Harvard, Vancouver, ISO, and other styles

48

Filiposka, Sonja, Anastas Mishev, and Carlos Juiz. "Current prospects towards energy-efficient top HPC systems." Computer Science and Information Systems 13, no. 1 (2016): 151–71. http://dx.doi.org/10.2298/csis150228063f.

Full text

Abstract:

Ever since the start of the green HPC initiative, a new design constriction has appeared on the horizon for the top supercomputer designers. Today?s top HPCs must not only boast with their exascale performances, but must take into account reaching the new exaflops frontiers with as minimum power consumption as possible. The goals of this paper are to present the current status of the top supercomputers from both performance and power consumption points of view. Using the current and available historical information from the Top and Green HPC lists, we identify the most promising design options and how they perform when combined together. The presented results reveal the main challenges that should become the focus of future research.

APA, Harvard, Vancouver, ISO, and other styles

49

Kale, Laxmikant. "Programming Models at Exascale: Adaptive Runtime Systems, Incomplete Simple Languages, and Interoperability." International Journal of High Performance Computing Applications 23, no. 4 (September 11, 2009): 344–46. http://dx.doi.org/10.1177/1094342009347497.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Heroux, Michael A. "Software Challenges for Extreme Scale Computing: Going From Petascale to Exascale Systems." International Journal of High Performance Computing Applications 23, no. 4 (September 30, 2009): 437–39. http://dx.doi.org/10.1177/1094342009347711.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!