Academic literature on the topic 'Exascale systems'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Exascale systems.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Exascale systems"

1

Coteus, P. W., J. U. Knickerbocker, C. H. Lam, and Y. A. Vlasov. "Technologies for exascale systems." IBM Journal of Research and Development 55, no. 5 (September 2011): 14:1–14:12. http://dx.doi.org/10.1147/jrd.2011.2163967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Rumley, Sebastien, Dessislava Nikolova, Robert Hendry, Qi Li, David Calhoun, and Keren Bergman. "Silicon Photonics for Exascale Systems." Journal of Lightwave Technology 33, no. 3 (February 1, 2015): 547–62. http://dx.doi.org/10.1109/jlt.2014.2363947.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Jensen, David, and Arun Rodrigues. "Embedded Systems and Exascale Computing." Computing in Science & Engineering 12, no. 6 (November 2010): 20–29. http://dx.doi.org/10.1109/mcse.2010.95.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Tahmazli-Khaligova, Firuza. "CHALLENGES OF USING BIG DATA IN DISTRIBUTED EXASCALE SYSTEMS." Azerbaijan Journal of High Performance Computing 3, no. 2 (December 29, 2020): 245–54. http://dx.doi.org/10.32010/26166127.2020.3.2.245.254.

Full text
Abstract:
In a traditional High Performance Computing system, it is possible to process a huge data volume. The nature of events in classic High Performance computing is static. In Distributed Exa-scale System has a different nature. The processing Big data in a distributed exascale system evokes a new challenge. The dynamic and interactive character of a distributed exascale system changes processes status and system elements. This paper discusses the challenge that Big data attributes: volume, velocity, variety, how they influence distributed exascale system dynamic and interactive nature. While investigating the effect of the Dynamic and Interactive nature of exascale systems in computing Big data, this research work suggests the Markov chains model. This model suggests the transition matrix, which identifies system status and memory sharing. It lets us analyze the two systems convergence. As a result in both systems are explored by the influence of each other.
APA, Harvard, Vancouver, ISO, and other styles
5

Alexander, Francis J., James Ang, Jenna A. Bilbrey, Jan Balewski, Tiernan Casey, Ryan Chard, Jong Choi, et al. "Co-design Center for Exascale Machine Learning Technologies (ExaLearn)." International Journal of High Performance Computing Applications 35, no. 6 (September 27, 2021): 598–616. http://dx.doi.org/10.1177/10943420211029302.

Full text
Abstract:
Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.
APA, Harvard, Vancouver, ISO, and other styles
6

Ismayilova, Nigar. "CHALLENGES OF USING THE FUZZY APPROACH IN EXASCALE COMPUTING SYSTEMS." Azerbaijan Journal of High Performance Computing 4, no. 2 (December 31, 2021): 198–205. http://dx.doi.org/10.32010/26166127.2021.4.2.198.205.

Full text
Abstract:
In this paper were studied opportunities of using fuzzy sets theory for constructing an appropriate load balancing model in Exascale distributed systems. The occurrence of dynamic and interactive events in multicore computing systems leads to uncertainty. As the fuzzy logic-based solutions allow the management of uncertain environments, there are several approaches and useful challenges for the development of load balancing models in Exascale computing systems.
APA, Harvard, Vancouver, ISO, and other styles
7

Klasky, S. A., H. Abbasi, M. Ainsworth, J. Choi, M. Curry, T. Kurc, Q. Liu, et al. "Exascale Storage Systems the SIRIUS Way." Journal of Physics: Conference Series 759 (October 2016): 012095. http://dx.doi.org/10.1088/1742-6596/759/1/012095.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Степаненко, Сергей, Sergey Stepanenko, Василий Южаков, and Vasiliy Yuzhakov. "Exascale supercomputers. Architectural outlines." Program systems: theory and applications 4, no. 4 (November 15, 2013): 61–90. http://dx.doi.org/10.12737/2418.

Full text
Abstract:
Architectural aspects of exascale supercomputers are explored. Param-eters of the computing environment and interconnect are evaluated. It is shown that reaching exascale performances requires hybrid systems. Processor elements of such systems comprise CPU cores and arithmetic accelerators, implementing the MIMD and SIMD computing disciplines, respectively. Efficient exascale hybrid systems require fundamentally new applications and architectural efficiency scaling solutions, including: 1) process-aware structural reconfiguring of hybrid processor elements by varying the number of MIMD cores and SIMD cores communicating with them to attain as high performance and efficiency as possible under given conditions; 2) application of conflict-free sets of sources and receivers and/or decomposi-tion of the computation to subprocesses and their allocation to environment elements in accordance with their features and communication topology to minimize communication time; 3) application of topological redundancy methods to preserve the topology and overall performance achieved by the above communication time minimiza-tion solutions in case of element failure thus maintaining the efficiency reached by the above reconfiguring and communication minimization solu-tions, i.e. to provide fault-tolerant efficiency scaling. Application of these solutions is illustrated by running molecular dynamics tests and the NPB LU benchmark. The resulting architecture displays dynamic adaptability to program features, which in turn ensures the efficiency of using exascale supercomputers.
APA, Harvard, Vancouver, ISO, and other styles
9

Abdullayev, Fakhraddin. "RESOURCE DISCOVERY IN DISTRIBUTED EXASCALE SYSTEMS USING A MULTI-AGENT MODEL: CATEGORIZATION OF AGENTS BASED ON THEIR CHARACTERISTICS." Azerbaijan Journal of High Performance Computing 6, no. 1 (June 30, 2023): 113–20. http://dx.doi.org/10.32010/26166127.2023.6.1.113.120.

Full text
Abstract:
Resource discovery is a crucial component in high-performance computing (HPC) systems. This paper presents a multi-agent model for resource discovery in distributed exascale systems. Agents are categorized based on resource types and behavior-specific characteristics. The model enables efficient identification and acquisition of memory, process, file, and IO resources. Through a comprehensive exploration, we highlight the potential of our approach in addressing resource discovery challenges in exascale computing systems, paving the way for optimized resource utilization and enhanced system performance.
APA, Harvard, Vancouver, ISO, and other styles
10

Shalf, John, Dan Quinlan, and Curtis Janssen. "Rethinking Hardware-Software Codesign for Exascale Systems." Computer 44, no. 11 (November 2011): 22–30. http://dx.doi.org/10.1109/mc.2011.300.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Exascale systems"

1

Deveci, Mehmet. "Load-Balancing and Task Mapping for Exascale Systems." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1429199721.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Bentria, Dounia. "Combining checkpointing and other resilience mechanisms for exascale systems." Thesis, Lyon, École normale supérieure, 2014. http://www.theses.fr/2014ENSL0971/document.

Full text
Abstract:
Dans cette thèse, nous nous sommes intéressés aux problèmes d'ordonnancement et d'optimisation dans des contextes probabilistes. Les contributions de cette thèse se déclinent en deux parties. La première partie est dédiée à l’optimisation de différents mécanismes de tolérance aux pannes pour les machines de très large échelle qui sont sujettes à une probabilité de pannes. La seconde partie est consacrée à l’optimisation du coût d’exécution des arbres d’opérateurs booléens sur des flux de données.Dans la première partie, nous nous sommes intéressés aux problèmes de résilience pour les machines de future génération dites « exascales » (plateformes pouvant effectuer 1018 opérations par secondes).Dans le premier chapitre, nous présentons l’état de l’art des mécanismes les plus utilisés dans la tolérance aux pannes et des résultats généraux liés à la résilience.Dans le second chapitre, nous étudions un modèle d’évaluation des protocoles de sauvegarde de points de reprise (checkpoints) et de redémarrage. Le modèle proposé est suffisamment générique pour contenir les situations extrêmes: d’un côté le checkpoint coordonné, et de l’autre toute une famille de stratégies non-Coordonnées. Nous avons proposé une analyse détaillée de plusieurs scénarios, incluant certaines des plateformes de calcul existantes les plus puissantes, ainsi que des anticipations sur les futures plateformes exascales.Dans les troisième, quatrième et cinquième chapitres, nous étudions l'utilisation conjointe de différents mécanismes de tolérance aux pannes (réplication, prédiction de pannes et détection d'erreurs silencieuses) avec le mécanisme traditionnel de checkpoints et de redémarrage. Nous avons évalué plusieurs modèles au moyen de simulations. Nos résultats montrent que ces modèles sont bénéfiques pour un ensemble de modèles d'applications dans le cadre des futures plateformes exascales.Dans la seconde partie de la thèse, nous étudions le problème de la minimisation du coût de récupération des données par des applications lors du traitement d’une requête exprimée sous forme d'arbres d'opérateurs booléens appliqués à des prédicats sur des flux de données de senseurs. Le problème est de déterminer l'ordre dans lequel les prédicats doivent être évalués afin de minimiser l'espérance du coût du traitement de la requête. Dans le sixième chapitre, nous présentons l'état de l'art de la seconde partie et dans le septième chapitre, nous étudions le problème pour les requêtes exprimées sous forme normale disjonctive. Nous considérons le cas plus général où chaque flux peut apparaître dans plusieurs prédicats et nous étudions deux modèles, le modèle où chaque prédicat peut accéder à un seul flux et le modèle où chaque prédicat peut accéder à plusieurs flux
In this thesis, we are interested in scheduling and optimization problems in probabilistic contexts. The contributions of this thesis come in two parts. The first part is dedicated to the optimization of different fault-Tolerance mechanisms for very large scale machines that are subject to a probability of failure and the second part is devoted to the optimization of the expected sensor data acquisition cost when evaluating a query expressed as a tree of disjunctive Boolean operators applied to Boolean predicates. In the first chapter, we present the related work of the first part and then we introduce some new general results that are useful for resilience on exascale systems.In the second chapter, we study a unified model for several well-Known checkpoint/restart protocols. The proposed model is generic enough to encompass both extremes of the checkpoint/restart space, from coordinated approaches to a variety of uncoordinated checkpoint strategies. We propose a detailed analysis of several scenarios, including some of the most powerful currently available HPC platforms, as well as anticipated exascale designs.In the third, fourth, and fifth chapters, we study the combination of different fault tolerant mechanisms (replication, fault prediction and detection of silent errors) with the traditional checkpoint/restart mechanism. We evaluated several models using simulations. Our results show that these models are useful for a set of models of applications in the context of future exascale systems.In the second part of the thesis, we study the problem of minimizing the expected sensor data acquisition cost when evaluating a query expressed as a tree of disjunctive Boolean operators applied to Boolean predicates. The problem is to determine the order in which predicates should be evaluated so as to shortcut part of the query evaluation and minimize the expected cost.In the sixth chapter, we present the related work of the second part and in the seventh chapter, we study the problem for queries expressed as a disjunctive normal form. We consider the more general case where each data stream can appear in multiple predicates and we consider two models, the model where each predicate can access a single stream and the model where each predicate can access multiple streams
APA, Harvard, Vancouver, ISO, and other styles
3

Shalf, John Marshall. "Advanced System-Scale and Chip-Scale Interconnection Networks for Ultrascale Systems." Thesis, Virginia Tech, 2010. http://hdl.handle.net/10919/36134.

Full text
Abstract:
The path towards realizing next-generation petascale and exascale computing is increasingly dependent on building supercomputers with unprecedented numbers of processors. Given the rise of multicore processors, the number of network endpoints both on-chip and off-chip is growing exponentially, with systems in 2018 anticipated to contain thousands of processing elements on-chip and billions of processing elements system-wide. To prevent the interconnect from dominating the overall cost of future systems, there is a critical need for scalable interconnects that capture the communication requirements of target ultrascale applications. It is therefore essential to understand high-end application communication characteristics across a broad spectrum of computational methods, and utilize that insight to tailor interconnect designs to the specific requirements of the underlying codes. This work makes several unique contributions towards attaining that goal. First, the communication traces for a number of high-end application communication requirements, whose computational methods include: finite-difference, lattice-Boltzmann, particle-in-cell, sparse linear algebra, particle mesh ewald, and FFT-based solvers. This thesis presents an introduction to the fit-tree approach for designing network infrastructure that is tailored to application requirements. A fit-tree minimizes the component count of an interconnect without impacting application performance compared to a fully connected network. The last section introduces a methodology for reconfigurable networks to implement fit-tree solutions called Hybrid Flexibly Assignable Switch Topology (HFAST). HFAST uses both passive (circuit) and active (packet) commodity switch components in a unique way to dynamically reconfigure interconnect wiring to suit the topological requirements of scientific applications. Overall the exploration points to several promising directions for practically addressing both the on-chip and off-chip interconnect requirements of future ultrascale systems.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
4

Maroñas, Marcos. "On the design and development of programming models for exascale systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/671783.

Full text
Abstract:
High Performance Computing (HPC) systems have been evolving over time to adapt to the scientific community requirements. We are currently approaching to the Exascale era. Exascale systems will incorporate a large number of nodes, each of them containing many computing resources. Besides that, not only the computing resources, but memory hierarchies are becoming more deep and complex. Overall, Exascale systems will present several challenges in terms of performance, programmability and fault tolerance. Regarding programmability, the more complex a system architecture is, the more complex to properly exploit the system. The programmability is closely related to the performance, because the performance a system can deliver is useless if users are not able to write programs that obtain such performance. This stresses the importance of programming models as a tool to easily write programs that can reach the peak performance of the system. Finally, it is well known that more components lead to more errors. The combination of large executions with a low Mean Time To Failure (MTTF) may jeopardize application progress. Thus, all the efforts done to improve performance become pointless if applications hardly finish. To prevent that, we must apply fault tolerance. The main goal of this thesis is to enable non-expert users to exploit complex Exascale systems. To that end, we have enhanced state-of-the-art parallel programming models to cope with three key Exascale challenges: programmability, performance and fault tolerance. The first set of contributions focuses on the efficient management of modern multicore/manycore processors. We propose a new kind of task that combines the key advantages of tasks with the key advantages of worksharing techniques. The use of this new task type alleviates granularity issues, thereby enhancing performance in several scenarios. We also propose the introduction of dependences in the taskloop construct so that programmers can easily apply blocking techniques. Finally, we extend taskloop construct to support the creation of the new kind of tasks instead of regular tasks. The second set of contributions focuses on the efficient management of modern memory hierarchies, focused on NUMA domains. By using the information that users provide in the dependences annotations, we build a system that tracks data location. Later, we use this information to take scheduling decisions that maximize data locality. Our last set of contributions focuses on fault tolerance. We propose a programming model that provides application-level checkpoint/restart in an easy and portable way. Our programming model offers a set of compiler directives to abstract users from system-level nuances. Then, it leverages state-of-the-art libraries to deliver high performance and includes several redundancy schemes.
Los supercomputadores han ido evolucionando a lo largo del tiempo para adaptarse a las necesidades de la comunidad científica. Actualmente, nos acercamos a la era Exascale. Los sistemas Exascale incorporarán un número de nodos enorme. Además, cada uno de esos nodos contendrá una gran cantidad de recursos computacionales. También la jerarquía de memoria se está volviendo más profunda y compleja. En conjunto, los sistemas Exascale plantearán varios desafíos en términos de rendimiento, programabilidad y tolerancia a fallos. Respecto a la programabilidad, cuánto más compleja es la arquitectura de un sistema, más difícil es aprovechar sus recursos de forma adecuada. La programabilidad está íntimamente ligada al rendimiento, ya que por mucho rendimiento que un sistema pueda ofrecer, no sirve de nada si nadie es capaz de conseguir ese rendimiento porque es demasiado difícil de usar. Esto refuerza la importancia de los modelos de programación como herramientas para desarrollar programas que puedan aprovechar al máximo estos sistemas de forma sencilla. Por último, es bien sabido que tener más componentes conlleva más errores. La combinación de ejecuciones muy largas y un tiempo medio hasta el fallo (MTTF) bajo ponen en peligro el progreso de las aplicaciones. Así pues, todos los esfuerzos realizados para mejorar el rendimiento son nulos si las aplicaciones difícilmente terminan. Para evitar esto, debemos desarrollar tolerancia a fallos. El objetivo principal de esta tesis es permitir que usuarios no expertos puedan aprovechar de forma óptima los complejos sistemas Exascale. Para ello, hemos mejorado algunos de los modelos de programación paralela más punteros para que puedan enfrentarse a tres desafíos clave de los sistemas Exascale: programabilidad, rendimiento y tolerancia a fallos. El primer conjunto de contribuciones de esta tesis se centra en la gestión eficiente de procesadores multicore/manycore. Proponemos un nuevo tipo de tarea que combina los puntos clave de las tareas con los de las técnicas de worksharing. Este nuevo tipo de tarea permite aliviar los problemas de granularidad, mejorando el rendimiento en algunos escenarios. También proponemos la introducción de dependencias en la directiva taskloop, de forma que los programadores puedan aplicar blocking de forma sencilla. Finalmente, extendemos la directiva taskloop para que pueda crear nuestro nuevo tipo de tareas, además de las tareas normales. El segundo conjunto de contribuciones está enfocado a la gestión eficiente de jerarquías de memoria modernas, centrado en entornos NUMA. Usando la información de las dependencias que anota el usuario, hemos construido un sistema que guarda la ubicación de los datos. Después, con esa información, decidimos dónde ejecutar el trabajo para maximizar la localidad de datos. El último conjunto de contribuciones se centra en tolerancia a fallos. Proponemos un modelo de programación que ofrece checkpoint/restart a nivel de aplicación, de forma sencilla y portable. Nuestro modelo ofrece una serie de directivas de compilador que permiten al usuario abstraerse de los detalles del sistema. Además, gestionamos librerías punteras en tolerancia a fallos para conseguir un alto rendimiento, incluyendo varios niveles y tipos de redundancia.
APA, Harvard, Vancouver, ISO, and other styles
5

Subasi, Omer. "Reliability for exascale computing : system modelling and error mitigation for task-parallel HPC applications." Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/397670.

Full text
Abstract:
As high performance computing (HPC) systems continue to grow, their fault rate increases. Applications running on these systems have to deal with rates on the order of hours or days. Furthermore, some studies for future Exascale systems predict the rates to be on the order of minutes. As a result, efficient fault tolerance solutions are needed to be able to tolerate frequent failures. A fault tolerance solution for future HPC and Exascale systems must be low-cost, efficient and highly scalable. It should have low overhead in fault-free execution and provide fast restart because long-running applications are expected to experience many faults during the execution. Meanwhile task-based dataflow parallel programming models (PM) are becoming a popular paradigm in HPC applications at large scale. For instance, we see the adaptation of task-based dataflow parallelism in OpenMP 4.0, OmpSs PM, Argobots and Intel Threading Building Blocks. In this thesis we propose fault-tolerance solutions for task-parallel dataflow HPC applications. Specifically, first we design and implement a checkpoint/restart and message-logging framework to recover from errors. We then develop performance models to investigate the benefits of our task-level frameworks when integrated with system-wide checkpointing. Moreover, we design and implement selective task replication mechanisms to detect and recover from silent data corruptions in task-parallel dataflow HPC applications. Finally, we introduce a runtime-based coding scheme to detect and recover from memory errors in these applications. Considering the span of all of our schemes, we see that they provide a fairly high failure coverage where both computation and memory is protected against errors.
A medida que los Sistemas de Cómputo de Alto rendimiento (HPC por sus siglas en inglés) siguen creciendo, también las tasas de fallos aumentan. Las aplicaciones que se ejecutan en estos sistemas tienen una tasa de fallos que pueden estar en el orden de horas o días. Además, algunos estudios predicen que los fallos estarán en el orden de minutos en los Sistemas Exascale. Por lo tanto, son necesarias soluciones eficientes para la tolerancia a fallos que puedan tolerar fallos frecuentes. Las soluciones para tolerancia a fallos en los Sistemas futuros de HPC y Exascale tienen que ser de bajo costo, eficientes y altamente escalable. El sobrecosto en la ejecución sin fallos debe ser bajo y también se debe proporcionar reinicio rápido, ya que se espera que las aplicaciones de larga duración experimenten muchos fallos durante la ejecución. Por otra parte, los modelos de programación paralelas basados en tareas ordenadas de acuerdo a sus dependencias de datos, se están convirtiendo en un paradigma popular en aplicaciones HPC a gran escala. Por ejemplo, los siguientes modelos de programación paralela incluyen este tipo de modelo de programación OpenMP 4.0, OmpSs, Argobots e Intel Threading Building Blocks. En esta tesis proponemos soluciones de tolerancia a fallos para aplicaciones de HPC programadas en un modelo de programación paralelo basado tareas. Específicamente, en primer lugar, diseñamos e implementamos mecanismos “checkpoint/restart” y “message-logging” para recuperarse de los errores. Para investigar los beneficios de nuestras herramientas a nivel de tarea cuando se integra con los “system-wide checkpointing” se han desarrollado modelos de rendimiento. Por otra parte, diseñamos e implementamos mecanismos de replicación selectiva de tareas que permiten detectar y recuperarse de daños de datos silenciosos en aplicaciones programadas siguiendo el modelo de programación paralela basadas en tareas. Por último, se introduce un esquema de codificación que funciona en tiempo de ejecución para detectar y recuperarse de los errores de la memoria en estas aplicaciones. Todos los esquemas propuestos, en conjunto, proporcionan una cobertura bastante alta a los fallos tanto si estos se producen el cálculo o en la memoria.
APA, Harvard, Vancouver, ISO, and other styles
6

Gkikas, Nikolaos. "Data Transfer and Management through the IKAROS framework : Adopting an asynchronous non-blocking event driven approach to implement the Elastic-Transfer's IMAP client-server connection." Thesis, KTH, Radio Systems Laboratory (RS Lab), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-166740.

Full text
Abstract:
Given the current state of input/output (I/O) and storage devices in petascale systems, incremental solutions would be ineffective when implemented in exascale environments. According to the "The International Exascale Software Roadmap", by Dongarra, et al. existing I/O architectures are not sufficiently scalable, especially because current shared file systems have limitations when used in large-scale environments. These limitations are: Bandwidth does not scale economically to large-scale systems, I/O traffic on the high speed network can impact on and be influenced by other unrelated jobs, and I/O traffic on the storage server can impact on and be influenced by other unrelated jobs. Future applications on exascale computers will require I/O bandwidth proportional to their computational capabilities. To avoid these limitations C. Filippidis, C. Markou, and Y. Cotronis proposed the IKAROS framework. In this thesis project, the capabilities of the publicly available elastic-transfer (eT) module which was directly derived from the IKAROS, will be expanded. The eT uses Google’s Gmail service as an utility for efficient meta-data management. Gmail is based on the IMAP protocol, and the existing version of the eT framework implements the Internet Message Access Protocol (IMAP) client-server connection through the ‘‘Inbox’’ module from the Node Package Manager (NPM) of the Node.js programming language. This module was used as a proof of concept, but in a production environment this implementation undermines the system’s scalability and there is an inefficient allocation of the system’s resources when a large number of concurrent requests arrive at the eT′s meta-data server (MDS) at the same time. This thesis solves this problem by adopting an asynchronous non-blocking event driven approach to implement the IMAP client-server connection. This was done by integrating and modifying the ‘‘Imap’’ NPM module from the NPM repository to suit the eT framework. Additionally, since the JavaScript Object Notation (JSON) format has become one of the most widespread data-interchange formats, eT′s meta-data scheme is appropriately modified to make the system’s meta-data easily parsed as JSON objects. This feature creates a framework with wider compatibility and interoperability with external systems. The evaluation and operational behavior of the new module was tested through a set of data transfer experiments over a wide area network environment. These experiments were performed to ensure that the changes in the system’s architecture did not affected its performance.
Givet det nuvarande läget för input/output (I/O) och lagringsenheter för system i peta-skala, skulle inkrementella lösningar bli ineffektiva om de implementerades i exa-skalamiljöer. Enligt ”The International Exascale Software Roadmap”, av Dongarra et al., är nuvarande I/O-arkitekturer inte tillräckligt skalbara, särskilt eftersom nuvarande delade filsystem har begränsningar när de används i storskaliga miljöer. Dessa begränsningar är: Bandbredd skalar inte på ett ekonomiskt sätt i storskaliga system, I/O-trafik på höghastighetsnätverk kan ha påverkan på och blir påverkad av andra orelaterade jobb, och I/O-trafik på lagringsservern kan ha påverkan på och bli påverkad av andra orelaterade jobb. Framtida applikationer på exa-skaladatorer kommer kräva I/O-bandbredd proportionellt till deras beräkningskapacitet. För att undvika dessa begränsningar föreslog C. Filippidis, C. Markou och Y. Cotronis ramverket IKAROS. I detta examensarbete utökas funktionaliteten hos den publikt tillgängliga modulen elastic-transfer (eT) som framtagits utifrån IKAROS. Den befintliga versionen av eT-ramverket implementerar Internet Message Access Protocol (IMAP) klient-serverkommunikation genom modulen ”Inbox” från Node Package Manager (NPM) ur Node.js programmeringsspråk. Denna modul användes som ett koncepttest, men i en verklig miljö så underminerar denna implementation systemets skalbarhet när ett stort antal värdar ansluter till systemet. Varje klient begär individuellt information relaterad till systemets metadata från IMAP-servern, vilket leder till en ineffektiv allokering av systemets resurser när ett stort antal värdar är samtidigt anslutna till eT-ramverket. Denna uppsats löser problemet genom att använda ett asynkront, icke-blockerande och händelsedrivet tillvägagångssätt för att implementera en IMAP klient-serveranslutning. Detta görs genom att integrera och modifiera NPM:s ”Imap”-modul, tagen från NPM:s katalog, så att den passar eT-ramverket. Eftersom formatet JavaScript Object Notation (JSON) har blivit ett av de mest spridda formaten för datautbyte så modifieras även eT:s metadata-struktur för att göra systemets metadata enkelt att omvandla till JSON-objekt. Denna funktionalitet ger ett bredare kompatibilitet och interoperabilitet med externa system. Utvärdering och tester av den nya modulens operationella beteende utfördes genom en serie dataöverföringsexperiment i en wide area network-miljö. Dessa experiment genomfördes för att få bekräftat att förändringarna i systemets arkitektur inte påverkade dess prestanda.
APA, Harvard, Vancouver, ISO, and other styles
7

Mirtaheri, Seyedeh Leili, and Lucio Grandinetti. "Optimized dynamic load balancing in distributed exascale computing systems." Thesis, 2016. http://hdl.handle.net/10955/1370.

Full text
Abstract:
Dottorato di Ricerca in Ricerca Operativa, Ciclo XXVII,a.a. 2015-2016
The dynamic nature of new generation scientific problems needs undergoing review in the traditional and static management of computing resources in Exascale computing systems. Doing so will support dynamic and unpredictable requests of the scientific programs for different type of resources. To achieve this facility, it is necessary to present a dynamic load balancing model to manage the load of the system efficiently based on the requests of the programs. Currently, the distributed Exascale systems with heterogeneous resources are the best branch of distributed computing systems that should be able to support the scientific programs with dynamic and interactive requests to resources. In this thesis, distributed Exascale systems are regarded as the operational and real distributed systems, and the dynamic load balancing model for the distributed controlling of load in the nodes in distributed Exascale computing systems are presented. The dominant paradigm in this model is derived from Operation Research sciences, and the request aware approach is replaced with the command-based approach in managing the load of the system. The results of evaluation show us the significant improvement regarding the performance by using the proposed load balancing mechanism in compare with the common distributed load balancing mechanisms
Università della Calabria
APA, Harvard, Vancouver, ISO, and other styles
8

Tiwari, Manasi. "Communication Overlapping Krylov Subspace Methods for Distributed Memory Systems." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5990.

Full text
Abstract:
Many high performance computing applications in computational fluid dynamics, electromagnetics etc. need to solve a linear system of equations $Ax=b$. For linear systems where $A$ is generally large and sparse, Krylov Subspace methods (KSMs) are used. In this thesis, we propose communication overlapping KSMs. We start with the Conjugate Gradient (CG) method, which is used when $A$ is sparse symmetric positive definite. Recent variants of CG include a Pipelined CG (PIPECG) method which overlaps the allreduce in CG with independent computations i.e., one Preconditioner (PC) and one Sparse Matrix Vector Product (SPMV). As we move towards the exascale era, the time for global synchronization and communication in allreduce increases with the large number of cores available in the exascale systems, and the allreduce time becomes the performance bottleneck which leads to poor scalability of CG. Therefore, it becomes necessary to reduce the number of allreduces in CG and adequately overlap the larger allreduce time with more independent computations than the independent computations provided by PIPECG. Towards this goal, we have developed PIPECG-OATI (PIPECG-One Allreduce per Two Iterations) which reduces the number of allreduces from three per iteration to one per two iterations and overlaps it with two PCs and two SPMVs. For better scalability with more overlapping, we also developed the Pipelined s-step CG method which reduces the number of allreduces to one per s iterations and overlaps it with s PCs and s SPMVs. We compared our methods with state-of-art CG variants on a variety of platforms and demonstrated that our method gives 2.15x - 3x speedup over the existing methods. We have also generalized our research with parallelization of CG on multi-node CPU systems in two dimensions. Firstly, we have developed communication overlapping variants of KSMs other than CG, including Conjugate Residual (CR), Minimum Residual (MINRES) and BiConjugate Gradient Stabilised (BiCGStab) methods for matrices with different properties. The pipelined variants give up to 1.9x, 2.5x and 2x speedup over the state-of-the-art MINRES, CR and BiCGStab methods respectively. Secondly, we developed communication overlapping CG variants for GPU accelerated nodes, where we proposed and implemented three hybrid CPU-GPU execution strategies for the PIPECG method. The first two strategies achieve task parallelism and the last method achieves data parallelism. Our experiments on GPUs showed that our methods give 1.45x - 3x average speedup over existing CPU and GPU-based implementations. The third method gives up to 6.8x speedup for problems that cannot be fit in GPU memory. We also implemented GPU related optimizations for the PIPECG-OATI method and show performance improvements over other GPU implementations of PCG and PIPECG on multiple nodes with multiple GPUs.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Exascale systems"

1

Reiz, Severin, Benjamin Uekermann, Philipp Neumann, Hans-Joachim Bungartz, and Wolfgang E. Nagel. Software for Exascale Computing - SPPEXA 2016-2019. Springer International Publishing AG, 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Reiz, Severin, Benjamin Uekermann, Philipp Neumann, Hans-Joachim Bungartz, and Wolfgang E. Nagel. Software for Exascale Computing - SPPEXA 2016-2019. Springer International Publishing AG, 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bungartz, Hans-Joachim. Software for Exascale Computing - SPPEXA 2016-2019. Springer Nature, 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Vetter, Jeffrey S. Contemporary High Performance Computing: From Petascale Toward Exascale. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Vetter, Jeffrey S. Contemporary High Performance Computing: From Petascale toward Exascale. Chapman and Hall/CRC, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Vetter, Jeffrey S. Contemporary High Performance Computing: From Petascale Toward Exascale. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Williams, Timothy J., Tjerk P. Straatsma, and Katerina B. Antypas. Exascale Scientific Applications: Scalability and Performance Portability. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Williams, Timothy J., Tjerk P. Straatsma, and Katerina B. Antypas. Exascale Scientific Applications: Scalability and Performance Portability. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Williams, Timothy J., Tjerk P. Straatsma, and Katerina B. Antypas. Exascale Scientific Applications: Scalability and Performance Portability. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Williams, Timothy J., Tjerk P. Straatsma, and Katerina B. Antypas. Exascale Scientific Applications: Scalability and Performance Portability. Taylor & Francis Group, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Exascale systems"

1

Djemame, Karim, and Hamish Carr. "Exascale Computing Deployment Challenges." In Economics of Grids, Clouds, Systems, and Services, 211–16. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-63058-4_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Flajslik, Mario, Eric Borch, and Mike A. Parker. "Megafly: A Topology for Exascale Systems." In Lecture Notes in Computer Science, 289–310. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-92040-5_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Casanova, Henri, Frédéric Vivien, and Dounia Zaidouni. "Using Replication for Resilience on Exascale Systems." In Computer Communications and Networks, 229–78. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20943-2_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Mhedheb, Yousri, and Achim Streit. "Energy Efficient Runtime Framework for Exascale Systems." In Lecture Notes in Computer Science, 32–44. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46079-6_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhao, Jisheng, Colleen Bertoni, Jeffrey Young, Kevin Harms, Vivek Sarkar, and Brice Videau. "HIPLZ: Enabling Performance Portability for Exascale Systems." In Euro-Par 2022: Parallel Processing Workshops, 197–210. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-31209-0_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kogge, Peter M. "Updating the Energy Model for Future Exascale Systems." In Lecture Notes in Computer Science, 323–39. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20119-1_24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Royuela, Sara, Alejandro Duran, Maria A. Serrano, Eduardo Quiñones, and Xavier Martorell. "A Functional Safety OpenMP $$^{*}$$ for Critical Real-Time Embedded Systems." In Scaling OpenMP for Exascale Performance and Portability, 231–45. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-65578-9_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Bobák, Martin, Ondrej Habala, and Ladislav Hluchý. "Exascale Flood Modelling in Environment Supporting Urgent Computing." In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, 384–91. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32591-6_41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Aliev, Araz R., and Nigar T. Ismayilova. "Graph-Based Load Balancing Model for Exascale Computing Systems." In 11th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions and Artificial Intelligence - ICSCCW-2021, 229–36. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-92127-9_33.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Oeste, Sebastian, Marc-André Vef, Mehmet Soysal, Wolfgang E. Nagel, André Brinkmann, and Achim Streit. "ADA-FS—Advanced Data Placement via Ad hoc File Systems at Extreme Scales." In Software for Exascale Computing - SPPEXA 2016-2019, 29–59. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-47956-5_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Exascale systems"

1

Varela, Maria Ruiz, Kurt B. Ferreira, and Rolf Riesen. "Fault-tolerance for exascale systems." In 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS). IEEE, 2010. http://dx.doi.org/10.1109/clusterwksp.2010.5613081.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jouppi, Norman Paul. "Resilience Challenges for Exascale Systems." In 2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT). IEEE, 2009. http://dx.doi.org/10.1109/dft.2009.52.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kash, J. A., P. Pepeljugoski, F. E. Doany, C. L. Schow, D. M. Kuchta, L. Schares, R. Budd, et al. "Communication technologies for exascale systems." In SPIE OPTO: Integrated Optoelectronic Devices, edited by Alexei L. Glebov and Ray T. Chen. SPIE, 2009. http://dx.doi.org/10.1117/12.815329.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

McLaren, Moray. "CMOS nanophotonics for exascale systems." In 2010 International Conference on Green Computing (Green Comp). IEEE, 2010. http://dx.doi.org/10.1109/greencomp.2010.5598267.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Bergman, Keren, Sébastien Rumley, Noam Ophir, Dessislava Nikolova, Robert Hendry, Qi Li, Kishore Padmara, Ke Wen, and Lee Zhu. "Silicon Photonics for Exascale Systems." In Optical Fiber Communication Conference. Washington, D.C.: OSA, 2014. http://dx.doi.org/10.1364/ofc.2014.m3e.1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Courteille, F., and J. Eaton. "Programming Perspectives for Pre-exascale Systems." In Second EAGE Workshop on High Performance Computing for Upstream. Netherlands: EAGE Publications BV, 2015. http://dx.doi.org/10.3997/2214-4609.201414023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Parsons, Mark. "Exascale computing - An impossible challenge?" In 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS). IEEE, 2011. http://dx.doi.org/10.1109/ahs.2011.5963975.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Feng, Rui, Peng Zhang, and Yuefan Deng. "Network Design Considerations for Exascale Supercomputers." In Parallel and Distributed Computing and Systems. Calgary,AB,Canada: ACTAPRESS, 2012. http://dx.doi.org/10.2316/p.2012.789-001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Lankes, Stefan. "Revisiting co-scheduling for upcoming ExaScale systems." In 2015 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 2015. http://dx.doi.org/10.1109/hpcsim.2015.7237117.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Agullo, Emmanuel, George Bosilca, Berenger Bramas, Cedric Castagnede, Olivier Coulaud, Eric Darve, Jack Dongarra, et al. "Abstract: Matrices Over Runtime Systems at Exascale." In 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC). IEEE, 2012. http://dx.doi.org/10.1109/sc.companion.2012.167.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Exascale systems"

1

Hendrickson, Bruce A. Scientific Discovery on Exascale Systems. Office of Scientific and Technical Information (OSTI), June 2015. http://dx.doi.org/10.2172/1198987.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Stearley, Jon R., Rolf E. Riesen, James H. ,. III Laros, Kurt Brian Ferreira, Kevin Thomas Tauke Pedretti, Ron A. Oldfield, and Ronald Brian Brightwell. Redundant computing for exascale systems. Office of Scientific and Technical Information (OSTI), December 2010. http://dx.doi.org/10.2172/1011662.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Beckman, Pete, Ron Brightwell, Maya Gokhale, Bronis R. de Supinski, Steven Hofmeyr, Sriram Krishnamoorthy, Mike Lang, Barney Maccabe, John Shalf, and Marc Snir. Exascale Operating Systems and Runtime Software Report. Office of Scientific and Technical Information (OSTI), December 2012. http://dx.doi.org/10.2172/1471119.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Riesen, Rolf E., Patrick G. Bridges, Jon R. Stearley, James H. ,. III Laros, Ron A. Oldfield, Dorian Arnold, Kevin Thomas Tauke Pedretti, Kurt Brian Ferreira, and Ronald Brian Brightwell. Keeping checkpoint/restart viable for exascale systems. Office of Scientific and Technical Information (OSTI), September 2011. http://dx.doi.org/10.2172/1029780.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Long, Darrell E., and Ethan L. Miller. Dynamic Non-Hierarchical File Systems for Exascale Storage. Office of Scientific and Technical Information (OSTI), February 2015. http://dx.doi.org/10.2172/1170868.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ferreira, Kurt Brian. Fault Survivability of Lightweight Operating Systems for exascale. Office of Scientific and Technical Information (OSTI), September 2014. http://dx.doi.org/10.2172/1459775.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Choudhary, Alok, Nagiza Samatova, Kesheng Wu, and Wei-keng Liao. Scalable and Power Efficient Data Analytics for Hybrid Exascale Systems. Office of Scientific and Technical Information (OSTI), March 2015. http://dx.doi.org/10.2172/1173060.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Xie, Yuan. Blackcomb2: Hardware-Software Co-design for Nonvolatile Memory in Exascale Systems. Office of Scientific and Technical Information (OSTI), April 2018. http://dx.doi.org/10.2172/1485357.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Brown, Forrest B., Brian Christopher Kiedrowski, Jeffrey S. Bull, and Lawrence James Cox. MCNP 2020: Preparing LANL Monte Carlo for Exascale Computer Systems (White Paper). Office of Scientific and Technical Information (OSTI), April 2015. http://dx.doi.org/10.2172/1177983.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Mudge, Trevor. BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems. Office of Scientific and Technical Information (OSTI), December 2017. http://dx.doi.org/10.2172/1413470.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography