Academic literature on the topic 'Modern multi-core systems'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Modern multi-core systems.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Modern multi-core systems"

1

Bucaioni, Alessio, Saad Mubeen, Federico Ciccozzi, Antonio Cicchetti, and Mikael Sjödin. "Modelling multi-criticality vehicular software systems: evolution of an industrial component model." Software and Systems Modeling 19, no. 5 (April 30, 2020): 1283–302. http://dx.doi.org/10.1007/s10270-020-00795-5.

Full text
Abstract:
Abstract Software in modern vehicles consists of multi-criticality functions, where a function can be safety-critical with stringent real-time requirements, less critical from the vehicle operation perspective, but still with real-time requirements, or not critical at all. Next-generation autonomous vehicles will require higher computational power to run multi-criticality functions and such a power can only be provided by parallel computing platforms such as multi-core architectures. However, current model-based software development solutions and related modelling languages have not been designed to effectively deal with challenges specific of multi-core, such as core-interdependency and controlled allocation of software to hardware. In this paper, we report on the evolution of the Rubus Component Model for the modelling, analysis, and development of vehicular software systems with multi-criticality for deployment on multi-core platforms. Our goal is to provide a lightweight and technology-preserving transition from model-based software development for single-core to multi-core. This is achieved by evolving the Rubus Component Model to capture explicit concepts for multi-core and parallel hardware and for expressing variable criticality of software functions. The paper illustrates these contributions through an industrial application in the vehicular domain.
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Kuo Yi, Fuh Gwo Chen, and Jr Shian Chen. "A Cost-Effective Hardware Approach for Measuring Power Consumption of Modern Multi-Core Processors." Applied Mechanics and Materials 110-116 (October 2011): 4569–73. http://dx.doi.org/10.4028/www.scientific.net/amm.110-116.4569.

Full text
Abstract:
Multiple processor cores are built within a chip by advanced VLSI technology. With the decreasing prices, multi-core processors are widely deployed in both server and desktop systems. The workload of multi-threaded applications could be separated to different cores by multiple threads, such that application threads can run concurrently to maximize overall execution speed of the applications. Moreover, for the green trend of computing nowadays, most of modern multi-core processors have a functionality of dynamic frequency turning. The power-level tuning techniques are based on Dynamic Voltage and Frequency Scaling (DVFS). In order to evaluate the performance of various power-saving approaches, an appropriate technique to measure the power consumption of multi-core processors is important. However, most of approaches estimate CPU power consumption only from CMOS power consumption data and CPU frequency. These approaches only estimate the dynamic power consumption of multi-core processors, the static power consumption is not be included. In this study, a hardware approach for the power consumption measurement of multi-core processors is proposed. Thus the power consumption of a CPU could be measured precisely, and the performance of CPU power-saving approaches can be evaluated well.
APA, Harvard, Vancouver, ISO, and other styles
3

Pryadko, S. A., A. S. Krutogolova, A. S. Uglyanitsa, and A. E. Ivanov. "Multi-core processors use for numerical problems solutions." Radio industry (Russia) 30, no. 4 (December 23, 2020): 98–105. http://dx.doi.org/10.21778/2413-9599-2020-30-4-98-105.

Full text
Abstract:
Problem statement. The use of programming technologies on modern multicore systems is an integral part of an enterprise whose activities involve multitasking or the need to make a large number of calculations over a certain time. The article discusses the development of such technologies aimed at increasing the speed of solving various issues, for example, numerical modeling.Objective. Search for alternative ways to increase the speed of calculations by increasing the number of processors. As an example of increasing the calculation speed depending on the number of processors, the well-known heat-transfer equation is taken, and classical numerical schemes for its solution are given. The use of explicit and implicit schemes is compared, including for the possibility of parallelization of calculations.Results. The article describes systems with shared and distributed memory, describes their possible use for solving various problems, and provides recommendations for their use.Practical implications. Parallel computing helps to solve many problems in various fields, as it reduces the time required to solve partial differential equations.
APA, Harvard, Vancouver, ISO, and other styles
4

Burns, Ethan, Seth Lemons, Wheeler Ruml, and Rong Zhou. "Suboptimal and Anytime Heuristic Search on Multi-Core Machines." Proceedings of the International Conference on Automated Planning and Scheduling 19 (October 16, 2009): 42–49. http://dx.doi.org/10.1609/icaps.v19i1.13375.

Full text
Abstract:
In order to scale with modern processors, planning algorithms must become multi-threaded. In this paper, we present parallel shared-memory algorithms for two problems that underlie many planning systems: suboptimal and anytime heuristic search. We extend a recently-proposed approach for parallel optimal search to the suboptimal case, providing two new pruning rules for bounded suboptimal search. We also show how this new approach can be used for parallel anytime search. Using temporal logic, we prove the correctness of our framework, and in an empirical comparison on STRIPS planning, grid pathfinding, and sliding tile puzzle problems using an 8-core machine, we show that it yields faster search performance than previous proposals.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhao, Huatao, Xiao Luo, Chen Zhu, Takahiro Watanabe, and Tianbo Zhu. "Behavior-aware cache hierarchy optimization for low-power multi-core embedded systems." Modern Physics Letters B 31, no. 19-21 (July 27, 2017): 1740067. http://dx.doi.org/10.1142/s021798491740067x.

Full text
Abstract:
In modern embedded systems, the increasing number of cores requires efficient cache hierarchies to ensure data throughput, but such cache hierarchies are restricted by their tumid size and interference accesses which leads to both performance degradation and wasted energy. In this paper, we firstly propose a behavior-aware cache hierarchy (BACH) which can optimally allocate the multi-level cache resources to many cores and highly improved the efficiency of cache hierarchy, resulting in low energy consumption. The BACH takes full advantage of the explored application behaviors and runtime cache resource demands as the cache allocation bases, so that we can optimally configure the cache hierarchy to meet the runtime demand. The BACH was implemented on the GEM5 simulator. The experimental results show that energy consumption of a three-level cache hierarchy can be saved from 5.29% up to 27.94% compared with other key approaches while the performance of the multi-core system even has a slight improvement counting in hardware overhead.
APA, Harvard, Vancouver, ISO, and other styles
6

Hanafi Por, Porya Soltani, Abbas Ramazani, and Mojtaba Hosseini Toodeshki. "Temperature and performance evaluation of multiprocessors chips by optimal control method." Bulletin of Electrical Engineering and Informatics 12, no. 2 (April 1, 2023): 749–59. http://dx.doi.org/10.11591/eei.v12i2.4291.

Full text
Abstract:
Multi-core processors support all modern electronic devices nowadays. However, temperature and performance management are one of the most critical issues in the design of today’s microprocessors. In this paper, we propose a framework by using an optimal control method based on fan speed and frequency control of the multi-core processor. The goal is to optimize performance and at the same time avoid violating an expected temperature. Our proposed method uses a high-precision thermal and power model for multi-core processors. This method is validated on asymmetric ODROID-XU4 multi-core processor. The experimental results show the ability of the proposed method to achieve the adequate trade-off between performance and temperature control.
APA, Harvard, Vancouver, ISO, and other styles
7

Chen, Yong Heng, Wan Li Zuo, and Feng Lin He. "Optimization Strategy of Bidirectional Join Enumeration in Multi-Core CPUS." Applied Mechanics and Materials 44-47 (December 2010): 383–87. http://dx.doi.org/10.4028/www.scientific.net/amm.44-47.383.

Full text
Abstract:
Most contemporary database systems query optimizers exploit System-R’s Bottom-up dynamic programming method (DP) to find the optimal query execution plan (QEP) without evaluating redundant sub-plans. As modern microprocessors employ multiple cores to accelerate computations, the parallel optimization algorithm has been proposed to parallelize the Bottom-up DP query optimization process. However Top-down DP method can derive upper bounds for the costs of the plans it generates which is not available to typical Bottom-up DP method since such method generate and cost all subplans before considering larger containing plans. This paper combined the enhancements of two approaches and proposes a comprehensive and practical algorithm based graph-traversal driven, referred to here as DPbid, for parallelizing query optimization in the multi-core processor architecture. This paper has implemented such a search strategy and experimental results show that can improve optimization time effective compared to known existing algorithms.
APA, Harvard, Vancouver, ISO, and other styles
8

Sibai, Fadi N., and Ali El-Moursy. "Performance evaluation and comparison of parallel conjugate gradient on modern multi-core accelerator and massively parallel systems." International Journal of Parallel, Emergent and Distributed Systems 29, no. 1 (February 6, 2013): 38–67. http://dx.doi.org/10.1080/17445760.2012.762774.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Filman, Robert E., and Paul H. Morris. "Compiling Knowledge-Based Systems to Ada: The PrkAda Core." International Journal on Artificial Intelligence Tools 06, no. 03 (September 1997): 341–64. http://dx.doi.org/10.1142/s0218213097000190.

Full text
Abstract:
This paper describes the implementation of PrkAda, a system for delivering, in Ada, Artificial Intelligence and object-oriented applications developed using the ProKappa system. (ProKappa is a modern, multi-paradigm knowledge-based–system development tool. It includes facilities for dynamic object management, rule-based processing, daemons, and graphical developer and end-user interfaces. ProKappa is a successor system to KEE.) Creating PrkAda required creating a run-time, Ada-language, object-system "core," and developing a compiler to Ada from ProTalk (ProKappa's high-level, backtracking-based language). We describe PrkAda ProTalk compiler in a companion paper [5]. This paper concentrates on the issues involved in implementing an AI application delivery core, particularly with respect to Ada, including • Automatic storage management (garbage collection) without either the cooperation of the compiler or access to the run-time stack, • Dynamic (weak) typing in a strongly-typed language, • Dynamic objects (objects that can change their slots and parentage as the program is executing) • Dynamic function binding in a language designed to preclude "self-modifying programs," and • Implementation trade-offs in object-oriented knowledge-based systems development environments
APA, Harvard, Vancouver, ISO, and other styles
10

Rudenko, O., I. Domanov, and V. Kravchenko. "TECHNICAL APPROACH IN EVALUATING OF THE CHARACTERISTICS OF MULTI-CORE BALANCED NONQUADDED CABLES FOR DIGITAL COMMUNICATION SYSTEMS." Наукові праці Державного науково-дослідного інституту випробувань і сертифікації озброєння та військової техніки, no. 4 (August 19, 2020): 107–17. http://dx.doi.org/10.37701/dndivsovt.4.2020.12.

Full text
Abstract:
The article proposes a variant of the technical approach in evaluating of the characteristics of multi-core balanced nonquadded cables for digital communication systems. The list of modern standards of the International Electrotechnical Commission, and also the National standards of Ukraine and the international standards which are created on their basis is presented. The proposed standards define the general requirements for communication cables, offer verifying methods for the design, structural dimensions, and materials of elements, marking and packaging. In addition, the provisions of these standards define the methods of testing the electrical characteristics, survivability and resistance of the cable to climatic and mechanical factors. A list of proposed cable characteristics, i.e. electrical, survivability and resistance to environmental influences, is sufficient to assess the cable for compliance with the stated requirements. The list of considered characteristics is not exhaustive and can be supplemented or changed depending on requirements of the customer. On the example of using the automatic measuring system AESA 9500 made in Sweden to measure electrical characteristics, the possibilities of optimizing the process of evaluating the characteristics of the cable through the use of modern automatically controlled measuring systems, which will significantly reduce the time and cost of evaluation are considered. The approach presented in the article to the evaluation of the characteristics of multi-core balanced nonquadded cables for digital communication systems can be used during the testing of the cable planned for delivery to the Armed Forces of Ukraine.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Modern multi-core systems"

1

Marsh, Gregory J. "Evaluation of High Performance Financial Messaging on Modern Multi-core Systems." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1269621500.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

M'sirdi, Soukayna Raja. "Modular Avionics Software Integration on Multi-Core COTS : certification-Compliant Methodology and Timing Analysis Metrics for Legacy Software Reuse in Modern Aerospace Systems." Thesis, Toulouse, INPT, 2017. http://www.theses.fr/2017INPT0039/document.

Full text
Abstract:
Les interférences apparaissant dans les multicoeurs sont indésirables dans les systèmes tempsréel critiques, en particulier dans le domaine de l'aéronautique, où le déterminisme du fonctionnement temporel de tout système doit être formellement prouvé lors de la conception du système de manière à pouvoir être certifié et considéré comme opérationnel. Le but de cette thèse est de proposer une approche pour l'intégration logicielle d'applications IMA sur processeur multicoeur, sans impliquer de modification des plateformes logicielle et matérielle, et en respectant un maximum d'exigences de certification et concepts clés de l'avionique actuels, comme le partitionnement spatial et temporel ou encore la certification incrémentale. L'un des objectifs de la thèse est de respecter au maximum les procédés industriels d'intégration actuels de manière à maximiser les chances des contributions résultantes de la thèse d'être réutilisées au sein des industries avioniques. Un second objectif mineur est de permettre de réduire au minimum la phase d'adaptation des différents profils impliqués dans le processus d'intégration logicielle. Enfin, un troisième objectif est d'aider à optimiser le temps passé à effectuer les vérifications temporelles qui peuvent s'avérer difficiles et coûteuses en temps, mais aussi les choix architecturaux, de manière à réduire le time-to-market mais aussi optimiser le design du système en cours de conception. La contribution majeure de cette thèse est la proposition de deux stratégies complètes d'intégration logicielle/matérielle sur multicoeur pour des applications IMA. L'un des deux processus respecte les contraintes majeures de certification actuelles, ce qui en fait une stratégie potentiellement exploitable pour les applications les plus critiques de DAL A de l'aérospatial; la seconde offre un design le plus optimisé possible en termes de réduction de poids masse et consommation énergétique embarqués. Chaque stratégie est dite complète car elle contient: - une analyse temporelle statique qui borne les interférences inter-coeurs et permet de dériver des bornes supérieures de WCETs de manière fiable; - une formulation de problème de programmation par contraintes (PPC) pour l'allocation automatique et optimisée de logiciel sur matériel; la configuration résultante est correcte par construction car le problème de PPC exprimé exploite l'analyse temporelle mentionnée précédemment pour effectuer une vérification temporelle sur chaque configuration testée. - une formulation de problème de PPC pour la génération d'ordonnancement automatique et optimisé; la configuration résultante est correcte par construction car le processus exploite l'analyse temporelle mentionnée précédemment pour effectuer une vérification temporelle sur chaque configuration testée
Interference in multicores is undesirable for hard real-time systems and especially in the aerospace industry, for which it is mandatory to ensure beforehand timing predictability and deadlines enforcement in a system runtime behavior, in order to be granted acceptance by certification authorities. The goal of this thesis is to propose an approach for multi-core integration of legacy IMA software, without any hardware nor software modification, and which complies as much as possible to current, incremental certification and IMA key concepts such as robust time and space partitioning. The motivations of this thesis are to stick as much as possible to the current IMA software integration process in order to maximize the chances of acceptation by avionics industries of the contributions of this thesis, but also because the current process has long been proven efficient on aerospace systems currently in usage. Another motivation is to minimize the extra effort needed to provide certification authorities with timing-related verification information required when seeking approval. As a secondary goal depending on the possibilities, the contributions should offer design optimization features, and help reduce the time-to-market by automating some steps of the design and verification process. This thesis proposes two complete methodologies for IMA integration on multi-core COTS. Each of them offers different advantages and has different drawbacks, and therefore each of them may correspond to its own, complementary situations. One fits all avionics and certification requirements of incremental verification and robust partitioning and therefore fits up to DAL A applications, while the other offers maximum Size, Weight and Power (SWaP) optimization and fits either up to DAL C applications, multipartition applications or non-IMA applications. The methodologies are said to be "complete" because this thesis provides all necessary metrics to go through all steps of the software integration process. More specifically, this includes, for each strategy: - a static timing analysis for safely upper-bounding inter-core interference, and deriving the corresponding WCET upper-bounds for each task. - a Constraint Programming (CP) formulation for automated software/hardware allocation; the resulting allocation is correct by construction since the CP process embraces the proposed timing analysis mentioned earlier. - a CP formulation for automated schedule generation; the resulting schedule is correct by construction since the CP process embraces the proposed timing analysis mentioned earlier
APA, Harvard, Vancouver, ISO, and other styles
3

Cordes, Daniel Alexander [Verfasser], Peter [Akademischer Betreuer] Marwedel, and Albert [Gutachter] Cohen. "Automatic parallelization for embedded multi-core systems using high level cost models / Daniel Alexander Cordes. Betreuer: Peter Marwedel. Gutachter: Albert Cohen." Dortmund : Universitätsbibliothek Dortmund, 2013. http://d-nb.info/1104738082/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jain, Rahul. "Machine learned machines : reinforcement learning exploration for architecture co-optimization." Thesis, 2017. http://localhost:8080/iit/handle/2074/7461.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Msirdi, Soukayna raja. "Modular Avionics Software Integration on Multi-Core COTS : certification-Compliant Methodology and Timing Analysis Metrics for Legacy Software Reuse in Modern Aerospace Systems." Phd thesis, 2017. http://oatao.univ-toulouse.fr/18732/1/MSIRDI.pdf.

Full text
Abstract:
Interference in multicores is undesirable for hard real-time systems and especially in the aerospace industry, for which it is mandatory to ensure beforehand timing predictability and deadlines enforcement in a system runtime behavior, in order to be granted acceptance by certification authorities. The goal of this thesis is to propose an approach for multi-core integration of legacy IMA software, without any hardware nor software modification, and which complies as much as possible to current, incremental certification and IMA key concepts such as robust time and space partitioning. The motivations of this thesis are to stick as much as possible to the current IMA software integration process in order to maximize the chances of acceptation by avionics industries of the contributions of this thesis, but also because the current process has long been proven efficient on aerospace systems currently in usage. Another motivation is to minimize the extra effort needed to provide certification authorities with timing-related verification information required when seeking approval. As a secondary goal depending on the possibilities, the contributions should offer design optimization features, and help reduce the time-to-market by automating some steps of the design and verification process. This thesis proposes two complete methodologies for IMA integration on multi-core COTS. Each of them offers different advantages and has different drawbacks, and therefore each of them may correspond to its own, complementary situations. One fits all avionics and certification requirements of incremental verification and robust partitioning and therefore fits up to DAL A applications, while the other offers maximum Size, Weight and Power (SWaP) optimization and fits either up to DAL C applications, multipartition applications or non-IMA applications. The methodologies are said to be "complete" because this thesis provides all necessary metrics to go through all steps of the software integration process. More specifically, this includes, for each strategy: - a static timing analysis for safely upper-bounding inter-core interference, and deriving the corresponding WCET upper-bounds for each task. - a Constraint Programming (CP) formulation for automated software/hardware allocation; the resulting allocation is correct by construction since the CP process embraces the proposed timing analysis mentioned earlier. - a CP formulation for automated schedule generation; the resulting schedule is correct by construction since the CP process embraces the proposed timing analysis mentioned earlier.
APA, Harvard, Vancouver, ISO, and other styles
6

Alzahrani, Ali Saeed. "Design of multi-core dataflow cryptprocessor." Thesis, 2018. https://dspace.library.uvic.ca//handle/1828/9972.

Full text
Abstract:
Embedded multi-core systems are implemented as systems-on-chip that rely on packet store-and-forward networks-on-chip for communications. These systems do not use buses nor global clock. Instead routers are used to move data between the cores and each core uses its own local clock. This implies concurrent asynchronous computing. Implementing algorithms in such systems is very much facilitated using dataflow concepts. In this work, we propose a methodology for implementing algorithms on dataflow platforms. The methodology can be applied to multi-threaded, multi-core platforms or a combination of these platforms as well. This methodology is based on a novel dataflow graph representation of the algorithm. We applied the proposed methodology to obtain a novel dataflow multi-core computing model for the secure hash algorithm-3. The resulting hardware was implemented in FPGA to verify the performance parameters. The proposed model of computation has advantages such as flexible I/O timing in term of scheduling policy, execution of tasks as soon as possible, and self-timed event-driven system. In other words, I/O timing and correctness of algorithm evaluation are dissociated in this work. The main advantage of this proposal is the ability to dynamically obfuscate algorithm evaluation to thwart side-channel attacks without having to redesign the system. This has important implications for cryptographic applications. Also, the dissertation proposes four countermeasure techniques against side-channel attacks for SHA-3 hashing. The countermeasure techniques are based on choosing stochastic or deterministic input data scheduling strategies. Extensive simulations of the SHA-3 algorithm and the proposed countermeasures approaches were performed using object-oriented MATLAB models to verify and validate the effectiveness of the techniques. The design immunity for the proposed countermeasures is assessed.
Graduate
2020-11-19
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Shao-Chumg, and 王紹仲. "Evaluation and Design of Programming Models for Heterogeneous Multi-Core Systems." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/36187539989392119287.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Nagendra, Gulur Dwarakanath. "Multi-Core Memory System Design : Developing and using Analytical Models for Performance Evaluation and Enhancements." Thesis, 2015. http://etd.iisc.ac.in/handle/2005/4007.

Full text
Abstract:
Memory system design is increasingly influencing modern multi-core architectures from both performance and power perspectives. Both main memory latency and bandwidth have im-proved at a rate that is slower than the increase in processor core count and speed. Off-chip memory, primarily built from DRAM, has received significant attention in terms of architecture and design for higher performance. These performance improvement techniques include sophisticated memory access scheduling, use of multiple memory controllers, mitigating the impact of DRAM refresh cycles, and so on. At the same time, new non-volatile memory technologies have become increasingly viable in terms of performance and energy. These alternative technologies offer different performance characteristics as compared to traditional DRAM. With the advent of 3D stacking, on-chip memory in the form of 3D stacked DRAM has opened up avenues for addressing the bandwidth and latency limitations of off-chip memory. Stacked DRAM is expected to offer abundant capacity — 100s of MBs to a few GBs — at higher bandwidth and lower latency. Researchers have proposed to use this capacity as an extension to main memory, or as a large last-level DRAM cache. When leveraged as a cache, stacked DRAM provides opportunities and challenges for improving cache hit rate, access latency, and off-chip bandwidth. Thus, designing off-chip and on-chip memory systems for multi-core architectures is complex, compounded by the myriad architectural, design and technological choices, combined with the characteristics of application workloads. Applications have inherent spatial local-ity and access parallelism that influence the memory system response in terms of latency and bandwidth. In this thesis, we construct an analytical model of the off-chip main memory system to comprehend this diverse space and to study the impact of memory system parameters and work-load characteristics from latency and bandwidth perspectives. Our model, called ANATOMY, uses a queuing network formulation of the memory system parameterized with workload characteristics to obtain a closed form solution for the average miss penalty experienced by the last-level cache. We validate the model across a wide variety of memory configurations on four-core, eight-core and sixteen-core architectures. ANATOMY is able to predict memory latency with average errors of 8.1%, 4.1%and 9.7%over quad-core, eight-core and sixteen-core configurations respectively. Further, ANATOMY identifie better performing design points accurately thereby allowing architects and designers to explore the more promising design points in greater detail. We demonstrate the extensibility and applicability of our model by exploring a variety of memory design choices such as the impact of clock speed, benefit of multiple memory controllers, the role of banks and channel width, and so on. We also demonstrate ANATOMY’s ability to capture architectural elements such as memory scheduling mechanisms and impact of DRAM refresh cycles. In all of these studies, ANATOMY provides insight into sources of memory performance bottlenecks and is able to quantitatively predict the benefit of redressing them. An insight from the model suggests that the provisioning of multiple small row-buffers in each DRAM bank achieves better performance than the traditional one (large) row-buffer per bank design. Multiple row-buffers also enable newer performance improvement opportunities such as intra-bank parallelism between data transfers and row activations, and smart row-buffer allocation schemes based on workload demand. Our evaluation (both using the analytical model and detailed cycle-accurate simulation) shows that the proposed DRAM re-organization achieves significant speed-up as well as energy reduction. Next we examine the role of on-chip stacked DRAM caches at improving performance by reducing the load on off-chip main memory. We extend ANATOMY to cover DRAM caches. ANATOMY-Cache takes into account all the key parameters/design issues governing DRAM cache organization namely, where the cache metadata is stored and accessed, the role of cache block size and set associativity and the impact of block size on row-buffer hit rate and off-chip bandwidth. Yet the model is kept simple and provides a closed form solution for the aver-age miss penalty experienced by the last-level SRAM cache. ANATOMY-Cache is validated against detailed architecture simulations and shown to have latency estimation errors of 10.7% and 8.8%on average in quad-core and eight-core configurations respectively. An interesting in-sight from the model suggests that under high load, it is better to bypass the congested DRAM cache and leverage the available idle main memory bandwidth. We use this insight to propose a refresh reduction mechanism that virtually eliminates refresh overhead in DRAM caches. We implement a low-overhead hardware mechanism to record accesses to recent DRAM cache pages and refresh only these pages. Older cache pages are considered invalid and serviced from the (idle) main memory. This technique achieves average refresh reduction of 90% with resulting memory energy savings of 9%and overall performance improvement of 3.7%. Finally, we propose a new DRAM cache organization that achieves higher cache hit rate, lower latency and lower off-chip bandwidth demand. Called the Bi-Modal Cache, our cache organization brings three independent improvements together: (i) it enables parallel tag and data accesses, (ii) it eliminates a large fraction of tag accesses entirely by use of a novel way locator and (iii) it improves cache space utilization by organizing the cache sets as a combination of some big blocks (512B) and some small blocks (64B). The Bi-Modal Cache reduces hit latency by use of the way locator and parallel tag and data accesses. It improves hit rate by leveraging the cache capacity efficiently – blocks with low spatial reuse are allocated in the cache at 64B granularity thereby reducing both wasted off-chip bandwidth as well as cache internal fragmentation. Increased cache hit rate leads to reduction in off-chip bandwidth demand. Through detailed simulations, we demonstrate that the Bi-Modal Cache achieves overall performance improvement of 10.8%, 13.8% and 14.0% in quad-core, eight-core and sixteen-core workloads respectively over an aggressive baseline.
APA, Harvard, Vancouver, ISO, and other styles
9

Dwarakanath, Nagendra Gulur. "Multi-Core Memory System Design : Developing and using Analytical Models for Performance Evaluation and Enhancements." Thesis, 2015. http://etd.iisc.ernet.in/2005/3935.

Full text
Abstract:
Memory system design is increasingly influencing modern multi-core architectures from both performance and power perspectives. Both main memory latency and bandwidth have im-proved at a rate that is slower than the increase in processor core count and speed. Off-chip memory, primarily built from DRAM, has received significant attention in terms of architecture and design for higher performance. These performance improvement techniques include sophisticated memory access scheduling, use of multiple memory controllers, mitigating the impact of DRAM refresh cycles, and so on. At the same time, new non-volatile memory technologies have become increasingly viable in terms of performance and energy. These alternative technologies offer different performance characteristics as compared to traditional DRAM. With the advent of 3D stacking, on-chip memory in the form of 3D stacked DRAM has opened up avenues for addressing the bandwidth and latency limitations of off-chip memory. Stacked DRAM is expected to offer abundant capacity — 100s of MBs to a few GBs — at higher bandwidth and lower latency. Researchers have proposed to use this capacity as an extension to main memory, or as a large last-level DRAM cache. When leveraged as a cache, stacked DRAM provides opportunities and challenges for improving cache hit rate, access latency, and off-chip bandwidth. Thus, designing off-chip and on-chip memory systems for multi-core architectures is complex, compounded by the myriad architectural, design and technological choices, combined with the characteristics of application workloads. Applications have inherent spatial local-ity and access parallelism that influence the memory system response in terms of latency and bandwidth. In this thesis, we construct an analytical model of the off-chip main memory system to comprehend this diverse space and to study the impact of memory system parameters and work-load characteristics from latency and bandwidth perspectives. Our model, called ANATOMY, uses a queuing network formulation of the memory system parameterized with workload characteristics to obtain a closed form solution for the average miss penalty experienced by the last-level cache. We validate the model across a wide variety of memory configurations on four-core, eight-core and sixteen-core architectures. ANATOMY is able to predict memory latency with average errors of 8.1%, 4.1%and 9.7%over quad-core, eight-core and sixteen-core configurations respectively. Further, ANATOMY identifie better performing design points accurately thereby allowing architects and designers to explore the more promising design points in greater detail. We demonstrate the extensibility and applicability of our model by exploring a variety of memory design choices such as the impact of clock speed, benefit of multiple memory controllers, the role of banks and channel width, and so on. We also demonstrate ANATOMY’s ability to capture architectural elements such as memory scheduling mechanisms and impact of DRAM refresh cycles. In all of these studies, ANATOMY provides insight into sources of memory performance bottlenecks and is able to quantitatively predict the benefit of redressing them. An insight from the model suggests that the provisioning of multiple small row-buffers in each DRAM bank achieves better performance than the traditional one (large) row-buffer per bank design. Multiple row-buffers also enable newer performance improvement opportunities such as intra-bank parallelism between data transfers and row activations, and smart row-buffer allocation schemes based on workload demand. Our evaluation (both using the analytical model and detailed cycle-accurate simulation) shows that the proposed DRAM re-organization achieves significant speed-up as well as energy reduction. Next we examine the role of on-chip stacked DRAM caches at improving performance by reducing the load on off-chip main memory. We extend ANATOMY to cover DRAM caches. ANATOMY-Cache takes into account all the key parameters/design issues governing DRAM cache organization namely, where the cache metadata is stored and accessed, the role of cache block size and set associativity and the impact of block size on row-buffer hit rate and off-chip bandwidth. Yet the model is kept simple and provides a closed form solution for the aver-age miss penalty experienced by the last-level SRAM cache. ANATOMY-Cache is validated against detailed architecture simulations and shown to have latency estimation errors of 10.7% and 8.8%on average in quad-core and eight-core configurations respectively. An interesting in-sight from the model suggests that under high load, it is better to bypass the congested DRAM cache and leverage the available idle main memory bandwidth. We use this insight to propose a refresh reduction mechanism that virtually eliminates refresh overhead in DRAM caches. We implement a low-overhead hardware mechanism to record accesses to recent DRAM cache pages and refresh only these pages. Older cache pages are considered invalid and serviced from the (idle) main memory. This technique achieves average refresh reduction of 90% with resulting memory energy savings of 9%and overall performance improvement of 3.7%. Finally, we propose a new DRAM cache organization that achieves higher cache hit rate, lower latency and lower off-chip bandwidth demand. Called the Bi-Modal Cache, our cache organization brings three independent improvements together: (i) it enables parallel tag and data accesses, (ii) it eliminates a large fraction of tag accesses entirely by use of a novel way locator and (iii) it improves cache space utilization by organizing the cache sets as a combination of some big blocks (512B) and some small blocks (64B). The Bi-Modal Cache reduces hit latency by use of the way locator and parallel tag and data accesses. It improves hit rate by leveraging the cache capacity efficiently – blocks with low spatial reuse are allocated in the cache at 64B granularity thereby reducing both wasted off-chip bandwidth as well as cache internal fragmentation. Increased cache hit rate leads to reduction in off-chip bandwidth demand. Through detailed simulations, we demonstrate that the Bi-Modal Cache achieves overall performance improvement of 10.8%, 13.8% and 14.0% in quad-core, eight-core and sixteen-core workloads respectively over an aggressive baseline.
APA, Harvard, Vancouver, ISO, and other styles
10

Kaushik, Anirudh Mohan. "Accelerating Mixed-Abstraction SystemC Models on Multi-Core CPUs and GPUs." Thesis, 2014. http://hdl.handle.net/10012/8370.

Full text
Abstract:
Functional verification is a critical part in the hardware design process cycle, and it contributes for nearly two-thirds of the overall development time. With increasing complexity of hardware designs and shrinking time-to-market constraints, the time and resources spent on functional verification has increased considerably. To mitigate the increasing cost of functional verification, research and academia have been engaged in proposing techniques for improving the simulation of hardware designs, which is a key technique used in the functional verification process. However, the proposed techniques for accelerating the simulation of hardware designs do not leverage the performance benefits offered by multiprocessors/multi-core and heterogeneous processors available today. With the growing ubiquity of powerful heterogeneous computing systems, which integrate multi-processor/multi-core systems with heterogeneous processors such as GPUs, it is important to utilize these computing systems to address the functional verification bottleneck. In this thesis, I propose a technique for accelerating SystemC simulations across multi-core CPUs and GPUs. In particular, I focus on accelerating simulation of SystemC models that are described at both the Register-Transfer Level (RTL) and Transaction Level (TL) abstractions. The main contributions of this thesis are: 1.) a methodology for accelerating the simulation of mixed abstraction SystemC models defined at the RTL and TL abstractions on multi-core CPUs and GPUs and 2.) An open-source static framework for parsing, analyzing, and performing source-to-source translation of identified portions of a SystemC model for execution on multi-core CPUs and GPUs.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Modern multi-core systems"

1

Song, Dong, and Theodore W. Berger. Hippocampal memory prosthesis. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780199674923.003.0055.

Full text
Abstract:
Damage to the hippocampus and surrounding regions of the medial temporal lobe can result in a permanent loss of the ability to form new long-term memories. Hippocampal memory prosthesis is designed to restore this ability. The animal model described here is the memory-dependent, delayed nonmatch-to-sample (DNMS) task in rats, and the core of the prosthesis is a biomimetic multi-input, multi-output (MIMO) nonlinear dynamical model that predicts hippocampal output (CA1) signals based on input (CA3) signals. When hippocampal CA1 function is pharmacologically blocked, successful DNMS behavior is abolished. However, when MIMO model predictions are used to re-instate CA1 memory-related activities with electrical stimulation, successful DNMS behavior and long-term memory function are restored. The hippocampal memory prosthesis has been successfully implemented in rodents and nonhuman primates, but the current system requires major advances before it can approach a working prosthesis. Looking forward, a deeper knowledge of neural coding will provide further insights.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Modern multi-core systems"

1

Liu, Shaoshan, and Jean-Luc Gaudiot. "Synchronization Mechanisms on Modern Multi-core Architectures." In Advances in Computer Systems Architecture, 290–303. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-74309-5_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Varbanescu, Ana Lucia, Rob V. van Nieuwpoort, Pieter Hijma, Henri E. Bal, Rosa M. Badia, and Xavier Martorell. "Programming Models for Multicore and Many-Core Computing Systems." In Programming multi-core and many-core computing systems, 29–58. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2017. http://dx.doi.org/10.1002/9781119332015.ch2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Chattopadhyay, Sudipta. "MESS: Memory Performance Debugging on Embedded Multi-core Systems." In Model Checking Software, 105–25. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23404-5_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ahmed, Jameel, Mohammed Yakoob Siyal, Shaheryar Najam, and Zohaib Najam. "Challenges and Issues in Modern Computer Architectures." In Fuzzy Logic Based Power-Efficient Real-Time Multi-Core System, 23–29. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-3120-5_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Borkowski, Jeffrey, Lotfi Belblidia, and Oliver Tsaoi. "S3R Advanced Training Simulator Core Model: Implementation and Validation." In Springer Proceedings in Physics, 789–99. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1023-6_68.

Full text
Abstract:
AbstractModern training simulators core models are required to replicate plant data for neutronic response. Replication is required such that reactivity manipulation on the simulator properly trains the operator for reactivity manipulation at the plant. This paper discusses advanced models which perform this function in real-time using S3R, the real-time, time-dependent core model of the Studsvik Core Management System (CMS). This paper also discusses the coupled multi-physics of the Reactor Coolant System (RCS) model, using RELAP5 as a prototype. Finally, this paper discusses the implementation of S3R under the control of a server-based executive environment and instructor station, essential for training simulator applications.
APA, Harvard, Vancouver, ISO, and other styles
6

Brandenburg, Jens, and Benno Stabernack. "A Generic and Non-intrusive Profiling Methodology for SystemC Multi-core Platform Simulation Models." In Architecture of Computing Systems – ARCS 2012, 135–46. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-28293-5_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lai, Wallace Wai-Lok. "Underground Utilities Imaging and Diagnosis." In Urban Informatics, 415–38. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8983-6_24.

Full text
Abstract:
AbstractThe invisible and congested world of underground utilities (UU) is an indispensable mystery to the general public because their existence is invisible until problems happen. Their growth aligns with the continuous development of cities and the ever-increasing demand for energy and quality of life. To satisfy a variety of modern requirements like emergency or routine repair, safe dig and excavation, monitoring, maintenance, and upscaling of the network, two basic tasks are always required. They are mapping and imaging (where?), and diagnosis (how healthy?). This chapter gives a review of the current state of the art of these two core topics, and their levels of expected survey accuracy, and looks forward to future trends of research and development (Sects. 24.1 and 24.2). From the point of view of physics, a large range of survey technologies is central to imaging and diagnosis, having originated from electromagnetic- and acoustic-based near-surface geophysical and nondestructive testing methods. To date, survey technologies have been further extended by multi-disciplinary task forces in various disciplines (Sect. 24.3). First, it involves sending and retrieving mechanical robots to survey the internal confined spaces of utilities using careful system control and seamless communication electronics. Secondly, the captured data and signals of various kinds are positioned, processed, and in the future, pattern-recognized with a database to robustly trace the location and diagnose the conditions of any particular type of utilities. Thirdly, such a pattern-recognized database of various types of defects can be regarded as a learning process through repeated validation in the laboratory, simulation, and ground-truthing in the field. This chapter is concluded by briefly introducing the human-factor or psychological and cognitive biases, which are in most cases neglected in any imaging and diagnostic work (Sect. 24.4). In short, the very challenging nature and large demand for utility imaging and diagnostics have been gradually evolving from the traditional visual inspection to a new era of multi-disciplinary surveying and engineering professions and even towards the psychological part of human–machine interaction.
APA, Harvard, Vancouver, ISO, and other styles
8

Curry, Edward, Edo Osagie, Niki Pavlopoulou, Dhaval Salwala, and Adegboyega Ojo. "A Best Practice Framework for Centres of Excellence in Big Data and Artificial Intelligence." In The Elements of Big Data Value, 177–210. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-68176-0_8.

Full text
Abstract:
AbstractThis chapter presents a best practice framework for the operation of Big Data and Artificial Intelligence Centres of Excellence (BDAI CoE). The goal of the framework is to foster collaboration and share best practices among existing centres and support the establishment of new Centres of Excellence (CoEs) within Europe. The framework was developed following a phased design science process, starting from a literature review to create an initial framework which was enhanced with the findings of a multi-case study of existing successful CoEs. Each case study involved an in-depth analysis and a series of in-depth interviews with leadership personnel of existing CoEs.The resulting best practice framework models a CoE using open systems theory that comprises input (environment), transformation (CoE) and output (impact). The framework conceptualises the internal operation of the CoE as a set of high-level capabilities including strategy, governance, structure, funding, and people and culture. The core capabilities of the CoE include business development, collaboration, research support services, technical infrastructure, experimentation/demonstration platforms, Intellectual Property (IP) and data protection, education and public engagement, policy outreach, technology and knowledge transfer, and performance and impact assessment. In this chapter we describe the best practice framework for CoEs in big data and AI, including objectives, environment, strategic and operational capabilities, and impact. The chapter outlines how the framework can be used by a CoE to support its strategic direction and operational decisions over time, and how a new CoE can use it in the start-up phase. Based on the analysis of the case studies, the chapter explores the critical success factors of a CoE as defined by a survey of CoE managers. Finally, the chapter concludes with a summary.
APA, Harvard, Vancouver, ISO, and other styles
9

Pahikkala, Tapio, Antti Airola, Thomas Canhao Xu, Pasi Liljeberg, Hannu Tenhunen, and Tapio Salakoski. "On Parallel Online Learning for Adaptive Embedded Systems." In Advances in Systems Analysis, Software Engineering, and High Performance Computing, 262–81. IGI Global, 2014. http://dx.doi.org/10.4018/978-1-4666-6034-2.ch011.

Full text
Abstract:
This chapter considers parallel implementation of the online multi-label regularized least-squares machine-learning algorithm for embedded hardware platforms. The authors focus on the following properties required in real-time adaptive systems: learning in online fashion, that is, the model improves with new data but does not require storing it; the method can fully utilize the computational abilities of modern embedded multi-core computer architectures; and the system efficiently learns to predict several labels simultaneously. They demonstrate on a hand-written digit recognition task that the online algorithm converges faster, with respect to the amount of training data processed, to an accurate solution than a stochastic gradient descent based baseline. Further, the authors show that our parallelization of the method scales well on a quad-core platform. Moreover, since Network-on-Chip (NoC) has been proposed as a promising candidate for future multi-core architectures, they implement a NoC system consisting of 16 cores. The proposed machine learning algorithm is evaluated in the NoC platform. Experimental results show that, by optimizing the cache behaviour of the program, cache/memory efficiency can improve significantly. Results from the chapter provide a guideline for designing future embedded multi-core machine learning devices.
APA, Harvard, Vancouver, ISO, and other styles
10

Ahmadinia, Ali, and Ahmed Saeed. "Secure Embedded Systems." In Cyber-Physical Systems for Next-Generation Networks, 207–21. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5510-0.ch010.

Full text
Abstract:
As computing devices have become an almost integral part of our lives, security of systems and protection of the sensitive data are emerging as very important issues. This is particularly evident for embedded systems which are often deployed in unprotected environments and at the same time being constrained by limited resources. Security and trust have also become important considerations in the design of virtually all modern embedded systems as they are utilized in critical and sensitive applications such as in transportation, national infrastructure, military equipment, banking systems, and medical devices. The increase in software content and network connectivity has made them vulnerable to fast spreading software-based attacks such as viruses and worms, which were hitherto primarily the concern of personal computers, servers, and the internet. This chapter discusses the basic concepts, security attacks types, and existing preventive measures in the field of embedded systems and multi-core systems.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Modern multi-core systems"

1

Kloda, Tomasz, Marco Solieri, Renato Mancuso, Nicola Capodieci, Paolo Valente, and Marko Bertogna. "Deterministic Memory Hierarchy and Virtualization for Modern Multi-Core Embedded Systems." In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 2019. http://dx.doi.org/10.1109/rtas.2019.00009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kalamkar, Dhiraj D., Joshua D. Trzaskoz, Srinivas Sridharan, Mikhail Smelyanskiy, Daehyun Kim, Armando Manduca, Yunhong Shu, Matt A. Bernstein, Bharat Kaul, and Pradeep Dubey. "High Performance Non-uniform FFT on Modern X86-based Multi-core Systems." In 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 2012. http://dx.doi.org/10.1109/ipdps.2012.49.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Galter, Diana, Sergey Biryuchinskiy, and Konstantin Melnikov. "An optical data transmission channel in single-chip multi-core systems." In 2012 IV International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT 2012). IEEE, 2012. http://dx.doi.org/10.1109/icumt.2012.6459647.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Eisenman, Assaf, Lucy Cherkasova, Guilherme Magalhaes, Qiong Cai, and Sachin Katti. "Parallel Graph Processing on Modern Multi-core Servers: New Findings and Remaining Challenges." In 2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 2016. http://dx.doi.org/10.1109/mascots.2016.66.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kirk, Richard O., Gihan R. Mudalige, Istvan Z. Reguly, Steven A. Wright, Matt J. Martineau, and Stephen A. Jarvis. "Achieving Performance Portability for a Heat Conduction Solver Mini-Application on Modern Multi-core Systems." In 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2017. http://dx.doi.org/10.1109/cluster.2017.122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Pérez Rodríguez, Javier, Patrick Meumeu Yomsi, Yilian Ribot González, and Luis Javier Pente Lam. "B-TSP: An Advanced Power Safe Management Strategy for modern Multi-core Platforms under Thermal-Aware Design." In RTNS 2023: The 31st International Conference on Real-Time Networks and Systems. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3575757.3593659.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Krzywaniak, Adam, Jerzy Proficz, and Paweł Czarnul. "Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors." In 2018 Federated Conference on Computer Science and Information Systems. IEEE, 2018. http://dx.doi.org/10.15439/2018f177.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Барладян, Борис, Boris Barladyan, Лев Шапиро, Lev Shapiro, Курбан Маллачиев, Kurban Mallachiev, Алексей Хорошилов, et al. "Multi-windows Rendering Using Software OpenGL in Avionics Embedded Systems." In 29th International Conference on Computer Graphics, Image Processing and Computer Vision, Visualization Systems and the Virtual Environment GraphiCon'2019. Bryansk State Technical University, 2019. http://dx.doi.org/10.30987/graphicon-2019-2-28-31.

Full text
Abstract:
Elaboration of modern airplane cockpit has tendency to use large displays instead of a lot of separate indicators. The large display should combine information about flight navigation and state of plane equipment. Information coming from a wide variety of devices should be displayed simultaneously. Therefore multi-windows rendering is vitally important here. Its implementation must be embedded in real-time operating system which controls the aircraft. Development of a Safety Critical Compositor for multi-windows rendering for OpenGL SC 1.0.1 software is considered in the paper. It works under the real-time operating system JetOS newly designed for aircraft. Development is based on the use of extensions designed to work in multi-core systems in addition to standard JetOS partitioning services.
APA, Harvard, Vancouver, ISO, and other styles
9

Dufour, Christian, Guillaume Dumur, Jean-Nicolas Paquin, and Jean Belanger. "A multi-core pc-based simulator for the hardware-in-the-loop testing of modern train and ship traction systems." In 2008 13th International Power Electronics and Motion Control Conference (EPE/PEMC 2008). IEEE, 2008. http://dx.doi.org/10.1109/epepemc.2008.4635476.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Lu, Yi, Kai Liu, and W. N. Dawes. "Fast High Order Large Eddy Simulations on Many Core Computing Systems for Turbomachinery." In ASME Turbo Expo 2016: Turbomachinery Technical Conference and Exposition. American Society of Mechanical Engineers, 2016. http://dx.doi.org/10.1115/gt2016-57468.

Full text
Abstract:
The overall aim of our research is to enable overnight high fidelity LES for realistic industry problems on affordable computing resource. We have adopted a “3E” approach: high spatial discretization Efficiency on general unstructured meshes, high Efficiency accurate time integration and high computing Efficiency on modern low cost HPC hardware. Our approach is centered on high order Flux Reconstruction with local time stepping — the STEFR algorithm [1]. In this paper, an offload-mode version of this code is described targeted at a heterogeneous many-core computing system based on low cost commodity hardware — Intel PHI cards. Three key techniques are introduced to achieve high FLOP rates — and optimal usage of non-equilibrium memory of both CPU and the many core coprocessor — with three levels of parallelization, multi-level nonequilibrium mesh partition and an asynchronous computing structure. A series of high order LES runs for a high lift low pressure turbine blade and a transonic turbine blade, with different order of accuracy, both fully wall-resolved and wall-modelled, were performed, analyzed and presented. This work demonstrates that the high order STEFR method has the potential to support over-night LES for realistic industrial problems on affordable computing resource.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Modern multi-core systems"

1

Mohammadi, N., D. Corrigan, A. A. Sappin, and N. Rayner. Evidence for a Neoarchean to earliest-Paleoproterozoic mantle metasomatic event prior to formation of the Mesoproterozoic-age Strange Lake REE deposit, Newfoundland and Labrador, and Quebec, Canada. Natural Resources Canada/CMSS/Information Management, 2022. http://dx.doi.org/10.4095/330866.

Full text
Abstract:
A complete suite of bulk major- and trace-elements measurements combined with macroscopic/microscopic observations and mineralogy guided by scanning electron microscope-energy dispersive spectrometry (SEM-EDS) analyses were applied on Nekuashu (2.55 Ga) and Pelland (2.32 Ga) intrusions in northern Canada, near the Strange Lake rare earth elements (REE) deposit, to evaluate their magmatic evolution and possible relations to the Mesoproterozoic Strange Lake Peralkaline Complex (SLPC). These Neoarchean to earliest-Paleoproterozoic intrusions, part of the Core Zone in southeastern Churchill Province, comprise mainly hypersolvus suites, including hornblendite, gabbro, monzogabbro/monzodiorite, monzonite, syenite/augite-syenite, granodiorite, and mafic diabase/dyke. However, the linkage of the suites and their petrogenesis are poorly understood. Geochemical evidence suggests a combination of 'intra-crustal multi-stage differentiation', mainly controlled by fractional crystallization (to generate mafic to felsic suites), and 'accumulation' (to form hornblendite suite) was involved in the evolution history of this system. Our model proposes that hornblendite and mafic to felsic intrusive rocks of both intrusions share a similar basaltic parent magma, generated from melting of a hydrous metasomatized mantle source that triggered an initial REE and incompatible element enrichment that prepared the ground for the subsequent enrichment in the SLPC. Geochemical signature of the hornblendite suite is consistent with a cumulate origin and its formation during the early stages of the magma evolution, however, the remaining suites were mainly controlled by 'continued fractional crystallization' processes, producing more evolved suites: gabbronorite/hornblende-gabbro ? monzogabbro/monzodiorite ? monzonite ? syenite/augite-syenite. In this proposed model, the hydrous mantle-derived basaltic magma was partly solidified to form the mafic suites (gabbronorite/hornblende-gabbro) by early-stage plagioclase-pyroxene-amphibole fractionation in the deep crust while settling of the early crystallized hornblende (+pyroxene) led to the formation of the hornblendite cumulates. The subsequent fractionation of plagioclase, pyroxene, and amphibole from the residual melt produced the more intermediate suites of monzogabbro/monzodiorite. The evolved magma ascended upward into the shallow crust to form monzonite by K-feldspar fractionation. The residual melt then intruded at shallower depth to form syenite/augite-syenite with abundant microcline crystals. The granodiorite suite was probably generated from lower crustal melts associated with the mafic end members. Later mafic diabase/dykes were likely generated by further partial melting of the same source at depth that were injected into the other suites.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography