Literatura científica selecionada sobre o tema "CPU-GPU Partitioning"

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos

Selecione um tipo de fonte:

Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "CPU-GPU Partitioning".

Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.

Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.

Artigos de revistas sobre o assunto "CPU-GPU Partitioning"

1

Benatia, Akrem, Weixing Ji, Yizhuo Wang, and Feng Shi. "Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms." International Journal of High Performance Computing Applications 34, no. 1 (2019): 66–80. http://dx.doi.org/10.1177/1094342019886628.

Texto completo da fonte
Resumo:
Sparse matrix–vector multiplication (SpMV) kernel dominates the computing cost in numerous applications. Most of the existing studies dedicated to improving this kernel have been targeting just one type of processing units, mainly multicore CPUs or graphics processing units (GPUs), and have not explored the potential of the recent, rapidly emerging, CPU-GPU heterogeneous platforms. To take full advantage of these heterogeneous systems, the input sparse matrix has to be partitioned on different available processing units. The partitioning problem is more challenging with the existence of many s
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Narayana, Divyaprabha Kabbal, and Sudarshan Tekal Subramanyam Babu. "Optimal task partitioning to minimize failure in heterogeneous computational platform." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 1 (2025): 1079. http://dx.doi.org/10.11591/ijece.v15i1.pp1079-1088.

Texto completo da fonte
Resumo:
The increased energy consumption by heterogeneous cloud platforms surges the carbon emissions and reduces system reliability, thus, making workload scheduling an extremely challenging process. The dynamic voltage- frequency scaling (DVFS) technique provides an efficient mechanism in improving the energy efficiency of cloud platform; however, employing DVFS reduces reliability and increases the failure rate of resource scheduling. Most of the current workload scheduling methods have failed to optimize the energy and reliability together under a central processing unit - graphical processing uni
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Huijing Yang and Tingwen Yu. "Two novel cache management mechanisms on CPU-GPU heterogeneous processors." Research Briefs on Information and Communication Technology Evolution 7 (June 15, 2021): 1–8. http://dx.doi.org/10.56801/rebicte.v7i.113.

Texto completo da fonte
Resumo:
Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the samechip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-LevelCache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications. Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contentionwithin CPUs and GPUs for
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

Narayana, Divyaprabha Kabbal, and Sudarshan Tekal Subramanyam Babu. "Optimal task partitioning to minimize failure in heterogeneous computational platform." International Journal of Electrical and Computer Engineering (IJECE) 15 (February 1, 2025): 1079–88. https://doi.org/10.11591/ijece.v15i1.pp1079-1088.

Texto completo da fonte
Resumo:
The increased energy consumption by heterogeneous cloud platforms surges the carbon emissions and reduces system reliability, thus, making workload scheduling an extremely challenging process. The dynamic voltage-frequency scaling (DVFS) technique provides an efficient mechanism in improving the energy efficiency of cloud platform; however, employing DVFS reduces reliability and increases the failure rate of resource scheduling. Most of the current workload scheduling methods have failed to optimize the energy and reliability together under a central processing un
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Fang, Juan, Mengxuan Wang, and Zelin Wei. "A memory scheduling strategy for eliminating memory access interference in heterogeneous system." Journal of Supercomputing 76, no. 4 (2020): 3129–54. http://dx.doi.org/10.1007/s11227-019-03135-7.

Texto completo da fonte
Resumo:
AbstractMultiple CPUs and GPUs are integrated on the same chip to share memory, and access requests between cores are interfering with each other. Memory requests from the GPU seriously interfere with the CPU memory access performance. Requests between multiple CPUs are intertwined when accessing memory, and its performance is greatly affected. The difference in access latency between GPU cores increases the average latency of memory accesses. In order to solve the problems encountered in the shared memory of heterogeneous multi-core systems, we propose a step-by-step memory scheduling strateg
Estilos ABNT, Harvard, Vancouver, APA, etc.
6

MERRILL, DUANE, and ANDREW GRIMSHAW. "HIGH PERFORMANCE AND SCALABLE RADIX SORTING: A CASE STUDY OF IMPLEMENTING DYNAMIC PARALLELISM FOR GPU COMPUTING." Parallel Processing Letters 21, no. 02 (2011): 245–72. http://dx.doi.org/10.1142/s0129626411000187.

Texto completo da fonte
Resumo:
The need to rank and order data is pervasive, and many algorithms are fundamentally dependent upon sorting and partitioning operations. Prior to this work, GPU stream processors have been perceived as challenging targets for problems with dynamic and global data-dependences such as sorting. This paper presents: (1) a family of very efficient parallel algorithms for radix sorting; and (2) our allocation-oriented algorithmic design strategies that match the strengths of GPU processor architecture to this genre of dynamic parallelism. We demonstrate multiple factors of speedup (up to 3.8x) compar
Estilos ABNT, Harvard, Vancouver, APA, etc.
7

Vilches, Antonio, Rafael Asenjo, Angeles Navarro, Francisco Corbera, Rub́en Gran, and María Garzarán. "Adaptive Partitioning for Irregular Applications on Heterogeneous CPU-GPU Chips." Procedia Computer Science 51 (2015): 140–49. http://dx.doi.org/10.1016/j.procs.2015.05.213.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
8

Sung, Hanul, Hyeonsang Eom, and HeonYoung Yeom. "The Need of Cache Partitioning on Shared Cache of Integrated Graphics Processor between CPU and GPU." KIISE Transactions on Computing Practices 20, no. 9 (2014): 507–12. http://dx.doi.org/10.5626/ktcp.2014.20.9.507.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
9

Wang, Shunjiang, Baoming Pu, Ming Li, Weichun Ge, Qianwei Liu, and Yujie Pei. "State Estimation Based on Ensemble DA–DSVM in Power System." International Journal of Software Engineering and Knowledge Engineering 29, no. 05 (2019): 653–69. http://dx.doi.org/10.1142/s0218194019400023.

Texto completo da fonte
Resumo:
This paper investigates the state estimation problem of power systems. A novel, fast and accurate state estimation algorithm is presented to solve this problem based on the one-dimensional denoising autoencoder and deep support vector machine (1D DA–DSVM). Besides, for further reducing the computation burden, a partitioning method is presented to divide the power system into several sub-networks and the proposed algorithm can be applied to each sub-network. A hybrid computing architecture of Central Processing Unit (CPU) and Graphics Processing Unit (GPU) is employed in the overall state estim
Estilos ABNT, Harvard, Vancouver, APA, etc.
10

Park, Sungwoo, Seyeon Oh, and Min-Soo Kim. "cuMatch: A GPU-based Memory-Efficient Worst-case Optimal Join Processing Method for Subgraph Queries with Complex Patterns." Proceedings of the ACM on Management of Data 3, no. 3 (2025): 1–28. https://doi.org/10.1145/3725398.

Texto completo da fonte
Resumo:
Subgraph queries are widely used but face significant challenges due to complex patterns such as negative and optional edges. While worst-case optimal joins have proven effective for subgraph queries with regular patterns, no method has been proposed that can process queries involving complex patterns in a single multi-way join. Existing CPU-based and GPU-based methods experience intermediate data explosion when processing complex patterns following regular patterns. In addition, GPU-based methods struggle with issues of wasted GPU memory and redundant computation. In this paper, we propose cu
Estilos ABNT, Harvard, Vancouver, APA, etc.
Mais fontes

Teses / dissertações sobre o assunto "CPU-GPU Partitioning"

1

Öhberg, Tomas. "Auto-tuning Hybrid CPU-GPU Execution of Algorithmic Skeletons in SkePU." Thesis, Linköpings universitet, Programvara och system, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-149605.

Texto completo da fonte
Resumo:
The trend in computer architectures has for several years been heterogeneous systems consisting of a regular CPU and at least one additional, specialized processing unit, such as a GPU.The different characteristics of the processing units and the requirement of multiple tools and programming languages makes programming of such systems a challenging task. Although there exist tools for programming each processing unit, utilizing the full potential of a heterogeneous computer still requires specialized implementations involving multiple frameworks and hand-tuning of parameters.To fully exploit t
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Thomas, Béatrice. "Adéquation Algorithme Architecture pour la gestion des réseaux électriques." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG104.

Texto completo da fonte
Resumo:
L'augmentation de la production renouvelable décentralisée nécessaire à la transition énergétique complexifiera la gestion du réseau électrique.Une riche littérature propose de décentraliser la gestion pour éviter la surcharge de l'opérateur central pendant la gestion réelle. Cependant la décentralisation exacerbe les problèmes de passage à l'échelle lors des simulations préliminaires permettant de valider les performances, la robustesse de la gestion ou le dimensionnement du futur réseau. Une démarche Adéquation Algorithme Architecture a été suivie dans cette thèse pour un marché pair à pair
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Li, Cheng-Hsuan, and 李承軒. "Weighted LLC Latency-Based Run-Time Cache Partitioning for Heterogeneous CPU-GPU Architecture." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/33311478280299879988.

Texto completo da fonte
Resumo:
碩士<br>國立臺灣大學<br>資訊工程學研究所<br>102<br>Integrating the CPU and GPU on the same chip has become the development trend for microprocessor design. In integrated CPU-GPU architecture, utilizing the shared last-level cache (LLC) is a critical design issue due to the pressure on shared resources and the different characteristics of CPU and GPU applications. Because of the latency-hiding capability provided by the GPU and the huge discrepancy in concurrent executing threads between the CPU and GPU, LLC partitioning can no longer be achieved by simply minimizing the overall cache misses as in homogeneous
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

Mishra, Ashirbad. "Efficient betweenness Centrality Computations on Hybrid CPU-GPU Systems." Thesis, 2016. http://etd.iisc.ac.in/handle/2005/2718.

Texto completo da fonte
Resumo:
Analysis of networks is quite interesting, because they can be interpreted for several purposes. Various features require different metrics to measure and interpret them. Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Between’s Centrality (BC) is one such metric that plays a key role in many real world applications. BC is an important graph analytics application for large-scale graphs. However it is one of the most computationally intensive kernels to execute, and measuring centrality in billion-scale graphs is
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Mishra, Ashirbad. "Efficient betweenness Centrality Computations on Hybrid CPU-GPU Systems." Thesis, 2016. http://hdl.handle.net/2005/2718.

Texto completo da fonte
Resumo:
Analysis of networks is quite interesting, because they can be interpreted for several purposes. Various features require different metrics to measure and interpret them. Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Between’s Centrality (BC) is one such metric that plays a key role in many real world applications. BC is an important graph analytics application for large-scale graphs. However it is one of the most computationally intensive kernels to execute, and measuring centrality in billion-scale graphs is
Estilos ABNT, Harvard, Vancouver, APA, etc.

Capítulos de livros sobre o assunto "CPU-GPU Partitioning"

1

Clarke, David, Aleksandar Ilic, Alexey Lastovetsky, and Leonel Sousa. "Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters." In Euro-Par 2012 Parallel Processing. Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-32820-6_49.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Saba, Issa, Eishi Arima, Dai Liu, and Martin Schulz. "Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning." In Architecture of Computing Systems. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-21867-5_4.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Fei, Xiongwei, Kenli Li, Wangdong Yang, and Keqin Li. "CPU-GPU Computing." In Innovative Research and Applications in Next-Generation High Performance Computing. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-5225-0287-6.ch007.

Texto completo da fonte
Resumo:
Heterogeneous and hybrid computing has been heavily studied in the field of parallel and distributed computing in recent years. It can work on a single computer, or in a group of computers connected by a high-speed network. The former is the topic of this chapter. Its key points are how to cooperatively use devices that are different in performance and architecture to satisfy various computing requirements, and how to make the whole program achieve the best performance possible when executed. CPUs and GPUs have fundamentally different design philosophies, but combining their characteristics co
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

"Topology-Aware Load-Balance Schemes for Heterogeneous Graph Processing." In Advances in Computer and Electrical Engineering. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-3799-1.ch005.

Texto completo da fonte
Resumo:
Inspired by the insights presented in Chapters 2, 3, and 4, in this chapter the authors present the KCMAX (K-Core MAX) and the KCML (K-Core Multi-Level) frameworks: novel k-core-based graph partitioning approaches that produce unbalanced partitions of complex networks that are suitable for heterogeneous parallel processing. Then they use KCMAX and KCML to explore the configuration space for accelerating BFSs on large complex networks in the context of TOTEM, a BSP heterogeneous GPU + CPU HPC platform. They study the feasibility of the heterogeneous computing approach by systematically studying
Estilos ABNT, Harvard, Vancouver, APA, etc.

Trabalhos de conferências sobre o assunto "CPU-GPU Partitioning"

1

Qiu, Jingbao, Huawei Zhai, Xiaodong Yuan, and Licheng Cui. "CPU -GPU Heterogeneous Stencil Computation Algorithm Based on Dynamic Hybrid Fragmentation Partitioning." In 2024 6th International Conference on Frontier Technologies of Information and Computer (ICFTIC). IEEE, 2024. https://doi.org/10.1109/icftic64248.2024.10913103.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Goodarzi, Bahareh, Martin Burtscher, and Dhrubajyoti Goswami. "Parallel Graph Partitioning on a CPU-GPU Architecture." In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2016. http://dx.doi.org/10.1109/ipdpsw.2016.16.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Cho, Younghyun, Florian Negele, Seohong Park, Bernhard Egger, and Thomas R. Gross. "On-the-fly workload partitioning for integrated CPU/GPU architectures." In PACT '18: International conference on Parallel Architectures and Compilation Techniques. ACM, 2018. http://dx.doi.org/10.1145/3243176.3243210.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

Kim, Dae Hee, Rakesh Nagi, and Deming Chen. "Thanos: High-Performance CPU-GPU Based Balanced Graph Partitioning Using Cross-Decomposition." In 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020. http://dx.doi.org/10.1109/asp-dac47756.2020.9045588.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Wang, Xin, and Wei Zhang. "Cache locking vs. partitioning for real-time computing on integrated CPU-GPU processors." In 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC). IEEE, 2016. http://dx.doi.org/10.1109/pccc.2016.7820644.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
6

Fang, Juan, Shijian Liu, and Xibei Zhang. "Research on Cache Partitioning and Adaptive Replacement Policy for CPU-GPU Heterogeneous Processors." In 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES). IEEE, 2017. http://dx.doi.org/10.1109/dcabes.2017.12.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
7

Wachter, Eduardo Weber, Geoff V. Merrett, Bashir M. Al-Hashimi, and Amit Kumar Singh. "Reliable mapping and partitioning of performance-constrained openCL applications on CPU-GPU MPSoCs." In ESWEEK'17: THIRTEENTH EMBEDDED SYSTEM WEEK. ACM, 2017. http://dx.doi.org/10.1145/3139315.3157088.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
8

Xiao, Chunhua, Wei Ran, Fangzhu Lin, and Lin Zhang. "Dynamic Fine-Grained Workload Partitioning for Irregular Applications on Discrete CPU-GPU Systems." In 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 2021. http://dx.doi.org/10.1109/ispa-bdcloud-socialcom-sustaincom52081.2021.00148.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
9

Magalhães, W. F., H. M. Gomes, L. B. Marinho, G. S. Aguiar, and P. Silveira. "Investigating Mobile Edge-Cloud Trade-Offs of Object Detection with YOLO." In VII Symposium on Knowledge Discovery, Mining and Learning. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/kdmile.2019.8788.

Texto completo da fonte
Resumo:
With the advent of smart IoT applications empowered with AI, together with the democratization of mobile devices, moving the computation from cloud to edge is a natural trend in both academia and industry. A major challenge in this direction is enabling the deployment of Deep Neural Networks (DNNs), which usually demand lots of computational resources (i.e. memory, disk, CPU/GPU, and power), in resource limited edge devices. Among the possible strategies to tackle this challenge are: (i) running the entire DNN on the edge device (sometimes not feasible), (ii) distributing the computation betwe
Estilos ABNT, Harvard, Vancouver, APA, etc.
10

Negrut, Dan, Toby Heyn, Andrew Seidl, Dan Melanz, David Gorsich, and David Lamb. "ENABLING COMPUTATIONAL DYNAMICS IN DISTRIBUTED COMPUTING ENVIRONMENTS USING A HETEROGENEOUS COMPUTING TEMPLATE." In 2024 NDIA Michigan Chapter Ground Vehicle Systems Engineering and Technology Symposium. National Defense Industrial Association, 2024. http://dx.doi.org/10.4271/2024-01-3314.

Texto completo da fonte
Resumo:
&lt;title&gt;ABSTRACT&lt;/title&gt; &lt;p&gt;This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The underlying theme of the solution approach embraced by HCT is that of partitioning the domain of interest into a number of sub-domains that are each managed by a separate core/accelerator (CPU/GPU) pair. The five components at the
Estilos ABNT, Harvard, Vancouver, APA, etc.
Oferecemos descontos em todos os planos premium para autores cujas obras estão incluídas em seleções literárias temáticas. Contate-nos para obter um código promocional único!