Log in

Relevant bibliographies by topics / GPU-CPU / Dissertations / Theses

To see the other types of publications on this topic, follow the link: GPU-CPU.

Dissertations / Theses on the topic 'GPU-CPU'

Author: Grafiati

Published: 25 May 2024

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'GPU-CPU.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Fang, Zhuowen. "Java GPU vs CPU Hashing Performance." Thesis, Mittuniversitetet, Avdelningen för informationssystem och -teknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-33994.

Full text

Abstract:

In the latest years, the public’s interest in blockchain technology has been growing since it was brought up in 2008, primarily because of its ability to create an immutable ledger, for storing information that never will or can be changed. As an expanding chain structure, the act of nodes adding blocks to the chain is called mining which is regulated by consensus mechanism. In the most widely used consensus mechanism Proof of work, this process is based on computationally heavy guessing of hashes of blocks. Today, there are several prominent ways developed of performing this guessing, thanks

APA, Harvard, Vancouver, ISO, and other styles

2

Dollinger, Jean-François. "A framework for efficient execution on GPU and CPU+GPU systems." Thesis, Strasbourg, 2015. http://www.theses.fr/2015STRAD019/document.

Full text

Abstract:

Les verrous technologiques rencontrés par les fabricants de semi-conducteurs au début des années deux-mille ont abrogé la flambée des performances des unités de calculs séquentielles. La tendance actuelle est à la multiplication du nombre de cœurs de processeur par socket et à l'utilisation progressive des cartes GPU pour des calculs hautement parallèles. La complexité des architectures récentes rend difficile l'estimation statique des performances d'un programme. Nous décrivons une méthode fiable et précise de prédiction du temps d'exécution de nids de boucles parallèles sur GPU basée sur tro

APA, Harvard, Vancouver, ISO, and other styles

3

Gjermundsen, Aleksander. "CPU and GPU Co-processing for Sound." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2010. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-11794.

Full text

Abstract:

When using voice communications, one of the problematic phenomena that can occur, is participants hearing an echo of their own voice. Acoustic echo cancellation (AEC) is used to remove this echo, but can be computationally demanding.The recent OpenCL standard allows high-level programs to be run on both multi-core CPUs, as well as Graphics Processing Units (GPUs) and custom accelerators. This opens up new possibilities for offloading computations, which is especially important for real-time applications. Although many algorithms for image- and video-processing have been studied on the GPU, aud

APA, Harvard, Vancouver, ISO, and other styles

4

CARLOS, EDUARDO TELLES. "HYBRID FRUSTUM CULLING USING CPU AND GPU." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2009. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=31453@1.

Full text

Abstract:

PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO<br>Um dos problemas mais antigos da computação gráfica tem sido a determinação de visibilidade. Vários algoritmos têm sido desenvolvidos para viabilizar modelos cada vez maiores e detalhados. Dentre estes algoritmos, destaca-se o frustum culling, cujo papel é remover objetos que não sejam visíveis ao observador. Esse algoritmo, muito comum em várias aplicações, vem sofrendo melhorias ao longo dos anos, a fim de acelerar ainda mais a sua execução. Apesar de ser tratado como um problema bem resolvido na computação gráfica, alguns pontos ainda po

APA, Harvard, Vancouver, ISO, and other styles

5

Farooqui, Naila. "Runtime specialization for heterogeneous CPU-GPU platforms." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54915.

Full text

Abstract:

Heterogeneous parallel architectures like those comprised of CPUs and GPUs are a tantalizing compute fabric for performance-hungry developers. While these platforms enable order-of-magnitude performance increases for many data-parallel application domains, there remain several open challenges: (i) the distinct execution models inherent in the heterogeneous devices present on such platforms drives the need to dynamically match workload characteristics to the underlying resources, (ii) the complex architecture and programming models of such systems require substantial application knowledge and e

APA, Harvard, Vancouver, ISO, and other styles

6

Smith, Michael Shawn. "Performance Analysis of Hybrid CPU/GPU Environments." PDXScholar, 2010. https://pdxscholar.library.pdx.edu/open_access_etds/300.

Full text

Abstract:

We present two metrics to assist the performance analyst to gain a unified view of application performance in a hybrid environment: GPU Computation Percentage and GPU Load Balance. We analyze the metrics using a matrix multiplication benchmark suite and a real scientific application. We also extend an experiment management system to support GPU performance data and to calculate and store our GPU Computation Percentage and GPU Load Balance metrics.

APA, Harvard, Vancouver, ISO, and other styles

7

Wong, Henry Ting-Hei. "Architectures and limits of GPU-CPU heterogeneous systems." Thesis, University of British Columbia, 2008. http://hdl.handle.net/2429/2529.

Full text

Abstract:

As we continue to be able to put an increasing number of transistors on a single chip, the answer to the perpetual question of what the best processor we could build with the transistors is remains uncertain. Past work has shown that heterogeneous multiprocessor systems provide benefits in performance and efficiency. This thesis explores heterogeneous systems composed of a traditional sequential processor (CPU) and highly parallel graphics processors (GPU). This thesis presents a tightly-coupled heterogeneous chip multiprocessor architecture for general-purpose non-graphics computation and a

APA, Harvard, Vancouver, ISO, and other styles

8

Gummadi, Deepthi. "Improving GPU performance by regrouping CPU-memory data." Thesis, Wichita State University, 2014. http://hdl.handle.net/10057/10959.

Full text

Abstract:

In order to fast effective analysis of large complex systems, high-performance computing is essential. NVIDIA Compute Unified Device Architecture (CUDA)-assisted central processing unit (CPU) / graphics processing unit (GPU) computing platform has proven its potential to be used in high-performance computing. In CPU/GPU computing, original data and instructions are copied from CPU main memory to GPU global memory. Inside GPU, it would be beneficial to keep the data into shared memory (shared only by the threads of that block) than in the global memory (shared by all threads). However, shared m

APA, Harvard, Vancouver, ISO, and other styles

9

Chen, Wei. "Dynamic Workload Division in GPU-CPU Heterogeneous Systems." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1364250106.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Ben, Romdhanne Bilel. "Simulation des réseaux à grande échelle sur les architectures de calculs hétérogènes." Thesis, Paris, ENST, 2013. http://www.theses.fr/2013ENST0088/document.

Full text

Abstract:

La simulation est une étape primordiale dans l'évolution des systèmes en réseaux. L’évolutivité et l’efficacité des outils de simulation est une clef principale de l’objectivité des résultats obtenue, étant donné la complexité croissante des nouveaux des réseaux sans-fils. La simulation a évènement discret est parfaitement adéquate au passage à l'échelle, cependant les architectures logiciel existantes ne profitent pas des avancées récente du matériel informatique comme les processeurs parallèle et les coprocesseurs graphique. Dans ce contexte, l'objectif de cette thèse est de proposer des méc

APA, Harvard, Vancouver, ISO, and other styles

11

Sundberg, Andreas. "Skapa digitalt fingeravtryck med hjälp av CPU och GPU." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-12851.

Full text

Abstract:

Digitala fingeravtryck är en teknik som används för att skapa riktad reklam och för att undvika bedrägeri. Det finns många fingeravtryckstekniker som till exempel att använda cookies, IPadresser och använda sig av Javascript. Många av teknikerna är lätta att undvika som till exempel att stänga av cookies och att byta IP-adress vilket gör det svårare att upptäcka användaren. I detta arbete undersöks det om det är möjligt att identifiera datorer med hjälp av att mäta hur lång tid det tar för en dator att exekvera några skript på CPU:n och GPU:n. För att besvara frågan skapades sex olika skript d

APA, Harvard, Vancouver, ISO, and other styles

12

Krishnasamy, Ezhilmathi. "Hybrid CPU-GPU Parallel Simulations of 3D Front Propagation." Thesis, Linköpings universitet, Hållfasthetslära, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-114935.

Full text

Abstract:

This master thesis studies GPU-enabled parallel implementations of the 3D Parallel Marching Method (PMM). 3D PMM is aimed at solving the non-linear static Jacobi-Hamilton equations, which has real world applications such as in the study of geological foldings, where each layer of the Earth’s crust is considered as a front propagating over time. Using the parallel computer architectures, fast simulationscan be achieved, leading to less time consumption, quicker understanding of the inner Earth and enables early exploration of oil and gas reserves. Currently 3D PMM is implemented in shared memor

APA, Harvard, Vancouver, ISO, and other styles

13

Lindqvist, Sebastian. "Performance Evaluation of Boids on the GPU and CPU." Thesis, Blekinge Tekniska Högskola, Institutionen för kreativa teknologier, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-15970.

Full text

Abstract:

Context. Agent based models are used to simulate complex systems by using multiple agents that follow a set of rules. One such model is the boid model which is used to simulate movements of synchronized groups of animals. Executing agent based models partially or fully on the GPU has previously shown to increase performance, opening up the possibility for larger simulations. However, few articles have previously compared a full GPU implementation of the boid model with a multi-threaded CPU implementation. Objectives. The objectives of this thesis are to find how parallel execution of boid mode

APA, Harvard, Vancouver, ISO, and other styles

14

Venkatasubramanian, Sundaresan. "Tuned and asynchronous stencil kernels for CPU/GPU systems." Thesis, Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29728.

Full text

Abstract:

Thesis (M. S.)--Computing, Georgia Institute of Technology, 2009.<br>Committee Chair: Vuduc, Richard; Committee Member: Kim, Hyesoon; Committee Member: Vetter, Jeffrey. Part of the SMARTech Electronic Thesis and Dissertation Collection.

APA, Harvard, Vancouver, ISO, and other styles

15

Lind, Eric, and Velasquez Ävelin Pantigoso. "A performance comparison between CPU and GPU in TensorFlow." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-260240.

Full text

Abstract:

The fast-growing field of Machine Learning has in the later years become more common, as it has gone from a restricted research area to actually be in general use. Frameworks such as TensorFlow have been developed to scale and analyze artificial neural networks, which are used in one of the areas in Machine Learning called Deep Learning. This paper will study how well the framework TensorFlow performs in regard to time and memory allocation on the processor units CPU and GPU since these are the factors that are often the restraining resources. Three neural networks have been used to measure ho

APA, Harvard, Vancouver, ISO, and other styles

16

Lagerhult, Christopher. "Smartphone CPU : An Energy efficient alternative to the GPU." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-397426.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Ospici, Matthieu. "Modèles de programmation et d'exécution pour les architectures parallèles et hybrides. Applications à des codes de simulation pour la physique." Phd thesis, Université de Grenoble, 2013. http://tel.archives-ouvertes.fr/tel-00934266.

Full text

Abstract:

Nous nous intéressons dans cette thèse aux grandes architectures parallèles hybrides, c'est-à-dire aux architectures parallèles qui sont une combinaison de processeurs généraliste (Intel Xeon par exemple) et de processeurs accélérateur (GPU Nvidia). L'exploitation efficace de ces grappes hybrides pour le calcul haute performance est au cœur de nos travaux. L'hétérogénéité des ressources de calcul au sein des grappes hybrides pose de nombreuses problématiques lorsque l'on souhaite les exploiter efficacement avec de grandes applications scientifiques existantes. Deux principales problématiques o

APA, Harvard, Vancouver, ISO, and other styles

18

Norgren, David. "Implementing and Evaluating CPU/GPU Real-Time Ray Tracing Solutions." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-32076.

Full text

Abstract:

Ray tracing is a popular algorithm used to simulate the behavior of light and is commonly used to render images with high levels of visual realism. Modern multicore CPUs and many-core GPUs can take advantage of the parallel nature of ray tracing to accelerate the rendering process and produce new images in real-time. For non-specialized hardware however, such implementations are often limited to low screen resolutions, simple scene geometry and basic graphical effects. In this work, a C++ framework was created to investigate how the ray tracing algorithm can be implemented and accelerated on t

APA, Harvard, Vancouver, ISO, and other styles

19

Sandgren, Julius. "Transfer Time Reduction of Data Transfers between CPU and GPU." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-205272.

Full text

Abstract:

In real-time video processing data transfer between CPU and GPU is a time critical action; time spent transferring data is processing time lost. Several variants of standard transfer methods were developed and evaluated on nine computers and two smart decision algorithms was designed to help choose the fastest method for each occasion. Results showed that the standard transfer methods can be beaten; by using the designed decision algorithms, transfer times between CPU and GPU (both ways) can be reduced by a factor of 7 compared to always using the standard methods.

APA, Harvard, Vancouver, ISO, and other styles

20

Erik, Liljeqvist. "Evaluating a CPU/GPU Implementation for Real-Time Ray Tracing." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35768.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Svantesson, David, and Martin Eklund. "A naive implementation of Topological Sort on GPU : A comparative study between CPU and GPU performance." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186417.

Full text

Abstract:

Topological sorting is a graph problem encountered in various different areas in computer science. Many graph problems have benefited from execution on a GPU rather than a CPU due to the GPU's capability for parallelism. The purpose of this report is to determine if topological sorting may benefit from a naive implementation on the GPU compared to the CPU. This is accomplished by constructing a parallel implementation using the CUDA platform by NVIDIA for GPGPU programing. The runtime of this implementation running on several different graphs is compared to a sequential implementation in C run

APA, Harvard, Vancouver, ISO, and other styles

22

Zhang, Junchi. "GPU computing of Heat Equations." Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-theses/515.

Full text

Abstract:

There is an increasing amount of evidence in scientific research and industrial engineering indicating that the graphic processing unit (GPU) has a higher efficiency and a stronger ability over CPUs to process certain computations. The heat equation is one of the most well-known partial differential equations with well-developed theories, and application in engineering. Thus, we chose in this report to use the heat equation to numerically solve for the heat distributions at different time points using both GPU and CPU programs. The heat equation with three different boundary conditions (Dirich

APA, Harvard, Vancouver, ISO, and other styles

23

Vekterli, Tor Brede. "Parallelization of Artificial Spiking Neural Networks on the CPU and GPU." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9838.

Full text

Abstract:

<p>Conventional artificial neural networks have traditionally faced inherent problems with efficient parallelization of neuron processing. Recent research has shown how artificial spiking neural networks can, with the introduction of biologically plausible synaptic conduction delays, be fully parallelized regardless of their network topology. This, in conjunction with the influx of fast, massively parallel desktop-level computing hardware leaves the field of efficient, large-scale spiking neural network simulations potentially open to even those with no access to supercomputers or large comput

APA, Harvard, Vancouver, ISO, and other styles

24

Enmyren, Johan. "A Skeleton Programming Library for Multicore CPU and Multi-GPU Systems." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-60319.

Full text

Abstract:

This report presents SkePU, a C++ template library which provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP back end. It also supports multi-GPU systems. Benchmarks show that copying data between the host and the GPU is often a bottleneck. Therefore a container which uses lazy memory copying has been implemented to avoid unnecessary memory transfers. SkePU was evaluated with

APA, Harvard, Vancouver, ISO, and other styles

25

Berthou, Gautier. "Implementation of an object-detection algorithm on a CPU+GPU target." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-206178.

Full text

Abstract:

Systems like autonomous vehicles may require real time embedded image processing under hardware constraints. This paper provides directions to design time and resource efficient Haar cascade detection algorithms. It also reviews some software architecture and hardware aspects. The considered algorithms were meant to be run on platforms equipped with a CPU and a GPU under power consumption limitations. The main aim of the project was to design and develop real time underwater object detection algorithms. However the concepts that are presented in this paper are generic and can be applied to oth

APA, Harvard, Vancouver, ISO, and other styles

26

Concha, Ramírez Francisca Andrea. "FADRA: A CPU-GPU framework for astronomical data reduction and Analysis." Tesis, Universidad de Chile, 2016. http://repositorio.uchile.cl/handle/2250/140769.

Full text

Abstract:

Magíster en Ciencias, Mención Computación<br>Esta tesis establece las bases de FADRA: Framework for Astronomical Data Reduction and Analysis. El framework FADRA fue diseñado para ser eficiente, simple de usar, modular, expandible, y open source. Hoy en día, la astronomía es inseparable de la computación, pero algunos de los software más usados en la actualidad fueron desarrollados tres décadas atrás y no están diseñados para enfrentar los actuales paradigmas de big data. El mundo del software astronómico debe evolucionar no solo hacia prácticas que comprendan y adopten la era del big data, sin

APA, Harvard, Vancouver, ISO, and other styles

27

Öhberg, Tomas. "Auto-tuning Hybrid CPU-GPU Execution of Algorithmic Skeletons in SkePU." Thesis, Linköpings universitet, Programvara och system, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-149605.

Full text

Abstract:

The trend in computer architectures has for several years been heterogeneous systems consisting of a regular CPU and at least one additional, specialized processing unit, such as a GPU.The different characteristics of the processing units and the requirement of multiple tools and programming languages makes programming of such systems a challenging task. Although there exist tools for programming each processing unit, utilizing the full potential of a heterogeneous computer still requires specialized implementations involving multiple frameworks and hand-tuning of parameters.To fully exploit t

APA, Harvard, Vancouver, ISO, and other styles

28

Vivanloc, Vincent. "Rendu distribué sur grappe de CPU/GPU et effets d'éclairage global." Toulouse 3, 2008. http://thesesups.ups-tlse.fr/823/.

Full text

Abstract:

Le prototypage virtuel et l'aide à la revue de projet requièrent un rendu réaliste en temps réel. Cela amène deux axes de recherches, d'une part, le rendu temps réel d'effets d'éclairage indirect et d'autre part, le rendu distribué temps réel à haute résolution. Simuler des effets d'éclairage global permet d'améliorer la qualité d'une image de synthèse produite par rastérisation. Nous nous sommes intéressés à l'éclairage indirect et aux réflexions spéculaires. Sur des éclairages à basse fréquence, le rendu de l'éclairage indirect peut être mis à jour en temps réel. Pour une gamme plus large de

APA, Harvard, Vancouver, ISO, and other styles

29

Trichy, Ravi Vignesh. "Runtime Systems and Scheduling Support for High-End CPU-GPU Architectures." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1338324367.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

He, Guanlin. "Parallel algorithms for clustering large datasets on CPU-GPU heterogeneous architectures." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG062.

Full text

Abstract:

Clustering, qui consiste à réaliser des groupements naturels de données, est une tâche fondamentale et difficile dans l'apprentissage automatique et l'exploration de données. De nombreuses méthodes de clustering ont été proposées dans le passé, parmi lesquelles le clustering en k-moyennes qui est une méthode couramment utilisée en raison de sa simplicité et de sa rapidité.Le clustering spectral est une approche plus récente qui permet généralement d'obtenir une meilleure qualité de clustering que les k-moyennes. Cependant, les algorithmes classiques de clustering spectral souffrent d'un manque

APA, Harvard, Vancouver, ISO, and other styles

31

Kankatala, Sriram. "Performance Analysis of kNN on large datasets using CUDA & Pthreads : Comparing between CPU & GPU." Thesis, Blekinge Tekniska Högskola, Institutionen för kommunikationssystem, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-10830.

Full text

Abstract:

Several organizations have large databases which are growing at a rapid rate day by day, which need to be regularly maintained. Content based searches are similar searched based on certain features that are obtained from various multi media data. For various applications like multimedia content retrieval, data mining, pattern recognition, etc., performing the nearest neighbor search is a challenging task in multidimensional data. The important factors in nearest neighbor search kNN are searching speed and accuracy. Implementation of kNN on GPU is an ongoing research from last few years, focusi

APA, Harvard, Vancouver, ISO, and other styles

32

Topcu, Tumer. "Data Parallelism For Ray Casting Large Scenes On A Cpu-gpu Cluster." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609494/index.pdf.

Full text

Abstract:

In the last decade, computational power, memory bandwidth and programmability capabilities of graphics processing units (GPU) have rapidly evolved. Therefore, many researches have been performed to use GPUs in advanced graphics rendering. Because of its high degree of parallelism, ray tracing has been one of the rst algorithms studied on GPUs. However, the rendering of large scenes with ray tracing can easily exceed the GPU&#039<br>s memory capacity. The algorithm proposed in this work uses a data parallel approach where the scene is partitioned and assigned to CPU-GPU couples in a cluster to

APA, Harvard, Vancouver, ISO, and other styles

33

Sharma, Vishist. "Sparse-Matrix support for the SkePU library for portable CPU/GPU programming." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-129687.

Full text

Abstract:

In this thesis work we have extended the SkePU framework by designing a new container data structure for the representation of generic two dimensional sparse matrices. Computation on matrices is an integral part of many scientific and engineering problems. Sometimes it is unnecessary to perform costly operations on zero entries of the matrix. If the number of zeroes is relatively large then a requirement for more efficient data structure arises. Beyond the sparse matrix representation, we propose an algorithm to judge the condition where computation on sparse matrices is more beneficial in ter

APA, Harvard, Vancouver, ISO, and other styles

34

Ferenczi, Daniel. "Användning av Dynamisk Arbetslastbalansering mellan CPU och GPU för att Simulera Rök." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-11025.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Pinto, Vinícius Garcia. "Escalonamento por roubo de tarefas em sistemas Multi-CPU e Multi-GPU." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/71270.

Full text

Abstract:

Nos últimos anos, uma das alternativas adotadas para aumentar o desempenho de sistemas de processamento de alto desempenho têm sido o uso de arquiteturas híbridas. Essas arquiteturas são constituídas de processadores multicore e coprocessadores especializados, como GPUs. Esses coprocessadores atuam como aceleradores em alguns tipos de operações. Por outro lado, as ferramentas e modelos de programação paralela atuais não são adequados para cenários híbridos, produzindo aplicações pouco portáveis. O paralelismo de tarefas considerado um paradigma de programação genérico e de alto nível pode ser

APA, Harvard, Vancouver, ISO, and other styles

36

Mestre, Nuno Roberto Pereira. "Comparação do desempenho do FDTD com implementação em CPU e em GPU." Master's thesis, Universidade de Aveiro, 2012. http://hdl.handle.net/10773/10939.

Full text

Abstract:

Mestrado em Engenharia de Computadores e Telemática<br>O Finite-Difference Time-Domain é um método utilizado em electromagnetismo computacional para simular a propagação de ondas electromagnéticas em meios cujas características podem não ser uniformes. É um método com inúmeras aplicações, e como tal é vantajoso que o seu desempenho possa ser aumentado, de preferência recorrendo a sistemas computacionais de baixo custo. O propósito desta dissertação é aproveitar duas tecnologias emergentes e de relativo baixo custo para aumentar o desempenho do FDTD em uma e duas dimensões. Essas tecnolo

APA, Harvard, Vancouver, ISO, and other styles

37

Ansaloni, Pietro. "Analisi di immagini con trasformata Ranklet: ottimizzazioni computazionali su CPU e GPU." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/5037/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Wen, Hao. "IMPROVING PERFORMANCE AND ENERGY EFFICIENCY FOR THE INTEGRATED CPU-GPU HETEROGENEOUS SYSTEMS." VCU Scholars Compass, 2018. https://scholarscompass.vcu.edu/etd/5664.

Full text

Abstract:

Current heterogeneous CPU-GPU architectures integrate general purpose CPUs and highly thread-level parallelized GPUs (Graphic Processing Units) in the same die. This dissertation focuses on improving the energy efficiency and performance for the heterogeneous CPU-GPU system. Leakage energy has become an increasingly large fraction of total energy consumption, making it important to reduce leakage energy for improving the overall energy efficiency. Cache occupies a large on-chip area, which are good targets for leakage energy reduction. For the CPU cache, we study how to reduce the cache leakag

APA, Harvard, Vancouver, ISO, and other styles

39

Giuntoli, Guido. "Hybrid CPU/GPU implementation for the FE2 multi-scale method for composite problems." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/668824.

Full text

Abstract:

This thesis aims to develop a High-Performance Computing implementation to solve large composite materials problems through the use of the FE2 multi-scale method. Previous works have not been able to scale the FE2 strategy to real size problems with mesh resolutions of more than 10K elements at the macro-scale and 100^3 elements at the micro-scale. The latter is due to the computational requirements needed to carry out these calculations. This works identifies the most computationally intensive parts of the FE2 algorithm and ports several parts of the micro-scale computations to GPUs. The cas

APA, Harvard, Vancouver, ISO, and other styles

40

Barrientos, Rojel Ricardo Javier. "Búsqueda por Similitud en Espacios Métricos Sobre Plataformas Multi-Core (CPU y GPU)." Tesis, Universidad de Chile, 2011. http://www.repositorio.uchile.cl/handle/2250/102738.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Sajjapongse, Kittisak. "Hierarchical scheduling and uniform access programming frameworks for heterogeneous CPU-GPU computing clusters." Thesis, University of Missouri - Columbia, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10178997.

Full text

Abstract:

<p> The advance of the GPU hardware architecture has made GPUs attractive devices for general-purpose computing. Modern GPUs are equipped with an increasing number of cores, a flexible memory hierarchy, and a large memory capacity. While the computational power of modern GPU devices has allowed their introduction in high-performance computing (HPC) clusters and the efficient processing of ever larger workloads, existing software components for HPC clusters still offer basic support for hardware heterogeneity and often cause performance limitations in the presence of GPU devices. In particular,

APA, Harvard, Vancouver, ISO, and other styles

42

Wahlberg, Björn. "Att procedurellt generera ett 2D landskap parallellt på GPU vs seriellt på CPU." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18759.

Full text

Abstract:

Procedurellt genererat innehåll, PCG,förekommer väldigt ofta i spel nu för tiden, mycket för att öka återspelbarheten i ett spel. Några populära exempel på spel som utnyttjar PCG är Terraria(2011) och Minecraft(2011). I takt med att hårdvara blir mer och mer kraftfull så ökar även kraven på spelen som utnyttjar teknikerna eftersom att det går att generera innehåll i realtid. Men finns det outnyttjat potential i grafikkortet? Trenden av ökningen av klockfrekvensen på processorer har reducerats på senare tid, för att istället ersättas av ett större antal kärnor. Här så kan parallellisering av pr

APA, Harvard, Vancouver, ISO, and other styles

43

Fauzia, Naznin. "Characterization of Data Locality Potential of CPU and GPU Applications through Dynamic Analysis." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1420759839.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Van, Winkle Scott E. "Dynamic Bandwidth and Laser Scaling for CPU-GPU Heterogenous Network-on-Chip Architectures." Ohio University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1500992706350957.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Xue, Weicheng. "CPU/GPU Code Acceleration on Heterogeneous Systems and Code Verification for CFD Applications." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/102073.

Full text

Abstract:

Computational Fluid Dynamics (CFD) applications usually involve intensive computations, which can be accelerated through using open accelerators, especially GPUs due to their common use in the scientific computing community. In addition to code acceleration, it is important to ensure that the code and algorithm are implemented numerically correctly, which is called code verification. This dissertation focuses on accelerating research CFD codes on multi-CPUs/GPUs using MPI and OpenACC, as well as the code verification for turbulence model implementation using the method of manufactured solution

APA, Harvard, Vancouver, ISO, and other styles

46

Said, Issam. "Apports des architectures hybrides à l'imagerie profondeur : étude comparative entre CPU, APU et GPU." Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066531/document.

Full text

Abstract:

Les compagnies pétrolières s'appuient sur le HPC pour accélérer les algorithmes d'imagerie profondeur. Les grappes de CPU et les accélérateurs matériels sont largement adoptés par l'industrie. Les processeurs graphiques (GPU), avec une grande puissance de calcul et une large bande passante mémoire, ont suscité un vif intérêt. Cependant le déploiement d'applications telle la Reverse Time Migration (RTM) sur ces architectures présente quelques limitations. Notamment, une capacité mémoire réduite, des communications fréquentes entre le CPU et le GPU présentant un possible goulot d'étranglement à

APA, Harvard, Vancouver, ISO, and other styles

47

Said, Issam. "Apports des architectures hybrides à l'imagerie profondeur : étude comparative entre CPU, APU et GPU." Electronic Thesis or Diss., Paris 6, 2015. http://www.theses.fr/2015PA066531.

Full text

Abstract:

Les compagnies pétrolières s'appuient sur le HPC pour accélérer les algorithmes d'imagerie profondeur. Les grappes de CPU et les accélérateurs matériels sont largement adoptés par l'industrie. Les processeurs graphiques (GPU), avec une grande puissance de calcul et une large bande passante mémoire, ont suscité un vif intérêt. Cependant le déploiement d'applications telle la Reverse Time Migration (RTM) sur ces architectures présente quelques limitations. Notamment, une capacité mémoire réduite, des communications fréquentes entre le CPU et le GPU présentant un possible goulot d'étranglement à

APA, Harvard, Vancouver, ISO, and other styles

48

Sjölander, Erik. "Krypteringsalgoritmer i OpenCL : AES-256 och ECC ElGamal." Thesis, Linköpings universitet, Institutionen för systemteknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-81660.

Full text

Abstract:

De senaste åren har grafikkorten genomgått en omvandling från renderingsenheter till att klara av generella beräkningar, likt en vanlig processor. Med hjälp av språk som OpenCL blir grafikkorten kraftfulla enheter som går att använda effektivt vid stora beräkningar. Målet med detta examensarbete var att visa krypteringsalgoritmer som passar bra att accelerera med OpenCL på grafikkort. Ytterligare mål var att visa att programmet inte behöver omfattande omskrivning för att fungera i OpenCL. Två krypteringsalgoritmer portades för att kunna köras på grafikkorten. Den första algoritmen AES-256 test

APA, Harvard, Vancouver, ISO, and other styles

49

Löfgren, Robin, and Kristoffer Dahl. "Beräkningar med GPU vs CPU : En jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning." Thesis, Linnaeus University, School of Computer Science, Physics and Mathematics, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-5782.

Full text

Abstract:

<p>Examensarbetet handlar om en jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning mellan grafikkort och processorer i persondatorer och PlayStation 3.</p><p>Problemet studeras för att göra allmänheten uppmärksam på att det går att lösa en del av energiproblematiken med beräkningar genom att öka energieffektiviteten av beräkningsenheterna.</p><p>Undersökningen har genomförts på ett explorativt sätt och studerar förhållandet mellan processorer, grafikkort och vilken som presterar bäst i vilket sammanhang. Prestandatest genomförs med molekylberäkningsprogramme

APA, Harvard, Vancouver, ISO, and other styles

50

Chavez, Daniel. "Parallelizing Map Projection of Raster Data on Multi-core CPU and GPU Parallel Programming Frameworks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-190883.

Full text

Abstract:

Map projections lie at the core of geographic information systems and numerous projections are used today. The reprojection between different map projections is recurring in a geographic information system and it can be parallelized with multi-core CPUs and GPUs. This thesis implements a parallel analytic reprojection algorithm of raster data in C/C++ with the parallel programming frameworks Pthreads, C++11 STL threads, OpenMP, Intel TBB, CUDA and OpenCL. The thesis compares the execution times from the different implementations on small, medium and large raster data sets, where OpenMP had the

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!