Dissertations / Theses on the topic 'Heterogeneous computing'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Heterogeneous computing.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Lu, Howard J. (Howard Jason). "Heterogeneous multithreaded computing." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36584.
Full textJackson, Robert Owen. "Heterogeneous parallel computing." Thesis, University of Birmingham, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.366162.
Full textFagg, Graham Edward. "Enabling technologies for parallel heterogeneous computing." Thesis, University of Reading, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.266150.
Full textScogland, Thomas R. "Runtime Adaptation for Autonomic Heterogeneous Computing." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/71315.
Full textPh. D.
Shum, Kam Hong. "Adaptive parallelism for computing on heterogeneous clusters." Thesis, University of Cambridge, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.627563.
Full textLee, Jaekyu. "Shared resource management for efficient heterogeneous computing." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50217.
Full textHerdman, Andy. "The readying of applications for heterogeneous computing." Thesis, University of Warwick, 2017. http://wrap.warwick.ac.uk/102343/.
Full textSarjanoja, S. (Sampsa). "BM3D image denoising using heterogeneous computing platforms." Master's thesis, University of Oulu, 2015. http://urn.fi/URN:NBN:fi:oulu-201504141380.
Full textKohinanpoisto on yksi keskeisimmistä digitaaliseen kuvankäsittelyyn liittyvistä ongelmista, joka useimmiten pyritään ratkaisemaan jo signaalinkäsittelyvuon varhaisessa vaiheessa. Kohinaa ilmestyy kuviin monella eri tavalla ja sen esiintyminen on väistämätöntä. Useat kuvankäsittelyalgoritmit toimivat paremmin, jos niiden syöte on valmiiksi mahdollisimman virheetöntä käsiteltäväksi. Jotta kuvankäsittelyviiveet pysyisivät pieninä eri laskenta-alustoilla, on tärkeää että myös kohinanpoisto suoritetaan nopeasti. Viihdeteollisuuden kehityksen myötä näytönohjaimien laskentateho on moninkertaistunut. Nykyisin näytönohjainpiirit koostuvat useista sadoista tai jopa tuhansista laskentayksiköistä. Näiden laskentayksiköiden käyttäminen yleiskäyttöiseen laskentaan on mahdollista OpenCL- ja CUDA-ohjelmointirajapinnoilla. Rinnakkaislaskenta usealla laskentayksiköllä mahdollistaa suuria suorituskyvyn parannuksia käyttökohteissa, joissa käsiteltävä tieto on toisistaan riippumatonta tai löyhästi riippuvaista. Näytönohjainpiirien käyttö yleisessä laskennassa on yleistymässä myös mobiililaitteissa. Lisäksi valokuvaaminen on nykypäivänä suosituinta juuri mobiililaitteilla. Tämä diplomityö pyrkii selvittämään viimeisimmän kohinanpoistoon käytettävän tekniikan, lohkonsovitus ja kolmiulotteinen suodatus (block-matching and three-dimensional filtering, BM3D), laskennan toteuttamista heterogeenisissä laskentaympäristöissä. Työssä arvioidaan esiteltyjen toteutusten suorituskykyä tekemällä vertailuja jo olemassa oleviin toteutuksiin. Esitellyt toteutukset saavuttavat merkittäviä hyötyjä rinnakkaislaskennan käyttämisestä. Samalla vertailuissa havainnollistetaan yleisiä ongelmakohtia näytönohjainlaskennan hyödyntämisessä monimutkaisten kuvankäsittelyalgoritmien laskentaan
Elteir, Marwa Khamis. "A MapReduce Framework for Heterogeneous Computing Architectures." Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/28786.
Full textPh. D.
Lee, Young Choon. "Problem-centric scheduling for heterogeneous computing systems." Thesis, The University of Sydney, 2007. http://hdl.handle.net/2123/9321.
Full textSai, Ranga Prashanth C. "Algorithms for task scheduling in heterogeneous computing environments." Auburn, Ala., 2006. http://repo.lib.auburn.edu/2006%20Fall/SAI_RANGA_58.pdf.
Full textChen, Keping. "Self-Organised computing in a dynamic heterogeneous environment." Thesis, University of Manchester, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.492715.
Full textRodrigues, Gabriel Siqueira. "Autonomic goal-driven deployment in heterogeneous computing environments." reponame:Repositório Institucional da UnB, 2016. http://repositorio.unb.br/handle/10482/23185.
Full textSubmitted by Fernanda Percia França (fernandafranca@bce.unb.br) on 2017-03-03T18:16:47Z No. of bitstreams: 1 2016_GabrielSiqueiraRodrigues.pdf: 1418859 bytes, checksum: 2ee51220d6f243fc8432fb73a19952c2 (MD5)
Approved for entry into archive by Raquel Viana(raquelviana@bce.unb.br) on 2017-04-04T21:54:40Z (GMT) No. of bitstreams: 1 2016_GabrielSiqueiraRodrigues.pdf: 1418859 bytes, checksum: 2ee51220d6f243fc8432fb73a19952c2 (MD5)
Made available in DSpace on 2017-04-04T21:54:40Z (GMT). No. of bitstreams: 1 2016_GabrielSiqueiraRodrigues.pdf: 1418859 bytes, checksum: 2ee51220d6f243fc8432fb73a19952c2 (MD5)
Vemos um crescente interesse em aplicações que devem contar com ambientes de computação heterogêneos, como a Internet das Coisas (IoT). Esses aplicativos são destinados a executar em uma ampla gama de dispositivos com diferentes recursos computacionais disponíveis. Para lidar com algum tipo de heterogeneidade, como dois tipos possíveis de processadores gráficos em um computador pessoal, podemos usar abordagens simples como um script que escolhe a biblioteca de software certa a ser copiada para uma pasta. Essas abordagens simples são centralizadas e criadas em tempo de design. Eles requerem um especialista ou equipe para controlar todo o espaço de variabilidade. Dessa forma, essas abordagens não são escaláveis para ambientes altamente heterogêneos. Em ambientes altamente heterogêneos, é difícil prever o ambiente computacional em tempo de projeto, implicando provavelmente indecidibilidade na configuração correta para cada ambiente. Em nosso trabalho, propomos GoalD: um método que permite a implantação autônoma de sistemas, refletindo sobre os objetivos do sistema e seu ambiente computacional. Por implantação autônoma, queremos dizer que o sistema é capaz de encontrar o conjunto correto de componentes para o ambiente computacional alvo, sem intervenção humana. Nós avaliamos nossa abordagem em um estudo de caso: conselheiro de estação de abastecimento, onde uma aplicação aconselha um motorista onde reabastecer / recarregar seu veículo. Nós projetamos a aplicação com variabilidade em nível de requisitos, arquitetura e implantação, o que pode permitir que a aplicação projetada seja executada em diferentes dispositivos. Para cenários com diferentes ambientes, foi possível planejar a implantação de forma autônoma. Além disso, a escalabilidade do algoritmo que planeja a implantação foi avaliada em um ambiente simulado. Os resultados mostram que usando a abordagem é possível planejar de forma autônoma a implantação de um sistema com milhares de componentes em poucos segundos.
We see a growing interest in computing applications that should rely on heterogeneous computing environments, like Internet of Things (IoT). Such applications are intended to execute in a broad range of devices with different available computing resources. In order to handle some kind of heterogeneity, such as two possible types of graphical processors in a desktop computer, we can use simple approaches as a script at deployment-time that chooses the right software library to be copied to a folder. These simple approaches are centralized and created at design-time. They require one specialist or team to control the entire space of variability. However, such approaches are not scalable to highly heterogeneous environments. In highly dynamic and heterogeneous environment it is hard to predict the computing environment at design-time, implying likely undecidability on the correct configuration for each environment at design-time. In our work, we propose GoalD: a method that allows autonomous deployment of systems by reflecting about the goals of the system and its computing environment. By autonomous deployment, we mean that the system can find the correct set of components, for the target computing environment, without human intervention. We evaluate our approach on the filling station advisor case study where an application advises a driver where to refuel/recharge its vehicle. We design the application with variability at requirements, architecture, and deployment, which can allow the designed application be executed in different devices. For scenarios with different environments, it was possible to plan the deployment autonomously. Additionally, the scalability of the algorithm that plan the deployment was evaluated in a simulated environment. Results show that using the approach it is possible to autonomously plan the deployment of a system with thousands of components in few seconds.
Aji, Ashwin M. "Programming High-Performance Clusters with Heterogeneous Computing Devices." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/52366.
Full textPh. D.
Bijanapalli, Chakri Ramakrishna. "Enabling the use of Heterogeneous Computing for Bioinformatics." Thesis, Virginia Tech, 2013. http://hdl.handle.net/10919/23866.
Full textMaster of Science
Winkleblack, Scott Kenneth swinkleb. "ReGen: Optimizing Genetic Selection Algorithms for Heterogeneous Computing." DigitalCommons@CalPoly, 2014. https://digitalcommons.calpoly.edu/theses/1236.
Full textFANFARILLO, ALESSANDRO. "Parallel programming techniques for heterogeneous exascale computing platforms." Doctoral thesis, Università degli Studi di Roma "Tor Vergata", 2014. http://hdl.handle.net/2108/202339.
Full textChiesi, Matteo <1984>. "Heterogeneous Multi-core Architectures for High Performance Computing." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6469/1/strutt.pdf.
Full textChiesi, Matteo <1984>. "Heterogeneous Multi-core Architectures for High Performance Computing." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6469/.
Full textVasanta, Harikrishna. "Secure, privacy assured mechanisms for heterogeneous contextual environments." Thesis, Queensland University of Technology, 2006. https://eprints.qut.edu.au/16177/1/Harikrishna_Vasanta_Thesis.pdf.
Full textVasanta, Harikrishna. "Secure, privacy assured mechanisms for heterogeneous contextual environments." Queensland University of Technology, 2006. http://eprints.qut.edu.au/16177/.
Full textGrewe, Dominik. "Mapping parallel programs to heterogeneous multi-core systems." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/8852.
Full textGelado, Fernández Isaac. "On the programmability of heterogeneous massively-parallel computing systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2010. http://hdl.handle.net/10803/6031.
Full textThis dissertation aims to increase the programmability of CPU - accelerator systems, without introducing major performance penalties. The key insight is that general purpose application programmers tend to favor programmability at the cost of system performance. This fact is illustrated by the tendency to use high-level programming languages, such as C++, to ease the task of programming at the cost of minor performance penalties. Moreover, currently many general purpose applications are being developed using interpreted languages, such as Java, C# or python, which raise the abstraction level even further introducing relatively large performance overheads. This dissertation also takes the approach of raising the level of abstraction for accelerators to improve programmability and investigates hardware and software mechanisms to efficiently implement these high-level abstractions without introducing major performance overheads.
Heterogeneous parallel systems typically implement separate memories for CPUs and accelerators, although commodity systems might use a shared memory at the cost of lower performance. However, in these commodity shared memory systems, coherence between accelerator and CPUs is not guaranteed. This system architecture implies that CPUs can only access system memory, and accelerators can only access their own local memory. This dissertation assumes separate system and accelerator memory and shows that low-level abstractions for these disjoint address spaces are the source of poor programmability of heterogeneous parallel systems.
A first consequence of having separate system and accelerator memories are the current data transfer models for heterogeneous parallel systems. In this dissertation two data transfer paradigms are identified: per-call and double-buffered. In these two models, data structures used by accelerators are allocated in both, system and accelerator memories. These models differ on how data between accelerator and system memories is managed. The per-call model transfers the input data needed by accelerators before accelerator calls, and transfers back the output data produced by accelerators on accelerator call return. The per-call model is quite simple, but might impose unacceptable performance penalties due to data transfer overheads. The double-buffered model aims to overlap data communication and CPU and accelerator computation. This model requires a relative quite complex code due to parallel execution and the need of synchronization between data communication and processing tasks. The extra code required for data transfers in these two models is necessary due to the lack of by-reference parameter passing to accelerators. This dissertation presents a novel accelerator-hosted data transfer model. In this model, data used by accelerators is hosted in the accelerator memory, so when the CPU accesses this data, it is effectively accessing the accelerator memory. Such a model cleanly supports by-reference parameter passing to accelerator calls, removing the need to explicit data transfers.
The second consequence of separate system and accelerator memories is that current programming models export separate virtual system and accelerator address spaces to application programmers. This dissertation identifies the double-pointer problem as a direct consequence of these separate virtual memory spaces. The double-pointer problem is that data structures used by both, accelerators and CPUs, are referenced by different virtual memory addresses (pointers) in the CPU and accelerator code. The double-pointer problem requires programmers to add extra code to ensure that both pointers contain consistent values (e.g., when reallocating a data structure). Keeping consistency between system and accelerator pointers might penalize accelerator performance and increase the accelerator memory requirements when pointers are embedded within data structures (e.g., a linked-list). For instance, the double-pointer problem requires increasing the numbers of global memory accesses by 2X in a GPU code that reconstructs a linked-list. This dissertation argues that a unified virtual address space that includes both, system and accelerator memories is an efficient solution to the double-pointer problem. Moreover, such a unified virtual address space cleanly complements the accelerator-hosted data model previously discussed.
This dissertation introduces the Non-Uniform Accelerator Memory Access (NUAMA) architecture, as a hardware implementation of the accelerator-hosted data transfer model and the unified virtual address space. In NUAMA an Accelerator Memory Collector (AMC) is included within the system memory controller to identify memory requests for accelerator-hosted data. The AMC buffers and coalesces such memory requests to efficiently transfer data from the CPU to the accelerator memory. NUAMA also implements a hybrid L2 cache memory. The L2 cache in NUAMA follows a write-throughwrite-non-allocate policy for accelerator hosted data. This policy ensures that the contents of the accelerator memory are updated eagerly and, therefore, when the accelerator is called, most of the data has been already transferred. The eager update of the accelerator memory contents effectively overlaps data communication and CPU computation. A write-backwrite-allocate policy is used for the data hosted by the system memory, so the performance of applications that does not use accelerators is not affected. In NUAMA, accelerator-hosted data is identified using a TLB-assisted mechanism. The page table entries are extended with a bit, which is set for those memory pages that are hosted by the accelerator memory. NUAMA increases the average bandwidth requirements for the L2 cache memory and the interconnection network between the CPU and accelerators, but the instantaneous bandwidth, which is the limiting factor, requirements are lower than in traditional DMA-based architectures. The NUAMA architecture is compared to traditional DMA systems using cycle-accurate simulations. Experimental results show that NUAMA and traditional DMA-based architectures perform equally well. However, the application source code complexity of NUAMA is much lower than in DMA-based architectures.
A software implementation of the accelerator-hosted model and the unified virtual address space is also explored. This dissertation presents the Asymmetric Distributed Shared Memory (ADSM) model. ADSM maintains a shared logical memory space for CPUs to access data in the accelerator physical memory but not vice versa. The asymmetry allows light-weight implementations that avoid common pitfalls of symmetrical distributed shared memory systems. ADSM allows programmers to assign data structures to performance critical methods. When a method is selected for accelerator execution, its associated data objects are allocated within the shared logical memory space, which is hosted in the accelerator physical memory and transparently accessible by the methods executed on CPUs. ADSM reduces programming efforts for heterogeneous parallel computing systems and enhances application portability. The design and implementation of an ADSM run-time, called GMAC, on top of CUDA in a GNU/Linux environment is presented. Experimental results show that applications written in ADSM and running on top of GMAC achieve performance comparable to their counterparts using programmer-managed data transfers. This dissertation presents the GMAC system, evaluates different design choices, and it further suggests additional architectural support that will likely allow GMAC to achieve higher application performance than the current CUDA model.
Finally, the execution model of heterogeneous parallel systems is considered. Accelerator execution is abstracted in different ways in existent programming models. This dissertation explores three approaches implemented by existent programming models. OpenCL and the NVIDIA CUDA driver API use file descriptor semantics to abstract accelerators: user processes access accelerators through descriptors. This approach increases the complexity of using accelerators because accelerator descriptors are needed in any call involving the accelerator (e.g., memory allocations or passing a parameter to the accelerator). The IBM Cell SDK abstract accelerators as separate execution threads. This approach requires adding the necessary code to create new execution threads and synchronization primitives to use of accelerators. Finally, the NVIDIA CUDA run-time API abstract accelerators as Remote Procedure Calls (RPC). This approach is fundamentally incompatible with ADSM, because it assumes separate virtual address spaces for accelerator and CPU code. The Heterogeneous Parallel Execution (HPE) model is presented in this dissertation. This model extends the execution thread abstraction to incorporate different execution modes. Execution modes define the capabilities (e.g., accessible virtual address space, code ISA, etc) of the code being executed. In this execution model, accelerator calls are implemented as execution mode switches, analogously to system calls. Accelerator calls in HPE are synchronous, on the contrary of CUDA, OpenCL and the IBM Cell SDK. Synchronous accelerator calls provide full compatibility with the existent sequential execution model provided by most operating systems. Moreover, abstracting accelerator calls as execution mode switches allows application that use accelerator to run on system without accelerators. In these systems, the execution mode switch falls back to an emulation layer, which emulates the accelerator execution in the CPU. This dissertation further presents different design and implementation choices for the HPE model, in GMAC. The necessary hardware support for an efficient implementation of this model is also presented. Experimental results show that HPE introduces a low execution-time overhead while offering a clean and simple programming interface to applications.
Banino-Rokkones, Cyril. "Algorithmic and Scheduling Techniques for Heterogeneous and Distributed Computing." Doctoral thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1462.
Full textThe computing and communication resources of high performance computing systems are becoming heterogeneous, are exhibiting performance fluctuations and are failing in an unforeseeable manner. The Master-Slave (MS) paradigm, that decomposes the computational load into independent tasks, is well-suited for operating in these environments due to its loose synchronization requirements. The application tasks can be computed in any order, by any slave, and can be resubmitted in case of slave failures. Although, the MS paradigm naturally adapts to dynamic and unreliable environments, it nevertheless suffers from a lack of scalability.
This thesis providesmodels, techniques and scheduling strategies that improve the scalability and performance of MS applications. In particular, we claim that deploying multiple masters may be necessary to achieve scalable performance. We address the problem of finding the most profitable locations on a heterogeneous Grid for hosting a given number of master processes, such that the total task throughput of the system is maximized. Further, we provide distributed scheduling strategies that better adapt to system load fluctuations than traditional MS techniques. Our strategies are especially efficient when communication is expensive compared to computation (which constitutes the difficult case).
Furthermore, this thesis investigates also the suitability ofMS scheduling techniques for the parallelization of stencil code applications. These applications are usually parallelized with domain decompositionmethods, that are highly scalable, but rather impractical for dealing with heterogeneous, dynamic and unreliable environments. Our experimental results with two scientific applications show that traditional MS tasking techniques can successfully be applied to stencil code applications when the master is used to control the parallel execution. If the master is used as a data access point, then deploying multiple masters becomes necessary to achieve scalable performance.
Krommydas, Konstantinos. "Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/77582.
Full textPh. D.
Helal, Ahmed Elmohamadi Mohamed. "Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/96607.
Full textDoctor of Philosophy
Current supercomputers integrate a massive number of heterogeneous compute units with varying speed, computational throughput, memory bandwidth, and memory access latency. This trend represents a major challenge to end users, as their applications have been designed from the ground up to primarily exploit homogeneous CPUs. While heterogeneous systems can deliver several orders of magnitude speedup compared to traditional CPU-based systems, end users need extensive software and hardware expertise as well as significant time and effort to efficiently utilize all the available compute resources. To streamline such a daunting process, this dissertation presents automated frameworks for analyzing and modeling the performance on parallel architectures and for transforming the execution of user applications at runtime. The proposed frameworks incorporate domain knowledge and adapt to the input data and the underlying hardware using novel static and dynamic analyses. The experimental results show the efficacy of the introduced frameworks across many important application domains, such as computational fluid dynamics (CFD), and computer-aided design (CAD). In particular, the adaptive execution approach on heterogeneous systems achieves up to an order-of-magnitude speedup over the optimized parallel implementations.
Daga, Mayank. "Architecture-Aware Mapping and Optimization on Heterogeneous Computing Systems." Thesis, Virginia Tech, 2011. http://hdl.handle.net/10919/32535.
Full textMaster of Science
Adurti, Devi Abhiseshu, and Mohit Battu. "Optimization of Heterogeneous Parallel Computing Systems using Machine Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21834.
Full textSrivatsan, Siddhartha Eluppai. "Integrating heterogeneous computing resources to form a campus grid." [Gainesville, Fla.] : University of Florida, 2009. http://purl.fcla.edu/fcla/etd/UFE0024690.
Full textVella, Kevin J. "Seamless parallel computing on heterogeneous networks of multiprocessor workstations." Thesis, University of Kent, 1998. https://kar.kent.ac.uk/21580/.
Full textDi, Giovanni Pasquale. "Enhancing Ubiquitous Computing Environments Through Composition of Heterogeneous Services." Doctoral thesis, Universita degli studi di Salerno, 2016. http://hdl.handle.net/10556/2231.
Full textIn recent years the substantial advancements in Information and Communication Technologies enabled the development of original software solutions that can provide support to problems people face in their daily activities. Among the technical advancements that have fostered the development of such innovative applications, the gradual transition from stand-alone and centralized architectures to distributed ones and the explosive growth in the area of mobile communication have played a central role. The pro table combination of these advancements has led to the rise of the so-called Mobile Information Systems. Unfortunately, ful lling such a type of systems is very challenging and several aspects have to be taken into account during the design and development of both the front and back ends of the proposed solution. Within this context in this thesis we investigate two main aspects: 1) the elicitation of requirements and the design of usable mobile User Interfaces and 2) the information exchange in a back end combining heterogeneous services, more speci cally services based on the standards of the World Wide Web (W3C) and Open Geospatial Consortium (OGC). In particular, we develop a methodology to support the design of mobile solutions when usability requirements play a key role for the success of the whole system. We also present a solution for a seamless integration of services developed according to di erent standards with speci c focus on the issue of proper management of geospatial metadata in a W3C standards-oriented infrastructure. The result of our investigation is an extension for a key W3C standard for the metadata retrieval to support OGC metadata. The case study considered in our work is a Mobile Information System to be used by a community of farmers in Sri Lanka. [edited by Author]
XII n.s.
MA, LIANG. "Low power and high performance heterogeneous computing on FPGAs." Doctoral thesis, Politecnico di Torino, 2019. http://hdl.handle.net/11583/2727228.
Full textRibeiro, Tiago Filipe Rodrigues. "Developing and evaluating clopencl applications for heterogeneous clusters." Master's thesis, Instituto Politécnico de Bragança, Escola Superior de Tecnologia e Gestão, 2012. http://hdl.handle.net/10198/7948.
Full textLu, Kai. "Decentralized load balancing in heterogeneous computational grids." Thesis, The University of Sydney, 2007. http://hdl.handle.net/2123/9382.
Full textPadhye, Mohini. "Coordinating heterogeneous web services through handhelds using SyD." unrestricted, 2004. http://etd.gsu.edu/theses/available/etd-12062004-125228/.
Full textTitle from title screen. Sushil K. Prasad, committee chair; Anu Bourgeois, Alex Zelikovsky, committee members. Description based on contents viewed Feb. 26, 2007. Includes bibliographical references (p. 55-59). Source code: p. 75-123.
Li, Yue. "Edge computing-based access network selection for heterogeneous wireless networks." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S042/document.
Full textTelecommunication network has evolved from 1G to 4G in the past decades. One of the typical characteristics of the 4G network is the coexistence of heterogeneous radio access technologies, which offers end-users the capability to connect them and to switch between them with their mobile devices of the new generation. However, selecting the right network is not an easy task for mobile users since access network condition changes rapidly. Moreover, video streaming is becoming the major data service over the mobile network where content providers and network operators should cooperate to guarantee the quality of video delivery. In order to cope with this context, the thesis concerns the design of a novel approach for making an optimal network selection decision and architecture for improving the performance of adaptive streaming in the context of a heterogeneous network. Firstly, we introduce an analytical model (i.e. linear discrete-time system) to describe the network selection procedure considering one traffic class. Then, we consider the design of a selection strategy based on foundations from linear optimal control theory, with the objective to maximize network resource utilization while meeting the constraints of the supported services. Computer simulations with MATLAB are carried out to validate the efficiency of the proposed mechanism. Based on the same principal we extend this model with a general analytical model describing the network selection procedures in heterogeneous network environments with multiple traffic classes. The proposed model was, then, used to derive a scalable mechanism based on control theory, which allows not only to assist in steering dynamically the traffic to the most appropriate network access but also helps in blocking the residual traffic dynamically when the network is congested by adjusting dynamically the access probabilities. We discuss the advantages of a seamless integration with the ANDSF. A prototype is also implemented into ns-3. Simulation results sort out that the proposed scheme prevents the network congestion and demonstrates the effectiveness of the controller design, which can maximize the network resources allocation by converging the network workload to the targeted network occupancy. Thereafter, we focus on enhancing the performance of DASH in a mobile network environment for the users which has one access network. We introduce a novel architecture based on MEC. The proposed adaptation mechanism, running as an MEC service, can modify the manifest files in real time, responding to network congestion and dynamic demand, thus driving clients towards selecting more appropriate quality/bitrate video representations. We have developed a virtualized testbed to run the experiment with our proposed scheme. The simulation results demonstrate its QoE benefits compared to traditional, purely client-driven, bitrate adaptation approaches since our scheme notably improves both on the achieved MOS and on fairness in the face of congestion. Finally, we extend the proposed the MEC-based architecture to support the DASH service in a multi-access heterogeneous network in order to maximize the QoE and fairness of mobile users. In this scenario, our scheme should help users select both video quality and access network and we formulate it as an optimization problem. This optimization problem can be solved by IBM CPLEX tool. However, this tool is time-consuming and not scalable. Therefore, we introduce a heuristic algorithm to make a sub-optimal solution with less complexity. Then we implement a testbed to conduct the experiment and the result demonstrates that our proposed algorithm notably can achieve similar performance on overall achieved QoE and fairness with much more time-saving compared to the IBM CPLEX tool
Janjic, Vladimir. "Load balancing of irregular parallel applications on heterogeneous computing environments." Thesis, University of St Andrews, 2012. http://hdl.handle.net/10023/2540.
Full textKao, Yi-Hsuan. "Optimizing task assignment for collaborative computing over heterogeneous network devices." Thesis, University of Southern California, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10124490.
Full textThe Internet of Things promises to enable a wide range of new applications involving sensors, embedded devices and mobile devices. Different from traditional cloud computing, where the centralized and powerful servers offer high quality computing service, in the era of the Internet of Things, there are abundant computational resources distributed over the network. These devices are not as powerful as servers, but are easier to access with faster setup and short-range communication. However, because of energy, computation, and bandwidth constraints on smart things and other edge devices, it will be imperative to collaboratively run a computational-intensive application that a single device cannot support individually. As many IoT applications, like data processing, can be divided into multiple tasks, we study the problem of assigning such tasks to multiple devices taking into account their abilities and the costs, and latencies associated with both task computation and data communication over the network.
A system that leverages collaborative computing over the network faces highly variant run-time environment. For example, the resource released by a device may suddenly decrease due to the change of states on local processes, or the channel quality may degrade due to mobility. Hence, such a system has to learn the available resources, be aware of changes and flexibly adapt task assignment strategy that efficiently makes use of these resources.
We take a step by step approach to achieve these goals. First, we assume that the amount of resources are deterministic and known. We formulate a task assignment problem that aims to minimize the application latency (system response time) subject to a single cost constraint so that we will not overuse the available resource. Second, we consider that each device has its own cost budget and our new multi-constrained formulation clearly attributes the cost to each device separately. Moving a step further, we assume that the amount of resources are stochastic processes with known distributions, and solve a stochastic optimization with a strong QoS constraint. That is, instead of providing a guarantee on the average latency, our task assignment strategy gives a guarantee that p% of time the latency is less than t, where p and t are arbitrary numbers. Finally, we assume that the amount of run-time resources are unknown and stochastic, and design online algorithms that learn the unknown information within limited amount of time and make competitive task assignment.
We aim to develop algorithms that efficiently make decisions at run-time. That is, the computational complexity should be as light as possible so that running the algorithm does not incur considerable overhead. For optimizations based on known resource profile, we show these problems are NP-hard and propose polynomial-time approximation algorithms with performance guarantee, where the performance loss caused by sub-optimal strategy is bounded. For online learning formulations, we propose light algorithms for both stationary environment and non-stationary environment and show their competitiveness by comparing the performance with the optimal offline policy (solved by assuming the resource profile is known).
We perform comprehensive numerical evaluations, including simulations based on trace data measured at application run-time, and validate our analysis on algorithm's complexity and performance based on the numerical results. Especially, we compare our algorithms with the existing heuristics and show that in some cases the performance loss given by the heuristic is considerable due to the sub-optimal strategy. Hence, we conclude that to efficiently leverage the distributed computational resource over the network, it is essential to formulate a sophisticated optimization problem that well captures the practical scenarios, and provide an algorithm that is light in complexity and suggests a good assignment strategy with performance guarantee.
Schultek, Brian Robert. "Design and Implementation of the Heterogeneous Computing Device Management Architecture." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1417801414.
Full textPagani, Marco. "Enabling Predictable Hardware Acceleration in Heterogeneous SoC-FPGA Computing Platforms." Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1I016.
Full textModern computing platforms for embedded systems are evolving towards heterogeneous architectures comprising different types of processing elements and accelerators. Such an evolution is driven by the steady increasing computational demand required by modern cyber-physical systems. These systems need to acquire large amounts of data from multiple sensors and process them for performing the required control and monitoring tasks. These requirements translate into the need to execute complex computing workloads such as machine learning, encryption, and advanced signal processing algorithms, within the timing constraints imposed by the physical world. Heterogeneous systems can meet this computational demand with a high level of energy efficiency by distributing the computational workload among the different processing elements.This thesis contributes to the development of system support for real-time systems on heterogeneous platforms by presenting novel methodologies and techniques for enabling predictable hardware acceleration on SoC-FPGA platforms. The first part of this thesis presents a framework designed for supporting the development of real-time applications on SoC-FPGAs, leveraging hardware acceleration and logic resource “Virtualization” through dynamic partial reconfiguration. The proposed framework is based on a device model that matches the capabilities of modern SoC-FPGA devices, and it is centered around a custom scheduling infrastructure designed to guarantee bounded response times. This characteristic is crucial for making dynamic hardware acceleration viable for safety-critical applications. The second part of this thesis presents a full implementation of the proposed framework on Linux. Such implementation allows developing predictable applications leveraging the large number of software systems available on GNU/Linux while relying on dynamic FPGA-based hardware acceleration for performing heavy computations. Finally, the last part of this thesis introduces a reservation mechanism for the AMBA AXI bus aimed at improving the predictability of hardware accelerators by regulating BUS contention through a bandwidth reservation mechanism
Brown, Grant Donald. "Application Of Heterogeneous Computing Techniques To Compartmental Spatiotemporal Epidemic Models." Diss., University of Iowa, 2015. https://ir.uiowa.edu/etd/1554.
Full textCumming, Benjamin Donald. "Modelling sea water intrusion in coastal aquifers using heterogeneous computing." Thesis, Queensland University of Technology, 2012. https://eprints.qut.edu.au/61038/1/Benjamin_Cumming_Thesis.pdf.
Full textFaticanti, Francescomaria. "Resource Allocation Strategies in Highly Distributed and Heterogeneous Computing Systems." Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/321482.
Full textKerr, Andrew. "A model of dynamic compilation for heterogeneous compute platforms." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/47719.
Full textRaman, Pirabhu. "GEMS Gossip-Enabled Monitoring Service for heterogeneous distributed systems /." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000598.
Full textChang, He. "Server selection for heterogeneous cloud video services." HKBU Institutional Repository, 2017. http://repository.hkbu.edu.hk/etd_oa/419.
Full textRafique, Muhammad Mustafa. "An Adaptive Framework for Managing Heterogeneous Many-Core Clusters." Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/29119.
Full textPh. D.
Huang, Jun. "Heterogeneity-aware approaches to optimizing performance of computing and communication tasks." Auburn, Ala., 2005. http://repo.lib.auburn.edu/2005%20Fall/Dissertation/HUANG_JUN_28.pdf.
Full textPorter, N. Wayne. "Resource usage for adaptive C4I models in a heterogeneous computing environment." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1999. http://handle.dtic.mil/100.2/ADA366190.
Full text"June 1999". Thesis advisor(s): Debra Hensgen, William G. Kemple. Includes bibliographical references (p. 175-179). Also available online.
Shan, Meijuan. "Distributed object-oriented parallel computing on heterogeneous workstation clusters using Java." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/mq43403.pdf.
Full text