Dissertations / Theses: 'Heterogenous scheduling'

1

Durdak, Yavuz. "The Air Cargo Scheduling Problem With Heterogenous Fleet." Master's thesis, METU, 2013. http://etd.lib.metu.edu.tr/upload/12615358/index.pdf.

Full text

Abstract:

In this study, we consider the Air Cargo Scheduling Problem based on a real life application. The aim is to move cargo and passengers that have different priorities and delivery time window, from a number of origin airports to destination airports by means of a transportation system. The system has predefined carrier routes and a heterogeneous fleet of aircraft. The problem is formulated as a heterogeneous vehicle, multi commodity, pick-up, and delivery network flow problem with a large set of system specific constraints. The proposed model determines set of movement requirements assigned on each route leg and number and type of aircraft assigned for each route in a reasonable amount of time. The model is tested with the real and generated data and the results are compared with the current methodology under different scenarios. The model produced better results in a short amount of time compared to the current methodology.

APA, Harvard, Vancouver, ISO, and other styles

2

Hernandez, Jesus Israel. "Reactive scheduling of DAG applications on heterogeneous and dynamic distributed computing systems." Thesis, University of Edinburgh, 2008. http://hdl.handle.net/1842/2336.

Full text

Abstract:

Emerging technologies enable a set of distributed resources across a network to be linked together and used in a coordinated fashion to solve a particular parallel application at the same time. Such applications are often abstracted as directed acyclic graphs (DAGs), in which vertices represent application tasks and edges represent data dependencies between tasks. Effective scheduling mechanisms for DAG applications are essential to exploit the tremendous potential of computational resources. The core issues are that the availability and performance of resources, which are already by their nature heterogeneous, can be expected to vary dynamically, even during the course of an execution. In this thesis, we first consider the problem of scheduling DAG task graphs onto heterogeneous resources with changeable capabilities. We propose a list-scheduling heuristic approach, the Global Task Positioning (GTP) scheduling method, which addresses the problem by allowing rescheduling and migration of tasks in response to significant variations in resource characteristics. We observed from experiments with GTP that in an execution with relatively frequent migration, it may be that, over time, the results of some task have been copied to several other sites, and so a subsequent migrated task may have several possible sources for each of its inputs. Some of these copies may now be more quickly accessible than the original, due to dynamic variations in communication capabilities. To exploit this observation, we extended our model with a Copying Management(CM) function, resulting in a new version, the Global Task Positioning with copying facilities (GTP/c) system. The idea is to reuse such copies, in subsequent migration of placed tasks, in order to reduce the impact of migration cost on makespan. Finally, we believe that fault tolerance is an important issue in heterogeneous and dynamic computational environments as the availability of resources cannot be guaranteed. To address the problem of processor failure, we propose a rewinding mechanism which rewinds the progress of the application to a previous state, thereby preserving the execution in spite of the failed processor(s). We evaluate our mechanisms through simulation, since this allow us to generate repeatable patterns of resource performance variation. We use a standard benchmark set of DAGs, comparing performance against that of competing algorithms from the scheduling literature.

APA, Harvard, Vancouver, ISO, and other styles

3

Binotto, Alécio Pedro Delazari. "A dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2011. http://hdl.handle.net/10183/34768.

Full text

Abstract:

Atualmente, o computador pessoal (PC) moderno poder ser considerado como um cluster heterogênedo de um nodo, o qual processa simultâneamente inúmeras tarefas provenientes das aplicações. O PC pode ser composto por Unidades de Processamento (PUs) assimétricas, como a Unidade Central de Processamento (CPU), composta de múltiplos núcleos, a Unidade de Processamento Gráfico (GPU), composta por inúmeros núcleos e que tem sido um dos principais co-processadores que contribuiram para a computação de alto desempenho em PCs, entre outras. Neste sentido, uma plataforma de execução heterogênea é formada em um PC para efetuar cálculos intensivos em um grande número de dados. Na perspectiva desta tese, a distribuição da carga de trabalho de uma aplicação nas PUs é um fator importante para melhorar o desempenho das aplicações e explorar tal heterogeneidade. Esta questão apresenta desafios uma vez que o custo de execução de uma tarefa de alto nível em uma PU é não-determinístico e pode ser afetado por uma série de parâmetros não conhecidos a priori, como o tamanho do domínio do problema e a precisão da solução, entre outros. Nesse escopo, esta pesquisa de doutorado apresenta um sistema sensível ao contexto e de adaptação em tempo de execução com base em um compromisso entre a redução do tempo de execução das aplicações - devido a um escalonamento dinâmico adequado de tarefas de alto nível - e o custo de computação do próprio escalonamento aplicados em uma plataforma composta de CPU e GPU. Esta abordagem combina um modelo para um primeiro escalonamento baseado em perfis de desempenho adquiridos em préprocessamento com um modelo online, o qual mantém o controle do tempo de execução real de novas tarefas e escalona dinâmicamente e de modo eficaz novas instâncias das tarefas de alto nível em uma plataforma de execução composta de CPU e de GPU. Para isso, é proposto um conjunto de heurísticas para escalonar tarefas em uma CPU e uma GPU e uma estratégia genérica e eficiente de escalonamento que considera várias unidades de processamento. A abordagem proposta é aplicada em um estudo de caso utilizando uma plataforma de execução composta por CPU e GPU para computação de métodos iterativos focados na solução de Sistemas de Equações Lineares que se utilizam de um cálculo de stencil especialmente concebido para explorar as características das GPUs modernas. A solução utiliza o número de incógnitas como o principal parâmetro para a decisão de escalonamento. Ao escalonar tarefas para a CPU e para a GPU, um ganho de 21,77% em desempenho é obtido em comparação com o escalonamento estático de todas as tarefas para a GPU (o qual é utilizado por modelos de programação atuais, como OpenCL e CUDA para Nvidia) com um erro de escalonamento de apenas 0,25% em relação à combinação exaustiva.
A modern personal computer can be now considered as a one-node heterogeneous cluster that simultaneously processes several applications’ tasks. It can be composed by asymmetric Processing Units (PUs), like the multi-core Central Processing Unit (CPU), the many-core Graphics Processing Units (GPUs) - which have become one of the main co-processors that contributed towards high performance computing - and other PUs. This way, a powerful heterogeneous execution platform is built on a desktop for data intensive calculations. In the perspective of this thesis, to improve the performance of applications and explore such heterogeneity, a workload distribution over the PUs plays a key role in such systems. This issue presents challenges since the execution cost of a task at a PU is non-deterministic and can be affected by a number of parameters not known a priori, like the problem size domain and the precision of the solution, among others. Within this scope, this doctoral research introduces a context-aware runtime and performance tuning system based on a compromise between reducing the execution time of the applications - due to appropriate dynamic scheduling of high-level tasks - and the cost of computing such scheduling applied on a platform composed of CPU and GPUs. This approach combines a model for a first scheduling based on an off-line task performance profile benchmark with a runtime model that keeps track of the tasks’ real execution time and efficiently schedules new instances of the high-level tasks dynamically over the CPU/GPU execution platform. For that, it is proposed a set of heuristics to schedule tasks over one CPU and one GPU and a generic and efficient scheduling strategy that considers several processing units. The proposed approach is applied in a case study using a CPU-GPU execution platform for computing iterative solvers for Systems of Linear Equations using a stencil code specially designed to explore the characteristics of modern GPUs. The solution uses the number of unknowns as the main parameter for assignment decision. By scheduling tasks to the CPU and to the GPU, it is achieved a performance gain of 21.77% in comparison to the static assignment of all tasks to the GPU (which is done by current programming models, such as OpenCL and CUDA for Nvidia) with a scheduling error of only 0.25% compared to exhaustive search.

APA, Harvard, Vancouver, ISO, and other styles

4

Neeracher, Matthias. "Scheduling for heterogeneous opportunistic workstation clusters /." [S.l.] : [s.n.], 1998. http://e-collection.ethbib.ethz.ch/show?type=diss&nr=12906.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Wen, Yuan. "Multi-tasking scheduling for heterogeneous systems." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/23469.

Full text

Abstract:

Heterogeneous platforms play an increasingly important role in modern computer systems. They combine high performance with low power consumption. From mobiles to supercomputers, we see an increasing number of computer systems that are heterogeneous. The most well-known heterogeneous system, CPU+GPU platforms have been widely used in recent years. As they become more mainstream, serving multiple tasks from multiple users is an emerging challenge. A good scheduler can greatly improve performance. However, indiscriminately allocating tasks based on availability leads to poor performance. As modern GPUs have a large number of hardware resources, most tasks cannot efficiently utilize all of them. Concurrent task execution on GPU is a promising solution, however, indiscriminately running tasks in parallel causes a slowdown. This thesis focuses on scheduling OpenCL kernels. A runtime framework is developed to determine where to schedule OpenCL kernels. It predicts the best-fit device by using a machine learning-based classifier, then schedules the kernels accordingly to either CPU or GPU. To improve GPU utilization, a kernel merging approach is proposed. Kernels are merged if their predicted co-execution can provide better performance than sequential execution. A machine learning based classifier is developed to find the best kernel pairs for co-execution on GPU. Finally, a runtime framework is developed to schedule kernels separately on either CPU or GPU, and run kernels in pairs if their co-execution can improve performance. The approaches developed in this thesis significantly improve system performance and outperform all existing techniques.

APA, Harvard, Vancouver, ISO, and other styles

6

Kreaseck, Barbara. "Dynamic autonomous scheduling on heterogeneous systmes /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2003. http://wwwlib.umi.com/cr/ucsd/fullcit?p3102539.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Tzeng, Stanley. "Scheduling on Manycore and Heterogeneous Graphics Processors." Thesis, University of California, Davis, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3602240.

Full text

Abstract:

Through custom software schedulers that distribute work differently than built-in hardware schedulers, data-parallel and heterogenous architectures can be retargeted towards irregular task-parallel graphics workloads. This dissertation examines the role of a GPU scheduler and how it may schedule complicated workloads onto the GPU for efficient parallel processing. This dissertation examines the scheduler through three different properties of workloads: granularity, irregularity, and dependency. Then it moves onto heterogenous architectures and examine how scheduling decisions differ when scheduling for discrete versus heterogeneous chips. The dissertation conclues with future work in scheduling for both discrete and heterogeneous architectures.

APA, Harvard, Vancouver, ISO, and other styles

8

Lee, Young Choon. "Problem-centric scheduling for heterogeneous computing systems." Thesis, The University of Sydney, 2007. http://hdl.handle.net/2123/9321.

Full text

Abstract:

This project addresses key scheduling problems in heterogeneous computing environments. Heterogeneous computing systems (HCSs) have received increased attention since the 1990s, particularly over the past 10 years with the popularity of grid computing systems. These computing environments consist of a variety of resources interconnected by a high-speed network. Many parallel and distributed applications can take advantage of this computing platform; however, resource heterogeneity and dynamism impose scheduling restrictions. It is extremely difficult for a single scheduling scheme to efficiently and effectively handle the application scenarios that are required in grid computing environments. What further complicates the issue is that computing environments are controlled by different administrative authorities. Thus, application diversity, and resource heterogeneity and dynamism, point to the need to develop a set of scheduling algorithms to manage these scenarios. The thesis describes a number of key application and system models, and extensively discusses the characteristics of traditional multiprocessor scheduling and grid scheduling. The application models can be broadly classified as independent and precedence-constrained. The coupling of resources in our HCS model can be tight or loose; while static scheduling is applied to tightly coupled platforms, dynamic scheduling is adopted on loosely coupled platforms. The thesis presents the scheduling schemes that we have developed to address various challenging scheduling issues, and sets out and interprets the experimental results from our performance evaluation study. The data indicate that our novel scheduling algorithms—which appropriately incorporate application and system characteristics into their scheduling—demonstrate significantly superior performance than previous approaches.

APA, Harvard, Vancouver, ISO, and other styles

9

Okolo, Benjamin Uchenna. "Joint Routing and Scheduling in Heterogeneous Wireless Network." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text

Abstract:

The idea of the Internet-of-Things (IoT), heralds a future where most things used in the daily life of people are connected in possibly one network and share information among themselves. As a result, IoT presents a hope of revolutionizing the daily life of its potential users, but also presents a challenge for engineers, who need to ensure good connectivity between heterogeneous devices in such network. The IoT vision will deal with devices that are based on diﬀerent/heterogeneous technologies. This thesis presents a novel technique for routing and scheduling in an heterogeneous network, based on a realistic physical layer model. The routing and scheduling techniques presented in this thesis, follows some existing routing and scheduling algorithms, with the performance of the scheme proved to be very eﬃcient, based on the simulation results obtained.

APA, Harvard, Vancouver, ISO, and other styles

10

Planas, Carbonell Judit. "Programming models and scheduling techniques for heterogeneous architectures." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/327036.

Full text

Abstract:

There is a clear trend nowadays to use heterogeneous high-performance computers, as they offer considerably greater computing power than homogeneous CPU systems. Extending traditional CPU systems with specialized units (accelerators such as GPGPUs) has become a revolution in the HPC world. Both the traditional performance-per-Watt and the performance-per-Euro ratios have been increased with the use of such systems. Heterogeneous machines can adapt better to different application requirements, as each architecture type offers different characteristics. Thus, in order to maximize application performance in these platforms, applications should be divided into several portions according to their execution requirements. These portions should then be scheduled to the device that better fits their requirements. Hence, heterogeneity introduces complexity in application development, up to the point of reaching the programming wall: on the one hand, source codes must be adapted to fit new architectures and, on the other, resource management becomes more complicated. For example, multiple memory spaces that require explicit data movements or additional synchronizations between different code portions that run on different units. For all these reasons, efficient programming and code maintenance in heterogeneous systems is extremely complex and expensive. Although several approaches have been proposed for accelerator programming, like CUDA or OpenCL, these models do not solve the aforementioned programming challenges, as they expose low level hardware characteristics to the programmer. Therefore, programming models should be able to hide all these complex accelerator programming by providing a homogeneous development environment. In this context, this thesis contributes in two key aspects: first, it proposes a general design to efficiently manage the execution of heterogeneous applications and second, it presents several scheduling mechanisms to spread application execution among all the units of the system to maximize performance and resource utilization. The first contribution proposes an asynchronous design to manage execution, data movements and synchronizations on accelerators. This approach has been developed in two steps: first, a semi-asynchronous proposal and then, a fully-asynchronous proposal in order to fit contemporary hardware restrictions. The experimental results tested on different multi-accelerator systems showed that these approaches could reach the maximum expected performance. Even if compared to native, hand-tuned codes, they could get the same results and outperform native versions in selected cases. The second contribution presents four different scheduling strategies. They focus and combine different aspects related to heterogeneous programming to minimize application's execution time. For example, minimizing the amount of data shared between memory spaces, or maximizing resource utilization by scheduling each portion of code on the unit that fits better. The experimental results were performed on different heterogeneous platforms, including CPUs, GPGPU and Intel Xeon Phi devices. As shown in these tests, it is particularly interesting to analyze how all these scheduling strategies can impact application performance. Three general conclusions can be extracted: first, application performance is not guaranteed across new hardware generations. Then, source codes must be periodically updated as hardware evolves. Second, the most efficient way to run an application on a heterogeneous platform is to divide it into smaller portions and pick the unit that better fits to run each portion. Hence, system resources can cooperate together to execute the application. Finally, and probably the most important, the requirements derived from the first and second conclusions can be implemented inside runtime frameworks, so the complexity of programming heterogeneous architectures is completely hidden to the programmer.
Actualment, hi ha una clara tendència per l'ús de sistemes heterogenis d'alt rendiment, ja que ofereixen una major potència de càlcul que els sistemes homogenis amb CPUs tradicionals. L'addició d'unitats especialitzades (acceleradors com ara GPGPUs) als sistemes amb CPUs s'ha convertit en una revolució en el món de la computació d'alt rendiment. Els sistemes heterogenis poden adaptar-se millor a les diferents necessitats de les aplicacions, ja que cada tipus d'arquitectura ofereix diferents característiques. Per tant, per maximitzar el rendiment, les aplicacions s'han de dividir en diverses parts d'acord amb els seus requeriments computacionals. Llavors, aquestes parts s'han d'executar al dispositiu que s'adapti millor a les seves necessitats. Per tant, l'heterogeneïtat introdueix una complexitat addicional en el desenvolupament d'aplicacions: d'una banda, els codis font s'han d'adaptar a les noves arquitectures i, de l'altra, la gestió de recursos es fa més complicada. Per exemple, múltiples espais de memòria que requereixen moviments explícits de dades o sincronitzacions addicionals entre diferents parts de codi que s'executen en diferents unitats. Per això, la programació i el manteniment del codi en sistemes heterogenis són extremadament complexos i cars. Tot i que hi ha diverses propostes per a la programació d'acceleradors, com CUDA o OpenCL, aquests models no resolen els reptes de programació descrits anteriorment, ja que exposen les característiques de baix nivell del hardware al programador. Per tant, els models de programació han de poder ocultar les complexitats dels acceleradors de cara al programador, proporcionant un entorn de desenvolupament homogeni. En aquest context, la tesi contribueix en dos aspectes fonamentals: primer, proposa un disseny per a gestionar de manera eficient l'execució d'aplicacions heterogènies i, segon, presenta diversos mecanismes de planificació per dividir l'execució d'aplicacions entre totes les unitats del sistema, per tal de maximitzar el rendiment i la utilització de recursos. La primera contribució proposa un disseny d'execució asíncron per gestionar els moviments de dades i sincronitzacions en acceleradors. Aquest enfocament s'ha desenvolupat en dos passos: primer, una proposta semi-asíncrona i després, una proposta totalment asíncrona per tal d'adaptar-se a les restriccions del hardware contemporani. Els resultats en sistemes multi-accelerador mostren que aquests enfocaments poden assolir el màxim rendiment esperat. Fins i tot, en determinats casos, poden superar el rendiment de codis nadius altament optimitzats. La segona contribució presenta quatre mecanismes de planificació diferents, enfocats a la programació heterogènia, per minimitzar el temps d'execució de les aplicacions. Per exemple, minimitzar la quantitat de dades compartides entre espais de memòria, o maximitzar la utilització de recursos mitjançant l'execució de cada porció de codi a la unitat que s'adapta millor. Els experiments s'han realitzat en diferents plataformes heterogènies, incloent CPUs, GPGPUs i dispositius Intel Xeon Phi. És particularment interessant analitzar com totes aquestes estratègies de planificació poden afectar el rendiment de l'aplicació. Com a resultat, es poden extreure tres conclusions generals: en primer lloc, el rendiment de l'aplicació no està garantit en les noves generacions de hardware. Per tant, els codis s'han d'actualitzar periòdicament a mesura que el hardware evoluciona. En segon lloc, la forma més eficient d'executar una aplicació en una plataforma heterogènia és dividir-la en porcions més petites i escollir la unitat que millor s'adapta per executar cada porció. Finalment, i probablement la conclusió més important, és que les exigències derivades de les dues primeres conclusions poden ser implementades dins de llibreries de sistema, de manera que la complexitat de programació d'arquitectures heterogènies quedi completament oculta per al programador.

APA, Harvard, Vancouver, ISO, and other styles

11

Mathirajan, M. "Heuristic Scheduling Algorithms For Parallel Heterogeneous Batch Processors." Thesis, Indian Institute of Science, 2000. http://hdl.handle.net/2005/196.

Full text

Abstract:

In the last decade, market pressures for greater variety of products forced a gradual shift from continuous manufacturing to batch manufacturing in various industries. Consequently batch scheduling problems have attracted the attention of researchers in production and operations management. This thesis addresses the scheduling of parallel non-identical batch processors in the presence of dynamic job arrivals, incompatible job-families and non-identical job sizes. This problem abstracts the scheduling of heat-treatment furnace operations of castings in a steel foundry. The problem is of considerable interest in this sector as a large proportion of the total production time is spent in heat treatment processing. This problem is also encountered in other industrial settings such as burn-in operation in the final testing stage of semiconductor manufacturing, and manufacturing of steel, ceramics, aircraft parts, footwear, etc. A detailed literature review and personal communications with experts revealed that this class of batch scheduling problems have not been addressed hitherto. A major concern in the management of foundries is to maximize throughput and reduce flow time and work-in-process inventories. Therefore we have chosen the primary scheduling objective to be the utilization of batch processors and as secondary objectives the minimization of overall flow time and weighted average waiting time per job. This formulation can be considered as an extension of problems studied by DOBSON AND NAMBINADOM (1992), UZSOY (1995), ZEE et a/. (1997) and MEHTA AND UZSOY (1998). Our effort to carefully catalogue the large number of variants of deterministic batch scheduling problems led us to the development of a taxonomy and notation. Not surprisingly, we are able to show that our problem is NP-hard and is therefore in the company of many scheduling problems that are difficult to solve. Initially two heuristic algorithms, one a mathematical programming based heuristic algorithm (MPHA) and the other a greedy heuristic algorithm were developed. Due to the computational overheads in the implementation of MPHA when compared with the greedy heuristic, we chose to focus on the latter as the primary scheduling methodology. Preliminary experimentation led us to the observation that the performance of greedy heuristics depends critically on the selection of job-families. So eight variants of the greedy heuristic that differ mainly in the decision on "job-family selection" were proposed. These eight heuristics are basically two sets {Al, A2, A3, A4} and the modified (MAI, MA2, MA3, MA4}, which differ only on how the "job-family" index, weighted shortest processing time, is computed. For evaluating the performance of the eight heuristics, computational experiments were carried out. The analysis of the experimental data is presented in two perspectives. The goal of the first perspective was to evaluate the absolute quality of the solutions obtained by the proposed heuristic algorithms when compared with estimated optimal solutions. The second perspective was to compare the relative performance of the proposed heuristics. The test problems generated were designed to reflect real-world scheduling problems that we have observed in the steel-casting industry. Three important problem parameters for the test set generation are the number of jobs [n], job-priority [P], and job-family [F]. We considered 5 different levels for n, 2 different levels for P and 2 different levels for F. The test set reflects (i) the size of the jobs vary uniformly (ii) there are two batch processors and (iii) five incompatible job-families with different processing times. 15 problem instances were generated for each level of (n, P, and F). Out of many procedures available in the literature for estimating optimal value for combinatorial optimization problems, we used the procedure based on Weibull distribution as discussed in Rardin and Uzsoy (2001). For each problem instance of the randomly generated 300 problem instances, 15 feasible solutions (i.e., the average utilization of batch processors (AUBP)) were obtained using "random decision rule for first two stages and using a "best-fit heuristic' for the last stage of the scheduling problem. These 15 feasible solutions were used to estimate the optimal value. The generated 15 feasible solutions are expected to provide the estimated optimal value of the problem instance with a very high probability. Both average performance and worst-case performance of the heuristics indicated that, the heuristic algorithms A3 and A4, on the average yielded better utilization than the estimated optimal value. This indicates that the Weibull-based technique may have yielded conservative estimates of the optimal value. Further, the other heuristic algorithms found inferior solutions when compared with the estimated optimal value. But the deviations were very small. From this, we may infer that all the proposed heuristic algorithms are acceptable. The relative evaluation of heuristics was in terms of both computational effort and the quality of the solution. For the heuristics, it was clear that the computational burden is low enough on the average to run all the proposed heuristics on each problem instance and select the best solution. Also, it is observed that any algorithm from the first set of {Al, A2, A3, and A4} takes more computational time than any one from the second set {MAI, MA2, MA3, and MA4}. Regarding solution quality, the following inferences were made: ٭ In general the heuristic algorithms are sensitive to the choice of problem factors with respect to all the scheduling objectives. ٭ The three algorithms A3, MA4 and MAI are observed to be superior with respect to the scheduling objectives: maximizing average utilization of batch processors (AUBP), minimization of overall flow time (OFT) and minimizing weighted average waiting time (WAWT) respectively. Further, the heuristic algorithm MAI turns out to be the best choice if we trade-off all three objectives AUBP, OFT and WAWT. Finally we carried out simple sensitivity analyses experiments in order to understand the influence of some parameters of the scheduling on the performance of the heuristic algorithms. These were related to one at a time changes in (1) job-size distribution, (2) capacities of batch processors and (3) processing time of job-families. From the analyses it appears that there is an influence of changes in these input parameters. The results of the sensitivity analyses can be used to guide the selection of a heuristic for a particular combination of input parameters. For example, if we have to pick a single heuristic algorithm, then MAI is the best choice when considering the performance and the robustness indicated by the sensitivity analysis. In summary, this thesis examined a problem arising in the scheduling of heat-treatment operations in the steel-casting industry. This problem was abstracted to a class of deterministic batch scheduling problems. We analyzed the computational complexity of this problem and showed that it is NP-hard and therefore unlikely to admit a scalable exact method. Eight variants of a fast greedy heuristic were designed to solve the scheduling problem of interest. Extensive computational experiments were carried out to compare the performance of the heuristics with estimated optimal values (using the Weibull technique) and also for relative effectiveness and this showed that the heuristics are capable of consistently obtaining near-estimated) optimal solutions with very low computational burden for the solution of large scale problems. Finally, a comprehensive sensitivity analysis was carried out to study the influence of a few parameters, by changing them one at a time, on the performance of the heuristic algorithms. This type of analysis gives users some confidence in the robustness of the proposed heuristics.

APA, Harvard, Vancouver, ISO, and other styles

12

Pineau, Jean-François. "Communication-aware scheduling on heterogeneous master-worker platforms." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2008. http://tel.archives-ouvertes.fr/tel-00530131.

Full text

Abstract:

Les travaux présentés dans cette thèse portent sur diverses techniques d'ordonnance- ment de tâches indépendantes pour des plates-formes de type maître-esclaves dis- tribuées à grande échelle, lorsque les temps de communications des tâches sont pris en compte par des modèles réalistes. Les contributions de cette thèse se situent à trois niveaux : 1) Algorithmique Parallèle : nous avons montré la complexité d'ordonnancer des tâches indépendantes sur une plate-forme hétérogène en modélisant les communications avec un modèle un-port, en regardant plusieurs sources d'hétérogénéité et plusieurs fonctions objectives; 2) Produit de matrices : nous avons calculé la borne théorique du volume de communication minimal nécessaire pour effectuer un produit de matrices dont les données sont centralisées, et où la mémoire des esclaves est limitée, et nous avons défini un algorithme efficace de partage de la mémoire, impliquant un volume de communication proche de la borne théorique. Nous avons ensuite étendu cet algorithme à des plate-formes hétérogènes ; 3) Ordonnancement : dans le cadre d'ordonnancement d'applications constituées d'un très grand nombre de tâches indépendantes et de caractéristiques identiques, nous avons étudié en régime permanent comment minimiser le retard de chaque application lorsqu'elles sont plusieurs à entrer en compétition pour les ressources de calcul, et comment minimiser la consommation de la plate-forme lorsqu'une seule application est déployée.

APA, Harvard, Vancouver, ISO, and other styles

13

Sai, Ranga Prashanth C. "Algorithms for task scheduling in heterogeneous computing environments." Auburn, Ala., 2006. http://repo.lib.auburn.edu/2006%20Fall/SAI_RANGA_58.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Lyerly, Robert Frantz. "Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/78130.

Full text

Abstract:

The world of high-performance computing has shifted from increasing single-core performance to extracting performance from heterogeneous multi- and many-core processors due to the power, memory and instruction-level parallelism walls. All trends point towards increased processor heterogeneity as a means for increasing application performance, from smartphones to servers. These various architectures are designed for different types of applications — traditional "big" CPUs (like the Intel Xeon) are optimized for low latency while other architectures (such as the NVidia Tesla K20x) are optimized for high-throughput. These architectures have different tradeoffs and different performance profiles, meaning fantastic performance gains for the right types of applications. However applications that are ill-suited for a given architecture may experience significant slowdown; therefore, it is imperative that applications are scheduled onto the correct processor. In order to perform this scheduling, applications must be analyzed to determine their execution characteristics. Traditionally this application-to-hardware mapping was determined statically by the programmer. However, this requires intimate knowledge of the application and underlying architecture, and precludes load-balancing by the system. We demonstrate and empirically evaluate a system for automatically scheduling compute kernels by extracting program characteristics and applying machine learning techniques. We develop a machine learning process that is system-agnostic, and works for a variety of contexts (e.g. embedded, desktop/workstation, server). Finally, we perform scheduling in a workload-aware and workload-adaptive manner for these compute kernels.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

15

Scogland, Thomas R. "Runtime Adaptation for Autonomic Heterogeneous Computing." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/71315.

Full text

Abstract:

Heterogeneity is increasing across all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other coprocessors into everything from cell phones to supercomputers. More quietly it is increasing with the rise of NUMA systems, hierarchical caching, OS noise, and a myriad of other factors. As heterogeneity becomes a fact of life, efficiently managing heterogeneous compute resources is becoming a critical, and ever more complex, task. The focus of this dissertation is to lay the foundation for an autonomic system for heterogeneous computing, employing runtime adaptation to improve performance portability and performance consistency while maintaining or increasing programmability. We investigate heterogeneity arising from a myriad of factors, grouped into the dimensions of locality and capability. This work has resulted in runtime schedulers capable of automatically detecting and mitigating heterogeneity in physically homogeneous systems through MPI and adaptive coscheduling for physically heterogeneous accelerator based systems as well as a synthesis of the two to address multiple levels of heterogeneity as a coherent whole. We also discuss our current work towards the next generation of fine-grained scheduling and synchronization across heterogeneous platforms in the design of a highly-scalable and portable concurrent queue for many-core systems. Each component addresses aspects of the urgent need for automated management of the extreme and ever expanding complexity introduced by heterogeneity.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

16

Nemirovsky, Daniel A. "Improving heterogeneous system efficiency : architecture, scheduling, and machine learning." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/461499.

Full text

Abstract:

Computer architects are beginning to embrace heterogeneous systems as an effective method to utilize increases in transistor densities for executing a diverse range of workloads under varying performance and energy constraints. As heterogeneous systems become more ubiquitous, architects will need to develop novel CPU scheduling techniques capable of exploiting the diversity of computational resources. In recognizing hardware diversity, state-of-the-art heterogeneous schedulers are able to produce significant performance improvements over their predecessors and enable more flexible system designs. Nearly all of these, however, are unable to efficiently identify the mapping schemes which will result in the highest system performance. Accurately estimating the performance of applications on different heterogeneous resources can provide a significant advantage to heterogeneous schedulers for identifying a performance maximizing mapping scheme to improve system performance. Recent advances in machine learning techniques including artificial neural networks have led to the development of powerful and practical prediction models for a variety of fields. As of yet, however, no significant leaps have been taken towards employing machine learning for heterogeneous scheduling in order to maximize system throughput. The core issue we approach is how to understand and utilize the rise of heterogeneous architectures, benefits of heterogeneous scheduling, and the promise of machine learning techniques with respect to maximizing system performance. We present studies that promote a future computing model capable of supporting massive hardware diversity, discuss the constraints faced by heterogeneous designers, explore the advantages and shortcomings of conventional heterogeneous schedulers, and pioneer applying machine learning to optimize mapping and system throughput. The goal of this thesis is to highlight the importance of efficiently exploiting heterogeneity and to validate the opportunities that machine learning can offer for various areas in computer architecture.
Arquitectos de computadores estan empesando a diseñar systemas heterogeneos como una manera efficiente de usar los incrementos en densidades de transistors para ejecutar una gran diversidad de programas corriendo debajo de differentes condiciones y requisitos de energia y rendimiento (performance). En cuanto los sistemas heterogeneos van ganando popularidad de uso, arquitectos van a necesitar a diseñar nuevas formas de hacer el scheduling de las applicaciones en los cores distintos de los CPUs. Schedulers nuevos que tienen en cuenta la heterogeniedad de los recursos en el hardware logran importantes beneficios en terminos de rendimiento en comparacion con schedulers hecho para sistemas homogenios. Pero, casi todos de estos schedulers heterogeneos no son capaz de poder identificar la esquema de mapping que produce el rendimiento maximo dado el estado de los cores y las applicaciones. Estimando con precision el rendimiento de los programas ejecutando sobre diferentes cores de un CPU es un a gran ventaja para poder identificar el mapping para lograr el mejor rendimiento posible para el proximo scheduling quantum. Desarollos nuevos en la area de machine learning, como redes neurales, han producido predictores muy potentes y con gran precision in disciplinas numerosas. Pero en estos momentos, la aplicacion de metodos de machine learning no se han casi explorados para poder mejorar la eficiencia de los CPUs y menos para mejorar los schedulers para sistemas heterogeneos. El tema de enfoque en esta tesis es como poder entender y utilizar los sistemas heterogeneos, los beneficios de scheduling para estos sistemas, y como aprovechar las promesas de los metodos de machine learning con respeto a maximizer el redimiento de el Sistema. Presentamos estudios que dan una esquema para un modelo de computacion para el futuro capaz de dar suporte a recursos heterogeneos en gran escala, discutimos las restricciones enfrentados por diseñadores de sistemas heterogeneos, exploramos las ventajas y desventajas de las ultimas schedulers heterogeneos, y abrimos el camino de usar metodos de machine learning para optimizer el mapping y rendimiento de un sistema heterogeneo. El objetivo de esta tesis es destacar la imporancia de explotando eficientemente la heterogenidad de los recursos y tambien validar las oportunidades para mejorar la eficiencia en diferente areas de arquitectura de computadoras que pueden ser realizadas gracias a machine learning.

APA, Harvard, Vancouver, ISO, and other styles

17

Banino-Rokkones, Cyril. "Algorithmic and Scheduling Techniques for Heterogeneous and Distributed Computing." Doctoral thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1462.

Full text

Abstract:

The computing and communication resources of high performance computing systems are becoming heterogeneous, are exhibiting performance fluctuations and are failing in an unforeseeable manner. The Master-Slave (MS) paradigm, that decomposes the computational load into independent tasks, is well-suited for operating in these environments due to its loose synchronization requirements. The application tasks can be computed in any order, by any slave, and can be resubmitted in case of slave failures. Although, the MS paradigm naturally adapts to dynamic and unreliable environments, it nevertheless suffers from a lack of scalability.

This thesis providesmodels, techniques and scheduling strategies that improve the scalability and performance of MS applications. In particular, we claim that deploying multiple masters may be necessary to achieve scalable performance. We address the problem of finding the most profitable locations on a heterogeneous Grid for hosting a given number of master processes, such that the total task throughput of the system is maximized. Further, we provide distributed scheduling strategies that better adapt to system load fluctuations than traditional MS techniques. Our strategies are especially efficient when communication is expensive compared to computation (which constitutes the difficult case).

Furthermore, this thesis investigates also the suitability ofMS scheduling techniques for the parallelization of stencil code applications. These applications are usually parallelized with domain decompositionmethods, that are highly scalable, but rather impractical for dealing with heterogeneous, dynamic and unreliable environments. Our experimental results with two scientific applications show that traditional MS tasking techniques can successfully be applied to stencil code applications when the master is used to control the parallel execution. If the master is used as a data access point, then deploying multiple masters becomes necessary to achieve scalable performance.

APA, Harvard, Vancouver, ISO, and other styles

18

Rosenvinge, Einar Magnus. "Online Task Scheduling on Heterogeneous Clusters : An Experimental Study." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2004. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-278.

Full text

Abstract:

We study the problem of scheduling applications composed of a large number of tasks on heterogeneous clusters. Tasks are identical, independent from each other, and can hence be computed in any order. The goal is to execute all the tasks as quickly as possible. We use the Master-Worker paradigm, where tasks are maintained by the master which will hand out batches of a variable amount of tasks to requesting workers. We introduce a new scheduling strategy, the Monitor strategy, and compare it to other strategies suggested in the literature. An image filtering application, known as matched filtering, has been used to compare the different strategies. Our implementation involves datastaging techniques in order to circumvent the possible bottleneck incurred by the master, and multi-threading to prevent possible processor idleness.

APA, Harvard, Vancouver, ISO, and other styles

19

Kumar, Suraj. "Scheduling of Dense Linear Algebra Kernels on Heterogeneous Resources." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0572/document.

Full text

Abstract:

Du fait des énormes capacités de calculs des accélérateurs tels que les GPUs et les Xeon Phi, l’utilisation de machines multicoques pourvues d’accélérateurs est devenue commune dans le domaine du calcul haute performance (HPC). La complexité induite par ces accélérateurs a suscité le développement de systèmes d’exécution à base de tâches, dans lesquels les dépendances entre les applications sont exprimées sous la forme de graphe de tâches et où les tâches sont ordonnancées dynamiquement sur les ressources de calcul. La difficulté est alors de concevoir des stratégies d’ordonnancement qui font une utilisation efficace des ressources de calculs et le développement de telles stratégies, même pour un unique noeud hybride, est un enjeu essentiel de la performance des systèmes HPC. Nous considérons dans cette thèse l’ordonnancement de noyaux d’algèbre linéaire dense sur des noeuds complètement hétérogènes et constitués de CPUs et de GPUs. Les performances relatives des accélérateurs par rapport aux coeurs classique dépend très fortement du noyau considéré. Par exemple, les accélérateurs sont beaucoup plus efficaces pour les produits de matrices, par exemple, que pour les factorisations. Dans cette thèse, nous analysons les performances de stratégies statiques et dynamiques d’ordonnancement et nous proposons un ensemble de stratégies intermédiaires, en ajoutant des composantes statiques (respectivement dynamiques) à des stratégies d’ordonnancements dynamique (respectivement statiques). Récemment, une stratégie appelée HeteroPrio a été proposée, qui s’appuie sur les affinités entre les tâches et les ressources pour un petit ensemble de tâches différentes s’exécutant sur deux types de ressources. Nous avons étendu cette stratégie d’ordonnancement pour des graphes de tâches généraux pour deux types de ressources puis pour plus de deux types. De manière complémentaire, nous avons également démontré des facteurs d’approximation et des pires cas pour HeteroPrio dans le cas d’un ensemble de tâches indépendantes sur différents types de plates-formes
Due to massive computation power of accelerators such as GPU, Xeon phi, multicore machines equipped with accelerators are becoming popular in High Performance Computing (HPC). The added complexity led to the development of different task-based runtime systems, which allow computations to be expressed as graphs of tasks and rely on runtime systems to schedule those tasks among all resources of the platform. The real challenge is to design efficient schedulers for such runtimes to make effective utilization of all resources. Developing good schedulers, even for a single hybrid node, and analyzing them can thus have a strong impact on the performance of current HPC systems. We consider the problem of scheduling dense linear algebra applications on fully hybrid platforms made of CPUs and GPUs. The relative performance of CPU and GPU highly depends on the sub-routine. For instance, GPUs are much more efficient to process matrix-matrix multiplications than matrix factorizations. In this thesis, we analyze the performance of static and dynamic scheduling strategies and we propose a set of intermediate strategies, by adding static (resp. dynamic) features into dynamic (resp. static) strategies. A resource centric dynamic scheduler, HeteroPrio, which is based on affinity between tasks and resources, has been proposed recently for a set of small independent tasks on two types of resources. We extend and analyze this scheduler for general task graphs first on two types of resources and then on more than two types of resources. Additionally, we provide approximation ratios and worst case examples of HeteroPrio for a set of independent tasks on different platform sizes

APA, Harvard, Vancouver, ISO, and other styles

20

Villebonnet, Violaine. "Scheduling and Dynamic Provisioning for Energy Proportional Heterogeneous Infrastructures." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEN057/document.

Full text

Abstract:

La consommation énergétique des centres de calculs et de données, aussi appelés « data centers », représentait 2% de la consommation mondiale d'électricité en 2012. Leur nombre est en augmentation et suit l'évolution croissante des objets connectés, services, applications, et des données collectées. Ces infrastructures, très consommatrices en énergie, sont souvent sur-dimensionnées et les serveurs en permanence allumés. Quand la charge de travail est faible, l'électricité consommée par les serveurs inutilisés est gaspillée, et un serveur inactif peut consommer jusqu'à la moitié de sa consommation maximale. Cette thèse s'attaque à ce problème en concevant un data center ayant une consommation énergétique proportionnelle à sa charge. Nous proposons un data center hétérogène, nommé BML pour « Big, Medium, Little », composé de plusieurs types de machines : des processeurs très basse consommation et des serveurs classiques. L'idée est de profiter de leurs différentes caractéristiques de performance, consommation, et réactivité d'allumage, pour adapter dynamiquement la composition de l'infrastructure aux évolutions de charge. Nous décrivons une méthode générique pour calculer les combinaisons de machines les plus énergétiquement efficaces à partir de données de profilage de performance et d'énergie acquis expérimentalement considérant une application cible, ayant une charge variable au cours du temps, dans notre cas un serveur web.Nous avons développé deux algorithmes prenant des décisions de reconfiguration de l'infrastructure et de placement des instances de l'application en fonction de la charge future. Les différentes temporalités des actions de reconfiguration ainsi que leur coûts énergétiques sont pris en compte dans le processus de décision. Nous montrons par simulations que nous atteignons une consommation proportionnelle à la charge, et faisons d'importantes économies d'énergie par rapport aux gestions classiques des data centers
The increasing number of data centers raises serious concerns regarding their energy consumption. These infrastructures are often over-provisioned and contain servers that are not fully utilized. The problem is that inactive servers can consume as high as 50% of their peak power consumption.This thesis proposes a novel approach for building data centers so that their energy consumption is proportional to the actual load. We propose an original infrastructure named BML for "Big, Medium, Little", composed of heterogeneous computing resources : from low power processors to classical servers. The idea is to take advantage of their different characteristics in terms of energy consumption, performance, and switch on reactivity to adjust the composition of the infrastructure according to the load evolutions. We define a generic methodology to compute the most energy proportional combinations of machines based on hardware profiling data.We focus on web applications whose load varies over time and design a scheduler that dynamically reconfigures the infrastructure, with application migrations and machines switch on and off, to minimize the infrastructure energy consumption according to the current application requirements.We have developed two different dynamic provisioning algorithms which take into account the time and energy overheads of the different reconfiguration actions in the decision process. We demonstrate through simulations based on experimentally acquired hardware profiles that we achieve important energy savings compared to classical data center infrastructures and management

APA, Harvard, Vancouver, ISO, and other styles

21

Lynch, Gerard. "Parallel job scheduling on heterogeneous networks of multiprocessor workstations." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0006/MQ45952.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Podobas, Artur, Mats Brorsson, and Vladimir Vlassov. "Exploring heterogeneous scheduling using the task-centric programming model." KTH, Programvaru- och datorsystem, SCS, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-120436.

Full text

Abstract:

Computer architecture technology is moving towards more heteroge-neous solutions, which will contain a number of processing units with different capabilities that may increase the performance of the system as a whole. How-ever, with increased performance comes increased complexity; complexity that is now barely handled in homogeneous multiprocessing systems. The present study tries to solve a small piece of the heterogeneous puzzle; how can we exploit all system resources in a performance-effective and user-friendly way? Our proposed solution includes a run-time system capable of using a variety of different heterogeneous components while providing the user with the already familiar task-centric programming model interface. Furthermore, when dealing with non-uniform workloads, we show that traditional approaches based on centralized or work-stealing queue algorithms do not work well and propose a scheduling algorithm based on trend analysis to distribute work in a performance-effective way across resources.

QC 20130429

ENCORE

APA, Harvard, Vancouver, ISO, and other styles

23

Teller, Justin Stevenson. "Scheduling Tasks on Heterogeneous Chip Multiprocessors with Reconfigurable Hardware." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1211985748.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

BAJAJ, RASHMI. "EFFICIENT TASK SCHEDULING ALGORITHM FOR NETWORK OF HETEROGENEOUS WORKSTATIONS." University of Cincinnati / OhioLINK, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1000734538.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Palli, Kiran Kumar. "Scheduling DAGs for minimum finish time and power consumption on heterogeneous processors." Auburn, Ala., 2005. http://repo.lib.auburn.edu/2005%20Summer/master's/PALLI_KIRANKUMAR_13.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Tumanov, Alexey. "Scheduling with Space-Time Soft Constraints In Heterogeneous Cloud Datacenters." Research Showcase @ CMU, 2016. http://repository.cmu.edu/dissertations/865.

Full text

Abstract:

Heterogeneity in modern datacenters is on the rise, in hardware resource characteristics, in workload characteristics, and in dynamic characteristics (e.g., a memoryresident copy of input data). As a result, which machines are assigned to a given job can have a significant impact. For example, a job may run faster on the same machine as its input data or with a given hardware accelerator, while still being runnable on other machines, albeit less efficiently. Heterogeneity takes on more complex forms as sets of resources differ in the level of performance they deliver, even if they consist of identical individual units, such as with rack-level locality. We refer to this as combinatorial heterogeneity. Mixes of jobs with strict SLOs on completion time and increasingly available runtime estimates in production datacenters deepen the challenge of matching the right resources to the right workloads at the right time. In this dissertation, we hypothesize that it is possible and beneficial to simultaneously leverage all of this information in the form of declaratively specified spacetime soft constraints. To accomplish this, we first design and develop our principal building block—a novel Space-Time Request Language (STRL). It enables the expression of jobs’ preferences and flexibility in a general, extensible way by using a declarative, composable, intuitive algebraic expression structure. Second, building on the generality of STRL, we propose an equally general STRL Compiler that automatically compiles STRL expressions into Mixed Integer Linear Programming (MILP) problems that can be aggregated and solved to maximize the overall value of shared cluster resources. These theoretical contributions form the foundation for the system we architect, called TetriSched, that instantiates our conceptual contributions: (a) declarative soft constraints, (b) space-time soft constraints, (c) combinatorial constraints, (d) orderless global scheduling, and (e) in situ preemption. We also propose a set of mechanisms that extend the scope and the practicality of TetriSched’s deployment by analyzing and improving on its scalability, enabling and studying the efficacy of preemption, and featuring a set of runtime mis-estimation handling mechanisms to address runtime prediction inaccuracy. In collaboration with Microsoft, we adapt some of these ideas as we design and implement a heterogeneity-aware resource reservation system called Aramid with support for ordinal placement preferences targeting deployment in production clusters at Microsoft scale. A combination of simulation and real cluster experiments with synthetic and production-derived workloads, a range of workload intensities, degrees of burstiness, preference strengths, and input inaccuracies support our hypothesis that leveraging space-time soft constraints (a) significantly improves scheduling quality and (b) is possible to achieve in a practical deployment.

APA, Harvard, Vancouver, ISO, and other styles

27

Park, Yongwon Baskiyar Sanjeev. "Dynamic task scheduling onto heterogeneous machines using Support Vector Machine." Auburn, Ala, 2008. http://repo.lib.auburn.edu/EtdRoot/2008/SPRING/Computer_Science_and_Software_Engineering/Thesis/Park_Yong_50.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Al-Sinayyid, Ali. "JOB SCHEDULING FOR STREAMING APPLICATIONS IN HETEROGENEOUS DISTRIBUTED PROCESSING SYSTEMS." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/dissertations/1868.

Full text

Abstract:

The colossal amounts of data generated daily are increasing exponentially at a never-before-seen pace. A variety of applications—including stock trading, banking systems, health-care, Internet of Things (IoT), and social media networks, among others—have created an unprecedented volume of real-time stream data estimated to reach billions of terabytes in the near future. As a result, we are currently living in the so-called Big Data era and witnessing a transition to the so-called IoT era. Enterprises and organizations are tackling the challenge of interpreting the enormous amount of raw data streams to achieve an improved understanding of data, and thus make efficient and well-informed decisions (i.e., data-driven decisions). Researchers have designed distributed data stream processing systems that can directly process data in near real-time. To extract valuable information from raw data streams, analysts need to create and implement data stream processing applications structured as a directed acyclic graphs (DAG). The infrastructure of distributed data stream processing systems, as well as the various requirements of stream applications, impose new challenges. Cluster heterogeneity in a distributed environment results in different cluster resources for task execution and data transmission, which make the optimal scheduling algorithms an NP-complete problem. Scheduling streaming applications plays a key role in optimizing system performance, particularly in maximizing the frame-rate, or how many instances of data sets can be processed per unit of time. The scheduling algorithm must consider data locality, resource heterogeneity, and communicational and computational latencies. The latencies associated with the bottleneck from computation or transmission need to be minimized when mapped to the heterogeneous and distributed cluster resources. Recent work on task scheduling for distributed data stream processing systems has a number of limitations. Most of the current schedulers are not designed to manage heterogeneous clusters. They also lack the ability to consider both task and machine characteristics in scheduling decisions. Furthermore, current default schedulers do not allow the user to control data locality aspects in application deployment.In this thesis, we investigate the problem of scheduling streaming applications on a heterogeneous cluster environment and develop the maximum throughput scheduler algorithm (MT-Scheduler) for streaming applications. The proposed algorithm uses a dynamic programming technique to efficiently map the application topology onto a heterogeneous distributed system based on computing and data transfer requirements, while also taking into account the capacity of underlying cluster resources. The proposed approach maximizes the system throughput by identifying and minimizing the time incurred at the computing/transfer bottleneck. The MT-Scheduler supports scheduling applications that are structured as a DAG, such as Amazon Timestream, Google Millwheel, and Twitter Heron. We conducted experiments using three Storm microbenchmark topologies in both simulated and real Apache Storm environments. To evaluate performance, we compared the proposed MT-Scheduler with the simulated round-robin and the default Storm scheduler algorithms. The results indicated that the MT-Scheduler outperforms the default round-robin approach in terms of both average system latency and throughput.

APA, Harvard, Vancouver, ISO, and other styles

29

Nguyen, Duc Anh. "A Heterogeneous Platform Description Model For Real-Time Scheduling Tools." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429193.

Full text

Abstract:

Real-time system analysis tools and WCET analyzers both need information aboutthe hardware to obtain the results. However, the current hardware models are eithertoo abstract or too detailed. Therefore, this thesis proposes a new hardware descriptionmodel that achieves the balance between two spectra; hence it is easy to use by the userswithout hardware expertise while not compromising too much accuracy. Along with thenew model, an accompanying application is implemented to help the user work withthe new model easier with its drag and drop interface. Furthermore, the application canfunction as a WCET analyzer by connecting to the Gem5 simulator, which facilitatessimulating and measuring the execution time of the programs.

APA, Harvard, Vancouver, ISO, and other styles

30

Vellore, Suriyakumar Avinankumar. "Statically configured heterogeneous SMT processor." Diss., Online access via UMI:, 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

31

Katayama, Fabio Massaaki. "O problema da troca de mensagens de diferentes tamanhos em redes multi-aglomerados." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-01082007-002437/.

Full text

Abstract:

Com o aumento no uso de aglomerados e grades de computadores, cresce o interesse no estudo de comunicações entre processadores. Em um computador paralelo dedicado, ou em uma rede local homogênea, o tempo de comunicação é geralmente modelado de forma similar, independente de quais processadores estão se comunicando. Em uma rede onde os links entre os computadores são heterogêneos, computadores mais próximos tendem a apresentar menor latência e maior largura de banda do que computadores distantes. Além disso, a largura de banda agregada é diferente dependendo do número de conexões simultâneas existentes entre dois aglomerados distantes. Neste trabalho estudaremos a troca completa de mensagens de tamanhos diferentes entre aglomerados interligados por backbones. Proporemos um novo algoritmo de comunicação baseado em algoritmos conhecidos, apresentaremos simulações de escalonamentos dos algoritmos estudados para esta rede multi-aglomerado e analisaremos os resultados destas simulações.
The growth in popularity of clusters and computational grids caused an increase in the interest in studying interprocessors communications. The comunication time in a dedicated parallel computer or in a local homogeneous network is modeled in a similar way, regardless of which processors are communicating. In a network with heterogeneous links, closer computers generally have lower latency and larger bandwidth than wide area computers. Besides, the aggregated bandwidth depends on the number of simultaneous connections between two wide area clusters. In this work we study the complete exchange of messages of different sizes between interconnected clusters using a backbone. We propose a new comunication algorithm based on known algorithms, we present some scheduling simulations of the studied algorithms in this multi-cluster network and we present the results analysis of these simulations.

APA, Harvard, Vancouver, ISO, and other styles

32

Pang, Yihan. "Leveraging Processor-diversity For Improved Performance In Heterogeneous-ISA Systems." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/95299.

Full text

Abstract:

The purpose of this thesis is to investigate the effectiveness of executing High Performance Computing (HPC) workloads on multiprocessors with heterogeneous Instruction Set Architecture (ISA) cores. ISA-heterogeneity in processor designs provides a unique dimension for researchers to explore performance benefits through diversity in design choices. Additionally, each application has a natural preference to one processor in a selected group of processors (we defined this term as processor-preference), and processor-preference is highly affected by processor design choices. Thus, a system with heterogeneous-ISA cores offers an intriguing design perspective, packing heterogeneous-ISA cores in the same processor or system that compensate each other in dynamic workload scenarios. This thesis considers dynamic migrating applications with different processor-preferences across ISA-different cores to exploit the potential of this idea. With SIMD instructions getting more attention from chip designers, this thesis also presents the necessary modifications for a general compiler/run-time infrastructure to transform the dynamic program state of SIMD regions at run-time from one ISA format to another for cross-ISA migration and execution. Lastly, this thesis presents a processor-preference-aware scheduling policy that makes dynamic cross-ISA migration decisions that improve overall system throughput compared to homogeneous-ISA systems. This thesis prototypes a heterogeneous-ISA system using an Intel Xeon Gold 5118 x86-64 server and a Cavium ThunderX ARMv8 server and evaluates the effectiveness of our infrastructure and scheduling policy. Our results reveal that heterogeneous-ISA systems that are processor-preference-aware and with cross-ISA execution migration capability can yield throughput gains up to 36\% compared to traditional homogeneous ISA systems.
Master of Science
The author of this thesis has a family full of non-engineers. To persuade family members that the work of this thesis is meaningful, aka the author is not procrastinating in school, the author decided to draw an analogy between processors and cars. Suppose in an alternative universe, cars (systems) can be powered by engines (processors) that uses two different fuel-sources (ISAs): gasoline or electric (single-ISA) processors but not both (heterogeneous-ISA). Car manufacturers (chip designers) can build engines with different design choices (processors with varying design options): engines combined with turbochargers for gasoline-powered cars, high-performance batteries combined with energy-efficient batteries for electric-powered cars (added extended instruction sets, CPU designs that target vastly different use cases, etc.). However, each design choice is limited to improving performance for a specific type of fuel-source based engine. For example, having battery alternatives has no performance impact on gasoline-powered engines. As time passes by, car manufacturers have exhausted options to make a drastic improvement to their existing engine designs (limited performance gains in recent chips). To tackle this problem, in this thesis, the author first examined the usage of cars: driving on the road (running applications). The author's study found that no single engine is suitable for all routes (no single processor is good for all workloads), and cars powered by different fuel-source based engines showed a significant diversity in performance (application performance varies drastically between systems with processors built on different ISAs). Gasoline-powered cars perform well on high-speed roads, whereas electric-powered cars perform well on low-speed roads. Unfortunately, in real life, a person's commute (a workload of applications) consists of a mixture of high-speed roads and low-speed roads, and one cannot know the exact percentage of each kind of path they travel (exact application composition in a workload) beforehand. Therefore it is challenging for a person to make the correct car selection for the incoming commute (choose the right system for a workload). This thesis tries to solve this commuting problem by building a car that has multiple engines fitted to suit different road needs (systems with processors that have vastly different use cases). This thesis looks at a particular dimension of combining various fuel-powered engines in the same car (a system with heterogeneous-ISA processors). The author believes that adding diversity in fuel-powered engine selections provide an exciting dimension in car design choices (adding ISA-heterogeneity in processors provide a unique dimension in system design). Thus, this thesis focuses on estimating a theoretical multi fuel-powered car's performance by combining two different fuel-powered cars into a single mega-car using some framework (Popcorn Linux). This framework allows this mega-car to be driven by a combined fuel source with fuel intake freely transfer between fuel-sources (cross-ISA migration and execution) based on road conditions (application encountered). Based on the evaluation of this new prototype, the author finds that in a real-life scenario (workload with mixed application combination), cars with multiple fuel-source based engines have better performance than two single fuel-source based cars (systems with heterogeneous-ISAs processors perform better than systems with homogeneous-ISAs processors). The author hopes that this study can help build the foundation for the development of hybrid cars (system with heterogeneous-ISAs in the same processor) in the future as well as the consideration of modifying existing car into a mega-car with multiple engines suited for different road needs for improved commute performance for now. Ultimately, this thesis is not about cars. The author hopes that by explaining the research done in this paper through cars, general audiences can understand what this work is trying to investigate and what solution they have provided. In this work, we investigate the potential of a system with heterogeneous-ISA processors. This thesis prototypes one such system and finds that heterogeneous-ISA systems have performance benefits than traditional homogeneous-ISA systems over a series of experiment evaluations.

APA, Harvard, Vancouver, ISO, and other styles

33

Rehn-Sonigo, Veronika. "Multi-criteria Mapping and Scheduling of Workflow Applications onto Heterogeneous Platforms." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2009. http://tel.archives-ouvertes.fr/tel-00424118.

Full text

Abstract:

Les travaux présentés dans cette thèse portent sur le placement et l'ordonnancement d'applications de flux de données sur des plates-formes hétérogènes. Dans ce contexte, nous nous concentrons sur trois types différents d'applications :
Placement de répliques dans les réseaux hiérarchiques - Dans ce type d'application, plusieurs clients émettent des requêtes à quelques serveurs et la question est : où doit-on placer des répliques dans le réseau afin que toutes les requêtes puissent être traitées. Nous discutons et comparons plusieurs politiques de placement de répliques dans des réseaux hiérarchiques en respectant des contraintes de capacité de serveur, de qualité
de service et de bande-passante. Les requêtes des clients sont connues a priori, tandis que le nombre et la position des serveurs sont à déterminer. L'approche traditionnelle dans la littérature est de forcer toutes les requêtes d'un client à être traitées par le serveur le plus proche dans le réseau hiérarchique. Nous introduisons et étudions deux nouvelles politiques. Une principale contribution de ce travail est l'évaluation de l'impact de ces nouvelles politiques sur le coût total de replication. Un autre but important est d'évaluer l'impact de l'hétérogénéité des serveurs, d'une perspective à la
fois théorique et pratique. Nous établissons plusieurs nouveaux résultats de complexité, et nous présentons plusieurs heuristiques
efficaces en temps polynomial.
Applications de flux de données - Nous considérons des applications de flux de données qui peuvent être exprimées comme des graphes linéaires. Un exemple pour ce type d'application est le traitement numérique d'images, où les images sont traitées en
régime permanent. Plusieurs critères antagonistes doivent être optimisés, tels que le débit et la latence (ou une combinaison) ainsi que la latence et la fiabilité (i.e. la probabilité que le calcul soit réussi) de l'application. Bien qu'il soit possible de trouver
des algorithmes polynomiaux simples pour les plates-formes entièrement homogènes, le problème devient NP-difficile lorsqu'on s'attaque à des plates-formes hétérogènes. Nous présentons une formulation en programme linéaire pour ce dernier problème. De
plus nous introduisons plusieurs heuristiques bi-critères efficaces en temps polynomial, dont la performance relative est évaluée par des simulations extensives. Dans une étude de cas, nous présentons des simulations et des résultats expérimentaux (programmés en MPI) pour le graphe d'application de l'encodeur JPEG sur une grappe de calcul.
Applications complexes de streaming - Considérons l'exécution d'applications organisées en arbres d'opérateurs, i.e. l'application en régime permanent d'un ou plusieurs arbres d'opérateurs à données multiples qui doivent être mis à jour continuellement à différents endroits du réseau. Un premier but est de fournir à l'utilisateur un ensemble de processeurs qui doit être acheté ou loué pour garantir que le débit minimum de l'application en régime permanent soit atteint. Puis nous étendons notre modèle aux applications multiples : plusieurs applications concurrentes sont exécutées en même
temps dans un réseau, et on doit assurer que toutes les applications puissent atteindre leur débit requis. Une autre contribution de ce travail est d'apporter des résultats de complexité pour des instances variées du problème. La troisième contribution est l'élaboration
de plusieurs heuristiques polynomiales pour les deux modèles d'application. Un objectif premier des heuristiques pour applications concurrentes est la réutilisation des résultats intermédiaires qui sont partagés parmi différentes applications.

APA, Harvard, Vancouver, ISO, and other styles

34

Karaduman, Gulsah. "Scheduling Approaches For Parameter Sweep Applications In A Heterogeneous Distributed Environment." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612562/index.pdf.

Full text

Abstract:

In this thesis, the focus is on the development of scheduling algorithms for Sim-PETEK which is a framework for parallel and distributed execution of simulations. Since it is especially designed for running parameter sweep applications in a heterogeneous distributed computational environment, multi-round and adaptive scheduling approaches are followed. Five different scheduling algorithms are designed and evaluated for scheduling purposes of Sim-PETEK. Development of these algorithms are arranged in a way that a newly developed algorithm provides extensions over the previously developed and evaluated ones. Evaluation of the scheduling algorithms is handled by running a Wireless Sensor Network (WSN) simulation over Sim-PETEK in a heterogeneous distributed computational system formed in TUBITAK UEKAE ILTAREN. This evaluation not only makes comparisons among the scheduling algorithms but it also and rates them in terms of the optimality principle of divisible load theory which mentions that in order to obtain optimal processing time all the processors used in the computation must stop at the same time. Furthermore, this study adapts a scheduling approach, which uses statistical calibration, from literature to Sim-PETEK and makes an assessment between this approach and the most optimal scheduling approach among the five algorithms that have been previously evaluated. The approach which is found to be the most efficient is utilized as the Sim-PETEK scheduler.

APA, Harvard, Vancouver, ISO, and other styles

35

Pop, Traian. "Analysis and Optimisation of Distributed Embedded Systems with Heterogeneous Scheduling Policies." Doctoral thesis, Linköping : Department of Computer and Information Science, Linköpings universitet, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8934.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Pop, Traian. "Scheduling and Optimisation of Heterogeneous Time/Event-Triggered Distributed Embedded Systems." Licentiate thesis, Linköping : Univ, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5691.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Pradhan, Shristi Nhuchhe. "Scheduling and power allocation for interference mitigation in heterogeneous cellular networks." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/45988.

Full text

Abstract:

The wireless industry is confronted with an exponentially increasing demand for ubiquitous wireless coverage and larger data rates. Recent studies have shown that the spectral efficiency of a point-to-point link in cellular networks has approached its theoretical limit. This demands an increase in the node density in order to further improve the network capacity. However, today's network already has dense deployments and high intercell interference severely limits the cell splitting gains. Moreover, high capital and operational expenditure associated further limit the deployment of high power macro nodes. In this thesis, we investigate on Heterogeneous Networks (HetNets), a new paradigm for increasing cellular capacity and coverage to meet the forecasted explosion of data traffic. HetNets consist of low power nodes such as pico and femto overlaid over a macrocell network. Nevertheless, the deployment of large number of small cells overlaying macrocells presents new technical challenges. We focus on interference management issues in HetNets and present user scheduling and power allocation schemes for interference mitigation. We investigate the performance of user scheduling and power allocation techniques for interference mitigation in HetNets. We present analytical modeling and propose improved solutions using results from the model and computer simulations. First, we propose a scheme to jointly minimize network outage probability and power consumption. Second, we propose a scheme to jointly maximize network throughput and minimize power consumption. Both these schemes guarantee Quality of Service (QoS) provisioning in HetNets. We analyze the intrinsic trade-off between network performance parameters, i.e., outage and power consumption; throughput and power consumption using multi-objective optimization approach. Different user scheduling schemes have been adopted such as best user selection, proportional fairness and round-robin. Thirdly, we also propose an energy efficient power allocation method and analyze its performance with guaranteed QoS provisioning. For all the proposed algorithms and schemes we provide extensive simulation based results.

APA, Harvard, Vancouver, ISO, and other styles

38

AULUCK, NITIN. "REAL-TIME SCHEDULING ALGORITHMS FOR PRECEDENCE RELATED TASKS ON HETEROGENEOUS MULTIPROCESSORS." University of Cincinnati / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1109288052.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Rehn-Sonigo, Véronika. "Multi-criteria Mapping and Scheduling of Workflow Applications onto Heterogeneous Platforms." Lyon, École normale supérieure (sciences), 2009. http://www.theses.fr/2009ENSL0518.

Full text

Abstract:

Les travaux présentés dans cette thèse portent sur le placement et l'ordonnancement d'applications de flux de données sur des plates-formes hétérogènes. Dans ce contexte, nous nous concentrons sur trois types différents d'applications : Placement de répliques dans les réseaux hiérarchiques - Dans ce type d'application, plusieurs clients émettent des requêtes à quelques serveurs et la question est : où doit-on placer des répliques dans le réseau afin que toutes les requêtes puissent être traitées. Nous discutons et comparons plusieurs politiques de placement de répliques dans des réseaux hiérarchiques en respectant des contraintes de capacité de serveur, de qualité de service et de bande-passante. Les requêtes des clients sont connues a priori, tandis que le nombre et la position des serveurs sont à déterminer. L'approche traditionnelle dans la littérature est de forcer toutes les requêtes d'un client à être traitées par le serveur le plus proche dans le réseau hiérarchique. Nous introduisons et étudions deux nouvelles politiques. Une principale contribution de ce travail est l'évaluation de l'impact de ces nouvelles politiques sur le coût total de replication. Un autre but important est d'évaluer l'impact de l'hétérogénéité des serveurs, d'une perspective à la fois théorique et pratique. Nous établissons plusieurs nouveaux résultats de complexité, et nous présentons plusieurs heuristiques efficaces en temps polynomial. Applications de flux de données - Nous considérons des applications de flux de données qui peuvent être exprimées comme des graphes linéaires. Un exemple pour ce type d'application est le traitement numérique d'images, où les images sont traitées en régime permanent. Plusieurs critères antagonistes doivent être optimisés, tels que le débit et la latence (ou une combinaison) ainsi que la latence et la fiabilité (i. E. La probabilité que le calcul soit réussi) de l'application. Bien qu'il soit possible de trouver des algorithmes polynomiaux simples pour les plates-formes entièrement homogènes, le problème devient NP-difficile lorsqu'on s'attaque à des plates-formes hétérogènes. Nous présentons une formulation en programme linéaire pour ce dernier problème. De plus nous introduisons plusieurs heuristiques bi-critères efficaces en temps polynomial, dont la performance relative est évaluée par des simulations extensives. Dans une étude de cas, nous présentons des simulations et des résultats expérimentaux (programmés en MPI) pour le graphe d'application de l'encodeur JPEG sur une grappe de calcul. Applications complexes de streaming - Considérons l'exécution d'applications organisées en arbres d'opérateurs, i. E. L'application en régime permanent d'un ou plusieurs arbres d'opérateurs à données multiples qui doivent être mis à jour continuellement à différents endroits du réseau. Un premier but est de fournir à l'utilisateur un ensemble de processeurs qui doit être acheté ou loué pour garantir que le débit minimum de l'application en régime permanent soit atteint. Puis nous étendons notre modèle aux applications multiples : plusieurs applications concurrentes sont exécutées en même temps dans un réseau, et on doit assurer que toutes les applications puissent atteindre leur débit requis. Une autre contribution de ce travail est d'apporter des résultats de complexité pour des instances variées du problème. La troisième contribution est l'élaboration de plusieurs heuristiques polynomiales pour les deux modèles d'application. Un objectif premier des heuristiques pour applications concurrentes est la réutilisation des résultats intermédiaires qui sont partagés parmi différentes applications
The results summarized in this document deal with the mapping and scheduling of workow applications on heterogeneous platforms. In this context, we focus on three different types of streaming applications: Replica placement in tree networks - In this kind of application, clients are issuing requests to some servers and the question is where to place replicas in the network such that all requests can be processed. We discuss and compare several policies to place replicas in tree networks, subject to server capacity, Quality of Service (QoS) and bandwidth constraints. The client requests are known beforehand, while the number and location of the servers have to be determined. The standard approach in the literature is to enforce that all requests of a client be served by the closest server in the tree. We introduce and study two new policies. One major contribution of this work is to assess the impact of these new policies on the total replication cost. Another important goal is to assess the impact of server heterogeneity, both from a theoretical and a practical perspective. We establish several new complexity results, and provide several efficient polynomial heuristics for NP-complete instances of the problem. Pipeline workflow applications - We consider workflow applications that can be expressed as linear pipeline graphs. An example for this application type is digital image processing, where images are treated in steady-state mode. Several antagonist criteria should be optimized, such as throughput and latency (or a combination) as well as latency and reliability (i. E. , the probability that the computation will be successful) of the application. While simple polynomial algorithms can be found for fully homogeneous platforms, the problem becomes NP-hard when tackling heterogeneous platforms. We present an integer linear programming formulation for this latter problem. Furthermore, we provide several efficient polynomial bi-criteria heuristics, whose relative performances are evaluated through extensive simulation. As a case-study, we provide simulations and MPI experimental results for the JPEG encoder application pipeline on a cluster of workstations. Complex streaming applications - We consider the execution of applications structured as trees of operators, i. E. , the application of one or several trees of operators in steady-state to multiple data objects that are continuously updated at various locations in a network. A first goal is to provide the user with a set of processors that should be bought or rented in order to ensure that the application achieves a minimum steady-state throughput, and with the objective of minimizing platform cost. We then extend our model to multiple applications: several concurrent applications are executed at the same time in a network, and one has to ensure that all applications can reach their application throughput. Another contribution of this work is to provide complexity results for different instances of the basic problem, as well as integer linear program formulations of various problem instances. The third contribution is the design of several polynomial-time heuristics, for both application models. One of the primary objectives of the heuristics for concurrent applications is to reuse intermediate results shared by multiple applications

APA, Harvard, Vancouver, ISO, and other styles

40

Aditi, Tanzim F. "A novel scheduling framework for heterogeneous traffic in the smart grid." Thesis, 2015. http://hdl.handle.net/1959.13/1309565.

Full text

Abstract:

Masters Research - Master of Philosophy (MPhil)
For a resource constrained network, an efficient packet scheduling technique ensures the maximum possible resource utilization while maintaining acceptable network performance at a given time. The scheduling process is closely associated with quality of service (QoS), which provides the customers a certain level of quality assurance while receiving services from the network. For a packet scheduler, the scheduling task becomes even more challenging in a multi-service environment,where the network has to allocate its limited available resources to multiple contending devices/applications with diverse QoS requirements. Hence, optimised distribution of network resources — while still meeting the QoS demands for each traffic class — is the key feature of an efficient packet scheduler.In recent times, the rise of machine-to-machine (M2M) communications, such as the Smart Grids, have further increased the necessity and importance of QoS management and enforcement due to their mission-critical nature. The co-existence of mission-critical M2M traffic and conventional multimedia traffic requires perpacket QoS treatment due to their diverse delay and priority requirements. Consequently,the prevalent practice in conventional networks of supporting session based QoS management is inadequate and a new QoS paradigm is required. To address the aforementioned challenges, this thesis proposes a novel scheduling framework for dynamic resource allocation, where the allocated resource is a function of dynamic queue-sizes of heterogeneous traffic classes and their projected delays. The delay is managed in a manner that the delay-sensitive traffic will be given priority over the delay-tolerant traffic, while still meeting the delay bound and maximum queue size. Thus, it is able to efficiently handle both base and peak traffic loads and provides guaranteed QoS in a multi-service communications network. The performance of the proposed scheduling framework is first validated with Monte-Carlo simulation model based on MATLAB under a generic packet networking environment and compared with two conventional scheduling algorithms. The corresponding results show that the proposed scheduling algorithm achieves significantly better performance compared to the other two schemes. The simulation model is then extended to emulate a multi-service Smart Grid environment to demonstrate the performance in a real-life scenario. To validate the proof-of-concept (PoC), an IEEE 802.16-based WiMAX network was used, which serves both M2M traffic and non-M2M traffic simultaneously. The corresponding results are consistent with those obtained with the generic networking model and demonstrates the applicability of the proposed scheduling framework in a challenging environment, such as the Smart Grid.

APA, Harvard, Vancouver, ISO, and other styles

41

Rashid, Faraan. "Low-Feedback Opportunistic Scheduling Schemes for Wireless Networks with Heterogenous Users." Thesis, 2012. http://hdl.handle.net/10754/237211.

Full text

Abstract:

Efficient implementation of resource sharing strategies in a multi-user wireless environment can improve the performance of a network significantly. In this thesis we study various scheduling strategies for wireless networks and handle the problem of opportunistically scheduling transmissions using channel aware schemes. First we propose a scheme that can handle users with asymmetric channel conditions and is opportunistic in the sense that it exploits the multi-user diversity of the network. The scheme requires the users to have a priori knowledge of their channel distributions. The associated overhead is limited meaning it offers reduced feedback load, that does not scale with the increasing number of users. The main technique used to shrink the feedback load is the contention based distributed implementation of a splitting algorithm that does not require explicit feedback to the scheduler from every user. The users find the best among themselves, in a distributed manner, while requiring just a ternary broadcast feedback from the scheduler at the end of each mini-slot. In addition, it can also handle fairness constraints in time and throughput to various degrees. Next we propose another opportunistic scheduler that offers most of the benefits of the previously proposed scheme but is more practical because it can also handle heterogenous users whose channel distributions are unknown. This new scheme actually reduces the complexity and is also more robust for changing traffic patterns. Finally we extend both these schemes to the scenario where there are fixed thresholds, this enables us to handle opportunistic scheduling in practical systems that can only transmit over finite number of discrete rates with the additional benefit that full feedback session, even from the selected user, is never required.

APA, Harvard, Vancouver, ISO, and other styles

42

Liu, Yih-fang, and 林藝芳. "Probability-based Scheduling Optimization for Multiple Multicast on Heterogenous Networks of Workstations." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/58536557884395162079.

Full text

Abstract:

碩士
國立中正大學
資訊工程研究所
89
In recent years, networks of workstations/PCs (so called NOW) are be-coming appealing vehicles for cost-effective parallel computing. Due to the commodity nature of workstations and networking equipment, LAN envi-ronments are gradually becoming heterogeneous. The diverse sources of heterogeneity in NOW systems pose a challenge on the design of efficient communication algorithms for this class of systems. Some efficient algorithms were designed for multiple multicast in hetero-geneous NOW systems (so called HNOW) in [13]. Although the simulation results demonstrate the performance advantage of those algorithms on sys-tems of up to 100 nodes, the communication model those algorithms based on doesn’t incorporate the factors of the contentions among the commu-nications. The contentions among the communications appear on the real HNOW and have considerable effects on the performance. We modified the communication model in [13] to incorporate the factors of the contentions among the communications. We also designed some new algorithms based on the new model. These algorithms were not only based on Fast-Edge-First(FEF) and Earliest-Completion-First(ECF), but also con-sidered the effect of the contentions among the communications. The simulations result of these algorithm based on new model demon-strate the more 20 percent performance advantage of the multiple multicast than the ones based on the old model in the paper[13].

APA, Harvard, Vancouver, ISO, and other styles

43

Ho-Hsuan, Lee. "Periodic Job Scheduling in Heterogeneous Environments." 2007. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-3001200712481400.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Raravi, Gurulingesh. "Real-Time Scheduling on Heterogeneous Multiprocessors." Tese, 2014. http://hdl.handle.net/10216/74684.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Lee, Ho-Hsuan, and 李龢軒. "Periodic Job Scheduling in Heterogeneous Environments." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/85708605947180643608.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
95
This paper consider a scheduling problem for periodic jobs in a heterogeneous environment. There are m heterogeneous processors and n identical jobs. Jobs are available one at a time periodically. Each job is assigned to a processor. The goal is to minimizes the summation of completion time of all jobs. We propose a Minimum-Completion-First (MCF) for scheduling identical and periodic jobs to heterogeneous processors. We show that MCF is optimal under the restriction that the number of jobs is smaller than the amount of time unit for a fastest processor to process a job. We also conduct experiments to illustrate that MCF produces excellent schedules in general cases.

APA, Harvard, Vancouver, ISO, and other styles

46

Raravi, Gurulingesh. "Real-Time Scheduling on Heterogeneous Multiprocessors." Doctoral thesis, 2014. https://repositorio-aberto.up.pt/handle/10216/72091.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Raravi, Gurulingesh. "Real-Time Scheduling on Heterogeneous Multiprocessors." Tese, 2013. https://repositorio-aberto.up.pt/handle/10216/72091.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Weissman, Jon. "Scheduling parallel computations in a heterogeneous environment /." 1995. http://www.lib.virginia.edu/etd/diss/SEAS/ComputerScience/1995/1995_06.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Lin, Ting-Chou, and 林廷舟. "Job Dispatching and Scheduling under Heterogeneous Clusters." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/92984254838389866682.

Full text

Abstract:

碩士
國立臺灣大學
資訊網路與多媒體研究所
103
Many enterprises or institutes are building private clouds by establishing their own data center. In such data centers, the physical machines can be different due to annual upgrades, but the amount of machine are fixed for most of the time. In such heterogeneous environment, scheduling jobs with different resource requirements and characteristic in order to meet different timing constraints is important. In this paper, we proposed a cloud resource management framework to dynamically adjust the number of computation nodes for every job in the system.

APA, Harvard, Vancouver, ISO, and other styles

50

Mariano, Artur Miguel Matos. "Scheduling (ir)regular applications on heterogeneous platforms." Master's thesis, 2012. http://hdl.handle.net/1822/28068.

Full text

Abstract:

Dissertação de mestrado em Engenharia de Informática
Current computational platforms have become continuously more and more heterogeneous and parallel over the last years, as a consequence of incorporating accelerators whose architectures are parallel and different from the CPU. As a result, several frameworks were developed to aid to program these platforms mainly targeting better productivity ratios. In this context, GAMA framework is being developed by the research group involved in this work, targeting both regular and irregular algorithms to efficiently run in heterogeneous platforms. Scheduling is a key issue of GAMA-like frameworks. The state of the art solutions of scheduling on heterogeneous platforms are efficient for regular applications but lack adequate mechanisms for irregular ones. The scheduling of irregular applications is particularly complex due to the unpredictability and the differences on the execution time of their composing computational tasks. This dissertation work comprises the design and validation of a dynamic scheduler’s model and implementation, to simultaneously address regular and irregular algorithms. The devised scheduling mechanism is validated within the GAMA framework, when running relevant scientific algorithms, which include the SAXPY, the Fast Fourier Transform and two n-Body solvers. The proposed mechanism is validated regarding its efficiency in finding good scheduling decisions and the efficiency and scalability of GAMA, when using it. The results show that the model of the devised dynamic scheduler is capable of working in heterogeneous systems with high efficiency and finding good scheduling decisions in the general tested cases. It achieves not only the scheduling decision that represents the real capacity of the devices in the platform, but also enables GAMA to achieve more than 100% of efficiency as defined in [3], when running a relevant scientific irregular algorithm. Under the designed scheduling model, GAMA was also able to beat CPU and GPU efficient libraries of SAXPY, an important scientific algorithm. It was also proved GAMA’s scalability under the devised dynamic scheduler, which properly leveraged the platform computational resources, in trials with one central quad-core CPU-chip and two GPU accelerators.
As plataformas computacionais actuais tornaram-se cada vez mais heterogéneas e paralelas nos últimos anos, como consequência de integrarem aceleradores cujas arquitecturas são paralelas e distintas do CPU. Como resultado, várias frameworks foram desenvolvidas para programar estas plataformas, com o objectivo de aumentar os níveis de produtividade de programação. Neste sentido, a framework GAMA está a ser desenvolvida pelo grupo de investigação envolvido nesta tese, tendo como objectivo correr eficientemente algoritmos regulares e irregulares em plataformas heterogéneas. Um aspecto chave no contexto de frameworks congéneres ao GAMA é o escalonamento. As soluções que compõem o estado da arte de escalonamento em plataformas heterogéneas são eficientes para aplicaçóes regulares, mas ineficientes para aplicações irregulares. O escalonamento destas é particularmente complexo devido à imprevisibilidade e ás diferenças no tempo de computação das tarefas computacionais que as compõem. Esta dissertação propõe o design e validação de um modelo de escalonamento e respectiva implementação, que endereça tanto aplicações regulares como irregulares. O mecanismo de escalonamento desenvolvido é validado na framework GAMA, executando algoritmos científicos relevantes, que incluem a SAXPY, a Transformada Rápida de Fourier e dois algoritmos de resolução do problema n-Corpos. O mecanismo proposto é validado quanto à sua eficiência em encontrar boas decisões de escalonamento e quanto à eficiência e escalabilidade do GAMA, quando fazendo uso do mesmo. Os resultados obtidos mostram que o modelo de escalonamento proposto é capaz de executar em plataformas heterogéneas com alto grau de eficiência, uma vez que encontra boas decisões de escalonamento na generalidade dos casos testados. Além de atingir a decisão de escalonamento que melhor representa o real poder computacional dos dispositivos na plataforma, também permite ao GAMA atingir mais de 100% de eficiência tal como definida em [3], executando um importante algoritmo científico irregular. Integrando o modelo de escalonamento desenvolvido, o GAMA superou ainda bibliotecas eficientes para CPU e GPU na execução do SAXPY, um importante algoritmo científico. Foi também provada a escalabilidade do GAMA sob o modelo desenvolvido, que aproveitou da melhor forma os recursos computacionais disponíveis, em testes para um CPU-chip de 4 núcleos e dois GPUs.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Heterogenous scheduling'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles