To see the other types of publications on this topic, follow the link: Heterogeneous Architecture Design.

Dissertations / Theses on the topic 'Heterogeneous Architecture Design'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 44 dissertations / theses for your research on the topic 'Heterogeneous Architecture Design.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Pessolano, Francesco. "Heterogeneous clustered processors : organization and design." Thesis, London South Bank University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.325819.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Cong, Minh Thanh. "Hardware accelerated simulation and automatic design of heterogeneous architecture." Electronic Thesis or Diss., Université de Rennes (2023-....), 2023. https://ged.univ-rennes1.fr/nuxeo/site/esupversions/1ae038b9-380e-4e42-bcd4-fa3a28cb34b0.

Full text
Abstract:
La conception de plates-formes de système sur puce hétérogènes est complexe avec de nombreuses combinaisons possibles. La simulation détaillée de différentes solutions est nécessaire pour déterminer le meilleur design. Les environnements de simulation existants (tels que gem5) sont limités car purement logiciels et ne prennent pas en compte les architectures hétérogènes. Pour pallier ces limitations, l'utilisation de composants reprogrammables FPGA pour accélérer la simulation est motivée. Notre travail est divisé en deux parties. La première partie est d'ordre expérimental et a étudié une approche de conception d'architectures hétérogènes en se concentrant sur la simulation de modèles de performance de composants de l'architecture (accélérateurs matériels et cœurs de processeurs) sur FPGA. La seconde partie est méthodologique et concerne un flot pour déterminer la meilleure microarchitecture en termes de rapport performance/consommation d'énergie. Ce flot combine un simulateur logiciel d'architecture et une méthode d'optimisation d'hyperparamètres pour trouver la meilleure combinaison de parallélisme, stratégies de déroulage de boucles et interfaces de mémoire. Les expérimentations ont été menées sur différents problèmes pour déterminer les solutions les plus optimales en termes d'efficacité énergétique
The design of heterogeneous system-on-chip platforms is complex with many possible combinations. Detailed simulation of different solutions is necessary to determine the best design. Existing simulation environments (such as gem5) are limited as they are purely software based and do not take into account heterogeneous architectures. To address these limitations, the use of reprogrammable FPGA components to accelerate simulation is motivated. Our work is divided into two parts. The first part is experimental and studied an approach to design heterogeneous architectures focusing on simulating performance models of architecture components (hardware accelerators and processor cores) on FPGA. The second part is methodological and concerns a flow to determine the best microarchitecture in terms of performance to energy consumption ratio. This flow combines a software architecture simulator and a hyperparameter optimization method to find the best combination of parallelism, loop unrolling strategies, and memory interfaces. Experiments were conducted on different problems to determine the most optimal solutions in terms of energy efficiency
APA, Harvard, Vancouver, ISO, and other styles
3

Schultek, Brian Robert. "Design and Implementation of the Heterogeneous Computing Device Management Architecture." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1417801414.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Huang, Zhe. "Design of heterogeneous P2P video-on-demand systems /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?ECED%202008%20HUANG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

McClure, Bruce Davis. "Design of an adaptive computing architecture for managing interactions in heterogeneous defence networks /." [St. Lucia, Qld.], 2002. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe17146.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Keir, Paul. "Design and implementation of an array language for computational science on a heterogeneous multicore architecture." Thesis, University of Glasgow, 2012. http://theses.gla.ac.uk/3645/.

Full text
Abstract:
The packing of multiple processor cores onto a single chip has become a mainstream solution to fundamental physical issues relating to the microscopic scales employed in the manufacture of semiconductor components. Multicore architectures provide lower clock speeds per core, while aggregate floating-point capability continues to increase. Heterogeneous multicore chips, such as the Cell Broadband Engine (CBE) and modern graphics chips, also address the related issue of an increasing mismatch between high processor speeds, and huge latency to main memory. Such chips tackle this memory wall by the provision of addressable caches; increased bandwidth to main memory; and fast thread context switching. An associated cost is often reduced functionality of the individual accelerator cores; and the increased complexity involved in their programming. This dissertation investigates the application of a programming language supporting the first-class use of arrays; and capable of automatically parallelising array expressions; to the heterogeneous multicore domain of the CBE, as found in the Sony PlayStation 3 (PS3). The language is a pre-existing and well-documented proper subset of Fortran, known as the ‘F’ programming language. A bespoke compiler, referred to as E , is developed to support this aim, and written in the Haskell programming language. The output of the compiler is in an extended C++ dialect known as Offload C++, which targets the PS3. A significant feature of this language is its use of multiple, statically typed, address spaces. By focusing on generic, polymorphic interfaces for both the generated and hand constructed code, a number of interesting design patterns relating to the memory locality are introduced. A suite of medium-sized (100-700 lines), real-world benchmark programs are used to evaluate the performance, correctness, and scalability of the compiler technology. Absolute speedup values, well in excess of one, are observed for all of the programs. The work ultimately demonstrates that an array language can significantly reduce the effort expended to utilise a parallel heterogeneous multicore architecture, while retaining high performance. A substantial, related advantage in using standard ‘F’ is that any Fortran compiler can create debuggable, and competitively performing serial programs.
APA, Harvard, Vancouver, ISO, and other styles
7

Okuya, Yujiro. "CAD Modification Techniques for Design Reviews on Heterogeneous Interactive Systems." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS450.

Full text
Abstract:
Les revues de design industriel bénéficient des nouvelles technologies interactives pour devenir plus réalistes, immersives et collaboratives. Toutefois, la modification des données de conception (CAO) est toujours effectuées depuis un espace de travail traditionnel par des ingénieurs qualifiés. Des problèmes de communication entre les différents experts peuvent apparaitre lors des réunions de revue de projet et engendrer des erreurs d’interprétation des modifications. J’estime que les processus actuels de révision de la conception impliquant itérativement des discussions sur la conception et un ajustement des modèles 3d devraient fusionner. Cela pourrait réduire le nombre d’itérations de correction sur les modèles durant le cycle de développement en facilitant lesdiscussions et en permettant à des utilisateurs non spécialistes CAO de modifier les données. Dans cette thèse, j’ai commencé par interviewer des ingénieurs de l‘industrie et j’ai esquissé un scénario de revue de conception dans lequel tous les membres d’un même projet peuvent générer et comparer plusieurs alternatives de conception depuis des systèmes interactifs adaptés pour répondre aux besoins de leurs différents expertises. J’ai d’abord conçu un système de couplage entre un environnement interactif temps réel et des données de CAO (RV-CAO) capable de modifier et de mettre à jour au format CAO natif. J’ai ensuite proposé des techniques d’interaction pour permettre à des utilisateurs non experts en CAO de modifier les données CAO paramétriques en utilisant des systèmes depuis un système CAVE et un mur d’image. Pour le système CAVE, j’ai créé ShapeGuide, une métaphore d’interaction basée forme permettant aux utilisateurs de générer et de choisir parmi des alternatives de conception en agissant indirectement sur les valeurs des paramètres d’un modèle CAO. J’ai étudié comment ShapeGuide peut affecter la qualité d’une tâche de modification de données CAO par rapport à un réglage de valeur de paramètre basée sur un défilement unidimensionnel. Les résultats ont montré que ShapeGuide permettait une modification plus rapide, plus efficace et préférée par les utilisateurs. Pour l’interaction depuis un mur d’images, j’ai créé ShapeCompare, qui permet à plusieurs utilisateurs de générer et de comparer plusieurs alternatives de design. J’ai étudié comment ShapeCompare affecte la collaboration entre experts par rapport à une technique de visualisation adaptée aux écrans standard. Les résultats ont montré qu’avec ShapeCompare, des paires de participants effectuaient plus rapidement une tâche de résolution de contraintes multiples et utilisaient plus d’instructions déictiques. Les résultats présentés décrivent des propositions de nouvelles pratiques de révision de conception, se basant sur l’utilisation d’interactions immersives et de murs d’images, qui permettent la modification directe des données de conception d’origine par tous les membres du projet quelle que soit leur expertise en CAO
Industrial design reviews benefit from emerging interactive technologies to become more Realistic, Immersive and Collaborative. However, the modification of design data is still managed in traditional workspace–Computer-Aided Design (CAD) systems on a workstation. As only engineers can apply modifications in such a workspace after the design review meeting, miscommunication between various experts could occur, resulting in unnecessary iterations. I argue that current processes of design reviews–design discussion and design adjustment– should merge. It could reduce the iterations, facilitate discussions and empower non-CAD experts to modify CAD data. In this dissertation, I started by interviewing engineers at an automotive industry and drew a new design review scenario in which project members can generate and compare several design alternatives in heterogeneous systems that can support needs from various experts. Based on the scenario, I firstly designed a VR-CAD system that can update the native format of CAD data in highly configurable interactive systems. I then explored interaction techniques for non-CAD experts to modify parametric CAD data with 3D and 2D interactive systems: a CAVE system and a wall-sized display. For the CAVE system, I created ShapeGuide, which allows users to generate and switch design alternatives of CAD data with a shape-based 3D interaction. I investigated how ShapeGuide affects a CAD data modification task compared to a standard one-dimensional scroll for parameter manipulation. Results showed that ShapeGuide was faster, more efficient and preferred by the users than the scroll technique. For the wall-sized display, I created ShapeCompare, which allows users to generate and distribute multiple design alternatives of CAD data using touch interaction. I investigated how ShapeCompare affects the collaboration among experts compared to a visualization technique suitable for standard screens. Results showed that pairs of participants performed a constraint solving task faster and used more deictic instructions with Shape- Compare. The presented findings for new design review practices using immersive systems and a wallsized display, allowing direct modification of the original CAD data by all project members regardless of their CAD expertise
APA, Harvard, Vancouver, ISO, and other styles
8

Beu, Jesse Garrett. "Design of heterogeneous coherence hierarchies using manager-client pairing." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/47710.

Full text
Abstract:
Over the past ten years, the architecture community has witnessed the end of single-threaded performance scaling and a subsequent shift in focus toward multicore and manycore processing. While this is an exciting time for architects, with many new opportunities and design spaces to explore, this brings with it some new challenges. One area that is especially impacted is the memory subsystem. Specifically, the design, verification, and evaluation of cache coherence protocols becomes very challenging as cores become more numerous and more diverse. This dissertation examines these issues and presents Manager-Client Pairing as a solution to the challenges facing next-generation coherence protocol design. By defining a standardized coherence communication interface and permissions checking algorithm, Manager-Client Pairing enables coherence hierarchies to be constructed and evaluated quickly without the high design-cost previously associated with hierarchical composition. Further, Manager-Client Pairing also allows for verification composition, even in the presence of protocol heterogeneity. As a result, this rapid development of diverse protocols is ensured to be bug-free, enabling architects to focus on performance optimization, rather than debugging and correctness concerns, while comparing diverse coherence configurations for use in future heterogeneous systems.
APA, Harvard, Vancouver, ISO, and other styles
9

Somers, Marc Steven. "Impact of Webpage Access on the Design of Single-Chip Heterogeneous Multiprocessors." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/32107.

Full text
Abstract:
Mobile devices are currently designed similar to embedded systems where performance is derived from a specification that allows the device to interact in a periodic manner with the environment. However, as mobile devices increasingly interact with the Internet they exhibit a different style of computing that does not fit the embedded system model. At the same time, a mobile device designer needs to consider many different issues such as the number and types of processors, scheduling strategies, applications, power consumption, and dimensions of the device, which increase the total number of design decisions at an alarming rate. This research shows that by using a more realistic model of mobile devices using webpage-based benchmarks, customization can allow specialized architectures to improve performance up to 70 percent over a homogeneous multiprocessor composed of general purpose processors and 25 percent additional improvement over the next best architecture when individual user preferences were also considered. Webpage access, to include user profiling for individual utilization, is clearly a significant factor in the design of mobile devices â and thus should be included in future benchmarks based upon webpage content and webpage access patterns. When new evaluation techniques are developed, new design strategies can be discovered and employed.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
10

Botero, Oscar. "Heterogeneous RFID framework design, analysis and evaluation." Phd thesis, Institut National des Télécommunications, 2012. http://tel.archives-ouvertes.fr/tel-00714120.

Full text
Abstract:
The Internet of Things paradigm establishes interaction and communication with a huge amount of actors. The concept is not a new-from-scratch one; actually, it combines a vast number of technologies and protocols and surely adaptations of pre-existing elements to offer new services and applications. One of the key technologies of the Internet of Things is the Radio Frequency Identification just abbreviated RFID. This technology proposes a set of solutions that allow tracking and tracing persons, animals and practically any item wirelessly. Considering the Internet of Things concept, multiple technologies need to be linked in order to provide interactions that lead to the implementation of services and applications. The challenge is that these technologies are not necessarily compatible and designed to work with other technologies. Within this context, the main objective of this thesis is to design a heterogeneous framework that will permit the interaction of diverse devices such as RFID, sensors and actuators in order to provide new applications and services. For this purpose in this work, our first contribution is the design and analysis of an integration architecture for heterogeneous devices. In the second contribution, we propose an evaluation model for RFID topologies and an optimization tool that assists in the RFID network planning process. Finally, in our last contribution, we implemented a simplified version of the framework by using embedded hardware and performance metrics are provided as well as the detailed configuration of the test platform
APA, Harvard, Vancouver, ISO, and other styles
11

Chung, Haera. "Optimal Network Topologies and Resource Mappings for Heterogeneous Networks-on-Chip." PDXScholar, 2013. https://pdxscholar.library.pdx.edu/open_access_etds/997.

Full text
Abstract:
Communication has become a bottleneck for modern microprocessors and multi-core chips because metal wires don't scale. The problem becomes worse as the number of components increases and chips become bigger. Traditional Systems-on-Chips (SoCs) interconnect architectures are based on shared-bus communication, which can carry only one communication transaction at a time. This limits the communication bandwidth and scalability. Networks-on-Chip (NoC) were proposed as a promising solution for designing large and complex SoCs. The NoC paradigm provides better scalability and reusability for future SoCs, however, long-distance multi-hop communication through traditional metal wires suffers from both high latency and power consumption. A radical solution to address this challenge is to add long-range, low power, and high-bandwidth single-hop links between distant cores. The use of optical or on-chip RF wireless links has been explored in this context. However, all previous work has focused on regular mesh-based metal wire fabrics that were expanded with one or two additional link types only for long-distance communication. In this thesis we address the following main research questions to address the above-mentioned challenges: (1) What library of different link types would represent an optimum in the design space? (2) How would these links be used to design an application-specific NoC architecture? (3) How would applications use the resulting NoC architecture efficiently? We hypothesize that networks with a higher degree of heterogeneity, i.e., three or more link types, will improve the network throughput and consume less energy compared to traditional NoC architectures. In order to verify our hypothesis and to address the research challenges, we design and analyze optimal heterogeneous networks under different realistic traffic models by considering different cost and performance trade-offs in a comprehensive technology-agnostic simulation framework that uses metaheuristic optimization techniques. As opposed to related work, our heterogeneous links can be placed anywhere in the network, which allows to explore the entire search space. The resulting application-specific networks are then analyzed by using complex network techniques, such as community detection and small-worldness, to understand how heterogeneous link types are used to improve the NoCs performance and cost. Next, we use the application-specific networks as a target architecture for other applications. The goal is to evaluate the performance of our new NoCs for applications they have not been designed for by finding optimal resource allocations. Our results show that there is an optimal number of heterogeneous link types for each set of constraints and that networks with three or more heterogeneous link types provide significantly higher throughput along with lower energy consumption compared to both homogeneous link type and regular 2D mesh networks under three different traffic scenarios. Our evolved networks with three different technology-driven link types, namely metal wires, wireless, and optical links, provide 15% more throughput and fourteen times less energy consumption compared to homogeneous link type network. When ten different abstract link types are used in the design, 12% more throughput and 52% less energy consumption are obtained compared to networks with three different technology-driven link types. This shows that heterogeneous NoC designs based on traditional metal wires, wireless, and optical links, occupy a non-optimal spot in the entire design space. Our results further show that heterogeneous NoCs scale up significantly better in terms of performance and cost compared to mesh networks. We uncovered that network communities evolve robustly and that heterogeneous link types are efficiently establishing inter- and intra-subnet connections depending on their link type properties. We also show that mapping an application on our application-specific NoC architecture provides on average 45% more throughput at 70% less energy consumption compared to regular 2D mesh networks. The NoCs are therefore not only good for the application they were designed for, but for a broad range of other applications as well.
APA, Harvard, Vancouver, ISO, and other styles
12

Roßbach, André Christian. "Evaluation of Software Architectures in the Automotive Domain for Multicore Targets in regard to Architectural Estimation Decisions at Design Time." Master's thesis, Universitätsbibliothek Chemnitz, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-163372.

Full text
Abstract:
In this decade the emerging multicore technology will hit the automotive industry. The increasing complexity of the multicore-systems will make a manual verification of the safety and realtime constraints impossible. For this reason, dedicated methods and tools are utterly necessary, in order to deal with the upcoming multicore issues. A lot of researchprojects for new hardware platforms and software frameworks for the automotive industry are running nowadays, because the paradigms of the “High-Performance Computing” and “Server/Desktop Domain” cannot be easily adapted for the embedded systems. One of the difficulties is the early suitability estimation of a hardware platform for a software architecture design, but hardly a research-work is tackling that. This thesis represents a procedure to evaluate the plausibility of software architecture estimations and decisions at design stage. This includes an analysis technique of multicore systems, an underlying graph-model – to represent the multicore system – and a simulation tool evaluation. This can guide the software architect, to design a multicore system, in full consideration of all relevant parameters and issues
In den nächsten Jahren wird die aufkommende Multicore-Technologie auf die Automobil-Branche zukommen. Die wachsende Komplexität der Multicore-Systeme lässt es nicht mehr zu, die Verifikation von Sicherheits- und Echtzeit-Anforderungen manuell auszuführen. Daher sind spezielle Methoden und Werkzeuge zwingend notwendig, um gerade mit den bevorstehenden Multicore-Problemfällen richtig umzugehen. Heutzutage laufen viele Forschungsprojekte für neue Hardware-Plattformen und Software-Frameworks für die Automobil-Industrie, weil die Paradigmen des “High-Performance Computings” und der “Server/Desktop-Domäne” nicht einfach so für die Eingebetteten Systeme angewendet werden können. Einer der Problemfälle ist das frühe Erkennen, ob die Hardware-Plattform für die Software-Architektur ausreicht, aber nur wenige Forschungs-Arbeiten berücksichtigen das. Diese Arbeit zeigt ein Vorgehens-Model auf, welches ermöglicht, dass Software-Architektur Abschätzungen und Entscheidungen bereits zur Entwurfszeit bewertet werden können. Das beinhaltet eine Analyse Technik für Multicore-Systeme, ein grundsätzliches Graphen-Model, um ein Multicore-System darzustellen, und eine Simulatoren Evaluierung. Dies kann den Software-Architekten helfen, ein Multicore System zu entwerfen, welches alle wichtigen Parameter und Problemfälle berücksichtigt
APA, Harvard, Vancouver, ISO, and other styles
13

Cornevaux-Juignet, Franck. "Hardware and software co-design toward flexible terabits per second traffic processing." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2018. http://www.theses.fr/2018IMTA0081/document.

Full text
Abstract:
La fiabilité et la sécurité des réseaux de communication nécessitent des composants efficaces pour analyser finement le trafic de données. La diversification des services ainsi que l'augmentation des débits obligent les systèmes d'analyse à être plus performants pour gérer des débits de plusieurs centaines, voire milliers de Gigabits par seconde. Les solutions logicielles communément utilisées offrent une flexibilité et une accessibilité bienvenues pour les opérateurs du réseau mais ne suffisent plus pour répondre à ces fortes contraintes dans de nombreux cas critiques.Cette thèse étudie des solutions architecturales reposant sur des puces programmables de type Field-Programmable Gate Array (FPGA) qui allient puissance de calcul et flexibilité de traitement. Des cartes équipées de telles puces sont intégrées dans un flot de traitement commun logiciel/matériel afin de compenser les lacunes de chaque élément. Les composants du réseau développés avec cette approche innovante garantissent un traitement exhaustif des paquets circulant sur les liens physiques tout en conservant la flexibilité des solutions logicielles conventionnelles, ce qui est unique dans l'état de l'art.Cette approche est validée par la conception et l'implémentation d'une architecture de traitement de paquets flexible sur FPGA. Celle-ci peut traiter n'importe quel type de paquet au coût d'un faible surplus de consommation de ressources. Elle est de plus complètement paramétrable à partir du logiciel. La solution proposée permet ainsi un usage transparent de la puissance d'un accélérateur matériel par un ingénieur réseau sans nécessiter de compétence préalable en conception de circuits numériques
The reliability and the security of communication networks require efficient components to finely analyze the traffic of data. Service diversification and through put increase force network operators to constantly improve analysis systems in order to handle through puts of hundreds,even thousands of Gigabits per second. Commonly used solutions are software oriented solutions that offer a flexibility and an accessibility welcome for network operators, but they can no more answer these strong constraints in many critical cases.This thesis studies architectural solutions based on programmable chips like Field-Programmable Gate Arrays (FPGAs) combining computation power and processing flexibility. Boards equipped with such chips are integrated into a common software/hardware processing flow in order to balance short comings of each element. Network components developed with this innovative approach ensure an exhaustive processing of packets transmitted on physical links while keeping the flexibility of usual software solutions, which was never encountered in the previous state of theart.This approach is validated by the design and the implementation of a flexible packet processing architecture on FPGA. It is able to process any packet type at the cost of slight resources over consumption. It is moreover fully customizable from the software part. With the proposed solution, network engineers can transparently use the processing power of an hardware accelerator without the need of prior knowledge in digital circuit design
APA, Harvard, Vancouver, ISO, and other styles
14

Nakov, Stojce. "On the design of sparse hybrid linear solvers for modern parallel architectures." Thesis, Bordeaux, 2015. http://www.theses.fr/2015BORD0298/document.

Full text
Abstract:
Dans le contexte de cette thèse, nous nous focalisons sur des algorithmes pour l’algèbre linéaire numérique, plus précisément sur la résolution de grands systèmes linéaires creux. Nous mettons au point des méthodes de parallélisation pour le solveur linéaire hybride MaPHyS. Premièrement nous considerons l'aproche MPI+threads. Dans MaPHyS, le premier niveau de parallélisme consiste au traitement indépendant des sous-domaines. Le second niveau est exploité grâce à l’utilisation de noyaux multithreadés denses et creux au sein des sous-domaines. Une telle implémentation correspond bien à la structure hiérarchique des supercalculateurs modernes et permet un compromis entre les performances numériques et parallèles du solveur. Nous démontrons la flexibilité de notre implémentation parallèle sur un ensemble de cas tests. Deuxièmement nous considérons un approche plus innovante, où les algorithmes sont décrits comme des ensembles de tâches avec des inter-dépendances, i.e., un graphe de tâches orienté sans cycle (DAG). Nous illustrons d’abord comment une première parallélisation à base de tâches peut être obtenue en composant des librairies à base de tâches au sein des processus MPI illustrer par un prototype d’implémentation préliminaire de notre solveur hybride. Nous montrons ensuite comment une approche à base de tâches abstrayant entièrement le matériel peut exploiter avec succès une large gamme d’architectures matérielles. À cet effet, nous avons implanté une version à base de tâches de l’algorithme du Gradient Conjugué et nous montrons que l’approche proposée permet d’atteindre une très haute performance sur des architectures multi-GPU, multicoeur ainsi qu’hétérogène
In the context of this thesis, our focus is on numerical linear algebra, more precisely on solution of large sparse systems of linear equations. We focus on designing efficient parallel implementations of MaPHyS, an hybrid linear solver based on domain decomposition techniques. First we investigate the MPI+threads approach. In MaPHyS, the first level of parallelism arises from the independent treatment of the various subdomains. The second level is exploited thanks to the use of multi-threaded dense and sparse linear algebra kernels involved at the subdomain level. Such an hybrid implementation of an hybrid linear solver suitably matches the hierarchical structure of modern supercomputers and enables a trade-off between the numerical and parallel performances of the solver. We demonstrate the flexibility of our parallel implementation on a set of test examples. Secondly, we follow a more disruptive approach where the algorithms are described as sets of tasks with data inter-dependencies that leads to a directed acyclic graph (DAG) representation. The tasks are handled by a runtime system. We illustrate how a first task-based parallel implementation can be obtained by composing task-based parallel libraries within MPI processes throught a preliminary prototype implementation of our hybrid solver. We then show how a task-based approach fully abstracting the hardware architecture can successfully exploit a wide range of modern hardware architectures. We implemented a full task-based Conjugate Gradient algorithm and showed that the proposed approach leads to very high performance on multi-GPU, multicore and heterogeneous architectures
APA, Harvard, Vancouver, ISO, and other styles
15

Nguyen, Thi Khanh Hong. "Conception faible consommation d'un système de détection de chute." Thesis, Nice, 2015. http://www.theses.fr/2015NICE4093/document.

Full text
Abstract:
De nos jours, la détection de chute est un défi pour la santé, notamment pour la surveillance des personnes âgées. Le but de cette thèse est de concevoir un système de détection de chute basée sur une surveillance par caméra et d’étudier les aspects algorithmiques et architecturaux. Notre système se compose de quatre modules : la segmentation d’objet, le filtrage, l’extraction de caractéristiques et la reconnaissance qui permettent en plus de la détection de chute d’identifier leur type afin de définir un niveau d’alerte. En premier lieu, différents algorithmes ont été étudiés et comparés comme le Background Subtraction-Neural Network; le Background Subtraction-Template Matching (BGS-TM); le Background Subtraction-Hidden Markov Model ; et le Gaussian Mixture Model. Le BGS/TM présentant le meilleur taux de reconnaissance a alors été retenu. Une nouvelle base de donnée DTU-HBU a été construite et classifiée selon différentes actions : chute, non-chute (assis, couché, rampant, etc.) selon trois angles de caméra (face, côtés et de biais). Le second objectif fut de définir une méthode de conception permettant de sélectionner les architectures présentant la meilleure performance. Un premier travail fut de définir des modèles de la consommation et du temps d’exécution pour différentes cibles (processeur, FPGA). A titre d’exemple, la plateforme ZYNQ a été considérée. Les modèles proposés présentent un taux erreur inférieur à 3,5%. Une méthodologie de conception DSE basée sur deux techniques de parallélisme (Intra-task et inter-task) et couplant le taux de reconnaissance (ACC) a été définie. Les résultats obtenus montrent que l’ACC atteint 98,3% pour une énergie de 29,5 mJ/f
Nowadays, fall detection is a major challenge in the public health care domain, especially for the elderly living alone and rehabilitants in hospitals. This thesis presents an exploration for a Fall Detection System based on camera under an algorithmic and architectural point of view. Our system includes four modules: Object Segmentation, Filter, Feature Extraction and Recognition and give an urgent alarm for detecting different kinds of fall. Firstly, different algorithms for the Fall Detection System are proposed and compared the efficiency among Background Subtraction-Neural Network, Background Subtraction-Template Matching (BGS/TM), Background Subtraction-Hidden Markov Model, and Gaussian Mixture Model. Therefore, the selected BGS/TM with 91.67% (Recall), 100% (Precision) and 95.65% (Accuracy) will be implemented on ZYNQ platform. Moreover, a DUT-HBU database which is classified with different actions: fall, non-fall in three camera directions is used to evaluate the efficiency of this system. Secondly, the aim is to explore low cost architectures for this system, new power consumption and execution time models for processor core and FPGA are defined according to the different configurations of architecture and applications. The error rates of the proposed models don’t exceed 3.5%. The models are then extended to hardware/software architectures to explore low cost architecture by defining a suitable Design Space Exploration methodology. Two techniques for parallelization which are based on intra-task and inter-task static scheduling are applied with the aim to enhance the accuracy and the power consumption of this system reaches 98.3% with energy per frame of 29.5mJ/f
APA, Harvard, Vancouver, ISO, and other styles
16

Kotsopoulos, Konstantinos. "Managing Next Generation Networks (NGNs) based on the Service-Oriented Architechture (SOA) : design, development and testing of a message-based network management platform for the integration of heterogeneous management systems." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5264.

Full text
Abstract:
Next Generation Networks (NGNs) aim to provide a unified network infrastructure to offer multimedia data and telecommunication services through IP convergence. NGNs utilize multiple broadband, QoS-enabled transport technologies, creating a converged packet-switched network infrastructure, where service-related functions are separated from the transport functions. This requires significant changes in the way how networks are managed to handle the complexity and heterogeneity of NGNs. This thesis proposes a Service Oriented Architecture (SOA) based management framework that integrates heterogeneous management systems in a loose coupling manner. The key benefit of the proposed management architecture is the reduction of the complexity through service and data integration. A network management middleware layer that merges low level management functionality with higher level management operations to resolve the problem of heterogeneity was proposed. A prototype was implemented using Web Services and a testbed was developed using trouble ticket systems as the management application to demonstrate the functionality of the proposed framework. Test results show the correcting functioning of the system. It also concludes that the proposed framework fulfils the principles behind the SOA philosophy.
APA, Harvard, Vancouver, ISO, and other styles
17

Papapostolou, Apostolia. "Indoor localization and mobility management in the emerging heterogeneous wireless networks." Phd thesis, Institut National des Télécommunications, 2011. http://tel.archives-ouvertes.fr/tel-00997657.

Full text
Abstract:
Over the last few decades, we have been witnessing a tremendous evolution in mobile computing, wireless networking and hand-held devices. In the future communication networks, users are anticipated to become even more mobile demanding for ubiquitous connectivity to different applications which will be preferably aware of their context. Admittedly, location information as part of their context is of paramount importance from both application and network perspectives. From application or user point of view, service provision can upgrade if adaptation to the user's context is enabled. From network point of view, functionalities such as routing, handoff management, resource allocation and others can also benefit if user's location can be tracked or even predicted. Within this context, we focus our attention on indoor localization and handoff prediction which are indispensable components towards the ultimate success of the envisioned pervasive communication era. While outdoor positioning systems have already proven their potential in a wide range of commercial applications, the path towards a successful indoor location system is recognized to be much more difficult, mainly due to the harsh indoor characteristics and requirement for higher accuracy. Similarly, handoff management in the future heterogeneous wireless networks is much more challenging than in traditional homogeneous networks. Handoff schemes must be seamless for meeting strict Quality of Service (QoS) requirements of the future applications and functional despite the diversity of operation features of the different technologies. In addition, handoff decisions should be flexible enough to accommodate user preferences from a wide range of criteria offered by all technologies. The main objective of this thesis is to devise accurate, time and power efficient location and handoff management systems in order to satisfy better context-aware and mobile applications. For indoor localization, the potential of Wireless Local Area Network (WLAN) and Radio Frequency Identification (RFID) technologies as standalone location sensing technologies are first studied by testing several algorithms and metrics in a real experimental testbed or by extensive simulations, while their shortcomings are also identified. Their integration in a common architecture is then proposed in order to combine their key benefits and overcome their limitations. The performance superiority of the synergetic system over the stand alone counterparts is validated via extensive analysis. Regarding the handoff management task, we pinpoint that context awareness can also enhance the network functionality. Consequently, two such schemes which utilize information obtained from localization systems are proposed. The first scheme relies on a RFID tag deployment, alike our RFID positioning architecture, and by following the WLAN scene analysis positioning concept, predicts the next network layer location, i.e. the next point of attachment to the network. The second scheme relies on an integrated RFID and Wireless Sensor/Actuator Network (WSAN) deployment for tracking the users' physical location and subsequently for predicting next their handoff point at both link and network layers. Being independent of the underlying principle wireless access technology, both schemes can be easily implemented in heterogeneous networks. Performance evaluation results demonstrate the advantages of the proposed schemes over the standard protocols regarding prediction accuracy, time latency and energy savings
APA, Harvard, Vancouver, ISO, and other styles
18

Freeman, Robert Steven. "Neutral Parametric Canonical Form for 2D and 3D Wireframe CAD Geometry." BYU ScholarsArchive, 2015. https://scholarsarchive.byu.edu/etd/5688.

Full text
Abstract:
The challenge of interoperability is to retain model integrity when different software applications exchange and interpret model data. Transferring CAD data between heterogeneous CAD systems is a challenge because of differences in feature representation. A study by the National Institute for Standards and Technology (NIST) performed in 1999 made a conservative estimate that inadequate interoperability in the automotive industry costs them $1 billion per year. One critical part of eliminating the high costs due to poor interoperability is a neutral format between heterogeneous CAD systems. An effective neutral CAD format should include a current-state data store, be associative, include the union of CAD features across an arbitrary number of CAD systems, maintain design history, maintain referential integrity, and support multi-user collaboration. This research has focused on extending an existing synchronous collaborative CAD software tool to allow for a neutral, current-state data store. This has been accomplished by creating a Neutral Parametric Canonical Form (NPCF) which defines the neutral data structure for many basic CAD features to enable translation between heterogeneous CAD systems. The initial architecture developed begins to define a new standard for storing CAD features neutrally. The NPCF's for many features have been implemented in a multi-user interoperability program and work between NX and CATIA CAD systems. The 2D point, 2D line, 2D arc, 2D circle, 2D spline, 3D point, extrude, and revolve NPCF's will be specifically defined. Complex models have successfully been modeled and exchanged in real time and have validated the NPCF approach. Multiple users can be in the same part at the same time in different CAD systems and create and update models in real time.
APA, Harvard, Vancouver, ISO, and other styles
19

Törtei, Dániel. "Co-design of architectures and algorithms for mobile robot localization and model-based detection of obstacles." Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30294/document.

Full text
Abstract:
Un véhicule autonome ou un robot mobile est équipé d'un système de navigation qui doit comporter plusieurs briques fonctionnelles pour traiter de perception, localisation, planification de trajectoires et locomotion. Dès que ce robot ou ce véhicule se déplace dans un environnement humain dense, il exécute en boucle et en temps réel plusieurs fonctions pour envoyer des consignes aux moteurs, pour calculer sa position vis-à-vis d'un repère de référence connu, et pour détecter de potentiels obstacles sur sa trajectoire; du fait de la richesse sémantique des images et du faible coût des caméras, ces fonctions exploitent souvent la vision. Les systèmes embarqués sur ces machines doivent alors intégrer des cartes assez puissantes pour traiter des données visuelles en temps réel. Par ailleurs, les contraintes d'autonomie de ces plateformes imposent de très faibles consommations énergétiques. Cette thèse proposent des architectures de type SOPC (System on Programmable Chip) conçues par une méthodologie de co-design matériel/logiciel pour exécuter de manière efficace les fonctions de localisation et de détection des obstacles à partir de la vision. Les résultats obtenus sont équivalents ou meilleurs que l'état de l'art, concernant la gestion de la carte locale d'amers pour l'odométrie-visuelle par une approche EKF-SLAM, et le rapport vitesse d'exécution sur précision pour ce qui est de la détection d'obstacles par identification dans les images d'objets (piétons, voitures...) sur la base de modèles appris au préalable
An autonomous mobile platform is endowed with a navigational system which must contain multiple functional bricks: perception, localization, path planning and motion control. As soon as such a robot or vehicle moves in a crowded environment, it continously loops several tasks in real time: sending reference values to motors' actuators, calculating its position in respect to a known reference frame and detection of potential obstacles on its path. Thanks to semantic richness provided by images and to low cost of visual sensors, these tasks often exploit visual cues. Other embedded systems running on these mobile platforms thus demand for an additional integration of high-speed embeddable processing systems capable of treating abundant visual sensorial input in real-time. Moreover, constraints influencing the autonomy of the mobile platform impose low power consumption. This thesis proposes SOPC (System on a Programmable Chip) architectures for efficient embedding of vison-based localization and obstacle detection tasks in a navigational pipeline by making use of the software/hardware co-design methodology. The obtained results are equivalent or better in comparison to state-of-the-art for both EKF-SLAM based visual odometry: regarding the local map size management containing seven-dimensional landmarks and model-based detection-by-identification obstacle detection: algorithmic precision over execution speed metric
APA, Harvard, Vancouver, ISO, and other styles
20

Staves, Daniel Robert. "Associative CAD References in the Neutral Parametric Canonical Form." BYU ScholarsArchive, 2016. https://scholarsarchive.byu.edu/etd/6222.

Full text
Abstract:
Due to the multiplicity of computer-aided engineering applications present in industry today, interoperability between programs has become increasingly important. A survey conducted among top engineering companies found that 82% of respondents reported using 3 or more CAD formats during the design process. A 1999 study by the National Institute for Standards and Technology (NIST) estimated that inadequate interoperability between the OEM and its suppliers cost the US automotive industry over $1 billion per year, with the majority spent fixing data after translations. The Neutral Parametric Canonical Form (NPCF) prototype standard developed by the NSF Center for e-Design, BYU Site offers a solution to the translation problem by storing feature data in a CAD-neutral format to offer higher-fidelity parametric transfer between CAD systems. This research has focused on expanding the definitions of the NPCF to enforce data integrity and to support associativity between features to preserved design intent through the neutralization process. The NPCF data structure schema was defined to support associativity while maintaining data integrity. Neutral definitions of new features was added including multiple types of coordinate systems, planes and axes. Previously defined neutral features were expanded to support new functionality and the software architecture was redefined to support new CAD systems. Complex models have successfully been created and exchanged by multiple people in real-time to validated the approach of preserving associativity and support for a new CAD system, PTC Creo, was added.
APA, Harvard, Vancouver, ISO, and other styles
21

Xypolitidis, Benard, and Rudin Shabani. "Architectural Design Space Exploration of Heterogeneous Manycores." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-29528.

Full text
Abstract:
Exploring the benefits of heterogeneous architectures is becoming more desirable dueto migration from single core to manycore architectural systems. A fast way to explorethe heterogeneity is through an architectural design space exploration (ADSE) tool,which gives the designer the option to explore design alternatives before the actualimplementation. Heracles Designer is an ADSE tool which allows the user to modifylarge aspects of the architecture. At present, Heracles Designer is equipped with asingle type of processing core, a MIPS CPU.We have extended the Heracles System in order to enable the system to model het-erogeneity. Our system is called the Heterogeneous Heracles System (HHS), where adifferent type of processing core, the OpenRISC CPU, is interfaced into the HeraclesSystem. Test programs are executed on both the MIPS and OpenRISC CPUs, whichhave provided promising results. In order to provide the designer with the option tomodify the system architecture without changing the source code, a GUI named AD-SET was created. ADSET provides the designer with the ability to modify the coresettings, memory system configuration and network topology configuration.In the HHS the MIPS core can only execute basic instructions, while the OpenRISCcan execute more advanced instructions, giving a designer the option to explore theeffects of heterogeneity based on the big little architectural concept. The results of ourwork provides an infrastructure on how to integrate different types of processing coresinto the HHS.
APA, Harvard, Vancouver, ISO, and other styles
22

Le, Tung Thanh. "Optimizing Network-on-Chip Designs for Heterogeneous Many-Core Architectures." Thesis, University of Louisiana at Lafayette, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=10981900.

Full text
Abstract:

On-chip Interconnection Networks are shifting from multicore to manycore systems and are tending to be heterogeneous with the integrated modules from different vendors of various sizes and shapes. Each module has different properties such as routers, link-width. From a system designer's perspective, making layouts of metal-wired links among interconnection modules for communication will be impractical as it increases the design cost in terms of the communication complexity and power leakage on these links. We can replace all links with wireless or optical links for high-performance, reducing latency. However, it comes with a high-cost. Therefore, we formulate the optimization model to minimize the cost (communication links between subnets) and maximize their data flows in the network-on-chip.

Since the optimization model using the optimizers such as CPLEX and Gurobi to achieve the best possible solutions, the solution time to a large set of given problems is not acceptable. Hence, we present a mincostflow-based heuristic algorithm (LINCA) that minimizes the quantification of hybrid routers corresponding to the application-specific traffic for manycore systems. LINCA guarantees the performance of hybrid networks on chip. Its results are validated against the manycore system architecture. Our evaluation shows that LINCA can significantly reduce the cost of using hybrid routers (communication links) in the manycore systems. It reduces cost by 84 percent on average across a variety of applications, compared with all of hybrid routers being deployed in the network without using the optimization model. However, we observed that the solution time of LINCA is increased exponentially for large scale networks. We then proposed an efficient predictive framework for optimized reconfiguring on-chip interconnection network.

The predictive model is built based on the optimization model and learning-based algorithms. As we wish to reduce the communication complexity of the interconnection links in the entire on-chip network, our objective is to minimize those links corresponding to the application-specific traffic demands. Thereby, the overall power dissipation can be mitigated. We believe that our approach will be an essential step when scaling out.

APA, Harvard, Vancouver, ISO, and other styles
23

Prasad, Rohit <1991&gt. "Integrated Programmable-Array accelerator to design heterogeneous ultra-low power manycore architectures." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amsdottorato.unibo.it/9983/1/PhD_thesis__20_January_2022_.pdf.

Full text
Abstract:
There is an ever-increasing demand for energy efficiency (EE) in rapidly evolving Internet-of-Things end nodes. This pushes researchers and engineers to develop solutions that provide both Application-Specific Integrated Circuit-like EE and Field-Programmable Gate Array-like flexibility. One such solution is Coarse Grain Reconfigurable Array (CGRA). Over the past decades, CGRAs have evolved and are competing to become mainstream hardware accelerators, especially for accelerating Digital Signal Processing (DSP) applications. Due to the over-specialization of computing architectures, the focus is shifting towards fitting an extensive data representation range into fewer bits, e.g., a 32-bit space can represent a more extensive data range with floating-point (FP) representation than an integer representation. Computation using FP representation requires numerous encodings and leads to complex circuits for the FP operators, decreasing the EE of the entire system. This thesis presents the design of an EE ultra-low-power CGRA with native support for FP computation by leveraging an emerging paradigm of approximate computing called transprecision computing. We also present the contributions in the compilation toolchain and system-level integration of CGRA in a System-on-Chip, to envision the proposed CGRA as an EE hardware accelerator. Finally, an extensive set of experiments using real-world algorithms employed in near-sensor processing applications are performed, and results are compared with state-of-the-art (SoA) architectures. It is empirically shown that our proposed CGRA provides better results w.r.t. SoA architectures in terms of power, performance, and area.
APA, Harvard, Vancouver, ISO, and other styles
24

CHATHA, KARAMVIR SINGH. "SYSTEM-LEVEL COSYNTHESIS OF TRANSFORMATIVE APPLICATIONS FOR HETEROGENEOUS HARDWARE-SOFTWARE ARCHITECTURES." University of Cincinnati / OhioLINK, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=ucin990822809.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Savas, Süleyman. "Utilizing Heterogeneity in Manycore Architectures for Streaming Applications." Licentiate thesis, Högskolan i Halmstad, Centrum för forskning om inbyggda system (CERES), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-33792.

Full text
Abstract:
In the last decade, we have seen a transition from single-core to manycore in computer architectures due to performance requirements and limitations in power consumption and heat dissipation. The first manycores had homogeneous architectures consisting of a few identical cores. However, the applications, which are executed on these architectures, usually consist of several tasks requiring different hardware resources to be executed efficiently. Therefore, we believe that utilizing heterogeneity in manycores will increase the efficiency of the architectures in terms of performance and power consumption. However, development of heterogeneous architectures is more challenging and the transition from homogeneous to heterogeneous architectures will increase the difficulty of efficient software development due to the increased complexity of the architecture. In order to increase the efficiency of hardware and software development, new hardware design methods and software development tools are required. Additionally, there is a lack of knowledge on the performance of applications when executed on manycore architectures. The transition began with a shift from single-core architectures to homogeneous multicore architectures consisting of a few identical cores. It now continues with a shift from homogeneous architectures with identical cores to heterogeneous architectures with different types of cores specialized for different purposes. However, this transition has increased the complexity of architectures and hence the complexity of software development and execution. In order to decrease the complexity of software development, new software tools are required. Additionally, there is a lack of knowledge on what kind of heterogeneous manycore design is most efficient for different applications and what are the performances of these applications when executed on current commercial manycores. This thesis studies manycore architectures in order to reveal possible uses of heterogeneity in manycores and facilitate choice of architecture for software and hardware developers. It defines a taxonomy for manycore architectures that is based on the levels of heterogeneity they contain and discusses benefits and drawbacks of these levels. Additionally, it evaluates several applications, a dataflow language (CAL), a source-to-source compilation framework (Cal2Many), and a commercial manycore architecture (Epiphany). The compilation framework takes implementations written in the dataflow language as input and generates code targetting different manycore platforms. Based on these evaluations, the thesis identifies the bottlenecks of the architecture. It finally presents a methodology for developing heterogeneoeus manycore architectures which target specific application domains. Our studies show that using different types of cores in manycore architectures has the potential to increase the performance of streaming applications. If we add specialized hardware blocks to a core, the performance easily increases by 15x for the target application while the core size increases by 40-50% which can be optimized further. Other results prove that dataflow languages, together with software development tools, decrease software development efforts significantly (25-50%) while having a small impact (2-17%) on the performance.
HiPEC (High Performance Embedded Computing)
NGES (Towards Next Generation Embedded Systems: Utilizing Parallelism and Reconfigurability)
APA, Harvard, Vancouver, ISO, and other styles
26

Masing, Leonard Jannik [Verfasser], and J. [Akademischer Betreuer] Becker. "Prototyping Methodologies and Design of Communication-centric Heterogeneous Many-core Architectures / Leonard Jannik Masing ; Betreuer: J. Becker." Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1223027937/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Masing, Leonard [Verfasser], and J. [Akademischer Betreuer] Becker. "Prototyping Methodologies and Design of Communication-centric Heterogeneous Many-core Architectures / Leonard Jannik Masing ; Betreuer: J. Becker." Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1223027937/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Daniel, Tertei. "Co-design of architectures and algorithms for mobile robot localization and model-based detection of obstacles." Phd thesis, Univerzitet u Novom Sadu, Fakultet tehničkih nauka u Novom Sadu, 2016. http://www.cris.uns.ac.rs/record.jsf?recordId=101781&source=NDLTD&language=en.

Full text
Abstract:
This thesis proposes SoPC (System on a ProgrammableChip) architectures for efficient embedding of vison-basedlocalization and obstacle detection tasks in a navigationalpipeline on autonomous mobile robots. The obtainedresults are equivalent or better in comparison to state-ofthe-art. For localization, an efficient hardware architecturethat supports EKF-SLAM's local map management withseven-dimensional landmarks in real time is developed.For obstacle detection a novel method of objectrecognition is proposed - detection by identificationframework based on single detection window scale. Thisframework allows adequate algorithmic precision andexecution speeds on embedded hardware platforms.
Ova teza bavi se dizajnom SoPC (engl. System on aProgrammable Chip) arhitektura i algoritama za efikasnuimplementaciju zadataka lokalizacije i detekcije preprekabaziranih na viziji u kontekstu autonomne robotskenavigacije. Za lokalizaciju, razvijena je efikasnaračunarska arhitektura za EKF-SLAM algoritam, kojapodržava skladištenje i obradu sedmodimenzionalnihorijentira lokalne mape u realnom vremenu. Za detekcijuprepreka je predložena nova metoda prepoznavanjaobjekata u slici putem prozora detekcije fiksnedimenzije, koja omogućava veću brzinu izvršavanjaalgoritma detekcije na namenskim računarskimplatformama.
APA, Harvard, Vancouver, ISO, and other styles
29

Barrère, François. "Conception de reseaux locaux heterogenes : le prototype campus." Toulouse 3, 1987. http://www.theses.fr/1987TOU30154.

Full text
Abstract:
Travaux portant sur l'integration de postes de travail heterogenes tant au niveau materiel que logiciel en un reseau ethernet. Presentation des facteurs conduisant a la conception de coupleurs. Integration logicielle d'un coupleur de communication. Proposition d'une architecture pour l'implantation
APA, Harvard, Vancouver, ISO, and other styles
30

Dou, Hsiang-Lin, and 竇祥霖. "The design and application of Heterogeneous Hierarchical classifier architecture." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/28401448668474033813.

Full text
Abstract:
碩士
國防大學中正理工學院
資訊科學研究所
96
In recent years, data mining techniques have become a popular research topic. With rapid advances in data collection and storage technology, we will not confront the situation of not enough data. The challenge what we need to face is how to discover knowledge from data. In this paper, we first survey some fundamental data mining techniques, such as self-organizing map, artificial neural network, and support vector machine. Taking advantages of these techniques, we develop an ensemble classification model, the Heterogeneous Hierarchical Classifier (HHC); this model can deal with binary and multi-class classification problems. In addition, we propose an adaptive model, the Adaptive Heterogeneous Hierarchical Classifier (AHHC), through which we can automatically obtain classifiers with high classification accuracy. We have applied the models to solve two critical, real-world problems, namely intrusion detection and rainfall intensity classification. For intrusion detection problems, HHC can efficiently perform classification tasks and accurately identify the Normal, DoS, Probing and U2R events. The classification accuracies for the four types of events are 99.81%, 99.85%, 91.31%, and 86.14%, respectively. For rainfall intensity classification problems, AHHC can achieve various goals through setting different fitness functions. Experimental results show that the proposed model is able to achieve high accuracy for rainfall intensity retrieval and outperforms previously published methods.
APA, Harvard, Vancouver, ISO, and other styles
31

Asmussen, Nils. "A New System Architecture for Heterogeneous Compute Units." 2018. https://tud.qucosa.de/id/qucosa%3A34886.

Full text
Abstract:
The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores.
APA, Harvard, Vancouver, ISO, and other styles
32

Liu, An-Ting, and 劉安庭. "Design the Smart Switch for DNS-like Heterogeneous Network Based on SDN Architecture." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/04656476945420139781.

Full text
Abstract:
碩士
國立中山大學
電機工程學系研究所
104
In the development of the Internet of Things(IoT)nowadays, there are a lot of wire and wireless communication standards because of the improvement of the communication technology. Therefore, heterogeneous network gateway plays an important role in IoT network. However, current management of the processing in data routing technique for heterogeneous network gateway is not efficient. Today, the Internet used IP-based packet as a core technique over all internet services so that the connection difficulty increases when there are more and more devices. Also, Modbus is a standard protocol using in the sensor layer in IoT for aiding data monitoring. Often apply on the device which has simple network physical interfaces. If the nodes have same node number connection to the heterogeneous network gateway which supports Modbus protocol, it will cause the node ID confliction. Furthermore, if it is used in different network architecture, the adaption decreases because the packet forwarding control plane and data plane have been in a tight couple state. To cope these problem, this thesis uses software defined networking architecture to design the DNS-like heterogeneous network smart switch. Separating control and data plane through OpenFlow can make switch become more flexible in network because the device and administrator can design and deploy under different network architecture. The Slave ID problem in Modbus heterogeneous network can be solved by using smart routing from smart switch. The node connected to smart switch has to be registered in smart switch and managed by the node name. User doesn’t have to know the node position and the IP address of the node. Data exchange just uses the Find Path by ID and Find Path by Name proposed in this thesis and unfixed string and this improves the connection experience substantially. In the final, we implement the node registration flow、client connection flow and client data access through the OpenFlow module in NS3 then show our algorithm result by using Wireshark.
APA, Harvard, Vancouver, ISO, and other styles
33

Lee, Chi-Ming, and 李齊明. "DeAr: An Efficient and Flexible Digital Signal Processor Design for Heterogeneous System Architecture." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/99514245159186996150.

Full text
Abstract:
碩士
國立清華大學
電機工程學系
104
The evolution of wireless communication protocols drives the quest of power-efficient and flexible computing for embedded digital signal processors (DSPs), but the popular DSP architectures, very-long-instruction-word (VLIW) and application-specific instruction set processor (ASIP), serve as opposite extreme cases in regard to power-efficiency and flexibility. To this end, we present DeAr: Dual-thread Architecture DSP, which manipulates a multi-banked register file that enables simultaneous multi-threading (SMT), as well as a transport-triggered bus that exploits the data forwarding mechanism in its compact datapath. In addition, a novel scheduling algorithm that leverages the compact hardware to achieve both high throughput and flexible computation is presented. To go beyond a single core DSP, we also propose a system integration framework and a compilation tool for DeAr based on Heterogeneous System Architecture (HSA), which is a promising standard for the multi-core architecture promoted by several leading semiconductor companies, AMD, ARM, MediaTek, etc. In the experiment of the comparison with VLIW and ASIP respectively, DeAr either saves 20.3%–13.1% and 31.8%–2.2% of power, 36.1%–31.5% and 28.2%–5.7% of area in the benchmark suite aiming at wireless communication, or saves 20.3%–13.1% and 31.8%–2.2% of power, 36.1%–31.5% and 28.2%–5.7% of area in the benchmark suite of general digital signal processing kernels.
APA, Harvard, Vancouver, ISO, and other styles
34

Guevara, Marisabel Alejandra. "Coordinating the Design and Management of Heterogeneous Datacenter Resources." Diss., 2014. http://hdl.handle.net/10161/8667.

Full text
Abstract:

Heterogeneous design presents an opportunity to improve energy efficiency but raises a challenge in management. Whereas prior work separates the two, we coordinate heterogeneous design and management. We present a market-based resource allocation mechanism that navigates the performance and power trade-offs of heterogeneous architectures. Given this management framework, we explore a design space of heterogeneous processors and show a 12x reduction in response time violations when equipping a datacenter with three processor types over a homogeneous system that consumes the same power. To better understand trade-offs in large heterogeneous design spaces, we explore dozens of design strategies and present a risk taxonomy that classifies the reasons why a deployed system may underperform relative to design targets. We propose design strategies that explicitly mitigate risk, such as a strategy that minimizes the coefficient of variation in performance. In our experiments, we find that risk-aware design accounts for more than 70% of the strategies that produce systems with the best service quality. We also present a new datacenter management mechanism that fairly allocates processors to latency-sensitive applications. Tasks express value for performance using sophisticated piecewise-linear utility functions. With fairness in market allocations, we show how datacenters can mitigate envy amongst latency-sensitive users. We quantify the price of fairness and detail efficiency-fairness trade-offs. Finally, we extend the market to fairly allocate heterogeneous processors.


Dissertation
APA, Harvard, Vancouver, ISO, and other styles
35

Rodrigues, Cristiano António Azevedo. "Heterogeneous fault tolerance architecture based on Arm and RISC-V processors." Master's thesis, 2019. http://hdl.handle.net/1822/64956.

Full text
Abstract:
Dissertação de mestrado em Engenharia Eletrónica Industrial e Computadores
Safety-critical systems deployed in harsh environments rely on fault tolerance and redundancy techniques to keep them operating even in the presence of faults. Although there are effective techniques to mitigate one side faults, they are not enough to protect the system against simultaneously multi side faults. These kinds of faults trigger the same error in faulty redundant components, which makes resulting errors invisible and undetectable for fault tolerant mechanisms. To overcome this problem, design diversity is applied in fault tolerant system to mitigate the Common-Mode Failure (CMF) and build a more robust and reliable system. Despite several fault tolerance architectures based on FPGA are available in the literature, to the best of our knowledge, none of them aims both hardening of heterogeneous processors and applying design diversity at processor level. To address this lack of solutions in the current state of the art, this dissertation proposes a novel heterogeneous fault tolerance architecture, Lock-V, which enables design diversity at processors architecture level. It deals with CMF, as well as both error detection and recovery fault tolerance techniques to mitigate errors triggered by external environment interactions, e.g., radiation. To eliminate the CMF, Lock-V explores an implementation based on different processing units: a hard-core Arm Cortex-A9 and a soft-core RISC-V-based processors, to leverage design diversity through ISA heterogeneity. To implement fault tolerance, Lock-V proposes a hybrid DCLS solution where the error detection is done by hardware, resorting to a FPGA accelerator, while error recovery is performed by software using rollback technique. After the deployment of Lock-V on a Zynq-7000 SoC, over 45000 faults were injected. The results taken from such injection shows that when an application runs on the Lock-V architecture, besides its protection against the CMF due to processors design diversity, it is also protected against 97% of the triggered errors. Nevertheless implement Lock-V came up with some tradeoffs. It used 79% of the LUT and 34% of the FF available on the Zedboard FPGA platform. Regarding the software part, implementing Lock-V leads to an 8% increase in memory footprint and also an increase in the execution overhead around 12%, mainly in the worst case scenario as tested in the absence of errors. Knowing that all the redundancy has its cost, Lock-V proved to be able to grant a system with design diversity and fault tolerance capabilities.
Quando sistemas críticos operam em ambientes hostis, estes necessitam de serviços de redundância e de tolerância a falhas para continuarem em funcionamento mesmo na presença de faltas. Embora a técnica de tolerância a falhas seja eficaz para mitigar faltas que ocorrem num único componente, ela perde eficácia, quando múltiplas faltas acontecem simultaneamente em vários componentes. Estes tipos de faltas, despoletam o mesmo erro em todos os componentes afetados, tornando-as indetectáveis. Para solucionar este problema, usualmente, recorre-se a diversidade de desenho para mitigar as Falhas de Modo Comum (FMC), construindo assim um sistema mais robusto e confiável. Várias arquiteturas de tolerância a falhas, baseadas em Field-Programmable Gate Array (FPGA), têm sido descritas na literatura, no entanto, pelas pesquisas efetuadas, nenhuma delas tem como objetivo proteger processadores heterogéneos e aplicar diversidade de desenho ao nível do processador. Para resolver a supracitada falta de soluções, esta dissertação propõe uma nova arquitetura heterogénea de tolerância a falhas, Lock-V. O Lock-V promove diversidade de desenho, ao nível da arquitetura do processador, assim como técnicas de tolerância a falhas para, respetivamente, mitigar FMC e detetar e recuperar erros despoletados por causas externas, por exemplo, radiação. Para eliminar as FMC, o Lock-V possuí duas unidades de processamento diferentes: um hard-core Arm Cortex-A9 e um soft-core baseado em RISC-V. Desta forma é aplicada diversidade de desenho, usando heterogeneidade no Instruction Set Architecture (ISA). Por outro lado, para implementar tolerância a falhas, o Lock-V propõe uma solução híbrida de Dual-Core Lockstep (DCLS), onde a deteção de erros é feita em hardware, recorrendo a um acelerador na FPGA, e a recuperação dos erros é suportado por software, usando técnicas de rollback. Após o Lock-V ser implementado na Zynq-7000 System-on-Chip (SoC), mais de 45000 faltas foram injetadas. Os resultados dessa injeção mostram que quando uma aplicação executa na arquitetura Lock-V, para além de estar protegida contra FMC, devido à diversidade do desenho ao nível dos processadores, também está protegida contra 97% dos erros ocorridos. No entanto, implementar o Lock-V acarreta alguns tradeoffs. 79% das Look-Up Tables (LUT) e 34% dos Flip-Flops (FF) disponíveis na plataforma (Zedboard), são usados. Ao nível do software, o Lock-V aumenta em 8% o consumo de memoria e, para o pior cenário testando sem a ocorrência de erros, aumenta em 12% o overhead de execução. Tendo em conta que toda a redundância tem o seu custo, o Lock-V provou ser capaz de dotar um sistema com diversidade de desenho e capacidades de tolerância a falhas.
APA, Harvard, Vancouver, ISO, and other styles
36

Cheng, Chen Wei, and 陳威成. "The Design of A Semantic Interoperable Architecture for Heterogeneous Healthcare System based on the IHE." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/68360787233388248336.

Full text
Abstract:
碩士
長庚大學
資訊管理研究所
94
The continuous healthcare service of patient-centered is the common objective of the healthcare industry. However, the only way to reach the objective is to integrate medical information and share medical resource across-hospital effectively. But, at present, the universal problem between the systems of across-hospital is the medical information interoperability. The emergence of Web Services and IHE solve the interoperability problem of the heterogeneous healthcare system. IHE provide an ideal specifications for design an integration healthcare system. Web Services, on the other hand to supply a fitness design technology for the specifications. To combine the IHE and Web Services can overcome the difficult interoperability problem of heterogeneous medical information. But, because of WSDL lacks semantic description capability of Web Services, and UDDI only support keyword-search of Web Services. The search keyword of the IHE Services requester and the registry keyword of the IHE Services provider maybe inconsistent, Such way will lead to the developer of hospital hard to discover the public IHE Services each other. In order to realize the inter-hospital medical information integration environment base on the IHE. This research will based on the IHE and utilize Web Services technology as the design methodology for IHE integration profiles and take advantage of OWL and OWL-S to building an a semantic interoperable architecture for heterogeneous healthcare system to support the developer of across-hospital can discover and integrate these public IHE Services successfully.
APA, Harvard, Vancouver, ISO, and other styles
37

Parshotam, Chiba Chetan. "Design and implementation of a fault management service for heterogeneous networks using Tina Network Resource architecture." Thesis, 2006. http://hdl.handle.net/10539/264.

Full text
Abstract:
Master of Science in Engineering - Engineering
Faults are unavoidable and cause network downtime and degradation of large and complex communication networks. The need for fault management capabilities for improving network reliability is critical to rectify these faults. Current communication networks are moving towards the distributed computing environment enabling these networks to transport heterogeneous multimedia information across end to end connections. An advanced fault management system is thus required for such communication networks. Fault Management provides information on the status of the network by locating, detecting, identifying, isolating, and correcting network problems thereby increasing network reliability. The TINA (Telecommunication Information Networking Architecture) standards define a Network Resource Architecture (NRA) that provides a framework of a transport network that is capable of transporting heterogeneous multimedia media information across heterogeneous networks. TINA also defines a Management Architecture that follows the functional area organization defined in the OSI (Open Systems Interconnection) Management Framework, namely fault, configuration, accounting, performance, and security management (FCAPS). The aim of this project is to utilise the TINA NRA and Management Architecture concepts and principles to design and implement a distributed Fault Management Service for heterogeneous networks. The design presented here utilises TINA’s fault management specifi- cations, together with UML modelling tools to developed this Fault Management Service. The design incorporates the use of CORBA and SNMP to provide a distributed management functionality capable of providing fault management support across heterogeneous networks. The generic nature of the fault management service is tested on the SATINA Trial platform which consists of both an ATM network as well as an IP MPLS network. The report concludes that the Fault Management Service is applicable to any connectionoriented network that is modeled using the TINA NRA specification and principles.
APA, Harvard, Vancouver, ISO, and other styles
38

Guan-YingHuang and 黃冠穎. "The Design and Implementation of Symmetric Message Passing and Management Mechanism in Heterogeneous Multi-Core Architecture." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/85765309524604460715.

Full text
Abstract:
碩士
國立成功大學
電腦與通信工程研究所
98
This thesis designs and implements a symmetric Inter-Process Communication (IPC) mechanism for embedded system platform of heterogeneous multi-core architecture. The goal is to provide a symmetric architecture for processes running on different processor cores to communicate and exchange data. With this symmetric IPC mechanism, processes are able to achieve effective communication, sharing data, co-operative execution and synchronization. The utilization of system resources thereby can be improved. The symmetric IPC is designed with shared memory architecture as the basis. It is composed of the fundamental functions of sharing data through accessing shared memory modules connected by different buses, DMA mechanism suitable for transferring large data blocks, high performance message passing mechanism based on hardware interrupt support, and message management functions in order to make the best of the system capacity. In meeting the various requirements of inter-process communication and improving the efficiency, the functions of sending and receiving messages in blocking and non-blocking mode are supported by the design. To help application development, application program interfaces (API) of using this mechanism are also included in the design. For evaluation purpose, the design of this symmetric IPC mechanism is implemented using PAC Duo as the embedded system platform. PAC Duo is heterogeneous multi-core system-on-chip (SOC) developed by Industrial Technology Research Institute (ITRI), which consists of one ARM processor and two advanced DSP processors (PACDSP). The implementation assumes that Linux operating system is running on the ARM side and, on the two PACDSPs, the real-time kernel ?C/OS-II and a dataflow kernel are running respectively. Function modules of message passing and management are embedded into system kernel and an API function library is provided for application development. The results of performance evaluation show that this symmetric IPC mechanism meets the requirements of application with either real-time or non-real-time conditions.
APA, Harvard, Vancouver, ISO, and other styles
39

Lee, Jeng-Ling, and 李政霖. "The Design and Implementation of Heterogeneous Systems Integration Based on Service Oriented Architecture- SAP Exchange Infrastructure." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/23123935644835443143.

Full text
Abstract:
碩士
大同大學
資訊經營學系(所)
98
Basically, EAI (Enterprise Application Integration) is not a new concept. In recent years, Microsoft, IBM, SAP and other major software companies have promoted the concept of SOA, making the growing importance of EAI. SOA and Web Service are no doubt paving the development of EAI. EAI, having good package, interoperability and universality indeed is a must to solve the complex and compatibility of enterprise application systems and Web Service based on HTTP, SOAP, WSDL, UDDI and XML technology, allows a variety of programs written in different languages on different platforms in a standard way to communicate with each other. This research, through description and analysis the application of Web Service based on EAI, points out the effective solution to implement the enterprise application integration. Besides, Also indicate the heterogeneous of enterprise application system, and discusses the architecture based on SOA and the related technique. In addition, this research will base on a manufacturing company background to import the concept of EAI SOA of IT transformation project coexistence of multiple heterogeneous systems. After using SAP XI as the interface technology, the company has effectively integrated various systems and implements the integration and optimization of their business system. With the updating and ever-changing of SAP product, SAP upgrades the product strategy and promote its EAI products to meet the architecture of SOA, and fully support for global standard develop Agreement. Because SAP are both the leader of current global enterprise applications and solutions technology and the leadership of this market, this study will intends a certain reference to the companies or customer who will use SAP's products or import the architecture of SOA and implement EAI.
APA, Harvard, Vancouver, ISO, and other styles
40

Shu, Sheng-Jie, and 許勝傑. "Performance, Power and Thermal Analysis of Program Execution in Different Architecture Design Based on OpenCL Simulator for Heterogeneous Multicore Processors." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/72uud6.

Full text
Abstract:
碩士
國立交通大學
資訊科學與工程研究所
105
In recent years, heterogeneous hardware architectures become the main trend of development which includes various kinds of processors and/or FPGA into one single chip. In order to develop efficient applications for heterogeneous architecture platform, Open Computing Language (OpenCL) is proposed as required. The applications developed in OpenCL can contain cross operating system, high performance and low power characteristics. The physical phenomena of heterogeneous platform are considerably complicated than conventional platform. Each device designed in a chip can mutual affect each other. Therefore, developing applications for heterogeneous platform requires considering overall system on a chip. However, there is still lack of study providing analysis of various OpenCL application designs executed in different heterogeneous platforms. Moreover, there is lacking a simulator for OpenCL applications, which can evaluate OpenCL applications and hardware design in pre-silicon stage. In this thesis, we propose a simulation platform which can provide performance, power and thermal information of OpenCL applications executed in a heterogeneous platform design. The proposed simulation platform compute the power consumption and thermal state of each function block in a processor based on the result of cycle-accurate performance simulation. According to these performance, power and thermal information, the workload of the devices in a chip can be reorganized for optimal system behavior.
APA, Harvard, Vancouver, ISO, and other styles
41

Roßbach, André Christian. "Evaluation of Software Architectures in the Automotive Domain for Multicore Targets in regard to Architectural Estimation Decisions at Design Time." Master's thesis, 2014. https://monarch.qucosa.de/id/qucosa%3A20224.

Full text
Abstract:
In this decade the emerging multicore technology will hit the automotive industry. The increasing complexity of the multicore-systems will make a manual verification of the safety and realtime constraints impossible. For this reason, dedicated methods and tools are utterly necessary, in order to deal with the upcoming multicore issues. A lot of researchprojects for new hardware platforms and software frameworks for the automotive industry are running nowadays, because the paradigms of the “High-Performance Computing” and “Server/Desktop Domain” cannot be easily adapted for the embedded systems. One of the difficulties is the early suitability estimation of a hardware platform for a software architecture design, but hardly a research-work is tackling that. This thesis represents a procedure to evaluate the plausibility of software architecture estimations and decisions at design stage. This includes an analysis technique of multicore systems, an underlying graph-model – to represent the multicore system – and a simulation tool evaluation. This can guide the software architect, to design a multicore system, in full consideration of all relevant parameters and issues.:Contents List of Figures vii List of Tables viii List of Abbreviations ix 1. Introduction 1 1.1. Motivation 1 1.2. Scope 2 1.3. Goal and Tasks 2 1.4. Structure of the Thesis 3 I. Multicore Technology 4 2. Fundamentals 5 2.1. Automotive Domains 5 2.2. Embedded System 7 2.2.1. Realtime 7 2.2.2. Runtime Predictions 8 2.2.3. Multicore Processor Architectures 8 2.3. Development of Automotive Embedded Systems 9 2.3.1. Applied V-Model 9 2.3.2. System Description and System Implementation 10 2.4. Software Architecture 11 2.5. Model Description of Software Structures 13 2.5.1. Design Domains of Multicore Systems 13 2.5.2. Software Structure Components 13 3. Trend and State of the Art of Multicore Research, Technology and Market 17 3.1. The Importance of Multicore Technology 17 3.2. Multicore Technology for the Automotive Industry 19 3.2.1. High-Performance Computing versus Embedded Systems 19 3.2.2. The Trend for the Automotive Industry 20 3.2.3. Examples of Multicore Hardware Platforms 23 3.3. Approaches for Upcoming Multicore Problems 24 3.3.1. Migration from Single-Core to Multicore 24 3.3.2. Correctness-by-Construction 25 3.3.3. AUTOSAR Multicore System 26 3.4. Software Architecture Simulators 28 3.4.1. Justification for Simulation Tools 28 3.4.2. System Model Simulation Software 29 3.5. Current Software Architecture Research Projects 31 3.6. Portrait of the current Situation 32 3.7. Summary of the Multicore Trend 32 II. Identification of Multicore System Parameters 34 4. Project Analysis to Identify Crucial Parameters 35 4.1. Analysis Procedure 35 4.1.1. Question Catalogue 36 4.1.2. Three Domains of Investigation 37 4.2. Analysed Projects 41 4.2.1. Project 1: Online Camera Calibration 41 4.2.2. Project 2: Power Management 45 4.2.3. Project 3: Battery Management 46 4.3. Results of Project Analysis 51 4.3.1. Ratio of Parameter Influence 51 4.3.2. General Influences of Parameters 53 5. Abstract System Model 54 5.1. Requirements for the System-Model 54 5.2. Simulation Tool Model Evaluation 55 5.2.1. System Model of PRECISION PRO 55 5.2.2. System Model of INCHRON 57 5.2.3. System Model of SymTA/S 58 5.2.4. System Model of Timing Architects 59 5.2.5. System Model of AMALTHEA 60 5.3. Concept of Abstract System Model 62 5.3.1. Components of the System Model 63 5.3.2. Software Function-Graph 63 5.3.3. Hardware Architecture-Graph 64 5.3.4. Specification-Graph for Mapping 65 6. Testcase Implementation 67 6.1. Example Test-System 68 6.1.1. Simulated Test-System 70 6.1.2. Testcases 73 6.2. Result of Tests 74 6.2.1. Processor Core Runtime Execution 74 6.2.2. Communication 75 6.2.3. Memory Access 76 6.3. Summary of Multicore System Parameters Identification 78 III. Evaluation of Software Architectures 80 7. Estimation-Procedure 81 7.1. Estimation-Procedure in a Nutshell 81 7.2. Steps of Estimation-Procedure 82 7.2.1. Project Analysis 82 7.2.2. Timing and Memory Requirements 83 7.2.3. System Modelling 84 7.2.4. Software Architecture Simulation 85 7.2.5. Results of a Validated Software Architecture 86 7.2.6. Feedback of Partly Implemented System 88 8. Implementation and Simulation 89 8.1. Example Project Analysis – Online Camera Calibration 89 8.1.1. Example Project Choice 90 8.1.2. OCC Timing Requirements Analysis 90 8.2. OCC Modelling 94 8.2.1. OCC Software Function-Graph 95 8.2.2. OCC Hardware Architecture 96 8.2.3. OCC Mapping – Specification-Graph 101 8.3. Simulation of the OCC Model with Tool Support 102 8.3.1. Tasks for Tool Setup 103 8.3.2. PRECISION PRO 105 8.3.3. INCHRON 107 8.3.4. SymTA/S 108 8.3.5. Timing Architects 112 8.3.6. AMALTHEA 115 8.4. System Optimisation Possibilities 116 8.5. OCC Implementation Results 117 9. Results of the Estimation-Procedure Evaluation 119 9.1. Tool-Evaluation Results 119 9.2. Findings of Estimation, Simulation and ECU-Behavior. 123 9.2.1. System-Specific Issues 123 9.2.2. Communication Issues 123 9.2.3. Memory Issues 124 9.2.4. Timing Issues 124 9.3. Summary of the Software Architecture Evaluation 125 10.Summary and Outlook 127 10.1. Summary 127 10.2. Usability of the Estimation-Procedure 128 10.3. Outlook and Future Work 129 11. Bibliography xii IV. Appendices xxi A. Appendices xxii A.1. Embedded Multicore Technology Research Projects xxii A.1.1. Simulation Software xxii A.1.2. Multicore Software Research Projects xxiii A.2. Testcase Implementation Results xxvi A.2.1. Function Block Processor Core Executions xxvi A.2.2. Memory Access Mechanism xxvii A.2.3. Memory Access Timings of Different Datatypes xxviii A.2.4. Inter-Processor Communication xxix A.3. Further OCC System Description xxxii A.3.1. OCC Timing Requirements of the FB xxxii A.3.2. INCHRON Validation Results xxxiv A.4. Detailed System Optimisation xxxv A.4.1. Optimisation through Hardware Alternation xxxv A.4.2. Optimisation through Mapping Alternation xxxv A.4.3. Optimisation of Execution Timings xxxvii B. Estimation-Procedure Engineering Paper xl B.1. Components and Scope of Software Architecture xl B.2. Estimation-Procedure in a Nutshell xlii B.3. Project Analysis xliii B.3.1. System level analysis xliv B.3.2. Communication Domain xlv B.3.3. Processor Core Domain xlvi B.3.4. Memory Domain xlvii B.3.5. Timing and Memory Requirements xlviii B.4. System Modelling xlix B.4.1. Function Model xlix B.4.2. Function-Graph l B.4.3. Possible ECU Target l B.4.4. Architecture-Graph l B.4.5. Software Architecture Mapping li B.4.6. Domain Specific Decision Guide lii B.5. Software Architecture Simulation liii B.6. Results of a Simulated Software Architecture lv B.7. Feedback of Partly Implemented System for Software Architecture Improvement lvi B.8. Benefits of the Estimation-Procedure lvii
In den nächsten Jahren wird die aufkommende Multicore-Technologie auf die Automobil-Branche zukommen. Die wachsende Komplexität der Multicore-Systeme lässt es nicht mehr zu, die Verifikation von Sicherheits- und Echtzeit-Anforderungen manuell auszuführen. Daher sind spezielle Methoden und Werkzeuge zwingend notwendig, um gerade mit den bevorstehenden Multicore-Problemfällen richtig umzugehen. Heutzutage laufen viele Forschungsprojekte für neue Hardware-Plattformen und Software-Frameworks für die Automobil-Industrie, weil die Paradigmen des “High-Performance Computings” und der “Server/Desktop-Domäne” nicht einfach so für die Eingebetteten Systeme angewendet werden können. Einer der Problemfälle ist das frühe Erkennen, ob die Hardware-Plattform für die Software-Architektur ausreicht, aber nur wenige Forschungs-Arbeiten berücksichtigen das. Diese Arbeit zeigt ein Vorgehens-Model auf, welches ermöglicht, dass Software-Architektur Abschätzungen und Entscheidungen bereits zur Entwurfszeit bewertet werden können. Das beinhaltet eine Analyse Technik für Multicore-Systeme, ein grundsätzliches Graphen-Model, um ein Multicore-System darzustellen, und eine Simulatoren Evaluierung. Dies kann den Software-Architekten helfen, ein Multicore System zu entwerfen, welches alle wichtigen Parameter und Problemfälle berücksichtigt.:Contents List of Figures vii List of Tables viii List of Abbreviations ix 1. Introduction 1 1.1. Motivation 1 1.2. Scope 2 1.3. Goal and Tasks 2 1.4. Structure of the Thesis 3 I. Multicore Technology 4 2. Fundamentals 5 2.1. Automotive Domains 5 2.2. Embedded System 7 2.2.1. Realtime 7 2.2.2. Runtime Predictions 8 2.2.3. Multicore Processor Architectures 8 2.3. Development of Automotive Embedded Systems 9 2.3.1. Applied V-Model 9 2.3.2. System Description and System Implementation 10 2.4. Software Architecture 11 2.5. Model Description of Software Structures 13 2.5.1. Design Domains of Multicore Systems 13 2.5.2. Software Structure Components 13 3. Trend and State of the Art of Multicore Research, Technology and Market 17 3.1. The Importance of Multicore Technology 17 3.2. Multicore Technology for the Automotive Industry 19 3.2.1. High-Performance Computing versus Embedded Systems 19 3.2.2. The Trend for the Automotive Industry 20 3.2.3. Examples of Multicore Hardware Platforms 23 3.3. Approaches for Upcoming Multicore Problems 24 3.3.1. Migration from Single-Core to Multicore 24 3.3.2. Correctness-by-Construction 25 3.3.3. AUTOSAR Multicore System 26 3.4. Software Architecture Simulators 28 3.4.1. Justification for Simulation Tools 28 3.4.2. System Model Simulation Software 29 3.5. Current Software Architecture Research Projects 31 3.6. Portrait of the current Situation 32 3.7. Summary of the Multicore Trend 32 II. Identification of Multicore System Parameters 34 4. Project Analysis to Identify Crucial Parameters 35 4.1. Analysis Procedure 35 4.1.1. Question Catalogue 36 4.1.2. Three Domains of Investigation 37 4.2. Analysed Projects 41 4.2.1. Project 1: Online Camera Calibration 41 4.2.2. Project 2: Power Management 45 4.2.3. Project 3: Battery Management 46 4.3. Results of Project Analysis 51 4.3.1. Ratio of Parameter Influence 51 4.3.2. General Influences of Parameters 53 5. Abstract System Model 54 5.1. Requirements for the System-Model 54 5.2. Simulation Tool Model Evaluation 55 5.2.1. System Model of PRECISION PRO 55 5.2.2. System Model of INCHRON 57 5.2.3. System Model of SymTA/S 58 5.2.4. System Model of Timing Architects 59 5.2.5. System Model of AMALTHEA 60 5.3. Concept of Abstract System Model 62 5.3.1. Components of the System Model 63 5.3.2. Software Function-Graph 63 5.3.3. Hardware Architecture-Graph 64 5.3.4. Specification-Graph for Mapping 65 6. Testcase Implementation 67 6.1. Example Test-System 68 6.1.1. Simulated Test-System 70 6.1.2. Testcases 73 6.2. Result of Tests 74 6.2.1. Processor Core Runtime Execution 74 6.2.2. Communication 75 6.2.3. Memory Access 76 6.3. Summary of Multicore System Parameters Identification 78 III. Evaluation of Software Architectures 80 7. Estimation-Procedure 81 7.1. Estimation-Procedure in a Nutshell 81 7.2. Steps of Estimation-Procedure 82 7.2.1. Project Analysis 82 7.2.2. Timing and Memory Requirements 83 7.2.3. System Modelling 84 7.2.4. Software Architecture Simulation 85 7.2.5. Results of a Validated Software Architecture 86 7.2.6. Feedback of Partly Implemented System 88 8. Implementation and Simulation 89 8.1. Example Project Analysis – Online Camera Calibration 89 8.1.1. Example Project Choice 90 8.1.2. OCC Timing Requirements Analysis 90 8.2. OCC Modelling 94 8.2.1. OCC Software Function-Graph 95 8.2.2. OCC Hardware Architecture 96 8.2.3. OCC Mapping – Specification-Graph 101 8.3. Simulation of the OCC Model with Tool Support 102 8.3.1. Tasks for Tool Setup 103 8.3.2. PRECISION PRO 105 8.3.3. INCHRON 107 8.3.4. SymTA/S 108 8.3.5. Timing Architects 112 8.3.6. AMALTHEA 115 8.4. System Optimisation Possibilities 116 8.5. OCC Implementation Results 117 9. Results of the Estimation-Procedure Evaluation 119 9.1. Tool-Evaluation Results 119 9.2. Findings of Estimation, Simulation and ECU-Behavior. 123 9.2.1. System-Specific Issues 123 9.2.2. Communication Issues 123 9.2.3. Memory Issues 124 9.2.4. Timing Issues 124 9.3. Summary of the Software Architecture Evaluation 125 10.Summary and Outlook 127 10.1. Summary 127 10.2. Usability of the Estimation-Procedure 128 10.3. Outlook and Future Work 129 11. Bibliography xii IV. Appendices xxi A. Appendices xxii A.1. Embedded Multicore Technology Research Projects xxii A.1.1. Simulation Software xxii A.1.2. Multicore Software Research Projects xxiii A.2. Testcase Implementation Results xxvi A.2.1. Function Block Processor Core Executions xxvi A.2.2. Memory Access Mechanism xxvii A.2.3. Memory Access Timings of Different Datatypes xxviii A.2.4. Inter-Processor Communication xxix A.3. Further OCC System Description xxxii A.3.1. OCC Timing Requirements of the FB xxxii A.3.2. INCHRON Validation Results xxxiv A.4. Detailed System Optimisation xxxv A.4.1. Optimisation through Hardware Alternation xxxv A.4.2. Optimisation through Mapping Alternation xxxv A.4.3. Optimisation of Execution Timings xxxvii B. Estimation-Procedure Engineering Paper xl B.1. Components and Scope of Software Architecture xl B.2. Estimation-Procedure in a Nutshell xlii B.3. Project Analysis xliii B.3.1. System level analysis xliv B.3.2. Communication Domain xlv B.3.3. Processor Core Domain xlvi B.3.4. Memory Domain xlvii B.3.5. Timing and Memory Requirements xlviii B.4. System Modelling xlix B.4.1. Function Model xlix B.4.2. Function-Graph l B.4.3. Possible ECU Target l B.4.4. Architecture-Graph l B.4.5. Software Architecture Mapping li B.4.6. Domain Specific Decision Guide lii B.5. Software Architecture Simulation liii B.6. Results of a Simulated Software Architecture lv B.7. Feedback of Partly Implemented System for Software Architecture Improvement lvi B.8. Benefits of the Estimation-Procedure lvii
APA, Harvard, Vancouver, ISO, and other styles
42

Peng, Yi-Wen, and 彭弈文. "Parallel Algorithm Design for CPU-GPU Heterogeneous Architectures." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/s4qc47.

Full text
Abstract:
博士
國立臺灣科技大學
電子工程系
106
Recently, general-purpose computing on graphics processing units (GPGPU) attracts a great deal of attention. As the technology advances, modern GPUs are used to accelerate general-purpose applications, such as DNA sequence alignment, voice recognition, traffic simulation, thermal simulation, etc. By combining the best features of CPUs and GPUs, we can achieve even further computational gains. This paradigm, known as heterogeneous computing, aims to fully utilize both CPU and GPU architectures to execute a wide range of applications efficiently. In the era of big data, a large amount of data is processed by data mining and machine learning algorithms to obtain valuable information. Big data is collected day to day from many different resources and services, and huge quantities of data are produced every day by and about people, things, and their interactions. To explore cutting-edge topics of big data, we have to deal with high-dimensional data and complex graphs. Hence, GPUs are used to accelerate the algorithms for data analysis. In this thesis, we use two applications to investigate the design of GPU-accelerated algorithms and examine the performance for high-dimensional data and complex graphs. For high-dimensional data, we use k-dominant skyline queries to examine the per- formance of GPUs. The k-dominant skyline queries retrieve the preference points from databases, which is essential for many applications involving multi-criteria analysis. The challenge for k-dominant skyline queries is that the computational cost grows rapidly as the number of dimensions grows. In this thesis, we focus on partitioning and balancing the workload of k-dominant skyline queries. For complex graphs, maximal clique enu- meration (MCE) is used to examine the performance of the GPU-accelerated algorithm. Maximal cliques are useful in many applications, such as social graphs analysis and bioin- formatics analysis. Because these applications commonly deal with the situations involved massive data, it is crucial to devise an efficient algorithm to solve MCE problem for large datasets. In this thesis, we accelerate the maximal clique enumeration using GPU and fo- cus on handling the irregular behaviors.
APA, Harvard, Vancouver, ISO, and other styles
43

Melia, Telemaco. "IP Converged Heterogeneous Mobility in 4G networks - Network-side Handover Management Strategies." Doctoral thesis, 2007. http://hdl.handle.net/11858/00-1735-0000-0006-B625-F.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Aldegheri, Stefano. "A model-based design flow for embedded vision applications on heterogeneous architectures." Doctoral thesis, 2020. http://hdl.handle.net/11562/1017960.

Full text
Abstract:
The ability to gather information from images is straightforward to human, and one of the principal input to understand external world. Computer vision (CV) is the process to extract such knowledge from the visual domain in an algorithmic fashion. The requested computational power to process these information is very high. Until recently, the only feasible way to meet non-functional requirements like performance was to develop custom hardware, which is costly, time-consuming and can not be reused in a general purpose. The recent introduction of low-power and low-cost heterogeneous embedded boards, in which CPUs are combine with heterogeneous accelerators like GPUs, DSPs and FPGAs, can combine the hardware efficiency needed for non-functional requirements with the flexibility of software development. Embedded vision is the term used to identify the application of the aforementioned CV algorithms applied in the embedded field, which usually requires to satisfy, other than functional requirements, also non-functional requirements such as real-time performance, power, and energy efficiency. Rapid prototyping, early algorithm parametrization, testing, and validation of complex embedded video applications for such heterogeneous architectures is a very challenging task. This thesis presents a comprehensive framework that: 1) Is based on a model-based paradigm. Differently from the standard approaches at the state of the art that require designers to manually model the algorithm in any programming language, the proposed approach allows for a rapid prototyping, algorithm validation and parametrization in a model-based design environment (i.e., Matlab/Simulink). The framework relies on a multi-level design and verification flow by which the high-level model is then semi-automatically refined towards the final automatic synthesis into the target hardware device. 2) Relies on a polyglot parallel programming model. The proposed model combines different programming languages and environments such as C/C++, OpenMP, PThreads, OpenVX, OpenCV, and CUDA to best exploit different levels of parallelism while guaranteeing a semi-automatic customization. 3) Optimizes the application performance and energy efficiency through a novel algorithm for the mapping and scheduling of the application 3 tasks on the heterogeneous computing elements of the device. Such an algorithm, called exclusive earliest finish time (XEFT), takes into consideration the possible multiple implementation of tasks for different computing elements (e.g., a task primitive for CPU and an equivalent parallel implementation for GPU). It introduces and takes advantage of the notion of exclusive overlap between primitives to improve the load balancing. This thesis is the result of three years of research activity, during which all the incremental steps made to compose the framework have been tested on real case studies
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography