Дисертації з теми "Heterogeneous embedded system"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Heterogeneous embedded system".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Fischaber, Scott Johan. "Memory-centric system level design of heterogeneous embedded DSP systems." Thesis, Queen's University Belfast, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.491885.
Повний текст джерелаPeterson, Thomas. "Dynamic Allocation for Embedded Heterogeneous Memory : An Empirical Study." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-223904.
Повний текст джерелаInbyggda system existerar allestädes och bidrar till våran livsstandard på flertalet avseenden genom att skapa funktionalitet i större system. För att vara verksamma kräver inbyggda system en välfungerande hård- och mjukvara samt gränssnitt mellan dessa. Dessa tre måste ständigt omarbetas i takt med utvecklingen av nya användbara teknologier för inbyggda system. En förändring dessa system genomgår i nuläget är experimentering med nya minneshanteringstekniker för RAM-minnen då nya icke-flyktiga RAM-minnen utvecklats. Dessa minnen uppvisar ofta asymmetriska läs och skriv fördröjningar vilket motiverar en minnesdesign baserad på flera olika icke-flyktiga RAM. Som en konsekvens av dessa egenskaper och minnesdesigner finns ett behov av att hitta minnesallokeringstekniker som minimerar de fördröjningar som skapas. Detta dokument adresserar problemet med minnesallokering på heterogena minnen genom en empirisk studie. I den första delen av studien studerades allokeringstekniker baserade på en länkad lista, bitmapp och ett kompissystem. Med detta som grund drogs slutsatsen att den länkade listan var överlägsen alternativen. Därefter utarbetades minnesarkitekturer med flera minnesbanker samtidigt som framtagandet av flera strategier för val av minnesbank utfördes. Dessa strategier baserades på storleksbaserade tröskelvärden och nyttjandegrad hos olika minnesbanker. Utvärderingen av dessa strategier resulterade ej i några större slutsatser men visade att olika strategier var olika lämpade för olika beteenden hos applikationer.
Pop, Traian. "Analysis and Optimisation of Distributed Embedded Systems with Heterogeneous Scheduling Policies." Doctoral thesis, Linköping : Department of Computer and Information Science, Linköpings universitet, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8934.
Повний текст джерелаHegde, Sridhar. "FUNCTIONAL ENHANCEMENT AND APPLICATIONS DEVELOPMENT FOR A HYBRID, HETEROGENEOUS SINGLE-CHIP MULTIPROCESSOR ARCHITECTURE." UKnowledge, 2004. http://uknowledge.uky.edu/gradschool_theses/252.
Повний текст джерелаSouza, Jeckson Dellagostin. "A reconfigurable heterogeneous multicore system with homogeneous ISA." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2016. http://hdl.handle.net/10183/140321.
Повний текст джерелаGiven the large diversity of embedded applications one can find in current portable devices, for energy and performance reasons one must exploit both Thread- and Instruction Level Parallelism. While MPSoCs (Multiprocessor system-on-chip) are largely used for this purpose, they fail when one considers software productivity, since it comprises different ISAs (Instruction Set Architecture) that must be programmed separately. On the other hand, general purpose multicores implement the same ISA, but are composed of a homogeneous set of very power consuming superscalar processors. In this dissertation, we show how one can effectively use a reconfigurable unit to provide a number of different possible heterogeneous configurations while still sustaining the same ISA, capable of reaching high performance with low energy cost. To ensure ISA compatibility, we use a binary translation mechanism that transforms code to be executed on the fabric at run-time. Using representative benchmarks, we show that one version of the heterogeneous system can outperform its homogenous counterpart in average by 59% in performance and 10% in energy, with EDP (Energy-Delay Product) improvements in almost every scenario. Furthermore, this work also proposes and evaluates six schedulers for the heterogeneous system: two static algorithms, which allocate the threads on the first free core, where they will run during the entire execution; an Instruction Count (IC) Driven scheduler, which reallocates threads during synchronization points accordingly to their instruction count; a Feedback scheduler, which uses data from inside the reconfigurable unit to reallocate threads; the PCFeedback scheduler, that adds a reuse mechanism to the last one; and an Oracle scheduler, which is capable of deciding the best thread allocation possible. We show that the static algorithm can reach high performance in applications with high parallelism, however for uniform performance in all applications, the Feedback and PC-Feedback algorithms are better designated.
Patel, Hiren Dhanji. "HEMLOCK: HEterogeneous ModeL Of Computation Kernel for SystemC." Thesis, Virginia Tech, 2003. http://hdl.handle.net/10919/9632.
Повний текст джерелаMaster of Science
Bergenhem, Carl, and Magnus Jonsson. "Two Protocols with Heterogeneous Real-Time Services for High-Performance Embedded Networks." Högskolan i Halmstad, Centrum för forskning om inbyggda system (CERES), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-21296.
Повний текст джерелаSwegert, Eric B. "RTOS Tutorials for a Heterogeneous Class of Senior and Beginning Graduate Students." University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1367934958.
Повний текст джерелаGantel, Laurent. "Hardware and software architecture facilitating the operation by the industry of dynamically adaptable heterogeneous embedded systems." Phd thesis, Université de Cergy Pontoise, 2014. http://tel.archives-ouvertes.fr/tel-01019909.
Повний текст джерелаRobino, Francesco. "A model-based design approach for heterogeneous NoC-based MPSoCs on FPGA." Licentiate thesis, KTH, Elektroniksystem, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-145521.
Повний текст джерелаQC 20140609
Abdallah, Fadel. "Optimization and Scheduling on Heterogeneous CPU/FPGA Architecture with Communication Delays." Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0301.
Повний текст джерелаThe domain of the embedded systems becomes more and more attractive in recent years with the development of increasing computationally demanding applications to which the traditional processor-based architectures (either single or multi-core) cannot always respond in terms of performance. While multiprocessor or multicore architectures have now become generalized, it is often necessary to add to them dedicated processing circuits, based in particular on reconfigurable circuits, to meet specific needs and strong constraints, especially when real-time processing is required. This work presents the study of scheduling problems into the reconfigurable heterogeneous architectures based on general processors (CPUs) and programmable circuits (FPGAs). The main objective is to run an application presented in the form of a Data Flow Graph (DFG) on a heterogeneous CPU/FPGA architecture in order to minimize the total running time or makespan criterion (Cmax). In this thesis, we have considered two case studies: a scheduling case taking into account the intercommunication delays and where the FPGA device can perform a single task at a time, and another case taking into account parallelism in the FPGA, which can perform several tasks in parallel while respecting the constraint surface. First, in the first case, we propose two new optimization approaches GAA (Genetic Algorithm Approach) and MGAA (Modified Genetic Algorithm Approach) based on genetic algorithms. We also propose to compare these algorithms to a Branch & Bound method. The proposed approaches (GAA and MGAA) offer a very good compromise between the quality of the solutions obtained (optimization makespan criterion) and the computational time required to perform large-scale problems, unlike to the proposed Branch & Bound and the other exact methods found in the literature. Second, we first implemented an updated method based on genetic algorithms to solve the temporal partitioning problem in an FPGA circuit using dynamic reconfiguration. This method provides good solutions in a reasonable running time. Then, we improved our previous MGAA approach to obtain a new approach called MGA (Multithreaded Genetic Algorithm), which allows us to provide solutions to the partitioning problem. In addition, we have also proposed an algorithm based on simulated annealing, called MSA (Multithreaded Simulated Annealing). These two proposed approaches which are based on metaheuristic methods provide approximate solutions within a reasonable time period to the scheduling and partitioning problems on a heterogeneous computing system
Diarra, Rokiatou. "Automatic Parallelization for Heterogeneous Embedded Systems." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS485.
Повний текст джерелаRecent years have seen an increase of heterogeneous architectures combining multi-core CPUs with accelerators such as GPU, FPGA, and Intel Xeon Phi. GPU can achieve significant performance for certain categories of application. Nevertheless, achieving this performance with low-level APIs (e.g. CUDA, OpenCL) requires to rewrite the sequential code, to have a good knowledge of GPU architecture, and to apply complex optimizations that are sometimes not portable. On the other hand, directive-based programming models (e.g. OpenACC, OpenMP) offer a high-level abstraction of the underlying hardware, thus simplifying the code maintenance and improving productivity. They allow users to accelerate their sequential codes on GPU by simply inserting directives. OpenACC/OpenMP compilers have the daunting task of applying the necessary optimizations from the user-provided directives and generating efficient codes that take advantage of the GPU architecture. Although the OpenACC / OpenMP compilers are mature and able to apply some optimizations automatically, the generated code may not achieve the expected speedup as the compilers do not have a full view of the whole application. Thus, there is generally a significant performance gap between the codes accelerated with OpenACC/OpenMP and those hand-optimized with CUDA/OpenCL. To help programmers for speeding up efficiently their legacy sequential codes on GPU with directive-based models and broaden OpenMP/OpenACC impact in both academia and industry, several research issues are discussed in this dissertation. We investigated OpenACC and OpenMP programming models and proposed an effective application parallelization methodology with directive-based programming approaches. Our application porting experience revealed that it is insufficient to simply insert OpenMP/OpenACC offloading directives to inform the compiler that a particular code region must be compiled for GPU execution. It is highly essential to combine offloading directives with loop parallelization constructs. Although current compilers are mature and perform several optimizations, the user may provide them more information through loop parallelization constructs clauses in order to get an optimized code. We have also revealed the challenge of choosing good loop schedules. The default loop schedule chosen by the compiler may not produce the best performance, so the user has to manually try different loop schedules to improve the performance. We demonstrate that OpenMP and OpenACC programming models can achieve best performance with lesser programming effort, but OpenMP/OpenACC compilers quickly reach their limit when the offloaded region code is computed/memory bound and contain several nested loops. In such cases, low-level languages may be used. We also discuss pointers aliasing problem in GPU codes and propose two static analysis tools that perform automatically at source level type qualifier insertion and scalar promotion to solve aliasing issues
Valente, Frederico Miguel Goulão. "Static analysis on embedded heterogeneous multiprocessor systems." Master's thesis, Universidade de Aveiro, 2008. http://hdl.handle.net/10773/2180.
Повний текст джерелаHines, Kenneth J. "Coordination-centric debugging for heterogeneous distributed embedded systems /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/6914.
Повний текст джерелаButko, Anastasiia. "Techniques de simulation rapide quasi cycle-précise pour l'exploration d'architectures multicoeur." Thesis, Montpellier, 2015. http://www.theses.fr/2015MONTS144/document.
Повний текст джерелаSince the computational needs precipitously grow each year, HPC technology becomes a driving force for numerous scientific and consumer areas. The most powerful supercomputer has been progressing from TFLOPS to PFLOPS throughout the last ten years. However, the extremely high power consumption and therefore the high cost pushed researchers to explore more energy-efficient technologies, such as the use of low-power embedded SoCs.The evolution of emerging manycore systems, forecasted to feature hundreds of cores by the end of the decade calls for efficient solutions for the design space exploration and debugging. Available industrial and academic simulators differ in terms of simulation speed/accuracy trade-offs. Cycle-approximate simulators are popular and attractive for architectural exploration. Even though enabling flexible and detailed architecture evaluation, cycle-approximate simulators entail slow simulation speeds, thereby limiting their scope of applicability for systems with hundreds of cores. This calls for alternative approaches capable of providing high simulation speed while preserving accuracy that is crucial to architectural exploration.In this thesis, we evaluate cycle-approximate simulation techniques for fast and accurate exploration of multi- and manycore architectures. Expecting to significantly reduce simulation time still preserving the accuracy at the cycle-approximate level, we propose a hybrid trace-oriented approach to enable flexible manycore architecture simulation. We design a set of simulation techniques to overcome the main weaknesses of the trace-oriented approach. The trace synchronization technique aims to manage control and data dependencies arising from the abstraction of processor cores. The trace replication technique is proposed to simulate manycore architectures using a finite set of pre-collected traces. The computation phase scaling technique is designed to enable flexible switching between multiple processor models without considering microarchitectural difference but taking into account the computation speed ratio. Based on the proposed simulation environment, we explore several manycore architectures in terms of performance and energy-efficiency trade-offs
Eriksson, Jonas. "Partitioning methodology validation for embedded systems design." Thesis, Linköpings universitet, Programvara och system, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-129332.
Повний текст джерелаVincenzo, Stoico. "A Model-Driven Approach for modeling Heterogeneous Embedded Systems." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-44199.
Повний текст джерелаPop, Traian. "Scheduling and Optimisation of Heterogeneous Time/Event-Triggered Distributed Embedded Systems." Licentiate thesis, Linköping : Univ, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5691.
Повний текст джерелаLifa, Adrian Alin. "Hardware/Software Codesign of Embedded Systems with Reconfigurable and Heterogeneous Platforms." Doctoral thesis, Linköpings universitet, Programvara och system, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-117637.
Повний текст джерелаNikov, Kris. "Power modelling and analysis on heterogeneous embedded systems : a systematic approach." Thesis, University of Bristol, 2018. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.743036.
Повний текст джерелаLeija, Antonio M. "AN INVESTIGATION INTO PARTITIONING ALGORITHMS FOR AUTOMATIC HETEROGENEOUS COMPILERS." DigitalCommons@CalPoly, 2015. https://digitalcommons.calpoly.edu/theses/1546.
Повний текст джерелаNam, HyunSuk, and HyunSuk Nam. "Security-driven Design Optimization of Mixed Cryptographic Implementations in Distributed, Reconfigurable, and Heterogeneous Embedded Systems." Diss., The University of Arizona, 2017. http://hdl.handle.net/10150/624287.
Повний текст джерелаMendoza, Cervantes Francisco [Verfasser]. "A Problem-Oriented Approach for Dynamic Verification of Heterogeneous Embedded Systems / Francisco Mendoza Cervantes." Karlsruhe : KIT Scientific Publishing, 2014. http://www.ksp.kit.edu.
Повний текст джерелаOliva, Venegas Yaset. "High level modeling of run-time managers for the design of heterogeneous embedded systems." Rennes, INSA, 2012. http://www.theses.fr/2012ISAR0017.
Повний текст джерелаIn order to circumvent the ever increasing difficulty of designing embedded systems, designers have to envisage using new methods and tools to abstract the level of description. These modern systems are usually composed of multiple complex functions of digital signal processing that are often implemented in heterogeneous (software and hardware) blocks. In this context, the current trend is to incorporate a kernel destined to manage these different processing blocks in a flexible and dynamic manner. As these blocks become too complex to handle, it is thus possible to add specific services to the kernel to manage it. This is particularly the case when the functions may run on multiple heterogeneous processors or in reconfigurable hardware. Considering all these aspects, the first part of this thesis is a contribution to a design tool and proposes a methodology to explore the structure of an embedded system. The tool allows the specification of three basic elements of a system: the application, the architecture and the operating system (kernel) from high level models. The methodology consists in specifying, simulating and analyzing these three basic elements. The process of exploration is performed iteratively until a satisfactory solution is reached. The second part of the thesis focused on an extension of a core model to dynamically manage the migration of tasks between different blocks (processors or reconfigurable areas). The proposed service is designed to manage the shared-memory architectures containing a kernel that supports a master / slave configuration. This Offloading service is part of the kernel model and adds new features (migration tasks, heterogeneous task management and the smart placement)
Dekkiche, Djamila. "Programming methodologies for ADAS applications in parallel heterogeneous architectures." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS388/document.
Повний текст джерелаComputer Vision (CV) is crucial for understanding and analyzing the driving scene to build more intelligent Advanced Driver Assistance Systems (ADAS). However, implementing CV-based ADAS in a real automotive environment is not straightforward. Indeed, CV algorithms combine the challenges of high computing performance and algorithm accuracy. To respond to these requirements, new heterogeneous circuits are developed. They consist of several processing units with different parallel computing technologies as GPU, dedicated accelerators, etc. To better exploit the performances of such architectures, different languages are required depending on the underlying parallel execution model. In this work, we investigate various parallel programming methodologies based on a complex case study of stereo vision. We introduce the relevant features and limitations of each approach. We evaluate the employed programming tools mainly in terms of computation performances and programming productivity. The feedback of this research is crucial for the development of future CV algorithms in adequacy with parallel architectures with a best compromise between computing performance, algorithm accuracy and programming efforts
Ringenson, Josefin. "Efficiency of CNN on Heterogeneous Processing Devices." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155034.
Повний текст джерелаJädal, Thomas, and Dissel Dirk Postol. "Dynamic Bandwidth Allocation for Wireless Nodes with Software Defined Networking in Heterogeneous Networks with Embedded Systems." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-44115.
Повний текст джерелаMotta, Rodrigo Bittencourt. "Reduzindo o consumo de energia em MPSoCs heterogêneos via clock gating." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2008. http://hdl.handle.net/10183/15312.
Повний текст джерелаIn this work we present an architecture that enables the generation of bus-based, scalable heterogeneous Multiprocessor Systems-on-Chip (MPSoCs), supporting different memory organizations. Intertask communication is specified by means of a shared memory structure that assures collision avoidance and promotes energy savings through a dynamic clock gating triggering. We also introduce a Dynamic Core Freezing (DCF) technique, which boosts energy savings taking advantage of processor idle cycles during memory accesses. Moreover, the combination of the memory organizations enables the architecture to exploit easy task migration by means of the task context saving in the shared data memory. Moreover, we show the high-level simulator, based on the proposed architecture, created in order to extract the energy savings enabled with the clock gating and the DCF techniques. The simulator accepts as input execution trace files of Java applications, from which it generates a new file that contains the mapping of the instructions found in the trace file for different instruction classes. This way, we can model different processor architectures, using the mapping file to simulate the MPSoC. Also, the simulator enables us to experiment with different memory organizations to estimate their impact on the executed instructions, bus contention, and energy consumption. As case study we have modeled different versions of a Java processor in order to experiment with different execution patterns over different memory organizations. Experiments based on a synthetic application running on an MPSoC containing different versions of a Java processor show a large improvement in energy efficiency with a minimal area cost. Besides that, we also present experiments based on applications of the SPECjvm98 benchmark, which show the impact on the energy efficiency when we change the application type. Moreover, the experiments show a huge improvement in the energy efficiency when applying the DCF technique to the MPSoC memories.
Wolvers, Adrianus Hendrikus Cornelis. "Integrating requirements authoring and design tools for heterogeneous and multicore embedded systems. : Using the iFEST Tool Integration Framework." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-18712.
Повний текст джерелаIdag existerar det en mängd olika verktyg som kan appliceras i respektive fas isystemutvecklings livscykel. Varje verktyg använder sin egna underliggande metamodell. Dessametamodeller kan variera avsevärt i både storlek och komplexitet, vilket gör dem svåra attintegrera. En lösning på detta problem är att bygga ett ramverk för verktygsintegration sombaseras på en enda, gemensam metamodell.iFEST-projektets mål är att specificera och utveckla ett ramverk för verktygsintegration förverktyg som används i utvecklingen av heterogena och multi-core inbyggda system. Dettaramverk benämns iFEST Tool Integration Framework eller iFEST IF.iFEST IF använder webbtjänster baserade på en standard som kallas OSCL, Open Services forLifecycle Collaboration samt specifikationer som gör att verktygen i verktygskedjan kankommunicera med varandra. För att validera ramverket har en fallstudie vid namn ”WindTurbine” gjorts med flertal inbyggda systemverktyg. Verktyg som används för att designa,implementera och testa en styrenhet för vindturbiner har integrerats i prototyp av enverktygskedja. För att bearbeta och behandla intern data genom webbtjänster behövs enverktygsadapter. Detta arbete redogör utvecklingen av en verktygsadapter förkravhanteringsmodulen HP Application Lifecycle Management (ALM), ett av de verktyg somanvänts i fallstudien av vindturbinen. En generalisering av de utmaningar som uppstod underutvecklingen av verktygsadaptern har genomförts. Dessa utmaningar indikerar att, trots att detfinns ett ramverk för verktygsintegration så är verktygsintegration fortfarande vara en svåruppgift att få bukt med. Detta gäller särskilt när verktyg inte är utvecklade med hänsyn tillverktygsintegration från början.
ARTEMIS iFEST
Radhakrishnan, Swarnalatha Computer Science & Engineering Faculty of Engineering UNSW. "Heterogeneous multi-pipeline application specific instruction-set processor design and implementation." Awarded by:University of New South Wales. Computer Science and Engineering, 2006. http://handle.unsw.edu.au/1959.4/29161.
Повний текст джерелаWahab, Muhammad Abdul. "Hardware support for the security analysis of embedded softwares : applications on information flow control and malware analysis." Thesis, CentraleSupélec, 2018. http://www.theses.fr/2018CSUP0003.
Повний текст джерелаInformation flow control (also known as Dynamic Information Flow Tracking, DIFT), allows a user to detect several types of software attacks such as buffer overflow or SQL injections. In this thesis, a solution based on the ARM Cortex-A9 processor family is proposed. Our approach relies on the use of ARM CoreSight components, which are able to trace software as executed by the processor in order to perform the information flow tracking. The DIFT coprocessor proposed in this thesis is implemented in an Artix-7 FPGA, embedded in a System-on-Chip (SoC) Zynq provided by Xilinx. It is shown that using ARM CoreSight components does not add a latency overhead while giving a better communication time between the ARM processor and the DIFT coprocessor
Saussard, Romain. "Méthodologies et outils de portage d’algorithmes de traitement d’images sur cibles hardware mixte." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS176/document.
Повний текст джерелаCar manufacturers increasingly provide Advanced Driver Assistance Systems (ADAS) based on cameras and image processing algorithms. To embed ADAS applications, semiconductor companies propose heterogeneous architectures. These Systems-on-Chip (SoCs) are composed of several processors with different capabilities on the same chip. However, with the increasing complexity of such systems, it becomes more and more difficult for an automotive actor to chose a SoC which can execute a given ADAS application while meeting real-time constraints. In addition, embedding algorithms on this type of hardware is not trivial: one needs to determine how to spread the computational load between the different processors, in others words the mapping of the computational load.In response to this issue, we defined during this thesis a global methodology to study the embeddability of image processing algorithms for real-time execution. This methodology predicts the embeddability of a given image processing algorithm on several heterogeneous SoCs by automatically exploring the possible mapping. It is based on three major contributions: the modeling of an algorithm and its real-time constraints, the characterization of a heterogeneous SoC, and a performance prediction approach which can address different types of architectures
Arras, Paul-Antoine. "Ordonnancement d'applications à flux de données pour les MPSoC embarqués hybrides comprenant des unités de calcul programmables et des accélérateurs matériels." Thesis, Bordeaux, 2015. http://www.theses.fr/2015BORD0031/document.
Повний текст джерелаAlthough numerous electronic devices are nowadays able to play video contents in real time and offer high-quality reproduction, video decoding in embedded systems has not become a trivial process yet. As a mater of fact, recent codecs such as H.264 and HEVC exhibit such a complexity that resorting to mixed sofware-hardware architecture is almost unavoidable. However, programming efficiently this kind of platforms is well-known to be tricky. This thesis addresses the issue of developing streaming applications for hybrid embedded targets and executing them efficiently, and proposes several contributions. The first one is an extension of the classical list-scheduling heuristics to take memory constraints into account. Te second one is a datafow execution model compatible with most existing models and with a large set of hardware platforms, as well as a dynamic scheduler. Lastly, numerous developments have been carried out on a real-world architecture from STMicroelectronics so as to demonstrate the feasibility of the approach
Vodel, Matthias. "Funkstandardübergreifende Kommunikation in Mobilen Ad Hoc Netzwerken." Doctoral thesis, Universitätsbibliothek Chemnitz, 2010. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-201001164.
Повний текст джерелаNegreiros, ângelo Lemos Vidal de. "Desenvolvimento e Avaliação de Simulação Distribuída para Projeto de Sistemas Embarcados com Ptolemy." Universidade Federal da Paraíba, 2014. http://tede.biblioteca.ufpb.br:8080/handle/tede/6106.
Повний текст джерелаCoordenação de Aperfeiçoamento de Pessoal de Nível Superior
Nowadays, embedded systems have a huge amount of computational power and consequently, high complexity. It is quite usual to find different applications being executed in embedded systems. Embedded system design demands for method and tools that allow the simulation and verification in an efficient and practical way. This paper proposes the development and evaluation of a solution for embedded modeling and simulation of heterogeneous Models of Computation in a distributed way by the integration of Ptolemy II and the High Level Architecture (HLA), a middleware for distributed discrete event simulation, in order to create an environment with high-performance execution of large-scale heterogeneous models. Experimental results demonstrated that the use of a non distributed simulation for some situations as well as the use of distributed simulation with few machines, like one, two or three computers can be infeasible. It was also demonstrated the feasibility of the integration of both technologies and so the advantages in its usage in many different scenarios. This conclusion was possible because the experiments captured some data during the simulation: execution time, exchanged data and CPU usage. One of the experiments demonstrated that a speedup of factor 4 was acquired when a model with 4,000 thousands actors were distributed in 8 different machines inside an experiment that used up to 16 machines. Furthermore, experiments have also shown that the use of HLA presents great advantages in fact, although with certain limitations.
Atualmente, sistemas embarcados têm apresentado grande poder computacional e consequentemente, alta complexidade. É comum encontrar diferentes aplicações sendo executadas em sistemas embarcados. O projeto de sistemas embarcados demanda métodos e ferramentas que possibilitem a simulação e a verificação de um modo eficiente e prático. Este trabalho propõe o desenvolvimento e a avaliação de uma solução para a modelagem e simulação de sistemas embarcados heterogêneos de forma distribuída, através da integração do Ptolemy II com o High Level Architecture (HLA), em que o último é um middleware para simulação de eventos discretos distribuídos. O intuito dessa solução é criar um ambiente com alto desempenho que possibilite a execução em larga escala de modelos heterogêneos. Os resultados dos experimentos demonstraram que o uso da simulação não distribuída para algumas situações assim como o uso da simulação distribuída utilizando poucas máquinas, como, uma, duas ou três podem ser inviável. Demonstrou-se também a viabilidade da integração das duas tecnologias, além de vantagens no seu uso em diversos cenários de simulação, através da realização de diversos experimentos que capturavam dados como: tempo de execução, dados trocados na rede e uso da CPU. Em um dos experimentos realizados consegue-se obter o speedup de fator quatro quando o modelo com quatro mil atores foi distribuído em oito diferentes computadores, em um experimento que utilizava até 16 máquinas distintas. Além disso, os experimentos também demonstraram que o uso do HLA apresenta grandes vantagens, de fato, porém com certas limitações.
Bouhadiba, Tayeb Sofiane. "42, Une approche à composants pour le prototypage virtuel des systèmes embarqués hétérogènes." Grenoble, 2010. https://theses.hal.science/tel-00539648.
Повний текст джерелаThe work presented in this thesis deals with virtual prototyping of heterogeneous embedded systems. The complexity of these systems make it difficult to find an optimal solution. Hence, engineers usually make simulations that require virtual prototyping of the system. Virtual prototyping of an embedded system aims at providing an executable model of it, in order to study its functional as well as its non-functional aspects. Our contribution is the definition of a new component-based approach for the virtual prototyping of embedded systems, called 42. 42 is not a new language for the design of embedded systems, it is a tool for describing components and assemblies for embedded systems at the system-level. Virtual prototyping of embedded systems must take into account their heterogeneous aspect. Following Ptolemy, several approaches propose a catalog of MoCCs (Models of Computation and Communication) and a framework for hierarchically combining them in order to model heterogeneity. As in Ptolemy, 42 allows to organize components and MoCCs in hierarchy. However, the MoCCs in 42 are described by means of programs manipulating a small set of basic primitives to activate components and to manage their communication. A component-based approach like 42 requires a formalism for specifying components. 42 proposes several means for specifying components. We will present these means an give particular interest to 42 control contracts. 42 is designed independently from any language or formalism and may be used jointly with the existing approaches. We provide a proof of concept to demonstrate the interest of using 42 and its control contracts with the existing approaches
Endo, Fernando Akira. "Génération dynamique de code pour l'optimisation énergétique." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM044/document.
Повний текст джерелаIn computing systems, energy consumption is limiting the performance growth experienced in the last decades. Consequently, computer architecture and software development paradigms will have to change if we want to avoid a performance stagnation in the next decades.In this new scenario, new architectural and micro-architectural designs can offer the possibility to increase the energy efficiency of hardware, thanks to hardware specialization, such as heterogeneous configurations of cores, new computing units and accelerators. On the other hand, with this new trend, software development should cope with the lack of performance portability to ever changing hardware and with the increasing gap between the performance that programmers can extract and the maximum achievable performance of the hardware. To address this issue, this thesis contributes by proposing a methodology and proof of concept of a run-time auto-tuning framework for embedded systems. The proposed framework can both adapt code to a micro-architecture unknown prior compilation and explore auto-tuning possibilities that are input-dependent.In order to study the capability of the proposed approach to adapt code to different micro-architectural configurations, I developed a simulation framework of heterogeneous in-order and out-of-order ARM cores. Validation experiments demonstrated average absolute timing errors around 7 % when compared to real ARM Cortex-A8 and A9, and relative energy/performance estimations within 6 % for the Dhrystone 2.1 benchmark when compared to Cortex-A7 and A15 (big.LITTLE) CPUs.An important component of the run-time auto-tuning framework is a run-time code generation tool, called deGoal. It defines a low-level dynamic DSL for computing kernels. During this thesis, I ported deGoal to the ARM Thumb-2 ISA and added new features for run-time auto-tuning. A preliminary validation in ARM processors showed that deGoal can in average generate equivalent or higher quality machine code compared to programs written in C, including manually vectorized codes.The methodology and proof of concept of run-time auto-tuning in embedded processors were developed around two kernel-based applications, extracted from the PARSEC 3.0 suite and its hand vectorized version PARVEC. In the favorable application, average speedups of 1.26 and 1.38 were obtained in real and simulated cores, respectively, going up to 1.79 and 2.53 (all run-time overheads included). I also demonstrated through simulations that run-time auto-tuning of SIMD instructions to in-order cores can outperform the reference vectorized code run in similar out-of-order cores, with an average speedup of 1.03 and energy efficiency improvement of 39 %. The unfavorable application was chosen to show that the proposed approach has negligible overheads when better kernel versions can not be found. When both applications run in real hardware, the run-time auto-tuning performance is in average only 6 % way from the performance obtained by the best statically found kernel implementations
Gómez, Cárdenas Carlos Ernesto. "Une approche multi-vue pour la modélisation système de propriétés fonctionnelles et non-fonctionnelles." Phd thesis, Université Nice Sophia Antipolis, 2013. http://tel.archives-ouvertes.fr/tel-00931001.
Повний текст джерелаFan, Ren-Feng, and 樊人鳳. "Image Processing Using A Heterogeneous Dual-core Embedded System." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/59878207299560183374.
Повний текст джерела淡江大學
航空太空工程學系碩士班
101
With the development of technology is widely applied to 3C products. How to shorten the processing time of program becomes a very important issue.This thesis investigates the efficiency of image processing using DSP collaborating with ARM for potential applications to onboard vision navigation of unmanned aerial vehicles (UAVs). Conventionally, a UAV with vision navigation either transmits the images to the ground station for processing, or process the images in the CPU. By comparing the processing time of ARM+DSP and of ARM only, we are able to conclude the which method has better efficiency. Specifically, this thesis illustrates step by step how to install operation systems in the ARM, and how to do image processing using DSP collaborating with ARM. Sample programs are provided for readers. Images of various sized are provided as examples to show the efficiency of our proposal. This thesis potentially contributes to the vision navigation of UAVs in Tamkang University.
Chen, I.-Hua, and 陳怡樺. "Full System Emulation of Embedded Heterogeneous Multicores Based on QEMU." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/jvgd77.
Повний текст джерела國立清華大學
資訊工程學系所
106
he emerging edge computing is poised to move computing and intelligence to the network's edge so as to be close to the data sources for fast responses and reduced network traffic. In edge computing, edge devices need to encompass a wide variety of applications or services, from data preprocessing, intelligence inference, to multimedia human interface. Many such applications are well suited for special-purpose hardware accelerators. With the increasing number of accelerators on the edge devices, a promising architecture for edge devices is an asymmetric heterogeneous multicore that incorporates one or more microcontrollers to offload accelerator scheduling and interrupt handling from the main CPU, as exemplified in the NVIDIA Deep Learning Accelerator (NVDLA). To develop such computing systems, virtual platforms such as QEMU are often used. Unfortunately, QEMU only supports symmetric homogeneous multicore systems. In this paper, we tackle the challenging problem of supporting asymmetric heterogeneous multicore systems on QEMU by considering two possible implementation strategies: one-process and multi-process. The two strategies are implemented and compared qualitatively and quantitatively.
Li, Bo-Jhen, and 李柏箴. "Embedded system design of heterogeneous network with IoT and M2M." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/3ruesc.
Повний текст джерела國立虎尾科技大學
資訊工程系碩士班
105
Following the trend of Industry 4.0, a future factory will adapt itself from ‘automated manufacturing’ into ‘smart-automated manufacturing’. Thus, this paper aims to create a system that integrates the traditional refined mechanics with the Internet of Things (IoT) and Machine to Machine (M2M). Based on the concept of Industry 4.0, the ultimate goal is to integrate 3D printer with the smart factory. In the smart 3D printer factory, the internal production line could be divided into five stations, using embedded development platform as the gateway to connect the heterogeneous network and process the protocol translation. The MQTT communication protocol is used to establish communication between the stations in order to create a heterogeneous network of Modbus and UART as well as WIFI. Due to the fact that most of the traditional machines of refined mechanics are expensive and not smart at all, this research designs an IoT module embedded on the machines at the five stations, making the traditional machines become smart. With M2M and IoT network connection to connect the stations together, this creates the communication between these stations. Then connecting the smart production line with any smart handheld device could enable customers to customize their products from design, manufacturing to shipment. Moreover, factory staff can also monitor the production line via the computer program. Therefore, the loss caused by machine failure or crash could be reduced with even less human resources to conduct the data collection and analysis. In the end, a smart manufacturing factory is realized to be ‘customized’ and ‘unmanned’ as an ultimate goal of smart manufacturing in the era of Industry 4.0.
Tsai, Chih-hsiang, and 蔡智翔. "Design and Implementations of H.264/AVC Encoder on Heterogeneous Dual Core Embedded System." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/89409856967177130909.
Повний текст джерела國立雲林科技大學
電子與資訊工程研究所
99
In recent years, because of gradually introducing a variety of mobile phones; competitions between various mobile manufacturers are intense. In order to achieve better phone potency performance, the dual-core processor has been promoted. Multimedia functional phone is a major point which consumers pay much attention with. Therefore, those multimedia functional phones have begun to support high-definition video recording. However, the load of data is overloaded for the limited capacity in mobile phones. To allow the load of data recording smaller, the phone must support H.264/AVC (MPEG-4 Part 10/Advanced Video Coding) compression format. Indeed, H.264/AVC, nowadays, is one of widespread compression techniques. Comparing with the previous compression techniques (MPEG-4, H.263 and MPEG-2) in the past, H.264/AVC might save approximately 50% of the data load. However, its complex coding techniques cause its operand higher than the former compression standard. This paper has proposed based on dual-core heterogeneous platform (DaVinci) to carry 297M Hz ARM (Advanced RISC Machines) and 594M Hz DSP (Digital Singnal Processor) on for H.264/AVC Baseline Profile encoder system design. In forcing dual-core works throughroughly, DSP has procesed a large amount of computation and ARM in share to procese the other remaining operations. And, to proceed the DSP interior optimization, by taking the advantages of SRAM (Static Random Access Memory) , EDMA (Enhanced Direct Memory Access) operated with the composes of DSP Assembly and fast calculating algorithm and so on, the overall performance potency has been enhanced, our proposed could achieve a resolution QVGA (Quarter Video Graphics Array, 320x240) of 25fps (frame per second). In contrast to the paper " Implementation and Optimization of Real-Time H.264/AVC Main Profile Encoder on DM648 DSP[29]” published in ICSAP(International Conference on Signal Acquisition and Processing) in April 2009, this paper could achieve a resolution QCIF (Quarter Common Intermediate Format, 176x144) of 65 ~ 74 fps (frame per second) performance. To summarised, the resolution of this paper is about 2.2 times higher than the resolution of QCIF 32 ~ 35 fps performance mentionded in the paper of Implementation and Optimization of Real-Time H.264/AVC Main Profile Encoder on DM648 DSP[29].
Chen, Yen-Lin, and 陳彥霖. "DYNAMIC TASKS DISPATCHING MANAGEMENT FOR H.264 ENCODER ON HETEROGENEOUS DUAL-CORE EMBEDDED SYSTEM." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/22412563819914193549.
Повний текст джерела大同大學
資訊工程學系(所)
98
This paper shows the mechanism of using a dual-core embedded system to process the multimedia data and how the resources managed to serve the whole process. The partitioning methodologies now for asymmetric dual-core systems typically perform software partitioning during compile time, which assigns tasks simply to RISC (Reduced Instruction Set Computer) or DSP (Digital Signal Processor) only, leaving the other processor idle. If we can use both processors to handle part of the job, the whole process will be faster and save a lot of time and power. In this paper we proposed a dynamic dispatching method to manage and schedule the tasks between RISC and DSP. We dispatch the tasks to both RISC and DSP core. We use TI’s DaVinci DM6446 as our example to show the enhancement of system performance. The paper also indicates the experiment results and the further issues to promote.
VINCO, Sara. "Reuse and Integration of Heterogeneous Components for Efficient Embedded Software Generation." Doctoral thesis, 2013. http://hdl.handle.net/11562/555550.
Повний текст джерелаNowadays, our world is more and more saturated with computing and communication capability, that is integrated with human users and with the environment. Electronic systems are no more stand-alone entities isolated from the physical reality. On the contrary, electronic systems permeate home automation, health care management, together with every day human life. As a result, modern embedded systems are highly heterogeneous, as they are composed of a mix of analog and digital HW, as well as embedded SW. Furthermore, the tight bound with the physical environment implies to take into account physical evolution during the design and verification phases. In this context, reuse is a very difficult task, as the components to integrate are highly heterogeneous. On the other hand, reuse is a winning approach to save design cost and time. Indeed, top-down approaches allow to optimize and configure each step of design, but any time that a component must be added or changed, the whole design flow must be undergone again. This thesis focuses on three main techniques for supporting reuse and integration in the context of embedded systems: co-simulation, interface generation and homogeneous formal representation. Co-simulation and interface generation preserve the high level of heterogeneity of the system, by allowing communication between heterogeneous components with a framework (co-simulation) or via generation of the necessary interfaces. A computational model is then proposed to model the heterogeneity in a homogeneous way. The starting heterogeneous components are automatically converted to the computational model, with transformations that preserve the starting behavior, thus supporting a fully bottom-up flow. The homogeneous description is then used as a starting point for automatic generation of code for simulation and validation of the integrated system, or for efficient execution as SW on massively parallel architectures. The whole flow is supported by automatic code generation tools, that enhance the effectiveness of the proposed approach.
Neill, Richard W. "Heterogeneous Cloud Systems Based on Broadband Embedded Computing." Thesis, 2013. https://doi.org/10.7916/D8HH6JG1.
Повний текст джерелаQiu, Meikang. "Time and power optimization for heterogeneous parallel embedded systems /." 2007. http://proquest.umi.com/pqdweb?did=1296105641&sid=1&Fmt=2&clientId=10361&RQT=309&VName=PQD.
Повний текст джерелаLiu, Jian-Hong, and 劉建宏. "A Micro-Kernel for Embedded Systems with Heterogeneous Multiprocessors." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/62080472684263138746.
Повний текст джерела國立成功大學
電機工程學系碩博士班
92
This thesis presents how to build a kernel which is based on micro-kernel architecture on a SOC of heterogeneous multiprocessors. The micro-kernel architecture is based on message-passing mechanism. There are four essential parts in the kernel, namely inter-process communication, hardware interrupt handler, inter-processor communication and scheduler. Message-passing between processes is the responsibility of inter-process communication. When a hardware interrupt is triggered, hardware interrupt handler must handle it. Inter-processor communication handles the communication between processors. Choosing next process to execute is scheduler’s duty. The kernel described in this thesis has been implemented on a reference design of TI TMS320DSC25 which is a heterogeneous multiprocessors SOC containing a ARM7TDMI core and a C5409 DSP core. ARM7 is a general purpose processor with 32-bit capability, while C5409 is a digital signal processor with 16-bit capability. The ARM processor and the DSP processor run their own copy of the kernel independently. Except the hardware dependent functionality, the two copies of the kernel are designed with the same structure providing same service functions through the same application program interfaces. By executing micro-kernel on DSC25, different processes executing on different processors can communicate or request services via inter-processor communication. Finally, every critical sections of the kernel, no matter whichever processor it running on, can complete in a bounded time.
Sousa, Luís Miguel Mendes Pimentel Alves de. "Runtime Management of Heterogeneous Compute Resources in Embedded Systems." Master's thesis, 2021. https://hdl.handle.net/10216/137152.
Повний текст джерелаKu, Chun-Wei, and 古君葳. "Heterogeneous Sensing Fusion for Safety Critical Embedded Real-time Systems." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/d73267.
Повний текст джерела國立臺灣大學
資訊工程學研究所
106
Nowadays, the bulk of these road collisions is caused by human unawareness or distraction. Since the most important thing is your safety and the safety of others, ADAS is developed to support enhanced vehicle system for safety and better driving. AEBS as an important part of the ADAS has become a hot research topic. Computer vision, together with Radar and Lidar, is at the forefront of technologies that enable the evolution of AEBS. Since the cost of long range radar and lidar is very high, we want to use camera-based system to construct AEBS. Instead of using a single monocular camera, we propose a heterogeneous camera-based system to use sensor fusion to combine the strengths of all the difference FoV cameras. Also,We use a heuristic false positive removal method to decrease the false positive rate that caused by the sensor fusion method. We optimize the sensor fusion method Because of the the limitation of computing resource on embedded system. As a result, the recall of YOLO can be increased up to 10% through our heterogeneous camera-based system.
Liao, Han-chiang, and 廖翰強. "Real-time on-line Task Scheduling for Heterogeneous Multi-core Embedded Systems." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/ya9ata.
Повний текст джерела國立臺灣科技大學
電機工程系
99
This paper explores the real-time scheduling problems for heterogeneous multi-core systems. With the precedence constraint consideration, we test the performance of heterogeneous dual-core systems under varying schedulers, protocols, preemption point and context switch overhead. In heterogeneous multi-core systems, we discuss the performance of system under varying dispatchers, migration cost and task structures. We also propose an efficient algorithm to reduce the number of preemption in heterogeneous multi-core systems.