Dissertations / Theses: 'System-on-chips'

1

Ludewig, Ralf. "Integrierte Architektur für das Testen und Debuggen von System-on-Chips /." Aachen : Shaker, 2006. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=014632870&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

An, Xin. "High level design and control of adaptive multiprocessor system-on-chips." Thesis, Grenoble, 2013. http://www.theses.fr/2013GRENM023/document.

Full text

Abstract:

La conception de systèmes embarqués modernes est de plus en plus complexe, car plus de fonctionnalités sont intégrées dans ces systèmes. En même temps, afin de répondre aux exigences de calcul tout en conservant une consommation d'énergie de faible niveau, MPSoCs sont apparus comme les principales solutions pour tels systèmes embarqués. En outre, les systèmes embarqués sont de plus en plus adaptatifs, comme l’adaptabilité peut apporter un certain nombre d'avantages, tels que la flexibilité du logiciel et l'efficacité énergétique. Cette thèse vise la conception sécuritaire de ces MPSoCs adaptatifs. Tout d'abord, chaque configuration de système doit être analysée en ce qui concerne ses propriétés fonctionnelles et non fonctionnelles. Nous présentons un cadre abstraite de conception et d’analyse qui permet des décisions d’implémentation plus rapide et plus rentable. Ce cadre est conçu comme un support de raisonnement intermédiaire pour les environnements de co-conception de logiciel / matériel au niveau de système. Il peut élaguer l'espace de conception à sa plus grande portée, et identifier les candidats de solutions de conception de manière rapide et efficace. Dans ce cadre, nous utilisons un codage basé sur l’horloge abstrait pour modéliser les comportements du système. Différents scénarios d'applications de mapping et de planification sur MPSoCs sont analysés via les traces d'horloge qui représentent les simulations du système. Les propriétés d'intérêt sont l’exactitude du comportement fonctionnel, la performance temporelle et la consommation d'énergie. Deuxièmement, la gestion de la reconfiguration de MPSoCs adaptatifs doit être abordée. Nous sommes particulièrement intéressés par les MPSoCs implémentés sur des architectures reconfigurables de hardware (ex. FPGA tissus) qui offrent une bonne flexibilité et une efficacité de calcul pour les MPSoCs adaptatifs. Nous proposons un cadre général de conception basésur la technique de la synthèse de contrôleurs discrets (SCD) pour résoudre ce problème. L’avantage principal de cette technique est qu'elle permet une synthèse d'un contrôleur automatique vis-à-vis d’une spécification donnée des objectifs de contrôle. Dans ce cadre, le comportement de reconfiguration du système est modélisé en termes d'automates synchrones en parallèle. Le problème de calcul de la gestion reconfiguration vis-à-vis de multiples objectifs concernant, par exemple, les usages des ressources, la performance et la consommation d’énergie est codé comme un problème de SCD . Le langage de programmation BZR existant et l’outil Sigali sont employés pour effectuer SCD et générer un contrôleur qui satisfait aux exigences du système. Finalement, nous étudions deux façons différentes de combiner les deux cadres de conception proposées pour MPSoCs adaptatifs. Tout d'abord, ils sont combinés pour construire un flot de conception complet pour MPSoCs adaptatifs. Deuxièmement, ils sont combinés pour présenter la façon dont le gestionnaire d'exécution conçu dans le second cadre peut être intégré dans le premier cadre de sorte que les simulations de haut niveau peuvent être effectuées pour évaluer le gestionnaire d'exécution
The design of modern embedded systems is getting more and more complex, as more func- tionality is integrated into these systems. At the same time, in order to meet the compu- tational requirements while keeping a low level power consumption, MPSoCs have emerged as the main solutions for such embedded systems. Furthermore, embedded systems are be- coming more and more adaptive, as the adaptivity can bring a number of benefits, such as software flexibility and energy efficiency. This thesis targets the safe design of such adaptive MPSoCs. First, each system configuration must be analyzed concerning its functional and non- functional properties. We present an abstract design and analysis framework, which allows for faster and cost-effective implementation decisions. This framework is intended as an intermediate reasoning support for system level software/hardware co-design environments. It can prune the design space at its largest, and identify candidate design solutions in a fast and efficient way. In the framework, we use an abstract clock-based encoding to model system behaviors. Different mapping and scheduling scenarios of applications on MPSoCs are analyzed via clock traces representing system simulations. Among properties of interest are functional behavioral correctness, temporal performance and energy consumption. Second, the reconfiguration management of adaptive MPSoCs must be addressed. We are specially interested in MPSoCs implemented on reconfigurable hardware architectures (i.e., FPGA fabrics), which provide a good flexibility and computational efficiency for adap- tive MPSoCs. We propose a general design framework based on the discrete controller syn- thesis (DCS) technique to address this issue. The main advantage of this technique is that it allows the automatic controller synthesis w.r.t. a given specification of control objectives. In the framework, the system reconfiguration behavior is modeled in terms of synchronous parallel automata. The reconfiguration management computation problem w.r.t. multiple objectives regarding e.g., resource usages, performance and power consumption is encoded as a DCS problem. The existing BZR programming language and Sigali tool are employed to perform DCS and generate a controller that satisfies the system requirements. Finally, we investigate two different ways of combining the two proposed design frame- works for adaptive MPSoCs. Firstly, they are combined to construct a complete design flow for adaptive MPSoCs. Secondly, they are combined to present how the designed run-time manager by the second framework can be integrated into the first framework so that high level simulations can be performed to assess the run-time manager

APA, Harvard, Vancouver, ISO, and other styles

3

Bai, Xiaoliang. "Modeling and testing for signal integrity in nanometer system-on-chips /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2003. http://wwwlib.umi.com/cr/ucsd/fullcit?p3112828.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Chen, Li. "Software-based self-test and diagnosis for processors and system-on-chips /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2003. http://wwwlib.umi.com/cr/ucsd/fullcit?p3090436.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Rech, Paolo. "Soft Errors Induced By Neutrons and Alpha Particles in System on Chips." Doctoral thesis, Università degli studi di Padova, 2010. http://hdl.handle.net/11577/3421895.

Full text

Abstract:

This Manuscript presents a new low-cost test setup for the radiation tests of System on Chips composed of different functional modules of different nature. Particular attention is given to radiation experiments results of embedded SRAM cores, embedded logic cores and embedded microprocessor cores, highlighting the dissimilar test protocols required to characterize their sensitivity to radiation. The main issues when testing a System on Chip are the cores reduced accessibility and the physical constraints test facilities may impose to the test setup. Manufacturers heavily employ Design for Testability techniques, based on built-in test structures, to enable exhaustive devices testing while minimizing application costs. We reused some of the Design for Testability built-in structures to deeply characterize the cores composing the System on Chip and the overall chip behaviours when exposed to radiation. Our strategy can be applied to any kind of integrated core, and we also present some guidelines on how built-in structures may be fruitfully applied to radiation experiments. Moreover, the monolithic shape of our test board makes it easy to be mounted in most of available particle accelerators chambers or radiation test facilities. As the test structures are built-in and thanks to the efficient interfaces strategy that takes advantage of both JTAG and Wrappers standards, tests are performed at high frequency, thus avoiding Single Event Transients underestimation, but without the need of high-speed connections between a host PC and the DUT, drastically reducing the overall setup costs. This thesis also shows and discusses the results gained during massive radiation experiments campaigns on the available System on Chip manufactured by STMicroelectronics in a 90 nm CMOS technology. As device is meant to be part of a complex automotive design, it may be affected by ground level radiation. We then exposed the chips both to neutron and alpha particles fluxes. With our low-cost setup we measured the SRAM core cross section to alphas and neutrons, and found out that the former one is higher than the latter. We have also characterized the microprocessors behaviour when exposed to alphas. The static test stated that registers flip-flops have a higher radiation induced error rate with respect to code and user RAM one. This result is of great importance, and should be taken into account when building a fault-injection platform. To understand how the corruption of the different memory resources affects codes executions, we designed different benchmark codes and performed a dynamic test. Results demonstrate that, in a typical application, the bit-flips in the code RAM are definitely predominant with respect to the ones in registers. Moreover, we show how code RAM and register bits are not always critical, and their corruption does not necessarily propagate to outputs. Finally, we have considered hardening techniques efficiency and costs. In particular, we have studied how Design For Manufacturing layout modifications and Triple Module Redundancy affect the radiation sensitivity of microprocessors. We considered chips built with different Design For Manufacturing maturity levels, and experimental results demonstrate that an higher level of optimization enhances the resilience to alpha radiation. Hardening techniques, however, come to a cost. The decision on which hardening technique to adopt when building a complex device is a hard-earned trade-off between costs, performance and, of course, reliability. Mitigation strategies for a product then depends on its requirements and on its mission environment.
Questa tesi presenta un innovativo setup a basso costo per effettuare dei test sotto radiazione di System on Chips in cui siano integrati moduli di diversa natura e con diverse funzionalità. In particolare sono stati svolti numerosi test sotto radiazione di memorie SRAM integrate, di moduli logici integrati e di microprocessori integrati, analizzando i diversi protocolli di test necessari per poter caratterizzare al meglio la loro sensibilità alla radiazione. Uno dei problemi maggiori che si riscontrano quando si deve testare un System on Chip è la ridotta accessibilità dei vari moduli integrati e i vincoli fisici che devono essere rispettati per effettuare il test stesso e che rendono le procedure di analisi molto difficili. I costruttori, per riuscire a verificare la funzionalità dei vari moduli integrati, usano molto spesso delle tecniche chiamate Design for Testability bastate su strutture di test integrate che permettono un’esaustiva verifica della funzionalità dei moduli minimizzando allo stesso tempo i costi del test. Durante gli esperimenti presentati in questo lavoro abbiamo riutilizzato alcune strutture integrate del tipo Design for Testability per caratterizzare nel dettaglio sia tutti i singoli moduli che compongono un System on Chip che il comportamento globale del dispositivo quando viene esposto a radiazione. La strategia che è proposta in questa tesi può essere generalizzata e applicata a qualunque tipo di modulo integrato e sono presentati anche alcuni suggerimenti sul come applicare le strutture di test DfT agli esperimenti di radiazione. Quando si effettua un esperimenti di radiazione tipicamente ci sono diversi vincoli che, in base al laboratorio in cui gli esperimenti vengono eseguiti, possono essere imposti al setup di test. La scheda di test che abbiamo sviluppato ha una forma monolitica, che la rende facile da posizionare nella maggior parte delle camere di irraggiamento degli acceleratori di particelle utilizzati per questo tipo di esperienze. Inoltre, grazie da un lato all’integrazione delle strutture di test nel System on Chip da caratterizzare e, dall’altro, ad una strategia d’interfaccia che si basa sia sul JTAG che sui Wrappers, i test possono essere eseguiti ad alta frequenza usando però solamente connessioni lente fra un PC e il dispositivo da testare, diminuendo così drasticamente il costo globale degli esperimenti. Questa tesi mostra e discute i risultati ottenuti da molte campagne di esperimenti di radiazione su un System on Chip costruito in tecnologia CMOS a 90 nm da STMicroelectronics. Tale dispositivo è stato pensato e realizzato per essere parte di un complesso progetto automotive; ci siamo dunque focalizzati sulle problematiche derivanti dall’impatto che la radiazione terrestre può avere in questo dispositivo. Abbiamo quindi esposto il chip sia a flussi di neutroni che di particelle alfa. Grazie ai dati ottenuti dagli esperimenti, abbiamo calcolato la sensibilità del modulo SRAM sia a particelle alfa che a neutroni, e abbiamo scoperto che quest’ultima è decisamente inferiore della prima. Abbiamo quindi caratterizzato il comportamento del microprocessore quando è esposto a particelle alfa. Il test statico ha dimostrato che i flip-flop che costituiscono i registri interni del microprocessore hanno un tasso di errore indotto da radiazione più elevato rispetto al modulo memoria utente e memoria codice. Questo risultato è di grande importanza e deve essere considerato, per esempio, quando si costruisce una piattaforma di fault-injection. Per effettuare il test dinamico del microprocessore abbiamo costruito due diversi codici di riferimento, in modo da capire come la corruzione delle riverse risorse di memorizzazione influenzi l’esecuzione del codice. I risultati ottenuti dimostrano che, in una tipica applicazione, gli errori nella memoria codice sono decisamente predominanti rispetto a quelli nei registri interni. Inoltre abbiamo visto che i bit di memoria codice e dei registri non sono sempre critici, e la loro corruzione non necessariamente si propaga all’uscita. Infine, abbiamo considerato l’efficacia e i costi di diverse tecniche di irrobustimento. In particolare, abbiamo studiato come l’ottimizzazione del layout proposta del Design For Manufacturing o la Triple Module Redundancy influenzino la sensibilità alla radiazione del microprocessore. Abbiamo considerato dei chip costruiti con diversi livelli di maturità del Design For Manufacturing e i risultati sperimentali dimostrano che un più alto livello di ottimizzazione aumenta la resistenza del dispositivo alla radiazione alfa. Le tecniche di irrobustimento, comunque, hanno un costo. La decisione su quale tecnica adottare quando si costruisce un dispositivo complesso è un trade-off fra costi, performance e, ovviamente, affidabilità. Le strategie da adottare per un particolare prodotto dipendono quindi dai suoi requisiti e dall’ambiente in cui dovrà essere impiegato.

APA, Harvard, Vancouver, ISO, and other styles

6

Sunwoo, John Stroud Charles E. "Built-In Self-Test of programmable resources in microcontroller based System-on-Chips." Auburn, Ala., 2005. http://repo.lib.auburn.edu/2005%20Fall/Thesis/SUNWOO_JOHN_31.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Ludewig, Ralf [Verfasser]. "Integrierte Architektur für das Testen und Debuggen von System-on-Chips / Ralf Ludewig." Aachen : Shaker, 2006. http://d-nb.info/118658789X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

SEU, GIOVANNI PIETRO. "Exploiting All-Programmable System on Chips for Closed-Loop Real-Time Neural Interfaces." Doctoral thesis, Università degli studi di Genova, 2019. http://hdl.handle.net/11567/943352.

Full text

Abstract:

High-density microelectrode arrays (HDMEAs) feature thousands of recording electrodes in a single chip with an area of few square millimeters. The obtained electrode density is comparable and even higher than the typical density of neuronal cells in cortical cultures. Commercially available HDMEA-based acquisition systems are able to record the neural activity from the whole array at the same time with submillisecond resolution. These devices are a very promising tool and are increasingly used in neuroscience to tackle fundamental questions regarding the complex dynamics of neural networks. Even if electrical or optical stimulation is generally an available feature of such systems, they lack the capability of creating a closed-loop between the biological neural activity and the artificial system. Stimuli are usually sent in an open-loop manner, thus violating the inherent working basis of neural circuits that in nature are constantly reacting to the external environment. This forbids to unravel the real mechanisms behind the behavior of neural networks. The primary objective of this PhD work is to overcome such limitation by creating a fullyreconfigurable processing system capable of providing real-time feedback to the ongoing neural activity recorded with HDMEA platforms. The potentiality of modern heterogeneous FPGAs has been exploited to realize the system. In particular, the Xilinx Zynq All Programmable System on Chip (APSoC) has been used. The device features reconfigurable logic, specialized hardwired blocks, and a dual-core ARM-based processor; the synergy of these components allows to achieve high elaboration performances while maintaining a high level of flexibility and adaptivity. The developed system has been embedded in an acquisition and stimulation setup featuring the following platforms: • 3·Brain BioCam X, a state-of-the-art HDMEA-based acquisition platform capable of recording in parallel from 4096 electrodes at 18 kHz per electrode. • PlexStim™ Electrical Stimulator System, able to generate electrical stimuli with custom waveforms to 16 different output channels. • Texas Instruments DLP® LightCrafter™ Evaluation Module, capable of projecting 608x684 pixels images with a refresh rate of 60 Hz; it holds the function of optical stimulation. All the features of the system, such as band-pass filtering and spike detection of all the recorded channels, have been validated by means of ex vivo experiments. Very low-latency has been achieved while processing the whole input data stream in real-time. In the case of electrical stimulation the total latency is below 2 ms; when optical stimuli are needed, instead, the total latency is a little higher, being 21 ms in the worst case. The final setup is ready to be used to infer cellular properties by means of closed-loop experiments. As a proof of this concept, it has been successfully used for the clustering and classification of retinal ganglion cells (RGCs) in mice retina. For this experiment, the light-evoked spikes from thousands of RGCs have been correctly recorded and analyzed in real-time. Around 90% of the total clusters have been classified as ON- or OFF-type cells. In addition to the closed-loop system, a denoising prototype has been developed. The main idea is to exploit oversampling techniques to reduce the thermal noise recorded by HDMEAbased acquisition systems. The prototype is capable of processing in real-time all the input signals from the BioCam X, and it is currently being tested to evaluate the performance in terms of signal-to-noise-ratio improvement.

APA, Harvard, Vancouver, ISO, and other styles

9

Zhao, Yi. "Fault modeling and on-line testing for deep-submicron noise interference in system-on-chips /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2004. http://wwwlib.umi.com/cr/ucsd/fullcit?p3127634.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Meng. "Real-Time Communication over Wormhole-Switched On-Chip Networks." Doctoral thesis, Mälardalens högskola, Inbyggda system, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35316.

Full text

Abstract:

In a modern industrial system, the requirement on computational capacity has increased dramatically, in order to support a higher number of functionalities, to process a larger amount of data or to make faster and safer run-time decisions. Instead of using a traditional single-core processor where threads can only be executed sequentially, multi-core and many-core processors are gaining more and more attentions nowadays. In a multi-core processor, software programs can be executed in parallel, which can thus boost the computational performance. Many-core processors are specialized multi-core processors with a larger number of cores which are designed to achieve a higher degree of parallel processing. An on-chip communication bus is a central intersection used for data-exchange between cores, memory and I/O in most multi-core processors. As the number of cores increases, more contention can occur on the communication bus which raises a bottleneck of the overall performance. Therefore, in order to reduce contention incurred on the communication bus, a many-core processor typically employs a Network-on-Chip (NoC) to achieve data-exchange. Real-time embedded systems have been widely utilized for decades. In addition to the correctness of functionalities, timeliness is also an important factor in such systems. Violation of specific timing requirements can result in performance degradation or even fatal problems. While executing real-time applications on many-core processors, the timeliness of a NoC, as a communication subsystem, is essential as well. Unfortunately, many real-time system designs over-provision resources to guarantee the fulfillment of timing requirements, which can lead to significant resource waste. For example, analysis of a NoC design yields that the network is already saturated (i.e. accepting more traffic can incur requirement violation), however, in reality the network actually has the capacity to admit more traffic. In this thesis, we target such resource wasting problems related to design and analysis of NoCs that are used in real-time systems. We propose a number of solutions to improve the schedulability of real-time traffic over wormhole-switched NoCs in order to further improve the resource utilization of the whole system. The solutions focus mainly on two aspects: (1) providing more accurate and efficient time analyses; (2) proposing more cost-effective scheduling methods.

APA, Harvard, Vancouver, ISO, and other styles

11

Väyrynen, Mikael. "Fault-Tolerant Average Execution Time Optimization for General-Purpose Multi-Processor System-On-Chips." Thesis, Linköping University, Linköping University, Linköping University, Department of Computer and Information Science, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-17705.

Full text

Abstract:

Fault tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault tolerance. For a given job and a soft (transient) no-error probability, we define mathematical formulas for AET using voting (active replication), rollback-recovery with checkpointing (RRC) and a combination of these (CRV) where bus communication overhead is included. And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize the AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC or a combination where RRC is included, (2) finding the number of processors and job-to-processor assignment when using voting or a combination where voting is used, and (3) defining fault tolerance scheme (voting, RRC or CRV) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.

APA, Harvard, Vancouver, ISO, and other styles

12

Parri, Jonathan. "A Framework for Selection and Integration of Custom Instructions for Hybrid System-on-Chips." Thesis, University of Ottawa (Canada), 2010. http://hdl.handle.net/10393/28739.

Full text

Abstract:

Traditionally, common processor augmentation solutions have involved the addition of coprocessors or the datapath integration of custom instructions within extensible processors as Instruction Set Extensions (ISE). Rarely is the hybrid option of using both techniques explored. Much research already exists concerning the mutually exclusive identification and selection of custom hardware blocks from hardware/software partitioning techniques. The question of how to best select and use this hardware within a user system where both coprocessors and datapath augmentations are possible and are mutually inclusive remains. Here a system with both types of these custom instructions is denoted as a hybrid SoC. In this work, both the coprocessor and internal datapath custom instruction design decisions are modeled within a design space exploration framework created to facilitate hybrid SoC development. We explore how to best select and integrate these instructions using available metrics and traditional combinatorial optimization techniques while packaging these ideas together into a complete toolchain framework. This framework is integrated into industry design flow tools in an attempt to achieve significant performance gains over existing methodologies.

APA, Harvard, Vancouver, ISO, and other styles

13

Qi, Ji. "System-level design automation and optimisation of network-on-chips in terms of timing and energy." Thesis, University of Southampton, 2015. https://eprints.soton.ac.uk/386210/.

Full text

Abstract:

As system complexity constantly increases, traditional bus-based architectures are less adaptable to the increasing design demands. Specifically in on-chip digital system designs, Network-on-Chip (NoC) architectures are promising platforms that have distributed multi-core co-operation and inter-communication. Since the design cost and time cycles of NoC systems are growing rapidly with higher integration, systemlevel Design Automation (DA) techniques are used to abstract models at early design stages for functional validation and performance prediction. Yet precise abstractions and efficient simulations are critical challenges for modern DA techniques to improve the design efficiency. This thesis makes several contributions to address these challenges. We have firstly extended a backbone simulator, NIRGAM, to offer accurate system level models and performance estimates. A case study of developing a one-to-one transmission system using asynchronous FIFOs as buffers in both the NIRGAM simulator and a synthesised gate-level design is given to validate the model accuracy by comparing their power and timing performance. Then we have made a second contribution to improve DA techniques by proposing a novel method to efficiently emulate non-rectangular NoC topologies in NIRGAM and generating accurate energy and timing performance. Our proposed method uses time regulated models to emulate virtual non-rectangular topologies based on a regular Mesh. The performance accuracy of virtual topologies is validated by comparing with corresponding real NoC topologies. The third contribution of our research is a novel task-mapping scheme that generates optimal mappings to tile-based NoC networks with accurate performance prediction and increased execution speed. A novel Non-Linear Programming (NLP) based mapping problem has been formulated and solved by a modified Branch and Bound (BB) algorithm. The proposed method predicts the performance of optimised mappings and compares it with NIRGAM simulations for accuracy validation.

APA, Harvard, Vancouver, ISO, and other styles

14

Huang, Jia [Verfasser], Alois [Akademischer Betreuer] Knoll, and Petru [Akademischer Betreuer] Eles. "Towards an Integrated Framework for Reliability-Aware Embedded System Design on Multiprocessor System-on-Chips / Jia Huang. Gutachter: Alois Knoll ; Petru Eles. Betreuer: Alois Knoll." München : Universitätsbibliothek der TU München, 2014. http://d-nb.info/1063724333/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Yang, Xiaokun. "A High Performance Advanced Encryption Standard (AES) Encrypted On-Chip Bus Architecture for Internet-of-Things (IoT) System-on-Chips (SoC)." FIU Digital Commons, 2016. http://digitalcommons.fiu.edu/etd/2477.

Full text

Abstract:

With industry expectations of billions of Internet-connected things, commonly referred to as the IoT, we see a growing demand for high-performance on-chip bus architectures with the following attributes: small scale, low energy, high security, and highly configurable structures for integration, verification, and performance estimation. Our research thus mainly focuses on addressing these key problems and finding the balance among all these requirements that often work against each other. First of all, we proposed a low-cost and low-power System-on-Chips (SoCs) architecture (IBUS) that can frame data transfers differently. The IBUS protocol provides two novel transfer modes – the block and state modes, and is also backward compatible with the conventional linear mode. In order to evaluate the bus performance automatically and accurately, we also proposed an evaluation methodology based on the standard circuit design flow. Experimental results show that the IBUS based design uses the least hardware resource and reduces energy consumption to a half of an AMBA Advanced High-Performance Bus (AHB) and Advanced eXensible Interface (AXI). Additionally, the valid bandwidth of the IBUS based design is 2.3 and 1.6 times, respectively, compared with the AHB and AXI based implementations. As IoT advances, privacy and security issues become top tier concerns in addition to the high performance requirement of embedded chips. To leverage limited resources for tiny size chips and overhead cost for complex security mechanisms, we further proposed an advanced IBUS architecture to provide a structural support for the block-based AES algorithm. Our results show that the IBUS based AES-encrypted design costs less in terms of hardware resource and dynamic energy (60.2%), and achieves higher throughput (x1.6) compared with AXI. Effectively dealing with the automation in design and verification for mixed-signal integrated circuits is a critical problem, particularly when the bus architecture is new. Therefore, we further proposed a configurable and synthesizable IBUS design methodology. The flexible structure, together with bus wrappers, direct memory access (DMA), AES engine, memory controller, several mixed-signal verification intellectual properties (VIPs), and bus performance models (BPMs), forms the basic for integrated circuit design, allowing engineers to integrate application-specific modules and other peripherals to create complex SoCs.

APA, Harvard, Vancouver, ISO, and other styles

16

Jacob, Kabakci Nisha [Verfasser], Georg [Akademischer Betreuer] Sigl, Georg [Gutachter] Sigl, and Sebastian [Gutachter] Steinhorst. "Hardware Trojans and their Security Impact on Reconfigurable System-on-Chips / Nisha Jacob Kabakci ; Gutachter: Georg Sigl, Sebastian Steinhorst ; Betreuer: Georg Sigl." München : Universitätsbibliothek der TU München, 2020. http://d-nb.info/1220319899/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Hirmer, Katrin [Verfasser], Klaus [Akademischer Betreuer] Hofmann, and Dirk [Akademischer Betreuer] Killat. "Interference-Aware Integration of Mixed-Signal Designs and Ultra High Voltage Pulse Generators for System-on-Chips / Katrin Hirmer ; Klaus Hofmann, Dirk Killat." Darmstadt : Universitäts- und Landesbibliothek Darmstadt, 2019. http://d-nb.info/1199006408/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Hirmer, Katrin [Verfasser], Klaus Akademischer Betreuer] Hofmann, and Dirk [Akademischer Betreuer] [Killat. "Interference-Aware Integration of Mixed-Signal Designs and Ultra High Voltage Pulse Generators for System-on-Chips / Katrin Hirmer ; Klaus Hofmann, Dirk Killat." Darmstadt : Universitäts- und Landesbibliothek Darmstadt, 2019. http://d-nb.info/1199006408/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Santos, André Flores dos. "Análise do uso de redundância em circuitos gerados por síntese de alto nível para FPGA programado por SRAM sob falhas transientes." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/178392.

Full text

Abstract:

Este trabalho consiste no estudo e análise da suscetibilidade a efeitos da radiação em projetos de circuitos gerados por ferramenta de Síntese de Alto Nível para FPGAs (Field Programmable Gate Array), ou seja, circuitos programáveis e sistemas em chip, do inglês System-on-Chip (SOC). Através de um injetor de falhas por emulação usando o ICAP (Internal Configuration Access Port) localizado dentro do FPGA é possível injetar falhas simples ou acumuladas do tipo SEU (Single Event Upset), definidas como perturbações que podem afetar o funcionamento correto do dispositivo através da inversão de um bit por uma partícula carregada. SEU está dentro da classificação de SEEs (Single Event Effects), efeitos transitórios em tradução livre, podem ocorrer devido a penetração de partículas de alta energia do espaço e do sol (raios cósmicos e solares) na atmosfera da Terra que colidem com átomos de nitrogênio e oxigênio resultando na produção de partículas carregadas, na grande maioria nêutrons. Dentro deste contexto além de analisar a suscetibilidade de projetos gerados por ferramenta de Síntese de Alto Nível, torna-se relevante o estudo de técnicas de redundância como TMR (Triple Modular Redundance) para detecção, correção de erros e comparação com projetos desprotegidos verificando a confiabilidade. Os resultados mostram que no modo de injeção de falhas simples os projetos com redundância TMR demonstram ser efetivos. Na injeção de falhas acumuladas o projeto com múltiplos canais apresentou melhor confiabilidade do que o projeto desprotegido e com redundância de canal simples, tolerando um maior número de falhas antes de ter seu funcionamento comprometido.
This work consists of the study and analysis of the susceptibility to effects of radiation in circuits projects generated by High Level Synthesis tool for FPGAs Field Programmable Gate Array (FPGAs), that is, system-on-chip (SOC). Through an emulation fault injector using ICAP (Internal Configuration Access Port), located inside the FPGA, it is possible to inject single or accumulated failures of the type SEU (Single Event Upset), defined as disturbances that can affect the correct functioning of the device through the inversion of a bit by a charged particle. SEU is within the classification of SEEs (Single Event Effects), can occur due to the penetration of high energy particles from space and from the sun (cosmic and solar rays) in the Earth's atmosphere that collide with atoms of nitrogen and oxygen resulting in the production of charged particles, most of them neutrons. In this context, in addition to analyzing the susceptibility of projects generated by a High Level Synthesis tool, it becomes relevant to study redundancy techniques such as TMR (Triple Modular Redundancy) for detection, correction of errors and comparison with unprotected projects verifying the reliability. The results show that in the simple fault injection mode TMR redundant projects prove to be effective. In the case of accumulated fault injection, the multichannel design presented better reliability than the unprotected design and with single channel redundancy, tolerating a greater number of failures before its operation was compromised.

APA, Harvard, Vancouver, ISO, and other styles

20

Feki, Anis. "Conception d’une mémoire SRAM en tension sous le seuil pour des applications biomédicales et les nœuds de capteurs sans fils en technologies CMOS avancées." Thesis, Lyon, INSA, 2015. http://www.theses.fr/2015ISAL0018/document.

Full text

Abstract:

L’émergence des circuits complexes numériques, ou System-On-Chip (SOC), pose notamment la problématique de la consommation énergétique. Parmi les blocs fonctionnels significatifs à ce titre, apparaissent les mémoires et en particulier les mémoires statiques (SRAM). La maîtrise de la consommation énergétique d’une mémoire SRAM inclue la capacité à rendre la mémoire fonctionnelle sous très faible tension d’alimentation, avec un objectif agressif de 300 mV (inférieur à la tension de seuil des transistors standard CMOS). Dans ce contexte, les travaux de thèse ont concerné la proposition d’un point mémoire SRAM suffisamment performant sous très faible tension d’alimentation et pour les nœuds technologiques avancés (CMOS bulk 28nm et FDSOI 28nm). Une analyse comparative des architectures proposées dans l’état de l’art a permis d’élaborer deux points mémoire à 10 transistors avec de très faibles impacts de courant de fuite. Outre une segmentation des ports de lecture, les propositions reposent sur l’utilisation de périphéries adaptées synchrones avec notamment une solution nouvelle de réplication, un amplificateur de lecture de données en mode tension et l’utilisation d’une polarisation dynamique arrière du caisson SOI (Body Bias). Des validations expérimentales s’appuient sur des circuits en technologies avancées. Enfin, une mémoire complète de 32kb (1024x32) a été soumise à fabrication en 28 FDSOI. Ce circuit embarque une solution de test (BIST) capable de fonctionner sous 300mV d’alimentation. Après une introduction générale, le 2ème chapitre du manuscrit décrit l’état de l’art. Le chapitre 3 présente les nouveaux points mémoire. Le 4ème chapitre décrit l’amplificateur de lecture avec la solution de réplication. Le chapitre 5 présente l’architecture d’une mémoire ultra basse tension ainsi que le circuit de test embarqué. Les travaux ont donné lieu au dépôt de 4 propositions de brevet, deux conférences internationales, un article de journal international est accepté et un autre vient d’être soumis
Emergence of large Systems-On-Chip introduces the challenge of power management. Of the various embedded blocks, static random access memories (SRAM) constitute the angrier contributors to power consumption. Scaling down the power supply is one way to act positively on power consumption. One aggressive target is to enable the operation of SRAMs at Ultra-Low-Voltage, i.e. as low as 300 mV (lower than the threshold voltage of standard CMOS transistors). The present work concerned the proposal of SRAM bitcells able to operate at ULV and for advanced technology nodes (either CMOS bulk 28 nm or FDSOI 28 nm). The benchmarking of published architectures as state-of-the-art has led to propose two flavors of 10-transitor bitcells, solving the limitations due to leakage current and parasitic power consumption. Segmented read-ports have been used along with the required synchronous peripheral circuitry including original replica assistance, a dedicated unbalanced sense amplifier for ULV operation and dynamic forward back-biasing of SOI boxes. Experimental test chips are provided in previously mentioned technologies. A complete memory cut of 32 kbits (1024x32) has been designed with an embedded BIST block, able to operate at ULV. After a general introduction, the manuscript proposes the state-of-the-art in chapter two. The new 10T bitcells are presented in chapter 3. The sense amplifier along with the replica assistance is the core of chapter 4. The memory cut in FDSOI 28 nm is detailed in chapter 5. Results of the PhD have been disseminated with 4 patent proposals, 2 papers in international conferences, a first paper accepted in an international journal and a second but only submitted paper in an international journal

APA, Harvard, Vancouver, ISO, and other styles

21

Larsson, Anders. "Test Optimization for Core-based System-on-Chip." Doctoral thesis, Linköping : Department of Computer and Information Science, Linköpings universitet, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-15182.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Tambara, Lucas Antunes. "Caracterização de circuitos programáveis e sistemas em chip sob radiação." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/86477.

Full text

Abstract:

Este trabalho consiste em um estudo acerca dos efeitos da radiação em circuitos programáveis e sistemas em chip, do inglês System-on-Chip (SoC), baseados em FPGAs (Field-Programmable Gate Array). Dentre os diversos efeitos que podem ensejar falhas nos circuitos integrados, destacam-se a ocorrência de Single Event Effects (SEEs), Efeitos Transitórios em tradução livre, e a Dose Total Ionizante, do inglês Total Ionizing Dose (TID). SEEs podem ocorrer em razão da incidência de nêutrons originários de interações de raios cósmicos com a atmosfera terrestre, íons pesados provenientes do espaço e prótons originários do Sol (vento solar) e dos cinturões de Van Allen. A Dose Total Ionizante diz respeito à exposição prolongada de um circuito integrado à radiação ionizante e cuja consequência é a alteração das características elétricas de partes do dispositivo em razão das cargas elétricas induzidas pela radiação e acumuladas nas interfaces dos semicondutores. Dentro desse contexto, este trabalho descreve em detalhes a caracterização do SoC-FPGA baseado em memória FLASH e de sinais mistos SmartFusion A2F200-FG484, da empresa Microsemi, quando exposto à radiação (SEEs e TID) através do uso da técnica de Redundância Diversificada visando a detecção de erros. Também, uma arquitetura que utiliza um esquema baseado em Redundância Modular Tripla e Diversificada é testada através da sua implementação no FPGA baseado em memória SRAM da família Spartan-6, modelo LX45, da empresa Xilinx, visando a detecção e correção de erros causados pela radiação (SEEs). Os resultados obtidos mostram que os diversos blocos funcionais que compõe SoC SmartFusion apresentam diferentes níveis de tolerância à radiação e que o uso das técnicas de Redundância Modular Tripla e Redundância Diversificada em conjunto mostrou-se extremamente eficiente no que se refere a tolerância a SEEs.
This work consists in a study about the radiation effects in programmable circuits and System-on-Chips (SoCs) based on FPGAs (Field-Programmable Gate Arrays). Single Event Effects (SEEs) and Total Ionizing Dose (TID) are the two main effects caused by the radiation incidence, and both can imply in the occurrence of failures in integrated circuits. SEEs are due to the incidence of neutrons derived from the interaction of the cosmic rays with the terrestrial atmosphere, as well as heavy ions coming from the space and protons provided from the solar wind and the Van Allen belts. Total Ionizing Dose regards the prolonged exposure of an integrated circuit to the ionizing radiation, which deviates the standard electrical characteristics of the device due to radiation-induced electrical charges accumulated in the semiconductors’ interfaces. In this context, this work aims to describe in details the characterization of Microsemi’s mixed-signal SoC-FPGA SmartFusion A2F200-FG484 when exposed to radiation (SEEs and TID), using a Diverse Redundancy approach for error detection. As well, an architecture using a Diversified Triple Modular Redundancy scheme was tested (SEEs) through its implementation in a Xilinx’s Spartan-6 LX45 FPGA, aiming error detection and correction. The results obtained show that several functional blocks from SmartFusion have different radiation tolerance levels and that the use of the Triple Modular Redundancy together with Diversified Redundancy proved to be extremely efficient in terms of SEEs tolerance.

APA, Harvard, Vancouver, ISO, and other styles

23

Samii, Soheil. "Power Modeling and Scheduling of Tests for Core-based System Chips." Thesis, Linköping University, Department of Computer and Information Science, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2863.

Full text

Abstract:

The technology today makes it possible to integrate a complete system on a single chip, called "System-on-Chip'' (SOC). Nowadays SOC designers use previously designed hardware modules, called cores, together with their user defined logic (UDL), to form a complete system on a single chip. The manufacturing process may result in defect chips, for instance due to the base material, and therefore testing chips after production is important in order to ensure fault-free chips.

The testing time for a chip will affect its final cost. Thus it is important to minimize the testing time for each chip. For core-based SOCs this can be done by testing several cores at the same time, instead of testing the cores sequentially. However, this will result in a higher activity in the chip, hence higher power consumption. Due to several factors in the manufacturing process there are limitations of the power consumption for a chip. Therefore, the power limitations should be carefully considered when planning the testing of a chip. Otherwise it can be damaged during test, due to overheating. This leads to the problem of minimizing testing time under such power constraints.

In this thesis we discuss test power modeling and its application to SOC testing. We present previous work in this area and conclude that current power modeling techniques in SOC testing are rather pessimistic. We therefore propose a more accurate power model that is based on the analysis of the test data. Furthermore, we present techniques for test pattern reordering, with the objective of partitioning the test power consumption into low parts and high parts.

The power model is included in a tool for SOC test architecture design and test scheduling, where the scheduling heuristic is designed for SOCs with fixed- width test bus architectures. Several experiments have been conducted in order to evaluate the proposed approaches. The results show that, by using the presented power modeling techniques in test scheduling algorithms, we will get lower testing times and thus lower test cost.

APA, Harvard, Vancouver, ISO, and other styles

24

Andersson, Dickfors Robin, and Nick Grannas. "OBJECT DETECTION USING DEEP LEARNING ON METAL CHIPS IN MANUFACTURING." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-55068.

Full text

Abstract:

Designing cutting tools for the turning industry, providing optimal cutting parameters is of importance for both the client, and for the company's own research. By examining the metal chips that form in the turning process, operators can recommend optimal cutting parameters. Instead of doing manual classification of metal chips that come from the turning process, an automated approach of detecting chips and classification is preferred. This thesis aims to evaluate if such an approach is possible using either a Convolutional Neural Network (CNN) or a CNN feature extraction coupled with machine learning (ML). The thesis started with a research phase where we reviewed existing state of the art CNNs, image processing and ML algorithms. From the research, we implemented our own object detection algorithm, and we chose to implement two CNNs, AlexNet and VGG16. A third CNN was designed and implemented with our specific task in mind. The three models were tested against each other, both as standalone image classifiers and as a feature extractor coupled with a ML algorithm. Because the chips were inside a machine, different angles and light setup had to be tested to evaluate which setup provided the optimal image for classification. A top view of the cutting area was found to be the optimal angle with light focused on both below the cutting area, and in the chip disposal tray. The smaller proposed CNN with three convolutional layers, three pooling layers and two dense layers was found to rival both AlexNet and VGG16 in terms of both as a standalone classifier, and as a feature extractor. The proposed model was designed with a limited system in mind and is therefore more suited for those systems while still having a high accuracy. The classification accuracy of the proposed model as a standalone classifier was 92.03%. Compared to the state of the art classifier AlexNet which had an accuracy of 92.20%, and VGG16 which had an accuracy of 91.88%. When used as a feature extractor, all three models paired best with the Random Forest algorithm, but the accuracy between the feature extractors is not that significant. The proposed feature extractor combined with Random Forest had an accuracy of 82.56%, compared to AlexNet with an accuracy of 81.93%, and VGG16 with 79.14% accuracy.
DIGICOGS

APA, Harvard, Vancouver, ISO, and other styles

25

Peterson, Mackenzie. "The Effect of the Antecedent Dry Conditions on Nitrogen Removal for a Modified Bioretention System." Scholar Commons, 2016. http://scholarcommons.usf.edu/etd/6567.

Full text

Abstract:

Eutrophication is defined as the ‘over enrichment’ of a water body from nutrients, resulting in uncontrolled growth of primary producers, leading to periods of oxygen depletion from decomposition of the algal organic matter. According to the 2010 Water Infrastructure Needs and Investment (a U.S. Congressional Report), 40% of U.S. water bodies are contaminated with pollutants, including nutrients. Non-point sources of nutrient pollution are a major cause of this reduction in water quality. One way to decrease eutrophication is to manage nutrients found in stormwater runoff, before they reach a receiving water body. Bioretention cells containing an internal water storage zone (IWSZ) have been shown to remove higher amounts of nitrogen than conventional cells (without an IWSZ). The IWSZ contains an organic carbon substrate, usually derived from wood chips submerged in water, which supports the biochemical process of denitrification. Characteristics of wood chips that affect nitrogen removal include carbon content (%), leaching of dissolved organic carbon (DOC), and wood chip size and type. However, there is limited information on how the intermittent hydraulic loading that is associated with these field systems impacts their performance. Accordingly, the overall goal of this research is to improve understanding of the effect that the antecedent dry conditions (ADC) have on the performance of a field scale bioretention cell modified to contain an IWSZ. The nine different types of wood chips used in laboratory and field studies identified in the literature were categorized as hardwood and softwood. Literature showed that total organic carbon (TOC) leached from softwood chips is almost double the TOC measured from the hardwood chips, 138.3 and 70.3 mg/L, respectively. The average observed nitrogen removal for softwood chips was found to be greater than the removal for the average of the hardwood chips (75.2% and 63.0%, respectively). Literature also suggests that larger wood chip size may limit the availability of the carbon for the denitrifying organisms and provides less surface area for the biofilm growth. A field study conducted for this research compared the performance of a modified bioretention system designed to enhance denitrification, addition of an IWSZ, with a conventional system that does not contain an IWSZ. Fourteen storm events were completed from January 2016 to July 2016 by replicating storm events previously completed in the laboratory using hydraulic loading rates (HLR) of 6.9 cm/h, 13.9 cm/h, and 4.1 cm/h. The goal was to have results from storm events with ADCs of two, four, and eight days, with the varying durations of hydraulic loading of two, four, and six hours. Synthetic stormwater, simulating nitrogen levels common in urban runoff, was used as the system’s influent to assist in running a controlled experiment. The resultant ADCs ranged from 0 to 33 days, with the average ADC being 9 days. The fourteen sets of influent samples were averaged to obtain mean influent concentrations for the synthetic stormwater. These values were used when calculating the percent nitrogen removal for the four measured nitrogen species (NOx – N, NH4+– N, organic N, and TN). The field storm events were separated into three groups based on HLR and duration to eliminate the affects of both variables on nitrogen removal for these results, since the focus is the ADC. For the low HLR (4.1 cm/hr), there were four storm events (ADCs of 4 to 33 days), as the ADC increased, greater percentages of ammonium – nitrogen, organic nitrogen, and total nitrogen were removed. For nitrate/nitrite – nitrogen, the percent removal was rather consistent for all four storm events, not significantly increasing or decreasing with changes in the ADC. There were five storm events (ADCs of 0 to 28 days) tested with the median HLR (6.9 cm/hr), nitrogen removal for all four species increased as the ADC increased. The increase was significant (p0.05) for nitrate/nitrite – nitrogen. The third group also contained five storm events (ADCs from 0 to 11 days) that were tested with the highest HLR (13.9 cm/hr). Ammonium – nitrogen, nitrate/nitrite – nitrogen, and total nitrogen all increased with the ADC, and organic nitrogen removal decreased with the increasing ADC. As a result, this research concluded that the difference in HLR affects the nitrogen removal efficiency, but overall increasing the ADC increased nitrogen removal for NOx – N, NH4+ - N, organic N, and TN.

APA, Harvard, Vancouver, ISO, and other styles

26

Rajamanikkam, Chidhambaranathan. "Understanding Security Threats of Emerging Computing Architectures and Mitigating Performance Bottlenecks of On-Chip Interconnects in Manycore NTC System." DigitalCommons@USU, 2019. https://digitalcommons.usu.edu/etd/7453.

Full text

Abstract:

Emerging computing architectures such as, neuromorphic computing and third party intellectual property (3PIP) cores, have attracted significant attention in the recent past. Neuromorphic Computing introduces an unorthodox non-von neumann architecture that mimics the abstract behavior of neuron activity of the human brain. They can execute more complex applications, such as image processing, object recognition, more efficiently in terms of performance and energy than the traditional microprocessors. However, focus on the hardware security aspects of the neuromorphic computing at its nascent stage. 3PIP core, on the other hand, have covertly inserted malicious functional behavior that can inflict range of harms at the system/application levels. This dissertation examines the impact of various threat models that emerges from neuromorphic architectures and 3PIP cores. Near-Threshold Computing (NTC) serves as an energy-efficient paradigm by aggressively operating all computing resources with a supply voltage closer to its threshold voltage at the cost of performance. Therefore, STC system is scaled to many-core NTC system to reclaim the lost performance. However, the interconnect performance in many-core NTC system pose significant bottleneck that hinders the performance of many-core NTC system. This dissertation analyzes the interconnect performance, and further, propose a novel technique to boost the interconnect performance of many-core NTC system.

APA, Harvard, Vancouver, ISO, and other styles

27

Chen, Yi-Jung, and 陳依蓉. "System Synthesis for Multi-Processor System-on-Chips." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/98033304483594207429.

Full text

Abstract:

博士
國立臺灣大學
資訊工程學研究所
98
Multi-core architecture is attractive to applications with significant parallelism since multiple processing elements (PEs) are put on a single die to support parallel execution. However, multi-core architecture also stresses the memory system with concurrent memory accesses from different PEs. With the number of cores on a chip increases, the main memory bandwidth requirement also grows. Therefore, it is important to have a memory-aware design when designing Multi-Processor System-on-Chips (MPSoCs). In this thesis, we propose memory-aware MPSoC synthesis methods for MPSoCs with two different architectures: (a) MPSoCs with the traditional 2-Dimensional (2D) CPU-DRAM connection, and (b) MPSoCs with 3-Dimensional (3D) stacked DRAMs. For MPSoCs with the traditional 2D CPU-DRAM connection, the main memory bandwidth is limited due to pin limitations. To maximize system performance, it is important to simultaneously consider the PE and on-chip memory architecture design with limited on-chip resource. That is, on one hand, we want to allocate as many PEs as possible to fully utilize the available task parallelism in the target applications, and on the other hand, we need to incorporate a significant amount of on-chip memory to alleviate memory bottleneck. However, in a traditional MPSoC design flow, memory and computation components are often considered independently. To tackle this problem, we develop the first PE and memory co-synthesis framework for MPSoCs with 2D CPU-DRAM connections. The goal of the algorithm is to simultaneously synthesize the allocation of PE and on-chip memory modules so that system performance is maximized subject to the resource constraint. In MPSoCs with stacked DRAMs, the 3D die-stacking technology utilizes Though-Silicon Vias (TSVs) to integrate processing cores and DRAMs on the same chip. Moreover, the TSVs that can be placed densely provide high DRAM bandwidth for the system. Therefore, to utilize the high DRAM bandwdith, each PE can have a local DRAM memory controller (DMC) so that it can directly access the DRAM module stacked on top of the PE. This forms a distributed memory interface for CPU-DRAM connection in MPSoCs with stacked DRAMs. However, a DMC occupies a significant share of transistor budget, which can be traded for enlarging the capacity of high speed local SRAM. Moreover, TSVs need extra manufacturing cost and have adverse impact on chip yields. Therefore, the distributed memory interface, including the number of allocated DMCs and vertical bus width of each DMC, should be designed carefully. To tackle this problem, in this thesis, we propose the first algorithm to synthesize the DMC allocation and vertical bus allocation for MPSoCs with stacked DRAMs. The goal of the proposed algorithm is to find a proper distributed memory interface design for the given task set so that the total number of TSVs in the system is minimized while the user-defined performance constraint is met.

APA, Harvard, Vancouver, ISO, and other styles

28

Wang, Shen-Min, and 王勝民. "Chips Pileup Cause Analysis System Based On Knowledge Integration." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/36219724414444598207.

Full text

Abstract:

碩士
中華大學
資訊工程學系碩士班
93
With the development of computer hardware, semi-conductor industry becomes one of mainstreams in the world. During the process of semi-conductor production, delivery of goods is often delayed due to materials piling-up in-between machines. That causes a great loss for semi-conductor manufacturers. In order to increase the competitiveness of the semi-conductor industry, pileup cause analysis is important. Nowadays, operations of production lines usually follow a fixed procedure and the pileup cause analysis was done by a set of rules that was predefined by a single experienced engineer. Exceptional conditions are hard to be detected on-line and the pileup analysis rules are also hard to be modified to reflect the change of operation status. In this thesis, a knowledge-based approach is proposed to analyze the pileup causes within semi-conductor production. A multi-expert knowledge acquisition tool is developed to extract pileup cause analysis rules from various experienced engineers. A knowledge-based Pileup Cause Analysis System (abbreviated as PICAS) that utilizes the extracted rules is also developed to analyze the pileup cause on-line. With knowledge extracted from multiple production-line experts, PICAS is more objective, avoiding problems that might be occurred from a single non-experienced engineer. And with knowledge-based approach, analysis rules are easy to be modified to reflect the on-line situations. That makes the pileup cause analysis more realistic and on-time. On-line experiments show that more satisfaction was gained with our new approach.

APA, Harvard, Vancouver, ISO, and other styles

29

Malave-Bonet, Javier. "A Benchmarking Platform For Network-On-Chip (NOC) Multiprocessor System-On- Chips." Thesis, 2010. http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8662.

Full text

Abstract:

Network-on-Chip (NOC) based designs have garnered significant attention from both researchers and industry over the past several years. The analysis of these designs has focused on broad topics such as NOC component micro-architecture, fault-tolerant communication, and system memory architecture. Nonetheless, the design of lowlatency, high-bandwidth, low-power and area-efficient NOC is extremely complex due to the conflicting nature of these design objectives. Benchmarks are an indispensable tool in the design process; providing thorough measurement and fair comparison between designs in order to achieve optimal results (i.e performance, cost, quality of service). This research proposes a benchmarking platform called NoCBench for evaluating the performance of Network-on-chip. Although previous research has proposed standard guidelines to develop benchmarks for Network-on-Chip, this work moves forward and proposes a System-C based simulation platform for system-level design exploration. It will provide an initial set of synthetic benchmarks for on-chip network interconnection validation along with an initial set of standardized processing cores, NOC components, and system-wide services. The benchmarks were constructed using synthetic applications described by Task Graphs For Free (TGFF) task graphs extracted from the E3S benchmark suite. Two benchmarks were used for characterization: Consumer and Networking. They are characterized based on throughput and latency. Case studies show how they can be used to evaluate metrics beyond throughput and latency (i.e. traffic distribution). The contribution of this work is two-fold: 1) This study provides a methodology for benchmark creation and characterization using NoCBench that evaluates important metrics in NOC design (i.e. end-to-end packet delay, throughput). 2) The developed full-system simulation platform provides a complete environment for further benchmark characterization on NOC based MpSoC as well as system-level design space exploration.

APA, Harvard, Vancouver, ISO, and other styles

30

"Test architecture design and optimization for three-dimensional system-on-chips." 2010. http://library.cuhk.edu.hk/record=b5894366.

Full text

Abstract:

Jiang, Li.
"October 2010."
Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.
Includes bibliographical references (leaves 71-76).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.ii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Three Dimensional Integrated Circuit --- p.1
Chapter 1.1.1 --- 3D ICs --- p.1
Chapter 1.1.2 --- Manufacture --- p.3
Chapter 1.2 --- Test Architecture Design and Optimization for SoCs --- p.4
Chapter 1.2.1 --- Test Wrapper --- p.4
Chapter 1.2.2 --- Test Access Mechanism --- p.6
Chapter 1.2.3 --- Test Architecture Optimization and Test Scheduling --- p.7
Chapter 1.3 --- Thesis Motivation and Organization --- p.9
Chapter 2 --- On Test Time and Routing Cost --- p.12
Chapter 2.1 --- Introduction --- p.12
Chapter 2.2 --- Preliminaries and Motivation --- p.13
Chapter 2.3 --- Problem Formulation --- p.17
Chapter 2.3.1 --- Test Cost Model --- p.17
Chapter 2.3.2 --- Routing Model --- p.17
Chapter 2.3.3 --- Problem Definition --- p.19
Chapter 2.4 --- Proposed Algorithm --- p.22
Chapter 2.4.1 --- Outline of The Proposed Algorithm --- p.22
Chapter 2.4.2 --- SA-Based Core Assignment --- p.24
Chapter 2.4.3 --- Heuristic-Based TAM Width Allocation --- p.25
Chapter 2.4.4 --- Fast routing Heuristic --- p.28
Chapter 2.5 --- Experiments --- p.29
Chapter 2.5.1 --- Experimental Setup --- p.29
Chapter 2.5.2 --- Experimental Results --- p.31
Chapter 2.6 --- Conclusion --- p.34
Chapter 3 --- Pre-bond-Test-Pin Constrained Test Wire Sharing --- p.37
Chapter 3.1 --- Introduction --- p.37
Chapter 3.2 --- Preliminaries and Motivation --- p.38
Chapter 3.2.1 --- Prior Work in SoC Testing --- p.38
Chapter 3.2.2 --- Prior Work in Testing 3D ICs --- p.39
Chapter 3.2.3 --- Test-Pin-Count Constraint in 3D IC Pre-Bond Testing --- p.40
Chapter 3.2.4 --- Motivation --- p.41
Chapter 3.3 --- Problem Formulation --- p.43
Chapter 3.3.1 --- Test Architecture Design under Pre-Bond Test-Pin-Count Constraint --- p.44
Chapter 3.3.2 --- Thermal-aware Test Scheduling for Post-Bond Test --- p.45
Chapter 3.4 --- Layout-Driven Test Architecture Design and Optimization --- p.46
Chapter 3.4.1 --- Scheme 1: TAM Wire Reuse with Fixed Test Architectures --- p.46
Chapter 3.4.2 --- Scheme 2: TAM Wire Reuse with Flexible Pre-bond Test Architecture --- p.52
Chapter 3.5 --- Thermal-Aware Test Scheduling for Post-Bond Test --- p.53
Chapter 3.5.1 --- Thermal Cost Function --- p.54
Chapter 3.5.2 --- Test Scheduling Algorithm --- p.55
Chapter 3.6 --- Experimental Results --- p.56
Chapter 3.6.1 --- Experimental Setup --- p.56
Chapter 3.6.2 --- Results and Discussion --- p.58
Chapter 3.7 --- Conclusion --- p.59
Chapter 3.8 --- Acknowledgement --- p.60
Chapter 4 --- Conclusion and Future Work --- p.69
Bibliography --- p.70

APA, Harvard, Vancouver, ISO, and other styles

31

Barnes, Christopher J. "A Dynamically Configurable Discrete Event Simulation Framework for Many-Core System-on-Chips." 2010. http://hdl.handle.net/1805/2222.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)
Industry trends indicate that many-core heterogeneous processors will be the next-generation answer to Moore's law and reduced power consumption. Thus, both academia and industry are focused on the challenges presented by many-core heterogeneous processor designs. In many cases, researchers use discrete event simulators to research and validate new computer architecture innovations. However, there is a lack of dynamically configurable discrete event simulation environments for the testing and development of many-core heterogeneous processors. To fulfill this need we present Mhetero, a retargetable framework for cycle-accurate simulation of heterogeneous many-core processors along with the cycle-accurate simulation of their associated network-on-chip communication infrastructure. Mhetero is the result of research into dynamically configurable and highly flexible simulation tools with which users are free to produce custom instruction sets and communication methods in a highly modular design environment. In this thesis we will discuss our approach to dynamically configurable discrete event simulation and present several experiments performed using the framework to exemplify how Mhetero, and similarly constructed simulators, may be used for future innovations.

APA, Harvard, Vancouver, ISO, and other styles

32

Chung, I.-Chun, and 鍾逸駿. "The Study on Intelligent Control of a Cervical-Lumbar Traction System by DSP Chips." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/32559762311692654642.

Full text

Abstract:

碩士
國立成功大學
系統及船舶機電工程學系碩博士班
92
The purpose of this study is to propose an intelligent cervical- lumbar traction system for rehabilitation because the population of the chronic skeletal neuromuscular disease is dramatically increasing. There are two major shortcomings for traditional traction systems. One is traction control is based on an open-loop manual structure. The other is only one set of fixed EMG applies to therapeutic services for all patients. In this study, a learning ENG knowledge base is constructed and will auto-renew patient’s data after each therapy. To achieve intelligent control, two fuzzy controllers by DSP chips are designed. An EMG fuzzy controller is used to deduce more suitable EMG from the constructed EMG knowledge base. According to the deduced EMG, the traction weight will be different for different patient rehabilitation. A traction fuzzy controller is used to accurately control a DC motor to achieve a safe and comfortable therapy. The system not only combines myoelectric signal feedback and autocontrol of traction weight avoiding muscle harm or causing more pain after traction because of the improper manipulation, but also provides a graphic user interface for more convenient. However, it will be increasing the value of product itself, but also give a great deal of inspirations to the future related researches.

APA, Harvard, Vancouver, ISO, and other styles

33

LIU, CHIA-YIN, and 劉佳音. "Thermal-aware Memory System Design Automation Method for Multi-Processor System-on-Chips with 3D-stacked Hybrid Memories." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/kzn2cx.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
106
Stacking memories on Multi-Processor System-on-Chips (MPSoCs) by Through-Silicon Vias (TSVs) provides high speed and wide bandwidth to overcome the memory wall problem and supply heterogeneous integration to integrate resources vertically. However, it is prone to face a thermal management problem because of the stacked power density. In terms of the resource allocation, using stacked-DRAM and TSVs bundles occupying areas in the logic layer can be traded for local memories to improve the system performance. In this paper, we propose a thermal-aware hardware and software co-design synthesis algorithm to optimize the performance in the limited resources under thermal constraint by considering stacked DRAM or SRAM, allocation of DRAM memory controllers(DMCs) if choosing DRAM, configuration of TSV bundles, the resource trade-off in the logic layer, and thermalaware task and data co-allocation. Compared to stacked-SRAM only configuration, applying thermal-aware software-only method, the proposed method can achieve 149% performance improvement on the average. And achieve 138% improvement on the average compared to stacked-DRAM one. The system temperature of the proposed method are all well kept under the given thermal constraint, 85◦C.

APA, Harvard, Vancouver, ISO, and other styles

34

Nahvi, Yawar M. "Transmission-gate based variation tolerant active clock deskewing for deep submicron system on chips (SoCs)." 2007. http://proquest.umi.com/pqdweb?did=1240710771&sid=7&Fmt=2&clientId=39334&RQT=309&VName=PQD.

Full text

Abstract:

Thesis (M.S.)--State University of New York at Buffalo, 2007.
Title from PDF title page (viewed on July 06, 2007) Available through UMI ProQuest Digital Dissertations. Thesis adviser: Ramalingam, Sridhar. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

35

蔡宜璋. "Fast FPGA prototyping of block-matching operations for video coding using system-on-programmable chips." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/20185273350619798693.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Hirmer, Katrin. "Interference-Aware Integration of Mixed-Signal Designs and Ultra High Voltage Pulse Generators for System-on-Chips." Phd thesis, 2019. https://tuprints.ulb.tu-darmstadt.de/9118/1/2019-10-21_Hirmer_Katrin.pdf.

Full text

Abstract:

The interference-aware implementation of system-on-chips (SoCs) including ultra high voltage pulse generators and mixed-signal devices, which are for example used in rectiﬁers or gate drivers, enables the continuous miniaturization of system electronics. Square wave signals with high amplitudes and slew rates can interfere signiﬁcantly with monolithically integrated low voltage electronics. The prediction of these interferences on SoCs prior to fabrication is essential to take countermeasures. This helps to ensure the functionality of the system and reduces development costs. The main objective of this work is to develop a model which can predict the inﬂuences of high voltage pulses on circuits with low supply voltages by simulations. The integration of this model into the conventional design ﬂow of integrated circuits enables SPICE simulations without any additional license fees. The investigations within this thesis allow deriving recommendations for the integration of high voltage pulses and low voltage circuitry within a SoC. Two SoCs have been fabricated in a silicon-on-insulator process. These can be used to emit light from an electroluminescent device as well as driving a capacitive sensor at the same time. The implemented ultra high voltage pulse generator can deliver pulses with up to ±300 V at slew rates of up to 99.56 V/µs. It is able to drive capacitive loads of 10 nF at frequencies of up to 5 kHz. At the same time, a spread spectrum clock generator (SSCG) with a resolution of 9 bit can excite the capacitive sensor with a bandwidth of 10.14 MHz and an attenuation of 33.17 dB with a 5 V power supply. During the switching operation of the ultra high voltage pulse generator, deviations of the operating frequency of the SSCG can be observed. These can mostly be explained by substrate coupling. To verify the coupling mechanism, on the one hand, relevant impedances of the substrate network are measured and compared to calculated values within this thesis. On the other hand, the coupling of the high voltage pulse generator to the substrate as well as the inﬂuences of variations of the substrate potential on low voltage designs are recorded by measurement. To predict the interferences on mixed-signal devices, a substrate netlist can be extracted with the help of the SoC layout. The parameters of the components within the substrate equivalent circuit can be analytically calculated by using geometric dimensions extracted from the layout of the SoC. The substrate netlist can be simulated along with the post-layout of the integrated components. The modeling of the supply voltage as well as the packaging is of great importance for the simulation. The investigations of this thesis result in recommendations for the implementation of SoCs with ultra high voltage pulse generators and mixed-signal devices. They include considerations for the circuit implementation, the layout as well as the package selection. For the fabricated SoCs, the frequency change of the SSCG can be reduced by 77.35 %.

APA, Harvard, Vancouver, ISO, and other styles

37

Cheng, Adriel. "Verification of systems-on-chips using genetic evolutionary test techniques from a software applications perspective." Thesis, 2010. http://hdl.handle.net/2440/62335.

Full text

Abstract:

This thesis examines verification of system-on-a-chip (SoC) designs using a software applications test methodology that is enhanced by genetic evolutionary test generations and functional coverage. The verification methodology facilitates application based testing using behavioural simulations before the chip is fabricated. The goal of the methodology is to verify commonly used real-life functionalities of the SoC earlier in the design process, so as to uncover design bugs that are considered most critical to actual SoC usages when the SoC is employed in its intended end-product. The verification methodology is based on a test building blocks approach, whereby many different components of various SoC application use-cases are extracted into building blocks, and then recomposed with other components to construct greater variety and range of test cases for verifying the SoC. An important facet of the methodology is to address automated creation of these software application test cases in an effective and efficient manner. The goal is to maximise test coverage and hence bug detection likelihood using minimal verification resources and effort. To this end, test generations techniques employing single and multi objective genetic algorithms and evolutionary strategies are devised in this thesis. Using coverage and test size to drive test generations, test suites which are continually evolved to enhance SoC verifications are created, thereby achieving automated coverage driven verifications. Another enhancement for test generation is to select the input test creation parameters in an analytical manner. A technique using Markov chains is developed to model and analyse the test generation method, and by doing so, test parameters can be selected to achieve desired verification characteristics and outcomes with greater likelihood. To quantify verification effectiveness, a functional coverage method is formulated. The coverage method monitors attributes of the SoC design during testing. The combinations of attribute values indicate the application functionalities carried out. To address the coverage space explosion phenomenon for such combinatorial methods and facilitate the coverage measurement process, partial order domains and trajectory checking techniques from the formal verification field of symbolic trajectory evaluation are adopted. The contributions of this thesis are a verification platform and associated tool-suite that incorporates the software applications test methodology, algorithmic test generation, and functional coverage techniques.
Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 2010

APA, Harvard, Vancouver, ISO, and other styles

38

Lee, Chang Joo 1975. "DRAM-aware prefetching and cache management." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-12-2492.

Full text

Abstract:

Main memory system performance is crucial for high performance microprocessors. Even though the peak bandwidth of main memory systems has increased through improvements in the microarchitecture of Dynamic Random Access Memory (DRAM) chips, conventional on-chip memory systems of microprocessors do not fully take advantage of it. This results in underutilization of the DRAM system, in other words, many idle cycles on the DRAM data bus. The main reason for this is that conventional on-chip memory system designs do not fully take into account important DRAM characteristics. Therefore, the high bandwidth of DRAM-based main memory systems cannot be realized and exploited by the processor. This dissertation identifies three major performance-related characteristics that can significantly affect DRAM performance and makes a case for DRAM characteristic-aware on-chip memory system design. We show that on-chip memory resource management policies (such as prefetching, buffer, and cache policies) that are aware of these DRAM characteristics can significantly enhance entire system performance. The key idea of the proposed mechanisms is to send out to the DRAM system useful memory requests that can be serviced with low latency or in parallel with other requests rather than requests that are serviced with high latency or serially. Our evaluations demonstrate that each of the proposed DRAM-aware mechanisms significantly improves performance by increasing DRAM utilization for useful data. We also show that when employed together, the performance benefit of each mechanism is achieved additively: they work synergistically and significantly improve the overall system performance of both single-core and Chip MultiProcessor (CMP) systems.
text

APA, Harvard, Vancouver, ISO, and other styles

39

WU, CHENG-EN, and 吳承恩. "Thermal-aware Task and Data Placement for Optimizing the Performance of Multi-Processor System-on-Chips with 3D-stacked Memories Architecture." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/ft7bs9.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
105
The architecture of Multi-Processor Systems-on-Chips (MPSoCs) with 3D-stakced memories is considered as one of the most promising way to mitigate the memory wall problem of an MPSoC. However, the increasing power density of a 3D IC makes the MPSoC more likely to be in the thermal emergent condition. Studies also show that, devices that are vertically aligned in the 3D IC have strong thermal correlation, and devices that are farther from the heat sink have more difficulties in thermal dissipation. To ease the thermal emergent problem of an 3D IC, the power consumption of each vertically aligned device should be managed to avoid overheating. The management of power consumption can be achieved by software design techniques, such as thermal-aware task scheduling and data placement, or hardware design techniques, such as allocating SRAM and DRAM layers according to the thermal limit. In this paper, we focus on the design of the software architecture, and propose a thermal-aware task and data placement synergistically method for the target architecture. Different form the existing thermal-aware task scheduling and data placement methods that consider the stacked cores or stacked memories only, the synergistically method designed in this paper considers the heterogeneity of cores and memory elements to optimize system performance given the thermal constraint. In experimental results, compare to performance-aware method, our method only loss 10% system performance below thermal constraint. Compare to data-only placement and task-only placement, our method have respectively improved 5% and 5.3% system performance.

APA, Harvard, Vancouver, ISO, and other styles

40

Cheng, Cheng-Hsiang, and 鄭丞翔. "The Design of CMOS System-on-Chips (SoCs) and Closed-Loop Neuromodulation Systems for Human Epileptic Seizure and Parkinson’s Disease Control." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/x4655k.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Chen, Po-Lin, and 陳柏霖. "Fast Test Integration: Toward Plug-and-Play Embedded At-speed Test Framework for Multiple Clock Domains in System-On-Chips Based on IEEE Standard 1500." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/68087992473847969806.

Full text

Abstract:

博士
國立清華大學
電機工程學系
99
隨著製程技術的不斷提升，晶片速度持續攀高，與時序(timing)相關的延遲缺陷(delay defects)，例如電阻短路、電阻斷路或者是訊號完整性問題，也逐漸了主宰晶片測試的品質(test quality)。然而，以傳統偵測定值錯誤(Stuck-at fault)為主流的測試方法並無法有效的檢測出時序相關的延遲缺陷，因此無法驗證晶片是否能夠操作在設計規格(design specifications)所規範的功能時脈速度(functional clock speed) 。單單依靠傳統定值錯誤測試不僅僅降低測試效率也額外增加測試負擔與成本(test cost)。因此，為解決時序相關的延遲缺陷所導致的後段測試的困難，半導體業界發展出相對應單一時脈域(single clock domain)延遲錯誤測試(delay fault test)或者稱全速測試(at-speed test)來針對單一頻率待測電路的延遲錯誤測試。然而，製程技術的快速發展也促使系統晶片(System-on-Chip)的設計策略得以整合多顆具有高效能、多重時脈(multiple clock domain)的設計於其中。跨頻域CDC (Clock Domain Crossing)的資料傳輸模式也促使延遲錯誤測試必須從原本單一時脈域的延遲測試延伸至多重時脈域且跨時脈域的延遲錯誤測試，以確保整體延遲錯誤測試的品質。然而，針對高速且多重時脈域的系統晶片的延遲錯誤測試已非傳統低階自動測試機台(Automatic Test Equipment)所能支援，因此，內嵌式可支援延遲錯誤測試的可測性設計(Design-for-Testability)變得相當重要與熱門，尤其在SOC設計模式的考量下，所有的邏輯電路皆內嵌在其中，無法由chip level的I/Os所控制與觀察，如何在chip內部執行多重時脈域的延遲錯誤測試，變得困難重重。雖然由美國電子電機工程師學會(Institute of Electrical and Electronic Engineers, IEEE)所提出模組化的系統晶片測試標準(IEEE Standard 1500)來對SOC中內嵌的邏輯電路(embedded cores)作有效的測試，但是並無提供任延遲測試相關之解決方案。因此，如何針對現有IEEE 1500所規範的測試標準之下，提出具有低面積開銷(low area overhead)、強健性(robustness)、且具有支援多重時脈域延遲錯誤測試的內嵌可測性設計來提升整體延遲錯誤測試的測試品質及可靠度並且降低測試成本為現階段提升系統晶片測試品質非常重要的部份。

APA, Harvard, Vancouver, ISO, and other styles

42

Surendran, Sudhakar. "A Systematic Approach To Synthesis Of Verification Test-Suites For Modular SoC Designs." Thesis, 2006. http://hdl.handle.net/2005/397.

Full text

Abstract:

SoCs (System on Chips) are complex designs with heterogeneous modules (CPU, memory, etc.) integrated in them. Veriﬁcation is one of the important stages in designing an SoC. Veriﬁcation is the process of checking if the transformation from architectural speciﬁcation to design implementation is correct. Veriﬁcation involves creating the following components: (i) a testplan that identiﬁes the conditions to be veriﬁed, (ii) a testcase that generates the stimuli to verify the conditions identiﬁed, and (iii) a test-bench that applies the stimuli and monitors the output from the design. Veriﬁcation consumes upto 70% of the total design time. This is largely due to the complex and manual nature of the veriﬁcation task. To reduce the time spent in verifying the design, the components used for veriﬁcation can be generated automatically or created at an abstract level (to reduce the complexity) and reused. In this work we present a methodology to synthesize testcases from reusable code segments and abstract speciﬁcations. Our methodology consists of the following major steps: (i) identifying the structure of testcases, (ii) identifying code segments of testcases that can be reused from one SoC to another, (iii) identifying properties of an SoC and its modules that can be used to synthesize the SoC speciﬁc code segments of the testcase, and (iv) proposing a synthesizer that uses the code segments, the properties and the abstract speciﬁcation to synthesize testcases. We discuss two speciﬁc classes of testcases. These are testcases for verifying the memory modules and the testcases for verifying the data transfer modules. These are considered since they form a signiﬁcantly large subset of the device functionality. We implement a prototype testcase generator and also present an example to illustrate the use of methodology for each of these classes. The use of our methodology enables (i) the creation of testcases automatically that are correct by construction and (ii) reuse of the testcase code segments from one SoC to another. Some of the properties (of the modules and the SoC) presented in our work can be easily made part of the architectural speciﬁcation, and hence, can further reduce the eﬀort needed to create them.

APA, Harvard, Vancouver, ISO, and other styles

43

Narayanasetty, Bhargavi. "Analysis of high performance interconnect in SoC with distributed switches and multiple issue bus protocols." Thesis, 2011. http://hdl.handle.net/2152/ETD-UT-2011-05-3325.

Full text

Abstract:

In a System on a Chip (SoC), interconnect is the factor limiting Performance, Power, Area and Schedule (PPAS). Distributed crossbar switches also called as Switching Central Resources (SCR) are often used to implement high performance interconnect in a SoC – Network on a Chip (NoC). Multiple issue bus protocols like AXI (from ARM), VBUSM (from TI) are used in paths critical to the performance of the whole chip. Experimental analysis of effects on PPAS by architectural modifications to the SCRs is carried out, using synthesis tools and Texas Instruments (TI) in house power estimation tools. The effects of scaling of SCR sizes are discussed in this report. These results provide a quick means of estimation for architectural changes in the early design phase. Apart from SCR design, the other major domain, which is a concern, is deadlocks. Deadlocks are situations where the network resources are suspended waiting for each other. In this report various kinds of deadlocks are classified and their respective mitigations in such networks are provided. These analyses are necessary to qualify distributed SCR interconnect, which uses multiple issue protocols, across all scenarios of transactions. The entire analysis in this report is carried out using a flagship product of Texas Instruments. This ASIC SoC is a complex wireless base station developed in 2010- 2011, having 20 major cores. Since the parameters of crossbar switches with multiple issue bus protocols are commonly used in SoCs across the semiconductor industry, this reports provides us a strong basis for architectural/design selection and validation of all such high performance device interconnects. This report can be used as a seed for the development of an interface tool for architects. For a given architecture, the tool suggests architectural modifications, and reports deadlock situations. This new tool will aid architects to close design problems and bring provide a competitive specification very early in the design cycle. A working algorithm for the tool development is included in this report.
text

APA, Harvard, Vancouver, ISO, and other styles

44

Runge, Armin. "Advances in Deflection Routing based Network on Chips." Doctoral thesis, 2017. https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-149700.

Full text

Abstract:

The progress which has been made in semiconductor chip production in recent years enables a multitude of cores on a single die. However, due to further decreasing structure sizes, fault tolerance and energy consumption will represent key challenges. Furthermore, an efficient communication infrastructure is indispensable due to the high parallelism at those systems. The predominant communication system at such highly parallel systems is a Network on Chip (NoC). The focus of this thesis is on NoCs which are based on deflection routing. In this context, contributions are made to two domains, fault tolerance and dimensioning of the optimal link width. Both aspects are essential for the application of reliable, energy efficient, and deflection routing based NoCs. It is expected that future semiconductor systems have to cope with high fault probabilities. The inherently given high connectivity of most NoC topologies can be exploited to tolerate the breakdown of links and other components. In this thesis, a fault-tolerant router architecture has been developed, which stands out for the deployed interconnection architecture and the method to overcome complex fault situations. The presented simulation results show, all data packets arrive at their destination, even at high fault probabilities. In contrast to routing table based architectures, the hardware costs of the herein presented architecture are lower and, in particular, independent of the number of components in the network. Besides fault tolerance, hardware costs and energy efficiency are of great importance. The utilized link width has a decisive influence on these aspects. In particular, at deflection routing based NoCs, over- and under-sizing of the link width leads to unnecessary high hardware costs and bad performance, respectively. In the second part of this thesis, the optimal link width at deflection routing based NoCs is investigated. Additionally, a method to reduce the link width is introduced. Simulation and synthesis results show, the herein presented method allows a significant reduction of hardware costs at comparable performance
Die Fortschritte der letzten Jahre bei der Fertigung von Halbleiterchips ermöglichen eine Vielzahl an Rechenkernen auf einem einzelnen Chip. Die in diesem Zusammenhang immer weiter sinkenden Strukturgrößen führen jedoch dazu, dass Fehlertoleranz und Energieverbrauch zentrale Herausforderungen darstellen werden. Aufgrund der hohen Parallelität in solchen Systemen, ist außerdem eine leistungsfähige Kommunikationsinfrastruktur unabdingbar. Das in diesen hochgradig parallelen Systemen überwiegend eingesetzte System zur Datenübertragung ist ein Netzwerk auf einem Chip (engl. Network on Chip (NoC)). Der Fokus dieser Dissertation liegt auf NoCs, die auf dem Prinzip des sog. Deflection Routing basieren. In diesem Kontext wurden Beiträge zu zwei Bereichen geleistet, der Fehlertoleranz und der Dimensionierung der optimalen Breite von Verbindungen. Beide Aspekte sind für den Einsatz zuverlässiger, energieeffizienter, Deflection Routing basierter NoCs essentiell. Es ist davon auszugehen, dass zukünftige Halbleiter-Systeme mit einer hohen Fehlerwahrscheinlichkeit zurecht kommen müssen. Die hohe Konnektivität, die in den meisten NoC Topologien inhärent gegeben ist, kann ausgenutzt werden, um den Ausfall von Verbindungen und anderen Komponenten zu tolerieren. Im Rahmen dieser Arbeit wurde vor diesem Hintergrund eine fehlertolerante Router-Architektur entwickelt, die sich durch das eingesetzte Verbindungsnetzwerk und das Verfahren zur Überwindung komplexer Fehlersituationen auszeichnet. Die präsentierten Simulations-Ergebnisse zeigen, dass selbst bei sehr hohen Fehlerwahrscheinlichkeiten alle Datenpakete ihr Ziel erreichen. Im Vergleich zu Router-Architekturen die auf Routing-Tabellen basieren, sind die Hardware-Kosten der hier vorgestellten Router-Architektur gering und insbesondere unabhängig von der Anzahl an Komponenten im Netzwerk, was den Einsatz in sehr großen Netzen ermöglicht. Neben der Fehlertoleranz sind die Hardware-Kosten sowie die Energieeffizienz von NoCs von großer Bedeutung. Einen entscheidenden Einfluss auf diese Aspekte hat die verwendete Breite der Verbindungen des NoCs. Insbesondere bei Deflection Routing basierten NoCs führt eine Über- bzw. Unterdimensionierung der Breite der Verbindungen zu unnötig hohen Hardware-Kosten bzw. schlechter Performanz. Im zweiten Teil dieser Arbeit wird die optimale Breite der Verbindungen eines Deflection Routing basierten NoCs untersucht. Außerdem wird ein Verfahren zur Reduzierung der Breite dieser Verbindungen vorgestellt. Simulations- und Synthese-Ergebnisse zeigen, dass dieses Verfahren eine erhebliche Reduzierung der Hardware-Kosten bei ähnlicher Performanz ermöglicht

APA, Harvard, Vancouver, ISO, and other styles

45

Basavaraj, T. "NoC Design & Optimization of Multicore Media Processors." Thesis, 2013. http://etd.iisc.ernet.in/2005/3296.

Full text

Abstract:

Network on Chips[1][2][3][4] are critical elements of modern System on Chip(SoC) as well as Chip Multiprocessor(CMP)designs. Network on Chips (NoCs) help manage high complexity of designing large chips by decoupling computation from communication. SoCs and CMPs have a multiplicity of communicating entities like programmable processing elements, hardware acceleration engines, memory blocks as well as off-chip interfaces. With power having become a serious design constraint[5], there is a great need for designing NoC which meets the target communication requirements, while minimizing power using all the tricks available at the architecture, microarchitecture and circuit levels of the de-sign. This thesis presents a holistic, QoS based, power optimal design solution of a NoC inside a CMP taking into account link microarchitecture and processor tile configurations. Guaranteeing QoS by NoCs involves guaranteeing bandwidth and throughput for connections and deterministic latencies in communication paths. Label Switching based Network-on-Chip(LS-NoC) uses a centralized LS-NoC Management framework that engineers traffic into QoS guaranteed routes. LS-NoC uses label switching, enables band-width reservation, allows physical link sharing and leverages advantages of both packet and circuit switching techniques. A flow identification algorithm takes into account band-width available in individual links to establish QoS guaranteed routes. LS-NoC caters to the requirements of streaming applications where communication channels are fixed over the lifetime of the application. The proposed NoC framework inherently supports heterogeneous and ad-hoc SoC designs. A multicast, broadcast capable label switched router for the LS-NoC has been de-signed, verified, synthesized, placed and routed and timing analyzed. A 5 port, 256 bit data bus, 4 bit label router occupies 0.431 mm2 in 130nm and delivers peak band-width of80Gbits/s per link at312.5MHz. LS Router is estimated to consume 43.08 mW. Bandwidth and latency guarantees of LS-NoC have been demonstrated on streaming applications like Hiper LAN/2 and Object Recognition Processor, Constant Bit Rate traffic patterns and video decoder traﬃc representing Variable Bit Rate traffic. LS-NoC was found to have a competitive figure of merit with state-of-the-art NoCs providing QoS. We envision the use of LS-NoC in general purpose CMPs where applications demand deterministic latencies and hard bandwidth requirements. Design variables for interconnect exploration include wire width, wire spacing, repeater size and spacing, degree of pipelining, supply, threshold voltage, activity and coupling factors. An optimal link configuration in terms of number of pipeline stages for a given length of link and desired operating frequency is arrived at. Optimal configurations of all links in the NoC are identified and a power-performance optimal NoC is presented. We presents a latency, power and performance trade-off study of NoCs using link microarchitecture exploration. The design and implementation of a framework for such a design space exploration study is also presented. We present the trade-oﬀ study on NoCs by varying microarchitectural(e.g. pipelining) and circuit level(e.g. frequency and voltage) parameters. A System-C based NoC exploration framework is used to explore impacts of various architectural and microarchitectural level parameters of NoC elements on power and performance of the NoC. The framework enables the designer to choose from a variety of architectural options like topology, routing policy, etc., as well as allows experimentation with various microarchitectural options for the individual links like length, wire width, pitch, pipelining, supply voltage and frequency. The framework also supports a flexible traffic generation and communication model. Latency, power and throughput results using this framework to study a 4x4 CMP are presented. The framework is used to study NoC designs of a CMP using different classes of parallel computing benchmarks[6]. One of the key findings is that the average latency of a link can be reduced by increasing pipeline depth to a certain extent, as it enables link operation at higher link frequencies. Abstract There exists an optimum degree of pipelining which minimizes the energy-delay product of the link. In a 2D Torus when the longest link is pipelined by 4 stages at which point least latency(1.56 times minimum) is achieved and power(40% of max) and throughput (64%of max) are nominal. Using frequency scaling experiments, power variations of up to40%,26.6% and24% can be seen in 2D Torus, Reduced 2D Torus and Tree based NoC between various pipeline configurations to achieve same frequency at constant voltages. Also in some cases, we find that switching to a higher pipelining configuration can actually help reduce power as the links can be designed with smaller repeaters. We also find that the overall performance of the ICNs is determined by the lengths of the links needed to support the communication patterns. Thus the mesh seems to perform the best amongst the three topologies(Mesh, Torus and Folded Torus) considered in case studies. The effects of communication overheads on performance, power and energy of a multiprocessor chip using L1,L2 cache sizes as primary exploration parameters using accurate interconnect, processor, on-chip and off-chip memory modelling are presented. On-chip and off-chip communication times have significant impact on execution time and the energy efficiency of CMPs. Large cache simply larger tile area that result in longer inter-tile communication link lengths and latencies, thus adversely impacting communication time. Smaller caches potentially have higher number of misses and frequent of off-tile communication. Energy efficient tile design is a configuration exploration and trade-off study using different cache sizes and tile areas to identify a power-performance optimal configuration for the CMP. Trade-offs are explored using a detailed, cycle accurate, multicore simulation frame-work which includes superscalar processor cores, cache coherent memory hierarchies, on-chip point-to-point communication networks and detailed interconnect model including pipelining and latency. Sapphire, a detailed multiprocessor execution environment integrating SESC, Ruby and DRAM Sim was used to run applications from the Splash2 benchmark(64KpointFFT).Link latencies are estimated for a16 core CMP simulation on Sapphire. Each tile has a single processor, L1 and L2 caches and a router. Different sizesofL1 andL2lead to different tile clock speeds, tile miss rates and tile area and hence interconnect latency. Simulations across various L1, L2 sizes indicate that the tile configuration that maximizes energy efficiency is related to minimizing communication time. Experiments also indicate different optimal tile configurations for performance, energy and energy efficiency. Clustered interconnection network, communication aware cache bank mapping and thread mapping to physical cores are also explored as potential energy saving solutions. Results indicate that ignoring link latencies can lead to large errors in estimates of program completion times, of up to 17%. Performance optimal configurations are achieved at lower L1 caches and at moderateL2 cache sizes due to higher operating frequencies and smaller link lengths and comparatively lesser communication. Using minimal L1 cache size to operate at the highest frequency may not always be the performance-power optimal choice. Larger L1 sizes, despite a drop in frequency, offer a energy advantage due to lesser communication due to misses. Clustered tile placement experiments for FFT show considerable performance per watt improvement (1.2%). Remapping most accessed L2 banks by a process in the same core or neighbouring cores after communication traffic analysis offers power and performance advantages. Remapped processes and banks in clustered tile placement show a performance per watt improvement of5.25% and energy reductionof2.53%. This suggests that processors could execute a program in multiple modes, for example, minimum energy, maximum performance.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'System-on-chips'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles