Rozprawy doktorskie na temat „Reconfigurable Hardware Architecture”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 48 najlepszych rozpraw doktorskich naukowych na temat „Reconfigurable Hardware Architecture”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Gelb, Benjamin S. "A timeshared, runtime reconfigurable hardware co-processing architecture". Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/53147.
Pełny tekst źródłaIncludes bibliographical references (leaves 73-74).
The constant desire for increased performance in microprocessor systems has led to the need for specialized hardware cores to accelerate specific computational tasks. In this thesis, we explore the potential of using FPGA partial reconfiguration to create a platform for customized hardware cores that may be loaded on demand, at runtime, and replaced when not in use. We implement two new software tools, bitparse and bitrender, to demonstrate the bitstream relocation technique. Further, we present a functional microprocessor system coupled with a runtime reprogramable peripheral synthesized on a Xilinx Virtex-5 FPGA and discuss its performance implications.
by Benjamin S. Gelb.
M.Eng.
Peterkin, Raymond. "A reconfigurable hardware architecture for VPN MPLS based services". Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27283.
Pełny tekst źródłaDiniz, Claudio Machado. "Dedicated and reconfigurable hardware accelerators for high efficiency video coding standard". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2015. http://hdl.handle.net/10183/118394.
Pełny tekst źródłaThe demand for ultra-high resolution video (beyond 1920x1080 pixels) led to the need of developing new and more efficient video coding standards to provide high compression efficiency. The High Efficiency Video Coding (HEVC) standard, published in 2013, reaches double compression efficiency (or 50% reduction in size of coded video) compared to the most efficient video coding standard at that time, and most used in the market, the H.264/AVC (Advanced Video Coding) standard. HEVC reaches this result at the cost of high computational effort of the tools included in the encoder and decoder. The increased computational effort of HEVC standard and the power limitations of current silicon fabrication technologies makes it essential to develop hardware accelerators for compute-intensive computational kernels of HEVC application. Hardware accelerators provide higher performance and energy efficiency than general purpose processors for specific applications. An HEVC application analysis conducted in this work identified the most compute-intensive kernels of HEVC, namely the Fractional-pixel Interpolation Filter, the Deblocking Filter and the Sum of Absolute Differences calculation. A run-time analysis on Interpolation Filter indicates a great potential of power/energy saving by adapting the hardware accelerator to the varying workload. This thesis introduces new contributions in the field of dedicated and reconfigurable hardware accelerators for HEVC standard. Dedicated hardware accelerators for the Fractional Pixel Interpolation Filter, the Deblocking Filter and the Sum of Absolute Differences calculation are herein proposed, designed and evaluated. The interpolation filter hardware architecture achieves throughput similar to the state of the art, while reducing hardware area by 50%. Our deblocking filter hardware architecture also achieves similar throughput compared to state of the art with a 5X to 6X reduction in gate count and 3X reduction in power dissipation. The thesis also does a new comparative analysis of Sum of Absolute Differences processing elements, in which various architecture design alternatives with different area, performance and power results were introduced. A novel reconfigurable interpolation filter hardware architecture for HEVC standard was developed, and it provides 57% design-time area reduction and run-time power/energy adaptation in a picture-by-picture basis, compared to the state-of-the-art. Additionally a run-time accelerator binding scheme is proposed for tile-based mixed-grained reconfigurable architectures, which reduces the communication overhead, compared to first-fit strategy with datapath reusing scheme, by up to 44% (23% on average) for different number of tiles and internal tile organizations. This run-time accelerator binding scheme is aware of the underlying architecture to bind datapaths in an efficient way, to avoid and minimize inter-tile communications. The new dedicated and reconfigurable hardware accelerators and techniques proposed in this thesis enable next-generation video coding standard implementations beyond HEVC with improved area, performance, and power efficiency.
Kung, Ling-Pei 1961. "Obtaining performance and programmability using reconfigurable hardware for media processing". Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/61855.
Pełny tekst źródłaIncludes bibliographical references (p. 127-132).
An imperative requirement in the design of a reconfigurable computing system or in the development of a new application on such a system is performance gains. However, such developments suffer from long-and-difficult programming process, hard-to-predict performance gains, and limited scope of applications. To address these problems, we need to understand reconfigurable hardware's capabilities and limitations, its performance advantages and disadvantages, re-think reconfigurable system architectures, and develop new tools to explore its utility. We begin by examining performance contributors at the system level. We identify those from general-purpose and those from dedicated components. We propose an architecture by integrating reconfigurable hardware within the general-purpose framework. This is to avoid and minimize dedicated hardware and organization for programmability. We analyze reconfigurable logic architectures and their performance limitations. This analysis leads to a theory that reconfigurable logic can never be clocked faster than a fixed-logic design based on the same fabrication technology. Though highly unpredictable, we can obtain a quick upper bound estimate on the clock speed based on a few parameters. We also analyze microprocessor architectures and establish an analytical performance model. We use this model to estimate performance bounds using very little information on task properties. These bounds help us to detect potential memory-bound tasks. For a compute-bound task, we compare its performance upper bound with the upper bound on reconfigurable clock speed to further rule out unlikely speedup candidates.
(cont.) These performance estimates require very few parameters, and can be quickly obtained without writing software or hardware codes. They can be integrated with design tools as front end tools to explore speedup opportunities without costly trials. We believe this will broaden the applicability of reconfigurable computing.
by Ling-Pei Kung.
Ph.D.
Balasubramanian, Karthikeyan. "Reconfigurable System-on-Chip Architecture for Neural Signal Processing". Diss., Temple University Libraries, 2011. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/144255.
Pełny tekst źródłaPh.D.
Analyzing the brain's behavior in terms of its neuronal activity is the fundamental purpose of Brain-Machine Interfaces (BMIs). Neuronal activity is often assumed to be encoded in the rate of neuronal action potential spikes. Successful performance of a BMI system is tied to the efficiency of its individual processing elements such as spike detection, sorting and decoding. To achieve reliable operation, BMIs are equipped with hundreds of electrodes at the neural interface. While a single electrode/tetrode communicates with up to four neurons at a given instant of time, a typical interface communicates with an ensemble of hundreds or even thousands of neurons. However, translation of these signals (data) into usable information for real-time BMIs is bottlenecked due to the lack of efficient real-time algorithms and real-time hardware that can handle massively parallel channels of neural data. The research presented here addresses this issue by developing real-time neural processing algorithms that can be implemented in reconfigurable hardware and thus, can be scaled to handle thousands of channels in parallel. The developed reconfigurable system serves as an evaluation platform for investigating the fundamental design tradeoffs in allocating finite hardware resources for a reliable BMI. In this work, the generic architectural layout needed to process neural signals in a massive scale is discussed. A System-on-Chip design with embedded system architecture is presented for FPGA hardware realization that features (a) scalability (b) reconfigurability, and (c) real-time operability. A prototype design incorporating a dual processor system and essential neural signal processing routines such as real-time spike detection and sorting is presented. Two kinds of spike detectors, a simple threshold-based and non-linear energy operator-based, were implemented. To achieve real-time spike sorting, a fuzzy logic-based spike sorter was developed and synthesized in the hardware. Furthermore, a real-time kernel to monitor the high-level interactions of the system was implemented. The entire system was realized in a platform FPGA (Xilinx Virtex-5 LX110T). The system was tested using extracellular neural recordings from three different animals, a owl monkey, a macaque and a rat. Operational performance of the system is demonstrated for a 300 channel neural interface. Scaling the system to 900 channels is trivial.
Temple University--Theses
Lomonaco, Michael John. "CRYPTARRAY A SCALABLE AND RECONFIGURABLE ARCHITECTURE FOR CRYPTOGRAPHIC APPLICATIONS". Master's thesis, University of Central Florida, 2004. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4394.
Pełny tekst źródłaM.S.
Department of Electrical and Computer Engineering
Engineering and Computer Science
Electrical and Computer Engineering
Avakian, Annie. "Reducing Cache Access Time in Multicore Architectures Using Hardware and Software Techniques". University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1335461322.
Pełny tekst źródłaSilva, Antonio Carlos Fernandes da. "ChipCflow: tool for convert C code in a static dataflow architecture in reconfigurable hardware". Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-30062015-141638/.
Pełny tekst źródłaExiste uma crescente busca por softwares e arquiteturas alternativas. Essa busca acontece pois houveram avanços na tecnologia do hardware, e estes avanços devem ser complementados por inovações nas metodologias de projetos, testes e verificação para que haja um uso eficaz da tecnologia. Os software e arquiteturas alternativas, geralmente são modelos que exploram o paralelismo das aplicações, ao contrário do modelo de Von Neumann. Dentre as arquiteturas alternativas de alto desempenho, tem-se a arquitetura a fluxo de dados. Nesse tipo de arquitetura, o processo de execução de programas é determinado pela disponibilidade dos dados, logo o paralelismo está embutido na própria natureza do sistema. O modelo a fluxo de dados possui a vantagem de expressar o paralelismo de maneira intrínseca, eliminando a necessidade do programador explicitar em seu código os trechos onde deve haver paralelismo. As arquiteturas a fluxo de dados voltaram a ser uma área de pesquisa devido aos avanços do hardware, em particular, os avanços da Computação Reconfigurável e dos Field Programmable Gate Arrays (FPGAs).Nesta tese é descrita uma ferramenta de conversão de código que visa a geração de aplicações utilizando uma arquitetura a fluxo de dados estática. Também é descrito o projeto ChipCflow, cuja ferramenta de conversão de código, descrita nesta tese, é parte integrante. A especificação do algoritmo a ser convertido é feita em linguagem C e convertida para uma linguagem de descrição de hardware, respeitando o modelo proposto pelo ChipCflow. Os resultados alcançados visam a prova de conceito da conversão de código de uma linguagem de alto nível para uma arquitetura a fluxo de dados a ser configurada em FPGA.
Robinson, Kylan Thomas. "An integrated development environment for the design and simulation of medium-grain reconfigurable hardware". Pullman, Wash. : Washington State University, 2010. http://www.dissertations.wsu.edu/Thesis/Spring2010/k_robinson_041510.pdf.
Pełny tekst źródłaTitle from PDF title page (viewed on June 22, 2010). "School of Electrical Engineering and Computer Science." Includes bibliographical references (p. 75-76).
Das, Satyajit. "Architecture and Programming Model Support for Reconfigurable Accelerators in Multi-Core Embedded Systems". Thesis, Lorient, 2018. http://www.theses.fr/2018LORIS490/document.
Pełny tekst źródłaEmerging trends in embedded systems and applications need high throughput and low power consumption. Due to the increasing demand for low power computing and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy efficient hardware accelerators. The main drawback of hardware accelerators is that they are not programmable. Therefore, their utilization can be low is they perform one specific function and increasing the number of the accelerators in a system on chip (SoC) causes scalability issues. Programmable accelerators provide flexibility and solve the scalability issues. Coarse-Grained Reconfigurable Array (CGRA) architecture consisting of several processing elements with word level granularity is a promising choice for programmable accelerator. Inspired by the promising characteristics of programmable accelerators, potentials of CGRAs in near threshold computing platforms are studied and an end-to-end CGRA research framework is developed in this thesis. The major contributions of this framework are: CGRA design, implementation, integration in a computing system, and compilation for CGRA. First, the design and implementation of a CGRA named Integrated Programmable Array (IPA) is presented. Next, the problem of mapping applications with control and data flow onto CGRA is formulated. From this formulation, several efficient algorithms are developed using internal resources of a CGRA, with a vision for low power acceleration. The algorithms are integrated into an automated compilation flow. Finally, the IPA accelerator is augmented in PULP - a Parallel Ultra-Low-Power Processing-Platform to explore heterogeneous computing
Werner, Stefan [Verfasser]. "Hybrid architecture for hardware-accelerated query processing in semantic web databases based on runtime reconfigurable FPGAs / Stefan Werner". Lübeck : Zentrale Hochschulbibliothek Lübeck, 2017. http://d-nb.info/1143986946/34.
Pełny tekst źródłaAstolfi, Vitor Fiorotto. "ChipCflow - em hardware dinamicamente reconfigurável". Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-05032010-203142/.
Pełny tekst źródłaIn recent years, reconfigurable computing has become increasingly more advanced, especially in hardware that uses Field-Programmable Gate Arrays. However, the increase of performance in FPGAs accumulated the gap between design capacity and technology for the development of the design. Imperative high-level programming languages such as C are more appropriate for the development of complex algorithms than hardware description languages (HDL). For this reason, many ANSI C-like programming tools for the development of hardware came to existence. The ChipCflow project, of which this project is part, is one of these tools. The execution of algorithms through this tool will be completely directed by data flow, according to the dynamic model found on Dataflow Architectures, taking advantage of its natural high levels of parallelism and the characteristics of the partially reconfigurable hardware. In this project, the objective is a proof of concept for the creation of instances, in the form of operators, of a ChipCflow algorithm on a partially reconfigurable hardware, taking as reference the Xilinx Virtex boards
Qian, Wenchao. "Energy-efficientSpatio-temporalComputing Framework". Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1459257723.
Pełny tekst źródłaLloyd, G. Scott. "Accelerated Large-Scale Multiple Sequence Alignment with Reconfigurable Computing". BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2729.
Pełny tekst źródłaEl-Hassan, Fadi. "Hardware Architecture of an XML/XPath Broker/Router for Content-Based Publish/Subscribe Data Dissemination Systems". Thèse, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/30660.
Pełny tekst źródłaBollengier, Théotime. "Du prototypage à l’exploitation d’overlays FPGA". Thesis, Brest, École nationale supérieure de techniques avancées Bretagne, 2018. http://www.theses.fr/2018ENTA0003/document.
Pełny tekst źródłaDue to their reconfigurable capability and the performance they offer, FPGAs are good candidates for accelerating applications in the cloud. However, FPGAs have some features that hinder their use in the Cloud as well as their adoption by customers : first, FPGA programming is done at low level and requires some expertise that usual Cloud clients do not necessarily have. Secondly, FPGAs do not have native mechanisms allowing them to easily fit in the dynamic execution model of the Cloud.In this work, we propose to use overlay architectures to facilitate FPGA adoption, integration, and operation in the Cloud. Overlays are reconfigurable architectures synthesized on FPGA. As hardware abstraction layers placed between the FPGA and applications, overlays allow to raise the abstraction level of the execution model presented to applications and users, as well as to implement mechanisms making them fit in a Cloud infrastructure.This work presents a vertical approach addressing all aspects of overlay operation in the Cloud as reconfigurable accelerators programmable by tenants : from designing and implementing overlays, integrating them on commercial FPGA platforms, setting up their operating mechanisms, to developping their programming tools. The environment developped in this work is complete, modular and extensible, it is partially based on several existing tools, and demonstrate the feasibility of our approach
Oliveira, Tiago de. "Desenvolvimento de uma arquitetura multiprocessada e reconfigurável para a síntese de redes de Petri em hardware /". Ilha Solteira : [s.n.], 2008. http://hdl.handle.net/11449/100361.
Pełny tekst źródłaBanca: Aledir Silveira Pereira
Banca: Alexandre Cesar Rodrigues da Silva
Banca: Furio Damiani
Banca: Paulo Romero Martins Maciel
Resumo: O objetivo desta tese é o desenvolvimento de uma arquitetura multiprocessada e reconfiguravel que permita a implementação física de sistemas de controle descritos por meio de Redes de Petri coloridas de arcos constantes T-temporizadas e que possuam pro- babilidade de disparo nas transições. A arquitetura pode ser utilizada para implementar sistemas de controle (e n~ao para a avaliacao das propriedades da Rede de Petri), permi- tindo a implementacao física por meio de mapeamento tecnologico diretamente no nível comportamental, sem a necessidade de se utilizar um processo de síntese de alto nível para descrever o sistema em equações booleanas e tabelas de transição de estados. A arquitetura é composta por um arranjo de blocos de configuracao denominados BCERPs, por blocos reconfiguráveis denominados BCGNs e por um sistema de comunicacão, implementado por um conjunto de roteadores. Os blocos BCERPs podem ser configurados para implementar as transições da Rede de Petri e seus respectivos lugares de entrada. Blocos BCGNs são utilizados pelos blocos BCERPs para a geração de numeros pseudo-aleatorios. Estes numeros podem definir a probabilidade de disparo das transições e tambem podem ser usados no processo de resolução de conflito, que ocorre quando uma transição possuir um ou mais lugares de entrada compartilhados com outras transições. O sistema de comunicacão possui uma topologia de grelha, tendo como principal função o roteamento e armazenamento de pacotes entre os blocos de configuração. Os roteadores e blocos de configuração BCERPs e BCGNs foram descritos em VHDL e implementados em FPGAs.
Abstract: The goal of this thesis is to develop a reconfigurable multiprocessed architecture that allows the physical implementation of systems described by T-timed colored Petri nets with constant arcs having transitions with firing probabilities. The architecture can be used to implement control systems (not to evaluation Petri net properties). With this architecture, physical implementation of systems can be achieved through technology mapping directly from behavioral level, without the need to go through an expensive high level synthesis process to describe the system into boolean equations and state transition tables. The architecture comprises an array of configuration blocks named BCERPs; reconfigurable blocks named BCGNs; and a communication system implemented using a set of routers. BCERP blocks can be configured to implement Petri net transitions as well as the corresponding input places. BCGN blocks are used by BCERPs for pseudo random number generation. These numbers can define transitions firing probabilities. They can also be used for conflit resolution, which happens when two or more transitions share one or more input places. The communication system presents a grid topology. Its main functions are packet storage and routing among configuration blocks. The routers, BCGNs and BCERPs configuration blocks were described in VHDL and implemented in FPGAs.
Doutor
Oliveira, Tiago de [UNESP]. "Desenvolvimento de uma arquitetura multiprocessada e reconfigurável para a síntese de redes de Petri em hardware". Universidade Estadual Paulista (UNESP), 2008. http://hdl.handle.net/11449/100361.
Pełny tekst źródłaConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
O objetivo desta tese é o desenvolvimento de uma arquitetura multiprocessada e reconfiguravel que permita a implementação física de sistemas de controle descritos por meio de Redes de Petri coloridas de arcos constantes T-temporizadas e que possuam pro- babilidade de disparo nas transições. A arquitetura pode ser utilizada para implementar sistemas de controle (e n~ao para a avaliacao das propriedades da Rede de Petri), permi- tindo a implementacao física por meio de mapeamento tecnologico diretamente no nível comportamental, sem a necessidade de se utilizar um processo de síntese de alto nível para descrever o sistema em equações booleanas e tabelas de transição de estados. A arquitetura é composta por um arranjo de blocos de configuracao denominados BCERPs, por blocos reconfiguráveis denominados BCGNs e por um sistema de comunicacão, implementado por um conjunto de roteadores. Os blocos BCERPs podem ser configurados para implementar as transições da Rede de Petri e seus respectivos lugares de entrada. Blocos BCGNs são utilizados pelos blocos BCERPs para a geração de numeros pseudo-aleatorios. Estes numeros podem definir a probabilidade de disparo das transições e tambem podem ser usados no processo de resolução de conflito, que ocorre quando uma transição possuir um ou mais lugares de entrada compartilhados com outras transições. O sistema de comunicacão possui uma topologia de grelha, tendo como principal função o roteamento e armazenamento de pacotes entre os blocos de configuração. Os roteadores e blocos de configuração BCERPs e BCGNs foram descritos em VHDL e implementados em FPGAs.
The goal of this thesis is to develop a reconfigurable multiprocessed architecture that allows the physical implementation of systems described by T-timed colored Petri nets with constant arcs having transitions with firing probabilities. The architecture can be used to implement control systems (not to evaluation Petri net properties). With this architecture, physical implementation of systems can be achieved through technology mapping directly from behavioral level, without the need to go through an expensive high level synthesis process to describe the system into boolean equations and state transition tables. The architecture comprises an array of configuration blocks named BCERPs; reconfigurable blocks named BCGNs; and a communication system implemented using a set of routers. BCERP blocks can be configured to implement Petri net transitions as well as the corresponding input places. BCGN blocks are used by BCERPs for pseudo random number generation. These numbers can define transitions firing probabilities. They can also be used for conflit resolution, which happens when two or more transitions share one or more input places. The communication system presents a grid topology. Its main functions are packet storage and routing among configuration blocks. The routers, BCGNs and BCERPs configuration blocks were described in VHDL and implemented in FPGAs.
Brunie, Nicolas. "Contribution à l'arithmétique des ordinateurs et applications aux systèmes embarqués". Thesis, Lyon, École normale supérieure, 2014. http://www.theses.fr/2014ENSL0894/document.
Pełny tekst źródłaIn the last decades embedded systems have been challenged with more and more application variety, each time more constrained. This implies an ever growing need for performances and energy efficiency in arithmetic units. This work studies solutions ranging from hardware to software to improve arithmetic support in embedded systems. Some of these solutions were integrated in Kalray's MPPA processor. The first part of this work focuses on floating-Point arithmetic support in the MPPA. It starts with the design of a floating-Point unit (FPU) based on the classical FMA (Fused Multiply-Add) operator. The improvements we suggest, implement and evaluate include a mixed precision FMA, a 3-Operand add and a 2D scalar product, each time with a single rounding and support for subnormal numbers. It then considers the implementation of division and square root. The FPU is reused and modified to optimize the software implementations of those primitives at a lower cost. Finally, this first part opens up on the development of a code generator designed for the implementation of highly optimized mathematical libraries in different contexts (architecture, accuracy, latency, throughput). The second part studies a reconfigurable coprocessor, a hardware operator that could be dynamically modified to adapt on the fly to various applicative needs. It intends to provide performance close to ASIC implementation, with some of the flexibility of software. One of the addressed challenges is the integration of such a reconfigurable coprocessor into the low power embedded cluster of the MPPA. Another is the development of a software framework targeting the coprocessor and allowing design space exploration. The last part of this work leaves micro-Architecture considerations to study the efficient use of parallel arithmetic resources. It presents an improvement of regular architectures (Single Instruction Multiple Data), like those found in graphic processing units (GPU), for the execution of divergent control flow graphs
Foucher, Clément. "Méthodologie de conception pour la virtualisation et le déploiement d'applications parallèles sur plateforme reconfigurable matériellement". Phd thesis, Université Nice Sophia Antipolis, 2012. http://tel.archives-ouvertes.fr/tel-00777511.
Pełny tekst źródłaAlmeida, Manoel Aranda de. "Sistema embarcado reconfigurável de forma estática por programação genética utilizando hardware evolucionário híbrido". Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/8000.
Pełny tekst źródłaApproved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T18:27:58Z (GMT) No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T18:28:04Z (GMT) No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5)
Made available in DSpace on 2016-10-20T18:28:13Z (GMT). No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5) Previous issue date: 2016-03-04
Não recebi financiamento
The use of technology based on Field Programmable Gate Arrays (FPGAs), a reconfigurable technology, has become a frequent object of study. This technique is feasible and a promising application in the development of embedded systems, however, the difficulty in finding a flexible and efficient way to perform such an application is their bigger problem. In this work, a virtual and reconfigurable architecture (AVR) in FPGA for hardware applications is presented using a Genetic Programming Software on the development of an optimal reconfiguration for this AVR, in order to build a hardware capable of performing a given task in an embedded system. This proposal is a simple, flexible and efficient way to achieve appropriate applications in embedded systems, when compared to other reconfigurable hardware techniques. The representation of phenotype of the proposed evolutionary system is based on a bi-dimensional network function elements (EF). The GPLAB tool for MATLAB is used in Genetic Programming, and the solution found by this procedure is converted into a memory mapping to represent the best solution, where it is used to reconfigure the hardware. In the tests, GPLAB found results for logic circuits in a few generations, and for image filters containing efficient solutions, where there was little hardware occupation, especially memory, in the cases this has been presented, with a reduced chromosome size, shows a proposal efficiency.
O uso da tecnologia baseada em Field Programmable Gate Arrays (FPGAs), de forma reconfigurável, para a solução de diversos problemas atuais, tem se tornado um frequente objeto de estudo. Essa técnica é de aplicação viável e promissora na elaboração de sistemas embarcados, porém, a dificuldade em encontrar uma forma flexível e eficiente de realizar tal aplicação é o seu maior problema. Neste trabalho, é apresentada uma arquitetura virtual e reconfigurável (AVR) em FPGA para aplicações em hardware, utilizando um software de Programação Genética na elaboração de uma reconfiguração ótima para esta AVR, de forma a construir um hardware capaz de efetuar uma determinada tarefa em um sistema embarcado. Esta proposta é uma forma simples, flexível e eficiente de realizar aplicações adequadas em sistemas embarcados, quando comparada a outras técnicas de hardware reconfigurável. A representação do fenótipo no sistema evolutivo proposto se baseia em uma rede de elementos de função (EF) bidimensional. A ferramenta GPLAB, para MATLAB, é usada na Programação Genética, e a solução encontrada por esta é convertida em um mapeamento de memória com o cromossomo da melhor solução, onde este é usado para reconfigurar o hardware. Nos testes realizados, a GPLAB encontrou resultados para circuitos lógicos em poucas gerações, e para filtros de imagem encontrou soluções eficientes, onde ocorreu pouca ocupação de hardware, principalmente da memória nos casos apresentados, apresentando um cromossomo de tamanho reduzido, o que demonstra uma boa eficiência da proposta.
Zhu, Yiqun. "An investigation into hardware architectures of reconfigurable convolutional decoders". Thesis, University of Sheffield, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.403256.
Pełny tekst źródłaMoss, Duncan J. M. "FPGA Architectures for Low Precision Machine Learning". Thesis, The University of Sydney, 2017. http://hdl.handle.net/2123/18182.
Pełny tekst źródłaFröhlich, Dominik. "Object-Oriented Development for Reconfigurable Architectures". Doctoral thesis, Technische Universitaet Bergakademie Freiberg Universitaetsbibliothek "Georgius Agricola", 2009. http://nbn-resolving.de/urn:nbn:de:bsz:105-802464.
Pełny tekst źródłaHussain, Hanaa Mohammad. "Dynamically and partially reconfigurable hardware architectures for high performance microarray bioinformatics data analysis". Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/7645.
Pełny tekst źródłaFarag, Mohammed Morsy Naeem. "Architectural Enhancements to Increase Trust in Cyber-Physical Systems Containing Untrusted Software and Hardware". Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/29084.
Pełny tekst źródłaPh. D.
Parris, Matthew. "OPTIMIZING DYNAMIC LOGIC REALIZATIONS FOR PARTIAL RECONFIGURATION OF FIELD PROGRAMMABLE GATE ARRAYS". Master's thesis, University of Central Florida, 2008. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4128.
Pełny tekst źródłaM.S.Cp.E.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Engineering MSCpE
Ló, Thiago Berticelli. "Virtualização de hardware e exploração da memória de contexto em arquiteturas reconfiguráveis". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2012. http://hdl.handle.net/10183/66195.
Pełny tekst źródłaReconfigurable architectures have shown to be a potential solution to the problem of increasing complexity found in embedded systems. However, in order to achieve significant performance gains, large quantities of redundant functional units are generally necessary, with a corresponding increase in the area occupied by these units. This thesis explores the design space with the objective of reducing both area and energy consumption, and presents two hardware virtualization techniques, similar to reconfigurable pipeline stages, which achieve a reduction in area of more than 94%. The use of context memory in reconfigurable architectures has a significant impact in terms of area and energy, as is clearly demonstrated by initial experimental results. Two novel context memory architectures are presented: the first approach is being based on an exploration of the balance point between memory port width and number of accesses, in order to reduce the energy consumed during fetching of the configuration bytes; the second approach presents a configuration management mechanism using hardware linked lists, and that allows segmented access to configuration settings. Both approaches demonstrate energy reduction of up to 98% and can be adopted in both partial and atomic reconfiguration architectures.
Fazzoletto, Emilio. "Characterization of Partial and Run-Time Reconfigurable FPGAs". Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-202724.
Pełny tekst źródłaFPGA-baserade system har tidigare främst använts för snabb och kostnadseffektiv konstruktion av prototyper vid framtagandet av applikationsspecika integrerade kretsar (ASIC). På senare år har användandet av FPGA:er i inbyggda system för implementation av hårdvaruacceleratorers såväl som huvudsaklig beräkningsenhet ökat. Denna ökning har möjliggjorts mycket tack vare den utveckling som har skett av rekonfigurerbara integrerade kretsar: från de mer traditionella Complex Programmable Logic Devices (CPLD) till helt CMOS-baserade FPGA:er. Nu inleds en ny era för FPGA-baserade system tack vare möjligheten att under körning rekonfigurera delar av FPGA:n genom så kallad partial run-time reconguration(RTR) - en teknik som redan idag finns tillgänglig i produkter på marknaden. Tidigare forskning visar att användandet av en RTR-baserad hårdvaruarkitektur kan ha en positiv effekt med avseende på prestanda såväl som strömförbrukning. Att använda RTR-baserad hårdvara innebär dock flera utmaningar: En ej försumbar rekonfigurationstid måste tas i beaktning, så även den icke-deterministiska exekveringstiden som en rekonfiguration kan innebära. Vidare måste anpassningar av mjukvaran göras för att fungera med en hårdvaruplattform som förändras över tid. Denna uppsats syftar till att undersöka prestandan hos ett modernt RTRbaserat SoC (Xilinx Zynq 7020) med fokus på rekonfigurationstider och dess förutsägbarhet, prestanda ökning, begränsningar samt nödvändiga kompromisser som denna arkitektur innebär. Huruvida en applikation kan dra nytta av en RTR-baserad arkitektur eller inte kan vara svårt att avgöra. Den insamlade datan som presenteras i denna rapport kan dock fungera som stöd för hårdvarukonstruktörer som önskar använda en RTR-baserad plattform.
Afonso, George. "Vers une nouvelle génération de systèmes de test et de simulation avionique dynamiquement reconfigurables". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2013. http://tel.archives-ouvertes.fr/tel-00921874.
Pełny tekst źródłaLalevée, André. "Towards highly flexible hardware architectures for high-speed data processing : a 100 Gbps network case study". Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0054/document.
Pełny tekst źródłaThe increase in both size and diversity of applications regarding modern networks is making traditional computing architectures limited. Indeed, purely software architectures can not sustain typical throughputs, while purely hardware ones severely lack the flexibility needed to adapt to the diversity of applications. Thus, the investigation of programmable hardware, such as Field Programmable Gate Arrays (FPGAs), has been done. These architectures are indeed usually considered as a good tradeoff between performance and flexibility, mainly thanks to the Dynamic Partial Reconfiguration (DPR), which allows to reconfigure a part of the design during run-time.However, this technique can have several drawbacks, especially regarding the storing of the configuration files, called bitstreams. To solve this issue, bitstream relocation can be deployed, which allows to decrease the number of configuration files required. However, this technique is long, error-prone, and requires specific knowledge inFPGAs. A fully automated design flow has been developped to ease the use of this technique. In order to provide flexibility regarding the sequence of treatments to be done on our architecture, a flexible and high-throughput communication structure is required. Thus, a Network-on-Chips study and characterization has been done accordingly to network processing and bitstream relocation properties. Finally, a case study has been developed in order to validate our approach
Marques, Nicolas. "Méthodologie et architecture adaptative pour le placement efficace de tâches matérielles de tailles variables sur des partitions reconfigurables". Thesis, Université de Lorraine, 2012. http://www.theses.fr/2012LORR0139/document.
Pełny tekst źródłaFPGA-based reconfigurable architectures can deliver appropriate solutions for several applications as they allow for changing the performance of a part of the FPGA while the rest of the circuit continues to run normally. These architectures, despite their improvements, still suffer from their lack of adaptability when confronted with applications consisting of variable size material tasks. This heterogeneity may cause wrong placements leading to a sub-optimal use of resources and therefore a decrease in the system performances. The contribution of this thesis focuses on the problematic of variable size material task placement and reconfigurable region effective generation. A methodology and an intermediate layer between the FPGA and the application are proposed to allow for the effective placement of variable size material tasks on reconfigurable partitions of a predefined size. To approve the method, we suggest an architecture based on the use of partial reconfiguration in order to adapt the transcoding of one video compression format to another in a flexible and effective way. A study on the reconfigurable region partitioning for the entropy encoder material tasks (CAVLC / VLC) is proposed in order to show the contribution of partitioning. Then an assessment of the gain obtained and of the method additional costs is submitted
Gonsales, Alex Dias. "Projeto de uma Nova Arquitetura de FPGA para aplicações BIST e DSP". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2002. http://hdl.handle.net/10183/12010.
Pełny tekst źródłaDigital electronic systems have been increasingly used in a large spectrum of applications, such as communication, voice processing, instrumentation, biomedicine, and multimedia. Most of these applications require some kind of signal processing. Most of this task is usually performed by a digital block. Moreover, these complex systems are composed of different kinds of circuits, such as RAM (Random Access Memory) and ROM (Read Only Memory) memories, complex datapaths and control parts. This way, the test of such systems is ever more important. Likewise, the increasingly complexity of the circuits to be tested requires more complex testers (external test), making the latter more expensive. An approach to address this problem is to embbed the test functions onto the chip to be tested itself. Nevertheless, this approach may bring a prohibitive cost in terms of area on silicon. However, if the test and the signal processing functions are not required to run in parallel, then it is possible to use the same reconfigurable area to implement these functions one after another. Thus, this work proposes an optimized reconfigurable architecture to implement this kind of circuits (digital signal processing and test). This approach intends to decrease the occupied area in comparison to a dedicated and also to a comercial reconfigurable device implementation. To validate these ideas, the proposed architecture is described using a hardware description language and some test and digital signal processing applications are mapped and simulated on this architecture. In this work an estimative of the occupied area by the three approaches (dedicated, comercial reconfigurable device, and the new proposed architecture) as well as a comparison analysis between them are performed. Likewise, a delay estimate is performed and the maximum operation frequency is evaluated.
Imran, Naveed. "Autonomous Recovery of Reconfigurable Logic Devices using Priority Escalation of Slack". Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5949.
Pełny tekst źródłaPh.D.
Doctorate
Electrical Engineering and Computer Science
Engineering and Computer Science
Electrical Engineering
Pasca, Bogdan Mihai. "Calcul flottant haute performance sur circuits reconfigurables". Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2011. http://tel.archives-ouvertes.fr/tel-00654121.
Pełny tekst źródłaLiu, Yue-qu, i 劉岳衢. "Reconfigurable Design and Implementation of Modular-Construction Based FFT Hardware Architecture". Thesis, 2018. http://ndltd.ncl.edu.tw/handle/m9kw97.
Pełny tekst źródła國立中山大學
電機工程學系研究所
106
In the 3GPP-LTE communication standard, it defines many kinds of Fast Fourier Transform(FFT) sizes. So, we design a high performance FFT architecture which makes good use of modular design construction and reconfigurable design to achieve easily connection between every 2 stages. This design can be suitable for any requirement. In the 4-stage module, it can support 48 modes which perform 2-2187 FFT points. It also supports 32 modes defined in 3GPP-LTE communication standard. Each module contains two parts. (1) Reconfigurable Computing Kernel(RC-CK):We employ radix-32 and radix-23 bases and suitably utilize the hardware reuse property. Without extra of hardware resource (ex:multipliers or adders), it can execute six types of different radix of FFT kernel operations. (2) Reconfigurable First-in First-Out(RC-FIFO):We develop a high efficient design method for supporting many FFT points. The FIFO plan is easily managed and suitably located to maximize the hardware storage usage. In addition, we propose Section-based Twiddle Factor Generator(STFG) to support multi-FFT points. It can reduce the area cost and satisfy any communication systems effectively. In the chip implementation, the core area is only 0.318 mm2 by using TSMC 40-nm CMOS technology. The maximal operating frequency is 350 MHz and power dissipation in average is 44.2 mW. As compared with other state-of-the-arts, our proposed work has the best performance and support many FFT points. Most important of all, the proposed hardware architecture has the better scalability. In the future, we can support the undefined specifications of the 5th generation wireless system only by increasing/decreasing the module.
Wang, Ching-Shun, i 王靖順. "Reconfigurable Hardware Architecture Design and Implementation for AI Deep Learning Accelerator". Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5441107%22.&searchmode=basic.
Pełny tekst źródła國立中興大學
電機工程學系所
107
This paper proposes the Convolution Neural Network hardware accelerator architecture with 288PE to achieve 230.4GOPS@400Mhz. To verify the hardware function, the hardware is implemented at 100MHz in units of 72PE owing to the limitation of FPGA resources. The proposed CNN hardware accelerator is Layer-based architecture which can be reconfigured the layer parameters to suitable for different CNN architectures. The proposed architecture is based on operating three Rows Input feature map and then generate a Row Output feature map. The proposed architecture uses 322KB On-Chip Memory to store Input feature map, Bias, Kernel, and Output feature map to improve the efficiency of Data reuse and reduce bandwidth utilization. In this paper, the Max-pooling layer after the Convolution layer can be combined to reduce the bandwidth of DRAM.
Chen, Wen-Chieh, i 陳文杰. "A new hardware-efficient algorithm and reconfigurable architecture for image contrast enhancement". Thesis, 2012. http://ndltd.ncl.edu.tw/handle/ws6ckf.
Pełny tekst źródła國立臺北科技大學
電腦與通訊研究所
100
Contrast enhancement is crucial when generating high quality images for image processing applications such as digital image or video photography, LCD processing, and medical image analysis. In order to achieve real-time performance for high-definition video applications, it is necessary to design efficient contrast enhancement hardware architecture to meet the needs of real-time processing. In this paper, we propose a novel hardware-oriented contrast enhancement algorithm which can be implemented effectively for hardware design. In order to be considered for hardware implementation, approximation techniques are proposed to reduce these complex computations during performance of the contrast enhancement algorithm. The proposed hardware-oriented contrast enhancement algorithm achieves good image quality by measuring the results of qualitative and quantitative analyses. To decrease hardware cost and improve hardware utilization for real-time performance, a reduction in circuit area is proposed through use of parameter-controlled reconfigurable architecture. The experiment results show that the proposed hardware-oriented contrast enhancement algorithm can provide an average frame rate of 48.23 fps at high definition resolution 1920×1080. This means that the proposed hardware architecture can run in real-time.
CHANG, YU-WEI, i 張祐維. "Hardware/Software Co-Design of Reconfigurable Architecture for 2D-to-3D Conversion". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/9rsz4y.
Pełny tekst źródła國立臺南大學
資訊工程學系碩士班
107
3D images are amazing, but the cost of using multiple photographic lenses for photography is very expensive. Therefore, there is a 2D-to-3D technology, and the software is superior in image processing, but if the software resources are insufficient due to lots of conversion schedules.At that time, it is necessary to assist with hardware. In the early stage of computer science design, software and hardware are separated, and they are integrated in the final stage.We can see lots of implementations of hardware/software Co-Design is already applied to business, and there are also many good methods for image edge detection and depth calculation. In order to accelerate the efficiency of 2D-to-3D, we proposed a hardware/software co-design of reconfigurable architecture for 2D-to-3D conversion that make more efficient for 2D-to-3D . Depth Image Based Rendering (DIBR) is used in 2D to 3D system to generate 3D image synthesis using original 2D images and depth information to generate 3D or multiple perspective virtual images. In this paper, We uses Xilinx Spartan 7 to do as a hardware device for a platform of hardware/software co-design.
Biswas, Prasenjit. "Hardware Consolidation Of Systolic Algorithms On A Coarse Grained Runtime Reconfigurable Architecture". Thesis, 2011. https://etd.iisc.ac.in/handle/2005/2108.
Pełny tekst źródłaBiswas, Prasenjit. "Hardware Consolidation Of Systolic Algorithms On A Coarse Grained Runtime Reconfigurable Architecture". Thesis, 2011. http://etd.iisc.ernet.in/handle/2005/2108.
Pełny tekst źródłaLiu, Xiaobin. "ENERGY EFFICIENCY EXPLORATION OF COARSE-GRAIN RECONFIGURABLE ARCHITECTURE WITH EMERGING NONVOLATILE MEMORY". 2015. https://scholarworks.umass.edu/masters_theses_2/159.
Pełny tekst źródłaAlle, Mythri. "Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm". Thesis, 2012. https://etd.iisc.ac.in/handle/2005/2453.
Pełny tekst źródłaAlle, Mythri. "Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm". Thesis, 2012. http://etd.iisc.ernet.in/handle/2005/2453.
Pełny tekst źródłaChang, Shao-Hsuan, i 張紹宣. "Design and Implementation of an ALU Cluster Intellectual Property as a Reconfigurable Hardware Accelerator for Media Streaming Architecture". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/66721172009467596171.
Pełny tekst źródła國立交通大學
電信工程系所
94
There are more and more portable systems such as mobiles, MP3 player, PDA, and other entertainment systems in today’s life. The functionality and complexity of them thus increase much higher than old-time ones. Therefore, having a great deal ability of multimedia operation is important for portable systems. However, it is tough to have enough amounts of multimedia operations from conventional hardware architecture. This results from the poor match between conventional architecture and features of media applications. It hence leads to inefficient memory access that induces performance degression. The worst case is unable to meet the real time requirement. According, this thesis designs an operational unit, ALU cluster, that is referenced from Stanford’s stream processor architecture and thus matches to media applications to provide necessary processing requirements for media applications. Besides, considering the issues of convenient usage in the future and rapid integration of real multimedia applications, we wrap ALU cluster as an AMBA-compatible IP by adding designed interface. Then, it is possible to exploit other existing IP and peripherals in the AMBA platform and truly treats our design as hardware accelerator for real multimedia applications. This thesis is finished with a synthesizable soft IP. The designed interface is verified by ARM-series baseboard. This ensures that the interface conforms to AMBA specification.
"Scalable Register File Architecture for CGRA Accelerators". Master's thesis, 2016. http://hdl.handle.net/2286/R.I.40738.
Pełny tekst źródłaDissertation/Thesis
Masters Thesis Computer Science 2016
Mohammadi, Mahnaz. "An Accelerator for Machine Learning Based Classifiers". Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4245.
Pełny tekst źródłaFröhlich, Dominik. "Object-Oriented Development for Reconfigurable Architectures". Doctoral thesis, 2001. https://tubaf.qucosa.de/id/qucosa%3A22579.
Pełny tekst źródła