Sommaire

  1. Thèses

Littérature scientifique sur le sujet « Reconfigurable Hardware Architecture »

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Reconfigurable Hardware Architecture ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Thèses sur le sujet "Reconfigurable Hardware Architecture"

1

Gelb, Benjamin S. "A timeshared, runtime reconfigurable hardware co-processing architecture." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/53147.

Texte intégral
Résumé :
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.<br>Includes bibliographical references (leaves 73-74).<br>The constant desire for increased performance in microprocessor systems has led to the need for specialized hardware cores to accelerate specific computational tasks. In this thesis, we explore the potential of using FPGA partial reconfiguration to create a platform for customized hardware cores that may be loaded on demand, at runtime, and replaced when not in use. We implement two new software tools, bitparse and bitrender, to demonstrate the bitstream relocation technique. Further, we present a functional microprocessor system coupled with a runtime reprogramable peripheral synthesized on a Xilinx Virtex-5 FPGA and discuss its performance implications.<br>by Benjamin S. Gelb.<br>M.Eng.
Styles APA, Harvard, Vancouver, ISO, etc.
2

Peterkin, Raymond. "A reconfigurable hardware architecture for VPN MPLS based services." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27283.

Texte intégral
Résumé :
Internet applications are becoming increasingly resource intensive and perform poorly in the presence of significant congestion. Increased bandwidth cannot provide long-term congestion relief so Internet traffic must be prioritized and efficiently routed. Multiprotocol Label Switching (MPLS) [12] provides the means to process traffic quickly and reserve resources for applications with specific requirements. However, MPLS must provide the same resilience mechanisms as ATM [18] over SONET [46] to become an acceptable alternative for assigning and switching label switched paths (LSPs). This thesis proposes a reconfigurable architecture and a prototype of a hardware processor for MPLS to improve its overall performance. Establishing LSPs and label management are the central tasks of the processor. It is used to describe LSPs and perform packet switching. A significant subset of RSVP-TE is implemented in the processor to provide the necessary mechanisms of a signaling protocol. Functionality is also available for Traffic Engineering (TE) allowing a user to configure the allocation of resources available for MPLS. The processor is designed to interact with software so it can become part of an embedded system. Results and analysis for the processor are provided describing its resource usage and performance. Resource intensive tasks are identified through analysis and determinations are made about the worst case performance improvement compared to software implementations which depend on the number of LSPs considered and network size.
Styles APA, Harvard, Vancouver, ISO, etc.
3

Diniz, Claudio Machado. "Dedicated and reconfigurable hardware accelerators for high efficiency video coding standard." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2015. http://hdl.handle.net/10183/118394.

Texte intégral
Résumé :
A demanda por vídeos de resolução ultra-alta (além de 1920x1080 pontos) levou à necessidade de desenvolvimento de padrões de codificação de vídeo novos e mais eficientes para prover alta eficiência de compressão. O novo padrão High Efficiency Video Coding (HEVC), publicado em 2013, atinge o dobro da eficiência de compressão (ou 50% de redução no tamanho do vídeo codificado) comparado com o padrão mais eficiente até então, e mais utilizado no mercado, o padrão H.264/AVC (Advanced Video Coding). O HEVC atinge este resultado ao custo de uma elevação da complexidade computacional das ferramentas inseridas no codificador e decodificador. O aumento do esforço computacional do padrão HEVC e as limitações de potência das tecnologias de fabricação em silício atuais tornam essencial o desenvolvimento de aceleradores de hardware para partes importantes da aplicação do HEVC. Aceleradores de hardware fornecem maior desempenho e eficiência energética para aplicações específicas que os processadores de propósito geral. Uma análise da aplicação do HEVC realizada neste trabalho identificou as partes mais importantes do HEVC do ponto de vista do esforço computacional, a saber, o Filtro de Interpolação de Ponto Fracionário, o Filtro de Deblocagem e o cálculo da Soma das Diferenças Absolutas. Uma análise de tempo de execução do Filtro de Interpolação indica um grande potencial de economia de potência/energia pela adaptação do acelerador de hardware à carga de trabalho variável. Esta tese introduz novas contribuições no tema de aceleradores dedicados e reconfiguráveis para o padrão HEVC. Aceleradores de hardware dedicados para o Filtro de Interpolação de Pixel Fracionário, para o Filtro de Deblocagem, e para o cálculo da Soma das Diferenças Absolutas, são propostos, projetados e avaliados nesta tese. A arquitetura de hardware proposta para o filtro de interpolação atinge taxa de processamento similar ao estado da arte, enquanto reduz a área do hardware para este bloco em 50%. A arquitetura de hardware proposta para o filtro de deblocagem também atinge taxa de processamento similar ao estado da arte com uma redução de 5X a 6X na contagem de gates e uma redução de 3X na dissipação de potência. A nova análise comparativa proposta para os elementos de processamento do cálculo da Soma das Diferenças Absolutas introduz diversas alternativas de projeto de arquitetura com diferentes resultados de área, desempenho e potência. A nova arquitetura reconfigurável para o filtro de interpolação do padrão HEVC fornece 57% de redução de área em tempo de projeto e adaptação da potência/energia em tempo-real a cada imagem processada, o que ainda não é suportado pelas arquiteturas do estado da arte para o filtro de interpolação. Adicionalmente, a tese propõe um novo esquema de alocação de aceleradores em tempo-real para arquiteturas reconfiguráveis baseadas em tiles de processamento e de grão-misto, o que reduz em 44% (23% em média) o “overhead” de comunicação comparado com uma estratégia first-fit com reuso de datapaths, para números diferentes de tiles e organizações internas de tile. Este esquema de alocação leva em conta a arquitetura interna para alocar aceleradores de uma maneira mais eficiente, evitando e minimizando a comunicação entre tiles. Os aceleradores e técnicas dedicadas e reconfiguráveis propostos nesta tese proporcionam implementações de codificadores de vídeo de nova geração, além do HEVC, com melhor área, desempenho e eficiência em potência.<br>The demand for ultra-high resolution video (beyond 1920x1080 pixels) led to the need of developing new and more efficient video coding standards to provide high compression efficiency. The High Efficiency Video Coding (HEVC) standard, published in 2013, reaches double compression efficiency (or 50% reduction in size of coded video) compared to the most efficient video coding standard at that time, and most used in the market, the H.264/AVC (Advanced Video Coding) standard. HEVC reaches this result at the cost of high computational effort of the tools included in the encoder and decoder. The increased computational effort of HEVC standard and the power limitations of current silicon fabrication technologies makes it essential to develop hardware accelerators for compute-intensive computational kernels of HEVC application. Hardware accelerators provide higher performance and energy efficiency than general purpose processors for specific applications. An HEVC application analysis conducted in this work identified the most compute-intensive kernels of HEVC, namely the Fractional-pixel Interpolation Filter, the Deblocking Filter and the Sum of Absolute Differences calculation. A run-time analysis on Interpolation Filter indicates a great potential of power/energy saving by adapting the hardware accelerator to the varying workload. This thesis introduces new contributions in the field of dedicated and reconfigurable hardware accelerators for HEVC standard. Dedicated hardware accelerators for the Fractional Pixel Interpolation Filter, the Deblocking Filter and the Sum of Absolute Differences calculation are herein proposed, designed and evaluated. The interpolation filter hardware architecture achieves throughput similar to the state of the art, while reducing hardware area by 50%. Our deblocking filter hardware architecture also achieves similar throughput compared to state of the art with a 5X to 6X reduction in gate count and 3X reduction in power dissipation. The thesis also does a new comparative analysis of Sum of Absolute Differences processing elements, in which various architecture design alternatives with different area, performance and power results were introduced. A novel reconfigurable interpolation filter hardware architecture for HEVC standard was developed, and it provides 57% design-time area reduction and run-time power/energy adaptation in a picture-by-picture basis, compared to the state-of-the-art. Additionally a run-time accelerator binding scheme is proposed for tile-based mixed-grained reconfigurable architectures, which reduces the communication overhead, compared to first-fit strategy with datapath reusing scheme, by up to 44% (23% on average) for different number of tiles and internal tile organizations. This run-time accelerator binding scheme is aware of the underlying architecture to bind datapaths in an efficient way, to avoid and minimize inter-tile communications. The new dedicated and reconfigurable hardware accelerators and techniques proposed in this thesis enable next-generation video coding standard implementations beyond HEVC with improved area, performance, and power efficiency.
Styles APA, Harvard, Vancouver, ISO, etc.
4

Kung, Ling-Pei 1961. "Obtaining performance and programmability using reconfigurable hardware for media processing." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/61855.

Texte intégral
Résumé :
Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2002.<br>Includes bibliographical references (p. 127-132).<br>An imperative requirement in the design of a reconfigurable computing system or in the development of a new application on such a system is performance gains. However, such developments suffer from long-and-difficult programming process, hard-to-predict performance gains, and limited scope of applications. To address these problems, we need to understand reconfigurable hardware's capabilities and limitations, its performance advantages and disadvantages, re-think reconfigurable system architectures, and develop new tools to explore its utility. We begin by examining performance contributors at the system level. We identify those from general-purpose and those from dedicated components. We propose an architecture by integrating reconfigurable hardware within the general-purpose framework. This is to avoid and minimize dedicated hardware and organization for programmability. We analyze reconfigurable logic architectures and their performance limitations. This analysis leads to a theory that reconfigurable logic can never be clocked faster than a fixed-logic design based on the same fabrication technology. Though highly unpredictable, we can obtain a quick upper bound estimate on the clock speed based on a few parameters. We also analyze microprocessor architectures and establish an analytical performance model. We use this model to estimate performance bounds using very little information on task properties. These bounds help us to detect potential memory-bound tasks. For a compute-bound task, we compare its performance upper bound with the upper bound on reconfigurable clock speed to further rule out unlikely speedup candidates.<br>(cont.) These performance estimates require very few parameters, and can be quickly obtained without writing software or hardware codes. They can be integrated with design tools as front end tools to explore speedup opportunities without costly trials. We believe this will broaden the applicability of reconfigurable computing.<br>by Ling-Pei Kung.<br>Ph.D.
Styles APA, Harvard, Vancouver, ISO, etc.
5

Balasubramanian, Karthikeyan. "Reconfigurable System-on-Chip Architecture for Neural Signal Processing." Diss., Temple University Libraries, 2011. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/144255.

Texte intégral
Résumé :
Electrical Engineering<br>Ph.D.<br>Analyzing the brain's behavior in terms of its neuronal activity is the fundamental purpose of Brain-Machine Interfaces (BMIs). Neuronal activity is often assumed to be encoded in the rate of neuronal action potential spikes. Successful performance of a BMI system is tied to the efficiency of its individual processing elements such as spike detection, sorting and decoding. To achieve reliable operation, BMIs are equipped with hundreds of electrodes at the neural interface. While a single electrode/tetrode communicates with up to four neurons at a given instant of time, a typical interface communicates with an ensemble of hundreds or even thousands of neurons. However, translation of these signals (data) into usable information for real-time BMIs is bottlenecked due to the lack of efficient real-time algorithms and real-time hardware that can handle massively parallel channels of neural data. The research presented here addresses this issue by developing real-time neural processing algorithms that can be implemented in reconfigurable hardware and thus, can be scaled to handle thousands of channels in parallel. The developed reconfigurable system serves as an evaluation platform for investigating the fundamental design tradeoffs in allocating finite hardware resources for a reliable BMI. In this work, the generic architectural layout needed to process neural signals in a massive scale is discussed. A System-on-Chip design with embedded system architecture is presented for FPGA hardware realization that features (a) scalability (b) reconfigurability, and (c) real-time operability. A prototype design incorporating a dual processor system and essential neural signal processing routines such as real-time spike detection and sorting is presented. Two kinds of spike detectors, a simple threshold-based and non-linear energy operator-based, were implemented. To achieve real-time spike sorting, a fuzzy logic-based spike sorter was developed and synthesized in the hardware. Furthermore, a real-time kernel to monitor the high-level interactions of the system was implemented. The entire system was realized in a platform FPGA (Xilinx Virtex-5 LX110T). The system was tested using extracellular neural recordings from three different animals, a owl monkey, a macaque and a rat. Operational performance of the system is demonstrated for a 300 channel neural interface. Scaling the system to 900 channels is trivial.<br>Temple University--Theses
Styles APA, Harvard, Vancouver, ISO, etc.
6

Lomonaco, Michael John. "CRYPTARRAY A SCALABLE AND RECONFIGURABLE ARCHITECTURE FOR CRYPTOGRAPHIC APPLICATIONS." Master's thesis, University of Central Florida, 2004. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4394.

Texte intégral
Résumé :
Cryptography is increasingly viewed as a critical technology to fulfill the requirements of security and authentication for information exchange between Internet applications. However, software implementations of cryptographic applications are unable to support the quality of service from a bandwidth perspective required by most Internet applications. As a result, various hardware implementations, from Application-Specific Integrated Circuits (ASICs), Field-Programmable Gate Arrays (FPGAs), to programmable processors, were proposed to improve this inadequate quality of service. Although these implementations provide performances that are considered better than those produced by software implementations, they still fall short of addressing the bandwidth requirements of most cryptographic applications in the context of the Internet for two major reasons: (i) The majority of these architectures sacrifice flexibility for performance in order to reach the performance level needed for cryptographic applications. This lack of flexibility can be detrimental considering that cryptographic standards and algorithms are still evolving. (ii) These architectures do not consider the consequences of technology scaling in general, and particularly interconnect related problems. As a result, this thesis proposes an architecture that attempts to address the requirements of cryptographic applications by overcoming the obstacles described in (i) and (ii). To this end, we propose a new reconfigurable, two-dimensional, scalable architecture, called CRYPTARRAY, in which bus-based communication is replaced by distributed shared memory communication. At the physical level, the length of the wires will be kept to a minimum. CRYPTARRAY is organized as a chessboard in which the dark and light squares represent Processing Elements (PE) and memory blocks respectively. The granularity and resource composition of the PEs is specifically designed to support the computing operations encountered in cryptographic algorithms in general, and symmetric algorithms in particular. Communication can occur only between neighboring PEs through locally shared memory blocks. Because of the chessboard layout, the architecture can be reconfigured to allow computation to proceed as a pipelined wave in any direction. This organization offers a high computational density in terms of datapath resources and a large number of distributed storage resources that easily support a high degree of parallelism and pipelining. Experimental prototyping a small array on FPGA chips shows that this architecture can run at 80.9 MHz producing 26,968,716 outputs every second in static reconfiguration mode and 20,226,537 outputs every second in dynamic reconfiguration mode.<br>M.S.<br>Department of Electrical and Computer Engineering<br>Engineering and Computer Science<br>Electrical and Computer Engineering
Styles APA, Harvard, Vancouver, ISO, etc.
7

Avakian, Annie. "Reducing Cache Access Time in Multicore Architectures Using Hardware and Software Techniques." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1335461322.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
8

Silva, Antonio Carlos Fernandes da. "ChipCflow: tool for convert C code in a static dataflow architecture in reconfigurable hardware." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-30062015-141638/.

Texte intégral
Résumé :
A growing search for alternative architectures and softwares have been noted in the last years. This search happens due to the advance of hardware technology and such advances must be complemented by innovations on design methodologies, test and verification techniques in order to use technology effectively. Alternative architectures and softwares, in general, explores the parallelism of applications, differently to Von Neumann model. Among high performance alternative architectures, there is the Dataflow Architecture. In this kind of architecture, the process of program execution is determined by data availability, thus the parallelism is intrinsic in these systems. The dataflow architectures become again a highlighted search area due to hardware advances, in particular, the advances of Reconfigurable Computing and Field Programmable Gate Arrays (FPGAs). ChipCflow projet is a tool for execution of algorithms using dynamic dataflow graph in FPGA. In this thesis, the development of a code conversion tool to generate aplications in a static dataflow architecture, is described. Also the ChipCflow project where the code conversion tool is part, is presented. The specification of algorithm to be converted is made in C language and converted to a hadware description language, respecting the proposed by ChipCflow project. The results are the proof of concept of converting a high-level language code for dataflow architecture to be used into a FPGA.<br>Existe uma crescente busca por softwares e arquiteturas alternativas. Essa busca acontece pois houveram avanços na tecnologia do hardware, e estes avanços devem ser complementados por inovações nas metodologias de projetos, testes e verificação para que haja um uso eficaz da tecnologia. Os software e arquiteturas alternativas, geralmente são modelos que exploram o paralelismo das aplicações, ao contrário do modelo de Von Neumann. Dentre as arquiteturas alternativas de alto desempenho, tem-se a arquitetura a fluxo de dados. Nesse tipo de arquitetura, o processo de execução de programas é determinado pela disponibilidade dos dados, logo o paralelismo está embutido na própria natureza do sistema. O modelo a fluxo de dados possui a vantagem de expressar o paralelismo de maneira intrínseca, eliminando a necessidade do programador explicitar em seu código os trechos onde deve haver paralelismo. As arquiteturas a fluxo de dados voltaram a ser uma área de pesquisa devido aos avanços do hardware, em particular, os avanços da Computação Reconfigurável e dos Field Programmable Gate Arrays (FPGAs).Nesta tese é descrita uma ferramenta de conversão de código que visa a geração de aplicações utilizando uma arquitetura a fluxo de dados estática. Também é descrito o projeto ChipCflow, cuja ferramenta de conversão de código, descrita nesta tese, é parte integrante. A especificação do algoritmo a ser convertido é feita em linguagem C e convertida para uma linguagem de descrição de hardware, respeitando o modelo proposto pelo ChipCflow. Os resultados alcançados visam a prova de conceito da conversão de código de uma linguagem de alto nível para uma arquitetura a fluxo de dados a ser configurada em FPGA.
Styles APA, Harvard, Vancouver, ISO, etc.
9

Robinson, Kylan Thomas. "An integrated development environment for the design and simulation of medium-grain reconfigurable hardware." Pullman, Wash. : Washington State University, 2010. http://www.dissertations.wsu.edu/Thesis/Spring2010/k_robinson_041510.pdf.

Texte intégral
Résumé :
Thesis (M.S. in computer engineering)--Washington State University, May 2010.<br>Title from PDF title page (viewed on June 22, 2010). "School of Electrical Engineering and Computer Science." Includes bibliographical references (p. 75-76).
Styles APA, Harvard, Vancouver, ISO, etc.
10

Das, Satyajit. "Architecture and Programming Model Support for Reconfigurable Accelerators in Multi-Core Embedded Systems." Thesis, Lorient, 2018. http://www.theses.fr/2018LORIS490/document.

Texte intégral
Résumé :
La complexité des systèmes embarqués et des applications impose des besoins croissants en puissance de calcul et de consommation énergétique. Couplé au rendement en baisse de la technologie, le monde académique et industriel est toujours en quête d'accélérateurs matériels efficaces en énergie. L'inconvénient d'un accélérateur matériel est qu'il est non programmable, le rendant ainsi dédié à une fonction particulière. La multiplication des accélérateurs dédiés dans les systèmes sur puce conduit à une faible efficacité en surface et pose des problèmes de passage à l'échelle et d'interconnexion. Les accélérateurs programmables fournissent le bon compromis efficacité et flexibilité. Les architectures reconfigurables à gros grains (CGRA) sont composées d'éléments de calcul au niveau mot et constituent un choix prometteur d'accélérateurs programmables. Cette thèse propose d'exploiter le potentiel des architectures reconfigurables à gros grains et de pousser le matériel aux limites énergétiques dans un flot de conception complet. Les contributions de cette thèse sont une architecture de type CGRA, appelé IPA pour Integrated Programmable Array, sa mise en œuvre et son intégration dans un système sur puce, avec le flot de compilation associé qui permet d'exploiter les caractéristiques uniques du nouveau composant, notamment sa capacité à supporter du flot de contrôle. L'efficacité de l'approche est éprouvée à travers le déploiement de plusieurs applications de traitement intensif. L'accélérateur proposé est enfin intégré à PULP, a Parallel Ultra-Low-Power Processing-Platform, pour explorer le bénéfice de ce genre de plate-forme hétérogène ultra basse consommation<br>Emerging trends in embedded systems and applications need high throughput and low power consumption. Due to the increasing demand for low power computing and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy efficient hardware accelerators. The main drawback of hardware accelerators is that they are not programmable. Therefore, their utilization can be low is they perform one specific function and increasing the number of the accelerators in a system on chip (SoC) causes scalability issues. Programmable accelerators provide flexibility and solve the scalability issues. Coarse-Grained Reconfigurable Array (CGRA) architecture consisting of several processing elements with word level granularity is a promising choice for programmable accelerator. Inspired by the promising characteristics of programmable accelerators, potentials of CGRAs in near threshold computing platforms are studied and an end-to-end CGRA research framework is developed in this thesis. The major contributions of this framework are: CGRA design, implementation, integration in a computing system, and compilation for CGRA. First, the design and implementation of a CGRA named Integrated Programmable Array (IPA) is presented. Next, the problem of mapping applications with control and data flow onto CGRA is formulated. From this formulation, several efficient algorithms are developed using internal resources of a CGRA, with a vision for low power acceleration. The algorithms are integrated into an automated compilation flow. Finally, the IPA accelerator is augmented in PULP - a Parallel Ultra-Low-Power Processing-Platform to explore heterogeneous computing
Styles APA, Harvard, Vancouver, ISO, etc.
Plus de sources
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!