Log in

Relevant bibliographies by topics / FPGA design / Journal articles

To see the other types of publications on this topic, follow the link: FPGA design.

Journal articles on the topic 'FPGA design'

Author: Grafiati

Published: 4 June 2021

Last updated: 25 April 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'FPGA design.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

LEE, HANHO, and GERALD E. SOBELMAN. "VLSI DESIGN OF DIGIT-SERIAL FPGA ARCHITECTURE." Journal of Circuits, Systems and Computers 13, no. 01 (February 2004): 17–52. http://dx.doi.org/10.1142/s021812660400126x.

Full text

Abstract:

This paper presents a novel application-specific field-programmable gate array (FPGA) architecture that satisfies efficient implementation of digit-serial DSP architectures on a digit wide basis. Digit-serial DSP designs have been an effective implementation method for FPGAs. To efficiently realize a digit-serial DSP design on FPGAs, one must create an FPGA architecture optimized for those types of systems. We examine the various circuits used in digit-serial DSP designs to extract their key features that should be reflected in the new FPGA architecture. We explain the design methodology, layout and implementation of the new digit-serial FPGA architecture. Digit-serial DSP designs using the digit-serial FPGA (DS-FPGA) are compared to those implemented on Xilinx FPGAs. We have estimated that the DS-FPGA are about 2.5~3 times more efficient in area and faster than the equivalent digit-serial DSP architectures implemented using Xilinx FPGAs.

APA, Harvard, Vancouver, ISO, and other styles

2

Oliveira, Duarte L., Marius Strum, and Sandro S. Sato. "Burst-Mode Asynchronous Controllers on FPGA." International Journal of Reconfigurable Computing 2008 (2008): 1–10. http://dx.doi.org/10.1155/2008/926851.

Full text

Abstract:

FPGAs have been mainly used to design synchronous circuits. Asynchronous design on FPGAs is difficult because the resulting circuit may suffer from hazard problems. We propose a method that implements a popular class of asynchronous circuits, known as burst mode, on FPGAs based on look-up table architectures. We present two conditions that, if satisfied, guarantee essential hazard-free implementation on any LUT-based FPGA. By doing that, besides all the intrinsic advantages of asynchronous over synchronous circuits, they also take advantage of the shorter design time and lower cost associated with FPGA designs.

APA, Harvard, Vancouver, ISO, and other styles

3

Wu, Chang Fu. "Analysis and Realization of Critical Points on Hardware Design of FPGA." Advanced Materials Research 950 (June 2014): 133–38. http://dx.doi.org/10.4028/www.scientific.net/amr.950.133.

Full text

Abstract:

FPGA is one kind of important devices that can realize many functions. As the development of communication technology and computer science, more and more technologies are invented and more and more hardware design technologies are sifted out. Therefore, the hardware design based on ASIC can be not fit on the new theories realization. As a new device, FPGA has many advantages including strength function, shorter design circle, less money, more flexible and more intelligent design tools. More and More hardware designs of FPGA are pay more attentions. Therefore, it is significant to make analysis on hardware design of FPGA. The hardware design for FPGA will be related to the FPGA device. In the market Altera and Xilinx FPGAs are used frequently by engineers. Therefore, in this dissertation will be make analysis and realization the critical points in hardware design based on Xilinx FPGA. In this dissertation, the critical point of Hardware Design of FPGA will be described. It will include power source, impedance matching and clock circuit design. There are many hardware design tools used for hardware design including Altium Designer, Protel, Cadence and others. Compared with other design tools, Cadence will have more advantages. Therefore, in this dissertation, Cadence will be used as the design tool for hardware design analysis and realization. With the help of Cadence, one hardware design and signal transmission simulation will be made analysis. With the development of the micro-electronics technology and computer science, the hardware design about FPGA will be taken more and more attentions.

APA, Harvard, Vancouver, ISO, and other styles

4

Trinh, Nguyen, Anh Le Thi Kim, Hung Nguyen, and Linh Tran. "Algorithmic TCAM on FPGA with data collision approach." Indonesian Journal of Electrical Engineering and Computer Science 22, no. 1 (April 1, 2021): 89. http://dx.doi.org/10.11591/ijeecs.v22.i1.pp89-96.

Full text

Abstract:

<span>Content addressable memory (CAM) and ternary content addressable memory (TCAM) are specialized high-speed memories for data searching. CAM and TCAM have many applications in network routing, packet forwarding and Internet data centers. These types of memories have drawbacks on power dissipation and area. As field-programmable gate array (FPGA) is recently being used for network acceleration applications, the demand to integrate TCAM and CAM on FPGA is increasing. Because most FPGAs do not support native TCAM and CAM hardware, methods of implementing algorithmic TCAM using FPGA resources have been proposed through recent years. Algorithmic TCAM on FPGA have the advantages of FPGAs low power consumption and high intergration scalability. This paper proposes a scaleable algorithmic TCAM design on FPGA. The design uses memory blocks to negate power dissipation issue and data collision to save area. The paper also presents a design of a 256 x 104-bit algorithmic TCAM on Intel FPGA Cyclone V, evaluates the performance and application ability of the design on large scale and in future developments.</span>

APA, Harvard, Vancouver, ISO, and other styles

5

Cahill, Eli, Brad Hutchings, and Jeffrey Goeders. "Approaches for FPGA Design Assurance." ACM Transactions on Reconfigurable Technology and Systems 15, no. 3 (September 30, 2022): 1–29. http://dx.doi.org/10.1145/3491233.

Full text

Abstract:

Field-Programmable Gate Arrays (FPGAs) are widely used for custom hardware implementations, including in many security-sensitive industries, such as defense, communications, transportation, medical, and more. Compiling source hardware descriptions to FPGA bitstreams requires the use of complex computer-aided design (CAD) tools. These tools are typically proprietary and closed-source, and it is not possible to easily determine that the produced bitstream is equivalent to the source design. In this work, we present various FPGA design flows that leverage pre-synthesizing or pre-implementing parts of the design, combined with open-source synthesis tools, bitstream-to-netlist tools, and commercial equivalence-checking tools, to verify that a produced hardware design is equivalent to the designer’s source design. We evaluate these different design flows on several benchmark circuits and demonstrate that they are effective at detecting malicious modifications made to the design during compilation. We compare our proposed design flows with baseline commercial design flows and measure the overheads to area and runtime.

APA, Harvard, Vancouver, ISO, and other styles

6

Hosseinghorban, Ali, and Akash Kumar. "A Partial-Reconfiguration-Enabled HW/SW Co-Design Benchmark for LTE Applications." Electronics 11, no. 7 (March 22, 2022): 978. http://dx.doi.org/10.3390/electronics11070978.

Full text

Abstract:

Rapid and continuous evolution in telecommunication standards and applications has increased the demand for a platform with high parallelization capability, high flexibility, and low power consumption. FPGAs are known platforms that can provide all these requirements. However, the evaluation of approaches, architectures, and scheduling policies in this era requires a suitable and open-source benchmark suite that runs on FPGA. This paper harnesses high-level synthesis tools to implement high-performance, resource-efficient, and easy-maintenance kernels for FPGAs. We provide various implementations of each kernel of PHY-Bench and WiBench, which are the most well-known benchmark suites for telecommunication applications on FPGAs. We analyze the execution time and power consumption of different kernels on ARM processors and FPGA. We have made all sources and documentation public for the benefit of the research community. The codes are flexible, and all kernels can easily be regenerated for different sizes. The results show that the FPGA can increase the speed by up to 19.4 times. Furthermore, we show that the power consumption of the FPGA can be reduced by up to 45% by partially reconfiguring a kernel that fits the size of the input data instead of using a large kernel that supports all inputs. We also show that partial reconfiguration can improve the execution time for processing a sub-frame in the uplink application by 33% compared to an FPGA-based approach without partial reconfiguration.

APA, Harvard, Vancouver, ISO, and other styles

7

Yu, Hoyoung, Hansol Lee, Sangil Lee, Youngmin Kim, and Hyung-Min Lee. "Recent Advances in FPGA Reverse Engineering." Electronics 7, no. 10 (October 12, 2018): 246. http://dx.doi.org/10.3390/electronics7100246.

Full text

Abstract:

In this paper, we review recent advances in reverse engineering with an emphasis on FPGA devices and experimentally verified advantages and limitations of reverse engineering tools. The paper first introduces essential components for programming Xilinx FPGAs (Xilinx, San Jose, CA, USA), such as Xilinx Design Language (XDL), XDL Report (XDLRC), and bitstream. Then, reverse engineering tools (Debit, BIL, and Bit2ncd), which extract the bitstream from the external memory to the FPGA and utilize it to recover the netlist, are reviewed, and their limitations are discussed. This paper also covers supplementary tools (Rapidsmith) that can adjust the FPGA design flow to support reverse engineering. Finally, reverse engineering projects for non-Xilinx products, such as Lattice FPGAs (Icestorm) and Altera FPGAs (QUIP), are introduced to compare the reverse engineering capabilities by various commercial FPGA products.

APA, Harvard, Vancouver, ISO, and other styles

8

Coli, Vincent J. "FPGA design technology." Microprocessors and Microsystems 17, no. 7 (September 1993): 383–89. http://dx.doi.org/10.1016/0141-9331(93)90060-k.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Heinz, Carsten, Jaco Hofmann, Jens Korinth, Lukas Sommer, Lukas Weber, and Andreas Koch. "The TaPaSCo Open-Source Toolflow." Journal of Signal Processing Systems 93, no. 5 (May 2021): 545–63. http://dx.doi.org/10.1007/s11265-021-01640-8.

Full text

Abstract:

AbstractThe integration of FPGA-based accelerators into a complete heterogeneous system is a challenging task faced by many researchers and engineers, especially now that FPGAs enjoy increasing popularity as implementation platforms for efficient, application-specific accelerators for domains such as signal processing, machine learning and intelligent storage. To lighten the burden of system integration from the developers of accelerators, the open-source TaPaSCo framework presented in this work provides an automated toolflow for the construction of heterogeneous many-core architectures from custom processing elements, and a simple, uniform programming interface to utilize spatially distributed, parallel computation on FPGAs. TaPaSCo aims to increase the scalability and portability of FPGA designs through automated design space exploration, greatly simplifying the scaling of hardware designs and facilitating iterative growth and portability across FPGA devices and families. This work describes TaPaSCo with its primary design abstractions and shows how TaPaSCo addresses portability and extensibility of FPGA hardware designs for systems-on-chip. A study of successful projects using TaPaSCo shows its versatility and can serve as inspiration and reference for future users, with more details on the usage of TaPaSCo presented in an in-depth case study and a short overview of the workflow.

APA, Harvard, Vancouver, ISO, and other styles

10

Zhang, Qian Li, Fang Yu, Yan Li, Ming Li, Yan Zhao, and Liang Chen. "Architecture-Specific Mapping Tool for SOI-Based FPGA." Advanced Materials Research 159 (December 2010): 438–43. http://dx.doi.org/10.4028/www.scientific.net/amr.159.438.

Full text

Abstract:

This paper addresses several key issues in the design of the mapping tool used for the FPGA application implementation in our SRAM-based FPGAs fabricated in a 0.5 micron SOI-CMOS process, with particular emphasis on FPGA architecture interrelated mapping step and packing method for CAD tool. Considering the routability and testability of the FPGA and the CAD tool, the algorithm combines the FPGA structure with the object netlist, mapping the basic elements into basic building blocks in order to reduce the resource usage. The result is proven in extensive test circuits used in our FPGA design.

APA, Harvard, Vancouver, ISO, and other styles

11

Reddy, Naresh Kumar, and N. Suresh. "An Efficient approach for Design and Testing of FPGA Programming using LabVIEW." International Journal of Reconfigurable and Embedded Systems (IJRES) 4, no. 3 (November 1, 2015): 192. http://dx.doi.org/10.11591/ijres.v4.i3.pp192-200.

Full text

Abstract:

Programming of Field Programmable Gate Arrays (FPGAs) have long been the domain of engineers with VHDL or Verilog expertise.FPGA’s have caught the attention of algorithm developers and communication researchers, who want to use FPGAs to instantiate systems or implement DSP algorithms. These efforts however, are often stifled by the complexities of programming FPGAs. RTL programming in either VHDL or Verilog is generally not a high level of abstraction needed to represent the world of signal flow graphs and complex signal processing algorithms. This paper describes the FPGA Programs using Graphical Language rather than Verilog, VHDL with the help of LabVIEW and features of the LabVIEW FPGA environment.

APA, Harvard, Vancouver, ISO, and other styles

12

Mbongue, Joel Mandebi, Danielle Tchuinkou Kwadjo, Alex Shuping, and Christophe Bobda. "Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure." ACM Transactions on Reconfigurable Technology and Systems 15, no. 2 (June 30, 2022): 1–31. http://dx.doi.org/10.1145/3474058.

Full text

Abstract:

Cloud deployments now increasingly exploit Field-Programmable Gate Array (FPGA) accelerators as part of virtual instances. While cloud FPGAs are still essentially single-tenant, the growing demand for efficient hardware acceleration paves the way to FPGA multi-tenancy. It then becomes necessary to explore architectures, design flows, and resource management features that aim at exposing multi-tenant FPGAs to the cloud users. In this article, we discuss a hardware/software architecture that supports provisioning space-shared FPGAs in Kernel-based Virtual Machine (KVM) clouds. The proposed hardware/software architecture introduces an FPGA organization that improves hardware consolidation and support hardware elasticity with minimal data movement overhead. It also relies on VirtIO to decrease communication latency between hardware and software domains. Prototyping the proposed architecture with a Virtex UltraScale+ FPGA demonstrated near specification maximum frequency for on-chip data movement and high throughput in virtual instance access to hardware accelerators. We demonstrate similar performance compared to single-tenant deployment while increasing FPGA utilization, which is one of the goals of virtualization. Overall, our FPGA design achieved about 2× higher maximum frequency than the state of the art and a bandwidth reaching up to 28 Gbps on 32-bit data width.

APA, Harvard, Vancouver, ISO, and other styles

13

Magyari, Alexander, and Yuhua Chen. "FPGA Remote Laboratory Using IoT Approaches." Electronics 10, no. 18 (September 11, 2021): 2229. http://dx.doi.org/10.3390/electronics10182229.

Full text

Abstract:

Field-Programmable Gate Arrays (FPGAs) are relatively high-end devices that are not easily shared between multiple users. In this work, we achieved a remotely accessible FPGA framework using accessible Internet of Things (IoT) approaches. We sought to develop a method for students to receive the same level of educational quality in a remote environment that they would receive in a typical, in-person course structure for a university-level digital design course. Keeping cost in mind, we are able to combine the functionality of an entry-level FPGA and a Raspberry Pi Zero to provide IoT access for laboratory work. Previous works in this field allow only one user to access an FPGA at a time, which requires students to schedule time slots. Our design is unique in that it gives multiple users the ability to interact simultaneously with one individual top-level design on an FPGA. This novel design has the benefit for classroom presentations, collaboration and debugging, and eliminates the need for restricting student access to a time slot for FPGA access. Further, our hardware wrapper is lightweight, utilizing less than 1% of tested FPGA chips, allowing it to be integrated with resource-heavy designs. The application is meant to scale with large designs; there is no difference between how many users can interact with the remote design, regardless of the complexity of the design. Further, the number of users who can interact with a single project is limited only by the bandwidth restrictions imposed by Google Fire Base, which is far beyond any practical number of users for simultaneous access.

APA, Harvard, Vancouver, ISO, and other styles

14

Landmann, Christoph, and Rolf Kall. "Graphical Hardware Description as a High-Level Design Entry Method for FPGA-Based Data Acquisition Systems." Key Engineering Materials 613 (May 2014): 296–306. http://dx.doi.org/10.4028/www.scientific.net/kem.613.296.

Full text

Abstract:

Probably one of the most significant developments in the field of software-defined multifunction data acquisition systems and devices is the employment of FPGA (Field-Programmable GateArray) technology, resulting in a tremendous digital processing potential close to the I/O pin. FPGA technology is based on reconfigurable semiconductor devices which can be employed as processing targets in heterogeneous computing architectures for a variety of data acquisition applications. They can primarily be characterized by generic properties, such as deterministic execution, inherent parallelism, fast processing speed and high availability, stability and reliability. Therefore FPGAs areparticularly suitable for use in “intelligent” data acquisition applications that require either in-line digital signal co-processing or real-time system emulation in the field of advanced control, protocol aware communication, hardware-in-the-loop (HIL) as well as RF and wireless test. From the perspective of a domain expert however, primarily being focused on developing applications and algorithms, simple and intuitive design entry methods and tools are required that facilitate the FPGA configuration and design entry process. Traditional FPGA design entry methods and commercially available tools assume a comprehensive knowledge of hardware description languages (HDL),such as VHDL or Verilog®, and implement a process or function at register-level. In contrast, graphical hardware description languages for FPGAs, such as the integrated development environment NI LabVIEW® with FPGA module extension, abstract the design process by means of graphical objects, I/O nodes and interconnecting wires that represent the FPGA’s IP and implement processes, timing, I/O integration and data flow. This paper discusses the advantages of graphical system design for FPGAs over text-based alternatives, introduces interfaces for the integration of 3rd party IP, all backed up by a detailed illustration of a COTS FPGA-based multifunction DAQ target compared to a traditional DAQ architecture.

APA, Harvard, Vancouver, ISO, and other styles

15

Singh, Rachna, and Arvind Rajawat. "Analytical Model for High–Level Area Estimation of FPGA Design." International Journal of Embedded and Real-Time Communication Systems 7, no. 2 (July 2016): 35–44. http://dx.doi.org/10.4018/ijertcs.2016070103.

Full text

Abstract:

FPGAs have been used as a target platform because they have increasingly interesting in system design and due to the rapid technological progress ever larger devices are commercially affordable. These trends make FPGAs an alternative in application areas where extensive data processing plays an important role. Consequently, the desire emerges for early performance estimation in order to quantify the FPGA approach. A mathematical model has been presented that estimates the maximum number of LUTs consumed by the hardware synthesized for different FPGAs using LLVM.. The motivation behind this research work is to design an area modeling approach for FPGA based implementation at an early stage of design. The equation based area estimation model permits immediate and accurate estimation of resources. Two important criteria used to judge the quality of the results were estimation accuracy and runtime. Experimental results show that estimation error is in the range of 1.33% to 7.26% for Spartan 3E, 1.6% to 5.63% for Virtex-2pro and 2.3% to 6.02% for Virtex-5.

APA, Harvard, Vancouver, ISO, and other styles

16

Pirzada, Syed Jahanzeb Hussain, Abid Murtaza, Tongge Xu, and Liu Jianwei. "A Reconfigurable Model-Based Design for Rapid Prototyping on FPGA." International Journal of Computer Theory and Engineering 12, no. 3 (2020): 80–84. http://dx.doi.org/10.7763/ijcte.2020.v12.1268.

Full text

Abstract:

The digital design methodologies are evolving with the increase of digital systems utilization in daily life. The Model Based Design (MBD) methodology provides a unique methodology for design and implementation of digital systems on Field Programmable Gate Array (FPGA). Recently, a lot of research effort has been put to exploit new methodologies for designing and prototyping of digital systems on FPGA. The FPGA hardware provides prototyping which provides means of verifying your design at an early stage of development cycle. This helps to evaluate design trade-offs by testing the design in real-time on hardware. Making prototypes is a common practice in research-oriented projects. However, it requires excess development time which increases time to market of the product. This paper illustrates the use of reconfigurable MBD for rapid prototyping of digital systems on Microsemi ACTEL FPGAs for improving the design-cycle and time-to-market of a product. The model is simulated to verify the functionality of the design at system-level and a high-level code is generated from the MBD toolset embedded in MATLAB for hardware implementation. Then, a High-Level Synthesis (HLS) is performed on the generated code which converts this high-level code into Verilog-HDL suitable for hardware implementation on FPGA. Hence, this work presents a methodology and its analysis for design of digital system using high-level synthesis on Microsemi ACTEL FPGA.

APA, Harvard, Vancouver, ISO, and other styles

17

Gehrer, Stefan, and Georg Sigl. "Area-Efficient PUF-Based Key Generation on System-on-Chips with FPGAs." Journal of Circuits, Systems and Computers 25, no. 01 (November 15, 2015): 1640002. http://dx.doi.org/10.1142/s0218126616400028.

Full text

Abstract:

Physically unclonable functions (PUFs) are an innovative way to generate device unique keys using uncontrollable production tolerances. In this work, we present a method to use PUFs on modern FPGA-based system-on-chips (SoCs). The processor system part of the SoC is used to configure the FPGA part. We propose a reconfigurable PUF design that can be changed by using the partial reconfiguration (PR) feature of modern FPGAs. Multiple ring oscillator PUF (RO PUF) designs are loaded on the same logic blocks of the FPGA in order to make use of different resources, i.e., sources of entropy, on the FPGA. Their frequencies are read out individually and the differences between neighbored oscillators are used to generate a bit response. The responses of each design can be concatenated to a larger response vector that can be used to generate a cryptographic key. We present an implementation that is able to decrease the needed resources by 87.5% on a Xilinx Zynq.

APA, Harvard, Vancouver, ISO, and other styles

18

Du, Changdao, and Yoshiki Yamaguchi. "High-Level Synthesis Design for Stencil Computations on FPGA with High Bandwidth Memory." Electronics 9, no. 8 (August 8, 2020): 1275. http://dx.doi.org/10.3390/electronics9081275.

Full text

Abstract:

Due to performance and energy requirements, FPGA-based accelerators have become a promising solution for high-performance computations. Meanwhile, with the help of high-level synthesis (HLS) compilers, FPGA can be programmed using common programming languages such as C, C++, or OpenCL, thereby improving design efficiency and portability. Stencil computations are significant kernels in various scientific applications. In this paper, we introduce an architecture design for implementing stencil kernels on state-of-the-art FPGA with high bandwidth memory (HBM). Traditional FPGAs are usually equipped with external memory, e.g., DDR3 or DDR4, which limits the design space exploration in the spatial domain of stencil kernels. Therefore, many previous studies mainly relied on exploiting parallelism in the temporal domain to eliminate the bandwidth limitations. In our approach, we scale-up the design performance by considering both the spatial and temporal parallelism of the stencil kernel equally. We also discuss the design portability among different HLS compilers. We use typical stencil kernels to evaluate our design on a Xilinx U280 FPGA board and compare the results with other existing studies. By adopting our method, developers can take broad parallelization strategies based on specific FPGA resources to improve performance.

APA, Harvard, Vancouver, ISO, and other styles

19

Biookaghazadeh, Saman, Pravin Kumar Ravi, and Ming Zhao. "Toward Multi-FPGA Acceleration of the Neural Networks." ACM Journal on Emerging Technologies in Computing Systems 17, no. 2 (April 2021): 1–23. http://dx.doi.org/10.1145/3432816.

Full text

Abstract:

High-throughput and low-latency Convolutional Neural Network (CNN) inference is increasingly important for many cloud- and edge-computing applications. FPGA-based acceleration of CNN inference has demonstrated various benefits compared to other high-performance devices such as GPGPUs. Current FPGA CNN-acceleration solutions are based on a single FPGA design, which are limited by the available resources on an FPGA. In addition, they can only accelerate conventional 2D neural networks. To address these limitations, we present a generic multi-FPGA solution, written in OpenCL, which can accelerate more complex CNNs (e.g., C3D CNN) and achieve a near linear speedup with respect to the available single-FPGA solutions. The design is built upon the Intel Deep Learning Accelerator architecture, with three extensions. First, it includes updates for better area efficiency (up to 25%) and higher performance (up to 24%). Second, it supports 3D convolutions for more challenging applications such as video learning. Third, it supports multi-FPGA communication for higher inference throughput. The results show that utilizing multiple FPGAs can linearly increase the overall bandwidth while maintaining the same end-to-end latency. In addition, the design can outperform other FPGA 2D accelerators by up to 8.4 times and 3D accelerators by up to 1.7 times.

APA, Harvard, Vancouver, ISO, and other styles

20

Ali, Moustafa. "Safety Critical FPGA Design." International Conference on Electrical Engineering 9, no. 9th (May 1, 2014): 1. http://dx.doi.org/10.21608/iceeng.2014.30555.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

KAMBOH, HAMID M., and SHOAB A. KHAN. "HIGH THROUGHPUT FILTER ARCHITECTURE FOR OPTIMAL FPGA-BASED IMPLEMENTATIONS." Journal of Circuits, Systems and Computers 22, no. 05 (May 9, 2013): 1350034. http://dx.doi.org/10.1142/s0218126613500345.

Full text

Abstract:

Modern field programmable gate arrays (FPGAs) offer built in support for efficient implementation of signal processing algorithms in the form of specialized embedded blocks such as high speed carry chains, specialized shift registers, adders, multiply accumulators (MAC) and block memories. These dedicated elements provide increased computational power and are used for efficient implementation of computationally extensive algorithms. This paper proposes a novel algorithm and architecture for the design and implementation of high performance intermediate frequency (IF) filters on FPGAs. In this research, we have proposed innovative design methodologies for generation of optimal feed forward and recursive architectures to be mapped on a family of FPGAs. Keeping in perspective the limited number of registers within the embedded blocks, the new methodology applies transformations to achieve higher throughput by applying various optimizations to the design algorithm. Implementation options include systolic MAC, transpose direct form MAC, canonic signed digit and distributed arithmetic based filters to suite the most economical FPGA implementation. The paper demonstrates the methodology and shows its applicability by synthesizing the designs and comparing the results to a number of traditional architectures and intellectual property cores. Using Xilinx Virtex-5 FPGA, our results show a throughput improvement between 7% and 30% with an average improvement of 16% over traditional implementations of these designs.

APA, Harvard, Vancouver, ISO, and other styles

22

Streit, Franz-Josef, Paul Krüger, Andreas Becher, Stefan Wildermann, and Jürgen Teich. "Design and Evaluation of a Tunable PUF Architecture for FPGAs." ACM Transactions on Reconfigurable Technology and Systems 15, no. 1 (March 31, 2022): 1–27. http://dx.doi.org/10.1145/3491237.

Full text

Abstract:

FPGA-based Physical Unclonable Functions (PUF) have emerged as a viable alternative to permanent key storage by turning effects of inaccuracies during the manufacturing process of a chip into a unique, FPGA-intrinsic secret. However, many fixed PUF designs may suffer from unsatisfactory statistical properties in terms of uniqueness, uniformity, and robustness. Moreover, a PUF signature may alter over time due to aging or changing operating conditions, rendering a PUF insecure in the worst case. As a remedy, we propose CHOICE , a novel class of FPGA-based PUF designs with tunable uniqueness and reliability characteristics. By the use of addressable shift registers available on an FPGA, we show that a wide configuration space for adjusting a device-specific PUF response is obtained without any sacrifice of randomness. In particular, we demonstrate the concept of address-tunable propagation delays, whereby we are able to increase or decrease the probability of obtaining “ 1 ”s in the PUF response. Experimental evaluations on a group of six 28 nm Xilinx Artix-7 FPGAs show that CHOICE PUFs provide a large range of configurations to allow a fine-tuning to an average uniqueness between 49% and 51%, while simultaneously achieving bit error rates below 1.5%, thus outperforming state-of-the-art PUF designs. Moreover, with only a single FPGA slice per PUF bit, CHOICE is one of the smallest PUF designs currently available for FPGAs. It is well-known that signal propagation delays are affected by temperature, as the operating temperature impacts the internal currents of transistors that ultimately make up the circuit. We therefore comprehensively investigate how temperature variations affect the PUF response and demonstrate how the tunability of CHOICE enables us to determine configurations that show a high robustness to such variations. As a case study, we present a cryptographic key generation scheme based on CHOICE PUF responses as device-intrinsic secret and investigate the design objectives resource costs, performance, and temperature robustness to show the practicability of our approach.

APA, Harvard, Vancouver, ISO, and other styles

23

Wu, Chi-Feng, and Cheng-Wen Wu. "Testing and Diagnosing Dynamic Reconfigurable FPGA." VLSI Design 10, no. 3 (January 1, 2000): 321–33. http://dx.doi.org/10.1155/2000/79281.

Full text

Abstract:

Dynamic reconfigurable field-programmable logic arrays (FPGAs) are receiving notable attention because of their much shorter reconfiguration time as compared with traditional FPGAs. The short reconfiguration time is vital to applications such as reconfigurable computing and emulation. We show in this paper that testing and diagnosis of the FPGA also can take advantage of its dynamic reconfigurability. We first propose an efficient methodology for testing the interconnects of the FPGA, then present several universal test and diagnosis approaches which cover all functional units of the FPGA. Experimental results show that our approach significantly reduces the testing time, without additional cost for diagnosis.

APA, Harvard, Vancouver, ISO, and other styles

24

Ghaffari, Alireza, and Yvon Savaria. "CNN2Gate: An Implementation of Convolutional Neural Networks Inference on FPGAs with Automated Design Space Exploration." Electronics 9, no. 12 (December 21, 2020): 2200. http://dx.doi.org/10.3390/electronics9122200.

Full text

Abstract:

Convolutional Neural Networks (CNNs) have a major impact on our society, because of the numerous services they provide. These services include, but are not limited to image classification, video analysis, and speech recognition. Recently, the number of researches that utilize FPGAs to implement CNNs are increasing rapidly. This is due to the lower power consumption and easy reconfigurability that are offered by these platforms. Because of the research efforts put into topics, such as architecture, synthesis, and optimization, some new challenges are arising for integrating suitable hardware solutions to high-level machine learning software libraries. This paper introduces an integrated framework (CNN2Gate), which supports compilation of a CNN model for an FPGA target. CNN2Gate is capable of parsing CNN models from several popular high-level machine learning libraries, such as Keras, Pytorch, Caffe2, etc. CNN2Gate extracts computation flow of layers, in addition to weights and biases, and applies a “given” fixed-point quantization. Furthermore, it writes this information in the proper format for the FPGA vendor’s OpenCL synthesis tools that are then used to build and run the project on FPGA. CNN2Gate performs design-space exploration and fits the design on different FPGAs with limited logic resources automatically. This paper reports results of automatic synthesis and design-space exploration of AlexNet and VGG-16 on various Intel FPGA platforms.

APA, Harvard, Vancouver, ISO, and other styles

25

Kumar, Kandagatla Ravi, Cheeli Priyadarshini, Kanakam Bhavani, Ankam Varun Sundar Kumar, and Palanki Naga Nanda Sai. "Design of High Speed and Low Area Confined Multiplier on FPGA." Revista Gestão Inovação e Tecnologias 11, no. 4 (July 22, 2021): 2736–46. http://dx.doi.org/10.47059/revistageintec.v11i4.2315.

Full text

Abstract:

In this Advanced world, Technology is playing the major role. Most importantly development in Electronics field has a large impact on the improved life style. Among the advanced applications, DSP ranks first in place. Multipliers are the most basic elements that are widely used in the Digital Signal Processing (DSP) applications. Therefore, the design of the multiplier is the main factor for the performance of the device. Using RTL simulation and a Field Programmable Gate Array (FPGA), we compare the performance of a serial multiplier with an advanced multiplier. Many single bit adders are removed and replaced with multiplexers in this project. So that the less often used FPGAs are fully used by occupying fewer divisions and slices. The use of multiplier architecture results in significant reductions in FPGA resources, latency, area, and power. These multiplication approaches are created utilizing RTL simulation in Xilinx ISE simulator and synthesis in Xilinx ISE 14.7. Finally, the Spartan 3E FPGA is used to implement the design.

APA, Harvard, Vancouver, ISO, and other styles

26

Sauvage, Laurent, Maxime Nassar, Sylvain Guilley, Florent Flament, Jean-Luc Danger, and Yves Mathieu. "Exploiting Dual-Output Programmable Blocks to Balance Secure Dual-Rail Logics." International Journal of Reconfigurable Computing 2010 (2010): 1–12. http://dx.doi.org/10.1155/2010/375245.

Full text

Abstract:

FPGA design of side-channel analysis countermeasures using unmasked dual-rail with precharge logic appears to be a great challenge. Indeed, the robustness of such a solution relies on careful differential placement and routing whereas both FPGA layout and FPGA EDA tools are not developed for such purposes. However, assessing the security level which can be achieved with them is an important issue, as it is directly related to the suitability to use commercial FPGA instead of proprietary custom FPGA for this kind of protection. In this article, we experimentally gave evidence that differential placement and routing of an FPGA implementation can be done with a granularity fine enough to improve the security gain. However, so far, this gain turned out to be lower for FPGAs than for ASICs. The solutions demonstrated in this article exploit the dual-output of modern FPGAs to achieve a better balance of dual-rail interconnections. However, we expect that an in-depth analysis of routing resources power consumption could still help reduce the interconnect differential leakage.

APA, Harvard, Vancouver, ISO, and other styles

27

Chen, Bo. "Thermal Design of a Satellite Borne FPGA." Applied Mechanics and Materials 52-54 (March 2011): 1411–14. http://dx.doi.org/10.4028/www.scientific.net/amm.52-54.1411.

Full text

Abstract:

Thermal design and analysis of a satellite borne FPGA is described in this paper. Thermal-conductive glue, vias and an aluminum bar were used to the FPGA and the PCB under the FPGA in order to help conduct the heat of the FPGA to heat sink. The results of finite element analysis showed that the case temperature of the FPGA decreased from 132.5°C to 55.4°C and the junction temperature decreased from 136.1°C to59.0 °C after the thermal design, which matches the requirements of thermal design.

APA, Harvard, Vancouver, ISO, and other styles

28

Menzel, Johannes, Christian Plessl, and Tobias Kenter. "The Strong Scaling Advantage of FPGAs in HPC for N-body Simulations." ACM Transactions on Reconfigurable Technology and Systems 15, no. 1 (March 31, 2022): 1–30. http://dx.doi.org/10.1145/3491235.

Full text

Abstract:

N-body methods are one of the essential algorithmic building blocks of high-performance and parallel computing. Previous research has shown promising performance for implementing n-body simulations with pairwise force calculations on FPGAs. However, to avoid challenges with accumulation and memory access patterns, the presented designs calculate each pair of forces twice, along with both force sums of the involved particles. Also, they require large problem instances with hundreds of thousands of particles to reach their respective peak performance, limiting the applicability for strong scaling scenarios. This work addresses both issues by presenting a novel FPGA design that uses each calculated force twice and overlaps data transfers and computations in a way that allows to reach peak performance even for small problem instances, outperforming previous single precision results even in double precision, and scaling linearly over multiple interconnected FPGAs. For a comparison across architectures, we provide an equally optimized CPU reference, which for large problems actually achieves higher peak performance per device, however, given the strong scaling advantages of the FPGA design, in parallel setups with few thousand particles per device, the FPGA platform achieves highest performance and power efficiency.

APA, Harvard, Vancouver, ISO, and other styles

29

Zhang, Xinyi, Yawen Wu, Peipei Zhou, Xulong Tang, and Jingtong Hu. "Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–24. http://dx.doi.org/10.1145/3477002.

Full text

Abstract:

Multi-head self-attention (attention mechanism) has been employed in a variety of fields such as machine translation, language modeling, and image processing due to its superiority in feature extraction and sequential data analysis. This is benefited from a large number of parameters and sophisticated model architecture behind the attention mechanism. To efficiently deploy attention mechanism on resource-constrained devices, existing works propose to reduce the model size by building a customized smaller model or compressing a big standard model. A customized smaller model is usually optimized for the specific task and needs effort in model parameters exploration. Model compression reduces model size without hurting the model architecture robustness, which can be efficiently applied to different tasks. The compressed weights in the model are usually regularly shaped (e.g. rectangle) but the dimension sizes vary (e.g. differs in rectangle height and width). Such compressed attention mechanism can be efficiently deployed on CPU/GPU platforms as their memory and computing resources can be flexibly assigned with demand. However, for Field Programmable Gate Arrays (FPGAs), the data buffer allocation and computing kernel are fixed at run time to achieve maximum energy efficiency. After compression, weights are much smaller and different in size, which leads to inefficient utilization of FPGA on-chip buffer. Moreover, the different weight heights and widths may lead to inefficient FPGA computing kernel execution. Due to the large number of weights in the attention mechanism, building a unique buffer and computing kernel for each compressed weight on FPGA is not feasible. In this work, we jointly consider the compression impact on buffer allocation and the required computing kernel during the attention mechanism compressing. A novel structural pruning method with memory footprint awareness is proposed and the associated accelerator on FPGA is designed. The experimental results show that our work can compress Transformer (an attention mechanism based model) by 95x. The developed accelerator can fully utilize the FPGA resource, processing the sparse attention mechanism with the run-time throughput performance of 1.87 Tops in ZCU102 FPGA.

APA, Harvard, Vancouver, ISO, and other styles

30

Arayacheeppreecha, Pancheewa, Suree Pumrin, and Boonchuay Supmonchai. "1-D Integer Transform for HEVC Encoder Using DSP Slices on FPGA." Applied Mechanics and Materials 781 (August 2015): 151–54. http://dx.doi.org/10.4028/www.scientific.net/amm.781.151.

Full text

Abstract:

This paper presents an FPGA architecture for the 1-D integer transform of the latest video coding standard, the High Efficiency Video Coding (HEVC). The design employs hard multipliers in dedicated DSP slices, which are already embedded into an FPGA die, to gain high throughput and save general purpose LUTs. The proposed architecture can support 4x4, 8x8, 16x16, and 32x32 transform. A multiplier sharing scheme is introduced to reduce the total number of required DSP slices in order to be able to fit the design onto a Spartan-3A FPGA. The design can reach a maximum throughput of 1,692 Msamples/s irrespective of the transform size, which is enough to encode 8K (7680x4320) videos at 30 fps. This work is a pioneer research that utilizes the dedicated multipliers on FPGAs in the design of the HEVC transform.

APA, Harvard, Vancouver, ISO, and other styles

31

Guo, Shuaizhi, Tianqi Wang, Linfeng Tao, Teng Tian, Zikun Xiang, and Xi Jin. "RP-Ring: A Heterogeneous Multi-FPGA Accelerator." International Journal of Reconfigurable Computing 2018 (2018): 1–14. http://dx.doi.org/10.1155/2018/6784319.

Full text

Abstract:

To reduce the cost of designing new specialized FPGA boards as direct-summation MOND (Modified Newtonian Dynamics) simulator, we propose a new heterogeneous architecture with existing FPGA boards, which is called RP-ring (reconfigurable processor ring). This design can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resources. In order to avoid overall performance loss caused by the slowest board, we build a mathematical model to decompose workload among FPGAs. The dividing of workload is based on the logic resource, memory access bandwidth, and communication bandwidth of each FPGA chip. Our accelerator can achieve two orders of magnitude speedup compared with CPU implementation.

APA, Harvard, Vancouver, ISO, and other styles

32

Provelengios, George, Daniel Holcomb, and Russell Tessier. "Mitigating Voltage Attacks in Multi-Tenant FPGAs." ACM Transactions on Reconfigurable Technology and Systems 14, no. 2 (July 29, 2021): 1–24. http://dx.doi.org/10.1145/3451236.

Full text

Abstract:

Recent research has exposed a number of security issues related to the use of FPGAs in embedded system and cloud computing environments. Circuits that deliberately waste power can be carefully crafted by a malicious cloud FPGA user and deployed to cause denial-of-service and fault injection attacks. The main defense strategy used by FPGA cloud services involves checking user-submitted designs for circuit structures that are known to aggressively consume power. Unfortunately, this approach is limited by an attacker’s ability to conceive new designs that defeat existing checkers. In this work, our contributions are twofold. We evaluate a variety of circuit power wasting techniques that typically are not flagged by design rule checks imposed by FPGA cloud computing vendors. The efficiencies of five power wasting circuits, including our new design, are evaluated in terms of power consumed per logic resource. We then show that the source of voltage attacks based on power wasters can be identified. Our monitoring approach localizes the attack and suppresses the clock signal for the target region within 21 μs, which is fast enough to stop an attack before it causes a board reset. All experiments are performed using a state-of-the-art Intel Stratix 10 FPGA.

APA, Harvard, Vancouver, ISO, and other styles

33

JOHNSTON, S. P., G. PRASAD, L. MAGUIRE, and T. M. MCGINNITY. "AN FPGA HARDWARE/SOFTWARE CO-DESIGN TOWARDS EVOLVABLE SPIKING NEURAL NETWORKS FOR ROBOTICS APPLICATION." International Journal of Neural Systems 20, no. 06 (December 2010): 447–61. http://dx.doi.org/10.1142/s0129065710002541.

Full text

Abstract:

This paper presents an approach that permits the effective hardware realization of a novel Evolvable Spiking Neural Network (ESNN) paradigm on Field Programmable Gate Arrays (FPGAs). The ESNN possesses a hybrid learning algorithm that consists of a Spike Timing Dependent Plasticity (STDP) mechanism fused with a Genetic Algorithm (GA). The design and implementation direction utilizes the latest advancements in FPGA technology to provide a partitioned hardware/software co-design solution. The approach achieves the maximum FPGA flexibility obtainable for the ESNN paradigm. The algorithm was applied as an embedded intelligent system robotic controller to solve an autonomous navigation and obstacle avoidance problem.

APA, Harvard, Vancouver, ISO, and other styles

34

Sankar, Deepa, Lakshmi Syamala, Babu Chembathu Ayyappan, and Mathew Kallarackal. "FPGA-Based Cost-Effective and Resource Optimized Solution of Predictive Direct Current Control for Power Converters." Energies 14, no. 22 (November 16, 2021): 7669. http://dx.doi.org/10.3390/en14227669.

Full text

Abstract:

Recent advances in power converter applications with highly demanding control goals require the efficient implementation of superior control strategies. However, the real-time application of such control strategies demands high computational power that necessitates efficient digital controllers like field programmable gate array (FPGA). The inherent parallelism offered by FPGAs minimizes the execution time and exhibits an excellent cost-performance trade-off. In addition, rapid advancements in FPGA technology with a broad portfolio of intellectual property (IP) cores, design tools, and robust embedded processors resulted in a design paradigm shift. This article proposes a low-cost solution for the resource-optimized implementation of dynamic, highly accurate, and computationally intensive finite state-predictive direct current control (FS-PDCC). The challenges for implementing complex control algorithms for power converters are discussed in detail, and the control is implemented in Intel’s low-cost non-volatile FPGA-MAX®10. An efficient design methodology using finite state machine (FSM) is adopted to achieve time/resource-efficient implementation. The parallel and pipelined architecture of FPGA provides better resource utilization with high execution speed. The experimental results prove the efficiency of FPGA-based cost-effective solutions that offer superior performance with better output quality.

APA, Harvard, Vancouver, ISO, and other styles

35

Zhou, Zhimei, Yong Wan, Yin Liu, Xiaoyan Guo, Qilin Yin, and Chen Feng. "The advancement of cluster based FPGA place & route technic." MATEC Web of Conferences 309 (2020): 01014. http://dx.doi.org/10.1051/matecconf/202030901014.

Full text

Abstract:

As one of the core components of electronic hardware systems, Field Programmable Logic Array (FPGA) device design technology continues to advance under the guidance of electronic information technology policies, and has made information technology applications. huge contribution. However, with the advancement of chip technology and the continuous upgrading of information technology, the functions that FPGAs need to perform are more and more complicated. How to efficiently perform layout design and make full use of chip resources has become an important technology to be solved and optimized in FPGA design. The FPGA itself is not limited to a specific function. It contains internal functions such as memory, protocol module, clock module, high-speed interface module and digital signal processing. It can be programmed through logic modules such as programmable logic unit modules and interconnects. Blank FPGA devices are designed to be high performance system applications with complex functions. The layout and routing technology based on cluster logic unit blocks can combine the above resources to give full play to its performance advantages, and its importance is self-evident. Based on the traditional FPGA implementation, this paper analyzes several advantages based on cluster logic block layout and routing technology, and generalizes the design method and flow based on cluster logic block layout and routing technology.

APA, Harvard, Vancouver, ISO, and other styles

36

Fu, Qing Qing, and Zheng Bin Liang. "The Design of Textile Image Processing System." Applied Mechanics and Materials 148-149 (December 2011): 250–53. http://dx.doi.org/10.4028/www.scientific.net/amm.148-149.250.

Full text

Abstract:

According to the drawback of high cost and complicated circuit and inadequate use of resources in DSP and FPGA structure of textile image processing, an image processing system based on Nios II in FPGA is designed. FPGA is the core of the system.Nios II processor is created in FPGA.Video image is acquired by CCD and processed in FPGA. The result shows that the system has some characteristics of small size, low cost, high integration, high stability and flexibility.

APA, Harvard, Vancouver, ISO, and other styles

37

Akkar, Hanan A. R., and Huthaifa Salman Khairy. "Design of a Field Programmable Gate Array for Swarm Intelligent Controller Based on a Portable Robotic System." Journal of Cases on Information Technology 23, no. 2 (April 2021): 65–75. http://dx.doi.org/10.4018/jcit.20210401.oa6.

Full text

Abstract:

Portable robots are considered an important device in many areas such as in medical, space research, emergency situations, applications, etc. The robots complete tasks efficiently and effectively without any human interaction. The most important advantages of portable robots are their small size and very high speed in problem processing with relatively high accuracy and efficiency compared with constant devices. In this paper, the authors discussed the applications of the robot systems based on swarm intelligent controller and field programmable gate array (FPGA). A component-oriented FPGA design platform is proposed for robot system integration because FPGAs are known to be power-efficient hardware platforms. From the results, they found that FPGA and swarm intelligence are very efficient in robotic systems and used in a wide area of applications.

APA, Harvard, Vancouver, ISO, and other styles

38

Fang, Jian, Yvo T. B. Mulder, Jan Hidders, Jinho Lee, and H. Peter Hofstee. "In-memory database acceleration on FPGAs: a survey." VLDB Journal 29, no. 1 (October 26, 2019): 33–59. http://dx.doi.org/10.1007/s00778-019-00581-w.

Full text

Abstract:

Abstract While FPGAs have seen prior use in database systems, in recent years interest in using FPGA to accelerate databases has declined in both industry and academia for the following three reasons. First, specifically for in-memory databases, FPGAs integrated with conventional I/O provide insufficient bandwidth, limiting performance. Second, GPUs, which can also provide high throughput, and are easier to program, have emerged as a strong accelerator alternative. Third, programming FPGAs required developers to have full-stack skills, from high-level algorithm design to low-level circuit implementations. The good news is that these challenges are being addressed. New interface technologies connect FPGAs into the system at main-memory bandwidth and the latest FPGAs provide local memory competitive in capacity and bandwidth with GPUs. Ease of programming is improving through support of shared coherent virtual memory between the host and the accelerator, support for higher-level languages, and domain-specific tools to generate FPGA designs automatically. Therefore, this paper surveys using FPGAs to accelerate in-memory database systems targeting designs that can operate at the speed of main memory.

APA, Harvard, Vancouver, ISO, and other styles

39

Chen, Deming, Jason Cong, and Peichen Pan. "FPGA Design Automation: A Survey." Foundations and Trends® in Electronic Design Automation 1, no. 3 (2006): 195–334. http://dx.doi.org/10.1561/1000000003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Shand, D. "Tools galore [FPGA design tools]." Electronics Systems and Software 3, no. 3 (June 1, 2005): 30–33. http://dx.doi.org/10.1049/ess:20050304.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Chen, X. T., W. K. Huang, N. Park, F. J. Meyer, and F. Lombardi. "Design verification of FPGA implementations." IEEE Design & Test of Computers 16, no. 2 (1999): 66–73. http://dx.doi.org/10.1109/54.765205.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Al_Dujaili, Mohammed Jawad, and Aws Majeed Al_Awadi. "Chirplet signal design by FPGA." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 3 (June 1, 2021): 2120. http://dx.doi.org/10.11591/ijece.v11i3.pp2120-2127.

Full text

Abstract:

The ever-expanding growth of the electronics and communications industries present new challenges for researchers. One of these challenges is the generation of the required bandwidth signal over a specific time frame that is used in a variety of contexts, particularly radar systems. To improve the range resolution in the radar along with better SNR, it is necessary to reduce the signal bandwidth and increase the peak power. There are some restrictions for narrowband signals like power limitation, pulse shaping, and the production of unwanted harmonics. So as a solution pulse compression techniques are suggested. Pulse compression is a process that modulating the transmitted pulse to achieve a wideband signal and then at the receiver, the received signal correlates with the transmitted pulse to achieve narrowband representations of data. Chirp is the most common signal used in pulse compression. The chirp signal is produced using linear frequency modulation. In this study, we attempted to add an amplitude modulation to the chirp signal and evaluate its performance by implementation on FPGA. The outcome signal is called chirplet and simulation will show that it enhance target detection and image quality in imaging radars like SAR.

APA, Harvard, Vancouver, ISO, and other styles

43

Cho, Mannhee, and Youngmin Kim. "FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit." Electronics 10, no. 22 (November 19, 2021): 2859. http://dx.doi.org/10.3390/electronics10222859.

Full text

Abstract:

Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs.

APA, Harvard, Vancouver, ISO, and other styles

44

Lei, Neng Fang. "The Design of Digital Communication System Based on FPGA." Applied Mechanics and Materials 687-691 (November 2014): 3093–96. http://dx.doi.org/10.4028/www.scientific.net/amm.687-691.3093.

Full text

Abstract:

The paper investigates the Multi-FPGA systems structure, including interconnect and configuration structure of the systems. After analyzing several common FPGA systems interconnection structure, we present its features. The paper analyses the feasibility and technical advantages for FPGA systems to use in digital array radar receiver design, and discusses their specific implementation, and designs a set of IF receiver with twenty reception elements.

APA, Harvard, Vancouver, ISO, and other styles

45

Sun, Qi, Tinghuan Chen, Siting Liu, Jianli Chen, Hao Yu, and Bei Yu. "Correlated Multi-objective Multi-fidelity Optimization for HLS Directives Design." ACM Transactions on Design Automation of Electronic Systems 27, no. 4 (July 31, 2022): 1–27. http://dx.doi.org/10.1145/3503540.

Full text

Abstract:

High-level synthesis (HLS) tools have gained great attention in recent years because it emancipates engineers from the complicated and heavy hardware description language writing and facilitates the implementations of modern applications (e.g., deep learning models) on Field-programmable Gate Array (FPGA) , by using high-level languages and HLS directives. However, finding good HLS directives is challenging, due to the time-consuming design processes, the balances among different design objectives, and the diverse fidelities (accuracies of data) of the performance values between the consecutive FPGA design stages. To find good HLS directives, a novel automatic optimization algorithm is proposed to explore the Pareto designs of the multiple objectives while making full use of the data with different fidelities from different FPGA design stages. Firstly, a non-linear Gaussian process (GP) is proposed to model the relationships among the different FPGA design stages. Secondly, for the first time, the GP model is enhanced as correlated GP (CGP) by considering the correlations between the multiple design objectives, to find better Pareto designs. Furthermore, we extend our model to be a deep version deep CGP (DCGP) by using the deep neural network to improve the kernel functions in Gaussian process models, to improve the characterization capability of the models, and learn better feature representations. We test our design method on some public benchmarks (including general matrix multiplication and sparse matrix-vector multiplication) and deep learning-based object detection model iSmart2 on FPGA. Experimental results show that our methods outperform the baselines significantly and facilitate the deep learning designs on FPGA.

APA, Harvard, Vancouver, ISO, and other styles

46

Jang, Seojin, Wei Liu, Sangun Park, and Yongbeom Cho. "Automatic RTL Generation Tool of FPGAs for DNNs." Electronics 11, no. 3 (January 28, 2022): 402. http://dx.doi.org/10.3390/electronics11030402.

Full text

Abstract:

With the increasing use of multi-purpose artificial intelligence of things (AIOT) devices, embedded field-programmable gate arrays (FPGA) represent excellent platforms for deep neural network (DNN) acceleration on edge devices. FPGAs possess the advantages of low latency and high energy efficiency, but the scarcity of FPGA development resources challenges the deployment of DNN-based edge devices. Register-transfer level programming, hardware verification, and precise resource allocation are needed to build a high-performance FPGA accelerator for DNNs. These tasks present a challenge and are time consuming for even experienced hardware developers. Therefore, we propose an automated, collaborative design process employing an automatic design space exploration tool; an automatic DNN engine enables the tool to reshape and parse a DNN model from software to hardware. We also introduce a long short-term memory (LSTM)-based model to predict performance and generate a DNN model that suits the developer requirements automatically. We demonstrate our design scheme with three FPGAs: a zcu104, a zcu102, and a Cyclone V SoC (system on chip). The results show that our hardware-based edge accelerator exhibits superior throughput compared with the most advanced edge graphics processing unit.

APA, Harvard, Vancouver, ISO, and other styles

47

Quang, Nguyen Khanh, and Nguyen Ho Quang. "FPGA Technology and Sequential Finite State Machine Method." Hue University Journal of Science: Natural Science 127, no. 1D (December 10, 2018): 55. http://dx.doi.org/10.26459/hueuni-jns.v127i1d.5073.

Full text

Abstract:

<em>The implementation of complex control algorithms on an FPGA</em> (Field programmable gate arrays)<em> is still at a basic level. There is no fixed method to develop algorithms on these devices because of their general characteristics. Therefore, the design engineers are still on the way to find the good approaches to optimize the implementation of algorithms on FPGAs [1-7]. This paper presents and demonstrates a sequential finite state machine design method that can solve the issue of optimal usage of the limited resources on an FPGA.</em>

APA, Harvard, Vancouver, ISO, and other styles

48

Papaphilippou, Philippos, Jiuxi Meng, Nadeen Gebara, and Wayne Luk. "Hipernetch: High-Performance FPGA Network Switch." ACM Transactions on Reconfigurable Technology and Systems 15, no. 1 (March 31, 2022): 1–31. http://dx.doi.org/10.1145/3477054.

Full text

Abstract:

We present Hipernetch, a novel FPGA-based design for performing high-bandwidth network switching. FPGAs have recently become more popular in data centers due to their promising capabilities for a wide range of applications. With the recent surge in transceiver bandwidth, they could further benefit the implementation and refinement of network switches used in data centers. Hipernetch replaces the crossbar with a “combined parallel round-robin arbiter”. Unlike a crossbar, the combined parallel round-robin arbiter is easy to pipeline, and does not require centralised iterative scheduling algorithms that try to fit too many steps in a single or a few FPGA cycles. The result is a network switch implementation on FPGAs operating at a high frequency and with a low port-to-port latency. Our proposed Hipernetch architecture additionally provides a competitive switching performance approaching output-queued crossbar switches. Our implemented Hipernetch designs exhibit a throughput that exceeds 100 Gbps per port for switches of up to 16 ports, reaching an aggregate throughput of around 1.7 Tbps.

APA, Harvard, Vancouver, ISO, and other styles

49

Cao, Da, Shun Xiang Wu, and Long Jiang Su. "I²C-Bus Design Based on FPGA." Advanced Materials Research 179-180 (January 2011): 528–33. http://dx.doi.org/10.4028/www.scientific.net/amr.179-180.528.

Full text

Abstract:

Introduced field programmable gate array FPGA with I2C bus interface device interface design. Programming with VHDL, using general FPGA I/O port to generate I2C bus interface signal timing, achieved FPGA with I2C-bus devices data communication, went through the simulation test, given the application example of FPGA with I2C-bus EEPROOM chip AT24C02 connected hardware design.

APA, Harvard, Vancouver, ISO, and other styles

50

Goswami, Pingakshya, and Dinesh Bhatia. "Congestion Prediction in FPGA Using Regression Based Learning Methods." Electronics 10, no. 16 (August 18, 2021): 1995. http://dx.doi.org/10.3390/electronics10161995.

Full text

Abstract:

Design closure in general VLSI physical design flows and FPGA physical design flows is an important and time-consuming problem. Routing itself can consume as much as 70% of the total design time. Accurate congestion estimation during the early stages of the design flow can help alleviate last-minute routing-related surprises. This paper has described a methodology for a post-placement, machine learning-based routing congestion prediction model for FPGAs. Routing congestion is modeled as a regression problem. We have described the methods for generating training data, feature extractions, training, regression models, validation, and deployment approaches. We have tested our prediction model by using ISPD 2016 FPGA benchmarks. Our prediction method reports a very accurate localized congestion value in each channel around a configurable logic block (CLB). The localized congestion is predicted in both vertical and horizontal directions. We demonstrate the effectiveness of our model on completely unseen designs that are not initially part of the training data set. The generated results show significant improvement in terms of accuracy measured as mean absolute error and prediction time when compared against the latest state-of-the-art works.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!