Увійти

Готові списки джерел за темами / FPGA resources / Статті в журналах

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: FPGA resources.

Статті в журналах з теми "FPGA resources"

Автор: Grafiati

Опубліковано: 30 травня 2022

Оновлено: 31 травня 2022

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "FPGA resources".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Caffarena, Gabriel, Juan A. López, Gerardo Leyva, Carlos Carreras, and Octavio Nieto-Taladriz. "Architectural Synthesis of Fixed-Point DSP Datapaths Using FPGAs." International Journal of Reconfigurable Computing 2009 (2009): 1–14. http://dx.doi.org/10.1155/2009/703267.

Повний текст джерела

Анотація:

We address the automatic synthesis of DSP algorithms using FPGAs. Optimized fixed-point implementations are obtained by means of considering (i) a multiple wordlength approach; (ii) a complete datapath formed of wordlength-wise resources (i.e., functional units, multiplexers, and registers); (iii) an FPGA-wise resource usage metric that enables an efficient distribution of logic fabric and embedded DSP resources. The paper shows (i) the benefits of applying a multiple wordlength approach to the implementation of fixed-point datapaths and (ii) the benefits of a wise use of embedded FPGA resources. The use of a complete fixed-point datapath leads to improvements up to 35%. And, the wise mapping of operations to FPGA resources (logic fabric and embedded blocks), thanks to the proposed resource usage metric, leads to improvements up to 54%.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Guo, Shuaizhi, Tianqi Wang, Linfeng Tao, Teng Tian, Zikun Xiang, and Xi Jin. "RP-Ring: A Heterogeneous Multi-FPGA Accelerator." International Journal of Reconfigurable Computing 2018 (2018): 1–14. http://dx.doi.org/10.1155/2018/6784319.

Повний текст джерела

Анотація:

To reduce the cost of designing new specialized FPGA boards as direct-summation MOND (Modified Newtonian Dynamics) simulator, we propose a new heterogeneous architecture with existing FPGA boards, which is called RP-ring (reconfigurable processor ring). This design can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resources. In order to avoid overall performance loss caused by the slowest board, we build a mathematical model to decompose workload among FPGAs. The dividing of workload is based on the logic resource, memory access bandwidth, and communication bandwidth of each FPGA chip. Our accelerator can achieve two orders of magnitude speedup compared with CPU implementation.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Liu, Huiqun, Kai Zhu, and D. F. Wong. "FPGA Partitioning with Complex Resource Constraints." VLSI Design 11, no. 3 (2000): 219–35. http://dx.doi.org/10.1155/2000/12198.

Повний текст джерела

Анотація:

In this paper, we present an algorithm for circuit partitioning with complex resource constraints in large FPGAs. Traditional partitioning methods estimate the capacity of an FPGA device by counting the number of logic blocks, however this is not accurate with the increasing diverse resource types in the new FPGA architectures. We first propose a network flow based method to optimally check whether a circuit or a subcircuit is feasible for a set of available heterogeneous resources. Then the feasibility checking procedure is integrated in the FM-based algorithm for circuit partitioning. Incremental flow technique is employed for efficient implementation. Experimental results on the MCNC benchmark circuits show that our partitioning algorithm not only yields good results, but also is efficient. Our algorithm for partitioning with complex resource constraints is applicable for both multiple FPGA designs (e.g., logic emulation systems) and partitioning-based placement algorithms for a single large hierarchical FPGA (e.g., Actel's ES6500 FPGA family).

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Ullah, Anees, Ali Zahir, Noaman A. Khan, Waleed Ahmad, Alexis Ramos, and Pedro Reviriego. "BPR-TCAM—Block and Partial Reconfiguration based TCAM on Xilinx FPGAs." Electronics 9, no. 2 (2020): 353. http://dx.doi.org/10.3390/electronics9020353.

Повний текст джерела

Анотація:

Field Programmable Gate Arrays (FPGAs) based Ternary Content Addressable Memories (TCAMs) are widely used in high-speed networking applications.However, TCAMs are not present on state-of-the-art FPGAs and need to be emulated on SRAM-based memories (i.e., LUTRAMs and Block RAMs) which requires a large amount of FPGA resources. In this paper, we present an efficient methodology to implement FPGA-based TCAMs with significant resource savings compared to existing schemes. The proposed methodology exploits the fracturable nature of Look Up Tables (LUTs) and the built-in slice carry-chains for simultaneous mapping of two rules and its matching logic to a single FPGA slice. Multiple slices can be stacked together to build deeper and wider TCAMs in a modular way. The combination of all these techniques results in significant savings in resource utilization compared to existing approaches.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Cho, Mannhee, and Youngmin Kim. "FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit." Electronics 10, no. 22 (2021): 2859. http://dx.doi.org/10.3390/electronics10222859.

Повний текст джерела

Анотація:

Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Alonso, Tobias, Lucian Petrica, Mario Ruiz, et al. "Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through Automatic Partitioning." ACM Transactions on Reconfigurable Technology and Systems 15, no. 2 (2022): 1–34. http://dx.doi.org/10.1145/3470567.

Повний текст джерела

Анотація:

Customized compute acceleration in the datacenter is key to the wider roll-out of applications based on deep neural network (DNN) inference. In this article, we investigate how to maximize the performance and scalability of field-programmable gate array (FPGA)-based pipeline dataflow DNN inference accelerators (DFAs) automatically on computing infrastructures consisting of multi-die, network-connected FPGAs. We present Elastic-DF, a novel resource partitioning tool and associated FPGA runtime infrastructure that integrates with the DNN compiler FINN. Elastic-DF allocates FPGA resources to DNN layers and layers to individual FPGA dies to maximize the total performance of the multi-FPGA system. In the resulting Elastic-DF mapping, the accelerator may be instantiated multiple times, and each instance may be segmented across multiple FPGAs transparently, whereby the segments communicate peer-to-peer through 100 Gbps Ethernet FPGA infrastructure, without host involvement. When applied to ResNet-50, Elastic-DF provides a 44% latency decrease on Alveo U280. For MobileNetV1 on Alveo U200 and U280, Elastic-DF enables a 78% throughput increase, eliminating the performance difference between these cards and the larger Alveo U250. Elastic-DF also increases operating frequency in all our experiments, on average by over 20%. Elastic-DF therefore increases performance portability between different sizes of FPGA and increases the critical throughput per cost metric of datacenter inference.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Wang, Gui Tang, Rui Huang Wang, Feng Wang, and Wen Juan Liu. "An Implementation and Improvement of Fast Two-Dimensional Median Filtering." Applied Mechanics and Materials 55-57 (May 2011): 95–100. http://dx.doi.org/10.4028/www.scientific.net/amm.55-57.95.

Повний текст джерела

Анотація:

This paper discussed a conventional fast median filtering algorithm for FPGA implementation. An improved way -- Quasi-median filtering algorithm -- have been proposed to reduce the occupancy rate of FPGA resources on the premise of ensuring the result of median filtering. Through the detailed analysis and comparison of results of simulation and experiments, conclusions can be drawn that such improvements can achieve better filtering results, and can reduce FPGA resource utilization. It offers some value for the application of design which requires more FPGA resources.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Pérez, Ignacio, and Miguel Figueroa. "A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems." Sensors 21, no. 8 (2021): 2637. http://dx.doi.org/10.3390/s21082637.

Повний текст джерела

Анотація:

Convolutional neural networks (CNN) have been extensively employed for image classification due to their high accuracy. However, inference is a computationally-intensive process that often requires hardware acceleration to operate in real time. For mobile devices, the power consumption of graphics processors (GPUs) is frequently prohibitive, and field-programmable gate arrays (FPGA) become a solution to perform inference at high speed. Although previous works have implemented CNN inference on FPGAs, their high utilization of on-chip memory and arithmetic resources complicate their application on resource-constrained edge devices. In this paper, we present a scalable, low power, low resource-utilization accelerator architecture for inference on the MobileNet V2 CNN. The architecture uses a heterogeneous system with an embedded processor as the main controller, external memory to store network data, and dedicated hardware implemented on reconfigurable logic with a scalable number of processing elements (PE). Implemented on a XCZU7EV FPGA running at 200 MHz and using four PEs, the accelerator infers with 87% top-5 accuracy and processes an image of 224×224 pixels in 220 ms. It consumes 7.35 W of power and uses less than 30% of the logic and arithmetic resources used by other MobileNet FPGA accelerators.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Sauvage, Laurent, Maxime Nassar, Sylvain Guilley, Florent Flament, Jean-Luc Danger, and Yves Mathieu. "Exploiting Dual-Output Programmable Blocks to Balance Secure Dual-Rail Logics." International Journal of Reconfigurable Computing 2010 (2010): 1–12. http://dx.doi.org/10.1155/2010/375245.

Повний текст джерела

Анотація:

FPGA design of side-channel analysis countermeasures using unmasked dual-rail with precharge logic appears to be a great challenge. Indeed, the robustness of such a solution relies on careful differential placement and routing whereas both FPGA layout and FPGA EDA tools are not developed for such purposes. However, assessing the security level which can be achieved with them is an important issue, as it is directly related to the suitability to use commercial FPGA instead of proprietary custom FPGA for this kind of protection. In this article, we experimentally gave evidence that differential placement and routing of an FPGA implementation can be done with a granularity fine enough to improve the security gain. However, so far, this gain turned out to be lower for FPGAs than for ASICs. The solutions demonstrated in this article exploit the dual-output of modern FPGAs to achieve a better balance of dual-rail interconnections. However, we expect that an in-depth analysis of routing resources power consumption could still help reduce the interconnect differential leakage.

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Trinh, Nguyen, Anh Le Thi Kim, Hung Nguyen, and Linh Tran. "Algorithmic TCAM on FPGA with data collision approach." Indonesian Journal of Electrical Engineering and Computer Science 22, no. 1 (2021): 89. http://dx.doi.org/10.11591/ijeecs.v22.i1.pp89-96.

Повний текст джерела

Анотація:

<span>Content addressable memory (CAM) and ternary content addressable memory (TCAM) are specialized high-speed memories for data searching. CAM and TCAM have many applications in network routing, packet forwarding and Internet data centers. These types of memories have drawbacks on power dissipation and area. As field-programmable gate array (FPGA) is recently being used for network acceleration applications, the demand to integrate TCAM and CAM on FPGA is increasing. Because most FPGAs do not support native TCAM and CAM hardware, methods of implementing algorithmic TCAM using FPGA resources have been proposed through recent years. Algorithmic TCAM on FPGA have the advantages of FPGAs low power consumption and high intergration scalability. This paper proposes a scaleable algorithmic TCAM design on FPGA. The design uses memory blocks to negate power dissipation issue and data collision to save area. The paper also presents a design of a 256 x 104-bit algorithmic TCAM on Intel FPGA Cyclone V, evaluates the performance and application ability of the design on large scale and in future developments.</span>

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Kyriakos, Angelos, Elissaios-Alexios Papatheofanous, Charalampos Bezaitis, and Dionysios Reisis. "Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification." Journal of Imaging 8, no. 4 (2022): 114. http://dx.doi.org/10.3390/jimaging8040114.

Повний текст джерела

Анотація:

A plethora of image and video-related applications involve complex processes that impose the need for hardware accelerators to achieve real-time performance. Among these, notable applications include the Machine Learning (ML) tasks using Convolutional Neural Networks (CNNs) that detect objects in image frames. Aiming at contributing to the CNN accelerator solutions, the current paper focuses on the design of Field-Programmable Gate Arrays (FPGAs) for CNNs of limited feature space to improve performance, power consumption and resource utilization. The proposed design approach targets the designs that can utilize the logic and memory resources of a single FPGA device and benefit mainly the edge, mobile and on-board satellite (OBC) computing; especially their image-processing- related applications. This work exploits the proposed approach to develop an FPGA accelerator for vessel detection on a Xilinx Virtex 7 XC7VX485T FPGA device (Advanced Micro Devices, Inc, Santa Clara, CA, USA). The resulting architecture operates on RGB images of size 80×80 or sliding windows; it is trained for the “Ships in Satellite Imagery” and by achieving frequency 270 MHz, completing the inference in 0.687 ms and consuming 5 watts, it validates the approach.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Gehrer, Stefan, and Georg Sigl. "Area-Efficient PUF-Based Key Generation on System-on-Chips with FPGAs." Journal of Circuits, Systems and Computers 25, no. 01 (2015): 1640002. http://dx.doi.org/10.1142/s0218126616400028.

Повний текст джерела

Анотація:

Physically unclonable functions (PUFs) are an innovative way to generate device unique keys using uncontrollable production tolerances. In this work, we present a method to use PUFs on modern FPGA-based system-on-chips (SoCs). The processor system part of the SoC is used to configure the FPGA part. We propose a reconfigurable PUF design that can be changed by using the partial reconfiguration (PR) feature of modern FPGAs. Multiple ring oscillator PUF (RO PUF) designs are loaded on the same logic blocks of the FPGA in order to make use of different resources, i.e., sources of entropy, on the FPGA. Their frequencies are read out individually and the differences between neighbored oscillators are used to generate a bit response. The responses of each design can be concatenated to a larger response vector that can be used to generate a cryptographic key. We present an implementation that is able to decrease the needed resources by 87.5% on a Xilinx Zynq.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Rawski, Mariusz. "Modified Distributed Arithmetic Concept for Implementations Targeted at Heterogeneous FPGAs." International Journal of Electronics and Telecommunications 56, no. 4 (2010): 345–50. http://dx.doi.org/10.2478/v10177-010-0045-9.

Повний текст джерела

Анотація:

Modified Distributed Arithmetic Concept for Implementations Targeted at Heterogeneous FPGAsDistributed Arithmetic (DA) plays an important role in designing digital signal processing modules for FPGA architectures. It allows replacing multiply-and-accumulate (MAC) operations with combinational blocks. The quality of implementations based on DA strongly depends on efficiency of methods that map combinational DA block into FPGA resources. Since modern FPGAs have heterogeneous structure, there is a need for quality algorithms to target these structures and the need for flexible architecture exploration aiding in appropriate mapping. The paper presents a modification of DA concept that allows for very efficient implementation in heterogeneous FPGA architectures.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Dandekar, Omkar, William Plishker, Shuvra S. Bhattacharyya, and Raj Shekhar. "Multiobjective Optimization for Reconfigurable Implementation of Medical Image Registration." International Journal of Reconfigurable Computing 2008 (2008): 1–17. http://dx.doi.org/10.1155/2008/738174.

Повний текст джерела

Анотація:

In real-time signal processing, a single application often has multiple computationally intensive kernels that can benefit from acceleration using custom or reconfigurable hardware platforms, such as field-programmable gate arrays (FPGAs). For adaptive utilization of resources at run time, FPGAs with capabilities for dynamic reconfiguration are emerging. In this context, it is useful for designers to derive sets of efficient configurations that trade off application performance with fabric resources. Such sets can be maintained at run time so that the best available design tradeoff is used. Finding a single, optimized configuration is difficult, and generating a family of optimized configurations suitable for different run-time scenarios is even more challenging. We present a novel multiobjective wordlength optimization strategy developed through FPGA-based implementation of a representative computationally intensive image processing application: medical image registration. Tradeoffs between FPGA resources and implementation accuracy are explored, and Pareto-optimized wordlength configurations are systematically identified. We also compare search methods for finding Pareto-optimized design configurations and demonstrate the applicability of search based on evolutionary techniques for identifying superior multiobjective tradeoff curves. We demonstrate feasibility of this approach in the context of FPGA-based medical image registration; however, it may be adapted to a wide range of signal processing applications.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Перепелицын, Артём Евгеньевич. "МЕТОД РАЗРАБОТКИ МУЛЬТИПАРАМЕТРИЗИРУЕМЫХ ПРОЕКТОВ ПРОГРАММИРУЕМОЙ ЛОГИКИ". Aerospace technic and technology, № 2 (26 квітня 2018): 64–70. http://dx.doi.org/10.32620/aktt.2018.2.09.

Повний текст джерела

Анотація:

A classification of project flexibility ways provided in VHDL language is proposed. The results of the analysis of the dependence of the FPGA resources required for the implementation of arithmetic blocks are presented. The peculiarities of implementation of FPGA arithmetic operations with a fixed point are analyzed. The analytical ratios of the of logical elements number for the Altera FPGA from the input data width of the arithmetic blocks are given. The results of an experimental study of the dependence of required FPGA resources amount for parametrizable arithmetic blocks implementation are given. It is demonstrated that in case if build-in hardware-implemented multipliers are available, the required FPGA resources amount for integer multiplier implementation changes with abrupt shape with adding one additional bit to input data width. Definitions of static parametrization, through parameterization and multiparametrization of FPGA projects are given. The ways of wide-parameterizable FPGA projects development are discussed. It is recommended to use parameterization to maximize the efficiency of FPGA resources usage. The technique of multiparametrized FPGA-based projects development is proposed. The technique of evaluating of the required FPGA resources amount for implementation of a multiparametrized project is proposed. A practical example of the use of the described method of multiparametrized FPGA-based projects development is proposed. The example includes the FPGA implementation of multiply-accumulate operations

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Irfan, Muhammad, Zahid Ullah, and Ray C. C. Cheung. "Zi-CAM: A Power and Resource Efficient Binary Content-Addressable Memory on FPGAs." Electronics 8, no. 5 (2019): 584. http://dx.doi.org/10.3390/electronics8050584.

Повний текст джерела

Анотація:

Content-addressable memory (CAM) is a type of associative memory, which returns the address of a given search input in one clock cycle. Many designs are available to emulate the CAM functionality inside the re-configurable hardware, field-programmable gate arrays (FPGAs), using static random-access memory (SRAM) and flip-flops. FPGA-based CAMs are becoming popular due to the rapid growth in software defined networks (SDNs), which uses CAM for packet classification. Emulated designs of CAM consume much dynamic power owing to a high amount of switching activity and computation involved in finding the address of the search key. In this paper, we present a power and resource efficient binary CAM architecture, Zi-CAM, which consumes less power and uses fewer resources than the available architectures of SRAM-based CAM on FPGAs. Zi-CAM consists of two main blocks. RAM block (RB) is activated when there is a sequence of repeating zeros in the input search word; otherwise, lookup tables (LUT) block (LB) is activated. Zi-CAM is implemented on Xilinx Virtex-6 FPGA for the size 64 × 36 which improved power consumption and hardware cost by 30 and 32%, respectively, compared to the available FPGA-based CAMs.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Jang, Seojin, Wei Liu, Sangun Park, and Yongbeom Cho. "Automatic RTL Generation Tool of FPGAs for DNNs." Electronics 11, no. 3 (2022): 402. http://dx.doi.org/10.3390/electronics11030402.

Повний текст джерела

Анотація:

With the increasing use of multi-purpose artificial intelligence of things (AIOT) devices, embedded field-programmable gate arrays (FPGA) represent excellent platforms for deep neural network (DNN) acceleration on edge devices. FPGAs possess the advantages of low latency and high energy efficiency, but the scarcity of FPGA development resources challenges the deployment of DNN-based edge devices. Register-transfer level programming, hardware verification, and precise resource allocation are needed to build a high-performance FPGA accelerator for DNNs. These tasks present a challenge and are time consuming for even experienced hardware developers. Therefore, we propose an automated, collaborative design process employing an automatic design space exploration tool; an automatic DNN engine enables the tool to reshape and parse a DNN model from software to hardware. We also introduce a long short-term memory (LSTM)-based model to predict performance and generate a DNN model that suits the developer requirements automatically. We demonstrate our design scheme with three FPGAs: a zcu104, a zcu102, and a Cyclone V SoC (system on chip). The results show that our hardware-based edge accelerator exhibits superior throughput compared with the most advanced edge graphics processing unit.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Skhiri, Rym, Virginie Fresse, Jean Paul Jamont, Benoit Suffran, and Jihene Malek. "From FPGA to Support Cloud to Cloud of FPGA: State of the Art." International Journal of Reconfigurable Computing 2019 (December 5, 2019): 1–17. http://dx.doi.org/10.1155/2019/8085461.

Повний текст джерела

Анотація:

Field Programmable Gate Array (FPGA) draws a significant attention from both industry and academia by accelerating computationally expensive applications and achieving low power consumption. FPGAs are interesting due to the flexibility and reconfigurabiltiy of their device. Cloud computing becomes a major trend towards infrastructure and computing resources dematerialization. It provides “unlimited” storage capacities and a large number of data and applications that make collaboration easier between multiple (not domain specific) designers. Many papers in the literature have surveyed Cloud and FPGA separately and, more precisely, their services and challenges. The acceleration of applications by FPGA and the unlimited capacities of the cloud are expected to be more and more pervasive. As more and more FPGA are being deployed in traditional cloud, it is appropriate to clarify what is the cloud FPGA and which drawbacks of using FPGA in local are resolved. We present a survey of the cloud FPGA works that have been proposed to exploit the advantages of using FPGA in the cloud. We classify these studies in three services to highlight their benefits and limitations. This survey aims at motivating further researches in cloud FPGA.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Singh, Sanjay, Anil Kumar Saini, Ravi Saini, A. S. Mandal, Chandra Shekhar, and Anil Vohra. "Area Optimized FPGA-Based Implementation of The Sobel Compass Edge Detector." ISRN Machine Vision 2013 (March 7, 2013): 1–6. http://dx.doi.org/10.1155/2013/820216.

Повний текст джерела

Анотація:

This paper presents a new FPGA resource optimized hardware architecture for real-time edge detection using the Sobel compass operator. The architecture uses a single processing element to compute the gradient for all directions. This greatly economizes on the FPGA resources' usages (more than 40% reduction) while maintaining real-time video frame rates. The measured performance of the architecture is 50 fps for standard PAL size video and 200 fps for CIF size video. The use of pipelining further improved the performance (185 fps for PAL size video and 740 fps for CIF size video) without significant increase in FPGA resources.

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Shashidhara, K. S., and H. C. Srinivasaiah. "Implementation of 1024-point FFT Soft-Core to Characterize Power and Resource Parameters in Artix-7, Kintex-7, Virtex-7, and Zynq-7000 FPGAs." European Journal of Engineering Research and Science 4, no. 9 (2019): 81–88. http://dx.doi.org/10.24018/ejers.2019.4.9.1515.

Повний текст джерела

Анотація:

This Paper presents implementation of 1024-point Fast Fourier Transform (FFT). The MatLab simulink environment approach is used to implement the complex 1024-point FFT. The FFT is implemented on different FPGAs such as the following four: Artix-7, Kintex-7, Virtex-7, and Zynq-7000. The comparative study on power and resource consumption has been carried out as design parameters of prime concern. The results show that Artix-7 FPGA consumes less power of 3.402W when compared with its contemporary devices, mentioned above. The resource consumption remains same across all the devices. The resource estimation on each FPGA is carried on and its results are presented for 1024-point FFT function implementation. This Comprehensive analysis provides a deep insight with respect to power and resources. The synthesis and implementation results such as RTL Schematic, I/O Planning, and Floor Planning are generated and analyzed for all the above devices.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Shashidhara, K. S., and H. C. Srinivasaiah. "Implementation of 1024-point FFT Soft-Core to Characterize Power and Resource Parameters in Artix-7, Kintex-7, Virtex-7, and Zynq-7000 FPGAs." European Journal of Engineering and Technology Research 4, no. 9 (2019): 81–88. http://dx.doi.org/10.24018/ejeng.2019.4.9.1515.

Повний текст джерела

Анотація:

This Paper presents implementation of 1024-point Fast Fourier Transform (FFT). The MatLab simulink environment approach is used to implement the complex 1024-point FFT. The FFT is implemented on different FPGAs such as the following four: Artix-7, Kintex-7, Virtex-7, and Zynq-7000. The comparative study on power and resource consumption has been carried out as design parameters of prime concern. The results show that Artix-7 FPGA consumes less power of 3.402W when compared with its contemporary devices, mentioned above. The resource consumption remains same across all the devices. The resource estimation on each FPGA is carried on and its results are presented for 1024-point FFT function implementation. This Comprehensive analysis provides a deep insight with respect to power and resources. The synthesis and implementation results such as RTL Schematic, I/O Planning, and Floor Planning are generated and analyzed for all the above devices.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Minhas, Umar Ibrahim, Roger Woods, and Georgios Karakonstantis. "Evaluation of Static Mapping for Dynamic Space-Shared Multi-task Processing on FPGAs." Journal of Signal Processing Systems 93, no. 5 (2021): 587–602. http://dx.doi.org/10.1007/s11265-020-01633-z.

Повний текст джерела

Анотація:

AbstractWhilst FPGAs have been used in cloud ecosystems, it is still extremely challenging to achieve high compute density when mapping heterogeneous multi-tasks on shared resources at runtime. This work addresses this by treating the FPGA resource as a service and employing multi-task processing at the high level, design space exploration and static off-line partitioning in order to allow more efficient mapping of heterogeneous tasks onto the FPGA. In addition, a new, comprehensive runtime functional simulator is used to evaluate the effect of various spatial and temporal constraints on both the existing and new approaches when varying system design parameters. A comprehensive suite of real high performance computing tasks was implemented on a Nallatech 385 FPGA card and show that our approach can provide on average 2.9 × and 2.3 × higher system throughput for compute and mixed intensity tasks, while 0.2 × lower for memory intensive tasks due to external memory access latency and bandwidth limitations. The work has been extended by introducing a novel scheduling scheme to enhance temporal utilization of resources when using the proposed approach. Additional results for large queues of mixed intensity tasks (compute and memory) show that the proposed partitioning and scheduling approach can provide higher than 3 × system speedup over previous schemes.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Roy, Kalapi, Bingzhong (David) Guan, and Carl Sechen. "A Sea-of-Gates Style FPGA Placement Algorithm." VLSI Design 4, no. 4 (1996): 293–307. http://dx.doi.org/10.1155/1996/92380.

Повний текст джерела

Анотація:

Field Programmable Gate Arrays (FPGAs) have a pre-defined chip boundary with fixed cell locations and routing resources. Placement objectives for flexible architectures (e.g., the standard cell design style) such as minimization of chip area do not reflect the primary placement goals for FPGAs. For FPGAs, the layout tools must seek 100% routability within the architectural constraints. Routability and congestion estimates must be made directly based on the demand and availability of routing resources for detailed routing of the particular FPGA. We. present a hierarchical placement approach consisting of two phases: a global placement phase followed by a detailed placement phase. The global placement phase minimizes congestion estimates of the global routing regions and satisfies all constraints at a coarser level. The detailed placer seeks to maximize the routability of the FPGA by considering factors which cause congestion at the detailed routing level and to precisely satisfy all of the constraints. Despite having limited knowledge about the gate level architectural details, we have achieved a 90%reduction in the number of unrouted nets in comparison to an industrial tool (the only other tool) developed specifically for this architecture.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Zhou, Zhimei, Yong Wan, Yin Liu, Xiaoyan Guo, Qilin Yin, and Chen Feng. "The advancement of cluster based FPGA place & route technic." MATEC Web of Conferences 309 (2020): 01014. http://dx.doi.org/10.1051/matecconf/202030901014.

Повний текст джерела

Анотація:

As one of the core components of electronic hardware systems, Field Programmable Logic Array (FPGA) device design technology continues to advance under the guidance of electronic information technology policies, and has made information technology applications. huge contribution. However, with the advancement of chip technology and the continuous upgrading of information technology, the functions that FPGAs need to perform are more and more complicated. How to efficiently perform layout design and make full use of chip resources has become an important technology to be solved and optimized in FPGA design. The FPGA itself is not limited to a specific function. It contains internal functions such as memory, protocol module, clock module, high-speed interface module and digital signal processing. It can be programmed through logic modules such as programmable logic unit modules and interconnects. Blank FPGA devices are designed to be high performance system applications with complex functions. The layout and routing technology based on cluster logic unit blocks can combine the above resources to give full play to its performance advantages, and its importance is self-evident. Based on the traditional FPGA implementation, this paper analyzes several advantages based on cluster logic block layout and routing technology, and generalizes the design method and flow based on cluster logic block layout and routing technology.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Khurshid, Burhan, and Roohie Naaz. "Cost Effective Implementation of Fixed Point Adders for LUT based FPGAs using Technology Dependent Optimizations." Electronics ETF 19, no. 1 (2015): 14. http://dx.doi.org/10.7251/els1519014k.

Повний текст джерела

Анотація:

Modern day field programmable gate arrays(FPGAs) have very huge and versatile logic resources resulting inthe migration of their application domain from prototypedesigning to low and medium volume production designing.Unfortunately most of the work pertaining to FPGAimplementations does not focus on the technology dependentoptimizations that can implement a desired functionality withreduced cost. In this paper we consider the mapping of simpleripple carry fixed-point adders (RCA) on look-up table (LUT)based FPGAs. The objective is to transform the given RCABoolean network into an optimized circuit netlist that canimplement the desired functionality with minimum cost. Weparticularly focus on 6-input LUTs that are inherent in all themodern day FPGAs. Technology dependent optimizations arecarried out to utilize this FPGA primitive efficiently and theresult is compared against various adder designs. Theimplementation targets the XC5VLX30-3FF324 device fromXilinx Virtex-5 FPGA family. The cost of the circuit is expressedin terms of the resources utilized, critical path delay and theamount of on-chip power dissipated. Our implementation resultsshow a reduction in resources usage by at least 50%; increase inspeed by at least 10% and reduction in dynamic powerdissipation by at least 30%. All this is achieved without anytechnology independent (architectural) modification.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Biookaghazadeh, Saman, Pravin Kumar Ravi, and Ming Zhao. "Toward Multi-FPGA Acceleration of the Neural Networks." ACM Journal on Emerging Technologies in Computing Systems 17, no. 2 (2021): 1–23. http://dx.doi.org/10.1145/3432816.

Повний текст джерела

Анотація:

High-throughput and low-latency Convolutional Neural Network (CNN) inference is increasingly important for many cloud- and edge-computing applications. FPGA-based acceleration of CNN inference has demonstrated various benefits compared to other high-performance devices such as GPGPUs. Current FPGA CNN-acceleration solutions are based on a single FPGA design, which are limited by the available resources on an FPGA. In addition, they can only accelerate conventional 2D neural networks. To address these limitations, we present a generic multi-FPGA solution, written in OpenCL, which can accelerate more complex CNNs (e.g., C3D CNN) and achieve a near linear speedup with respect to the available single-FPGA solutions. The design is built upon the Intel Deep Learning Accelerator architecture, with three extensions. First, it includes updates for better area efficiency (up to 25%) and higher performance (up to 24%). Second, it supports 3D convolutions for more challenging applications such as video learning. Third, it supports multi-FPGA communication for higher inference throughput. The results show that utilizing multiple FPGAs can linearly increase the overall bandwidth while maintaining the same end-to-end latency. In addition, the design can outperform other FPGA 2D accelerators by up to 8.4 times and 3D accelerators by up to 1.7 times.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Morales-Sandoval, Miguel, Luis Armando Rodriguez Flores, Rene Cumplido, Jose Juan Garcia-Hernandez, Claudia Feregrino, and Ignacio Algredo. "A Compact FPGA-Based Accelerator for Curve-Based Cryptography in Wireless Sensor Networks." Journal of Sensors 2021 (January 6, 2021): 1–13. http://dx.doi.org/10.1155/2021/8860413.

Повний текст джерела

Анотація:

The main topic of this paper is low-cost public key cryptography in wireless sensor nodes. Security in embedded systems, for example, in sensor nodes based on field programmable gate array (FPGA), demands low cost but still efficient solutions. Sensor nodes are key elements in the Internet of Things paradigm, and their security is a crucial requirement for critical applications in sectors such as military, health, and industry. To address these security requirements under the restrictions imposed by the available computing resources of sensor nodes, this paper presents a low-area FPGA-prototyped hardware accelerator for scalar multiplication, the most costly operation in elliptic curve cryptography (ECC). This cryptoengine is provided as an enabler of robust cryptography for security services in the IoT, such as confidentiality and authentication. The compact property in the proposed hardware design is achieved by implementing a novel digit-by-digit computing approach applied at the finite field and curve level algorithms, in addition to hardware reusing, the use of embedded memory blocks in modern FPGAs, and a simpler control logic. Our hardware design targets elliptic curves defined over binary fields generated by trinomials, uses fewer area resources than other FPGA approaches, and is faster than software counterparts. Our ECC hardware accelerator was validated under a hardware/software codesign of the Diffie-Hellman key exchange protocol (ECDH) deployed in the IoT MicroZed FPGA board. For a scalar multiplication in the sect233 curve, our design requires 1170 FPGA slices and completes the computation in 128820 clock cycles (at 135.31 MHz), with an efficiency of 0.209 kbps/slice. In the codesign, the ECDH protocol is executed in 4.1 ms, 17 times faster than a MIRACL software implementation running on the embedded processor Cortex A9 in the MicroZed. The FPGA-based accelerator for binary ECC presented in this work is the one with the least amount of hardware resources compared to other FPGA designs in the literature.

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Sadruddin, Salman, and Arshad Aziz. "Reduced Precision Redundancy for Satellite Telecommand Receiver Module on FPGA." Chinese Journal of Engineering 2013 (September 24, 2013): 1–8. http://dx.doi.org/10.1155/2013/453872.

Повний текст джерела

Анотація:

A novel and highly efficient design of a software defined radiation tolerant baseband module for a LEO satellite telecommand receiver using FPGA is presented. FPGAs in space are subject to single event upsets (SEUs) due to high radiation environment. Traditionally, triple modular redundancy (TMR) is used for mitigating Single Event Upsets (SEUs). The drawback of using TMR is that it consumes a lot of hardware resources and requires more power. Reduced precision redundancy (RPR) can be a viable alternative of TMR in digital systems for arithmetic operations. This paper uses the combination of RPR and TMR for mitigating SEUs. The designed module consumes less resources on FPGA and has bit error rate (BER) identical to theoretical results, apart from degradation due to implementation losses. An improved Costas loop and timing recovery algorithm are implemented for achieving carrier recovery and bit synchronization. The hybrid approach mitigates SEUs while consuming 26% less resources than a customary TMR protected receiver.

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Quang, Nguyen Khanh, and Nguyen Ho Quang. "FPGA Technology and Sequential Finite State Machine Method." Hue University Journal of Science: Natural Science 127, no. 1D (2018): 55. http://dx.doi.org/10.26459/hueuni-jns.v127i1d.5073.

Повний текст джерела

Анотація:

<em>The implementation of complex control algorithms on an FPGA</em> (Field programmable gate arrays)<em> is still at a basic level. There is no fixed method to develop algorithms on these devices because of their general characteristics. Therefore, the design engineers are still on the way to find the good approaches to optimize the implementation of algorithms on FPGAs [1-7]. This paper presents and demonstrates a sequential finite state machine design method that can solve the issue of optimal usage of the limited resources on an FPGA.</em>

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Kalistru, I. I., M. A. Borodin, A. S. Rybkin, and R. A. Gladko. "Methods for implementing the Kuznyechik algorithm on FPGAs." Radio industry 28, no. 3 (2018): 64–70. http://dx.doi.org/10.21778/2413-9599-2018-28-3-64-70.

Повний текст джерела

Анотація:

Increased volumes and speed of data transmission over computer networks, and also the need to protect the transmitted data, require accordingly to increase the speed of cryptographic data processing. One of the ways to achieve high performance is implementation of FPGAs-based cryptographic equipment. Therewith, to cut the cost of equipment, it is important that encryption modules shall consume a minimum possible hardware resources. The work aims to find the most compact high-speed solution for FPGA-based Kuznyechik block cipher. Several methods for hardware implementation of linear transformation, which is used in Kuznyechik cipher, have been reviewed. Various aspects of implementation of these methods taking into account the architecture of target FPGAs are investigated. We also consider aspects of the FPGA implementation of nonlinear transformation, which is used in Kuznyechik block cipher. Resource consumption by various implemented solutions of linear transformation has been estimated. A relatively compact high-speed implemented solution of Kuznyechik block cipher has been obtained and tested on the real equipment. The achieved values of speed for iterative and fully pipelined implementations of the algorithm have been presented.

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Chochaev, R. Zh, D. A. Zheleznikov, G. A. Ivanova, S. V. Gavrilov, and V. I. Enns. "FPGA Routing Architecture Estimation Models and Methods." Proceedings of Universities. Electronics 25, no. 5 (2020): 410–22. http://dx.doi.org/10.24151/1561-5405-2020-25-5-410-422.

Повний текст джерела

Анотація:

The problem of analyzing and evaluating the structure of FPGA routing resources at early stages of the design flow presents great interest for researchers. Until now, an approach, consisting in passing the full design flow (logic synthesis, placement, routing) on a set of the test circuits with subsequent estimation of various parameters for each FPGA architecture being analyzed, had been dominant. Despite the high accuracy, this approach has a long runtime and requires lots of computing resources, as well as CAD tuned to the analyzed FPGA architecture. Modern FPGA contain more than a million logical gates, therefore, the application of such approach is inefficient. Today, more attention is paid to the development of various models, which allows to evaluate the structure of the routing resources at early stages without using the benchmark circuits. In this work an overview of the existing models and methods for analyzing the structure of FPGA routing resources has been presented. A comparison of the methods and models has been performed, the estimation of their efficiency and possibility of application for designing domestic FPGA has been made. It has been found that the most optimal approach for analyzing of arbitrary structures of the routing resources FPGA is the development and application of mixed methods. This will allow to obtain the accurate models as well as to significantly reduce the development and market entry time.

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Gnad, Dennis R. E., Cong Dang Khoa Nguyen, Syed Hashim Gillani, and Mehdi B. Tahoori. "Voltage-Based Covert Channels Using FPGAs." ACM Transactions on Design Automation of Electronic Systems 26, no. 6 (2021): 1–25. http://dx.doi.org/10.1145/3460229.

Повний текст джерела

Анотація:

Field Programmable Gate Arrays ( FPGAs ) are increasingly used in cloud applications and being integrated into Systems-on-Chip. For these systems, various side-channel attacks on cryptographic implementations have been reported, motivating one to apply proper countermeasures. Beyond cryptographic implementations, maliciously introduced covert channel receivers and transmitters can allow one to exfiltrate other secret information from the FPGA. In this article, we present a fast covert channel on FPGAs, which exploits the on-chip power distribution network. This can be achieved without any logical connection between the transmitter and receiver blocks. Compared to a recently published covert channel with an estimated 4.8 Mbit/s transmission speed, we show 8 Mbit/s transmission and reduced errors from around 3% to less than 0.003%. Furthermore, we demonstrate proper transmissions of word-size messages and test the channel in the presence of noise generated from other residing tenants’ modules in the FPGA. When we place and operate other co-tenant modules that require 85% of the total FPGA area, the error rate increases to 0.02%, depending on the platform and setup. This error rate is still reasonably low for a covert channel. Overall, the transmitter and receiver work with less than 3–5% FPGA LUT resources together. We also show the feasibility of other types of covert channel transmitters, in the form of synchronous circuits within the FPGA.

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Gothandaraman, Akila, Gregory D. Peterson, G. Lee Warren, Robert J. Hinde, and Robert J. Harrison. "A Pipelined and Parallel Architecture for Quantum Monte Carlo Simulations on FPGAs." VLSI Design 2010 (February 28, 2010): 1–8. http://dx.doi.org/10.1155/2010/946486.

Повний текст джерела

Анотація:

Recent advances in Field-Programmable Gate Array (FPGA) technology make reconfigurable computing using FPGAs an attractive platform for accelerating scientific applications. We develop a deeply pipelined and parallel architecture for Quantum Monte Carlo simulations using FPGAs. Quantum Monte Carlo simulations enable us to obtain the structural and energetic properties of atomic clusters. We experiment with different pipeline structures for each component of the design and develop a deeply pipelined architecture that provides the best performance in terms of achievable clock rate, while at the same time has a modest use of the FPGA resources. We discuss the details of the pipelined and generic architecture that is used to obtain the potential energy and wave function of a cluster of atoms.

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Farooq, Umer, Husain Parvez, Habib Mehrez, and Zied Marrakchi. "Exploration of Heterogeneous FPGA Architectures." International Journal of Reconfigurable Computing 2011 (2011): 1–18. http://dx.doi.org/10.1155/2011/121404.

Повний текст джерела

Анотація:

Mesh-based heterogeneous FPGAs are commonly used in industry and academia due to their area, speed, and power benefits over their homogeneous counterparts. These FPGAs contain a mixture of logic blocks and hard blocks where hard blocks are arranged in fixed columns as they offer an easy and compact layout. However, the placement of hard-blocks in fixed columns can potentially lead to underutilization of logic and routing resources and this problem is further aggravated with increase in the types of hard-blocks. This work explores and compares different floor-planning techniques of mesh-based FPGA to determine their effect on the area, performance, and power of the architecture. A tree-based architecture is also presented; unlike mesh-based architecture, the floor-planning of heterogeneous tree-based architecture does not affect its routing requirements due to its hierarchical structure. Both mesh and tree-based architectures are evaluated for three sets of benchmark circuits. Experimental results show that a more flexible floor-planning in mesh-based FPGA gives better results as compared to the column-based floor-planning. Also it is shown that compared to different floor-plannings of mesh-based FPGA, tree-based architecture gives better area, performance, and power results.

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Garcia, Paulo, Deepayan Bhowmik, Robert Stewart, Greg Michaelson, and Andrew Wallace. "Optimized Memory Allocation and Power Minimization for FPGA-Based Image Processing." Journal of Imaging 5, no. 1 (2019): 7. http://dx.doi.org/10.3390/jimaging5010007.

Повний текст джерела

Анотація:

Memory is the biggest limiting factor to the widespread use of FPGAs for high-level image processing, which require complete frame(s) to be stored in situ. Since FPGAs have limited on-chip memory capabilities, efficient use of such resources is essential to meet performance, size and power constraints. In this paper, we investigate allocation of on-chip memory resources in order to minimize resource usage and power consumption, contributing to the realization of power-efficient high-level image processing fully contained on FPGAs. We propose methods for generating memory architectures, from both Hardware Description Languages and High Level Synthesis designs, which minimize memory usage and power consumption. Based on a formalization of on-chip memory configuration options and a power model, we demonstrate how our partitioning algorithms can outperform traditional strategies. Compared to commercial FPGA synthesis and High Level Synthesis tools, our results show that the proposed algorithms can result in up to 60% higher utilization efficiency, increasing the sizes and/or number of frames that can be accommodated, and reduce frame buffers’ dynamic power consumption by up to approximately 70%. In our experiments using Optical Flow and MeanShift Tracking, representative high-level algorithms, data show that partitioning algorithms can reduce total power by up to 25% and 30%, respectively, without impacting performance.

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Jaquenod, Guillermo A., Javier Valls, and Javier Siman. "Efficient FPGA Hardware Reuse in a Multiplierless Decimation Chain." International Journal of Reconfigurable Computing 2014 (2014): 1–5. http://dx.doi.org/10.1155/2014/546264.

Повний текст джерела

Анотація:

In digital communications, an usual reception chain requires many stages of digital signal processing for filtering and sample rate reduction. For satellite on board applications, this need is hardly constrained by the very limited hardware resources available in space qualified FPGAs. This short paper focuses on the implementation of a dual chain of 14 stages of cascaded half band filters plus 2 : 1 decimators for complex signals (in-phase and quadrature) with minimal hardware resources, using a small portion of an UT6325 Aeroflex FPGA, as a part of a receiver designed for a low data rate command and telemetry channel.

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Tippetts, Beau, Dah Jye Lee, Kirt Lillywhite, and James K. Archibald. "Hardware-Efficient Design of Real-Time Profile Shape Matching Stereo Vision Algorithm on FPGA." International Journal of Reconfigurable Computing 2014 (2014): 1–12. http://dx.doi.org/10.1155/2014/945926.

Повний текст джерела

Анотація:

A variety of platforms, such as micro-unmanned vehicles, are limited in the amount of computational hardware they can support due to weight and power constraints. An efficient stereo vision algorithm implemented on an FPGA would be able to minimize payload and power consumption in microunmanned vehicles, while providing 3D information and still leaving computational resources available for other processing tasks. This work presents a hardware design of the efficient profile shape matching stereo vision algorithm. Hardware resource usage is presented for the targeted micro-UV platform, Helio-copter, that uses the Xilinx Virtex 4 FX60 FPGA. Less than a fifth of the resources on this FGPA were used to produce dense disparity maps for image sizes up to 450 × 375, with the ability to scale up easily by increasing BRAM usage. A comparison is given of accuracy, speed performance, and resource usage of a census transform-based stereo vision FPGA implementation by Jin et al. Results show that the profile shape matching algorithm is an efficient real-time stereo vision algorithm for hardware implementation for resource limited systems such as microunmanned vehicles.

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Chen, Qianqiao, Vaibhawa Mishra, Jose Nunez-Yanez, and Georgios Zervas. "Reconfigurable Network Stream Processing on Virtualized FPGA Resources." International Journal of Reconfigurable Computing 2018 (2018): 1–11. http://dx.doi.org/10.1155/2018/8785903.

Повний текст джерела

Анотація:

The software defined network and network function virtualization are proposed to address the network ossification issue in current Internet infrastructure. Network functions and services are implemented as software applications to increase the programmability of network. However, involving general purpose processors in data plane restricts the bandwidth of network services. Therefore, to keep both the bandwidth and flexibility, a FPGA platform is suggested as a reconfigurable platform to deliver high bandwidth virtual network functions on data plane. In this paper, the FPGA resource has been virtualized by interconnecting partial reconfigurable regions to deliver high bandwidth reconfigurable processing on network streams. With the help of partial reconfiguration technology, network functions on our platform can be configured without affecting other functions on the same FPGA device. The on-chip interconnect system is further evaluated by comparing with existing network-on-chip system. A reconfiguration process is also proposed and demonstrated that it can be performed on our platform. The process can happen in the real time of network services and it is able to keep the original function working during the download of partial bitstream.

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Pathan, Aneela, Tayab D. Memon, Fareesa K. Sohu, and Muhammad A. Rajput. "Analysis of Existing and Proposed 3-Bit and Multi-Bit Multiplier Algorithms for FIR Filters and Adaptive Channel Equalizers on FPGA." Quaid-e-Awam University Research Journal of Engineering Science & Technology 19, no. 1 (2021): 81–89. http://dx.doi.org/10.52584/qrj.1901.12.

Повний текст джерела

Анотація:

Different multiplication algorithms have different performance characteristics. Some are good at speed while others consume less area when implemented on hardware, like Field Programmable Gate Array (FPGA)-the advanced implementation technology for DSP systems. The eminent parallel and sequential multiplication algorithms include Shift and Add, Wallace Tree, Booth, and Array. The multiplier optimization attempts have also been reported in adders used for partial product addition. In this paper, analogous to conventional multipliers, two new multiplication algorithms implemented on FPGA are shown and compared with conventional algorithms as stand-alone and by using them in the implementation of FIR filters and adaptive channel equalizer using the LMS algorithm. The work is carried out on Spartan-6 FPG that may be extended for any type of FPGA. Results are compared in terms of resource utilization, power consumption, and maximum achieved frequency. The results show that for a small length of coefficients like 3-bit, the proposed algorithms work very well in terms of achieved frequency, consumed power, and even resource utilization. Whilst for the length greater than 3-bit, the Pipelined multiplier is much better in frequency than the proposed and conventional ones, and the Booth multiplier consumes fewer resources in terms of lookup tables.

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Du, Changdao, and Yoshiki Yamaguchi. "High-Level Synthesis Design for Stencil Computations on FPGA with High Bandwidth Memory." Electronics 9, no. 8 (2020): 1275. http://dx.doi.org/10.3390/electronics9081275.

Повний текст джерела

Анотація:

Due to performance and energy requirements, FPGA-based accelerators have become a promising solution for high-performance computations. Meanwhile, with the help of high-level synthesis (HLS) compilers, FPGA can be programmed using common programming languages such as C, C++, or OpenCL, thereby improving design efficiency and portability. Stencil computations are significant kernels in various scientific applications. In this paper, we introduce an architecture design for implementing stencil kernels on state-of-the-art FPGA with high bandwidth memory (HBM). Traditional FPGAs are usually equipped with external memory, e.g., DDR3 or DDR4, which limits the design space exploration in the spatial domain of stencil kernels. Therefore, many previous studies mainly relied on exploiting parallelism in the temporal domain to eliminate the bandwidth limitations. In our approach, we scale-up the design performance by considering both the spatial and temporal parallelism of the stencil kernel equally. We also discuss the design portability among different HLS compilers. We use typical stencil kernels to evaluate our design on a Xilinx U280 FPGA board and compare the results with other existing studies. By adopting our method, developers can take broad parallelization strategies based on specific FPGA resources to improve performance.

Стилі APA, Harvard, Vancouver, ISO та ін.

41

Singh, Rachna, and Arvind Rajawat. "Analytical Model for High–Level Area Estimation of FPGA Design." International Journal of Embedded and Real-Time Communication Systems 7, no. 2 (2016): 35–44. http://dx.doi.org/10.4018/ijertcs.2016070103.

Повний текст джерела

Анотація:

FPGAs have been used as a target platform because they have increasingly interesting in system design and due to the rapid technological progress ever larger devices are commercially affordable. These trends make FPGAs an alternative in application areas where extensive data processing plays an important role. Consequently, the desire emerges for early performance estimation in order to quantify the FPGA approach. A mathematical model has been presented that estimates the maximum number of LUTs consumed by the hardware synthesized for different FPGAs using LLVM.. The motivation behind this research work is to design an area modeling approach for FPGA based implementation at an early stage of design. The equation based area estimation model permits immediate and accurate estimation of resources. Two important criteria used to judge the quality of the results were estimation accuracy and runtime. Experimental results show that estimation error is in the range of 1.33% to 7.26% for Spartan 3E, 1.6% to 5.63% for Virtex-2pro and 2.3% to 6.02% for Virtex-5.

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Piróg, S., R. Stala, and Ł. Stawiarski. "Power electronic converter for photovoltaic systems with the use of FPGA-based real-time modeling of single phase grid-connected systems." Bulletin of the Polish Academy of Sciences: Technical Sciences 57, no. 4 (2009): 345–54. http://dx.doi.org/10.2478/v10175-010-0137-9.

Повний текст джерела

Анотація:

Power electronic converter for photovoltaic systems with the use of FPGA-based real-time modeling of single phase grid-connected systemsThe paper presents a method of investigation of grid connected systems with a renewable energy source. The method enables fast prototyping of control systems and power converters components by real-time simulation of the system. Components of the system such as energy source (PV array), converters, filters, sensors and control algorithms are modeled in FPGA IC. Testing the systems before its practical application reduces cost and time-to-market. FPGA devices are commonly used for digital control. The resources of the FPGAs used for preliminary testing can be sufficient for the complete system modelling. Debugging tools for FPGA enable observation of many signals of the analyzed power system (as a result of the control), with very advanced triggering tools. The presented method of simulation with the use of hardware model of the power system in comparison to classical simulation tools gives better possibilities for verification of control algorithms such as MPPT or anti-islanding.

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Mukhanbet, A. A., E. S. Nurakhov, and B. S. Daribayev. "Implementation of a number recognition algorithm built using a neural network on the BASYS3 FPGA panel." Bulletin of the National Engineering Academy of the Republic of Kazakhstan 82, no. 4 (2021): 86–96. http://dx.doi.org/10.47533/2020.1606-146x.119.

Повний текст джерела

Анотація:

In recent years, some field programmable valve arrays (FPGAs) based on CNN release phase accelerators have been introduced. FPGA is widely used in portable devices. They can be programmed to achieve higher concurrency and provide better performance. The power consumption of the FPGA is lower than that of GPUs with the same workload. These reasons make the FPGA suitable for implementing the CNN release phase. They can provide relative output performance for GPUs and achieve low power consumption, which is very important for portable devices. To effectively implement the CNN output phase on the FPGA, the design should have high parallelism, and the hardware resources used should be minimized to reduce the area and power consumption. In the process of working with the help of a neural network, an algorithm for recognizing handwritten numbers is implemented. A special architecture is being created to implement a neural network at the appatent level. The performance during operation and power consumption is comparable to the performance of the processor and the GPU.

Стилі APA, Harvard, Vancouver, ISO та ін.

44

ESMAEILDOUST, MOHAMMAD, and ALI ZAKEROLHOSSEINI. "ROUTING AWARE PLACEMENT ALGORITHM AND EFFICIENT FREE SPACE MANAGEMENT FOR RECONFIGURABLE SYSTEMS." Journal of Circuits, Systems and Computers 19, no. 06 (2010): 1217–34. http://dx.doi.org/10.1142/s0218126610006839.

Повний текст джерела

Анотація:

In partially reconfigurable devices like FPGA, logic resources and communication channels can be reconfigured without affecting other section of the device. This allows parallel execution of multiple tasks on a FPGA. Due to limited resources on a FPGA, an effective management for efficient execution of tasks is required. We present a new approach for management of FPGA logic resources and also communication channels in an online task placement. The approach creates communication channels between the tasks and also tasks with I/O elements without requiring extra computation overhead. We present a fast algorithm for searching MERs for management of FPGA space. Then, we present a exact routing algorithm to find distance and create exact path between two set of tasks. A new fitting strategy based on the rate of communication between the tasks is also presented. The results indicate the proposed strategy and also its combination with some other known strategies can improve the quality of placement.

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Zhang, Xinyi, Yawen Wu, Peipei Zhou, Xulong Tang, and Jingtong Hu. "Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices." ACM Transactions on Embedded Computing Systems 20, no. 5s (2021): 1–24. http://dx.doi.org/10.1145/3477002.

Повний текст джерела

Анотація:

Multi-head self-attention (attention mechanism) has been employed in a variety of fields such as machine translation, language modeling, and image processing due to its superiority in feature extraction and sequential data analysis. This is benefited from a large number of parameters and sophisticated model architecture behind the attention mechanism. To efficiently deploy attention mechanism on resource-constrained devices, existing works propose to reduce the model size by building a customized smaller model or compressing a big standard model. A customized smaller model is usually optimized for the specific task and needs effort in model parameters exploration. Model compression reduces model size without hurting the model architecture robustness, which can be efficiently applied to different tasks. The compressed weights in the model are usually regularly shaped (e.g. rectangle) but the dimension sizes vary (e.g. differs in rectangle height and width). Such compressed attention mechanism can be efficiently deployed on CPU/GPU platforms as their memory and computing resources can be flexibly assigned with demand. However, for Field Programmable Gate Arrays (FPGAs), the data buffer allocation and computing kernel are fixed at run time to achieve maximum energy efficiency. After compression, weights are much smaller and different in size, which leads to inefficient utilization of FPGA on-chip buffer. Moreover, the different weight heights and widths may lead to inefficient FPGA computing kernel execution. Due to the large number of weights in the attention mechanism, building a unique buffer and computing kernel for each compressed weight on FPGA is not feasible. In this work, we jointly consider the compression impact on buffer allocation and the required computing kernel during the attention mechanism compressing. A novel structural pruning method with memory footprint awareness is proposed and the associated accelerator on FPGA is designed. The experimental results show that our work can compress Transformer (an attention mechanism based model) by 95x. The developed accelerator can fully utilize the FPGA resource, processing the sparse attention mechanism with the run-time throughput performance of 1.87 Tops in ZCU102 FPGA.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Zgheib, Grace, and Iyad Ouaiss. "Enhanced Technology Mapping for FPGAs with Exploration of Cell Configurations." Journal of Circuits, Systems and Computers 24, no. 03 (2015): 1550039. http://dx.doi.org/10.1142/s0218126615500395.

Повний текст джерела

Анотація:

In the state-of-the-art field-programmable gate arrays (FPGAs), logic circuits are synthesized and mapped on clusters of look-up tables. However, arithmetic operations benefit from an existing dedicated adder along with a carry chain used to ensure a fast carry propagation. This carry chain is a dedicated wire available in the architecture of the FPGA and is as such independent of the external programmable routing resources. In this paper, we propose a variable-structure Boolean matching technology mapper with embedded decomposition techniques to map nonarithmetic logic functions on carry chains. Previously synthesized and mapped logic functions are adapted so that their outputs are routed using the dedicated carry chains instead of the external programmable interconnects. The experimental results show a reduction in the used routing resources as well as the circuit area when using this Boolean matching-based mapper on the Altera Stratix-III FPGA.

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Amar, Hebibi, Arres Bartil, and Lahcene Ziet. "Comparison of two new methods for implementa BPSK modulator using FPGA." Indonesian Journal of Electrical Engineering and Computer Science 19, no. 2 (2020): 819. http://dx.doi.org/10.11591/ijeecs.v19.i2.pp819-827.

Повний текст джерела

Анотація:

<span>The design of electronic systems has become mainly dependent on FPGAs applications. This is due to the softness effectiveness progress by reconfigurable computing and reduced time to develop solutions for digital signal processing. In this article, we present the theoretical backgrounds of a BPSK modulation and hardware designs of the BPSK system, a firstly with the help of Matlab/Simulink reliant on the System Generator and a second with Xilinx ISE VERILOG Hardware Description Language. In order to show the differences between them, in terms of efficiency, duration of development and how many resources are used in FPGA. For the projected system, we have a tendency to aimed toward employing a moderately sized, low-value FPGA to implement the system. The Atlys development board by Digilent to configure develops, and run the system, based on a Xilinx Spartan-6 LX45 FPGA.</span>

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Luo, Yawen, and Yuhua Chen. "FPGA-Based Acceleration on Additive Manufacturing Defects Inspection." Sensors 21, no. 6 (2021): 2123. http://dx.doi.org/10.3390/s21062123.

Повний текст джерела

Анотація:

Additive manufacturing (AM) has gained increasing attention over the past years due to its fast prototype, easier modification, and possibility for complex internal texture devices when compared to traditional manufacture processing. However, potential internal defects are occurring during AM processes, and it requires real-time inspections to minimize the costs by either aborting the processing or repairing the defect. In order to perform the defects inspection, first the defects database NEU-DET is used for training. Then, a convolution neural network (CNN) is applied to perform defects classification. For real-time purposes, Field Programmable Gate Arrays (FPGAs) are utilized for acceleration. A binarized neural network (BNN) is proposed to best fit the FPGA bit operations. Finally, for the image labeled with defects, the selective search and non-maximum algorithms are implemented to help locate the coordinates of defects. Experiments show that the BNN model on NEU-DET can achieve 97.9% accuracy in identifying whether the image is defective or defect-free. As for the image classification speed, the FPGA-based BNN module can process one image within 0.5 s. The BNN design is modularized and can be duplicated in parallel to fully utilize logic gates and memory resources in FPGAs. It is clear that the proposed FPGA-based BNN can perform real-time defects inspection with high accuracy and it can easily scale up to larger FPGA implementations.

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Ghaffari, Alireza, and Yvon Savaria. "CNN2Gate: An Implementation of Convolutional Neural Networks Inference on FPGAs with Automated Design Space Exploration." Electronics 9, no. 12 (2020): 2200. http://dx.doi.org/10.3390/electronics9122200.

Повний текст джерела

Анотація:

Convolutional Neural Networks (CNNs) have a major impact on our society, because of the numerous services they provide. These services include, but are not limited to image classification, video analysis, and speech recognition. Recently, the number of researches that utilize FPGAs to implement CNNs are increasing rapidly. This is due to the lower power consumption and easy reconfigurability that are offered by these platforms. Because of the research efforts put into topics, such as architecture, synthesis, and optimization, some new challenges are arising for integrating suitable hardware solutions to high-level machine learning software libraries. This paper introduces an integrated framework (CNN2Gate), which supports compilation of a CNN model for an FPGA target. CNN2Gate is capable of parsing CNN models from several popular high-level machine learning libraries, such as Keras, Pytorch, Caffe2, etc. CNN2Gate extracts computation flow of layers, in addition to weights and biases, and applies a “given” fixed-point quantization. Furthermore, it writes this information in the proper format for the FPGA vendor’s OpenCL synthesis tools that are then used to build and run the project on FPGA. CNN2Gate performs design-space exploration and fits the design on different FPGAs with limited logic resources automatically. This paper reports results of automatic synthesis and design-space exploration of AlexNet and VGG-16 on various Intel FPGA platforms.

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Hace, Aleš. "The Improved Division-Less MT-Type Velocity Estimation Algorithm for Low-Cost FPGAs." Electronics 8, no. 3 (2019): 361. http://dx.doi.org/10.3390/electronics8030361.

Повний текст джерела

Анотація:

Advanced motion control applications require smooth and highly accurate high-bandwidth velocity feedback, which is usually provided by an incremental encoder. Furthermore, high sampling rates are also demanded in order to achieve cutting-edge system performance. Such control system performance with high accuracy can be achieved easily by FPGA-based controllers. On the other hand, the well-known MT method for velocity estimation has been well proven in practice. However, its complexity, which is related to the inherent arithmetic division involved in the calculus part of the method, prevents its holistic implementation as a single-chip solution on small-size low-cost FPGAs that are suitable for practical optimized control systems. In order to overcome this obstacle, we proposed a division-less MT-type algorithm that consumes only minimal FPGA resources, which makes it proper for modern cost-optimized FPGAs. In this paper, we present new results. The recursive discrete algorithm has been further optimized, in order to improve the accuracy of the velocity estimation. The novel algorithm has also been implemented on the experimental FPGA board, and validated by practical experiments. The enhanced algorithm design resulted in improved practical performance.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!