Journal articles on the topic 'Multi-FPGA Boards'

To see the other types of publications on this topic, follow the link: Multi-FPGA Boards.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Multi-FPGA Boards.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Guo, Shuaizhi, Tianqi Wang, Linfeng Tao, Teng Tian, Zikun Xiang, and Xi Jin. "RP-Ring: A Heterogeneous Multi-FPGA Accelerator." International Journal of Reconfigurable Computing 2018 (2018): 1–14. http://dx.doi.org/10.1155/2018/6784319.

Full text
Abstract:
To reduce the cost of designing new specialized FPGA boards as direct-summation MOND (Modified Newtonian Dynamics) simulator, we propose a new heterogeneous architecture with existing FPGA boards, which is called RP-ring (reconfigurable processor ring). This design can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resources. In order to avoid overall performance loss caused by the slowest board, we build a mathematical model to decompose workload among FPGAs. The dividing of workload is based on the logic resource, memory access bandwidth, and communication bandwidth of each FPGA chip. Our accelerator can achieve two orders of magnitude speedup compared with CPU implementation.
APA, Harvard, Vancouver, ISO, and other styles
2

Stęplewski, Wojciech, Mateusz Mroczkowski, Radoslav Darakchiev, Konrad Futera, and Grażyna Kozioł. "New technologies of multi-layered printed circuit boards, intended of rapid-design electronic modules." Circuit World 41, no. 3 (August 3, 2015): 121–24. http://dx.doi.org/10.1108/cw-03-2015-0008.

Full text
Abstract:
Purpose – The purpose of this study was the use of embedded components technology and innovative concepts of the printed circuit board (PCB) for electronic modules containing field-programmable gate array (FPGA) devices with a large number of pins (e.g. Virtex 6, FF1156/RF1156 package, 1,156 pins). Design/methodology/approach – In the multi-layered boards, embedded passive components that support FPGA device input/output (I/O), such as blocking capacitors and pull-up resistors, were used. These modules can be used in rapid design of electronic devices. In the study, the MC16T FaradFlex material was used for the inner capacitive layer. The Ohmega-Ply RCM 25 Ω/sq material was used to manufacture pull-up resistors for high-frequency pins. The embedded components have been connected to pins of the FPGA component by using plated-through holes for capacitors and blind vias for resistors. Also, a technique for a board-to-board joining, by using castellated terminations, is described. Findings – The fully functional modules for assembly of the FPGA were manufactured. Achieved resistance of embedded micro resistors, as small as the smallest currently used surface-mount device components (01005), was below required tolerance of 10 per cent. Obtained tolerance of capacitors was less than 3 per cent. Use of embedded components allowed to replace the pull-up resistors and blocking capacitors and shortens the signal path from the I/O of the FPGA. Correct connection to the castellated terminations with a very small pitch was also obtained. This allows in further planned studies to create a full signal distribution system from the FPGA without the use of unreliable plug connectors in aviation and space technology. Originality/value – This study developed and manufactured several innovative concepts of signal distribution from printed circuit boards. The signal distribution solutions were integrated with embedded components, which allowed for significant reduction in the signal path. This study allows us to build the target object that is the module for rapid design of the FPGA device. Usage of a pre-designed module would lessen the time needed to develop a FPGA-based device, as a significant part of the necessary work (mainly designing the signal and power fan-out) will already be done during the module development.
APA, Harvard, Vancouver, ISO, and other styles
3

Hammad, Saifullah, and Muhammad Hasnain. "Highly Expandable Reconfigurable Platform using Multi-FPGA based Boards." International Journal of Computer Applications 51, no. 12 (August 30, 2012): 15–20. http://dx.doi.org/10.5120/8094-1674.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Caba, Julián, María Díaz, Jesús Barba, Raúl Guerra, and Jose A. de la Torre and Sebastián López. "FPGA-Based On-Board Hyperspectral Imaging Compression: Benchmarking Performance and Energy Efficiency against GPU Implementations." Remote Sensing 12, no. 22 (November 13, 2020): 3741. http://dx.doi.org/10.3390/rs12223741.

Full text
Abstract:
Remote-sensing platforms, such as Unmanned Aerial Vehicles, are characterized by limited power budget and low-bandwidth downlinks. Therefore, handling hyperspectral data in this context can jeopardize the operational time of the system. FPGAs have been traditionally regarded as the most power-efficient computing platforms. However, there is little experimental evidence to support this claim, which is especially critical since the actual behavior of the solutions based on reconfigurable technology is highly dependent on the type of application. In this work, a highly optimized implementation of an FPGA accelerator of the novel HyperLCA algorithm has been developed and thoughtfully analyzed in terms of performance and power efficiency. In this regard, a modification of the aforementioned lossy compression solution has also been proposed to be efficiently executed into FPGA devices using fixed-point arithmetic. Single and multi-core versions of the reconfigurable computing platforms are compared with three GPU-based implementations of the algorithm on as many NVIDIA computing boards: Jetson Nano, Jetson TX2 and Jetson Xavier NX. Results show that the single-core version of our FPGA-based solution fulfils the real-time requirements of a real-life hyperspectral application using a mid-range Xilinx Zynq-7000 SoC chip (XC7Z020-CLG484). Performance levels of the custom hardware accelerator are above the figures obtained by the Jetson Nano and TX2 boards, and power efficiency is higher for smaller sizes of the image block to be processed. To close the performance gap between our proposal and the Jetson Xavier NX, a multi-core version is proposed. The results demonstrate that a solution based on the use of various instances of the FPGA hardware compressor core achieves similar levels of performance than the state-of-the-art GPU, with better efficiency in terms of processed frames by watt.
APA, Harvard, Vancouver, ISO, and other styles
5

Zabołotny, Wojciech Marek. "Versatile DMA Engine for High-Energy Physics Data Acquisition Implemented with High-Level Synthesis." Electronics 12, no. 4 (February 9, 2023): 883. http://dx.doi.org/10.3390/electronics12040883.

Full text
Abstract:
FPGA-based cards for data concentration and readout are often used in data acquisition (DAQ) systems for high-energy physics experiments. The DMA engines implemented in FPGA enable efficient data transfer to the processing system’s memory. This paper presents a versatile DMA engine. It may be used in systems with FPGA-equipped PCIe boards hosted in a server and MPSoC-based systems with programmable logic connected directly to the AXI system bus. The core part of the engine is implemented in HLS to simplify further development and modifications. The design is modular and may be easily integrated with the user’s DAQ logic, assuming it delivers the data via a standard AXI-Stream interface. The engine and accompanying software are designed with flexibility in mind. They offer a simple single-packet mode for debugging and a high-performance multi-packet mode fully utilizing the computational power of the processing system. The number of used DAQ cards and the amount of memory used for the DMA buffer may be modified in the runtime without rebooting the system. That is particularly useful in the development and test setups. This paper also presents the development and testing methodology. The whole design is open-source and available in public repositories.
APA, Harvard, Vancouver, ISO, and other styles
6

Takeda, Kosuke. "Software-based data acquisition system for Level-1 end-cap muon trigger in Atlas Run-3." EPJ Web of Conferences 214 (2019): 01036. http://dx.doi.org/10.1051/epjconf/201921401036.

Full text
Abstract:
In 2019, the ATLAS experiment at CERN is planning an upgrade in order to cope with the higher luminosity requirements. In this upgrade, the installation of the new muon chambers for the end-cap muon system will be carriedout. Muon track reconstruction performance can be improved, and fake triggers can be reduced. It is also necessary to develop readout system of trigger data for the Level-1 end-cap muon trigger. We have decided to develop software-based data acquisition system. There-fore, we have implemented SiTCP technology, which connects a FPGA with the network, on the FPGA of new trigger processor boards. Due to this implementation, the new DAQ system can take advantage of the latest developments in computing industry. This new readout system architec-ture is based on multi-process software, and can assemble events at a rate of 100 kHz. For data collection, the 10 Gbit Ethernet network switch is used. Moreover, we have optimized these processes to send data to the following sys-tem without any error. Therefore, the built events can be sent with an average throughput of approximately 211 Mbps. Our newly developed readout system is very generic and it is flexible for modi-fications, extensions and easyto debug. This paper will present the details of the new software-based DAQ system and report the development status for ATLAS Run-3.
APA, Harvard, Vancouver, ISO, and other styles
7

Dumez-Viou, Cédric, Rodolphe Weber, and Philippe Ravier. "Multi-Level Pre-Correlation RFI Flagging for Real-Time Implementation on UniBoard." Journal of Astronomical Instrumentation 05, no. 04 (December 2016): 1641019. http://dx.doi.org/10.1142/s2251171716410191.

Full text
Abstract:
Because of the denser active use of the spectrum, and because of radio telescopes higher sensitivity, radio frequency interference (RFI) mitigation has become a sensitive topic for current and future radio telescope designs. Even if quite sophisticated approaches have been proposed in the recent years, the majority of RFI mitigation operational procedures are based on post-correlation corrupted data flagging. Moreover, given the huge amount of data delivered by current and next generation radio telescopes, all these RFI detection procedures have to be at least automatic and, if possible, real-time. In this paper, the implementation of a real-time pre-correlation RFI detection and flagging procedure into generic high-performance computing platforms based on field programmable gate arrays (FPGA) is described, simulated and tested. One of these boards, UniBoard, developed under a Joint Research Activity in the RadioNet FP7 European programme is based on eight FPGAs interconnected by a high speed transceiver mesh. It provides up to 4 TMACs with ®Altera Stratix IV FPGA and 160 Gbps data rate for the input data stream. The proposed concept is to continuously monitor the data quality at different stages in the digital preprocessing pipeline between the antennas and the correlator, at the station level and the core level. In this way, the detectors are applied at stages where different time–frequency resolutions can be achieved and where the interference-to-noise ratio (INR) is maximum right before any dilution of RFI characteristics by subsequent channelizations or signal recombinations. The detection decisions could be linked to a RFI statistics database or could be attached to the data for later stage flagging. Considering the high in–out data rate in the pre-correlation stages, only real-time and go-through detectors (i.e. no iterative processing) can be implemented. In this paper, a real-time and adaptive detection scheme is described. An ongoing case study has been set up with the Electronic Multi-Beam Radio Astronomy Concept (EMBRACE) radio telescope facility at Nançay Observatory. The objective is to evaluate the performances of this concept in term of hardware complexity, detection efficiency and additional RFI metadata rate cost. The UniBoard implementation scheme is described.
APA, Harvard, Vancouver, ISO, and other styles
8

CUHADAROGLU, Burak, and H. Gökhan İLK. "Design and implementation of a low cost, high performance ionizing radiation source detection and source direction finding system." Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 63, no. 2 (December 30, 2021): 93–117. http://dx.doi.org/10.33769/aupse.942315.

Full text
Abstract:
This study shows the design, implementation, and test results of a low-cost portable radiation-detector system relies on a directionally designed multi detector probe that works in Geiger-Müller counting mode with a single chip solution. The proposed system can perform the functions of detecting the ionizing radiation source, counting gamma and showing the direction and angle of the gamma source relative to the position of the device. The radiation direction finding (RDF) system consists of a radiation probe and electronic sections that are mounted in a metal box. The probe has a has a cast housing made of lead material and it has 8 directional slots for placing the optically isolated PIN diode arrays where each array consists of 4 parallelly connected BPW 34 PIN model diode. The lead housing also blocks incident rays from unintended directions and provides a directional sensing for PIN diodes. The metal box contains 8 low noise amplifiers and pulse shaping detector boards that are assigned to each channel of PIN diode arrays, a signal inverter board, a step-up high voltage board, a 12 V battery and a parallel processing FPGA board with an embedded VHDL software that can process all 8 channels simultaneously and execute the direction estimation algorithm. The system also has an adjustable detector bias voltage and the applied voltage can be displayed on a seven-segment display located in front of the unit so that different models of PIN diodes can be used and tested with different bias voltage levels. It also has a HMI touch screen unit and user interface for displaying the Cpm or Cps values of each channel; a 360-degree scale showing the direction of the source with its pointer and an indicator showing the direction of the source numerically in degrees. The system works as a gamma detector and the source direction can also be detected within ±45° interval. The success of system within this interval is 99.22%. The detector was tested with low to high energy gamma sources (241Am, 9.761 μCi, 59.54 keV, 137Cs 661, 3.7 MBq, keV and 60Co, μCi, 1173 and 1332 keV) and showed good sensitivity performance level in gamma ray detection. The major outcome of this study and the major contribution of this work to the literature is therefore is the design and production details of a hand-held detector and source direction locator prototype; which is a light, portable and compact system.
APA, Harvard, Vancouver, ISO, and other styles
9

Yang, Changqing, Dawei Zhou, and Li Lu. "Multi-channel High-speed Data Acquisition System Based on Improved SPI Communication." Journal of Physics: Conference Series 2404, no. 1 (December 1, 2022): 012028. http://dx.doi.org/10.1088/1742-6596/2404/1/012028.

Full text
Abstract:
Abstract A multichannel high-speed data acquisition system based on improved SPI communication is designed to meet the requirements of synchronous acquisition of current and voltage parameters by master and slave control boards. The main architecture of this system is a master-slave FPGA board and AD acquisition card, and the program is developed based on the Vivado platform. Based on traditional SPI communication, this paper puts forward an improved SPI communication. By increasing the number of MISO data lines, the data transmission rate between master and slave control boards is greatly increased, and the FPGA expansion port resources are fully utilized. In addition, a data conversion module is added to the system, which can observe the acquisition results more intuitively. At the same time as high-speed data acquisition, the measurement accuracy within the specified voltage input range is high. The experimental results show that the system can realize multichannel high-speed data acquisition.
APA, Harvard, Vancouver, ISO, and other styles
10

Turki, Mariem, Zied Marrakchi, Habib Mehrez, and Mohamed Abid. "Frequency Optimization Objective during System Prototyping on Multi-FPGA Platform." International Journal of Reconfigurable Computing 2013 (2013): 1–12. http://dx.doi.org/10.1155/2013/853510.

Full text
Abstract:
Multi-FPGA hardware prototyping is becoming increasingly important in the system on chip design cycle. However, after partitioning the design on the multi-FPGA platform, the number of inter-FPGA signals is greater than the number of physical connections available on the prototyping board. Therefore, these signals should be time-multiplexed which lowers the system frequency. The way in which the design is partitioned affects the number of inter-FPGA signals. In this work, we propose a set of constraints to be taken into account during the partitioning task. Then, the resulting inter-FPGA signals are routed with an iterative routing algorithm in order to obtain the best multiplexing ratio. Indeed, signals are grouped and then routed using the intra-FPGA routing algorithm: Pathfinder. This algorithm is adapted to deal with the inter-FPGA routing problem. Many scenarios are proposed to obtain the most optimized results in terms of prototyping system frequency. Using this technique, the system frequency is improved by an average of 12.8% compared to constructive routing algorithm.
APA, Harvard, Vancouver, ISO, and other styles
11

Zhang, Cheng Chang, Li Sheng Yang, Xiao Ping Hu, Hong Yang, and Ping Li. "A New Clock Synchronization Method for Multi-FPGA Systems." Advanced Materials Research 204-210 (February 2011): 907–10. http://dx.doi.org/10.4028/www.scientific.net/amr.204-210.907.

Full text
Abstract:
A novel clock synchronization scheme is proposed in this paper, which smartly takes the advantages of the inner delay-locked loop(DLL) to compensate for the delay generated by board-level feedback, combining with conventional external clock tree scheme to achieve the system clock synchronization.
APA, Harvard, Vancouver, ISO, and other styles
12

Jain, Sushil Chandra, Anshul Kumar, and Shashi Kumar. "Hybrid Multi-FPGA Board Evaluation by Permitting Limited Multi-Hop Routing." Design Automation for Embedded Systems 8, no. 4 (December 2003): 309–26. http://dx.doi.org/10.1023/b:daem.0000013065.87652.df.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

NUHA, MUHAMMAD ULIN, HARI ARIEF DHARMAWAN, and SETYAWAN PURNOMO SAKTI. "Desain ADC SAR 10-Bit Dua Kanal Simultan menggunakan Board FPGA Altera DE10." ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 10, no. 1 (January 14, 2022): 16. http://dx.doi.org/10.26760/elkomika.v10i1.16.

Full text
Abstract:
ABSTRAKDesain arsitektur ADC (Analog to Digital Converter) multi kanal simultan pada perangkat kontroller dapat mengurangi jumlah intruksi (task) program yang harus dijalankan oleh mikroprosessor dan dapat digunakan untuk membentuk pengukuran simultan. Paper ini memaparkan desain ADC SAR (Successive Approximation Register) 10-bit dua kanal simultan menggunakan Board FPGA (Field Programmable Gate Array) Altera DE10. FPGA dikonfigurasi untuk difungsikan sebagai sirkuit logika SAR dua kanal menggunakan bahasa VHDL (VHSIC-Very High Speed Integrated Circuit Hardware Description Language). Hasil pengujian menunjukkan kanal ADC_1 dan ADC_2 memiliki tingkat kesalahan rata-rata sebesar 1.05 % dan 0.90 %, tingkat akurasi sebesar 98.95 % dan 99.09 %, tingkat linearitas dengan koefisien korelasi sebesar 0.9999 dan 0.9999. Durasi waktu yang dibutuhkan dalam satu kali proses konversi ADC yaitu 104 μs. Didapatkan sampling-rate sebesar 9.6 KS/s. Daya yang dikonsumsi sebesar 842 mW. Kedua kanal ADC SAR yang telah dibuat mampu bekerja secara simultan.Kata kunci: ADC, dua-kanal simultan, FPGA, SAR, VHDL ABSTRACTDesing of simultaneous multi-channel ADC (Analog to Digital Converter) architecture on the controller device can reduce the number of program instructions (tasks) that must be executed by microprocessor and can be used to form simultaneous measurements. This paper describes design of simultaneous two channel 10-bit SAR (Successive Approximation Register) ADC by using Board FPGA (Field Programmable Gate Array) Altera DE10. FPGA is configured using VHDL (VHSIC-Very High Speed Integrated Circuit Hardware Description Language) language to function as two channels SAR logic circuit. Test results show that ADC_1 and ADC_2 channels have average error of 1.05% and 0.90%, accuracy of 98.95% and 99.09%, linearity level with correlation coefficient of 0.9999 and 0.9999. Time duration in one ADC conversion process is 104 μs. The sampling rate obtained is 9.6 KS/s. Power consumed is 842 mW. Design of two channels SAR ADC that has been made can work simultaneously.Keywords: ADC, two-channels simultaneous, FPGA, SAR, VHDL
APA, Harvard, Vancouver, ISO, and other styles
14

Zhao, Zhi Peng, Jun Wang, Chao Yun Mai, and Yu Xi Zhang. "A Multi-Channel Large-Capacity Acquisition and Storage System." Applied Mechanics and Materials 687-691 (November 2014): 3427–32. http://dx.doi.org/10.4028/www.scientific.net/amm.687-691.3427.

Full text
Abstract:
In this paper, a multi-channel large-capacity acquisition and storage system is prototyped to meet the urgent need of multi-channel analog signals’ acquisition and analysis. Composed of an acquisition board and storage board based on FPGA, the system could acquire 4 channel signals and store them in 512GB flash array. Gigabit Ethernet is used to connect the system and PC, which ensures high throughout between them. Ping-pong operation and pipeline technology are applied to cover the programming time of flash to improve the access speed. The system is portable, stable, and expansible, and could be used in various areas.
APA, Harvard, Vancouver, ISO, and other styles
15

Chen, Yonghao, Tianrui Li, Xiaojie Chen, ZhiGang Cai, and Tao Su. "High-Frequency Systolic Array-Based Transformer Accelerator on Field Programmable Gate Arrays." Electronics 12, no. 4 (February 6, 2023): 822. http://dx.doi.org/10.3390/electronics12040822.

Full text
Abstract:
The systolic array is frequently used in accelerators for neural networks, including Transformer models that have recently achieved remarkable progress in natural language processing (NLP) and machine translation. Due to the constraints of FPGA EDA (Field Programmable Gate Array Electronic Design Automation) tools and the limitations of design methodology, existing systolic array accelerators for FPGA deployment often cannot achieve high frequency. In this work, we propose a well-designed high-frequency systolic array for an FPGA-based Transformer accelerator, which is capable of performing the Multi-Head Attention (MHA) block and the position-wise Feed-Forward Network (FFN) block, reaching 588 MHz and 474 MHz for different array size, achieving a frequency improvement of 1.8× and 1.5× on a Xilinx ZCU102 board, while drastically saving resources compared to similar recent works and pushing the utilization of each DSP slice to a higher level. We also propose a semi-automatic design flow with constraint-generating tools as a general solution for FPGA-based high-frequency systolic array deployment.
APA, Harvard, Vancouver, ISO, and other styles
16

Liu, Dongmei, Weiyuan Zhu, Yanhui Wang, Ziyi Chang, Kaikai Xie, and Shun Wang. "Power quality transient disturbances detection system based on db5 wavelet." Journal of Physics: Conference Series 2564, no. 1 (August 1, 2023): 012010. http://dx.doi.org/10.1088/1742-6596/2564/1/012010.

Full text
Abstract:
Abstract To improve the real-time and accuracy of transient disturbance localization in the power grid, this paper designs an FPGA-based transient disturbance detection system based on the good time-frequency local analysis capability, multi-resolution analysis characteristics of wavelet decomposition, and the selection of db5 (Daubechies 5) wavelets. The system acquires the original disturbance signal through an external high-speed AD/DA board, implements the wavelet transform algorithm by the EP4CE10F17C8 FPGA, and then feeds back to the high-speed AD/DA board to output the location of the singularities in the high-frequency detail signal that corresponds to the start and end moments of the disturbance. Simulation and actual test results together show that the system can accurately and quickly locate the start and end moments of typical single disturbances such as voltage sag and short interruption, with a measurement deviation of within 3ms, meeting the real-time requirements.
APA, Harvard, Vancouver, ISO, and other styles
17

Davidson, Kyle, and Joey Bray. "Understanding Digital Radio Frequency Memory Performance in Countermeasure Design." Applied Sciences 10, no. 12 (June 15, 2020): 4123. http://dx.doi.org/10.3390/app10124123.

Full text
Abstract:
This paper describes the design, implementation, and testing of a novel multi-function software defined Radio Frequency (RF) system designed for small airborne drone applications. The system was created using an inexpensive Field Programmable Gate Array (FPGA) to combine a coherent linear frequency modulated radar transmitter and receiver, with a Digital Radio Frequency Memory (DRFM) jammer for use with a common RF aperture in simultaneous operation. The system was implemented on a Xilinx Kintex-7 FPGA with a wideband analogue-to-digital/ digital-to-analogue (ADC/DAC) converter mezzanine board and tested using hardware-in-the-loop mode to validate its performance. This is the first known account of an integrated multifunction electronic attack and radar system on a single chip, capable of performing a simultaneous, not time shared, operation.
APA, Harvard, Vancouver, ISO, and other styles
18

Śmigielski, Grzegorz. "Numerical control system based on a programmable logic device." MATEC Web of Conferences 357 (2022): 01005. http://dx.doi.org/10.1051/matecconf/202235701005.

Full text
Abstract:
The article presents a numerical control system of a multi-axis machine built using a programmable logic device.The system consists of PC, motor controller (based on FPGA), power stages and motors. The complete code of controller was written in the VHDL language and implementing in Xilinx Spartan 3 board. The PC program was created using LabVIEW.The logical and functional tests of the system have been carried out. The application of a programmable device enables its quick configuration according to the requirements set by the user.The main advantage of FPGA is the option to expand to following modules, such as the incremental encoder support module, PWM, FOC etc. In these applications, the advantage of the programmable system becomes visible, which due to its specifics, is built for parallel processing or generating signals.
APA, Harvard, Vancouver, ISO, and other styles
19

Xu, Peng, Zhihua Xiao, Xianglong Wang, Lei Chen, Chao Wang, and Fengwei An. "A Multi-Core Object Detection Coprocessor for Multi-Scale/Type Classification Applicable to IoT Devices." Sensors 20, no. 21 (October 31, 2020): 6239. http://dx.doi.org/10.3390/s20216239.

Full text
Abstract:
Power efficiency is becoming a critical aspect of IoT devices. In this paper, we present a compact object-detection coprocessor with multiple cores for multi-scale/type classification. This coprocessor is capable to process scalable block size for multi-shape detection-window and can be compatible with the frame-image sizes up to 2048 × 2048 for multi-scale classification. A memory-reuse strategy that requires only one dual-port SRAM for storing the feature-vector of one-row blocks is developed to save memory usage. Eventually, a prototype platform is implemented on the Intel DE4 development board with the Stratix IV device. The power consumption of each core in FPGA is only 80.98 mW.
APA, Harvard, Vancouver, ISO, and other styles
20

Handzlik, Adam, and Andrzej Jabłonski. "Large Data Stream Processing - Embedded Systems Design Challenges." International Journal of Electronics and Telecommunications 56, no. 2 (June 1, 2010): 107–10. http://dx.doi.org/10.2478/v10177-010-0013-4.

Full text
Abstract:
Large Data Stream Processing - Embedded Systems Design Challenges The following paper describes an application of reconfigurable hardware architectures for processing of huge data streams. Radar, sonar and high speed internet networks are typical sources of data that require extreme computing power and resources to enable real time acquisition, processing and management. An approach to monitoring of real time multi-gigabit internet network has been described as a practical application of FPGA based board, designed for fast data processing.
APA, Harvard, Vancouver, ISO, and other styles
21

Pacini, Tommaso, Emilio Rapuano, Gianmarco Dinelli, and Luca Fanucci. "A Multi-Cache System for On-Chip Memory Optimization in FPGA-Based CNN Accelerators." Electronics 10, no. 20 (October 15, 2021): 2514. http://dx.doi.org/10.3390/electronics10202514.

Full text
Abstract:
In recent years, FPGAs have demonstrated remarkable performance and contained power consumption for the on-the-edge inference of Convolutional Neural Networks. One of the main challenges in implementing this class of algorithms on board an FPGA is resource management, especially with regard to memory. This work presents a multi-cache system that allows for noticeably shrinking the required on-chip memory with a negligible variation of timing performance and power consumption. The presented methods have been applied to the CloudScout CNN, which was developed to perform cloud detection directly on board the satellite, thus representing a relevant case study for on the edge applications. The system was validated and characterized on a Xilinx ZCU106 Evaluation Board. The result is a 64.48% memory saving if compared to an alternative hardware accelerator developed for the same algorithm, with comparable performance in terms of inference time and power consumption. The paper also presents a detailed analysis of the hardware accelerator power consumption, focusing on the impact of data transfer between the accelerator and the external memory. Further investigation shows that the proposed strategies allow the implementation of the accelerator on FPGAs with a smaller size, guaranteeing benefits in terms of power consumption and hardware costs. A broader evaluation about the applicability of the presented methods to other models demonstrates valuable results in terms of memory saving with respect to other works reported in the literature.
APA, Harvard, Vancouver, ISO, and other styles
22

Haziq Ishak, Mohammad, Mohd Syafiq Mispan, Wong Yan Chiew, Muhammad Raihaan Kamaruddin, and Mikhail Aleksandrovich Korobkov. "Secure lightweight obfuscated delay-based physical unclonable function design on FPGA." Bulletin of Electrical Engineering and Informatics 11, no. 2 (April 1, 2022): 1075–83. http://dx.doi.org/10.11591/eei.v11i2.3265.

Full text
Abstract:
The internet of things (IoT) describes the network of physical objects equipped with sensors and other technologies to exchange data with other devices over the Internet. Due to its inherent flexibility, field-programmable gate array (FPGA) has become a viable platform for IoT development. However, various security threats such as FPGA bitstream cloning and intellectual property (IP) piracy have become a major concern for this device. Physical unclonable function (PUF) is a promising hardware fingerprinting technology to solve the above problems. Several PUFs have been proposed, including the implementation of reconfigurable-XOR PUF (R-XOR PUF) and multi-PUF (MPUF) on the FPGA. However, these proposed PUFs have drawbacks, such as high delay imbalances caused by routing constraints. Therefore, in this study, we explore relative placement method to implement the symmetric routing in the obfuscated delay-based PUF on the FPGA board. The delay analysis result proves that our method to implement the symmetric routing was successful. Therefore, our work has achieved good PUF quality with uniqueness of 48.75%, reliability of 99.99%, and uniformity of 52.5%. Moreover, by using the obfuscation method, which is an Arbiter-PUF combined with a random challenge permutation technique, we reduced the vulnerability of Arbiter-PUF against machine learning attacks to 44.50%.
APA, Harvard, Vancouver, ISO, and other styles
23

Elouaret, Tarek, Sylvain Colomer, Frédéric De Melo, Nicolas Cuperlier, Olivier Romain, Lounis Kessal, and Stéphane Zuckerman. "Implementation of a Bio-Inspired Neural Architecture for Autonomous Vehicles on a Multi-FPGA Platform." Sensors 23, no. 10 (May 10, 2023): 4631. http://dx.doi.org/10.3390/s23104631.

Full text
Abstract:
Autonomous vehicles require efficient self-localisation mechanisms and cameras are the most common sensors due to their low cost and rich input. However, the computational intensity of visual localisation varies depending on the environment and requires real-time processing and energy-efficient decision-making. FPGAs provide a solution for prototyping and estimating such energy savings. We propose a distributed solution for implementing a large bio-inspired visual localisation model. The workflow includes (1) an image processing IP that provides pixel information for each visual landmark detected in each captured image, (2) an implementation of N-LOC, a bio-inspired neural architecture, on an FPGA board and (3) a distributed version of N-LOC with evaluation on a single FPGA and a design for use on a multi-FPGA platform. Comparisons with a pure software solution demonstrate that our hardware-based IP implementation yields up to 9× lower latency and 7× higher throughput (frames/second) while maintaining energy efficiency. Our system has a power footprint as low as 2.741 W for the whole system, which is up to 5.5–6× less than what Nvidia Jetson TX2 consumes on average. Our proposed solution offers a promising approach for implementing energy-efficient visual localisation models on FPGA platforms.
APA, Harvard, Vancouver, ISO, and other styles
24

Peng, Xian Min, Qiang Li, Gui Chuan Zhang, Qing Lin Liu, and Bing Xu. "Research on Dual Core Mechanism of Wireless Telemetry System." Advanced Materials Research 791-793 (September 2013): 2131–35. http://dx.doi.org/10.4028/www.scientific.net/amr.791-793.2131.

Full text
Abstract:
In the high-precision multi-channel wireless telemetry systems, in order to complete a variety of tasks such as the system collection and control, parameter assignment, data storage, wireless command parsing, a dual-core framework is used, in which one of the FPGA is to control c the collection board control and another is to complete the core control. This paper focuses on exploring the working mechanism of the dual-core and mutual communication mechanism which is successfully applied to the actual system.
APA, Harvard, Vancouver, ISO, and other styles
25

Drożdż, Michał, and Tomasz Kryjak. "FPGA Implementation of Multi-scale Face Detection Using HOG Features and SVM Classifier." Image Processing & Communications 21, no. 3 (September 1, 2016): 27–44. http://dx.doi.org/10.1515/ipc-2016-0014.

Full text
Abstract:
Abstract In this paper an FPGA based embedded vision system for face detection is presented. The sliding detection window, HOG+SVM algorithm and multi-scale image processing were used and extensively described. The applied computation parallelizations allowed to obtain real-time processing of a 1280 × 720 @ 50Hz video stream. The presented module has been verified on the Zybo development board with Zynq SoC device from Xilinx. It can be used in a vast number of vision systems, including diver fatigue monitoring.
APA, Harvard, Vancouver, ISO, and other styles
26

Wu, Jin, Dan Yu Wu, Fan Jiang, Yang Yu, Lei Zhou, Xin Yu Liu, and De Xin Wu. "An 8-Bit 1.72-Gsample/s Two Channel TimeInterleaved Analog-to-Digital Converter Based on PCB Circuit Board." Applied Mechanics and Materials 336-338 (July 2013): 1525–31. http://dx.doi.org/10.4028/www.scientific.net/amm.336-338.1525.

Full text
Abstract:
This paper reports an 8-bit 1.72-Gsample/s TimeInterleaved analog-to-digital converter (TIADC) based on PCB with Field Programmable Gate Array (FPGA) technique. The system integrates two independent designed 8-bit 0.9-Gsample/s ADC chips in parallel, commercial FPGA and multi phase clock distributor circuit. In order to increase the systems performance, online calibration method is proposed to calibrate the mismatching errors in TIADC. The utilization of the FPGA is proven to be effective in removing the offset & gain mismatch; the clock distributor circuit is used as the time delay for each Sub-ADC chip to eliminate the sampling-time error. The hardware design features are also described in details. The ADC chip is fabricated in 0.35 um SiGe BiCMOS. Finally, experimental results reveal that the proposed system is capable to be operated up to 1.72GSps.Under this sampling frequency (1.72GHz), the system can achieve spurious free dynamic range (SFDR) which is larger than 40dBc. The effective code (ENOB) is lager than 5.5 bit from DC to Nyquist frequency and the power dissipation is 2.28 W.
APA, Harvard, Vancouver, ISO, and other styles
27

Prasad Acharya, G., and M. Asha Rani. "Online Self-testable Multi-core System using Dynamic Partial Reconfiguration of FPGA." International Journal of Reconfigurable and Embedded Systems (IJRES) 6, no. 3 (May 28, 2018): 160. http://dx.doi.org/10.11591/ijres.v6.i3.pp160-168.

Full text
Abstract:
<span>This paper presents a novel and efficient method of designing an online self-testable multi-core system. Testing of a Core Under Test (CoUT) in a massively multi-core system can be carried out while the system is operational, by assigning the functionality of the CoUT to one of the non-functioning/idle and pre-tested core. The methodology presented in this paper has been implemented taking a test setup by demonstrating the Dynamic Partial Reconfiguration (DPR) feature of latest FPGAs on Zynq-7 XC702 evaluation board. The simulation results obtained from the experimental setup show that the utilization of a multi-core system can be significantly improved by effectively utilizing the idle core(s) to back up CoUT(s) for on-line test without a significant hardware overhead and test latency.</span>
APA, Harvard, Vancouver, ISO, and other styles
28

Guo, Sen, San Feng Chen, and Yong Sheng Liang. "Global Shared Memory Design for Multi-GPU Graphics Cards on Personal Supercomputer." Applied Mechanics and Materials 263-266 (December 2012): 1236–41. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.1236.

Full text
Abstract:
When programming CUDA or OpenCL on multi-GPU systems, the programmers usually expect the GPUs on the same system can communicate fast with each other. For instance, they hope a device memory copy from GPU1s memory to GPU2s memory can be done inside the graphics card, and needn’t to employ the PCIE, which is in relative low speed. In this paper, we propose an idea to add a multi-channel memory to the multi-GPU board, and this memory is only for transferring data between different GPUs. This multi-channel memory should have multiple interfaces, including one common interface shared by different GPUs, which is connected with a FPGA arbitration circuit and several other interfaces connected with dedicated GPUs frame buffer independently. To distinguish the shared memory of a stream multiprocessor, we call this memory Global Shared Memory. We analyze the performance improvement expectation with this global shared memory, with the case of accelerating computer tomography algebraic reconstruction on multi-GPU.
APA, Harvard, Vancouver, ISO, and other styles
29

Zhang, Duo Li, Xue Peng Yang, and Yu Kun Song. "Design and Implementation of a Large Points FFT Acceleration Unit in Multi-Processor System Based on FPGA." Applied Mechanics and Materials 347-350 (August 2013): 1793–98. http://dx.doi.org/10.4028/www.scientific.net/amm.347-350.1793.

Full text
Abstract:
This paper introduces the design and implementation of a large points FFT acceleration unit of multi-processor system based on FPGA. It introduces radix-2 DIT-FFT algorithm and the features of hardware platform. It analyzes the FFT acceleration unit overall and divides it into several modules. Then hardware implementations of modules are discussed. The whole FFT acceleration unit is a part of one multi-processor image processing system and the whole system is tested by software and hardware co verification. Results are verified between the software simulation and Virtex6 evaluation board of Xilinx Company. The results are compatible in the range of error. The FFT acceleration unit meets the needs of the whole system and the design is successful.
APA, Harvard, Vancouver, ISO, and other styles
30

Mouri Zadeh Khaki, Ahmad, Ebrahim Farshidi, Sawal Hamid MD Ali, and Masuri Othman. "An FPGA-Based 16-Bit Continuous-Time 1-1 MASH ΔΣ TDC Employing Multirating Technique." Electronics 8, no. 11 (November 5, 2019): 1285. http://dx.doi.org/10.3390/electronics8111285.

Full text
Abstract:
An all-digital voltage-controlled oscillator (VCO)-based second-order multi-stage noise-shaping (MASH) ΔΣ time-to-digital converter (TDC) is presented in this paper. The prototype of the proposed TDC was implemented on an Altera Stratix IV FPGA board. In order to improve the performance over conventional TDCs, a multirating technique is employed in this work in which higher sampling rate is used for higher stages. Experimental results show that the multirating technique had a significant influence on improving signal-to-noise ratio (SNR), from 43.09 dB without multirating to 61.02 dB with multirating technique (a gain of 17.93 dB) by quadrupling the sampling rate of the second stage. As the proposed design works in the time-domain and does not consist of any loop and calibration block, no time-to-voltage conversion is needed which results in low complexity and power consumption. A built-in oscillator and phase-locked loops (PLLs) of the FPGA board are utilized to generate sampling clocks at different frequencies. Therefore, no external clock needs to be applied to the proposed TDC. Two cases with different sampling rates were examined by the proposed design to demonstrate the capability of the technique. It can be implied that, by employing multirating technique and increasing sampling frequency, higher SNR can be achieved.
APA, Harvard, Vancouver, ISO, and other styles
31

Bae, Jina, Junhee Lee, and Hyoungsik Nam. "Variable Clock and EM Signal Generation Scheme for Foveation-Based Driving OLED Head-Mounted Displays." Electronics 10, no. 5 (February 25, 2021): 538. http://dx.doi.org/10.3390/electronics10050538.

Full text
Abstract:
An image processing pipeline and multi-output shift register of a foveation-based driving scheme are proposed for the realization of immersive head-mounted displays in 2019. In addition, this paper describes a variable clock generation circuit to manipulate output waveforms of shift registers in the foveated display. The EM circuit for OLED displays is also introduced to support the control signal to keep OLEDs of pixels from emitting light during the compensation. Especially, the EM circuit consists of only four TFTs and one capacitor and gives rise to pulses of variable widths corresponding to the resolution of a driven display area. A variable clock generation scheme is verified with 60 Hz 1440 × 2560 monitor, eye-tracker, PSoC board and FPGA board. An EM circuit is simulated by SPICE for 9600 lines and 120 Hz foveated displays.
APA, Harvard, Vancouver, ISO, and other styles
32

Saber, Mohamed, and Esam Hagras. "Parallel multi-layer selector S-Box based on lorenz chaotic system with FPGA implementation." Indonesian Journal of Electrical Engineering and Computer Science 19, no. 2 (August 1, 2020): 784. http://dx.doi.org/10.11591/ijeecs.v19.i2.pp784-792.

Full text
Abstract:
<p><span>The substitution box (S-Box) is the main block in the encryption system, which replaces the non-encrypted data by dynamic secure and hidden data. S-Box can be designed based on complex nonlinear chaotic systems that presented in recent papers as a chaotic S-Box. The hardware implementation of these chaotic systems suffers from long processing time (low speed), and high-power consumption since it requires a large number of non-linear computational models. In this paper, we present a high-speed FPGA implementation of Parallel Multi-Layer Selector Substitution Boxes based on the Lorenz Chaotic System (PMLS S-Box). The proposed PMLS chaotic S-Box is modeled using Xilinx System Generator (XSG) in 32 bits fixed-point format, and the architecture implemented into Xilinx Spartan-6 X6SLX45 board. The maximum frequency of the proposed PMLS chaotic S-Box is 381.764 MHz, with dissipates of 77 mwatt. Compared to other S-Box chaotic systems, the proposed one achieves a higher frequency and lower power consumption. In addition, the proposed PMLS chaotic S-Box is analyzed based on S-Box standard tests such as; Bijectivity property, nonlinearity, strict avalanche criterion, differential probability, and bits independent criterion. The five different standard results for the proposed S-Box indicate that PMLSC can effectively resist crypto-analysis attacks, and is suitable for secure communications.</span></p>
APA, Harvard, Vancouver, ISO, and other styles
33

De Lucia, Erika. "Status of the KLOE-2 Inner Tracker." EPJ Web of Conferences 166 (2018): 00003. http://dx.doi.org/10.1051/epjconf/201816600003.

Full text
Abstract:
KLOE-2 at the DAΦNE Φ-factory is the main experiment of the INFN Laboratori Nazionali di Frascati (LNF) and is the first high-energy experiment using the GEM technology with a cylindrical geometry, a novel idea developed at LNF. Four concentric cylindrical triple-GEM detectors compose the Inner Tracker, inserted around the interaction region and before the inner wall of the pre-existing KLOE Drift Chamber to improve the resolution on decay vertices close to the interaction point. State-of-the-art solutions have been expressly developed or tuned for this project: single-mask GEM etching, multi-layer XV patterned readout, PEEK spacer grid, GASTONE front-end board, a custom 64-channel ASIC with digital output, and the Global Interface Board for data collection, with a configurable FPGA architecture and Gigabit Ethernet. Alignment and calibration of a cylindrical GEM detector was never done before and represents one of the challenging activities of the experiment. The Inner Tracker detector construction, operation, calibration and performance obtained with cosmic-ray muons and Bhabha scattering events will be reported.
APA, Harvard, Vancouver, ISO, and other styles
34

Dasari, Manikanta Swamy, Venkatesan Mani, and Subbarao Mopidevi. "Fuel Cell-Based High-Gain Boost Converter Fed Single-Phase Multi-level Inverter Controlled by FPGA controller." Journal of New Materials for Electrochemical Systems 24, no. 3 (September 30, 2021): 208–17. http://dx.doi.org/10.14447/jnmes.v24i3.a09.

Full text
Abstract:
For grid-connected applications, Multi-level Inverters (MLI) are mostly used, similarly to join the various RES to the grid as well as to satisfy the load demand MLIs are mostly preferred. This article presents a fuel cell-based high-gain boost converter with a single-phase five-level inverter controlled by CB-PWM techniques. A fuel cell produces a very less amount of power so to boost-up the fuel cell power to the essential power level with the help of a high-gain boost converter. The boost converter produces the required DC output power and that power is a converter to required AC power by using a single-phase five-level inverter. The MLI switches are controlled with the help of MC-PWM techniques and observe the inverter behavior in terms of THD, output ripples, and settling time. The entire work is done in MATLAB/Simulink tool as well as design a small prototype model by using the FPGA board.
APA, Harvard, Vancouver, ISO, and other styles
35

Li, Chao, Rui Xu, Yong Lv, Yonghui Zhao, and Weipeng Jing. "Edge Real-Time Object Detection and DPU-Based Hardware Implementation for Optical Remote Sensing Images." Remote Sensing 15, no. 16 (August 10, 2023): 3975. http://dx.doi.org/10.3390/rs15163975.

Full text
Abstract:
The accuracy of current deep learning algorithms has certainly increased. However, deploying deep learning networks on edge devices with limited resources is challenging due to their inherent depth and high parameter count. Here, we proposed an improved YOLO model based on an attention mechanism and receptive field (RFA-YOLO) model, applying the MobileNeXt network as the backbone to reduce parameters and complexity, adopting the Receptive Field Block (RFB) and Efficient Channel Attention (ECA) modules to improve the detection accuracy of multi-scale and small objects. Meanwhile, an FPGA-based model deployment solution was proposed to implement parallel acceleration and low-power deployment of the detection algorithm model, which achieved real-time object detection for optical remote sensing images. We implement the proposed DPU and Vitis AI-based object detection algorithms with FPGA deployment to achieve low power consumption and real-time performance requirements. Experimental results on DIOR dataset demonstrate the effectiveness and superiority of our RFA-YOLO model for object detection algorithms. Moreover, to evaluate the performance of the proposed hardware implementation, it was implemented on a Xilinx ZCU104 board. Results of the experiments for hardware and software simulation show that our DPU-based hardware implementation are more power efficient than central processing units (CPUs) and graphics processing units (GPUs), and have the potential to be applied to onboard processing systems with limited resources and power consumption.
APA, Harvard, Vancouver, ISO, and other styles
36

Bertolucci, Matteo, Riccardo Cassettari, and Luca Fanucci. "On the Frequency Carrier Offset and Symbol Timing Estimation for CCSDS 131.2-B-1 High Data-Rate Telemetry Receivers." Sensors 21, no. 9 (April 21, 2021): 2915. http://dx.doi.org/10.3390/s21092915.

Full text
Abstract:
In recent years there have been significant developments in satellite transmitter technology to follow the rapid innovation of sensors on-board new satellites. The CCSDS 131.2-B-1 standard for telemetry downlink, released in 2012, is part of the next generation of standards that aims to support the increased data-rate caused by these improvements in resolution. As a result of its relative novelty, this standard currently lacks in-depth analysis by researchers, but it is also strongly supported by the European Space Agency (ESA) for future missions. For these reasons, it seems important to evaluate how major receiver sub-components, such as timing recovery and carrier frequency correction, can be designed and implemented in new receivers that support this standard. The timing error detectors (TED) and frequency error detectors (FED) were therefore studied on the specific peculiarities of CCSDS 131.2-B-1 in its usual environment of Low Earth Orbit (LEO). Estimators have been evaluated highlighting performances, trade-offs and peculiarities of each one with respect to corresponding architectural choices. Finally, a receiver architecture derived from the paper considerations is proposed in the aim of supporting very different mission scenarios. Specifically, the realized architecture employs a parallel feedforward estimator for the timing recovery section and a novel multi-algorithm feedback frequency correction loop to efficiently cover both low symbol rates (5 Mbaud) and high data-rates (up to 500 Mbaud). This solution represents a good trade-off to support these scenarios in a very compact footprint by pushing the clock frequency to the FPGA limit. The FPGA resources occupation on a Zynq Ultrascale+ RFSoC XCZU28DR FPGA is 5202 LUT, 4851 FF, 5 BRAM, and 21 DSP for the timing recovery part, while the frequency recovery section occupies 1723 LUT, 1511 FF, 2.5 BRAM and 32 DSP.
APA, Harvard, Vancouver, ISO, and other styles
37

Al-Shueli, Assad I. "Artificial TMAP Signal Generator Based on One-Bit Sigma Delta Modulator for MEC Test." Mathematical Modelling of Engineering Problems 10, no. 1 (February 28, 2023): 282–88. http://dx.doi.org/10.18280/mmep.100133.

Full text
Abstract:
The current study proposes three electrode-channels with time delay of artificial nerve signals generator utilized with only a active low pass filter and a Field Programmable Gate Array (FPGA). The XC3S100E FPGA board adopted to design a novel and real-time artificial transmembrane action potential (TMAP). The implementation of one bit Sigma Delta Modulation (SDM) method with linearized technique could be very expensive and complicated products with other technologies option currently available such as a microcontroller due to the fabrication requirements. This technique could employ in many biomedical applications and testing efficiently, especially, in evaluating the multi-electrode cuff system without requiring use of real nerves spike from animal models and saving time and effort, and cost. Moreover, this approach is adopted for further improvement to the signal performances such as resolution, accuracy, differential linearity monotonic DAC, and settling time of the signal waveform without extra hardware or cost required. In addition, the design included three channels of white Gaussian noise uncorrelated for each TMAP signal to simulate the real nerve noise environment, which surrounds the electrode cuff in experimental cases. A noise channels with adjustment level is added to the nerve spike using three sets of taps are employed to provide three channels of noise sources. The Xilinx ISE platform and Mentor Graphics ModelSim tool are used to simulate and implement this, and the results produced are compared. The power consumption of conventional technology is 31 mW, which is significantly higher than the power consumed (21mW) in modern technology. Furthermore, as compared to the usual way, the proposed solution conserved around 10% of the FPGA resources. This technique offers reuse in other applications where delay time, low amplitude voltage signals, and multichannel generator is needed. Moreover, this system offers adjustable waveform signal, noise level, conduction velocity and time delay easily to adequate different applications and tests.
APA, Harvard, Vancouver, ISO, and other styles
38

Komeylian, Somayeh, and Christopher Paolini. "Implementation of the Digital QS-SVM-Based Beamformer on an FPGA Platform." Sensors 23, no. 3 (February 3, 2023): 1742. http://dx.doi.org/10.3390/s23031742.

Full text
Abstract:
To address practical challenges in establishing and maintaining robust wireless connectivity such as multi-path effects, low latency, size reduction, and high data rate, we have deployed the digital beamformer, as a spatial filter, by using the hybrid antenna array at an operating frequency of 10 GHz. The proposed digital beamformer utilizes a combination of the two well-established beamforming techniques of minimum variance distortionless response (MVDR) and linearly constrained minimum variance (LCMV). In this case, the MVDR beamforming method updates weight vectors on the FPGA board, while the LCMV beamforming technique performs nullsteering in directions of interference signals in the real environment. The most well-established machine learning technique of support vector machine (SVM) for the Direction of Arrival (DoA) estimation is limited to problems with linearly-separable datasets. To overcome the aforementioned constraint, the quadratic surface support vector machine (QS-SVM) classifier with a small regularizer has been used in the proposed beamformer for the DoA estimation in addition to the two beamforming techniques of LCMV and MVDR. In this work, we have assumed that five hybrid array antennas and three sources are available, at which one of the sources transmits the signal of interest. The QS-SVM-based beamformer has been deployed on the FPGA board for spatially filtering two signals from undesired directions and passing only one of the signals from the desired direction. The simulation results have verified the strong performance of the QS-SVM-based beamformer in suppressing interference signals, which are accompanied by placing deep nulls with powers less than −10 dB in directions of interference signals, and transferring the desired signal. Furthermore, we have verified that the performance of the QS-SVM-based beamformer yields other advantages including average latency time in the order of milliseconds, performance efficiency of more than 90%, and throughput of nearly 100%.
APA, Harvard, Vancouver, ISO, and other styles
39

Li, Wenhao, Qisheng Zhang, Qimao Zhang, Feng Guo, Shuaiqing Qiao, Shiyang Liu, Yueyun Luo, Yuefeng Niu, and Xing Heng. "Development of a distributed hybrid seismic–electrical data acquisition system based on the Narrowband Internet of Things (NB-IoT) technology." Geoscientific Instrumentation, Methods and Data Systems 8, no. 2 (August 12, 2019): 177–86. http://dx.doi.org/10.5194/gi-8-177-2019.

Full text
Abstract:
Abstract. The ambiguity of geophysical inversions, which is based on a single geophysical method, is a long-standing problem in geophysical exploration. Therefore, multi-method geophysical prospecting has become a popular topic. In multi-method geophysical prospecting, the joint inversion of seismic and electric data has been extensively researched for decades. However, the methods used for hybrid seismic–electric data acquisition that form the base for multi-method geophysical prospecting techniques have not yet been explored in detail. In this work, we developed a distributed, high-precision, hybrid seismic–electrical data acquisition system using advanced Narrowband Internet of Things (NB-IoT) technology. The system was equipped with a hybrid data acquisition board, a high-performance embedded motherboard based on field-programmable gate array, an advanced RISC machine, and host software. The data acquisition board used an ADS1278 24 bit analog-to-digital converter and FPGA-based digital filtering techniques to perform high-precision data acquisition. The equivalent input noise of the data acquisition board was only 0.5 µV with a sampling rate of 1000 samples per second and front-end gain of 40 dB. The multiple data acquisition stations of our system were synchronized using oven-controlled crystal oscillators and global positioning system technologies. Consequently, the clock frequency error of the system was less than 10−9 Hz at 1 Hz after calibration, and the synchronization accuracy of the data acquisition stations was ±200 ns. The use of sophisticated NB-IoT technologies allowed the long-distance wireless communication between the control center and the data acquisition stations. In validation experiments, it was found that our system was operationally stable and reliable, produced highly accurate data, and it was functionally flexible and convenient. Furthermore, using this system, it is also possible to monitor the real-time quality of data acquisition processes. We believe that the results obtained in this study will drive the advancement of prospective integrated seismic–electrical technologies and promote the use of IoT technologies in geophysical instrumentation.
APA, Harvard, Vancouver, ISO, and other styles
40

Lee, Jaeheum, Jason K. Eshraghian, Sungjin Kim, Kamran Eshraghian, and Kyoungrok Cho. "Quantized Convolutional Neural Network Implementation on a Parallel-Connected Memristor Crossbar Array for Edge AI Platforms." Journal of Nanoscience and Nanotechnology 21, no. 3 (March 1, 2021): 1854–61. http://dx.doi.org/10.1166/jnn.2021.18925.

Full text
Abstract:
There are many challenges in the hardware implementation of a neural network using nanoscale memristor crossbar arrays where the use of analog cells is concerned. Multi-state or analog cells introduce more stringent noise margins, which are difficult to adhere to in light of variability. We propose a potential solution using a 1-bit memristor that stores binary values “0” or “1” with their memristive states, denoted as a high-resistance state (HRS) and a low-resistance state (LRS). In addition, we propose a new architecture consisting of 4-parallel 1-bit memristors at each crosspoint on the array. The four 1-bit memristors connected in parallel represent 5 decimal values according to the number of activated memristors. This is then mapped to a synaptic weight, which corresponds to the state of an artificial neuron in a neural network. We implement a convolutional neural network (CNN) model on a framework (tensorflow) using an equivalent quantized weight mapping model that demonstrates learning results almost identical to a high-precision CNN model. This radix-5 CNN is mapped to hardware on the proposed parallel-connected memristor crossbar array. Also, we propose a method for negative weight representation on a memristor crossbar array. Then, we verify the CNN hardware on an edge-AI (e-AI) platform, developed on a field-programmable gate array (FPGA). In this e-AI platform, we represent five weights per crosspoint using CLB logics. We test the learning results of the CNN hardware using an e-AI platform with a dataset consisting of 4×4 images in three classes. We verify the functionality of our radix-5 CNN implementation showing comparable classification accuracy to high-precision use cases, with reduction of the area of the memristor crossbar array by half, all verified on a FPGA. Implementing the CNN model on the FPGA board can contribute to the practical use of edge-AI.
APA, Harvard, Vancouver, ISO, and other styles
41

Barrios, Yubal, Antonio Sánchez, Raúl Guerra, and Roberto Sarmiento. "Hardware Implementation of the CCSDS 123.0-B-2 Near-Lossless Compression Standard Following an HLS Design Methodology." Remote Sensing 13, no. 21 (October 31, 2021): 4388. http://dx.doi.org/10.3390/rs13214388.

Full text
Abstract:
The increment in the use of high-resolution imaging sensors on-board satellites motivates the use of on-board image compression, mainly due to restrictions in terms of both hardware (computational and storage resources) and downlink bandwidth with the ground. This work presents a compression solution based on the CCSDS 123.0-B-2 near-lossless compression standard for multi- and hyperspectral images, which deals with the high amount of data acquired by these next-generation sensors. The proposed approach has been developed following an HLS design methodology, accelerating design time and obtaining good system performance. The compressor is comprised by two main stages, a predictor and a hybrid encoder, designed in Band-Interleaved by Line (BIL) order and optimized to achieve a trade-off between throughput and logic resources utilization. This solution has been mapped on a Xilinx Kintex UltraScale XCKU040 FPGA and targeting AVIRIS images, reaching a throughput of 12.5 MSamples/s and consuming only the 7% of LUTs and around the 14% of dedicated memory blocks available in the device. To the best of our knowledge, this is the first fully-compliant hardware implementation of the CCSDS 123.0-B-2 near-lossless compression standard available in the state of the art.
APA, Harvard, Vancouver, ISO, and other styles
42

Yu, Zheqi, Pedro Machado, Adnan Zahid, Amir M. Abdulghani, Kia Dashtipour, Hadi Heidari, Muhammad A. Imran, and Qammer H. Abbasi. "Energy and Performance Trade-Off Optimization in Heterogeneous Computing via Reinforcement Learning." Electronics 9, no. 11 (November 2, 2020): 1812. http://dx.doi.org/10.3390/electronics9111812.

Full text
Abstract:
This paper suggests an optimisation approach in heterogeneous computing systems to balance energy power consumption and efficiency. The work proposes a power measurement utility for a reinforcement learning (PMU-RL) algorithm to dynamically adjust the resource utilisation of heterogeneous platforms in order to minimise power consumption. A reinforcement learning (RL) technique is applied to analyse and optimise the resource utilisation of field programmable gate array (FPGA) control state capabilities, which is built for a simulation environment with a Xilinx ZYNQ multi-processor systems-on-chip (MPSoC) board. In this study, the balance operation mode for improving power consumption and performance is established to dynamically change the programmable logic (PL) end work state. It is based on an RL algorithm that can quickly discover the optimization effect of PL on different workloads to improve energy efficiency. The results demonstrate a substantial reduction of 18% in energy consumption without affecting the application’s performance. Thus, the proposed PMU-RL technique has the potential to be considered for other heterogeneous computing platforms.
APA, Harvard, Vancouver, ISO, and other styles
43

Yang, Dan, Xuhan Xu, Tianyang Chen, Yanhao Chen, and Junjie Zhang. "Low Latency TOE with Double-Queue Structure for 10Gbps Ethernet on FPGA." Sensors 23, no. 10 (May 12, 2023): 4690. http://dx.doi.org/10.3390/s23104690.

Full text
Abstract:
The TCP protocol is a connection-oriented and reliable transport layer communication protocol which is widely used in network communication. With the rapid development and popular application of data center networks, high-throughput, low-latency, and multi-session network data processing has become an immediate need for network devices. If only a traditional software protocol stack is used for processing, it will occupy a large amount of CPU resources and affect network performance. To address the above issues, this paper proposes a double-queue storage structure for a 10G TCP/IP hardware offload engine based on FPGA. Furthermore, a TOE reception transmission delay theoretical analysis model for interaction with the application layer is proposed, so that the TOE can dynamically select the transmission channel based on the interaction results. After board-level verification, the TOE supports 1024 TCP sessions with a reception rate of 9.5 Gbps and a minimum transmission latency of 600 ns. When the TCP packet payload length is 1024 bytes, the latency performance of TOE’s double-queue storage structure improves by at least 55.3% compared to other hardware implementation approaches. When compared with software implementation approaches, the latency performance of TOE is only 3.2% of the software approaches.
APA, Harvard, Vancouver, ISO, and other styles
44

Mizukami, Atsushi. "ATLAS Level-1 Endcap Muon Trigger for Run 3." EPJ Web of Conferences 245 (2020): 01002. http://dx.doi.org/10.1051/epjconf/202024501002.

Full text
Abstract:
The Large Hadron Collider is expected to operate with a centre-ofmass energy of 14 TeV and an instantaneous luminosity of 2.0 1034 cm−2s−1 for Run 3 scheduled from 2021 to 2024. In order to cope with the high event rate, an upgrade of the ATLAS trigger system is required. The level-1 endcap muon trigger system identifies muons with high transverse momentum by combining data from fast muon trigger detectors, called Thin Gap Chambers on the Big Wheel. Inner muon detectors (the Small Wheel and the Tile Calorimeter) coincidence was introduced to reduce fake muon contamination. In the ongoing Phase-1 upgrade the present Small Wheel is replaced with the New Small Wheel and additional Resistive Plate Chambers are installed in the inner region of the ATLAS muon spectrometer for the endcap muon trigger. Precision track information from the new detectors can be used as part of the muon trigger logic to enhance the performance significantly. The trigger processor board, Sector Logic, has been upgraded to handle the additional data from the new detectors. The new Sector Logic board has a modern FPGA to make use of Multi-Gigabit transceiver technology, which is used to receive data from the new detectors. The readout system for trigger data has also been re-designed to minimize the use of custom electronics and instead use commercial computers and network switches, by using TCP/IP for the data transfer. The new readout system uses a software-based data-handling. This paper describes the development of the level-1 endcap muon trigger and its readout system for Run 3.
APA, Harvard, Vancouver, ISO, and other styles
45

Khamlich, Salaheddine, Fathallah Khamlich, Issam Atouf, and Mohamed Benrabh. "Performance evaluation and implementations of MFCC, SVM and MLP algorithms in the FPGA board." International journal of electrical and computer engineering systems 12, no. 3 (August 27, 2021): 139–53. http://dx.doi.org/10.32985/ijeces.12.3.3.

Full text
Abstract:
One of the most difficult speech recognition tasks is accurate recognition of human-to-human communication. Advances in deep learning over the last few years have produced major speech improvements in recognition on the representative Switch-board conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This raises two issues - what is human performance, and how far down can we still drive speech recognition error rates? The main objective of this article is the development of a comparative study of the performance of Automatic Speech Recognition (ASR) algorithms using a database made up of a set of signals created by female and male speakers of different ages. We will also develop techniques for the Software and Hardware implementation of these algorithms and test them in an embedded electronic card based on a reconfigurable circuit (Field Programmable Gate Array FPGA). We will present an analysis of the results of classifications for the best Support Vector Machine architectures (SVM) and Artificial Neural Networks of Multi-Layer Perceptron (MLP). Following our analysis, we created NIOSII processors and we tested their operations as well as their characteristics. The characteristics of each processor are specified in this article (cost, size, speed, power consumption and complexity). At the end of this work, we physically implemented the architecture of the Mel Frequency Cepstral Coefficients (MFCC) extraction algorithm as well as the classification algorithm that provided the best results.
APA, Harvard, Vancouver, ISO, and other styles
46

Barrios, Yubal, Alfonso Rodríguez, Antonio Sánchez, Arturo Pérez, Sebastián López, Andrés Otero, Eduardo de la Torre, and Roberto Sarmiento. "Lossy Hyperspectral Image Compression on a Reconfigurable and Fault-Tolerant FPGA-Based Adaptive Computing Platform." Electronics 9, no. 10 (September 26, 2020): 1576. http://dx.doi.org/10.3390/electronics9101576.

Full text
Abstract:
This paper describes a novel hardware implementation of a lossy multispectral and hyperspectral image compressor for on-board operation in space missions. The compression algorithm is a lossy extension of the Consultative Committee for Space Data Systems (CCSDS) 123.0-B-1 lossless standard that includes a bit-rate control stage, which in turn manages the losses the compressor may introduce to achieve higher compression ratios without compromising the recovered image quality. The algorithm has been implemented using High-Level Synthesis (HLS) techniques to increase design productivity by raising the abstraction level. The proposed lossy compression solution is deployed onto ARTICo3, a dynamically reconfigurable multi-accelerator architecture, obtaining a run-time adaptive solution that enables user-selectable performance (i.e., load more hardware accelerators to transparently increase throughput), power consumption, and fault tolerance (i.e., group hardware accelerators to transparently enable hardware redundancy). The whole compression solution is tested on a Xilinx Zynq UltraScale+ Field-Programmable Gate Array (FPGA)-based MPSoC using different input images, from multispectral to ultraspectral. For images acquired by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS), the proposed implementation renders an execution time of approximately 36 s when 8 accelerators are compressing concurrently at 100 MHz, which in turn uses around 20% of the LUTs and 17% of the dedicated memory blocks available in the target device. In this scenario, a speedup of 15.6× is obtained in comparison with a pure software version of the algorithm running in an ARM Cortex-A53 processor.
APA, Harvard, Vancouver, ISO, and other styles
47

Gu, Haoyu, Wei Su, Baolin Zhao, Hao Zhou, and Xianxue Liu. "A Design Methodology of Digital Control System for MEMS Gyroscope Based on Multi-Objective Parameter Optimization." Micromachines 11, no. 1 (January 9, 2020): 75. http://dx.doi.org/10.3390/mi11010075.

Full text
Abstract:
This paper presents a novel multi-objective parameter optimization method based on the genetic algorithm (GA) and adaptive moment estimation (Adam) algorithm for the design of a closed-loop control system for the sense mode of a Microelectromechanical systems (MEMS) gyroscope. The proposed method can improve the immunity of the control system to fabrication tolerances and external noise. The design procedure starts by deriving a parameterized model of the closed-loop of the sense mode. The loop parameters are then optimized by the GA. Finally, the ensemble of optimized loop parameters is tested by Monte Carlo analysis to obtain a robust optimal solution. Simultaneously, the Adam-least mean square (LMS) demodulator, which is appropriate for the demodulation of very noisy signals, is also presented. Compared with the traditional method, the time consumption of the design process is reduced significantly. The digital control system is implemented by the print circuit board based on embedded Field Programmable Gate Array (FPGA). The experimental results show that the optimized control loop has achieved a better performance, the system bandwidth in open-loop and optimal closed-loop control system is about 23 Hz and 101 Hz, respectively. Compared to a non-optimized closed-loop system, the bias instability reduced from 0.0015°/s to 7.52 × 10−4°/s, the scale factor increased from 17.7 mV/(°/s) to 23 mV/(°/s) and the non-linearity of the scale factor reduced from 0.008452% to 0.006156%.
APA, Harvard, Vancouver, ISO, and other styles
48

Zhang, Ling, Xuefei Yang, Zhenlong Wan, Dingxin Cao, and Yingcheng Lin. "A Real-Time FPGA Implementation of Infrared and Visible Image Fusion Using Guided Filter and Saliency Detection." Sensors 22, no. 21 (November 4, 2022): 8487. http://dx.doi.org/10.3390/s22218487.

Full text
Abstract:
Taking advantage of the functional complementarity between infrared and visible light sensors imaging, pixel-level real-time image fusion based on infrared and visible light images of different resolutions is a promising strategy for visual enhancement, which has demonstrated tremendous potential for autonomous driving, military reconnaissance, video surveillance, etc. Great progress has been made in this field in recent years, but the fusion speed and quality of visual enhancement are still not satisfactory. Herein, we propose a multi-scale FPGA-based image fusion technology with substantially enhanced visual enhancement capability and fusion speed. Specifically, the source images are first decomposed into three distinct layers using guided filter and saliency detection, which are the detail layer, saliency layer and background layer. Fusion weight map of the saliency layer is subsequently constructed using attention mechanism. Afterwards weight fusion strategy is used for saliency layer fusion and detail layer fusion, while weight average fusion strategy is used for the background layer fusion, followed by the incorporation of image enhancement technology to improve the fused image contrast. Finally, high-level synthesis tool is used to design the hardware circuit. The method in the present study is thoroughly tested on XCZU15EG board, which could not only effectively improve the image enhancement capability in glare and smoke environments, but also achieve fast real-time image fusion with 55FPS for infrared and visible images with a resolution of 640 × 470.
APA, Harvard, Vancouver, ISO, and other styles
49

Nikolaidis, Dimitris, Panos Groumas, Christos Kouloumentas, and Hercules Avramopoulos. "Novel Benes Network Routing Algorithm and Hardware Implementation." Technologies 10, no. 1 (January 25, 2022): 16. http://dx.doi.org/10.3390/technologies10010016.

Full text
Abstract:
Benes/Clos networks constitute a particularly important part of interconnection networks and have been used in numerous areas, such as multi-processor systems, data centers and on-chip networks. They have also attracted great interest in the field of optical communications due to the increasing popularity of optical switches based on these architectures. There are numerous algorithms aimed at routing these types of networks, with varying degrees of utility. Linear algorithms, such as Sun Tsu and Opferman, were historically the first attempt to standardize the routing procedure of this types of networks. They require matrix-based calculations, which are very demanding in terms of resources and in some cases involve backtracking, which impairs their efficiency. Parallel solutions, such as Lee’s algorithm, were introduced later and provide a different answer that satisfy the requirements of high-performance networks. They are, however, extremely complex and demand even more resources. In both cases, hardware implementations reflect their algorithmic characteristics. In this paper, we attempt to design an algorithm that is simple enough to be implemented on a small field programmable gate array board while simultaneously efficient enough to be used in practical scenarios. The design itself is of a generic nature; therefore, its behavior across different sizes (8 × 8, 16 × 16, 32 × 32, 64 × 64) is examined. The platform of implementation is a medium range FPGA specifically selected to represent the average hardware prototyping device. In the end, an overview of the algorithm’s imprint on the device is presented alongside other approaches, which include both hard and soft computing techniques.
APA, Harvard, Vancouver, ISO, and other styles
50

Meyer, Marius, Tobias Kenter, and Christian Plessl. "Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks." ACM Transactions on Reconfigurable Technology and Systems, January 9, 2023. http://dx.doi.org/10.1145/3576200.

Full text
Abstract:
While FPGA accelerator boards and their respective high-level design tools are maturing, there is still a lack of multi-FPGA applications, libraries, and not least, benchmarks and reference implementations towards sustained HPC usage of these devices. As in the early days of GPUs in HPC, for workloads that can reasonably be decoupled into loosely coupled working sets, multi-accelerator support can be achieved by using standard communication interfaces like MPI on the host side. However, for performance and productivity, some applications can profit from a tighter coupling of the accelerators. FPGAs offer unique opportunities here when extending the dataflow characteristics to their communication interfaces. In this work, we extend the HPCC FPGA benchmark suite by multi-FPGA support and three missing benchmarks that particularly characterize or stress inter-device communication: b_eff, PTRANS, and LINPACK. With all benchmarks implemented for current boards with Intel and Xilinx FPGAs, we established a baseline for multi-FPGA performance. Additionally, for the communication-centric benchmarks, we explored the potential of direct FPGA-to-FPGA communication with a circuit-switched inter-FPGA network that is currently only available for one of the boards. The evaluation with parallel execution on up to 26 FPGA boards makes use of one of the largest academic FPGA installations.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography