Gotowa bibliografia na temat „Hardware-Software Codesign Accelerators”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Hardware-Software Codesign Accelerators”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Hardware-Software Codesign Accelerators"

1

Xiao, Chunhua, Lei Zhang, Yuhua Xie, Weichen Liu i Duo Liu. "Hardware/Software Adaptive Cryptographic Acceleration for Big Data Processing". Security and Communication Networks 2018 (27.08.2018): 1–24. http://dx.doi.org/10.1155/2018/7631342.

Pełny tekst źródła
Streszczenie:
Along with the explosive growth of network data, security is becoming increasingly important for web transactions. The SSL/TLS protocol has been widely adopted as one of the effective solutions for sensitive access. Although OpenSSL could provide a freely available implementation of the SSL/TLS protocol, the crypto functions, such as symmetric key ciphers, are extremely compute-intensive operations. These expensive computations through software implementations may not be able to compete with the increasing need for speed and secure connection. Although there are lots of excellent works with the objective of SSL/TLS hardware acceleration, they focus on the dedicated hardware design of accelerators. Hardly of them presented how to utilize them efficiently. Actually, for some application scenarios, the performance improvement may not be comparable with AES-NI, due to the induced invocation cost for hardware engines. Therefore, we proposed the research to take full advantages of both accelerators and CPUs for security HTTP accesses in big data. We not only proposed optimal strategies such as data aggregation to advance the contribution with hardware crypto engines, but also presented an Adaptive Crypto System based on Accelerators (ACSA) with software and hardware codesign. ACSA is able to adopt crypto mode adaptively and dynamically according to the request character and system load. Through the establishment of 40 Gbps networking on TAISHAN Web Server, we evaluated the system performance in real applications with a high workload. For the encryption algorithm 3DES, which is not supported in AES-NI, we could get about 12 times acceleration with accelerators. For typical encryption AES supported by instruction acceleration, we could get 52.39% bandwidth improvement compared with only hardware encryption and 20.07% improvement compared with AES-NI. Furthermore, the user could adjust the trade-off between CPU occupation and encryption performance through MM strategy, to free CPUs according to the working requirements.
Style APA, Harvard, Vancouver, ISO itp.
2

Kritikakou, Angeliki, Francky Catthoor, George S. Athanasiou, Vasilios Kelefouras i Costas Goutis. "Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints". ACM Transactions on Architecture and Code Optimization 10, nr 2 (maj 2013): 1–25. http://dx.doi.org/10.1145/2459316.2459317.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Hernández, Mario, Juan M. Cebrián, José M. Cecilia i José M. García. "Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance". International Journal of High Performance Computing Applications 34, nr 2 (7.11.2017): 199–207. http://dx.doi.org/10.1177/1094342017738352.

Pełny tekst źródła
Streszczenie:
The ever-increasing computational requirements of HPC and service provider applications are becoming a great challenge for hardware and software designers. These requirements are reaching levels where the isolated development on either computational field is not enough to deal with such challenge. A holistic view of the computational thinking is therefore the only way to success in real scenarios. However, this is not a trivial task as it requires, among others, of hardware–software codesign. In the hardware side, most high-throughput computers are designed aiming for heterogeneity, where accelerators (e.g. Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), etc.) are connected through high-bandwidth bus, such as PCI-Express, to the host CPUs. Applications, either via programmers, compilers, or runtime, should orchestrate data movement, synchronization, and so on among devices with different compute and memory capabilities. This increases the programming complexity and it may reduce the overall application performance. This article evaluates different offloading strategies to leverage heterogeneous systems, based on several cards with the first-generation Xeon Phi coprocessors (Knights Corner). We use a 11-point 3-D Stencil kernel that models heat dissipation as a case study. Our results reveal substantial performance improvements when using several accelerator cards. Additionally, we show that computing of an approximate result by reducing the communication overhead can yield 23% performance gains for double-precision data sets.
Style APA, Harvard, Vancouver, ISO itp.
4

Morales-Sandoval, Miguel, Luis Armando Rodriguez Flores, Rene Cumplido, Jose Juan Garcia-Hernandez, Claudia Feregrino i Ignacio Algredo. "A Compact FPGA-Based Accelerator for Curve-Based Cryptography in Wireless Sensor Networks". Journal of Sensors 2021 (6.01.2021): 1–13. http://dx.doi.org/10.1155/2021/8860413.

Pełny tekst źródła
Streszczenie:
The main topic of this paper is low-cost public key cryptography in wireless sensor nodes. Security in embedded systems, for example, in sensor nodes based on field programmable gate array (FPGA), demands low cost but still efficient solutions. Sensor nodes are key elements in the Internet of Things paradigm, and their security is a crucial requirement for critical applications in sectors such as military, health, and industry. To address these security requirements under the restrictions imposed by the available computing resources of sensor nodes, this paper presents a low-area FPGA-prototyped hardware accelerator for scalar multiplication, the most costly operation in elliptic curve cryptography (ECC). This cryptoengine is provided as an enabler of robust cryptography for security services in the IoT, such as confidentiality and authentication. The compact property in the proposed hardware design is achieved by implementing a novel digit-by-digit computing approach applied at the finite field and curve level algorithms, in addition to hardware reusing, the use of embedded memory blocks in modern FPGAs, and a simpler control logic. Our hardware design targets elliptic curves defined over binary fields generated by trinomials, uses fewer area resources than other FPGA approaches, and is faster than software counterparts. Our ECC hardware accelerator was validated under a hardware/software codesign of the Diffie-Hellman key exchange protocol (ECDH) deployed in the IoT MicroZed FPGA board. For a scalar multiplication in the sect233 curve, our design requires 1170 FPGA slices and completes the computation in 128820 clock cycles (at 135.31 MHz), with an efficiency of 0.209 kbps/slice. In the codesign, the ECDH protocol is executed in 4.1 ms, 17 times faster than a MIRACL software implementation running on the embedded processor Cortex A9 in the MicroZed. The FPGA-based accelerator for binary ECC presented in this work is the one with the least amount of hardware resources compared to other FPGA designs in the literature.
Style APA, Harvard, Vancouver, ISO itp.
5

Ahmed, O., S. Areibi, K. Chattha i B. Kelly. "PCIU: Hardware Implementations of an Efficient Packet Classification Algorithm with an Incremental Update Capability". International Journal of Reconfigurable Computing 2011 (2011): 1–21. http://dx.doi.org/10.1155/2011/648483.

Pełny tekst źródła
Streszczenie:
Packet classification plays a crucial role for a number of network services such as policy-based routing, firewalls, and traffic billing, to name a few. However, classification can be a bottleneck in the above-mentioned applications if not implemented properly and efficiently. In this paper, we propose PCIU, a novel classification algorithm, which improves upon previously published work. PCIU provides lower preprocessing time, lower memory consumption, ease of incremental rule update, and reasonable classification time compared to state-of-the-art algorithms. The proposed algorithm was evaluated and compared to RFC and HiCut using several benchmarks. Results obtained indicate that PCIU outperforms these algorithms in terms of speed, memory usage, incremental update capability, and preprocessing time. The algorithm, furthermore, was improved and made more accessible for a variety of applications through implementation in hardware. Two such implementations are detailed and discussed in this paper. The results indicate that a hardware/software codesign approach results in a slower, but easier to optimize and improve within time constraints, PCIU solution. A hardware accelerator based on an ESL approach using Handel-C, on the other hand, resulted in a 31x speed-up over a pure software implementation running on a state of the art Xeon processor.
Style APA, Harvard, Vancouver, ISO itp.
6

Pedram, Ardavan, Andreas Gerstlauer i Robert A. van de Geijn. "Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator". IEEE Transactions on Computers 63, nr 8 (1.08.2014): 1854–67. http://dx.doi.org/10.1109/tc.2014.2315627.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Acer, Seher, Ariful Azad, Erik G. Boman, Aydın Buluç, Karen D. Devine, SM Ferdous, Nitin Gawande i in. "EXAGRAPH: Graph and combinatorial methods for enabling exascale applications". International Journal of High Performance Computing Applications 35, nr 6 (30.09.2021): 553–71. http://dx.doi.org/10.1177/10943420211029299.

Pełny tekst źródła
Streszczenie:
Combinatorial algorithms in general and graph algorithms in particular play a critical enabling role in numerous scientific applications. However, the irregular memory access nature of these algorithms makes them one of the hardest algorithmic kernels to implement on parallel systems. With tens of billions of hardware threads and deep memory hierarchies, the exascale computing systems in particular pose extreme challenges in scaling graph algorithms. The codesign center on combinatorial algorithms, ExaGraph, was established to design and develop methods and techniques for efficient implementation of key combinatorial (graph) algorithms chosen from a diverse set of exascale applications. Algebraic and combinatorial methods have a complementary role in the advancement of computational science and engineering, including playing an enabling role on each other. In this paper, we survey the algorithmic and software development activities performed under the auspices of ExaGraph from both a combinatorial and an algebraic perspective. In particular, we detail our recent efforts in porting the algorithms to manycore accelerator (GPU) architectures. We also provide a brief survey of the applications that have benefited from the scalable implementations of different combinatorial algorithms to enable scientific discovery at scale. We believe that several applications will benefit from the algorithmic and software tools developed by the ExaGraph team.
Style APA, Harvard, Vancouver, ISO itp.
8

Kumar, Rakesh, Alejandro Martínez i Antonio González. "Efficient Power Gating of SIMD Accelerators Through Dynamic Selective Devectorization in an HW/SW Codesigned Environment". ACM Transactions on Architecture and Code Optimization 11, nr 3 (27.10.2014): 1–23. http://dx.doi.org/10.1145/2629681.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Karl, Patrick, Jonas Schupp, Tim Fritzmann i Georg Sigl. "Post-Quantum Signatures on RISC-V with Hardware Acceleration". ACM Transactions on Embedded Computing Systems, 6.01.2023. http://dx.doi.org/10.1145/3579092.

Pełny tekst źródła
Streszczenie:
CRYSTALS-Dilithium and Falcon are digital signature algorithms based on cryptographic lattices, that are considered secure even if large-scale quantum computers will be able to break conventional public-key cryptography. Both schemes have been selected for standardization in the NIST post-quantum competition. In this work, we present a RISC-V HW/SW codesign that aims to combine the advantages of software- and hardware implementations, i.e. flexibility and performance. It shows the use of flexible hardware accelerators, which have been previously used for Public-Key Encryption (PKE) and Key-Encapsulation Mechanism (KEM), for post-quantum signatures. It is optimized for Dilithium as a generic signature scheme but also accelerates applications that require fast verification of Falcon’s compact signatures. We provide a comparison with previous works showing that for Dilithium and Falcon, cycle counts are significantly reduced, such that our design is faster than previous software implementations or other HW/SW codesigns. In addition to that, we present a compact Globalfoundries 22 nm ASIC design that runs at 800 MHz. By using hardware acceleration, energy consumption for Dilithium is reduced by up to \(92.2\% \) , and up to \(67.5\% \) for Falcon’s signature verification.
Style APA, Harvard, Vancouver, ISO itp.
10

Bahadori, Milad, Kimmo Järvinen, Tilen Marc i Miha Stopar. "Speed Reading in the Dark: Accelerating Functional Encryption for Quadratic Functions with Reprogrammable Hardware". IACR Transactions on Cryptographic Hardware and Embedded Systems, 9.07.2021, 1–27. http://dx.doi.org/10.46586/tches.v2021.i3.1-27.

Pełny tekst źródła
Streszczenie:
Functional encryption is a new paradigm for encryption where decryption does not give the entire plaintext but only some function of it. Functional encryption has great potential in privacy-enhancing technologies but suffers from excessive computational overheads. We introduce the first hardware accelerator that supports functional encryption for quadratic functions. Our accelerator is implemented on a reprogrammable system-on-chip following the hardware/software codesign methogology. We benchmark our implementation for two privacy-preserving machine learning applications: (1) classification of handwritten digits from the MNIST database and (2) classification of clothes images from the Fashion MNIST database. In both cases, classification is performed with encrypted images. We show that our implementation offers speedups of over 200 times compared to a published software implementation and permits applications which are unfeasible with software-only solutions.
Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat "Hardware-Software Codesign Accelerators"

1

Sredojević, Ranko Radovin. "Template-based hardware-software codesign for high-performance embedded numerical accelerators". Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/84895.

Pełny tekst źródła
Streszczenie:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 129-132).
Sophisticated algorithms for control, state estimation and equalization have tremendous potential to improve performance and create new capabilities in embedded and mobile systems. Traditional implementation approaches are not well suited for porting these algorithmic solutions into practical implementations within embedded system constraints. Most of the technical challenges arise from design approach that manipulates only one level in the design stack, thus being forced to conform to constraints imposed by other levels without question. In tightly constrained environments, like embedded and mobile systems, such approaches have a hard time efficiently delivering and delivering efficiency. In this work we offer a solution that cuts through all the design stack layers. We build flexible structures at the hardware, software and algorithm level, and approach the solution through design space exploration. To do this efficiently we use a template-based hardware-software development flow. The main incentive for template use is, as in software development, to relax the generality vs. efficiency/performance type tradeoffs that appear in solutions striving to achieve run-time flexibility. As a form of static polymorphism, templates typically incur very little performance overhead once the design is instantiated, thus offering the possibility to defer many design decisions until later stages when more is known about the overall system design. However, simply including templates into design flow is not sufficient to result in benefits greater than some level of code reuse. In our work we propose using templates as flexible interfaces between various levels in the design stack. As such, template parameters become the common language that designers at different levels of design hierarchy can use to succinctly express their assumptions and ideas. Thus, it is of great benefit if template parameters map directly and intuitively into models at every level. To showcase the approach we implement a numerical accelerator for embedded Model Predictive Control (MPC) algorithm. While most of this work and design flow are quite general, their full power is realized in search for good solutions to a specific problem. This is best understood in direct comparison with recent works on embedded and high-speed MPC implementations. The controllers we generate outperform published works by a handsome margin in both speed and power consumption, while taking very little time to generate.
by Ranko Radovin Sredojević.
Ph.D.
Style APA, Harvard, Vancouver, ISO itp.
2

Fons, Lluís Mariano. "Hardware accelerators for embedded fingerprint-based personal recognition systems". Doctoral thesis, Universitat Rovira i Virgili, 2012. http://hdl.handle.net/10803/83493.

Pełny tekst źródła
Streszczenie:
Abstract The development of automatic biometrics-based personal recognition systems is a reality in the current technological age. Not only those operations demanding stringent security levels but also many daily use consumer applications request the existence of computational platforms in charge of recognizing the identity of one individual based on the analysis of his/her physiological and/or behavioural characteristics. The state of the art points out two main open problems in the implementation of such applications: on the one hand, the needed reliability improvement in terms of recognition accuracy, overall security and real-time performances; and on the other hand, the cost reduction of those physical platforms in charge of the processing. This work aims at finding the proper system architecture able to address those limitations of current personal recognition applications. Embedded system solutions based on hardware-software co-design techniques and programmable (and run-time reconfigurable) logic devices under FPGAs or SOPCs is proven to be an efficient alternative to those existing multiprocessor systems based on HPCs, GPUs or PC platforms in the development of that kind of high-performance applications at low cost
El desenvolupament de sistemes automàtics de reconeixement personal basats en tècniques biomètriques esdevé una realitat en l’era tecnològica actual. No només aquelles operacions que exigeixen un elevat nivell de seguretat sinó també moltes aplicacions quotidianes demanen l’existència de plataformes computacionals encarregades de reconèixer la identitat d’un individu a partir de l’anàlisi de les seves característiques fisiològiques i/o comportamentals. L’estat de l’art de la tècnica identifica dues limitacions importants en la implementació d’aquest tipus d’aplicacions: per una banda, és necessària la millora de la fiabilitat d’aquests sistemes en termes de precisió en el procés de reconeixement personal, seguretat i execució en temps real; i per altra banda, és necessari reduir notablement el cost dels sistemes electrònics encarregats del processat biomètric. Aquest treball té per objectiu la cerca de l’arquitectura adequada a nivell de sistema que permeti fer front a les limitacions de les aplicacions de reconeixement personal actuals. Es demostra que la proposta de sistemes empotrats basats en tècniques de codisseny hardware-software i dispositius lògics programables (i reconfigurables en temps d’execució) sobre FPGAs o SOPCs resulta ser una alternativa eficient en front d’aquells sistemes multiprocessadors existents basats en HPCs, GPUs o plataformes PC per al desenvolupament d’aquests tipus d’aplicacions que requereixen un alt nivell de prestacions a baix cost.
El desarrollo de sistemas automáticos de reconocimiento personal basados en técnicas biométricas se ha convertido en una realidad en la era tecnológica actual. No tan solo aquellas operaciones que requieren un alto nivel de seguridad sino también muchas otras aplicaciones cotidianas exigen la existencia de plataformas computacionales encargadas de verificar la identidad de un individuo a partir del análisis de sus características fisiológicas y/o comportamentales. El estado del arte de la técnica identifica dos limitaciones importantes en la implementación de este tipo de aplicaciones: por un lado, es necesario mejorar la fiabilidad que presentan estos sistemas en términos de precisión en el proceso de reconocimiento personal, seguridad y ejecución en tiempo real; y por otro lado, es necesario reducir notablemente el coste de los sistemas electrónicos encargados de dicho procesado biométrico. Este trabajo tiene por objetivo la búsqueda de aquella arquitectura adecuada a nivel de sistema que permita hacer frente a las limitaciones de los sistemas de reconocimiento personal actuales. Se demuestra que la propuesta basada en sistemas embebidos implementados mediante técnicas de codiseño hardware-software y dispositivos lógicos programables (y reconfigurables en tiempo de ejecución) sobre FPGAs o SOPCs resulta ser una alternativa eficiente frente a aquellos sistemas multiprocesador actuales basados en HPCs, GPUs o plataformas PC en el ámbito del desarrollo de aplicaciones que demandan un alto nivel de prestaciones a bajo coste
Style APA, Harvard, Vancouver, ISO itp.
3

Ramesh, Chinthala. "Hardware-Software Co-Design Accelerators for Sparse BLAS". Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4276.

Pełny tekst źródła
Streszczenie:
Sparse Basic Linear Algebra Subroutines (Sparse BLAS) is an important library. Sparse BLAS includes three levels of subroutines. Level 1, Level2 and Level 3 Sparse BLAS routines. Level 1 Sparse BLAS routines do computations over sparse vector and spare/dense vector. Level 2 deals with sparse matrix and vector operations. Level 3 deals with sparse matrix and dense matrix operations. The computations of these Sparse BLAS routines on General Purpose Processors (GPPs) not only suffer from less utilization of hardware resources but also takes more compute time than the workload due to poor data locality of sparse vector/matrix storage formats. In the literature, tremendous efforts have been put into software to improve these Sparse BLAS routines performance on GPPs. GPPs best suit for applications with high data locality, whereas Sparse BLAS routines operate on applications with less data locality hence, GPPs performance is poor. Various Custom Function Units (Hardware Accelerators) are proposed in the literature and are proved to be efficient than soft wares which tried to accelerate Sparse BLAS subroutines. Though existing hardware accelerators improved the Sparse BLAS performance compared to software Sparse BLAS routines, there is still lot of scope to improve these accelerators. This thesis describes both the existing software and hardware software co-designs (HW/SW co-design) and identifies the limitations of these existing solutions. We propose a new sparse data representation called Sawtooth Compressed Row Storage (SCRS) and corresponding SpMV and SpMM algorithms. SCRS based SpMV and SpMM are performing better than existing software solutions. Even though SCRS based SpMV and SpMM algorithms perform better than existing solutions, they still could not reach theoretical peak performance. The knowledge gained from the study of limitations of these existing solutions including the proposed SCRS based SpMV and SpMM is used to propose new HW/SW co-designs. Software accelerators are limited by the hardware properties of GPPs, and GPUs itself, hence, we propose HW/SW co-designs to accelerate few basic Sparse BLAS operations (SpVV and SpMV). Our proposed Parallel Sparse BLAS HW/SW co-design achieves near theoretical peak performance with reasonable hardware resources.
Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "Hardware-Software Codesign Accelerators"

1

Ben Othman, Slim, Ahmed Karim Ben Salem i Slim Ben Saoud. "Performance Analysis of FPGA Architectures based Embedded Control Applications". W Reconfigurable Embedded Control Systems, 274–310. IGI Global, 2011. http://dx.doi.org/10.4018/978-1-60960-086-0.ch011.

Pełny tekst źródła
Streszczenie:
The performances of System on Chip (SoC) and the Field Programmable Gate Array (FPGA) particularly, are increasing continually. Due to the growing complexity of modern embedded control systems, the need of more performance digital devices is evident. Recent FPGA technology makes it possible to include processor cores into the FPGA chip, which ensures more flexibility for digital controllers. Indeed, greater functionality of hardware and system software, Real-Time (RT) platforms and distributed subsystems are demanded. In this chapter, a design concept of FPGA based controller with Hardware/Software (Hw/Sw) codesign is proposed. It is applied for electrical machine drives. There are discussed different MultiProcessor SoC (MPSoC) architectures with Hw peripherals for the implementation on FPGA-based embedded processor cores. Hw accelerators are considered in the design to enhance the controller speed performance and reduce power consumption. Test and validation of this control system are performed on a RT motor emulator implemented on the same FPGA. Experimental results, carried on a real prototyping platform, are given in order to analyze the performance and efficiency of discussed architecture designs helping to support hard RT constraints.
Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Hardware-Software Codesign Accelerators"

1

Temam, O. "Hardware neural network accelerators". W 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2013. http://dx.doi.org/10.1109/codes-isss.2013.6659008.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Glint, Tom, Kailash Prasad, Jinay Dagli, Krishil Gandhi, Aryan Gupta, Vrajesh Patel, Neel Shah i Joycee Mekie. "Hardware-Software Codesign of DNN Accelerators Using Approximate Posit Multipliers". W ASPDAC '23: 28th Asia and South Pacific Design Automation Conference. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3566097.3567866.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Pasricha, Sudeep. "EDAML 2022 Invited Speaker 10: Hardware/Software Codesign for Optical Deep Learning Accelerators". W 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2022. http://dx.doi.org/10.1109/ipdpsw55747.2022.00203.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Vogel, Pirmin, Andrea Marongiu i Luca Benini. "Lightweight virtual memory support for many-core accelerators in heterogeneous embedded SoCs". W 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2015. http://dx.doi.org/10.1109/codesisss.2015.7331367.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Wang, Xuan, Chao Wang i Xuehai Zhou. "Work-in-Progress: WinoNN: Optimising FPGA-based Neural Network Accelerators using Fast Winograd Algorithm". W 2018 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2018. http://dx.doi.org/10.1109/codesisss.2018.8525909.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Gong, Lei, Chao Wang, Xi Li i Xuehai Zhou. "WiderFrame: An Automatic Customization Framework for Building CNN Accelerators on FPGAs: Work-in-Progress". W 2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2020. http://dx.doi.org/10.1109/codesisss51650.2020.9244024.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Colucci, Alessio, Alberto Marchisio, Beatrice Bussolino, Voitech Mrazek, Maurizio Martina, Guido Masera i Muhammad Shafique. "A Fast Design Space Exploration Framework for the Deep Learning Accelerators: Work-in-Progress". W 2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2020. http://dx.doi.org/10.1109/codesisss51650.2020.9244038.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Conti, Francesco, Andrea Marongiu i Luca Benini. "Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters". W 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2013. http://dx.doi.org/10.1109/codes-isss.2013.6658992.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Qin, Yunji, Lei Gong, Zhendong Zheng i Chao Wang. "Work-in-Progress: BloCirNN: An Efficient Software/hardware Codesign Approach for Neural Network Accelerators with Block-Circulant Matrix". W 2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). IEEE, 2022. http://dx.doi.org/10.1109/codes-isss55005.2022.00010.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Yi, Changjae, Donghyun Kang i Soonhoi Ha. "Hardware-Software Codesign of a CNN Accelerator". W 2022 25th Euromicro Conference on Digital System Design (DSD). IEEE, 2022. http://dx.doi.org/10.1109/dsd57027.2022.00054.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii