Academic literature on the topic 'Domain-Specific Accelerator'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Domain-Specific Accelerator.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Domain-Specific Accelerator"

1

Cong, Jason, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, and Glenn Reinman. "Architecture Support for Domain-Specific Accelerator-Rich CMPs." ACM Transactions on Embedded Computing Systems 13, no. 4s (July 2014): 1–26. http://dx.doi.org/10.1145/2584664.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sotiriou-Xanthopoulos, Efstathios, Sotirios Xydis, Kostas Siozios, George Economakos, and Dimitrios Soudris. "A Framework for Interconnection-Aware Domain-Specific Many-Accelerator Synthesis." ACM Transactions on Embedded Computing Systems 16, no. 1 (November 3, 2016): 1–26. http://dx.doi.org/10.1145/2983624.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Feng, Fan, Li Li, Kun Wang, Yuxiang Fu, Guoqiang He, and Hongbing Pan. "Design and Application Space Exploration of a Domain-Specific Accelerator System." Electronics 7, no. 4 (March 29, 2018): 45. http://dx.doi.org/10.3390/electronics7040045.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sunny, Febin P., Asif Mirza, Mahdi Nikdast, and Sudeep Pasricha. "ROBIN: A Robust Optical Binary Neural Network Accelerator." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–24. http://dx.doi.org/10.1145/3476988.

Full text
Abstract:
Domain specific neural network accelerators have garnered attention because of their improved energy efficiency and inference performance compared to CPUs and GPUs. Such accelerators are thus well suited for resource-constrained embedded systems. However, mapping sophisticated neural network models on these accelerators still entails significant energy and memory consumption, along with high inference time overhead. Binarized neural networks (BNNs), which utilize single-bit weights, represent an efficient way to implement and deploy neural network models on accelerators. In this paper, we present a novel optical-domain BNN accelerator, named ROBIN , which intelligently integrates heterogeneous microring resonator optical devices with complementary capabilities to efficiently implement the key functionalities in BNNs. We perform detailed fabrication-process variation analyses at the optical device level, explore efficient corrective tuning for these devices, and integrate circuit-level optimization to counter thermal variations. As a result, our proposed ROBIN architecture possesses the desirable traits of being robust, energy-efficient, low latency, and high throughput, when executing BNN models. Our analysis shows that ROBIN can outperform the best-known optical BNN accelerators and many electronic accelerators. Specifically, our energy-efficient ROBIN design exhibits energy-per-bit values that are ∼4 × lower than electronic BNN accelerators and ∼933 × lower than a recently proposed photonic BNN accelerator, while a performance-efficient ROBIN design shows ∼3 × and ∼25 × better performance than electronic and photonic BNN accelerators, respectively.
APA, Harvard, Vancouver, ISO, and other styles
5

Belda, María José, Katzalin Olcoz, Fernando Castro, and Francisco Tirado. "Optimization of a line detection algorithm for autonomous vehicles on a RISC-V with accelerator." Journal of Computer Science and Technology 22, no. 2 (October 17, 2022): e10. http://dx.doi.org/10.24215/16666038.22.e10.

Full text
Abstract:
In recent years, autonomous vehicles have attracted theattention of many research groups, both in academiaand business, including researchers from leading com-panies such as Google, Uber and Tesla. This type ofvehicles are equipped with systems that are subjectto very strict requirements, essentially aimed at per-forming safe operations –both for potential passengersand pedestrians– as well as carrying out the process-ing needed for decision making in real time. In manyinstances, general-purpose processors alone cannotensure that these safety, reliability and real-time re-quirements are met, so it is common to implementheterogeneous systems by including accelerators. Thispaper explores the acceleration of a line detection ap-plication in the autonomous car environment using aheterogeneous system consisting of a general-purposeRISC-V core and a domain-specific accelerator. In par-ticular, the application is analyzed to identify the mostcomputationally intensive parts of the code and it isadapted accordingly for more efficient processing. Fur-thermore, the code is executed on the aforementionedhardware platform to verify that the execution effec-tively meets the existing requirements in autonomousvehicles, experiencing a 3.7x speedup with respect torunning without accelerator.
APA, Harvard, Vancouver, ISO, and other styles
6

Vretenar, M., A. Mamaras, G. Bisoffi, and P. Foka. "Production of radioisotopes for cancer imaging and treatment with compact linear accelerators." Journal of Physics: Conference Series 2420, no. 1 (January 1, 2023): 012104. http://dx.doi.org/10.1088/1742-6596/2420/1/012104.

Full text
Abstract:
Abstract Accelerator-produced radioisotopes are widely used in modern medicine, for imaging, for cancer therapy, and for combinations of therapy and diagnostics (theragnostics). Clinical trials are well advanced for several radioisotope-based treatments that might open the way to a strong request of specific accelerator systems dedicated to radioisotope production. While cyclotrons are the standard tool in this domain, we explore here alternative options using linear accelerators. Compared to cyclotrons, linacs have the advantage of modularity, compactness, and reduced beam loss with lower shielding requirements. Although in general more expensive than cyclotrons, linacs are competitive in cost for production of low-energy proton beams, or of intense beams of heavier particles. After a review of radioisotopes of potential interest, in particular produced with low-energy protons or helium, this paper presents two linac-based isotope production systems. The first is a compact RFQ-based system for PET (Positron Emission Tomography) isotopes, and the second is an alpha-particle linac for production of alpha-emitters. The accelerator systems are described, together with calculations of production yields for different targets.
APA, Harvard, Vancouver, ISO, and other styles
7

Huang, Shizhen, Enhao Tang, Shun Li, Xiangzhan Ping, and Ruiqi Chen. "Hardware-friendly compression and hardware acceleration for transformer: A survey." Electronic Research Archive 30, no. 10 (2022): 3755–85. http://dx.doi.org/10.3934/era.2022192.

Full text
Abstract:
<abstract> <p>The transformer model has recently been a milestone in artificial intelligence. The algorithm has enhanced the performance of tasks such as Machine Translation and Computer Vision to a level previously unattainable. However, the transformer model has a strong performance but also requires a high amount of memory overhead and enormous computing power. This significantly hinders the deployment of an energy-efficient transformer system. Due to the high parallelism, low latency, and low power consumption of field-programmable gate arrays (FPGAs) and application specific integrated circuits (ASICs), they demonstrate higher energy efficiency than Graphics Processing Units (GPUs) and Central Processing Units (CPUs). Therefore, FPGA and ASIC are widely used to accelerate deep learning algorithms. Several papers have addressed the issue of deploying the Transformer on dedicated hardware for acceleration, but there is a lack of comprehensive studies in this area. Therefore, we summarize the transformer model compression algorithm based on the hardware accelerator and its implementation to provide a comprehensive overview of this research domain. This paper first introduces the transformer model framework and computation process. Secondly, a discussion of hardware-friendly compression algorithms based on self-attention and Transformer is provided, along with a review of a state-of-the-art hardware accelerator framework. Finally, we considered some promising topics in transformer hardware acceleration, such as a high-level design framework and selecting the optimum device using reinforcement learning.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
8

Fang, Jian, Yvo T. B. Mulder, Jan Hidders, Jinho Lee, and H. Peter Hofstee. "In-memory database acceleration on FPGAs: a survey." VLDB Journal 29, no. 1 (October 26, 2019): 33–59. http://dx.doi.org/10.1007/s00778-019-00581-w.

Full text
Abstract:
Abstract While FPGAs have seen prior use in database systems, in recent years interest in using FPGA to accelerate databases has declined in both industry and academia for the following three reasons. First, specifically for in-memory databases, FPGAs integrated with conventional I/O provide insufficient bandwidth, limiting performance. Second, GPUs, which can also provide high throughput, and are easier to program, have emerged as a strong accelerator alternative. Third, programming FPGAs required developers to have full-stack skills, from high-level algorithm design to low-level circuit implementations. The good news is that these challenges are being addressed. New interface technologies connect FPGAs into the system at main-memory bandwidth and the latest FPGAs provide local memory competitive in capacity and bandwidth with GPUs. Ease of programming is improving through support of shared coherent virtual memory between the host and the accelerator, support for higher-level languages, and domain-specific tools to generate FPGA designs automatically. Therefore, this paper surveys using FPGAs to accelerate in-memory database systems targeting designs that can operate at the speed of main memory.
APA, Harvard, Vancouver, ISO, and other styles
9

Hosseini, Morteza, and Tinoosh Mohsenin. "Binary Precision Neural Network Manycore Accelerator." ACM Journal on Emerging Technologies in Computing Systems 17, no. 2 (April 2021): 1–27. http://dx.doi.org/10.1145/3423136.

Full text
Abstract:
This article presents a low-power, programmable, domain-specific manycore accelerator, Binarized neural Network Manycore Accelerator (BiNMAC), which adopts and efficiently executes binary precision weight/activation neural network models. Such networks have compact models in which weights are constrained to only 1 bit and can be packed several in one memory entry that minimizes memory footprint to its finest. Packing weights also facilitates executing single instruction, multiple data with simple circuitry that allows maximizing performance and efficiency. The proposed BiNMAC has light-weight cores that support domain-specific instructions, and a router-based memory access architecture that helps with efficient implementation of layers in binary precision weight/activation neural networks of proper size. With only 3.73% and 1.98% area and average power overhead, respectively, novel instructions such as Combined Population-Count-XNOR , Patch-Select , and Bit-based Accumulation are added to the instruction set architecture of the BiNMAC, each of which replaces execution cycles of frequently used functions with 1 clock cycle that otherwise would have taken 54, 4, and 3 clock cycles, respectively. Additionally, customized logic is added to every core to transpose 16×16-bit blocks of memory on a bit-level basis, that expedites reshaping intermediate data to be well-aligned for bitwise operations. A 64-cluster architecture of the BiNMAC is fully placed and routed in 65-nm TSMC CMOS technology, where a single cluster occupies an area of 0.53 mm 2 with an average power of 232 mW at 1-GHz clock frequency and 1.1 V. The 64-cluster architecture takes 36.5 mm 2 area and, if fully exploited, consumes a total power of 16.4 W and can perform 1,360 Giga Operations Per Second (GOPS) while providing full programmability. To demonstrate its scalability, four binarized case studies including ResNet-20 and LeNet-5 for high-performance image classification, as well as a ConvNet and a multilayer perceptron for low-power physiological applications were implemented on BiNMAC. The implementation results indicate that the population-count instruction alone can expedite the performance by approximately 5×. When other new instructions are added to a RISC machine with existing population-count instruction, the performance is increased by 58% on average. To compare the performance of the BiNMAC with other commercial-off-the-shelf platforms, the case studies with their double-precision floating-point models are also implemented on the NVIDIA Jetson TX2 SoC (CPU+GPU). The results indicate that, within a margin of ∼2.1%--9.5% accuracy loss, BiNMAC on average outperforms the TX2 GPU by approximately 1.9× (or 7.5× with fabrication technology scaled) in energy consumption for image classification applications. On low power settings and within a margin of ∼3.7%--5.5% accuracy loss compared to ARM Cortex-A57 CPU implementation, BiNMAC is roughly ∼9.7×--17.2× (or 38.8×--68.8× with fabrication technology scaled) more energy efficient for physiological applications while meeting the application deadline.
APA, Harvard, Vancouver, ISO, and other styles
10

Gundi, Noel Daniel, Pramesh Pandey, Sanghamitra Roy, and Koushik Chakraborty. "Implementing a Timing Error-Resilient and Energy-Efficient Near-Threshold Hardware Accelerator for Deep Neural Network Inference." Journal of Low Power Electronics and Applications 12, no. 2 (June 6, 2022): 32. http://dx.doi.org/10.3390/jlpea12020032.

Full text
Abstract:
Increasing processing requirements in the Artificial Intelligence (AI) realm has led to the emergence of domain-specific architectures for Deep Neural Network (DNN) applications. Tensor Processing Unit (TPU), a DNN accelerator by Google, has emerged as a front runner outclassing its contemporaries, CPUs and GPUs, in performance by 15×–30×. TPUs have been deployed in Google data centers to cater to the performance demands. However, a TPU’s performance enhancement is accompanied by a mammoth power consumption. In the pursuit of lowering the energy utilization, this paper proposes PREDITOR—a low-power TPU operating in the Near-Threshold Computing (NTC) realm. PREDITOR uses mathematical analysis to mitigate the undetectable timing errors by boosting the voltage of the selective multiplier-and-accumulator units at specific intervals to enhance the performance of the NTC TPU, thereby ensuring a high inference accuracy at low voltage. PREDITOR offers up to 3×–5× improved performance in comparison to the leading-edge error mitigation schemes with a minor loss in accuracy.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Domain-Specific Accelerator"

1

Babecki, Christopher. "A Memory-Array Centric Reconfigurable Hardware Accelerator for Security Applications." Case Western Reserve University School of Graduate Studies / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=case1427381331.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ouedraogo, Ganda Stéphane. "Automatic synthesis of hardware accelerator from high-level specifications of physical layers for flexible radio." Thesis, Rennes 1, 2014. http://www.theses.fr/2014REN1S183/document.

Full text
Abstract:
L'internet des objets vise à connecter des milliards d'objets physiques ainsi qu'à les rendre accessibles depuis le monde numérique que représente l'internet d'aujourd'hui. Pour ce faire, l'accès à ces objets sera majoritairement réalisé sans fil et sans utiliser d'infrastructures prédéfinies ou de normes spécifiques. Une telle technologie nécessite de définir et d'implémenter des nœuds radio intelligents capables de s'adapter à différents protocoles physiques de communication. Nos travaux de recherches ont consisté à définir un flot de conception pour ces nœuds intelligents partant de leur modélisation à haut niveau jusqu'à leur implémentation sur des cibles de types FPGA. Ce flot vise à améliorer la programmabilité des formes d'ondes par l'utilisation de spécification de haut niveau exécutables et synthétisables, il repose sur la synthèse de haut niveau (HLS pour High Level Synthesis) pour le prototypage rapide des briques de base ainsi que sur le modèle de calcul de types flot de données des formes d'ondes radio. Le point d'entrée du flot consiste en un langage à usage spécifique (DSL pour Domain Specific Language) qui permet de modéliser à haut niveau une forme d'onde tout en insérant des contraintes d'implémentation pour des architectures reconfigurables telles que les FPGA. Il est associé à un compilateur qui permet de générer du code synthétisable ainsi que des scripts de synthèse. La forme d'onde finale est composée d'un chemin de données et d'une entité de contrôle implémentée sous forme d'une machine d'état hiérarchique
The Internet of Things (IoT) aims at connecting billions of communicating devices through an internet-like network. To this aim, the access to these things is expected to be performed via wireless technologies without using any predefined infrastructures or standards. This technology requires defining and implementing smart nodes capable to adapt to different radio communication protocols. In this thesis, we have defined a design methodology/flow, for such smart nodes, starting from their high-level specification down to their implementation in FPGA fabrics. This flow aims at improving the programmability of the waveforms by leveraging some high-level specifications. Thus, it relies on the High-Level Synthesis (HLS) for rapid prototyping of the waveforms functional blocks as well as the dataflow model of computation. Its entry point is Domain-Specific Language which enables modeling a waveform while inserting some implementation constraints for reconfigurable architectures such as the FPGAs. The flow is featured with a compiler which purpose is to produce some synthesis scripts and generate some RTL source code. The final waveform consists of a datapath and a control unit implemented as a Hierarchical Finite State Machine (HFSM)
APA, Harvard, Vancouver, ISO, and other styles
3

Membarth, Richard [Verfasser]. "Code Generation for GPU Accelerators from a Domain-Specific Language for Medical Imaging / Richard Membarth." München : Verlag Dr. Hut, 2013. http://d-nb.info/1037287142/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Patrick, Ardhe, and Sebastian Karlsson. "The crypto catalyst." Thesis, Linnéuniversitetet, Institutionen för marknadsföring (MF), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-75548.

Full text
Abstract:
In a scope where continuous innovations are seen each day, the cyberspace can be seen as the tech-entrepreneurs’ playground for delivering new solutions to customers. Digital start-ups who interact through the cyberspace operate with little to no restriction despite having limiting resource. In 2008 a man named Satoshi Nakamoto developed a new technology called blockchain. The new breed of firms providing blockchain solutions have been painted to live in a borderless world with little technical restrictions. Exploring the effects that blockchain brings to their internationalisation has brought our attention to study the early internationalisation of blockchain born globals and their business ecosystem. The deductive and qualitative approach gave the results from four different companies that were involved in blockchain technology. By using previous theory on internationalisation and a deductive approach a conceptual synthesis was developed. The synthesis was later applied in the case-companies to observe the results. The findings have shown that the firms implementing blockchain in the core offering has resulted in an accelerated internationalisation. The major factors contributing to this quick internationalisations is the spread of knowledge between buyers and sellers, trough the cyberspace. However, the authors were unable to find a relationship between the accelerated internationalisation and to the extent in which a firm has implemented blockchain in its core offering. The finding has given the authors prominent answers to the research question and has highlighted the complexity of the subject. The, authors conclude the thesis by displaying the importance of cyberspace in the business ecosystem; how it attracts customers and the importance of the company’s business model. Blockchain technology proved to have effects on the process of internationalisation due to superior technological performance, but also its hype.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Domain-Specific Accelerator"

1

Van Baalen, Jeffrey, and Steven Roach. "Using Decision Procedures to Accelerate Domain-Specific Deductive Synthesis Systems." In Logic-Based Program Synthesis and Transformation, 61–80. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/3-540-48958-4_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Droschinsky, Andre, Lina Humbeck, Oliver Koch, Nils M. Kriege, Petra Mutzel, and Till Schäfer. "Graph-Based Methods for Rational Drug Design." In Lecture Notes in Computer Science, 76–96. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-21534-6_5.

Full text
Abstract:
AbstractRational drug design deals with computational methods to accelerate the development of new drugs. Among other tasks, it is necessary to analyze huge databases of small molecules. Since a direct relationship between the structure of these molecules and their effect (e.g., toxicity) can be assumed in many cases, a wide set of methods is based on the modeling of the molecules as graphs with attributes.Here, we discuss our results concerning structural molecular similarity searches and molecular clustering and put them into the wider context of graph similarity search. In particular, we discuss algorithms for computing graph similarity w.r.t. maximum common subgraphs and their extension to domain specific requirements.
APA, Harvard, Vancouver, ISO, and other styles
3

Joshi, Deepak, and Michael E. Hahn. "Electromyogram and Inertial Sensor Signal Processing in Locomotion and Transition Classification." In Computational Tools and Techniques for Biomedical Signal Processing, 195–211. IGI Global, 2017. http://dx.doi.org/10.4018/978-1-5225-0660-7.ch009.

Full text
Abstract:
Signal processing in biomedical engineering is essentially required for classification while serving mainly two aims. The first is noise removal and the second is signal representation. Signal representation deals with transforming the signal in such a way that the signal is most informative in that particular domain for the application at hand. This chapter will describe signal processing methods like spectrogram with specific applications to locomotion and transition classification using Electromyography (EMG) data. A wavelet analysis application on foot acceleration signals for automatic identification of toe off in locomotion and the ramp transition is also shown. Finally, the performance of EMG and accelerometer performance across different time windows of a gait cycle in locomotion and transition classification is presented with an emphasis on fusing the data from both sensors for better classification.
APA, Harvard, Vancouver, ISO, and other styles
4

Joshi, Deepak, and Michael E. Hahn. "Electromyogram and Inertial Sensor Signal Processing in Locomotion and Transition Classification." In Data Analytics in Medicine, 762–78. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-1204-3.ch041.

Full text
Abstract:
Signal processing in biomedical engineering is essentially required for classification while serving mainly two aims. The first is noise removal and the second is signal representation. Signal representation deals with transforming the signal in such a way that the signal is most informative in that particular domain for the application at hand. This chapter will describe signal processing methods like spectrogram with specific applications to locomotion and transition classification using Electromyography (EMG) data. A wavelet analysis application on foot acceleration signals for automatic identification of toe off in locomotion and the ramp transition is also shown. Finally, the performance of EMG and accelerometer performance across different time windows of a gait cycle in locomotion and transition classification is presented with an emphasis on fusing the data from both sensors for better classification.
APA, Harvard, Vancouver, ISO, and other styles
5

Bouguettaya, Athman, Boualem Benatallah, Brahim Medjahed, Mourad Ouzzani, and Lily Hendra. "Adaptive Web-Based Database Communities." In Information Modeling for Internet Applications, 277–98. IGI Global, 2003. http://dx.doi.org/10.4018/978-1-59140-050-9.ch013.

Full text
Abstract:
The evolution into the global information infrastructure and the concomitant increase in the available information on the Web, is offering a powerful distribution vehicle for organizations that need to coordinate the use of multiple information sources. However, the technology to organize, search, integrate, and evolve these sources has not kept pace with the rapid growth of the available information space. In this chapter, we present our work in the WebFINDIT project. WebFINDIT aims to achieve the scalable integration and efficient querying of Web-accessible databases through the incremental data-driven discovery and formation of interrelationships between information sources. WebFINDIT uses an ontological organization of the information space to filter interactions and accelerate service searches. More precisely, the information space is organized as domain-specific groups. Each group forms a database community to represent the domain of interest of the related databases. Additionally, WebFINDIT provides a monitoring mechanism to dynamically alter relationships between different database communities. This is achieved by using distributed agents that work as background processes. They continually gather and evaluate information about the intercommunity relationships to recommend changes. A prototype has been fully implemented in the context of a healthcare application.
APA, Harvard, Vancouver, ISO, and other styles
6

S., Umamaheswari, Sangeetha D., C. Mouliganth, and Vignesh E. M. "KidNet." In Deep Learning Applications and Intelligent Decision Making in Engineering, 114–29. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-2108-3.ch004.

Full text
Abstract:
Kidney cancer is one of the 10 most common cancers in both men and women. The lifetime risk for one developing kidney cancer is about 1.6%. The rate of kidney cancer diagnosis has been rising since the 1990s due to the use of newer imaging tests such as CT scans. The kidneys are deep inside the body and hence small kidney tumours cannot be seen or felt during a physical examination. Existing work on kidney tumour diagnosis uses traditional machine learning and image processing techniques to find and classify the images. Deep learning systems do not require this domain-specific knowledge. The kidney tumour diagnosis system uses deep learning and convolutional neural networks to classify CT images. A deep learning neural network model named KidNet has been implemented. It has been trained using labelled kidney CT images. To achieve acceleration during the training phase, GPUs have been used. The network when trained with abdominal CT images achieved 86.1% accuracy, and the one trained with cropped portion of kidney images achieved 89.6% accuracy.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Domain-Specific Accelerator"

1

Ozkan, M. Akif, Oliver Reiche, Frank Hannig, and Jurgen Teich. "FPGA-based accelerator design from a domain-specific language." In 2016 26th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 2016. http://dx.doi.org/10.1109/fpl.2016.7577357.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Durelli, Gianluca C., Fabrizio Spada, Christian Pilato, and Marco D. Santambrogio. "Scala-Based Domain-Specific Language for Creating Accelerator-Based SoCs." In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2016. http://dx.doi.org/10.1109/ipdpsw.2016.169.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Khavari Tavana, Mohammad, Amey Kulkarni, Abbas Rahimi, Tinoosh Mohsenin, and Houman Homayoun. "Energy-efficient mapping of biomedical applications on domain-specific accelerator under process variation." In ISLPED'14: International Symposium on Low Power Electronics and Design. New York, NY, USA: ACM, 2014. http://dx.doi.org/10.1145/2627369.2627654.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kim, Soyeon, Sanghoon Kang, Donghyeon Han, Sangyeob Kim, Sangjin Kim, and Hoi-jun Yoo. "An Energy-Efficient GAN Accelerator with On-chip Training for Domain Specific Optimization." In 2020 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, 2020. http://dx.doi.org/10.1109/a-sscc48613.2020.9336128.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Sun, Baohua, Lin Yang, Patrick Dong, Wenhan Zhang, Jason Dong, and Charles Young. "Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt for Mobile and Embedded Applications." In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2018. http://dx.doi.org/10.1109/cvprw.2018.00219.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Malik, Maria, Farnoud Farahmand, Paul Otto, Nima Akhlaghi, Tinoosh Mohsenin, Siddhartha Sikdar, and Houman Homayoun. "Architecture Exploration for Energy-Efficient Embedded Vision Applications: From General Purpose Processor to Domain Specific Accelerator." In 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 2016. http://dx.doi.org/10.1109/isvlsi.2016.112.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sun, Baohua, Lin Yang, Wenhan Zhang, Patrick Dong, Charles Young, Jason Dong, and Michael Lin. "Demonstration of Applications in Computer Vision and NLP on Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt." In 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 2019. http://dx.doi.org/10.1109/icmew.2019.00115.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hannig, Frank. "Resource-aware computing on domain-specific accelerators." In the 10th Workshop. New York, New York, USA: ACM Press, 2013. http://dx.doi.org/10.1145/2443608.2443616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Delcambre, Lois, Susan Price, Marianne Lykke Nielsen, Timothy Tolle, Vibeke Luk, and Mathew Weaver. "Accelerated indexing in a domain-specific digital library." In the 2006 national conference. New York, New York, USA: ACM Press, 2006. http://dx.doi.org/10.1145/1146598.1146691.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Limaye, Ankur, and Tosiron Adegbija. "DOSAGE: Generating Domain-Specific Accelerators for Resource-Constrained Computing." In 2021 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). IEEE, 2021. http://dx.doi.org/10.1109/islped52811.2021.9502501.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Domain-Specific Accelerator"

1

Mehmood, Hamid. Bibliometrics of Water Research: A Global Snapshot. United Nations University Institute for Water, Environment and Health, May 2019. http://dx.doi.org/10.53328/eybt8774.

Full text
Abstract:
This report examines the various dimensions of global water-related research over the 2012-2017 period, using extensive bibliographic data. The review covers trends in water-related publications and citations, the relative importance of water-related research in the overall body of scientific research, flows of water-related knowledge between countries and the dynamics of water research publishing opportunities. In summary, it shows that: less than 50% of all countries are publishing water-related research, that China and USA are the two top publishers, and that China’s publishing rate has been growing steadily over the study period. More than 70% of water related publications originating in USA are being cited globally, while China’s water research output appears to be primarily internally cited at present. Analysis of the global water knowledge flows suggests that research is hardly addressing a range of regional water challenges. Countries with protracted water problems – for example in infrastructure, environment, agriculture, energy solutions – do not seem to be at the forefront of water research production or knowledge transfer. Instead, global water research is reliant on Western, particularly US-produced, scientific outputs. A disconnect is also observed between the percentage increase in the publication and the number of citations, suggesting low quality or a narrow focus of many publications. Among other factors, this may reflect the pressure on researchers to contribute a certain number of publications per year, or of the progressively increasing role of grey literature in scientific discourse that ‘diverts’ some citation flow. Analysis of the number of research publications per million people suggests that water research does not necessarily emerge as a reaction to water scarcity in a specific country, but may be driven by the traditional economic value of water supply, geopolitical location, a focus on regional development - including cross-border water management - or development aid spending, or globally applicable research in water management. The proportion of water research in the overall research output of a country is small, including for some of the top-publishing countries. The number of water-related journals that create opportunities for publishing water research, has grown dramatically in absolute terms since 2000, and is now close 2100 journals. The metrics used in this report are based on readily available bibliographic data. They can be further focused to better understand a specific thematic domain, geographical region or country, or to analyze a different period. To help accelerate solutions to global and national water challenges that many of these research papers are highlighting, the water research community needs to look beyond the research ‘box’ and identify ways to measure development impact of water research programmes, rather ‘impact’ based solely on academic impact measured in citations. The research findings, learning and knowledge in these research publications needs to be conveyed in a practical way to the real users of this knowledge – stakeholders who are beyond research circles.
APA, Harvard, Vancouver, ISO, and other styles
2

Dubcovsky, Jorge, Tzion Fahima, and Ann Blechl. Molecular characterization and deployment of the high-temperature adult plant stripe rust resistance gene Yr36 from wheat. United States Department of Agriculture, November 2013. http://dx.doi.org/10.32747/2013.7699860.bard.

Full text
Abstract:
Stripe rust, caused by Puccinia striiformis f. sp. tritici is one of the most destructive fungal diseases of wheat. Virulent races that appeared within the last decade caused drastic cuts in yields. The incorporation of genetic resistance against this pathogen is the most cost-effective and environmentally friendly solution to this problem. However, race specific seedling resistance genes provide only a temporary solution because fungal populations rapidly evolve to overcome this type of resistance. In contrast, high temperature adult plant (HTAP) resistance genes provide a broad spectrum resistance that is partial and more durable. The cloning of the first wheat HTAP stripe rust resistance gene Yr36 (Science 2009, 323:1357), funded by our previous (2007-2010) BARD grant, provided us for the first time with an entry point for understanding the mechanism of broad spectrum resistance. Two paralogous copies of this gene are tightly linked at the Yr36 locus (WKS1 and WKS2). The main objectives of the current study were to characterize the Yr36 (WKS) resistance mechanism and to identify and characterize alternative WKSgenes in wheat and wild relatives. We report here that the protein coded by Yr36, designated WKS1, that has a novel architecture with a functional kinase and a lipid binding START domain, is localized to chloroplast. Our results suggest that the presence of the START domain may affect the kinase activity. We have found that the WKS1 was over-expressed on leaf necrosis in wheat transgenic plants. When the isolated WKS1.1 splice variant transcript was transformed into susceptible wheat it conferred resistance to stripe rust, but the truncated variant WKS1.2 did not confer resistance. WKS1.1 and WKS1.2 showed different lipid binding profiling. WKS1.1 enters the chloroplast membrane, while WKS1.2 is only attached outside of the chloroplast membrane. The ascorbate peroxidase (APX) activity of the recombinant protein of TmtAPXwas found to be reduced by WKS1.1 protein in vitro. The WKS1.1 mature protein in the chloroplast is able to phosphorylate TmtAPXprotein in vivo. WKS1.1 induced cell death by suppressing APX activity and reducing the ability of the cell to detoxify reactive oxygen. The decrease of APX activity reduces the ability of the plant to detoxify the reactive H2O2 and is the possible mechanism underlying the accelerated cell death observed in the transgenic plants overexpressing WKS1.1 and in the regions surrounding a stripe rust infection in the wheat plants carrying the natural WKS1.1 gene. WKS2 is a nonfunctional paralog of WKS1 in wild emmer wheat, probably due to a retrotransposon insertion close to the alternative splicing site. In some other wild relatives of wheat, such as Aegilops comosa, there is only one copy of this gene, highly similar to WKS2, which is lucking the retrotransposon insertion. WKS2 gene present in wheat and WKS2-Ae from A. showed a different pattern of alternative splice variants, regardless of the presence of the retrotransposon insertion. Susceptible Bobwhite transformed with WKS2-Ae (without retrotansposon insertion in intron10), which derived from Aegilops comosaconferred resistance to stripe rust in wheat. The expression of WKS2-Ae in transgenic plants is up-regulated by temperature and pathogen infection. Combination of WKS1 and WKS2-Ae shows improved stripe rust resistance in WKS1×WKS2-Ae F1 hybrid plants. The obtained results show that WKS1 protein is accelerating programmed cell death observed in the regions surrounding a stripe rust infection in the wheat plants carrying the natural or transgenic WKS1 gene. Furthermore, characterization of the epistatic interactions of Yr36 and Yr18 demonstrated that these two genes have additive effects and can therefore be combined to increase partial resistance to this devastating pathogen of wheat. These achievements may have a broad impact on wheat breeding efforts attempting to protect wheat yields against one of the most devastating wheat pathogen.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography