Добірка наукової літератури з теми "HW-Aware NAS"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "HW-Aware NAS".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "HW-Aware NAS":

1

Benmeziane, Hadjer, Hamza Ouarnoughi, Kaoutar El Maghraoui, and Smail Niar. "Multi-Objective Hardware-Aware Neural Architecture Search with Pareto Rank-Preserving Surrogate Models." ACM Transactions on Architecture and Code Optimization, January 11, 2023. http://dx.doi.org/10.1145/3579853.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Deep learning (DL) models such as convolutional neural networks (ConvNets) are being deployed to solve various computer vision and natural language processing tasks at the edge. It is a challenge to find the right DL architecture that simultaneously meets the accuracy, power and performance budgets of such resource-constrained devices. Hardware-aware Neural Architecture Search (HW-NAS) has recently gained steam by automating the design of efficient DL models for a variety of target hardware platform. However, such algorithms require excessive computational resources. Thousands of GPU days are required to evaluate and explore an architecture search space such as FBNet [45]. State-of-the-art approaches propose using surrogate models to predict architecture accuracy and hardware performance to speed up HW-NAS. Existing approaches use independent surrogate models to estimate each objective, resulting in non-optimal Pareto fronts. In this paper, HW-PR-NAS, a novel Pareto rank-preserving surrogate model for edge computing platforms is presented. Our model integrates a new loss function that ranks the architectures according to their Pareto rank, regardless of the actual values of the various objectives. We employ a simple yet effective surrogate model architecture that can be generalized to any standard DL model. We then present an optimized evolutionary algorithm that uses and validates our surrogate model. Our approach has been evaluated on seven edge hardware platforms from various classes, including ASIC, FPGA, GPU and multi-core CPU. The evaluation results show that HW-PR-NAS achieves up to 2.5x speedup compared to state-of-the-art methods while achieving 98% near the actual Pareto front.
2

Chitty-Venkata, Krishna Teja, and Arun K. Somani. "Neural Architecture Search Survey: A Hardware Perspective." ACM Computing Surveys, April 13, 2022. http://dx.doi.org/10.1145/3524500.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
We review the problem of automating hardware-aware architectural design process of Deep Neural Networks (DNNs). The field of Convolutional Neural Network (CNN) algorithm design has led to advancements in many fields such as computer vision, virtual reality, and autonomous driving. The end-to-end design process of a CNN is a challenging and time-consuming task as it requires expertise in multiple areas such as signal and image processing, neural networks, and optimization. At the same time, several hardware platforms, general- and special-purpose, have equally contributed to the training and deployment of these complex networks in a different setting. Hardware-Aware Neural Architecture Search (HW-NAS) automates the architectural design process of DNNs to alleviate human effort, and generate efficient models accomplishing acceptable accuracy-performance tradeoffs. The goal of this paper is to provide insights and understanding of HW-NAS techniques for various hardware platforms (MCU, CPU, GPU, ASIC, FPGA, ReRAM, DSP, and VPU), followed by the co-search methodologies of neural algorithm and hardware accelerator specifications.

Дисертації з теми "HW-Aware NAS":

1

Bouzidi, Halima. "Efficient Deployment of Deep Neural Networks on Hardware Devices for Edge AI." Electronic Thesis or Diss., Valenciennes, Université Polytechnique Hauts-de-France, 2024. http://www.theses.fr/2024UPHF0006.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Les réseaux de neurones (RN) sont devenus une force dominante dans le monde de la technologie. Inspirés par le cerveau humain, leur conception complexe leur permet d’apprendre des motifs, de prendre des décisions et même de prévoir des scénarios futurs avec une précision impressionnante. Les RN sont largement déployés dans les systèmes de l'Internet des Objets (IoT pour Internet of Things), renforçant davantage les capacités des dispositifs interconnectés en leur donnant la capacité d'apprendre et de s'auto-adapter dans un contexte temps réel. Cependant, la prolifération des données produites par les capteurs IoT rend difficile leur envoi vers un centre cloud pour le traitement. Par conséquent, le traitement des données plus près de leur origine, en edge, permet de prendre des décisions en temps réel, réduisant ainsi la congestion du réseau.L'intégration des RN à l'edge dans les systèmes IoT permet d'obtenir des solutions plus efficaces et réactives, inaugurant ainsi une nouvelle ère de edge AI. Néanmoins, le déploiement des RN sur des plateformes matérielles à ressources présente une multitude de défis. (i) La complexité inhérente des architectures des RN, qui nécessitent d'importantes capacités de calcul et de stockage. (ii) Le budget énergétique limité caractérisant les dispositifs matériels sur edge qui ne permet pas de supporter des RN complexes, réduisant drastiquement la durée de fonctionnement du système. (iii) Le défi d'assurer une harmonie entre la conception des RN et celle des dispositifs matériels de l’edge. (iv) L'absence de l'adaptabilité à l'environnement d'exécution dynamique et aux complexités des données.Pour pallier ces problèmes, cette thèse vise à établir des méthodes innovantes qui élargissent les cadres traditionnels de conception de RN (NAS pour Neural Architecture Search) en intégrant les caractéristiques contextuelles du matériel et de l’environnement d'exécution. Tout d'abord, nous intégrons les propriétés matérielles au NAS en adaptant les RN aux variations de la fréquence d'horloge. Deuxièmement, nous exploitons l’aspect dynamique au sein des RN d'un point de vue conceptuel, en introduisant un NAS dynamique. Troisièmement, nous explorons le potentiel des RN graphiques (GNN pour Graph Neural Network) en développant un NAS avec calcul distribué sur des multiprocesseurs hétérogènes sur puce (MPSoC pour Multi-Processors Système-on-Chip). Quatrièmement, nous abordons la co-optimisation software et matérielle sur les MPSoCs hétérogènes en proposant une stratégie d'ordonnancement innovante qui exploite l'adaptabilité et le parallélisme des RN. Cinquièmement, nous explorons la perspective de ML4ML (pour Machine Learning for Machine Learning) en introduisant des techniques d'estimation des performances des RN sur les plateformes matérielles sur edge en utilisant des méthodes basés sur ML. Enfin, nous développons un framework NAS évolutif et auto-adaptatif de bout en bout qui apprend progressivement l'importance des paramètres architecturaux du RN pour guider efficacement le processus de recherche du NAS vers l'optimalité.Nos méthodes aident à contribuer à la réalisation d’un framework de conception de bout en bout pour les RN sur les dispositifs matériels sur edge. Elles permettent ainsi de tirer avantage de plusieurs pistes d’optimisation au niveau logiciel et matériel, améliorant les performances et l’efficacité de l’Edge AI
Neural Networks (NN) have become a leading force in today's digital landscape. Inspired by the human brain, their intricate design allows them to recognize patterns, make informed decisions, and even predict forthcoming scenarios with impressive accuracy. NN are widely deployed in Internet of Things (IoT) systems, further elevating interconnected devices' capabilities by empowering them to learn and auto-adapt in real-time contexts. However, the proliferation of data produced by IoT sensors makes it difficult to send them to a centralized cloud for processing. This is where the allure of edge computing becomes captivating. Processing data closer to where it originates -at the edge- reduces latency, makes real-time decisions with less effort, and efficiently manages network congestion.Integrating NN on edge devices for IoT systems enables more efficient and responsive solutions, ushering in a new age of self-sustaining Edge AI. However, Deploying NN on resource-constrained edge devices presents a myriad of challenges: (i) The inherent complexity of neural network architectures, which requires significant computational and memory capabilities. (ii) The limited power budget of IoT devices makes the NN inference prone to rapid energy depletion, drastically reducing system utility. (iii) The hurdle of ensuring harmony between NN and HW designs as they evolve at different rates. (iv) The lack of adaptability to the dynamic runtime environment and the intricacies of input data.Addressing these challenges, this thesis aims to establish innovative methods that extend conventional NN design frameworks, notably Neural Architecture Search (NAS). By integrating HW and runtime contextual features, our methods aspire to enhance NN performances while abstracting the need for the human-in-loop}. Firstly, we incorporate HW properties into the NAS by tailoring the design of NN to clock frequency variations (DVFS) to minimize energy footprint. Secondly, we leverage dynamicity within NN from a design perspective, culminating in a comprehensive Hardware-aware Dynamic NAS with DVFS features. Thirdly, we explore the potential of Graph Neural Networks (GNN) at the edge by developing a novel HW-aware NAS with distributed computing features on heterogeneous MPSoC. Fourthly, we address the SW/HW co-optimization on heterogeneous MPSoCs by proposing an innovative scheduling strategy that leverages NN adaptability and parallelism across computing units. Fifthly, we explore the prospect of ML4ML -- Machine Learning for Machine Learning by introducing techniques to estimate NN performances on edge devices using neural architectural features and ML-based predictors. Finally, we develop an end-to-end self-adaptive evolutionary HW-aware NAS framework that progressively learns the importance of NN parameters to guide the search process toward Pareto optimality effectively.Our methods can contribute to elaborating an end-to-end design framework for neural networks on edge hardware devices. They enable leveraging multiple optimization opportunities at both the software and hardware levels, thus improving the performance and efficiency of Edge AI

Частини книг з теми "HW-Aware NAS":

1

Benmeziane, Hadjer, Kaoutar El Maghraoui, Hamza Ouarnoughi, and Smail Niar. "Pareto Rank-Preserving Supernetwork for Hardware-Aware Neural Architecture Search." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2023. http://dx.doi.org/10.3233/faia230276.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In neural architecture search (NAS), training every sampled architecture is very time-consuming and should be avoided. Weight-sharing is a promising solution to speed up the evaluation process. However, training the supernetwork incurs many discrepancies between the actual ranking and the predicted one. Additionally, efficient deep-learning engineering processes require incorporating realistic hardware-performance metrics into the NAS evaluation process, also known as hardware-aware NAS (HW-NAS). In HW-NAS, estimating task-specific performance and hardware efficiency are both required. This paper proposes a supernetwork training methodology that preserves the Pareto ranking between its different subnetworks resulting in more efficient and accurate neural networks for a variety of hardware platforms. The results show a 97% near Pareto front approximation in less than 2 GPU days of search, which provides 2x speed up compared to state-of-the-art methods. We validate our methodology on NAS-Bench-201, DARTS, and ImageNet. Our optimal model achieves 77.2% accuracy (+1.7% compared to baseline) with an inference time of 3.68ms on Edge GPU for ImageNet, which yields a 2.3x speedup. Training implementation can be found: https://github.com/IHIaadj/PRP-NAS.

Тези доповідей конференцій з теми "HW-Aware NAS":

1

Benmeziane, Hadjer, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, and Naigang Wang. "Hardware-Aware Neural Architecture Search: Survey and Taxonomy." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/592.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
There is no doubt that making AI mainstream by bringing powerful, yet power hungry deep neural networks (DNNs) to resource-constrained devices would required an efficient co-design of algorithms, hardware and software. The increased popularity of DNN applications deployed on a wide variety of platforms, from tiny microcontrollers to data-centers, have resulted in multiple questions and challenges related to constraints introduced by the hardware. In this survey on hardware-aware neural architecture search (HW-NAS), we present some of the existing answers proposed in the literature for the following questions: "Is it possible to build an efficient DL model that meets the latency and energy constraints of tiny edge devices?", "How can we reduce the trade-off between the accuracy of a DL model and its ability to be deployed in a variety of platforms?". The survey provides a new taxonomy of HW-NAS and assesses the hardware cost estimation strategies. We also highlight the challenges and limitations of existing approaches and potential future directions. We hope that this survey will help to fuel the research towards efficient deep learning.

До бібліографії