Accedi

Bibliografie tematiche / Architecture parallelisante

Indice

Articoli di riviste
Tesi

Letteratura scientifica selezionata sul tema "Architecture parallelisante"

Autore: Grafiati

Pubblicato: 25 gennaio 2025

Ultima modifica: 5 luglio 2025

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Architecture parallelisante".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Architecture parallelisante"

1

Szénási, Sándor. "Solving the inverse heat conduction problem using NVLink capable Power architecture." PeerJ Computer Science 3 (November 20, 2017): e138. http://dx.doi.org/10.7717/peerj-cs.138.

Testo completo

Abstract (sommario):

The accurate knowledge of Heat Transfer Coefficients is essential for the design of precise heat transfer operations. The determination of these values requires Inverse Heat Transfer Calculations, which are usually based on heuristic optimisation techniques, like Genetic Algorithms or Particle Swarm Optimisation. The main bottleneck of these heuristics is the high computational demand of the cost function calculation, which is usually based on heat transfer simulations producing the thermal history of the workpiece at given locations. This Direct Heat Transfer Calculation is a well parallelisable process, making it feasible to implement an efficient GPU kernel for this purpose. This paper presents a novel step forward: based on the special requirements of the heuristics solving the inverse problem (executing hundreds of simulations in a parallel fashion at the end of each iteration), it is possible to gain a higher level of parallelism using multiple graphics accelerators. The results show that this implementation (running on 4 GPUs) is about 120 times faster than a traditional CPU implementation using 20 cores. The latest developments of the GPU-based High Power Computations area were also analysed, like the new NVLink connection between the host and the devices, which tries to solve the long time existing data transfer handicap of GPU programming.

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Taygan, Ugur, and Adnan Ozsoy. "Performance analysis and GPU parallelisation of ECO object tracking algorithm." New Trends and Issues Proceedings on Advances in Pure and Applied Sciences, no. 12 (April 30, 2020): 109–18. http://dx.doi.org/10.18844/gjpaas.v0i12.4991.

Testo completo

Abstract (sommario):

The classification and tracking of objects has gained popularity in recent years due to the variety and importance of their application areas. Although object classification does not necessarily have to be real time, object tracking is often intended to be carried out in real time. While the object tracking algorithm mainly focuses on robustness and accuracy, the speed of the algorithm may degrade significantly. Due to their parallelisable nature, the use of GPUs and other parallel programming tools are increasing in the object tracking applications. In this paper, we run experiments on the Efficient Convolution Operators object tracking algorithm, in order to detect its time-consuming parts, which are the bottlenecks of the algorithm, and investigate the possibility of GPU parallelisation of the bottlenecks to improve the speed of the algorithm. Finally, the candidate methods are implemented and parallelised using the Compute Unified Device Architecture.  Keywords: Object tracking, parallel programming.

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Rice, J. E., and K. B. Kent. "Case studies in determining the optimal field programmable gate array design for computing highly parallelisable problems." IET Computers & Digital Techniques 3, no. 3 (2009): 247. http://dx.doi.org/10.1049/iet-cdt.2008.0042.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Dyubele, Sithembiso, and Duncan Anthony Coulter. "A Hybrid Agent-Oriented Stochastic Diffusion Search and Beam Search Architecture." International Conference on Intelligent and Innovative Computing Applications 2024 (November 30, 2024): 214–24. https://doi.org/10.59200/iconic.2024.023.

Testo completo

Abstract (sommario):

Various swarm intelligence-based algorithms have been developed and explored over the years. These algorithms include particle swarm optimisation, spider monkey optimisation, artificial bee colony algorithm, ant colony optimisation, and bacterial foraging optimisation, among many others. However, according to the reviewed literature, classical or traditional optimisation methods are confronted with difficulties when scaling up to real-world optimisation problems; therefore, there is a need to develop efficient and robust computational algorithms that can solve problems numerically, irrespective of their sizes. Inspired by natureinspired swarm intelligence algorithms, this study has created a hybrid-based algorithm utilising Stochastic Diffusion Search (SDS) and Beam Search algorithms. In this s ability to operate as a multi-agent population-based global search and utilised to initialise, update, and maintain a list of candidate regions in the search space. In addition, it is responsible for recruiting agents for those regions in the search space. A variation of the knapsack problem was employed to test the created hybrid model. In this problem, constraints were established, as discussed later in the paper (in section 3.5). The results discussed in section 4 indicated that the algorithm found a better solution in the search space. The results also showed a strong and consistent beam after a series of iterations during the simulation. The specific improvements observed with the hybrid algorithm are that, because it is implemented as an actororiented system, it is completely parallelised, every actor is independent of every other actor and can be run automatically, on its own individual green thread but can, in fact, be run on another computer. This parallelisability, composability, and the resulting distributable nature of the new algorithm are the main advantages over the standard implementation of either stochastic diffusion search or beam search, neither of which are parallelised by default. Its implication is based on the fact that the pace of improvement in available computing power has levelled off following several decades of sustained growth characterised by Moore's law. The hybrid algorithm is highly parallelisable, making it easy to take advantage of multiple cores on a single computer or multiple machine instances in a cloud computing scenario. Therefore, this is a modern version of both component algorithms within the proposed hybrid approach as it translates better into environments where it is easier to scale outwards rather than upwards.

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Haveraaen, Magne. "Machine and Collection Abstractions for User-Implemented Data-Parallel Programming." Scientific Programming 8, no. 4 (2000): 231–46. http://dx.doi.org/10.1155/2000/485607.

Testo completo

Abstract (sommario):

Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic) structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Peakman, Aiden, Thomas Bennett, Kerr Fitzgerald, Robert Gregg, and Glyn Rossiter. "NEXUS FRAMEWORK FOR WHOLE-CORE FUEL PERFORMANCE: CURRENT APPLICATIONS AND FUTURE TRENDS." EPJ Web of Conferences 247 (2021): 12001. http://dx.doi.org/10.1051/epjconf/202124712001.

Testo completo

Abstract (sommario):

Current industry practice in fuel licensing often relies on thermo-mechanical modeling of a fuel rod with an artificially constructed bounding power history. The benefit of this approach is that it is computationally efficient; however, the drawbacks are that 1) such an approach is not always conservative, for instance when modelling phenomena related to late onset pellet-clad gap closure; and 2) it can poorly estimate available safety margins for fuel operating at high local power densities and/or to high burnup. For these reasons NNL developed an in-house whole-core fuel performance framework – NEXUS – to enable modelling of all fuel rods in the core using the ENIGMA fuel performance code and computed power histories from core simulation packages (currently limited to PARCS or SIMULATE). One of the main objectives was to create a tool that was both computationally efficient and user friendly. The former was achieved by making use of parallelisable architecture, while the latter was achieved by minimising necessary user input and providing tools for easy interrogation of the fuel performance output. NEXUS has been applied to several LWR operational scenarios, which we summarise in this paper, including steady-state operation of an ABWR, and a rod ejection accident in a small modular soluble boron free PWR and a GWe-class PWR. We also summarise current development activities related to integrating NNL’s in-house fuel performance Monte Carlo uncertainty analysis software CASINO into the NEXUS framework.

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Jiang, Xinyu, Chenfei Ma, and Kianoush Nazarpour. "Posture-invariant myoelectric control with self-calibrating random forests." Frontiers in Neurorobotics 18 (December 4, 2024). https://doi.org/10.3389/fnbot.2024.1462023.

Testo completo

Abstract (sommario):

IntroductionMyoelectric control systems translate different patterns of electromyographic (EMG) signals into the control commands of diverse human-machine interfaces via hand gesture recognition, enabling intuitive control of prosthesis and immersive interactions in the metaverse. The effect of arm position is a confounding factor leading to the variability of EMG characteristics. Developing a model with its characteristics and performance invariant across postures, could largely promote the translation of myoelectric control into real world practice.MethodsHere we propose a self-calibrating random forest (RF) model which can (1) be pre-trained on data from many users, then one-shot calibrated on a new user and (2) self-calibrate in an unsupervised and autonomous way to adapt to varying arm positions.ResultsAnalyses on data from 86 participants (66 for pre-training and 20 in real-time evaluation experiments) demonstrate the high generalisability of the proposed RF architecture to varying arm positions.DiscussionOur work promotes the use of simple, explainable, efficient and parallelisable model for posture-invariant myoelectric control.

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Tang, Duowei, Peter Kuppens, Luc Geurts, and Toon van Waterschoot. "End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network." EURASIP Journal on Audio, Speech, and Music Processing 2021, no. 1 (2021). http://dx.doi.org/10.1186/s13636-021-00208-5.

Testo completo

Abstract (sommario):

AbstractAmongst the various characteristics of a speech signal, the expression of emotion is one of the characteristics that exhibits the slowest temporal dynamics. Hence, a performant speech emotion recognition (SER) system requires a predictive model that is capable of learning sufficiently long temporal dependencies in the analysed speech signal. Therefore, in this work, we propose a novel end-to-end neural network architecture based on the concept of dilated causal convolution with context stacking. Firstly, the proposed model consists only of parallelisable layers and is hence suitable for parallel processing, while avoiding the inherent lack of parallelisability occurring with recurrent neural network (RNN) layers. Secondly, the design of a dedicated dilated causal convolution block allows the model to have a receptive field as large as the input sequence length, while maintaining a reasonably low computational cost. Thirdly, by introducing a context stacking structure, the proposed model is capable of exploiting long-term temporal dependencies hence providing an alternative to the use of RNN layers. We evaluate the proposed model in SER regression and classification tasks and provide a comparison with a state-of-the-art end-to-end SER model. Experimental results indicate that the proposed model requires only 1/3 of the number of model parameters used in the state-of-the-art model, while also significantly improving SER performance. Further experiments are reported to understand the impact of using various types of input representations (i.e. raw audio samples vs log mel-spectrograms) and to illustrate the benefits of an end-to-end approach over the use of hand-crafted audio features. Moreover, we show that the proposed model can efficiently learn intermediate embeddings preserving speech emotion information.

Gli stili APA, Harvard, Vancouver, ISO e altri

Tesi sul tema "Architecture parallelisante"

1

Louetsi, Kenelm. "Un environnement de développement d'applications sur un processeur à beaucoup de cœurs parallélisant." Electronic Thesis or Diss., Perpignan, 2024. http://www.theses.fr/2024PERP0024.

Testo completo

Abstract (sommario):

Les objets numériques du futur (robots domestiques, véhicules autonomes, engins spatiaux automatiques,...) auront besoin à la fois de puissance de calcul et de sûreté. Le Little Big Processor (LBP) est adapté à ce défi : il a une approche novatriced u parallélisme qui offre l'avantage de la puissance en garantissant un certain déterminisme de l'exécution. Ce déterminisme d'exécution donne une sûreté de fonctionnement indispensable dans la plupart des dispositifs interagissant avec le monde et l'humain. Dans cette thèse, nous avons réalisé un environnement de développement pour LBP, avec un compilateur, un « bootloader » et un débogueur. Ces outils sont classiques, mais en l'occurrence, ils devront être adaptés à la mise en oeuvre d'applications parallélisées avec OpenMP pour LBP. Suite à la réalisation de l'environnement de développement, nous avons défini un modèle de parallélisme déterministe pour de l'embarqué « bareme-tal ». Ce modèle a été évalué sur une plateforme embarquée « baremetal » et nous a permis de confirmer qu'il était possible d'avoir une exécution parallèle déterministe qui conserve les gains en performance du parallélisme<br>Digital objects of the future (domestic robots, autonomous vehicles, automatic spacecraft, ...) will need both computing power and safety. The Little Big Processor (LBP) is suitable for this challenge: it has an innovative approach to parallelism which offers the advantages of computing power while guaranteeing a certain determinism of execution. This execution determinism brings a level of operational safety essential in most devices interacting with the world and humans. In the present thesis we created a development environment for LBP, with a compiler, a loader and a debugger. These tools are classic but in this case, they will have to be adapted to the implementation of parallelized OpenMP applications for LBP. Following the creation of the development environment, we defined a deterministic parallel model for embedded bare-metal. This model has been evaluated on a embedded bare-metal platform, and this allowed us to confirm that it is possible to have a deterministic parallel execution which keeps the performance speedups from parallelism

Gli stili APA, Harvard, Vancouver, ISO e altri

Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!