Indice
Letteratura scientifica selezionata sul tema "Accélération graphique"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Accélération graphique".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Articoli di riviste sul tema "Accélération graphique"
Van Straaten, A., P. Sabatier, J. Feydy e A.-S. Jannot. "Accélération des calculs à l'aide de cartes graphiques pour la détection de signaux de pharmacovigilance sur le Système national des données de santé : le package survivalGPU". Revue d'Épidémiologie et de Santé Publique 71 (marzo 2023): 101467. http://dx.doi.org/10.1016/j.respe.2023.101467.
Testo completoTesi sul tema "Accélération graphique"
Boyer, Vincent. "Pour une palette graphique performante : accélération d'algorithmes fondamentaux". Paris 8, 2001. http://www.theses.fr/2001PA081842.
Testo completoManeval, Daniel. "Conception d'un formalisme de pouvoir d'arrêt équivalent et accélération graphique : des simulations Monte Carlo plus efficaces en protonthérapie". Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/34601.
Testo completoIn radiotherapy, treatment planning is the optimization of the ballistics to administer the prescribed dose to the treated lesions while minimizing collateral doses received by the healthy tissue. The algorithm of the dose calculation is at the heart of this numerical simulation. It must be precise and computationally efficient. The antagonism of these two features has led to the development of rapid analytical algorithms whose improvement in dosimetric accuracy has nowadays reached its limit. The accuracy of the dose calculation algorithm is particularly important in proton therapy to fully exploit the ballistic potential of protons. The Monte Carlo proton transport method is the most accurate but also the least efficient. This thesis deals with the development of a Monte Carlo dose calculation platform that is sufficiently effective to consider its use in clinical routine. The main objective of the project is to accelerate the Monte Carlo proton transport without compromising the precision of the dose deposition. To do this, two lines of research have been exploited. The first was to establish a new variance reduction technique called the equivalent restricted stopping power formalism (formalism Leq). This technique significantly improves the algorithmic time complexity made constant (O(1)) instead of linear (O(n)) for the current Monte Carlo algorithms. The second line of research focused on the use of graphics processing units to improve the execution speed of the proton Monte Carlo transport. The developed platform, named pGPUMCD, transports protons on graphic processors in a voxelized geometry. In pGPUMCD, condensed and discrete interaction techniques are considered. The inelastic low-range interactions are modeled with a continuous proton slowing down using the Leq formalism and the energy straggling is considered. The elastic interactions are based on the multiple Coulomb scattering. The discrete interactions are the inelastic interactions, the nuclear elastic and the non-elastic proton-nuclei interactions. pGPUMCD is compared to Geant4 and the implemented physical processes are validated one after the other. For the dose calculation in a clinical context, 27 materials are defined for the tissue segmentation from the CT scan. The dosimetric accuracy of the Leq formalism is better than 0.31% for various materials ranging from water to gold. The intrinsic efficiency gain factors of the Leq formalism are greater than 30, between 100 to 630 for a similar dosimetric accuracy. Combined with the GPU acceleration, the efficiency gain is an order of magnitude greater than 10⁵. Dose differences between pGPUMCD and Geant4 are smaller than 1% in the Bragg peak region and below 3% in its distal fall-off for the different simulation configurations with homogeneous phantoms and clinical cases. In addition, 99.5% of the dose points pass the criterion 1% and the prescribing ranges match with those of Geant4 at less than 0.1%. The computing times of pGPUMCD are below 0.5 seconds per million of transported protons compared to several hours with Geant4. The dosimetric and efficiency performances of pGPUMCD make it a good candidate to be used in a clinical dosimetric planning environment. The expected medical benefit is a better control of the delivered doses allowing a significant margin and toxicity reductions of the treatments.
Delestrac, Paul. "Advanced Profiling Techniques For Evaluating GPU Computing Efficiency Executing ML Applications". Electronic Thesis or Diss., Université de Montpellier (2022-....), 2024. http://www.theses.fr/2024UMONS014.
Testo completoThe rising complexity of Artificial Intelligence (AI) applications significantly increases the demand for computing power to execute and train Machine Learning (ML) models, thus boosting the energy consumption of data centers. GPUs, enhanced by developments like tensor cores (2017), have become the preferred architecture. Building more efficient ML computing systems relies on a deep understanding of the limits of both parts of a tightly coupled hardware/software paradigm. However, the high abstraction of ML frameworks and the closed-source, proprietary design of state-of-the-art GPU architectures obscure the execution process and make performance evaluation tedious.The main goal of this thesis is to provide new methodologies to evaluate performance and energy bottlenecks of GPU-accelerated ML workloads. Existing profiling solutions are limited in three ways. First, ML framework profiling tools are designed to assist the development of ML models but do not give insights into the runtime execution of the ML framework. While these profiling tools provide high-level metrics on the GPU device execution, these metrics can be misleading and overestimate the utilization of the GPU resources. Second, lower-level profiling tools provide access to performance counters and insights on how to optimize GPU kernels. However, these tools cannot capture the efficiency of host/device interactions occurring at a higher level. Finally, when evaluating energy bottlenecks, the mentioned profiling tools cannot provide a detailed breakdown of the energy consumed by modern GPUs during ML training. To tackle these shortcomings, this thesis makes three key contributions organized as a top-down analysis of GPU-accelerated ML workloads.First, we analyze ML frameworks' runtime execution on a CPU-GPU tandem. We propose a new profiling methodology that leverages data from an ML framework's profiler. We use this methodology to provide new insights into the runtime execution of inference, for three ML models. Our results show that GPU kernels' execution must be long enough to hide the runtime overhead of the ML framework, increasing GPU utilization. However, this strive for longer kernel execution leads to the use of bigger batches of data, seemingly pushing the need for more GPU memory.Second, we analyze the utilization of GPU resources when performing ML training. We propose a new profiling methodology combining the use of high-level and low-level profilers to provide new insights into the utilization of the GPU's inner components. Our experiments, on two modern GPUs, suggest that bigger GPU memory helps enhance throughput and utilization from a high level. However, our results also suggest that a plateau has been reached, eliminating the push for bigger batches. Furthermore, we observe that the fastest GPU cores (tensor cores) are idle most of the time, and the tested workloads are now limited by kernels that do not use these cores. Thus, our results suggest that the current GPU paradigm is reaching a saturation point.Finally, we analyze the energy consumption of GPUs during ML training. We propose an energy model and calibration methodology that uses microbenchmarks to provide a breakdown of the GPU energy consumption. We implement and validate this approach with a modern NVIDIA GPU. Our results suggest that data movement is responsible for most of the energy consumption (up to 84% of the dynamic energy consumption of the GPU). This further motivates the push for newer architectures, optimizing memory accesses (e.g., processing in/near memory, vectorized architectures).This thesis provides a comprehensive analysis of the performance and energy bottlenecks of GPU-accelerated ML workloads. We believe our contributions uncover some of the limitations of current GPU architectures and motivate the need for more advanced profiling techniques to design more efficient ML accelerators. We hope that our work will inspire future research in this direction
Rubez, Gaëtan. "Accélération des calculs en Chimie théorique : l'exemple des processeurs graphiques". Thesis, Reims, 2018. http://www.theses.fr/2018REIMS002/document.
Testo completoIn this research work we are interested in the use of the manycore technology of graphics cards in the framework of approaches coming from the field of Theoretical Chemistry. We support the need for Theoretical Chemistry to be able to take advantage of the use of graphics cards. We show the feasibility as well as the limits of the use of graphics cards in the framework of the theoretical chemistry through two usage of GPU on different approaches.We first base our research work on the GPU implementation of the NCIplot program. The NCIplot program has been distributed since 2011 by Julia CONTRERAS-GARCIA implementing the NCI methodology published in 2010. The NCI approach is proving to be an ideal candidate for the use of graphics cards as demonstrated by our analysis of the NCIplot program, as well as the performance achieved by our GPU implementations. Our best implementation (VHY) shows an acceleration factors up to 100 times faster than the NCIplot program. We are currently freely distributing this implementation in the cuNCI program.The second GPU accelerated work is based on the software GAMESS-US, a free competitor of GAUSSIAN. GAMESS is an international software that implements many quantum methods. We were interested in the simultaneous use of DTFB, FMO and PCM methods. The frame is less favorable to the use of graphics cards however we have been able to accelerate the part carried by two K20X graphics cards
Cunat, Christophe. "Accélération matérielle pour le rendu de scènes multimédia vidéo et 3D". Phd thesis, Télécom ParisTech, 2004. http://tel.archives-ouvertes.fr/tel-00077593.
Testo completoCette thèse s'inscrit dans le cadre de la composition d'objets visuels qui peuvent être de natures différentes (séquences vidéo, images fixes, objets synthétiques 3D, etc.). Néanmoins, les puissances de calcul nécessaires afin d'effectuer cette composition demeurent prohibitives sans mise en place d'accélérateurs matériels spécialisés et deviennent critiques dans un contexte de terminal portable.
Une revue tant algorithmique qu'architecturale des différents domaines est effectuée afin de souligner à la fois les points de convergence et de différence. Ensuite, trois axes (interdépendants) de réflexions concernant les problématiques de représentation des données, d'accès aux données et d'organisation des traitements sont principalement discutés.
Ces réflexions sont alors appliquées au cas concret d'un terminal portable pour la labiophonie : application de téléphonie où le visage de l'interlocuteur est reconstruit à partir d'un maillage de triangles et d'un placage de texture. Une architecture unique d'un compositeur d'image capable de traiter indifféremment ces objets visuels est ensuite définie. Enfin, une synthèse sur une plateforme de prototypage de cet opérateur autorise une comparaison avec des solutions existantes, apparues pour la plupart au cours de cette thèse.
Mena, morales Valentin. "Approche de conception haut-niveau pour l'accélération matérielle de calcul haute performance en finance". Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0018/document.
Testo completoThe need for resources in High Performance Computing (HPC) is generally met by scaling up server farms, to the detriment of the energy consumption of such a solution. Accelerating HPC application on heterogeneous platforms, such as FPGAs or GPUs, offers a better architectural compromise as they can reduce the energy consumption of a deployed system. Therefore, a change of programming paradigm is needed to support this heterogeneous acceleration, which trickles down to an increased level of programming complexity tackled by software experts. This is most notably the case for developers in quantitative finance. Applications in this field are constantly evolving and increasing in complexity to stay competitive and comply with legislative changes. This puts even more pressure on the programmability of acceleration solutions. In this context, the use of high-level development and design flows, such as High-Level Synthesis (HLS) for programming FPGAs, is not enough. A domain-specific approach can help to reach performance requirements, without impairing the programmability of accelerated applications.We propose in this thesis a high-level design approach that relies on OpenCL, as a heterogeneous programming standard. More precisely, a recent implementation of OpenCL for Altera FPGA is used. In this context, four main contributions are proposed in this thesis: (1) an initial study of the integration of hardware computing cores to a software library for quantitative finance (QuantLib), (2) an exploration of different architectures and their respective performances, as well as the design of a dedicated architecture for the pricing of American options and their implied volatility, based on a high-level design flow, (3) a detailed characterization of an Altera OpenCL platform, from elemental operators, memory accesses, control overlays, and up to the communication links it is made of, (4) a proposed compilation flow that is specific to the quantitative finance domain, and relying on the aforementioned characterization and on the description of the considered financial applications (option pricing)
Epstein, Emric. "Utilisation de miroirs dans un système de reconstruction interactif". Thèse, 2004. http://hdl.handle.net/1866/16668.
Testo completoNassiri, Moulay Ali. "Les algorithmes de haute résolution en tomographie d'émission par positrons : développement et accélération sur les cartes graphiques". Thèse, 2015. http://hdl.handle.net/1866/12353.
Testo completoPositron emission tomography (PET) is a molecular imaging modality that uses radiotracers labeled with positron emitting isotopes in order to quantify many biological processes. The clinical applications of this modality are largely in oncology, but it has a potential to be a reference exam for many diseases in cardiology, neurology and pharmacology. In fact, it is intrinsically able to offer the functional information of cellular metabolism with a good sensitivity. The principal limitations of this modality are the limited spatial resolution and the limited accuracy of the quantification. To overcome these limits, the recent PET systems use a huge number of small detectors with better performances. The image reconstruction is also done using accurate algorithms such as the iterative stochastic algorithms. But as a consequence, the time of reconstruction becomes too long for a clinical use. So the acquired data are compressed and the accelerated versions of iterative stochastic algorithms which generally are non convergent are used to perform the reconstruction. Consequently, the obtained performance is compromised. In order to be able to use the complex reconstruction algorithms in clinical applications for the new PET systems, many previous studies were aiming to accelerate these algorithms on GPU devices. Therefore, in this thesis, we joined the effort of researchers for developing and introducing for routine clinical use the accurate reconstruction algorithms that improve the spatial resolution and the accuracy of quantification for PET. Therefore, we first worked to develop the new strategies for accelerating on GPU devices the reconstruction from list mode acquisition. In fact, this mode offers many advantages over the histogram-mode, such as motion correction, the possibility of using time-of-flight (TOF) information to improve the quantification accuracy, the possibility of using temporal basis functions to perform 4D reconstruction and extract kinetic parameters with better accuracy directly from the acquired data. But, one of the main obstacles that limits the use of list-mode reconstruction approach for routine clinical use is the relatively long reconstruction time. To overcome this obstacle we : developed a new strategy to accelerate on GPU devices fully 3D list mode ordered-subset expectation-maximization (LM-OSEM) algorithm, including the calculation of the sensitivity matrix that accounts for the patient-specific attenuation and normalisation corrections. The reported reconstruction are not only compatible with a clinical use of 3D LM-OSEM algorithms, but also lets us envision fast reconstructions for advanced PET applications such as real time dynamic studies and parametric image reconstructions. developed and implemented on GPU a multigrid/multiframe approach of an expectation-maximization algorithm for list-mode acquisitions (MGMF-LMEM). The objective is to develop new strategies to accelerate the reconstruction of gold standard LMEM (list-mode expectation-maximization) algorithm which converges slowly. The GPU-based MGMF-LMEM algorithm processed data at a rate close to one million of events per second per iteration, and permits to perform near real-time reconstructions for large acquisitions or low-count acquisitions such as gated studies. Moreover, for clinical use, the quantification is often done from acquired data organized in sinograms. This data is generally compressed in order to accelerate reconstruction. But previous works have shown that this approach to accelerate the reconstruction decreases the accuracy of quantification and the spatial resolution. The ordered-subset expectation-maximization (OSEM) is the most used reconstruction algorithm from sinograms in clinic. Thus, we parallelized and implemented the attenuation-weighted line-of-response OSEM (AW-LOR-OSEM) algorithm which allows a PET image reconstruction from sinograms without any data compression and incorporates the attenuation and normalization corrections in the sensitivity matrices as weight factors. We compared two strategies of implementation: in the first, the system matrix (SM) is calculated on the fly during the reconstruction, while the second implementation uses a precalculated SM more accurately. The results show that the computational efficiency is about twice better for the implementation using calculated SM on-the-fly than the implementation using pre-calculated SM, but the reported reconstruction times are compatible with a clinical use for both strategies.
Choquette, Guillaume-Olivier. "Une approche conceptuelle pour l’interprétation des graphiques en cinématique au secondaire". Thèse, 2008. http://hdl.handle.net/1866/3451.
Testo completoThe goal of this study is to determine whether the use of a guided inquiry laboratory (as a teaching complement based on a conceptual approach) will allow secondary five students to better understand kinematics notions than by the use of an expository laboratory. It comes within a series of college and university studies about teaching approaches using laboratories to transmit physics’ concepts in mechanics (McDermott, 1996; Beichner, 1994). The guided inquiry laboratory approach is associated to a conceptual approach based on qualitative reasoning, whereas the expository laboratory is associated to traditional approach in teaching physics. The Test of Understanding Graphs in Kinematics (TUG-K) (Beichner, 1994) and individual interviews were used to evaluate understanding of kinematics concepts. First of all, the study shows that a guided inquiry approach is an effective method to teach most of kinematics notions. Comparing the results from two groups of 38 students, the study results indicate that a conceptual approach laboratory is better than an expository laboratory for students’ long-term understanding of acceleration notions.