Contents
Academic literature on the topic 'Optimisations pour GPU'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Optimisations pour GPU.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Optimisations pour GPU"
Ramesh, Vishal Avinash, Ehsan Nikbakht Jarghouyeh, Ahmed Saleh Alraeeini, and Amin Al-Fakih. "Optimisation Investigation and Bond-Slip Behaviour of High Strength PVA-Engineered Geopolymer Composite (EGC) Cured in Ambient Temperatures." Buildings 13, no. 12 (December 4, 2023): 3020. http://dx.doi.org/10.3390/buildings13123020.
Full textAddonizio, Maria Luisa, and Luigi Fusco. "Adhesion and Barrier Properties Analysis of Silica-Like Thin Layer on Polyethylene Naphthalate Substrates for Thin Film Solar Cells." Advances in Science and Technology 74 (October 2010): 113–18. http://dx.doi.org/10.4028/www.scientific.net/ast.74.113.
Full textWang, Hao, Ce Yu, Bo Zhang, Jian Xiao, and Qi Luo. "HCGrid: A convolution-based gridding framework for radio astronomy in hybrid computing environments." Monthly Notices of the Royal Astronomical Society, December 10, 2020. http://dx.doi.org/10.1093/mnras/staa3800.
Full textMarignier, Augustin, Jason D. McEwen, Ana M. G. Ferreira, and Thomas D. Kitching. "Posterior sampling for inverse imaging problems on the sphere in seismology and cosmology." RAS Techniques and Instruments, January 2, 2023. http://dx.doi.org/10.1093/rasti/rzac010.
Full textDissertations / Theses on the topic "Optimisations pour GPU"
Romera, Thomas. "Adéquation algorithme architecture pour flot optique sur GPU embarqué." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS450.
Full textThis thesis focus on the optimization and efficient implementation of pixel motion (optical flow) estimation algorithms on embedded graphics processing units (GPUs). Two iterative algorithms have been studied: the Total Variation - L1 (TV-L1) method and the Horn-Schunck method. The primary objective of this work is to achieve real-time processing, with a target frame processing time of less than 40 milliseconds, on low-power platforms, while maintaining acceptable image resolution and flow estimation quality for the intended applications. Various levels of optimization strategies have been explored. High-level algorithmic transformations, such as operator fusion and operator pipelining, have been implemented to maximize data reuse and enhance spatial/temporal locality. Additionally, GPU-specific low-level optimizations, including the utilization of vector instructions and numbers, as well as efficient memory access management, have been incorporated. The impact of floating-point number representation (single-precision versus half-precision) has also been investigated. The implementations have been assessed on Nvidia's Jetson Xavier, TX2, and Nano embedded platforms in terms of execution time, power consumption, and optical flow accuracy. Notably, the TV-L1 method exhibits higher complexity and computational intensity compared to Horn-Schunck. The fastest versions of these algorithms achieve a processing rate of 0.21 nanoseconds per pixel per iteration in half-precision on the Xavier platform, representing a 22x time reduction over efficient and parallel CPU versions. Furthermore, energy consumption is reduced by a factor of x5.3. Among the tested boards, the Xavier embedded platform, being both the most powerful and the most recent, consistently delivers the best results in terms of speed and energy efficiency. Operator merging and pipelining have proven to be instrumental in improving GPU performance by enhancing data reuse. This data reuse is made possible through GPU Shared memory, which is a small, high-speed memory that enables data sharing among threads within the same GPU thread block. While merging multiple iterations yields performance gains, it is constrained by the size of the Shared memory, necessitating trade-offs between resource utilization and speed. The adoption of half-precision numbers accelerates iterative algorithms and achieves superior optical flow accuracy within the same time frame compared to single-precision counterparts. Half-precision implementations converge more rapidly due to the increased number of iterations possible within a given time window. Specifically, the use of half-precision numbers on the best GPU architecture accelerates execution by up to x2.2 for TV-L1 and x3.7 for Horn-Schunck. This work underscores the significance of both GPU-specific optimizations for computer vision algorithms, along with the use and study of reduced floating point numbers. They pave the way for future enhancements through new algorithmic transformations, alternative numerical formats, and hardware architectures. This approach can potentially be extended to other families of iterative algorithms
Chrétien, Benjamin. "Optimisation semi-infinie sur GPU pour le contrôle corps-complet de robots." Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT315/document.
Full textA humanoid robot is a complex system with numerous degrees of freedom, whose behavior is subject to the nonlinear equations of motion. As a result, planning its motion is a difficult task from a computational perspective.In this thesis, we aim at developing a method that can leverage the computing power of GPUs in the context of optimization-based whole-body motion planning. We first exhibit the properties of the optimization problem, and show that several avenues can be exploited in the context of parallel computing. Then, we present our approach of the dynamics computation, suitable for highly-parallel processing architectures. Next, we propose a many-core GPU implementation of the motion planning problem. Our approach computes the constraints and their gradients in parallel, and feeds the result to a nonlinear optimization solver running on the CPU. Because each constraint and its gradient can be evaluated independently for each time interval, we end up with a highly parallelizable problem that can take advantage of GPUs. We also propose a new parametrization of contact forces adapted to our optimization problem. Finally, we investigate the extension of our work to model predictive control
Delevacq, Audrey. "Métaheuristiques pour l'optimisation combinatoire sur processeurs graphiques (GPU)." Thesis, Reims, 2013. http://www.theses.fr/2013REIMS011/document.
Full textSeveral combinatorial optimization problems are NP-hard and can only be solved optimally by exact algorithms for small instances. Metaheuristics have proved to be effective in solving many of these problems by finding approximate solutions in a reasonable time. However, dealing with large instances, they may require considerable computation time and amount of memory space to be efficient in the exploration of the search space. Therefore, the interest devoted to their deployment on high performance computing architectures has increased over the past years. Existing parallelization approaches generally follow the message-passing and shared-memory computing paradigms which are suitable for traditional architectures based on microprocessors, also called CPU (Central Processing Unit). However, research in the field of parallel computing is rapidly evolving and new architectures emerge, including hardware accelerators which offloads the CPU of some of its tasks. Among them, graphics processors or GPUs (Graphics Processing Units) have a massively parallel architecture with great potential but also imply new algorithmic and programming challenges. In fact, existing parallelization models of metaheuristics are generally unsuited to computing environments like GPUs. Few works have tackled this subject without providing a comprehensive and fundamental view of it.The general purpose of this thesis is to propose a framework for the effective implementation of metaheuristics on parallel architectures based on GPUs. It begins with a state of the art describing existing works on GPU parallelization of metaheuristics and general classifications of parallel metaheuristics. An original taxonomy is then designed to classify identified implementations and to formalize GPU parallelization strategies in a coherent methodological framework. This thesis also aims to validate this taxonomy by exploiting its main components to propose original parallelization strategies specifically tailored to GPU architectures. Several effective implementations based on Ant Colony Optimization and Iterated Local Search metaheuristics are thus proposed for solving the Travelling Salesman Problem. A structured and thorough experimental study is conducted to evaluate and compare the performance of approaches on criteria related to solution quality and computing time reduction
Claustre, Jonathan. "Modèle particulaire 2D et 3D sur GPU pour plasma froid magnétisé : Application à un filtre magnétique." Phd thesis, Université Paul Sabatier - Toulouse III, 2012. http://tel.archives-ouvertes.fr/tel-00796690.
Full textCodol, Jean-Marie. "Hybridation GPS/Vision monoculaire pour la navigation autonome d'un robot en milieu extérieur." Thesis, Toulouse, INSA, 2012. http://www.theses.fr/2012ISAT0060/document.
Full textWe are witnessing nowadays the importation of ICT (Information and Communications Technology) in robotics. These technologies will give birth, in upcoming years, to the general public service robotics. This future, if realised, shall be the result of many research conducted in several domains: mechatronics, telecommunications, automatics, signal and image processing, artificial intelligence ... One particularly interesting aspect in mobile robotics is hence the simultaneous localisation and mapping problem. Consequently, to access certain informations, a mobile robot has, in many cases, to map/localise itself inside its environment. The following question is then posed: What precision can we aim for in terms of localisation? And at what cost?In this context, one of the objectives of many laboratories indulged in robotics research, and where results impact directly the industry, is the positioning and mapping of the environment. These latter tasks should be precise, adapted everywhere, integrated, low-cost and real-time. The prediction sensors are inexpensive ones, such as a standard GPS (of metric precision), and a set of embeddable payload sensors (e.g. video cameras). These type of sensors constitute the main support in our work.In this thesis, we shed light on the localisation problem of a mobile robot, which we choose to handle with a probabilistic approach. The procedure is as follows: we first define our "variables of interest" which are a set of random variables, and then we describe their distribution laws and their evolution models. Afterwards, we determine a cost function in such a manner to build up an observer (an algorithmic class where the objective is to minimize the cost function).Our contribution consists of using brute GPS measures (brute measures or raw datas are measures issued from code and phase correlation loops, called pseudo-distance measures of code and phase, respectively) for a low-cost navigation, which is precise in an external suburban environment. By implementing the so-called "whole" property of GPS phase ambiguities, we expand the navigation to achieve a GPS-RTK (Real-Time Kinematic) system in a precise and low-cost local differential mode.Our propositions has been validated through experimentations realized on our robotic demonstrator
Crestetto, Anaïs. "Optimisation de méthodes numériques pour la physique des plasmas : application aux faisceaux de particules chargées." Phd thesis, Université de Strasbourg, 2012. http://tel.archives-ouvertes.fr/tel-00735569.
Full textHeiries, Vincent. "Optimisation d'une chaîne de réception pour signaux de radionavigation à porteuse à double décalage (BOC) retenus pour les systèmes GALILEO et GPS modernisé." Toulouse, ISAE, 2007. http://www.theses.fr/2007ESAE0018.
Full textJaeger, Julien. "Transformations source-à-source pour l'optimisation de codes irréguliers et multithreads." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2012. http://tel.archives-ouvertes.fr/tel-00842177.
Full textPetit, Eric. "Vers un partitionnement automatique d'applications en codelets spéculatifs pour les systèmes hétérogènes à mémoires distribuées." Phd thesis, Université Rennes 1, 2009. http://tel.archives-ouvertes.fr/tel-00445512.
Full textAhmed, Bacha Adda Redouane. "Localisation multi-hypothèses pour l'aide à la conduite : conception d'un filtre "réactif-coopératif"." Thesis, Evry-Val d'Essonne, 2014. http://www.theses.fr/2014EVRY0051/document.
Full text“ When we use information from one source,it's plagiarism;Wen we use information from many,it's information fusion ”This work presents an innovative collaborative data fusion approach for ego-vehicle localization. This approach called the Optimized Kalman Particle Swarm (OKPS) is a data fusion and an optimized filtering method. Data fusion is made using data from a low cost GPS, INS, Odometer and a Steering wheel angle encoder. This work proved that this approach is both more appropriate and more efficient for vehicle ego-localization in degraded sensors performance and highly nonlinear situations. The most widely used vehicle localization methods are the Bayesian approaches represented by the EKF and its variants (UKF, DD1, DD2). The Bayesian methods suffer from sensitivity to noises and instability for the highly non-linear cases. Proposed for covering the Bayesian methods limitations, the Multi-hypothesis (particle based) approaches are used for ego-vehicle localization. Inspired from monte-carlo simulation methods, the Particle Filter (PF) performances are strongly dependent on computational resources. Taking advantages of existing localization techniques and integrating metaheuristic optimization benefits, the OKPS is designed to deal with vehicles high nonlinear dynamic, data noises and real time requirement. For ego-vehicle localization, especially for highly dynamic on-road maneuvers, a filter needs to be robust and reactive at the same time. The OKPS filter is a new cooperative-reactive localization algorithm inspired by dynamic Particle Swarm Optimization (PSO) metaheuristic methods. It combines advantages of the PSO and two other filters: The Particle Filter (PF) and the Extended Kalman filter (EKF). The OKPS is tested using real data collected using a vehicle equipped with embedded sensors. Its performances are tested in comparison with the EKF, the PF and the Swarm Particle Filter (SPF). The SPF is an interesting particle based hybrid filter combining PSO and particle filtering advantages; It represents the first step of the OKPS development. The results show the efficiency of the OKPS for a high dynamic driving scenario with damaged and low quality GPS data