Auswahl der wissenschaftlichen Literatur zum Thema „Algorithmes GPU“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Algorithmes GPU" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Algorithmes GPU"

1

Boulay, Thomas, Nicolas Gac, Ali Mohammad-Djafari und Julien Lagoutte. „Algorithmes de reconnaissance NCTR et parallélisation sur GPU“. Traitement du signal 30, Nr. 6 (28.04.2013): 309–42. http://dx.doi.org/10.3166/ts.30.309-342.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Rios-Willars, Ernesto, Jennifer Velez-Segura und María Magdalena Delabra-Salinas. „Enhancing Multiple Sequence Alignment with Genetic Algorithms: A Bioinformatics Approach in Biomedical Engineering“. Revista Mexicana de Ingeniería Biomédica 45, Nr. 2 (01.05.2024): 62–77. http://dx.doi.org/10.17488/rmib.45.2.4.

Der volle Inhalt der Quelle
Annotation:
This study aimed to create a genetic information processing technique for the problem of multiple alignment of genetic sequences in bioinformatics. The objective was to take advantage of the computer hardware's capabilities and analyze the results obtained regarding quality, processing time, and the number of evaluated functions. The methodology was based on developing a genetic algorithm in Java, which resulted in four different versions: Gp1, Gp2, Gp3 and Gp4 . A set of genetic sequences were processed, and the results were evaluated by analyzing numerical behavior profiles. The research found that algorithms that maintained diversity in the population produced better quality solutions, and parallel processing reduced processing time. It was observed that the time required to perform the process decreased, according to the generated performance profile. The study concluded that conventional computer equipment can produce excellent results when processing genetic information if algorithms are optimized to exploit hardware resources. The computational effort of the hardware used is directly related to the number of evaluated functions. Additionally, the comparison method based on the determination of the performance profile is highlighted as a strategy for comparing the algorithm results in different metrics of interest, which can guide the development of more efficient genetic information processing techniques.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

SOMAN, JYOTHISH, KISHORE KOTHAPALLI und P. J. NARAYANAN. „SOME GPU ALGORITHMS FOR GRAPH CONNECTED COMPONENTS AND SPANNING TREE“. Parallel Processing Letters 20, Nr. 04 (Dezember 2010): 325–39. http://dx.doi.org/10.1142/s0129626410000272.

Der volle Inhalt der Quelle
Annotation:
Graphics Processing Units (GPU) are application specific accelerators which provide high performance to cost ratio and are widely available and used, hence places them as a ubiquitous accelerator. A computing paradigm based on the same is the general purpose computing on the GPU (GPGPU) model. The GPU due to its graphics lineage is better suited for the data-parallel, data-regular algorithms. The hardware architecture of the GPU is not suitable for the data parallel but data irregular algorithms such as graph connected components and list ranking. In this paper, we present results that show how to use GPUs efficiently for graph algorithms which are known to have irregular data access patterns. We consider two fundamental graph problems: finding the connected components and finding a spanning tree. These two problems find applications in several graph theoretical problems. In this paper we arrive at efficient GPU implementations for the above two problems. The algorithms focus on minimising irregularity at both algorithmic and implementation level. Our implementation achieves a speedup of 11-16 times over a corresponding best sequential implementation.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Schnös, Florian, Dirk Hartmann, Birgit Obst und Glenn Glashagen. „GPU accelerated voxel-based machining simulation“. International Journal of Advanced Manufacturing Technology 115, Nr. 1-2 (08.05.2021): 275–89. http://dx.doi.org/10.1007/s00170-021-07001-w.

Der volle Inhalt der Quelle
Annotation:
AbstractThe simulation of subtractive manufacturing processes has a long history in engineering. Corresponding predictions are utilized for planning, validation and optimization, e.g., of CNC-machining processes. With the up-rise of flexible robotic machining and the advancements of computational and algorithmic capability, the simulation of the coupled machine-process behaviour for complex machining processes and large workpieces is within reach. These simulations require fast material removal predictions and analysis with high spatial resolution for multi-axis operations. Within this contribution, we propose to leverage voxel-based concepts introduced in the computer graphics industry to accelerate material removal simulations. Corresponding schemes are well suited for massive parallelization. By leveraging the computational power offered by modern graphics hardware, the computational performance of high spatial accuracy volumetric voxel-based algorithms is further improved. They now allow for very fast and accurate volume removal simulation and analysis of machining processes. Within this paper, a detailed description of the data structures and algorithms is provided along a detailed benchmark for common machining operations.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Zatolokin, Y. A., E. I. Vatutin und V. S. Titov. „ALGORITHMIC OPTIMIZATION OF SOFTWARE IMPLEMENTATION OF ALGORITHMS FOR MULTIPLYING DENSE REAL MATRICES ON GRAPHICS PROCESSORS WITH OPENGL TECHNOLOGY SUPPORT“. Proceedings of the Southwest State University 21, Nr. 5 (28.10.2017): 6–15. http://dx.doi.org/10.21869/2223-1560-2017-21-5-06-15.

Der volle Inhalt der Quelle
Annotation:
In the article was given statement of a problem of matrix multiplication. Is is show that desired problem can be simpl formulated but for its solving may be required both heuristic methods and set of algorithmic modifications relating to algorithmic and high-level software optimization taking into account the particular problem and allow to increase the multiplication performance. These include: a comparative analysis of the performance of the actions performed without GPU-specific optimizations and with optimizations, which showed that computations without optimizing the work with global GPU memory have low processing performance. Optimizing data distribution in global and local memory The GPU allows you to reuse the calculation time and increase real performance. To compare the performance of the developed software implementations for OpenGL and CUDA technologies, identical calculations on identical GPUs were performed, which showed higher real performance when using CUDA cores. Specific values of generation performance measured for multi-threaded software implementation on GPU are given for all of described optimizations. It is shown that the most effective approach is based on the method we can get much more performance by technique of caching sub-blocks of the matrices (tiles) in the GPU's on-chip local memory, that with specialized software implementation is provide the performance of 275,3 GFLOP/s for GPU GeForce GTX 960M.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

MERRILL, DUANE, und ANDREW GRIMSHAW. „HIGH PERFORMANCE AND SCALABLE RADIX SORTING: A CASE STUDY OF IMPLEMENTING DYNAMIC PARALLELISM FOR GPU COMPUTING“. Parallel Processing Letters 21, Nr. 02 (Juni 2011): 245–72. http://dx.doi.org/10.1142/s0129626411000187.

Der volle Inhalt der Quelle
Annotation:
The need to rank and order data is pervasive, and many algorithms are fundamentally dependent upon sorting and partitioning operations. Prior to this work, GPU stream processors have been perceived as challenging targets for problems with dynamic and global data-dependences such as sorting. This paper presents: (1) a family of very efficient parallel algorithms for radix sorting; and (2) our allocation-oriented algorithmic design strategies that match the strengths of GPU processor architecture to this genre of dynamic parallelism. We demonstrate multiple factors of speedup (up to 3.8x) compared to state-of-the-art GPU sorting. We also reverse the performance differentials observed between GPU and multi/many-core CPU architectures by recent comparisons in the literature, including those with 32-core CPU-based accelerators. Our average sorting rates exceed 1B 32-bit keys/sec on a single GPU microprocessor. Our sorting passes are constructed from a very efficient parallel prefix scan "runtime" that incorporates three design features: (1) kernel fusion for locally generating and consuming prefix scan data; (2) multi-scan for performing multiple related, concurrent prefix scans (one for each partitioning bin); and (3) flexible algorithm serialization for avoiding unnecessary synchronization and communication within algorithmic phases, allowing us to construct a single implementation that scales well across all generations and configurations of programmable NVIDIA GPUs.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Gremse, Felix, Andreas Höfter, Lukas Razik, Fabian Kiessling und Uwe Naumann. „GPU-accelerated adjoint algorithmic differentiation“. Computer Physics Communications 200 (März 2016): 300–311. http://dx.doi.org/10.1016/j.cpc.2015.10.027.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Rapaport, D. C. „GPU molecular dynamics: Algorithms and performance“. Journal of Physics: Conference Series 2241, Nr. 1 (01.03.2022): 012007. http://dx.doi.org/10.1088/1742-6596/2241/1/012007.

Der volle Inhalt der Quelle
Annotation:
Abstract A previous study of MD algorithms designed for GPU use is extended to cover more recent developments in GPU architecture. Algorithm modifications are described, togther with extensions to more complex systems. New measurements include the effects of increased parallelism on GPU performance, as well as comparisons with multiple-core CPUs using multitasking based on CPU threads and message passing. The results show that the GPU retains a significant performance advantage.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Mikhayluk, M. V., und A. M. Trushin. „Spheres Collision Detection Algorithms on GPU“. PROGRAMMNAYA INGENERIA 8, Nr. 8 (15.08.2017): 354–58. http://dx.doi.org/10.17587/prin.8.354-358.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Matei, Adrian, Cristian Lupașcu und Ion Bica. „On GPU Implementations of Encryption Algorithms“. Journal of Military Technology 2, Nr. 2 (18.12.2019): 29–34. http://dx.doi.org/10.32754/jmt.2019.2.04.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Algorithmes GPU"

1

Ballage, Marion. „Algorithmes de résolution rapide de problèmes mécaniques sur GPU“. Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30122/document.

Der volle Inhalt der Quelle
Annotation:
Dans le contexte de l'analyse numérique en calcul de structures, la génération de maillages conformes sur des modèles à géométrie complexe conduit à des tailles de modèles importantes, et amène à imaginer de nouvelles approches éléments finis. Le temps de génération d'un maillage est directement lié à la complexité de la géométrie, augmentant ainsi considérablement le temps de calcul global. Les processeurs graphiques (GPU) offrent de nouvelles opportunités pour le calcul en temps réel. L'architecture grille des GPU a été utilisée afin d'implémenter une méthode éléments finis sur maillage cartésien. Ce maillage est particulièrement adapté à la parallélisation souhaitée par les processeurs graphiques et permet un gain de temps important par rapport à un maillage conforme à la géométrie. Les formulations de la méthode des éléments finis ainsi que de la méthode des éléments finis étendue ont été reprises afin d'être adaptées à notre méthode. La méthode des éléments finis étendus permet de prendre en compte la géométrie et les interfaces à travers un choix adéquat de fonctions d'enrichissement. Cette méthode discrétise par exemple sans mailler explicitement les fissures, et évite surtout de remailler au cours de leur propagation. Des adaptations de cette méthode sont faites afin de ne pas avoir besoin d'un maillage conforme à la géométrie. La géométrie est définie implicitement par une fonction surfaces de niveau, ce qui permet une bonne approximation de la géométrie et des conditions aux limites sans pour autant s'appuyer sur un maillage conforme. La géométrie est représentée par une fonction surfaces de niveau que nous appelons la densité. La densité est supérieure à 0.5 à l'intérieur du domaine de calcul et inférieure à 0.5 à l'extérieur. Cette fonction densité, définie par ses valeurs aux points noeuds du maillage, est interpolée à l'intérieur de chaque élément. Une méthode d'intégration adaptée à cette représentation géométrique est proposée. En effet, certains éléments sont coupés par la fonction surfaces de niveau et l'intégration de la matrice de raideur ne doit se faire que sur la partie pleine de l'élément. La méthode de quadrature de Gauss qui permet d'intégrer des polynômes de manière exacte n'est plus adaptée. Nous proposons d'utiliser une méthode de quadrature avec des points d'intégration répartis sur une grille régulière et dense. L'intégration peut s'avérer coûteuse en temps de calcul, c'est pour cette raison que nous proposons une technique d'apprentissage donnant la matrice élémentaire de rigidité en fonction des valeurs de la fonction surfaces de niveau aux sommets de l'élément considéré. Cette méthode d'apprentissage permet de grandes améliorations du temps de calcul des matrices élémentaires. Les résultats obtenus après analyse par la méthode des éléments finis standard ou par la méthode des éléments finis sur maillage cartésien ont une taille qui peut croître énormément selon la complexité des modèles, ainsi que la précision des schémas de résolution. Dans un contexte de programmation sur processeurs graphiques, où la mémoire est limitée, il est intéressant d'arriver à compresser ces données. Nous nous sommes intéressés à la compression des modèles et des résultats éléments finis par la transformée en ondelettes. La compression mise en place aidera aussi pour les problèmes de stockage en réduisant la taille des fichiers générés, et pour la visualisation des données
Generating a conformal mesh on complex geometries leads to important model size of structural finite element simulations. The meshing time is directly linked to the geometry complexity and can contribute significantly to the total turnaround time. Graphics processing units (GPUs) are highly parallel programmable processors, delivering real performance gains on computationally complex, large problems. GPUs are used to implement a new finite element method on a Cartesian mesh. A Cartesian mesh is well adapted to the parallelism needed by GPUs and reduces the meshing time to almost zero. The novel method relies on the finite element method and the extended finite element formulation. The extended finite element method was introduced in the field of fracture mechanics. It consists in enriching the basis functions to take care of the geometry and the interface. This method doesn't need a conformal mesh to represent cracks and avoids refining during their propagation. Our method is based on the extended finite element method, with a geometry implicitly defined, wich allows for a good approximation of the geometry and boundary conditions without a conformal mesh.To represent the model on a Cartesian grid, we use a level set representing a density. This density is greater than 0.5 inside the domain and less than 0.5 outside. It takes 0.5 on the boundary. A new integration technique is proposed, adapted to the geometrical representation. For the element cut by the levet set, only the part full of material has to be integrated. The Gauss quadrature is no longer adapted. We introduce a quadrature method with integration points on a cartesian dense grid.In order to reduce the computational effort, a learning approach is then considered to form the elementary stiffness matrices as function of density values on the vertices of the elements. This learning method reduces the stiffness matrices time computation. Results obtained after analysis by finite element method or the novel finite element method can have important storage size, dependant of the model complexity and the resolution scheme exactitude. Due to the limited direct memory of graphics processing units, the data results are compressed. We compress the model and the element finite results with a wavelet transform. The compression will help for storage issue and also for data visualization
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Luong, Thé Van. „Métaheuristiques parallèles sur GPU“. Thesis, Lille 1, 2011. http://www.theses.fr/2011LIL10058/document.

Der volle Inhalt der Quelle
Annotation:
Les problèmes d'optimisation issus du monde réel sont souvent complexes et NP-difficiles. Leur modélisation est en constante évolution en termes de contraintes et d'objectifs, et leur résolution est coûteuse en temps de calcul. Bien que des algorithmes approchés telles que les métaheuristiques (heuristiques génériques) permettent de réduire la complexité de leur résolution, ces méthodes restent insuffisantes pour traiter des problèmes de grande taille. Au cours des dernières décennies, le calcul parallèle s'est révélé comme un moyen incontournable pour faire face à de grandes instances de problèmes difficiles d'optimisation. La conception et l'implémentation de métaheuristiques parallèles sont ainsi fortement influencées par l'architecture parallèle considérée. De nos jours, le calcul sur GPU s'est récemment révélé efficace pour traiter des problèmes coûteux en temps de calcul. Cette nouvelle technologie émergente est considérée comme extrêmement utile pour accélérer de nombreux algorithmes complexes. Un des enjeux majeurs pour les métaheuristiques est de repenser les modèles existants et les paradigmes de programmation parallèle pour permettre leurdéploiement sur les accélérateurs GPU. De manière générale, les problèmes qui se posent sont la répartition des tâches entre le CPU et le GPU, la synchronisation des threads, l'optimisation des transferts de données entre les différentes mémoires, les contraintes de capacité mémoire, etc. La contribution de cette thèse est de faire face à ces problèmes pour la reconception des modèles parallèles des métaheuristiques pour permettre la résolution des problèmes d'optimisation à large échelle sur les architectures GPU. Notre objectif est de repenser les modèles parallèles existants et de permettre leur déploiement sur GPU. Ainsi, nous proposons dans ce document une nouvelle ligne directrice pour la construction de métaheuristiques parallèles efficaces sur GPU. Le défi de cette thèse porte sur la conception de toute la hiérarchie des modèles parallèles sur GPU. Pour cela, des approches très efficaces ont été proposées pour l'optimisation des transferts de données entre le CPU et le GPU, le contrôle de threads, l'association entre les solutions et les threads, ou encore la gestion de la mémoire. Les approches proposées ont été expérimentées de façon exhaustive en utilisant cinq problèmes d'optimisation et quatre configurations GPU. En comparaison avec une exécution sur CPU, les accélérations obtenues vont jusqu'à 80 fois plus vite pour des grands problèmes d'optimisation combinatoire et jusqu'à 2000 fois plus vite pour un problème d'optimisation continue. Les différents travaux liés à cette thèse ont fait l'objet d'une douzaine publications comprenant la revue IEEE Transactions on Computers
Real-world optimization problems are often complex and NP-hard. Their modeling is continuously evolving in terms of constraints and objectives, and their resolution is CPU time-consuming. Although near-optimal algorithms such as metaheuristics (generic heuristics) make it possible to reduce the temporal complexity of their resolution, they fail to tackle large problems satisfactorily. Over the last decades, parallel computing has been revealed as an unavoidable way to deal with large problem instances of difficult optimization problems. The design and implementation of parallel metaheuristics are strongly influenced by the computing platform. Nowadays, GPU computing has recently been revealed effective to deal with time-intensive problems. This new emerging technology is believed to be extremely useful to speed up many complex algorithms. One of the major issues for metaheuristics is to rethink existing parallel models and programming paradigms to allow their deployment on GPU accelerators. Generally speaking, the major issues we have to deal with are: the distribution of data processing between CPU and GPU, the thread synchronization, the optimization of data transfer between the different memories, the memory capacity constraints, etc. The contribution of this thesis is to deal with such issues for the redesign of parallel models of metaheuristics to allow solving of large scale optimization problems on GPU architectures. Our objective is to rethink the existing parallel models and to enable their deployment on GPUs. Thereby, we propose in this document a new generic guideline for building efficient parallel metaheuristics on GPU. Our challenge is to come out with the GPU-based design of the whole hierarchy of parallel models.In this purpose, very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of solutions to GPU threadsor memory management. These approaches have been exhaustively experimented using five optimization problems and four GPU configurations. Compared to a CPU-based execution, experiments report up to 80-fold acceleration for large combinatorial problems and up to 2000-fold speed-up for a continuous problem. The different works related to this thesis have been accepted in a dozen of publications, including the IEEE Transactions on Computers journal
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Viard, Thomas. „Algorithmes de visualisation des incertitudes en géomodélisation sur GPU“. Thesis, Vandoeuvre-les-Nancy, INPL, 2010. http://www.theses.fr/2010INPL042N/document.

Der volle Inhalt der Quelle
Annotation:
En géosciences, la majeure partie du sous-sol est inaccessible à toute observation directe. Seules des informations parcellaires ou imprécises sont donc disponibles lors de la construction ou de la mise à jour de modèles géologiques ; de ce fait, les incertitudes jouent un rôle fondamental en géomodélisation. La théorie des problèmes inverses et les méthodes de simulations stochastiques fournissent un cadre théorique permettant de générer un ensemble de représentations plausibles du sous-sol, également nommées réalisations. En pratique, la forte cardinalité de l'ensemble des réalisations limite significativement tout traitement ou interprétation sur le modèle géologique.L'objectif de cette thèse est de fournir au géologue des algorithmes de visualisation permettant d'explorer, d'analyser et de communiquer les incertitudes spatiales associées à de larges ensembles de réalisations. Nos contributions sont les suivantes : (1) Nous proposons un ensemble de techniques dédiées à la visualisation des incertitudes pétrophysiques. Ces techniques reposent sur une programmation sur carte graphique (GPU) et utilisent une architecture garantissant leur interopérabilité ; (2) Nous proposons deux techniques dédiées à la visualisation des incertitudes structurales, traitant aussi bien les incertitudes géométriques que les incertitudes topologiques (existence de la surface ou interactions avec d'autres surfaces) ; (3) Nous évaluons la qualité des algorithmes de visualisation des incertitudes par le biais de deux études sur utilisateurs, portant respectivement sur la perception des méthodes statiques et par animation. Ces études apportent un éclairage nouveau sur la manière selon laquelle l'incertitude doit être représentée
Most of the subsurface is inaccessible to direct observation in geosciences. Consequently, only local or imprecise data are available when building or updating a geological model; uncertainties are therefore central to geomodeling. The inverse problem theory and the stochastic simulation methods provide a framework for the generation of large sets of likely representations of the subsurface, also termed realizations. In practice, however, the size of the set of realizations severely impacts further interpretation or processing of the geological model.This thesis aims at providing visualization algorithms to expert geologists that allow them to explore, analyze and communicate on spatial uncertainties associated to large sets of realizations. Our contributions are: (1) We propose a set of techniques dedicated to petrophysical uncertainty visualization, based on a GPU programming approach that maintains their interoperability; (2) We propose two techniques dedicated to structural uncertainty visualization that can handle both geometrical and topological uncertainties (e.g., the existence of the surface or its relationships with other surfaces); (3) We assess the quality of our uncertainty visualization algorithms through two user studies, which respectively focus on the perception of static and animated methods. These studies bring new elements on how uncertainty should be represented
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Lefèbvre, Matthieu. „Algorithmes sur GPU pour la simulation numérique en mécanique des fluides“. Paris 13, 2012. http://scbd-sto.univ-paris13.fr/intranet/edgalilee_th_2012_lefebvre.pdf.

Der volle Inhalt der Quelle
Annotation:
La simulation numérique en mécanique des fluides nécessite souvent une importante puissance de calcul. Pour accélérer ces simulations l’utilisation de GPU est envisagée. Cette thèse s’intéresse tout d’abord à l’étude d’algorithmes de mécanique des fluides numérique en maillage structuré. La structuration du maillage amène un rangement mémoire naturellement convenable pour effectuer des simulations sur GPU. Ceci permet de dégager des principes d’utilisation sans avoir à considérer le bruit occasionné par la définition d’une structure de données spécifique. Par la suite nous nous concentrons sur les algorithmes de mécanique des fluides numériques sur maillage non structuré et trois stratégies algorithmiques sont présentées. La première de ces techniques est une réorganisation visant à rendre les données consécutives en mémoire, au prix d’une copie des données coûteuse à la fois en temps et en mémoire. Une deuxième technique résultant en un partitionnement fin du domaine de calcul est développée, elle permet d’utiliser les mémoires cache des GPU modernes de manière plus efficace. La troisième de ces techniques est le raffinement générique. Le maillage initial est composé d’éléments grossiers raffinés de façon identique. Cette algorithme permet d’accéder aux données de façon consécutive et apporte des performances accrues
Numerical simulations in fluid mechanics require tremendous computational power ; GPU computing is one of the newest approaches to accelerate such simulations. On one hand, this thesis studies the case of fluid mechanics algorithms on structured meshes. The mesh structuration naturally brings well suited memory arrangements and allows to reach guidelines when using GPUs for numerical simulations. On the other hand, we examine the case of fluid mechanics on unstructured meshes with the help of three different algorithmic strategies. The first of these technique is a reorganisation to produce consecutive data accesses, but at the cost of expensive data copies, both in time and in memory. The second technique, a cell partitioning approach, is developed and allows to extensively use modern GPUs’ cache memories. The third technique consists on a generic refinement. The initial mesh is made of coarse elements refined in the exact same way in order to produce consecutive memory accesses. This approach brings significant performance improvements for fluid mechanics simulations on unstructured meshes
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Marin, Manuel. „GPU-enhanced power flow analysis“. Thesis, Perpignan, 2015. http://www.theses.fr/2015PERP0041.

Der volle Inhalt der Quelle
Annotation:
Cette thèse propose un large éventail d'approches afin d'améliorer différents aspects de l'analyse des flux de puissance avec comme fils conducteur l'utilisation du processeurs graphiques (GPU). Si les GPU ont rapidement prouvés leurs efficacités sur des applications régulières pour lesquelles le parallélisme de données était facilement exploitable, il en est tout autrement pour les applications dites irrégulières. Ceci est précisément le cas de la plupart des algorithmes d'analyse de flux de puissance. Pour ce travail, nous nous inscrivons dans cette problématique d'optimisation de l'analyse de flux de puissance à l'aide de coprocesseur de type GPU. L'intérêt est double. Il étend le domaine d'application des GPU à une nouvelle classe de problème et/ou d'algorithme en proposant des solutions originales. Il permet aussi à l'analyse des flux de puissance de rester pertinent dans un contexte de changements continus dans les systèmes énergétiques, et ainsi d'en faciliter leur évolution. Nos principales contributions liées à la programmation sur GPU sont: (i) l'analyse des différentes méthodes de parcours d'arbre pour apporter une réponse au problème de la régularité par rapport à l'équilibrage de charge ; (ii) l'analyse de l'impact du format de représentation sur la performance des implémentations d'arithmétique floue. Nos contributions à l'analyse des flux de puissance sont les suivantes: (ii) une nouvelle méthode pour l'évaluation de l'incertitude dans l'analyse des flux de puissance ; (ii) une nouvelle méthode de point fixe pour l'analyse des flux de puissance, problème que l'on qualifie d'intrinsèquement parallèle
This thesis addresses the utilization of Graphics Processing Units (GPUs) for improving the Power Flow (PF) analysis of modern power systems. Currently, GPUs are challenged by applications exhibiting an irregular computational pattern, as is the case of most known methods for PF analysis. At the same time, the PF analysis needs to be improved in order to cope with new requirements of efficiency and accuracy coming from the Smart Grid concept. The relevance of GPU-enhanced PF analysis is twofold. On one hand, it expands the application domain of GPU to a new class of problems. On the other hand, it consistently increases the computational capacity available for power system operation and design. The present work attempts to achieve that in two complementary ways: (i) by developing novel GPU programming strategies for available PF algorithms, and (ii) by proposing novel PF analysis methods that can exploit the numerous features present in GPU architectures. Specific contributions on GPU computing include: (i) a comparison of two programming paradigms, namely regularity and load-balancing, for implementing the so-called treefix operations; (ii) a study of the impact of the representation format over performance and accuracy, for fuzzy interval algebraic operations; and (iii) the utilization of architecture-specific design, as a novel strategy to improve performance scalability of applications. Contributions on PF analysis include: (i) the design and evaluation of a novel method for the uncertainty assessment, based on the fuzzy interval approach; and (ii) the development of an intrinsically parallel method for PF analysis, which is not affected by the Amdahl's law
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Van, Luong Thé. „Métaheuristiques parallèles sur GPU“. Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00638820.

Der volle Inhalt der Quelle
Annotation:
Les problèmes d'optimisation issus du monde réel sont souvent complexes et NP-difficiles. Leur modélisation est en constante évolution en termes de contraintes et d'objectifs, et leur résolution est coûteuse en temps de calcul. Bien que des algorithmes approchés telles que les métaheuristiques (heuristiques génériques) permettent de réduire la complexité de leur résolution, ces méthodes restent insuffisantes pour traiter des problèmes de grande taille. Au cours des dernières décennies, le calcul parallèle s'est révélé comme un moyen incontournable pour faire face à de grandes instances de problèmes difficiles d'optimisation. La conception et l'implémentation de métaheuristiques parallèles sont ainsi fortement influencées par l'architecture parallèle considérée. De nos jours, le calcul sur GPU s'est récemment révélé efficace pour traiter des problèmes coûteux en temps de calcul. Cette nouvelle technologie émergente est considérée comme extrêmement utile pour accélérer de nombreux algorithmes complexes. Un des enjeux majeurs pour les métaheuristiques est de repenser les modèles existants et les paradigmes de programmation parallèle pour permettre leur déploiement sur les accélérateurs GPU. De manière générale, les problèmes qui se posent sont la répartition des tâches entre le CPU et le GPU, la synchronisation des threads, l'optimisation des transferts de données entre les différentes mémoires, les contraintes de capacité mémoire, etc. La contribution de cette thèse est de faire face à ces problèmes pour la reconception des modèles parallèles des métaheuristiques pour permettre la résolution des problèmes d'optimisation à large échelle sur les architectures GPU. Notre objectif est de repenser les modèles parallèles existants et de permettre leur déploiement sur GPU. Ainsi, nous proposons dans ce document une nouvelle ligne directrice pour la construction de métaheuristiques parallèles efficaces sur GPU. Le défi de cette thèse porte sur la conception de toute la hiérarchie des modèles parallèles sur GPU. Pour cela, des approches très efficaces ont été proposées pour l'optimisation des transferts de données entre le CPU et le GPU, le contrôle de threads, l'association entre les solutions et les threads, ou encore la gestion de la mémoire. Les approches proposées ont été expérimentées de façon exhaustive en utilisant cinq problèmes d'optimisation et quatre configurations GPU. En comparaison avec une exécution sur CPU, les accélérations obtenues vont jusqu'à 80 fois plus vite pour des grands problèmes d'optimisation combinatoire et jusqu'à 2000 fois plus vite pour un problème d'optimisation continue. Les différents travaux liés à cette thèse ont fait l'objet d'une douzaine publications comprenant la revue IEEE Transactions on Computers.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Buatois, Luc. „Algorithmes sur GPU de visualisation et de calcul pour des maillages non-structurés“. Phd thesis, Institut National Polytechnique de Lorraine - INPL, 2008. http://tel.archives-ouvertes.fr/tel-00331935.

Der volle Inhalt der Quelle
Annotation:
Les algorithmes les plus récents de traitement numérique de la géométrie ou bien encore de simulation numérique de type CFD (Computational Fluid Dynamics) utilisent à présent de nouveaux types de grilles composées de polyèdres arbitraires, autrement dit des grilles fortement non-structurées. Dans le cas de simulations de type CFD, ces grilles peuvent servir de support à des champs scalaires ou vectoriels qui représentent des grandeurs physiques (par exemple : densité, porosité, perméabilité). La problématique de cette thèse concerne la définition de nouveaux outils de visualisation et de calcul sur de telles grilles. Pour la visualisation, cela pose `a la fois le problème du stockage et de l'adaptativité des algorithmes `a une géométrie et une topologie variables. Pour le calcul, cela pose le problème de la résolution de grands systèmes linéaires creux non-structurés. Pour aborder ces problèmes, l'augmentation incessante ces dernières années de la puissance de calcul parallèle des processeurs graphiques nous fournit de nouveaux outils. Toutefois, l'utilisation de ces GPU nécessite de définir de nouveaux algorithmes adaptés aux modèles de programmation parallèle qui leur sont spécifiques. Nos contributions sont les suivantes : (1) Une méthode générique de visualisation tirant partie de la puissance de calcul des GPU pour extraire des isosurfaces à partir de grandes grilles fortement nonstructurées. (2) Une méthode de classification de cellules qui permet d'accélérer l'extraction d'isosurfaces grâce à une pré-sélection des seules cellules intersectées. (3) Un algorithme d'interpolation temporelle d'isosurfaces. Celui-ci permet de visualiser de manière continue dans le temps l'évolution d'isosurfaces. (4) Un algorithme massivement parallèle de résolution de grands systèmes linéaires non-structurés creux sur le GPU. L'originalité de celui-ci concerne son adaptation à des matrices de motif arbitraire, ce qui le rend applicable `a n'importe quel système creux, dont ceux issus de maillages fortement non-structurés.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Chakroun, Imen. „Algorithmes Branch and Bound parallèles hétérogènes pour environnements multi-coeurs et multi-GPU“. Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2013. http://tel.archives-ouvertes.fr/tel-00841965.

Der volle Inhalt der Quelle
Annotation:
Les algorithmes Branch and Bound (B&B) sont attractifs pour la résolution exacte de problèmes d'optimisation combinatoire (POC) par exploration d'un espace de recherche arborescent. Néanmoins, ces algorithmes sont très gourmands en temps de calcul pour des instances de problèmes de grande taille (exemple : benchmarks de Taillard pour FSP) même en utilisant le calcul sur grilles informatiques [Mezmaz et al., IEEE IPDPS'2007]. Le calcul massivement parallèle fourni à travers les plates-formes de calcul hétérogènes d'aujourd'hui [TOP500 ] est requis pour traiter effi cacement de telles instances. Le dé fi est alors d'exploiter tous les niveaux de parallélisme sous-jacents et donc de repenser en conséquence les modèles parallèles des algorithmes B&B. Dans cette thèse, nous nous attachons à revisiter la conception et l'implémentation des ces algorithmes pour la résolution de POC de grande taille sur (larges) plates-formes de calcul multi-coeurs et multi-GPUs. Le problème d'ordonnancement Flow-Shop (FSP) est considéré comme étude de cas. Une étude expérimentale préliminaire sur quelques grandes instances du FSP a révélé que l'arbre de recherche est hautement irrégulier (en forme et en taille) et très large (milliards de milliards de noeuds), et que l'opérateur d'évaluation des bornes est exorbitant en temps de calcul (environ 97% du temps de B&B). Par conséquent, notre première contribution est de proposer une approche GPU avec un seul coeur CPU (GB&B) dans laquelle seul l'opérateur d'évaluation est exécuté sur GPU. L'approche traite deux dé fis: la divergence de threads et l'optimisation de la gestion de la mémoire hiérarchique du GPU. Comparée à une version séquentielle, des accélérations allant jusqu'à ( 100) sont obtenues sur Nvidia Tesla C2050. L'analyse des performances de GB&B a montré que le surcoût induit par le transfert des données entre le CPU et le GPU est élevé. Par conséquent, l'objectif de la deuxième contribution est d'étendre l'approche (LL-GB&B) a fin de minimiser la latence de communication CPU-GPU. Cet objectif est réalisé grâce à une parallélisation à grain fin sur GPU des opérateurs de séparation et d'élagage. Le défi majeur relevé ici est la divergence de threads qui est due à la nature fortement irrégulière citée ci-dessus de l'arbre exploré. Comparée à une exécution séquentielle, LL-GB&B permet d'atteindre des accélérations allant jusqu'à ( 160) pour les plus grandes instances. La troisième contribution consiste à étudier l'utilisation combinée des GPUs avec les processeurs multi-coeurs. Deux scénarios ont été explorés conduisant à deux approches: une concurrente (RLL-GB&B) et une coopérative (PLL-GB&B). Dans le premier cas, le processus d'exploration est eff ectué simultanément par le GPU et les coeurs du CPU. Dans l'approche coopérative, les coeurs du CPU préparent et transfèrent les sous-problèmes en utilisant le streaming CUDA tandis que le GPU eff ectue l'exploration. L'utilisation combinée du multi-coeur et du GPU a montré que l'utilisation de RLL-GB&B n'est pas bénéfi que et que PLL-GB&B permet une amélioration allant jusqu'à (36%) par rapport à LL-GB&B. Sachant que récemment des grilles de calcul comme Grid5000 (certains sites) ont été équipées avec des GPU, la quatrième contribution de cette thèse traite de la combinaison du calcul sur GPU et multi-coeur avec le calcul distribué à grande échelle. Pour ce faire, les diff érentes approches proposées ont été réunies dans un méta-algorithme hétérofigène qui sélectionne automatiquement l'algorithme à déployer en fonction de la con figuration matérielle cible. Ce méta-algorithme est couplé avec l'approche B&B@Grid proposée dans [Mezmaz et al., IEEE IPDPS'2007]. B&B@Grid répartit les unités de travail (sous-espaces de recherche codés par des intervalles) entre les noeuds de la grille tandis que le méta-algorithme choisit et déploie localement un algorithme de B&B parallèle sur les intervalles reçus. L'approche combinée nous a permis de résoudre à l'optimalité et e fficacement les instances (20 20) de Taillard.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Mansouri, Abdelkhalek. „Generic heuristics on GPU to superpixel segmentation and application to optical flow estimation“. Thesis, Bourgogne Franche-Comté, 2020. http://www.theses.fr/2020UBFCA012.

Der volle Inhalt der Quelle
Annotation:
Déterminer des clusters dans des nuages de points et apparier des graphes sont des tâches primordiales en informatique, analyse de donnée, traitement d’image, généralement modélisées par des problèmes d’optimisation de classe NP-difficile. Avec l’avènement des multiprocesseurs à bas coût, l’accélération des procédures heuristiques pour ces tâches devient possible et nécessaire. Nous proposons des implantations parallèles sur système GPU (graphics processing unit) pour des algorithmes génériques appliqués ici à la segmentation d’image en superpixels et au problème du flot optique. Le but est de fournir des algorithmes génériques basés sur des structures de données décentralisées et aisément adaptables à différents problèmes d’optimisation sur des graphes et plateformes parallèles.Les algorithmes parallèles proposés sur GPU incluent le classique k-means et le calcul de forêt couvrante minimum pour la segmentation en superpixels. Ils incluent également un algorithme de recherche locale parallèle et un algorithme mémétique à base de population de solutions appliqués à l’estimation du flot optique via des appariements de superpixels. Tandis que les opérations sur les données exploitent le GPU, l’algorithme mémétique opère en tant que coalition de processus exécutés en parallèle sur le CPU multi-cœur et requérant des ressources GPU. Les images sont des nuages de points de l’espace euclidien 3D (domaine espace-intensité), et aussi des graphes auxquels sont associés des grilles de processeurs. Les kernels GPU exécutent des transformations en parallèle sous contrôle du CPU qui a un rôle réduit de détection des conditions d’arrêt et de séquencement des transformations.La contribution présentée est composée de deux grandes parties. Dans une première partie, nous présentons des outils pour la segmentation en superpixels. Une implémentation parallèle de l’algorithme des k-means est présentée et appliquée aux données 3D. Elle est basée sur une subdivision cellulaire de l’espace 3D qui permet des recherches de plus proche voisin en parallèle en temps optimal constant pour des distributions bornées. Nous présentons également une application de l’algorithme parallèle de calcul de forêt couvrante de Boruvka à la segmentation superpixel de type ligne de partage-des-eaux (watershed). Dans une deuxième partie, en se basant sur les superpixels générés, des procédures parallèles de mise en correspondance sont dérivées pour l’estimation du flot optique avec prise en compte des discontinuités. Ces méthodes incluent des heuristiques de construction et d’amélioration, telles que le winner-take-all et la recherche locale parallèle, et leur intégration dans une métaheuristique à base de population. Diverses combinaisons d’exécution sont présentées et évaluées en comparaison avec des algorithmes de l’état de l’art performants
Finding clusters in point clouds and matching graphs to graphs are recurrent tasks in computer science domain, data analysis, image processing, that are most often modeled as NP-hard optimization problems. With the development and accessibility of cheap multiprocessors, acceleration of the heuristic procedures for these tasks becomes possible and necessary. We propose parallel implantation on GPU (graphics processing unit) system for some generic algorithms applied here to image superpixel segmentation and image optical flow problem. The aim is to provide generic algorithms based on standard decentralized data structures to be easy to improve and customized on many optimization problems and parallel platforms.The proposed parallel algorithm implementations include classical k-means algorithm and application of minimum spanning forest computation for super-pixel segmentation. They include also a parallel local search procedure, and a population-based memetic algorithm applied to optical flow estimation based on superpixel matching. While data operations fully exploit GPU, the memetic algorithm operates like a coalition of processes executed in parallel on the multi-core CPU and requesting GPU resources. Images are point clouds in 3D Euclidean space (space-gray value domain), and are also graphs to which are assigned processor grids. GPU kernels execute parallel transformations under CPU control whose limited role only consists in stopping criteria evaluation or sequencing transformations.The presented contribution contains two main parts. Firstly, we present tools for superpixel segmentation. A parallel implementation of the k-means algorithm is presented with application to 3D data. It is based on a cellular grid subdivision of 3D space that allows closest point findings in constant optimal time for bounded distributions. We present an application of the parallel Boruvka minimum spanning tree algorithm to compute watershed minimum spanning forest. Secondly, based on the generated superpixels and segmentation, we derive parallel optimization procedures for optical flow estimation with edge aware filtering. The method includes construction and improvement heuristics, as winner-take-all and parallel local search, and their embedding into a population-based metaheuristic framework. The algorithms are presented and evaluated in comparison to state-of-the-art algorithms
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Legrand, Hélène. „Algorithmes parallèles pour le traitement rapide de géométries 3D“. Electronic Thesis or Diss., Paris, ENST, 2017. http://www.theses.fr/2017ENST0053.

Der volle Inhalt der Quelle
Annotation:
Au cours des vingt dernières années, les principaux concepts du traitement du signal ont trouvé leur homologue pour le cas de la géométrie numérique, et en particulier des modèles polygonaux de surfaces 3D. Ces traitements requièrent néanmoins un temps de calcul non négligeable lorsqu’on les applique sur des modèles de taille conséquente. Cette charge de calcul devient un frein important dans le contexte actuel, où les quantités massives de données 3D générées à chaque seconde peuvent potentiellement nécessiter l’application d’un sous-ensemble de ces opérateurs. La capacité à exécuter des opérateurs de traitement géométrique en un temps très court représente alors un verrou important pour les systèmes de conception, capture et restitution 3D dynamiques. Dans ce contexte, on cherche à accélérer de plusieurs ordres de grandeur certains algorithmes de traitement géométrique actuels, et à reformuler ou approcher ces algorithmes afin de diminuer leur complexité ou de les adapter à un environnement parallèle. Dans cette thèse, nous nous appuyons sur un objet compact et efficace permettant d’analyser les surfaces 3D à plusieurs échelles : les quadriques d’erreurs. En particulier, nous proposons de nouveaux algorithmes haute performance, maintenant à la surface des quadriques d’erreur représentatives de la géométrie. Un des principaux défis tient ici à la génération des structures adaptées en parallèle, afin d’exploiter les processeurs parallèles à grain fin que sont les GPU, la principale source de puissance disponible dans un ordinateur moderne
Over the last twenty years, the main signal processing concepts have been adapted for digital geometry, in particular for 3D polygonal meshes. However, the processing time required for large models is significant. This computational load becomes an obstacle in the current context, where the massive amounts of data that are generated every second may need to be processed with several operators. The ability to run geometry processing operators with strong time constraints is a critical challenge in dynamic 3D systems. In this context, we seek to speed up some of the current algorithms by several orders of magnitude, and to reformulate or approximate them in order to reduce their complexity or make them parallel. In this thesis, we are building on a compact and effective object to analyze 3D surfaces at different scales : error quadrics. In particular, we propose new high performance algorithms that maintain error quadrics on the surface to represent the geometry. One of the main challenges lies in the effective generation of the right structures for parallel processing, in order to take advantage of the GPU
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bücher zum Thema "Algorithmes GPU"

1

Xu, Guochang. GPS: Theory, algorithms, and applications. Berlin: Springer, 2003.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Shu ju jie gou: Yong mian dui xiang fang fa yu C++ yu yan miao shu. 2. Aufl. Bei jing: Qing hua da xue chu ban she, 2007.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Baúto, João, Rui Neves und Nuno Horta. Parallel Genetic Algorithms for Financial Pattern Discovery Using GPUs. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-73329-6.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Pilley, H. Robert. GPS-based airport operations: Requirements, analysis & algorithms : engineering source book. Deering, NH: DSDC, 1994.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Gulati, Kanupriya. Hardware acceleration of EDA algorithms: Custom ICs, FPGAs and GPUs. New York: Springer, 2010.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Chen, Dewang, und Ruijun Cheng. Intelligent Processing Algorithms and Applications for GPS Positioning Data of Qinghai-Tibet Railway. Berlin, Heidelberg: Springer Berlin Heidelberg, 2019. http://dx.doi.org/10.1007/978-3-662-58970-0.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Tu jie zi liao jie gou: Shi yong Python. Xinbei Shi: Bo shuo wen hua gu fen you xian gong si, 2017.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Shu ju jie gou yu suan fa fen xi: C yu yan miao shu. Beijing Shi: Ji xie gong ye chu ban she, 2004.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Jet Propulsion Laboratory (U.S.), Hrsg. A fully redundant double difference algorithm for obtaining minimum variance estimates from GPS observations. Pasadena, Calif: National Aeronautics and Space Administration, Jet Propulsion Laboratory, California Institute of Technology, 1986.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Melbourne, William G. A fully redundant double difference algorithm for obtaining minimum variance estimates from GPS observations. Pasadena, Calif: National Aeronautics and Space Administration, Jet Propulsion Laboratory, California Institute of Technology, 1986.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Algorithmes GPU"

1

Ou, Zhixin, Juan Chen, Yuyang Sun, Tao Xu, Guodong Jiang, Zhengyuan Tan und Xinxin Qi. „AOA: Adaptive Overclocking Algorithm on CPU-GPU Heterogeneous Platforms“. In Algorithms and Architectures for Parallel Processing, 253–72. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-22677-9_14.

Der volle Inhalt der Quelle
Annotation:
AbstractAlthough GPUs have been used to accelerate various convolutional neural network algorithms with good performance, the demand for performance improvement is still continuously increasing. CPU/GPU overclocking technology brings opportunities for further performance improvement in CPU-GPU heterogeneous platforms. However, CPU/GPU overclocking inevitably increases the power of the CPU/GPU, which is not conducive to energy conservation, energy efficiency optimization, or even system stability. How to effectively constrain the total energy to remain roughly unchanged during the CPU/GPU overclocking is a key issue in designing adaptive overclocking algorithms. There are two key factors during solving this key issue. Firstly, the dynamic power upper bound must be set to reflect the real-time behavior characteristics of the program so that algorithm can better meet the total energy unchanging constraints; secondly, instead of independently overclocking at both CPU and GPU sides, coordinately overclocking on CPU-GPU must be considered to adapt to real-time load balance for higher performance improvement and better energy constraints. This paper proposes an Adaptive Overclocking Algorithm (AOA) on CPU-GPU heterogeneous platforms to achieve the goal of performance improvement while the total energy remains roughly unchanged. AOA uses the function $$F_k$$ F k to describe the variable power upper bound and introduces the load imbalance factor W to realize the CPU-GPU coordinated overclocking. Through the verification of several types convolutional neural network algorithms on two CPU-GPU heterogeneous platforms (Intel$$^\circledR $$ ® Xeon E5-2660 & NVIDIA$$^\circledR $$ ® Tesla K80; Intel$$^\circledR $$ ® Core™i9-10920X & NIVIDIA$$^\circledR $$ ® GeForce RTX 2080Ti), AOA achieves an average of 10.7% performance improvement and 4.4% energy savings. To verify the effectiveness of the AOA, we compare AOA with other methods including automatic boost, the highest overclocking and static optimal overclocking.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Wijs, Anton, und Muhammad Osama. „A GPU Tree Database for Many-Core Explicit State Space Exploration“. In Tools and Algorithms for the Construction and Analysis of Systems, 684–703. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-30823-9_35.

Der volle Inhalt der Quelle
Annotation:
AbstractVarious techniques have been proposed to accelerate explicit-state model checking with GPUs, but none address the compact storage of states, or if they do, at the cost of losing completeness of the checking procedure. We investigate how to implement a tree database to store states as binary trees in GPU memory. We present fine-grained parallel algorithms to find and store trees, experiment with a number of GPU-specific configurations, and propose a novel hashing technique, called Cleary-Cuckoo hashing, which enables the use of Cleary compression on GPUs. We are the first to assess the effectiveness of using a tree database, and Cleary compression, on GPUs. Experiments show processing speeds of up to 131 million states per second.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Reinders, James, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook und Xinmin Tian. „Programming for GPUs“. In Data Parallel C++, 353–85. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5574-2_15.

Der volle Inhalt der Quelle
Annotation:
Abstract Over the last few decades, Graphics Processing Units (GPUs) have evolved from specialized hardware devices capable of drawing images on a screen to general-purpose devices capable of executing complex parallel kernels. Nowadays, nearly every computer includes a GPU alongside a traditional CPU, and many programs may be accelerated by offloading part of a parallel algorithm from the CPU to the GPU.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Osama, Muhammad, Anton Wijs und Armin Biere. „SAT Solving with GPU Accelerated Inprocessing“. In Tools and Algorithms for the Construction and Analysis of Systems, 133–51. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-72016-2_8.

Der volle Inhalt der Quelle
Annotation:
AbstractSince 2013, the leading SAT solvers in the SAT competition all use inprocessing, which unlike preprocessing, interleaves search with simplifications. However, applying inprocessing frequently can still be a bottle neck, i.e., for hard or large formulas. In this work, we introduce the first attempt to parallelize inprocessing on GPU architectures. As memory is a scarce resource in GPUs, we present new space-efficient data structures and devise a data-parallel garbage collector. It runs in parallel on the GPU to reduce memory consumption and improves memory access locality. Our new parallel variable elimination algorithm is twice as fast as previous work. In experiments our new solver ParaFROST solves many benchmarks faster on the GPU than its sequential counterparts.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Osama, Muhammad, und Anton Wijs. „Hitching a Ride to a Lasso: Massively Parallel On-The-Fly LTL Model Checking“. In Tools and Algorithms for the Construction and Analysis of Systems, 23–43. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-57249-4_2.

Der volle Inhalt der Quelle
Annotation:
AbstractThe need for massively parallel algorithms, suitable to exploit the computational power of hardware such as graphics processing units, is ever increasing. In this paper, we propose a new algorithm for the on-the-fly verification of Linear-Time Temporal Logic (LTL) formulae [45] that is aimed at running on such devices. We prove its correctness and termination guarantee, and experimentally compare a GPU implementation with state-of-the-art LTL model checkers. Our new GPU LTL-checking algorithm is up to 150$$\times $$ × faster on proving the correctness of a system than LTSmin running on a 32-core high-end CPU, and is more economic in using the available memory.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Yang, Kaifeng, und Michael Affenzeller. „Surrogate-assisted Multi-objective Optimization via Genetic Programming Based Symbolic Regression“. In Lecture Notes in Computer Science, 176–90. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-27250-9_13.

Der volle Inhalt der Quelle
Annotation:
AbstractSurrogate-assisted optimization algorithms are a commonly used technique to solve expensive-evaluation problems, in which a regression model is built to replace an expensive function. In some acquisition functions, the only requirement for a regression model is the predictions. However, some other acquisition functions also require a regression model to estimate the “uncertainty” of the prediction, instead of merely providing predictions. Unfortunately, very few statistical modeling techniques can achieve this, such as Kriging/Gaussian processes, and recently proposed genetic programming-based (GP-based) symbolic regression with Kriging (GP2). Another method is to use a bootstrapping technique in GP-based symbolic regression to estimate prediction and its corresponding uncertainty. This paper proposes to use GP-based symbolic regression and its variants to solve multi-objective optimization problems (MOPs), which are under the framework of a surrogate-assisted multi-objective optimization algorithm (SMOA). Kriging and random forest are also compared with GP-based symbolic regression and GP2. Experiment results demonstrate that the surrogate models using the GP2 strategy can improve SMOA’s performance.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Vasconcelos, Cristina N., Asla Sá, Paulo Cezar Carvalho und Marcelo Gattass. „Lloyd’s Algorithm on GPU“. In Advances in Visual Computing, 953–64. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-89639-5_91.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Xu, Guochang, und Yan Xu. „Applications of GPS Theory and Algorithms“. In GPS, 313–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2016. http://dx.doi.org/10.1007/978-3-662-50367-6_10.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Martens, Jan, Jan Friso Groote, Lars van den Haak, Pieter Hijma und Anton Wijs. „A Linear Parallel Algorithm to Compute Bisimulation and Relational Coarsest Partitions“. In Formal Aspects of Component Software, 115–33. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-90636-8_7.

Der volle Inhalt der Quelle
Annotation:
AbstractThe most efficient way to calculate strong bisimilarity is by finding the relational coarsest partition of a transition system. We provide the first linear-time algorithm to calculate strong bisimulation using parallel random access machines (PRAMs). More precisely, with n states, m transitions and $$| Act |\le m$$ | A c t | ≤ m action labels, we provide an algorithm for $$\max (n,m)$$ max ( n , m ) processors that calculates strong bisimulation in time $$\mathcal {O}(n+| Act |)$$ O ( n + | A c t | ) and space $$\mathcal {O}(n+m)$$ O ( n + m ) . The best-known PRAM algorithm has time complexity $$\mathcal {O}(n\log n)$$ O ( n log n ) on a smaller number of processors making it less suitable for massive parallel devices such as GPUs. An implementation on a GPU shows that the linear time-bound is achievable on contemporary hardware.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Şakar, Ömer, Mohsen Safari, Marieke Huisman und Anton Wijs. „Alpinist: An Annotation-Aware GPU Program Optimizer“. In Tools and Algorithms for the Construction and Analysis of Systems, 332–52. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-99527-0_18.

Der volle Inhalt der Quelle
Annotation:
AbstractGPU programs are widely used in industry. To obtain the best performance, a typical development process involves the manual or semi-automatic application of optimizations prior to compiling the code. To avoid the introduction of errors, we can augment GPU programs with (pre- and postcondition-style) annotations to capture functional properties. However, keeping these annotations correct when optimizing GPU programs is labor-intensive and error-prone.This paper introduces Alpinist, an annotation-aware GPU program optimizer. It applies frequently-used GPU optimizations, but besides transforming code, it also transforms the annotations. We evaluate Alpinist, in combination with the VerCors program verifier, to automatically optimize a collection of verified programs and reverify them.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Algorithmes GPU"

1

Konobrytskyi, Dmytro, Thomas Kurfess, Joshua Tarbutton und Tommy Tucker. „GPGPU Accelerated 3-Axis CNC Machining Simulation“. In ASME 2013 International Manufacturing Science and Engineering Conference collocated with the 41st North American Manufacturing Research Conference. American Society of Mechanical Engineers, 2013. http://dx.doi.org/10.1115/msec2013-1096.

Der volle Inhalt der Quelle
Annotation:
GPUs (Graphics Processing Units), traditionally used for 3D graphics calculations, have recently got an ability to perform general purpose calculations with a GPGPU (General Purpose GPU) technology. Moreover, GPUs can be much faster than CPUs (Central Processing Units) by performing hundreds or even thousands commands concurrently. This parallel processing allows the GPU achieving the extremely high performance but also requires using only highly parallel algorithms which can provide enough commands on each clock cycle. This work formulates a methodology for selection of a right geometry representation and a data structure suitable for parallel processing on GPU. Then the methodology is used for designing the 3-axis CNC milling simulation algorithm accelerated with the GPGPU technology. The developed algorithm is validated by performing an experimental machining simulation and evaluation of the performance results. The experimental simulation shows an importance of an optimization process and usage of algorithms that provide enough work to GPU. The used test configuration also demonstrates almost an order of magnitude difference between CPU and GPU performance results.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Tarashima, Shuhei, Satoshi Someya und Koji Okamoto. „Acceleration of Recursive Cross-Correlation PIV Using Multiple GPUs“. In ASME/JSME 2011 8th Thermal Engineering Joint Conference. ASMEDC, 2011. http://dx.doi.org/10.1115/ajtec2011-44442.

Der volle Inhalt der Quelle
Annotation:
A large number of PIV algorithms and systems have been proposed, many of which are highly sophisticated in terms of accuracy and spatial and temporal resolution. However, a general problem with PIV is the time cost to compute vector fields from images, which often imposes specific constraints on the measurement methods. In this paper, focusing on recursive direct cross-correlation PIV with window deformation, which is one of the most popular algorithms for PIV, we propose a technique to accelerate PIV processing using a single Graphics Processing Unit (single-GPU) and multiple GPUs (multi-GPU). In the case of using single-GPU, we show that PIV data can be processed over 100 times faster than using a CPU alone and about 30 PIV image pairs per second can be processed for certain image sizes. The scalability of the algorithm used is also discussed. In the case of using multi-GPU, the image split method and the parallel method of single-GPU-based PIV processing are measured. We show that the effect of multi-GPU can be observed over a certain amount of image data whether either of these two methods is used. Data transfer between CPU and GPUs is shown to be a bottleneck if the number of GPUs used increase.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Bergmann, Ryan M., und Jasmina L. Vujić. „Monte Carlo Neutron Transport on GPUs“. In 2014 22nd International Conference on Nuclear Engineering. American Society of Mechanical Engineers, 2014. http://dx.doi.org/10.1115/icone22-30148.

Der volle Inhalt der Quelle
Annotation:
GPUs have gradually increased in computational power from the small, job-specific boards of the early 90s to the programmable powerhouses of today. Compared to CPUs, they have a higher aggregate memory bandwidth, much higher floating-point operations per second (FLOPS), and lower energy consumption per FLOP. Because one of the main obstacles in exascale computing is power consumption, many new supercomputing platforms are gaining much of their computational capacity by incorporating GPUs into their compute nodes. Since CPU optimized parallel algorithms are not directly portable to GPU architectures (or at least without losing substantial performance gain), transport codes need to be rewritten in order to execute efficiently on GPUs. Unless this is done, we cannot take full advantage of these new supercomputers for reactor simulations. In this work, we attempt to efficiently map the Monte Carlo transport algorithm on the GPU while preserving its benefits, namely, very few physical and geometrical simplifications. Regularizing memory access and introducing parallel-efficient search and sorting algorithms are the main factors in completing the task.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Bulavintsev, Vadim, und Dmitry Zhdanov. „Method for Adaptation of Algorithms to GPU Architecture“. In 31th International Conference on Computer Graphics and Vision. Keldysh Institute of Applied Mathematics, 2021. http://dx.doi.org/10.20948/graphicon-2021-3027-930-941.

Der volle Inhalt der Quelle
Annotation:
We propose a generalized method for adapting and optimizing algorithms for efficient execution on modern graphics processing units (GPU). The method consists of several steps. First, build a control flow graph (CFG) of the algorithm. Next, transform the CFG into a tree of loops and merge non-parallelizable loops into parallelizable ones. Finally, map the resulting loops tree to the tree of GPU computational units, unrolling the algorithm’s loops as necessary for the match. The mapping should be performed bottom-up, from the lowest GPU architecture levels to the highest ones, to minimize off-chip memory access and maximize register file usage. The method provides programmer with a convenient and robust mental framework and strategy for GPU code optimization. We demonstrate the method by adapting to a GPU the DPLL backtracking search algorithm for solving the Boolean satisfiability problem (SAT). The resulting GPU version of DPLL outperforms the CPU version in raw tree search performance sixfold for regular Boolean satisfiability problems and twofold for irregular ones.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Vulcan, Alexandru mihai, Radu nicolae Pietraru und Maximilian Nicolae. „VISUAL TOOL FOR LEARNING GPU PROGRAMMING“. In eLSE 2019. Carol I National Defence University Publishing House, 2019. http://dx.doi.org/10.12753/2066-026x-19-057.

Der volle Inhalt der Quelle
Annotation:
Graphic Processing Units (GPUs) are unanimously considered as powerful computational resources. General-purpose computing on GPU (GPGPU), as well, is the de facto infrastructure for most of the today computationally intensive problems that researchers all over the globe dill with. High Performance Computing (HPC) facilities use state of the art GPUs. Many domains like deep learning, machine learning, and computational finance uses GPU's for decreasing the execution time. GPUs are widely used in data centers for high performance computing where virtualization techniques are intended for optimizing the resource utilization (e.g. GPU cloud computing). The GPU programming model requires for all the data to be stored in a global memory before it is used. This limits the dimension of the problem a GPU can handle. A system utilizing a cluster of GPU would have a bigger level of parallelization but also would eliminate the memory limitation imposed by a single GPU. These being just a few of the problems a programmer needs to handle. However, the ratio between specialists that are able to efficiently program such processors and the rest of programmers is very small. One important reason for this situation is the steepness of the GPU programming learning curve due to the complex parallel architecture of the processor. Therefore, the tool presented in this article aims to provide visual support for a better understanding of the execution on GPU. With it, the programmers can easily observe the trace of the parallel execution on their own algorithm and, from that, they could determine the unused GPU capacity that could be better exploited.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Mazhar, Hammad, Andrew Seidl, Rebecca Shotwell, Marco B. Quadrelli, Dan Negrut und Abhinandan Jain. „Granular Dynamics Simulation on Multiple GPUs Using Domain Decomposition“. In ASME 2012 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2012. http://dx.doi.org/10.1115/detc2012-71121.

Der volle Inhalt der Quelle
Annotation:
This paper describes the software infrastructure needed to enable massive multi-body simulation using multiple GPUs. Utilizing a domain decomposition approach, a large system made up of billions of bodies can be split into self-contained subdomains which are then transferred to different GPUs and solved in parallel. Parallelism is enabled on multiple levels, first on the CPU through OpenMP and secondly on the GPU through NVIDIA CUDA (Compute Unified Device Architecture). This heterogeneous software infrastructure can be extended to networks of computers using MPI (Message Passing Interface) as each subdomain is self-contained. This paper will discuss the implementation of the spatial subdivision algorithm used for subdomain creation along with the algorithms used for collision detection and constraint solution.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Morris, Christopher, Njiru Mwaura, David Schneider, FNU Tabish, Duncan Carpenter, Nathan Clark und Anjali Sandip. „Graphics Processing Units’ Accelerated Navier-Stokes Solvers for Unstructured Meshes: A Literature Review“. In ASME 2023 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2023. http://dx.doi.org/10.1115/imece2023-112786.

Der volle Inhalt der Quelle
Annotation:
Abstract Recent advances in graphics processing units (GPUs) computational hardware have provided opportunities to develop large-scale numerical models with complex geometric configurations at reduced run time for wide-ranging applications, including computational fluid dynamics. However, utilizing the GPU hardware at its maximum capacity remains challenging. There is a need to develop numerical solution algorithms to alleviate the above-mentioned bottleneck. This study reviews recent developments in GPU-based numerical solution algorithms for incompressible Navier-Stokes equations and unstructured meshes. This literature review aims to promote the creation and development of robust large-scale complex geometry numerical models with applications, including but not limited to computational fluid dynamics.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Romanelli, G., L. Mangani, E. Casartelli, A. Gadda und M. Favale. „Implementation of Explicit Density-Based Unstructured CFD Solver for Turbomachinery Applications on Graphical Processing Units“. In ASME Turbo Expo 2015: Turbine Technical Conference and Exposition. American Society of Mechanical Engineers, 2015. http://dx.doi.org/10.1115/gt2015-43396.

Der volle Inhalt der Quelle
Annotation:
For the aerodynamic design of multistage compressors and turbines Computational Fluid Dynamics (CFD) plays a fundamental role. In fact it allows the characterization of the complex behaviour of turbomachinery components with high fidelity. Together with the availability of more and more powerful computing resources, current trends pursue the adoption of such high-fidelity tools and state-of-the-art technology even in the preliminary design phases. Within such a framework Graphical Processing Units (GPUs) yield further growth potential, allowing a significant reduction of CFD process turn-around times at relatively low costs. The target of the present work is to illustrate the design and implementation of an explicit density-based RANS coupled solver for the efficient and accurate numerical simulation of multi-dimensional time-dependent compressible fluid flows on polyhedral unstructured meshes. The solver has been developed within the object-oriented OpenFOAM framework, using OpenCL bindings to interface CPU and GPU and using MPI to interface multiple GPUs. The overall structure of the code, the numerical strategies adopted and the algorithms implemented are specifically designed in order to best exploit the huge computational peak power offered by modern GPUs, by minimizing memory transfers between CPUs and GPUs and potential branch divergence occurrences. This has a significant impact in terms of the speedup factor and is especially challenging within a polyhedral unstructured mesh framework. Specific tools for turbomachinery applications, such as Arbitrary Mesh Interface (AMI) and mixing-plane (MP), are implemented within the GPU context. The credibility of the proposed CFD solver is assessed by tackling a number of benchmark test problems, including Rotor 67 axial compressor, C3X stator blade with conjugate heat transfer and Aachen multi-stage turbine. An average GPU speedup factor of approximately S ≃ 50 with respect to CPU is achieved (single precision, both GPU and CPU in 100 USD price range). Preliminary parallel scalability test run on multiple GPUs show a parallel efficiency factor of approximately E ≃ 75%.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Vanka, S. Pratap, Aaron F. Shinn und Kirti C. Sahu. „Computational Fluid Dynamics Using Graphics Processing Units: Challenges and Opportunities“. In ASME 2011 International Mechanical Engineering Congress and Exposition. ASMEDC, 2011. http://dx.doi.org/10.1115/imece2011-65260.

Der volle Inhalt der Quelle
Annotation:
A new paradigm for computing fluid flows is the use of Graphics Processing Units (GPU), which have recently become very powerful and convenient to use. In the past three years, we have implemented five different fluid flow algorithms on GPUs and have obtained significant speed-ups over a single CPU. Typically, it is possible to achieve a factor of 50–100 over a single CPU. In this review paper, we describe our experiences on the various algorithms developed and the speeds achieved.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Bruel, Pedro, Marcos Amarís und Alfredo Goldman. „Autotuning GPU Compiler Parameters Using OpenTuner“. In XVI Simpósio em Sistemas Computacionais de Alto Desempenho. Sociedade Brasileira de Computação - SBC, 2015. http://dx.doi.org/10.5753/wscad.2015.14268.

Der volle Inhalt der Quelle
Annotation:
In this paper we implement an autotuner for the compilation flags of GPU algorithms using the OpenTuner framework. An autotuner is a program that finds a combination of algorithms, or a configuration of an algorithm, that optimizes the solution of a given problem instance or set of instances. We analyse the performance gained after autotuning compilation flags for parallel algorithms in three GPU devices, and show that it is possible to improve upon the high-level optimizations of the CUDA compiler. One of the experimental settings achieved a 30% speedup.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Berichte der Organisationen zum Thema "Algorithmes GPU"

1

Mniszewski, Susan, Stan Moore, Sam Reeve, Stuart Slattery, Damien Lebrun-Grandie, Shane Fogerty und Steve Plimpton. Algorithmic and GPU enhancements for molecular dynamics in Cabana and LAMMPS. Office of Scientific and Technical Information (OSTI), März 2022. http://dx.doi.org/10.2172/1856126.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Jimoh, Mujeeb B. Performance Testing of GPU-Based Approximate Matching Algorithm on Network Traffic. Fort Belvoir, VA: Defense Technical Information Center, März 2015. http://dx.doi.org/10.21236/ada620807.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Kolev, T. CEED-MS36: High-order algorithmic developments and optimizations for large-scale GPU-accelerated simulations. Office of Scientific and Technical Information (OSTI), März 2021. http://dx.doi.org/10.2172/1845639.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Lever, James, Allan Delaney, Laura Ray, E. Trautman, Lynette Barna und Amy Burzynski. Autonomous GPR surveys using the polar rover Yeti. Engineer Research and Development Center (U.S.), März 2022. http://dx.doi.org/10.21079/11681/43600.

Der volle Inhalt der Quelle
Annotation:
The National Science Foundation operates stations on the ice sheets of Antarctica and Greenland to investigate Earth’s climate history, life in extreme environments, and the evolution of the cosmos. Understandably, logistics costs predominate budgets due to the remote locations and harsh environments involved. Currently, manual ground-penetrating radar (GPR) surveys must preceed vehicle travel across polar ice sheets to detect subsurface crevasses or other voids. This exposes the crew to the risks of undetected hazards. We have developed an autonomous rover, Yeti, specifically to conduct GPR surveys across polar ice sheets. It is a simple four-wheel-drive, battery-powered vehicle that executes autonomous surveys via GPS waypoint following. We describe here three recent Yeti deployments, two in Antarctica and one in Greenland. Our key objective was to demonstrate the operational value of a rover to locate subsurface hazards. Yeti operated reliably at −30 ◦C, and it has good oversnow mobility and adequate GPS accuracy for waypoint-following and hazard georeferencing. It has acquired data on hundreds of crevasse encounters to improve our understanding of heavily crevassed traverse routes and to develop automated crevasse-detection algorithms. Importantly, it helped to locate a previously undetected buried building at the South Pole. Yeti can improve safety by decoupling survey personnel from the consequences of undetected hazards. It also enables higher-quality systematic surveys to improve hazard-detection probabilities, increase assessment confidence, and build datasets to understand the evolution of these regions. Yeti has demonstrated that autonomous vehicles have great potential to improve the safety and efficiency of polar logistics.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Suess, Matthias, Demetrios Matsakis und Charles A. Greeenhall. Simulating Future GPS Clock Scenarios with Two Composite Clock Algorithms. Fort Belvoir, VA: Defense Technical Information Center, November 2010. http://dx.doi.org/10.21236/ada547035.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Jade Morton, Yu T. Developing Signal Processing Algorithms for Weak GPS Signal Acquisition in Urban Environment. Fort Belvoir, VA: Defense Technical Information Center, September 2004. http://dx.doi.org/10.21236/ada426847.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Mathew, Jijo K., Christopher M. Day, Howell Li und Darcy M. Bullock. Curating Automatic Vehicle Location Data to Compare the Performance of Outlier Filtering Methods. Purdue University, 2021. http://dx.doi.org/10.5703/1288284317435.

Der volle Inhalt der Quelle
Annotation:
Agencies use a variety of technologies and data providers to obtain travel time information. The best quality data can be obtained from second-by-second tracking of vehicles, but that data presents many challenges in terms of privacy, storage requirements and analysis. More frequently agencies collect or purchase segment travel time based upon some type of matching of vehicles between two spatially distributed points. Typical methods for that data collection involve license plate re-identification, Bluetooth, Wi-Fi, or some type of rolling DSRC identifier. One of the challenges in each of these sampling techniques is to employ filtering techniques to remove outliers associated with trip chaining, but not remove important features in the data associated with incidents or traffic congestion. This paper describes a curated data set that was developed from high-fidelity GPS trajectory data. The curated data contained 31,621 vehicle observations spanning 42 days; 2550 observations had travel times greater than 3 minutes more than normal. From this baseline data set, outliers were determined using GPS waypoints to determine if the vehicle left the route. Two performance measures were identified for evaluating three outlier-filtering algorithms by the proportion of true samples rejected and proportion of outliers correctly identified. The effectiveness of the three methods over 10-minute sampling windows was also evaluated. The curated data set has been archived in a digital repository and is available online for others to test outlier-filtering algorithms.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Cheng, Peng, James V. Krogmeier, Mark R. Bell, Joshua Li und Guangwei Yang. Detection and Classification of Concrete Patches by Integrating GPR and Surface Imaging. Purdue University, 2021. http://dx.doi.org/10.5703/1288284317320.

Der volle Inhalt der Quelle
Annotation:
This research considers the detection, location, and classification of patches in concrete and asphalt-on-concrete pavements using data taken from ground penetrating radar (GPR) and the WayLink 3D Imaging System. In particular, the project seeks to develop a patching table for “inverted-T” patches. A number of deep neural net methods were investigated for patch detection from 3D elevation and image observation, but the success was inconclusive, partly because of a dearth of training data. Later, a method based on thresholding IRI values computed on a 12-foot window was used to localize pavement distress, particularly as seen by patch settling. This method was far more promising. In addition, algorithms were developed for segmentation of the GPR data and for classification of the ambient pavement and the locations and types of patches found in it. The results so far are promising but far from perfect, with a relatively high rate of false alarms. The two project parts were combined to produce a fused patching table. Several hundred miles of data was captured with the Waylink System to compare with a much more limited GPR dataset. The primary dataset was captured on I-74. A software application for MATLAB has been written to aid in automation of patch table creation.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Cheng, Peng, James V. Krogmeier, Mark R. Bell, Joshua Li und Guangwei Yang. Detection and Classification of Concrete Patches by Integrating GPR and Surface Imaging. Purdue University, 2021. http://dx.doi.org/10.5703/1288284317320.

Der volle Inhalt der Quelle
Annotation:
This research considers the detection, location, and classification of patches in concrete and asphalt-on-concrete pavements using data taken from ground penetrating radar (GPR) and the WayLink 3D Imaging System. In particular, the project seeks to develop a patching table for “inverted-T” patches. A number of deep neural net methods were investigated for patch detection from 3D elevation and image observation, but the success was inconclusive, partly because of a dearth of training data. Later, a method based on thresholding IRI values computed on a 12-foot window was used to localize pavement distress, particularly as seen by patch settling. This method was far more promising. In addition, algorithms were developed for segmentation of the GPR data and for classification of the ambient pavement and the locations and types of patches found in it. The results so far are promising but far from perfect, with a relatively high rate of false alarms. The two project parts were combined to produce a fused patching table. Several hundred miles of data was captured with the Waylink System to compare with a much more limited GPR dataset. The primary dataset was captured on I-74. A software application for MATLAB has been written to aid in automation of patch table creation.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Lee, W. S., Victor Alchanatis und Asher Levi. Innovative yield mapping system using hyperspectral and thermal imaging for precision tree crop management. United States Department of Agriculture, Januar 2014. http://dx.doi.org/10.32747/2014.7598158.bard.

Der volle Inhalt der Quelle
Annotation:
Original objectives and revisions – The original overall objective was to develop, test and validate a prototype yield mapping system for unit area to increase yield and profit for tree crops. Specific objectives were: (1) to develop a yield mapping system for a static situation, using hyperspectral and thermal imaging independently, (2) to integrate hyperspectral and thermal imaging for improved yield estimation by combining thermal images with hyperspectral images to improve fruit detection, and (3) to expand the system to a mobile platform for a stop-measure- and-go situation. There were no major revisions in the overall objective, however, several revisions were made on the specific objectives. The revised specific objectives were: (1) to develop a yield mapping system for a static situation, using color and thermal imaging independently, (2) to integrate color and thermal imaging for improved yield estimation by combining thermal images with color images to improve fruit detection, and (3) to expand the system to an autonomous mobile platform for a continuous-measure situation. Background, major conclusions, solutions and achievements -- Yield mapping is considered as an initial step for applying precision agriculture technologies. Although many yield mapping systems have been developed for agronomic crops, it remains a difficult task for mapping yield of tree crops. In this project, an autonomous immature fruit yield mapping system was developed. The system could detect and count the number of fruit at early growth stages of citrus fruit so that farmers could apply site-specific management based on the maps. There were two sub-systems, a navigation system and an imaging system. Robot Operating System (ROS) was the backbone for developing the navigation system using an unmanned ground vehicle (UGV). An inertial measurement unit (IMU), wheel encoders and a GPS were integrated using an extended Kalman filter to provide reliable and accurate localization information. A LiDAR was added to support simultaneous localization and mapping (SLAM) algorithms. The color camera on a Microsoft Kinect was used to detect citrus trees and a new machine vision algorithm was developed to enable autonomous navigations in the citrus grove. A multimodal imaging system, which consisted of two color cameras and a thermal camera, was carried by the vehicle for video acquisitions. A novel image registration method was developed for combining color and thermal images and matching fruit in both images which achieved pixel-level accuracy. A new Color- Thermal Combined Probability (CTCP) algorithm was created to effectively fuse information from the color and thermal images to classify potential image regions into fruit and non-fruit classes. Algorithms were also developed to integrate image registration, information fusion and fruit classification and detection into a single step for real-time processing. The imaging system achieved a precision rate of 95.5% and a recall rate of 90.4% on immature green citrus fruit detection which was a great improvement compared to previous studies. Implications – The development of the immature green fruit yield mapping system will help farmers make early decisions for planning operations and marketing so high yield and profit can be achieved.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie