Dissertations / Theses on the topic 'Optimisations for GPU'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Optimisations for GPU.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Romera, Thomas. "Adéquation algorithme architecture pour flot optique sur GPU embarqué." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS450.
Full textThis thesis focus on the optimization and efficient implementation of pixel motion (optical flow) estimation algorithms on embedded graphics processing units (GPUs). Two iterative algorithms have been studied: the Total Variation - L1 (TV-L1) method and the Horn-Schunck method. The primary objective of this work is to achieve real-time processing, with a target frame processing time of less than 40 milliseconds, on low-power platforms, while maintaining acceptable image resolution and flow estimation quality for the intended applications. Various levels of optimization strategies have been explored. High-level algorithmic transformations, such as operator fusion and operator pipelining, have been implemented to maximize data reuse and enhance spatial/temporal locality. Additionally, GPU-specific low-level optimizations, including the utilization of vector instructions and numbers, as well as efficient memory access management, have been incorporated. The impact of floating-point number representation (single-precision versus half-precision) has also been investigated. The implementations have been assessed on Nvidia's Jetson Xavier, TX2, and Nano embedded platforms in terms of execution time, power consumption, and optical flow accuracy. Notably, the TV-L1 method exhibits higher complexity and computational intensity compared to Horn-Schunck. The fastest versions of these algorithms achieve a processing rate of 0.21 nanoseconds per pixel per iteration in half-precision on the Xavier platform, representing a 22x time reduction over efficient and parallel CPU versions. Furthermore, energy consumption is reduced by a factor of x5.3. Among the tested boards, the Xavier embedded platform, being both the most powerful and the most recent, consistently delivers the best results in terms of speed and energy efficiency. Operator merging and pipelining have proven to be instrumental in improving GPU performance by enhancing data reuse. This data reuse is made possible through GPU Shared memory, which is a small, high-speed memory that enables data sharing among threads within the same GPU thread block. While merging multiple iterations yields performance gains, it is constrained by the size of the Shared memory, necessitating trade-offs between resource utilization and speed. The adoption of half-precision numbers accelerates iterative algorithms and achieves superior optical flow accuracy within the same time frame compared to single-precision counterparts. Half-precision implementations converge more rapidly due to the increased number of iterations possible within a given time window. Specifically, the use of half-precision numbers on the best GPU architecture accelerates execution by up to x2.2 for TV-L1 and x3.7 for Horn-Schunck. This work underscores the significance of both GPU-specific optimizations for computer vision algorithms, along with the use and study of reduced floating point numbers. They pave the way for future enhancements through new algorithmic transformations, alternative numerical formats, and hardware architectures. This approach can potentially be extended to other families of iterative algorithms
Fumero, Alfonso Juan José. "Accelerating interpreted programming languages on GPUs with just-in-time compilation and runtime optimisations." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28718.
Full textHopson, Benjamin Thomas Ken. "Techniques of design optimisation for algorithms implemented in software." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20435.
Full textLuong, Thé Van. "Métaheuristiques parallèles sur GPU." Thesis, Lille 1, 2011. http://www.theses.fr/2011LIL10058/document.
Full textReal-world optimization problems are often complex and NP-hard. Their modeling is continuously evolving in terms of constraints and objectives, and their resolution is CPU time-consuming. Although near-optimal algorithms such as metaheuristics (generic heuristics) make it possible to reduce the temporal complexity of their resolution, they fail to tackle large problems satisfactorily. Over the last decades, parallel computing has been revealed as an unavoidable way to deal with large problem instances of difficult optimization problems. The design and implementation of parallel metaheuristics are strongly influenced by the computing platform. Nowadays, GPU computing has recently been revealed effective to deal with time-intensive problems. This new emerging technology is believed to be extremely useful to speed up many complex algorithms. One of the major issues for metaheuristics is to rethink existing parallel models and programming paradigms to allow their deployment on GPU accelerators. Generally speaking, the major issues we have to deal with are: the distribution of data processing between CPU and GPU, the thread synchronization, the optimization of data transfer between the different memories, the memory capacity constraints, etc. The contribution of this thesis is to deal with such issues for the redesign of parallel models of metaheuristics to allow solving of large scale optimization problems on GPU architectures. Our objective is to rethink the existing parallel models and to enable their deployment on GPUs. Thereby, we propose in this document a new generic guideline for building efficient parallel metaheuristics on GPU. Our challenge is to come out with the GPU-based design of the whole hierarchy of parallel models.In this purpose, very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of solutions to GPU threadsor memory management. These approaches have been exhaustively experimented using five optimization problems and four GPU configurations. Compared to a CPU-based execution, experiments report up to 80-fold acceleration for large combinatorial problems and up to 2000-fold speed-up for a continuous problem. The different works related to this thesis have been accepted in a dozen of publications, including the IEEE Transactions on Computers journal
Chrétien, Benjamin. "Optimisation semi-infinie sur GPU pour le contrôle corps-complet de robots." Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT315/document.
Full textA humanoid robot is a complex system with numerous degrees of freedom, whose behavior is subject to the nonlinear equations of motion. As a result, planning its motion is a difficult task from a computational perspective.In this thesis, we aim at developing a method that can leverage the computing power of GPUs in the context of optimization-based whole-body motion planning. We first exhibit the properties of the optimization problem, and show that several avenues can be exploited in the context of parallel computing. Then, we present our approach of the dynamics computation, suitable for highly-parallel processing architectures. Next, we propose a many-core GPU implementation of the motion planning problem. Our approach computes the constraints and their gradients in parallel, and feeds the result to a nonlinear optimization solver running on the CPU. Because each constraint and its gradient can be evaluated independently for each time interval, we end up with a highly parallelizable problem that can take advantage of GPUs. We also propose a new parametrization of contact forces adapted to our optimization problem. Finally, we investigate the extension of our work to model predictive control
Van, Luong Thé. "Métaheuristiques parallèles sur GPU." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00638820.
Full textDelevacq, Audrey. "Métaheuristiques pour l'optimisation combinatoire sur processeurs graphiques (GPU)." Thesis, Reims, 2013. http://www.theses.fr/2013REIMS011/document.
Full textSeveral combinatorial optimization problems are NP-hard and can only be solved optimally by exact algorithms for small instances. Metaheuristics have proved to be effective in solving many of these problems by finding approximate solutions in a reasonable time. However, dealing with large instances, they may require considerable computation time and amount of memory space to be efficient in the exploration of the search space. Therefore, the interest devoted to their deployment on high performance computing architectures has increased over the past years. Existing parallelization approaches generally follow the message-passing and shared-memory computing paradigms which are suitable for traditional architectures based on microprocessors, also called CPU (Central Processing Unit). However, research in the field of parallel computing is rapidly evolving and new architectures emerge, including hardware accelerators which offloads the CPU of some of its tasks. Among them, graphics processors or GPUs (Graphics Processing Units) have a massively parallel architecture with great potential but also imply new algorithmic and programming challenges. In fact, existing parallelization models of metaheuristics are generally unsuited to computing environments like GPUs. Few works have tackled this subject without providing a comprehensive and fundamental view of it.The general purpose of this thesis is to propose a framework for the effective implementation of metaheuristics on parallel architectures based on GPUs. It begins with a state of the art describing existing works on GPU parallelization of metaheuristics and general classifications of parallel metaheuristics. An original taxonomy is then designed to classify identified implementations and to formalize GPU parallelization strategies in a coherent methodological framework. This thesis also aims to validate this taxonomy by exploiting its main components to propose original parallelization strategies specifically tailored to GPU architectures. Several effective implementations based on Ant Colony Optimization and Iterated Local Search metaheuristics are thus proposed for solving the Travelling Salesman Problem. A structured and thorough experimental study is conducted to evaluate and compare the performance of approaches on criteria related to solution quality and computing time reduction
Quinto, Michele Arcangelo. "Méthode de reconstruction adaptive en tomographie par rayons X : optimisation sur architectures parallèles de type GPU." Thesis, Grenoble, 2013. http://www.theses.fr/2013GRENT109/document.
Full textTomography reconstruction from projections data is an inverse problem widely used inthe medical imaging field. With sufficiently large number of projections over the requiredangle, the FBP (filtered backprojection) algorithms allow fast and accurate reconstructions.However in the cases of limited views (lose dose imaging) and/or limited angle (specificconstrains of the setup), the data available for inversion are not complete, the problembecomes more ill-conditioned, and the results show significant artifacts. In these situations,an alternative approach of reconstruction, based on a discrete model of the problem,consists in using an iterative algorithm or a statistical modelisation of the problem to computean estimate of the unknown object. These methods are classicaly based on a volumediscretization into a set of voxels and provide 3D maps of densities. Computation time andmemory storage are their main disadvantages. Moreover, whatever the application, thevolumes are segmented for a quantitative analysis. Numerous methods of segmentationwith different interpretations of the contours and various minimized energy functionalare offered, and the results can depend on their use.This thesis presents a novel approach of tomographic reconstruction simultaneouslyto segmentation of the different materials of the object. The process of reconstruction isno more based on a regular grid of pixels (resp. voxel) but on a mesh composed of nonregular triangles (resp. tetraedra) adapted to the shape of the studied object. After aninitialization step, the method runs into three main steps: reconstruction, segmentationand adaptation of the mesh, that iteratively alternate until convergence. Iterative algorithmsof reconstruction used in a conventionnal way have been adapted and optimizedto be performed on irregular grids of triangular or tetraedric elements. For segmentation,two methods, one based on a parametric approach (snake) and the other on a geometricapproach (level set) have been implemented to consider mono and multi materials objects.The adaptation of the mesh to the content of the estimated image is based on the previoussegmented contours that makes the mesh progressively coarse from the edges to thelimits of the domain of reconstruction. At the end of the process, the result is a classicaltomographic image in gray levels, but whose representation by an adaptive mesh toits content provide a correspoonding segmentation. The results show that the methodprovides reliable reconstruction and leads to drastically decrease the memory storage. Inthis context, the operators of projection have been implemented on parallel archituecturecalled GPU. A first 2D version shows the feasability of the full process, and an optimizedversion of the 3D operators provides more efficent compoutations
O'Connell, Jonathan F. "A dynamic programming model to solve optimisation problems using GPUs." Thesis, Cardiff University, 2017. http://orca.cf.ac.uk/97930/.
Full textPospíchal, Petr. "Akcelerace genetického algoritmu s využitím GPU." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-236783.
Full textAvramidis, Eleftherios. "Optimisation and computational methods to model the oculomotor system with focus on nystagmus." Thesis, University of Exeter, 2015. http://hdl.handle.net/10871/18291.
Full textClaustre, Jonathan. "Modèle particulaire 2D et 3D sur GPU pour plasma froid magnétisé : Application à un filtre magnétique." Phd thesis, Université Paul Sabatier - Toulouse III, 2012. http://tel.archives-ouvertes.fr/tel-00796690.
Full textBachmann, Etienne. "Imagerie ultrasonore 2D et 3D sur GPU : application au temps réel et à l'inversion de forme d'onde complète." Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30133/document.
Full textIf the most important progresses in ultrasound imaging have been closely linked to the instrumentation's quality, the advent of computing science revolutionized this discipline by introducing growing possibilities in data processing to obtain a better picture. In addition, GPUs, which are the main components of the graphics cards deliver thanks to their architecture a significantly higher processing speed compared with processors, and also for scientific calculation purpose. The goal of this work is to take the best benefit of this new computing tool, by aiming two complementary applications. The first one is to enable real-time imaging with a better quality than other sonographic imaging techniques, thanks to the parallelization of the FTIM (Fast Tpological IMaging) imaging process. The second one is to introduce quantitative imaging and more particularly reconstructing the wavespeed map of an unknown medium, using Full Waveform Inversion
GASPARETTO, THOMAS. "Development of a computing farm for Cloud computing on GPU - Development and optimisation of data-analysis methodologies for the Cherenkov Telescope Array." Doctoral thesis, Università degli Studi di Trieste, 2020. http://hdl.handle.net/11368/2963769.
Full textThe research activity was focused on the creation of simulation and analysis pipelines to be used at different levels in the context of the Cherenkov Telescope Array. The work consists of two main parts: the first one is dedicated to the reconstruction of the events coming from the Monte Carlo simulations using the ctapipe library, whereas a second part is devoted to the estimation of the future performances of CTA in the observation of violent phenomena such as those generating Gamma Ray Bursts and Gravitational Waves. The low-level reconstruction of the raw data was done with a pipeline which uses the ImPACT analysis, a template-based technique with templates derived from Monte Carlo simulations; ImPACT was both used to obtain angular and energy resolution plots, but also fully profiled to find its bottlenecks, debugged and sped up. The code was used to analyse data from different telescopes’ layouts and refactored to analyse data for the prototype of the LST-1 telescope, working in “mono mode” instead of the standard stereo mode. The analysis was re-implemented in order to try massively all the templates on the GPU in one single step. The implementation is done using the PyTorch library, developed for Deep Learning. The estimation of the performances of the telescopes in the so-called “divergent pointing mode” was investigated: in this scenario the telescopes have a slightly different pointing direction with respect to the parallel configuration, so that the final hyper field-of-view of all the system is larger with respect to the parallel pointing. The reconstruction code in ctapipe was adapted to this particular observation mode. The creation of a 3D displayer, done using VTK, helped in understanding the code and in fixing it accordingly. The extragalactic sky model for the First CTA Data Challenge was created selecting sources from different catalogues. The goal of the DC-1 was to enable the CTA Consortium Science Working Groups to derive science benchmarks for the CTA Key Science Projects and get more people involved in the analyses. In order to do the simulations and analysis for the GRB and GW Consortium papers, a pipeline was created around the ctools library: this is made by two parts handled by configuration files, which take care both of the specific task to do (background simulation, model creation and the simulation part which performs the detection and estimate the significance) and the jobs submission. The research was done for 14 months (with 5 months covered by an additional scholarship from the French Ambassy) at the “Laboratorie d’Annecy de Physique de Particules” (LAPP) in Annecy (France) under a joint-supervision program based on the mandatory research period to spend abroad foreseen in the scholarship, funded from the European Social Fund.
Tanner, Michael. "BOR2G : Building Optimal Regularised Reconstructions with GPUs (in cubes)." Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:1928c996-d913-4d7e-8ca5-cf247f90aa0f.
Full textChernoglazov, Alexander Igorevich. "Improving Visualisation of Large Multi-Variate Datasets: New Hardware-Based Compression Algorithms and Rendering Techniques." Thesis, University of Canterbury. Computer Science and Software Engineering, 2012. http://hdl.handle.net/10092/7004.
Full textBeaugnon, Ulysse. "Efficient code generation for hardware accelerators by refining partially specified implementation." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE050.
Full textCompilers looking for an efficient implementation of a function must find which optimizations are the most beneficial. This is a complex problem, especially in the early steps of the compilation process. Each decision may impact the transformations available in subsequent steps. We propose to represent the compilation process as the progressive refinement of a partially specified implementation. All potential decisions are exposed upfront and commute. This allows for making the most discriminative decisions first and for building a performance model aware of which optimizations may be applied in subsequent steps. We apply this approach to the generation of efficient GPU code for linear algebra and yield performance competitive with hand-tuned libraries
He, Guanlin. "Parallel algorithms for clustering large datasets on CPU-GPU heterogeneous architectures." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG062.
Full textClustering, which aims at achieving natural groupings of data, is a fundamental and challenging task in machine learning and data mining. Numerous clustering methods have been proposed in the past, among which k-means is one of the most famous and commonly used methods due to its simplicity and efficiency.Spectral clustering is a more recent approach that usually achieves higher clustering quality than k-means. However, classical algorithms of spectral clustering suffer from a lack of scalability due to their high complexities in terms of number of operations and memory space requirements. This scalability challenge can be addressed by applying approximation methods or by employing parallel and distributed computing.The objective of this thesis is to accelerate spectral clustering and make it scalable to large datasets by combining representatives-based approximation with parallel computing on CPU-GPU platforms. Considering different scenarios, we propose several parallel processing chains for large-scale spectral clustering. We design optimized parallel algorithms and implementations for each module of the proposed chains: parallel k-means on CPU and GPU, parallel spectral clustering on GPU using sparse storage format, parallel filtering of data noise on GPU, etc. Our various experiments reach high performance and validate the scalability of each module and the complete chains
Mokos, Athanasios Dorotheos. "Multi-phase modelling of violent hydrodynamics using Smoothed Particle Hydrodynamics (SPH) on Graphics Processing Units (GPUs)." Thesis, University of Manchester, 2014. https://www.research.manchester.ac.uk/portal/en/theses/multiphase-modelling-of-violent-hydrodynamics-using-smoothed-particle-hydrodynamics-sph-on-graphics-processing-units-gpus(a82b8187-f81a-400b-8bd2-9a74c502a953).html.
Full textMonnier, Nicolas. "ExaSKA : Parallelization on a High Performance Computing server for the exascale radiotelescope SKA." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG036.
Full textRadio interferometers simulate a large telescope via a network of antennas. Imaging reconstructs an image of the observed sky using signals received by the antennas, which are in the spatial domain but whose data and visibilities are located in the Fourier domain. This reconstruction problem is "ill-posed" because the measurements do not cover the entire Fourier plane and are corrupted by the effects of signal propagation in the Earth's atmosphere.Iterative algorithms use a priori information about the sky for reconstruction but require interpolation of visibilities onto a uniform grid to use fast Fourier transform algorithms. In the "backward" model, interpolation, called gridding, spreads visibilities onto a uniform grid using a convolution kernel. In the "forward" model, interpolation, called degridding, is the adjoint operation that gathers information on an area centered on the visibility position.The processing and storage of visibilities are computationally expensive due to the extremely large data rates generated by radio telescopes, especially with the new generation of interferometers. Image reconstruction is a major challenge due to the high computational cost of interpolation operators, gridding and degridding, and reconstruction algorithms that are a bottleneck.This thesis focuses on reducing the computation time of imaging methods by focusing on two aspects: the algorithmic aspect and the hardware implementation with fine-grained and coarse-grained parallelization. A method for reducing the computational cost of gridding and degridding operators is presented by merging them into a single operator, named Grid to Grid (G2G), which relies on the succession of the two operators as well as the approximation of the visibility coordinates on the Fourier grid. CPU and GPU implementations of this method show that G2G significantly reduces the computational cost and memory footprint without penalizing reconstruction quality. The oversampling factor serves as a balance between reducing computational cost and interpolation accuracy.A multi-core multi-node distribution on an HPC server of the DDFacet imaging framework is also presented. Parallelization is divided into several levels: multi-core parallelization for shared-memory systems based on the independence of calculations between facets and multi-node parallelization for distributed-memory systems based on the independence of gridding and degridding calculations between different observation frequencies. These two levels of parallelization significantly reduce execution time, and acceleration is not linear, allowing for an optimum choice between acceleration and computational resources used
Zhang, Naiyu. "Cellular GPU Models to Euclidean Optimization Problems : Applications from Stereo Matching to Structured Adaptive Meshing and Traveling Salesman Problem." Thesis, Belfort-Montbéliard, 2013. http://www.theses.fr/2013BELF0215/document.
Full textThe work presented in this PhD studies and proposes cellular computation parallel models able to address different types of NP-hard optimization problems defined in the Euclidean space, and their implementation on the Graphics Processing Unit (GPU) platform. The goal is to allow both dealing with large size problems and provide substantial acceleration factors by massive parallelism. The field of applications concerns vehicle embedded systems for stereovision as well as transportation problems in the plane, as vehicle routing problems. The main characteristic of the cellular model is that it decomposes the plane into an appropriate number of cellular units, each responsible of a constant part of the input data, and such that each cell corresponds to a single processing unit. Hence, the number of processing units and required memory are with linear increasing relationship to the optimization problem size, which makes the model able to deal with very large size problems.The effectiveness of the proposed cellular models has been tested on the GPU parallel platform on four applications. The first application is a stereo-matching problem. It concerns color stereovision. The problem input is a stereo image pair, and the output a disparity map that represents depths in the 3D scene. The goal is to implement and compare GPU/CPU winner-takes-all local dense stereo-matching methods dealing with CFA (color filter array) image pairs. The second application focuses on the possible GPU improvements able to reach near real-time stereo-matching computation. The third and fourth applications deal with a cellular GPU implementation of the self-organizing map neural network in the plane. The third application concerns structured mesh generation according to the disparity map to allow 3D surface compressed representation. Then, the fourth application is to address large size Euclidean traveling salesman problems (TSP) with up to 33708 cities.In all applications, GPU implementations allow substantial acceleration factors over CPU versions, as the problem size increases and for similar or higher quality results. The GPU speedup factor over CPU was of 20 times faster for the CFA image pairs, but GPU computation time is about 0.2s for a small image pair from Middlebury database. The near real-time stereovision algorithm takes about 0.017s for a small image pair, which is one of the fastest records in the Middlebury benchmark with moderate quality. The structured mesh generation is evaluated on Middlebury data set to gauge the GPU acceleration factor and quality obtained. The acceleration factor for the GPU parallel self-organizing map over the CPU version, on the largest TSP problem with 33708 cities, is of 30 times faster
Cui, Beibei. "Image processing applications in object detection and graph matching : from Matlab development to GPU framework." Thesis, Bourgogne Franche-Comté, 2020. http://www.theses.fr/2020UBFCA002.
Full textAutomatically finding correspondences between object features in images is of main interest for several applications, as object detection and tracking, flow velocity estimation, identification, registration, and many derived tasks. In this thesis, we address feature correspondence within the general framework of graph matching optimization and with the principal aim to contribute, at a final step, to the design of new and parallel algorithms and their implementation on GPU (Graphics Processing Unit) systems. Graph matching problems can have many declinations, depending on the assumptions of the application at hand. We observed a gap between applications based on local cost objective functions, and those applications with higher-order cost functions, that evaluate similarity between edges of the graphs, or hyperedges when considering hypergraphs. The former class provides convolution-based algorithms already having parallel GPU implementations. Whereas, the latter class puts the emphasis on geometric inter-feature relationships, transforming the correspondence problem to a purely geometric problem stated in a high dimensional space, generally modeled as an integer quadratic programming, for which we did not find GPU implementations available yet.Two complementary approaches were adopted in order to contribute to addressing higher-order geometric graph matching on GPU. Firstly, we study different declinations of feature correspondence problems by the use of the Matlab platform, in order to reuse and provide state-of-the-art solution methods, as well as experimental protocols and input data necessary for a GPU platform with evaluation and comparison tools against existing sequential algorithms, most of the time developed in Matlab framework. Then, the first part of this work concerns three contributions, respectively, to background and frame difference application, to feature extraction problem from images for local correspondences, and to the general graph matching problem, all based on the combination of methods derived from Matlab environment. Secondly, and based on the results of Matlab developments, we propose a new GPU framework written in CUDA C++ specifically dedicated to geometric graph matching but providing new parallel algorithms, with lower computational complexity, as the self-organizing map in the plane, derived parallel clustering algorithms, and distributed local search method. These parallel algorithms are then evaluated and compared to the state-of-the-art methods available for graph matching and following the same experimental protocol. This GPU platform constitutes our final and main proposal to contribute to bridging the gap between GPU development and higher-order graph matching
Codol, Jean-Marie. "Hybridation GPS/Vision monoculaire pour la navigation autonome d'un robot en milieu extérieur." Thesis, Toulouse, INSA, 2012. http://www.theses.fr/2012ISAT0060/document.
Full textWe are witnessing nowadays the importation of ICT (Information and Communications Technology) in robotics. These technologies will give birth, in upcoming years, to the general public service robotics. This future, if realised, shall be the result of many research conducted in several domains: mechatronics, telecommunications, automatics, signal and image processing, artificial intelligence ... One particularly interesting aspect in mobile robotics is hence the simultaneous localisation and mapping problem. Consequently, to access certain informations, a mobile robot has, in many cases, to map/localise itself inside its environment. The following question is then posed: What precision can we aim for in terms of localisation? And at what cost?In this context, one of the objectives of many laboratories indulged in robotics research, and where results impact directly the industry, is the positioning and mapping of the environment. These latter tasks should be precise, adapted everywhere, integrated, low-cost and real-time. The prediction sensors are inexpensive ones, such as a standard GPS (of metric precision), and a set of embeddable payload sensors (e.g. video cameras). These type of sensors constitute the main support in our work.In this thesis, we shed light on the localisation problem of a mobile robot, which we choose to handle with a probabilistic approach. The procedure is as follows: we first define our "variables of interest" which are a set of random variables, and then we describe their distribution laws and their evolution models. Afterwards, we determine a cost function in such a manner to build up an observer (an algorithmic class where the objective is to minimize the cost function).Our contribution consists of using brute GPS measures (brute measures or raw datas are measures issued from code and phase correlation loops, called pseudo-distance measures of code and phase, respectively) for a low-cost navigation, which is precise in an external suburban environment. By implementing the so-called "whole" property of GPS phase ambiguities, we expand the navigation to achieve a GPS-RTK (Real-Time Kinematic) system in a precise and low-cost local differential mode.Our propositions has been validated through experimentations realized on our robotic demonstrator
Mantell, Rosemary Genevieve. "Accelerated sampling of energy landscapes." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/267990.
Full textWeber, Bruno. "Optimisation de code Galerkin discontinu sur ordinateur hybride : application à la simulation numérique en électromagnétisme." Thesis, Strasbourg, 2018. http://www.theses.fr/2018STRAD046/document.
Full textIn this thesis, we present the evolutions made to the Discontinuous Galerkin solver Teta-CLAC – resulting from the IRMA-AxesSim collaboration – during the HOROCH project (2015-2018). This solver allows to solve the Maxwell equations in 3D and in parallel on a large amount of OpenCL accelerators. The goal of the HOROCH project was to perform large-scale simulations on a complete digital human body model. This model is composed of 24 million hexahedral cells in order to perform calculations in the frequency band of connected objects going from 1 to 3 GHz (Bluetooth). The applications are numerous: telephony and accessories, sport (connected shirts), medicine (probes: capsules, patches), etc. The changes thus made include, among others: optimization of OpenCL kernels for CPUs in order to make the best use of hybrid architectures; StarPU runtime experimentation; the design of an integration scheme using local time steps; and many optimizations allowing the solver to process simulations of several millions of cells
Crestetto, Anaïs. "Optimisation de méthodes numériques pour la physique des plasmas : application aux faisceaux de particules chargées." Phd thesis, Université de Strasbourg, 2012. http://tel.archives-ouvertes.fr/tel-00735569.
Full textLalami, Mohamed Esseghir. "Contribution à la résolution de problèmes d'optimisation combinatoire : méthodes séquentielles et parallèles." Phd thesis, Université Paul Sabatier - Toulouse III, 2012. http://tel.archives-ouvertes.fr/tel-00748546.
Full textBramas, Bérenger. "Optimization and parallelization of the boundary element method for the wave equation in time domain." Thesis, Bordeaux, 2016. http://www.theses.fr/2016BORD0022/document.
Full textThe time-domain BEM for the wave equation in acoustics and electromagnetism is used to simulatethe propagation of a wave with a discretization in time. It allows to obtain several frequencydomainresults with one solve. In this thesis, we investigate the implementation of an efficientTD-BEM solver using different approaches. We describe the context of our study and the TD-BEMformulation expressed as a sparse linear system composed of multiple interaction/convolutionmatrices. This system is naturally computed using the sparse matrix-vector product (SpMV). Wework on the limits of the SpMV kernel by looking at the matrix reordering and the behavior of ourSpMV kernels using vectorization (SIMD) on CPUs and an advanced blocking-layout on NvidiaGPUs. We show that this operator is not appropriate for our problem, and we then propose toreorder the original computation to get a special matrix structure. This new structure is called aslice matrix and is computed with a custom matrix/vector product operator. We present an optimizedimplementation of this operator on CPUs and Nvidia GPUs for which we describe advancedblocking schemes. The resulting solver is parallelized with a hybrid strategy above heterogeneousnodes and relies on a new heuristic to balance the work among the processing units. Due tothe quadratic complexity of this matrix approach, we study the use of the fast multipole method(FMM) for our time-domain BEM solver. We investigate the parallelization of the general FMMalgorithm using several paradigms in both shared and distributed memory, and we explain howmodern runtime systems are well-suited to express the FMM computation. Finally, we investigatethe implementation and the parametrization of an FMM kernel specific to our TD-BEM, and weprovide preliminary results
Balestra, Julien. "Caractérisation de la source des séismes par inversion des données sismologiques et géodésiques : mécanismes au foyer, optimisation des modèles de vitesse, distribution du glissement cosismique." Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4020/document.
Full textStudies of the earthquake source are based on observations of seismic ground motions. They also depend on the quality and the density of measurements. In this present work we will present studies of the determination of focal mechanism of main aftershocks of the Les Saintes (MW 6.4, 2004) earthquake, and the determination of the coseismic slip of the L’Aquila (MW 6.3, 2009), the Miyagi-Oki (MW 7.2, 2005), ant the Sanriku-Oki (MW 7.3, 2011) earthquakes. These studies were based on two inversion methods. Different kinds of data were available (strong motion, broadband teleseismic, GPS and InSAR) depending on the earthquake studied. But the multiplicity of data is not sufficient to well describe rupture process. There are others difficulties as the data modeling of strong motion. Seismic velocity models are used to describe the characteristics of layers crossed by seismic waves. The quality of the modeling is depending on the pertinence of these seismic velocity models. The description of the rupture process is also depending on the non-uniqueness of the best solution given by global inversion methods. We propose two procedures in order to take into account these two classic issues. First, we developed a velocity model exploration procedure to obtain optimized 1D velocity models in order to improve the strong motion modeling of the L’Aquila earthquake. Then we developed a procedure to build an average rupture model from the combined results of several joint inversions, which was applied to the L’Aquila, the Miyagi-Oki, and the Sanriku-Oki earthquake. This thesis presents all these works and answers to the raised issues
Heiries, Vincent. "Optimisation d'une chaîne de réception pour signaux de radionavigation à porteuse à double décalage (BOC) retenus pour les systèmes GALILEO et GPS modernisé." Toulouse, ISAE, 2007. http://www.theses.fr/2007ESAE0018.
Full textPetit, Eric. "Vers un partitionnement automatique d'applications en codelets spéculatifs pour les systèmes hétérogènes à mémoires distribuées." Phd thesis, Université Rennes 1, 2009. http://tel.archives-ouvertes.fr/tel-00445512.
Full textBesch, Guillaume. "Optimisation du contrôle glycémique en chirurgie cardiaque : variabilité glycémique, compliance aux protocoles de soins, et place des incrétino-mimétiques." Thesis, Bourgogne Franche-Comté, 2017. http://www.theses.fr/2017UBFCE016/document.
Full textStress hyperglycemia and glycemic variability are associated with increased morbidity and mortality in cardiac surgery patients. Intravenous (IV) insulin therapy using complex dynamic protocols is the gold standard treatment for stress hyperglycemia. If the optimal blood glucose target range remains a matter of debate, blood glucose control using IV insulin therapy protocols has become part of the good clinical practices during the postoperative period, but implies a significant increase in nurse workload. In the 1st part of the thesis, we aimed at checking the nurse-compliance to the insulin therapy protocol used in our Cardiac Surgery Intensive Care Unit 7 years after its implementation. Major deviations have been observed and simple corrective measures have restored a high level of nurse compliance. In the 2nd part of this thesis, we aimed at assessing whether blood glucose variability could be related to poor outcome in transcatheter aortic valve implantation (TAVI) patients, as reported in more invasive cardiac surgery procedures. The analysis of data from patients who undergone TAVI in our institution and included in the multicenter France and France-2 registries suggested that increased glycemic variability is associated with a higher rate of major adverse events occurring between the 3rd and the 30th day after TAVI, regardless of hyperglycemia. In the 3rd part if this thesis, we conducted a randomized controlled phase II/III trial to investigate the clinical effectiveness of IV exenatide in perioperative blood glucose control after coronary artery bypass graft surgery. Intravenous exenatide failed to improve blood glucose control and to decrease glycemic variability, but allowed to delay the start in insulin infusion and to lower the insulin dose required. Moreover, IV exenatide could allow a significant decrease in nurse workload. The ancillary analysis of this trial suggested that IV exenatide did neither provide cardio protective effect against myocardial ischemia-reperfusion injuries nor improve the left ventricular function by using IV exenatide. Strategies aiming at improving nurse compliance to insulin therapy protocols and at reducing blood glucose variability could be suitable to improve blood glucose control in cardiac surgery patients. The use of the analogues of GLP-1 in cardiac surgery patients needs to be investigated otherwise
Jaeger, Julien. "Transformations source-à-source pour l'optimisation de codes irréguliers et multithreads." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2012. http://tel.archives-ouvertes.fr/tel-00842177.
Full textXia, Liang. "Towards optimal design of multiscale nonlinear structures : reduced-order modeling approaches." Thesis, Compiègne, 2015. http://www.theses.fr/2015COMP2230/document.
Full textHigh-performance heterogeneous materials have been increasingly used nowadays for their advantageous overall characteristics resulting in superior structural mechanical performance. The pronounced heterogeneities of materials have significant impact on the structural behavior that one needs to account for both material microscopic heterogeneities and constituent behaviors to achieve reliable structural designs. Meanwhile, the fast progress of material science and the latest development of 3D printing techniques make it possible to generate more innovative, lightweight, and structurally efficient designs through controlling the composition and the microstructure of material at the microscopic scale. In this thesis, we have made first attempts towards topology optimization design of multiscale nonlinear structures, including design of highly heterogeneous structures, material microstructural design, and simultaneous design of structure and materials. We have primarily developed a multiscale design framework, constituted of two key ingredients : multiscale modeling for structural performance simulation and topology optimization forstructural design. With regard to the first ingredient, we employ the first-order computational homogenization method FE2 to bridge structural and material scales. With regard to the second ingredient, we apply the method Bi-directional Evolutionary Structural Optimization (BESO) to perform topology optimization. In contrast to the conventional nonlinear design of homogeneous structures, this design framework provides an automatic design tool for nonlinear highly heterogeneous structures of which the underlying material model is governed directly by the realistic microstructural geometry and the microscopic constitutive laws. Note that the FE2 method is extremely expensive in terms of computing time and storage requirement. The dilemma of heavy computational burden is even more pronounced when it comes to topology optimization : not only is it required to solve the time-consuming multiscale problem once, but for many different realizations of the structural topology. Meanwhile we note that the optimization process requires multiple design loops involving similar or even repeated computations at the microscopic scale. For these reasons, we introduce to the design framework a third ingredient : reduced-order modeling (ROM). We develop an adaptive surrogate model using snapshot Proper Orthogonal Decomposition (POD) and Diffuse Approximation to substitute the microscopic solutions. The surrogate model is initially built by the first design iteration and updated adaptively in the subsequent design iterations. This surrogate model has shown promising performance in terms of reducing computing cost and modeling accuracy when applied to the design framework for nonlinear elastic cases. As for more severe material nonlinearity, we employ directly an established method potential based Reduced Basis Model Order Reduction (pRBMOR). The key idea of pRBMOR is to approximate the internal variables of the dissipative material by a precomputed reduced basis computed from snapshot POD. To drastically accelerate the computing procedure, pRBMOR has been implemented by parallelization on modern Graphics Processing Units (GPUs). The implementation of pRBMOR with GPU acceleration enables us to realize the design of multiscale elastoviscoplastic structures using the previously developed design framework inrealistic computing time and with affordable memory requirement. We have so far assumed a fixed material microstructure at the microscopic scale. The remaining part of the thesis is dedicated to simultaneous design of both macroscopic structure and microscopic materials. By the previously established multiscale design framework, we have topology variables and volume constraints defined at both scales
Selmi, Ikhlas. "Optimisation de l'infrastructure d'un système de positionnement indoor à base de transmetteurs GNSS." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2013. http://www.theses.fr/2013TELE0024.
Full textIn order to make the GNSS positioning service continuous and available when going from an outdoor to an indoor environment, pseudolite and repeater based systems have been developed. A new system called repealite is a combination of both pseudolites and repeaters. It is based on transmitting a single signal through a set of transmitters (thus creating the local constellation). In order to avoid interference between the repealite signals and to distinguish between them at the receiver’s end, each signal is shifted with a specific delay. The research carried out in this PhD aims at optimizing two aspects of the repealite based system. Firstly, we need to mitigate the effect of the interference caused on the satellite signals received outdoors. So we decided to design new codes characterized by low interference levels with outdoor signals. Secondly, we worked on the infrastructure part in order to simplify it and to make it easier to install: this is mainly achieved through the use of optical fibers. In the first part, we study the codes and the modulation techniques currently used in the GNSS systems. Then, we propose a few codes having an interference level equivalent to that of the GPS (obtained when computing two GPS codes). These new codes are compatible with the GPS L1 or the Glonass G1 bands. In a second step, we focus on the modulation techniques and create the so-called IMBOC (Indoor Modified Binary Offset Carrier) that aims at minimizing the interference levels with outdoor signals. With this modulation, we propose new IMBOC codes capable of much lower interference levels than the GPS reference. In order to evaluate the performance of the proposed codes, we carried out a theoretical study, simulations and experimental tests. The interference gain reached about 20 dB on the GPS band and 16 dB on the Glonass one. The proposed codes are divided into two categories: those reserved to the repealite system (using a single code) and families of codes suited to pseudolite–based systems. Finally, we generated the IMBOC signals modulated by the new codes and tested the real interference induced on an outdoor receiver tracking the satellite signals. In the second part, we use optical fibers in order to replace the coaxial cables used to transmit signals from the GNSS-like signal generator to the repealites. In addition, the initial delay needed for each repealite is added by propagating the signals through rolls of fibers. Indeed, optical fiber offers advantages such as lightness, flexibility and low power loss that make it suitable to simplify the infrastructure of the system. In order to evaluate the real delays of these various fibers, we develop an estimating method based on phase shift measurements (between two sinusoidal signals) and statistical analysis of the series of measurements. This method should have uncertainties lower than one centimeter in order to insure a sub-meter precision (in absolute positioning with the repealite positioning system). In order to validate this method, we compare it to a GNSS based calibration approach. Finally, we carry out a few positioning tests with the repealite positioning system deployed in a typical indoor environment. These tests deal with absolute and relative positioning and give an idea about the system’s performance
Seznec, Mickaël. "From the algorithm to the targets, optimization flow for high performance computing on embedded GPUs." Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG074.
Full textCurrent digital processing algorithms require more computing power to achieve more accurate results and process larger data. In the meantime, hardware architectures are becoming more specialized, with highly efficient accelerators designed for specific tasks. In this context, the path of deployment from the algorithm to the implementation becomes increasingly complex. It is, therefore, crucial to determine how algorithms can be modified to take advantage of new hardware capabilities. Our study focused on graphics processing units (GPUs), a massively parallel processor. Our algorithmic work was done in the context of radio-astronomy or optical flow estimation and consisted of finding the best adaptation of the software to the hardware. At the level of a mathematical operator, we modified the traditional image convolution algorithm to use the matrix units and showed that its performance doubles for large convolution kernels. At a broader method level, we evaluated linear solvers for the combined local-global optical flow to find the most suitable one on GPU. With additional optimizations, such as iteration fusion or memory buffer re-utilization, the method is twice as fast as the initial implementation, running at 60 frames per second on an embedded platform (30 W). Finally, we also pointed out the interest of this hardware-aware algorithm design method in the context of deep neural networks. For that, we showed the hybridization of a convolutional neural network for optical flow estimation with a pre-trained image classification network, MobileNet, that was initially designed for efficient image classification on low-power platforms
Selmi, Ikhlas. "Optimisation de l'infrastructure d'un système de positionnement indoor à base de transmetteurs GNSS." Phd thesis, Institut National des Télécommunications, 2013. http://tel.archives-ouvertes.fr/tel-00919772.
Full textAhmed, Bacha Adda Redouane. "Localisation multi-hypothèses pour l'aide à la conduite : conception d'un filtre "réactif-coopératif"." Thesis, Evry-Val d'Essonne, 2014. http://www.theses.fr/2014EVRY0051/document.
Full text“ When we use information from one source,it's plagiarism;Wen we use information from many,it's information fusion ”This work presents an innovative collaborative data fusion approach for ego-vehicle localization. This approach called the Optimized Kalman Particle Swarm (OKPS) is a data fusion and an optimized filtering method. Data fusion is made using data from a low cost GPS, INS, Odometer and a Steering wheel angle encoder. This work proved that this approach is both more appropriate and more efficient for vehicle ego-localization in degraded sensors performance and highly nonlinear situations. The most widely used vehicle localization methods are the Bayesian approaches represented by the EKF and its variants (UKF, DD1, DD2). The Bayesian methods suffer from sensitivity to noises and instability for the highly non-linear cases. Proposed for covering the Bayesian methods limitations, the Multi-hypothesis (particle based) approaches are used for ego-vehicle localization. Inspired from monte-carlo simulation methods, the Particle Filter (PF) performances are strongly dependent on computational resources. Taking advantages of existing localization techniques and integrating metaheuristic optimization benefits, the OKPS is designed to deal with vehicles high nonlinear dynamic, data noises and real time requirement. For ego-vehicle localization, especially for highly dynamic on-road maneuvers, a filter needs to be robust and reactive at the same time. The OKPS filter is a new cooperative-reactive localization algorithm inspired by dynamic Particle Swarm Optimization (PSO) metaheuristic methods. It combines advantages of the PSO and two other filters: The Particle Filter (PF) and the Extended Kalman filter (EKF). The OKPS is tested using real data collected using a vehicle equipped with embedded sensors. Its performances are tested in comparison with the EKF, the PF and the Swarm Particle Filter (SPF). The SPF is an interesting particle based hybrid filter combining PSO and particle filtering advantages; It represents the first step of the OKPS development. The results show the efficiency of the OKPS for a high dynamic driving scenario with damaged and low quality GPS data
Adnan, S. "Ultra-wideband antenna design for microwave imaging applications. Design, optimisation and development of ultra-wideband antennas for microwave near-field sensing tools, and study the matching and radiation purity of these antennas within near field environment." Thesis, University of Bradford, 2012. http://hdl.handle.net/10454/5750.
Full textAdnan, Shahid. "Ultra-wideband antenna design for microwave imaging applications : design, optimisation and development of ultra-wideband antennas for microwave near-field sensing tools, and study the matching and radiation purity of these antennas within near field environment." Thesis, University of Bradford, 2012. http://hdl.handle.net/10454/5750.
Full textTurcanu, Vasile. "Valorisation des granulats recyclés dans les bétons soumis au gel/dégel sans saturation (classes d’exposition F et R)." Mémoire, Université de Sherbrooke, 2017. http://hdl.handle.net/11143/10479.
Full textDubois, Clémence. "Optimisation du traitement du cancer du sein Triple-Négatif : développement des modèles de culture cellulaire en trois dimensions, efficacité de l'Olaparib (anti-PARP1) en combinaison avec la radiothérapie et chimiorésistance instaurée par les protéines Multi Drug Résistance." Thesis, Université Clermont Auvergne (2017-2020), 2018. http://www.theses.fr/2018CLFAS018/document.
Full textBreast cancer is a very complex and heterogeneous disease. Among the different molecular subtypes, Triple-Negative (TN) breast cancers are particularly aggressive and of poor prognosis. TN tumours are characterized by a lack of estrogen receptors expression (ER), progesterone receptors expression (PR), the absence of Human Epidermal growth factor receptor 2 overexpression (HER2) of the frequent mutations on BRCA1 / 2 genes ("BRCAness" phenotype). In the absence of effective targeted therapies, many targeted therapies including poly-ADP-ribose polymerase inhibitors (anti-PARPs) are currently under development in preclinical and clinical studies. Based on the synthetic lethality concept, the anti-PARPs specifically target the BRCAness properties of TN tumors. In this context, these works were focused on the development of diagnostic tools for the optimization of TN tumours treatment with anti-PARPs. For this, firstly, 3D cell cultures formed with the Liquid Overlay technique as well as associated cytotoxicity tests were developed, from the TN breast cancer cell lines MDA-MB-231 and SUM1315. These two spheroid models were then optimized and standardized in a synthetic culture medium called OPTIPASS (BIOPASS). Secondly, the efficacy of a co-treatment combining anti-PARP1 Olaparib at low and high doses and fractioned radiotherapy (5x2 Gy) was analyzed on the two cell lines MDA-MB-231 and SUM1315 cultured in 2D and 3D conditions. These experiments clearly demonstrated a potentiating effect of Olaparib on radiotherapy (i) in presence of low doses of this anti-PARP (5 μM or inferior) (ii) at long term and (iii) in presence of the maximum fractionation (5x2 Gy). In addition, these two TN cell lines showed a heterogeneous sensitivity to the co-treatment. Thus, an in silico transcriptomic analysis revealed very different profiles of these highly metastatic and highly aggressive cell lines. Notably, the SUM1315 cell line presented a neuronal commitment, suggesting its cerebral metastatic origin. These promising results could open up new perspectives for the treatment of TN tumours brain metastases, which are very common in this subtype. Thirdly, in order to better characterize the mode of action of Olaparib on these spheroid models, a fluorescent derivative of Olaparib, Ola-FL, was synthesized and characterized. The analysis of Ola-FL penetration and distribution in MDA-MB-231 and SUM1315 spheroids showed a rapid and homogeneous distribution of the compound as well as its persistence after 3h of incubation, in all the depth of the spheroids and especially in the central hypoxic zones. Finally, the analysis of the co-expression of two major Multidrug Resistance (MDR) pumps, MRP7 and P-gp after the treatment of the two TN lines with Olaparib, revealed on 2D cultures, a relay type expression of the MRP7 and the P-gp. On spheroids treated with a low dose of Olaparib art long term (10 days), a basal expression of MRP7 and an overexpression of P-gp were detected in the peripheral residual cells of the spheroids. These results clearly highlighted the involvement of these efflux pumps in Olaparib resistance mechanisms, in these aggressive tumors. All the results resulting from the modeling of the action of Olaparib on MDA-MB-231 and SUM1315 spheroids suggest its greater efficacy at low dose and at long-term, especially in the hypoxic zones of the spheroids. This parameter might be probably at the origin of its potentiating effect with radiotherapy
Lalami, Mohamed Esseghir. "Contribution à la résolution de problèmes d'optimisation combinatoire : méthodes séquentielles et parallèles." Phd thesis, Toulouse 3, 2012. http://thesesups.ups-tlse.fr/1916/.
Full textCombinatorial optimization problems are difficult problems whose solution by exact methods can be time consuming or not realistic. The use of heuristics permits one to obtain good quality solutions in a reasonable time. Heuristics are also very useful for the development of exact methods based on branch and bound techniques. The first part of this thesis concerns the Multiple Knapsack Problem (MKP). We propose here a heuristic called RCH which yields a good solution for the MKP problem. This approach is compared to the MTHM heuristic and CPLEX solver. The second part of this thesis concerns parallel implementation of an exact method for solving combinatorial optimization problems like knapsack problems on GPU architecture. The parallel implementation of the Branch and Bound method via CUDA for knapsack problems is proposed. Experimental results show a speedup of 51 for difficult problems using a Nvidia Tesla C2050 (448 cores). A CPU-GPU implementation of the simplex method for solving linear programming problems is also proposed. This implementation offers a speedup around 12. 7 on a Tesla C2050 board. Finally, we propose a multi-GPU implementation of the simplex algorithm via CUDA. An efficiency of 96. 5% is obtained when passing from one GPU to two GPUs
Marak, Laszlo. "On continuous maximum flow image segmentation algorithm." Phd thesis, Université Paris-Est, 2012. http://tel.archives-ouvertes.fr/tel-00786914.
Full textBahi, Mouad. "High Performance by Exploiting Information Locality through Reverse Computing." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00768574.
Full textZehendner, Elisabeth. "Operations management at container terminals using advanced information technologies." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2013. http://tel.archives-ouvertes.fr/tel-00972071.
Full textWatson, Francis Maurice. "Better imaging for landmine detection : an exploration of 3D full-wave inversion for ground-penetrating radar." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/better-imaging-for-landmine-detection-an-exploration-of-3d-fullwave-inversion-for-groundpenetrating-radar(720bab5f-03a7-4531-9a56-7121609b3ef0).html.
Full textBistaffa, Filippo. "Constraint Optimisation Techniques for Real-World Applications." Doctoral thesis, 2016. http://hdl.handle.net/11562/939118.
Full textConstraint optimisation represents a fundamental technique that has been successfully employed in Multi-Agent Systems (MAS) in order to face a number of multi-agent coordination challenges. In this thesis we focus on Coalition Formation (CF), one of the key approaches for coordination in MAS. CF aims at the formation of groups that maximise a particular objective functions (e.g., arrange shared rides among multiple agents in order to minimise travel costs). Specifically, we discuss a special case of CF known as Graph-Constrained CF (GCCF) where a network connecting the agents constrains the formation of coalitions. We focus on this type of problem given that in many real-world applications, agents may be connected by a communication network or only trust certain peers in their social network. In particular, the contributions of this thesis are the following.We propose a novel representation of this problem and we design an efficient solution algorithm, i.e., CFSS. We evaluate CFSS on GCCF scenarios like collective energy purchasing and social ridesharing using realistic data (i.e., energy consumption profiles from households in the UK, GeoLife for spatial data, and Twitter as social network).Results show that CFSS outperforms state of the art GCCF approaches both in terms of runtime and scalability. CFSS is the first algorithm that provides solutions with good quality guarantees for large-scale GCCF instances with thousands of agents (i.e., more that 2700).In addition, we address the problem of computing the transfer or payment to each agent to ensure it is fairly rewarded for its contribution to its coalition. This aspect of CF, denoted as payment computation, is of utmost importance in scenario characterised by agents with rational behaviours, such as collective energy purchasing and social ridesharing. In this perspective, we propose PRF, the first method to compute payments in large-scale GCCF scenarios that are also stable in a game-theoretic sense.Finally, we provide an alternative method for the solution of GCCF, by exploiting the close relation between such problem and Constraint Optimisation Problems (COPs).We consider Bucket Elimination (BE), one of the most important algorithmic frameworks to solve COPs, and we propose CUBE, a highly-parallel GPU implementation of the most computationally intensive operations of BE. CUBE adopts an efficient memory layout that results in a high computational throughput. In addition, CUBE is not limited by the amount of memory of the GPU and, hence, it can cope with problems of realistic nature. CUBE has been tested on the SPOT5 dataset, which contains several satellite management problems modelled as COPs.Moreover, we use CUBE to solve COP-GCCF, the first COP formalisation of GCCF that results in a linear number of constraints with respect to the number of agents. This property is crucial to ensure the scalability of our approach.Results show that COP-GCCF produces significant improvements with respect to state of the art algorithms when applied to a realistic graph topology (i.e., Twitter), both in terms of runtime and memory.Overall, this thesis provides a novel perspective on important techniques in the context of MAS (such as CF and constraint optimisation), allowing to solve realistic problems involving thousands of agents for the first time.
Grenier, Julie. "Optimisation de l'utilisation des techniques de modélisation dans le passage de l'étape pré-clinique à clinique du développement d'un médicament." Thèse, 2008. http://hdl.handle.net/1866/6686.
Full text(5929916), Sudhir B. Kylasa. "HIGHER ORDER OPTIMIZATION TECHNIQUES FOR MACHINE LEARNING." Thesis, 2019.
Find full textFirst-order methods such as Stochastic Gradient Descent are methods of choice for solving non-convex optimization problems in machine learning. These methods primarily rely on the gradient of the loss function to estimate descent direction. However, they have a number of drawbacks, including converging to saddle points (as opposed to minima), slow convergence, and sensitivity to parameter tuning. In contrast, second order methods that use curvature information in addition to the gradient, have been shown to achieve faster convergence rates, theoretically. When used in the context of machine learning applications, they offer faster (quadratic) convergence, stability to parameter tuning, and robustness to problem conditioning. In spite of these advantages, first order methods are commonly used because of their simplicity of implementation and low per-iteration cost. The need to generate and use curvature information in the form of a dense Hessian matrix makes each iteration of second order methods more expensive.
In this work, we address three key problems associated with second order methods – (i) what is the best way to incorporate curvature information into the optimization procedure; (ii) how do we reduce the operation count of each iteration in a second order method, while maintaining its superior convergence property; and (iii) how do we leverage high-performance computing platforms to significant accelerate second order methods. To answer the first question, we propose and validate the use of Fisher information matrices in second order methods to significantly accelerate convergence. The second question is answered through the use of statistical sampling techniques that suitably sample matrices to reduce per-iteration cost without impacting convergence. The third question is addressed through the use of graphics processing units (GPUs) in distributed platforms to deliver state of the art solvers.
Through our work, we show that our solvers are capable of significant improvement over state of the art optimization techniques for training machine learning models. We demonstrate improvements in terms of training time (over an order of magnitude in wall-clock time), generalization properties of learned models, and robustness to problem conditioning.