Índice
Literatura académica sobre el tema "Vectorisation (informatique)"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Vectorisation (informatique)".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Tesis sobre el tema "Vectorisation (informatique)"
Maini, Jean-Luc. "Vectorisation, segmentation de schémas éléctroniques et reconnaissance de composants". Le Havre, 2000. http://www.theses.fr/2000LEHA0008.
Texto completoPeou, Kenny. "Computing Tools for HPDA : a Cache-Oblivious and SIMD Approach". Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG105.
Texto completoThis work presents three contributions to the fields of CPU vectorization and machine learning. The first contribution is an algorithm for computing an average with half precision floating point values. In this work performed with limited half precision hardware support, we use an existing software library to emulate half precision computation. This allows us to compare the numerical precision of our algorithm to various commonly used algorithms. Finally, we perform runtime performance benchmarks using single and double floating point values in order to anticipate the potential gains from applying CPU vectorization to half precision values. Overall, we find that our algorithm has slightly worse best-case numerical performance in exchange for significantly better worst-case numerical performance, all while providing similar runtime performance to other algorithms. The second contribution is a fixed-point computational library designed specifically for CPU vectorization. Existing libraries fail rely on compiler auto-vectorization, which fail to vectorize arithmetic multiplication and division operations. In addition, these two operations require cast operations which reduce vectorizability and have a real computational cost. To allevieate this, we present a fixed-point data storage format that does not require any cast operations to perform arithmetic operations. In addition, we present a number of benchmarks comparing our implementation to existing libraries and present the CPU vectorization speedup on a number of architectures. Overall, we find that our fixed point format allows runtime performance equal to or better than all compared libraries. The final contribution is a neural network inference engine designed to perform experiments varying the numerical datatypes used in the inference computation. This inference engine allows layer-specific control of which data types are used to perform inference. We use this level of control to perform experiments to determine how aggressively it is possible to reduce the numerical precision used in inferring the PVANet neural network. In the end, we determine that a combination of the standardized float16 and bfloat16 data types is sufficient for the entire inference
Gallet, Camille. "Étude de transformations et d’optimisations de code parallèle statique ou dynamique pour architecture "many-core"". Electronic Thesis or Diss., Paris 6, 2016. http://www.theses.fr/2016PA066747.
Texto completoSince the 60s to the present, the evolution of supercomputers faced three revolutions : (i) the arrival of the transistors to replace triodes, (ii) the appearance of the vector calculations, and (iii) the clusters. These currently consist of standards processors that have benefited of increased computing power via an increase in the frequency, the proliferation of cores on the chip and expansion of computing units (SIMD instruction set). A recent example involving a large number of cores and vector units wide (512-bit) is the co-proceseur Intel Xeon Phi. To maximize computing performance on these chips by better exploiting these SIMD instructions, it is necessary to reorganize the body of the loop nests taking into account irregular aspects (control flow and data flow). To this end, this thesis proposes to extend the transformation named Deep Jam to extract the regularity of an irregular code and facilitate vectorization. This thesis presents our extension and application of a multi-material hydrodynamic mini-application, HydroMM. Thus, these studies show that it is possible to achieve a significant performance gain on uneven codes
Gallet, Camille. "Étude de transformations et d’optimisations de code parallèle statique ou dynamique pour architecture "many-core"". Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066747/document.
Texto completoSince the 60s to the present, the evolution of supercomputers faced three revolutions : (i) the arrival of the transistors to replace triodes, (ii) the appearance of the vector calculations, and (iii) the clusters. These currently consist of standards processors that have benefited of increased computing power via an increase in the frequency, the proliferation of cores on the chip and expansion of computing units (SIMD instruction set). A recent example involving a large number of cores and vector units wide (512-bit) is the co-proceseur Intel Xeon Phi. To maximize computing performance on these chips by better exploiting these SIMD instructions, it is necessary to reorganize the body of the loop nests taking into account irregular aspects (control flow and data flow). To this end, this thesis proposes to extend the transformation named Deep Jam to extract the regularity of an irregular code and facilitate vectorization. This thesis presents our extension and application of a multi-material hydrodynamic mini-application, HydroMM. Thus, these studies show that it is possible to achieve a significant performance gain on uneven codes
Mercadier, Darius. "Usuba, Optimizing Bitslicing Compiler". Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS180.
Texto completoBitslicing is a technique commonly used in cryptography to implement high-throughput parallel and constant-time symmetric primitives. However, writing, optimizing and protecting bitsliced implementations by hand are tedious tasks, requiring knowledge in cryptography, CPU microarchitectures and side-channel attacks. The resulting programs tend to be hard to maintain due to their high complexity. To overcome those issues, we propose Usuba, a high-level domain-specific language to write symmetric cryptographic primitives. Usuba allows developers to write high-level specifications of ciphers without worrying about the actual parallelization: an Usuba program is a scalar description of a cipher, from which the Usuba compiler (Usubac) automatically produces vectorized bitsliced code. When targeting high-end Intel CPUs, the Usubac applies several domain-specific optimizations, such as interleaving and custom instruction-scheduling algorithms. We are thus able to match the throughputs of hand-tuned assembly and C implementations of several widely used ciphers. Futhermore, in order to protect cryptographic implementations on embedded devices against side-channel attacks, we extend our compiler in two ways. First, we integrate into Usubac state-of-the-art techniques in higher-order masking to generate implementations that are provably secure against power-analysis attacks. Second, we implement a backend for SKIVA, a custom 32-bit CPU enabling the combination of countermeasures against power-based and timing-based leakage, as well as fault injection
Meas-Yedid, Vannary. "Analyse de cartes scannées : interprétation de la planche de vert : contribution à l'analyse d'images en cartographie". Paris 5, 1998. http://www.theses.fr/1998PA05S003.
Texto completoKirschenmann, Wilfried. "Vers des noyaux de calcul intensif pérennes". Phd thesis, Université de Lorraine, 2012. http://tel.archives-ouvertes.fr/tel-00844673.
Texto completoNaouai, Mohamed. "Localisation et reconstruction du réseau routier par vectorisation d'image THR et approximation des contraintes de type "NURBS"". Phd thesis, Université de Strasbourg, 2013. http://tel.archives-ouvertes.fr/tel-00994333.
Texto completoSoyez-Martin, Claire. "From semigroup theory to vectorization : recognizing regular languages". Electronic Thesis or Diss., Université de Lille (2022-....), 2023. http://www.theses.fr/2023ULILB052.
Texto completoThe pursuit of optimizing regular expression validation has been a long-standing challenge,spanning several decades. Over time, substantial progress has been made through a vast range of approaches, spanning from ingenious new algorithms to intricate low-level optimizations.Cutting-edge tools have harnessed these optimization techniques to continually push the boundaries of efficient execution. One notable advancement is the integration of vectorization, a method that leverage low-level parallelism to process data in batches, resulting in significant performance enhancements. While there has been extensive research on designing handmade tailored algorithms for particular languages, these solutions often lack generalizability, as the underlying methodology cannot be applied indiscriminately to any regular expression, which makes it difficult to integrate to existing tools.This thesis provides a theoretical framework in which it is possible to generate vectorized programs for regular expressions corresponding to rational expressions in a given class. To do so, we rely on the algebraic theory of automata, which provides tools to process letters in parallel. These tools also allow for a deeper understanding of the underlying regular language, which gives access to some properties that are useful when producing vectorized algorithms. The contribution of this thesis is twofold. First, it provides implementations and preliminary benchmarks to study the potential efficiency of algorithms using algebra and vectorization. Second, it gives algorithms that construct vectorized programs for languages in specific classes of rational expressions, namely the first order logic and its subset restricted to two variables
Haine, Christopher. "Kernel optimization by layout restructuring". Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0639/document.
Texto completoCareful data layout design is crucial for achieving high performance, as nowadays processors waste a considerable amount of time being stalled by memory transactions, and in particular spacial and temporal locality have to be optimized. However, data layout transformations is an area left largely unexplored by state-of-the-art compilers, due to the difficulty to evaluate the possible performance gains of transformations. Moreover, optimizing data layout is time-consuming, error-prone, and layout transformations are too numerous tobe experimented by hand in hope to discover a high performance version. We propose to guide application programmers through data layout restructuring with an extensive feedback, firstly by providing a comprehensive multidimensional description of the initial layout, built via analysis of memory traces collected from the application binary textit {in fine} aiming at pinpointing problematic strides at the instruction level, independently of theinput language. We choose to focus on layout transformations,translatable to C-formalism to aid user understanding, that we apply and assesson case study composed of two representative multithreaded real-lifeapplications, a cardiac wave simulation and lattice QCD simulation, with different inputs and parameters. The performance prediction of different transformations matches (within 5%) with hand-optimized layout code