Tesis sobre el tema "Cache partagé"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 17 mejores tesis para su investigación sobre el tema "Cache partagé".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Liu, Hao. "Protocoles scalables de cohérence des caches pour processeurs manycore à espace d'adressage partagé visant la basse consommation". Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066059/document.
Texto completoThe TSAR architecture (Tera-Scale ARchitecture) developed jointly by Lip6 Bull and CEA-LETI is a CC-NUMA manycore architecture which is scalable up to 1024 cores. The DHCCP cache coherence protocol in the TSAR architecture is a global directory protocol using the write-through policy in the L1 cache for scalability purpose, but this write policy causes a high power consumption which we want to reduce. Currently the biggest semiconductors companies, such as Intel or AMD, use the MESI MOESI protocols in their multi-core processors. These protocols use the write-back policy to reduce the high power consumption due to writes. However, the complexity of implementation and the sharp increase in the coherence traffic when the number of processors increases limits the scalability of these protocols beyond a few dozen cores. In this thesis, we propose a new cache coherence protocol using a hybrid method to process write requests in the L1 private cache : for exclusive lines, the L1 cache controller chooses the write-back policy in order to modify locally the lines as well as eliminate the write traffic for exclusive lines. For shared lines, the L1 cache controller uses the write-through policy to simplify the protocol and in order to guarantee the scalability. We also optimized the current solution for the TLB coherence problem in the TSAR architecture. The new method which is called CC-TLB not only improves the performance, but also reduces the energy consumption. Finally, this thesis introduces a new micro cache between the core and the L1 cache, which allows to reduce the number of accesses to the instruction cache, in order to save energy
Liu, Hao. "Protocoles scalables de cohérence des caches pour processeurs manycore à espace d'adressage partagé visant la basse consommation". Electronic Thesis or Diss., Paris 6, 2016. http://www.theses.fr/2016PA066059.
Texto completoThe TSAR architecture (Tera-Scale ARchitecture) developed jointly by Lip6 Bull and CEA-LETI is a CC-NUMA manycore architecture which is scalable up to 1024 cores. The DHCCP cache coherence protocol in the TSAR architecture is a global directory protocol using the write-through policy in the L1 cache for scalability purpose, but this write policy causes a high power consumption which we want to reduce. Currently the biggest semiconductors companies, such as Intel or AMD, use the MESI MOESI protocols in their multi-core processors. These protocols use the write-back policy to reduce the high power consumption due to writes. However, the complexity of implementation and the sharp increase in the coherence traffic when the number of processors increases limits the scalability of these protocols beyond a few dozen cores. In this thesis, we propose a new cache coherence protocol using a hybrid method to process write requests in the L1 private cache : for exclusive lines, the L1 cache controller chooses the write-back policy in order to modify locally the lines as well as eliminate the write traffic for exclusive lines. For shared lines, the L1 cache controller uses the write-through policy to simplify the protocol and in order to guarantee the scalability. We also optimized the current solution for the TLB coherence problem in the TSAR architecture. The new method which is called CC-TLB not only improves the performance, but also reduces the energy consumption. Finally, this thesis introduces a new micro cache between the core and the L1 cache, which allows to reduce the number of accesses to the instruction cache, in order to save energy
Parrinello, Emanuele. "Fundamental Limits of Shared-Cache Networks". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS491.
Texto completoIn the context of communication networks, the emergence of predictable content has brought to the fore the use of caching as a fundamental ingredient for handling the exponential growth in data volumes. This thesis aims at providing the fundamental limits of shared-cache networks where the communication to users is aided by a small set of caches. Our shared-cache model, not only captures heterogeneous wireless cellular networks, but it can also represent a model for users requesting multiple files simultaneously, and it can be used as a simple yet effective way to deal with the so-called subpacketization bottleneck of coded caching. Furthermore, we will also see how our techniques developed for caching networks can find application in the context of heterogeneous coded distributed computing
Malik, Adeel. "Stochastic Coded Caching Networks : a Study of Cache-Load Imbalance and Random User Activity". Electronic Thesis or Diss., Sorbonne université, 2022. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2022SORUS045.pdf.
Texto completoIn this thesis, we elevate coded caching from their purely information-theoretic framework to a stochastic setting where the stochasticity of the networks originates from the heterogeneity in users’ request behaviors. Our results highlight that stochasticity in the cache-aided networks can lead to the vanishing of the gains of coded caching. We determine the exact extent of the cache-load imbalance bottleneck of coded caching in stochastic networks, which has never been explored before. Our work provides techniques to mitigate the impact of this bottleneck for the scenario where the user-to-cache state associations are restricted by proximity constraints between users and helper nodes (i.e., shared-cache setting) as well as for the scenario where user-to-cache state associations strategies are considered, as a design parameter (i.e., subpacketization-constrained setting)
Gao, Yang. "Contrôleur de cache générique pour une architecture manycore massivement parallèle à mémoire partagée cohérente". Paris 6, 2011. http://www.theses.fr/2011PA066296.
Texto completoHardy, Damien. "Analyse pire cas pour processeur multi-cœurs disposant de caches partagés". Phd thesis, Université Rennes 1, 2010. http://tel.archives-ouvertes.fr/tel-00557058.
Texto completoHardy, Damien. "Analyse pire cas pour processeur multi-coeurs disposant de caches partagés". Rennes 1, 2010. http://www.theses.fr/2010REN1S143.
Texto completoHard real-time systems are subject to timing constraints and failure to respect them can cause economic, ecological or human disasters. The validation process which guarantees the safety of such software, by ensuring the respect of these constraints in all situations including the worst case, is based on the knowledge of the worst case execution time of each task. However, determining the worst case execution time is a difficult problem for modern architectures because of complex hardware mechanisms that could cause significant execution time variability. This document focuses on the analysis of the worst case timing behavior of cache hierarchies, to determine their contribution to the worst case execution time. Several approaches are proposed to predict and improve the worst case execution time of tasks running on multicore processors with a cache hierarchy in which some cache levels are shared between cores
Wan, Kai. "Limites fondamentales de stockage pour les réseaux de diffusion de liens partagés et les réseaux de combinaison". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS217/document.
Texto completoIn this thesis, we investigated the coded caching problem by building the connection between coded caching with uncoded placement and index coding, and leveraging the index coding results to characterize the fundamental limits of coded caching problem. We mainly analysed the caching problem in shared-link broadcast model and in combination networks. In the first part of this thesis, for cache-aided shared-link broadcast networks, we considered the constraint that content is placed uncoded within the caches. When the cache contents are uncoded and the user demands are revealed, the caching problem can be connected to an index coding problem. We derived fundamental limits for the caching problem by using tools for the index coding problem. A novel index coding achievable scheme was first derived based on distributed source coding. This inner bound was proved to be strictly better than the widely used “composite (index) coding” inner bound by leveraging the ignored correlation among composites and the non-unique decoding. For the centralized caching problem, an outer bound under the constraint of uncoded cache placement is proposed based on the “acyclic index coding outer bound”. This outer bound is proved to be achieved by the cMAN scheme when the number of files is not less than the number of users, and by the proposed novel index coding achievable scheme otherwise. For the decentralized caching problem, this thesis proposes an outer bound under the constraint that each user stores bits uniformly and independently at random. This outer bound is achieved by dMAN when the number of files is not less than the number of users, and by our proposed novel index coding inner bound otherwise. In the second part of this thesis, we considered the centralized caching problem in two-hop relay networks, where the server communicates with cache-aided users through some intermediate relays. Because of the hardness of analysis on the general networks, we mainly considered a well-known symmetric relay networks, combination networks, including H relays and binom{H}{r} users where each user is connected to a different r-subset of relays. We aimed to minimize the max link-load for the worst cases. We derived outer and inner bounds in this thesis. For the outer bound, the straightforward way is that each time we consider a cut of x relays and the total load transmitted to these x relays could be outer bounded by the outer bound for the shared-link model including binom{x}{r} users. We used this strategy to extend the outer bounds for the shared-link model and the acyclic index coding outer bound to combination networks. In this thesis, we also tightened the extended acyclic index coding outer bound in combination networks by further leveraging the network topology and joint entropy of the various random variables. For the achievable schemes, there are two approaches, separation and non-separation. In the separation approach, we use cMAN cache placement and multicast message generation independent of the network topology. We then deliver cMAN multicast messages based on the network topology. In the non-separation approach, we design the placement and/or the multicast messages on the network topology. We proposed four delivery schemes on separation approach. On non-separation approach, firstly for any uncoded cache placement, we proposed a delivery scheme by generating multicast messages on network topology. Moreover, we also extended our results to more general models, such as combination networks with cache-aided relays and users, and caching systems in more general relay networks. Optimality results were given under some constraints and numerical evaluations showed that our proposed schemes outperform the state-of-the-art
Dumas, Julie. "Représentation dynamique de la liste des copies pour le passage à l'échelle des protocoles de cohérence de cache". Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM093/document.
Texto completoCache coherence protocol scalability problem for parallel architecture is also a problem for on chip architecture, following the emergence of manycores architectures. There are two protocol classes : snooping and directory-based.Protocols based on snooping, which send coherence information to all caches, generate a lot of messages whose few are useful.On the other hand, directory-based protocols send messages only to caches which need them. The most obvious implementation uses a full bit vector whose size depends only on the number of cores. This bit vector represents the sharing set. To scale, a coherence protocol must produce a reasonable number of messages and limit hardware ressources used by the coherence and in particular for the sharing set.To evaluate and compare protocols and their sharing set, we first propose a method based on trace injection in a high-level cache model. This method enables a very fast architectural exploration of cache coherence protocols.We also propose a new dynamic sharing set for cache coherence protocols, which is scalable. With 64 cores, 93% of cache blocks are shared by up to 8 cores.Futhermore, knowing that the operating system looks to place communicating tasks close to each other. Our dynamic sharing set takes advantage from these two observations by using a bit vector for a subset of copies and a linked list. The bit vector corresponds to a rectangle which stores the exact sharing set. The position and shape of this rectangle evolve over application's lifetime. Several algorithms for coherent rectangle placement are proposed and evaluated. Finally, we make a comparison with sharing sets from the state of the art
Busseuil, Rémi. "Exploration d'architecture d'accélérateurs à mémoire distribuée". Thesis, Montpellier 2, 2012. http://www.theses.fr/2012MON20218/document.
Texto completoAlthough the accelerators market is dominated by heterogeneous MultiProcessor Systems-on-Chip (MPSoC), i.e. with different specialized processors, a growing interest is put on another type of MPSoC, composed by an array of identical processors. Even if these processors achieved lower performance to power ratio, the better flexibility and programmability of these homogeneous MPSoC allow an easier adaptation to the load, and offer a wider space of configurations. In this context, this thesis exposes the development of a scalable homogeneous MPSoC – i.e. with linear performance scaling – and different kind of adaptive mechanisms and programming model on it.This architecture is based on an array of MicroBlaze-like processors, each having its own memory, and connected through a 2D NoC. A modular RTOS was build on top of it. Thanks to a complex communication stack, different adaptive mechanisms were made: a “redirected data” task migration mechanism, reducing the impact of the migration mechanism for data-flow applications, and a “remote execution” mechanism. Instead of migrate the instruction code from a memory to another, this last consists in only migrate the execution, keeping the code in its initial memory. The different experiments shows faster reactivity but lower performance of this mechanism compared to migration.This development naturally led to the creation of a shared memory programming model. To achieve this, a scalable hardware/software memory consistency and cache coherency mechanism has been made, through the PThread library development. Experiments show the advantage of using NoC based homogeneous MPSoC with a brand programming model
Kaci, Ania. "Conception d'une architecture extensible pour le calcul massivement parallèle". Thesis, Paris Est, 2016. http://www.theses.fr/2016PESC1044.
Texto completoIn response to the growing demand for performance by a wide variety of applications (eg, financial modeling, sub-atomic simulation, bioinformatics, etc.), computer systems become more complex and increase in size (number of computing components, memory and storage capacity). The increased complexity of these systems results in a change in their architecture towards a heterogeneous computing technologies and programming models. The harmonious management of this heterogeneity, resource optimization and minimization of consumption are major technical challenges in the design of future computer systems.This thesis addresses a field of this complexity by focusing on shared memory subsystems where all processors share a common address space. Work will focus on the implementation of a cache coherence and memory consistency on an extensible architecture and methodology for validation of this implementation.In our approach, we selected processors 64-bit ARM and generic co-processor (GPU, DSP, etc.) as components of computing, shared memory protocols AMBA / ACE and AMBA / ACE-Lite and associated architecture "CoreLink CCN" as a starting solution. Generalization and parameterization of this architecture and its validation in the simulation environment GEM5 are the backbone of this thesis.The results at the end of the thesis, tend to demonstrate the achievement of objectives
Lottiaux, Renaud. "Gestion globale de la mémoire physique d'une grappe pour un système à image unique : : mise en oeuvre dans le système Gobelins". Rennes 1, 2001. http://www.theses.fr/2001REN10097.
Texto completoAbousabea, Emad Mohamed Abd Elrahman. "Optimization algorithms for video service delivery". Thesis, Evry, Institut national des télécommunications, 2012. http://www.theses.fr/2012TELE0030/document.
Texto completoThe aim of this thesis is to provide optimization algorithms for accessing video services either in unmanaged or managed ways. We study recent statistics about unmanaged video services like YouTube and propose suitable optimization techniques that could enhance files accessing and reduce their access costs. Moreover, this cost analysis plays an important role in decision making about video files caching and hosting periods on the servers. Under managed video services called IPTV, we conducted experiments for an open-IPTV collaborative architecture between different operators. This model is analyzed in terms of CAPEX and OPEX costs inside the domestic sphere. Moreover, we introduced a dynamic way for optimizing the Minimum Spanning Tree (MST) for multicast IPTV service. In nomadic access, the static trees could be unable to provide the service in an efficient manner as the utilization of bandwidth increases towards the streaming points (roots of topologies). Finally, we study reliable security measures in video streaming based on hash chain methodology and propose a new algorithm. Then, we conduct comparisons between different ways used in achieving reliability of hash chains based on generic classifications
Preud'Homme, Thomas. "Communication inter-cœurs optimisée pour le parallélisme de flux". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2013. http://tel.archives-ouvertes.fr/tel-00931833.
Texto completoLi, Haifeng. "Traitement de la variabilité et développement de systèmes robustes pour la reconnaissance de l'écriture manuscrite en-ligne". Paris 6, 2002. http://www.theses.fr/2002PA066230.
Texto completoAbousabea, Emad Mohamed Abd Elrahman. "Optimization algorithms for video service delivery". Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2012. http://www.theses.fr/2012TELE0030.
Texto completoThe aim of this thesis is to provide optimization algorithms for accessing video services either in unmanaged or managed ways. We study recent statistics about unmanaged video services like YouTube and propose suitable optimization techniques that could enhance files accessing and reduce their access costs. Moreover, this cost analysis plays an important role in decision making about video files caching and hosting periods on the servers. Under managed video services called IPTV, we conducted experiments for an open-IPTV collaborative architecture between different operators. This model is analyzed in terms of CAPEX and OPEX costs inside the domestic sphere. Moreover, we introduced a dynamic way for optimizing the Minimum Spanning Tree (MST) for multicast IPTV service. In nomadic access, the static trees could be unable to provide the service in an efficient manner as the utilization of bandwidth increases towards the streaming points (roots of topologies). Finally, we study reliable security measures in video streaming based on hash chain methodology and propose a new algorithm. Then, we conduct comparisons between different ways used in achieving reliability of hash chains based on generic classifications
Etcheverry, Arnaud. "Simulation de la dynamique des dislocations à très grande échelle". Thesis, Bordeaux, 2015. http://www.theses.fr/2015BORD0263/document.
Texto completoThis research work focuses on bringing performances in 3D dislocation dynamics simulation, to run efficiently on modern computers. First of all, we introduce some algorithmic technics, to reduce the complexity in order to target large scale simulations. Second of all, we focus on data structure to take into account both memory hierachie and algorithmic data access. On one side we build this adaptive data structure to handle dynamism of data and on the other side we use an Octree to combine hierachie decompostion and data locality in order to face intensive arithmetics with force field computation and collision detection. Finnaly, we introduce some parallel aspects of our simulation. We propose a classical hybrid parallelism, with task based openMP threads and domain decomposition technics for MPI