Dissertations / Theses: 'Memory management'

1

Panthulu, Pradeep. "Intelligent Memory Management Heuristics." Thesis, University of North Texas, 2003. https://digital.library.unt.edu/ark:/67531/metadc4399/.

Full text

Abstract:

Automatic memory management is crucial in implementation of runtime systems even though it induces a significant computational overhead. In this thesis I explore the use of statistical properties of the directed graph describing the set of live data to decide between garbage collection and heap expansion in a memory management algorithm combining the dynamic array represented heaps with a mark and sweep garbage collector to enhance its performance. The sampling method predicting the density and the distribution of useful data is implemented as a partial marking algorithm. The algorithm randomly marks the nodes of the directed graph representing the live data at different depths with a variable probability factor p. Using the information gathered by the partial marking algorithm in the current step and the knowledge gathered in the previous iterations, the proposed empirical formula predicts with reasonable accuracy the density of live nodes on the heap, to decide between garbage collection and heap expansion. The resulting heuristics are tested empirically and shown to improve overall execution performance significantly in the context of the Jinni Prolog compiler's runtime system.

APA, Harvard, Vancouver, ISO, and other styles

2

Mårtensson, Henrik. "Memory Management of Manycore Systems." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-107879.

Full text

Abstract:

This thesis project is part of the MANY-project hosted by ITEA2. The objective of Many is to provide developers with tools for developing using multi and manycore, as well as to provide a knowledge-base about software on manycore. This thesis project has the following objectives: to investigate the complex subject of e_ectively managing system memory in a manycore environment, propose a memory management technique for implementation in OSE and to investigate the Tilera manycore processor TILEPro64 and Enea OSE in order to be able to continue the ongoing project of porting OSE to TILEPro64. Several memory management techniques were investigated for managing memory access on all tiers of the system. Some of these techniques require modi_cations to hardware while some are made directly in software. The porting of OSE to the TILEPro64 processor was continued and contributions where made to the Hardware Abstraction Layer of OSE.

APA, Harvard, Vancouver, ISO, and other styles

3

Stojanovic, Marta. "Automatic memory management in Java." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ65392.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Sánchez, Navarro Francisco Jesús. "Smart memory management through locality analysis." Doctoral thesis, Universitat Politècnica de Catalunya, 2001. http://hdl.handle.net/10803/5965.

Full text

Abstract:

Las memorias caché fueron incorporadas en los microprocesadores ya desde los primeros tiempos, y representan la solución más común para tratar la diferencia de velocidad entre el procesador y la memoria. Sin embargo, muchos estudios señalan que la capacidad de almacenamiento de la caché es malgastada muchas veces, lo cual tiene un impacto directo en el rendimiento del procesador. Aunque una caché está diseñada para explotar diferentes tipos de localidad, todas la referencias a memoria son tratadas de la misma forma, ignorando comportamientos particulares de localidad. El uso restringido de la información de localidad para cada acceso a memoria puede limitar la eficiencia de la cache. En esta tesis se demuestra como un análisis de localidad de datos puede ayudar al investigador a entender dónde y porqué ocurren los fallos de caché, y proponer entonces diferentes técnicas que hacen uso de esta información con el objetivo de mejorar el rendimiento de la memoria caché. Proponemos técnicas en las cuales la información de localidad obtenida por el analizador de localidad es pasada desde el compilador al hardware a través del ISA para guiar el manejo de los accesos a memoria.
Hemos desarrollado un análisis estático de localidad de datos. Este análisis está basado en los vectores de reuso y contiene los tres típicos pasos: reuso, volumen y análisis de interferencias. Comparado con trabajos previos, tanto el análisis de volúmenes como el de interferencias ha sido mejorado utilizando información de profiling así como un análisis de interferencias más preciso. El analizador de localidad de datos propuesto ha sido incluido como un paso más en un compilador de investigación. Los resultados demuestran que, para aplicaciones numéricas, el análisis es muy preciso y el overhead de cálculo es bajo. Este análisis es la base para todas las otras partes de la tesis. Además, para algunas propuestas en la última parte de la tesis, hemos usado un análisis de localidad de datos basado en las ecuaciones de fallos de cache. Este análisis, aunque requiere más tiempo de cálculo, es más preciso y más apropiado para cachés asociativas por conjuntos. El uso de dos análisis de localidad diferentes también demuestra que las propuestas arquitectónicas de esta tesis son independientes del análisis de localidad particular utilizado.
Después de mostrar la precisión del análisis, lo hemos utilizado para estudiar el comportamiento de localidad exhibido por los programas SPECfp95. Este tipo de análisis es necesario antes de proponer alguna nueva técnica ya que ayuda al investigador a entender porqué ocurren los fallos de caché. Se muestra que con el análisis propuesto se puede estudiar de forma muy precisa la localidad de un programa y detectar donde estan los "puntos negros" así como la razón de estos fallos en cache. Este estudio del comportamiento de localidad de diferentes programas es la base y motivación para las diferentes técnicas propuestas en esta tesis para mejorar el rendimiento de la memoria.
Así, usando el análisis de localidad de datos y basándonos en los resultados obtenidos después de analizar el comportamiento de localidad de un conjunto de programas, proponemos utilizar este análisis con el objetivo de guiar tres técnicas diferentes: (i) manejo de caches multimódulo, (ii) prebúsqueda software para bucles con planificación módulo, y (iii) planificación de instrucciones de arquitecturas VLIW clusterizadas.
El primer uso del análisis de localidad propuesto es el manejo de una novedosa organización de caché. Esta caché soporta bypass y/o está compuesta por diferentes módulos, cada uno orientado a explotar un tipo particular de localidad. La mayor diferencia de esta caché con respecto propuestas previas es que la decisión de "cachear" o no, o en qué módulo un nuevo bloque es almacenado, está controlado por algunos bits en las instrucciones de memoria ("pistas" de localidad). Estas "pistas" (hints) son fijadas en tiempo de compilación utilizando el análisis de localidad propuesto. Así, la complejidad del manejo de esta caché se mantiene bajo ya que no requiere ningún hardware adicional. Los resultados demuestran que cachés más pequeñas con un manejo más inteligente pueden funcionar tan bien (o mejor) que cachés convencionales más grandes.
Hemos utilizado también el análisis de localidad para estudiar la interacción entre la segmentación software y la prebúsqueda software. La segmentación software es una técnica muy efectiva para la planificación de código en bucles (principalmente en aplicaciones numéricas en procesadores VLIW). El esquema más popular de prebúsqueda software se llama planificación módulo. Muchos trabajos sobre planificación módulo se pueden encontrar en la literatura, pero casi todos ellos consideran una suposición crítica: consideran un comportamiento optimista de la cache (en otras palabras, usan siempre la latencia de acierto cuando planifican instrucciones de memoria). Así, los resultados que presentan ignoran los efectos del bloqueo debido a dependencias con instrucciones de memoria. En esta parte de la tesis mostramos que esta suposición puede llevar a planificaciones cuyo rendimiento es bastante más bajo cuando se considera una memoria real. Nosotros proponemos un algoritmo para planificar instrucciones de memoria en bucles con planificación módulo. Hemos estudiado diferentes estrategias de prebúsqueda software y finalmente hemos propuesto un algoritmo que realiza prebúsqueda basándose en el análisis de localidad y en la forma del grafo de dependencias del bucle. Los resultados obtenidos demuestran que el esquema propuesto mejora el rendimiento de las otras heurísticas ya que obtiene un mejor compromiso entre tiempo de cálculo y de bloqueo.
Finalmente, el último uso del análisis de localidad estudiado en esta tesis es para guiar un planificador de instrucciones para arquitecturas VLIW clusterizadas. Las arquitecturas clusterizadas están siendo una tendencia común en el diseño de procesadores empotrados/DSP. Típicamente, el núcleo de estos procesadores está basado en un diseño VLIW el cual particiona tanto el banco de registros como las unidades funcionales. En este trabajo vamos un paso más allá y también hacemos la partición de la memoria caché. En este caso, tanto las comunicaciones entre registros como entre memorias han de ser consideradas. Nosotros proponemos un algoritmo que realiza la partición del grafo así como la planificación de instrucciones en un único paso en lugar de hacerlo secuencialmente, lo cual se demuestra que es más efectivo. Este algoritmo es mejorado añadiendo una análisis basado en las ecuaciones de fallos de cache con el objetivo de guiar en la planificación de las instrucciones de memoria para reducir no solo comunicaciones entre registros, sino también fallos de cache.
Cache memories were incorporated in microprocessors in the early times and represent the most common
solution to deal with the gap between processor and memory speeds. However, many studies point out that the cache storage capacity is wasted many times, which means a direct impact in processor performance. Although a cache is designed to exploit different types of locality, all memory references are handled in the same way, ignoring particular locality behaviors. The restricted use of the locality information for each memory access can limit the effectivity of the cache. In this thesis we show how a data locality analysis can help the researcher to understand where and why cache misses occur, and then to propose different techniques that make use of this information in order to improve the performance of cache memory. We propose techniques in which locality information obtained by the locality analyzer is passed from the compiler to the hardware through the ISA to guide the management of memory accesses.

We have developed a static data locality analysis. This analysis is based on reuse vectors and performs the three typical steps: reuse, volume and interfere analysis. Compared with previous works, both volume
and interference analysis have been improved by using profile information as well as a more precise inter-
ference analysis. The proposed data locality analyzer has been inserted as another pass in a research compiler. Results show that for numerical applications the analysis is very accurate and the computing overhead is low. This analysis is the base for all other parts of the thesis. In addition, for some proposals in the last part of the thesis we have used a data locality analysis based on cache miss equations. This analysis, although more time consuming, is more accurate and more appropriate for set-associative caches. The usage of two different locality analyzers also shows that the architectural proposals of this thesis are independent from the particular locality analysis.

After showing the accuracy of the analysis, we have used it to study the locality behavior exhibited by the SPECfp95 programs. This kind of analysis is necessary before proposing any new technique since can help the researcher to understand why cache misses occur. We show that with the proposed analysis we can study very accurately the locality of a program and detect where the hot spots are as well as the reason for these misses. This study of the locality behavior of different programs is the base and motivation for the different techniques proposed in this thesis to improve the memory performance.

Thus, using the data locality analysis and based on the results obtained after analyzing the locality behavior of a set of programs, we propose to use this analysis in order to guide three different techniques: (i) management of multi-module caches, (ii) software prefetching for modulo scheduled loops, and (iii) instruction scheduling for clustered VLIW architectures.

The first use of the proposed data locality analysis is to manage a novel cache organization. This cache supports bypassing and/or is composed of different modules, each one oriented to exploit a particular type of locality. The main difference of this cache with respect to previous proposals is that the decision of caching or not, or in which module a new fetched block is allocated is managed by some bits in memory instructions (locality hints). These hints are set at compile time using the proposed locality analysis. Thus, the management complexity of this cache is kept low since no additional hardware is required. Results show that smaller caches with a smart management can perform as well as (or better than) bigger conventional caches.

We have also used the locality analysis to study the interaction between software pipelining and software prefetching. Software pipelining has been shown to be a very effective scheduling technique for loops (mainly in numerical applications for VLIW processors). The most popular scheme for software pipelining is called modulo scheduling. Many works on modulo scheduling can be found in the literature, but almost all of them make a critical assumption: they consider an optimistic behavior of the cache (in other words, they use the hit latency when a memory instruction is scheduled). Thus, the results they present ignore the effect of stalls due to dependences with memory instructions. In this part of the thesis we show that this assumption can lead to schedules whose performance is rather low when a real memory is considered. Thus, we propose an algorithm to schedule memory instructions in modulo scheduled loops. We have studied different software prefetching strategies and finally proposed an algorithm that performs prefetching based on the locality analysis and the shape of the loop dependence graph. Results obtained shows that the proposed scheme outperforms other heuristic approaches since it achieves a better trade-off between compute and stall time than the others. Finally, the last use of the locality analysis studied in this thesis is to guide an instruction scheduler for a clustered VLIW architecture. Clustered architectures are becoming a common trend in the design of embedded/DSP processors. Typically, the core of these processors is based on a VLIW design which partitionates both register file and functional units. In this work we go a step beyond and also make a partition of the cache memory. Then, both inter-register and inter-memory communications have to be taken into account. We propose an algorithm that performs both graph partition and instruction scheduling in a single step instead of doing it sequentially, which is shown to be more effective. This algorithm is improved by adding an analysis based on the cache miss equations in order to guide the scheduling of memory instructions in clusters with the aim of reducing not only inter-register communications, but also cache misses.

APA, Harvard, Vancouver, ISO, and other styles

5

Hanai, Ryo. "Memory management for real-time applications." 京都大学 (Kyoto University), 2007. http://hdl.handle.net/2433/135980.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Wilk, Daniel. "Hierarchical application-oriented physical memory management." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ29419.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Mohapatra, Dushmanta. "Coordinated memory management in virtualized environments." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54454.

Full text

Abstract:

Two recent advances are the primary motivating factors for the research in my dissertation. First, virtualization is no longer confined to the powerful server class machines. It has already been introduced into smart-phones and will be a part of other high-end embedded systems like automobiles in the near future. Second, more and more resource intensive and latency sensitive applications are being used in devices which are rather resource constrained and introducing virtualization into the software stack just exacerbates the resource allocation issue. The focus of my research is on memory management in virtualized environments. Existing memory-management mechanisms were designed for server class machines and their implementations are geared towards the applications running primarily on data centers and cloud setups. In these setups, appropriate load balancing and achieving fair division of resources are the goals and over-provisioning may be the norm. Latency involved in resource management mechanisms may not be a big concern. But in case of smart phones and other hand held devices, applications like media streaming, social-networking are prevalent, which are both resource intensive and latency sensitive. Moreover, the bursty nature of their memory requirement results in spikes in memory needs of the virtual machines. As over provisioning is not an option in these domains, fast and effective (memory) resource management mechanisms are necessary. The overall thesis of my dissertation is: with appropriate design and implementation, it is possible to achieve inter-VM memory management with a latency comparable to the latency involved in intra-VM memory management mechanisms like ‘malloc’. Towards realizing and validating this goal, I have made the following research contributions through my dissertation: (1) I analyzed the memory requirement pattern of prevalent applications, which exhibit bursty behavior and showcased the need for fast memory management mechanisms. (2) I designed and implemented a Coordinated Memory Management mechanism in Xen based virtualized setup, based on the split driver principle (3) I analyzed this mechanism and did a comparative evaluation with parallel memory management mechanisms. (4)I analyzed the extent of interference from the schedulers in the operation of the mechanism and implemented constructs that help in reducing the interference and latency. (5) Based on my analysis, I revised the implementation of the mechanism to one in which Xen hypervisor plays a more significant and active role in the coordination of the mechanism and I did a detailed analysis to showcase the latency improvements due to this design change. (6) In order to validate my hypothesis, I did a comparative analysis of inter-vm and intra-vm memory management mechanisms as final part of my dissertation.

APA, Harvard, Vancouver, ISO, and other styles

8

Feeley, Michael Joseph. "Global memory management for workstation networks /." Thesis, Connect to this title online; UW restricted, 1996. http://hdl.handle.net/1773/6997.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Yang, Shufan. "Memory interconnect management on a chip multiprocessor." Thesis, University of Manchester, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.520682.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Österlund, Erik. "Automatic memory management system for automatic parallelization." Thesis, Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-13693.

Full text

Abstract:

With Moore’s law coming to an end and the era of multiprocessor chips emerging, the need for ways of dealing with the essential problems with concurrency is becoming imminent. Automatic parallelization for imperative languages and pure functions in functional programming languages all try to prove independence statically. This thesis argues that independence is dynamic in nature. Static analysis for automatic parallelization has failed to do anything but trivial optimizations. This thesis shows a new approach where dynamic analysis about the system is provided for very low costs using a garbage collector that has to go through all live cells anyway. Immutable sub-graphs of objects that cannot change state are found. Their methods become pure functions that can be parallelized. The garbage collector implemented is a kind of replicating collector. It is about three times faster than Boehm’s collector in garbage collection, fully concurrent and provides the dynamic analysis almost for free.

APA, Harvard, Vancouver, ISO, and other styles

11

Zhang, Yang. "Dynamic Memory Management for the Loci Framework." MSSTATE, 2004. http://sun.library.msstate.edu/ETD-db/theses/available/etd-04062004-215627/.

Full text

Abstract:

Resource management is a critical part in high-performance computing software. While management of processing resources to increase performance is the most critical, efficient management of memory resources plays an important role in solving large problems. This thesis research seeks to create an effective dynamic memory management scheme for a declarative data-parallel programming system. In such systems, some sort of automatic resource management is a requirement. Using the Loci framework, this thesis research focuses on exploring such opportunities. We believe there exists an automatic memory management scheme for such declarative data-parallel systems that provides good compromise between memory utilization and performance. In addition to basic memory management, this thesis research also seeks to develop methods that take advantages of the cache memory subsystem and explore balances between memory utilization and parallel communication costs in such declarative data-parallel frameworks.

APA, Harvard, Vancouver, ISO, and other styles

12

Wilhelmsson, Jesper. "Efficient memory management for message-passing concurrency." Licentiate thesis, Uppsala : Univ. : Dept. of Information Technology, Univ, 2005. http://www.it.uu.se/research/reports/lic/2005-001/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Elsweiler, David. "Supporting human memory in personal information management." Thesis, University of Strathclyde, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488520.

Full text

Abstract:

Personal Information Management (PIM) describes the processes by which an individual acquires, organises, and re-finds information. Studies have shown that people find PIM challenging and many struggle to manage the volume and diversity of information that they accumulate.

APA, Harvard, Vancouver, ISO, and other styles

14

Hand, Steven Michael. "Providing quality of service in memory management." Thesis, University of Cambridge, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.624317.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Ellims, M. "Memory management in the Smalltalk-80 system." Thesis, University of Canterbury. Computer Science, 1987. http://hdl.handle.net/10092/9376.

Full text

Abstract:

This work presents an examination of the memory management area of the Smalltalk-80 system. Two implementations of this system were completed. The first system used virtual memory managed in an object oriented manner, the performance and related factors of this system is examined in detail. The second system implemented was based wholly in RAM and was used to examine in detail the factors that affected the performance of the system. Two areas of the RAM based system are examined in detail. The first of these is the logical manner in which the memory of the system is structured and its effects on the performance of the system. The second field is the way in which object reference counts are decremented. This second field has potentially the large effect on the system's running speed.

APA, Harvard, Vancouver, ISO, and other styles

16

Hagelin, Martin. "Optimizing Memory Management with Object-Local Heaps." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-259345.

Full text

Abstract:

The large discrepancy between the speed of caches and main memory makes efficient memory management more important than ever. This report gives an overview of the current state of the art in memory management, and how then can be used to make the most of hardware systems like cache hierarchies and hardware prefetchers. Pooled allocation, automatic splitting of objects, pointer compression and copying garbage collection are identified as four promising areas, and it is noted that few if any existing systems offer functionality for all four. Furthermore the report describes the design and implementation of a new software package called Object-Local Heaps (OLH). Object-Local Heaps includes a pooled allocator and a compacting garbage collector to avoid memory fragmentation; as well as structure splitting and pointer compression to conserve memory, better utilize caches and improve the performance of hardware prefechers. The main contribution lies showing how these separate ideas can be combined. The system is tested and evaluated using micro-benchmarks. It is found that Object-Local Heaps can increase the throughput by an order of magnitude when iterating over linked structures, compared to an implementation in pure C that is subject to some fragmentation. The throughput when accessing individual fields in objects of two or more fields can also be more than doubled compared to a program that iterates over a standard C++ vector. Additional overhead renders the system unsuitable for search trees and similar structures however, as they require multiple random accesses throughout pools. A red-black tree from the C++ standard template library is more than twice as fast as an equivalent using Object-Local Heaps. Pooled allocation is identified as the most worthwhile feature to integrate in a production language, structure splitting and pointer compression have to be applied more carefully, and might not be suitable for the general case. Some of the shortcomings of Object-Local Heaps could be overcome by using hierarchical memory layouts for search trees, others by leveraging compiler support to reduce latency in general. In conclusion this work shows that automatic memory management can provide opportunities for significant performance gains as well as safety and convenience.

APA, Harvard, Vancouver, ISO, and other styles

17

Hassan, Ahmad. "Software management of hybrid main memory systems." Thesis, Queen's University Belfast, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.706689.

Full text

Abstract:

Power and energy efficiency have become major concerns for modern computing systems. Main memory is a key energy consumer and a critical component of system design. Dynamic Random Access Memory (DRAM) is the de-facto technology for main memory in modern computing systems. However, DRAM is unlikely to scale beyond 22nm which restricts the amount of main memory available to a system. Moreover, DRAM consumes significant static energy both in active and idle state due to continuous leakage and refresh power. Non-Volatile Memory (NVM) technology is emerging as a compelling main memory technology due to its high density and low leakage power. Current NVM devices have higher read and write access latencies than DRAM. Unlike DRAM, NVM technology is characterized by asymmetric read and write latencies; with write suffering more than read. Moreover, NVM suffers from higher dynamic access energy and reduced durability than DRAM. This dissertation proposes to leverage a hybrid memory architecture, consisting of both DRAM and NVM, with an aim to reduce energy. An application-level data management policies have been proposed that decide to place data on DRAM ys. NVM. With careful data placement, hybrid memory exhibits the latency and dynamic energy ot DRAM in the common case, while rarely exposing the latency and high dynamic energy of NVM. Moreover, main memory capacity is increased by NVM without expending the static energy of DRAM.

APA, Harvard, Vancouver, ISO, and other styles

18

Lescouet, Alexis. "Memory management for operating systems and runtimes." Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAS008.

Full text

Abstract:

Avec l'émergence et l'évolution rapide de domaines scientifiques tels que l'analyse de données ou l'intelligence artificielle, les besoins en puissance de calcul ont fortement augmenté. Depuis des années, en raison de contraintes physiques, l'augmentation de la puissance des processeurs se fait au travers d'une augmentation du nombre de cœurs et non plus d'une augmentation de la fréquence. Ce nouveau paradigme nécessite une évolution du logiciel afin de pouvoir développer toute la puissance de ces machines, faisant ainsi du parallélisme une pierre angulaire de la pile logicielle. Les systèmes d'exploitation, directement concernés, doivent inclure différentes règles permettant la bonne gestion de différents types de machines. Cependant, la gestion de ressources est souvent divisée en différentes unités responsables chacune d'une ressource spécifique, qui prennent des décisions sans vision globale du système. De plus, en raison de la complexité et de l'évolution rapide du matériel, les systèmes d'exploitation ont de plus en plus de difficultés à tenir compte des variations subtiles entre deux machines. L'important développement de la technologie de virtualisation nous permet de proposer une nouvelle approche pour la gestion de ressources qui utilise la virtualisation pour ajouter une couche de gestion des ressources dédiée entre la machine et le système d'exploitation habituel. Au même titre que les systèmes d'exploitation, les applications doivent exécuter une partie de leur code en parallèle pour obtenir des performances élevées. C'est le cas en particulier pour les environnements d'exécution tels que MPI qui ont pour but d'aider à la parallélisation d'applications. Avec les architectures matérielles modernes dotées de réseaux rapides, le recouvrement de la communication réseau avec du calcul est devenu partie intégrante du parallélisme applicatif. Une certaine quantité de recouvrement peut être obtenue manuellement mais cela reste une procédure complexe. Notre approche propose de transformer automatiquement les communications bloquantes en communications non bloquantes, augmentant ainsi le potentiel de recouvrement. Pour cela, nous utilisons un thread séparé pour les communications et contrôlons les accès à la mémoire des communications. Nous garantissons ainsi la progression des communications et une meilleure parallélisation de celles-ci et des calculs
During the last decade, the need for computational power has increased due to the emergence and fast evolution of fields such as data analysis or artificial intelligence. This tendency is also reinforced by the growing number of services and end-user devices. Due to physical constraints, the trend for new hardware has shifted from an increase in processor frequency to an increase in the number of cores per machine. This new paradigm requires software to adapt, making the ability to manage such a parallelism the cornerstone of many parts of the software stack. Directly concerned by this change, operating systems have evolved to include complex rules each pertaining to different hardware configurations. However, more often than not, resources management units are responsible for one specific resource and make a decision in isolation. Moreover, because of the complexity and fast evolution rate of hardware, operating systems, not designed to use a generic approach have trouble keeping up. Given the advance of virtualization technology, we propose a new approach to resource management in complex topologies using virtualization to add a small software layer dedicated to resources placement in between the hardware and a standard operating system. Similarly, in user space applications, parallelism is an important lever to attain high performances, which is why high performance computing runtimes, such as MPI, are built to increase parallelism in applications. The recent changes in modern architectures combined with fast networks have made overlapping CPU-bound computation and network communication a key part of parallel applications. While some degree of overlap might be attained manually, this is often a complex and error prone procedure. Our proposal automatically transforms blocking communications into non blocking ones to increase the overlapping potential. To this end, we use a separate communication thread responsible for handling communications and a memory protection mechanism to track memory accesses in communication buffers. This guarantees both progress for these communications and the largest window during which communication and computation can be processed in parallel

APA, Harvard, Vancouver, ISO, and other styles

19

Ha, Viet Hai. "Optimization of memory management on distributed machine." Phd thesis, Institut National des Télécommunications, 2012. http://tel.archives-ouvertes.fr/tel-00814630.

Full text

Abstract:

In order to explore further the capabilities of parallel computing architectures such as grids, clusters, multi-processors and more recently, clouds and multi-cores, an easy-to-use parallel language is an important challenging issue. From the programmer's point of view, OpenMP is very easy to use with its ability to support incremental parallelization, features for dynamically setting the number of threads and scheduling strategies. However, as initially designed for shared memory systems, OpenMP is usually limited on distributed memory systems to intra-nodes' computations. Many attempts have tried to port OpenMP on distributed systems. The most emerged approaches mainly focus on exploiting the capabilities of a special network architecture and therefore cannot provide an open solution. Others are based on an already available software solution such as DMS, MPI or Global Array and, as a consequence, they meet difficulties to become a fully-compliant and high-performance implementation of OpenMP. As yet another attempt to built an OpenMP compliant implementation for distributed memory systems, CAPE − which stands for Checkpointing Aide Parallel Execution − has been developed which with the following idea: when reaching a parallel section, the master thread is dumped and its image is sent to slaves; then, each slave executes a different thread; at the end of the parallel section, slave threads extract and return to the master thread the list of all modifications that has been locally performed; the master includes these modifications and resumes its execution. In order to prove the feasibility of this paradigm, the first version of CAPE was implemented using complete checkpoints. However, preliminary analysis showed that the large amount of data transferred between threads and the extraction of the list of modifications from complete checkpoints lead to weak performance. Furthermore, this version was restricted to parallel problems satisfying the Bernstein's conditions, i.e. it did not solve the requirements of shared data. This thesis aims at presenting the approaches we proposed to improve CAPE' performance and to overcome the restrictions on shared data. First, we developed DICKPT which stands for Discontinuous Incremental Checkpointing, an incremental checkpointing technique that supports the ability to save incremental checkpoints discontinuously during the execution of a process. Based on the DICKPT, the execution speed of the new version of CAPE was significantly increased. For example, the time to compute a large matrix-matrix product on a desktop cluster has become very similar to the execution time of the same optimized MPI program. Moreover, the speedup associated with this new version for various number of threads is quite linear for different problem sizes. In the side of shared data, we proposed UHLRC, which stands for Updated Home-based Lazy Release Consistency, a modified version of the Home-based Lazy Release Consistency (HLRC) memory model, to make it more appropriate to the characteristics of CAPE. Prototypes and algorithms to implement the synchronization and OpenMP data-sharing clauses and directives are also specified. These two works ensures the ability for CAPE to respect shared-data behavior

APA, Harvard, Vancouver, ISO, and other styles

20

Ha, Viet Hai. "Optimization of memory management on distributed machine." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2012. http://www.theses.fr/2012TELE0042.

Full text

Abstract:

Afin d'exploiter les capacités des architectures parallèles telles que les grappes, les grilles, les systèmes multi-processeurs, et plus récemment les nuages et les systèmes multi-cœurs, un langage de programmation universel et facile à utiliser reste à développer. Du point de vue du programmeur, OpenMP est très facile à utiliser en grande partie grâce à sa capacité à supporter une parallélisation incrémentale, la possibilité de définir dynamiquement le nombre de fils d'exécution, et aussi grâce à ses stratégies d'ordonnancement. Cependant, comme il a été initialement conçu pour des systèmes à mémoire partagée, OpenMP est généralement très limité pour effectuer des calculs sur des systèmes à mémoire distribuée. De nombreuses solutions ont été essayées pour faire tourner OpenMP sur des systèmes à mémoire distribuée. Les approches les plus abouties se concentrent sur l’exploitation d’une architecture réseau spéciale et donc ne peuvent fournir une solution ouverte. D'autres sont basées sur une solution logicielle déjà disponible telle que DMS, MPI ou Global Array, et par conséquent rencontrent des difficultés pour fournir une implémentation d'OpenMP complètement conforme et à haute performance. CAPE — pour Checkpointing Aided Parallel Execution — est une solution alternative permettant de développer une implémentation conforme d'OpenMP pour les systèmes à mémoire distribuée. L'idée est la suivante : en arrivant à une section parallèle, l'image du thread maître est sauvegardé et est envoyée aux esclaves ; puis, chaque esclave exécute l'un des threads ; à la fin de la section parallèle, chaque threads esclaves extraient une liste de toutes modifications ayant été effectuées localement et la renvoie au thread maître ; le thread maître intègre ces modifications et reprend son exécution. Afin de prouver la faisabilité de cette approche, la première version de CAPE a été implémentée en utilisant des points de reprise complets. Cependant, une analyse préliminaire a montré que la grande quantité de données transmises entre les threads et l’extraction de la liste des modifications depuis les points de reprise complets conduit à de faibles performances. De plus, cette version est limitée à des problèmes parallèles satisfaisant les conditions de Bernstein, autrement dit, il ne permet pas de prendre en compte les données partagées. L'objectif de cette thèse est de proposer de nouvelles approches pour améliorer les performances de CAPE et dépasser les restrictions sur les données partagées. Tout d'abord, nous avons développé DICKPT (Discontinuous Incremental ChecKPoinTing), une technique points de reprise incrémentaux qui supporte la possibilité de prendre des points de reprise discontinue lors de l'exécution d'un processus. Basé sur DICKPT, la vitesse d'exécution de la nouvelle version de CAPE a été considérablement augmenté. Par exemple, le temps de calculer une grande multiplication matrice-matrice sur un cluster des ordinateurs bureaux est devenu très similaire à la durée d'exécution d'un programme MPI optimisé. En outre, l'accélération associée à cette nouvelle version pour divers nombre de threads est assez linéaire pour différentes tailles du problème. Pour des données partagées, nous avons proposé UHLRC (Updated Home-based Lazy Relaxed Consistency), une version modifiée de la HLRC (Home-based Lazy Relaxed Consistency) modèle de mémoire, pour le rendre plus adapté aux caractéristiques de CAPE. Les prototypes et les algorithmes à mettre en œuvre la synchronisation des données et des directives et clauses de données partagées sont également précisées. Ces deux travaux garantit la possibilité pour CAPE de respecter des demandes de données partagées d'OpenMP
In order to explore further the capabilities of parallel computing architectures such as grids, clusters, multi-processors and more recently, clouds and multi-cores, an easy-to-use parallel language is an important challenging issue. From the programmer's point of view, OpenMP is very easy to use with its ability to support incremental parallelization, features for dynamically setting the number of threads and scheduling strategies. However, as initially designed for shared memory systems, OpenMP is usually limited on distributed memory systems to intra-nodes' computations. Many attempts have tried to port OpenMP on distributed systems. The most emerged approaches mainly focus on exploiting the capabilities of a special network architecture and therefore cannot provide an open solution. Others are based on an already available software solution such as DMS, MPI or Global Array and, as a consequence, they meet difficulties to become a fully-compliant and high-performance implementation of OpenMP. As yet another attempt to built an OpenMP compliant implementation for distributed memory systems, CAPE − which stands for Checkpointing Aide Parallel Execution − has been developed which with the following idea: when reaching a parallel section, the master thread is dumped and its image is sent to slaves; then, each slave executes a different thread; at the end of the parallel section, slave threads extract and return to the master thread the list of all modifications that has been locally performed; the master includes these modifications and resumes its execution. In order to prove the feasibility of this paradigm, the first version of CAPE was implemented using complete checkpoints. However, preliminary analysis showed that the large amount of data transferred between threads and the extraction of the list of modifications from complete checkpoints lead to weak performance. Furthermore, this version was restricted to parallel problems satisfying the Bernstein's conditions, i.e. it did not solve the requirements of shared data. This thesis aims at presenting the approaches we proposed to improve CAPE' performance and to overcome the restrictions on shared data. First, we developed DICKPT which stands for Discontinuous Incremental Checkpointing, an incremental checkpointing technique that supports the ability to save incremental checkpoints discontinuously during the execution of a process. Based on the DICKPT, the execution speed of the new version of CAPE was significantly increased. For example, the time to compute a large matrix-matrix product on a desktop cluster has become very similar to the execution time of the same optimized MPI program. Moreover, the speedup associated with this new version for various number of threads is quite linear for different problem sizes. In the side of shared data, we proposed UHLRC, which stands for Updated Home-based Lazy Release Consistency, a modified version of the Home-based Lazy Release Consistency (HLRC) memory model, to make it more appropriate to the characteristics of CAPE. Prototypes and algorithms to implement the synchronization and OpenMP data-sharing clauses and directives are also specified. These two works ensures the ability for CAPE to respect shared-data behavior

APA, Harvard, Vancouver, ISO, and other styles

21

Faur, Andrei. "Memory Profiling Techniques." Thesis, Linköpings universitet, Programvara och system, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-79598.

Full text

Abstract:

Memory proﬁling is an important technique which aids program optimization and can even help tracking down bugs. The main problem with the current memory proﬁling techniques and tools is that they slow down the target software considerably therefore making them inadequate for mainline integration. Ideally, the user would be able to monitor memory consumption without having to worry about the rest of the software being aﬀected in any way. This thesis provides a comparison of existing techniques and tools along with the description of a memory proﬁler implementation which tries to provide a balance between the information it is able to retrieve and the inﬂuence it has on the target software.

APA, Harvard, Vancouver, ISO, and other styles

22

Upadhayaya, Niraj. "Memory management and optimization using distributed shared memory systems for high performance computing clusters." Thesis, University of the West of England, Bristol, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.421743.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Cao, Haian. "Memory address management for digital signal processors (DSPs)." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0007/MQ43147.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Kim, Jinwoo. "Memory hierarchy management through off-line computational learning." Diss., Georgia Institute of Technology, 2003. http://hdl.handle.net/1853/8194.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Mapp, Glenford Ezra. "An object-oriented approach to virtual memory management." Thesis, University of Cambridge, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.387015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Borg, Andrew. "Coarse grain memory management in real-time systems." Thesis, University of York, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.437621.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Favela, Jesus. "Organizational memory management for large-scale systems development." Thesis, Massachusetts Institute of Technology, 1993. http://hdl.handle.net/1721.1/12238.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Modzelewski, Kevin (Kevin Paul). "Scalable memory management using a distributed buddy allocator." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/61002.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (p. 53-56).
The recent rise of multicore processors has forced us to reexamine old computer science problems in a new light. As multicores turn into manycores, we need to visit these problems yet again to find solutions that will work on these drastically different architectures. This thesis presents the design of a new page allocator algorithm based on a new distributed buddy allocator algorithm, one which is made with future processor architectures in mind. The page allocator is a vital and heavily-used part of an operating system, and making this more scalable is a necessary step to build a scalable operating system. This design was implemented in the fos [34] research operating system, and evaluated on 8- and 16-core machines. The results show that this design has comparable performance with Linux for small core counts, and with its better scalability, surpasses the performance of Linux at higher core counts.
by Kevin Modzelewski.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

29

Beebee, William S. (William Scripps) 1977. "Region-based memory management for real-time Java." Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/86801.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.
Includes bibliographical references (p. 81-82).
by William S. Beebee, Jr.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

30

Makridis, Odysseus. "Energy-Efficient Flash Memory Management in Embedded Systems." Thesis, KTH, Maskinkonstruktion (Inst.), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-176276.

Full text

Abstract:

In today’s embedded systems there is a growing need for energy efficient solutions as applications and hardware are growing in power consumption levels. That is why GEODES was initiated, a project where several companies and universities hold as main objective to develop power efficient solutions in embedded software. This master thesis is realized at ENEA and is part of GEODES. The focus of the work is flash memory management and how this type of software can be made power efficient. The first part of the master thesis is a theoretical study in flash memory management and power efficiency in flash systems. This study identified and evaluated the current solutions that exist in this topic. The theoretical foundation was made so an implementation of an energy efficient flash management system could be realized. A requirement specification was also developed regarding both basic functionality and energy optimization. After the theoretical study a power efficient design was developed, containing techniques like data separation, intelligent garbage collection and the utilization of a cache memory. More basic functionality like error code correction, bad block management and wear levelling was also included. The implementation was based on this design and the result was a driver layer for a flash memory, however no power management techniques were included due to time restrictions. The driver was implemented on an i.MX31 board in an OSE5.4 environment; this module (memory device driver - MDD) controlled a 128 M x 8-bit NAND flash memory and included functionalities such as error code correction, bad block management and status read. The MDD utilized an internal RAM buffer and the i.MX31 boards NAND flash controller. It executed all the operations (read, write and erase) and used the spare area for metadata writes.
I dagens inbyggda system ser vi behovet av energieffektiva lösningar växa då applikationers och hårdvarors strömkonsumtionsnivåer ökar. Därför har GEODES, ett samarbetsprojekt mellan flera företag och universitet, initierats. GEODES mål är att utveckla strömsparande lösningar inom inbyggda system och deras arbete sträcker sig över hela Europa. Detta examensarbete utfördes på ENEA och var del av GEODES projektet, fokus på arbetet var flashminnes-hantering och hur man skulle kunna strömeffektivisera sådana typer av system. Första delen av arbetet är en teoretisk fördjupning i flashminnes-hantering och energieffektivisering. I denna studie identifierades och utvärderades de lösningar som finns idag. Grunden och teoretiska underlaget lades för att kunna implementera ett sådant system. En mall för ett strömsnålt flash-lager togs fram i form av en kravspecifikation, dessa krav berörde både grundläggande funktionalitet och strömsparning. Nästa steg var utvecklingen av en energieffektiv design som innehöll teknikerna dataseparering, intelligent skräphantering och cache minne. Denna design tillhandhöll också funktionaliteter som error code correction, bad block management och wear-levelling. Implementationen baserades på denna design och resultatet blev en drivare för ett flashminne, men på grund utav tidsrestriktioner så hann inga strömsparnings-tekniker realiseras. Drivaren implementerades på ett i.MX31-kort i OSE5.4 miljö; denna modul (Memory Device Driver - MDD) styrde ett 128 M x 8-bit NAND flashminne och tillhandahöll funktionaliteter som error code correction, bad block management och statusläsning. MDD:n utnyttjade en intern RAM-buffert, i.MX31-kortets NAND flash controller, samt utförde alla operationer (läsa, skriva och radera) och brukade minnets ”spare area” för metadata skrivningar.

APA, Harvard, Vancouver, ISO, and other styles

31

Holk, Eric. "Region-based memory management for expressive GPU programming." Thesis, Indiana University, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10132089.

Full text

Abstract:

Over the last decade, graphics processing units (GPUs) have seen their use broaden from purely graphical tasks to general purpose computation. The increased programmability required by demanding graphics applications has proven useful for a number of non-graphical problems as well. GPUs' high memory bandwidth and floating point performance make them attractive for general computation workloads, yet these benefits come at the cost of added complexity. One particular problem is the fact that GPUs and their associated high performance memory typically lie on discrete cards that are separated from the host CPU} by the PCI-Express bus. This requires programmers to carefully manage the transfer of data between the CPU and GPU memory so that the right data is in the right place at the right time. Programmers must design data structures with serialization in mind in order to efficiently move data across the PCI bus. In practice, this leads to programmers working with only simple data structures such as one or two-dimensional arrays and the applications that can be easily expressed in terms of these structures. CPU programmers have long had access to richer data structures, such as trees or first class procedures, which enable new and simpler approaches to solving certain problems.

This thesis explores the use of RBMM to overcome these data movement challenges. RBMM is a technique in which data is assigned to regions and these regions can then be operated on as a unit. One of the first uses of regions was to amortize the cost of deallocation. Many small objects would be allocated in a single region and the region could be deallocated as a single operation independent of the number of items in the region. In this thesis, regions are used as the unit of data movement between the CPU and GPU. Data structures are assigned to a region and thus the runtime system does not have to be aware of the internal layout of a data structure. The runtime system can simply move the entire region from one device to another, keeping the internal layout intact and allowing code running on either device to operate on the data in the same way.

These ideas are explored through a new programming language called Harlan. Harlan is designed to simplify programming GPUs and other data parallel processors. It provides kernel expressions as its fundamental mechanism for parallelism. Kernels function similarly to a parallel map or zipWith operation from other functional programming languages. For example, the expression (kernel ([x xs] [y ys]) (+ x y)) evaluates to a vector where each element is the sum of the corresponding elements in xs and ys. Kernels can have arbitrary body expressions that can even include kernels, thereby supporting nested data parallelism. Harlan uses a region-based memory system to enable higher level programming features such as trees and ADTs and even first class procedures. Like all data in Harlan, first class procedures are device-independent, so a procedure created in GPU code can be applied in CPU code and vice-versa.

Besides providing the design and description of the implementation of Harlan, this thesis includes a type safety proof for a small model of Harlan's region system as well as a number of small application case studies. The type safety proof provides formal support that Harlan ensures programs will have the right data in the right place at the right time. The application case studies show that Harlan and the ideas embodied within it are useful both for a number of traditional applications as well as problems that are problematic for previous GPU programming languages. The design and implementation of Harlan, its proof of type safety and the set of application case studies together show that region-based memory management is an effective way of enabling high level features in languages targeting CPU/GPU systems and other machines with disjoint memories.

APA, Harvard, Vancouver, ISO, and other styles

32

Wu, Jiesheng. "Communication and memory management in networked storage systems." The Ohio State University, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=osu1095696917.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Herrmann, Edward C. "Threaded Dynamic Memory Management in Many-Core Processors." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1277132326.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Muthu, Srinivas. "A Context-Aware Approach to Android Memory Management." University of Toledo / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1449665506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Lin, Jiang. "Thermal modeling and management of DRAM memory systems." [Ames, Iowa : Iowa State University], 2008.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

36

Sim, Jae Woong. "Architecting heterogeneous memory systems with 3D die-stacked memory." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53835.

Full text

Abstract:

The main objective of this research is to efficiently enable 3D die-stacked memory and heterogeneous memory systems. 3D die-stacking is an emerging technology that allows for large amounts of in-package high-bandwidth memory storage. Die-stacked memory has the potential to provide extraordinary performance and energy benefits for computing environments, from data-intensive to mobile computing. However, incorporating die-stacked memory into computing environments requires innovations across the system stack from hardware and software. This dissertation presents several architectural innovations to practically deploy die-stacked memory into a variety of computing systems. First, this dissertation proposes using die-stacked DRAM as a hardware-managed cache in a practical and efficient way. The proposed DRAM cache architecture employs two novel techniques: hit-miss speculation and self-balancing dispatch. The proposed techniques virtually eliminate the hardware overhead of maintaining a multi-megabytes SRAM structure, when scaling to gigabytes of stacked DRAM caches, and improve overall memory bandwidth utilization. Second, this dissertation proposes a DRAM cache organization that provides a high level of reliability for die-stacked DRAM caches in a cost-effective manner. The proposed DRAM cache uses error-correcting code (ECCs), strong checksums (CRCs), and dirty data duplication to detect and correct a wide range of stacked DRAM failures—from traditional bit errors to large-scale row, column, bank, and channel failures—within the constraints of commodity, non-ECC DRAM stacks. With only a modest performance degradation compared to a DRAM cache with no ECC support, the proposed organization can correct all single-bit failures, and 99.9993% of all row, column, and bank failures. Third, this dissertation proposes architectural mechanisms to use large, fast, on-chip memory structures as part of memory (PoM) seamlessly through the hardware. The proposed design achieves the performance benefit of on-chip memory caches without sacrificing a large fraction of total memory capacity to serve as a cache. To achieve this, PoM implements the ability to dynamically remap regions of memory based on their access patterns and expected performance benefits. Lastly, this dissertation explores a new usage model for die-stacked DRAM involving a hybrid of caching and virtual memory support. In the common case where system’s physical memory is not over-committed, die-stacked DRAM operates as a cache to provide performance and energy benefits to the system. However, when the workload’s active memory demands exceed the capacity of the physical memory, the proposed scheme dynamically converts the stacked DRAM cache into a fast swap device to avoid the otherwise grievous performance penalty of swapping to disk.

APA, Harvard, Vancouver, ISO, and other styles

37

Krishnajith, Anaththa Pathiranage Dhanushka. "Memory management and parallelization of data intensive all-to-all comparison in shared-memory systems." Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/79187/1/Anaththa%20Pathiranage%20Dhanushka_Krishnajith_Thesis.pdf.

Full text

Abstract:

This thesis presents a novel program parallelization technique incorporating with dynamic and static scheduling. It utilizes a problem specific pattern developed from the prior knowledge of the targeted problem abstraction. Suitable for solving complex parallelization problems such as data intensive all-to-all comparison constrained by memory, the technique delivers more robust and faster task scheduling compared to the state-of-the art techniques. Good performance is achieved from the technique in data intensive bioinformatics applications.

APA, Harvard, Vancouver, ISO, and other styles

38

Henschen, Katharina. "The monument : the Shoah and German memory." Thesis, City University London, 2002. http://openaccess.city.ac.uk/7648/.

Full text

Abstract:

The aim of this research project is to analyse forms of remembrance and memory of the Shoah in Germany in its and their political and cultural formations. The underlying question driving the research project is Adorno's famous essay 'What does it mean: coming-to-terms with the past?'. The thesis deals with historical-philosophical reflections and the historical-literary perspective on the complex process of remembering the Shoah in Germany and its monumental manifestations in the form of the planned Holocaust Memorial in Berlin. The research project sets out to critique and analyse a body of artistic, literary and philosophical works that engage with the problematics of remembering and re-presenting the Shoah. It explores these critical questions against the backdrop of the changed social and historical conditions of the reunited Germany and makes reference to the debates of the 1990s, the planned Holocaust Memorial in Berlin and the wider context of post-Holocaust discourse. The first chapter delivers an exegetical reading of Walter Benjamin's texts in order to open up new interpretative perspectives for an understanding of the issues at stake. Benjamin's notions of 'history' and 'memory' serve as ideas for a comparative analysis of the problematics of memory in the country of the perpetrators and for the possibilities of future memory. The second chapter discusses the decision-making process for a national, central 'Memorial to the murdered Jews of Europe' in Berlin; it explores the winning designs of the competition and their respective implications on what constitutes the memory of the Shoah in Germany. The decision for a central memorial and the implications of the chosen design are measured against the backdrop of the debates of the 1990s and the politics of a re-united Germany. The third chapter discusses the different attempts of literary and (historical-)philosophical reflection on the occurrence of the Shoah in the writings of Thomas Mann, Hannah Arendt, Karl Jaspers and Alexander and Margarete Mitscherlich. The chapter questions the political positioning and action of the author Martin Walser, as a representative of the generation of perpetrators, to the process of working-through and coming-to-terms. It critically examines Walser's speech of October 1998 and places the speech in the historical context of coming-to-terms in post-war Germany. The thesis demonstrates that the choice of design. for the planned Holocaust Memorial correlates with the status of politics in the united Germany. It is argued here that the focus on what it is that needs to be worked through and come to terms with, has shifted during the post-Holocaust discourse. The thesis demonstrates that the questions at stake in the most recent debates are the workings-through of a younger generation that confronts part of a horrifying family history. The thesis argues for the necessity of memory and remembrance in the future.

APA, Harvard, Vancouver, ISO, and other styles

39

Cruz, Eduardo Henrique Molina da. "Online thread and data mapping using the memory management unit." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2016. http://hdl.handle.net/10183/139109.

Full text

Abstract:

Conforme o paralelismo a nível de threads aumenta nas arquiteturas modernas devido ao aumento do número de núcleos por processador e processadores por sistema, a complexidade da hierarquia de memória também aumenta. Tais hierarquias incluem diversos níveis de caches privadas ou compartilhadas e tempo de acesso não uniforme à memória. Um desafio importante em tais arquiteturas é a movimentação de dados entre os núcleos, caches e bancos de memória primária, que ocorre quando um núcleo realiza uma transação de memória. Neste contexto, a redução da movimentação de dados é um dos pilares para futuras arquiteturas para manter o aumento de desempenho e diminuir o consumo de energia. Uma das soluções adotadas para reduzir a movimentação de dados é aumentar a localidade dos acessos à memória através do mapeamento de threads e dados. Mecanismos de mapeamento do estado-da-arte aumentam a localidade de memória mapeando threads que compartilham um grande volume de dados em núcleos próximos na hierarquia de memória (mapeamento de threads), e mapeando os dados em bancos de memória próximos das threads que os acessam (mapeamento de dados). Muitas propostas focam em mapeamento de threads ou dados separadamente, perdendo oportunidades de ganhar desempenho. Outras propostas dependem de traços de execução para realizar um mapeamento estático, que podem impor uma sobrecarga alta e não podem ser usados em aplicações cujos comportamentos de acesso à memória mudam em diferentes execuções. Há ainda propostas que usam amostragem ou informações indiretas sobre o padrão de acesso à memória, resultando em informação imprecisa sobre o acesso à memória. Nesta tese de doutorado, são propostas soluções inovadoras para identificar um mapeamento que otimize o acesso à memória fazendo uso da unidade de gerência de memória para monitor os acessos à memória. As soluções funcionam dinamicamente em paralelo com a execução da aplicação, detectando informações para o mapeamento de threads e dados. Com tais informações, o sistema operacional pode realizar o mapeamento durante a execução das aplicações, não necessitando de conhecimento prévio sobre o comportamento da aplicação. Como as soluções funcionam diretamente na unidade de gerência de memória, elas podem monitorar a maioria dos acessos à memória com uma baixa sobrecarga. Em arquiteturas com TLB gerida por hardware, as soluções podem ser implementadas com pouco hardware adicional. Em arquiteturas com TLB gerida por software, algumas das soluções podem ser implementadas sem hardware adicional. As soluções aqui propostas possuem maior precisão que outros mecanismos porque possuem acesso a mais informações sobre o acesso à memória. Para demonstrar os benefícios das soluções propostas, elas são avaliadas com uma variedade de aplicações usando um simulador de sistema completo, uma máquina real com TLB gerida por software, e duas máquinas reais com TLB gerida por hardware. Na avaliação experimental, as soluções reduziram o tempo de execução em até 39%. O ganho de desempenho se deu por uma redução substancial da quantidade de faltas na cache, e redução do tráfego entre processadores.
As thread-level parallelism increases in modern architectures due to larger numbers of cores per chip and chips per system, the complexity of their memory hierarchies also increase. Such memory hierarchies include several private or shared cache levels, and Non-Uniform Memory Access nodes with different access times. One important challenge for these architectures is the data movement between cores, caches, and main memory banks, which occurs when a core performs a memory transaction. In this context, the reduction of data movement is an important goal for future architectures to keep performance scaling and to decrease energy consumption. One of the solutions to reduce data movement is to improve memory access locality through sharing-aware thread and data mapping. State-of-the-art mapping mechanisms try to increase locality by keeping threads that share a high volume of data close together in the memory hierarchy (sharing-aware thread mapping), and by mapping data close to where its accessing threads reside (sharing-aware data mapping). Many approaches focus on either thread mapping or data mapping, but perform them separately only, losing opportunities to improve performance. Some mechanisms rely on execution traces to perform a static mapping, which have a high overhead and can not be used if the behavior of the application changes between executions. Other approaches use sampling or indirect information about the memory access pattern, resulting in imprecise memory access information. In this thesis, we propose novel solutions to identify an optimized sharing-aware mapping that make use of the memory management unit of processors to monitor the memory accesses. Our solutions work online in parallel to the execution of the application and detect the memory access pattern for both thread and data mappings. With this information, the operating system can perform sharing-aware thread and data mapping during the execution of the application, without any prior knowledge of their behavior. Since they work directly in the memory management unit, our solutions are able to track most memory accesses performed by the parallel application, with a very low overhead. They can be implemented in architectures with hardwaremanaged TLBs with little additional hardware, and some can be implemented in architectures with software-managed TLBs without any hardware changes. Our solutions have a higher accuracy than previous mechanisms because they have access to more accurate information about the memory access behavior. To demonstrate the benefits of our proposed solutions, we evaluate them with a wide variety of applications using a full system simulator, a real machine with software-managed TLBs, and a trace-driven evaluation in two real machines with hardware-managed TLBs. In the experimental evaluation, our proposals were able to reduce execution time by up to 39%. The improvements happened to a substantial reduction in cache misses and interchip interconnection traffic.

APA, Harvard, Vancouver, ISO, and other styles

40

Mohnen, Markus. "Optimising the memory management of higher order functional programs." Aachen : RWTH, Fachgruppe Informatik, 1997. http://deposit.d-nb.de/cgi-bin/dokserv?idn=970713789.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Zhang, Q. "Memory management architecture for next generation networks traffic managers." Thesis, Queen's University Belfast, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.557859.

Full text

Abstract:

The trend of moving conventionallP networks towards Next Generation Networks (NGNs) has highlighted the need for more sophisticated Traffic Managers (TMs) to guarantee better network and Quality of Service (QoS); these have to be scalable to support increasing link bandwidth and to cater for more diverse emerging applications. Current TM solutions though, are limited and not flexible enough to support new TM functionality or QoS with increasing diversity at faster speeds. This thesis investigates efficient and flexible memory management architectures that are critical in determining scalability and upper limits of TM performance. The approach presented takes advantage of current FPGA technology that now offers a high density of computational resources and flexible memory configurations, leading to what the author contends to be an ideal, programmable platform for distributed network management. The thesis begins with a survey of current TM solutions and their underlying technologies/architectures, the outcome of which indicates that memory and memory interfacing are the major factors in determining the scalability and upper limits of TM performance. An analysis of the implementation cost for a new TM with the capability of integrated queuing and scheduling further highlights the need to develop a more effective memory management architecture. A new on-demand QM architecture for programmable TM is then proposed that can dynamically map the ongoing active flows to a limited number of physical queues. Compared to the traditional QMs, it consumes much less memory resources, leading to a more scalable and effiCient TM solution. Based on the analysis of the effect of varying Internet traffic on the proposed OM, a more robust and resilient QM architecture is derived that achieves higher scalability and pefformance by adapting its functionality to the changing network conditions.

APA, Harvard, Vancouver, ISO, and other styles

42

Sinha, Aman. "Memory management and transaction scheduling for large-scale databases /." Digital version accessible at:, 1999. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Jiang, Song. "Efficient caching algorithms for memory management in computer systems." W&M ScholarWorks, 2004. https://scholarworks.wm.edu/etd/1539623446.

Full text

Abstract:

As disk performance continues to lag behind that of memory systems and processors, fully utilizing memory to reduce disk accesses is a highly effective effort to improve the entire system performance. Furthermore, to serve the applications running on a computer in distributed systems, not only the local memory but also the memory on remote servers must be effectively managed to minimize I/O operations. The critical challenges in an effective memory cache management include: (1) Insightfully understanding and quantifying the locality inherent in the memory access requests; (2) Effectively utilizing the locality information in replacement algorithms; (3) Intelligently placing and replacing data in the multi-level caches of a distributed system; (4) Ensuring that the overheads of the proposed schemes are acceptable.;This dissertation provides solutions and makes unique and novel contributions in application locality quantification, general replacement algorithms, low-cost replacement policy, thrashing protection, as well as multi-level cache management in a distributed system. First, the dissertation proposes a new method to quantify locality strength, and accurately to identify the data with strong locality. It also provides a new replacement algorithm, which significantly outperforms existing algorithms. Second, considering the extremely low-cost requirements on replacement policies in virtual memory management, the dissertation proposes a policy meeting the requirements, and considerably exceeding the performance existing policies. Third, the dissertation provides an effective scheme to protect the system from thrashing for running memory-intensive applications. Finally, the dissertation provides a multi-level block placement and replacement protocol in a distributed client-server environment, exploiting non-uniform locality strengths in the I/O access requests.;The methodology used in this study include careful application behavior characterization, system requirement analysis, algorithm designs, trace-driven simulation, and system implementations. A main conclusion of the work is that there is still much room for innovation and significant performance improvement for the seemingly mature and stable policies that have been broadly used in the current operating system design.

APA, Harvard, Vancouver, ISO, and other styles

44

Ananthanarayanan, R. (Rajagopal). "High performance distributed shared memory." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/8129.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Huffman, Michael John. "JDiet: Footprint Reduction for Memory-constrained Systems." DigitalCommons@CalPoly, 2009. https://digitalcommons.calpoly.edu/theses/108.

Full text

Abstract:

Main memory remains a scarce computing resource. Even though main memory is becoming more abundant, software applications are inexorably engineered to consume as much memory as is available. For example, expert systems, scientific computing, data mining, and embedded systems commonly suffer from the lack of main memory availability. This thesis introduces JDiet, an innovative memory management system for Java applications. The goal of JDiet is to provide the developer with a highly configurable framework to reduce the memory footprint of a memory-constrained system, enabling it to operate on much larger working sets. Inspired by buffer management techniques common in modern database management systems, JDiet frees main memory by evicting non-essential data to a disk-based store. A buffer retains a fixed amount of managed objects in main memory. As non-resident objects are accessed, they are swapped from the store to the buffer using an extensible replacement policy. While the Java virtual machine naïvely delegates virtual memory management to the operating system, JDiet empowers the system designer to select both the managed data and replacement policy. Guided by compile-time configuration, JDiet performs aspect-oriented bytecode engineering, requiring no explicit coupling to the source or compiled code. The results of an experimental evaluation of the effectiveness of JDiet are reported. A JDiet-enabled XML DOM parser is capable of parsing and processing over 200% larger input documents by sacrificing less than an order of magnitude in performance.

APA, Harvard, Vancouver, ISO, and other styles

46

Yoon, Myungchul. "Development and analysis of weak memory consistency models to accelerate shared memory multiprocessor systems /." Digital version accessible at:, 1998. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Akritidis, Periklis. "Practical memory safety for C." Thesis, University of Cambridge, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609600.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Ramaswamy, Subramanian. "Active management of Cache resources." Diss., Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/24663.

Full text

Abstract:

This dissertation addresses two sets of challenges facing processor design as the industry enters the deep sub-micron region of semiconductor design. The first set of challenges relates to the memory bottleneck. As the focus shifts from scaling processor frequency to scaling the number of cores, performance growth demands increasing die area. Scaling the number of cores also places a concurrent area demand in the form of larger caches. While on-chip caches occupy 50-60% of area and consume 20-30% of energy expended on-chip, their performance and energy efficiencies are less than 15% and 1% respectively for a range of benchmarks! The second set of challenges is posed by transistor leakage and process variation (inter-die and intra-die) at future technology nodes. Leakage power is anticipated to increase exponentially and sharply lower defect-free yield with successive technology generations. For performance scaling to continue, cache efficiencies have to improve significantly. This thesis proposes and evaluates a broad family of such improvements. This dissertation first contributes a model for cache efficiencies and finds them to be extremely low - performance efficiencies less than 15% and energy efficiencies in the order of 1%. Studying the sources of inefficiency leads to a framework for efficiency improvement based on two interrelated strategies. The approach for improving energy efficiency primarily relies on sizing the cache to match the application memory footprint during a program phase while powering down all remaining cache sets. Importantly, the sized is fully functional with no references to inactive sets. Improving performance efficiency primarily relies on cache shaping, i.e., changing the placement function and thereby the manner in which memory shares the cache. Sizing and shaping are applied at different phase of the design cycle: i) post-manufacturing & offline, ii) at compile-time, and at iii) run-time. This thesis proposes and explores techniques at each phase collectively realizing a repertoire of techniques for future memory system designers. The techniques use a combination of HW-SW techniques and are demonstrated to provide substantive improvements with modest overheads.

APA, Harvard, Vancouver, ISO, and other styles

49

Álvarez, Martí Lluc. "Transparent management of scratchpad memories in shared memory programming models." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/333334.

Full text

Abstract:

Cache-coherent shared memory has traditionally been the favorite memory organization for chip multiprocessors thanks to its high programmability. In this organization the cache hierarchy is in charge of moving the data and keeping it coherent between all the caches, enabling the usage of shared memory programming models where the programmer does not need to carry out any data management operation. Unfortunately, performing all the data management operations in hardware causes severe problems, being the primary concerns the power consumption originated in the caches and the amount of coherence traffic in the interconnection network. A good solution is to introduce ScratchPad Memories (SPMs) alongside the cache hierarchy, forming a hybrid memory hierarchy. SPMs are more power-efficient than caches and do not generate coherence traffic, but they degrade programmability. In particular, SPMs require the programmer to partition the data, to program data transfers, and to keep coherence between different copies of the data. A promising solution to exploit the benefits of the SPMs without harming programmability is to allow programmers to use shared memory programming models and to automatically generate code that manages the SPMs. Unfortunately, current compilers and runtime systems encounter serious limitations to automatically generate code for hybrid memory hierarchies from shared memory programming models. This thesis proposes to transparently manage the SPMs of hybrid memory hierarchies in shared memory programming models. In order to achieve this goal this thesis proposes a combination of hardware and compiler techniques to manage the SPMs in fork-join programming models and a set of runtime system techniques to manage the SPMs in task programming models. The proposed techniques allow to program hybrid memory hierarchies with these two well-known and easy-to-use forms of shared memory programming models, capitalizing on the benefits of hybrid memory hierarchies in power consumption and network traffic without harming programmability. The first contribution of this thesis is a hardware/software co-designed coherence protocol to transparently manage the SPMs of hybrid memory hierarchies in fork-join programming models. The solution allows the compiler to always generate code to manage the SPMs with tiling software caches, even in the presence of unknown memory aliasing hazards between memory references to the SPMs and to the cache hierarchy. On the software side, the compiler generates a special form of memory instruction for memory references with possible aliasing hazards. On the hardware side, the special memory instructions are diverted to the correct copy of the data using a set of directories that track what data is mapped to the SPMs. The second contribution of this thesis is a set of runtime system techniques to manage the SPMs of hybrid memory hierarchies in task programming models. The proposed runtime system techniques exploit the characteristics of these programming models to map the data specified in the task dependences to the SPMs. Different policies are proposed to mitigate the communication costs of the data transfers, overlapping them with other execution phases such as the task scheduling phase or the execution of the previous task. The runtime system can also reduce the number of data transfers by using a task scheduler that exploits data locality in the SPMs. In addition, the proposed techniques are combined with mechanisms that reduce the impact of fine-grained tasks, such as hardware runtime systems or large SPM sizes. The accomplishment of this thesis is that hybrid memory hierarchies can be programmed with fork-join and task programming models. Consequently, architectures with hybrid memory hierarchies can be exposed to the programmer as a shared memory multiprocessor, taking advantage of the benefits of the SPMs while maintaining the programming simplicity of shared memory programming models.
La memoria compartida con coherencia de caches es la jerarquía de memoria más utilizada en multiprocesadores gracias a su programabilidad. En esta solución la jerarquía de caches se encarga de mover los datos y mantener la coherencia entre las caches, habilitando el uso de modelos de programación de memoria compartida donde el programador no tiene que realizar ninguna operación para gestionar las memorias. Desafortunadamente, realizar estas operaciones en la arquitectura causa problemas severos, siendo especialmente relevantes el consumo de energía de las caches y la cantidad de tráfico de coherencia en la red de interconexión. Una buena solución es añadir Memorias ScratchPad (SPMs) acompañando la jerarquía de caches, formando una jerarquía de memoria híbrida. Las SPMs son más eficientes en energía y tráfico de coherencia, pero dificultan la programabilidad ya que requieren que el programador particione los datos, programe transferencias de datos y mantenga la coherencia entre diferentes copias de datos. Una solución prometedora para beneficiarse de las ventajas de las SPMs sin dificultar la programabilidad es permitir que el programador use modelos de programación de memoria compartida y generar código para gestionar las SPMs automáticamente. El problema es que los compiladores y los entornos de ejecución actuales sufren graves limitaciones al gestionar automáticamente una jerarquía de memoria híbrida en modelos de programación de memoria compartida. Esta tesis propone gestionar automáticamente una jerarquía de memoria híbrida en modelos de programación de memoria compartida. Para conseguir este objetivo esta tesis propone una combinación de técnicas hardware y de compilador para gestionar las SPMs en modelos de programación fork-join, y técnicas en entornos de ejecución para gestionar las SPMs en modelos de programación basados en tareas. Las técnicas propuestas hacen que las jerarquías de memoria híbridas puedan programarse con estos dos modelos de programación de memoria compartida, de tal forma que las ventajas en energía y tráfico de coherencia se puedan explotar sin dificultar la programabilidad. La primera contribución de esta tesis en un protocolo de coherencia hardware/software para gestionar SPMs en modelos de programación fork-join. La propuesta consigue que el compilador siempre pueda generar código para gestionar las SPMs, incluso cuando hay posibles alias de memoria entre referencias a memoria a las SPMs y a la jerarquía de caches. En la solución el compilador genera instrucciones especiales para las referencias a memoria con posibles alias, y el hardware sirve las instrucciones especiales con la copia válida de los datos usando directorios que guardan información sobre qué datos están mapeados en las SPMs. La segunda contribución de esta tesis son una serie de técnicas para gestionar SPMs en modelos de programación basados en tareas. Las técnicas aprovechan las características de estos modelos de programación para mapear las dependencias de las tareas en las SPMs y se complementan con políticas para minimizar los costes de las transferencias de datos, como solaparlas con fases del entorno de ejecución o la ejecución de tareas anteriores. El número de transferencias también se puede reducir utilizando un planificador que tenga en cuenta la localidad de datos y, además, las técnicas se pueden combinar con mecanismos para reducir los efectos negativos de tener tareas pequeñas, como entornos de ejecución en hardware o SPMs de más capacidad. Las propuestas de esta tesis consiguen que las jerarquías de memoria híbridas se puedan programar con modelos de programación fork-join y basados en tareas. En consecuencia, las arquitecturas con jerarquías de memoria híbridas se pueden exponer al programador como multiprocesadores de memoria compartida, beneficiándose de las ventajas de las SPMs en energía y tráfico de coherencia y manteniendo la simplicidad de uso de los modelos de programación de memoria compartida.

APA, Harvard, Vancouver, ISO, and other styles

50

Bolignano, Pauline. "Formal models and verification of memory management in a hypervisor." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S026/document.

Full text

Abstract:

Un hyperviseur est un logiciel qui virtualise les ressources d'une machine physique pour permettre à plusieurs systèmes d'exploitation invités de s'exécuter simultanément dessus. L'hyperviseur étant le gestionnaire des ressources, un bug peut être critique pour les systèmes invités. Dans cette thèse nous nous intéressons aux propriétés d'isolation de la mémoire d'un hyperviseur de type 1, qui virtualise la mémoire en utilisant des Shadow Page Tables. Plus précisément, nous présentons un modèle concret et un modèle abstrait de l'hyperviseur, et nous prouvons formellement que les systèmes d'exploitation invités ne peuvent pas altérer ou accéder aux données privées des autres s'ils n'en ont pas la permission. Nous utilisons le langage et l'assistant de preuve développés par Prove & Run pour ce faire. Le modèle concret comporte beaucoup d'optimisations, qui rendent les structures de données et les algorithmes complexes, il est donc difficile de raisonner dessus. C'est pourquoi nous construisons un modèle abstrait dans lequel il est plus facile de raisonner. Nous prouvons les propriétés sur le modèle abstrait, et nous prouvons formellement sa correspondance avec le modèle concret, de telle manière que les preuves sur le modèle abstrait s'appliquent au modèle concret. La preuve correspondance n'est valable que pour des états concrets qui respectent certaines propriétés, nous prouvons que ces propriétés sont des invariants du système concret. La preuve s'articule donc en trois phases : la preuve d'invariants au niveau concret, la preuve de correspondance entre les modèles abstraits et concret, et la preuve des propriétés de sécurité au niveau abstrait
A hypervisor is a software which virtualizes hardware resources, allowing several guest operating systems to run simultaneously on the same machine. Since the hypervisor manages the access to resources, a bug can be critical for the guest Oses. In this thesis, we focus on memory isolation properties of a type 1 hypervisor, which virtualizes memory using Shadow Page Tables. More precisely, we present a low-level and a high-level model of the hypervisor, and we formally prove that guest OSes cannot access or tamper with private data of other guests, unless they have the authorization to do so. We use the language and the proof assistant developed by Prove & Run. There are many optimizations in the low-level model, which makes the data structures and algorithms complexes. It is therefore difficult to reason on such a model. To circumvent this issue, we design an abstract model in which it is easier to reason. We prove properties on the abstract model, and we prove its correspondence with the low-level model, in such a way that properties proved on the abstract model also hold for the low-level model. The correspondence proof is valid only for low-level states which respect some properties. We prove that these properties are invariants of the low-level system. The proof can be divided into three parts : the proof of invariants preservation on the low-level, the proof of correspondence between abstract and low-level models, and proof of the security properties on the abstract level

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Memory management'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles