Dissertations / Theses: 'Software distributed shared memory'

1

Radovic, Zoran. "Software Techniques for Distributed Shared Memory." Doctoral thesis, Uppsala University, Department of Information Technology, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6058.

Full text

Abstract:

In large multiprocessors, the access to shared memory is often nonuniform, and may vary as much as ten times for some distributed shared-memory architectures (DSMs). This dissertation identifies another important nonuniform property of DSM systems: nonuniform communication architecture, NUCA. High-end hardware-coherent machines built from large nodes, or from chip multiprocessors, are typical NUCA systems, since they have a lower penalty for reading recently written data from a neighbor's cache than from a remote cache. This dissertation identifies node affinity as an important property for scalable general-purpose locks. Several software-based hierarchical lock implementations exploiting NUCAs are presented and evaluated. NUCA-aware locks are shown to be almost twice as efficient for contended critical sections compared to traditional lock implementations.

The shared-memory “illusion”' provided by some large DSM systems may be implemented using either hardware, software or a combination thereof. A software-based implementation can enable cheap cluster hardware to be used, but typically suffers from poor and unpredictable performance characteristics.

This dissertation advocates a new software-hardware trade-off design point based on a new combination of techniques. The two low-level techniques, fine-grain deterministic coherence and synchronous protocol execution, as well as profile-guided protocol flexibility, are evaluated in isolation as well as in a combined setting using all-software implementations. Finally, a minimum of hardware trap support is suggested to further improve the performance of coherence protocols across cluster nodes. It is shown that all these techniques combined could result in a fairly stable performance on par with hardware-based coherence.

APA, Harvard, Vancouver, ISO, and other styles

2

Johnson, Kirk Lauritz. "High-performance all-software distributed shared memory." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/37185.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (p. 165-172).
by Kirk Lauritz Johnson.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

3

Cheung, Wang-leung Benny. "Large object space support for software distributed shared memory." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B31601741.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

張宏亮 and Wang-leung Benny Cheung. "Migrating-home protocol for software distributed shared-memory system." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B31222377.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Cheung, Wang-leung Benny, and 張宏亮. "Large object space support for software distributed shared memory." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B31601741.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Norgren, Magnus. "Software Distributed Shared Memory Using the VIPS Coherence Protocol." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-256975.

Full text

Abstract:

A coherent global address space in a distributed system enables shared memory programming in a much larger scale than in a single multicore processor. The solution is to implement a software distributed shared memory (SWDSM) system since hardware support at this scale is non-existent. However, traditional approaches to coherence in SWDSM systems (centralized via 'active' home-node directories) are inherently unfit for such a scenario. Instead, it is crucial to make decisions locally and avoid the long latency imposed by both network and software message-handlers. This thesis investigates the performance of an SWDSM system with a novel and completely distributed coherence protocol that minimizes long-latency communications common in coherence protocols. More specifically, we propose an approach suitable for data race free programs, based on self-invalidation and self-downgrade inspired by the VIPS cache coherence protocol. This thesis tries to exploit the distributed nature of self-invalidation, self-downgrade by using a passive data classification directory that require no message-handlers, thereby incurring no extra latency when issuing coherence requests. The result is an SWDSM system called SVIPS which maximizes local decision making and allows high parallel performance with minimal coherence traffic between nodes in a distributed system.

APA, Harvard, Vancouver, ISO, and other styles

7

Cheung, Wang-leung Benny. "Migrating-home protocol for software distributed shared-memory system /." Hong Kong : University of Hong Kong, 2000. http://sunzi.lib.hku.hk/hkuto/record.jsp?B22030116.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Lo, Adley Kam Wing. "Tolerating latency in software distributed shared memory systems through multithreading." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ34040.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Atukorala, G. S. "Porting a distributed operating system to a shared memory parallel computer." Thesis, University of Bath, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.256756.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Chan, Charles Quoc Cuong. "Tolerating latency in software distributed shared memory systems through non-binding prefetching." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ34036.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Qin, Xiaohan. "On the use and performance of communication primitives in software controlled cache-coherent cluster architectures /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/6925.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Khalil, Mohamed Abdalla. "Integrative monitoring and control framework based on software distributed shared memory non-locking model." Thesis, Nottingham Trent University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.403094.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Zeffer, Håkan. "Towards Low-Complexity Scalable Shared-Memory Architectures." Doctoral thesis, Uppsala University, Department of Information Technology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7135.

Full text

Abstract:

Plentiful research has addressed low-complexity software-based shared-memory systems since the idea was first introduced more than two decades ago. However, software-coherent systems have not been very successful in the commercial marketplace. We believe there are two main reasons for this: lack of performance and/or lack of binary compatibility.

This thesis studies multiple aspects of how to design future binary-compatible high-performance scalable shared-memory servers while keeping the hardware complexity at a minimum. It starts with a software-based distributed shared-memory system relying on no specific hardware support and gradually moves towards architectures with simple hardware support.

The evaluation is made in a modern chip-multiprocessor environment with both high-performance compute workloads and commercial applications. It shows that implementing the coherence-violation detection in hardware while solving the interchip coherence in software allows for high-performing binary-compatible systems with very low hardware complexity. Our second-generation hardware-software hybrid performs on par with, and often better than, traditional hardware-only designs.

Based on our results, we conclude that it is not only possible to design simple systems while maintaining performance and the binary-compatibility envelope, it is often possible to get better performance than in traditional and more complex designs.

We also explore two new techniques for evaluating a new shared-memory design throughout this work: adjustable simulation fidelity and statistical multiprocessor cache modeling.

APA, Harvard, Vancouver, ISO, and other styles

14

Govindaswamy, Kirthilakshmi. "An API for adaptive loop scheduling in shared address space architectures." Master's thesis, Mississippi State : Mississippi State University, 2003. http://sun.library.msstate.edu/ETD-db/theses/available/etd-07082003-122028/restricted/kirthi%5Fthesis.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Parastatidis, Savas. "Run-time support for parallel object-oriented computing : the NIP lazy task creation technique and the NIP object-based software distributed shared memory." Thesis, University of Newcastle Upon Tyne, 2000. http://hdl.handle.net/10443/1768.

Full text

Abstract:

Advances in hardware technologies combined with decreased costs have started a trend towards massively parallel architectures that utilise commodity components. It is thought unreasonable to expect software developers to manage the high degree of parallelism that is made available by these architectures. This thesis argues that a new programming model is essential for the development of parallel applications and presents a model which embraces the notions of object-orientation and implicit identification of parallelism. The new model allows software engineers to concentrate on development issues, using the object-oriented paradigm, whilst being freed from the burden of explicitly managing parallel activity. To support the programming model, the semantics of an execution model are defined and implemented as part of a run-time support system for object-oriented parallel applications. Details of the novel techniques from the run-time system, in the areas of lazy task creation and object-based, distributed shared memory, are presented. The tasklet construct for representing potentially parallel computation is introduced and further developed by this thesis. Three caching techniques that take advantage of memory access patterns exhibited in object-oriented applications are explored. Finally, the performance characteristics of the introduced run-time techniques are analysed through a number of benchmark applications.

APA, Harvard, Vancouver, ISO, and other styles

16

Hussain, Shahid, and Hassan Shabbir. "Directory scalability in multi-agent based systems." Thesis, Blekinge Tekniska Högskola, Avdelningen för programvarusystem, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3110.

Full text

Abstract:

Simulation is one approach to analyze and model the real world complex problems. Multi-agent based systems provide a platform to develop simulations based on the concept of agent-oriented programming. In multi-agent systems, the local interaction between agents contributes to the emergence of the global phenomena by getting the result of the simulation runs. In MABS systems, interaction is one common aspect for all agents to perform their tasks. To interact with each other the agents require yellow page services from the platform to search for other agents. As more and more agents perform searches on this yellow page directory, there is a decrease in the performance due to a central bottleneck. In this thesis, we have investigated multiple solutions for this problem. The most promising solution is to integrate distributed shared memory with the directory systems. With our proposed solution, empirical analysis shows a statistically significant increase in performance of the directory service. We expect this result to make a considerable contribution to the state of the art in multiagent platforms.

APA, Harvard, Vancouver, ISO, and other styles

17

Kinawi, Husam. "Optimistic distributed shared memory." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0012/NQ38454.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Auld, Philip Ragner. "Broadcast distributed shared memory." W&M ScholarWorks, 2001. https://scholarworks.wm.edu/etd/1539623374.

Full text

Abstract:

Distributed shared memory (DSM) provides the illusion of shared memory processing to programs running on physically distributed systems. Many of these systems are connected by a broadcast medium network such as Ethernet. In this thesis, we develop a weakly coherent model for DSM that takes advantage of hardware-level broadcast. We define the broadcast DSM model (BDSM) to provide fine-grained sharing of user-defined locations. Additionally, since extremely weak DSM models are difficult to program, BDSM provides effective synchronization operations that allow it to function as a stronger memory. We show speedup results for a test suite of parallel programs and compare them to MPI versions.;To overcome the potential for message loss using broadcast on an Ethernet segment we have developed a reliable broadcast protocol, called Pipelined Broadcast Protocol (PBP). This protocol provides the illusion of a series of FIFO pipes among member process, on top of Ethernet broadcast operations. We discuss two versions of the PBP protocol and their implementations. Comparisons to TCP show the predicted benefits of using broadcast. PBP also shows strong throughput results, nearing the maximum of our 10Base-T hardware.;By combining weak DSM and hardware broadcast we developed a system that provides comparable performance to a common message-passing system, MPI. For our test programs that have all-to-all communication patterns, we actually see better performance than MPI. We show that using broadcast to perform DSM updates can be a viable alternative to message passing for parallel and distributed computation on a single Ethernet segment.

APA, Harvard, Vancouver, ISO, and other styles

19

Ananthanarayanan, R. (Rajagopal). "High performance distributed shared memory." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/8129.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Godfrey, Andrew. "Distributed shared memory for virtual environments." Master's thesis, University of Cape Town, 1997. http://hdl.handle.net/11427/9516.

Full text

Abstract:

Bibliography: leaves 71-77.
This work investigated making virtual environments easier to program, by designing a suitable distributed shared memory system. To be usable, the system must keep latency to a minimum, as virtual environments are very sensitive to it. The resulting design is push-based and non-consistent. Another requirement is that the system should be scaleable, over large distances and over large numbers of participants. The latter is hard to achieve with current network protocols, and a proposal was made for a more scaleable multicast addressing system than is used in the Internet protocol. Two sample virtual environments were developed to test the ease-of-use of the system. This showed that the basic concept is sound, but that more support is needed. The next step should be to extend the language and add compiler support, which will enhance ease-of-use and allow numerous optimisations. This can be improved further by providing system-supported containers.

APA, Harvard, Vancouver, ISO, and other styles

21

Carter, Nicholas P. (Nicholas Parks). "Processor mechanisms for software shared memory." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/79969.

Full text

Abstract:

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.
Includes bibliographical references (p. 169-171).
by Nicholas Parks Carter.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

22

Girard, Gabriel. "Views and consistencies in distributed shared memory." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0009/NQ59232.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Marurngsith, Worawan. "Simulation modelling of distributed-shared memory multiprocessors." Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/870.

Full text

Abstract:

Distributed shared memory (DSM) systems have been recognised as a compelling platform for parallel computing due to the programming advantages and scalability. DSM systems allow applications to access data in a logically shared address space by abstracting away the distinction of physical memory location. As the location of data is transparent, the sources of overhead caused by accessing the distant memories are difficult to analyse. This memory locality problem has been identified as crucial to DSM performance. Many researchers have investigated the problem using simulation as a tool for conducting experiments resulting in the progressive evolution of DSM systems. Nevertheless, both the diversity of architectural configurations and the rapid advance of DSM implementations impose constraints on simulation model designs in two issues: the limitation of the simulation framework on model extensibility and the lack of verification applicability during a simulation run causing the delay in verification process. This thesis studies simulation modelling techniques for memory locality analysis of various DSM systems implemented on top of a cluster of symmetric multiprocessors. The thesis presents a simulation technique to promote model extensibility and proposes a technique for verification applicability, called a Specification-based Parameter Model Interaction (SPMI). The proposed techniques have been implemented in a new interpretation-driven simulation called DSiMCLUSTER on top of a discrete event simulation (DES) engine known as HASE. Experiments have been conducted to determine which factors are most influential on the degree of locality and to determine the possibility to maximise the stability of performance. DSiMCLUSTER has been validated against a SunFire 15K server and has achieved similarity of cache miss results, an average of +-6% with the worst case less than 15% of difference. These results confirm that the techniques used in developing the DSiMCLUSTER can contribute ways to achieve both (a) a highly extensible simulation framework to keep up with the ongoing innovation of the DSM architecture, and (b) the verification applicability resulting in an efficient framework for memory analysis experiments on DSM architecture.

APA, Harvard, Vancouver, ISO, and other styles

24

Kotselidis, Christos-Efthymios. "Exploiting distributed software transactional memory." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/exploiting-distributed-software-transactional-memory(33765e72-93ce-4802-bdfe-c92b72f73890).html.

Full text

Abstract:

Over the past years research and development on computer architecture has shifted from uni-processor systems to multi-core architectures. This transition has created new incentives in software development because in order for the software to scale it has to be highly parallel. Traditional synchronization primitives based on mutual exclusion locking are challenging to use and therefore are only efficiently employed by a minority of expert programmers. Transactional Memory (TM) is a new alternative parallel programming model aiming to alleviate the problems that arise from the use of explicit synchronization mechanisms. In TM, lock guarded code is replaced by memory transactions which comply with the ACI (atomicity, consistency, isolation) principles. The simplicity of the programming model that TM proposes has led to major research efforts by academia and industry to produce high-performance TM implementations. The majority of these TM systems, however, focus on shared-memory Chip MultiProcessors (CMPs) leaving the area of distributed systems unexplored. This thesis explores Transactional Memory in the distributed systems domain and more specifically on small-scale clusters. A variety of novel distributed transactional coherence protocols are proposed and evaluated, against complex TM oriented benchmarks, in the context of distributed Java Virtual Machines (JVMs) - an area that has received much attention over the last decade due to its perfect applicability into the enterprise domain. The implemented Distributed Software Transactional Memory (DiSTM) system, proposed in this thesis, is a JVM clustering solution that employs software transactional memory as its synchronization mechanism. Due to its modular design and ease in programming, it allows the addition of new protocols in a fairly easy manner. Finally, DiSTM is highly portable as it runs on top of off-the-shelf JVMs and requires no changes to existing Java source code.

APA, Harvard, Vancouver, ISO, and other styles

25

Zeffer, Håkan. "Hardware–Software Tradeoffs in Shared-Memory Implementations." Licentiate thesis, Uppsala universitet, Avdelningen för datorteknik, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86369.

Full text

Abstract:

Shared-memory architectures represent a class of parallel computer systems commonly used in the commercial and technical market. While shared-memory servers typically come in a large variety of configurations and sizes, the advance in semiconductor technology have set the trend towards multiple cores per die and multiple threads per core. Software-based distributed shared-memory proposals were given much attention in the 90s. But their promise of short time to market and low cost could not make up for their unstable performance. Hence, these systems seldom made it to the market. However, with the trend towards chip multiprocessors, multiple hardware threads per core and increased cost of connecting multiple chips together to form large-scale machines, software coherence in one form or another might be a good intra-chip coherence solution. This thesis shows that data locality, software flexibility and minimal processor support for read and write coherence traps can offer good performance, while removing the hard limit of scalability. Our aggressive fine-grained software-only distributed shared-memory system exploits key application properties, such as locality and sharing patterns, to outperform a hardware-only machine on some benchmarks. On average, the software system is 11 percent slower than the hardware system when run on identical node and interconnect hardware. A detailed full-system simulation study of dual core CMPs, with multiple hardware threads per core and minimal processor support for coherence traps is on average one percent slower than its hardware-only counterpart when some flexibility is taken into account. Finally, a functional full-system simulation study of an adaptive coherence-batching scheme shows that the number of coherence misses can be reduced with up to 60 percent and bandwidth consumption reduced with up to 22 percent for both commercial and scientific applications.

APA, Harvard, Vancouver, ISO, and other styles

26

Ruppert, Eric. "The consensus power of shared-memory distributed systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0028/NQ49848.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Hsieh, Wilson Cheng-Yi. "Dynamic computation migration in distributed shared memory systems." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36635.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.
Vita.
Includes bibliographical references (p. 123-131).
by Wilson Cheng-Yi Hsieh.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

28

Rakamarić, Zvonimir. "Modular verification of shared-memory concurrent system software." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/32572.

Full text

Abstract:

Software is large, complex, and error-prone. According to the US National Institute of Standards and Technology, software bugs cost the US economy an estimated $60 billion each year. The trend in hardware design of switching to multi-core architectures makes software development even more complex. Cutting software development costs and ensuring higher reliability of software is of global interest and a grand challenge. This is especially true of the system software that is the foundation beneath all general-purpose application programs. The verification of system software poses particular challenges: system software is typically written in a low-level programming language with dynamic memory allocation and pointer manipulation, and system software is also highly concurrent, with shared-memory communication being the main concurrent programming paradigm. Available verification tools usually perform poorly when dealing with the aforementioned challenges. This thesis addresses these problems by enabling precise and scalable verification of low-level, shared-memory, concurrent programs. The main contributions are about the interrelated concepts of memory, modularity, and concurrency. First, because programs use huge amounts of memory, the memory is usually modeled very imprecisely in order to scale to big programs. This imprecise modeling renders most tools almost useless in the memory-intensive parts of code. This thesis describes a scalable, yet precise, memory model that offers on-demand precision only when necessary. Second, modularity is the key to scalability, but it often comes with a price --- a user must manually provide module specifications, making the verification process more tedious. This thesis proposes a light-weight technique for automatically inferring an important family of specifications to make the verification process more automatic. Third, the number of program behaviors explodes in the presence of concurrency, thereby greatly increasing the complexity of the verification task. This explosion is especially true of shared-memory concurrency. The thesis presents a static context-bounded analysis that combines a number of techniques to successfully solve this problem. We have implemented the above contributions in the verification tools developed as a part of this thesis. We have applied the tools on real-life system software, and we are already finding critical, previously undiscovered bugs.

APA, Harvard, Vancouver, ISO, and other styles

29

Costa, Prats Juan José. "Efficient openMP over sequentially consistent distributed shared memory systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2011. http://hdl.handle.net/10803/81012.

Full text

Abstract:

Nowadays clusters are one of the most used platforms in High Performance Computing and most programmers use the Message Passing Interface (MPI) library to program their applications in these distributed platforms getting their maximum performance, although it is a complex task. On the other side, OpenMP has been established as the de facto standard to program applications on shared memory platforms because it is easy to use and obtains good performance without too much effort. So, could it be possible to join both worlds? Could programmers use the easiness of OpenMP in distributed platforms? A lot of researchers think so. And one of the developed ideas is the distributed shared memory (DSM), a software layer on top of a distributed platform giving an abstract shared memory view to the applications. Even though it seems a good solution it also has some inconveniences. The memory coherence between the nodes in the platform is difficult to maintain (complex management, scalability issues, high overhead and others) and the latency of the remote-memory accesses which can be orders of magnitude greater than on a shared bus due to the interconnection network. Therefore this research improves the performance of OpenMP applications being executed on distributed memory platforms using a DSM with sequential consistency evaluating thoroughly the results from the NAS parallel benchmarks. The vast majority of designed DSMs use a relaxed consistency model because it avoids some major problems in the area. In contrast, we use a sequential consistency model because we think that showing these potential problems that otherwise are hidden may allow the finding of some solutions and, therefore, apply them to both models. The main idea behind this work is that both runtimes, the OpenMP and the DSM layer, should cooperate to achieve good performance, otherwise they interfere one each other trashing the final performance of applications. We develop three different contributions to improve the performance of these applications: (a) a technique to avoid false sharing at runtime, (b) a technique to mimic the MPI behaviour, where produced data is forwarded to their consumers and, finally, (c) a mechanism to avoid the network congestion due to the DSM coherence messages. The NAS Parallel Benchmarks are used to test the contributions. The results of this work shows that the false-sharing problem is a relative problem depending on each application. Another result is the importance to move the data flow outside of the critical path and to use techniques that forwards data as early as possible, similar to MPI, benefits the final application performance. Additionally, this data movement is usually concentrated at single points and affects the application performance due to the limited bandwidth of the network. Therefore it is necessary to provide mechanisms that allows the distribution of this data through the computation time using an otherwise idle network. Finally, results shows that the proposed contributions improve the performance of OpenMP applications on this kind of environments.

APA, Harvard, Vancouver, ISO, and other styles

30

Mohindra, Ajay. "Issues in the design of distributed shared memory systems." Diss., Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/9123.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Gupta, Sandeep K. (Sandeep Kumar). "Protocol optimizations for the CRL distributed shared memory system." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/41004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Adler, Joseph (Joseph Adam). "Implementing distributed shared memory on an extensible operating system." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/42805.

Full text

Abstract:

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.
Includes bibliographical references (p. 86-91).
by Joseph Adler.
S.B.and M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

33

Holsapple, Stephen Alan. "DSM64: A DISTRIBUTED SHARED MEMORY SYSTEM IN USER-SPACE." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/725.

Full text

Abstract:

This paper presents DSM64: a lazy release consistent software distributed shared memory (SDSM) system built entirely in user-space. The DSM64 system is capable of executing threaded applications implemented with pthreads on a cluster of networked machines without any modifications to the target application. The DSM64 system features a centralized memory manager [1] built atop Hoard [2, 3]: a fast, scalable, and memory-efficient allocator for shared-memory multiprocessors. In my presentation, I present a SDSM system written in C++ for Linux operating systems. I discuss a straight-forward approach to implement SDSM systems in a Linux environment using system-provided tools and concepts avail- able entirely in user-space. I show that the SDSM system presented in this paper is capable of resolving page faults over a local area network in as little as 2 milliseconds. In my analysis, I present the following. I compare the performance characteristics of a matrix multiplication benchmark using various memory coherency models. I demonstrate that matrix multiplication benchmark using a LRC model performs orders of magnitude quicker than the same application using a stricter coherency model. I show the effect of coherency model on memory access patterns and memory contention. I compare the effects of different locking strategies on execution speed and memory access patterns. Lastly, I provide a comparison of the DSM64 system to a non-networked version using a system-provided allocator.

APA, Harvard, Vancouver, ISO, and other styles

34

Akay, Mehmet Fatih Katsinis Constantine. "Contention resolution and memory load balancing algorithms on distributed shared memory multiprocessors /." Philadelphia, Pa. : Drexel University, 2005. http://dspace.library.drexel.edu/handle/1860/510.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Silcock, Jackie, and mikewood@deakin edu au. "Programmer friendly and efficient distributed shared memory integrated into a distributed operating system." Deakin University. School of Computing and Mathematics, 1998. http://tux.lib.deakin.edu.au./adt-VDU/public/adt-VDU20051114.110240.

Full text

Abstract:

Distributed Shared Memory (DSM) provides programmers with a shared memory environment in systems where memory is not physically shared. Clusters of Workstations (COWs), an often untapped source of computing power, are characterised by a very low cost/performance ratio. The combination of Clusters of Workstations (COWs) with DSM provides an environment in which the programmer can use the well known approaches and methods of programming for physically shared memory systems and parallel processing can be carried out to make full use of the computing power and cost advantages of the COW. The aim of this research is to synthesise and develop a distributed shared memory system as an integral part of an operating system in order to provide application programmers with a convenient environment in which the development and execution of parallel applications can be done easily and efficiently, and which does this in a transparent manner. Furthermore, in order to satisfy our challenging design requirements we want to demonstrate that the operating system into which the DSM system is integrated should be a distributed operating system. In this thesis a study into the synthesis of a DSM system within a microkernel and client-server based distributed operating system which uses both strict and weak consistency models, with a write-invalidate and write-update based approach for consistency maintenance is reported. Furthermore a unique automatic initialisation system which allows the programmer to start the parallel execution of a group of processes with a single library call is reported. The number and location of these processes are determined by the operating system based on system load information. The DSM system proposed has a novel approach in that it provides programmers with a complete programming environment in which they are easily able to develop and run their code or indeed run existing shared memory code. A set of demanding DSM system design requirements are presented and the incentives for the placement of the DSM system with a distributed operating system and in particular in the memory management server have been reported. The new DSM system concentrated on an event-driven set of cooperating and distributed entities, and a detailed description of the events and reactions to these events that make up the operation of the DSM system is then presented. This is followed by a pseudocode form of the detailed design of the main modules and activities of the primitives used in the proposed DSM system. Quantitative results of performance tests and qualitative results showing the ease of programming and use of the RHODOS DSM system are reported. A study of five different application is given and the results of tests carried out on these applications together with a discussion of the results are given. A discussion of how RHODOS DSM allows programmers to write shared memory code in an easy to use and familiar environment and a comparative evaluation of RHODOS DSM with other DSM systems is presented. In particular, the ease of use and transparency of the DSM system have been demonstrated through the description of the ease with which a moderately inexperienced undergraduate programmer was able to convert, write and run applications for the testing of the DSM system. Furthermore, the description of the tests performed using physically shared memory shows that the latter is indistinguishable from distributed shared memory; this is further evidence that the DSM system is fully transparent. This study clearly demonstrates that the aim of the research has been achieved; it is possible to develop a programmer friendly and efficient DSM system fully integrated within a distributed operating system. It is clear from this research that client-server and microkernel based distributed operating system integrated DSM makes shared memory operations transparent and almost completely removes the involvement of the programmer beyond classical activities needed to deal with shared memory. The conclusion can be drawn that DSM, when implemented within a client-server and microkernel based distributed operating system, is one of the most encouraging approaches to parallel processing since it guarantees performance improvements with minimal programmer involvement.

APA, Harvard, Vancouver, ISO, and other styles

36

Argile, Andrew Duncan Stuart. "Distributed processing in decision support systems." Thesis, Nottingham Trent University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.259647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Silva, João André Almeida e. "Partial replication in distributed software transactional memory." Master's thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/10769.

Full text

Abstract:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Distributed software transactional memory (DSTM) is emerging as an interesting alternative for distributed concurrency control. Usually, DSTM systems resort to data distribution and full replication techniques in order to provide scalability and fault tolerance. Nevertheless, distribution does not provide support for fault tolerance and full replication limits the system’s total storage capacity. In this context, partial data replication rises as an intermediate solution that combines the best of the previous two trying to mitigate their disadvantages. This strategy has been explored by the distributed databases research field, but has been little addressed in the context of transactional memory and, to the best of our knowledge, it has never before been incorporated into a DSTM system for a general-purpose programming language. Thus, we defend the claim that it is possible to combine both full and partial data replication in such systems. Accordingly, we developed a prototype of a DSTM system combining full and partial data replication for Java programs. We built from an existent DSTM framework and extended it with support for partial data replication. With the proposed framework, we implemented a partially replicated DSTM. We evaluated the proposed system using known benchmarks, and the evaluation showcases the existence of scenarios where partial data replication can be advantageous, e.g., in scenarios with small amounts of transactions modifying fully replicated data. The results of this thesis show that we were able to sustain our claim by implementing a prototype that effectively combines full and partial data replication in a DSTM system. The modularity of the presented framework allows the easy implementation of its various components, and it provides a non-intrusive interface to applications.
Fundação para a Ciência e Tecnologia - (FCT/MCTES) in the scope of the research project PTDC/EIA-EIA/113613/2009 (Synergy-VM)

APA, Harvard, Vancouver, ISO, and other styles

38

Chaiken, David Lars. "Mechanisms and interfaces for software-extended coherent shared memory." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/34090.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.
Includes bibliographical references (p. 140-146).
by David L. Chaiken.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

39

Farook, Mohammad. "Fast lock-free linked lists in distributed shared memory systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ32107.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Gull, Aarron. "Cherub : a hardware distributed single shared address space memory architecture." Thesis, City University London, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356981.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Dai, Donglai. "Designing efficient communication subsystems for distributed shared memory (DSM) systems /." The Ohio State University, 1999. http://rave.ohiolink.edu/etdc/view?acc_num=osu1488186329503887.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Ramesh, Bharath. "Samhita: Virtual Shared Memory for Non-Cache-Coherent Systems." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/23687.

Full text

Abstract:

Among the key challenges of computing today are the emergence of many-core architectures and the resulting need to effectively exploit explicit parallelism. Indeed, programmers are striving to exploit parallelism across virtually all platforms and application domains. The shared memory programming model effectively addresses the parallelism needs of mainstream computing (e.g., portable devices, laptops, desktop, servers), giving rise to a growing ecosystem of shared memory parallel techniques, tools, and design practices. However, to meet the extreme demands for processing and memory of critical problem domains, including scientific computation and data intensive computing, computing researchers continue to innovate in the high-end distributed memory architecture space to create cost-effective and scalable solutions. The emerging distributed memory architectures are both highly parallel and increasingly heterogeneous. As a result, they do not present the programmer with a cache-coherent view of shared memory, either across the entire system or even at the level of an individual node. Furthermore, it remains an open research question which programming model is best for the heterogeneous platforms that feature multiple traditional processors along with accelerators or co-processors. Hence, we have two contradicting trends. On the one hand, programming convenience and the presence of shared memory call for a shared memory programming model across the entire heterogeneous system. On the other hand, increasingly parallel and heterogeneous nodes lacking cache-coherent shared memory call for a message passing model. In this dissertation, we present the architecture of Samhita, a distributed shared memory (DSM) system that addresses the challenge of providing shared memory for non-cache-coherent systems. We define regional consistency (RegC), the memory consistency model implemented by Samhita. We present performance results for Samhita on several computational kernels and benchmarks, on both cluster supercomputers and heterogeneous systems. The results demonstrate the promising potential of Samhita and the RegC model, and include the largest scale evaluation by a significant margin for any DSM system reported to date.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

43

Upadhayaya, Niraj. "Memory management and optimization using distributed shared memory systems for high performance computing clusters." Thesis, University of the West of England, Bristol, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.421743.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Shao, Cheng. "Multi-writer consistency conditions for shared memory objects." Texas A&M University, 2007. http://hdl.handle.net/1969.1/85806.

Full text

Abstract:

Regularity is a shared memory consistency condition that has received considerable attention, notably in connection with quorum-based shared memory. Lamport's original definition of regularity assumed a single-writer model, however, and is not well defined when each shared variable may have multiple writers. In this thesis, we address this need by formally extending the notion of regularity to a multi-writer model. We have shown that the extension is not trivial. While there exist various ways to extend the single-writer definition, the resulting definitions will have different strengths. Specifically, we give several possible definitions of regularity in the presence of multiple writers. We then present a quorum-based algorithm to implement each of the proposed definitions and prove them correct. We study the relationships between these definitions and a number of other well-known consistency conditions, and give a partial order describing the relative strengths of these consistency conditions. Finally, we provide a practical context for our results by studying the correctness of two well-known algorithms for mutual exclusion under each of our proposed consistency conditions.

APA, Harvard, Vancouver, ISO, and other styles

45

Li, Zongpeng. "Non-blocking implementations of Queues in asynchronous distributed shared-memory systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ62967.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Kalvaitis, Timothy Elmer. "Distributed shared memory for real time hardware in the loop simulation." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/35972.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.
Includes bibliographical references (p. 95-96).
by Timothy Elmer Kalvaitis.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

47

Zhang, Kai. "Compiling for software distributed-shared memory systems." Thesis, 2000. http://hdl.handle.net/1911/17392.

Full text

Abstract:

In this thesis, we explore the use of software distributed shared memory (SDSM) as a target communication layer for parallelizing compilers. We explore how to effectively exploit compiler-derived knowledge of sharing and communication patterns for regular access patterns to improve their performance on SDSM systems. We introduce two novel optimization techniques: compiler-restricted consistency which reduces the cost of false sharing, and compiler-managed communication buffers which, when used together with compiler-restricted consistency, reduce the cost of fragmentation. We focus on regular applications with wavefront computation and tightly-coupled sharing due to carried data dependence. Along with other types of compiler-assisted SDSM optimizations such as compiler-controlled eager update, our integrated compiler and run-time support provides speedups for wavefront computations on SDSM that rival those achieved previously only for loosely synchronous style applications. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

48

Rajamani, Karthick. "Automatic data aggregation for software distributed shared memory systems." Thesis, 1997. http://hdl.handle.net/1911/17126.

Full text

Abstract:

Software Distributed Shared Memory (DSM) provides a shared-memory abstraction on distributed memory hardware, making a parallel programmer's task easier. Unfortunately, software DSM is less efficient than the direct use of the underlying message-passing hardware. The chief reason for this is that hand-coded and compiler-generated message-passing programs typically achieve better data aggregation in their messages than programs using software DSM. Software DSM has poorer data aggregation because the system lacks the knowledge of the application's behavior that a programmer or compiler analysis can provide. We propose four new techniques to perform automatic data aggregation in software DSM. Our techniques use run-time analysis of past data-fetch accesses made by a processor, to aggregate data movement for future accesses. They do not need any additional compiler support. We implemented our techniques in the TreadMarks software DSM system. We used a test suite of four applications--3D-FFT, Barnes-Hut, Ilink and Shallow. For these applications we obtained 40% to 66% reduction in message counts which resulted in 6% to 19% improvement in execution times.

APA, Harvard, Vancouver, ISO, and other styles

49

Wang, Hsiao-Hsi, and 王孝熙. "On the Coherence Problems for Software Distributed Shared Memory." Thesis, 1993. http://ndltd.ncl.edu.tw/handle/05733960280342675094.

Full text

Abstract:

博士
國立交通大學
資訊工程研究所
82
Software distributed shared memory (DSM) provides a convenient and effective solution for programming parallel applications on distributed systems. However, the performance of current implementations suffers from large overhead in enforcing memory coherence. In this thesis, we propose two coherence schemes to solve memory coherence problems. The first scheme is based upon the observation that the performance of distributed shared memory depends on the memory coherence algorithms and the access characteristics of shared data. Therefore, we propose an efficient directory-based coherence scheme using multiple coherence algorithms with self-adjusting feature. This method can dynamically choose suitable coherence algorithm for each shared variable class. The second scheme is based on the observation that there exists false sharing problem that causes unnecessarily large amount of coherence faults. Various memory consistency models have been proposed in order to eliminate the effects of network traffic and memory latency. We present a hybrid approach that combines relaxed memory consistency models and a compiler strategy to solve memory coherence and false sharing problems for DSM. Experimental results show that this hybrid approach produces fewer coherence faults and is effective for reducing the memory coherence overhead of DSM.

APA, Harvard, Vancouver, ISO, and other styles

50

Wong, H'sien Jin. "Integrating software distributed shared memory and message passing programming." Phd thesis, 2010. http://hdl.handle.net/1885/151533.

Full text

Abstract:

Software Distributed Shared Memory (SDSM) systems provide programmers with a shared memory programming environment across distributed memory architectures. In contrast to the message passing programming environment, an SDSM can resolve data dependencies within the application without the programmer having to explicitly specify the required communications. Such ease-of-use is, however, provided at a cost to performance. This thesis considers how the SDSM programming model can be combined with the message passing programming model with the goal of avoiding these additional costs when a message passing solution is straightforward. To pursue the above goal a new SDSM library, named Danui, has been developed. The SDSM manages a shared address space in units of pages. Consistency for these pages is maintained using a variety of home-based protocols. In total, the SDSM includes seven different barrier implementations, five of which support home-migration. Danui is designed with portability and modularity in mind. It is written using the standard Message Passing Interface (MPI) to perform all underlying SDSM related communications. MPI was used both to permit the subsequent exploration of user-level MPI/SDSM programming, and also to enable the SDSM library to exploit optimised MPI implementations on a variety of cluster interconnects. A detailed analysis of the costs associated with the various SDSM operations is given. This analysis characterizes each SDSM operation based upon a number of factors. For example, the size of an SDSM page, the number of modifications made to shared memory, the choice of barrier used etc. This information is then used as a guide to determine which parts of an SDSM program would benefit most if replaced by direct calls to MPI. The integration of the shared memory and message passing (SMMP) programming models within Danui is discussed in detail. It is shown that a na{u00EF}ve integration of the SDSM and MPI programming models leads to memory consistency problems when MPI transfers occur to or from memory that is part of the software managed shared address space. These problems can, however, be overcome by noting the semantics of the shared memory and message passing paradigms, and introducing a few extensions to the MPI interface.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Software distributed shared memory'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles