Dissertations / Theses: 'Distributed shared memory'

1

Kinawi, Husam. "Optimistic distributed shared memory." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0012/NQ38454.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Auld, Philip Ragner. "Broadcast distributed shared memory." W&M ScholarWorks, 2001. https://scholarworks.wm.edu/etd/1539623374.

Full text

Abstract:

Distributed shared memory (DSM) provides the illusion of shared memory processing to programs running on physically distributed systems. Many of these systems are connected by a broadcast medium network such as Ethernet. In this thesis, we develop a weakly coherent model for DSM that takes advantage of hardware-level broadcast. We define the broadcast DSM model (BDSM) to provide fine-grained sharing of user-defined locations. Additionally, since extremely weak DSM models are difficult to program, BDSM provides effective synchronization operations that allow it to function as a stronger memory. We show speedup results for a test suite of parallel programs and compare them to MPI versions.;To overcome the potential for message loss using broadcast on an Ethernet segment we have developed a reliable broadcast protocol, called Pipelined Broadcast Protocol (PBP). This protocol provides the illusion of a series of FIFO pipes among member process, on top of Ethernet broadcast operations. We discuss two versions of the PBP protocol and their implementations. Comparisons to TCP show the predicted benefits of using broadcast. PBP also shows strong throughput results, nearing the maximum of our 10Base-T hardware.;By combining weak DSM and hardware broadcast we developed a system that provides comparable performance to a common message-passing system, MPI. For our test programs that have all-to-all communication patterns, we actually see better performance than MPI. We show that using broadcast to perform DSM updates can be a viable alternative to message passing for parallel and distributed computation on a single Ethernet segment.

APA, Harvard, Vancouver, ISO, and other styles

3

Ananthanarayanan, R. (Rajagopal). "High performance distributed shared memory." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/8129.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Radovic, Zoran. "Software Techniques for Distributed Shared Memory." Doctoral thesis, Uppsala University, Department of Information Technology, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6058.

Full text

Abstract:

In large multiprocessors, the access to shared memory is often nonuniform, and may vary as much as ten times for some distributed shared-memory architectures (DSMs). This dissertation identifies another important nonuniform property of DSM systems: nonuniform communication architecture, NUCA. High-end hardware-coherent machines built from large nodes, or from chip multiprocessors, are typical NUCA systems, since they have a lower penalty for reading recently written data from a neighbor's cache than from a remote cache. This dissertation identifies node affinity as an important property for scalable general-purpose locks. Several software-based hierarchical lock implementations exploiting NUCAs are presented and evaluated. NUCA-aware locks are shown to be almost twice as efficient for contended critical sections compared to traditional lock implementations.

The shared-memory “illusion”' provided by some large DSM systems may be implemented using either hardware, software or a combination thereof. A software-based implementation can enable cheap cluster hardware to be used, but typically suffers from poor and unpredictable performance characteristics.

This dissertation advocates a new software-hardware trade-off design point based on a new combination of techniques. The two low-level techniques, fine-grain deterministic coherence and synchronous protocol execution, as well as profile-guided protocol flexibility, are evaluated in isolation as well as in a combined setting using all-software implementations. Finally, a minimum of hardware trap support is suggested to further improve the performance of coherence protocols across cluster nodes. It is shown that all these techniques combined could result in a fairly stable performance on par with hardware-based coherence.

APA, Harvard, Vancouver, ISO, and other styles

5

Godfrey, Andrew. "Distributed shared memory for virtual environments." Master's thesis, University of Cape Town, 1997. http://hdl.handle.net/11427/9516.

Full text

Abstract:

Bibliography: leaves 71-77.
This work investigated making virtual environments easier to program, by designing a suitable distributed shared memory system. To be usable, the system must keep latency to a minimum, as virtual environments are very sensitive to it. The resulting design is push-based and non-consistent. Another requirement is that the system should be scaleable, over large distances and over large numbers of participants. The latter is hard to achieve with current network protocols, and a proposal was made for a more scaleable multicast addressing system than is used in the Internet protocol. Two sample virtual environments were developed to test the ease-of-use of the system. This showed that the basic concept is sound, but that more support is needed. The next step should be to extend the language and add compiler support, which will enhance ease-of-use and allow numerous optimisations. This can be improved further by providing system-supported containers.

APA, Harvard, Vancouver, ISO, and other styles

6

Girard, Gabriel. "Views and consistencies in distributed shared memory." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0009/NQ59232.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Marurngsith, Worawan. "Simulation modelling of distributed-shared memory multiprocessors." Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/870.

Full text

Abstract:

Distributed shared memory (DSM) systems have been recognised as a compelling platform for parallel computing due to the programming advantages and scalability. DSM systems allow applications to access data in a logically shared address space by abstracting away the distinction of physical memory location. As the location of data is transparent, the sources of overhead caused by accessing the distant memories are difficult to analyse. This memory locality problem has been identified as crucial to DSM performance. Many researchers have investigated the problem using simulation as a tool for conducting experiments resulting in the progressive evolution of DSM systems. Nevertheless, both the diversity of architectural configurations and the rapid advance of DSM implementations impose constraints on simulation model designs in two issues: the limitation of the simulation framework on model extensibility and the lack of verification applicability during a simulation run causing the delay in verification process. This thesis studies simulation modelling techniques for memory locality analysis of various DSM systems implemented on top of a cluster of symmetric multiprocessors. The thesis presents a simulation technique to promote model extensibility and proposes a technique for verification applicability, called a Specification-based Parameter Model Interaction (SPMI). The proposed techniques have been implemented in a new interpretation-driven simulation called DSiMCLUSTER on top of a discrete event simulation (DES) engine known as HASE. Experiments have been conducted to determine which factors are most influential on the degree of locality and to determine the possibility to maximise the stability of performance. DSiMCLUSTER has been validated against a SunFire 15K server and has achieved similarity of cache miss results, an average of +-6% with the worst case less than 15% of difference. These results confirm that the techniques used in developing the DSiMCLUSTER can contribute ways to achieve both (a) a highly extensible simulation framework to keep up with the ongoing innovation of the DSM architecture, and (b) the verification applicability resulting in an efficient framework for memory analysis experiments on DSM architecture.

APA, Harvard, Vancouver, ISO, and other styles

8

Johnson, Kirk Lauritz. "High-performance all-software distributed shared memory." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/37185.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (p. 165-172).
by Kirk Lauritz Johnson.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

9

Ruppert, Eric. "The consensus power of shared-memory distributed systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0028/NQ49848.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Hsieh, Wilson Cheng-Yi. "Dynamic computation migration in distributed shared memory systems." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36635.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.
Vita.
Includes bibliographical references (p. 123-131).
by Wilson Cheng-Yi Hsieh.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

11

Cheung, Wang-leung Benny. "Large object space support for software distributed shared memory." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B31601741.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

張宏亮 and Wang-leung Benny Cheung. "Migrating-home protocol for software distributed shared-memory system." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B31222377.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Cheung, Wang-leung Benny, and 張宏亮. "Large object space support for software distributed shared memory." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B31601741.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Holsapple, Stephen Alan. "DSM64: A DISTRIBUTED SHARED MEMORY SYSTEM IN USER-SPACE." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/725.

Full text

Abstract:

This paper presents DSM64: a lazy release consistent software distributed shared memory (SDSM) system built entirely in user-space. The DSM64 system is capable of executing threaded applications implemented with pthreads on a cluster of networked machines without any modifications to the target application. The DSM64 system features a centralized memory manager [1] built atop Hoard [2, 3]: a fast, scalable, and memory-efficient allocator for shared-memory multiprocessors. In my presentation, I present a SDSM system written in C++ for Linux operating systems. I discuss a straight-forward approach to implement SDSM systems in a Linux environment using system-provided tools and concepts avail- able entirely in user-space. I show that the SDSM system presented in this paper is capable of resolving page faults over a local area network in as little as 2 milliseconds. In my analysis, I present the following. I compare the performance characteristics of a matrix multiplication benchmark using various memory coherency models. I demonstrate that matrix multiplication benchmark using a LRC model performs orders of magnitude quicker than the same application using a stricter coherency model. I show the effect of coherency model on memory access patterns and memory contention. I compare the effects of different locking strategies on execution speed and memory access patterns. Lastly, I provide a comparison of the DSM64 system to a non-networked version using a system-provided allocator.

APA, Harvard, Vancouver, ISO, and other styles

15

Costa, Prats Juan José. "Efficient openMP over sequentially consistent distributed shared memory systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2011. http://hdl.handle.net/10803/81012.

Full text

Abstract:

Nowadays clusters are one of the most used platforms in High Performance Computing and most programmers use the Message Passing Interface (MPI) library to program their applications in these distributed platforms getting their maximum performance, although it is a complex task. On the other side, OpenMP has been established as the de facto standard to program applications on shared memory platforms because it is easy to use and obtains good performance without too much effort. So, could it be possible to join both worlds? Could programmers use the easiness of OpenMP in distributed platforms? A lot of researchers think so. And one of the developed ideas is the distributed shared memory (DSM), a software layer on top of a distributed platform giving an abstract shared memory view to the applications. Even though it seems a good solution it also has some inconveniences. The memory coherence between the nodes in the platform is difficult to maintain (complex management, scalability issues, high overhead and others) and the latency of the remote-memory accesses which can be orders of magnitude greater than on a shared bus due to the interconnection network. Therefore this research improves the performance of OpenMP applications being executed on distributed memory platforms using a DSM with sequential consistency evaluating thoroughly the results from the NAS parallel benchmarks. The vast majority of designed DSMs use a relaxed consistency model because it avoids some major problems in the area. In contrast, we use a sequential consistency model because we think that showing these potential problems that otherwise are hidden may allow the finding of some solutions and, therefore, apply them to both models. The main idea behind this work is that both runtimes, the OpenMP and the DSM layer, should cooperate to achieve good performance, otherwise they interfere one each other trashing the final performance of applications. We develop three different contributions to improve the performance of these applications: (a) a technique to avoid false sharing at runtime, (b) a technique to mimic the MPI behaviour, where produced data is forwarded to their consumers and, finally, (c) a mechanism to avoid the network congestion due to the DSM coherence messages. The NAS Parallel Benchmarks are used to test the contributions. The results of this work shows that the false-sharing problem is a relative problem depending on each application. Another result is the importance to move the data flow outside of the critical path and to use techniques that forwards data as early as possible, similar to MPI, benefits the final application performance. Additionally, this data movement is usually concentrated at single points and affects the application performance due to the limited bandwidth of the network. Therefore it is necessary to provide mechanisms that allows the distribution of this data through the computation time using an otherwise idle network. Finally, results shows that the proposed contributions improve the performance of OpenMP applications on this kind of environments.

APA, Harvard, Vancouver, ISO, and other styles

16

Mohindra, Ajay. "Issues in the design of distributed shared memory systems." Diss., Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/9123.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Norgren, Magnus. "Software Distributed Shared Memory Using the VIPS Coherence Protocol." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-256975.

Full text

Abstract:

A coherent global address space in a distributed system enables shared memory programming in a much larger scale than in a single multicore processor. The solution is to implement a software distributed shared memory (SWDSM) system since hardware support at this scale is non-existent. However, traditional approaches to coherence in SWDSM systems (centralized via 'active' home-node directories) are inherently unfit for such a scenario. Instead, it is crucial to make decisions locally and avoid the long latency imposed by both network and software message-handlers. This thesis investigates the performance of an SWDSM system with a novel and completely distributed coherence protocol that minimizes long-latency communications common in coherence protocols. More specifically, we propose an approach suitable for data race free programs, based on self-invalidation and self-downgrade inspired by the VIPS cache coherence protocol. This thesis tries to exploit the distributed nature of self-invalidation, self-downgrade by using a passive data classification directory that require no message-handlers, thereby incurring no extra latency when issuing coherence requests. The result is an SWDSM system called SVIPS which maximizes local decision making and allows high parallel performance with minimal coherence traffic between nodes in a distributed system.

APA, Harvard, Vancouver, ISO, and other styles

18

Gupta, Sandeep K. (Sandeep Kumar). "Protocol optimizations for the CRL distributed shared memory system." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/41004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Adler, Joseph (Joseph Adam). "Implementing distributed shared memory on an extensible operating system." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/42805.

Full text

Abstract:

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.
Includes bibliographical references (p. 86-91).
by Joseph Adler.
S.B.and M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

20

Cheung, Wang-leung Benny. "Migrating-home protocol for software distributed shared-memory system /." Hong Kong : University of Hong Kong, 2000. http://sunzi.lib.hku.hk/hkuto/record.jsp?B22030116.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Akay, Mehmet Fatih Katsinis Constantine. "Contention resolution and memory load balancing algorithms on distributed shared memory multiprocessors /." Philadelphia, Pa. : Drexel University, 2005. http://dspace.library.drexel.edu/handle/1860/510.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Ramesh, Bharath. "Samhita: Virtual Shared Memory for Non-Cache-Coherent Systems." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/23687.

Full text

Abstract:

Among the key challenges of computing today are the emergence of many-core architectures and the resulting need to effectively exploit explicit parallelism. Indeed, programmers are striving to exploit parallelism across virtually all platforms and application domains. The shared memory programming model effectively addresses the parallelism needs of mainstream computing (e.g., portable devices, laptops, desktop, servers), giving rise to a growing ecosystem of shared memory parallel techniques, tools, and design practices. However, to meet the extreme demands for processing and memory of critical problem domains, including scientific computation and data intensive computing, computing researchers continue to innovate in the high-end distributed memory architecture space to create cost-effective and scalable solutions. The emerging distributed memory architectures are both highly parallel and increasingly heterogeneous. As a result, they do not present the programmer with a cache-coherent view of shared memory, either across the entire system or even at the level of an individual node. Furthermore, it remains an open research question which programming model is best for the heterogeneous platforms that feature multiple traditional processors along with accelerators or co-processors. Hence, we have two contradicting trends. On the one hand, programming convenience and the presence of shared memory call for a shared memory programming model across the entire heterogeneous system. On the other hand, increasingly parallel and heterogeneous nodes lacking cache-coherent shared memory call for a message passing model. In this dissertation, we present the architecture of Samhita, a distributed shared memory (DSM) system that addresses the challenge of providing shared memory for non-cache-coherent systems. We define regional consistency (RegC), the memory consistency model implemented by Samhita. We present performance results for Samhita on several computational kernels and benchmarks, on both cluster supercomputers and heterogeneous systems. The results demonstrate the promising potential of Samhita and the RegC model, and include the largest scale evaluation by a significant margin for any DSM system reported to date.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

23

Argile, Andrew Duncan Stuart. "Distributed processing in decision support systems." Thesis, Nottingham Trent University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.259647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Silcock, Jackie, and mikewood@deakin edu au. "Programmer friendly and efficient distributed shared memory integrated into a distributed operating system." Deakin University. School of Computing and Mathematics, 1998. http://tux.lib.deakin.edu.au./adt-VDU/public/adt-VDU20051114.110240.

Full text

Abstract:

Distributed Shared Memory (DSM) provides programmers with a shared memory environment in systems where memory is not physically shared. Clusters of Workstations (COWs), an often untapped source of computing power, are characterised by a very low cost/performance ratio. The combination of Clusters of Workstations (COWs) with DSM provides an environment in which the programmer can use the well known approaches and methods of programming for physically shared memory systems and parallel processing can be carried out to make full use of the computing power and cost advantages of the COW. The aim of this research is to synthesise and develop a distributed shared memory system as an integral part of an operating system in order to provide application programmers with a convenient environment in which the development and execution of parallel applications can be done easily and efficiently, and which does this in a transparent manner. Furthermore, in order to satisfy our challenging design requirements we want to demonstrate that the operating system into which the DSM system is integrated should be a distributed operating system. In this thesis a study into the synthesis of a DSM system within a microkernel and client-server based distributed operating system which uses both strict and weak consistency models, with a write-invalidate and write-update based approach for consistency maintenance is reported. Furthermore a unique automatic initialisation system which allows the programmer to start the parallel execution of a group of processes with a single library call is reported. The number and location of these processes are determined by the operating system based on system load information. The DSM system proposed has a novel approach in that it provides programmers with a complete programming environment in which they are easily able to develop and run their code or indeed run existing shared memory code. A set of demanding DSM system design requirements are presented and the incentives for the placement of the DSM system with a distributed operating system and in particular in the memory management server have been reported. The new DSM system concentrated on an event-driven set of cooperating and distributed entities, and a detailed description of the events and reactions to these events that make up the operation of the DSM system is then presented. This is followed by a pseudocode form of the detailed design of the main modules and activities of the primitives used in the proposed DSM system. Quantitative results of performance tests and qualitative results showing the ease of programming and use of the RHODOS DSM system are reported. A study of five different application is given and the results of tests carried out on these applications together with a discussion of the results are given. A discussion of how RHODOS DSM allows programmers to write shared memory code in an easy to use and familiar environment and a comparative evaluation of RHODOS DSM with other DSM systems is presented. In particular, the ease of use and transparency of the DSM system have been demonstrated through the description of the ease with which a moderately inexperienced undergraduate programmer was able to convert, write and run applications for the testing of the DSM system. Furthermore, the description of the tests performed using physically shared memory shows that the latter is indistinguishable from distributed shared memory; this is further evidence that the DSM system is fully transparent. This study clearly demonstrates that the aim of the research has been achieved; it is possible to develop a programmer friendly and efficient DSM system fully integrated within a distributed operating system. It is clear from this research that client-server and microkernel based distributed operating system integrated DSM makes shared memory operations transparent and almost completely removes the involvement of the programmer beyond classical activities needed to deal with shared memory. The conclusion can be drawn that DSM, when implemented within a client-server and microkernel based distributed operating system, is one of the most encouraging approaches to parallel processing since it guarantees performance improvements with minimal programmer involvement.

APA, Harvard, Vancouver, ISO, and other styles

25

Farook, Mohammad. "Fast lock-free linked lists in distributed shared memory systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ32107.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Lo, Adley Kam Wing. "Tolerating latency in software distributed shared memory systems through multithreading." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ34040.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Gull, Aarron. "Cherub : a hardware distributed single shared address space memory architecture." Thesis, City University London, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356981.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Dai, Donglai. "Designing efficient communication subsystems for distributed shared memory (DSM) systems /." The Ohio State University, 1999. http://rave.ohiolink.edu/etdc/view?acc_num=osu1488186329503887.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Shao, Cheng. "Multi-writer consistency conditions for shared memory objects." Texas A&M University, 2007. http://hdl.handle.net/1969.1/85806.

Full text

Abstract:

Regularity is a shared memory consistency condition that has received considerable attention, notably in connection with quorum-based shared memory. Lamport's original definition of regularity assumed a single-writer model, however, and is not well defined when each shared variable may have multiple writers. In this thesis, we address this need by formally extending the notion of regularity to a multi-writer model. We have shown that the extension is not trivial. While there exist various ways to extend the single-writer definition, the resulting definitions will have different strengths. Specifically, we give several possible definitions of regularity in the presence of multiple writers. We then present a quorum-based algorithm to implement each of the proposed definitions and prove them correct. We study the relationships between these definitions and a number of other well-known consistency conditions, and give a partial order describing the relative strengths of these consistency conditions. Finally, we provide a practical context for our results by studying the correctness of two well-known algorithms for mutual exclusion under each of our proposed consistency conditions.

APA, Harvard, Vancouver, ISO, and other styles

30

Upadhayaya, Niraj. "Memory management and optimization using distributed shared memory systems for high performance computing clusters." Thesis, University of the West of England, Bristol, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.421743.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Yellajyosula, Kiran S. "Distributed dispatchers for partially clairvoyant schedulers." Morgantown, W. Va. : [West Virginia University Libraries], 2003. http://etd.wvu.edu/templates/showETD.cfm?recnum=3104.

Full text

Abstract:

Thesis (M.S.)--West Virginia University, 2003.
Title from document title page. Document formatted into pages; contains ix, 63 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 60-63).

APA, Harvard, Vancouver, ISO, and other styles

32

Li, Zongpeng. "Non-blocking implementations of Queues in asynchronous distributed shared-memory systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ62967.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Atukorala, G. S. "Porting a distributed operating system to a shared memory parallel computer." Thesis, University of Bath, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.256756.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Kalvaitis, Timothy Elmer. "Distributed shared memory for real time hardware in the loop simulation." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/35972.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.
Includes bibliographical references (p. 95-96).
by Timothy Elmer Kalvaitis.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

35

Vassenkov, Phillip. "Contech: a shared memory parallel program analysis framework." Thesis, Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50379.

Full text

Abstract:

We are in the era of multicore machines, where we must exploit thread level parallelism for programs to run better, smarter, faster, and more efficiently. In order to increase instruction level parallelism, processors and compilers perform heavy dataflow analyses between instructions. However, there isn’t much work done in the area of inter-thread dataflow analysis. In order to pave the way and find new ways to conserve resources across a variety of domains (i.e., execution speed, chip die area, power efficiency, and computational throughput), we propose a novel framework, termed Contech, to facilitate the analysis of multithreaded program in terms of its communication and execution patterns. We focus the scope on shared memory programs rather than message passing programs, since it is more difficult to analyze the communication and execution patterns for these programs. Discovering patterns of shared memory programs has the potential to allow general purpose computing machines to turn on or off architectural tricks according to application-specific features. Our design of Contech is modular in nature, so we can glean a large variety of information from an architecturally independent representation of the program under examination.

APA, Harvard, Vancouver, ISO, and other styles

36

Lam, King-tin, and 林擎天. "Efficient shared object space support for distributed Java virtual machine." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B47752877.

Full text

Abstract:

Given the popularity of Java, extending the standard Java virtual machine (JVM) to become cluster-aware effectively brings the vision of transparent horizontal scaling of applications to fruition. With a set of cluster-wide JVMs orchestrated as a virtually single system, thread-level parallelism in Java is no longer confined to one multiprocessor. An unmodified multithreaded Java application running on such a Distributed JVM (DJVM) can scale out transparently, tapping into the vast computing power of the cluster. While this notion creates an easy-to-use and powerful parallel programming paradigm, research on DJVMs has remained largely at the proof-of-concept stage where successes were proven using trivial scientific computing workloads only. Real-life Java applications with commercial server workloads have not been well-studied on DJVMs. Their natures including complex and sometimes huge object graphs, irregular access patterns and frequent synchronizations are key scalability hurdles. To design a scalable DJVM for real-life applications, we identify three major unsolved issues calling for a top-to-bottom overhaul of traditional systems. First, we need a more time- and space-efficient cache coherence protocol to support fine-grained object sharing over the distributed shared heap. The recent prevalence of concurrent data structures with heavy use of volatile fields has added complications to the matter. Second, previous generations of DJVMs lack true support for memory-intensive applications. While the network-wide aggregated physical memory can be huge, mutual sharing of huge object graphs like Java collections may cause nodes to eventually run out of local heap space because the cached copies of remote objects, linked by active references, can’t be arbitrarily discarded. Third, thread affinity, which determines the overall communication cost, is vital to the DJVM performance. Data access locality can be improved by collocating highly-correlated threads, via dynamic thread migration. Tracking inter-thread correlations trades profiling costs for reduced object misses. Unfortunately, profiling techniques like active correlation tracking used in page-based DSMs would entail prohibitively high overheads and low accuracy when ported to fine-grained object-based DJVMs. This dissertation presents technical contributions towards all these problems. We use a dual-protocol approach to address the first problem. Synchronized (lock-based) and volatile accesses are handled by a home-based lazy release consistency (HLRC) protocol and a sequential consistency (SC) protocol respectively. The two protocols’ metadata are maintained in a conflict-free, memory-efficient manner. With further techniques like hierarchical passing of lock ownerships, the overall communication overheads of fine-grained distributed object sharing are pruned to a minimal level. For the second problem, we develop a novel uncaching mechanism to safely break a huge active object graph. When a JVM instance runs low on free memory, it initiates an uncaching policy, which eagerly assigns nulls to selected reference fields, thus detaching some older or less useful cached objects from the root set for reclamation. Careful orchestration is made between uncaching, local garbage collection and the coherence protocol to avoid possible data races. Lastly, we devise lightweight sampling-based profiling methods to derive inter-thread correlations, and a profile-guided thread migration policy to boost the system performance. Extensive experiments have demonstrated the effectiveness of all our solutions.
published_or_final_version
Computer Science
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

37

Anevlavis, Ioannis. "A Study of Page-Based Memory Allocation Policies for the Argo Distributed Shared Memory System." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-437735.

Full text

Abstract:

Software distributed shared memory (DSM) systems have been one of the main areas of research in the high-performance computing community. One of the many implementations of such systems is Argo, a page-based, user-space DSM, built on top of MPI. Researchers have dedicated considerable effort in making Argo easier to use and alleviate some of its shortcomings that are culprits in hurting performance and scaling. However, there are several issues left to be addressed, one of them concerning the simplistic distribution of pages across the nodes of a cluster. Since Argo works on page granularity, the page-based memory allocation or placement of pages in a distributedsystem is of significant importance to the performance, since it determines the extent of remote memory accesses. To ensure high performance, it is essential to employ memory allocation policies that allocate data in distributed memory modules intelligently, thus reducing latencies and increasing memory bandwidth. In this thesis,we incorporate several page placement policies on Argo and evaluate their impact on performance with a set of benchmarks ported on that programming model.

APA, Harvard, Vancouver, ISO, and other styles

38

Chan, Charles Quoc Cuong. "Tolerating latency in software distributed shared memory systems through non-binding prefetching." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ34036.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

劉宗國 and Chung-kwok Albert Lau. "The doubly-linked list protocol family for distributed shared memory multiprocessor systems." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B3121325X.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Richard, Golden George III. "Techniques for process recovery in message passing and distributed shared memory systems /." The Ohio State University, 1995. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487862399451108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Mandal, Manas. "Efficient distributed shared memory using mapped segmentation and reusable single-assignment variables /." The Ohio State University, 1995. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487864485229781.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Lau, Chung-kwok Albert. "The doubly-linked list protocol family for distributed shared memory multiprocessor systems /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B17590553.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Ji, Qing Song, and 紀清松. "Dynamic distributed shared memory system." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/53044595193361533471.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

HUANG, JIN-YUAN, and 黃進源. "Coherence of distributed shared memory." Thesis, 1990. http://ndltd.ncl.edu.tw/handle/92302775635354532235.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Reinhardt, Steven K. "Mechanisms for distributed shared memory." 1996. http://catalog.hathitrust.org/api/volumes/oclc/36943929.html.

Full text

Abstract:

Thesis (Ph. D.)--University of Wisconsin--Madison, 1996.
Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 113-123).

APA, Harvard, Vancouver, ISO, and other styles

46

Chen, Feng-Lin, and 陳烽霖. "Fault-Tolerant Distributed Shared Memory Systems." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/79200425584823292350.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Keleher, Peter John. "Lazy release consistency for distributed shared memory." Thesis, 1995. http://hdl.handle.net/1911/16837.

Full text

Abstract:

A software distributed shared memory (DSM) system allows shared memory parallel programs to execute on networks of workstations. This thesis presents a new class of protocols that has lower communication requirements than previous DSM protocols, and can consequently achieve higher performance. The lazy release consistent protocols achieve this reduction in communication by piggybacking consistency information on top of existing synchronization transfers. Some of the protocols also improve performance by speculatively moving data. We evaluate the impact of these features by comparing the performance of a software DSM using lazy protocols with that of a DSM using previous eager protocols. We found that seven of our eight applications performed better on the lazy system, and four of the applications showed performance speedups of at least 18%. As part of this comparison, we show that the cost of executing the slightly more complex code of the lazy protocols is far less important than the reduction in communication requirements. We also compare the lazy performance with that of a hardware supported shared memory system that uses processors and caches similar to those of the workstations running our DSM. Our DSM system was able to approach, and in one case even surpass, the performance of the hardware system for applications with coarse-grained parallelism, but the hardware system performed significantly better for programs with fine-grained parallelism. Overall, the results indicate that DSMs using lazy protocols have become a viable alternative for high-performance parallel processing.

APA, Harvard, Vancouver, ISO, and other styles

48

Zhang, Kai. "Compiling for software distributed-shared memory systems." Thesis, 2000. http://hdl.handle.net/1911/17392.

Full text

Abstract:

In this thesis, we explore the use of software distributed shared memory (SDSM) as a target communication layer for parallelizing compilers. We explore how to effectively exploit compiler-derived knowledge of sharing and communication patterns for regular access patterns to improve their performance on SDSM systems. We introduce two novel optimization techniques: compiler-restricted consistency which reduces the cost of false sharing, and compiler-managed communication buffers which, when used together with compiler-restricted consistency, reduce the cost of fragmentation. We focus on regular applications with wavefront computation and tightly-coupled sharing due to carried data dependence. Along with other types of compiler-assisted SDSM optimizations such as compiler-controlled eager update, our integrated compiler and run-time support provides speedups for wavefront computations on SDSM that rival those achieved previously only for loosely synchronous style applications. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

49

Ueng, Jyh-Chang, and 翁志昌. "Multi-threading for Distributed Shared Memory Systems." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/75109053685415923880.

Full text

Abstract:

博士
國立成功大學
電機工程學系
88
This dissertation investigates the support of multi-threading for DSM systems to improve programming flexibility, latency masking, thread migration, load balancing, and dynamic resource utilization. Several techniques are proposed to enhance the efficiency of multi-threading in addition to provide the basic facility of multi-threading. The methods include transparent programming model, affinity scheduling, and idle-reducing synchronization mechanisms. Furthermore, this thesis proposes a dependency-driven load balancing strategy for balancing workload among nodes. The dynamic resource utilization is also made possible by supporting dynamic node reconfiguration, which is based on thread migration. In addition to theory investigation, we have implemented a prototype multi-threaded DSM system called Cohesion that incorporates all of the proposed techniques. Some experiments have been carried out to evaluate the effectiveness of multi-threading as well as the efficiency of the proposed techniques. The results show that employing the overlapping technique for multi-threading can improve the performance of applications to 18% in our testing programs. Moreover, our load balancing facility reduces processor idleness when the workload on each node of the system is unequal. The support of node reconfiguration achieves much improvement in performance when computers are dynamically added to the system.

APA, Harvard, Vancouver, ISO, and other styles

50

Chang, Chen-Yea, and 張振亞. "Distributed Shared Memory Management:an Object Oriented Approach." Thesis, 1993. http://ndltd.ncl.edu.tw/handle/05880254505201157090.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Distributed shared memory'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles