Dissertations / Theses: 'Parallel programming (Computer science)'

1

Gamble, James Graham. "Explicit parallel programming." Thesis, This resource online, 1990. http://scholar.lib.vt.edu/theses/available/etd-06082009-171019/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Roe, Paul. "Parallel programming using functional languages." Thesis, Connect to e-thesis, 1991. http://theses.gla.ac.uk/1052.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Handler, Caroline. "Parallel process placement." Thesis, Rhodes University, 1989. http://hdl.handle.net/10962/d1002033.

Full text

Abstract:

This thesis investigates methods of automatic allocation of processes to available processors in a given network configuration. The research described covers the investigation of various algorithms for optimal process allocation. Among those researched were an algorithm which used a branch and bound technique, an algorithm based on graph theory, and an heuristic algorithm involving cluster analysis. These have been implemented and tested in conjunction with the gathering of performance statistics during program execution, for use in improving subsequent allocations. The system has been implemented on a network of loosely-coupled microcomputers using multi-port serial communication links to simulate a transputer network. The concurrent programming language occam has been implemented, replacing the explicit process allocation constructs with an automatic placement algorithm. This enables the source code to be completely separated from hardware considerations

APA, Harvard, Vancouver, ISO, and other styles

4

Bergstrom, Lars. "Parallel functional programming with mutable state." Thesis, The University of Chicago, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3568360.

Full text

Abstract:

Immutability greatly simplifies the implementation of parallel languages. In the absence of mutable state the language implementation is free to perform parallel operations with fewer locks and fewer restrictions on scheduling and data replication. In the Manticore project, we have achieved nearly perfect speedups across both Intel and AMD manycore machines on a variety of benchmarks using this approach.

There are parallel stateful algorithms, however, that exhibit significantly better performance than the corresponding parallel algorithm without mutable state. For example, in many search problems, the same problem configuration can be reached through multiple execution paths. Parallel stateful algorithms share the results of evaluating the same configuration across threads, but parallel mutation-free algorithms are required to either duplicate work or thread their state through a sequential store. Additionally, in algorithms where each parallel task mutates an independent portion of the data, non-conflicting mutations can be performed in parallel. The parallel state-free algorithm will have to merge each of those changes individually, which is a sequential operation at each step.

In this dissertation, we extend Manticore with two techniques that address these problems while preserving its current scalability. Memoization , also known as function caching, is a technique that stores previously returned values from functions, making them available to parallel threads of executions that call that same function with those same values. We have taken this deterministic technique and combined it with a high-performance implementation of a dynamically sized, parallel hash table to provide scalable performance. We have also added mutable state along with two execution models — one of which is deterministic — that allow the user to share arbitrary results across parallel threads under several execution models, all of which preserve the ability to reason locally about the behavior of code.

For both of these techniques, we present a detailed description of their implementations, examine a set of relevant benchmarks, and specify their semantics.

APA, Harvard, Vancouver, ISO, and other styles

5

Lee, I.-Ting Angelina. "Memory abstractions for parallel programming." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/75636.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 156-163).
A memory abstraction is an abstraction layer between the program execution and the memory that provides a different "view" of a memory location depending on the execution context in which the memory access is made. Properly designed memory abstractions help ease the task of parallel programming by mitigating the complexity of synchronization or admitting more efficient use of resources. This dissertation describes five memory abstractions for parallel programming: (i) cactus stacks that interoperate with linear stacks, (ii) efficient reducers, (iii) reducer arrays, (iv) ownershipaware transactions, and (v) location-based memory fences. To demonstrate the utility of memory abstractions, my collaborators and I developed Cilk-M, a dynamically multithreaded concurrency platform which embodies the first three memory abstractions. Many dynamic multithreaded concurrency platforms incorporate cactus stacks to support multiple stack views for all the active children simultaneously. The use of cactus stacks, albeit essential, forces concurrency platforms to trade off between performance, memory consumption, and interoperability with serial code due to its incompatibility with linear stacks. This dissertation proposes a new strategy to build a cactus stack using thread-local memory mapping (or TLMM), which enables Cilk-M to satisfy all three criteria simultaneously. A reducer hyperobject allows different branches of a dynamic multithreaded program to maintain coordinated local views of the same nonlocal variable. With reducers, one can use nonlocal variables in a parallel computation without restructuring the code or introducing races. This dissertation introduces memory-mapped reducers, which admits a much more efficient access compared to existing implementations. When used in large quantity, reducers incur unnecessarily high overhead in execution time and space consumption. This dissertation describes support for reducer arrays, which offers the same functionality as an array of reducers with significantly less overhead. Transactional memory is a high-level synchronization mechanism, designed to be easier to use and more composable than fine-grain locking. This dissertation presents ownership-aware transactions, the first transactional memory design that provides provable safety guarantees for "opennested" transactions. On architectures that implement memory models weaker than sequential consistency, programs communicating via shared memory must employ memory-fences to ensure correct execution. This dissertation examines the concept of location-based memoryfences, which unlike traditional memory fences, incurs latency only when synchronization is necessary.
by I-Ting Angelina Lee.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

6

Child, Christopher H. T. "Approximate dynamic programming with parallel stochastic planning operators." Thesis, City University London, 2011. http://openaccess.city.ac.uk/1109/.

Full text

Abstract:

This thesis presents an approximate dynamic programming (ADP) technique for environment modelling agents. The agent learns a set of parallel stochastic planning operators (P-SPOs) by evaluating changes in its environment in response to actions, using an association rule mining approach. An approximate policy is then derived by iteratively improving state value aggregation estimates attached to the operators using the P-SPOs as a model in a Dyna-Q-like architecture. Reinforcement learning and dynamic programming are powerful techniques for automated agent decision making in stochastic environments. Dynamic programming is effective when there is a known environment model, while reinforcement learning is effective when a model is not available. The techniques derive a policy: a mapping from each environment state to an action which optimizes the long term reward the agent receives. The standard methods become less effective as the state space for the environment increases because they require values to be associated with each state, the storage and processing of which is exponential to the number of state variables. Resolving this “curse of dimensionality” is an important topic of research amongst all communities working on this problem. Two key methods are to: (i) derive an estimate of the value (approximate dynamic programming) using function approximation or state aggregation; or (ii) build a model of the environment from experience. This thesis presents a method of combining these approaches by exploiting structure in the state transition and value functions captured in a set of planning operators which are learnt through experience in the environment. Standard planning operators define the deterministic changes that occur in an environment in response to an action. This work presents Parallel Stochastic Planning Operators (P-SPOs), a novel form of planning operator providing a structured model of the state transition function in environments which are both non-deterministic and for which changes can occur outside the influence of actions. Next, an automated method for extracting P-SPOs from observations in an environment is explored using an adaptation of association rule mining. Finally, methods of relating the state transition structure encapsulated in the P-SPOs to state values, using the operators to store state value aggregation estimates, are evaluated. The framework described provides a method by which approximate dynamic programming can be applied by designers of AI agents and AI planning systems for which they have minimal prior knowledge. The framework and P-SPO based implementations are tested against standard techniques in two bench-mark stochastic environments: a “slippery gripper” block painting robot; and a “predator-prey” agent environment. Experimental results show that an agent using a P-SPO-based approach is able to learn an accurate model of its environment if successor state variables exhibit conditional independence, and an approximate model in the non-independent case. Results also demonstrate that the agent’s ability to generalise to previously unseen states using the model allow it to form an improved policy over an agent employing a standard Dyna-Q based technique. Finally, an approximate policy stored in state aggregation estimates attached to operators is shown to be optimal in experiments for which the P-SPO set contains sufficient information for effective aggregations to be formed.

APA, Harvard, Vancouver, ISO, and other styles

7

Lewis, E. Christopher. "Achieving robust performance in parallel programming languages /." Thesis, Connect to this title online; UW restricted, 2001. http://hdl.handle.net/1773/6996.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Ding, Weiren. "Selsyn-C : a self-synchronizing parallel programming language." Thesis, McGill University, 1992. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=22494.

Full text

Abstract:

In thesis thesis we report the design and implementation of a new self-scheduling parallel programming language, SELSYN-C. As parallel processors become more accessible to a broad range of programmers, the development of simple to use and effective programming languages becomes increasingly important. Our approach to the challenge of parallel programming language design and implementation is two-fold: (1) the design of simple extensions to C that are both easy to use for the programmer, and useful for effective compilation, and (2) the design of efficient and effective scheduling strategies that can be automatically supported by a compiler and associated run-time environment.
We outline our approach by presenting: (1) our motivation, (2) an overview of the extensions to C that form the SELSYN-C programming language, and (3) the development of a new scheduling mechanism that can be used to effectively compile SELSYN-C programs for a real parallel processor, the BBN Butterfly GP-1000. Different scheduling strategies for this mechanism were studied via several experimental tests and the results of these experiments are reported.
A source-to-source compiler supporting the SELSYN-C language has been implemented. Included in this thesis is a description of both the compiler and associated run-time environment.

APA, Harvard, Vancouver, ISO, and other styles

9

Vaudin, John. "A unified programming system for a multi-paradigm parallel architecture." Thesis, University of Warwick, 1991. http://wrap.warwick.ac.uk/108849/.

Full text

Abstract:

Real time image understanding and image generation require very large amounts of computing power. A possible way to meet these requirements is to make use of the power available from parallel computing systems. However parallel machines exhibit performance which is highly dependent on the algorithms being executed. Both image understanding and image generation involve the use of a wide variety of algorithms. A parallel machine suited to some of these algorithms may be unsuited to others. This thesis describes a novel heterogeneous parallel architecture optimised for image based applications. It achieves its performance by combining two different forms of parallel architecture, namely fine grain SIMD and course grain MIMD, into a single architecture. In this way it is possible to match the most appropriate computing resource to each algorithm in a given application. As important as the architecture itself is a method for programming it. This thesis describes a novel multi-paradigm programming language based on C++, which allows programs which make use of both control and data parallelism to be expressed in a single coherent framework, based on object oriented programming. To demonstrate the utility of both the architecture and the programming system, two applications, one from the field of image understanding the other image generation are examined. These applications combine some novel algorithms with other novel implementation approaches to provide the most effective mapping onto this architecture.

APA, Harvard, Vancouver, ISO, and other styles

10

Dazzi, Patrizio. "Tools and models for high level parallel and Grid programming." Thesis, IMT Alti Studi Lucca, 2008. http://e-theses.imtlucca.it/12/1/Dazzi_phdthesis.pdf.

Full text

Abstract:

When algorithmic skeletons were first introduced by Cole in late 1980 (50) the idea had an almost immediate success. The skeletal approach has been proved to be effective when application algorithms can be expressed in terms of skeletons composition. However, despite both their effectiveness and the progress made in skeletal systems design and implementation, algorithmic skeletons remain absent from mainstream practice. Cole and other researchers, respectively in (51) and (19), focused the problem. They recognized the issues affecting skeletal systems and stated a set of principles that have to be tackled in order to make them more effective and to take skeletal programming into the parallel mainstream. In this thesis we propose tools and models for addressing some among the skeletal programming environments issues. We describe three novel approaches aimed at enhancing skeletons based systems from different angles. First, we present a model we conceived that allows algorithmic skeletons customization exploiting the macro data-flow abstraction. Then we present two results about the exploitation of metaprogramming techniques for the run-time generation and optimization of macro data-flow graphs. In particular, we show how to generate and how to optimize macro data-flow graphs accordingly both to programmers provided non-functional requirements and to execution platform features. The last result we present are the Behavioural Skeletons, an approach aimed at addressing the limitations of skeletal programming environments when used for the development of component-based Grid applications. We validated all the approaches conducting several test, performed exploiting a set of tools we developed.

APA, Harvard, Vancouver, ISO, and other styles

11

Brandis, Robert Craig. "IPPM : Interactive parallel program monitor." Full text open access at:, 1986. http://content.ohsu.edu/u?/etd,111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Lee, ChuanChe. "Parallel programming on General Block Min Max Criterion." CSUSB ScholarWorks, 2006. https://scholarworks.lib.csusb.edu/etd-project/3065.

Full text

Abstract:

The purpose of the thesis is to develop a parallel implementation of the General Block Min Max Criterion (GBMM). This thesis deals with two kinds of parallel overheads: Redundant Calculations Parallel Overhead (RCPO) and Communication Parallel Overhead (CPO).

APA, Harvard, Vancouver, ISO, and other styles

13

Chau, Genghis. "Cilkpride : always-on visualizations for parallel programming." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/112834.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 59).
Parallel programming is an increasingly important way for programmers to squeeze more performance out of their programs. Parallelization is error-prone, however, and programmers often forget to run error checkers and performance analyzers regularly. This thesis presents Cilkpride, an IDE plug-in that uses always-on visualizations to show programmers information on on their parallel program directly inside their IDE. Cilkpride runs a race checker and program profiler every time code is changed and immediately displays output to make programmers always aware of parallelization errors and performance bottlenecks. Programmers can then react and fix these issues quickly. To evaluate the system, we asked students who had taken MIT's 6.172 class, a performance engineering course, to use Cilkpride. Students found Cilkpride useful, helping them find races and bottlenecks.
by Genghis Chau.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

14

Wu, Jing 1964. "A parallel flow analysis method on structured programming languages." Thesis, McGill University, 1995. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=23951.

Full text

Abstract:

Ideally, compilers should produce object code that is at least as efficient as hand-written code. The key step toward this goal is developing techniques by which a compiler can derive the information that can help in optimization. This is the concern of flow analysis. Recently, with the emergence of parallel computer systems, both parallelization of the compiler and production of code for parallel processors have become crucial issues for compiler development (1,2,3).
This thesis presents new methods of compiler flow analysis for modern computer languages running on a uniprocessor or multiprocessor. These methods allow flow analysis information to be extracted directly from a high-level representation of the source programs, even in parallel. To achieve this goal, we develop and utilize the Extended Abstract Syntax Tree (EAST), and the Symbol Table Data Relational Tree (STDRT) representations, to perform our flow analysis based on these structures. By these approaches, the compiler is able to keep the most useful information and apply this information during various optimization stages. We also introduce several scheduling algorithms for parallelizing the flow analysis phase. An experimental compiler and its results support the usefulness of these methods.

APA, Harvard, Vancouver, ISO, and other styles

15

Li, Li. "Model-based automatic performance diagnosis of parallel computations /." view abstract or download file of text, 2007. http://proquest.umi.com/pqdweb?did=1335366371&sid=1&Fmt=2&clientId=11238&RQT=309&VName=PQD.

Full text

Abstract:

Thesis (Ph. D.)--University of Oregon, 2007.
Typescript. Includes vita and abstract. Includes bibliographical references (leaves 119-123). Also available for download via the World Wide Web; free to University of Oregon users.

APA, Harvard, Vancouver, ISO, and other styles

16

Girimaji, Sanjay. "Data-parallel programming with multiple inheritance on the connection machine." FIU Digital Commons, 1990. https://digitalcommons.fiu.edu/etd/3940.

Full text

Abstract:

The demand for computers is oriented toward faster computers and newer computers are being built with more than one CPU. These computers require sophisticated software to program them. One such approach to program the multiple CPU machines is through the use of object-oriented programming techniques. An example of such an approach is the use of C* on the Connection Machine. Though C* supports many of the object-oriented concepts, it does not support the concept of software reuse through inheritance. This thesis introduces a new language called C*±+ , an extension of C* language to support inheritance. We also discuss the issues invloved in the implementation of multiple inheritance in programming languages. This thesis describes the differences between C** and C* . It also discusses the various issues involved in the design and implementation of the translator from C** to C* . It also illustrates the advantages of programming in C*++ through an example. Since C*++ is designed to support software reuse which allows the users to create quality software in shorter time, it is anticipated that C*+ will have widespread use in programming the Connection Machine.

APA, Harvard, Vancouver, ISO, and other styles

17

Coffin, Michael Howard. "Par: An approach to architecture-independent parallel programming." Diss., The University of Arizona, 1990. http://hdl.handle.net/10150/185150.

Full text

Abstract:

This dissertation addresses the problem of writing portable programs for parallel computers, including shared memory, distributed, and non-uniform memory access architectures. The basis of our approach is to separate the expression of the algorithm from the machine-dependent details that are necessary to achieve good performance. The method begins with a statement of the algorithm in a classic, explicitly parallel, manner. This basic program is then annotated to specify architecture-dependent details such as scheduling and mapping. These ideas have been cast in terms of a programming language, Par, which provides flexible facilities for a range of programming styles, from shared memory to message passing. Par is used to specify both the algorithm and the implementation of the annotations.

APA, Harvard, Vancouver, ISO, and other styles

18

Alahmadi, Marwan Ibrahim. "Optimizing data parallelism in applicative languages." Diss., Georgia Institute of Technology, 1990. http://hdl.handle.net/1853/8457.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Huck, Kevin A. "Knowledge support for parallel performance data mining /." Connect to title online (Scholars' Bank) Connect to title online (ProQuest), 2009. http://hdl.handle.net/1794/10087.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Kuper, Lindsey. "Lattice-based data structures for deterministic parallel and distributed programming." Thesis, Indiana University, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3726443.

Full text

Abstract:

Deterministic-by-construction parallel programming models guarantee that programs have the same observable behavior on every run, promising freedom from bugs caused by schedule nondeterminism. To make that guarantee, though, they must sharply restrict sharing of state between parallel tasks, usually either by disallowing sharing entirely or by restricting it to one type of data structure, such as single-assignment locations.

I show that lattice-based data structures, or LVars, are the foundation for a guaranteed-deterministic parallel programming model that allows a more general form of sharing. LVars allow multiple assignments that are inflationary with respect to a given lattice. They ensure determinism by allowing only inflationary writes and "threshold" reads that block until a lower bound is reached. After presenting the basic LVars model, I extend it to support event handlers, which enable an event-driven programming style, and non-blocking "freezing" reads, resulting in a quasi-deterministic model in which programs behave deterministically modulo exceptions.

I demonstrate the viability of the LVars model with LVish, a Haskell library that provides a collection of lattice-based data structures, a work-stealing scheduler, and a monad in which LVar computations run. LVish leverages Haskell's type system to index such computations with effect levels to ensure that only certain LVar effects can occur, hence statically enforcing determinism or quasi-determinism. I present two case studies of parallelizing existing programs using LVish: a k-CFA control flow analysis, and a bioinformatics application for comparing phylogenetic trees.

Finally, I show how LVar-style threshold reads apply to the setting of convergent replicated data types (CvRDTs), which specify the behavior of eventually consistent replicated objects in a distributed system. I extend the CvRDT model to support deterministic, strongly consistent threshold queries. The technique generalizes to any lattice, and hence any CvRDT, and allows deterministic observations to be made of replicated objects before the replicas' states converge.

APA, Harvard, Vancouver, ISO, and other styles

21

Pinder, Robert William 1977. "Applications of genetic programming to parallel system optimization." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86507.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Includes bibliographical references (p. 81-84).
by Robert William Pinder.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

22

Huang, Kai 1980. "Data-race detection in transactions-everywhere parallel programming." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/16964.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Includes bibliographical references (p. 69-72).
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
This thesis studies how to perform dynamic data-race detection in programs using "transactions everywhere", a new methodology for shared-memory parallel programming. Since the conventional definition of a data race does not make sense in the transactions-everywhere methodology, this thesis develops a new definition based on a weak assumption about the correctness of the target program's parallel-control flow, which is made in the same spirit as the assumption underlying the conventional definition. This thesis proves, via a reduction from the problem of 3cnf-formula satisfiability, that data-race detection in the transactions-everywhere methodology is an NP-complete problem. In view of this result, it presents an algorithm that approximately detects data races. The algorithm never reports false negatives. When a possible data race is detected, the algorithm outputs simple information that allows the programmer to efficiently resolve the root of the problem. The algorithm requires running time that is worst-case quadratic in the size of a graph representing all the scheduling constraints in the target program.
by Kai Huang.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

23

Watkins, Rees Collyer. "Algorithmic skeletons as a method of parallel programming." Thesis, Rhodes University, 1993. http://hdl.handle.net/10962/d1004889.

Full text

Abstract:

A new style of abstraction for program development, based on the concept of algorithmic skeletons, has been proposed in the literature. The programmer is offered a variety of independent algorithmic skeletons each of which describe the structure of a particular style of algorithm. The appropriate skeleton is used by the system to mould the solution. Parallel programs are particularly appropriate for this technique because of their complexity. This thesis investigates algorithmic skeletons as a method of hiding the complexities of parallel programming from the user, and for guiding them towards efficient solutions. To explore this approach, this thesis describes the implementation and benchmarking of the divide and conquer and task queue paradigms as skeletons. All but one category of problem, as implemented in this thesis, scale well over eight processors. The rate of speed up tails off when there are significant communication requirements. The results show that, with some user knowledge, efficient parallel programs can be developed using this method. The evaluation explores methods for fine tuning some skeleton programs to achieve increased efficiency.

APA, Harvard, Vancouver, ISO, and other styles

24

Bakken, David Edward. "Supporting fault-tolerant parallel programming in Linda." Diss., The University of Arizona, 1994. http://hdl.handle.net/10150/186872.

Full text

Abstract:

As people are becoming increasingly dependent on computerized systems, the need for these systems to be dependable is also increasing. However, programming dependable systems is difficult, especially when parallelism is involved. This is due in part to the fact that very few high-level programming languages support both fault-tolerance and parallel programming. This dissertation addresses this problem by presenting FT-Linda, a high-level language for programming fault-tolerant parallel programs. FT-Linda is based on Linda, a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. FT-Linda extends Linda by providing support to allow a program to tolerate failures in the underlying computing platform. The distinguishing features of FT-Linda are stable tuple spaces and atomic execution of multiple tuple space operations. The former is a type of stable storage in which tuple values are guaranteed to persist across failures, while the latter allows collections of tuple operations to be executed in an all-or-nothing fashion despite failures and concurrency. Example FT-Linda programs are given for both dependable systems and parallel applications. The design and implementation of FT-Linda are presented in detail. The key technique used is the replicated state machine approach to constructing fault-tolerant distributed programs. Here, tuple space is replicated to provide failure resilience, and the replicas are sent a message describing the atomic sequence of tuple space operations to perform. This strategy allows an efficient implementation in which only a single multicast message is needed for each atomic sequence of tuple space operations. An implementation of FT-Linda for a network of workstations is also described. FT-Linda is being implemented using Consul, a communication substrate that supports fault-tolerant distributed programming. Consul is built in turn with the x-kernel, an operating system kernel that provides support for composing network protocols. Each of the components of the implementation has been built and tested.

APA, Harvard, Vancouver, ISO, and other styles

25

Wong, Chi-Kin. "Reusable template library for parallel patterns." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000618.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Auvil, Loretta Sue. "Problem specific environments for parallel scientific computing." Thesis, This resource online, 1992. http://scholar.lib.vt.edu/theses/available/etd-12042009-020030/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Dick, Andrew J. "Object-oriented distributed and parallel I/O streams." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/MQ39189.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Gopinath, Prabha Shankar. "Programming and execution of object-based, parallel, hard, real-time applications /." The Ohio State University, 1988. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487592050228249.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Oladele, Jean-David G. "Implementation of a Parallel Program, Program Generator." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000583.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Ngo, Ton Anh. "The role of performance models in parallel programming and languages /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/6990.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Yue, Kwok B. (Kwok Bun). "Semaphore Solutions for General Mutual Exclusion Problems." Thesis, University of North Texas, 1988. https://digital.library.unt.edu/ark:/67531/metadc331970/.

Full text

Abstract:

Automatic generation of starvation-free semaphore solutions to general mutual exclusion problems is discussed. A reduction approach is introduced for recognizing edge-solvable problems, together with an O(N^2) algorithm for graph reduction, where N is the number of nodes. An algorithm for the automatic generation of starvation-free edge-solvable solutions is presented. The solutions are proved to be very efficient. For general problems, there are two ways to generate efficient solutions. One associates a semaphore with every node, the other with every edge. They are both better than the standard monitor—like solutions. Besides strong semaphores, solutions using weak semaphores, weaker semaphores and generalized semaphores are also considered. Basic properties of semaphore solutions are also discussed. Tools describing the dynamic behavior of parallel systems, as well as performance criteria for evaluating semaphore solutions are elaborated.

APA, Harvard, Vancouver, ISO, and other styles

32

Kopek, Christopher Vincent. "Parallel intrusion detection systems for high speed networks using the divided data parallel method." Electronic thesis, 2007. http://dspace.zsr.wfu.edu/jspui/handle/10339/191.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Zhang, Yuan. "Static analyses and optimizations for parallel programs with synchronization." Access to citation, abstract and download form provided by ProQuest Information and Learning Company; downloadable PDF file, 169 p, 2008. http://proquest.umi.com/pqdweb?did=1601517931&sid=6&Fmt=2&clientId=8331&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Kim, Eunkee. "Implementation patterns for parallel program and a case study." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000552.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Valiveti, Natana Carleton University Dissertation Computer Science. "Parallel computational geometry on Analog Hopfield Networks." Ottawa, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

36

Velusamy, Vijay. "Adapting Remote Direct Memory Access based file system to parallel Input-/Output." Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-11112003-092209.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Landry, Kenneth D. "Instructional footprinting : a basis for exploiting concurrency through instructional decomposition and code motion /." Diss., This resource online, 1993. http://scholar.lib.vt.edu/theses/available/etd-06062008-165834/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Chattopadhyay, Vaishali. "Distributed parallel computation using standard ML." Online access for everyone, 2007. http://www.dissertations.wsu.edu/Thesis/Fall2007/v_chattopadhyay_111607.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Deitz, Steven J. "High-level programming language abstractions for advanced and dynamic parallel computations /." Thesis, Connect to this title online; UW restricted, 2005. http://hdl.handle.net/1773/6967.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Chamberlain, Bradford L. "The design and implementation of a region-based parallel programming language /." Thesis, Connect to this title online; UW restricted, 2001. http://hdl.handle.net/1773/6953.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Berrios, Joseph Stephen. "Using wait-free synchronization to increase system reliability and performance." [Gainesville, Fla.]: University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Chronaki, Catherine Eleftherios. "Parallelism in declarative languages /." Online version of thesis, 1990. http://hdl.handle.net/1850/10793.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Hakansson, Carolyn Ann. "The design and implementation of a parallel prolog opcode-interpreter on a multiprocessor architecture /." Full text open access at:, 1987. http://content.ohsu.edu/u?/etd,146.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Kondo, Boubacar. "An investigation of parallel algorithms developed for graph problems and their implementation on parallel computers." Virtual Press, 1991. http://liblink.bsu.edu/uhtbin/catkey/770951.

Full text

Abstract:

With the recent development of VLSI (Very Large Scale Integration) technology, research has increased considerably on the development of efficient parallel algorithms for solutions of practical graph problems. Varieties of algorithms have already been implemented on different models of parallel computers. But not too much is known yet about the question of which model of parallel computer will efficiently and definitely fit every graph problem. In this investigation the study will focus on a comparative analysis of speedup and efficiency of parallel algorithms with parallel model of computation, and with respect to some sequential algorithms.
Department of Computer Science

APA, Harvard, Vancouver, ISO, and other styles

45

Wendelborn, Andrew Lawrence. "Data flow implementations of a lucid-like programming language." Title page, contents and summary only, 1985. http://web4.library.adelaide.edu.au/theses/09PH/09phw471.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Montagne, Euripides. "Program structures and computer architectures for parallel processing." Thesis, McGill University, 1985. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=65949.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

馬家駒 and Ka-kui Ma. "Transparent process migration for parallel Java computing." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31226474.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Ma, Ka-kui. "Transparent process migration for parallel Java computing /." Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B23589371.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Lindhult, Johan. "Operational Semantics for PLEX : A Basis for Safe Parallelization." Licentiate thesis, Västerås : School of Innovation, Design and Engineering, Mälardalen University, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-631.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Pereira, Marcio Machado 1959. "Scheduling and serialization techniques for transactional memories." [s.n.], 2015. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275547.

Full text

Abstract:

Orientadores: Guido Costa Souza de Araújo, José Nelson Amaral
Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-27T10:12:59Z (GMT). No. of bitstreams: 1 Pereira_MarcioMachado_D.pdf: 2922376 bytes, checksum: 9775914667eadf354d7e256fb2835859 (MD5) Previous issue date: 2015
Resumo: Nos últimos anos, Memórias Transacionais (Transactional Memories ¿ TMs) têm-se mostrado um modelo de programação paralela que combina, de forma eficaz, a melhoria de desempenho com a facilidade de programação. Além disso, a recente introdução de extensões para suporte a TM por grandes fabricantes de microprocessadores, também parece endossá-la como um modelo de programação para aplicações paralelas. Uma das questões centrais na concepção de sistemas de TM em Software (STM) é identificar mecanismos ou heurísticas que possam minimizar a contenção decorrente dos conflitos entre transações. Apesar de já terem sido propostos vários mecanismos para reduzir a contenção, essas técnicas têm um alcance limitado, uma vez que o conflito é evitado por interrupção ou serialização da execução da transação, impactando consideravelmente o desempenho do programa. Este trabalho explora uma abordagem complementar para melhorar o desempenho de STM através da utilização de escalonadores. Um escalonador de TM é um componente de software que decide quando uma determinada transação deve ser executada ou não. Sua eficácia é muito sensível às métricas usadas para prever o comportamento das transações, especialmente em cenários de alta contenção. Este trabalho propõe um novo escalonador, Dynamic Transaction Scheduler ¿ DTS, para selecionar a próxima transação a ser executada. DTS é baseada em uma política de "recompensa pelo sucesso" e utiliza uma métrica que mede com melhor precisão o trabalho realizado por uma transação. Memórias Transacionais em Hardware (HTMs) são mecanismos interessante para implementar TM porque integram o suporte a transações no nível da arquitetura. Por outro lado, aplicações que usam HTM podem ter o seu desempenho dificultado pela falta de escalabilidade e transbordamento da cache de dados. Este trabalho apresenta um extenso estudo de desempenho de aplicações que usam HTM na arquitetura Haswell da Intel. Ele avalia os pontos fortes e fracos desta nova arquitetura, realizando uma exploração das várias características das aplicações de TM. Este estudo detalhado revela as restrições impostas pela nova arquitetura e introduz uma política de serialização simples, porém eficaz, para garantir o progresso das transações, além de proporcionar melhor desempenho
Abstract: In the last few years, Transactional Memories (TMs) have been shown to be a parallel programming model that can effectively combine performance improvement with ease of programming. Moreover, the recent introduction of (H)TM-based ISA extensions, by major microprocessor manufacturers, also seems to endorse TM as a programming model for today¿s parallel applications. One of the central issues in designing Software TM (STM) systems is to identify mechanisms or heuristics that can minimize contention arising from conflicting transactions. Although a number of mechanisms have been proposed to tackle contention, such techniques have a limited scope, because conflict is avoided by either interrupting or serializing transaction execution, thus considerably impacting performance. This work explores a complementary approach to boost the performance of STM through the use of schedulers. A TM scheduler is a software component that decides when a particular transaction should be executed. Their effectiveness is very sensitive to the accuracy of the metrics used to predict transaction behaviour, particularly in high-contention scenarios. This work proposes a new Dynamic Transaction Scheduler ¿ DTS to select a transaction to execute next, based on a new policy that rewards success and an improved metric that measures the amount of effective work performed by a transaction. Hardware TMs (HTM) are an interesting mechanism to implement TM as they integrate the support for transactions at the lowest, most efficient, architectural level. On the other hand, for some applications, HTMs can have their performance hindered by the lack of scalability and by limitations in cache store capacity. This work presents an extensive performance study of the implementation of HTM in the Haswell generation of Intel x86 core processors. It evaluates the strengths and weaknesses of this new architecture by exploring several dimensions in the space of TM application characteristics. This detailed performance study provides insights on the constraints imposed by the Intel¿s Transaction Synchronization Extension (Intel¿s TSX) and introduces a simple, but efficient, serialization policy for guaranteeing forward progress on top of the best-effort Intel¿s HTM which was critical to achieving performance
Doutorado
Ciência da Computação
Doutor em Ciência da Computação

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Parallel programming (Computer science)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles