Dissertations / Theses: 'Speculative Architecture'

1

Xekalakis, Polychronis. "Mixed speculative multithreaded execution models." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/3282.

Full text

Abstract:

The current trend toward chip multiprocessor architectures has placed great pressure on programmers and compilers to generate thread-parallel programs. Improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), but, instead, via multithreaded execution. One notable technique that facilitates the extraction of parallel threads from sequential applications is thread-level speculation (TLS). This technique allows programmers/compilers to generate threads without checking for inter-thread data and control dependences, which are then transparently enforced by the hardware. Most prior work on TLS has concentrated on thread selection and mechanisms to efficiently support the main TLS operations, such as squashes, data versioning, and commits. This thesis seeks to enhance TLS functionality by combining it with other speculative multithreaded execution models. The main idea is that TLS already requires extensive hardware support, which when slightly augmented can accommodate other speculative multithreaded techniques. Recognizing that for different applications, or even program phases, the application bottlenecks may be different, it is reasonable to assume that the more versatile a system is, the more efficiently it will be able to execute the given program. As mentioned above, generating thread-parallel programs is hard and TLS has been suggested as an execution model that can speculatively exploit thread-level parallelism (TLP) even when thread independence cannot be guaranteed by the programmer/ compiler. Alternatively, the helper threads (HT) execution model has been proposed where subordinate threads are executed in parallel with a main thread in order to improve the execution efficiency (i.e., ILP) of the latter. Yet another execution model, runahead execution (RA), has also been proposed where subordinate versions of the main thread are dynamically created especially to cope with long-latency operations, again with the aim of improving the execution efficiency of the main thread (ILP). Each one of these multithreaded execution models works best for different applications and application phases. We combine these three models into a single execution model and single hardware infrastructure such that the system can dynamically adapt to find the most appropriate multithreaded execution model. More specifically, TLS is favored whenever successful parallel execution of instructions in multiple threads (i.e., TLP) is possible and the system can seamlessly transition at run-time to the other models otherwise. In order to understand the tradeoffs involved, we also develop a performance model that allows one to quantitatively attribute overall performance gains to either TLP or ILP in such combined multithreaded execution model. Experimental results show that our combined execution model achieves speedups of up to 41.2%, with an average of 10.2%, over an existing state-of-the-art TLS system and speedups of up to 35.2%, with an average of 18.3%, over a flavor of runahead execution for a subset of the SPEC2000 Integer benchmark suite. We then investigate how a common ILP-enhancingmicroarchitectural feature, namely branch prediction, interacts with TLS.We show that branch prediction for TLS is even more important than it is for single core machines. Unfortunately, branch prediction for TLS systems is also inherently harder. Code partitioning and re-executions of squashed threads pollute the branch history making it harder for predictors to be accurate. We thus propose to augment the hardware, so as to accommodate Multi-Path (MP) execution within the existing TLS protocol. Under the MP execution model, all paths following a number of hard-to-predict conditional branches are followed. MP execution thus, removes branches that would have been otherwise mispredicted helping in this way the processor to exploit more ILP. We show that with only minimal hardware support, one can combine these two execution models into a unified one, which can achieve far better performance than both TLS and MP execution. Experimental results show that our combied execution model achieves speedups of up to 20.1%, with an average of 8.8%, over an existing state-of-the-art TLS system and speedups of up to 125%, with an average of 29.0%, when compared with multi-path execution for a subset of the SPEC2000 Integer benchmark suite. Finally, Since systems that support speculative multithreading usually treat all threads equally, they are energy-inefficient. This inefficiency stems from the fact that speculation occasionally fails and, thus, power is spent on threads that will have to be discarded. We propose a profitability-based power allocation scheme, where we “steal” power from non-profitable threads and use it to speed up more useful ones. We evaluate our techniques for a state-of-the-art TLS system and show that, with minimalhardware support, we achieve improvements in ED of up to 25.5% with an average of 18.9%, for a subset of the SPEC 2000 Integer benchmark suite.

APA, Harvard, Vancouver, ISO, and other styles

2

Lindskog, Ellen. "Danvikens Hospital - A speculative investigation." Thesis, KTH, Arkitektur, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298799.

Full text

Abstract:

This thesis project is an investigation of time and endurance. Temporality and change. It is an intuitive journey through the past. By collecting, redrawing and speculating I have explored the past and present of a site and a building. The hospital institution Danvikens Hospital dates back to the 16th century, and the current building from 1719 can be seen as the first monumental hospital building in Sweden. The site went through irreversible infrastructural changes at the beginning of the 20th century, with the excavation of Hammarby canal. Following this, the building has had short identities as an archive, workshop and hotel. This project work with findings from the past, using textile to create a new identity and life for the building.

APA, Harvard, Vancouver, ISO, and other styles

3

Weate, Jeremy. "Phenomenology and difference : the body, architecture and race." Thesis, University of Warwick, 1998. http://wrap.warwick.ac.uk/2472/.

Full text

Abstract:

The aim of the thesis is to consider the position of phenomenology in contemporary thought in order to argue that only on its terms can a political ontology of difference be thought. To inaugurate this project I being by questioning Heidegger's relation to phenomenology. I take issue with the way that Heidegger privileges time over space in "Being and Time". In this way, the task of the thesis is clarified as the need to elaborate a spatio-temporal phenomenology. After re-situating Heidegger's failure in this respect within a Kantian background, I suggest that the phenomenological grounding of difference must work through the body. I contend that the body is the ontological site of both the subject and the object. I use Whitehead and Merleau-Ponty to explore the ramifications of this thesis. I suggest first of all that architecture should be grounded ontologically in the body, and as such avoids being a 'master discourse'. Secondly, by theorising the body and world as reciprocally transformative, my reading of Merleau-Ponty emphasises the ways in which his thinking opens up a phenomenology of embodied difference. It is on the basis of these themes that I develop this thinking in the direction of race, exploring the dialectics of visibility and invisibility in the work of Frantz Fanon and James Baldwin. I argue that embodied difference attests to variations in the agent's freedom to act in the world. If freedom is understood through Merleau-Ponty as being the embodied ground of historicity, we must ask after unfreedom. I suggest that the "flesh" ontology of a pre-thetic community should be rethought as a regulative ideal, the ideal of a justice that can never be given. In this light, phenomenology becomes as much as poetics. Beyond being though of as conservative, phenomenology henceforth unleashes the possibility of thinking a transformative embodied agency.

APA, Harvard, Vancouver, ISO, and other styles

4

Li, Wentong. "High Performance Architecture using Speculative Threads and Dynamic Memory Management Hardware." Thesis, University of North Texas, 2007. https://digital.library.unt.edu/ark:/67531/metadc5150/.

Full text

Abstract:

With the advances in very large scale integration (VLSI) technology, hundreds of billions of transistors can be packed into a single chip. With the increased hardware budget, how to take advantage of available hardware resources becomes an important research area. Some researchers have shifted from control flow Von-Neumann architecture back to dataflow architecture again in order to explore scalable architectures leading to multi-core systems with several hundreds of processing elements. In this dissertation, I address how the performance of modern processing systems can be improved, while attempting to reduce hardware complexity and energy consumptions. My research described here tackles both central processing unit (CPU) performance and memory subsystem performance. More specifically I will describe my research related to the design of an innovative decoupled multithreaded architecture that can be used in multi-core processor implementations. I also address how memory management functions can be off-loaded from processing pipelines to further improve system performance and eliminate cache pollution caused by runtime management functions.

APA, Harvard, Vancouver, ISO, and other styles

5

Li, Wentong Kavi Krishna M. "High performance architecture using speculative threads and dynamic memory management hardware." [Denton, Tex.] : University of North Texas, 2007. http://digital.library.unt.edu/permalink/meta-dc-5150.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Weingarten, Lauren Ariel. "Building on the edge of reason : the Institute for Speculative Science, M.I.T." Thesis, Massachusetts Institute of Technology, 1989. http://hdl.handle.net/1721.1/79003.

Full text

Abstract:

Thesis (M. Arch.)--Massachusetts Institute of Technology, Dept. of Architecture, 1989.
This thesis sprang from a fascination with and respect for speculative thought in science. One need walk no further than down the "infinite corridor" to feel the pulse of experimentation that has earned M.I.T. the reputation of being one of the principal research institutes of the world. But, even with the commitment to research found at M.I.T. market demands point toward specialization. This creates an environment where speculative science in its classic sense cannot occur. It is the aim of this project to reconnect contemporary higher science with ancient ideas of a unified world view. To do this, the scientists at M.I.T. will be provided with a physically different environment from the everyday scientific workplace, with its gadgetry and budgetary constraints, if even for a short time. The site I propose to locate the Institute for Speculative Science is removed from Boston but is close enough to remain in the scientists' frame of reference. Located between Quincy and Thompson Island in the Boston Harbor, the institute for Speculative Science builds on the tension between built and natural states: urban to exurban; mainland to island; rock to water. This project will occupy the space between the mainland (Quincy) and surrounding islands, currently treated as an orphan of the big city. From the vantage point of the Institute for Speculative Science, you can see the Boston skyline but Boston cannot see you. You walk on a natural and wild beach, but this beach is littered with debris, relics of Boston's industrial recent past. The models and drawings in this book are scaled representations of the proposed project, they are how ever drawn and made as worlds in their own right which the participant in this book can experience. To communicate the idea of a building on the edge of reason, I've looked to bring to light a project that you feel before you understand.
by Lauren Ariel Weingarten.
M.Arch.

APA, Harvard, Vancouver, ISO, and other styles

7

ESTILL, ALEXANDER CLAYTON. "VITRUVIAN DELIGHT: CUSTOMIZATION WITHIN THE SPECULATIVE MODEL." University of Cincinnati / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1129233879.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Khan, Salman. "Putting checkpoints to work in thread level speculative execution." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/4676.

Full text

Abstract:

With the advent of Chip Multi Processors (CMPs), improving performance relies on the programmers/compilers to expose thread level parallelism to the underlying hardware. Unfortunately, this is a difficult and error-prone process for the programmers, while state of the art compiler techniques are unable to provide significant benefits for many classes of applications. An interesting alternative is offered by systems that support Thread Level Speculation (TLS), which relieve the programmer and compiler from checking for thread dependencies and instead use the hardware to enforce them. Unfortunately, data misspeculation results in a high cost since all the intermediate results have to be discarded and threads have to roll back to the beginning of the speculative task. For this reason intermediate checkpointing of the state of the TLS threads has been proposed. When the violation does occur, we now have to roll back to a checkpoint before the violating instruction and not to the start of the task. However, previous work omits study of the microarchitectural details and implementation issues that are essential for effective checkpointing. Further, checkpoints have only been proposed and evaluated for a narrow class of benchmarks. This thesis studies checkpoints on a state of the art TLS system running a variety of benchmarks. The mechanisms required for checkpointing and the costs associated are described. Hardware modifications required for making checkpointed execution efficient in time and power are proposed and evaluated. Further, the need for accurately identifying suitable points for placing checkpoints is established. Various techniques for identifying these points are analysed in terms of both effectiveness and viability. This includes an extensive evaluation of data dependence prediction techniques. The results show that checkpointing thread level speculative execution results in consistent power savings, and for many benchmarks leads to speedups as well.

APA, Harvard, Vancouver, ISO, and other styles

9

Ioannou, Nikolas. "Complementing user-level coarse-grain parallelism with implicit speculative parallelism." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/7900.

Full text

Abstract:

Multi-core and many-core systems are the norm in contemporary processor technology and are expected to remain so for the foreseeable future. Parallel programming is, thus, here to stay and programmers have to endorse it if they are to exploit such systems for their applications. Programs using parallel programming primitives like PThreads or OpenMP often exploit coarse-grain parallelism, because it offers a good trade-off between programming effort versus performance gain. Some parallel applications show limited or no scaling beyond a number of cores. Given the abundant number of cores expected in future many-cores, several cores would remain idle in such cases while execution performance stagnates. This thesis proposes using cores that do not contribute to performance improvement for running implicit fine-grain speculative threads. In particular, we present a many-core architecture and protocols that allow applications with coarse-grain explicit parallelism to further exploit implicit speculative parallelism within each thread. We show that complementing parallel programs with implicit speculative mechanisms offers significant performance improvements for a large and diverse set of parallel benchmarks. Implicit speculative parallelism frees the programmer from the additional effort to explicitly partition the work into finer and properly synchronized tasks. Our results show that, for a many-core comprising 128 cores supporting implicit speculative parallelism in clusters of 2 or 4 cores, performance improves on top of the highest scalability point by 44% on average for the 4-core cluster and by 31% on average for the 2-core cluster. We also show that this approach often leads to better performance and energy efficiency compared to existing alternatives such as Core Fusion and Turbo Boost. Moreover, we present a dynamic mechanism to choose the number of explicit and implicit threads, which performs within 6% of the static oracle selection of threads. To improve energy efficiency processors allow for Dynamic Voltage and Frequency Scaling (DVFS), which enables changing their performance and power consumption on-the-fly. We evaluate the amenability of the proposed explicit plus implicit threads scheme to traditional power management techniques for multithreaded applications and identify room for improvement. We thus augment prior schemes and introduce a novel multithreaded power management scheme that accounts for implicit threads and aims to minimize the Energy Delay2 product (ED2). Our scheme comprises two components: a “local” component that tries to adapt to the different program phases on a per explicit thread basis, taking into account implicit thread behavior, and a “global” component that augments the local components with information regarding inter-thread synchronization. Experimental results show a reduction of ED2 of 8% compared to having no power management, with an average reduction in power of 15% that comes at a minimal loss of performance of less than 3% on average.

APA, Harvard, Vancouver, ISO, and other styles

10

Duan, Kewei. "Resource-oriented architecture based scientific workflow modelling." Thesis, University of Bath, 2016. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.698986.

Full text

Abstract:

This thesis studies the feasibility and methodology of applying state-of-the-art computer technology in scientific workflow modelling, within a collaborative environment. The collaborative environment also indicates that the people involved include non-computer scientists or engineers from other disciplines. The objective of this research is to provide a systematic methodology based on a web environment for the purpose of lowering the barriers brought by the heterogeneous features of multi-institutions, multi-platforms and geographically distributed resources which are implied in the collaborative environment of scientific workflow.

APA, Harvard, Vancouver, ISO, and other styles

11

Jimborean, Alexandra. "Adapting the polytope model for dynamic and speculative parallelization." Phd thesis, Université de Strasbourg, 2012. http://tel.archives-ouvertes.fr/tel-00733850.

Full text

Abstract:

In this thesis, we present a Thread-Level Speculation (TLS) framework whose main feature is to speculatively parallelize a sequential loop nest in various ways, to maximize performance. We perform code transformations by applying the polyhedral model that we adapted for speculative and runtime code parallelization. For this purpose, we designed a parallel code pattern which is patched by our runtime system according to the profiling information collected on some execution samples. We show on several benchmarks that our framework yields good performance on codes which could not be handled efficiently by previously proposed TLS systems.

APA, Harvard, Vancouver, ISO, and other styles

12

Harmon, Justin L. "The Normative Architecture of Reality: Towards an Object-Oriented Ethics." UKnowledge, 2016. http://uknowledge.uky.edu/philosophy_etds/9.

Full text

Abstract:

The fact-value distinction has structured and still structures ongoing debates in metaethics, and all of the major positions in the field (expressivism, cognitivist realism, and moral error theory) subscribe to it. In contrast, I claim that the fact-value distinction is a contingent product of our intellectual history and a prime object for questioning. The most forceful reason for rejecting the distinction is that it presupposes a problematic understanding of the subject-object divide whereby one tends to view humans as the sole source of normativity in the world. My dissertation aims to disclose the background against which human ethical praxis is widely seen as a unique and special phenomenon among other phenomena. I show that ethical norms, as delimited by utilitarianism, deontology, virtue ethics, etc., derive from an originary proto-ethical normativity at the heart of the real itself. Every object, human and nonhuman, presents itself as a bottomless series of cues or conditions of appropriateness that determine adequate and inadequate ways of relating to it. That is, objects demand something from other objects if they are to be related to; they condition other objects by soliciting a change in disposition, perception, or sense, and for this reason are sources of normativity in and unto themselves. Ethical norms, or values, are the human expression of the adequacy conditions with which all objects show themselves. In the post-Kantian landscape it is widely thought that human finitude constitutes the origin of ethical norms. Consequently, the world is divided up into morally relevant agents (humans) on one side, and everything else on the other. Adopting a deflationary view of agency, I argue that human-human and human-world relations differ from other relations in degree rather than kind. Thus, instead of a fact-value distinction, value is inextricably bound up with the factual itself. The critical upshot of my project is that traditional subject-oriented ethical theories have served to conceal the real demands of non-human objects (such as animals, plants, microorganisms, and artificially intelligent machines) in favor of specifically human interests. Such theories have also been leveraged frequently in exclusionary practices with respect to different groups within the human community (e.g. women and those of non-European descent) based on arbitrary criteria or principles.

APA, Harvard, Vancouver, ISO, and other styles

13

Altringer, Beth. "The intended and actual impacts of mega-events : an international comparative study on mega-event hosting and a speculative review of South Africa's preparations for the 2010 Football World Cup." Master's thesis, University of Cape Town, 2006. http://hdl.handle.net/11427/5602.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Madriles, Gimeno Carles. "Mitosis based speculative multithreaded architectures." Doctoral thesis, Universitat Politècnica de Catalunya, 2012. http://hdl.handle.net/10803/124709.

Full text

Abstract:

In the last decade, industry made a right-hand turn and shifted towards multi-core processor designs, also known as Chip-Multi-Processors (CMPs), in order to provide further performance improvements under a reasonable power budget, design complexity, and validation cost. Over the years, several processor vendors have come out with multi-core chips in their product lines and they have become mainstream, with the number of cores increasing in each processor generation. Multi-core processors improve the performance of applications by exploiting Thread Level Parallelism (TLP) while the Instruction Level Parallelism (ILP) exploited by each individual core is limited. These architectures are very efficient when multiple threads are available for execution. However, single-thread sections of code (single-thread applications and serial sections of parallel applications) pose important constraints on the benefits achieved by parallel execution, as pointed out by Amdahl’s law. Parallel programming, even with the help of recently proposed techniques like transactional memory, has proven to be a very challenging task. On the other hand, automatically partitioning applications into threads may be a straightforward task in regular applications, but becomes much harder for irregular programs, where compilers usually fail to discover sufficient TLP. In this scenario, two main directions have been followed in the research community to take benefit of multi-core platforms: Speculative Multithreading (SpMT) and Non-Speculative Clustered architectures. The former splits a sequential application into speculative threads, while the later partitions the instructions among the cores based on data-dependences but avoid large degree of speculation. Despite the large amount of research on both these approaches, the proposed techniques so far have shown marginal performance improvements. In this thesis we propose novel schemes to speed-up sequential or lightly threaded applications in multi-core processors that effectively address the main unresolved challenges of previous approaches. In particular, we propose a SpMT architecture, called Mitosis, that leverages a powerful software value prediction technique to manage inter-thread dependences, based on pre-computation slices (p-slices). Thanks to the accuracy and low cost of this technique, Mitosis is able to effectively parallelize applications even in the presence of frequent dependences among threads. We also propose a novel architecture, called Anaphase, that combines the best of SpMT schemes and clustered architectures. Anaphase effectively exploits ILP, TLP and Memory Level Parallelism (MLP), thanks to its unique finegrain thread decomposition algorithm that adapts to the available parallelism in the application.

APA, Harvard, Vancouver, ISO, and other styles

15

Abeydeera, Maleen Hasanka (Weeraratna Patabendige Maleen Hasanka). "Optimizing throughput architectures for speculative parallelism." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111930.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 57-62).
Throughput-oriented architectures, like GPUs, use a large number of simple cores and rely on application-level parallelism, using multithreading to keep the cores busy. These architectures work well when parallelism is plentiful but work poorly when its not. Therefore, it is important to combine these techniques with other hardware support for parallelizing challenging applications. Recent work has shown that speculative parallelism is plentiful for a large class of applications that have traditionally been hard to parallelize. However, adding hardware support for speculative parallelism to a throughput-oriented system leads to a severe pathology: aborted work consumes scarce resources and hurts the throughput of useful work. This thesis develops a technique to optimize throughput-oriented architectures for speculative parallelism: tasks should be prioritized according to how speculative they are. This focuses resources on work that is more likely to commit, reducing aborts and using speculation resources more efficiently. We identify two on-chip resources where this prioritization is most likely to help, the core pipeline and the memory controller. First, this thesis presents speculation-aware multithreading (SAM), a simple policy that modifies a multithreaded processor pipeline to prioritize instructions from less speculative tasks. Second, we modify the on-chip memory controller to prioritize requests issued by tasks that are earlier in the conflict resolution order. We evaluate SAM on systems with up to 64 SMT cores. With SAM, 8-threaded in-order cores outperform single-threaded cores by 2.41 x on average, while a speculation-oblivious policy yields a 1.91 x speedup. SAM also reduces wasted work by 43%. Unlike at the core, we find little performance benefit from prioritizing requests at the memory controller. The reason is that speculative execution works as a very effective prefetching mechanism, and most requests, even those from tasks that are ultimately aborted, do end up being useful.
by Weeraratna Patabendige Maleen Hasanka Abeydeera.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

16

Runberger, Jonas. "Architectural Prototypes II : Reformations, Speculations and Strategies in the Digital Design Field." Doctoral thesis, KTH, Projektkommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-95188.

Full text

Abstract:

This doctoral thesis is situated within the digital design field of architecture, and is a continuation of the licentiate thesis Architectural Prototypes: Modes of Design Development and Architectural Practice, presented at the KTH School of Architecture in 2008. The doctoral thesis investigates the current status of the digital design field of architecture, and identifies a number of related discourses. Within this field, it identifies a period of formation, which in recent years has turned into a process of reformation. It contributes to this ongoing reformation by proposing two alternate areas of future practice and research within the field. A speculative approach is considered to be important for a continued mode of exploration within the field, and is suggested as away to bring new scope to the digital design field. A number of key terms from the field of science fiction studies have been investigated to support the construction of a speculative framework for further development. A strategic approach is regarded as crucial to the way new design potentials that have emerged within the digital design field to be implemented into general architectural practice, and to further inform the field itself. Key concepts have been imported from the field of strategic management in the formulation of a framework for digital design strategies. The notion of the prototype, as explored in the previous licentiate thesis, resurfaces as a prototypical approach, which could be equally employed in the speculative approach and the strategic approach. The doctoral thesis is also situated within the field of research-by-design, in the way architectural design projects have been facilitated as contextualized experiments, selected, documented and aligned in regard to terminology, and analyzed through a series of design project enquiries.
QC 20120528

APA, Harvard, Vancouver, ISO, and other styles

17

Wamhoff, Jons-Tobias. "Exploiting Speculative and Asymmetric Execution on Multicore Architectures." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-163250.

Full text

Abstract:

The design of microprocessors is undergoing radical changes that affect the performance and reliability of hardware and will have a high impact on software development. Future systems will depend on a deep collaboration between software and hardware to cope with the current and predicted system design challenges. Instead of higher frequencies, the number of processor cores per chip is growing. Eventually, processors will be composed of cores that run at different speeds or support specialized features to accelerate critical portions of an application. Performance improvements of software will only result from increasing parallelism and introducing asymmetric processing. At the same time, substantial enhancements in the energy efficiency of hardware are required to make use of the increasing transistor density. Unfortunately, the downscaling of transistor size and power will degrade the reliability of the hardware, which must be compensated by software. In this thesis, we present new algorithms and tools that exploit speculative and asymmetric execution to address the performance and reliability challenges of multicore architectures. Our solutions facilitate both the assimilation of software to the changing hardware properties as well as the adjustment of hardware to the software it executes. We use speculation based on transactional memory to improve the synchronization of multi-threaded applications. We show that shared memory synchronization must not only be scalable to large numbers of cores but also robust such that it can guarantee progress in the presence of hardware faults. Therefore, we streamline transactional memory for a better throughput and add fault tolerance mechanisms with a reduced overhead by speculating optimistically on an error-free execution. If hardware faults are present, they can manifest either in a single event upset or crashes and misbehavior of threads. We address the former by applying transactions to checkpoint and replicate the state such that threads can correct and continue their execution. The latter is tackled by extending the synchronization such that it can tolerate crashes and misbehavior of other threads. We improve the efficiency of transactional memory by enabling a lightweight thread that always wins conflicts and significantly reduces the overheads. Further performance gains are possible by exploiting the asymmetric properties of applications. We introduce an asymmetric instrumentation of transactional code paths to enable applications to adapt to the underlying hardware. With explicit frequency control of individual cores, we show how applications can expose their possibly asymmetric computing demand and dynamically adjust the hardware to make a more efficient usage of the available resources.

APA, Harvard, Vancouver, ISO, and other styles

18

Black, Michael David. "Applying perceptrons to speculation in computer architecture." College Park, Md. : University of Maryland, 2007. http://hdl.handle.net/1903/6725.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2007.
Thesis research directed by: Electrical Engineering. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

19

Subramanian, Suvinay. "Architectural techniques to unlock ordered and nested speculative parallelism." Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/121729.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 129-144).
Current multicores suffer from two major limitations: they can only exploit a fraction of the parallelism available in applications and they are very hard to program. This is because they are limited to programs with coarse-grained tasks that synchronize infrequently. However, many applications have abundant parallelism when divided into small tasks (of a few tens to hundreds of instructions each). Current systems cannot exploit this fine-grained parallelism because synchronization and task management overheads overwhelm the benefits of parallelism. This thesis presents novel techniques that tackle the scalability and programmability issues of current multicores. First, Swarm is a parallel architecture that makes fine-grained parallelism practical by leveraging order as a general synchronization primitive. Swarm programs consist of tasks with programmer-specified order constraints. Swarm hardware provides support for fine-grained task management, and executes tasks speculatively and out of order to scale. Second, Fractal extends Swarm to harness nested speculative parallelism, which is crucial to scale large, complex applications and to compose parallel speculative algorithms. Third, Amalgam makes more efficient use of speculation resources by splitting and merging address set signatures to create fixed-size units of speculative work. Amalgam can improve performance and reduce implementation costs. Together, these techniques unlock abundant fine-grained parallelism in applications from a broad set of domains, including graph analytics, databases, machine learning, and discrete-event simulation. At 256 cores, our system is 40x -512x faster than a single core system and outperforms state-of-the-art software-only parallel algorithms by one to two orders of magnitude. Besides achieving near-linear scalability, the resulting programs are almost as simple as their sequential counterparts, as they do not use explicit synchronization.
by Suvinay Subramanian.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO, and other styles

20

Dou, Jialin. "A compiler cost model for speculative multithreading chip-multiprocessor architectures." Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/24532.

Full text

Abstract:

This thesis proposes a novel compiler static cost model of speculative multithreaded execution that can be used to predict the resulting performance. This model attempts to predict the expected speedups, or slowdowns, of the candidate speculative sections based on the estimation of the combined run-time effects of various speculation overheads, and taking into account the scheduling restrictions of most speculative execution environments. The model is based on estimating the likely execution duration of threads and considers all the possible permutations of these threads when scheduled on a multiprocessor. The proposed cost model was implemented in a research computer development framework. The model seamlessly uses the compiler’s intermediate representation and integrates with the control and data flow analyses. The resulting framework was tested and evaluated on a collection of SPEC benchmarks, which include large real-world scientific and engineering applications. The framework was found to be very stable and efficient with moderate compilation times. Initially, the proposed framework is evaluated on a number of loops that suffer mainly from load imbalance and thread dispatch and commit overheads. Experimental results show that the framework can identify on average 68% of the loops that cause slowdowns and on average 97% of the loops that lead to speedups. In fact, the framework predicts the speedups or slowdowns with an error of less than 20% for an average of 44% of the loops across the benchmarks, and with an error of less than 50% for an average of 84% of the loops. Overall, the framework leads to a performance improvement of 5% on average, and as high as 38%, over a naïve approach that attempts to speculatively parallelize all the loops considered. The proposed framework is also evaluated on loops that may suffer from data dependence violations. Experimental results with all loops show that prediction accuracy is lower when loops with violations are included. Nevertheless, accuracy is still very high for a static model.

APA, Harvard, Vancouver, ISO, and other styles

21

Yiapanis, Paraskevas. "High performance optimizations in runtime speculative parallelization for multicore architectures." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/high-performance-optimizations-in-runtime-speculative-parallelization-for-multicore-architectures(1d980957-a08d-49f0-8f7f-d5c35bae38b8).html.

Full text

Abstract:

Thread-Level Speculation (TLS) overcomes limitations intrinsic with conservative compile-time auto-parallelizing tools by extracting parallel threads optimistically and only ensuring absence of data dependence violations at runtime. A significant barrier for adopting TLS (implemented in software) is the overheads associated with maintaining speculative state. Previous TLS limit studies observe that on future multi-core systems it is likely to have more cores idle than those which traditional TLS would be able to harness. This thesis describes a novel compact version management data structure optimized for space overhead when using a small number of TLS threads. Furthermore, two novel software runtime parallelization systems were developed that utilize this compact data structure. The first one, MiniTLS, is optimized for fast recovery in the case of misspeculations by parallelizing the recovery procedure. The second one, Lector, is optimizedfor performance by using lightweight helper threads, along with TLS threads, to establish whether speculation can be withdrawn avoiding that way any speculative overheads. Facilitated by the novel compact representation, MiniTLS reduces the space overhead over state-of-the-art software TLS systems between 96% on 2 threads and 40% on 32 threads. MiniTLS and Lector were applied to seven Java benchmarks performing on average 7x and 8.2x faster, respectively, against the sequential versions and on average 1.7x faster than the current state-of-the-art in software TLS for 32 threads.

APA, Harvard, Vancouver, ISO, and other styles

22

Perais, Arthur. "Increasing the performance of superscalar processors through value prediction." Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S070/document.

Full text

Abstract:

Bien que les processeurs actuels possèdent plus de 10 cœurs, de nombreux programmes restent purement séquentiels. Cela peut être dû à l'algorithme que le programme met en œuvre, au programme étant vieux et ayant été écrit durant l'ère des uni-processeurs, ou simplement à des contraintes temporelles, car écrire du code parallèle est notoirement long et difficile. De plus, même pour les programmes parallèles, la performance de la partie séquentielle de ces programmes devient rapidement le facteur limitant l'augmentation de la performance apportée par l'augmentation du nombre de cœurs disponibles, ce qui est exprimé par la loi d'Amdahl. Conséquemment, augmenter la performance séquentielle reste une approche valide même à l'ère des multi-cœurs.Malheureusement, la façon conventionnelle d'améliorer la performance (augmenter la taille de la fenêtre d'instructions) contribue à l'augmentation de la complexité et de la consommation du processeur. Dans ces travaux, nous revisitons une technique visant à améliorer la performance de façon orthogonale : La prédiction de valeurs. Au lieu d'augmenter les capacités du moteur d'exécution, la prédiction de valeurs améliore l'utilisation des ressources existantes en augmentant le parallélisme d'instructions disponible.En particulier, nous nous attaquons aux trois problèmes majeurs empêchant la prédiction de valeurs d'être mise en œuvre dans les processeurs modernes. Premièrement, nous proposons de déplacer la validation des prédictions depuis le moteur d'exécution vers l'étage de retirement des instructions. Deuxièmement, nous proposons un nouveau modèle d'exécution qui exécute certaines instructions dans l'ordre soit avant soit après le moteur d'exécution dans le désordre. Cela réduit la pression exercée sur ledit moteur et permet de réduire ses capacités. De cette manière, le nombre de ports requis sur le fichier de registre et la complexité générale diminuent. Troisièmement, nous présentons un mécanisme de prédiction imitant le mécanisme de récupération des instructions : La prédiction par blocs. Cela permet de prédire plusieurs instructions par cycle tout en effectuant une unique lecture dans le prédicteur. Ces trois propositions forment une mise en œuvre possible de la prédiction de valeurs qui est réaliste mais néanmoins performante
Although currently available general purpose microprocessors feature more than 10 cores, many programs remain mostly sequential. This can either be due to an inherent property of the algorithm used by the program, to the program being old and written during the uni-processor era, or simply to time to market constraints, as writing and validating parallel code is known to be hard. Moreover, even for parallel programs, the performance of the sequential part quickly becomes the limiting improvement factor as more cores are made available to the application, as expressed by Amdahl's Law. Consequently, increasing sequential performance remains a valid approach in the multi-core era. Unfortunately, conventional means to do so - increasing the out-of-order window size and issue width - are major contributors to the complexity and power consumption of the chip. In this thesis, we revisit a previously proposed technique that aimed to improve performance in an orthogonal fashion: Value Prediction (VP). Instead of increasing the execution engine aggressiveness, VP improves the utilization of existing resources by increasing the available Instruction Level Parallelism. In particular, we address the three main issues preventing VP from being implemented. First, we propose to remove validation and recovery from the execution engine, and do it in-order at Commit. Second, we propose a new execution model that executes some instructions in-order either before or after the out-of-order engine. This reduces pressure on said engine and allows to reduce its aggressiveness. As a result, port requirement on the Physical Register File and overall complexity decrease. Third, we propose a prediction scheme that mimics the instruction fetch scheme: Block Based Prediction. This allows predicting several instructions per cycle with a single read, hence a single port on the predictor array. This three propositions form a possible implementation of Value Prediction that is both realistic and efficient

APA, Harvard, Vancouver, ISO, and other styles

23

McKellar, Elizabeth. "Architectural practice for speculative building in late seventeenth century London." Thesis, Royal College of Art, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.281699.

Full text

Abstract:

Architectural practice is the study of how people produce architecture - the ways in which they build, the manner in which they organize themselves to do so and the methods by which buildings are both conceived and physically realised. This thesis is concerned with investigating what has been seen as the watershed period between medi~v.al and modem practices. It particularly examines whether the picture of late 17th century development given by John Summerson in 'Georgian London' (1945), still the standard work on the subject, is correct. In order to do this new evidence has been used from the Court of Chancery concerning building and property disputes. The first section 'Development Practice' investigates where and how development was carried out. It shows how the development system was made possible through the freehold/leasehold distinction in English law which allowed for separate interests to exist in the same piece of land. It proves that it was undertaken not primarily by aristocrats, as Summerson thought, but by a new breed of businessmen and entrepreneurs working largely on credit. The next section 'Design Practice' examines the design process for the realisation of these projects. It shows that although the antecedents of the new houses being produced were classical this was not matched by a parallel transformation in design procedures or the understanding of form. Only a very limited use was made of drawings and where they were used, it is argued, this was mainly for contractual or economic purposes. This section challenges conventional notions about the adoption of classicism in this country and its use and tranmission here. In the final section 'Building Practice' the role of the craftsman is examined and is shown to be far more entrepreneurial than conventional interpretations have allowed, with some of them operating as master builders contracting for all trades. It is shown that the new classical house with its regular, standardized parts was perfectly suited to the design, construction and development systems of the day, and that building was a far more capitalistic and commercialized activity by this date than has previously been thought.

APA, Harvard, Vancouver, ISO, and other styles

24

Chiu, Virginia. "Architectural support for commutativity in hardware speculation." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106029.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 57-60).
Hardware speculative execution schemes (e.g., hardware transactional memory (HTM)) enjoy low run-time overheads but suffer from limited concurrency because they detect conflicts at the level of reads and writes. By contrast, software speculation schemes can reduce conflicts by exploiting that many operations on shared data are semantically commutative: they produce semantically equivalent results when reordered. However, software techniques often incur unacceptable run-time overheads. To bridge this dichotomy, this thesis presents CommTM, an HTM that exploits semantic commutativity. CommTM extends the coherence protocol and conflict detection scheme to allow multiple cores to perform user-defined commutative operations to shared data concurrently and without conflicts. CommTM preserves transactional guarantees and can be applied to arbitrary HTMs. This thesis details CommTM's implementation and presents its evaluation. The evaluation uses a series of micro-benchmarks that covers commonly used operations and a suite of full transactional memory applications. We see that CommTM scales many operations that serialize on conventional HTMs, such as counter increments, priority updates, and top-K set insertions. As a result, at 128 cores on full applications, CommTM outperforms a conventional eager-lazy HTM by up to 3.4x and reduces or eliminates aborts.
by Virginia Chiu.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

25

Abou, Dib Marwan Joseph. "Design for speculation : volatile, temporal, in-transit." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/103455.

Full text

Abstract:

Thesis: M. Arch. in Real Estate Development, Massachusetts Institute of Technology, Department of Architecture, 2016.
Thesis: S.M. in Real Estate Development, Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 97).
The thesis project is a reaction to the alarming rate of building and development depreciation caused by foreign investment in the Middle Eastern city of Dubai. The intervention looks at how architects, developers, and planners can counteract this phenomenon by designing for speculation in order to mitigate future crises or successes. Understanding the economic terms of "creative destruction" and "planning obsolescence" are imperative to help structure such a proposal Though such terms were attributed to industrial products such as cars and electronics, they are today applicable in the context of Dubai and similar cities worldwide. Architecture and real estate products have become victim to this capitalist phenomenon. The project is framed as an architectural reaction to the world's increasing capability to make and accumulate in conjunction with a growing desire to be transient and global. Has architecture become a mere toy product which can be changed around as it become obsolete? Rather than be destroyed, how can architecture morph and be updated into something new? Architects are not in complete control of consumer wants and needs; these, too, continue to change at a dynamic pace. I argue that a synchronized system that can reflect flexibility is integral in order to maintain equilibrium in the urban economic model today The project design is an infrastructure capable of harnessing capital inflow and outflow while withstanding volatility, temporality and a population in-transit Dubai is the core case-study and the thesis explores how such a generic system can adapt to cities such as Miami, New York City and Juba.
by Marwan Joseph Abou Dib.
M. Arch. in Real Estate Development
S.M. in Real Estate Development

APA, Harvard, Vancouver, ISO, and other styles

26

Powers, Aaron. "Stimulation, speculation, simulation : the architecture of the captured city that the corporation gave us." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/129933.

Full text

Abstract:

Thesis: M. Arch., Massachusetts Institute of Technology, Department of Architecture, February, 2020
Cataloged from student-submitted thesis.
Includes bibliographical references (pages 114-115).
For thousands of years, architecture built itself from a history of its own understanding of itself. A retracing over time, architecture was inherited from its past. Over time, sets of ideals were built collectively, with the city as its manifesto. It could be argued that the architecture built was in response to an architecture that already existed. the production of making architecture made cities, and further, showcased architecture as a way of thinking. Diagrammatic thinking produces diagrammatic architecture. Orthographic thinking produced orthographic architecture [1]. Machines have learned to compound a retracing of time through creating systems of algorithms that recursively produce patterns of understanding through their recordings of human behavior. The model is based on recording the past to predict the future. Increasingly, our cities have been non-stop recorded through various and ubiquitous sensing devices. The city is then re-represented afterward, through the eyes and interpretation of the machine. Or rather, by the eyes of the machine by the men who made the machine see the way they want the machine to see, building a proxy city, a representation of the real world, to help make choices in the distant and near future. How might we begin to imagine architecture as collective intelligence within this new system? Imagining architecture as a type of metadata. Mined through vision, camera, surveillance technology, connecting various strands of metadata produced surveillance capitalisms abilities. Imagine other connections mining this figural metadata could produce for a second just processing the knowns and unknowns, speculating on the possible city in Rumsfeldian ways, knowing from these systems, it is that they have the capabilities to find patterns and order not before seen. When past behavior is the basis of predicting future behavior, how might we revisit the city that was, to forge the city to come?
by Aaron Powers.
M. Arch.
M.Arch. Massachusetts Institute of Technology, Department of Architecture

APA, Harvard, Vancouver, ISO, and other styles

27

Pilla, Mauricio Lima. "RST: Reuse through Speculation on Traces." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2004. http://hdl.handle.net/10183/5888.

Full text

Abstract:

Na presente tese, apresentamos uma nova abordagem para combinar reuso e prvisão de seqüências dinâmicas de instruções, chamada Reuso por Especulação em traces (RST). Esta técnica permite a identificação dinâmica de traces de instruções redundantes ou previsíveis e o reuso (especulativo ou não) desses traces. RST procura resolver a questão de traces que não são reusados por seus valores de entradas de Traces (DTM). Em estudo anteriores, esses traces foram contabilizados como sendo cerca de 69% de todos os traces reusáveis. Uma das maiores vantagens de RST sobre a combinação de um mecanismo de previsão com uma técnica de reuso de valores em que mecanismos não são relacionados é que RST não necessita de tabelas adicionais para o armazenamento dos valores a serem previstos. A aplciação de reuso e previsão de valores pela simples combinação de mecanismos pode necessitar de uma quantidade proibitiva de espaço de armazenamento. No mecanismo RST, os valores já estão presentes na Tabela de Memorização de Traces, não incorrendo em custos adicionais para lê-los se comparado com uma técnica não-especulativa de reuso de traces. O contexto de entrada de cada trace (os valores de entrada de todas as instruções contidas no trace) já armazenam os valores para o teste de reuso, os quais podem ser também utilizados para previsão de valores para o teste de reuso, os quais podem ser também utilizados para previsão de valores. As principais contribuições de nosso trabalho incluem: (i) um framework de reuso especulativo de traces que pode ser modificado para diferentes arquiteturas de processadores; (ii) definição das modificações necessárias em um processador superescalar e superpipeline para implementar nosso mecanismo; (iii) estudo de questões de implementação relacionadas à essa arquitetura; (iv) estudo dos limites de desempenho da nossa técnica; (v) estudo de uma implementação RST limitada por fatores realísticos; e (vi) ferramentas de simulação que podem ser utilizadas em outros estudos, representando um processador superescalar e superpipeline em detalhes. Salientamos que, em uma arquitetura utilizando mecanismos realistas de estimativa de confiança das previsões, nossa técnica RST consegue atingir speedups médios (médias harmônicas) de 1.29 sobre uma arquitetura sem reuso e 1.09 sobre uma técnica não-especulativa de reuso de traces (DTM).
In this thesis, we present a novel approach to combine both reuse and prediction of dynamic sequences of instructions called Reuse through Speculation on Traces (RST). Our technique allows the dynamic identification of instruction traces that are redundant or predictable, and the reuse (speculative or not) of these traces. RST addresses the issue, present on Dynamic Trace Memoization (DTM), of traces not being reused because some of their inputs are not ready for the reuse test. These traces were measured to be 69% of all reusable traces in previous studies. One of the main advantages of RST over just combining a value prediction technique with an unrelated reuse technique is that RST does not require extra tables to store the values to be predicted. Applying reuse and value prediction in unrelated mechanisms but at the same time may require a prohibitive amount of storage in tables. In RST, the values are already stored in the Trace Memoization Table, and there is no extra cost in reading them if compared with a non-speculative trace reuse technique. . The input context of each trace (the input values of all instructions in the trace) already stores the values for the reuse test, which may also be used for prediction. Our main contributions include: (i) a speculative trace reuse framework that can be adapted to different processor architectures; (ii) specification of the modifications in a superscalar, superpipelined processor in order to implement our mechanism; (iii) study of implementation issues related to this architecture; (iv) study of the performance limits of our technique; (v) a performance study of a realistic, constrained implementation of RST; and (vi) simulation tools that can be used in other studies which represent a superscalar, superpipelined processor in detail. In a constrained architecture with realistic confidence, our RST technique is able to achieve average speedups (harmonic means) of 1.29 over the baseline architecture without reuse and 1.09 over a non-speculative trace reuse technique (DTM).

APA, Harvard, Vancouver, ISO, and other styles

28

Kully, Deborah Grace. "Speculating on architecture : morality, the new real estate, and the bourgeois apartment industry in late nineteenth-century France." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/63061.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Architecture, 2011.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 216-232).
The topic of architecture as a commodity-something that can be possessed and traded-has been largely ignored within the discipline of architectural history, or even written off altogether as an inevitable consequence of modem capitalism. But the history of the commodification of architecture is by no means as simple as it may seem. It has its roots in Haussmann's Paris, and the speculative property market of the 1860s, where we see, for the first time, a complex intermingling of new mortgage structures and residential typologies, the use of standardization, and the proliferation of discourses concerning apartment decoration. The project also treats reactions expressed by architects, aesthetic theorists, and religious and political figures over the course of the Third Republic against speculation practices and their architectural effects. The changes brought by property's increased circulation-the very idea of apartments designed for unknown future occupants-were compounded by the perception of a real estate market held in the grips of commodity culture. The possibility that anyone could own property was unsettling for some political and religious authorities; perhaps even more so was the sense of an assault on the way in which property had traditionally stood as a representation of individuality. Speculative architecture brought about a separation of the subject (the particular owner of an apartment) from its object (the apartment unit now rendered ubiquitous). The powerful critique of modem capitalism and the ostensive ill effects on private life that emerged from all of this was bound up in liberal and Catholic ideologies, as I argue in my dissertation. I look at a set of figures from vastly different professions who, perforce, collectively developed and implemented rules governing finances, architecture, decoration, and, ultimately, human conduct. These include developers like the Saint-Simonien Emile Pdreire, whose experimental Credit Mobilier sponsored standard models for residential architecture, democratized credit, and underwrote the design and construction of thousands of new apartments. These also include taste-makers like Charles Blanc, director of the Academie des beaux-arts, whose works included decoration manuals. And finally, these include politicians such as Frederic Le Play, the Catholic modernist and proto-sociologist who insisted on the connection between private property and morality, and Jules Simon, the conservative republican who linked the security of the family to that of the nation state. The reactionary moralization of design, to be detected in Catholic dogma, metaphysical philosophy, and the republican politics of the time, stands as one of the great unacknowledged precedents for the proselytizing ideology of architectural modernism at the dawn of the twentieth century.
by Deborah Grace Kully.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

29

Kalaitzidis, Kleovoulos. "Advanced speculation to increase the performance of superscalar processors." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S007.

Full text

Abstract:

Même à l’ère des multicœurs, il est primordial d’améliorer la performance en contexte monocœur, étant donné l’existence de pro- grammes qui exposent des parties séquentielles non négligeables. Les performances séquentielles se sont essentiellement améliorées avec le passage à l’échelle des structures de processeurs qui permettent le parallélisme d’instructions (ILP). Cependant, les chaînes de dépendances séquentielles li- mitent considérablement la performance. La prédiction de valeurs (VP) et la prédiction d’adresse des lectures mémoire (LAP) sont deux techniques en développement qui permettent de surmonter cet obstacle en permettant l’exécution d’instructions en spéculant sur les données. Cette thèse propose des mécanismes basés sur VP et LAP qui conduisent à des améliorations de performances sensiblement plus élevées. D’abord, VP est examiné au niveau de l’ISA, ce qui fait apparaître l’impact de certaines particularités de l’ISA sur les performances. Ensuite, un nouveau prédicteur binaire (VSEP), qui permet d’exploiter certains motifs de valeurs, qui bien qu’ils soient fréquemment rencontrés, ne sont pas capturés par les modèles précédents, est introduit. VSEP améliore le speedup obtenu de 19% et, grâce à sa structure, il atténue le coût de la prédiction de va- leurs supérieures à 64 bits. Adapter cette approche pour effectuer LAP permet de prédire les adresses de 48% des lectures mémoire. Finalement, une microarchitecture qui exploite soigneusement ce mécanisme de LAP peut exécuter 32% des lectures mémoire en avance
Even in the multicore era, making single cores faster is paramount to achieve high- performance computing, given the existence of programs that are either inherently sequential or expose non-negligible sequential parts. Sequential performance has been essentially improving with the scaling of the processor structures that enable instruction-level parallelism (ILP). However, as modern microarchitectures continue to extract more ILP by employing larger instruction windows, true data dependencies remain a major performance bottleneck. Value Prediction (VP) and Load-Address Prediction (LAP) are two developing techniques that allow to overcome this obstacle and harvest more ILP by enabling the execution of instructions in a data-wise speculative manner. This thesis proposes mechanisms that are related with VP and LAP and lead to effectively higher performance improvements. First, VP is examined in an ISA-aware manner, that discloses the impact of certain ISA particularities on the anticipated speedup. Second, a novel binary-based VP model is introduced, namely VSEP, that allows to exploit certain value patterns that although they are encountered frequently, they cannot be captured by previous works. VSEP improves the obtained speedup by 19% and also, by virtue of its structure, it mitigates the cost of predicting values wider than 64 bits. By adapting this approach to perform LAP allows to predict the memory addresses of 48% of the committed loads. Eventually, a microarchitecture that leverages carefully this LAP mechanism can execute 32% of the committed loads early

APA, Harvard, Vancouver, ISO, and other styles

30

Wamhoff, Jons-Tobias [Verfasser], Christof [Akademischer Betreuer] Fetzer, and Pascal [Akademischer Betreuer] Felber. "Exploiting Speculative and Asymmetric Execution on Multicore Architectures / Jons-Tobias Wamhoff. Gutachter: Christof Fetzer ; Pascal Felber. Betreuer: Christof Fetzer." Dresden : Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://d-nb.info/106909675X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Loew, Jason. "Quantifying the impacts of disabling speculation and relaxing the scheduling loop in multithreaded processors." Diss., Online access via UMI:, 2006.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

32

Perombelon, Brice Désiré Jude. "Prioritising indigenous representations of geopower : the case of Tulita, Northwest Territories, Canada." Thesis, University of Oxford, 2018. http://ora.ox.ac.uk/objects/uuid:71e14c26-d00a-4320-a385-df74715c45c8.

Full text

Abstract:

Recent calls from progressive, subaltern and postcolonial geopoliticians to move geopolitical scholarship away from its Western ontological bases have argued that more ethnographic studies centred on peripheral and dispossessed geographies need to be undertaken in order to integrate peripheralised agents and agencies in dominant ontologies of geopolitics. This thesis follows these calls. Through empirical data collected during a period of five months of fieldwork undertaken between October 2014 and March 2015, it investigates the ways through which an Indigenous community of the Canadian Arctic, Tulita (located in the Northwest Territories' Sahtu region) represents geopower. It suggests a semiotic reading of these representations in order to take the agency of other-than/more-than-human beings into account. In doing so, it identifies the ontological bases through which geopolitics can be indigenised. Drawing from Dene animist ontologies, it indeed introduces the notion of a place-contingent speculative geopolitics. Two overarching argumentative lines are pursued. First, this thesis contends that geopower operates through metamorphic refashionings of the material forms of, and signs associated with, space and place. Second, it infers from this that through this transformational process, geopower is able to create the conditions for alienating but also transcending experiences and meanings of place to emerge. It argues that this movement between conflictual and progressive understandings is dialectical in nature. In addition to its conceptual suggestions, this thesis makes three empirical contributions. First, it confirms that settler geopolitical narratives of sovereignty assertion in the North cannot be disentangled from capitalist and industrial political-economic processes. Second, it shows that these processes, and the geopolitical visions that subtend them, are materialised in space via the extension of the urban fabric into Indigenous lands. Third, it demonstrates that by assembling space ontologically in particular ways, geopower establishes (and entrenches) a geopolitical distinction between living/sovereign (or governmentalised) spaces and nonliving/bare spaces (or spaces of nothingness).

APA, Harvard, Vancouver, ISO, and other styles

33

Kainth, Haresh S. "A data dependency recovery system for a heterogeneous multicore processor." Thesis, University of Derby, 2014. http://hdl.handle.net/10545/313343.

Full text

Abstract:

Multicore processors often increase the performance of applications. However, with their deeper pipelining, they have proven increasingly difficult to improve. In an attempt to deliver enhanced performance at lower power requirements, semiconductor microprocessor manufacturers have progressively utilised chip-multicore processors. Existing research has utilised a very common technique known as thread-level speculation. This technique attempts to compute results before the actual result is known. However, thread-level speculation impacts operation latency, circuit timing, confounds data cache behaviour and code generation in the compiler. We describe an software framework codenamed Lyuba that handles low-level data hazards and automatically recovers the application from data hazards without programmer and speculation intervention for an asymmetric chip-multicore processor. The problem of determining correct execution of multiple threads when data hazards occur on conventional symmetrical chip-multicore processors is a significant and on-going challenge. However, there has been very little focus on the use of asymmetrical (heterogeneous) processors with applications that have complex data dependencies. The purpose of this thesis is to: (i) define the development of a software framework for an asymmetric (heterogeneous) chip-multicore processor; (ii) present an optimal software control of hardware for distributed processing and recovery from violations;(iii) provides performance results of five applications using three datasets. Applications with a small dataset showed an improvement of 17% and a larger dataset showed an improvement of 16% giving overall 11% improvement in performance.

APA, Harvard, Vancouver, ISO, and other styles

34

Tonchev, Anton. "Door, Passage, Courtyard: Shifting Perspective in Gamla Stan." Thesis, KTH, Arkitektur, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281363.

Full text

Abstract:

A historical study of the urban texture of Gamla Stan shows how public space has been appropriated for private needs. Streets were built over, closed off or turned into private courtyards, some of which have started to disappear, being completely internalized. This process of space appropriation was one-directional until the early 1900s, when the fear of losing structures across town made authorities create a precedent and revert the process by removing specific houses from the urban texture. This approach is based on a set of rules which I changed when making my project: the re-examination of all the hidden, internal, private spaces and their re-introduction to public life. My set of criteria is rooted in a research of the elements that constitute the borderline in Gamla Stan's public vs private realm: doors, passages and courtyards. Based on that I limited my intervention techniques to the removal of three elements: fences, structures, and doors. The last one has two sub-categories "the removed wall" (turned into a new door) and "the removed lock" (opening an existing door). By establishing the parameters of my work, I tested this speculation in a specific case scenario - a cluster of four blocks on the west side of Gamla Stan. Using the rule I that a door must be the beginning of a corridor path that leads to an open court, and having the historical knowledge of the location of past public spaces, I surgically removed later additions of lesser architectural or historical quality. The result of this is a new interconnected, accessible network. Until now one was restricted to walking along the streets and alleys, and around buildings in Gamla Stan. With this intervention people can walk through the buildings and into the reclaimed spaces, thus shifting one’s perception of the urban texture. The new alternative, total system of navigation turns solid into permeable/perforated. Alley City has become Corridor City.

APA, Harvard, Vancouver, ISO, and other styles

35

Chuzeville, Sylvain. "Vie, œuvre et carrière de Jean-Antoine Morand, peintre et architecte à Lyon au XVIIIe." Thesis, Lyon 2, 2012. http://www.theses.fr/2012LYO20076/document.

Full text

Abstract:

Né en 1727 à Briançon, Jean-Antoine Morand a 14 ans lorsqu’il se lance, suite à la mort de son père, dans une carrière artistique. C’est à Lyon qu’il s’installe et fonde, en 1748, un atelier de peinture. Il reçoit des commandes officielles et privées, travaille régulièrement pour la Comédie, se spécialise dans la peinture en trompe-l’œil et la scénographie, y compris les machines de théâtre. À la fin des années 1750, encouragé par Soufflot, il se tourne vers l’architecture et l’embellissement, ainsi que l’y disposent différents aspects de sa première carrière.Architecte autodidacte, Morand souffre d’un déficit de légitimité et tente d’y remédier en recherchant la reconnaissance publique. Mais ses succès, en particulier la construction à titre privé d’un pont sur le Rhône, n’y suffisent pas. La carrière de Morand est tiraillée entre fierté entrepreneuriale et appétence institutionnelle. Son image pâtit de l’opposition entre spéculation foncière et promotion du bien public. Cela concerne en particulier son grand œuvre, un projet d’agrandissement de Lyon sur la rive gauche du Rhône, compris dans un plan général donnant à la ville la forme circulaire.Morand a peu construit et il ne subsiste presque rien de son œuvre pictural. On dispose en revanche d’un fonds d’archives privé d’une grande richesse, sur lequel s’appuie cette thèse, afin de mettre au jour les intentions, les relations et la psychologie d’un architecte autrement méconnu
Born in 1727, Jean-Antoine Morand is 14 years old when he embraces an artistic career, following his father’s death. Having settled down in Lyon, he establishes his own painter’s workshop in 1748. Receiving public and private commissions and working for the theatre on a regular basis, he specializes in trompe l’œil painting and stage-setting, including machinery. In the late 1750s, spurred on by Soufflot, he turns to architecture and city-planning, as various aspects of his previous career could have prompted him to.As an autodidactic architect, Morand suffers from a lack of legitimacy against which he pursues public recognition. But his successes, which include the building of a privately-owned bridge across the Rhône, aren’t enough. Morand’s career is torn between entrepreneurial pride and his longing for tenure. His public image is marred by the alleged opposition between land speculation and the defense of public good. This concerns mostly his great work, a project for the extension of Lyon on the left bank of the Rhône, included in a circular general city plan.Morand hasn’t built much and very little remains of his pictorial work. This thesis is based on an extensive private archive that allows us to explore this otherwise unsung architect’s intentions, relations and psychology

APA, Harvard, Vancouver, ISO, and other styles

36

Svahn, Garreau Hélène. "I originalets tjänst : Om framställandet och bevarandet av kalkmåleri i svenska kyrkorum mellan 1850 och 1980." Doctoral thesis, KTH, Arkitekturens historia och teori, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-171078.

Full text

Abstract:

There are approximately 1300 completely or partially preserved medieval churches in Sweden. Many of these have remains of kalkmåleri (mural paintings at least partially created in lime) from the 12th throughout the 17th century. This dissertation discusses the enactments that formed the revival of this kalkmåleri between 1850 and 1980, with a focus on restoration and conservation. The decorative and monumental paintings that were created at the same time are also discussed. The study is divided into two sections: one concentrates on the mural paintings and the networks that made their (re-)enactment possible, and the second is a case study that examines kalkmåleri in four medieval churches; Vendel and Ed north of Stockholm, and Floda and Vadsbro south of Stockholm. To come close to the paintings, an eclectic methodology with analysis of written and depicted sources, interviews, and studies in situ of the paintings through mapping and analysis of taken samples was designed. The objectives were to investigate the formation of kalkmåleri as phenomena, significant concepts, and conservation practices throughout time and space. Theoretical inspiration was taken from Actor-Network-Theory, critical discourse analysis, and speculative realism. Throughout the study the kalkmåleri is thus seen to have agency. The weave of enactments stemming from different professions and thought collectives that formed the paintings was made visible by following the actors. Some of these enactments were analyzed: i.e. the aesthetic shaping of the room, as religious and iconographic images, historical documents, art, style, technical, or hybrid objects. The latter refers to conservation that did not entirely rely on science, humanist scholarship, craftsmanship, or artistic creativity. Thus conservation is seen as a hybrid activity. Three periods of conservation principles were explored: stylistic restoration, original conservation, and precautionary conservation, which were related to what was perceived as the authentic original. Furthermore some Swedish "traditions" are discussed: that no institute for technical studies of art was formed, the use of "Curman’s principles", restricted retouching from the 1960s onward, and the use of gomma pane for cleaning. Finally appendices are included containing terminology, an index of conservators, and a DVD with mapping, chemical analysis, and photographs.

Forskningsfinansiärer: FoU-medel: Riksantikvarieämbetet, Brandförsäkringsverkets stiftelse för bebyggelsehistorisk forskning, Elna Bengtsssons fond och Tyréns stiftelse.

Ett läsår på Columbia University kunde genomföras med stöd av Fulbright Commission. Erik & Lily Philipsons minnesfond och Axelson Johnsons stiftelse.

APA, Harvard, Vancouver, ISO, and other styles

37

Valiukas, Tadas. "Kompiliatorių optimizavimas IA-64 architektūroje." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2009~D_20140701_180746-19336.

Full text

Abstract:

Tradicinės x86 architektūros spartinimui artėjant prie galimybių ribos, kompanija Intel pradėjo kurti naują IA-64 architektūrą, paremtą EPIC – išreikštinai lygiagrečiai vykdomomis instrukcijomis vieno takto metu. Ši pagrindinė savybė leidžia vykdyti iki šešių instrukcijų per vieną taktą. Taipogi architektūra pasižymi tokiomis savybėmis, kurios leido efektyviai spręsti su kodo optimizavimu susijusias problemas tradicinėse architektūrose. Tačiau kompiliatorių optimizavimo algoritmai ilgą laiką buvo tobulinami tradicinėse architektūrose, todėl norint išnaudoti naująją architektūrą, reikia ieškoti būdų tobulinti esamus kompiliatorius. Vienas iš būdų – kompiliatoriaus vidinių parametrų atsakingų už optimizacijas reikšmių pritaikymas IA-64. Būtent toks yra šio darbo tikslas, kuriam pasiekti reikia išnagrinėti IA-64 savybes, jas vėliau eksperimentiškai taikyti realaus kodo pavyzdžiuose bei įvertinti jų įtaką kodo vykdymo spartai. Pagal gautus rezultatus nagrinėjami kompiliatoriaus vidiniai parametrai ir su specialia kompiliatorių testavimo programa randamas geriausias reikšmių rinkinys šiai architektūrai. Vėliau šis rinkinys išbandomas su taikomosiomis programomis. Gauto parametrų rinkinio reikšmės turėtų leisti generuoti efektyvesnį kodą IA-64 architektūrai.
After performance optimization of traditional architectures began to reach their limits, Intel corporation started to develop new architecture based on EPIC – Explicitly Parallel Instruction Counting. This main feature allowed up to six instructions to be executed in single CPU cycle. Also this architecture includes more features, which allowed efficient solution of traditional architectures code optimization problems. However for long time code optimization algorithms have been improved for traditional architectures only, as a result those algorithms should be adopted to new architecture. One of the ways to do that – exploration of internal compilers parameters, which are responsible for code optimizations. That is the primary target of this work and in order to reach it the features of the IA-64 architecture and impact to execution performance must be explored using real-life code examples. Tests results may be used later for internal parameters selection and further exploration of these parameters values by using special compiler performance testing benchmarks. The set of those new values could be tested with real life applications in order to prove efficiency of IA-64 architecture features.

APA, Harvard, Vancouver, ISO, and other styles

38

Hung, Ming-Yu, and 洪明郁. "Compiler supports for Optimizing Speculative Multithreading Architecture." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/34974431939945341349.

Full text

Abstract:

碩士
國立清華大學
資訊工程學系
92
By the progress of VLSI technology, there are more and more features added in a single processor. Speculative multithreading (SpMT) architecture is one of them. It has speculative function and multithreading feature in a processor, and it can exploit thread-level parallelism that cannot be identified statically. Speedup can be obtained by speculatively executing threads in parallel that are extracted from a sequential program. However, performance degradation might occur if the threads are highly dependent. A recovery mechanism will be activated when a speculative thread violates the sequential semantics. The recovery action usually incurs a very high penalty, because it must squash all living threads before doing recovery code. Therefore, it is essential for SpMT to quantify the degree of dependences and to turn off speculation if the degree of loop carried dependence is over a certain threshold. This paper presents a technique that quantitatively computes loop carried dependences and such information can be used to determine if loop iterations should be executed in parallel by speculative threads or not. This technique can be broken into two steps. First, probabilistic points-to analysis is performed to estimate the probabilities of points-to relationships in case there are pointer references in programs. That way, the degree of dependences between loop iterations is computed quantitatively. Second, experimental results show compiler-directed thread-level speculation based on the information gathered by this technique can guarantee the architecture to always do a right decision on the experimental platform, SImulator for Multithreaded Computer Architectures (SIMCA). SIMCA be modeled as SpMT architecture by inserting SIMCA specific instructions.

APA, Harvard, Vancouver, ISO, and other styles

39

Chen, HsuanYu, and 陳宣宇. "Speculative Execution of Non-Blocking Multithreaded Architecture." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/77229340248370372217.

Full text

Abstract:

碩士
輔仁大學
資訊工程學系
95
In the past decade, CPU speed has been increasing linearly, but the memory access has not kept up with this improvement. In the modern architectures if we want to increase the processor performance, we need to increase the ILP. TLP has been complemented to ILP, in multithreaded architectures. In this thesis, we present an evaluation of modern processor that decouples memory accesses to alleviate the gap, uses a non-blocking multithreaded together with the dataflow paradigm. We provide both clock cycles per instruction (CPI) and instructions per clock cycle (IPC) evaluation of a multithreaded architecture by using speculative execution. Current architectural paradigm shift is from high performance to high-throughput processing, using on chip processors or by using distributed components. Main reasons to experiment speculative execution is by the witnessing of the diminishing potential of the existing techniques to extract parallelism from single program and current technology trends that allow us to execute multiple independent threads. The existing architecture has been evaluated previously and shown that it has outperformed MIPS like architectures. In this particular study, we try to implement speculative execution of a multithread on this unique architecture. We have used hand coded examples for speculative and non-speculative execution. Using few benchmarks, we present the IPC and CPI improvement over non-speculative execution. Some of the benchmarks we used include I-structure that is unique to dataflow architecture and other benchmarks are without I-structure. All the benchmarks have shown speedup of about 1.3. Without speculation we can only divide the programs conservatively into non-speculative threads, whose mutual exclusion and independence is guaranteed. In a speculative execution, it divides the thread aggressively and the mutual exclusion and dependence are guaranteed to be parallel. Thus it can increase the performance of any program with high probability. That has been proved as a result of this research using in a non-blocking multithreaded architecture. We have used different architectural simulators to prove the existing performance improvement of speculative execution.

APA, Harvard, Vancouver, ISO, and other styles

40

TSENG, YU-MING, and 曾淯銘. "Loop Transformation and Instruction Scheduling Techniques for Timing Speculative Architecture." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/as8mgc.

Full text

Abstract:

碩士
國立中正大學
資訊工程研究所
106
Traditional processor design incorporates voltage and frequency guardbands to ensure correct execution of operations under worst-case conditions. As transistor density increases and manufacturing processes improve, increasingly costly guardbands are required to deal with the impacts of environmental variability. The use of timing speculation can relax the tight constraint for worst-case design by allowing occasional errors, which are detected and corrected later by an error resilience mechanism. However, a program's performance may suffer owing to timing errors. The thesis consists of three works. First, we analyze program behaviors and observe what influence the number of timing errors through a simulator. Second, we propose a loop transformation technique for Timing Speculative Architectures to reduce up to 37% the number of timing errors. Third, we propose an instruction scheduling technique to rearrange the instructions in the programs that make better cooperation between Timing Speculative Architecture and Adaptive Voltage Scaling technique and reduce up to 45% the number of timing errors. These proposed techniques are implemented in the LLVM compiler infrastructure to generate the optimized programs automatically.

APA, Harvard, Vancouver, ISO, and other styles

41

"Adaptive modern and speculative urbanism: the architecture of the Crédit Foncier d'Extrême-Orient (C.F.E.O.) in Hong Kong and China's treaty ports, 1907-1959." 2013. http://library.cuhk.edu.hk/record=b5884271.

Full text

Abstract:

Lau, Leung Kwok Prudence.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves ).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts also in Chinese.

APA, Harvard, Vancouver, ISO, and other styles

42

Yuan, Yi. "A microprocessor performance and reliability simulation framework using the speculative functional-first methodology." Thesis, 2011. http://hdl.handle.net/2152/ETD-UT-2011-12-4848.

Full text

Abstract:

With the high complexity of modern day microprocessors and the slow speed of cycle-accurate simulations, architects are often unable to adequately evaluate their designs during the architectural exploration phases of chip design. This thesis presents the design and implementation of the timing partition of the cycle-accurate, microarchitecture-level SFFSim-Bear simulator. SFFSim-Bear is an implementation of the speculative functional-first (SFF) methodology, and utilizes a hybrid software-FPGA platform to accelerate simulation throughput. The timing partition, implemented in FPGA, features throughput-oriented, latency-tolerant designs to cope with the challenges of the hybrid platform. Furthermore, a fault injection framework is added to this implementation that allows designers to study the reliability aspects of their processors. The result is a simulator that is fast, accurate, flexible, and extensible.
text

APA, Harvard, Vancouver, ISO, and other styles

43

Wamhoff, Jons-Tobias. "Exploiting Speculative and Asymmetric Execution on Multicore Architectures." Doctoral thesis, 2014. https://tud.qucosa.de/id/qucosa%3A28598.

Full text

Abstract:

The design of microprocessors is undergoing radical changes that affect the performance and reliability of hardware and will have a high impact on software development. Future systems will depend on a deep collaboration between software and hardware to cope with the current and predicted system design challenges. Instead of higher frequencies, the number of processor cores per chip is growing. Eventually, processors will be composed of cores that run at different speeds or support specialized features to accelerate critical portions of an application. Performance improvements of software will only result from increasing parallelism and introducing asymmetric processing. At the same time, substantial enhancements in the energy efficiency of hardware are required to make use of the increasing transistor density. Unfortunately, the downscaling of transistor size and power will degrade the reliability of the hardware, which must be compensated by software. In this thesis, we present new algorithms and tools that exploit speculative and asymmetric execution to address the performance and reliability challenges of multicore architectures. Our solutions facilitate both the assimilation of software to the changing hardware properties as well as the adjustment of hardware to the software it executes. We use speculation based on transactional memory to improve the synchronization of multi-threaded applications. We show that shared memory synchronization must not only be scalable to large numbers of cores but also robust such that it can guarantee progress in the presence of hardware faults. Therefore, we streamline transactional memory for a better throughput and add fault tolerance mechanisms with a reduced overhead by speculating optimistically on an error-free execution. If hardware faults are present, they can manifest either in a single event upset or crashes and misbehavior of threads. We address the former by applying transactions to checkpoint and replicate the state such that threads can correct and continue their execution. The latter is tackled by extending the synchronization such that it can tolerate crashes and misbehavior of other threads. We improve the efficiency of transactional memory by enabling a lightweight thread that always wins conflicts and significantly reduces the overheads. Further performance gains are possible by exploiting the asymmetric properties of applications. We introduce an asymmetric instrumentation of transactional code paths to enable applications to adapt to the underlying hardware. With explicit frequency control of individual cores, we show how applications can expose their possibly asymmetric computing demand and dynamically adjust the hardware to make a more efficient usage of the available resources.

APA, Harvard, Vancouver, ISO, and other styles

44

Ranganathan, Nitya. "Control flow speculation for distributed architectures." 2009. http://hdl.handle.net/2152/6586.

Full text

Abstract:

As transistor counts, power dissipation, and wire delays increase, the microprocessor industry is transitioning from chips containing large monolithic processors to multi-core architectures. The granularity of cores determines the mechanisms for branch prediction, instruction fetch and map, data supply, instruction execution, and completion. Accurate control flow prediction is essential for high performance processors with large instruction windows and high-bandwidth execution. This dissertation considers cores with very large granularity, such as TRIPS, as well as cores with extremely small granularity, such as TFlex, and explores control flow speculation issues in such processors. Both TRIPS and TFlex are distributed block-based architectures and require control speculation mechanisms that can work in a distributed environment while supporting efficient block-level prediction, misprediction detection, and recovery. This dissertation aims at providing efficient control flow prediction techniques for distributed block-based processors. First, we discuss simple exit predictors inspired by branch predictors and describe the design of the TRIPS prototype block predictor. Area and timing trade-offs in the predictor implementation are presented. We report the predictor misprediction rates from the prototype chip for the SPEC benchmark suite. Next, we look at the performance bottlenecks in the prototype predictor and present a detailed analysis of exit and target predictors using basic prediction components inspired from branch predictors. This study helps in understanding what types of predictors are effective for exit and target prediction. Using the results of our prediction analysis, we propose novel hardware techniques to improve the accuracy of block prediction. To understand whether exit prediction is inherently more difficult than branch prediction, we measure the correlation among branches in basic blocks and hyperblocks and examine the loss in correlation due to hyperblock construction. Finally, we propose block predictors for TFlex, a fully distributed architecture that uses composable lightweight processors. We describe various possible designs for distributed block predictors and a classification scheme for such predictors. We present results for predictors from each of the design points for distributed prediction.
text

APA, Harvard, Vancouver, ISO, and other styles

45

Dodd, Samuel Tommy. "Merchandising the postwar model house at the Parade of Homes." Thesis, 2009. http://hdl.handle.net/2152/ETD-UT-2009-08-345.

Full text

Abstract:

The Parade of Homes began in 1948 as a novel form of sales merchandising and publicity. The model house, on display at the Parade of Homes, was a powerful advertising tool employed by postwar merchant-builders to sell modern design to a new market of informed consumers and second-time homeowners. Using House & Home as a primary source, I contextualize the postwar housing industry and the merchandising efforts of builders. Then, through an examination of the 1955 Parade of Homes in Houston, Texas, I analyze the early Parade of Homes events and the language of domestic modernism that they showcased.
text

APA, Harvard, Vancouver, ISO, and other styles

46

Carlerbäck, Johansson Linnea. "The living square : Speculating publicness." Thesis, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-171712.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Chen, Pei-yuan, and 陳沛源. "Front-End Policy based on Speculation Condition for Simultaneous Multithreading Architecture." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/vegfgr.

Full text

Abstract:

碩士
大同大學
資訊工程學系(所)
95
For modern wide-issue superscalar processors, high performance instruction fetch unit is the key component to keep the powerful execution engine operating in full speed. The performance measurement to evaluate a front-end mechanism includes both the instruction delivery rate and speculation accuracy. That means a good front-end engine should be able to fetch and dispatch massive instructions on the right execution path, in a reasonable clock cycle time. Things may be a little different in Simultaneous Multithreading (SMT) architecture because there are multiple active contexts inside the CPU. If we can extract some information about future speculation conditions of each thread, the front-end fetch engine can then prefer threads with highly predictable execution path to avoid resource or energy waste on mis-speculative routes. In this paper, we focus on improving the front-end engine of SMT processor. We present a supplementary structure called Sequential Trace Table (STT) to provide a look-ahead into the future speculating conditions of each thread, and use the information to help improving fetch prioritizing policies.

APA, Harvard, Vancouver, ISO, and other styles

48

Khosrow-Khavar, Farzad. "Assigning cost to branches for speculation control in superscalar processors." 2005. http://hdl.handle.net/1828/580.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Kelly, Daniel R. "Arithmetic data value speculation." Thesis, 2011. http://hdl.handle.net/2440/70234.

Full text

Abstract:

Arithmetic approximation is used to decrease the latency of an arithmetic circuit by shortening the critical path delay or the sampling period so that result is not guaranteed to be correct for every input combination.Thus, an acceptable compromise between the circuit latency and the average probability of correctness drives the circuit design. Two methods of arithmetic approximation are: temporally incompleteness where circuits quote the result before the critical path delay (overclocking); and logically incompleteness where circuits use simplified logic, so that most input cases are calculated correctly, but the slowest cases are calculated incorrectly. Arithmetic data value speculation (ADVS) is a speculation scheme based on arithmetic approximation, and is used to increase the throughput of a general purpose processor. ADVS is similar to branch prediction, an arithmetic instruction is issued to an exact arithmetic unit and an approximate arithmetic unit which provides an approximate result faster than the exact counterpart. The approximate result is forwarded to dependent operations so they may be speculatively issued. When the exact result is eventually known, it is compared to the approximate result, and the pipeline is flushed if they differ. This thesis, "ArithmeticDataValue Speculation", presents work in the field of digital arithmetic and computer architecture. A summary of current probabilistic arithmetic methods from the literature is provided, and novel designs of approximate integer arithmetic units are presented, including results from logical synthesis. A case study demonstrates approximate arithmetic units used to increase the average throughput of benchmark programs by speculatively issuing dependent operations in a RISC processor. The average correctness of the approximate arithmetic units are shown to be highly data dependent, results vary depending on the benchmarks being run. In addition, the average correctness when running benchmarks is consistently higher than for random inputs. Simulations show that many arithmetic operations are often repeated in the same benchmark, leading to a high variation in correctness. Speculative gains from one operation can be offset by speculation losses due repeated incorrect approximation of another approximate unit, so typical throughput gains through speculation in a general purpose processor pipeline are low. The minimum threshold correctness of an approximate arithmetic unit used for speculation is shown to be approximately 95%. Logic synthesis is used to determine power, area and timing information for approximate units implemented from novel algorithms, and show a reduction in arithmetic cycle latency for integer operations, and the expense of 50% leakage and area, and 90% dynamic power. Value speculation can be complemented by result caching; repeated pipeline flushes can be avoided if the correct result is know before speculation, the average operation latency can be reduced, and caching can be used for operations that are difficult to approximate.
Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 2011

APA, Harvard, Vancouver, ISO, and other styles

50

(11132985), Thamir Qadah. "High-performant, Replicated, Queue-oriented Transaction Processing Systems on Modern Computing Infrastructures." Thesis, 2021.

Find full text

Abstract:

With the shifting landscape of computing hardware architectures and the emergence of new computing environments (e.g., large main-memory systems, hundreds of CPUs, distributed and virtualized cloud-based resources), state-of-the-art designs of transaction processing systems that rely on conventional wisdom suffer from lost performance optimization opportunities. This dissertation challenges conventional wisdom to rethink the design and implementation of transaction processing systems for modern computing environments.

We start by tackling the vertical hardware scaling challenge, and propose a deterministic approach to transaction processing on emerging multi-sockets, many-core, shared memory architecture to harness its unprecedented available parallelism. Our proposed priority-based queue-oriented transaction processing architecture eliminates the transaction contention footprint and uses speculative execution to improve the throughput of centralized deterministic transaction processing systems. We build QueCC and demonstrate up to two orders of magnitude better performance over the state-of-the-art.

We further tackle the horizontal scaling challenge and propose a distributed queue-oriented transaction processing engine that relies on queue-oriented communication to eliminate the traditional overhead of commitment protocols for multi-partition transactions. We build Q-Store, and demonstrate up to 22x improvement in system throughput over the state-of-the-art deterministic transaction processing systems.

Finally, we propose a generalized framework for designing distributed and replicated deterministic transaction processing systems. We introduce the concept of speculative replication to hide the latency overhead of replication. We prototype the speculative replication protocol in QR-Store and perform an extensive experimental evaluation using standard benchmarks. We show that QR-Store can achieve a throughput of 1.9 million replicated transactions per second in under 200 milliseconds and a replication overhead of 8%-25%compared to non-replicated configurations.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Speculative Architecture'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles