Dissertations / Theses: 'Interconnect architectures'

1

Venkatesan, Raguraman. "Multilevel interconnect architectures for gigascale integration (GSI)." Diss., Georgia Institute of Technology, 2003. http://hdl.handle.net/1853/13370.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Meng, Wang. "Verifying Deadlock-Freedom for Advanced Interconnect Architectures." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-171922.

Full text

Abstract:

Modern advanced Interconnects, such as those orchestrated by the ARM AMBA AXI protocol, can have fatal deadlocks in the connection between Masters and Slaves if those transactions are not properly arranged. There exists some research about the deadlock problems in an on-chip bus system and also methods to avoid those deadlocks which could happen. This project aims to verify those situations could make deadlock happens and also the countermeasures for those deadlocks. In this thesis, the ARM AMBA AXI protocol and countermeasures are modelled in NuSMV. Based on these models, we verified the non-trivial cycles of transactions could cause deadlocks and also some bus techniques which can mitigate deadlock problems efficiently. The results from model checking several instances of the protocol and corresponding countermeasures show the techniques could indeed avoid deadlocks.

APA, Harvard, Vancouver, ISO, and other styles

3

Cook, Jason Todd. "Interconnect Thermal Management of High Power Packaged Electronic Architectures." Thesis, Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5013.

Full text

Abstract:

Packaged microelectronic technology provides an efficient means to connecting high performance chips to PCBs. As area array bump density increases, joule heating will play an important role in chip and interconnect reliability. Joule heating, in addition to chip heating can significantly reduce the clock speed and I/O while increasing noise, electromigration, and leakage power. Direct cooling of the solder bumps is a new innovative approach to removing heat from packaged high heat dissipating chips. This could be used in conjunction with top surface mounted thermal management devices to maximize heat removal. The solder bumps leave a small gap between the packaged chip and PCB, which can be utilized for incorporating a thermal management scheme. Since space is very limited, fans and conventional heat sinks are not practical solutions. Jet impingement presents a unique solution for cooling solder bumps. It has been shown that micro jets can effectively cool the top surface of laptop computer processors. They can also be used to cool the solder bumps and bottom of the chip. Micro jets are easily implemented into the PCB without compromising the electrical leads powering the chip. A prototype printed wiring board containing micro jets was built and a dummy plastic ball grid array packaged chip with a heating element embedded in it was attached on top. A mini compressor supplied the pressure and flow rates needed to push air through the micro jet holes. The pressure, flow rate, and temperatures were measured and analyzed. A numerical model was created based on the results of the experiments. Both the experiments and model show the effectiveness of interconnect cooling.

APA, Harvard, Vancouver, ISO, and other styles

4

Chen, Hongyu. "On-chip interconnect architectures perspectives of layout, circuits, and systems /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2006. http://wwwlib.umi.com/cr/ucsd/fullcit?p3237549.

Full text

Abstract:

Thesis (Ph. D.)--University of California, San Diego, 2006.
Title from first page of PDF file (viewed December 12, 2006). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references (p. 131-137).

APA, Harvard, Vancouver, ISO, and other styles

5

Bhide, Kanchan P. "DESIGN ENHANCEMENT AND INTEGRATION OF A PROCESSOR-MEMORY INTERCONNECT NETWORK INTO A SINGLE-CHIP MULTIPROCESSOR ARCHITECTURE." UKnowledge, 2004. http://uknowledge.uky.edu/gradschool_theses/253.

Full text

Abstract:

This thesis involves modeling, design, Hardware Description Language (HDL) design capture, synthesis, implementation and HDL virtual prototype simulation validation of an interconnect network for a Hybrid Data/Command Driven Computer Architecture (HDCA) system. The HDCA is a single-chip shared memory multiprocessor architecture system. Various candidate processor-memory interconnect topologies that may meet the requirements of the HDCA system are studied and evaluated related to utilization within the HDCA system. It is determined that the Crossbar network topology best meets the HDCA system requirements and it is therefore used as the processormemory interconnect network of the HDCA system. The design capture, synthesis, implementation and HDL simulation is done in VHDL using XILINX ISE 6.2.3i and ModelSim 5.7g CAD softwares. The design is validated by individually testing against some possible test cases and then integrated into the HDCA system and validated against two different applications. The inclusion of crossbar switch in the HDCA architecture involved major modifications to the HDCA system and some minor changes in the design of the switch. Virtual Prototype testing of the HDCA executing applications when utilizing crossbar interconnect revealed proper functioning of the interconnect and HDCA. Inclusion of the interconnect into the HDCA now allows it to implement dynamic node level reconfigurability and multiple forking functionality.

APA, Harvard, Vancouver, ISO, and other styles

6

Bhaduri, Debayan. "Tools and Techniques for Evaluating Reliability Trade-offs for Nano-Architectures." Thesis, Virginia Tech, 2002. http://hdl.handle.net/10919/9918.

Full text

Abstract:

It is expected that nano-scale devices and interconnections will introduce unprecedented level of defects in the substrates, and architectural designs need to accommodate the uncertainty inherent at such scales. This consideration motivates the search for new architectural paradigms based on redundancy based defect-tolerant designs. However, redundancy is not always a solution to the reliability problem, and often too much or too little redundancy may cause degradation in reliability. The key challenge is in determining the granularity at which defect tolerance is designed, and the level of redundancy to achieve a specific level of reliability. Analytical probabilistic models to evaluate such reliability-redundancy trade-offs are error prone and cumbersome, and do not scalewell for complex networks of gates. In this thesiswe develop different tools and techniques that can evaluate the reliability measures of combinational circuits, and can be used to analyze reliability-redundancy trade-offs for different defect-tolerant architectural configurations. In particular, we have developed two tools, one of which is based on probabilistic model checking and is named NANOPRISM, and another MATLAB based tool called NANOLAB. We also illustrate the effectiveness of our reliability analysis tools by pointing out certain anomalies which are counter-intuitive but can be easily discovered by these tools, thereby providing better insight into defecttolerant design decisions. We believe that these tools will help furthering research and pedagogical interests in this area, expedite the reliability analysis process and enhance the accuracy of establishing reliability-redundancy trade-off points.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

7

Solkowski, Tomasz. "Multimedia workstation architecture with ATM interconnect." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ28851.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Nousias, Ioannis. "Reconfigurable instruction cell architecture : reconfiguration and interconnects." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/11222.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Dines, Julian A. B. "Optoelectronic computing : interconnects, architectures and a systems demonstrator." Thesis, Heriot-Watt University, 1997. http://hdl.handle.net/10399/647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Dennison, Larry R. (Larry Robert). "The reliable router : an architecture for fault tolerant interconnect." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/11001.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (p. 152-154).
by Larry R. Dennison.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

11

Hassan, Abu S. M. (Abu Saleem Mahmudul). "Testing of board interconnects using boundary scan architecture." Thesis, McGill University, 1989. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=74304.

Full text

Abstract:

The testing of printed circuit board (PCB) interconnects is a complex task that requires enormous amount of resources. With the increasing use of new technologies like surface mounting technology (SMT), testing PCB interconnects using the available techniques, like in-circuit testing and functional testing, is becoming very difficult. To make testing manageable, it must be considered earlier in the design process. This is known as 'design for testability' (DFT). A hierarchical DFT approach known as boundary scan architecture has recently become an increasingly attractive solution for PCB interconnect testing problems. This framework provides a scan path for electronic access to the interconnect test points, thus removing the need for accessibility through electro-mechanical contacts known as 'bed of nails'.
In the recent past, several researchers have proposed different schemes for PCB interconnect testing based on the boundary scan architecture.
In this dissertation, a new approach, based on the concept of built-in self-test (BIST), is developed using the boundary scan architecture for PCB interconnect testing. BIST, at the component level, generally consists of incorporating additional circuitry on the chip to generate test patterns and to compact the response of the circuit under test into a reference signature. For the PCB level BIST, the board is considered as the unit under test. A family of BIST schemes are developed for board interconnect testing utilizing the properties of the boundary scan architecture. The BIST approach has removed the dependence on automatic test equipment (ATE) for generation of test vector sets and analysis of output data sets. Techniques are developed for the generation of test vector sets which require very simple test generation hardware. Test vector sets are shown to be independent of the order of the input/output (I/O) scan cells in the boundary scan chain and of the structural complexity of the interconnects under test. Response compaction techniques proposed in the schemes are such that fault detection and diagnosis can be done independent of the topological information about the interconnects. These response compaction techniques can be implemented within each boundary scan cell or outside the boundary scan chain, providing a trade-off in terms of test time and hardware complexity. The various uses of the boundary scan architecture make the proposed schemes more attractive and advantageous than the existing approaches for board interconnect testing.
Moreover, a family of interconnect testing schemes is proposed for a partial boundary scan environment. Partial boundary scan environment refers to a board with a mix of boundary scan and non-boundary scan components. Such an environment is more complex compared to a complete boundary scan environment. The proposed schemes are BIST-able despite the inherently complex test environment. However, fault coverage is limited because of the reduced accessibility of the partial boundary scan environment.

APA, Harvard, Vancouver, ISO, and other styles

12

Debnath, Kapil. "Photonic crystal cavity based architecture for optical interconnects." Thesis, University of St Andrews, 2013. http://hdl.handle.net/10023/3870.

Full text

Abstract:

Today's information and communication industry is confronted with a serious bottleneck due to the prohibitive energy consumption and limited transmission bandwidth of electrical interconnects. Silicon photonics offers an alternative by transferring data optically and thereby eliminating the restriction of electrical interconnects over distance and bandwidth. Due to the inherent advantage of using the same material as that used for the electronic circuitry, silicon photonics also promises high volume and low cost production plus the possibility of integration with electronics. In this thesis, I introduce an all-silicon optical interconnect architecture that promises very high integration density along with very low energy consumption. The basic building block of this architecture is a vertically coupled photonic crystal cavity-waveguide system. This vertically coupled system acts as a highly wavelength selective filter. By suitably designing the waveguide and the cavity, at resonance wavelength of the cavity, large drop in transmission can be achieved. By locally modulating the material index of the cavity electrically, the resonance wavelength of the cavity can be tuned to achieve modulation in the transmission of the waveguide. The detection scheme also utilizes the same vertically coupled system. By creating crystal defects in silicon in the cavity region, wavelength selective photodetection can be achieved. This unique vertical coupling scheme also allows us to cascade multiple modulators and detectors coupled to a single waveguide, thus offering huge channel scalability and design and fabrication simplicity. During this project, I have implemented this vertical coupling scheme to demonstrate modulation with extremely low operating energy (0.6 fJ/bit). Furthermore, I have demonstrated cascadeability and multichannel operation by using a comb laser as the source that simultaneously drives five channels. For photodetection, I have realized one of the smallest wavelength selective detector with responsivity of 0.108 A/W at 10 V reverse bias with a dark current of 9.4 nA. By cascading such detectors I have also demonstrated a two-channel demultiplexer.

APA, Harvard, Vancouver, ISO, and other styles

13

Balakrishnan, Anant. "Analysis and optimization of global interconnects for many-core architectures." Thesis, Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/39632.

Full text

Abstract:

The objective of this thesis is to develop circuit-aware interconnect technology optimization for network-on-chip based many-core architectures. The dimensions of global interconnects in many-core chips are optimized for maximum bandwidth density and minimum delay taking into account network-on-chip router latency and size effects of copper. The optimal dimensions thus obtained are used to characterize different network-on-chip topologies based on wiring area utilization, maximum core-to-core channel width, aggregate chip bandwidth and worse case latency. Finally, the advantages of many-core many-tier chips are evaluated for different network-on-chip topologies. Area occupied by a router within a core is shown to be the bottleneck to achieve higher performance in network-on-chip based architectures.

APA, Harvard, Vancouver, ISO, and other styles

14

Karkar, Ammar Jallawi Mahmood. "Interconnects architectures for many-core era using surface-wave communication." Thesis, University of Newcastle upon Tyne, 2016. http://hdl.handle.net/10443/3380.

Full text

Abstract:

Networks-on-chip (NoCs) is a communication paradigm that has emerged aiming to address on-chip communication challenges and to satisfy interconnection demands for chip-multiprocessors (CMPs). Nonetheless, there is continuous demand for even higher computational power, which is leading to a relentless downscaling of CMOS technology to enable the integration of many-cores. However, technology downscaling is in favour of the gate nodes over wires in terms of latency and power consumption. Consequently, this has led to the era of many-core processors where power consumption and performance are governed by inter-core communications rather than core computation. Therefore, NoCs need to evolve from being merely metalbased implementations which threaten to be a performance and power bottleneck for many-core efficiency and scalability. To overcome such intensified inter-core communication challenges, this thesis proposes a novel interconnect technology: the surface-wave interconnect (SWI). This new RF-based on-chip interconnect has notable characteristics compared to cutting-edge on-chip interconnects in terms of CMOS compatibility, high speed signal propagation, low power dissipation, and massive signal fan-out. Nonetheless, the realization of the SWI requires investigations at different levels of abstraction, such as the device integration and RF engineering levels. The aim of this thesis is to address the networking and system level challenges and highlight the potential of this interconnect. This should encourage further research at other levels of abstraction. Two specific system-level challenges crucial in future many-core systems are tackled in this study, which are cross-the-chip global communication and one-to-many communication. This thesis makes four major contributions towards this aim. The first is reducing the NoC average-hop count, which would otherwise increase packet-latency exponentially, by proposing a novel hybrid interconnect architecture. This hybrid architecture can not only utilize both regular metal-wire and SWI, but also exploits merits of both bus and NoC architectures in terms of connectivity compared to other general-purpose on-chip interconnect architectures. The second contribution addresses global communication issues by developing a distance-based weighted-round-robin arbitration (DWA) algorithm. This technique prioritizes global communication to be send via SWI short-cuts, which offer more efficient power dissipation and faster across-the-chip signal propagation. Results obtained using a cycleaccurate simulator demonstrate the effectiveness of the proposed system architecture in terms of significant power reduction, considervii able average delay reduction and higher throughput compared to a regular NoC. The third contribution is in handling multicast communications, which are normally associated with traffic overload, hotspots and deadlocks and therefore increase, by an order of magnitude the power consumption and latency. This has been achieved by proposing a novel routing and centralized arbitration schemes that exploits the SWI0s remarkable fan-out features. The evaluation demonstrates drastic improvements in the effectiveness of the proposed architecture in terms of power consumption ( 2-10x) and performance ( 22x) but with negligible hardware overheads ( 2%). The fourth contribution is to further explore multicast contention handling in a flexible decentralized manner, where original techniques such as stretch-multicast and ID-tagging flow control have been developed. A comparison of these techniques shows that the decentralized approach is superior to the centralized approach with low traffic loads, while the latter outperforms the former near and after NoC saturation.

APA, Harvard, Vancouver, ISO, and other styles

15

Azadeh, Mohammad. "Current mode processing and architecture for optoelectronically interconnected arrays /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/6104.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Neel, Brian. "High Performance Shared Memory Networking in Future Many-core Architectures UsingOptical Interconnects." Ohio University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1397488118.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Chung, Kee Shik. "ILP-SIMD : an instruction parallel SIMD architecture with short-wire interconnects." Diss., Georgia Institute of Technology, 2000. http://hdl.handle.net/1853/15455.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

HUANG, RENQIU. "PHYSICAL AWARE HIGH LEVEL SYNTHESIS AND INTERCONNECT FOR FPGAs." University of Cincinnati / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1147616884.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Sato, Ken-ichi, Hiroshi Hasegawa, and Yuto Iwai. "A large-scale photonic node architecture that utilizes interconnected OXC subsystems." Optical Society of America, 2013. http://hdl.handle.net/2237/21044.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Apsel, Alyssa Beth. "Optoelectronic receivers in silicon on sapphire CMOS architecture and design for efficient parallel interconnects /." Available to US Hopkins community, 2002. http://wwwlib.umi.com/dissertations/dlnow/3068111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Kaplan, Adam Blake. "Architectural integration of RF-Interconnect to enhance on-chip communication for many-core chip multiprocessors." Diss., Restricted to subscribing institutions, 2008. http://proquest.umi.com/pqdweb?did=1780840191&sid=1&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Li, Hui. "Design methods for energy-efficient silicon photonic interconnects on chip." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEC059/document.

Full text

Abstract:

La photonique au silicium est une technologie émergente considérée comme l'une des solutions clés pour les interconnexions sur puce de génération future, offrant plusieurs avantages potentiels tels qu'une faible latence de transmission et une bande passante élevée. Cependant, elle reste confrontée à des défis en matière d'efficacité énergétique. Différentes topologies, layout et architectures offrent diverses options d'interconnexion. Ceci conduit à une grande variation des pertes optiques, qui est l'un des facteurs prédominants dans la consommation d'énergie. De plus, les composants photoniques au silicium sont très sensibles aux variations de température. Sous une activité de puces donnée, ceci conduit à une réduction de l’efficacité des lasers et à une dérive des longueurs d'onde des composants optiques, ce qui entraîne un «Bit Error Ratio (BER)» plus élevé et réduit par conséquent l'efficacité énergétique des interconnexions optiques. Dans cette thèse, nous travaillons sur des méthodologies de conception pour les interconnexions photoniques sur silicium économes-en-énergie et prenant en compte la topologie / le layout, la variation thermique et l'architecture
Silicon photonics is an emerging technology considered as one of the key solutions for future generation on-chip interconnects, providing several prospective advantages such as low transmission latency and high bandwidth. However, it still encounters challenges in energy efficiency. Different topologies, physical layouts, and architectures provide various interconnect options for on-chip communication. This leads to a large variation in optical losses, which is one of the predominant factors in power consumption. In addition, silicon photonic devices are highly sensitive to temperature variation. Under a given chip activity, this leads to a lower laser efficiency and a drift of wavelengths of optical devices (on-chip lasers and microring resonators (MRs)), which in turn results in a higher Bit Error Ratio (BER) and consequently reduces the energy efficiency of optical interconnects. In this thesis, we work on design methodologies for energy-efficient silicon photonic interconnects on chip related to topology/layout, thermal variation, and architecture

APA, Harvard, Vancouver, ISO, and other styles

23

Zhu, Lingbo. "Controlled Fabrication of Aligned Carbon Nanotube Architectures for Microelectronics Packaging Applications." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/19739.

Full text

Abstract:

This thesis is devoted to the fabrication of carbon nanotube structures for microelectronics packaging applications with an emphasis on fundamental studies of nanotube growth and assembly, wetting of nanotube structures, and nanotube-based composites. A CVD process is developed that allows controlled growth of a variety of CNT structures, such as CNT films, bundles, and stacks. Use of an Al2O3 support enhances the Fe catalyst activity by increasing the CNT growth rate by nearly two orders of magnitude under the same growth conditions. By introducing a trace amount of weak oxidants into the CVD chamber during CNT growth, aligned CNT ends can be opened and/or functionalized, depending on the selection of oxidants. By varying the growth temperature, CNT growth can be performed in a gas diffusion- or kinetics-controlled regime. To overcome the challenges that impede implementation of CNTs in circuitry, a CNT transfer process was proposed to assemble aligned CNT structures (films, stacks &bundles) at low temperature which ensures compatibility with current microelectronics fabrication sequences and technology. Field emission and electrical testing of the as-assembled CNT devices indicate good electrical contact between CNTs and solder and a very low contact resistance across CNT/solder interfaces. For attachment of CNTs and other applications (e.g. composites), wetting of nanotube structures was studied. Two model surfaces with two-tier scale roughness were fabricated by controlled growth of CNT arrays followed by coating with fluorocarbon layers formed by plasma polymerization to study roughness geometric effects on superhydrophobicity. Due to the hydrophobicity of nanotube structures, electrowetting was investigated to reduce the hydrophobicity of aligned CNTs by controllably reducing the interfacial tension between carbon nanotubes (CNTs) and liquids. Electrowetting can greatly reduce the contact angle of liquids on the surfaces of aligned CNT films. However, contact angle saturation still occurs. Variable frequency microwave (VFM) radiation can greatly improve the CNT/epoxy interfacial bonding strength. Compared to composites cured by thermal heating, VFM-cured composites demonstrate higher CNT/matrix interfacial bonding strength, which is reflected in composite negative thermal expansion. The improved CNT/epoxy interface enhances the thermal conductivity of the composites by 26-30%.

APA, Harvard, Vancouver, ISO, and other styles

24

Wit, Mark Stuart de. "The MINT architecture : a design for providing quality of service support in desktop-level interconnects." Thesis, University of Glasgow, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.434028.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Cloonan, Thomas J. "A high bit-rate packet switch architecture with advanced electronic packaging and free-space optical interconnects." Thesis, Heriot-Watt University, 1993. http://hdl.handle.net/10399/1461.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

McKenny, Martin. "An investigation into the performance of the RapidIO interconnect architecture in the context of a generic routing device." Thesis, University of Glasgow, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412933.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Weinberg, Gil 1967. "Interconnected musical networks : bringing expression and thoughtfulness to collaborative group playing." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/28287.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2003.
Includes bibliographical references (p. 211-219).
(cont.) In order to addressee the latter challenge I have decided to employ the digital network--a promising candidate for bringing a unique added value to the musical experience of collaborative group playing. I have chosen to address both challenges by embedding cognitive and educational concepts in newly designed interconnect instruments and applications, which led to the development of a number of such Interconnected Musical Networks (IMNs)--live performance systems that allow players to influence, share, and shape each other's music in real-time. In my thesis I discuss the concepts, motivations, and aesthetics of IMNs and review a number of historical and current technological landmarks that led the way to the development of the field. I then suggest a comprehensive theoretical framework for artistic interdependency, based on which I developed a set of instruments and activities in an effort to turn IMNs into an expressive and intuitive art form that provides meaningful learning experiences, engaging collaborative interactions, and worthy music.
Music today is more ubiquitous, accessible, and democratized than ever. Thanks to technologies such as high-end home studios, audio compression, and digital distribution, music now surrounds us in everyday life, almost every piece of music is a few minutes of download away, and almost any western musician, novice or expert, can compose, perform and distribute their music directly to their listeners from their home studios. But at the same time these technologies lead to some concerning social effects on the culture of consuming and creating music. Although music is available for more people, in more locations, and for longer periods of time, most listeners experience it in an incidental, unengaged, or utilitarian manner. On the creation side, home studios promote private and isolated practice of music making where hardly any musical instruments or even musicians are needed, and where the value of live group interaction is marginal. My thesis work attempts to use technology to address these same concerning effects that it had created by developing tools and applications that would address two main challenges: 1. Facilitating engaged and thoughtful as well as intuitive and expressive musical experiences for novices and children 2. Enhancing the inherent social attributes of music making by connecting to and intensifying the roots of music as a collaborative socialritual. My approach for addressing the first challenge is to study and model music cognition and education theories and to design algorithms that would bridge between the thoughtful and the expressive, allowing novices and children an access to meaningful and engaging musical experiences.
by Gil Weinberg.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

28

Gruenwald, Benjamin Charles. "Toward Verifiable Adaptive Control Systems: High-Performance and Robust Architectures." Scholar Commons, 2018. https://scholarcommons.usf.edu/etd/7676.

Full text

Abstract:

In this dissertation, new model reference adaptive control architectures are presented with stability, performance, and robustness considerations, to address challenges related to the verification of adaptive control systems. The challenges associated with the transient performance of adaptive control systems is first addressed using two new approaches that improve the transient performance. Specifically, the first approach is predicated on a novel controller architecture, which involves added terms in the update law entitled artificial basis functions. These terms are constructed through a gradient optimization procedure to minimize the system error between an uncertain dynamical system and a given reference model during the learning phase of an adaptive controller. The second approach is an extension of the first one and minimizes the effect of the system uncertainties more directly in the transient phase. In addition, this approach uses a varying gain to enforce performance bounds on the system error and is further generalized to adaptive control laws with nonlinear reference models. Another challenge in adaptive control systems is to achieve system stability and a prescribed level performance in the presence of actuator dynamics. It is well-known that if the actuator dynamics do not have sufficiently high bandwidth, their presence cannot be practically neglected in the design since they limit the achievable stability of adaptive control laws. Another major contribution of this dissertation is to address this challenge. In particular, first a linear matrix inequalities-based hedging approach is proposed, where this approach modifies the ideal reference model dynamics to allow for correct adaptation that is not affected by the presence of actuator dynamics. The stability limits of this approach are computed using linear matrix inequalities revealing the fundamental stability interplay between the parameters of the actuator dynamics and the allowable system uncertainties. In addition, these computations are used to provide a depiction of the feasible region of the actuator parameters such that the robustness to variation in the parameters is addressed. Furthermore, the convergence properties of the modified reference model to the ideal reference model are analyzed. Generalizations and applications of the proposed approach are then provided. Finally, to improve upon this linear matrix inequalities-based hedging approach a new adaptive control architecture using expanded reference models is proposed. It is shown that the expanded reference model trajectories more closely follow the trajectories of the ideal reference model as compared to the hedging approach and through the augmentation of a command governor architecture, asymptotic convergence to the ideal reference model can be guaranteed. To provide additional robustness against possible uncertainties in the actuator bandwidths an estimation of the actuator bandwidths is incorporated. Lastly, the challenge presented by the unknown physical interconnection of large-scale modular systems is addressed. First a decentralized adaptive architecture is proposed in an active-passive modular framework. Specifically, this architecture is based on a set-theoretic model reference adaptive control approach that allows for command following of the active module in the presence of module-level system uncertainties and unknown physical interconnections between both active and passive modules. The key feature of this framework allows the system error trajectories of the active modules to be contained within apriori, user-defined compact sets, thereby enforcing strict performance guarantees. This architecture is then extended such that performance guarantees are enforced on not only the actuated portion (active module) of the interconnected dynamics but also the unactuated portion (passive module). For each proposed adaptive control architecture, a system theoretic approach is included to analyze the closed-loop stability properties using tools from Lyapunov stability, linear matrix inequalities, and matrix mathematics. Finally, illustrative numerical examples are included to elucidate the proposed approaches.

APA, Harvard, Vancouver, ISO, and other styles

29

Puche, Lara José. "Novel Cache Hierarchies with Photonic Interconnects for Chip Multiprocessors." Doctoral thesis, Universitat Politècnica de València, 2021. http://hdl.handle.net/10251/165254.

Full text

Abstract:

[ES] Los procesadores multinúcleo actuales cuentan con recursos compartidos entre los diferentes núcleos. Dos de estos recursos compartidos, la cache de último nivel y el ancho de banda de memoria principal, pueden convertirse en cuellos de botella para el rendimiento. Además, con el crecimiento del número de núcleos que implementan los diseños más recientes, la red dentro del chip también se convierte en un cuello de botella que puede afectar negativamente al rendimiento, ya que las redes tradicionales pueden encontrar limitaciones a su escalabilidad en el futuro cercano. Prácticamente la totalidad de los diseños actuales implementan jerarquías de memoria que se comunican mediante rápidas redes de interconexión. Esta organización es eficaz dado que permite reducir el número de accesos que se realizan a memoria principal y la latencia media de acceso a memoria. Las caches, la red de interconexión y la memoria principal, conjuntamente con otras técnicas conocidas como la prebúsqueda, permiten reducir las enormes latencias de acceso a memoria principal, limitando así el impacto negativo ocasionado por la diferencia de rendimiento existente entre los núcleos de cómputo y la memoria. Sin embargo, compartir los recursos mencionados es fuente de diferentes problemas y retos, siendo uno de los principales el manejo de la interferencia entre aplicaciones. Hacer un uso eficiente de la jerarquía de memoria y las caches, así como contar con una red de interconexión apropiada, es necesario para sostener el crecimiento del rendimiento en los diseños tanto actuales como futuros. Esta tesis analiza y estudia los principales problemas e inconvenientes observados en estos dos recursos: la cache de último nivel y la red dentro del chip. En primer lugar, se estudia la escalabilidad de las tradicionales redes dentro del chip con topología de malla, así como esta puede verse comprometida en próximos diseños que cuenten con mayor número de núcleos. Los resultados de este estudio muestran que, a mayor número de núcleos, el impacto negativo de la distancia entre núcleos en la latencia puede afectar seriamente al rendimiento del procesador. Como solución a este problema, en esta tesis proponemos una de red de interconexión óptica modelada en un entorno de simulación detallado, que supone una solución viable a los problemas de escalabilidad observados en los diseños tradicionales. A continuación, esta tesis dedica un esfuerzo importante a identificar y proponer soluciones a los principales problemas de diseño de las jerarquías de memoria actuales como son, por ejemplo, el sobredimensionado del espacio de cache privado, la existencia de réplicas de datos y rigidez e incapacidad de adaptación de las estructuras de cache. Aunque bien conocidos, estos problemas y sus efectos adversos en el rendimiento pueden ser evitados en procesadores de alto rendimiento gracias a la enorme capacidad de la cache de último nivel que este tipo de procesadores típicamente implementan. Sin embargo, en procesadores de bajo consumo, no existe la posibilidad de contar con tales capacidades y hacer un uso eficiente del espacio disponible es crítico para mantener el rendimiento. Como solución a estos problemas en procesadores de bajo consumo, proponemos una novedosa organización de jerarquía de dos niveles cache que utiliza una red de interconexión óptica. Los resultados obtenidos muestran que, comparado con diseños convencionales, el consumo de energía estática en la arquitectura propuesta es un 60% menor, pese a que los resultados de rendimiento presentan valores similares. Por último, hemos extendido la arquitectura propuesta para dar soporte tanto a aplicaciones paralelas como secuenciales. Los resultados obtenidos con la esta nueva arquitectura muestran un ahorro de hasta el 78 % de energía estática en la ejecución de aplicaciones paralelas.
[CA] Els processadors multinucli actuals compten amb recursos compartits entre els diferents nuclis. Dos d'aquests recursos compartits, la memòria d’últim nivell i l'ample de banda de memòria principal, poden convertir-se en colls d'ampolla per al rendiment. A mes, amb el creixement del nombre de nuclis que implementen els dissenys mes recents, la xarxa dins del xip també es converteix en un coll d'ampolla que pot afectar negativament el rendiment, ja que les xarxes tradicionals poden trobar limitacions a la seva escalabilitat en el futur proper. Pràcticament la totalitat dels dissenys actuals implementen jerarquies de memòria que es comuniquen mitjançant rapides xarxes d’interconnexió. Aquesta organització es eficaç ates que permet reduir el nombre d'accessos que es realitzen a memòria principal i la latència mitjana d’accés a memòria. Les caches, la xarxa d’interconnexió i la memòria principal, conjuntament amb altres tècniques conegudes com la prebúsqueda, permeten reduir les enormes latències d’accés a memòria principal, limitant així l'impacte negatiu ocasionat per la diferencia de rendiment existent entre els nuclis de còmput i la memòria. No obstant això, compartir els recursos esmentats és font de diversos problemes i reptes, sent un dels principals la gestió de la interferència entre aplicacions. Fer un us eficient de la jerarquia de memòria i les caches, així com comptar amb una xarxa d’interconnexió apropiada, es necessari per sostenir el creixement del rendiment en els dissenys tant actuals com futurs. Aquesta tesi analitza i estudia els principals problemes i inconvenients observats en aquests dos recursos: la memòria cache d’últim nivell i la xarxa dins del xip. En primer lloc, s'estudia l'escalabilitat de les xarxes tradicionals dins del xip amb topologia de malla, així com aquesta es pot veure compromesa en propers dissenys que compten amb major nombre de nuclis. Els resultats d'aquest estudi mostren que, a major nombre de nuclis, l'impacte negatiu de la distància entre nuclis en la latència pot afectar seriosament al rendiment del processador. Com a solució' a aquest problema, en aquesta tesi proposem una xarxa d’interconnexió' òptica modelada en un entorn de simulació detallat, que suposa una solució viable als problemes d'escalabilitat observats en els dissenys tradicionals. A continuació, aquesta tesi dedica un esforç important a identificar i proposar solucions als principals problemes de disseny de les jerarquies de memòria actuals com son, per exemple, el sobredimensionat de l'espai de memòria cache privat, l’existència de repliques de dades i la rigidesa i incapacitat d’adaptació' de les estructures de memòria cache. Encara que ben coneguts, aquests problemes i els seus efectes adversos en el rendiment poden ser evitats en processadors d'alt rendiment gracies a l'enorme capacitat de la memòria cache d’últim nivell que aquest tipus de processadors típicament implementen. No obstant això, en processadors de baix consum, no hi ha la possibilitat de comptar amb aquestes capacitats, i fer un us eficient de l'espai disponible es torna crític per mantenir el rendiment. Com a solució a aquests problemes en processadors de baix consum, proposem una nova organització de jerarquia de dos nivells de memòria cache que utilitza una xarxa d’interconnexió òptica. Els resultats obtinguts mostren que, comparat amb dissenys convencionals, el consum d'energia estàtica en l'arquitectura proposada és un 60% menor, malgrat que els resultats de rendiment presenten valors similars. Per últim, hem estes l'arquitectura proposada per donar suport tant a aplicacions paral·leles com seqüencials. Els resultats obtinguts amb aquesta nova arquitectura mostren un estalvi de fins al 78 % d'energia estàtica en l’execució d'aplicacions paral·leles.
[EN] Current multicores face the challenge of sharing resources among the different processor cores. Two main shared resources act as major performance bottlenecks in current designs: the off-chip main memory bandwidth and the last level cache. Additionally, as the core count grows, the network on-chip is also becoming a potential performance bottleneck, since traditional designs may find scalability issues in the near future. Memory hierarchies communicated through fast interconnects are implemented in almost every current design as they reduce the number of off-chip accesses and the overall latency, respectively. Main memory, caches, and interconnection resources, together with other widely-used techniques like prefetching, help alleviate the huge memory access latencies and limit the impact of the core-memory speed gap. However, sharing these resources brings several concerns, being one of the most challenging the management of the inter-application interference. Since almost every running application needs to access to main memory, all of them are exposed to interference from other co-runners in their way to the memory controller. For this reason, making an efficient use of the available cache space, together with achieving fast and scalable interconnects, is critical to sustain the performance in current and future designs. This dissertation analyzes and addresses the most important shortcomings of two major shared resources: the Last Level Cache (LLC) and the Network on Chip (NoC). First, we study the scalability of both electrical and optical NoCs for future multicoresand many-cores. To perform this study, we model optical interconnects in a cycle-accurate multicore simulation framework. A proper model is required; otherwise, important performance deviations may be observed otherwise in the evaluation results. The study reveals that, as the core count grows, the effect of distance on the end-to-end latency can negatively impact on the processor performance. In contrast, the study also shows that silicon nanophotonics are a viable solution to solve the mentioned latency problems. This dissertation is also motivated by important design concerns related to current memory hierarchies, like the oversizing of private cache space, data replication overheads, and lack of flexibility regarding sharing of cache structures. These issues, which can be overcome in high performance processors by virtue of huge LLCs, can compromise performance in low power processors. To address these issues we propose a more efficient cache hierarchy organization that leverages optical interconnects. The proposed architecture is conceived as an optically interconnected two-level cache hierarchy composed of multiple cache modules that can be dynamically turned on and off independently. Experimental results show that, compared to conventional designs, static energy consumption is improved by up to 60% while achieving similar performance results. Finally, we extend the proposal to support both sequential and parallel applications. This extension is required since the proposal adapts to the dynamic cache space needs of the running applications, and multithreaded applications's behaviors widely differ from those of single threaded programs. In addition, coherence management is also addressed, which is challenging since each cache module can be assigned to any core at a given time in the proposed approach. For parallel applications, the evaluation shows that the proposal achieves up to 78% static energy savings. In summary, this thesis tackles major challenges originated by the sharing of on-chip caches and communication resources in current multicores, and proposes new cache hierarchy organizations leveraging optical interconnects to address them. The proposed organizations reduce both static and dynamic energy consumption compared to conventional approaches while achieving similar performance; which results in better energy efficiency.
Puche Lara, J. (2021). Novel Cache Hierarchies with Photonic Interconnects for Chip Multiprocessors [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/165254
TESIS

APA, Harvard, Vancouver, ISO, and other styles

30

Rajamanikkam, Chidhambaranathan. "Understanding Security Threats of Emerging Computing Architectures and Mitigating Performance Bottlenecks of On-Chip Interconnects in Manycore NTC System." DigitalCommons@USU, 2019. https://digitalcommons.usu.edu/etd/7453.

Full text

Abstract:

Emerging computing architectures such as, neuromorphic computing and third party intellectual property (3PIP) cores, have attracted significant attention in the recent past. Neuromorphic Computing introduces an unorthodox non-von neumann architecture that mimics the abstract behavior of neuron activity of the human brain. They can execute more complex applications, such as image processing, object recognition, more efficiently in terms of performance and energy than the traditional microprocessors. However, focus on the hardware security aspects of the neuromorphic computing at its nascent stage. 3PIP core, on the other hand, have covertly inserted malicious functional behavior that can inflict range of harms at the system/application levels. This dissertation examines the impact of various threat models that emerges from neuromorphic architectures and 3PIP cores. Near-Threshold Computing (NTC) serves as an energy-efficient paradigm by aggressively operating all computing resources with a supply voltage closer to its threshold voltage at the cost of performance. Therefore, STC system is scaled to many-core NTC system to reclaim the lost performance. However, the interconnect performance in many-core NTC system pose significant bottleneck that hinders the performance of many-core NTC system. This dissertation analyzes the interconnect performance, and further, propose a novel technique to boost the interconnect performance of many-core NTC system.

APA, Harvard, Vancouver, ISO, and other styles

31

Siboni, Didier. "La gestion de service sur les réseaux hétérogènes interconnectes : utilisation des techniques d'intelligence artificielle et architectures hybrides." Versailles-St Quentin en Yvelines, 1997. http://www.theses.fr/1997VERS0005.

Full text

Abstract:

La complexité toujours croissante des réseaux de communication interconnectes rend leur gestion de plus en plus complexe. Les systèmes experts ont été largement utilises pour résoudre ce type de problèmes mais ont montré leur limite pour prendre en compte les évolutions des configurations des réseaux et l'expertise souvent trop pauvre caractérisant ce domaine d'application. Nous proposons, dans cette thèse, un système intelligent et intègré de gestion des fautes utilisant de façon coopérative, les techniques d'apprentissages que sont les réseaux de neurones et le raisonnement est partir de cas et les systèmes experts. Nous proposons une articulation globale du processus de gestion des fautes en trois grandes étapes, regroupants des fonctions de gestions spécialisées: la détection de problèmes nécessitant les fonctions de filtrage et de diagnostic de problèmes de type fautes ou performances. La mesure de l'impact des problèmes sur la qualité de service nécessitant une modélisation de la qualité de service au niveau des différents services rendus par les éléments constitutifs des chaînes de liaison. La résolution des problèmes au moyen de fonctions de reconfigurations et reallocations de ressources, de maintenance curative et préventive, et de gestion d'historique de problème ou gestion de tickets d'incidents. Nous étudions les différentes techniques de l'intelligence artificielle intervenant dans le processus de gestion des fautes, comme les systèmes experts à base de modèles, les réseaux de neurones, le raisonnement est partir de cas. Nous établissons une correspondance entre les différentes étapes et les techniques d'intelligence artificielle, afin de proposer une architecture modulaire, bâtie au-dessus d'une plate-forme ouverte de gestion de réseaux. Un prototype opérationnel de corrélation d'alarmes utilisant la technique de système expert à base de modèles a été développé et intégré à la plate-forme hp openview. Un prototype correspondant au module de filtrage utilisant la technique de réseau de neurone a été développé et expérimenté de manière positive sur des données réelles. Un module de qualification et de correction de problème utilisant la technique d'apprentissage de raisonnement à partir de cas sur une base de tickets d'incident a été étudié

APA, Harvard, Vancouver, ISO, and other styles

32

Young, Jeffrey. "Dynamic partitioned global address spaces for high-efficiency computing." Thesis, Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26467.

Full text

Abstract:

Thesis (M. S.)--Electrical and Computer Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Yalamanchili, Sudhakar; Committee Member: Riley, George; Committee Member: Schimmel, David. Part of the SMARTech Electronic Thesis and Dissertation Collection.

APA, Harvard, Vancouver, ISO, and other styles

33

Arsenault, Patrick. "Cross-talk analysis of a 12 channel 2.5 GBs VCSEL array based parallel optical interconnect for a multi-stage scalable router architecture." Thesis, McGill University, 2002. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=29571.

Full text

Abstract:

To support and keep pace with the Internet growth, new routers based on optical multi-stage architectures are emerging. These routers consist of multiple shelves interconnected with parallel optical interconnects. This thesis proposes the analysis of the inter-channel crosstalk of a state-of-the-art 1 x 12 VCSEL and PIN array based parallel optical interconnect operating at 2.5 Gb/s. The crosstalk properties of the parallel optical interconnect will impact the optical power link budget and scalability of these multi-stage routers.
To study the crosstalk properties of the optical interconnect, a special test set-up and detailed test procedures were created to analyse the bit error rate and jitter performance of the parallel optical interconnect in multi-channel operation. The results obtained from the pre-defined experiments confirmed the degradation of the interconnect performance due to inter-channel crosstalk. This performance penalty also limits system scalability, especially when it is combined with the inherent crosstalk properties of the optical redirection boxes. The sources of inter-channel crosstalk were also determined. Finally the system optical link budgets were adjusted and rough system scalability limits were obtained.

APA, Harvard, Vancouver, ISO, and other styles

34

Polster, Robert. "Architecture of Silicon Photonic Links." Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112177/document.

Full text

Abstract:

Les futurs calculateurs de haute performance (HPC) devront faire face à deux défis majeurs : la densité de la bande passante d'interconnexion et les problématiques de consommation d'énergie. La photonique silicium est aujourd’hui perçue comme une solution solide pour aborder ces questions, tant du fait de ses performances que de sa viabilité économique en raison de sa compatibilité directe avec la microélectronique CMOS. Actuellement, une tendance de fond conduit à remplacer les interconnexions métalliques par des liens optiques ; cette évolution a été initiée sur des liaisons grandes distances mais atteint actuellement le niveau des liaisons entre cartes électroniques et pourrait conduire à moyen terme à l’intégration de liens optiques au sein mêmes des circuits intégrés électroniques. La prochaine étape est en effet envisagée pour l'interconnexion des processeurs au sein de puces multi-cœurs en positionnant les liens photoniques sur un même support de silicium (« interposer »). Plusieurs travaux ont démontré la possibilité d'intégrer tous les éléments nécessaires pour la réalisation de liaisons optiques sur un substrat de silicium ouvrant des perspectives de co-intégration optique et électronique très riches.Dans ce contexte, la première contribution de cette thèse est l'optimisation d'un lien de photonique de silicium en terme d'efficacité énergétique par bit (à minimiser). L'optimisation que nous avons conduite a pris en compte une modélisation de la consommation d'énergie pour le laser de la liaison, celle de l’étape dé-sérialisation des données, du résonateur en anneau considéré comme modulateur optique et des circuits de réception (« front-end ») et de décision. Les résultats ont montré que les principales contributions à la consommation de puissance au sein d’un lien optique sont la puissance consommée par le laser et les circuits d’alimentation du modulateur électro-optique. En considérant des paramètres de consommation extraits de simulations numériques et de travaux publiés dans des publications récentes, le débit optimal identifié se trouve dans la plage comprise entre 8 Gbits/seconde et 22 Gbits/seconde selon le nœud technologique CMOS utilisé (65nm à 28nm FD SOI). Il est également apparu qu’une diminution de la consommation de puissance statique du modulateur utilisé pourrait encore ramener ce débit optimal en-dessous de 8 Gbits/seconde.Afin de vérifier ces résultats, un circuit intégré récepteur de liaison optique a été conçu et fabriqué en se basant sur un débit de fonctionnement de 8 Gbits/seconde. Le récepteur utilise une technique d’entrelacement temporel destinée à réduire la vitesse d'horloge nécessaire et à éviter potentiellement l’étape de dé-sérialisation dédiée des informations
Future high performance computer (HPC) systems will face two major challenges: interconnection bandwidth density and power consumption. Silicon photonic technology has been proposed recently as a cost-effective solution to tackle these issues. Currently, copper interconnections are replaced by optical links at rack and board level in HPCs and data centers. The next step is the interconnection of multi-core processors, which are placed in the same package on silicon interposers, and define the basic building blocks of these computers. Several works have demonstrated the possibility of integrating all elements needed for the realization of short optical links on a silicon substrate.The first contribution of this thesis is the optimization of a silicon photonic link for highest energy efficiency in terms of energy per bit. The optimization provides energy consumption models for the laser, a de- and serialization stage, a ring resonator as modulator and supporting circuitry, a receiver front-end and a decision stage. The optimization shows that the main consumers in optical links is the power consumed by the laser and the modulator's supporting circuitry. Using consumption parameters either gathered by design and simulation or found in recent publications, the optimal bit rate is found in the range between 8 Gbps and 22 Gbps, depending on the used CMOS technology. Nevertheless, if the static power consumption of modulators is reduced it could decrease even below 8 Gbps.To apply the results from the optimization an optical link receiver was designed and fabricated. It is designed to run at a bit rate of 8 Gbps. The receiver uses time interleaving to reduce the needed clock speed and aleviate the need of a dedicated deserialization stage. The front-end was adapted for a wide dynamic input range. In order to take advantage of it, a fast mechanism is proposed to find the optimal threshold voltage to distinguish ones from zeros.Furthermore, optical clock channels are explored. Using silicon photonics a clock can be distributed to several processors with very low skew. This opens the possibility to clock all chips synchronously, relaxing the requirements for buffers that are needed within the communication channels. The thesis contributes to this research direction by presenting two novel optical clock receivers. Clock distribution inside chips is a major power consumer, with small adaptation the clock receivers could also be used inside on-chip clocking trees

APA, Harvard, Vancouver, ISO, and other styles

35

Potluri, Sreeram. "Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397797221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Reehal, Gursharan Kaur. "Designing Low Power and High Performance Network-on-Chip Communication Architectures for Nanometer SoCs." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1340022240.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Wu, Jiesheng. "Communication and memory management in networked storage systems." The Ohio State University, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=osu1095696917.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Anterola, Jeremy K. "Intelligent adaptive environments : proposal for inclusive, interactive design enabling the creation of an interconnected public open space on the Iron Horse trestle interurban-railroad-subway [St. Louis, Missouri]." Manhattan, Kan. : Kansas State University, 2009. http://hdl.handle.net/2097/1493.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Tonchev, Anton. "Door, Passage, Courtyard: Shifting Perspective in Gamla Stan." Thesis, KTH, Arkitektur, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281363.

Full text

Abstract:

A historical study of the urban texture of Gamla Stan shows how public space has been appropriated for private needs. Streets were built over, closed off or turned into private courtyards, some of which have started to disappear, being completely internalized. This process of space appropriation was one-directional until the early 1900s, when the fear of losing structures across town made authorities create a precedent and revert the process by removing specific houses from the urban texture. This approach is based on a set of rules which I changed when making my project: the re-examination of all the hidden, internal, private spaces and their re-introduction to public life. My set of criteria is rooted in a research of the elements that constitute the borderline in Gamla Stan's public vs private realm: doors, passages and courtyards. Based on that I limited my intervention techniques to the removal of three elements: fences, structures, and doors. The last one has two sub-categories "the removed wall" (turned into a new door) and "the removed lock" (opening an existing door). By establishing the parameters of my work, I tested this speculation in a specific case scenario - a cluster of four blocks on the west side of Gamla Stan. Using the rule I that a door must be the beginning of a corridor path that leads to an open court, and having the historical knowledge of the location of past public spaces, I surgically removed later additions of lesser architectural or historical quality. The result of this is a new interconnected, accessible network. Until now one was restricted to walking along the streets and alleys, and around buildings in Gamla Stan. With this intervention people can walk through the buildings and into the reclaimed spaces, thus shifting one’s perception of the urban texture. The new alternative, total system of navigation turns solid into permeable/perforated. Alley City has become Corridor City.

APA, Harvard, Vancouver, ISO, and other styles

40

Bui, Thanh Thi Thanh. "Interconnect architectures for dynamically partially reconfigurable systems." Thesis, 2017. http://hdl.handle.net/2440/113589.

Full text

Abstract:

Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow hardware modules to be placed and removed at runtime while other parts of the system keep working. With their potential benefits, they have been the topic of a great deal of research over the last decade. To exploit the partial reconfiguration capability of FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure that automatically adapts as modules are added to and removed from the system. Many bus and network-on-chip (NoC) architectures have been proposed to exploit this capability on FPGA technology. However, few realizations have been reported in the public literature to demonstrate or compare their performance in real world applications. While partial reconfiguration can offer many benefits, it is still rarely exploited in practical applications. Few full realizations of partially reconfigurable systems in current FPGA technologies have been published. More application experiments are required to understand the benefits and limitations of implementing partially reconfigurable systems and to guide their further development. The motivation of this thesis is to fill this research gap by providing empirical evidence of the cost and benefits of different interconnect architectures. The results will provide a baseline for future research and will be directly useful for circuit designers who must make a well-reasoned choice between the alternatives. This thesis contains the results of experiments to compare different NoC and bus interconnect architectures for FPGA-based designs in general and dynamically partially reconfigurable systems. These two interconnect schemes are implemented and evaluated in terms of performance, area and power consumption using FFT (Fast Fourier Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions drawn from these results include recommendations concerning the interconnect approach for different kinds of applications. It is found that a NoC provides much better performance than a single channel bus and similar performance to a multi-channel bus in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based interconnect achieves better performance and consume less area and power than NoCbased scheme for the fully-connected feed-forward NN system. This suggests buses are a better choice for systems that do not require many simultaneous communications or systems with broadcast communications like a fully-connected feed-forward NN. Results from the experiments with dynamic partial reconfiguration demonstrate that buses have the advantages of better resource utilization and smaller reconfiguration time and memory than NoCs. However, NoCs are more flexible and expansible. They have the advantage of placing almost all of the communication infrastructure in the dynamic reconfiguration region. This means that different applications running on the FPGA can use different interconnection strategies without the overhead of fixed bus resources in the static region. Another objective of the research is to examine the partial reconfiguration process and reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows users to efficiently change the number of running PEs to choose an optimal powerperformance operating point at the minimum cost of reconfiguration. However, this brings drawbacks including resource utilization inefficiency, power consumption overhead and decrease in system operating frequency. The experimental results report a 50% of resource utilization inefficiency with a power consumption overhead of less than 5% and a decrease in frequency of up to 32% compared to a static implementation. The results also show that most of the drawbacks of partial reconfiguration implementation come from the restrictions and limitations of partial reconfiguration design flow. If these limitations can be addressed, partial reconfiguration should still be considered with its potential benefits.
Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 2018

APA, Harvard, Vancouver, ISO, and other styles

41

Chen, Yi-Chiao, and 陳意喬. "High-Performance Deadlock-Free ID Assignment for Advanced Interconnect Architectures." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/98720923319134048396.

Full text

Abstract:

碩士
國立清華大學
資訊工程學系
102
In a modern System on Chip (SoC) design, hundreds of cores and Intellectual Properties (IPs) can be integrated into a single chip. To be suitable for high-performance interconnects, designers increasingly adopt advanced interconnect protocols which support novel mechanisms of parallel accessing including outstanding transactions and out-of-order completion of transactions. To implement those novel mechanisms, a master tags an ID to each transaction to decide in-order or out-of-order properties. However, these advanced protocols may lead to a deadlock problem that does not occur in traditional protocols. To prevent the deadlock problem, current solutions stall suspicious transactions and in certain cases, many such stalls can cause serious performance penalty. In this paper, we propose a novel ID assignment mechanism which guarantees the issued transactions to be deadlock-free and results in significant reduction in the number of stalls. Our experimental results show encouraging performance improvement compared to previous works with little hardware overhead.

APA, Harvard, Vancouver, ISO, and other styles

42

Masud, Muhammad Imran. "FPGA routing structures : a novel switch block and depopulated interconnect matrix architectures." Thesis, 2000. http://hdl.handle.net/2429/10309.

Full text

Abstract:

Field-Programmable Gate Arrays (FPGAs) are integrated circuits which can be programmed to implement virtually any digital circuit. This programmability provides a low-risk, low-turnaround time option for implementing digital circuits. This programmability comes at a cost, however. Typically, circuits implemented on FPGAs are three times as slow and have only one tenth the density of circuits implemented using more conventional techniques. Much of this area and speed penalty is due to the programmable routing structures contained in the FPGA. By optimizing these routing structures, significant performance and density improvements are possible. In this thesis, we focus on the optimization of two of these routing structures. First, we focus on a switch block, which is a programmable switch connecting fixed routing tracks. A typical FPGA contains several hundred switch blocks; thus optimization of these blocks is very important. We present a novel switch block that, when used in a realistic FPGA architecture, is more efficient than all previously proposed switch blocks. Through experiments, we show that the new switch block results in up to 13% fewer transistors in the routing fabric compared to the best previous switch block architectures, with virtually no effect on the speed of the FPGA. Second, we focus on the logic block Interconnect Matrix, which is a programmable switch connecting logic elements. We show that we can create smaller, faster Interconnect Matrix by removing switches from the matrix. We also show, however, that removing switches in this way places additional constraints on the other FPGA routing structures. Through experiments, we show that, after compensating for the reduced flexibility of the Interconnect Matrix, the overall effect on the FPGA density and speed is negligible.

APA, Harvard, Vancouver, ISO, and other styles

43

Li, Katherine Shu-Min, and 李淑敏. "Interconnect-Centric Oscillation Ring Architectures and Algorithms for SoC Testability and Yield Enhancement." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/a6p467.

Full text

Abstract:

博士
國立交通大學
電子工程系所
94
Interconnects play a dominant role in deep-submicron and nanotechnologies. As a result, testability and yield problems of interconnects attract increasing attention. The paradigm shift of the interconnect-related problems is indispensable to cope with two major challenges as technology advances into nanometer territory: �� The ever increasing design complexity of gigascale integration renders testability (detection and diagnosability) and yield enhancement inevitable. �� The complicated physical effects inherent from the scaling effects in nanoscale technology make crosstalk noise (crosstalk-induced glitch faults and crosstalk-induced delay) inevitable, and thus signal integrity and delay faults can no long be ignored. The motivation of this research is targeted at testability and yield enhancement with test time reduction at design stages by our proposed Oscillation Ring (OR) test mechanism. These advantages of the oscillation ring test mechanism have made interconnects detectable and diagnosable through a systematic graph modeling approach. As a relatively novel methodology, OR mechanism for system-level interconnects should be compliant to IEEE Std. 1500. Thus, it is desirable to consider test architectures and algorithms for interconnect testing for System on Chip (SoC) under IEEE Std. 1500, and develop interconnect-centric computer-aided-design tools including design, detection, and diagnosis. To handle the first challenge, the ever increasing design complexity of gigascale integration, we integrate our proposed oscillation ring test techniques into a signal-integrity-aware router. We propose an integrated multilevel full-chip routing algorithm that improves testability and diagnosability, manufacturability, and signal integrity for yield enhancement. Two major issues are addressed. (1) An oscillation ring test and diagnosis scheme for interconnects, based on IEEE Std. 1500, is integrated into the multilevel routing framework to achieve testability enhancement. We augment the traditional multilevel framework by introducing a preprocessing stage of Interconnect Oscillation Ring Detection (IORT) that analyzes the oscillation ring structure for better resource estimation before the coarsening stage, and a postprocessing (final) stage of Interconnect Oscillation Ring Diagnosis (IORD) after uncoarsening that improves testability to achieve 100% interconnect fault coverage and maximal diagnosability. (2) We present a heuristic to balance routing congestion, and the goals of this router include minimizing multiple-fault probability, reducing crosstalk effects, and improving yield for both chemical-mechanical-polishing (CMP) and optical-proximity-correction (OPC) induced manufacturability problems. Experimental results on the MCNC benchmark circuits demonstrate that the proposed OR method achieves 100% fault coverage and the optimal diagnosis resolution for interconnects, and the multilevel congestion-driven routing algorithm effectively balances the routing density to achieve 100% routing completion. Experimental results show that our method significantly improves routing quality for testability and yield enhancement. To deal with the second challenge for signal integrity problem, the crosstalk-induced faults have caused significant impact on interconnect performance as technology advances into nanometer era. The crosstalk is a phenomenon of parasitic capacitance caused by continuous scaling effects. It directly influences reliability, manufacturability and yield of VLSI circuits. (1) We present buffer planning techniques for designing and analyzing crosstalk noise together with performance during floorplanning, and show theoretically and experimentally that our interconnect-aware floorplanner outperforms currently available ones with simultaneously considering crosstalk and timing as our preliminary work which paves the base for IORT and IORD. (2) There are two types of crosstalk: crosstalk-induced glitch and crosstalk-induced delay. We analyze and design the detection of crosstalk faults for interconnect bus, and show experimentally that the unified detection scheme for crosstalk-induced glitch and crosstalk-induced delay is feasible and effectively. This scheme is based on a built-in pulse detector with an adjustable threshold voltage, and we show that this design works well under process variations. Furthermore, the pulse detector in the crosstalk unified detection scheme is embedded into IEEE Std. 1500 wrapper compliant cells so that oscillation ring test for the interconnect test can handle the delay fault, which poses challenges to system performance. (3) We study interconnect detection and diagnosis problems for interconnects. We show a class of oscillation ring approximation algorithms for an interconnect detection and diagnosis problem and prove that oscillation ring mechanism with IEEE Std. 1500 compliant test architecture guarantees 100% fault detection (by IORT) and the optimal diagnosis resolution (by IORD) not only under the fault models of traditional stuck-at and open faults, but also delay and crosstalk glitch faults. Solutions to the interconnect problems by applying oscillation ring methodology pave the way for developing a novel integrated multilevel routing framework with a congestion metric for routing as mentioned above. (4) Finally, the oscillation ring test method has been successfully modified and applied to synchronous sequential circuits to facilitate at-speed test for delay fault detectable in addition to traditional stuck-at and open fault models. In summary, both testability and signal integrity issues have significant impact on interconnect design and test. In my PhD dissertation, an interconnect-centric oscillation ring architectures and algorithms targeted for SoC testability and yield enhancement is proposed to deal with system-level interconnect test and diagnosis, full-chip integrated multilevel router framework, and RTL (register transfer level) synchronous sequential circuits for at-speed testability.

APA, Harvard, Vancouver, ISO, and other styles

44

Palaniappan, Arun. "Modeling, Optimization and Power Efficiency Comparison of High-speed Inter-chip Electrical and Optical Interconnect Architectures in Nanometer CMOS Technologies." Thesis, 2010. http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8618.

Full text

Abstract:

Inter-chip input-output (I/O) communication bandwidth demand, which rapidly scaled with integrated circuit scaling, has leveraged equalization techniques to operate reliably on band-limited channels at additional power and area complexity. High-bandwidth inter-chip optical interconnect architectures have the potential to address this increasing I/O bandwidth. Considering future tera-scale systems, power dissipation of the high-speed I/O link becomes a significant concern. This work presents a design flow for the power optimization and comparison of high-speed electrical and optical links at a given data rate and channel type in 90 nm and 45 nm CMOS technologies. The electrical I/O design framework combines statistical link analysis techniques, which are used to determine the link margins at a given bit-error rate (BER), with circuit power estimates based on normalized transistor parameters extracted with a constant current density methodology to predict the power-optimum equalization architecture, circuit style, and transmit swing at a given data rate and process node for three different channels. The transmitter output swing is scaled to operate the link at optimal power efficiency. Under consideration for optical links are a near-term architecture consisting of discrete vertical-cavity surface-emitting lasers (VCSEL) with p-i-n photodetectors (PD) and three long-term integrated photonic architectures that use waveguide metal-semiconductor-metal (MSM) photodetectors and either electro-absorption modulator (EAM), ring resonator modulator (RRM), or Mach-Zehnder modulator (MZM) sources. The normalized transistor parameters are applied to jointly optimize the transmitter and receiver circuitry to minimize total optical link power dissipation for a specified data rate and process technology at a given BER. Analysis results shows that low loss channel characteristics and minimal circuit complexity, together with scaling of transmitter output swing, allows electrical links to achieve excellent power efficiency at high data rates. While the high-loss channel is primarily limited by severe frequency dependent losses to 12 Gb/s, the critical timing path of the first tap of the decision feedback equalizer (DFE) limits the operation of low-loss channels above 20 Gb/s. Among the optical links, the VCSEL-based link is limited by its bandwidth and maximum power levels to a data rate of 24 Gb/s whereas EAM and RRM are both attractive integrated photonic technologies capable of scaling data rates past 30 Gb/s achieving excellent power efficiency in the 45 nm node and are primarily limited by coupling and device insertion losses. While MZM offers robust operation due to its wide optical bandwidth, significant improvements in power efficiency must be achieved to become applicable for high density applications.

APA, Harvard, Vancouver, ISO, and other styles

45

Nagpal, Rahul. "Compiler-Assisted Energy Optimization For Clustered VLIW Processors." Thesis, 2008. http://hdl.handle.net/2005/684.

Full text

Abstract:

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving clock speed, reducing energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long wires having high load capacitance which leads to delay in execution and significantly high energy consumption. Inter-cluster communication also introduces many short idle cycles, therby significantly increasing the overall leakage energy consumption in the functional units. The trend towards miniatrurization of devices (and associated reduction in threshold voltage) makes energy consumption in interconnects and functional units even worse and limits the usability of clustered architectures in smaller technologies. In the past, study of leakage energy management at the architectural level has mostly focused on storage structures such as cache. Relatively, little work has been done on architecture level leakage energy management in functional units in the context of superscalar processors and energy efficient scheduling in the context of VLIW architectures. In the absence of any high level model for interconnect energy estimation, the primary focus of research in the context of interconnects has been to reduce the latency of communication and evaluation of various inter-cluster communication models. To the best of our knowledge, there has been no such work in the past from the point of view of enegy efficiency targeting clustered VLIW architectures specifically focusing on smaller technologies. Technological advancements now permit design of interconnects and functional units With varying performance and power modes. In thesis we people scheduling algorithms that aggregate the scheduling slack of instructions and communication slack of data values to exploit the low power modes of interconnects and functional units . We also propose a high level model for estimation of interconnect delay and energy (in contrast to low-level circuit level model proposed earlier) that makes it possible to carry out architectural and compiler optimizations specifically targeting the inter connect, Finally we present synergistic combination of these algorithms that simultaneously saves energy in functional units and interconnects to improve the usability of clustered architectures by archiving better overall energy-performance trade-offs. Our compiler assisted leakage energy management scheme for functional units reduces the energy consumption of functional units approximately by 15% and 17% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively with negligible performance degradation over and above that offered by a hardware-only scheme. The interconnect energy optimization scheme improves the energy consumption of interconnects on an average by 41% and 46% for a 2-clustered and a 4-clustered machine respectively with 2% and 1.5% performance degradation. The combined scheme options slightly better energy benefit in functional units and 37% and 43% energy benefit in interconnect with slightly higher performance degradation. Even with the conservative estimates of contribution of functional unit interconnect to overall processor energy consumption the proposed combined scheme obtains on an average 8% and 10% improvement in overall energy delay product with 3.5% and 2% performance degradation for a 2-clustered and a 4-clustered machine respectively. We present a detailed experimental evaluation of the proposed schemes using the Trimaran compiler infrastructure.

APA, Harvard, Vancouver, ISO, and other styles

46

Khun, Jush Farshad. "Architectural enhancement for message passing interconnects." Thesis, 2008. http://hdl.handle.net/1828/1225.

Full text

Abstract:

Research in high-performance architecture has been focusing on achieving more computing power to solve computationally-intensive problems. Advancements in the processor industry are not applicable in applications that need several hundred or thousand-fold improvement in performance. The parallel architecture approach promises to provide more computing power and scalability. Cluster computing, consisting of low-cost and high-performance processors, has been an alternative to proprietary and expensive supercomputer platforms. As in any other parallel system, communication overhead (including hardware, software, and network) adversely affects the computation performance in a cluster environment. Therefore, decreasing this overhead is the main concern in such environments. Communication overhead is the key obstacle to reaching hardware performance limits and is mostly associated with software overhead, a significant portion of which is attributed to message copying. Message copying is largely caused by a lack of knowledge of the next received message, which can be dealt with through speculation. To reduce this copying overhead and advance toward a finer granularity, architectural extensions comprised of a specialized network cache and instructions to manage the operations of these extensions were introduced. In order to investigate the effectiveness of the proposed architectural enhancement, a simulation environment was established by expanding an existing single-thread infrastructure to one that can run MPI applications. Then the proposed extensions were implemented, along with the MPI functions on top of the SimpleScalar infrastructure. Further, two techniques were proposed in order to achieve zero-copy data transfer in message passing environments, two policies that determine when a message is to be bound and sent to the data cache. These policies are called Direct to Cache Transfer DTCT and lazy DTCT. The simulations showed that by using the proposed network extension along with the DTCT techniques fewer data cache misses were encountered as compared to when the DTCT techniques were not used. This involved a study of the possible overhead and cache pollution introduced by the operating system and the communications stack, as exemplified by Linux, TCP/IP and M-VIA. Then these effects on the proposed extensions were explored. Ultimately, this enabled a comparison of the performance achieved by applications running on a system incorporating the proposed extension with the performance of the same applications running on a standard system. The results showed that the proposed approach could improve the performance of MPI applications by 15 to 20%. Moreover, data transfer mechanisms and the associated components in the CELL BE processor were studied. For this, two general data transfer methods were explored involving the PUT and GET functions, demonstrating that the SPE-initiated DMA data transfers are faster than the corresponding PPE-initiated DMAs. The main components of each data transfer were also investigated. In the SPE-initiated GET function, the main component is data delivery. However, the PPE-initiated GET function shows a long DMA issue time as well as a lengthy gap in receiving successive messages. It was demonstrated that the main components of the SPE-initiated PUT function are data delivery and latency (that is, the time to receive the first byte), and the main components in the PPE-initiated PUT function are the DMA issue time and latency. Further, an investigation revealed that memory-management overhead is comparable to the data transfer time; therefore, this calls for techniques to hide the unavoidable overhead in order to reach high-throughput communication in MPI implementation in the Cell BE processor.

APA, Harvard, Vancouver, ISO, and other styles

47

Grecu, Cristian. "SoC interconnect architecture design and evaluation under timing constraints." Thesis, 2003. http://hdl.handle.net/2429/15310.

Full text

Abstract:

System on chip design steadily evolves toward different non-overlapping abstraction levels. Very different competence and design tools will be needed at each level. One specific level of abstraction will deal with interconnect technologies, with a pronounced trend towards networks on chip. It is projected that, within five years, the large majority of end-user SoC products will consist of heterogeneous embedded processors, built on multi-processor SoC platforms (MP-SoC). There is a tremendous amount of research required to characterize the various topologies and their effectiveness for different application domains. A common issue with all network-on-chip topologies is communication latency. Due to the increase of global wire delay with technology scaling, pipelining is required to hide the latency associated with the exchange of data across the chip. The building blocks of a network-on-chip are intelligent switches, which provide a data transport mechanism across the chip. Their design is critical due to different architectural and circuit level trade-offs. This work is novel in that it addresses the issues of quantifying the delay of different pipeline stages in an on-chip topology, and evaluates the effectiveness of a given topology in forthcoming technology nodes.

APA, Harvard, Vancouver, ISO, and other styles

48

Kao, Chih-Heng, and 高智恆. "Re-configurable Hybrid Interconnect Architecture for a Multicore System." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/32935897780557569240.

Full text

Abstract:

碩士
國立交通大學
電子工程學系電子研究所
102
To achieve superior performance of parallel applications on a Multi-Processor System on Chip (MPSoC), an effective and efficient interconnect design has been a critical research topic in architecture design for System on Chip. The conventional design of interconnection targets on specific applications with fixed configurations. However, a fixed interconnect architecture cannot be efficiently applied on applications with different characteristics. To address this issue, this thesis proposes a novel architecture and implementation of Reconfigurable Hybrid Interconnect (RHI) for MPSoC. By cooperating with known traffic characteristics, the proposed RHI has demonstrated an average of 25% to 40% of latency reduction by applying proper configurations.

APA, Harvard, Vancouver, ISO, and other styles

49

(8815964), Minsuk Koo. "Energy Efficient Neuromorphic Computing: Circuits, Interconnects and Architecture." Thesis, 2020.

Find full text

Abstract:

Neuromorphic computing has gained tremendous interest because of its ability to overcome the limitations of traditional signal processing algorithms in data intensive applications such as image recognition, video analytics, or language translation. The new computing paradigm is built with the goal of achieving high energy efficiency, comparable to biological systems.

To achieve such energy efficiency, there is a need to explore new neuro-mimetic devices, circuits, and architecture, along with new learning algorithms. To that effect, we propose two main approaches:

First, we explore an energy-efficient hardware implementation of a bio-plausible Spiking Neural Network (SNN). The key highlights of our proposed system for SNNs are 1) addressing connectivity issues arising from Network On Chip (NOC)-based SNNs, and 2) proposing stochastic CMOS binary SNNs using biased random number generator (BRNG). On-chip Power Line Communication (PLC) is proposed to address the connectivity issues in NOC-based SNNs. PLC can use the on-chip power lines augmented with low-overhead receiver and transmitter to communicate data between neurons that are spatially far apart. We also propose a CMOS 'stochastic-bit' with on-chip stochastic Spike Timing Dependent Plasticity (sSTDP) based learning for memory-compressed binary SNNs. A chip was fabricated in 90 nm CMOS process to demonstrate memory-efficient reconfigurable on-chip learning using sSTDP training.

Second, we explored coupled oscillatory systems for distance computation and convolution operation. Recent research on nano-oscillators has shown the possibility of using coupled oscillator networks as a core computing primitive for analog/non-Boolean computations. Spin-torque oscillator (STO) can be an attractive candidate for such oscillators because it is CMOS compatible, highly integratable, scalable, and frequency/phase tunable. Based on these promising features, we propose a new coupled-oscillator based architecture for hybrid spintronic/CMOS hardware that computes multi-dimensional norm. The hybrid system composed of an array of four injection-locked STOs and a CMOS detector is experimentally demonstrated. Energy and scaling analysis shows that the proposed STO-based coupled oscillatory system has higher energy efficiency compared to the CMOS-based system, and an order of magnitude faster computation speed in distance computation for high dimensional input vectors.

APA, Harvard, Vancouver, ISO, and other styles

50

Lin, Shiuann-Shiuh, and 林炫旭. "Exact Solution for Net Assignment Problem in Partial Crossbar Interconnect Architecture." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/13614866725628807242.

Full text

Abstract:

碩士
國立清華大學
資訊工程學系
88
In this thesis, we will study the net assignment problem in partial crossbar interconnection architecture [1,4]. Net assignment of two-terminal nets in this interconnection structure is guaranteed to be completed in polynomial time. However, net assignment of multi-terminal nets becomes NP-complete. Previous paper [1] has proposed a simple heuristic to perform net assignment for multi-terminal nets. Its results showed that it failed to complete routing all nets for many cases. It is inadequate to have net assignment algorithm which does not guarantee an exact solution, for the failure of interconnecting FPGAs will result in the failure of whole mapping to the computing engine or redoing the previous steps, e.g., partitioning of circuits. Therefore, we will propose an exact algorithm to solve the net assignment problem. The exact algorithm will find a solution if there exists one. However, the exact algorithm may take exponential time. Accordingly, a two-phase approach will be taken in this research. A time-efficient heuristic method [14,15] will be called first. The exact solver will be called only if the heuristic fails to deliver a solution.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Interconnect architectures'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles