Gotowa bibliografia na temat „Cache Coherence Problem”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Cache Coherence Problem”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Cache Coherence Problem"

1

Shmeylin, B. Z., i E. A. Alekseeva. "THE PROBLEM OF PROVIDING CACHE COHERENCE IN MULTIPROCESSOR SYSTEMS WITH MANY PROCESSORS". Issues of radio electronics, nr 5 (20.05.2018): 47–53. http://dx.doi.org/10.21778/2218-5453-2018-5-47-53.

Pełny tekst źródła
Streszczenie:
In this paper the tasks of managing the directory in coherence maintenance systems in multiprocessor systems with a large number of processors are solved. In microprocessor systems with a large number of processors (MSLP) the problem of maintaining the coherence of processor caches is significantly complicated. This is due to increased traffic on the memory buses and increased complexity of interprocessor communications. This problem is solved in various ways. In this paper, we propose the use of Bloom filters used to accelerate the determination of an element’s belonging to a certain array. In this article, such filters are used to establish the fact that the processor belongs to some subset of the processors and determine if the processor has a cache line in the set. In the paper, the processes of writing and reading information in the data shared between processors are discussed in detail, as well as the process of data replacement from private caches. The article also shows how the addresses of cache lines and processor numbers are removed from the Bloom filters. The system proposed in this paper allows significantly speeding up the implementation of operations to maintain cache coherence in the MSLP as compared to conventional systems. In terms of performance and additional hardware and software costs, the proposed system is not inferior to the most efficient of similar systems, but on some applications and significantly exceeds them.
Style APA, Harvard, Vancouver, ISO itp.
2

Journal, Baghdad Science. "Cache Coherence Protocol Design and Simulation Using IES (Invalid Exclusive read/write Shared) State". Baghdad Science Journal 14, nr 1 (5.03.2017): 219–30. http://dx.doi.org/10.21123/bsj.14.1.219-230.

Pełny tekst źródła
Streszczenie:
To improve the efficiency of a processor in recent multiprocessor systems to deal with data, cache memories are used to access data instead of main memory which reduces the latency of delay time. In such systems, when installing different caches in different processors in shared memory architecture, the difficulties appear when there is a need to maintain consistency between the cache memories of different processors. So, cache coherency protocol is very important in such kinds of system. MSI, MESI, MOSI, MOESI, etc. are the famous protocols to solve cache coherency problem. We have proposed in this research integrating two states of MESI's cache coherence protocol which are Exclusive and Modified, which responds to a request from reading and writing at the same time and that are exclusive to these requests. Also back to the main memory from one of the other processor that has a modified state is removed in using a proposed protocol when it is invalidated as a result of writing to that location that has the same address because in all cases it depends on the latest value written and if back to memory is used to protect data from loss; preprocessing steps to IES protocol is used to maintain and saving data in main memory when it evict from the cache. All of this leads to increased processor efficiency by reducing access to main memory
Style APA, Harvard, Vancouver, ISO itp.
3

أنيس القردوح, عبدالحميد الكواش i عبدالمحسن البنداق. "Simulation Cache Coherence Protocols in Multicore Processors". Journal of Pure & Applied Sciences 21, nr 4 (3.10.2022): 285–89. http://dx.doi.org/10.51984/jopas.v21i4.2239.

Pełny tekst źródła
Streszczenie:
The cache coherence problem is the challenge of keeping multiple cache synchronized when one of the processors update its local copy of data which is shared among multiple cache. This paper discusses several different varieties of cache coherence protocols including with their pros and cons, and using simulation technique it will address this problem and compare between two protocols that use to solve it: Directory-based protocol and Snooping protocol. Simulation results have shown that snooping based systems are appropriate for high bandwidth systems while directory-based cache coherence protocols are suitable for lower bandwidth systems.
Style APA, Harvard, Vancouver, ISO itp.
4

Jalil, Luma Fayeq, Maha Abdul kareem H. Al-Rawi i Abeer Diaa Al-Nakshabandi. "Cache coherence protocol design using VMSI (Valid Modified Shared Invalid) states". Journal of University of Human Development 3, nr 1 (31.03.2017): 274. http://dx.doi.org/10.21928/juhd.v3n1y2017.pp274-281.

Pełny tekst źródła
Streszczenie:
We have proposed in this research the design of a new protocol named VMSI coherence protocol in the cache in order to solve the problem of coherence which is the incompatibility of data between caches that appeared in recent multiprocessors system through the operations of reading and writing. The main purpose of this protocol is to increase processor efficiency by reducing traffic between processor and memory that have been achieved through the removal of the write back to the main memory in the case of reading or writing of shared caches because it depends on existing directory inside that cache which contains all the data that represents a subset of main memory.
Style APA, Harvard, Vancouver, ISO itp.
5

Guo, Yu Feng, Ming Zhang i Rui Gong. "I/O Coherence Faulty Tolerance Method for Multi-Core Processor Based on Retry". Applied Mechanics and Materials 427-429 (wrzesień 2013): 2830–33. http://dx.doi.org/10.4028/www.scientific.net/amm.427-429.2830.

Pełny tekst źródła
Streszczenie:
I/O Consistency problem is one of the key issues which Multi-Cores Processor design must face. With increasing of core number and complicating of cache level, the probability of I/O coherence packets blocked would increase, which would decrease I/O system efficiency significantly. An I/O coherence maintaining method based on retransmission is proposed to improve reliability of the I/O coherence protocol. Experimental results demonstrate that this method can enhance the robustness of I/O coherence protocol effectively.
Style APA, Harvard, Vancouver, ISO itp.
6

Zhao, Jia i Watanabe. "Router-integrated Cache Hierarchy Design for Highly Parallel Computing in Efficient CMP Systems". Electronics 8, nr 11 (17.11.2019): 1363. http://dx.doi.org/10.3390/electronics8111363.

Pełny tekst źródła
Streszczenie:
In current Chip Multi-Processor (CMP) systems, data sharing existing in cache hierarchy acts as a critical issue which costs plenty of clock cycles for maintaining data coherence. Along with the integrated core number increasing, the only shared cache serves too many processing threads to maintain sharing data efficiently. In this work, an enhanced router network is integrated within the private cache level for fast interconnecting sharing data accesses existing in different threads. All sharing data in private cache level can be classified into seven access types by experimental pattern analysis. Then, both shared accesses and thread-crossed accesses can be rapidly detected and dealt with in the proposed router network. As a result, the access latency of private cache is decreased, and a conventional coherence traffic problem is alleviated. The process in the proposed path is composed of three steps. Firstly, the target accesses can be detected by exploring in the router network. Then, the proposed replacement logic can handle those accesses for maintaining data coherence. Finally, those accesses are delivered in the proposed data deliverer. Thus, the harmful data sharing accesses are solved within the first chip layer in 3D-IC structure. The proposed system is also implemented into a cycle-precise simulation platform, and experimental results illustrate that our model can improve the Instructions Per Cycle (IPC) of on-chip execution by maximum 31.85 percent, while energy consumption can be saved by about 17.61 percent compared to the base system.
Style APA, Harvard, Vancouver, ISO itp.
7

Zhu, Wei, i Xiaoyang Zeng. "Decision Tree-Based Adaptive Reconfigurable Cache Scheme". Algorithms 14, nr 6 (1.06.2021): 176. http://dx.doi.org/10.3390/a14060176.

Pełny tekst źródła
Streszczenie:
Applications have different preferences for caches, sometimes even within the different running phases. Caches with fixed parameters may compromise the performance of a system. To solve this problem, we propose a real-time adaptive reconfigurable cache based on the decision tree algorithm, which can optimize the average memory access time of cache without modifying the cache coherent protocol. By monitoring the application running state, the cache associativity is periodically tuned to the optimal cache associativity, which is determined by the decision tree model. This paper implements the proposed decision tree-based adaptive reconfigurable cache in the GEM5 simulator and designs the key modules using Verilog HDL. The simulation results show that the proposed decision tree-based adaptive reconfigurable cache reduces the average memory access time compared with other adaptive algorithms.
Style APA, Harvard, Vancouver, ISO itp.
8

Tian, Yong Hong, i Guang Jian Chen. "A Review of Researches on Cache Coherence Protocols for Multi-Core Processor". Advanced Materials Research 933 (maj 2014): 740–43. http://dx.doi.org/10.4028/www.scientific.net/amr.933.740.

Pełny tekst źródła
Streszczenie:
Multi-core processor parallels two or more computing core in a single processor to enhance computational capability. Plenty of former researches are focused on CMP (Chip multi-processor), the most typical structure of multi-core processor. Thus deign of cache coherence, in particular, is one of the primary problems beyond other researches about CMP. In this paper, cache coherence protocol of CMP is fully presented, along with its advantages and disadvantages. Finally, some edging issues of cache coherence protocols are addressed in this paper.
Style APA, Harvard, Vancouver, ISO itp.
9

ALKOWAILEET, WAIL Y., DAVID CARRILLO-CISNEROS, ROBERT V. LIM i ISAAC D. SCHERSON. "NUMA-Aware Multicore Matrix Multiplication". Parallel Processing Letters 24, nr 04 (grudzień 2014): 1450006. http://dx.doi.org/10.1142/s0129626414500066.

Pełny tekst źródła
Streszczenie:
A user-level scheduling along with a specific data alignment for matrix multiplication in cache-coherent Non-Uniform Memory Access (ccNUMA) architectures is presented. Addressing the data locality problem that could occur in such systems potentially alleviates memory bottlenecks. We show experimentally that an agnostic thread scheduler (e.g., OpenMP 3.1) from the data placement on a ccNUMA machine produces a high number of cache-misses. To overcome this memory contention problem, we show how proper memory mapping and scheduling manage to tune an existing matrix multiplication implementation and reduce the number of cache-misses by 67% and consequently, reduce the computation time by up to 22%. Finally, we show a relationship between cache-misses and the gained speedup as a novel figure of merit to measure the quality of the method.
Style APA, Harvard, Vancouver, ISO itp.
10

CHONG, FREDERIC T., i ANANT AGARWAL. "SHARED MEMORY VERSUS MESSAGE PASSING FOR ITERATIVE SOLUTION OF SPARSE, IRREGULAR PROBLEMS". Parallel Processing Letters 09, nr 01 (marzec 1999): 159–70. http://dx.doi.org/10.1142/s0129626499000177.

Pełny tekst źródła
Streszczenie:
The benefits of hardware support for shared memory versus those for message passing are difficult to evaluate without an in-depth study of real applications on a common platform. We evaluate the communication mechanisms of the MIT Alewife machine, a multiprocessor which provides integrated cache-coherent shared memory, massage passing, and DMA. We perform this evaluation with "best-effort" implementations which solve several sparse, irregular benchmark problems with a preconditioned conjugate gradient sparse matrix solver (ICCG). We find that machines with fast global memory operations do not need message passing or bulk transfer to suport our irregular problems. This is primarily due to three reasons. First, a 5-to-1 ratio between global and local cache misses makes memory copies in bulk communication expensive relati to communication via shared memory. Second, although message passing has synchronization semantics superior to shared memory for data-driven computation, efficient shared memory can overcome this handicap by using global read-modify-writes to change from the traditional owner-computers model to a producer-computes model. Third, bulk transfers can result in high processor idle times in irregular applications.
Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat "Cache Coherence Problem"

1

Archibald, James K. "The cache coherence problem in shared-memory multiprocessors /". Thesis, Connect to this title online; UW restricted, 1987. http://hdl.handle.net/1773/6955.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

ARORA, POOJA. "Cache Coherence in Multi Processors Architecture". Thesis, 2015. http://dspace.dtu.ac.in:8080/jspui/handle/repository/14290.

Pełny tekst źródła
Streszczenie:
Appropriate solution to illustrious Cache Coherence Problem in shared memory multiprocessors system is one of the crucial issue for improving system performance and scalability. In this paper we have surveyed various cache coherence mechanisms in shared memory multiprocessor. Various hardware based and software based protocol have been investigated in depth including recent protocols. We have concluded that hardware based cache coherence protocol are better than software based protocol according to presently available protocols, but hardware based protocol have added the cost to implement them. As software based cache coherence protocol are more economical therefore more devotion is needed for software based protocol as they show great promise for future work. After thoroughly studying about MESI protocol and MARSSx86 (Micro Architectural and System Simulator) simulator, which is an open source therefore its code is available without a hitch. In this project we have enhanced the performance of the system. While level 2 cache as shared we have made existing invalid to invalid transition zero at the Level 1 Data Cache at user level with dual cores and reduced this transition at the great extent when it comes to the FERRET, SWAPTIONS and CANNEAL programs of PARSEC (Princeton Application Repository for Shared-Memory Computers) benchmark. The experiment results have proved that with dual cores we have increased cycles per second for above mentioned programs of PARSEC benchmark and at quad cores we have increased commits per second. When it comes to octet cores we have enhanced the commits per second for FERRET, cycles per second for SWAPTIONS AND CANNEAL program of PARSEC benchmark.While keeping level 2 cache as private we have also enhanced the system performance in terms of cycle per second and commits per second by modifying the existing Invalid to Invalid (II) in MESI protocol’s code of the MARSSx86 simulator. In fact by doing so we have successfully made invalid to invalid transition zero for the programs of PARSEC (Princeton Application Repository for Shared-Memory Computers) benchmark for dual cores. Experiments have shown that for quad cores configuration we have reduced invalid to invalid transition significantly by 99% on an average. When we tested for octet cores configuration invalid to invalid transition is decreased by 99% with CANNEAL, 99% with FERRET and 70% with SWAPTIONS. As shown in experimental results we are actually depreciating the bus traffic and improving the system performance
Mr. MANOJ KUMAR Associate Professor Delhi Technological University Department of Computer Engineering Delhi Technological University 2011-2012
Style APA, Harvard, Vancouver, ISO itp.
3

HUANG, WEN-GIANG, i 黃文強. "Two new protocols for cache coherence problem". Thesis, 1989. http://ndltd.ncl.edu.tw/handle/80242125272378726036.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Książki na temat "Cache Coherence Problem"

1

Milo, Tomašević, i Milutinović Veljko, red. The Cache-coherence problem in shared-memory multiprocessors: Hardware solutions. Los Alamitos, Calif: IEEE Computer Society Press, 1993.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Tartalja, Igor. The cache coherence problem in shared-memory multiprocessors: Software solutions. Los Alamitos, Calif: IEEE Computer Society Press, 1996.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

(Editor), Milo Tomasevic, i Veljko Milutinovic (Editor), red. The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions. Ieee Computer Society, 1993.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

(Editor), Veljko Milutinovic, i Milo Tomasevic (Editor), red. The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions. Institute of Electrical & Electronics Enginee, 1999.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "Cache Coherence Problem"

1

Cortes, Toni, Sergi Girona i Jesús Labarta. "Avoiding the cache-coherence problem in a parallel/distributed file system". W High-Performance Computing and Networking, 860–69. Berlin, Heidelberg: Springer Berlin Heidelberg, 1997. http://dx.doi.org/10.1007/bfb0031657.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Moorthi, M. Narayana, i R. Manjula. "Challenges Faced in Enhancing the Performance and Scalability in Parallel Computing Architecture". W Advances in Computer and Electrical Engineering, 252–69. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9479-8.ch010.

Pełny tekst źródła
Streszczenie:
Now a day the architecture of high performance systems are improving with more and more processor cores on the chip. This has both benefits as well as challenges. The benefit is running more task simultaneously which reduces the running time of the program or application. The challenges are what is the maximum limit of the number of cores in the given chip, how the existing and future software will make use of all the cores, what parallel programming language to choose, what are the memory and cache coherence issues involved when we increase the number of cores, how to solve the power and performance issues, how the cores are connected and how they are communicating to solve a single problem, workload distribution and load balancing issues in terms of scalability. There is a practical limit for speedup and scalability of number of cores on the chip which needs to be analyzed. So this chapter will focus on the introduction and overviews of parallel computing and the challenges faced in enhancing the performance and scalability in parallel computing architecture.
Style APA, Harvard, Vancouver, ISO itp.
3

"Design Issues of a Cooperative Cache with no Coherence Problems". W High Performance Mass Storage and Parallel I/O. IEEE, 2009. http://dx.doi.org/10.1109/9780470544839.ch18.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Cache Coherence Problem"

1

Cheng, L., i A. A. Sawchuk. "Optical solutions for cache memories in parallel computers". W OSA Annual Meeting. Washington, D.C.: Optica Publishing Group, 1993. http://dx.doi.org/10.1364/oam.1993.mzz.1.

Pełny tekst źródła
Streszczenie:
A cache is a high speed memory located between processors and main memory to fill the speed gap between them. In a loosely coupled parallel computer, each processor has its own cache memory. The processor accesses information from its cache memory, which stores information obtained from the main memory through an interconnection network. A major challenge in this system is to keep the data in all the caches consistent with that in main memory. This is referred to as the cache coherence problem. One solution is to have a bus between the caches, and supply each cache with a controller which listens to the bus and updates the cache whenever a change is made in main memory. This approach is complicated and is limited by the bandwidth of the bus [1], In a tightly coupled parallel computer, the cache memory can be shared by all processors and there is no cache coherence problem. However, with conventional VLSI implementation, only one processor can access the cache memory at a given time and the performance degrades dramatically [2]. We present optical solutions to the above cache problems. We describe an optical bus which updates the multiple caches in loosely coupled parallel computers to eliminate the bandwidth limitation. Optical or optoelectronic cache memory is proposed for shared cache in tightly coupled parallel computers to allow parallel access. We examine potential architectures and devices for both cases.
Style APA, Harvard, Vancouver, ISO itp.
2

Mittal, Shaily, i Nitin. "A New Approach to Directory Based Solution for Cache Coherence Problem". W 2014 3rd International Conference on Eco-friendly Computing and Communication Systems (ICECCS). IEEE, 2014. http://dx.doi.org/10.1109/eco-friendly.2014.77.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Li, Qiang. "SCI Lamp for Multimedia Applications". W ASME 1994 International Computers in Engineering Conference and Exhibition and the ASME 1994 8th Annual Database Symposium collocated with the ASME 1994 Design Technical Conferences. American Society of Mechanical Engineers, 1994. http://dx.doi.org/10.1115/cie1994-0474.

Pełny tekst źródła
Streszczenie:
Abstract Multimedia technology has become widely available and it has tremendous potential. As computers getting faster every day, multimedia applications are reaching new ground constantly. However, a fundamental problem of multimedia systems is the interprocessor/intermachine communication speed. In this paper, we introduce a platform based on the Scalable Coherent Interface (SCI, ANSI/IEEE std 1596). The system can be physically distributed but logically closely coupled. All processors/machines in an SCI system shared physical memory and cache coherence is maintained even among remote processors. The interprocessor communication bandwidth can be as high as 1 Gbyte/sec. We will discuss the features of SCI-based systems when multimedia application is considered.
Style APA, Harvard, Vancouver, ISO itp.
4

Cortes, Toni, Sergi Girona i Jesús Labarta. "Design issues of a cooperative cache with no coherence problems". W the fifth workshop. New York, New York, USA: ACM Press, 1997. http://dx.doi.org/10.1145/266220.266224.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Hytopoulos, Evangelos, Mark D. Kremenetsky, Ramesh Andra, Richard Sun i Stan Posey. "Scalability Studies on a cc-NUMA Computer Architecture for Large Automotive Simulations". W ASME 1998 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 1998. http://dx.doi.org/10.1115/imece1998-0994.

Pełny tekst źródła
Streszczenie:
Abstract The maturity of the Computational Fluid Dynamics methods and the increasing computational power of today’s computers have allowed the automotive industry to incorporate the CFD technology in several stages of the design process. As the application of the CFD technology is moving from the component level analysis to the system level, the complexity and the size of the models increase continuously. Successful simulation requires synergy between CAD, grid generation, and solvers. The requirement for shorter design cycle has put severe limitations on the turnaround time of the numerical simulations. The time required for a) mesh generation (around bodies of complex geometry, such as the geometry of a complete car), and b) obtaining numerical solutions (for flows with complex physics) has traditionally been the pacing item in CFD applications. Unstructured grid generation techniques and parallel algorithms have been instrumental in making such calculations affordable. Availability of these algorithms in commercial packages has proliferated in the last few years and parallel performance has become a very important factor in the selection of such methods for production work. Although extensive research has been devoted in determining the optimum parallel paradigm, in practice the best parallel performance can be obtained only when the algorithms and paradigms take into consideration the architectural design of the target computer that they are intended for. The present paper addresses the issues related to the porting and optimization of a commercial code (Fluent/UNS) on a cache-coherent (cc) Non Uniform Memory Architecture (NUMA). Issues related to the Message Passing system and the memory to processor affinity are investigated using both a sample CFD code and Fluent/UNS. The scalability of the code when applied to the problem of the front-end cooling simulation of a development prototype family sedan are presented and discussed. Since speed and accuracy are the ultimate goals for using CAE in the design process a discussion concerning the model preparation time, grid generation process, and solution time will be presented. Comparison with available experimental data will be presented and discussed.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii