Log in

Relevant bibliographies by topics / Prefetching / Journal articles

To see the other types of publications on this topic, follow the link: Prefetching.

Journal articles on the topic 'Prefetching'

Author: Grafiati

Published: 4 June 2021

Last updated: 4 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Prefetching.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gu, Ji, and Hui Guo. "Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor Systems." International Journal of Handheld Computing Research 2, no. 4 (October 2011): 42–58. http://dx.doi.org/10.4018/jhcr.2011100103.

Full text

Abstract:

Instruction prefetching is an effective way to improve performance of the pipelined processors. However, existing instruction prefetching schemes increase performance with a significant energy sacrifice, making them unsuitable for embedded and ubiquitous systems where high performance and low energy consumption are all demanded. This paper proposes reducing energy overhead in instruction prefetching by using a simple hardware/software design and an efficient prefetching operation scheme. Two approaches are investigated: Decoded Loop Instruction Cache based Prefetching (DLICP) that is most effective for loop intensive applications, and the enhanced DLICP with the popular existing Next Line Prefetching (NLP) for applications of a moderate number of loops. The experimental results show that both DLICP and the enhanced DLICP deliver improved performance at a much reduced energy overhead.

APA, Harvard, Vancouver, ISO, and other styles

2

Jiang, Yong, Shu Wu Zhang, and Jie Liu. "Optimized Prefetching Scheme to Support VCR-Like Operations in P2P VoD Applications with Guided Seeks." Applied Mechanics and Materials 719-720 (January 2015): 756–66. http://dx.doi.org/10.4028/www.scientific.net/amm.719-720.756.

Full text

Abstract:

In Peer-to-Peer (P2P) Video-on-Demand (VoD) streaming systems, supporting free VCR operations is challenging. Prefetching is a good way to improve user experience of VCR interactivity. But most existing P2P VoD prefetching schemes are proposed aimed at those popular videos with large amount of log data, without considering the situation that the videos are unpopular or the popular videos are in their initial phase of release. In this situation, these schemes cannot support user VCR interactivity very well. To address this issue, we propose a new optimized prefetching scheme, called Hybrid Anchor Scheme (HAS), in which fixed anchors and dynamic anchors are merged together. The dynamic anchors are generated based on association rule and segments popularity. Through combining the way of sequential prefetching according to weight of segments and the way of several rounds prefetching, we implement HAS effectively. Extensive simulations validate the proposed prefetching scheme provide shorter seeking latency compared to other prefetching schemes.

APA, Harvard, Vancouver, ISO, and other styles

3

Chen, Yong, Huaiyu Zhu, Philip C. Roth, Hui Jin, and Xian-He Sun. "Global-aware and multi-order context-based prefetching for high-performance processors." International Journal of High Performance Computing Applications 25, no. 4 (March 8, 2011): 355–70. http://dx.doi.org/10.1177/1094342010394386.

Full text

Abstract:

Data prefetching is widely used in high-end computing systems to accelerate data accesses and to bridge the increasing performance gap between processor and memory. Context-based prefetching has become a primary focus of study in recent years due to its general applicability. However, current context-based prefetchers only adopt the context analysis of a single order, which suffers from low prefetching coverage and thus limits the overall prefetching effectiveness. Also, existing approaches usually consider the context of the address stream from a single instruction but not the context of the address stream from all instructions, which further limits the context-based prefetching effectiveness. In this study, we propose a new context-based prefetcher called the Global-aware and Multi-order Context-based (GMC) prefetcher. The GMC prefetcher uses multi-order, local and global context analysis to increase prefetching coverage while maintaining prefetching accuracy. In extensive simulation testing of the SPEC-CPU2006 benchmarks with an enhanced CMP$im simulator, the proposed GMC prefetcher was shown to outperform existing prefetchers and to reduce the data-access latency effectively. The average Instructions Per Cycle (IPC) improvement of SPEC CINT2006 and CFP2006 benchmarks with GMC prefetching was over 55% and 44% respectively.

APA, Harvard, Vancouver, ISO, and other styles

4

Chen, Yunliang, Fangyuan Li, Bo Du, Junqing Fan, and Ze Deng. "A Quantitative Analysis on Semantic Relations of Data Blocks in Storage Systems." Journal of Circuits, Systems and Computers 24, no. 08 (August 12, 2015): 1550118. http://dx.doi.org/10.1142/s0218126615501182.

Full text

Abstract:

In a big data era, there are and will be more demands and higher requirements for data organization and management in the storage system. Previous studies on semantic relations of blocks have set an essential theoretical basis in storage systems. Revealing semantic relationship of blocks has great scientific values and indicates a promising prospect in the application. However, only qualitative description of block correlations has been obtained by traditional ways, and few researches have been done on unifying and quantizing the blocks semantic relationships. In this paper, an uniform design-aided gene expression programming (UGEP) method has been proposed to address this issue. After analyzing historical access streams, a quantitative description system on semantic relationship of blocks is established. The semantic relationship of blocks can also be used in prefetching in storage system, which means the next access block can be calculated by the prefetching model. Simulation tests in four real system traces indicate that compared with sequential prefetching and no prefetching model, the quantitative semantic relationships model based on UGEP can highly improve the performance of the storage system. The experimental results show that in Cello-92 cases, compared with no prefetching method and sequential prefetching method, prefetching based on UGEP can reduce average I/O response time by 23.7% and 17.9%, and hit ratio is also increased by 16% and 8% as well.

APA, Harvard, Vancouver, ISO, and other styles

5

Callahan, David, Ken Kennedy, and Allan Porterfield. "Software prefetching." ACM SIGPLAN Notices 26, no. 4 (April 2, 1991): 40–52. http://dx.doi.org/10.1145/106973.106979.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Callahan, David, Ken Kennedy, and Allan Porterfield. "Software prefetching." ACM SIGOPS Operating Systems Review 25, Special Issue (April 2, 1991): 40–52. http://dx.doi.org/10.1145/106974.106979.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Callahan, David, Ken Kennedy, and Allan Porterfield. "Software prefetching." ACM SIGARCH Computer Architecture News 19, no. 2 (April 2, 1991): 40–52. http://dx.doi.org/10.1145/106975.106979.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Gschwind, Michael K., and Thomas J. Pietsch. "Vector prefetching." ACM SIGARCH Computer Architecture News 23, no. 5 (December 15, 1995): 1–7. http://dx.doi.org/10.1145/218328.218329.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Kimbrel. "Interleaved Prefetching." Algorithmica 32, no. 1 (January 2002): 107–22. http://dx.doi.org/10.1007/s00453-001-0066-y.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Cantin, Jason F., Mikko H. Lipasti, and James E. Smith. "Stealth prefetching." ACM SIGOPS Operating Systems Review 40, no. 5 (October 20, 2006): 274–82. http://dx.doi.org/10.1145/1168917.1168892.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Cantin, Jason F., Mikko H. Lipasti, and James E. Smith. "Stealth prefetching." ACM SIGPLAN Notices 41, no. 11 (November 2006): 274–82. http://dx.doi.org/10.1145/1168918.1168892.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Cantin, Jason F., Mikko H. Lipasti, and James E. Smith. "Stealth prefetching." ACM SIGARCH Computer Architecture News 34, no. 5 (October 20, 2006): 274–82. http://dx.doi.org/10.1145/1168919.1168892.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Setia, Sonia, Jyoti, and Neelam Duhan. "Neural Network Based Prefetching Control Mechanism." International Journal of Engineering and Advanced Technology 9, no. 2 (December 30, 2019): 1361–66. http://dx.doi.org/10.35940/ijeat.a5089.129219.

Full text

Abstract:

An important issue incurred by users that limits the use of internet is the long web access delays. Most efficient way to solve this problem is to use “Prefetching”. This paper is an attempt to dynamically monitor the network bandwidth for which a neural network-based model has been worked upon. Prefetching is an effective and efficient technique for reducing users perceived latency. It is a technique that predicts & fetches the web pages in advance corresponding to the clients’ request, that will be accessed in future. Generally, this prediction is based on the historical information that the sever maintains for each web page it serves in a chronological order. This is a speculative technique where if predictions are incorrect then prefetching adds extra traffic to the network, which is seriously negating the network performance. Therefore, there is critical need of a mechanism that could analyze the network bandwidth of the system before prefetching is done. Based on network conditions, this model not only guides if the prefetching should be done or not but also tells number of pages which are to be prefetched in advance so that network bandwidth can be effectively utilized. Proposed control mechanism has been validated using NS-2 simulator and thus various adverse effects of prefetching in terms of response time and bandwidth utilization have been reduced.

APA, Harvard, Vancouver, ISO, and other styles

14

LIU, Jin, Chuang HU, Ming HU, and Yi-li GONG. "File prefetching with multiple prefetching points in multithreading environment." Journal of Computer Applications 32, no. 6 (April 27, 2013): 1713–16. http://dx.doi.org/10.3724/sp.j.1087.2012.01713.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Schreiber, Daniel, Andreas Göb, Erwin Aitenbichler, and Max Mühlhäuser. "Reducing User Perceived Latency with a Proactive Prefetching Middleware for Mobile SOA Access." International Journal of Web Services Research 8, no. 1 (January 2011): 68–85. http://dx.doi.org/10.4018/jwsr.2011010104.

Full text

Abstract:

Network latency is one of the most critical factors for the usability of mobile SOA applications. This paper introduces prefetching and caching enhancements for an existing SOA framework for mobile applications to reduce the user perceived latency. Latency reduction is achieved by proactively sending data to the mobile device that could most likely be requested at a later time. This additional data is piggybacked onto responses to actual requests and injected into a client side cache, so that it can be used without an additional connection. The prefetching is done automatically using a sequence prediction algorithm. The benefit of prefetching and caching enhancements were evaluated for different network settings and a reduction of user perceived latency of up to 31% was found in a typical scenario. In contrast to other prefetching solutions, our piggybacking approach also allows to significantly increase battery lifetime of the mobile device.

APA, Harvard, Vancouver, ISO, and other styles

16

VERMA, SANTHOSH, and DAVID M. KOPPELMAN. "THE INTERACTION AND RELATIVE EFFECTIVENESS OF HARDWARE AND SOFTWARE DATA PREFETCH." Journal of Circuits, Systems and Computers 21, no. 02 (April 2012): 1240002. http://dx.doi.org/10.1142/s0218126612400026.

Full text

Abstract:

A major performance limiter in modern processors is the long latencies caused by data cache misses. Both compiler- and hardware-based prefetching schemes help hide these latencies and so improve performance. Compiler techniques infer memory access patterns through code analysis, and insert appropriate prefetch instructions. Hardware prefetching techniques work independently from the compiler by monitoring an access stream, detecting patterns in this stream and issuing prefetches based on these patterns. This paper looks at the interplay between compiler and hardware architecture-based prefetching techniques. Does either technique make the other one unnecessary? First, compilers' ability to achieve good results without extreme expertise is evaluated by preparing binaries with no prefetch, one-flag prefetch (no tuning), and expertly tuned prefetch. From runs of SPECcpu2006 binaries, we find that expertise avoids minor slowdown in a few benchmarks and provides substantial speedup in others. We compare software schemes to hardware prefetching schemes and our simulations show software alone substantially outperforms hardware alone on about half of a selection of benchmarks. While hardware matches or exceeds software in a few cases, software is better on average. Analysis reveals that in many cases hardware is not prefetching access patterns that it is capable of recognizing, due to some irregularities in the observed miss sequence. Hardware outperforms software on address sequences that the compiler would not guess. In general, while software is better at prefetching individual loads, hardware partly compensates for this by identifying more loads to prefetch. Using the two schemes together provides further benefits, but less than the sum of the contributions of each alone.

APA, Harvard, Vancouver, ISO, and other styles

17

Quan, Guocong, Atilla Eryilmaz, Jian Tan, and Ness Shroff. "Prefetching and Caching for Minimizing Service Costs." ACM SIGMETRICS Performance Evaluation Review 48, no. 3 (March 5, 2021): 77–78. http://dx.doi.org/10.1145/3453953.3453970.

Full text

Abstract:

In practice, prefetching data strategically has been used to improve caching performance. The idea is that data items can either be cached upon request (traditional approach) or prefetched into the cache before the requests actually occur. The caching and prefetching operations compete for the limited cache space, whose size is typically much smaller than the number of data items. A key challenge is to design an optimal prefetching and caching policy, assuming that the future requests can be predicted to a certain extent. This is a non-trivial challenge even under the idealized assumption that future requests are precisely known.

APA, Harvard, Vancouver, ISO, and other styles

18

He, Yongjun, Jiacheng Lu, and Tianzheng Wang. "CoroBase." Proceedings of the VLDB Endowment 14, no. 3 (November 2020): 431–44. http://dx.doi.org/10.14778/3430915.3430932.

Full text

Abstract:

Data stalls are a major overhead in main-memory database engines due to the use of pointer-rich data structures. Lightweight coroutines ease the implementation of software prefetching to hide data stalls by overlapping computation and asynchronous data prefetching. Prior solutions, however, mainly focused on (1) individual components and operations and (2) intra-transaction batching that requires interface changes, breaking backward compatibility. It was not clear how they apply to a full database engine and how much end-to-end benefit they bring under various workloads. This paper presents CoroBase, a main-memory database engine that tackles these challenges with a new coroutine-to-transaction paradigm. Coroutine-to-transaction models transactions as coroutines and thus enables inter-transaction batching, avoiding application changes but retaining the benefits of prefetching. We show that on a 48-core server, CoroBase can perform close to 2x better for read-intensive workloads and remain competitive for workloads that inherently do not benefit from software prefetching.

APA, Harvard, Vancouver, ISO, and other styles

19

Zhang, Dan, Xiaoyu Ma, and Derek Chiou. "Worklist-Directed Prefetching." IEEE Computer Architecture Letters 16, no. 2 (July 1, 2017): 170–73. http://dx.doi.org/10.1109/lca.2016.2627571.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Sridharan, Aswinkumar, Biswabandan Panda, and Andre Seznec. "Band-Pass Prefetching." ACM Transactions on Architecture and Code Optimization 14, no. 2 (July 21, 2017): 1–27. http://dx.doi.org/10.1145/3090635.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Bhattacharjee, Abhishek. "Translation-Triggered Prefetching." ACM SIGOPS Operating Systems Review 51, no. 2 (April 4, 2017): 63–76. http://dx.doi.org/10.1145/3093315.3037705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Bhattacharjee, Abhishek. "Translation-Triggered Prefetching." ACM SIGPLAN Notices 52, no. 4 (May 12, 2017): 63–76. http://dx.doi.org/10.1145/3093336.3037705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Bhattacharjee, Abhishek. "Translation-Triggered Prefetching." ACM SIGARCH Computer Architecture News 45, no. 1 (May 11, 2017): 63–76. http://dx.doi.org/10.1145/3093337.3037705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Wang, Zhenlin, Doug Burger, Kathryn S. McKinley, Steven K. Reinhardt, and Charles C. Weems. "Guided region prefetching." ACM SIGARCH Computer Architecture News 31, no. 2 (May 2003): 388–98. http://dx.doi.org/10.1145/871656.859663.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Shashidhara, D. N., D. N. Chandrappa, and C. Puttamadappa. "An Efficient Content Prefetching Method for Cloud Based Mobile Adhoc Network." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4162–66. http://dx.doi.org/10.1166/jctn.2020.9038.

Full text

Abstract:

Recently, Mobile Ad-hoc Networks (MANETs) has emerged as a very important research area for provisioning service for remote client through internet, cloud computing, and cellular network. This work focusses on improving image access in MANET. Various method has been present recently for reducing data access and reduce query latency. Number challenges need to addressed such as caching, content prefetching, shared access environment, dynamic high speed node mobility environment etc. As mobile adhoc network is growing rapidly for the possessions of its capability in forming provisional network without the requirement of any predefined infrastructure, improving throughput (i.e., access rate), reducing bit error rate (BER) (i.e., query latency) has been a major concern and requirement in mobile adhoc network. This work aimed at building an efficient content prefetching technique of geographically distributed data for enhancing the access rate and reduce query latency. Along with, our model can minimize processing time and cost for carrying content prefetching operation. Experiment result shows the proposed content prefetching method improves bit error rate (BER) and throughput performance.

APA, Harvard, Vancouver, ISO, and other styles

26

Banu, J. Saira, and M. Rajasekhara Babu. "Exploring Vectorization and Prefetching Techniques on Scientific Kernels and Inferring the Cache Performance Metrics." International Journal of Grid and High Performance Computing 7, no. 2 (April 2015): 18–36. http://dx.doi.org/10.4018/ijghpc.2015040102.

Full text

Abstract:

Performance improvement in modern processor is staggering due to power wall and memory wall problem. In general, the power wall problem is addressed by various vectorization design techniques. The Memory wall problem is diminished through prefetching technique. In this paper vectorization is achieved through Single Instruction Multiple Data (SIMD) registers of the current processor. It provides architecture optimization by reducing the number of instructions in the pipeline and by minimizing the utilization of multi-level memory hierarchy. These registers provide an economical computing platform compared to Graphics Processing Unit (GPU) for compute intensive applications. This paper explores software prefetching via Streaming SIMD extension (SSE) instructions to mitigate the memory wall problem. This work quantifies the effect of vectorization and prefetching in Matrix Vector Multiplication (MVM) kernel with dense and sparse structure. Both Prefetching and Vectorization method reduces the data and instruction cache pressure and thereby improving the cache performance. To show the cache performance improvements in the kernel, the Intel VTune amplifier is used. Finally, experimental results demonstrate a promising performance of matrix kernel by Intel Haswell's processor. However, effective utilization of SIMD registers is a programming challenge to the developers.

APA, Harvard, Vancouver, ISO, and other styles

27

Bishwa Ranjan Roy, Purnendu Das, Nurulla Mansur Barbhuiya,. "PP-Bridge: Establishing a Bridge between the Prefetching and Cache Partitioning." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 9 (October 30, 2023): 897–906. http://dx.doi.org/10.17762/ijritcc.v11i9.8982.

Full text

Abstract:

— Modern computer processors are equipped with multiple cores, each boasting its own dedicated cache memory, while collectively sharing a generously sized Last Level Cache (LLC). To ensure equitable utilization of the LLC space and bolster system security, partitioning techniques have been introduced to allocate the shared LLC space among the applications running on different cores. This partition dynamically adapts to the requirements of these applications. Prefetching plays a vital role in enhancing cache performance by proactively loading data into the cache before it get requested explicitly by a core. Each core employs prefetch engines to decide which data blocks to fetch preemptively. However, a haphazard prefetcher may bring in more data blocks than necessary, leading to cache pollution and a subsequent degradation in system performance. To maximize the benefits of prefetching, it is essential to keep cache pollution to a minimum. Intriguingly, our research has uncovered that when existing prefetching techniques are combined with partitioning methods, they tend to exacerbate cache pollution within the LLC, resulting in a noticeable decline in system performance. In this paper, we present a novel approach aimed at mitigating cache pollution when combining prefetching with partitioning techniques.

APA, Harvard, Vancouver, ISO, and other styles

28

Zhang, Kai, and Chao Tian. "Fundamental Limits of Coded Caching: From Uncoded Prefetching to Coded Prefetching." IEEE Journal on Selected Areas in Communications 36, no. 6 (June 2018): 1153–64. http://dx.doi.org/10.1109/jsac.2018.2844958.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Rochberg, David, and Garth Gibson. "Prefetching over a network." ACM SIGMETRICS Performance Evaluation Review 25, no. 3 (December 1997): 29–36. http://dx.doi.org/10.1145/270900.270906.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Joseph, Doug, and Dirk Grunwald. "Prefetching using Markov predictors." ACM SIGARCH Computer Architecture News 25, no. 2 (May 1997): 252–63. http://dx.doi.org/10.1145/384286.264207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Angermann, Michael. "Analysis of speculative prefetching." ACM SIGMOBILE Mobile Computing and Communications Review 6, no. 2 (April 2002): 13–17. http://dx.doi.org/10.1145/565702.565706.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Aggarwal, Aneesh. "Software caching vs. prefetching." ACM SIGPLAN Notices 38, no. 2 supplement (February 15, 2003): 157–62. http://dx.doi.org/10.1145/773039.512450.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Jiménez, Víctor, Francisco J. Cazorla, Roberto Gioiosa, Alper Buyuktosunoglu, Pradip Bose, Francis P. O'Connell, and Bruce G. Mealey. "Adaptive Prefetching on POWER7." ACM Transactions on Parallel Computing 1, no. 1 (October 3, 2014): 1–25. http://dx.doi.org/10.1145/2588889.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Patterson, R. H., G. A. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. "Informed prefetching and caching." ACM SIGOPS Operating Systems Review 29, no. 5 (December 3, 1995): 79–95. http://dx.doi.org/10.1145/224057.224064.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Joseph, D., and D. Grunwald. "Prefetching using Markov predictors." IEEE Transactions on Computers 48, no. 2 (1999): 121–33. http://dx.doi.org/10.1109/12.752653.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Kratzer, Klaus, Hartmut Wedekind, and Georg Zörntlein. "Prefetching—a performance analysis." Information Systems 15, no. 4 (January 1990): 445–52. http://dx.doi.org/10.1016/0306-4379(90)90047-s.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Li, Jun, Xiaofei Xu, Zhigang Cai, Jianwei Liao, Kenli Li, Balazs Gerofi, and Yutaka Ishikawa. "Pattern-Based Prefetching with Adaptive Cache Management Inside of Solid-State Drives." ACM Transactions on Storage 18, no. 1 (February 28, 2022): 1–25. http://dx.doi.org/10.1145/3474393.

Full text

Abstract:

This article proposes a pattern-based prefetching scheme with the support of adaptive cache management, at the flash translation layer of solid-state drives ( SSDs ). It works inside of SSDs and has features of OS dependence and uses transparency. Specifically, it first mines frequent block access patterns that reflect the correlation among the occurred I/O requests. Then, it compares the requests in the current time window with the identified patterns to direct prefetching data into the cache of SSDs. More importantly, to maximize the cache use efficiency, we build a mathematical model to adaptively determine the cache partition on the basis of I/O workload characteristics, for separately buffering the prefetched data and the written data. Experimental results show that our proposal can yield improvements on average read latency by 1.8 %– 36.5 % without noticeably increasing the write latency, in contrast to conventional SSD-inside prefetching schemes.

APA, Harvard, Vancouver, ISO, and other styles

38

Fang, Juan, and Hong Bo Zhang. "An Improved Architecture for Multi-Core Prefetching." Advanced Materials Research 505 (April 2012): 253–56. http://dx.doi.org/10.4028/www.scientific.net/amr.505.253.

Full text

Abstract:

The “Memory Wall” problem has become a bottleneck for the performance of processor, and on-chip multiprocessor(CMP) aggravates the memory access latency. So many hardware prefetching techniques have been brought to solve this challenge, i.e. Future Execution. This paper introduces runahead execution(another hardware prefetching technique firstly used on single-core processor) and Future Execution, then it brings up some improvement for Future Execution and gives the result and analysis of data tested by SPEC2000 benchmark.

APA, Harvard, Vancouver, ISO, and other styles

39

Qian, Cheng, Bruce Childers, Libo Huang, Hui Guo, and Zhiying Wang. "CGAcc: A Compressed Sparse Row Representation-Based BFS Graph Traversal Accelerator on Hybrid Memory Cube." Electronics 7, no. 11 (November 7, 2018): 307. http://dx.doi.org/10.3390/electronics7110307.

Full text

Abstract:

Graph traversal is widely used in map routing, social network analysis, causal discovery and many more applications. Because it is a memory-bound process, graph traversal puts significant pressure on the memory subsystem. Due to poor spatial locality and the increasing size of today’s datasets, graph traversal consumes an ever-larger part of application execution time. One way to mitigate this cost is memory prefetching, which issues requests from the processor to the memory in anticipation of needing certain data. However, traditional prefetching does not work well for graph traversal due to data dependencies, the parallel nature of graphs and the need to move vast amounts of data from memory to the caches. In this paper, we propose a compressed sparse row representation-based graph accelerator on the Hybrid Memory Cube (HMC), called CGAcc. CGAcc combines Compressed Sparse Row (CSR) graph representation with in-memory prefetching and processing to improve the performance of graph traversal. Our approach integrates the prefetching and processing in the logic layer of a 3D stacked Dynamic Random-Access Memory (DRAM) architecture, based on Micron’s HMC. We selected HMC to implement CGAcc because it can provide quite high bandwidth and low access latency. Furthermore, this device has multiple DRAM layers connected to internal logic to control memory access and perform rudimentary computation. Using the CSR representation, CGAcc deploys prefetchers in the HMC to exploit the short transaction latency between the logic and DRAM layers. By doing this, it can also avoid large data movement costs. In the runtime, CGAcc pipelines the prefetching to fetch data from DRAM arrays to improve memory-level parallelism. To further reduce the access latency, several optimized internal caches are also introduced to hold the prefetched data to be Processed In-Memory (PIM). A comprehensive evaluation shows the effectiveness of CGAcc. Experimental results showed that, compared to a conventional HMC main memory equipped with a stream prefetcher, CGAcc achieved an average 3.51× speedup with moderate hardware cost.

APA, Harvard, Vancouver, ISO, and other styles

40

Selvam, S. "An Effective Techniques Using Apriori and Logistic Methods in Cloud Computing." IARS' International Research Journal 11, no. 2 (August 29, 2021): 35–39. http://dx.doi.org/10.51611/iars.irj.v11i2.2021.167.

Full text

Abstract:

This paper presents a creativity data prefetching scheme on the loading servers in distributed file systems for cloud computing. The server will get and piggybacked the frequent data from the client system, after analyzing the fetched data is forward to the client machine from the server. To place this technique to work, the data about client nodes is piggybacked onto the real client I/O requests, and then forwarded to the relevant storage server. Next, dual prediction algorithms have been proposed to calculation future block access operations for directing what data should be fetched on storage servers in advance. Finally, the prefetching data can be pressed to the relevant client device from the storage server. Over a series of evaluation experiments with a group of application benchmarks, we have demonstrated that our presented initiative prefetching technique can benefit distributed file systems for cloud environments to achieve better I/O performance. In particular, configuration-limited client machines in the cloud are not answerable for predicting I/O access operations, which can certainly contribute to preferable system performance on them.

APA, Harvard, Vancouver, ISO, and other styles

41

PAMNANI, SUMITKUMAR N., DEEPAK N. AGARWAL, GANG QU, and DONALD YEUNG. "LOW POWER SYSTEM DESIGN BY COMBINING SOFTWARE PREFETCHING AND DYNAMIC VOLTAGE SCALING." Journal of Circuits, Systems and Computers 16, no. 05 (October 2007): 745–67. http://dx.doi.org/10.1142/s0218126607003964.

Full text

Abstract:

Performance-enhancement techniques improve CPU speed at the cost of other valuable system resources such as power and energy. Software prefetching is one such technique, tolerating memory latency for high performance. In this article, we quantitatively study this technique's impact on system performance and power/energy consumption. First, we demonstrate that software prefetching achieves an average of 36% performance improvement with 8% additional energy consumption and 69% higher power consumption on six memory-intensive benchmarks. Then we combine software prefetching with a (unrealistic) static voltage scaling technique to show that this performance gain can be converted to an average of 48% energy saving. This suggests that it is promising to build low power systems with techniques traditionally known for performance enhancement. We thus propose a practical online profiling based dynamic voltage scaling (DVS) algorithm. The algorithm monitors system's performance and adapts the voltage level accordingly to save energy while maintaining the observed system performance. Our proposed online profiling DVS algorithm achieves 38% energy saving without any significant performance loss.

APA, Harvard, Vancouver, ISO, and other styles

42

Thilaganga, V., and S. Selvam. "A Prefetching Technique using Apriori and Logistic Methods for the DFS in Cloud." Asian Journal of Engineering and Applied Technology 6, no. 1 (May 5, 2017): 10–13. http://dx.doi.org/10.51983/ajeat-2017.6.1.817.

Full text

Abstract:

This paper presents an creativity data prefetching scheme on the loading servers in distributed file systems for cloud computing. The server will get and piggybacked the frequent data from the client system, after analyzing the fetched data is forward to the client machine from the server. To place this technique to work, the data about client nodes is piggybacked onto the real client I/O requests, and then forwarded to the relevant storage server. Next, dual prediction algorithms have been proposed to calculation future block access operations for directing what data should be fetched on storage servers in advance. Finally, the prefetching data can be pressed to the relevant client device from the storage server. Over a series of evaluation experiments with a group of application benchmarks, we have demonstrated that our presented initiative prefetching technique can benefit distributed file systems for cloud environments to achieve better I/O performance. In particular, configuration-limited client machines in the cloud are not answerable for predicting I/O access operations, which can certainly contribute to preferable system performance on them.

APA, Harvard, Vancouver, ISO, and other styles

43

NATARAJAN, RAGAVENDRA, VINEETH MEKKAT, WEI-CHUNG HSU, and ANTONIA ZHAI. "EFFECTIVENESS OF COMPILER-DIRECTED PREFETCHING ON DATA MINING BENCHMARKS." Journal of Circuits, Systems and Computers 21, no. 02 (April 2012): 1240006. http://dx.doi.org/10.1142/s0218126612400063.

Full text

Abstract:

For today's increasingly power-constrained multicore systems, integrating simpler and more energy-efficient in-order cores becomes attractive. However, since in-order processors lack complex hardware support for tolerating long-latency memory accesses, developing compiler technologies to hide such latencies becomes critical. Compiler-directed prefetching has been demonstrated effective on some applications. On the application side, a large class of data centric applications has emerged to explore the underlying properties of the explosively growing data. These applications, in contrast to traditional benchmarks, are characterized by substantial thread-level parallelism, complex and unpredictable control flow, as well as intensive and irregular memory access patterns. These applications are expected to be the dominating workloads on future microprocessors. Thus, in this paper, we investigated the effectiveness of compiler-directed prefetching on data mining applications in in-order multicore systems. Our study reveals that although properly inserted prefetch instructions can often effectively reduce memory access latencies for data mining applications, the compiler is not always able to exploit this potential. Compiler-directed prefetching can become inefficient in the presence of complex control flow and memory access patterns; and architecture dependent behaviors. The integration of multithreaded execution onto a single die makes it even more difficult for the compiler to insert prefetch instructions, since optimizations that are effective for single-threaded execution may or may not be effective in multithreaded execution. Thus, compiler-directed prefetching must be judiciously deployed to avoid creating performance bottlenecks that otherwise do not exist. Our experiences suggest that dynamic performance tuning techniques that adjust to the behaviors of a program can potentially facilitate the deployment of aggressive optimizations in data mining applications.

APA, Harvard, Vancouver, ISO, and other styles

44

Keshava, Kausthub, Alain Jean-Marie, and Sara Alouf. "Optimal Prefetching in Random Trees." Mathematics 9, no. 19 (October 1, 2021): 2437. http://dx.doi.org/10.3390/math9192437.

Full text

Abstract:

We propose and analyze a model for optimizing the prefetching of documents, in the situation where the connection between documents is discovered progressively. A random surfer moves along the edges of a random tree representing possible sequences of documents, which is known to a controller only up to depth d. A quantity k of documents can be prefetched between two movements. The question is to determine which nodes of the known tree should be prefetched so as to minimize the probability of the surfer moving to a node not prefetched. We analyzed the model with the tools of Markov decision process theory. We formally identified the optimal policy in several situations, and we identified it numerically in others.

APA, Harvard, Vancouver, ISO, and other styles

45

CHEN, Juan. "Energy-Constrained Software Prefetching Optimization." Journal of Software 17, no. 7 (2006): 1650. http://dx.doi.org/10.1360/jos171650.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Wedekind, H., and George Zoerntlein. "Prefetching in realtime database applications." ACM SIGMOD Record 15, no. 2 (June 15, 1986): 215–26. http://dx.doi.org/10.1145/16856.16876.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Curewitz, Kenneth M., P. Krishnan, and Jeffrey Scott Vitter. "Practical prefetching via data compression." ACM SIGMOD Record 22, no. 2 (June 1993): 257–66. http://dx.doi.org/10.1145/170036.170077.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Master, Neal, Aditya Dua, Dimitrios Tsamis, Jatinder Pal Singh, and Nicholas Bambos. "Adaptive Prefetching in Wireless Computing." IEEE Transactions on Wireless Communications 15, no. 5 (May 2016): 3296–310. http://dx.doi.org/10.1109/twc.2016.2519882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Kimbrel, Tracy, Pei Cao, Edward W. Felten, Anna R. Karlin, and Kai Li. "Integrated parallel prefetching and caching." ACM SIGMETRICS Performance Evaluation Review 24, no. 1 (May 15, 1996): 262–63. http://dx.doi.org/10.1145/233008.233052.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Vitter, Jeffrey Scott, and P. Krishnan. "Optimal prefetching via data compression." Journal of the ACM 43, no. 5 (September 1996): 771–93. http://dx.doi.org/10.1145/234752.234753.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!