Дисертації з теми "Automatic dynamic memory management"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-43 дисертацій для дослідження на тему "Automatic dynamic memory management".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Österlund, Erik. "Automatic memory management system for automatic parallelization." Thesis, Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-13693.
Повний текст джерелаStojanovic, Marta. "Automatic memory management in Java." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ65392.pdf.
Повний текст джерелаDoddapaneni, Srinivas P. "Automatic dynamic decomposition of programs on distributed memory machines." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/8158.
Повний текст джерелаZhang, Yang. "Dynamic Memory Management for the Loci Framework." MSSTATE, 2004. http://sun.library.msstate.edu/ETD-db/theses/available/etd-04062004-215627/.
Повний текст джерелаVan, Vleet Taylor. "Dynamic cache-line sizes /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/6899.
Повний текст джерелаHerrmann, Edward C. "Threaded Dynamic Memory Management in Many-Core Processors." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1277132326.
Повний текст джерелаBurrell, Tiffany. "System Identification in Automatic Database Memory Tuning." Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1583.
Повний текст джерелаMa, Yuke. "Design, Test and Implement a Reflective Scheduler with Task Partitioning Support of a Grid." Thesis, Cranfield University, 2008. http://hdl.handle.net/1826/3510.
Повний текст джерелаLi, Wentong Kavi Krishna M. "High performance architecture using speculative threads and dynamic memory management hardware." [Denton, Tex.] : University of North Texas, 2007. http://digital.library.unt.edu/permalink/meta-dc-5150.
Повний текст джерелаLi, Wentong. "High Performance Architecture using Speculative Threads and Dynamic Memory Management Hardware." Thesis, University of North Texas, 2007. https://digital.library.unt.edu/ark:/67531/metadc5150/.
Повний текст джерелаLi, Bo. "Modeling and Runtime Systems for Coordinated Power-Performance Management." Diss., Virginia Tech, 2019. http://hdl.handle.net/10919/87064.
Повний текст джерелаPh. D.
System efficiency on high-performance computing (HPC) systems is the key to achieving the goal of power budget for exascale supercomputers. Techniques for adjusting the performance of different system components can help accomplish this goal by dynamically controlling system performance according to application behaviors. In this dissertation, we focus on three techniques: adjusting CPU performance, memory performance, and the number of threads for running parallel applications. First, we profile the performance and energy consumption of different HPC applications on both Intel systems with accelerators and IBM BG/Q systems. We explore the trade-offs of performance and energy under these techniques and provide optimization insights. Furthermore, we propose a parallel performance model that can accurately capture the impact of these techniques on performance in terms of job completion time. We present an approximation approach for performance prediction. The approximation has up to 7% and 17% prediction error on Intel x86 and IBM BG/Q systems respectively under 19 HPC applications. Thereafter, we apply the performance model in a runtime system design for improving performance under a given power budget. Our runtime strategy achieves up to 20% performance improvement to the baseline method.
Shalan, Mohamed A. "Dynamic memory management for embedded real-time multiprocessor system-on-a-chip." Diss., Available online, Georgia Institute of Technology, 2003:, 2003. http://etd.gatech.edu/theses/available/etd-11252003-131621/unrestricted/shalanmohameda200312.pdf.
Повний текст джерелаVincent Mooney, Committee Chair; John Barry, Committee Member; James Hamblen, Committee Member; Karsten Schwan, Committee Member; Linda Wills, Committee Member. Includes bibliography.
Xia, Xiuxian. "Dynamic power distribution management for all electric aircraft." Thesis, Cranfield University, 2011. http://dspace.lib.cranfield.ac.uk/handle/1826/6285.
Повний текст джерелаKim, Seyeon. "Node-oriented dynamic memory management for real-time systems on ccNUMA architecture systems." Thesis, University of York, 2013. http://etheses.whiterose.ac.uk/5712/.
Повний текст джерелаPeterson, Thomas. "Dynamic Allocation for Embedded Heterogeneous Memory : An Empirical Study." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-223904.
Повний текст джерелаInbyggda system existerar allestädes och bidrar till våran livsstandard på flertalet avseenden genom att skapa funktionalitet i större system. För att vara verksamma kräver inbyggda system en välfungerande hård- och mjukvara samt gränssnitt mellan dessa. Dessa tre måste ständigt omarbetas i takt med utvecklingen av nya användbara teknologier för inbyggda system. En förändring dessa system genomgår i nuläget är experimentering med nya minneshanteringstekniker för RAM-minnen då nya icke-flyktiga RAM-minnen utvecklats. Dessa minnen uppvisar ofta asymmetriska läs och skriv fördröjningar vilket motiverar en minnesdesign baserad på flera olika icke-flyktiga RAM. Som en konsekvens av dessa egenskaper och minnesdesigner finns ett behov av att hitta minnesallokeringstekniker som minimerar de fördröjningar som skapas. Detta dokument adresserar problemet med minnesallokering på heterogena minnen genom en empirisk studie. I den första delen av studien studerades allokeringstekniker baserade på en länkad lista, bitmapp och ett kompissystem. Med detta som grund drogs slutsatsen att den länkade listan var överlägsen alternativen. Därefter utarbetades minnesarkitekturer med flera minnesbanker samtidigt som framtagandet av flera strategier för val av minnesbank utfördes. Dessa strategier baserades på storleksbaserade tröskelvärden och nyttjandegrad hos olika minnesbanker. Utvärderingen av dessa strategier resulterade ej i några större slutsatser men visade att olika strategier var olika lämpade för olika beteenden hos applikationer.
Gazi, Boran. "Dynamic buffer management policy for shared memory packet switches by employing per-queue thresholds." Thesis, Northumbria University, 2007. http://nrl.northumbria.ac.uk/3695/.
Повний текст джерелаHuang, Jipeng. "Efficient Context Sensitivity for Dynamic Analyses via Calling Context Uptrees and Customized Memory Management." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397231571.
Повний текст джерелаVijayakumar, Smita. "A Framework for Providing Automatic Resource and Accuracy Management in a Cloud Environment." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1274194090.
Повний текст джерелаYoung, Jeffrey. "Dynamic partitioned global address spaces for high-efficiency computing." Thesis, Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26467.
Повний текст джерелаCommittee Chair: Yalamanchili, Sudhakar; Committee Member: Riley, George; Committee Member: Schimmel, David. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Ramaswamy, Lakshmish Macheeri. "Towards Efficient Delivery of Dynamic Web Content." Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/7646.
Повний текст джерелаSinha, Udayan Prabir. "Memory Management Error Detection in Parallel Software using a Simulated Hardware Platform." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-219606.
Повний текст джерелаMinneshanteringsfel i parallell mjukvara som exekverar på flerkärniga arkitekturer kan vara svåra att detektera, samt kostsamma att åtgärda. Exempel på fel kan vara användning av ej initialiserat minne, minnesläckage, samt att data blir överskrivna av en process som inte är ägare till de data som skrivs över. Om minneshanteringsfel kan detekteras i ett tidigt skede, t ex genom att använda en simulator, som körs innan mjukvaran har levererats och integrerats i en produkt, skulle man kunna erhålla signifikanta kostnadsbesparingar. Detta examensarbete undersöker och utvecklar metoder för detektion av ej initialiserat minne i mjukvara som körs på en virtuell plattform. Den virtuella plattformen innehåller modeller av delar av den digitala hårdvara, för basband och radio, som finns i en Ericsson radiobasstation. Modellerna är bit-exakta representationer av motsvarande hårdvarublock, och innefattar processorer och periferienheter. Den virtuella plattformen används av Ericsson för utveckling och integration av mjukvara. Det finns verktyg, exempelvis Memcheck (Valgrind), samt MemorySanitizer och AddressSanitizer (Clang), som kan användas för att detektera minneshanteringsfel. Egenskaper hos sådana verktyg har undersökts, och algoritmer för detektion av minneshanteringsfel har utvecklats, för en specifik processor och dess instruktioner. Algoritmerna har implementerats i en virtuell plattform, och kravställningar och design-överväganden som speglar den tillämpnings-specifika instruktionsrepertoaren för den valda processorn, har behandlats. En prototyp-implementation av presentation av minneshanteringsfel, där källkodsraderna samt anropsstacken för de platser där fel har hittats pekas ut, har utvecklats, med användning av en debugger. Ett experiment, som använder sig av ett för ändamålet utvecklat program, har använts för att utvärdera feldetektions-förmågan för de algoritmer som implementerats i den virtuella plattformen, samt för att jämföra med feldetektions-förmågan hos Memcheck. De algoritmer som implementerats i den virtuella plattformen kan, för det program som används, detektera alla kända fel, förutom ett. Algoritmerna rapporterar också falska felindikeringar. Dessa rapporter är huvudsakligen ett resultat av att den aktuella implementationen har begränsad kunskap om det operativsystem som används på den simulerade processorn.
Gangadharappa, Tejus A. "Designing Support For MPI-2 Programming Interfaces On Modern InterConnects." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1243908626.
Повний текст джерелаGreen, Craig Elkton. "Composite thermal capacitors for transient thermal management of multicore microprocessors." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44772.
Повний текст джерелаZhu, Yong. "Routing, Resource Allocation and Network Design for Overlay Networks." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14017.
Повний текст джерелаSaxena, Abhinav. "Knowledge-Based Architecture for Integrated Condition Based Maintenance of Engineering Systems." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/16125.
Повний текст джерелаFrampton, Daniel John. "An Investigation into Automatic Dynamic Memory Management Strategies using Compacting Collection." Thesis, 2003. http://hdl.handle.net/1885/39951.
Повний текст джерелаPai, Sreepathi. "Efficient Dynamic Automatic Memory Management And Concurrent Kernel Execution For General-Purpose Programs On Graphics Processing Units." Thesis, 2014. http://etd.iisc.ernet.in/handle/2005/2609.
Повний текст джерелаMost GPU programs access data in GPU memory for performance. Manually inserting data transfers that move data to and from this GPU memory is an error-prone and tedious task. In this work, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale data accesses and uses a runtime to initiate transfers as necessary. This avoids redundant transfers that are exhibited by all other existing automatic memory management proposals for general purpose programs. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only (ADSM) and compiler-only (OpenMPC) proposals, it also transfers 2.2x to 13.3x less data on average.
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30% of resources going unused on average for programs of the Parboil2 suite. Current GPUs therefore allow concurrent execution of kernels to improve utilization. We study concurrent execution of GPU kernels using multiprogrammed workloads on current NVIDIA Fermi GPUs. On two-program workloads from Parboil2 we find concurrent execution is often no better than serialized execution. We identify lack of control over resource allocation to kernels as a major serialization bottleneck. We propose transformations that convert CUDA kernels into elastic kernels which permit fine-grained control over their resource usage. We then propose several elastic-kernel aware runtime concurrency policies that offer significantly better performance and concurrency than the current CUDA policy. We evaluate our proposals on real hardware using multiprogrammed workloads constructed from benchmarks in the Parboil2 suite. On average, our proposals increase system throughput (STP) by 1.21x and improve the average normalized turnaround time (ANTT) by 3.73x for two-program workloads over the current CUDA concurrency implementation.
Recent NVIDIA GPUs use a FIFO policy in their thread block scheduler (TBS) to schedule thread blocks of concurrent kernels. We show that FIFO leaves performance to chance, resulting in significant loss of performance and fairness. To improve performance and fairness, we propose use of the Shortest Remaining Time First (SRTF) policy instead. Since SRTF requires an estimate of runtime (i.e. execution time), we introduce Structural Runtime Prediction that uses the grid structure of GPU programs for predicting runtimes. Using a novel Staircase model of GPU kernel execution, we show that kernel runtime can be predicted by profiling only the first few thread blocks. We evaluate an online predictor based on this model on benchmarks from ERCBench and find that predictions made after the execution of single thread block are between 0.48x to 1.08x of actual runtime. %Next, we design a thread block scheduler that is both concurrent kernel-aware and incorporates this predictor. We implement the SRTF policy for concurrent kernels that uses this predictor and evaluate it on two-program workloads from ERCBench. SRTF improves STP by 1.18x and ANTT by 2.25x over FIFO. Compared to MPMax, a state-of-the-art resource allocation policy for concurrent kernels, SRTF improves STP by 1.16x and ANTT by 1.3x. To improve fairness, we also propose SRTF/Adaptive which controls resource usage of concurrently executing kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12x, ANTT by 2.23x and Fairness by 2.95x compared to FIFO. Overall, our implementation of SRTF achieves STP to within 12.64% of Shortest Job First (SJF, an oracle optimal scheduling policy), bridging 49% of the gap between FIFO and SJF.
Quinane, Luke. "An Examination of Deferred Reference Counting and Cycle Detection." Thesis, 2003. http://hdl.handle.net/1885/42030.
Повний текст джерелаSUN, SHOU-LIANG, and 孫守亮. "Dynamic Memory Management for Xen Virtualization Platforms." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/s4xhyj.
Повний текст джерела國立屏東大學
資訊工程學系碩士班
105
Virtual memory is a typical mechanism in modern operating systems, which provides extended memory space from disks for a memory overcommit situation (i.e., system with insufficient mem- ory). Although the sufficient amount of memory can be provided, the performance is impaired due to a large amount of disk I/O requests is required for the enabling of Virtual memory. In virtu- alization environments, however, these I/O requests may degrade another virtual machine (VM) because VMs shared the disk on the same physical machine (i.e., disk contentions). In this thesis, we propose an dynamic memory management approach for Xen virtualization systems, called critical amount guaranteed memory allocation (CAGMA), to expand or shrink the allocated memory of a VM dynamically with a guaranteed amount of available memory. Under CAGMA, a critical memory amount is calculated for each VM periodically and at the time a swapping event is occurred or virtual memory is enabled. The allocated memory of each VM is then adjusted according to its critical memory amount so that the number of I/O requests generated for virtual memory could be reduced greatly and the performance degradation problem could be prevented. Our proposed CAGMA has been implemented in Xen 4.2.2 and a series of experiments have been conducted for which some encouraging results were obtained.
Hsiao, Kuang-tse, and 蕭光哲. "Customized Dynamic Memory Management for Embedded Systems." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/31795309229125915660.
Повний текст джерела國立成功大學
電腦與通信工程研究所
97
More and more applications for multimedia or for network services are ported to embedded systems. These applications usually depend on dynamic memory usage for their executions due to the inherent unpredictability in the processing. Many general dynamic memory management policies and their implementations are available to achieve good performance in general purpose systems. However, using a common and general purpose memory management mechanism might not be suitable for embedded system platforms due to resource constraint. Instead, customized dynamic memory management would be a suitable solution. When customizing dynamic memory management is integrated into the porting process or development process, it would be an effective approach to produce efficient memory usage and better performance at low cost. This thesis presents the design and implementation of customized dynamic memory management for embedded system platforms. There are two parts in the customized dynamic memory management: configurable dynamic memory management mechanisms for applications and supporting tools which help select memory management mechanisms. By providing several dynamic memory management mechanisms, application developer may choose suitable management mechanisms based on the pattern of memory requests. The supporting tools provide information on memory usage by collecting data of memory requests at run time and analyzing the collected data. The data are collected from both application side and operating system side. Hence, the choice on suitable dynamic memory management mechanism might, at least, achieve balance between these two aspects and better performance. The customized dynamic memory management of this thesis is implemented in the Zinix micro-kernel operating system running on TI DaVinci EVM. The implementation includes memory management policies in function library to be used by application software, and page management policies in operating system. To help choose suitable memory management policy for application software, a tool which analyzes memory usage of application software and suggests suitable choice, is implemented. The implementation is tested with H.264 decoder software and predicted 3D objects rendering software. The results are promising as predicted and the efficient memory usage makes the test programs execute faster.
Ke, Dian-Chia, and 柯典嘉. "Dynamic Memory Management on uC/OS-II." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/11987121996736990793.
Повний текст джерела臺灣大學
資訊工程學研究所
98
There is an increasing demand for more memory to satisfy complex execution of applications even in embedded or real-time systems. The implementation of more memory is usually done by using paging or virtual memory. However, page faults in such systems impact the performance of memory access significantly and result in unpredictable response time. So, reducing page fault rate is critical to improve the system performance. In this thesis, a dynamic memory management scheme based on paging on μC/OS-II is proposed. Memory pages are allocated to tasks relying on task priorities, so pages of high-priority tasks are more likely to be kept in memory, while those of low-priority tasks may be easier replaced. By this way, pages are prevented from being re-allocated frequently, resulting in a lower page fault rate. In my experiment, the page fault rate could reduce 66\%. More experiments under different scenarios are also designed to test how the performance would be influenced. The results show that such prioritized scheme improves the overall performance of page fault rate, especially with wide memory access range, more number of tasks, and low aging strategy. I believe dynamic memory management with task priority can effectively improve the overall performance of many embedded and real-time systems.
Hsu, Chao-Hung, and 徐肇鴻. "Dynamic Memory Optimization and Parallelism Management for OpenCL." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/78409180200517991120.
Повний текст джерела國立交通大學
資訊科學與工程研究所
102
Recently, multiprocessor platforms have become trends for achieving high performance. Multiprocessor platforms may be categorized into homogeneous multiprocessor platforms and heterogeneous multiprocessor platforms. For some applications with large concurrency, such as digital signal processing, linear algebra matrix operations, and so on, executing on heterogeneous multiprocessors usually achieves higher performance than on homogeneous multiprocessors. However, it is difficult and tedious to program applications for executing on heterogeneous multiprocessors. OpenCL (Open Computing Language), released by Khronos group, is one of the programming standards for heterogeneous multiprocessor, and provides portability for heterogeneous multiprocessor platforms. OpenCL may support three types of device, CPUs (Central Processing Unit), GPUs (Graphic Processing Unit), and accelerators. Our research focuses on platforms with CPUs and GPUs, because GPUs are now widespread in use. On such a platform, two programming issues may affect the performance on GPU computing significantly. One is the work load distribution including parallelizing application into work items and distributing work items into workgroups. The other is the employment of GPU memory hierarchy. To fully utilize the characteristics of GPUs, programmers have to be not only proficient at parallel programming but also familiar with hardware specification. Therefore, in this thesis, we propose a compilation pass to automatically perform optimizations for OpenCL kernels. The input is a naïve kernel which is functionally correct without optimization for performance improvement. Our compilation pass will transform the input kernel function with optimizations, including kernel function analysis, work-group rearrangement, memory coalescing, and work-item merge. In addition, our framework is implemented on a runtime system so that it may dynamically adjust the optimizing parameters according to the hardware specifications. Although the optimizations performed in runtime may incur overheads of execution time, the overheads may be covered by massive kernel computation or input data in most cases. The experiment results of our benchmarks demonstrate that the applications may gain 1.3 times speedup in average. Therefore, we design and implement an optimization pass for OpenCL which may take hardware specification of target platform into account for optimization in a runtime compiler framework based on LLVM.
Hong, Wan-Zen, and 洪婉荏. "Development of Automatic Deployment and Dynamic Resource Adjustment for Virtual Machine Management." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/13435496510826130135.
Повний текст джерела國立中興大學
資訊科學與工程學系所
99
As the extensive use of virtualization technology in the cloud computing environment, in particular IaaS (Infrastructure as a Service), how to effectively manage a cluster of virtual machines has been an emergent problem to be resolve. Other than the adoption of public cloud management platform, virtual machine management under limited resources in the private cloud is a challenge. This thesis focuses on the IaaS management issues in the private cloud environment, and provides an integration resource adjustment. By applying libvirt API for management, each physical machine is assumed to be equipped with KVM and QEMU. Menu-style model is developed for virtual machines configuration so that users can save time as well as reduce operational mistakes. SNMP is employed to regularly monitor CPU and memory usage on both physical machines and virtual machines. Based on the resource consumption, we are able to automatically adjust the memory allocation for each virtual machine. When its memory usage exceeds 80%, an extra memory will be added if available. If the physical machine can not supply the additional memory demand for virtual machines, migrated to another physical machine will be automatically taken place. Experimental results show that by applying the proposed mechanism, the loading of virtual machines and physical machines can be balanced and resources can be efficiently utilized.
Ramashekar, Thejas. "Automatic Data Allocation, Buffer Management And Data Movement For Multi-GPU Machines." Thesis, 2013. http://etd.iisc.ernet.in/handle/2005/2627.
Повний текст джерелаHuetter, RJ. "RHmalloc : a very large, highly concurrent dynamic memory manager." Thesis, 2005. http://hdl.handle.net/10453/37375.
Повний текст джерелаDynamic memory management (DMM) is a fundamental aspect of computing, directly affecting the capability, performance and reliability of virtually every system in existence today. Yet oddly, the fifty year research into DMM has not taken memory capacity into account, having fallen significantly behind hardware trends. Comparatively little research work on scalable DMM has been conducted – of the order of ten papers exist on this topic – all of which focus on CPU scalability only; the largest heap reported in the literature to date is 600MB. By contrast, symmetric multiprocessor (SMP) machines with terabytes of memory are now commercially available. The contribution of our research is the formal exploration, design, construction and proof of a general purpose, high performance dynamic memory manager which scales indefinitely with respect to both CPU and memory – one that can predictably manage a heap of arbitrary size, on any SMP machine with an arbitrary number of CPU’s, without a priori knowledge. We begin by recognizing the scattered, inconsistency of the literature surrounding this topic. Firstly, to ensure clarity, we present a simplified introduction, followed by a catalog of the fundamental techniques. We discuss the melting pot of engineering tradeoffs, so as to establish a sound basis to tackle the issue at hand – large scale DMM. We review both the history and state of the art, from which significant insight into this topic is to be found. We then explore the problem space and suggest a workable solution. Our proposal, known as RHmalloc, is based on the novel perspective that, a highly scalable heap can be viewed as, an unbounded set of finite-sized sub-heaps, where each sub-heap maybe concurrently shared by any number of threads; such that a suitable sub-heap can be found in O(1) time, and an allocation from a suitable sub-heap is also O(1). Testing the design properties of RHmalloc, we show by extrapolation that, RHmalloc will scale to at least 1,024 CPU’s and 1PB; and we theoretically prove that DMM scales indefinitely with respect to both CPU and memory. Most importantly, the approach for the scalability proof alludes to a general analysis and design technique for systems of this nature.
Kim, Yŏng-jin. "Hybrid approaches to solve dynamic fleet management problems." Thesis, 2003. http://wwwlib.umi.com/cr/utexas/fullcit?p3116412.
Повний текст джерелаLee, Chang Joo 1975. "DRAM-aware prefetching and cache management." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-12-2492.
Повний текст джерелаtext
Sartor, Jennifer Bedke. "Exploiting language abstraction to optimize memory efficiency." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-08-1919.
Повний текст джерелаtext
Jacob, Joseph 1971. "Automatic scheduling and dynamic load sharing of parallel computations on heterogeneous workstation clusters." Thesis, 1995. http://hdl.handle.net/1957/34688.
Повний текст джерелаGraduation date: 1996
Ha, Chi Yuan, and 哈䆊遠. "Energy Saving Designs with Joint Considerations of Non-Volatile Memory Allocation and Dynamic Power Management." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/kyzq6n.
Повний текст джерела長庚大學
資訊工程學系
105
To explore more computing power in a hardware system, vendors and researchers had been focusing on increasing the clock rate and including more components in a chip, but these technologies had also led to high energy consumption and overheating problems. When energy saving is considered, the Dynamic Power Management(DPM) technique can temporarily turn off some components in a chip or change the operating modes of some parts in a system. However, the expense of DPM energy saving is the long latency of system resuming to the active mode, which might violate application performance requirements and can cause extra energy consumption during the resumption period. This work exploits Non-Volatile Memory (NVM) for reducing the system suspending and resuming time so as to make our DPM algorithm more practical and effective for energy saving designs on Internet of Things (IoT) and wearable devices. The major challenge of this work is on the co-design of task scheduling and memory allocation for energy saving and/or better performance, which has renewed the system synthesis problem with a more flexible computing and memory architecture. By exploring application behaviors, this work uses NVM to keep critical data and binaries during system idle periods so as to reduce the system resumption time and further save more energy. Real-time analysis is provided for the performance guarantee of the applications on the target system. Experiments with the NVMain simulator and the WCET benchmark suite were conducted to evaluate the performance of our algorithm. Experimental results show that our solution can significantly save more energy compared to the other solutions.
Chandan, G. "Effective Automatic Computation Placement and Data Allocation for Parallelization of Regular Programs." Thesis, 2014. http://hdl.handle.net/2005/3111.
Повний текст джерелаJindal, Prachee. "Compiler Assisted Energy Management For Sensor Network Nodes." Thesis, 2008. http://hdl.handle.net/2005/819.
Повний текст джерелаCérat, Benjamin. "Étude de cas sur l’ajout de vecteurs d’enregistrements typés dans Gambit Scheme." Thèse, 2014. http://hdl.handle.net/1866/11984.
Повний текст джерелаIn order to optimize the in memory representation of Scheme records in the Gambit compiler, we introduce a type annotation system on record fields. We also introduce flat vector of records containing an abbreviated representation of those records. These vectors omit the header and reference to the type descriptor on contained records and use a type tree spanning the whole memory to recover the type as needed from an internal pointer. The implementation of the new functionnalities is done through changes in the Gambit runtime. We add new primitives to the language and modify the existing architecture to correctly handle the new data types in a way transparent that is transparent to the user. To do so, we modify the garbage collector to account to account for the existance of internal references and of heterogenous records whose fields may not be alligned to a word and need not be boxed. We also have to automatically and systematically update hte type tree to reflect the live vectors. To asses our implementation’s performance, we run a serie of benchmarks. We measure significant gains on allocation time and space with both typed records and contained records. We also measure a minor overhead in access costs on typed fields and major loss on accesses to the type descriptor of contained records.