Cache memory

Model
Digital Document
Publisher
Florida Atlantic University
Description
The use of cache memories in multiprocessor systems increases the overall systems performance. Caches reduce the amount of network traffic and provide a solution to the memory contention problem. However, caches introduce memory consistency problems. The existence of multiple cache copies of a memory block will result in an inconsistent view of memory if one processor changes a value in its associated cache. Cache coherence protocols are algorithms designed in software or hardware to maintain memory consistency. With the increased complexity of some of the more recent protocols, testing for the correctness of these protocols becomes an issue that requires more elaborate work. In this thesis, correctness analysis of a selected group of representative cache coherence protocols was performed using Petri nets as a modeling and analysis tool. First, the Petri net graphs for these protocols were designed. These graphs were built by following the logical and coherence actions performed by the protocols in response to the different processors' requests that threatens memory consistency. Correctness analysis was then performed on these graphs.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In this thesis, a delta service extends a mobile file system cache in order to minimize the amount of data transferred over wireless communications links. Network bandwidth stands as one of the resource limitations impacting the design of mobile computer applications. At the mobile file system service level, caching and compression provide resource conservation in distributed applications. This thesis proposes a delta service to enhance caching services characteristic of mobile computer file systems. Well established a mechanisms for sequence comparison and software configuration management, file deltas have applicability to mobile computer and distributed file system caching environments. Study of the delta service uses trace-driven simulation methodology incorporating traces obtained in a real world distributed environment. A mobile computer client cache model will corroborate existing studies regarding suitable cache size for disconnected client operation. A delta service model will extend the mobile computer client cache model of various cache sizes in order to gauge the bandwidth savings on the link obtained by the delta service.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Caches are used in shared memory multiprocessors to reduce the effective memory latency, and network and memory bandwidth requirements. But the data spreading across the caches leads to the cache coherence problem. In this thesis, a new directory based cache coherence scheme, called the cache-vector protocol, is proposed and evaluated. The said scheme entails a low memory overhead but delivers a performance that is very close to that of the scheme proposed by Censier and Feautrier (3), which offers the best performance of all the directory based schemes. The performance of the cache-vector protocol is evaluated using trace-driven simulation. A figure of merit which takes into account the average memory latency, network traffic and the hardware overhead is introduced and used as the basis of comparison between the two schemes. The simulation results indicate that the cache-vector protocol is a viable solution to the cache coherence problem in large scale multiprocessors.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This thesis introduced two allocation schemes for cache memory in multiprogramming environments. The proposed schemes, called static and dynamic cache partitioning, are slight variations of the schemes proposed by Thiebaut and Stone. We developed a trace driven simulation program to study and compare the performance of the proposed schemes to that of the cache sharing and cache flushing schemes. Furthermore, we proposed a new replacement technique that uses some heuristic to detect loop structures in the reference patterns. Initially, the proposed technique uses the Least Recently Used (LRU) strategy. Once a loop has been detected, all the instructions, which will harm performance if they were to be stored in the cache, will be dynamically excluded from being cached. The LRU strategy will resume as soon as the end of the loop has been detected. We have also developed a simulation program to compare the performance of this scheme to that of other related ones, so as to demonstrate its effectiveness. The results show our scheme outperforms the others, especially when the system references are loop dominated.
Model
Digital Document
Publisher
Florida Atlantic University
Description
With the growth of the Internet and increasing, network traffic, latency has become a major issue. On the entrepreneur side bandwidth is a bottleneck. Web Caching helps to resolve both these issues. In recent years, many proxy caching algorithms and benchmarks to test them have been implemented. In this thesis, some of the existing proxy caching algorithms and related benchmarks have been examined. It has been observed that most of the benchmarks do not provide the developer/entrepreneur a customized environment to debug or deploy a proxy caching algorithm. Hence, this thesis implements a platform independent, easily extensible Test Bed that can be tailored to satisfy the needs of both the developers and the entrepreneurs. The thesis also implements two standard caching algorithms. To illustrate the application of the Test Bed, these algorithms are run on the Test Bed and results obtained are analyzed. Some of the results are then compared to existing behavioral patterns.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Multiple threads can run concurrently on multiple cores in a multicore system and improve performance/power ratio. However, effective core allocation in multicore and manycore systems is very challenging. In this thesis, we propose an effective and scalable core allocation strategy for multicore systems to achieve optimal core utilization by reducing both internal and external fragmentations. Our proposed strategy helps evenly spreading the servicing cores on the chip to facilitate better heat dissipation. We introduce a multi-stage power management scheme to reduce the total power consumption by managing the power states of the cores. We simulate three multicore systems, with 16, 32, and 64 cores, respectively, using synthetic workload. Experimental results show that our proposed strategy performs better than Square-shaped, Rectangle-shaped, L-Shaped, and Hybrid (contiguous and non-contiguous) schemes in multicore systems in terms of fragmentation and completion time. Among these strategies, our strategy provides a better heat dissipation mechanism.