ASK HERE

summer project pal · 29-01-2011, 07:35 PM

Bank-aware Dynamic Cache bPartitioning for Multicore Architectures
A Seminar Report
by
Anoop S S
105102
Department of Computer Science & Engineering
College of Engineering Trivandrum
Kerala - 695016
2010-11

Abstract
As Chip-Multiprocessor systems (CMP) have become the predominant topology for lead-
ing microprocessors, critical components of the system are now integrated on a single chip.
This enables sharing of computation resources that was not previously possible. In addition,
the virtualization of these computational resources exposes the system to a mix of diverse
and competing workloads. Cache is a resource of primary concern as it can be dominant in
controlling overall throughput. In order to prevent destructive interference between divergent
workloads, the last level of cache must be partitioned. In the past, many solutions have been
proposed but most of them are assuming either simplied cache hierarchies with no realistic

3 Bank-Aware Cache Partitioning
This section provide details about their application proling mechanism followed by parti-
tioning algorithm for assigning cache capacity to each core. In the end, the cache partitions
allocation algorithm for allocating the cache partitions on the CMP-baseline system are de-
scribed.
3.1 Cache Proling of Applications
In order to dynamically prole the cache requirements of each core, a cache miss prediction
model based on Mattsons stack distance algorithm is implemented. Mattsons stack algorithm
(MSA) was initially proposed by Mattson et al. in for reducing the simulation time of trace-
driven caches by determining the miss ratios of all possible cache sizes with a single pass through
the trace. The basic idea of the algorithm was later used for ecient trace-driven simulations
of a set associative cache. More recently, hardware-based MSA algorithms have been proposed
for CMP system resource management .
MSA is based on the inclusion property of the commonly used Least Recently Used (LRU)
cache replacement policy. Specically, during any sequence of memory accesses, the content of
an N-sized cache is a subset of the content of any cache larger than N. To create a prole for a
K-way set associative cache K+1 counters are needed, named Counter1 to CounterK+1. Every
time there is an access to the monitored cache increment only the counter that corresponds to
the LRU stack distance where the access took place. Counters from Counter1 up to CounterK
correspond to the Most Recently Used (MRU) up to the LRU position in the stack distance,
respectively. If an access touches an address in a cache block that was in the i-th position of the
LRU stack distance, increment Counteri counter. Finally, if the access ends up being a miss,
increment CounterK+1.
restrictions or complex cache schemes that are dicult to integrate in a real design. To address
this problem a dynamic partitioning strategy based on realistic last level cache designs of CMP
processors was proposed. This uses a cycle accurate, full system simulator based on Simics and
Gems to evaluate the partitioning scheme on an 8-core DNUCA CMP system. Results for an
8-core system show that the proposed scheme provides on average a 70 percent reduction in
misses compared to non-partitioned shared caches, and a 25 percent misses reduction compared
to static equally partitioned (private) caches.

1 Introduction
Chip Multiprocessors (CMP) have gradually become an attractive architecture for leveraging
system integration by providing capabilities on a single die that would have previously occupied
many chips across multiple small systems. This integration has brought abundant on-chip
resources that can now be shared in ner granularity among the multiple cores. Such sharing
though has introduced chip-level contention and the need of eective resource management
policies is more important that ever.
To eciently exploit these resources, systems require multiple program contexts and virtu-
alization has become a key player in this arena. Many small and/or low utilization servers
can now be easily consolidated on a single physical machine, allowing higher utilization of the
available resources with signicant energy reductions. Such consolidation presents both oppor-
tunities and pitfalls to computer architects to best manage these once isolated resources on
large CMP designs.
In such virtualization environments, workloads tend to place dissimilar demands on shared
resources and therefore, due to resource contention, are much more likely to destructively inter-
fere in an unfair way. Consequently, shared resources' contention become the key performance
bottleneck in CMPs Shared resources include, but are not limited to: main memory bandwidth,
main memory capacity, cache capacity, cache bandwidth, memory subsystem interconnection
bandwidth and system power.
Among these resources, several studies have identied the shared last-level cache (here L2)
of CMPs as a major source of performance loss and execution inconsistency. As a solution,
most of the proposed techniques control this contention by partitioning the L2 cache capacity
and allocating specic portions of it to each core or execution thread. There are both static
and dynamic partitioning schemes available that use workload proling information to make
a decision on cache capacity assignment for each core/thread. All of the above techniques
are usually based on high-level system characteristic monitoring since low-level activity based
algorithms such as LRU replacement fail to provide a strong barrier among workloads competing
for shared resources.

[attachment=8550]

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Dynamic Search Algorithm in Unstructured Peer-to-Peer Networks	seminar surveyer	3	2,849	14-07-2015, 02:24 PM Last Post: seminar report asees
	Adaptive Replacement Cache Full Download Seminar Report and Paper Presentation	computer science crazy	1	3,005	19-04-2014, 07:01 PM Last Post: Guest
	Dynamic Synchronous Transfer Mode	computer science crazy	3	4,585	19-02-2014, 03:29 AM Last Post: Guest
	Dynamic programming language	seminar projects crazy	2	3,201	03-01-2013, 12:31 PM Last Post: seminar details
	High Performance DSP Architectures	computer science crazy	1	8,166	12-12-2012, 12:18 PM Last Post: seminar details
	Distributed Cache Updating for the Dynamic Source Routing Protocol	seminar class	3	2,286	17-11-2012, 01:26 PM Last Post: seminar details
	A Hybrid Disk-Aware Spin-Down Algorithm with I/O Subsystem Support	computer girl	0	1,090	07-06-2012, 04:01 PM Last Post: computer girl
	DYNAMIC MEMORY MANAGEMENT	projectsofme	1	1,980	05-03-2012, 09:20 AM Last Post: seminar paper
	Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit-State Mod	seminar surveyer	2	2,363	14-02-2012, 12:55 PM Last Post: seminar paper
	Rover Technology Enabling Scalable Location Aware Computing	shabeer	3	8,579	26-01-2012, 10:05 AM Last Post: seminar addict

Important Note..!

ASK HERE