A Reading List of CMP Research Papers

 

 

Novel Cache Architectures

Adaptive Set Pinning: Managing Shared Caches in Chip Multiprocessors [ASPLOS’08]

Zero-Content Augmented Caches [ICS’09]

Zero-Value Caches: Cancelling Loads that Return Zero [PACT’09]

Multi-Execution: Multicore Caching for Data-Similar Executions [ISCA’09]

A Novel Architecture of the 3D Stacked MRAM L2 Cache for CMPs [HPCA’09]

Optimizing Communication and Capacity in a 3D Stacked Reconfigurable Cache Hierarchy [HPCA’09]

Adaptive Line Placement with the Set Balancing Cache [MICRO’09]

 

 

 

Novel Network-on-Chips

Technology-Driven, Highly-Scalable Dragonfly Topology [ISCA’08]

Express Cube Topologies for On-Chip Interconnects [HPCA’09]

CMP Network-on-Chip Overlaid With Multi-Band RF-Interconnect [HPCA’08]

Power Reduction of CMP Communication Networks via RF-Interconnects [MICRO’08]

A Low-Radix and Low-Diameter 3D Interconnection Network Design [HPCA’09]

Firefly: Illuminating Future Network-on-Chip with Nanophotonics [ISCA’09]

Phastlane: A Rapid Transit Optical Routing Network [ISCA’09]

Application-Aware Prioritization Mechanisms for On-Chip Networks [MICRO’09]

 

New Memory Technology

3D-Stacked Memory Architectures for Multi-Core Processors [ISCA’08]

Architecting Phase Change Memory as a Scalable DRAM Alternative [ISCA’09]

A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology [ISCA’09]

Scalable High Performance Main Memory System Using phase-Change Memory Technology [ISCA’09]

Exploring Phase Change Memory and 3D Die-Stacking for Power/Thermal Friendly, Fast and Durable Memory Architectures [PACT’09]

Characterizing and Mitigating the Impact of Process Variations on Phase Change based Memory Systems [MICRO’09]

Enhancing Lifetime and Security of PCM-Based Main Memory with Start-Gap Wear Leveling [MICRO’09]

Extending the Effectiveness of 3D-Stacked DRAM Caches with an Adaptive Multi-Queue Policy [MICRO’09]

 

 

Reliability

Mixed-Mode Multicore Reliability [ASPLOS’09]

Memory Mapped ECC: Low-Cost Error Protection for Last Level Caches [ISCA’09]

Flexible Cache Error Protection using an ECC FIFO [SC’09]

Configurable Isolation: Building High Availability Systems with Commodity Multicore Processors [ISCA’07]

COVERT: Configurable Virtual Redundancy for Transparent High Availability on Commodity Software [ASPLOS’08]

Implementing High Availability Memory with a Duplication Cache [MICRO’08]

Improving Cache Lifetime Reliability at Ultra-low Voltages [MICRO’09]

 

 

HW-Based CMP Cache Management

Cooperative Caching for Chip Multiprocessors [ISCA’06]

ASR: Adaptive Selective Replication for CMP Caches [MICRO’06]

Cooperative cache partitioning for chip multiprocessors [ICS’07]

Distributed cooperative caching [PACT’08]

Adaptive Spill-Receive for Robust High-Performance Caching in CMPs [HPCA’09]

Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches [MICRO’06]

Adaptive insertion policies for managing shared caches [PACT’08]

PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches [ISCA’09]

Pseudo-LIFO: The Foundation of a New Family of Replacement Policies for Last-level Caches [MICRO’09]

 

 

SW-Based CMP Cache Management

Managing Distributed, Shared L2 Caches through OS-Level Page Allocation [MICRO’06]

An OS-Based Alternative to Full Hardware Coherence on Tiled CMPs [HPCA’08]

PageNUCA: Selected Policies for Page-grain Locality Management in Large Shared Chip-multiprocessor Caches [HPCA’09]

Dynamic Hardware-Assisted Software-Controlled Page Placement to Manage Capacity Allocation and Sharing within Large Caches [HPCA’09]

Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches [ISCA’09]

Taming Single-Thread Program Performance on Many Distributed On-Chip L2 Caches [ICPP’08]

SOS: A Software-Oriented Distributed Shared Cache Management Approach for Chip Multiprocessors [PACT’09]

Improving Hardware Cache Performance Through Software-Controlled Object-Level Cache Partitioning [PACT’09]

SHARP Control: Controlled Shared Cache Management in Chip Multiprocessors [MICRO’09]

 

 

Combinational Optimization

A Novel Migration-Based NUCA Design for Chip Multiprocessors [ICS’08]

Dynamic Cache Clustering for Chip Multiprocessors [ICS’09]

A Case for Integrated Processor-Cache Partitioning in Chip Multiprocessors [SC’09]

Achieving Predictable Performance through Better Memory Controller Placement in Many-Core CMPs [ISCA’09]

Feedback Driven Threading: Power-Efficient and High-Performance Execution of Multithreaded Workloads on CMPs [ASPLOS’08]

 

 

Core’s Microarchitecture

iCFP: Tolerating All Level Cache Misses in In-Order Processors [HPCA’09]

Decoupled Store Completion/Silent Deterministic Replay: Enabling Scalable Data Memory for CPR/CFP Processors [ISCA’09]

CPROB: Checkpoint Processing with Opportunistic Minimal Recovery [PACT’09]

 

 

Heterogeneous CMPs

Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures [ASPLOS’09]

Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures [PACT’09]

Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping [MICRO’09]

 

 

Power & Thermal

Voltage Emergency Prediction: Using Signatures to Reduce Operating Margins [HPCA’09]

Thread Motion: Fine-Grained Power Management for Multi-Core Systems [ISCA’09]

 

 

Helper Threads

A Helper Thread Based EDP Reduction Scheme for Adapting Application Execution in CMPs [IPDPS’08]

Dynamic Performance Tuning for Speculative Threads [ISCA’09]

 

 

Process Variation

Process Variation Tolerant 3T1D-Based Cache Architectures [MICRO’07]

Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors [ISCA’08]

Variation-Tolerant Non-Uniform 3D Cache Management in Die Stacked Multicore Processor [MICRO’09]

 

 

Compiler Assistance

Synchronization Optimizations for Efficient Execution on Multi-Cores [ICS’09]

Data Layout Transformation for Enhancing Locality on NUCA Chip Multiprocessors [PACT’09]