SAFARI Research Group at ETH Zurich and Carnegie Mellon University's repositories
GPGPUSim-Ramulator
The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used to produce some of the results in our SIGMETRICS 2019 paper: Ghose et al., "Demystifying Complex Workload-DRAM Interactions: An Experimental Study" at https://arxiv.org/pdf/1902.07609.pdf.
IMPICA
This is a processing-in-memory simulator which models 3D-stacked memory within gem5. Also includes the workloads used for IMPICA (In-Memory PoInter Chasing Accelerator), an ICCD 2016 paper by Hsieh et al. at https://users.ece.cmu.edu/~omutlu/pub/in-memory-pointer-chasing-accelerator_iccd16.pdf
Shifted-Hamming-Distance
Source code for the Shifted Hamming Distance (SHD) filtering mechanism for sequence alignment. Described in the Bioinformatics journal paper (2015) by Xin et al. at http://users.ece.cmu.edu/~omutlu/pub/shifted-hamming-distance_bioinformatics15_proofs.pdf
Apollo
Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size. Described in the Bioinformatics journal paper (2020) by Firtina et al. at https://people.inf.ethz.ch/omutlu/pub/apollo-technology-independent-genome-assembly-polishing_bioinformatics20.pdf
Cache-Memory-Hog
Cache and main memory hog programs. These are programs with specific access patterns to evict the already existing cache blocks of various applications. These programs were designed to demonstrate that application performance is nearly linearly correlated with cache access rate (as shown in Section 3.1 of Subramanian et al. "The Application Slowdown Model" @ https://users.ece.cmu.edu/~omutlu/pub/application-slowdown-model_micro15.pdf)
BEER
BEER determines an ECC code's parity-check matrix based on the uncorrectable errors it can cause. BEER targets Hamming codes that are used for DRAM on-die ECC but can be extended to apply to other linear block codes (e.g., BCH, Reed-Solomon). BEER is described in the 2020 MICRO paper by Patel et al.: https://arxiv.org/abs/2009.07985.
Shouji
Shouji is fast and accurate pre-alignment filter for banded sequence alignment calculation. Described in the Bioinformatics journal paper (2019) by Alser et al. at https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btz234/28533771/btz234.pdf
CROW
Source code for the architectural and circuit-level simulators used for modeling the CROW (Copy-ROW DRAM) mechanism proposed in our ISCA 2019 paper "CROW: A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability". Paper is at: https://people.inf.ethz.ch/omutlu/pub/CROW-DRAM-substrate-for-performance-energy-reliability_isca19.pdf.
SMASH
SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of SMASH is to compress sparse matrices with a hierarchical bitmap compression format that can be accelerated from hardware. Described by Kanellopoulos et al. (MICRO '19) https://people.inf.ethz.ch/omutlu/pub/SMASH-sparse-matrix-software-hardware-acceleration_micro19.pdf
Pythia-HDL
Implementation of Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning in Chisel HDL. To know more, please read the paper that appeared in MICRO 2021 by Bera et al. (https://arxiv.org/pdf/2109.12021.pdf).
CLRDRAM
Circuit-level model for the Capacity-Latency Reconfigurable DRAM (CLR-DRAM) architecture. This repository contains the SPICE models of the CLR-DRAM architecture and the baseline architecture used in our ISCA 2020 paper "Luo et al., CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off": https://people.inf.ethz.ch/omutlu/pub/CLR-DRAM_capacity-latency-reconfigurable-DRAM_isca20.pdf
COVIDHunter
COVIDHunter š¦ :construction:: An accurate and flexible COVID-19 outbreak simulation model that forecasts the strength of future mitigation measures and the numbers of cases, hospitalizations, and deaths for a given day, while considering the potential effect of environmental conditions. Described by Alser et al. (preliminary version at https://arxiv.org/abs/2102.03667 and https://doi.org/10.1101/2021.02.06.21251265).
RamulatorSharp
RamulatorSharp is a fast and flexible memory subsystem simulator implemented in C# and it can easily run on Linux, OS X, and Windows. The simulator contains the implementation of the Low-Cost Inter-Linked Subarrays (HPCA 2016) and ChargeCache (HPCA 2016) in addition to other features present in the C++ version of Ramulator: https://users.ece.cmu.edu/~omutlu/pub/lisa-dram_hpca16.pdf https://users.ece.cmu.edu/~omutlu/pub/chargecache_low-latency-dram_hpca16.pdf
DRAM-Voltage-Study
Experimental study and analysis on the effect of using a wide range of different supply voltage values on the reliability, latency, and retention characteristics of DDR3L DRAM SO-DIMMs
HARP
HARP is a memory error profiling algorithm (i.e., for identifying error-prone cells) designed for use with memory chips that use on-die error-correcting codes (ECC). This tool uses Monte-Carlo simulation to evaluate HARP and other error profilers. HARP and this tool are described in the 2021 MICRO paper by Patel et al.: https://arxiv.org/abs/2109.12697.
DIVA-DRAM
This repository provides characterization data collected over 96 DDR3 SO-DIMMs, related to the following paper: Lee et al., "Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms", SIGMETRICS 2017. https://people.inf.ethz.ch/omutlu/pub/DIVA-low-latency-DRAM_sigmetrics17-paper.pdf
optimal-seed-solver
Optimal Seed Solver (OSS) is a dynamic-programming algorithm that finds the optimal seeds of a read, which renders the minimum total seed frequency. It is described by Xin et al. at http://arxiv.org/pdf/1506.08235v1.pdf.
MIG-7-PHY-DDR3-Controller
A DDR3 Controller that uses the Xilinx MIG-7 PHY to interface with DDR3 devices.
SNP-Selective-Hiding
An optimization-based mechanism :dna: :closed_lock_with_key: to selectively hide the minimum number of overlapping SNPs among the family members :family_man_woman_girl_boy: who participated in the genomic studies (i.e. GWAS). Our goal is to distort the dependencies among the family members in the original database for achieving better privacy without significantly degrading the data utility.
PDNspot
PDNspot is a versatile framework that enables the modeling and architectural exploration of power delivery networks (PDNs) of modern processors. PDNspot evaluates the effect of multiple PDN parameters, TDP, and workloads on the metrics of interest. Described in the MICRO 2020 paper by Jawad Haj-Yahya et al. at https://people.inf.ethz.ch/omutlu/pub/FlexWatts-HybridPowerDeliveryNetwork_micro20.pdf
Register-Interval
LTRF's register-interval creation algorithm divides the control flow graph (CFG) of a GPU application into some register-intervals which have two main characteristics: 1) register-intervals have only one entry-point in CFG, and 2) they have a limited number of registers. This algorithm is part of ASPLOS2018 paper by Sadrosadati et al. at https://people.inf.ethz.ch/omutlu/pub/LTRF-latency-tolerant-GPU-register-file_asplos18.pdf
DRAM-Latency-Variation-Study
Latency characterization data collected from 30 real DRAM SO-DIMMs. You can find the background and analysis on the data in our SIGMETRICS'16 paper "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization".