SAFARI Research Group at ETH Zurich and Carnegie Mellon University

SAFARI Research Group at ETH Zurich and Carnegie Mellon University's repositories

GPGPUSim-Ramulator

The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used to produce some of the results in our SIGMETRICS 2019 paper: Ghose et al., "Demystifying Complex Workload-DRAM Interactions: An Experimental Study" at https://arxiv.org/pdf/1902.07609.pdf.

Language:C++NOASSERTION43 8 1

IMPICA

This is a processing-in-memory simulator which models 3D-stacked memory within gem5. Also includes the workloads used for IMPICA (In-Memory PoInter Chasing Accelerator), an ICCD 2016 paper by Hsieh et al. at https://users.ece.cmu.edu/~omutlu/pub/in-memory-pointer-chasing-accelerator_iccd16.pdf

Language:C43 7 3

Mosaic

Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes" https://people.inf.ethz.ch/omutlu/pub/mosaic-application-transparent-multiple-page-sizes-for-GPUs_micro17.pdf

Language:C++40 10 7

VAMPIRE

An open-source DRAM power model based on extensive experimental characterization of real DRAM modules. Described in the SIGMETRICS 2018 paper by Ghose et al. (https://people.inf.ethz.ch/omutlu/pub/VAMPIRE-DRAM-power-characterization-and-modeling_sigmetrics18_pomacs18.pdf)

Language:C++MIT34 9 3

Shifted-Hamming-Distance

Source code for the Shifted Hamming Distance (SHD) filtering mechanism for sequence alignment. Described in the Bioinformatics journal paper (2015) by Xin et al. at http://users.ece.cmu.edu/~omutlu/pub/shifted-hamming-distance_bioinformatics15_proofs.pdf

Language:CGPL-2.030 26 1

Apollo

Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size. Described in the Bioinformatics journal paper (2020) by Firtina et al. at https://people.inf.ethz.ch/omutlu/pub/apollo-technology-independent-genome-assembly-polishing_bioinformatics20.pdf

Language:C++GPL-3.027 5 7

Cache-Memory-Hog

Cache and main memory hog programs. These are programs with specific access patterns to evict the already existing cache blocks of various applications. These programs were designed to demonstrate that application performance is nearly linearly correlated with cache access rate (as shown in Section 3.1 of Subramanian et al. "The Application Slowdown Model" @ https://users.ece.cmu.edu/~omutlu/pub/application-slowdown-model_micro15.pdf)

Language:C1900

MemBen

Benchmark suite containing cache filtered traces for use with Ramulator. These include some of the workloads used in our SIGMETRICS 2019 paper: Ghose et al., "Demystifying Complex Workload-DRAM Interactions: An Experimental Study" at https://arxiv.org/pdf/1902.07609.pdf.

MIT19 5 1

BEER

BEER determines an ECC code's parity-check matrix based on the uncorrectable errors it can cause. BEER targets Hamming codes that are used for DRAM on-die ECC but can be extended to apply to other linear block codes (e.g., BCH, Reed-Solomon). BEER is described in the 2020 MICRO paper by Patel et al.: https://arxiv.org/abs/2009.07985.

Language:C++MIT17 40

Shouji

Shouji is fast and accurate pre-alignment filter for banded sequence alignment calculation. Described in the Bioinformatics journal paper (2019) by Alser et al. at https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btz234/28533771/btz234.pdf

Language:VHDLGPL-3.016 70

CROW

Source code for the architectural and circuit-level simulators used for modeling the CROW (Copy-ROW DRAM) mechanism proposed in our ISCA 2019 paper "CROW: A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability". Paper is at: https://people.inf.ethz.ch/omutlu/pub/CROW-DRAM-substrate-for-performance-energy-reliability_isca19.pdf.

Language:C++15 6 1

SMASH

SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of SMASH is to compress sparse matrices with a hierarchical bitmap compression format that can be accelerated from hardware. Described by Kanellopoulos et al. (MICRO '19) https://people.inf.ethz.ch/omutlu/pub/SMASH-sparse-matrix-software-hardware-acceleration_micro19.pdf

Language:C14 50

Pythia-HDL

Implementation of Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning in Chisel HDL. To know more, please read the paper that appeared in MICRO 2021 by Bera et al. (https://arxiv.org/pdf/2109.12021.pdf).

Language:ScalaMIT13 60

CLRDRAM

Circuit-level model for the Capacity-Latency Reconfigurable DRAM (CLR-DRAM) architecture. This repository contains the SPICE models of the CLR-DRAM architecture and the baseline architecture used in our ISCA 2020 paper "Luo et al., CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off": https://people.inf.ethz.ch/omutlu/pub/CLR-DRAM_capacity-latency-reconfigurable-DRAM_isca20.pdf

Language:AGS ScriptMIT12 50

GRIM

Source code of the processing-in-memory simulator used in the GRIM-Filter paper published at BMC Genomics in 2018: "GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping using Processing-in-Memory Technologies" (preliminary version at https://arxiv.org/pdf/1711.01177.pdf)

Language:C1100

COVIDHunter

COVIDHunter 🦠:construction:: An accurate and flexible COVID-19 outbreak simulation model that forecasts the strength of future mitigation measures and the numbers of cases, hospitalizations, and deaths for a given day, while considering the potential effect of environmental conditions. Described by Alser et al. (preliminary version at https://arxiv.org/abs/2102.03667 and https://doi.org/10.1101/2021.02.06.21251265).

Language:SwiftMIT9 5 1

RamulatorSharp

RamulatorSharp is a fast and flexible memory subsystem simulator implemented in C# and it can easily run on Linux, OS X, and Windows. The simulator contains the implementation of the Low-Cost Inter-Linked Subarrays (HPCA 2016) and ChargeCache (HPCA 2016) in addition to other features present in the C++ version of Ramulator: https://users.ece.cmu.edu/~omutlu/pub/lisa-dram_hpca16.pdf https://users.ece.cmu.edu/~omutlu/pub/chargecache_low-latency-dram_hpca16.pdf

Language:C#BSD-3-Clause9 60

DRAM-Voltage-Study

Experimental study and analysis on the effect of using a wide range of different supply voltage values on the reliability, latency, and retention characteristics of DDR3L DRAM SO-DIMMs

Language:AGS Script7 70

HARP

HARP is a memory error profiling algorithm (i.e., for identifying error-prone cells) designed for use with memory chips that use on-die error-correcting codes (ECC). This tool uses Monte-Carlo simulation to evaluate HARP and other error profilers. HARP and this tool are described in the 2021 MICRO paper by Patel et al.: https://arxiv.org/abs/2109.12697.

Language:C++MIT7 40

DIVA-DRAM

This repository provides characterization data collected over 96 DDR3 SO-DIMMs, related to the following paper: Lee et al., "Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms", SIGMETRICS 2017. https://people.inf.ethz.ch/omutlu/pub/DIVA-low-latency-DRAM_sigmetrics17-paper.pdf

Language:AGS Script6 70

optimal-seed-solver

Optimal Seed Solver (OSS) is a dynamic-programming algorithm that finds the optimal seeds of a read, which renders the minimum total seed frequency. It is described by Xin et al. at http://arxiv.org/pdf/1506.08235v1.pdf.

Language:C++6 270

MIG-7-PHY-DDR3-Controller

A DDR3 Controller that uses the Xilinx MIG-7 PHY to interface with DDR3 devices.

Language:Verilog5 50

UHMEM

A cycle-accurate simulator that models a hybrid memory subsystem consisting of multiple memory technologies. Described in the CLUSTER 2017 paper by Li et al. (https://people.inf.ethz.ch/omutlu/pub/utility-based-hybrid-memory-management_cluster17.pdf)

Language:C#5 5 1

LEAP

Language:C++4 6 1

SMLA

This simulator models Simultaneous Multi Layer Access (SMLA) and 3D-stacked DRAM memory, based on the TACO 2016 paper https://users.ece.cmu.edu/~omutlu/pub/smla_high-bandwidth-3d-stacked-memory_taco16.pdf

Language:C++4 40

SNP-Selective-Hiding

An optimization-based mechanism :dna: :closed_lock_with_key: to selectively hide the minimum number of overlapping SNPs among the family members :family_man_woman_girl_boy: who participated in the genomic studies (i.e. GWAS). Our goal is to distort the dependencies among the family members in the original database for achieving better privacy without significantly degrading the data utility.

Language:MATLAB4 30

PDNspot

PDNspot is a versatile framework that enables the modeling and architectural exploration of power delivery networks (PDNs) of modern processors. PDNspot evaluates the effect of multiple PDN parameters, TDP, and workloads on the metrics of interest. Described in the MICRO 2020 paper by Jawad Haj-Yahya et al. at https://people.inf.ethz.ch/omutlu/pub/FlexWatts-HybridPowerDeliveryNetwork_micro20.pdf

Language:Python3 40

Register-Interval

LTRF's register-interval creation algorithm divides the control flow graph (CFG) of a GPU application into some register-intervals which have two main characteristics: 1) register-intervals have only one entry-point in CFG, and 2) they have a limited number of registers. This algorithm is part of ASPLOS2018 paper by Sadrosadati et al. at https://people.inf.ethz.ch/omutlu/pub/LTRF-latency-tolerant-GPU-register-file_asplos18.pdf

Language:C++300

DRAM-Latency-Variation-Study

Latency characterization data collected from 30 real DRAM SO-DIMMs. You can find the background and analysis on the data in our SIGMETRICS'16 paper "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization".

BSD-3-Clause2 60

BurstLink

000