BaeHeeSun / LCMat

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LCMat (Loss Curvature Matching for Dataset Selection and Condensation, AISTATS 2023)

Official PyTorch implementation of "Loss-Curvature Matching for Dataset Selection and Condensation" (AISTATS 2023) by Seungjae Shin*, HeeSun Bae*, Donghyeok Shin, Weonyoung Joo, and Il-Chul Moon.

Overview

LCMat identifies the optimal dataset by matching the worst loss-curvature gap between the original dataset and the reduced dataset. It learns toward achieving the generalization around the local parameter region on dataset reduction procedure. Our implementation code is largely dependent on the code of DeepCore. We thank the authors for providing these codes.

Here, $\theta$ is a parameter, $\rho$ denotes the maximum size of the perturbation. LCMat matches the Loss Curvature of the training dataset, $T$, and the resulting dataset, $S$, by reducing the gap between the curvature of $\mathcal{l}(T)$ and that of $\mathcal{l}(S)$.

By considering the sharpness on loss difference, LCMat(right) can successfully identify the reduced dataset $S$ matching the loss landscape of original $T$, although the subset selected by Craig(left) does not match the loss curvature of $T$.

Setup

Please install required libraries as follows.

We kindly suggest other researchers to run this code on python = 3.8 version.

pip install -r requirements.txt

Reproduce

For reproduce the results of LCMAT-S, we provide a bash file for running main.py, which located at:

/bash/LCMat_XXX.sh

Here, XXX is dataset. You can get results in result/ directory.

You can also reproduce cross-architecture generalization result by running cross_network_generalization.py.

We will also release the code of LCMat-C soon.

Thank you for your Interest in our paper!

About


Languages

Language:Python 99.7%Language:Shell 0.3%