Accepted by NeurIPS 2022 Datasets and Benchmarks Track
Please visit our website for more dataset information.
Please visit our website for interactive viewing of our dataset.
We also provide decompressed sample data (mesh files in obj
format) under example_data/
.
You can use MeshLab to view them.
We provide a compressed version of our dataset, together with a python decompressor script that you can run to locally decompress it. Proceed as follows (this assumes you have conda installed):
- Download the Breaking Bad dataset from Dataverse and unzip files.
To reproduce the main results in the paper, you only need to download
everyday
andartifact
subset as well as thedata_split.tar.gz
. For theother
subset, we split the zip file into 4 parts because of the single file size limit on Dataverse. Refer to here for how to unzip splitted zip files. Make sure the unzipped dataset looks like
$DATA_ROOT/
├──── data_split/
│ ├──── everyday.train.txt
│ ├──── everyday.val.txt
│ ├──── artifact.train.txt
│ ├──── artifact.val.txt
│ ├──── other.train.txt
│ ├──── other.val.txt
├──── everyday_compressed/
│ ├──── BeerBottle/
│ | |──── 3f91158956ad7db0322747720d7d37e8/
| | | |──── compressed_data.npz
| | | |──── compressed_mesh.obj
| │ | |──── mode_0/
| | | | |──── compressed_fracture.npy
• • • •
• • • •
| | | |──── mode_19/
| | | |──── fractured_0/
• • • •
• • • •
| | | |──── fractured_79/
│ | |──── 6da7fa9722b2a12d195232a03d04563a/
│ | |──── 2927d6c8438f6e24fe6460d8d9bd16c6/
• • •
• • •
│ ├──── Bottle/
│ | |──── 1/
│ | |──── 1b64b36bf7ddae3d7ad11050da24bb12/
│ | |──── 1c79735033726294724d5ee7f09ab66b/
• • •
• • •
│ ├──── Bowl/
• •
• •
├──── artifact_compressed/
│ ├──── 39084_sf/
│ ├──── 39085_sf/
│ ├──── 39086_sf/
• •
• •
├──── other_compressed/
│ ├──── 32770_sf/
│ ├──── 34783_sf/
│ ├──── 34784_sf/
• •
• •
- Clone this repository
git clone git@github.com:Breaking-Bad-Dataset/Breaking-Bad-Dataset.github.io.git breaking-bad-dataset
- Navigate to the repository
cd breaking-bad-dataset/
- Install dependencies
conda create -n breaking-bad python=3.8
conda activate breaking-bad
conda install numpy scipy tqdm
conda install -c conda-forge igl
- Run decompressor script
python decompress.py --data_root $DATA_ROOT --subset $SUBSET --category $CATEGORY
where $DATA_ROOT
is the path to the Breaking Bad dataset folder.
$SUBSET
is the name of the subset you want to process, i.e. one of ['everyday', 'artifact', 'other']
.
You can also input all
to decompress the entire dataset, which is very time-consuming and takes ~1T disk storage.
$CATEGORY
is only used for the everyday
subset and specifies the object category you want to decompress, e.g. Bottle
, Bowl
.
You can also input all
to decompress all the categories.
For example, to decompress the Bottle
category in the everyday
subset run
python decompress.py --data_root $DATA_ROOT --subset everyday --category Bottle
to decompress the artifact
subset run
python decompress.py --data_root $DATA_ROOT --subset artifact
After decompressing everything, the structure of the dataset will be
$DATA_ROOT/
├──── data_split/
├──── everyday/ (~60G)
│ ├──── BeerBottle/
│ | |──── 3f91158956ad7db0322747720d7d37e8/
| │ | |──── mode_0/
| | | | |──── piece_0.obj
• • • • •
• • • • •
| | | | |──── piece_n.obj
• • • •
• • • •
| | | |──── mode_19/
| | | | |──── piece_0.obj
• • • • •
• • • • •
| | | | |──── piece_n.obj
| │ | |──── fracture_0/
| | | | |──── piece_0.obj
• • • • •
• • • • •
| | | | |──── piece_n.obj
• • • •
• • • •
| | | |──── fracture_79/
| | | | |──── piece_0.obj
• • • •
• • • •
| | | | |──── piece_n.obj
• •
• •
├──── artifact/ (~40G)
├──── other/ (~900G)
We release the code for reproducing our benchmark results here.
In the initial release of our Breaking Bad Dataset, some fractures contain small chip-like pieces (see examples with 6 or 8 pieces in our gallery). The imbalance in shape volumes causes difficulty in model learning. For example, if we sample 1,000 points per piece, the point density of small and large pieces will be very different.
As an attempt to solve this issue, we create a volume constrained version of our dataset.
In the fracture simulator, we set the minimum volume of each piece to be at least 1/40 of the total shape volume, and do rejection sampling to generate valid samples.
We release the volume constrained version of the everyday
and the artifact
subset under the same repo.
Note that, due to the constraint, some shapes cannot generate 100 valid fractures.
We also benchmark the baselines on this version of data. See last part of the section.
If you find this dataset useful, please consider citing our paper:
@inproceedings{sellan2022breaking,
title = {Breaking Bad: A Dataset for Geometric Fracture and Reassembly},
author = {Sell{\'a}n, Silvia and Chen, Yun-Chun and Wu, Ziyi and Garg, Animesh and Jacobson, Alec},
booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2022}
}