Implementation of SANAS (see paper on arXiv ), a model able to dynamically adapt the architecture of a Deep Neural Network at test time for efficient sequence classification.
- Create an environment with Python 3.6
pip install -r requirements.txt
- Download the Speech command v0.01 archive.
- Extract the dataset and give the extracted folder path as
root_path
argument (defaults to./data/speech_commands_v0.01
). - Implementation of the Speech Commands data processing is based on honk, credits goes to the authors!
- Speech Commands dataset paper on arXiv.
python main.py with adam speech_commands gru kwscnn static=True use_mongo=False ex_path=<path_to_save_location>/runs
- If no
ex_path
is specified, logs and models will be saved under./runs
- Create json file containing the required connection informations:
{
"user": "Me",
"passwd": "MySecurePassword",
"host": "localhost",
"port": "27017",
"db": "sanas",
"collection": "runs"
}
python main.py with adam speech_commands gru kwscnn static=True use_mongo=False mongo_config_path=<path_to_config>/mongo_config.json
mongo_config_path
defaults to./resources/mongo_credentials.json
python main.py with adam speech_commands gru kwscnn static=True use_visdom=False
- Visdom will connect to
localhost:8097
by default. To specify the server, create a config file:
{
"server": "http://localhost",
"port": 8097
}
python main.py with adam speech_commands gru kwscnn static=True visdom_config_path=<path_to_config>/vis_config.json
The __get_item__(self, idx)
of a dataset should return a tuple (x,y)
with:
x
of sizeseq_len x feature_dims
. For example,feature_dims
for traditional images is(C,H,W)
y
of sizeseq_len
.
It is possible to use the PadCollate class in the dataloader to pad each sequence to the length of the longest one in the sampled batch.