bert contrastive-learning distilbert electra huggingface-transformers language-model math-word-problem nlp pytorch roberta

Solving Math Word Problems Using Language Models and Contrastive Loss 📖

Details 📃

The code provided is for the best model: RoBERTa with NeuralSim Loss. Other models based on BERT, DistilBERT, etc., can be used with the same codebase by cloning the Hugging Face model repo into the pretrained_models folder (for example, https://huggingface.co/distilbert-base-uncased/tree/main) and adding [num] and [NUM] to the vocab file like below:

[PAD]
[num]
[NUM]
[unused3]
[unused4]
[unused5]
[unused6]
...

The data folder contains the dataset files. The pretrained_models folder is for storing the modified Hugging Face models (like RoBERTa), and the src folder contains the code.
Logs and checkpoints will be saved in a folder named output that will be created automatically during training/testing.
The files train_cl.sh and train_ft.sh are scripts to run the training/testing as explained in the Running experiments section below.
The files run_ft.py and run_cl.py also contain code.

Requirements⚡

Python 3
PyTorch 1.8 (with CUDA)
Transformers 4.9.1

Running experiments ▶️

Download pytorch_model.bin from https://huggingface.co/roberta-base/tree/main and place it in pretrained_models/roberta-base/.
To run the contrastive loss step (including testing), run ./train_cl.sh (make sure it is executable on the filesystem).
```
./train_cl.sh
```
To run the fine-tuning step (including testing), run ./train_ft.sh (make sure it is executable on the filesystem).
```
./train_ft.sh
```
Parameters like learning rate, epochs, batch size, etc., can be changed in train_cl.sh and train_ft.sh.
Checkpoints and logs will be saved in output/.
If you only want to test the model (can be done only after a few checkpoints are saved during training), use the --only_test option in train_cl.sh or train_ft.sh.

Team members ✏️

Chris Francis (cfrancis@ucsd.edu), Harshil Jain (hjain@ucsd.edu), Rohit Ramaprasad (rramaprasad@ucsd.edu), Sai Sree Harsha (ssreeharsha@ucsd.edu)

References 🔖

[1] Li, Z., Zhang, W., Yan, C., Zhou, Q., Li, C., Liu, H., & Cao, Y. (2021). Seeking patterns, not just memorizing procedures: Contrastive learning for solving math word problems. arXiv preprint arXiv:2110.08464.

Our code is based on the PyTorch implementation of the work by Li et al [1].

About

Solving Math Word Problems Using Language Models and Contrastive Loss

bert contrastive-learning distilbert electra huggingface-transformers language-model math-word-problem nlp pytorch roberta

Languages

Language:Python 98.9%Language:Shell 1.1%