itsnamgyu / reasoning-teacher

Official code for "Large Language Models Are Reasoning Teachers", ACL 2023

Home Page:https://arxiv.org/abs/2212.10071

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some issues encountered in fine-tuning the T5 model

Carrot-r opened this issue · comments

Hi, thank you very much for your outstanding work in the field of thinking chain reasoning. During the process of reproducing the project, I have some questions that I would like to consult with you for assistance in resolving. When I was fine-tuning the T5 model using the custom_train.py file, there was this output on the console:
Epoch 1: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 439/439 [10:16<00:00, 1.40s/it, loss=0.784, v_num=2, accuracy=0.022]INFO:root:------------------------------VALIDATION_EXAMPLES-------------------------------███████████████████████████████████████████████████████████████████████████████████████| 42/42 [05:42<00:00, 8.15s/it]
INFO:root:[
[
{
"sample_index": 7473,
"completion_index": 0,
"question": "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"answer": "18",
"prompt": "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"completion": "Janet's ducks lay 16 eggs per day. She eats three for breakfast every morning. That means she eats 8 eggs in total every morning. She bakes muffins for her friends every day with four. That means she bakes 8 muffins in total every day. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. That means she makes $2 per day at the farmers' market. --> 2"
}
],
[
{
"sample_index": 7474,
"completion_index": 0,
"question": "A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?",
"answer": "3",
"prompt": "A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?",
"completion": "A robe takes 2 bolts of blue fiber and half that much white fiber. That means that the robe takes 2 + 1/2 = 4 bolts of white fiber. Now we can add up all the bolts. 4 + 4 = 12 bolts in total. --> 12"
}
],
[
{
"sample_index": 7475,
"completion_index": 0,
"question": "Josh decides to try flipping a house. He buys a house for $80,000 and then puts in $50,000 in repairs. This increased the value of the house by 150%. How much profit did he make?",
"answer": "70000",
"prompt": "Josh decides to try flipping a house. He buys a house for $80,000 and then puts in $50,000 in repairs. This increased the value of the house by 150%. How much profit did he make?",
"completion": "Josh buys a house for $80,000. This increases the value of the house by 150%. This increases the value of the house by $80,000 by $30,000, which is $15,000. Josh then puts in $50,000 in repairs. This increases the value of the house by $50,000 by $30,000, which is $15,000. This increases the value of the house by $15,000, which is $15,000. Josh's profit is $15,000 - $15,000 = $15,000 - $15,000 = $15,000 - $15,000 = $15,000 - $120,000 = $120,000 --> 120,000"
}
],
[
{
"sample_index": 7476,
"completion_index": 0,
"question": "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week?",
"answer": "540",
"prompt": "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week?",
"completion": "James runs 3 sprints 3 times a week. That means he runs 3 * 60 = 240 meters in a week. --> 240"
}
],
[
{
"sample_index": 7477,
"completion_index": 0,
"question": "Every day, Wendi feeds each of her chickens three cups of mixed chicken feed, containing seeds, mealworms and vegetables to help keep them healthy. She gives the chickens their feed in three separate meals. In the morning, she gives her flock of chickens 15 cups of feed. In the afternoon, she gives her chickens another 25 cups of feed. How many cups of feed does she need to give her chickens in the final meal of the day if the size of Wendi's flock is 20 chickens?",
"answer": "20",
"prompt": "Every day, Wendi feeds each of her chickens three cups of mixed chicken feed, containing seeds, mealworms and vegetables to help keep them healthy. She gives the chickens their feed in three separate meals. In the morning, she gives her flock of chickens 15 cups of feed. In the afternoon, she gives her chickens another 25 cups of feed. How many cups of feed does she need to give her chickens in the final meal of the day if the size of Wendi's flock is 20 chickens?",
"completion": "In the morning, Wendi gives her flock of chickens 15 cups of feed. In the afternoon, she gives her chickens another 25 cups of feed. This means that in total, Wendi gives her flock of chickens 15 + 15 + 25 = 45 cups of feed. In the final meal of the day, Wendi needs to give her chickens 45 + 45 = 105 cups of feed. --> 105"
}
]
]
1、The training console will output some "VALIDATION-EXAMPLES" for each round of training. Can I understand it as testing the training effectiveness by using these questions from "VALIDATION-EXAMPLES" after training for one round? If that's the case, how were these "VALIDATION-EXAMPLES" selected? What kind of rules were used for selection?
2、What is the difference between the field "question" and the field "prompt" in "VALIDATION-EXAMPLES"? I guess after listing these issues, should we feed them to the fine tuned model for testing, feed it the "prompt field", and output the "completion" field? If that's the case, is the "prompt" field missing "let's think step by step"?
3、If I fine tune the T5 model, does the data I use only need to be from the B_text-davinci-002__C_zs_cot directory? If data needs to be modified or enhanced, only corresponding modifications can be made in this directory. Is this the case?

Looking forward to your reply!!

Thanks for your interest in our work :)

  1. These were selected without any deep thought. We simply wanted to sanity-check the training process.
  2. The question is the original question from the benchmark dataset, and the prompt is the actual text prompt that is fed to the language model (during training AND inference)
  3. I think you should look into the data preprocessing pipeline to see where you want to make your own modifications. I don't remember exactly. I guess that modifying the json file (from which the code fetches the training data) is the easiest way, but I wouldn't recommend it for running and keeping track of various experiments

Thank you for your prompt reply!
In question 2, I would like to confirm if the content of "question" and "prompt" should be the same? Because from the output of the console, their content is the same. But since "prompt" needs to be fed to the model, has it automatically added "let's think step by step" in the program? Can you tell me the specific implementation location of the program?

Yep, the formatting of the prompt depends on where it is used. Formatting is implemented in src/data/format.py