Job failed because of message got too long
upupzealot opened this issue · comments
{
"plugins": {
"dataCollect": {
"package": "@pipcook/plugins-object-detection-pascalvoc-data-collect",
"params": {
"url": "https://zhijiansha.oss-cn-hangzhou.aliyuncs.com/deep-learning/output.zip"
}
},
"dataAccess": {
"package": "@pipcook/plugins-coco-data-access"
},
"modelDefine": {
"package": "@pipcook/plugins-pytorch-yolov5-model-define"
},
"modelTrain": {
"package": "@pipcook/plugins-pytorch-yolov5-model-train",
"params": {
"epochs": 300
}
},
"modelEvaluate": {
"package": "@pipcook/plugins-pytorch-yolov5-model-evaluate"
}
}
}
BTW, to locate the cause of this issue, I set epoch to 10 for test, and the trainning just gose well
The SIGKILL seems to be caused by the costa has used the memory up, then the OS killed the process, we need to figure it out a way to optimize the memory consumption.