microsoft / winfile

Original Windows File Manager (winfile) with enhancements

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

oomkilled

jentur-zabbeJ-8basdy opened this issue · comments

oomkilled
image

默认的脚本
`set -x
export BS=${BS:-16}
export MEMCAP=${MEMCAP:-0}
export GPUNUM=${GPUNUM:-1}

export MODLE_PATH="facebook/opt-${MODEL}"
model_name_or_path=./opt6.7b

# HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1
torchrun
--nproc_per_node ${GPUNUM}
--master_port 19198
train_gemini_opt.py
--mem_cap ${MEMCAP}
--model_name_or_path ${model_name_or_path}
--batch_size ${BS} `

Environment

版本:torch1.12+cu113
deepspeed:0.7.7
内存:80G

Originally posted by @iMountTai in hpcaitech/ColossalAI#2772

oomkilled

image

默认的脚本

`set -x

export BS=${BS:-16}

export MEMCAP=${MEMCAP:-0}

export GPUNUM=${GPUNUM:-1}

export MODLE_PATH="facebook/opt-${MODEL}"

model_name_or_path=./opt6.7b

# HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1

torchrun \

--nproc_per_node ${GPUNUM} \

--master_port 19198 \

train_gemini_opt.py \

--mem_cap ${MEMCAP} \

--model_name_or_path ${model_name_or_path} \

--batch_size ${BS} `

Environment

版本:torch1.12+cu113

deepspeed:0.7.7

内存:80G

Originally posted by @iMountTai in hpcaitech/ColossalAI#2772