ZJULearning / pixel_link

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

为什么程序会在CPU上跑?

Pro-flynn opened this issue · comments

在我自己的数据上运程序时 为什么只会在CPU上运行呢 发现用了550%的CPU 而 gpu才用了150M
希望以前踩过这个坑的人 能够提示一下 谢谢!!

下面是我 /scripts/train.sh 的中设置

set -x
set -e
export CUDA_VISIBLE_DEVICES=0
IMG_PER_GPU=32

TRAIN_DIR=$/pixel_link_info

OLD_IFS="$IFS"
IFS=","
gpus=($CUDA_VISIBLE_DEVICES)
IFS="$OLD_IFS"
NUM_GPUS=${#gpus[@]}

BATCH_SIZE=expr $NUM_GPUS \* $IMG_PER_GPU

DATASET=thaiid
DATASET_DIR=$/tmp

CUDA_VISIBLE_DEVICES=0 python train_pixel_link.py
--train_dir=${TRAIN_DIR}
--num_gpus=${NUM_GPUS}
--learning_rate=1e-3
--gpu_memory_fraction=-1
--train_image_width=512
--train_image_height=512
--batch_size=${BATCH_SIZE}
--dataset_dir=${DATASET_DIR}
--dataset_name=${DATASET}
--dataset_split_name=train
--max_number_of_steps=100
--checkpoint_path=${CKPT_PATH}
--using_moving_average=1
2>&1 | tee -a ${TRAIN_DIR}/log.log

How can you solved this problem? Please help me

@Pro-xiaowen Are you using conda environment? In my case, after used fully 20GB GPU, the code also used fully CPU. And i don't know why?

@Pro-xiaowen the problem come from its also using fully my CPU too. Have you face this problem?

And I have another question, after about 3000 iterations, my loss is approximate at 0.5 - 0.6 (pretrained model is PixelLink VGG 2s) and its not drop down anymore. @@!.
Have you trained successfully?
Would you give me some advice, or idea to get out of this situation?

@Pro-xiaowen my loss is 0.4 around, but detect empty box and i don't know why @@!

@Pro-xiaowen Can you share how you setup the dataset?. I think may be the problem come from my dataset.

@Pro-xiaowen Thanks you for sharing. I will try it.