princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

no response with training customer dataset

BarryC7 opened this issue · comments

I slightly modify the supervised example to

python train.py --model_name_or_path bert-base-uncased --train_file data/train1_CSE.csv --output_dir result/my-sup-simcse-bert-base-uncased --num_train_epochs 1 --per_device_train_batch_size 64 --max_seq_length 512 --evaluation_strategy steps --metric_for_best_model stsb_spearman --load_best_model_at_end --eval_steps 25 --pooler_type cls --overwrite_output_dir --temp 0.05 --do_train --do_eva --fp16

it shows
INFO - main- PyTorch: setting up devices
WARNING - main - Process rank: -1, device: cuda:0, n_gpu: 1 distributed training: False, 16-bits training: True
INFO - main - for parameters

Does anyone know if this is normal? I waited for a while and then checked the task manager. There seemed to be no response. CPU and GPU are low usage. The situation is the same even if I use the default supervised example.

I tried 2 environments:
py3.9, transformer4.2.1, torch2.0.1+cu118
py3.8, transformer4.2.1, torch1.7.1+cu110 (My computer cuda version is 11.8)

Stale issue message