baidu-research / tripmaster

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot use tripmaster TMSuperviseLearner to go through complete learning pipeline

YvonneYang1234 opened this issue · comments

I cannot use tripmaster TMSuperviseLearner to go through complete learning pipeline. After running "Task" module, it ends without any error or warning. But when I use the Pangu package, it can go through the pipeline.
The log is as following:

[2023-03-10 02:00:39] DEBUG: Logging queue listener started!
[2023-03-10 02:01:10] INFO: 1 samples loaded

My Application is the subclass of "TMStandaloneApp". LearningSystem is the subclass of "TMSystem". Learner is the subclass of "TMSuperviseLearner". "TMSuperviseLearner" is imported from tripmaster.core.components.operator.supervise.

My config yaml is as following:

config:
io:
input:
task:
train_sample_ratio_for_eval: 0
serialize:
save: false
path: ${job.startup_path}/doc_hoia_task_data.pkl
load: false

problem:
  train_sample_ratio_for_eval:  0
  serialize:
    save: false
    path: ${job.startup_path}/doc_hoia_problem_data.pkl
    load: false

launcher:
type: local
strategies:
local:

job:
ray_tune: false

startup_path: ""
testing: false
test:
validate: False
sample_num: 10
epoch_num: 10
batching:
type: fixed_size
strategies:
fixed_size:
batch_size: 1
drop_last: False
# parallel: single
dataloader:
worker_num: 0 # load data using multi-process
pin_memory: false
timeout: 0
resource_allocation_range: 10000
drop_last: False
train_eval_sampling_ratio: 0
resource:
computing:
cpu_per_trial: 1
cpus: 4
gpu_per_trial: 0
gpus: 0
memory:
inferencing_memory_limit: 1000
learning_memory_limit: 1000
distributed: "no"
metric_logging:
type: tableprint
strategies:
tableprint: { }
tensorboard:
path: "metrics"

system:
serialize:
save: true
path: ${job.startup_path}/doc_hoia.system.pkl

load: false

task:
evaluator: {} # define raw evaluator?
tp_modeler:

sentence_processor:
  require_words: True
  provide_words: True
  add_bert_tokens: True
  spacy_language_pack: "en_core_web_sm"
  tokenizer:
    pretrained_tokenizer_path: "ernie-3.0-base-zh" #"ernie-3.0-mini-zh"

problem:
evaluator:
machine:
arch:
pretrained:
model_path: "ernie-3.0-base-zh" #"ernie-3.0-mini-zh"
voc_size: null
decoder:
all_copy: true
anno_hidden_size: 768
arc_hidden_size: 128
beam_size: 1
cross_attn: false
dropout: 0
input_size: 768
rel_hidden_size: 768
edge_embedding_dims: 128
label2id_path: ${job.startup_path}/label2id.yaml
loss:
interpolation: 0.5
alpha: 1.0
beta: 1.0
lamb: 1.0
evaluator:
average: "weighted"
num_edge_types: 67
learner:
optimizer:
strategy:
epochs: 1
algorithm:
pretrained_embedding:
lr: 5e-5
decoder:
lr: 1e-4

  lr_scheduler:
    gamma: 0.9
  gradient_clip_val: 1.
modelselector:
  stage: "problem"
  channel: "dev"
  metric: "span_split_prediction.Acc"
  # metric: "span_type_prediction.Accuracy"
  better: "max"
  save_prefix: "best"
evaluator_trigger:
  interval: 1

repo:
server: "http://public.bcc-bdbl.baidu.com:8000"
local_dir: ${job.startup_path}/pangu

I think your System class should subclass the TMSuperviseSystem rather than the TMSystem class.