The path 'model_path' cannot be found

Question

The path 'model_path' cannot be found

lilililisa1998 opened this issue 2 years ago · comments

lilililisa1998 commented 2 years ago

Excuse me," model_ Path" where is this file generated? What is model.tar.gz

Chia Yew Ken · Answer 1 · Thu Apr 28 2022 14:10:50 GMT+0800 (China Standard Time)

Hi, can you give more information to help debug your case? For example, are you following these training steps? https://github.com/chiayewken/Span-ASTE#model-training

Zyuting1 · Answer 2 · Mon May 09 2022 19:39:26 GMT+0800 (China Standard Time)

excuse me, i met the same question when debug the process.
in main.py when get the return of train(func) and self.model_path will raise the exception that there is no model.tar.gz.
i cant figure it out which file is that generates the model.tar.gz:(

Chia Yew Ken · Answer 3 · Mon May 09 2022 21:08:03 GMT+0800 (China Standard Time)

Hi, we recently updated to a new and simpler API which uses wrapper.py and not main.py, can you try the training steps in this notebook? https://github.com/chiayewken/Span-ASTE/blob/main/demo.ipynb

from wrapper import SpanModel

model = SpanModel(save_dir=save_dir, random_seed=random_seed)
model.fit(path_train, path_dev)

model.predict(path_in=path_test, path_out=path_pred)
results = model.score(path_pred, path_test)

Zyuting1 · Answer 4 · Mon May 09 2022 21:15:11 GMT+0800 (China Standard Time)

and here is another question when i read ur paper， i would be grateful if u could help with it: D
during the mention module, you decrease the amount of candidates under the supervised of ATE and OTE task, but you gave only one eqaution to calculate the score of candidates. i dont understand the achievement of this step well. did it mean that with the labeled datas, we just compare which span is more similar to the real-label in the datasets？

thanks deeply for ur reply!

Zyuting1 · Answer 5 · Mon May 09 2022 21:18:42 GMT+0800 (China Standard Time)

in the demo.ipynb i can now run the train process successfully. but i want to do some changes based on ur code, so i cant just work with the API : ( btw thanks for ur help !

Chia Yew Ken · Answer 6 · Wed May 11 2022 10:11:49 GMT+0800 (China Standard Time)

Hi, it is actually less complicated to modify the model code based on the new API in wrapper.py, as you just need to modify the config.jsonnet and the underlying span_model.py. The old API in main.py has the same logic, but is not as clean.

Chia Yew Ken · Answer 7 · Wed May 11 2022 10:16:42 GMT+0800 (China Standard Time)

Firstly, we calculate the logits for each span (Opinion/Target/Invalid).
Based on the equation 4, we can select the top candidates for target terms.
Separately and based on equation 4, we can also select the top candidates for opinion terms.

Zyuting1 · Answer 8 · Wed May 25 2022 21:19:51 GMT+0800 (China Standard Time)

hey, thanks for ur reply for the last time!
but i couldt understand well about how to decide the type of span or spans_pair with a feed forward network ..
does it mean the type was given random at the first training epoch. and by repeat this process, we can gradually learn the true type of data?
i still got another problem. when using the bert embedding, at the span representation part, do we need to do something with the dataset as bert will get a different token so some word will be apart in bert, thus the label in dataset will be changed.

thank you a lot! and looking forward to ur reply~

Chia Yew Ken · Answer 9 · Thu May 26 2022 00:05:20 GMT+0800 (China Standard Time)

Hi, the span type (aspect or opinion) is not random as it is supervised by the mention loss which is included in the overall loss.

Regarding how to align the original tokens with the BERT tokens, we use the helper pretrained transformer class from AllenNLP which can match the tokens. Hence, the output token representations will correspond to the original labels. (https://docs.allennlp.org/main/api/modules/token_embedders/pretrained_transformer_mismatched_embedder/)

Lafandi · Answer 10 · Mon Jun 27 2022 10:52:30 GMT+0800 (China Standard Time)

Hi, I would like to ask about the alignment of the original token with the BERT token. How pretrained transformer mismatched embedder did this. Can it be used to match other embeddings? I cannot decipher the steps written from documentation by allennlp.

Chia Yew Ken · Answer 11 · Mon Jun 27 2022 18:14:29 GMT+0800 (China Standard Time)

Hi, based on the sentence processing function, the sentence is converted into a TextField where each input word is converted to a list of wordpiece ids by the token indexer (BERT tokenizer).
Because each input word can be converted to one or more ids by the token indexer, the mismatched embedder will aggregate the the list of wordpiece BERT embeddings to a single embedding for each word.
By default, the aggregation method is to average the wordpiece embeddings to get the single word embedding.
Hence, the output word embeddings will match the word-level annotations in the dataset.

The tokenizer and transformer are specified in the config.jsonnet.
If you want to use other transformer models, you may be able to change the model name if it is in HuggingFace transformers (eg roberta-base).
If you want to implement your custom embedding method, you need to define a custom embedder class, for example we have some extra example code which demonstrates how to manipulate the transformer layer representations.

upupup · Answer 12 · Wed Jul 06 2022 17:27:39 GMT+0800 (China Standard Time)

Excuse me,I meet the question about model_path .
the problems are as follows.

Could you give me some suggestions?Thanks deeply for your time!

Chia Yew Ken · Answer 13 · Wed Jul 06 2022 21:13:51 GMT+0800 (China Standard Time)

Hi, can you please provide more details on the error? For example the exact commands used and environment and hardware details. There may be an error message higher up which caused the model saving to fail. If possible, you can also try with the new API which is easier to use.

Lafandi · Answer 14 · Thu Jul 07 2022 04:03:45 GMT+0800 (China Standard Time)

Hi, Thanks again Ken....I looked through the allennlp tutorial and somehow manage to understand the part you mention prior. Regarding BERT last layer(12th layer), it will be followed by next sentence prediction which is prior span representation aggregation. Correct me if I'm wrong.:). I am referring to other post in towards data science(https://medium.com/analytics-vidhya/understanding-bert-architecture-3f35a264b187)

Chia Yew Ken · Answer 15 · Mon Jul 11 2022 23:07:48 GMT+0800 (China Standard Time)

Hi, although BERT uses next-sentence and masked token prediction in the pre-training phase, these objectives are not used in the fine-tuning phase for ASTE. Instead, we input the sequence to BERT to obtain the contextualized token representations, which are then used to compute the span representations and so on here.