chiayewken / Span-ASTE

Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The path 'model_path' cannot be found

lilililisa1998 opened this issue · comments

Excuse me," model_ Path" where is this file generated? What is model.tar.gz

Hi, can you give more information to help debug your case? For example, are you following these training steps? https://github.com/chiayewken/Span-ASTE#model-training

excuse me, i met the same question when debug the process.
in main.py when get the return of train(func) and self.model_path will raise the exception that there is no model.tar.gz.
i cant figure it out which file is that generates the model.tar.gz:(

Hi, we recently updated to a new and simpler API which uses wrapper.py and not main.py, can you try the training steps in this notebook? https://github.com/chiayewken/Span-ASTE/blob/main/demo.ipynb

from wrapper import SpanModel

model = SpanModel(save_dir=save_dir, random_seed=random_seed)
model.fit(path_train, path_dev)

model.predict(path_in=path_test, path_out=path_pred)
results = model.score(path_pred, path_test)

and here is another question when i read ur paper, i would be grateful if u could help with it: D
during the mention module, you decrease the amount of candidates under the supervised of ATE and OTE task, but you gave only one eqaution to calculate the score of candidates. i dont understand the achievement of this step well. did it mean that with the labeled datas, we just compare which span is more similar to the real-label in the datasets?

thanks deeply for ur reply!

in the demo.ipynb i can now run the train process successfully. but i want to do some changes based on ur code, so i cant just work with the API : ( btw thanks for ur help !

Hi, it is actually less complicated to modify the model code based on the new API in wrapper.py, as you just need to modify the config.jsonnet and the underlying span_model.py. The old API in main.py has the same logic, but is not as clean.

image

Firstly, we calculate the logits for each span (Opinion/Target/Invalid).
Based on the equation 4, we can select the top candidates for target terms.
Separately and based on equation 4, we can also select the top candidates for opinion terms.

hey, thanks for ur reply for the last time!
but i couldt understand well about how to decide the type of span or spans_pair with a feed forward network ..
does it mean the type was given random at the first training epoch. and by repeat this process, we can gradually learn the true type of data?
i still got another problem. when using the bert embedding, at the span representation part, do we need to do something with the dataset as bert will get a different token so some word will be apart in bert, thus the label in dataset will be changed.

thank you a lot! and looking forward to ur reply~

image

Hi, the span type (aspect or opinion) is not random as it is supervised by the mention loss which is included in the overall loss.

Regarding how to align the original tokens with the BERT tokens, we use the helper pretrained transformer class from AllenNLP which can match the tokens. Hence, the output token representations will correspond to the original labels. (https://docs.allennlp.org/main/api/modules/token_embedders/pretrained_transformer_mismatched_embedder/)

Hi, I would like to ask about the alignment of the original token with the BERT token. How pretrained transformer mismatched embedder did this. Can it be used to match other embeddings? I cannot decipher the steps written from documentation by allennlp.

Hi, based on the sentence processing function, the sentence is converted into a TextField where each input word is converted to a list of wordpiece ids by the token indexer (BERT tokenizer).
Because each input word can be converted to one or more ids by the token indexer, the mismatched embedder will aggregate the the list of wordpiece BERT embeddings to a single embedding for each word.
By default, the aggregation method is to average the wordpiece embeddings to get the single word embedding.
Hence, the output word embeddings will match the word-level annotations in the dataset.

The tokenizer and transformer are specified in the config.jsonnet.
If you want to use other transformer models, you may be able to change the model name if it is in HuggingFace transformers (eg roberta-base).
If you want to implement your custom embedding method, you need to define a custom embedder class, for example we have some extra example code which demonstrates how to manipulate the transformer layer representations.

Excuse me,I meet the question about model_path .
the problems are as follows.
image

Could you give me some suggestions?Thanks deeply for your time!

Hi, can you please provide more details on the error? For example the exact commands used and environment and hardware details. There may be an error message higher up which caused the model saving to fail. If possible, you can also try with the new API which is easier to use.

Hi, Thanks again Ken....I looked through the allennlp tutorial and somehow manage to understand the part you mention prior. Regarding BERT last layer(12th layer), it will be followed by next sentence prediction which is prior span representation aggregation. Correct me if I'm wrong.:). I am referring to other post in towards data science(https://medium.com/analytics-vidhya/understanding-bert-architecture-3f35a264b187)

Hi, although BERT uses next-sentence and masked token prediction in the pre-training phase, these objectives are not used in the fine-tuning phase for ASTE. Instead, we input the sequence to BERT to obtain the contextualized token representations, which are then used to compute the span representations and so on here.