Should I apply softmax or log_softmax to my token_scores?

Question

Should I apply softmax or log_softmax to my token_scores?

poteminr opened this issue 2 years ago · comments

embedded_text_input = self.encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state
embedded_text_input = self.dropout(F.leaky_relu(embedded_text_input))
 token_scores = F.log_softmax(self.feedforward(embedded_text_input), dim=-1)
 # or
 # token_scores = self.feedforward(embedded_text_input)

loss, output_tags = self.apply_crf(token_scores, labels, attention_mask, batch_size=batch_size)

Should I apply softmax before passing token_scores to the CRF?

Kemal Kurniawan · Answer 1 · Fri Dec 02 2022 02:53:25 GMT+0800 (China Standard Time)

No you don’t have to. The CRF is already a (very large) softmax over the possible tag sequences.On 2 Dec 2022, at 4:42 am, Roman Potemin ***@***.***> wrote: embedded_text_input = self.encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state embedded_text_input = self.dropout(F.leaky_relu(embedded_text_input)) token_scores = F.log_softmax(self.feedforward(embedded_text_input), dim=-1) # or # token_scores = self.feedforward(embedded_text_input) loss, output_tags = self.apply_crf(token_scores, labels, attention_mask, batch_size=batch_size) Should I apply softmax before passing token_scores to the CRF? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Roman Potemin · Answer 2 · Fri Dec 02 2022 15:48:15 GMT+0800 (China Standard Time)

Thank you!