ETM training leading to NaN loss

Question

ETM training leading to NaN loss

PearlSikka opened this issue 2 years ago · comments

OCTIS version: 1.10.4
Python version: 3.7.13
Operating System: Windows

Description

I'm running topic model for tweets using ETM model. While training, it led to NaN loss in the first epoch and hence, the training doesn't go further epochs. The ETM model is being trained with default parameters.

model = ETM(num_topics=10) #command run
output = model.train_model(dataset)

Output:
Epoch: 1 .. batch: 20/25 .. LR: 0.005 .. KL_theta: nan .. Rec_loss: nan .. NELBO: nan


![tm_fail](https://user-images.githubusercontent.com/70057374/177056948-277f8d0f-9b57-4884-ab60-c79827ff5b8b.png)

Silvia Terragni · Answer 1 · Mon Jul 04 2022 20:16:57 GMT+0800 (China Standard Time)

Hello,
this is an issue related to the original implementation of ETM. We took the model and integrated into OCTIS. Looking at a related issue in the original repo (adjidieng/ETM#3), it seems that lowering the learning rate could help. The other two parameters (bow_norm and activation_function) are okay by default.
Otherwise you can try using a different model, e.g. CTM seems to work well on short texts as tweets.

Let me know if it helps,

Silvia

Pearl Sikka · Answer 2 · Tue Jul 05 2022 03:28:40 GMT+0800 (China Standard Time)

Thank you Silvia for your quick response. I tried training ETM with lower learning rate as well but it still shows NaN loss. Maybe I can leverage CTM model. Thanks again!

Silvia Terragni · Answer 3 · Tue Jul 05 2022 18:21:02 GMT+0800 (China Standard Time)

Okay, then I'll close the issue. Feel free to re-open it or open a new issue if you have other questions.