Context-aware DeepFM not learning
BianchiGiulia opened this issue · comments
Hello!
I'm trying to run DeepFM with a custom subsample of the MIND dataset.
My files are:
-mind_small15.inter
-mind_small15.itememb
-mind_small15.useremb
I have very limited computing power, therefore I interrupted the training once I realized the model was not learning. Hereunder the validation score of the first two epochs. As you can see every evaluation gets worse. Any suggestion on what I might be doing wrong? Or is the model just too complex for my data (user and item embedding are TF-IDF of news titles)?
Any help is much appreciated,
Thank you in advance.
my model:
class DeepFMCustom(DeepFM):
def __init__(self, config, dataset):
super(DeepFMCustom, self).__init__(config, dataset)
pretrained_user_emb = dataset.get_preload_weight('uid')
pretrained_item_emb = dataset.get_preload_weight('iid')
self.user_embedding = nn.Embedding.from_pretrained(torch.from_numpy(pretrained_user_emb))
self.item_embedding = nn.Embedding.from_pretrained(torch.from_numpy(pretrained_item_emb))
CONFIG:
#CONTEX-AWARE RECOMMENDER
config_dict = {
'epochs': 10,
'data_path': '/Users/giulia/Desktop/tesi/',
'dataset': 'mind_small15', #or 'test_run'
'additional_feat_suffix': ['useremb', 'itememb'],
'load_col': {
'inter': ['user_id', 'item_id', 'label'],
'useremb': ['uid', 'user_emb'],
'itememb': ['iid', 'item_emb']},
'USER_ID_FIELD': 'user_id',
'ITEM_ID_FIELD': 'item_id',
'alias_of_user_id': ['uid'], #List of fields’ names, which will be remapped into the same index system with USER_ID_FIELD
'alias_of_item_id': ['iid'],
'preload_weight': {'uid': 'user_emb', 'iid': 'item_emb'},
'eval_args': {
'split': {'RS': [0.8, 0.1, 0.1]},
'group_by': 'user',
'order': 'RO', #there is no timestamp column
'mode': 'labeled'},
'model': DeepFMCustom,
'mlp_hidden_size': [32] ,
'dropout_prob': 0.1,
'learning_rate': 0.001,
'train_neg_sample_args': {'distribution': 'uniform', 'sample_num': 4, 'dynamic': True,'candidate_num': 3000},
#'neg_sampling': {'uniform': 1},
'device': device,
'embedding_size': 32,
'train_batch_size': 32,
'eval_batch_size': 32,
'l2_reg': 0.001,
'early_stopping_patience': 5,
'early_stopping_metric': 'AUC',
'checkpoint_dir': './saved',
'log_level': 'DEBUG',
'seed': 42,
'reproducibility': True,
'metrics': ["AUC", "MAE", "RMSE", "LogLoss"]
}
INFO ON DATASET:
mind_small15
The number of users: 86339
Average actions of users: 182.97368482012556
The number of items: 26938
Average actions of items: 586.4640457363478
The number of inters: 15797582
The sparsity of the dataset: 99.320767816568%
Remain Fields: ['user_id', 'item_id', 'label', 'uid', 'user_emb', 'iid', 'item_emb']
TRAINING LOG:
Tue 30 Apr 2024 17:53:06 INFO [Training]: train_batch_size = [32] train_neg_sample_args: [{'distribution': 'uniform', 'sample_num': 1, 'dynamic': False, 'alpha': 1.0, 'candidate_num': 0}]
Tue 30 Apr 2024 17:53:06 INFO [Evaluation]: eval_batch_size = [32] eval_args: [{'split': {'RS': [0.8, 0.1, 0.1]}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'labeled', 'test': 'labeled'}}]
Tue 30 Apr 2024 20:12:11 INFO epoch 0 training [time: 8335.17s, train loss: 282374.3796]
Tue 30 Apr 2024 20:12:48 INFO epoch 0 evaluating [time: 37.40s, valid_score: 0.342700]
Tue 30 Apr 2024 20:12:48 INFO valid result:
auc : 0.3427 mae : 0.5572 rmse : 0.7153 logloss : 3.1073
Tue 30 Apr 2024 20:12:53 INFO Saving current: ./saved/DeepFMCustom-Apr-30-2024_17-53-15.pth
Tue 30 Apr 2024 22:21:00 INFO epoch 1 training [time: 7687.66s, train loss: 292717.1044]
Tue 30 Apr 2024 22:21:36 INFO epoch 1 evaluating [time: 36.03s, valid_score: 0.237500]
Tue 30 Apr 2024 22:21:36 INFO valid result:
auc : 0.2375 mae : 0.5547 rmse : 0.726 logloss : 6.0022
Hi!
Maybe I can help here: have you tried setting freeze=False
?
From pytorch from_pretrained
documentation:
freeze
(bool, optional) – If True
, the tensor does not get updated in the learning process. Equivalent to embedding.weight.requires_grad = False
. Default: True
Additionally, your model inherits from DeepFM
and only adds the self.user_embedding
and self.item_embedding
variables, which from a quick look are not used in the DeepFM
model.
Hey! Thank you fro your prompt reply!
I thought that any context-aware algorithm could just use any pre-trained embedding as described in the doc, but a quick experiment with and without those embedding gave the exact same metric results, so I guess you're right and I should dive deeper into the OG code😅, or just choose for a different algorithm.
Thank you so much!!