zerohd4869 / SACL

The repository for ACL 2023 paper "Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations", and SemEval@ACL 2023 paper "UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hello! I have a few questions for you about the code:

WISH-WEI opened this issue · comments

您好!针对代码我有几个问题想要请教您:
1.关于CAT的处理,主要体现在代码的那个部分呢?
2.如果想要更换特征处理器,emb—names应该依据什么内容进行修改呢?
如果您能不吝赐教,我将万分感激!
Hello! I have a few questions for you about the code:

  1. What part of the code does CAT processing mainly reflect?
  2. If you want to change the signature processor, what should EMB-names change based on?
    I would appreciate it very much if you could give me your advice.

Is pgd function not used? Is adversary_flag True when it is officially run?

  1. CAT的处理包含两部分:1)生成对抗扰动r。这部分代码可参考adversary_flag=True的代码块(at_method默认为FGM);2)将扰动r加在建模上下文的hidden layer的权重上,之后再基于原优化目标进行梯度更新。这部分代码主要体现在emb_names变量的设置中,指定扰动r起作用的位置。如论文的图2(a)所示, 在输入信号u经过的多通道网络中,与u相关的所有带参数隐层均加放入了emb_names变量中。
  2. 对于emb_names的选择:如果是上下文相关的输入,可使用LSTM/GRU等序列网络来建模上下文依赖,可直接参考我们的设置;如果是上下文无关的输入,如使用BERT建模单个句子,可直接将扰动放在embedding层或者最低的1-2层 hidden layer即可。

English version:

  1. The processing of CAT consists of two parts: 1) Generating an adversarial perturbation r. This part of the code can be referred to in the block where adversary_flag=True (with at_method defaulting to FGM). 2) Adding the perturbation r to the weights of the context-aware hidden layer, followed by gradient updates based on the original optimization objective. This part of the code is mainly reflected in the setting of the emb_names variable, specifying the locations where the perturbation r takes effect. As shown in Figure 2(a) of the paper, in the multi-channel network that the input signal u passes through, the relevant parameterized hidden layers are all included in the emb_names variable.

  2. The selection of emb_names: If the input is context-dependent, you can use sequence networks like LSTM/GRU to model context dependencies, which can be directly referenced in our setup; if the input is context-independent, such as using BERT to model a single sentence, you can directly place the perturbation in the embedding layer or the lowest 1-2 hidden layers.

Is pgd function not used? Is adversary_flag True when it is officially run?

For the results in the paper, CAT is implemented based on FGM. The adversary_flag is set to True and at_method is set to FGM when it is officially run, as detailed in the corresponding run script. CAT is an adaptation of traditional AT methods for context-dependent scenarios and can be used in conjunction with other adversarial methods such as PGD. FGM has advantages such as simple implementation, few hyperparameters, and fast speed.

Thank you for your attention to our work. If you have any questions during the use, please feel free to communicate with us. We will do our best to provide support. If there is no timely response, we recommend contacting me via email at hudou@iie.ac.cn.