Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Home Page:https://adversarial-robustness-toolbox.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement HuggingFace Language Modeling Estimators

f4str opened this issue · comments

Is your feature request related to a problem? Please describe.
The next step of integrating HuggingFace into ART is to add support for the language modeling estimators. This involves creating ART estimator wrappers for the HuggingFace text models. These estimators should support

Describe the solution you'd like
A new module will be created: art.estimators.language_modeling which will be where all of the new HuggingFace language modeling estimators will be implemented.

A new estimator will be created for each language modeling task (e.g., masked LM, sequence classification, next sentence prediction, etc.). Each estimator will be named accordingly (e.g., HuggingFaceMaskedLM, HuggingFaceSequenceClassificationLM, HuggingFaceNextSequencePredictionLM, etc.). This is due to the fact that the expected input and output for each task is unique.

Each estimator will take in a HuggingFace model and the corresponding tokenizer. In this approach, the model and tokenizer will be coupled in the same wrapper. This is the simplest approach since the tokenizer is specific to the text model and is not very useful standalone for ART's use cases.

Describe alternatives you've considered
The tokenizer can be made its own standalone module that is passed in to the ART wrapper. However, the tokenizer by itself is not very useful since it is dependent on the model (BERT, GPT-2, T5, etc.) and adds unnecessary complexity to creating the language model. If needed, the tokenizer can always be decoupled from the model and made standalone at a later point.

Additional context
The naming for the module and estimators are not finalized and are open to suggestions.

Hi Dear @f4str , incorporating NLP into ART isn't a bad idea!, I hope the goal will be to "backdoored" these LLMs or "poisoned" these models to better understand their potential vulnerabilities and flaws,? because if the goal is simply to insert HuggingFace models that are based on pre-trainer models that are themselves vulnerable .....

In short, a technical problem to bear in mind: a stand-alone tokenizer is less useful for ART use cases( I think ) because it's specific to a particular HuggingFace model and adds unnecessary complexity?
On the other hand, decoupling the tokenizer may introduce unnecessary complexity into the estimator creation process.

An improvement(s) could consist in : A mechanism for dynamically selecting the appropriate tokenizer based on the specified model. Adding automatic model loading to streamline the model preparation process, with integration with ART's tuning capabilities to enable optimization of HuggingFace's future models and tasks, which change on an almost monthly or quarterly basis, so as not to disrupt ART's existing structure.

Thanks ! : )