mayerantoine / clinical-adapter

GA Tech CS7643 Group Project implementing adapter-transformers for clinical information extraction and classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clinical-adapter

GA Tech CS7643 Group Project implementing adapter-transformers for clinical entity extraction and assertion classification tasks.

Overleaf Report (read only link): https://www.overleaf.com/read/nckbrhtkgsbc#8e823a

Example Overleaf (for refernce): https://www.overleaf.com/project/5f5ec061aa94370001943266

Project Summary: Advancements in natural language processing (NLP) and natural language understanding (NLU) offer new and exciting applications to the fields of healthcare and public health. Specifically, extracting important pieces of information in various types of health records and assessing the certainty of clinical statements represents an important task with applications in the medical industry, public health, and several fields of research. However, currently, these domains face challenges related to a lack of resources and techniques to efficiently solve the disparate and complex tasks needed to evaluate health records. One solution is to use transfer learning, leveraging pre-trained models from the Bidirectional Encoder Representations from Transformers (BERT) family, and further fine-tuning them in the healthcare domain to develop specific task models. However, this still requires resources to fully fine-tune multiple models on specific tasks or subtasks. In recent years, several new approaches to transfer and multitask learning using "adapter transformers" have been proposed. These approaches serve as efficient parameter fine-tuning techniques, reducing the number of parameters and storage of models. This project aims to explore approaches of parameter-efficient fine-tuning using adapters and evaluate their application in multitask learning on two linked NLU tasks using healthcare records: clinical entity extraction and clinical assertion classification.

To run on colab

Upload the run_experiment.py, utils.py, and config.yaml. Make sure you have access to i2b2 data and also upload the data in colab. You can change configuration in the config file.

!pip install -q spacy
!pip install -q evaluate
!pip install -q datasets
!pip install -q accelerate
!pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_ner_bc5cdr_md-0.4.0.tar.gz
!pip install -Uq adapters
!pip install -q seqeval
!pip install wandb
!python run_experiment.py

Config yaml file

Train:
  task: ast #ner or ast
  model: bert # bert or clinicalbert
  finetune: head # head or full
  lr: 0.00001 # 1e-5
  epochs: 2
  batch: 16
  weight_decay : 0.002
  adapter: False # True or False
  adapter_method: SeqBnConfig # SeqBnConfig,DoubleSeqBnConfig https://docs.adapterhub.ml/overview.html
  reduction_factor: 64
  logging_steps: 500
  hd: cpu # intel or arm
  wandb: False # True or False
  wandb_api_key:  #set your own api key here

data:
  i2b2: all # all or beth_and_partners
  frac: 0.1

To run hyperparameter tuning on wandb

Change the hyperparameters that are defined un run_experiment.ipynb in the constant sweep_configuration. The current code only allows hyperparameter tuning of parameters that are included in the config file. While chaning hyperparameters do make sure that the key in the json matches with key in config.yaml and for hyperparameters that have int or float values.

About

GA Tech CS7643 Group Project implementing adapter-transformers for clinical information extraction and classification


Languages

Language:Jupyter Notebook 99.1%Language:Python 0.9%