aadarshsingh191198 / glossary_builder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DEBUG : Definition Extraction for Building Useful Glossaries

This is a project in which definition extraction is explored. Definition extraction is broken into 2 processes :

  1. Whether sentences contain a definition
  2. Tagging the tokens of a sentence containing the definition.

Datasets

  1. WCL corpus [http://lcl.uniroma1.it/wcl/]
  2. DEFT corpus [https://github.com/adobe-research/deft_corpus]

The source code related to the training of the models can be found in notebooks and the pretrained models on datasets can be found in models.

The end to end pipeline can be found in the definition_extractor.py script.

Setting up the server - Ubuntu-16.04 with GPU Tesla K80

  1. Install python3.6
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.6
  1. Install cuda 10.1 - Reference
wget --header="Connection: keep-alive" "https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.1.105-1_amd64.deb" -c -O 'cuda-repo-ubuntu1604_10.1.105-1_amd64.deb'
sudo dpkg -i cuda-repo-ubuntu1604_10.1.105-1_amd64.deb
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
  1. Install conda - Reference 1 and 2
wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
bash Anaconda3-2020.02-Linux-x86_64.sh
source ~/.bashrc
  1. Install git lfs - Reference
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
  1. Clone github repository (The traditional git clone didn't appear to work) - SO Answer
git lfs clone https://github.com/aadarshsingh191198/glossary_builder.git
  1. Create conda environment
conda create --name tf1_15 python=3.6
conda activate tf1_15
  1. Install tensorflow 1.15 - Why use conda for tensorflow?
conda install tensorflow==1.15
  1. Install remaining dependencies
pip install -r requirements.txt
  1. Install spacy's small English model
python -m spacy download "en_core_web_sm"
  1. Run server
python app.py
  1. Setting up production server - Reference
sudo apt-get install nginx-full
sudo /etc/init.d/nginx start

# remove default configuration file
sudo rm /etc/nginx/sites-enabled/default

# create a new site configuration file
sudo touch /etc/nginx/sites-available/flask_project
sudo ln -s /etc/nginx/sites-available/flask_project /etc/nginx/sites-enabled/flask_project

# Edit configuration file
sudo nano /etc/nginx/sites-enabled/flask_project

#Copy and paste the following code 
  server {
    location / {
        proxy_pass http://0.0.0.0:8000;
    }
}

Other reference links:

  1. Using conda with pip - 1 and 2
  2. Exposing GCE ports
  3. Increasing GPU Quota on GCE. Sidenote - Checkout the comment section too.

About


Languages

Language:Jupyter Notebook 99.6%Language:Python 0.4%