geoffreyweal / Literature_Mining_Tutorial

This repository contains all the notebook and information required for the iDM Literature Mining Tutorial.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Literature Mining Tutorial for the Centre of Integrated Data Materials Science (iDM)

Author: Dr. Geoffrey Weal1,2,3 (geoffrey.weal@vuw.ac.nz)

Advisers: Dr. Paul Hume1,2,3, Dr. Daniel Packwood3,4, Prof. Justin Hodgkiss1,2,3

Editors: Dr. Daniel Packwood3,4, Dr. Chayanit (First) Wechwithayakhlung3,4, Olivia Sato1,2, Dr. Caitlin Casey-Stevens1,2

  1. Victoria University of Wellington, Wellington, New Zealand
  2. The MacDiarmid Institute for Advanced Materials and Nanotechnology, Wellington, New Zealand
  3. Centre of Integrated Data Materials Science, Yoshida-Ushinomiya-cho, Sakyo-ku, Kyoto, Japan
  4. Institute for Integrated Cell-Material Sciences, Kyoto University, Yoshida-Ushinomiya-cho, Sakyo-ku, Kyoto, Japan

About

This tutorial will show you how to mine data from scientific papers using Python. We will use scripts for gathering papers, extracting and analysing text, and for highlighting keywords.

This tutorial does not require any programming knowledge prior to the tutorial.

Google Colab and Python

Google Colab is a online tool that allows you to create a notebook that allows you to write Python, Julia, and R code, along with notes about what the code does.

We will be running this tutorial in Google Colab, using Python to run commands. Python is a programming language that contains lots of helpful packages for mining data from scientific papers.

Instructions

Click on each notebook below that you want to open in Google Colab. These notebooks are designed to be completed in order.

RECOMMENDED: Right click each link as you go, one at a time, and load each link as a new tab in your browser.

  1. Gathering Papers in Bulk
  2. Extracting Text from PDF using Python
  3. Highlighting Keywords in PDFs using Python
  4. Putting It All Together

Issues

If you have any issues when you try running this tutorial on your own, feel free to get in touch at geoffrey.weal@vuw.ac.nz

The Centre of Integrated Data Materials Science (iDM)

The Centre of Integrated Data Materials Science (iDM) is a new relationship between the Institute for Integrated Cell-Material Sciences (iCeMS) at Kyoto University and the MacDiarmid Institute for Advanced Materials and Nanotechnology, involving universities from across New Zealand. The iDM aims to deepen the paradigm of data-driven materials science while aiming to establish a next-generation materials development process. See the following websites for more information:

About

This repository contains all the notebook and information required for the iDM Literature Mining Tutorial.

License:GNU Affero General Public License v3.0


Languages

Language:Jupyter Notebook 100.0%