albert-pang / web-of-science-downloader

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WOS Downloader App

Description

The WOS Downloader App is a sophisticated tool designed to facilitate the efficient downloading of data from the Web of Science (WOS) platform. This application automates the retrieval of publication information, optimizing the research workflow for academics and data analysts.

Features

  • Automated downloading of publication metadata, including full record and cited references.
  • Support for data export in tab-delimited file format.
  • Streamlined, user-friendly interface for seamless user experience.

Framework and Technologies

The app leverages the following frameworks and technologies:

  • Python: For scripting and automation.
  • Streamlit: To construct the interactive web interface.
  • Selenium: For programmatically navigating the Web of Science website and extracting data.

Installation

Prerequisites

  • Install Anaconda to manage packages and environments for Python.
  • Ensure Python is installed on your system (Python 3.9 and above is recommended).

Setup

  1. Clone the repository to your local machine.
  2. Open Anaconda Command Prompt and navigate to the cloned repository's directory.
  3. Create a new environment using the wos.yml file by executing conda env create -f wos.yml. This will copy all necessary dependencies within the environment.
  4. Activate the newly created environment with conda activate wos.
  5. Please ensure that you download the specified versions of Chrome and Chromedriver via the provided Google Drive link. This application is configured to work with particular versions of both to avoid mismatches between the browser and the webdriver, which could lead to unexpected errors.

Folder Structure

Below is the folder structure of the WOS Downloader:

  • /Chrome: Chrome folder
  • /Chromedriver: Chromedriver folder
  • app.py: The main Python script to run the WOS Downloader.
  • launcher.py: The script that is used to launch app.py
  • wos.yml: cloned environment

image

Usage

  1. To start the WOS Downloader App, use the launcher.py script to launch the launcher.py script (make sure you're using a dedicated terminal with the wos environment activated).
  2. There should be a pop-up for the Streamlit UI (Homepage).
  3. You can access the app through your web browser at http://localhost:8501 as long as the script is running.
  4. Input your search link or URL (WOS) into the app's search bar.
  5. Initiate the search by pressing Start and sign into the WOS if necessary (if you're using a network outside of the University's).
  6. Find the downloaded files in your Download folder.
  7. If you need to do another search, just input your next link and press Start again.

Important Notice for External Network Users:

When using the WOS Downloader App from a network outside the University, it is crucial to end your session and log out from any active services from WOS before initiating a download with this application. Failure to do so may lead to the app encountering a page that detects residual activity, which can cause errors.

In the event that you encounter an error, please take the following steps:

  1. Refresh the WOS Downloader App interface.
  2. Confirm that you have completed the End your session and log out process from all WOS services.
  3. Proceed with your download again.

Acknowledgements

  • Thanks to Streamlit for powering the interactive web interface.
  • Gratitude to Selenium for enabling comprehensive web scraping capabilities.

About


Languages

Language:Python 100.0%