drskalman / scientific-paper-pdf-rename

A simple tool for renaming scientific papers files according their titles.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scientific Papers PDF Rename

A simple command-line tool to rename ALL your scientific papers.

renamer

Now you can organize and rename ALL those scientific papers you have downloaded once, but you didn’t have the time to organize and rename one-by-one in your directories.

Introduction

Scientific Paper PDF Rename (sci_rename for short) is a command line tool that helps you to rename all your scientific paper according to their titles. The proposed tool scans the PDF file of your papers in order to find its title and rename the pdf file of the paper accordingly.

Approach

sci_rename looks for the title at two different places. First it tries to find the title using the pdf metadata. When pdf files are carefuly created the title of the paper can be included into the pdf metadata. Unfortunately, this practice not as frequent as we expect. To overcome that we use the following assumption. The title of a scientific paper is usually formatted with the largest font of the text. Therefore, sci_rename also scans the first page of the paper and try to identify a bounded size sentence with the largest font size. If it finds a potential title in both cases (i.e., metadata and larger font approach), it asks the user to choose one between them.

Dependencies

Requires Python>=3.7 and PyMuPDF==1.18.14

Installation

  1. Clone the project in your machine

    $ git clone https://github.com/heversonbr/scientific-paper-pdf-rename.git
  2. Enter in the created directory

    $ cd scientific-paper-pdf-rename
  3. Set a python virtual environment to avoid mixing the required packages with your global installed packages.

    $ python3 -m venv .venv
  4. Activate your virtual environment

    $ source .venv/bin/activate
  5. Check the installed packages and update you pip , if required.

    $ pip list
    $ python3 -m pip install --upgrade pip
  6. Install the dependencies.

    $ python3 -m pip install -r requirements.txt

Usage

You can run the script to rename a simple file or a target directory as follows.

all files in directory
$ python3 sci_rename.py <target_directory>

or

single file
$ python3 sci_rename.py <pdf_filename.pdf>

The output will be in a directory called auto_renamed. Use the examples directory to test the renaming tool if you wish.

Feedback

All feedbacks are welcome. It was only tested on Mac OSx and Linux. I did this simple tool to try avoiding the hell of looking for a specific paper I have downloaded before and got lost among many others papers downloaded. I hope it will help.

About

A simple tool for renaming scientific papers files according their titles.

License:GNU Affero General Public License v3.0


Languages

Language:Python 93.5%Language:Shell 6.5%