researchhub-backend

This repository contains the Django backend for https://www.researchhub.com/.

Setup

GENERAL

Install the flake8 linter in your IDE

Create a keys file in config

touch src/config/keys.py

Add the following to keys.py (fill in the blanks)

SECRET_KEY = ''
AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''
INFURA_PROJECT_ID = ''
INFURA_PROJECT_SECRET = ''
INFURA_RINKEBY_ENDPOINT = f'https://rinkeby.infura.io/v3/{INFURA_PROJECT_ID}'

Add local config files by copying files from src/config to src/config_local. Ask somebody to provide all the keys.

Set executable permissions on scripts

chmod -R u+x scripts/

Install git hooks

./scripts/install-hooks

DATABASE

Create a db file in config

touch src/config/db.py

Add the following

NAME = 'researchhub'
HOST = 'localhost'
PORT = 5432
USER = 'rh_developer' # replace as needed
PASS = 'not_secure'   # replace as needed

Create a local postgres db called researchhub. Alternatively, to use docker for local development, run the following:

# https://docs.docker.com/samples/library/postgres/
docker run \
  --rm \
  --name researchhub_db \
  --env POSTGRES_DB=researchhub \
  --env POSTGRES_USER=rh_developer \
  --env POSTGRES_PASSWORD=not_secure \
  --volume "$(pwd)"/database:/var/lib/postgresql/data \
  --publish 5432:5432 \
  --detach \
  postgres:12

ENVIRONMENT

The project environment is managed using Pipenv.

The project uses Python version 3.6.12, so you will need to install it (use pyenv e.g.)

If you're installing on macOS 11.x, additional step is required for which the explanation can be found here or here, that basically installs the right version of Python with extra flags (notice Python version within the script):

CFLAGS="-I$(brew --prefix openssl)/include -I$(brew --prefix bzip2)/include -I$(brew --prefix readline)/include -I$(xcrun --show-sdk-path)/usr/include" LDFLAGS="-L$(brew --prefix openssl)/lib -L$(brew --prefix readline)/lib -L$(brew --prefix zlib)/lib -L$(brew --prefix bzip2)/lib" pyenv install --patch 3.6.12 < <(curl -sSL https://github.com/python/cpython/commit/8ea6353.patch\?full_index\=1)

After installing Python, run the following commands from the src directory:

# installs the project environment and packages
pipenv install

# activates the environment and enters shell
pipenv shell

In general, when adding new packages, follow these steps:

# add a package to the project environment
pipenv install package_name

# update requirements.txt which is used by elastic beanstalk
pipenv lock --requirements >| requirements.txt

REDIS (Required)

Make sure to run redis-server in a separate terminal.

ELASTICSEARCH (Optional)

In a new shell, run this Docker image script (make sure Redis is running in the background redis-server)

 # Let this run for ~30 minutes in the background before terminating, be patient :)
./start-es.sh

Back in the python virtual environment, build the indices

python manage.py search_index --rebuild

Optionally, start Kibana for Elastic dev tools

./start-kibana.sh

To view elastic queries via the API, add DEBUG_TOOLBAR = True to keys.py. Then, visit an API url such as http://localhost:8000/api/search/paper/?publish_date__gte=2022-01-01

ETHEREUM (Optional)

Create a wallet file in config

touch src/config/wallet.py

Add the following to wallet.py (fill in the blanks)

KEYSTORE_FILE = ''
KEYSTORE_PASSWORD = ''

Add the keystore file to the config directory

Ask a team member for the file or create one from MyEtherWallet https://www.myetherwallet.com/create-wallet

Make sure you have added the Infura keys (see above^)

DEVELOPMENT

This sections contains some helpful commands for development.

Run these from within pipenv shell from src, like it was previously mentioned.

Update the database schema:

python manage.py makemigrations
python manage.py migrate

Run a development server and make the API available at http://localhost:8000/api/:

# create a superuser and retrieve an authentication token
python manage.py createsuperuser --username=<username> --email=<email>
python manage.py drf_create_token <email>

# run the development server
python manage.py runserver

# query the API
curl --silent \
  --header 'Authorization: Token <token>' \
  http://localhost:8000/api/

Run the test suite:

# run all tests
# Note: Add --keepdb flag to speed up the process of running tests locally
python manage.py test

# run tests for the paper app, excluding ones that require AWS secrets
python manage.py test paper --exclude-tag=aws

# run a specific test example:
run python manage.py test note.tests.test_note_api.NoteTests.test_create_workspace_note --keepdb

Run in the background for async tasks:

celery -A researchhub worker -l info

Run in the background for periodic tasks (needs celery running)

celery -A researchhub beat -l info

Both celery commands in one (for development only)

celery -A researchhub worker -l info -B

Google Auth

Ask somebody to provide you with CLIENT_ID and SECRET config, and run this SQL query (with updated configs) to seed the right data for Google login to work:

insert into socialaccount_socialapp (provider, name, client_id, secret, key)
values ('google','Google','<CLIENT_ID>', '<SECRET>');

insert into django_site (domain, name) values ('http://google.com', 'google.com');

insert into socialaccount_socialapp_sites (socialapp_id, site_id) values (1, 1);

(make sure that IDs are the right one in the last query)

Seeding hub data

There's a CSV file in /misc/hub_hub.csv with hub data that you can use to seed hubs data.

If you encounter problems importing CSV due to DB tool thinking that empty fields are nulls for acronym and description columns, temporarily update hub_hub table to allow null values for those columns, import CSV, then execute update hub_hub set acronym='', description=''; to populate with non-null yet empty values, then update table to disallow nulls again.

Then run this from pipenv shell:

python manage.py create-categories
python manage.py migrate-hubs
python manage.py categorize-hubs

Seeding paper data

From your terminal, follow these steps:

cd src
pipenv shell
python manage.py shell_plus # enters Python shell within pipenv shell

from paper.tasks import pull_crossref_papers, pull_papers
pull_crossref_papers()
pull_papers()

Lynaj / researchhub-backend