s2t2 / truth-collection-2023

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

truth-collection-2023

Setup

Obtain a username and password for the truth social network (i.e. TRUTH_USERNAME and TRUTH_PASSWORD).

Create a Google Cloud project and BigQuery database (i.e. DATASET_ADDRESS). Also obtain service account credentials and download the resulting JSON file into this repo as "google-credentials.json" (which has been ignored from version control).

Setup local ".env" file with credentials:

TRUTH_USERNAME="_______"
TRUTH_PASSWORD="_______"

# BigQuery Credentials:
GOOGLE_APPLICATION_CREDENTIALS="/Users/path/to/truth-collection-2023/google-credentials.json"
DATASET_ADDRESS="your-project.truth_2023_development"

Setup local environment:

conda create -n truth-env python=3.10
conda activate truth-env

Install packages:

pip install -r requirements.txt

Usage

Connect to BigQuery:

python -m app.bq_service

Connect to the social network:

python -m app.truth_service

Timeline Collection

First migrate timeline statuses table:

DESTRUCTIVE=false python -m app.bq_migrate.timeline_statuses

Collect timeline statuses for a given user:

python -m app.bq_collect.timeline_statuses
#COLLECTION_USERNAME="abc123" python -m app.bq_collect.timeline_statuses

Collect timeline statuses for all previously collected and mentioned users:

python -m app.bq_collect.all_timelines
# USERS_LIMIT=5 python -m app.bq_collect.all_timelines
python -m app.bq_collect.all_timelines_threaded
# USERS_LIMIT=5 MAX_THREADS=3 python -m app.bq_collect.all_timelines_threaded

Testing

pytest

About


Languages

Language:Python 99.3%Language:Procfile 0.7%