justadreamer / tghistory

Import Telegram message history into a PostgreSQL DB

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tghistory

Import Telegram message history into a PostgreSQL DB and run an R analysis producing a report using knitr from an R Markdown document.

Accompanying blog articles:

  1. http://logic-explained.blogspot.com/2020/03/telegram-chats-history-analysis.html
  2. https://logic-explained.blogspot.com/2020/07/telegram-chats-history-analysis-with-r.html

Setup

1. Register a Telegram app at my.telegram.org/apps, obtain API_ID, API_HASH
2. Build and install TDLib
3. Setup a Postgres DB in an arbitrary location.
4. Create config.yml

cp config.yml.example config.yml

5. Set values in the config.yml (this file is used in both Python and R):

  • DB connection parameters
  • app_id, api_hash obtained from Telegram (see step 1)
  • your phone number to connect to your Telegram account
  • list of chat titles for which you would like to fetch history

6. Also install pipenv (on a Mac: brew install pipenv)
7. To make it correctly install psycopg on a Mac package export this before running pipenv install:

export LDFLAGS="-L/usr/local/opt/openssl/lib" export CPPFLAGS="-I/usr/local/opt/openssl/include"

8. Run pipenv install to install dependencies and create a venv.

9. Install R

10. Launch R and install packages by issuing the command:

install.packages(c("tidyverse", "lubridate", "odbc", "kableExtra", "config"))

you will be prompted to selecte a CRAN mirror - the first one will do.

Run

Script which automates the following steps is ./run-analytics.sh.

1. Start the DB server, f.e. if postgresdb is the directory where you would like to keep it: DB can be empty schema will be initialized on the first run of the script.

pg_ctl -D postgresdb start

2. With pipenv you have to switch to your virtualenv using: pipenv shell

3. Then just python extract_thistory.py

4. Enter your code to authenticate with Telegram.

5. Then run R analysis using:

R -e "rmarkdown::render('tghistory_analytics.Rmd')"

Queries

queries.sql contains sample queries and comments.

R analysis

You have to install R and then you can compile the analysis in R markdown:

R -e "rmarkdown::render('tghistory_analytics.Rmd')"

The output is available in tghistory_analytics.pdf

About

Import Telegram message history into a PostgreSQL DB


Languages

Language:Python 98.5%Language:Shell 1.5%