idofrizler / FollowerData

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Twitter statistics - Fun with Kusto!

Demo

redacted_dashboard.png

Installation

Twitter setup

  1. Go to https://developer.twitter.com/ and create your own app.
  2. From the 'Keys and tokens' page, under Authentication Tokens, copy your Bearer Token.
  3. Add an environment variable to where you're running your code, named TwitterToken and give it that copied value.
  4. Choose a Twitter username to query (it doesn't have to be yourself!) and put it in the USER_ALIAS field in main.py

Kusto setup

  1. Go to https://dataexplorer.azure.com/freecluster/ and create yourself a free Kusto cluster.
  2. Inside it, create a database called FollowerData.
  3. Then, go to "My Cluster (Preview)" and copy out the "Cluster URI" and "Data ingestion URI" values to kusto_handler.py (under KUSTO_URI, KUSTO_INGEST_URI)
  4. In Kusto-commands.txt you'll find control commands to create four tables: Users, Follows, Tweets, Likes. Copy those.
  5. In the Kusto website, go to "Query", paste those commands, and run them (one by one).

Getting the data

  1. Run main.py; it will start outputting messages as it queries Twitter and ingests into Kusto. Note that you may get rate-limited in some of these API calls; code will automatically sleep and retry until it finishes querying.
  2. You can get more data on other users as well, and use it to cross-reference with yours.

Query the data

  1. To query, you can simply play with the data however you wish in the Query tab. Kusto has a rich KQL language; you can also join between tables. Refer to Kusto docs for more reference.

Visualize the data

  1. In Kusto website, go to "Dashboards (Preview)".
  2. Once you have your queries, you can add tiles as you wish.
  3. Kusto-commands.txt contains some suggested queries (you'll still need to visualize those when you build the tile).

Alternative (not tested yet)

  1. I also added a template: dashboard-twitter-stats.json that you can edit and upload as a pre-cooked dashboard.
    1. It is not tested, but I believe you'll need to edit clusterUri with your cluster's address before uploading.
    2. Also, you'll need to edit the <your_user_id> and <user-alias> params referenced in the queries (you should be able to do that after uploading the dashboard, from the website).

Notes

  1. The current code only queries up to your most recent 3200 tweets. It's a Twitter API limitation that can be extended (by adding time filters on the main query), but does not exist in current version.

About


Languages

Language:Python 100.0%