timhutton / twitter-archive-parser

Python code to parse a Twitter archive and output in various ways

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Followers & Followings

matt22207 opened this issue · comments

Any chance of parsing the Followers & Followings lists ? They are currently just unfriendly url's:

https://twitter.com/intent/user?user_id=XXXXXXXXXXXXX

but it would be good to resolve those URL's into at least usernames, and any other public info available on that user that would help us to reconstruct our lists if those twitter URL's stop working.

THanks!

I made a script to parse user IDs and map them to handles. It doesn't need login or access to Twitter's API, because it uses the TweeterID web service to look up the handles. It also finds some of the handles in the archive itself (looking in mentions and retweets). Sometimes it also finds display names and links, but it can't look up the bio or profile picture yet.

Currently, it just writes the mappings into a JSON file, but you might already want to already use it anyway, in case Twitter goes down even faster than expected...

The script is available in the userids branch in my fork of this project:
https://github.com/flauschzelle/twitter-archive-parser/tree/userids

@lenaschimmel and me are working on integrating it into the main parser script and will probably be making a pull request to the main project here later. But integrating it properly might take a few days, so if you're in a hurry, feel free to use my version in the meantime :)

I've been working on getting full user data via the API (without needing a developer key). There's a get_follows script in my fork which demonstrates this.
https://github.com/press-rouch/twitter-archive-parser