rahulkumaran / tweet_scrapper

Scrape the Twitter frontend API without any authentication and restriction.

Home Page:http://www.shirishkadam.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tweet Scrapper

License: GPL v3 Codacy Badge codecov Build Status Current Release Version pypi Version Twitter

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely fast.

You can use this library to get the text of any user's Tweets trivially. Follow the creator's blog at shirishkadam.com for updates on progress.

This project is inspired from Kenneth Reitz's similar project kennethreitz/twitter-scraper which is limited to python 3.6 and above.

Getting Started

$ pip install tweetscrape
$ python -m tweetscrape.twitter_scrape -u "@5hirish" -p 3
$ python -m tweetscrape.twitter_scrape -s "#Python" -p 4
$ python -m tweetscrape.twitter_scrape -s "Avengers Infinity War" -p 2

Usage

from tweetscrape.profile_tweets import TweetScrapperProfile 

tweet_scrapper = TweetScrapperProfile("@5hirish", 1)
tweets = tweet_scrapper.get_profile_tweets()
for tweet in tweets:
    print(str(tweet))

Read more on tweetscrape usage here.

Id: 1056176020368191488	Type: tweet	Time: 1540646960000
Author: 5hirish	AuthorId: 428808036
ReTweeter: None
Associated Tweet: 1056176020368191488
Text:   I've completed 7 Pull Requests for #Hacktoberfest! https://hacktoberfest.digitalocean.com/stats/5hirish  Always wanted to contribute to #OpenSource, thanks to @digitalocean initiative #Hacktoberfest finally got around doing it. Will keep it up.
Links: ['https://t.co/J42KiNKGMG']
Hastags: ['#Hacktoberfest', '#OpenSource', '#Hacktoberfest']
Mentions: ['@digitalocean']
Replies: 0	Favorites: 3	Retweets: 1

Id: 1055883061513084928	Type: tweet	Time: 1540577113000
Author: wesmckinn	AuthorId: 115494880
ReTweeter: 5hirish
Associated Tweet: 1055883061513084928
Text:   TFW someone asks "Any update?" or "When is this feature going to be implemented?" on an open source issue tracker.
Links: []
Hastags: []
Mentions: []
Replies: 5	Favorites: 84	Retweets: 11

Id: 1055151881377406976	Type: tweet	Time: 1540402786000
Author: justinkan	AuthorId: 28917111
ReTweeter: 5hirish
Associated Tweet: 1055151881377406976
Text:   1/ Actually I’ve learned a lot from @ROWGHANI that’s worth sharing. First, your job as CEO is to: make sure there’s $ in the bank, define the company’s mission, hire the senior team, and do maybe one thing you enjoy (sales, product, etc)
Links: []
Hastags: []
Mentions: ['@ROWGHANI']
Replies: 43	Favorites: 2793	Retweets: 700
....

Requirements

Python Package dependencies listed in requirements.txt

Features

  • Extract user tweets with all meta-data
  • Extracts external links, hashtags and mentions from a tweet
  • Extracts reply, favorite and retweet counts of a tweet

Cool stuff using Tweetscrape

I have added a few examples (Jupyter Notebooks) using this library to do some cool stuff.

  • Tweet generator using Markov Chain
  • Gensim Topic Modeling using Latent Dirichlet Allocation model

TODO

  • Extract tweets from a twitter user's profile
  • Extract tweets from twitter search
  • Extract tweets from a twitter thread, given the thread link
  • Extract the quoted tweet along with a tweet

Contributions

Please see the contributing documentation for some tips on getting started.

Maintainers

About

Scrape the Twitter frontend API without any authentication and restriction.

http://www.shirishkadam.com/

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 98.8%Language:Python 1.2%