NandanDesai / TwitterScraper4J

a java library which scrapes twitter to fetch publicly available info

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TwitterScraper4J

MIT license GitHub release

! Shutdown Notice !

Twitter has shutdown it's Legacy version (non-JavaScript website). This library was designed to fetch the content by scraping the non-JavaScript version of the site. Now that the legacy Twitter is shutdown, this library won't work anymore. Refer this Issue for more info.

Description

This is a Java library which lets you fetch Twitter public data without the need to use any API.

Pros

  • The JAR file for this entire library is just 430 KB.
  • Unlike the official Twitter API, TwitterScraper4J doesn't have any rate limits.
  • Unlike the official Twitter API, TwitterScraper4J doesn't require any generation of Tokens, Keys etc. It's just a plug and play library.
  • Can fetch around 3200 (or sometimes more) tweets for any public account.
  • Get basic profile information.
  • Search for users.
  • Search for tweets (with keywords, hashtags etc.).
  • Get all followers list
  • Get all following list
  • Streaming tweets (this is not as good as the official Twitter API and is still experimental)

Cons

  • Cannot get the number of likes and retweets.
  • Cannot get all the replies for a tweet.
  • Cannot download videos attached to tweets.
  • If multiple images are attached to a tweet, then this library can fetch only the first image.

Getting Started

JAR file is available in the release section. Download the JAR file, add it to your Java project and start using it!

For Gradle projects, a better way of adding this library as a dependency is,

Add the following in your root build.gradle:

allprojects {
	repositories {
		...
		maven { url 'https://jitpack.io' }
	}
}

Next, add the dependency:

dependencies {
        implementation 'com.github.NandanDesai:TwitterScraper4J:v1.2.1-beta'
}

Code Examples

  • Getting Profile information like name, description, location, number of followers, verified status, profile picture etc.

    TwitterScraper scraper = TwitterScraper.builder().build();  
    Profile profile = scraper.getProfile("realDonaldTrump");
  • Getting the user's timeline

    TwitterScraper twitterScraper = TwitterScraper.builder().build();  
    List<Tweet> tweets=twitterScraper.getUserTimeline("realDonaldTrump");  
      
    for(Tweet tweet:tweets){  
        System.out.println(tweet);  
    }
  • Search a user

    TwitterScraper scraper = TwitterScraper.builder().build();  
    List<User> users =scraper.searchUser("Narendra Modi");  
    for (User result: users){  
        System.out.println(result);  
    }
  • Get worldwide trends

    TwitterScraper scraper=TwitterScraper.builder().build();  
    System.out.println(scraper.getWorldwideTrends());
  • Get all followers list

    TwitterScraper twitterScraper = TwitterScraper.builder().build();
    Iterator<List<User>> it=twitterScraper.getAllFollowers("realDonaldTrump");
    while(it.hasNext()){
        List<User> users=it.next();
        for(User user:users){
            System.out.println(user);
        }
        Thread.sleep(1000);
    }
  • Fetch around 3200 tweets for a given profile

    TwitterScraper twitterScraper = TwitterScraper.builder().build();
    Iterator<List<Tweet>> it=twitterScraper.getAllTweets("realDonaldTrump");
    while(it.hasNext()){
        List<Tweet> tweets=it.next();
        for(Tweet tweet:tweets){
            System.out.println(tweet);
        }
        Thread.sleep(1000);
    }
  • Get a stream of tweets for a particular keyword or hashtag (EXPERIMENTAL feature)

     TwitterScraper twitterScraper = TwitterScraper.builder().build();
     TweetStream stream=twitterScraper.getTweetStream("Kashmir");
     stream.setStreamListener(new TweetStreamListener() {
         @Override
         public void onPageRefresh(List<Tweet> tweets) {
             for(Tweet tweet:tweets){
                 System.out.println(tweet);
             }
         }
     });
     stream.start();
  • To use a proxy

    Proxy proxy = new Proxy(Proxy.Type.SOCKS, new InetSocketAddress(<ip address>, <port>));
    TwitterScraper twitterScraper = TwitterScraper.builder().proxy(proxy).build();

There is a lot more this library can offer. Refer tests for other examples.

To-do List

  • Getting media links
  • Complete 'retweet with comment'
  • Timestamps
  • Add proxy support
  • Getting a stream of tweets containing a given keyword (EXPERIMENTAL)
  • Followers of a given user
  • Friends of a given user

License

MIT

About

a java library which scrapes twitter to fetch publicly available info

License:MIT License


Languages

Language:Java 100.0%