All Twitter scrapes are failing: `blocked (404)`

Question

All Twitter scrapes are failing: `blocked (404)`

JustAnotherArchivist opened this issue a year ago · comments

JustAnotherArchivist commented a year ago

With the exception of twitter-trends, all Twitter scrapes are failing since sometime in the past hour. This is likely connected to Twitter as a whole getting locked behind a login wall since earlier today. There is no known workaround at this time, and it's not known whether this will be fixable.

Yijie Xu · Answer 1 · Sat Jul 01 2023 03:26:54 GMT+0800 (China Standard Time)

So sad :-(
My research project is strongly related to this lib, and pay tribute to your effort in maintaining this.

viktorzen · Answer 2 · Sat Jul 01 2023 05:05:01 GMT+0800 (China Standard Time)

Twitter disabled their public web site today (2023-06-30) and require users to login, twitter used to be public prior to this date. Would it be possible to automate the login as well providing a username and pw to snscrape, i.e. before calling a graphql api to login to twitter and simulate a logged-in session?

Yijie Xu · Answer 3 · Sat Jul 01 2023 05:13:17 GMT+0800 (China Standard Time)

I do not think the developer would do this, as he said that auth would never be added into features: see #270 .
Let's see what our great developers' solution, hope it would not take long.

Enzo Ferey · Answer 4 · Sat Jul 01 2023 05:15:23 GMT+0800 (China Standard Time)

Before using this library, I had started doing manual scrapping myself using Puppeteer and I had automated the sign in part (even through 2FA). The issue is that if you frequently sign in in a small period of time you get blocked by Twitter and you cannot sign in again for a certain amount of time. So I'm not sure what the ideal setup would be in this case...

midnight · Answer 5 · Sat Jul 01 2023 05:27:09 GMT+0800 (China Standard Time)

If this comment is off-topic, please consider deleting it. Uh. It was mentioning Twitter failing in this regard, not you. btw.

midnight · Answer 6 · Sat Jul 01 2023 05:45:06 GMT+0800 (China Standard Time)

Please consider deleting my prior off-topic comment.

Don't nuke this one as off-topic: A Twitter employee says it's temporary:

https://twitter.com/AqueelMiq/status/1674843555486134272
"this is a temporary restriction, we will re-enable logged out twitter access in the near future"

Wouze · Answer 7 · Sat Jul 01 2023 09:25:13 GMT+0800 (China Standard Time)

Elon talked about it too 💀
https://twitter.com/elonmusk/status/1674942336583757825

akanachuu · Answer 8 · Sat Jul 01 2023 11:23:23 GMT+0800 (China Standard Time)

can i use my personal oauth key to twitter snscrape ?

khorg0sh · Answer 9 · Sat Jul 01 2023 13:40:16 GMT+0800 (China Standard Time)

Elon talked about it too 💀 https://twitter.com/elonmusk/status/1674942336583757825

Musk referred to EXTREME scraping, indicating that scrapers may no longer be functional post changes. Let's see how it is done.

akanachuu · Answer 10 · Sat Jul 01 2023 16:34:55 GMT+0800 (China Standard Time)

can i edited the "twitter.py" modules w/ my own bearer key or event oauth login key? (locally, at my computer when i installed snscraper module) since it change to my local snscraper module ? thanks

Benniepie · Answer 11 · Sat Jul 01 2023 17:39:16 GMT+0800 (China Standard Time)

Hello,

This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com):
https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505

Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web/*/https://twitter.com/tesla/status*)

If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.

Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated

Ben

Arfath Yahiya · Answer 12 · Sat Jul 01 2023 22:18:40 GMT+0800 (China Standard Time)

Hello,

This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505

Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)

If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.

Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated

Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:

import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)

Generated by Insomnia

Yijie Xu · Answer 13 · Sat Jul 01 2023 23:25:40 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

This seems to be working, the problem might be the rate limit and stability, more tests are needed.

David Diaz · Answer 14 · Sat Jul 01 2023 23:40:29 GMT+0800 (China Standard Time)

It does not allow you to see all the followed by a user either, would there be a solution for that? they help me?

Joffrey · Answer 15 · Sun Jul 02 2023 01:22:41 GMT+0800 (China Standard Time)

https://twitter.com/elonmusk/status/1675187969420828672

😂

@ElonMusk
To address extreme levels of data scraping & system manipulation, we’ve applied the following temporary limits:

Verified accounts are limited to reading 6000 posts/day

Unverified accounts to 600 posts/day

New unverified accounts to 300/day

fa5g · Answer 16 · Sun Jul 02 2023 05:01:04 GMT+0800 (China Standard Time)

My IP was banned although I was using a proxy that change the IP dynamically, what options we have now?

Mazen Tayseer · Answer 17 · Sun Jul 02 2023 05:47:23 GMT+0800 (China Standard Time)

@JustAnotherArchivist Are the scrapers working anytime soon? Also, I want to thank you for your hard work on these scrapers.

fa5g · Answer 18 · Sun Jul 02 2023 08:04:30 GMT+0800 (China Standard Time)

Scraping seems to be still possible, check this:

https://rss-bridge.org/bridge01/?action=display&bridge=TwitterBridge&context=By+username&u=elonmusk&format=html

https://rss-bridge.org/bridge01/?action=display&bridge=TwitterBridge&context=By+username&u=elonmusk&format=json

By https://github.com/RSS-Bridge/rss-bridge

Joffrey · Answer 19 · Sun Jul 02 2023 14:36:57 GMT+0800 (China Standard Time)

Scraping seems to be still possible, check this:

https://rss-bridge.org/bridge01/?action=display&bridge=TwitterBridge&context=By+username&u=elonmusk&format=html

https://rss-bridge.org/bridge01/?action=display&bridge=TwitterBridge&context=By+username&u=elonmusk&format=json

By https://github.com/RSS-Bridge/rss-bridge

while cool, it's using API V1 and you can't get long tweet

MrCube21 · Answer 20 · Sun Jul 02 2023 17:31:24 GMT+0800 (China Standard Time)

hi guys im new to github and coding but maybe this is helpful

https://twitter.com/iam4x/status/1675194767854956546?s=20

Joffrey · Answer 21 · Sun Jul 02 2023 17:32:37 GMT+0800 (China Standard Time)

hi guys im new to github and coding but maybe this is helpful

https://twitter.com/iam4x/status/1675194767854956546?s=20

This doesn't work since a long time ago.

Mahmoud Nabil · Answer 22 · Sun Jul 02 2023 18:05:50 GMT+0800 (China Standard Time)

what about using Selenium first to make a login after that use Sntwitter to get tweets?
the question here is how can link between Selenium session with Sntwitter?

Erik Castricum · Answer 23 · Sun Jul 02 2023 18:32:33 GMT+0800 (China Standard Time)

hi guys im new to github and coding but maybe this is helpful
https://twitter.com/iam4x/status/1675194767854956546?s=20

This doesn't work since a long time ago.

lol this seems to be working,
na never mind, besides it was fun for some minutes, it messes up the rest of the features so no lol after all

Odem · Answer 24 · Sun Jul 02 2023 18:47:24 GMT+0800 (China Standard Time)

what about using Selenium first to make a login after that use Sntwitter to get tweets? the question here is how can link between Selenium session with Sntwitter?

The beauty ofsnscrapeis that it doesn't require authentication, if we're going to have to start using login/auth and tools like Selenium then it should be spun off into another project and not snscrape. Also using any form of auth gives twitter another way to ban mass collection which is the use case for many users of snscrape.

PanMiko · Answer 25 · Sun Jul 02 2023 19:09:41 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

Hi! :)) It works great! Is there perhaps any way to scrape repost and comment data as well? I need a mapping of twitt spread for my master thesis, but what companies are doing lately with their API (like Twitter or Reddit) is terrible....

saad-15art · Answer 26 · Sun Jul 02 2023 19:22:52 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result
CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia
Hi! :)) It works great! Is there perhaps any way to scrape repost and comment data as well? I need a mapping of twitt spread for my master thesis, but what companies are doing lately with their API (like Twitter or Reddit) is terrible....

You are describing my situation now I need the comments for the same purpose please let me know when you find a solution my submission in September

IrtzaShahan · Answer 27 · Sun Jul 02 2023 21:17:16 GMT+0800 (China Standard Time)

what about using Selenium first to make a login after that use Sntwitter to get tweets? the question here is how can link between Selenium session with Sntwitter?

The beauty ofsnscrapeis that it doesn't require authentication, if we're going to have to start using login/auth and tools like Selenium then it should be spun off into another project and not snscrape. Also using any form of auth gives twitter another way to ban mass collection which is the use case for many users of snscrape.

So you would rather have it completely stop working for all other use cases as well?

TheTechRobo · Answer 28 · Sun Jul 02 2023 22:24:43 GMT+0800 (China Standard Time)

@IrtzaShahan #270

nerra0pos · Answer 29 · Mon Jul 03 2023 00:50:46 GMT+0800 (China Standard Time)

Would be great if snscrape would add a new function like TwitterProfileScraperSyn that grabs the tweet data from the still publicly available syndication profile feeds. The sny feed shows 20 tweets with is good for many applications.

Mostafa Miandari Hossein · Answer 30 · Mon Jul 03 2023 01:38:16 GMT+0800 (China Standard Time)

Insomnia

Great!

Is there any other param I can put in querystring except the tweet id?
I want to get tweets for specific users, but can't find what params should I use.

JustAnotherArchivist · Answer 31 · Mon Jul 03 2023 03:00:16 GMT+0800 (China Standard Time)

Auth will not be added, as has been mentioned at least twice now.
Yes, if the syndication feeds are the only remaining option, I will switch to that or add a separate scraper for them. The thing is that I don't want to have to (read: don't have time to) change everything again in two days when Elon has another one of his brilliant ideas, so I'm waiting for the dust to settle down a bit.

Odem · Answer 32 · Mon Jul 03 2023 03:04:01 GMT+0800 (China Standard Time)

what about using Selenium first to make a login after that use Sntwitter to get tweets? the question here is how can link between Selenium session with Sntwitter?

The beauty ofsnscrapeis that it doesn't require authentication, if we're going to have to start using login/auth and tools like Selenium then it should be spun off into another project and not snscrape. Also using any form of auth gives twitter another way to ban mass collection which is the use case for many users of snscrape.

So you would rather have it completely stop working for all other use cases as well?

Yes (for twitter), and I expressed why and so has JustAnotherArchivist#issuecomment-1616774736 / #270

Tianzhu Qin · Answer 33 · Mon Jul 03 2023 03:11:50 GMT+0800 (China Standard Time)

May I please ask how we can have a specific user's tweets from the start time to the end time for now? Really in a hurry and currently have no clues....

And this one seems to have no params for screen name? Do we have other urls?
https://cdn.syndication.twimg.com/tweet-result

Thank you for all your help, and many great praise to the author @JustAnotherArchivist

周星星 · Answer 34 · Mon Jul 03 2023 12:56:49 GMT+0800 (China Standard Time)

Broke by Musk

ihabpalamino · Answer 35 · Mon Jul 03 2023 19:11:41 GMT+0800 (China Standard Time)

i hope a solution would be found soon i really need this libs its for my final studies project otherwise i could fail...

pleblira · Answer 36 · Mon Jul 03 2023 22:05:19 GMT+0800 (China Standard Time)

Does anyone know if someone's working on a snscraper fork that implements login/auth for Twitter?

Really appreciate your work, JustAnotherArchivist, thank you for all you do. Hoping Elon pulls back some of the restriction and we can have snscrape working as original! Best wishes

codilau · Answer 37 · Mon Jul 03 2023 23:14:49 GMT+0800 (China Standard Time)

@pleblira this library uses the SNScrape classes for User and Tweet and supports auth
https://github.com/vladkens/twscrape

Brahmani Nutakki · Answer 38 · Tue Jul 04 2023 17:10:55 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

What's the URL to see the user profile? Sorry if it's a dumb question, but I could not find any reference on the net.

Nickchen Nick · Answer 39 · Tue Jul 04 2023 17:28:25 GMT+0800 (China Standard Time)

@nbrahmani You can try

https://syndication.twitter.com/srv/timeline-profile/screen-name/[username]

For example, https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk

User info will be stored inside the <script id="__NEXT_DATA__> tag. The tag itself is server-side rendered so you can use requests with BeautifulSoup (assume using Python) to extract the data you need. You can get the user profile and up to 20 most recent tweets from that user.

Unfortunately this endpoint has been dead since 16, 17 hours ago.

Brahmani Nutakki · Answer 40 · Tue Jul 04 2023 17:33:04 GMT+0800 (China Standard Time)

@nickchen120235

I tried this, but I need the twitter blue status of a user, and this does not return that.

Nickchen Nick · Answer 41 · Tue Jul 04 2023 17:35:53 GMT+0800 (China Standard Time)

@nickchen120235

I tried this, but I need the twitter blue status of a user, and this does not return that.

There's a boolean is_blue_verified or something similar in the user key iirc. Maybe that's what you need?

Brahmani Nutakki · Answer 42 · Tue Jul 04 2023 17:46:15 GMT+0800 (China Standard Time)

@nickchen120235
I tried this, but I need the twitter blue status of a user, and this does not return that.

There's a boolean is_blue_verified or something similar in the user key iirc. Maybe that's what you need?

As far as I can see, it does not have that boolean. I get the following output:

{"props":{"pageProps":{"contextProvider":{"features":{},"scribeData":{"client_version":null,"dnt":false,"widget_id":"embed-0","widget_origin":"","widget_frame":"","widget_partner":"","widget_site_screen_name":"","widget_site_user_id":"","widget_creator_screen_name":"","widget_creator_user_id":"","widget_iframe_version":"bb06567:1687853948269","widget_data_source":"screen-name:elonmusk","session_id":""},"messengerContext":{"embedId":"embed-0"},"hasResults":true,"lang":"en","theme":"light"},"lang":"en","maxHeight":null,"showHeader":true,"hideBorder":false,"hideFooter":false,"hideScrollBar":false,"transparent":false,"timeline":{"entries":[]},"headerProps":{"screenName":"elonmusk"}},"__N_SSP":true},"page":"/timeline-profile/screen-name/[screenName]","query":{"screenName":"elonmusk"},"buildId":"vn5fUacsNpP-nIkFRlFf6","assetPrefix":"https://platform.twitter.com","isFallback":false,"gssp":true,"customServer":true}

Nickchen Nick · Answer 43 · Tue Jul 04 2023 17:52:08 GMT+0800 (China Standard Time)

@nbrahmani Sorry for the confusion 😓

As I mentioned earlier this endpoint is dead, so it's no longer outputting the correct response.

If it were working, the info you need would be in the user key in one of the entries.

ihabpalamino · Answer 44 · Wed Jul 05 2023 15:47:29 GMT+0800 (China Standard Time)

hello guys hello @JustAnotherArchivist any update about the issue?

Nickchen Nick · Answer 45 · Wed Jul 05 2023 16:17:52 GMT+0800 (China Standard Time)

AFAIK

The login wall is still there.
Single embedded tweet works, but embedded timeline doesn't. (You can try at https://publish.twitter.com)
Authentication won't be implemented anyway.

Prasun Shrestha · Answer 46 · Wed Jul 05 2023 21:27:22 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

Can we get the IDs of the post generated by a specific profile? If a single embedded tweet is working, a for-loop through all the IDs will work in the interim. Thank you!

ihabpalamino · Answer 47 · Wed Jul 05 2023 22:47:35 GMT+0800 (China Standard Time)

what is this code for ? @prasunshrestha

Deleted user · Answer 48 · Wed Jul 05 2023 22:57:40 GMT+0800 (China Standard Time)

As of now it seems to be possible to view public tweets without logging in. Wayback Machine can save tweet pages again.
Current snscrape scraping methods still return 404, so it's likely that API endpoints or something else has been changed.

Can't confirm anything more than that for now.

JustAnotherArchivist · Answer 49 · Wed Jul 05 2023 22:58:49 GMT+0800 (China Standard Time)

Yes, it is a different endpoint which only returns the single requested tweet, no replies or the replied-to tweet.

ihabpalamino · Answer 50 · Wed Jul 05 2023 23:01:07 GMT+0800 (China Standard Time)

Yes, it is a different endpoint which only returns the single requested tweet, no replies or the replied-to tweet.

is it already implemented? if yes wich version should i update or have then

JustAnotherArchivist · Answer 51 · Wed Jul 05 2023 23:02:05 GMT+0800 (China Standard Time)

No, my previous comment still applies.

The thing is that I don't want to have to (read: don't have time to) change everything again in two days when Elon has another one of his brilliant ideas, so I'm waiting for the dust to settle down a bit.

Prasun Shrestha · Answer 52 · Wed Jul 05 2023 23:03:41 GMT+0800 (China Standard Time)

Yes, it is a different endpoint which only returns the single requested tweet, no replies or the replied-to tweet.

I will be happy if only I could get the content of the tweet. Please correct me if I am wrong, but the only way possible I see at the moment (without authentication) is through the embedded tweets. As a result, if only I could get the post IDs, I can get the content. Is there a way possible at the moment?

ihabpalamino · Answer 53 · Wed Jul 05 2023 23:04:26 GMT+0800 (China Standard Time)

No, my previous comment still applies.

The thing is that I don't want to have to (read: don't have time to) change everything again in two days when Elon has another one of his brilliant ideas, so I'm waiting for the dust to settle down a bit.

okey thank you brother hope it wont take a loong time i really need this in my project

Deleted user · Answer 54 · Wed Jul 05 2023 23:11:32 GMT+0800 (China Standard Time)

Yes, it is a different endpoint which only returns the single requested tweet, no replies or the replied-to tweet.

Ah, indeed, I didn't notice there's no replies. As for replied-to tweet, I see there's in_reply_to_status_id_str field to get replied-to tweet, quoted_status_id_str to get quoted tweet, and conversation_id_str to get conversation root tweet (not sure), so it may be solved with another request, I suppose, if needed. Yet, well, it might stay this way, or it might not. We can only observe for now.

Nickchen Nick · Answer 55 · Wed Jul 05 2023 23:49:23 GMT+0800 (China Standard Time)

zedeus/nitter#919 (comment)

Maybe this could provide some help?

Disclaimer: This may be broken by the Tesla guy anytime, so proceed with caution

Manuel Hernandez · Answer 56 · Thu Jul 06 2023 01:59:51 GMT+0800 (China Standard Time)

https://twitter.com/TitterDaily/status/1676624363787894784?s=20

👀

eminekahveci · Answer 57 · Thu Jul 06 2023 14:39:53 GMT+0800 (China Standard Time)

While I was taking data for my thesis, my twitter developer account was suddenly closed. Right now I have little time left and I desperately need my twitter data. I couldn't start the digger here, is it my fault??

PVasyl · Answer 58 · Thu Jul 06 2023 15:01:32 GMT+0800 (China Standard Time)

https://twitter.com/TitterDaily/status/1676624363787894784?s=20

👀

No, it does not work, unfortunately

Abdullah Al Mukaddim · Answer 59 · Fri Jul 07 2023 03:39:19 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

Hi, thanks for the code. This works amazingly. However, instead of scraping a single tweet using the tweet id, I want to get multiple tweets based on a search string, with start and end dates, and a tweet limit of how many tweets it should scrape. What changes do I have to make to the query string? I searched online, but could not find documentation on this. I made an assignment for the students at a university and that depended on the snscrape library. Any guidance is appreciated.

DanRev111212 · Answer 60 · Fri Jul 07 2023 15:38:12 GMT+0800 (China Standard Time)

Any update?

ihabpalamino · Answer 61 · Fri Jul 07 2023 16:40:23 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result
CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia
Hi, thanks for the code. This works amazingly. However, instead of scraping a single tweet using the tweet id, I want to get multiple tweets based on a search string, with start and end dates, and a tweet limit of how many tweets it should scrape. What changes do I have to make to the query string? I searched online, but could not find documentation on this. I made an assignment for the students at a university and that depended on the snscrape library. Any guidance is appreciated.

yes please same question and is it possible to scrape with period of date

sabeehsaeed · Answer 62 · Fri Jul 07 2023 18:05:06 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result
CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia
Hi, thanks for the code. This works amazingly. However, instead of scraping a single tweet using the tweet id, I want to get multiple tweets based on a search string, with start and end dates, and a tweet limit of how many tweets it should scrape. What changes do I have to make to the query string? I searched online, but could not find documentation on this. I made an assignment for the students at a university and that depended on the snscrape library. Any guidance is appreciated.
yes please same question and is it possible to scrape with period of date

AFAIK

As of now, Twitter allows accessing only the Tweet content ( not the associated replies ) at the TweetResultByRestId GraphQL api, without logging in. The mentioned api requires you to have the tweet id beforehand.
To access the "search by query" endpoint you would either have to login before running any scraping task, or use the official TwitterAPI ( the free version does not allow read requests. The $100 basic account allows reading tweets with a monthly cap of 10,000 )
Logging in and querying the search/usertweet endpoints may result in your account getting banned. They are also rate limited at 50 requests per 15 minutes per endpoint.

I'm not going to refer any other tools that allow authentication support but you'll find some online that use the snscrape and twint models and build on top of that to support user authentication and GQL endpoint querying.

NOTE: Anybody attempting to do so should research and understand the risks and liabilities associated with scraping with an authenticated user.

drcarademono · Answer 63 · Sat Jul 08 2023 06:55:04 GMT+0800 (China Standard Time)

Hello,
This may or may not help. Here's a route to access Tweets without logging in (contains further iframe to platform.twitter.com): https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https://twitter.com/elonmusk/status/1674865731136020505
Would combining this with a pre-existing list of Tweets allow data scraping to continue? Alternatively users could build the tweet list using google search, e.g. for Tesla tweets: "site:twitter.com/tesla/status" or via another cached list (e.g. Waybackmachine - https://web.archive.org/web//https://twitter.com/tesla/status)
If I'm off the mark, I apologise but thought I'd pass this on, on the off chance it may help at least as a temporary measure.
Just a note to @JustAnotherArchivist - thank you for the hard work you have put into this library - it is very much appreciated
Ben

URL: https://cdn.syndication.twimg.com/tweet-result

CODE:
import requests

url = "https://cdn.syndication.twimg.com/tweet-result"

querystring = {"id":"1652193613223436289","lang":"en"}

payload = ""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0",
    "Accept": "*/*",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Origin": "https://platform.twitter.com",
    "Connection": "keep-alive",
    "Referer": "https://platform.twitter.com/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "Pragma": "no-cache",
    "Cache-Control": "no-cache",
    "TE": "trailers"
}

response = requests.request("GET", url, data=payload, headers=headers, params=querystring)

print(response.text)
Generated by Insomnia

When I use this code on a retweet, it ONLY returns the original tweeter's username in the JSON. Is there any way to return the retweeter's username?

For example, putting this Tweet ID into the code (1499332226177286144, a retweet by AaronBell4NUL) returns information about the original tweet (1499056854688841732, a tweet by NewsNBC). AaronBell4NUL is not reported in the JSON that's returned, even though I entered his Tweet ID. However, I'd like to be able to enter the retweet's Tweet ID and return AaronBell4NUL.

locfinessemonster · Answer 64 · Sun Jul 09 2023 09:04:06 GMT+0800 (China Standard Time)

Here is what I have gathered about what snscrape twitter scraping may look like in the future, if anyone could confirm, deny, or add details that would be awesome.

There is some optimism that it is possible for twitter users to be queried by screen name and that their syndication feeds can be captured. These syndication feeds would consist of up to 20 tweets per user (no retweets or replies) and wouldn't be subject to rate limiting? These updates are pending until the dust on the Twitter changes settles a bit.

itsnotmethistime · Answer 65 · Tue Jul 11 2023 14:43:35 GMT+0800 (China Standard Time)

https://github.com/zedeus/nitter/pull/927/commits

Nitter has updated endpoints, could be useful for people to mess around with.

I'm curious if this is going to just be a never ending cycle of cat and mouse.

Current state of twitter sucks.

ihabpalamino · Answer 66 · Wed Jul 12 2023 15:55:06 GMT+0800 (China Standard Time)

hello guys hello @JustAnotherArchivist any update or news about the issue?

4yR1l1k · Answer 67 · Wed Jul 12 2023 16:00:23 GMT+0800 (China Standard Time)

https://github.com/zedeus/nitter/pull/927/commits

Nitter has updated endpoints, could be useful for people to mess around with.

I'm curious if this is going to just be a never ending cycle of cat and mouse.

Current state of twitter sucks.

Yet, keyword scraping not possible as far as I understand.

ihabpalamino · Answer 68 · Wed Jul 12 2023 17:48:30 GMT+0800 (China Standard Time)

hello guys hello @JustAnotherArchivist any update or news about the issue?

waiting for an answer guys

Aditya Bansal · Answer 69 · Fri Jul 14 2023 01:59:41 GMT+0800 (China Standard Time)

when will API endpoint free from locked down ? Any idea ?

Santino · Answer 70 · Fri Jul 14 2023 06:34:27 GMT+0800 (China Standard Time)

Nitter keyword scrapping is possible now. I just tried.

JustAnotherArchivist · Answer 71 · Fri Jul 14 2023 06:35:58 GMT+0800 (China Standard Time)

Yeah, I saw, it got implemented in zedeus/nitter@67203a4.

JustAnotherArchivist · Answer 72 · Fri Jul 14 2023 06:38:20 GMT+0800 (China Standard Time)

Looks like things have been fairly stable for a few days now. I'm not sure yet when I'll have time to implement the necessary changes, possibly on the weekend.

ihabpalamino · Answer 73 · Fri Jul 14 2023 15:38:13 GMT+0800 (China Standard Time)

thanks for your job @JustAnotherArchivist you are really saving our life

MrCube21 · Answer 74 · Fri Jul 14 2023 17:41:08 GMT+0800 (China Standard Time)

Is there a way to donate for your work ? @JustAnotherArchivist

anasmcl · Answer 75 · Fri Jul 14 2023 18:44:07 GMT+0800 (China Standard Time)

it seems that the nitter keyword research shows the results for the last 10 days only. maybe the new endpoint limit ? hope it will be lifted

zedeus/nitter#938

Unixcision · Answer 76 · Sun Jul 16 2023 20:15:54 GMT+0800 (China Standard Time)

Looks like things have been fairly stable for a few days now. I'm not sure yet when I'll have time to implement the necessary changes, possibly on the weekend.

Hey @JustAnotherArchivist, any update on this? Thanks!

DanRev111212 · Answer 77 · Sun Jul 16 2023 20:55:02 GMT+0800 (China Standard Time)

I literally have my final year project depending on this module. Please save us @JustAnotherArchivist

Deleted user · Answer 78 · Mon Jul 17 2023 12:13:40 GMT+0800 (China Standard Time)

If you for some reason absolutely need to use snscrape for getting a single tweet, and you don't mind just getting it by tweet id purely in code, you can do it like that:

from snscrape.base import ScraperException
from snscrape.modules.twitter import (
    Tweet,
    _TwitterAPIType,
    TwitterTweetScraper,
    TwitterTweetScraperMode,
)


def get_items(self):
    variables = {
        "tweetId": str(self._tweetId),
        "includePromotedContent": True,
        "withCommunity": True,
        "withVoice": True,
        # !!! these fields may be deprecated
        # "with_rux_injections": False,
        # "withQuickPromoteEligibilityTweetFields": True,
        # "withBirdwatchNotes": False,
        # "withV2Timeline": True,
    }
    features = {
        "responsive_web_graphql_exclude_directive_enabled": True,
        "verified_phone_label_enabled": False,
        "creator_subscriptions_tweet_preview_api_enabled": False,
        "responsive_web_graphql_timeline_navigation_enabled": True,
        "responsive_web_graphql_skip_user_profile_image_extensions_enabled": False,
        "tweetypie_unmention_optimization_enabled": True,
        "responsive_web_edit_tweet_api_enabled": True,
        "graphql_is_translatable_rweb_tweet_is_translatable_enabled": True,
        "view_counts_everywhere_api_enabled": True,
        "longform_notetweets_consumption_enabled": True,
        "tweet_awards_web_tipping_enabled": False,
        "freedom_of_speech_not_reach_fetch_enabled": True,
        "standardized_nudges_misinfo": True,
        "tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled": False,
        "longform_notetweets_rich_text_read_enabled": True,
        "longform_notetweets_inline_media_enabled": False,
        "responsive_web_enhance_cards_enabled": False,
        "responsive_web_twitter_article_tweet_consumption_enabled": False,  # new?
        "responsive_web_media_download_video_enabled": True,  # new?
        # !!! these fields may be deprecated
        # "rweb_lists_timeline_redesign_enabled": False,
        # "vibe_api_enabled": True,
        # "interactive_text_enabled": True,
        # "blue_business_profile_image_shape_enabled": True,
        # "responsive_web_text_conversations_enabled": False,
    }
    fieldToggles = {
        "withArticleRichContentState": True,
        "withAuxiliaryUserLabels": True,
    }
    params = {
        "variables": variables,
        "features": features,
        "fieldToggles": fieldToggles,  # seems optional
    }
    url = "https://twitter.com/i/api/graphql/3HC_X_wzxnMmUBRIn3MWpQ/TweetResultByRestId"
    if self._mode is TwitterTweetScraperMode.SINGLE:
        obj = self._get_api_data(url, _TwitterAPIType.GRAPHQL, params=params)
        if not obj["data"]["tweetResult"]:
            return
        yield self._graphql_timeline_tweet_item_result_to_tweet(
            obj["data"]["tweetResult"]["result"], tweetId=self._tweetId
        )


# replace this method
TwitterTweetScraper.get_items = get_items


def get_using_snscrape(tweet_id: int) -> Tweet | None:
    print("Sending API request...")
    try:
        for tweet in TwitterTweetScraper(tweet_id).get_items():
            print("Response: %r." % tweet)
            return tweet
        print("No response from public API.")
    except ScraperException:
        print("Scraping failed.")

Just pass tweet id to get_using_snscrape function and it will return Tweet instance, if there's anything to return, or None otherwise. Obviously, it is not the best way to do it, but it works at least. You can also adapt results of #996 (comment) to your needs.

ihabpalamino · Answer 79 · Mon Jul 17 2023 21:00:53 GMT+0800 (China Standard Time)

If you for some reason absolutely need to use snscrape for getting a single tweet, and you don't mind just getting it by tweet id purely in code, you can do it like that:

from snscrape.base import ScraperException
from snscrape.modules.twitter import (
    Tweet,
    _TwitterAPIType,
    TwitterTweetScraper,
    TwitterTweetScraperMode,
)


def get_items(self):
    variables = {
        "tweetId": str(self._tweetId),
        "includePromotedContent": True,
        "withCommunity": True,
        "withVoice": True,
        # !!! these fields may be deprecated
        # "with_rux_injections": False,
        # "withQuickPromoteEligibilityTweetFields": True,
        # "withBirdwatchNotes": False,
        # "withV2Timeline": True,
    }
    features = {
        "responsive_web_graphql_exclude_directive_enabled": True,
        "verified_phone_label_enabled": False,
        "creator_subscriptions_tweet_preview_api_enabled": False,
        "responsive_web_graphql_timeline_navigation_enabled": True,
        "responsive_web_graphql_skip_user_profile_image_extensions_enabled": False,
        "tweetypie_unmention_optimization_enabled": True,
        "responsive_web_edit_tweet_api_enabled": True,
        "graphql_is_translatable_rweb_tweet_is_translatable_enabled": True,
        "view_counts_everywhere_api_enabled": True,
        "longform_notetweets_consumption_enabled": True,
        "tweet_awards_web_tipping_enabled": False,
        "freedom_of_speech_not_reach_fetch_enabled": True,
        "standardized_nudges_misinfo": True,
        "tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled": False,
        "longform_notetweets_rich_text_read_enabled": True,
        "longform_notetweets_inline_media_enabled": False,
        "responsive_web_enhance_cards_enabled": False,
        "responsive_web_twitter_article_tweet_consumption_enabled": False,  # new?
        "responsive_web_media_download_video_enabled": True,  # new?
        # !!! these fields may be deprecated
        # "rweb_lists_timeline_redesign_enabled": False,
        # "vibe_api_enabled": True,
        # "interactive_text_enabled": True,
        # "blue_business_profile_image_shape_enabled": True,
        # "responsive_web_text_conversations_enabled": False,
    }
    fieldToggles = {
        "withArticleRichContentState": True,
        "withAuxiliaryUserLabels": True,
    }
    params = {
        "variables": variables,
        "features": features,
        "fieldToggles": fieldToggles,  # seems optional
    }
    url = "https://twitter.com/i/api/graphql/3HC_X_wzxnMmUBRIn3MWpQ/TweetResultByRestId"
    if self._mode is TwitterTweetScraperMode.SINGLE:
        obj = self._get_api_data(url, _TwitterAPIType.GRAPHQL, params=params)
        if not obj["data"]["tweetResult"]:
            return
        yield self._graphql_timeline_tweet_item_result_to_tweet(
            obj["data"]["tweetResult"]["result"], tweetId=self._tweetId
        )


# replace this method
TwitterTweetScraper.get_items = get_items


def get_using_snscrape(tweet_id: int) -> Tweet | None:
    print("Sending API request...")
    try:
        for tweet in TwitterTweetScraper(tweet_id).get_items():
            print("Response: %r." % tweet)
            return tweet
        print("No response from public API.")
    except ScraperException:
        print("Scraping failed.")

Just pass tweet id to get_using_snscrape function and it will return Tweet instance, if there's anything to return, or None otherwise. Obviously, it is not the best way to do it, but it works at least. You can also adapt results of #996 (comment) to your needs.

there is no way to get tweets by username and date since until isn't it?

Tobias Garbelini · Answer 80 · Wed Jul 19 2023 04:41:49 GMT+0800 (China Standard Time)

Hello, I'm still having this problem (CRITICAL:snscrape.base:Errors: blocked (404)), do you have any solution?

klein · Answer 81 · Wed Jul 19 2023 09:54:19 GMT+0800 (China Standard Time)

It seems that Twitter has once again discontinued access to these APIs. like UserByRestId, UserTweets....

Leo Chow · Answer 82 · Wed Jul 19 2023 14:18:59 GMT+0800 (China Standard Time)

@JustAnotherArchivist if you are implementing snscrape using graphql API like what nitter did, will snscrape also encounter the followng issue with keyword search?:

#996 (comment)

Leo Chow · Answer 83 · Wed Jul 19 2023 14:42:16 GMT+0800 (China Standard Time)

Does anyone know if there is a forked snscrape library that uses Twitter login, or something similar to this what works for keyword search and can scrape historical data?

JustAnotherArchivist · Answer 84 · Wed Jul 19 2023 16:41:52 GMT+0800 (China Standard Time)

@leockl Almost certainly, yes.

ihabpalamino · Answer 85 · Wed Jul 19 2023 17:36:27 GMT+0800 (China Standard Time)

@leockl Almost certainly, yes.

hello @JustAnotherArchivist does the implementation of solution of the blocked(404) gonna be implemented soon?

Mohammed BADI · Answer 86 · Wed Jul 19 2023 19:46:36 GMT+0800 (China Standard Time)

Is there a library which uses twitter login to scrape?

nerra0pos · Answer 87 · Sun Jul 23 2023 00:41:44 GMT+0800 (China Standard Time)

Update: It seems profiles and the tweets on profiles are available to the public without login again.

@JustAnotherArchivist have you seen that? Do the endpoints work again?

Raiyaan · Answer 88 · Sun Jul 23 2023 00:51:07 GMT+0800 (China Standard Time)

Update: It seems profiles and the tweets on profiles are available to the public without login again.

@JustAnotherArchivist have you seen that? Do the endpoints work again?

FR????

lf11 · Answer 89 · Sun Jul 23 2023 02:40:47 GMT+0800 (China Standard Time)

Thanks for spotting! Yes you can view a profile again but it is very strange. When I google an account and try to access it, not working. When I click on a tweet shown in Google results and from there click on the account, the account's tweets are shown. But strangely, only tweets from before 2023 and not in chronological order. I hardly think this is a "stable" feature of twitter but perhaps a sign that they are currently tweaking some things regarding how one can view a profile without being signed in.

Yijie Xu · Answer 90 · Sun Jul 23 2023 02:43:26 GMT+0800 (China Standard Time)

Thanks for spotting! Yes you can view a profile again but it is very strange. When I google an account and try to access it, not working. When I click on a tweet shown in Google results and from there click on the account, the account's tweets are shown. But strangely, only tweets from before 2023 and not in chronological order. I hardly think this is a "stable" feature of twitter but perhaps a sign that they are currently tweaking some things regarding how one can view a profile without being signed in.

I do think it is related to UserAgent. Twitter Inc. doesn't want to lose search results from Google since it could damage its influence.
In this case, maybe snscrape could change UA to google to bypass its restrictions.

nerra0pos · Answer 91 · Sun Jul 23 2023 02:56:36 GMT+0800 (China Standard Time)

Thanks for spotting! Yes you can view a profile again but it is very strange. When I google an account and try to access it, not working. When I click on a tweet shown in Google results and from there click on the account, the account's tweets are shown. But strangely, only tweets from before 2023 and not in chronological order. I hardly think this is a "stable" feature of twitter but perhaps a sign that they are currently tweaking some things regarding how one can view a profile without being signed in.

Ah yes you're right. If you first go to a single tweet, then traverse to a profile it will work. But also the Tweets shown seem to be "top" tweets or something, not the latest. But certainly worth a look at the Twitter API endpoints to see what this is all about.

JustAnotherArchivist · Answer 92 · Sun Jul 23 2023 07:43:10 GMT+0800 (China Standard Time)

Your User Agent doesn't change like that. But it isn't based on the Referer header either (which would have behaviour like you described, different results from direct navigation vs from web search results). Rather, one of the API endpoints for profile timelines is accessible again, but the page URL still isn't. So if you already have the Twitter website open and click on a profile name, it just triggers that API request and works, but if you open a profile page directly (or refresh it after such an API load), you get the login wall.

Interesting development, yes. There are some complications with implementing this (since snscrape sometimes has to load the profile page to fetch a token), but that can be worked around. The results are very poor though, and the 'Replies' tab (which the twitter-profile scraper is/was using) as well as tweet threads (twitter-tweet with scroll or recurse mode) are still inaccessible.

Also, I've been too busy with life, so I haven't had time to implement any of the changes yet, and I can't currently give any ETA either.

nerra0pos · Answer 93 · Sun Jul 23 2023 17:39:17 GMT+0800 (China Standard Time)

Syndication for Twitter profile works again with the latest tweets:

https://syndication.twitter.com/srv/timeline-profile/screen-name/nypost?showReplies=true

Leo Chow · Answer 94 · Mon Jul 24 2023 03:40:17 GMT+0800 (China Standard Time)

@nerra0pos is there a Python library which works to scrape this Syndication for Twitter site?

zaferk19 · Answer 95 · Wed Jul 26 2023 05:54:02 GMT+0800 (China Standard Time)

Hello, does the profile scraper still work?

ihabpalamino · Answer 96 · Wed Jul 26 2023 15:37:14 GMT+0800 (China Standard Time)

Hello, does the profile scraper still work?

i dont think so but you can wait for confirmation of my answer

ihabpalamino · Answer 97 · Fri Jul 28 2023 22:01:19 GMT+0800 (China Standard Time)

hello @JustAnotherArchivist still there is no solution or implemention for this error?is there any updates?

Jessie N. · Answer 98 · Sat Jul 29 2023 02:45:04 GMT+0800 (China Standard Time)

Syndication for Twitter profile works again with the latest tweets:

https://syndication.twitter.com/srv/timeline-profile/screen-name/nypost?showReplies=true

Notes about this:

Returned data in the HTML, following full load by the Javascript, is application/json , within <script id="__NEXT_DATA__" type="application/json">{}</script
Only loads about 20 tweets total.

While useless for pulling old data (without crawling through that awful, obfuscated Javascript), new data could be pulled occasionally and consecutively for sources such as newspapers, publishers, etc; no login is needed, nor any login flow issue.

locfinessemonster · Answer 99 · Sun Jul 30 2023 08:47:50 GMT+0800 (China Standard Time)

Does anyone know if the syndication streams are subject to rate limiting?

Slav Ivanov · Answer 100 · Sun Jul 30 2023 19:51:39 GMT+0800 (China Standard Time)

Syndication for Twitter profile works again with the latest tweets:
https://syndication.twitter.com/srv/timeline-profile/screen-name/nypost?showReplies=true

Notes about this:

Returned data in the HTML, following full load by the Javascript, is application/json , within <script id="__NEXT_DATA__" type="application/json">{}</script

Only loads about 20 tweets total.

While useless for pulling old data (without crawling through that awful, obfuscated Javascript), new data could be pulled occasionally and consecutively for sources such as newspapers, publishers, etc; no login is needed, nor any login flow issue.

Sometimes, the tweets in syndication are trimmed.
Do you know of a working endpoint or page (without login) to get the full tweet given id and the username?