i am getting 404 block
ihabpalamino opened this issue · comments
Describe the bug
Error retrieving https://twitter.com/i/api/graphql/7jT5GT59P8IFjgxwqnEdQw/SearchTimeline?variables=%7B%22rawQuery%22%3A%22%28from%3ANone%29%22%2C%22count%22%3A20%2C%22product%22%3A%22Latest%22%2C%22withDownvotePerspective%22%3Afalse%2C%22withReactionsMetadata%22%3Afalse%2C%22withReactionsPerspective%22%3Afalse%7D&features=%7B%22rweb_lists_timeline_redesign_enabled%22%3Afalse%2C%22blue_business_profile_image_shape_enabled%22%3Afalse%2C%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Afalse%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Afalse%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22vibe_api_enabled%22%3Atrue%2C%22responsive_web_edit_tweet_api_enabled%22%3Atrue%2C%22graphql_is_translatable_rweb_tweet_is_translatable_enabled%22%3Atrue%2C%22view_counts_everywhere_api_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22tweet_awards_web_tipping_enabled%22%3Afalse%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Afalse%2C%22standardized_nudges_misinfo%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Afalse%2C%22interactive_text_enabled%22%3Atrue%2C%22responsive_web_text_conversations_enabled%22%3Afalse%2C%22longform_notetweets_rich_text_read_enabled%22%3Afalse%2C%22longform_notetweets_inline_media_enabled%22%3Afalse%2C%22responsive_web_enhance_cards_enabled%22%3Afalse%2C%22responsive_web_twitter_blue_verified_badge_is_enabled%22%3Atrue%7D: blocked (404)
4 requests to https://twitter.com/i/api/graphql/7jT5GT59P8IFjgxwqnEdQw/SearchTimeline?variables=%7B%22rawQuery%22%3A%22%28from%3ANone%29%22%2C%22count%22%3A20%2C%22product%22%3A%22Latest%22%2C%22withDownvotePerspective%22%3Afalse%2C%22withReactionsMetadata%22%3Afalse%2C%22withReactionsPerspective%22%3Afalse%7D&features=%7B%22rweb_lists_timeline_redesign_enabled%22%3Afalse%2C%22blue_business_profile_image_shape_enabled%22%3Afalse%2C%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Afalse%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Afalse%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22vibe_api_enabled%22%3Atrue%2C%22responsive_web_edit_tweet_api_enabled%22%3Atrue%2C%22graphql_is_translatable_rweb_tweet_is_translatable_enabled%22%3Atrue%2C%22view_counts_everywhere_api_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22tweet_awards_web_tipping_enabled%22%3Afalse%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Afalse%2C%22standardized_nudges_misinfo%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Afalse%2C%22interactive_text_enabled%22%3Atrue%2C%22responsive_web_text_conversations_enabled%22%3Afalse%2C%22longform_notetweets_rich_text_read_enabled%22%3Afalse%2C%22longform_notetweets_inline_media_enabled%22%3Afalse%2C%22responsive_web_enhance_cards_enabled%22%3Afalse%2C%22responsive_web_twitter_blue_verified_badge_is_enabled%22%3Atrue%7D failed, giving up.
Errors: blocked (404), blocked (404), blocked (404), blocked (404)
127.0.0.1 - - [03/Jul/2023 13:05:51] "POST /scrape-tweets2 HTTP/1.1" 500 -
Traceback (most recent call last):
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 2213, in call
return self.wsgi_app(environ, start_response)
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 2193, in wsgi_app
response = self.handle_exception(e)
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 2190, in wsgi_app
response = self.full_dispatch_request()
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 1486, in full_dispatch_request
rv = self.handle_user_exception(e)
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 1484, in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\flask\app.py", line 1469, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "C:\Users\HP Probook\PycharmProjects\firstproject\TweetsSraper.py", line 29, in scrape_tweets
for i, tweet in enumerate(scraper.get_items()):
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\snscrape\modules\twitter.py", line 1699, in get_items
for obj in self._iter_api_data('https://twitter.com/i/api/graphql/7jT5GT59P8IFjgxwqnEdQw/SearchTimeline', _TwitterAPIType.GRAPHQL, params, paginationParams, cursor = self._cursor, instructionsPath = ['data', 'search_by_raw_query', 'search_timeline', 'timeline', 'instructions']):
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\snscrape\modules\twitter.py", line 867, in _iter_api_data
obj = self._get_api_data(endpoint, apiType, reqParams, instructionsPath = instructionsPath)
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\snscrape\modules\twitter.py", line 838, in _get_api_data
r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = functools.partial(self._check_api_response, apiType = apiType, instructionsPath = instructionsPath))
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\snscrape\base.py", line 272, in _get
return self._request('GET', *args, **kwargs)
File "C:\Users\HP Probook\PycharmProjects\firstproject\venv\lib\site-packages\snscrape\base.py", line 268, in _request
raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://twitter.com/i/api/graphql/7jT5GT59P8IFjgxwqnEdQw/SearchTimeline?variables=%7B%22rawQuery%22%3A%22%28from%3ANone%29%22%2C%22count%22%3A20%2C%22product%22%3A%22Latest%22%2C%22withDownvotePerspective%22%3Afalse%2C%22withReactionsMetadata%22%3Afalse%2C%22withReactionsPerspective%22%3Afalse%7D&features=%7B%22rweb_lists_timeline_redesign_enabled%22%3Afalse%2C%22blue_business_profile_image_shape_enabled%22%3Afalse%2C%22responsive_web_graphql_exclude_directive_enabled%22%3Atrue%2C%22verified_phone_label_enabled%22%3Afalse%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Afalse%2C%22responsive_web_graphql_timeline_navigation_enabled%22%3Atrue%2C%22responsive_web_graphql_skip_user_profile_image_extensions_enabled%22%3Afalse%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22vibe_api_enabled%22%3Atrue%2C%22responsive_web_edit_tweet_api_enabled%22%3Atrue%2C%22graphql_is_translatable_rweb_tweet_is_translatable_enabled%22%3Atrue%2C%22view_counts_everywhere_api_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22tweet_awards_web_tipping_enabled%22%3Afalse%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Afalse%2C%22standardized_nudges_misinfo%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Afalse%2C%22interactive_text_enabled%22%3Atrue%2C%22responsive_web_text_conversations_enabled%22%3Afalse%2C%22longform_notetweets_rich_text_read_enabled%22%3Afalse%2C%22longform_notetweets_inline_media_enabled%22%3Afalse%2C%22responsive_web_enhance_cards_enabled%22%3Afalse%2C%22responsive_web_twitter_blue_verified_badge_is_enabled%22%3Atrue%7D failed, giving up.
How to reproduce
it was working fine but not anymore
Expected behaviour
my code from datetime import datetime
from flask import Flask, request, jsonify
import snscrape.modules.twitter as sntwitter
import pandas as pd
import json
import re
app = Flask(name)
@app.route('/scrape-tweets2', methods=['POST'])
def scrape_tweets():
Username = request.form.get('username')
SINCE = request.form.get('since')
UNTIL = request.form.get('until')
PLATFORM_NAME = request.form.get('plateform')
if SINCE and UNTIL:
since_date = datetime.strptime(SINCE, "%Y-%m-%d")
until_date = datetime.strptime(UNTIL, "%Y-%m-%d")
date_range = f" since:{since_date.strftime('%Y-%m-%d')} until:{until_date.strftime('%Y-%m-%d')}"
else:
date_range = ""
scraper = sntwitter.TwitterSearchScraper(f"(from:{Username}){date_range}")
tweets = []
for i, tweet in enumerate(scraper.get_items()):
if tweet.media is not None and any(mediatype == "video" for mediatype in tweet.media):
view_count = tweet.viewCount
else:
view_count = "Not a video tweet"
data = {
"id_post": tweet.id,
"Date": tweet.date.strftime("%Y-%m-%d"),
"Heure": tweet.date.strftime("%H:%M:%S"),
"content": tweet.content,
"username": tweet.user.username,
"likecount": tweet.likeCount,
"shares": tweet.retweetCount,
"comments": tweet.replyCount,
"platformname": PLATFORM_NAME,
"postUrl": tweet.url
}
tweets.append(data)
if i > 800:
break
tweet_df = pd.DataFrame(tweets, columns=["id_post", "Date", "Heure", "content", "username", "likecount", "shares",
"comments", "platformname", "postUrl"])
tweet_df.to_csv('tweeter.csv', sep=";", encoding='utf-8', index=False)
tweet_json = tweet_df.to_json(orient='records', indent=4, force_ascii=False)
clean_insta_json = re.sub(r"[\x00-\x1F\x7F-\x9F]", "", tweet_json)
response = jsonify(json.loads(clean_insta_json))
response.headers['Content-Type'] = 'application/json'
return response
if name == 'main':
app.run(debug=True)
Screenshots and recordings
No response
Operating system
Windows 11
Python version: output of python3 --version
3.9.13
snscrape version: output of snscrape --version
snscrape-0.6.2.20230321.**dev39+**gc3b216c
Scraper
TwitterSearchScrapper
How are you using snscrape?
Module (import snscrape.modules.something
in Python code)
Backtrace
No response
Log output
No response
Dump of locals
No response
Additional context
No response
Same here, one day I was using simple queries and the other I was getting blocked.
However, you can use Twitter API to tackle this problem but you ill reach a limit of tweets you can scrape.
My thoughts is that the owner of Twitter stopped unlimited queries in order not to let AI improve over his social network. (Many articles I read are saying that)
Same here. Has anybody managed to come up with a solution?