twitter scrapper error
mahajnay opened this issue · comments
Hi all,
While using twitter scrapper,
I have this code
from twitterscraper import query_tweets
import datetime as dt
import pandas as pd
begin_date = dt.date(2020,3,1)
end_date = dt.date(2021,11,1)
limit = 100
lang = 'english'
tweets = query_tweets('vaccinesideeffects', begindate = begin_date, enddate = end_date, limit = limit, lang = lang)
df = pd.DataFrame(t.dict for t in tweets)
df = df['text']
df
Getting below error
AttributeError Traceback (most recent call last)
in
----> 1 from twitterscraper import query_tweets
2 import datetime as dt
3 import pandas as pd
4
5 begin_date = dt.date(2020,3,1)
~/opt/anaconda3/lib/python3.8/site-packages/twitterscraper/init.py in
11
12
---> 13 from twitterscraper.query import query_tweets
14 from twitterscraper.query import query_tweets_from_user
15 from twitterscraper.query import query_user_info
~/opt/anaconda3/lib/python3.8/site-packages/twitterscraper/query.py in
74 yield start + h * i
75
---> 76 proxies = get_proxies()
77 proxy_pool = cycle(proxies)
78
~/opt/anaconda3/lib/python3.8/site-packages/twitterscraper/query.py in get_proxies()
47 soup = BeautifulSoup(response.text, 'lxml')
48 table = soup.find('table',id='proxylisttable')
---> 49 list_tr = table.find_all('tr')
50 list_td = [elem.find_all('td') for elem in list_tr]
51 list_td = list(filter(None, list_td))
AttributeError: 'NoneType' object has no attribute 'find_all'
Same for me
Same issue here
It tries to grab table from https://free-proxy-list.net with id ='proxylisttable'
but it doesnt exist.
You need to remove it from line 48 :
table = soup.find('table',id='proxylisttable')
to
table = soup.find('table')
fixed this error using Pandas:
import pandas as pd
...
def get_proxies():
resp = requests.get(PROXY_URL)
df = pd.read_html(resp.text)[0]
list_ip=list(df['IP Address'].values)
list_ports=list(df['Port'].values.astype(str))
list_proxies = [':'.join(elem) for elem in list(zip(list_ip, list_ports))]
however, this still does not work.
list_of_tweets = query_tweets("Trump OR Clinton", 10)
returns:
Exception: Traceback (most recent call last):
File "/Users/rmartin/Desktop/Envs/crypto_env/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV) Job: 0.
Same error here on python 3.9
It tries to grab table from https://free-proxy-list.net with
id ='proxylisttable'
but it doesnt exist. You need to remove it from line 48 :table = soup.find('table',id='proxylisttable')
totable = soup.find('table')
thanks, it solved my problem