Post Text
yeamusic21 opened this issue · comments
Would it be possible to include the post text when scraping user posts? I'm interested in grabbing the number of hashtags used over the last X (say 12) posts. For example, say over the last 12 posts, on average there are 5 hashtags used per post. If the Post object included the post text, I think I could get what I'm looking for. This isn't really an issue, more of a request or suggestion for your consideration.
Thank You for your suggestion, You can get the caption of the post by
>>> user = InstagramUser('github', sessionid=id)
>>> sample_post = user.posts[0]
>>> sample_post.caption
I try to add the hashtags of the post in the next release.
Thank You
I try to add the hashtags of the post in the next release.
Amazing! You rock @yogeshwaran01 👍 💯 🥇
@yeamusic21 You can get the hashtags and text of the post by the following script
>>> from Instagram import InstagramPost
>>> post = InstagramPost('CNQDEkxr8eM', sessionid=sessionid)
>>> data = post.post_data
>>> post_text = post['edge_media_to_caption']['edges'][0]['node']['text']
# if raise key Error text of post is Empty
>>> post_text
@yogeshwaran01 Wow! Awesome! I will try this out and let you know how it goes. 👍 💯
@yogeshwaran01 - Errors out with sessionid :-(
>>> from instagramy import InstagramUser, InstagramPost
>>> import re
>>> import os
>>> import json
>>> import pickle
>>> from time import sleep
>>> postID = 'CM3a6y5B4ju'
>>> post2 = InstagramPost(postID, sessionid=os.environ.get("INSTAGRAM_SESSIONID"))
Traceback (most recent call last):
File "C:\Users\Redacted\Desktop\Personal\Redacted\Redacted\Redacted\lib\site-packages\instagramy\InstagramPost.py", line 52, in __init__
self.post_data = data["entry_data"]["PostPage"][0]["graphql"][
KeyError: 'graphql'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Redacted\Desktop\Personal\Redacted\Redacted\Redacted\lib\site-packages\instagramy\InstagramPost.py", line 56, in __init__
raise RedirectionError
instagramy.core.exceptions.RedirectionError: Instagram Redirects you to login page, Try After Sometime or Reboot your PC Provide the sessionid to Login
>>>
Works with the following revisions:
- no session id
- [0] added after ['edges'])
>>> import re
>>> import os
>>> import json
>>> import pickle
>>> from time import sleep
>>> postID = 'CM3a6y5B4ju'
>>> post2 = InstagramPost(postID)
>>> pdata = post2.post_data
>>> len(re.findall("[#]\w+", pdata['edge_media_to_caption']['edges'][0]['node']['text']))
7
@yeamusic21, Yes your right I forget to add [0]
, Now it is updated.
instagramy.core.exceptions.RedirectionError
is due to incorrect session_id or Instagram changed the session_id
@yogeshwaran01 I'm using Python 3.7.1 and it seems like my code is different than your latest code.
My Instagramy code in instagramy/InstagramPost.py:
def __init__(self, post_id: str, sessionid=None):
self.post_id = post_id
self.url = f"https://www.instagram.com/p/{post_id}/"
self.sessionid = sessionid
data = self.get_json()
try:
self.post_data = data["entry_data"]["PostPage"][0]["graphql"][
"shortcode_media"
]
except KeyError:
raise RedirectionError
Latest Instagramy code in instagramy/InstagramPost.py:
def __init__(self, post_id: str, sessionid=None, from_cache=False):
self.post_id = post_id
self.url = f"https://www.instagram.com/p/{post_id}/"
self.sessionid = sessionid
cache = Cache("post")
if from_cache:
if cache.is_exists(post_id):
self.post_data = cache.read_cache(post_id)
else:
data = self.get_json()
cache.make_cache(
post_id,
data["entry_data"]["PostPage"][0]["graphql"]["shortcode_media"],
)
self.post_data = data["entry_data"]["PostPage"][0]["graphql"]["shortcode_media"]
else:
data = self.get_json()
cache.make_cache(
post_id, data["entry_data"]["PostPage"][0]["graphql"]["shortcode_media"]
)
try:
self.post_data = data["entry_data"]["PostPage"][0]["graphql"][
"shortcode_media"
]
except KeyError:
raise RedirectionError
if sessionid:
try:
self.viewer = Viewer(data=data["config"]["viewer"])
except UnboundLocalError:
self.viewer = None
else:
self.viewer = None
@yeamusic21, Yes today instagramy got a new release 4.3
. It got a new caching feature. Please check it out.
Update the package to the latest version
pip install instagramy --upgrade