Praw incorrectly uses the "after" parameter to paginate Mode Note API Queries.
iargue opened this issue · comments
Describe the Bug
The Reddit API does not support using the parameter "after" on queries for /api/mod/notes
The default listing generator uses the "after" parameter to paginate results.
Generator.py
if self._listing.after and self._listing.after != self.params.get("after"):
self.params["after"] = self._listing.after
else:
self._exhausted = True
The "after" parameter is set as part of the listing class.
class ModNoteListing(Listing):
"""Special Listing for handling :class:`.ModNote` lists."""
CHILD_ATTRIBUTE = "mod_notes"
@property
def after(self) -> Optional[Any]:
"""Return the next attribute or None."""
if not getattr(self, "has_next_page", True):
return None
return getattr(self, "end_cursor", None)
The result is that the reddit API will ignore the 'after' parameter and return an identical result to the first query. When PRAW receives a second response with an identical 'end_cursor' parameter, it will end the query. This means that the maximum number of mod notes pulled by PRAW is 100.
Desired Result
PRAW should record the 'end_cursor' parameter from responses to the Mod Notes API and transmit them as "before" in the next query. This will properly collect the next page of results from the API.
I do not have the python knowledge to provide a best practice fix. Below is my hack that correctly provides all user notes.
listing.py
class ModNoteListing(Listing):
"""Special Listing for handling :class:`.ModNote` lists."""
CHILD_ATTRIBUTE = "mod_notes"
@property
def before(self) -> Optional[Any]:
"""Return the next attribute or None."""
if not getattr(self, "has_next_page", True):
return None
return getattr(self, "end_cursor", None)
generator.py
def _next_batch(self):
if self._exhausted:
raise StopIteration()
self._listing = self._reddit.get(self.url, params=self.params)
self._listing = self._extract_sublist(self._listing)
self._list_index = 0
if not self._listing:
raise StopIteration()
if hasattr(self._listing, "after"):
if self._listing.after and self._listing.after != self.params.get("after"):
self.params["after"] = self._listing.after
else:
self._exhausted = True
elif hasattr(self._listing, "before"):
if self._listing.before and self._listing.before != self.params.get("before"):
self.params["before"] = self._listing.before
else:
self._exhausted = True
else:
self._exhausted = True
Relevant Logs
DEBUG:prawcore:Params: {'subreddit': Subreddit(display_name='test'), 'user': 'TestUser', 'limit': 1024, 'raw_json': 1}
DEBUG:prawcore:Response: 200 (5089 bytes)
DEBUG:prawcore:Params: {'subreddit': Subreddit(display_name='test), 'user': 'testUser', 'limit': 1024, 'after': 'MTY2MDIzMTM3MDk5Mw==', 'raw_json': 1}
DEBUG:prawcore:Response: 200 (5089 bytes)
Code to reproduce the bug
for note in reddit.subreddit("test").mod.notes.redditors(userName, limit = None):
My code example does not include the Reddit()
initialization to prevent credential leakage.
Yes
This code has previously worked as intended.
No
Operating System/Environment
Windows 10
Python Version
Python 3.10
PRAW Version
Version: 7.6.1
Prawcore Version
Version: 2.3.0
Anything else?
The user will need more than 100 mod notes in order to need to paginate requests.