praw-dev / praw

PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

Home Page:http://praw.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Praw incorrectly uses the "after" parameter to paginate Mode Note API Queries.

iargue opened this issue · comments

Describe the Bug

The Reddit API does not support using the parameter "after" on queries for /api/mod/notes

The default listing generator uses the "after" parameter to paginate results.

Generator.py

    if self._listing.after and self._listing.after != self.params.get("after"):
                self.params["after"] = self._listing.after
            else:
                self._exhausted = True

The "after" parameter is set as part of the listing class.

    class ModNoteListing(Listing):
        """Special Listing for handling :class:`.ModNote` lists."""
    
        CHILD_ATTRIBUTE = "mod_notes"
    
        @property
        def after(self) -> Optional[Any]:
            """Return the next attribute or None."""
            if not getattr(self, "has_next_page", True):
                return None
            return getattr(self, "end_cursor", None)

The result is that the reddit API will ignore the 'after' parameter and return an identical result to the first query. When PRAW receives a second response with an identical 'end_cursor' parameter, it will end the query. This means that the maximum number of mod notes pulled by PRAW is 100.

Desired Result

PRAW should record the 'end_cursor' parameter from responses to the Mod Notes API and transmit them as "before" in the next query. This will properly collect the next page of results from the API.

I do not have the python knowledge to provide a best practice fix. Below is my hack that correctly provides all user notes.

listing.py

class ModNoteListing(Listing):
    """Special Listing for handling :class:`.ModNote` lists."""

    CHILD_ATTRIBUTE = "mod_notes"

    @property
    def before(self) -> Optional[Any]:
        """Return the next attribute or None."""
        if not getattr(self, "has_next_page", True):
            return None
        return getattr(self, "end_cursor", None)

generator.py

    def _next_batch(self):
        if self._exhausted:
            raise StopIteration()

        self._listing = self._reddit.get(self.url, params=self.params)
        self._listing = self._extract_sublist(self._listing)
        self._list_index = 0

        if not self._listing:
            raise StopIteration()

        if hasattr(self._listing, "after"):
            if self._listing.after and self._listing.after != self.params.get("after"):
                self.params["after"] = self._listing.after
            else:
                self._exhausted = True
        elif hasattr(self._listing, "before"):
            if self._listing.before and self._listing.before != self.params.get("before"):
                self.params["before"] = self._listing.before
            else:
                self._exhausted = True
        else:
            self._exhausted = True

Relevant Logs

DEBUG:prawcore:Params: {'subreddit': Subreddit(display_name='test'), 'user': 'TestUser', 'limit': 1024, 'raw_json': 1}
DEBUG:prawcore:Response: 200 (5089 bytes)
DEBUG:prawcore:Params: {'subreddit': Subreddit(display_name='test), 'user': 'testUser', 'limit': 1024, 'after': 'MTY2MDIzMTM3MDk5Mw==', 'raw_json': 1}
DEBUG:prawcore:Response: 200 (5089 bytes)

Code to reproduce the bug

for note in reddit.subreddit("test").mod.notes.redditors(userName, limit = None):

My code example does not include the Reddit() initialization to prevent credential leakage.

Yes

This code has previously worked as intended.

No

Operating System/Environment

Windows 10

Python Version

Python 3.10

PRAW Version

Version: 7.6.1

Prawcore Version

Version: 2.3.0

Anything else?

The user will need more than 100 mod notes in order to need to paginate requests.