Tatsh / patreon-archiver

Download Patreon content.

Home Page:https://patreon-archiver.rtfd.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NoneType in save_images function

kingraaa opened this issue · comments

Hi, I have just come across this app and thought I would give it a go to download some guitar tutorials I am currently paying for. These come as posts with .txt files (Guitar tab) and embedded Vimeo files (Tutorials)

I downloaded the latest master version (built as 0.0.6), and installed with pip install patreon-archiver-master.zip.

To start I could not get it working but I am assuming that is a problem with my environment varibales set-up. When I directly call the app with C:\Program` Files\Python311\Scripts\patreon-archiver.exe -o "D:\Patreon\Tatsh-Patreon-Archiver\DL2" CAMPAIGN-ID, it begins to run fine and downloads image posts and text posts. However, after a while it stops with a TypeError: 'NoneType' object is not subscriptable exception. I have included the stacktrace below:

Image file: https://www.patreon.com/posts/xxxxx
Image file: https://www.patreon.com/posts/xxxxx
Text_Only: https://www.patreon.com/posts/xxxxx
Image file: https://www.patreon.com/posts/xxxxx
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Program Files\Python311\Scripts\patreon-archiver.exe\__main__.py", line 7, in <module>
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 139, in main
    media_uris.extend(x for x in process_posts(posts, session)
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 139, in <genexpr>
    media_uris.extend(x for x in process_posts(posts, session)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 67, in process_posts
    yield from save_images(session, post)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 36, in save_images
    pdd['attributes']['post_metadata']['image_order'], start=1):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

Is this a problem with the setup on my end? I have not used yt-dlp before and only had it installed as a dependency when installing the patreon-archiver ZIP file. Though logic tells me that is not the issue as it seems to be a problem with the save_images function.

Any help would be appreciated - happy to provide more info if needed. My python skills aren't great but I will try to follow along and see if I can narrow down the issue

Cheers

As I mentioned earlier my python skills arent great, but I added a try-except statement around the contents of the save-images function, and put a debugger breakpoint inside the except block. I have manually edited the args here to remove any URLs in case they should not be linked here - happy to provide if needed though, it's nothing dodgy.

The args involved are as follows:

(Pdb) a
session = <requests.sessions.Session object at 0x000001FEB0632910>
pdd = {
    'attributes': {
        'change_visibility_at': None,
        'comment_count': 2,
        'content': 'Hey guys!\xa0<p><br></p><p>This will be up first thing tomorrow as usual, bit late for me to edit again now, thanks again for supporting me here on Patreon, really hope you guys like the content</p>',
        'current_user_can_comment': True,
        'current_user_can_delete': False,
        'current_user_can_view': True,
        'current_user_has_liked': False,
        'embed': None,
        'has_ti_violation': False,
        'image': {
            'height': 2880,
            'large_url': 'VALID-URL',
            'thumb_square_large_url': 'URL',
            'thumb_square_url': 'URL',
            'thumb_url': 'URL',
            'url': 'URL',
            'width': 5120
        },
        'is_paid': False,
        'like_count': 1,
        'meta_image_url': 'URL',
        'min_cents_pledged_to_view': 1,
        'patreon_url': 'URL',
        'pledge_url': '/bePatron?patAmt=0.01&c=CAMPAIGN-ID',
        'post_file': {
            'name': 'patreon.jpg',
            'progress': {},
            'url': 'URL'
        },
        'post_metadata': None,
        'post_type': 'image_file',
        'published_at': '2018-09-26T21:47:19.000+00:00',
        'teaser_text': 'Riot Van Lesson w/Tabs - Arctic Monkeys',
        'title': 'Riot Van Lesson w/Tabs - Arctic Monkeys',
        'upgrade_url': '/join/URL',
        'url': 'URL',
        'was_posted_by_campaign_owner': True
    },
    'id': '21662551',
    'relationships': {
        'access_rules': {
            'data': [{'id': '352184', 'type': 'access-rule'}]
        },
        'attachments': {
            'data': []
        },
        'audio': {
            'data': None
        },
        'campaign': {
            'data': {
                'id': '1786474',
                'type': 'campaign'
            },
            'links': {
                'related': 'https://www.patreon.com/api/campaigns/CAMPAIGNID'
            }
        },
        'images': {
            'data': [{'id': '12117762', 'type': 'media'}]
        },
        'media': {
            'data': [{'id': '12117762', 'type': 'media'}]
        },
        'poll': {
            'data': None
        },
        'ti_checks': {
            'data': []
        },
        'user': {
            'data': {
                'id': '11425844',
                'type': 'user'
            },
            'links': {
                'related': 'https://www.patreon.com/api/user/11425844'
            }
        },
        'user_defined_tags': {
            'data': []
        }
    },
    'type': 'post'
}

EDIT: I type "n" to go to the next step in the debugger and get the same pdd value. I then type "n" again which is where the TypeError problem appears. The posts attribute here is enormously long (174,000 characters). Maybe this has something to do with it?

Though maybe not - as the error is referencing a 'NoneType', which this clearly is not!

I have added a snippet below, obviously not the whole 174000 character variable though

(Pdb) n
TypeError: 'NoneType' object is not iterable
> c:\program files\python311\lib\site-packages\patreon_archiver\main.py(71)process_posts()
-> yield from save_images(session, post)
(Pdb) a
posts = {'data': [{'attributes': {'change_visibility_at': None, 'comment_count': 0, 'content': ...... all 174000 chars ...... 
f'}, 'meta': {'pagination': {'cursors': {'next': '01zou3apnLhxa7rNb3PM_WIsMf'}, 'total': 604}}}
session = <requests.sessions.Session object at 0x000001C915BC4D10>
(Pdb) n
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Program Files\Python311\Scripts\patreon-archiver.exe\__main__.py", line 7, in <module>
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 143, in main
    media_uris.extend(x for x in process_posts(posts, session)
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 143, in <genexpr>
    media_uris.extend(x for x in process_posts(posts, session)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\patreon_archiver\main.py", line 71, in process_posts
    yield from save_images(session, post)
TypeError: 'NoneType' object is not iterable

Its getting late here so cant look at this too much more until after work tomorrow. If you have any suggestions I'm all ears

commented

I think let's not yet worry about your second post here.

I pushed a change to handle when post_metadata is None. This should fix the issue in your first post.

commented

Please try with latest master.

Tried it and this bug is fixed, cheers. (Did reveal a different problem, I'll make another post)