lorenzodifuccia / safaribooks

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

images in pulled book are not showing

digitalw00t opened this issue · comments

All images in the books I pull are not in the epub. I do see in the same directory an OEBPS folder, with an images folder in there. And I do see images in there. All the images are 30k in size, and the ubutnu image viewer says they are all an "unknown" image type.

commented

The parameter received asset_base_url is outdated, and that's the reason the images are not being found. It is a simple change in the script to fix it: just need to get the new URL (open a book in the website, right-click on it then copy image URL) and replace in code the usage of asset_base_url by the new format.

As said by @victormeloufrgs changing the script in this way solve the problem, is not ideal but helps:

            new_base_url = "https://learning.oreilly.com/api/v2/epubs/urn:orm:book:9781492086888/files/assets"
            if "images" in next_chapter and len(next_chapter["images"]):
                self.images.extend(urljoin(new_base_url, img_url)
                                   for img_url in next_chapter['images'])

You only need to change the id of the book in the URL to the id that you wants

My working solution based on snippet from @RenanSPLopes
asset_base_url was slightly different and needed to sub book_id as param

# Images
asset_base_url = "https://learning.oreilly.com/api/v2/epubs/urn:orm:book:%s/files/" % self.book_id
if "images" in next_chapter and len(next_chapter["images"]):
    self.images.extend(urljoin(asset_base_url, img_url)
                        for img_url in next_chapter['images'])

Looks like images are in there, having another issue. I'll put in a seperate issue report fo it. Closing this one. Thanks for the assist.