xlcnd / isbnlib

python library to validate, clean, transform and get metadata of ISBN strings (for devs).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invalid JSON for some books (extra double quotes in publisher)

toukovk opened this issue · comments

For some books, JSON output is invalid since there are extra double quotes in publisher. Maybe the metadata source has quotes in publisher metadata?

Example

# Print out the versions
/# pip list
argparse (1.2.1)
chardet (2.0.1)
colorama (0.2.5)
html5lib (0.999)
isbnlib (3.10.1)
isbntools (4.3.23)
pip (1.5.4)
requests (2.2.1)
setuptools (3.3)
six (1.5.2)
urllib3 (1.7.1)
wheel (0.24.0)
wsgiref (0.1.2)

# json output (has extra double quotes for publisher)
/# isbn_meta 9780596003302 json
{"type": "book",
     "title": "Unix Power Tools",
    "author": [{"name": "Shelley Powers"}, {"name": "Jerry Peek"}, {"name": "Tim O'Reilly"}, {"name": "Mike Loukides"}, {"name": "Michael Kosta Loukides"}],
      "year": "2003",
"identifier": [{"type": "ISBN", "id": "9780596003302"}],
 "publisher": ""O'Reilly Media, Inc.""}

# default output (so the source seems to have double quotes for publisher)
/# isbn_meta 9780596003302
Type:      BOOK
Title:     Unix Power Tools
Author:    Shelley Powers
Author:    Jerry Peek
Author:    Tim O'Reilly
Author:    Mike Loukides
Author:    Michael Kosta Loukides
ISBN:      9780596003302
Year:      2003
Publisher: "O'Reilly Media, Inc."

Btw thanks for the utility :)

commented

Hi!

Yes, this is a 'recurrent' problem with the metadata provider (Google Books)--see the output of https://www.googleapis.com/books/v1/volumes?q=isbn:9780596003302&fields=items/volumeInfo(title,subtitle,authors,publisher,publishedDate,language,industryIdentifiers)&maxResults=1. Anyway, I already changed isbnlib in order to fix it.

Thanks.