internetarchive / openlibrary

One webpage for every book ever published!

Home Page:https://openlibrary.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ImportBot ignoring publishers field when resolving editions

cdrini opened this issue · comments

Problem

Related: #8977

It seems to think a title match is enough to consider something a match :/

E.g. https://openlibrary.org/books/OL46219062M/Spoon_River_Anthology?m=diff&b=2

Note I have since undone this import.

Reproducing the bug

  1. copydocs this record into your local environment: https://openlibrary.org/books/OL46219062M/Spoon_River_Anthology
  2. Try to import this record:
{
    "title": "Spoon River Anthology",
    "source_records": ["standard_ebooks:edgar-lee-masters/spoon-river-anthology"],
    "publishers": ["Standard Ebooks"],
    "publish_date": "2022",
    "authors": [{"name": "Edgar Lee Masters"}],
    "description": "<p><i>Spoon River Anthology</i> is a collection of short poems that reveals the true nature of the citizens of a fictional small town in Illinois. Each poem is a candid autobiography of a now-deceased resident that lies in the Oak Hill Cemetery\u2014often exposing their darkest secrets.</p> <p><a href=\"https://standardebooks.org/ebooks/edgar-lee-masters\">Edgar Lee Masters</a> was raised in Lewistown, Illinois and based these stories on the gossip he heard there. The book was a commercial success, but was banned from schools and libraries in the area due to the real-life citizens knowing exactly whom each poem was written about. As the years have passed and more generations now lie in Oak Hill Cemetery, Lewistown has forgiven Masters, and he\u2019s now celebrated there.</p> <p>The poems were originally published in the literary magazine <i>Reedy\u2019s Mirror</i> under the pseudonym Webster Ford. The first book edition was published in 1915 and contained 209 poems. Masters added 35 new poems, including the epilogue, for the 1916 edition, which is the edition that this Standard Ebooks edition is based on.</p>",
    "subjects": ["American poetry", "Poetry"],
    "identifiers": {
        "standard_ebooks": ["edgar-lee-masters/spoon-river-anthology"]
    },
    "languages": ["eng"],
    "cover": "https://standardebooks.org/ebooks/edgar-lee-masters/spoon-river-anthology/downloads/cover.jpg"
}
  • Expected behavior: A new edition is created.
  • Actual behavior: It's attached to this edition!

Context

  • Browser (Chrome, Safari, Firefox, etc):
  • OS (Windows, Mac, etc):
  • Logged in (Y/N): Y
  • Environment (prod, dev, local): prod

Notes from this Issue's Lead

Proposal & constraints

Related files

Stakeholders

@scottbarnes

Note: Before making a new branch or updating an existing one, please ensure your branch is up to date.

This is mentioned in the linked PR, but it is worth mentioning here as well. To get the local environment in a position to replicate this issue, the following should work:

docker compose exec web bash
PYTHONPATH=. ./scripts/copydocs.py /works/OL31830589W?v=2 /authors/OL7636796A?v=1 /books/OL46219062M?v=3

Then try to import the record mentioned in (2) above under "Reproducing the bug".

Some more examples:

{"title": "Leaves of Grass", "source_records": ["standard_ebooks:walt-whitman/leaves-of-grass"], "publishers": ["Standard Ebooks"], "publish_date": "2024", "authors": [{"name": "Walt Whitman"}], "description": "<p><a href=\"https://standardebooks.org/ebooks/walt-whitman\">Walt Whitman</a> consciously set out to forge a personal path for himself as a poet. Inspired by contemporaries like <a href=\"https://standardebooks.org/ebooks/ralph-waldo-emerson\">Emerson</a> who expressed a need for a new, uniquely American style of poetry, Whitman eschewed conventions he saw as outdated or undemocratic. Setting aside traditional rhyme, meter, and even brevity, Whitman favored a style that was declarative, direct, and maximalist. For subject matter he focused on the common individual, as democratic representative of all humanity, and the natural world of which humanity exists as an integral part. \u201cSong of Myself\u201d is perhaps the most well-known exemplar of this aesthetic.</p> <p>Whitman\u2019s poetic career took an abrupt turn during the American Civil War, and his poems from that time draw on his experiences volunteering at military hospitals. These, coupled with his elegy for President Lincoln after his assassination (\u201cWhen Lilacs Last in the Dooryard Bloom\u2019d\u201d), helped to cement Whitman\u2019s position as a particularly American voice.</p> <p>Among Whitman\u2019s recurring themes are the embracing of sensual pleasures, including frank acknowledgments of homosexuality. This latter aspect drove several contemporary critics to reject his work as indecent. Threats of censorship and outright banning encouraged his supporters to speak more publicly in defense of his work, however, and Whitman is now considered to be one of America\u2019s most important poets.</p> <p><i>Leaves of Grass</i> was continually edited and extended over most of Whitman\u2019s life. Months before his death, he announced that the next edition would be the complete and definitive one. Referred to now as the \u201cdeathbed edition,\u201d it was published in 1892 by Whitman\u2019s literary executors, and is the basis for this ebook.</p>", "subjects": ["American poetry -- 19th century", "Poetry"], "identifiers": {"standard_ebooks": ["walt-whitman/leaves-of-grass"]}, "languages": ["eng"], "cover": "https://standardebooks.org/ebooks/walt-whitman/leaves-of-grass/downloads/cover.jpg"}
{"title": "The Secret of the Old Mill", "source_records": ["standard_ebooks:franklin-w-dixon/the-secret-of-the-old-mill"], "publishers": ["Standard Ebooks"], "publish_date": "2024", "authors": [{"name": "Franklin W. Dixon"}], "description": "<p>During a hike of the countryside, the Hardy boys and their pals learn that an abandoned mill has recently gotten new owners. When the boys go to look, the new owners claim to be producing a new breakfast food, field a few questions, and then shoo the boys away. But that\u2019s enough to make the Hardy boys suspicious. Are the men to be trusted? Are they merely surly scientists, or are they covering up a darker secret?</p> <p>This is the third book of the Hardy boys series, first published in 1927. It was rewritten in 1962; this Standard Ebook contains the original 1927 text.</p>", "subjects": ["Brothers -- Juvenile fiction", "Mystery and detective stories", "Counterfeits and counterfeiting -- Juvenile fiction", "Hardy Boys (Fictitious characters) -- Juvenile fiction", "Adventure", "Children\u2019s", "Fiction", "Mystery"], "identifiers": {"standard_ebooks": ["franklin-w-dixon/the-secret-of-the-old-mill"]}, "languages": ["eng"], "cover": "https://standardebooks.org/ebooks/franklin-w-dixon/the-secret-of-the-old-mill/downloads/cover.jpg"}

^ These should also create a new record, for more testing cases!