learn-awesome / learndb

Curated learning resources with topics, formats, difficulty levels, expert reviews and metadata tags

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Merge duplicate items

nileshtrivedi opened this issue · comments

commented

Hi, Can you please elaborate on this issue with an example

commented

@Abdul-Hafeez-Galib When I search for "Sapiens", 3 results are shown:

image

This is because 3 items exists in db/items.json for this. We allow one item to have multiple links. That's not the problem. But this is an example of there being duplicates in dataset.

We need to create a standalone NodeJS script scripts/remove_duplicates.js that updates db/items.json by merging duplicate items into one. It should have a --dry-run option because ideally, we don't want to remove items that have been linked at external sites.

So, I should merge these duplicates by adding the multiple links for a single book.

commented

item is an abstract thing: the set of ideas given in Sapiens. Each item can have multiple concrete representations (links), for eg: book, video, summary, podcast, course etc. So, yes, you need to merge all the fields like description, links, reviews, tags into a single item and delete the duplicate items.