RUCDM / KB4Rec

This is the data for KB4Rec

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

number of Users doesn't match, in Amazon-Book

liulu1998 opened this issue · comments

The number of Users in Amazon-Book isn't equal to that shown in your table.

I use Books (ratings only, 22,507,155 ratings), which is 873.81MB, download from Amazon Reviews 2014.

My source code :

df = pd.read_csv("./ratings_Books.csv", header=None)

df.columns = ["user", "item", "rating", "timestamp"]

print(f"number of user: {len(np.unique(df.user))}")
print(f"number of item: {len(np.unique(df.item))}")
print(f"number of interaction: {len(df)}")

the output is :

number of user: 8026324
number of item: 2330066
number of interaction: 22507155

As shown above, number of Users I count is 8,026,324, which isn't equal to 3,468,412 shown in your table., while other numbers match.

Am I using a wrong version of Amazon-Book ?

I checked it, it should be 8,026,324 users in this dataset. I'm sorry for the mistake.
I think you use the right version, as the number of items and interactions is the same as the paper.

@RichardHGL
Thanks for your reply.