number of Users doesn't match, in Amazon-Book
liulu1998 opened this issue · comments
Liu Lu(刘陆) commented
The number of Users in Amazon-Book isn't equal to that shown in your table.
I use Books (ratings only, 22,507,155 ratings), which is 873.81MB, download from Amazon Reviews 2014.
My source code :
df = pd.read_csv("./ratings_Books.csv", header=None)
df.columns = ["user", "item", "rating", "timestamp"]
print(f"number of user: {len(np.unique(df.user))}")
print(f"number of item: {len(np.unique(df.item))}")
print(f"number of interaction: {len(df)}")
the output is :
number of user: 8026324
number of item: 2330066
number of interaction: 22507155
As shown above, number of Users I count is 8,026,324, which isn't equal to 3,468,412 shown in your table., while other numbers match.
Am I using a wrong version of Amazon-Book ?
Gaole He (何高乐) commented
I checked it, it should be 8,026,324 users in this dataset. I'm sorry for the mistake.
I think you use the right version, as the number of items and interactions is the same as the paper.
Liu Lu(刘陆) commented
@RichardHGL
Thanks for your reply.