Chapter 03 Yelp Dataset has a Typo
amancioandre opened this issue · comments
Andre Moraes commented
Hi everyone,
Chapter 3 does not load Yelp data due to a typo on the last line of the dataset:
Line Review
73357: "1","Capital City Transfer han
Using nrows argument passing the number of rows - 1, fixed for me.
train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], nrows=73356)
Or
train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], error_bad_lines=False)
Or by just appending a " at this line.
Still, would be nice to fix this typo on the dataset.