mosaicml / streaming

A Data Streaming Library for Efficient Neural Network Training

Home Page:https://streaming.docs.mosaicml.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use IndexError instead of ValueError in __getitem__

keaganlong opened this issue · comments

raise ValueError(f'Invalid sample index `{index}`: 0 <= {index} < {self.num_samples}')

Hello, curious if it might be more customary to use python's IndexError instead of your custom ValueError when an index is out of bounds in __getitem__.
One consequence of using the current setup is that you will get a ValueError exception when you use Spanner/LocalDataset/etc as an iterator. Python allows objects with a __getitem__ to be used as an iterator but it expects to catch and stop on IndexError.

I see, that makes sense. It wouldn't impact current workflows anyways since an error is thrown regardless, but having IndexError would be more pythonic. If you're up for it, could you submit a small PR? I'd be happy to review, so feel free to tag me.