Protocol to match Decord / VideoReader

Question

Protocol to match Decord / VideoReader

johndpope opened this issue 5 months ago · comments

🚀 Feature

training dataset / pytorch on videos should allow me to take an mp4

and get the frames


class VideoReader(EncodedVideo):
    def __init__(self, uri):
        super().__init__(uri)
        self._num_frame = int(self.duration * self.video_meta['fps'])
    
    def __len__(self):
        return self._num_frame

    def __getitem__(self, idx):
        if isinstance(idx, slice):
            return self.get_batch(range(*idx.indices(len(self))))
        if idx < 0:
            idx += self._num_frame
        if idx >= self._num_frame or idx < 0:
            raise IndexError("Index out of range")
        return self.get_frame_by_index(idx)

    def get_frame_by_index(self, frame_index):
        fps = self.video_meta['fps']
        time_sec = frame_index / fps
        video_frame = self.get_clip(start_sec=time_sec, end_sec=time_sec + 1 / fps)['video']
        return video_frame[0]

    def get_batch(self, indices):
        frames = [self.get_frame_by_index(idx) for idx in indices]
        return torch.stack(frames)

    def seek(self, frame_index):
        # EncodedVideo does not support non-sequential access like seek.
        # This is a placeholder if you need a similar method.
        pass

    @classmethod
    def from_path(cls, uri):
        return cls.from_path(uri)

NOTE: Please look at the existing list of Issues tagged with the label 'enhancement`. Only open a new issue if you do not see your feature request there.

Motivation

Pitch

NOTE: we only consider adding new features if they are useful for many users.