rishikksh20 / ViViT-pytorch

Implementation of ViViT: A Video Vision Transformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to do the implementation with my Dataset

vaibhavsah opened this issue · comments

Hi

I am a newbie to PyTorch. I want to use this model for my thesis. Can you please help me in explaining how to run this code on my dataset. Can you please update the Readme file with simple steps to run and evaluate the model on custom dataset.

@rishikksh20 @aarti9
In my dataset I have 6 classes for identification. So on using the code in develop branch, I am getting an error for no support for multi-class classification by the loss function. Please help me out there.
Screenshot 2021-08-09 at 12 02 17 PM

RuntimeError: multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15

Please help me out.

Thanks @aarti9. You may connect with me on sah.vaibhav@gmail.com
I have also mailed you about the issue.

As the code was not working, after a bit research, I tried updating the loss function calculation as
loss_func = nn.CrossEntropyLoss(weight=class_weights.to(device))

Where class_weights are calculated as:
tensor([0.0045, 0.0042, 0.0048, 0.0038, 0.0070, 0.0065])

But this also didn't work. It seems there is some issue in the dimensions of target variable in train_epoch function. Please help me fix it.

Earlier I used that only, which was already there in your code. The thing is that I have made some changes in the Dataset Preprocessing to read the videos from the dataframe. But not sure if that is causing the errors.

class DatasetProcessing(data.Dataset):
    def __init__(self, df, root_dir):
        super(DatasetProcessing, self).__init__()
        # List of all videos path
        video_list = df["Video"].apply(lambda x: root_dir + '/' + x)
        # for root, dirs, files in os.walk(videos_path):
        #     for file in files:
        #         fullpath = os.path.join(root, file)
        #         if ('.mp4' in fullpath):
        #             video_list.append(fullpath)
        self.video_list = np.asarray(video_list)

        self.df = df
        # self.framespath = framespath

    def __getitem__(self, index):
        # Ensure that the raw videos are in respective folders and folder name matches the output class label
        video_label = self.video_list[index].split('/')[-2]
        video_name = self.video_list[index].split('/')[-1]
        # pklFileName = os.path.splitext(video_name)[0]
        # with open(self.framespath + '/' + pklFileName + '.pkl', 'rb') as f:
        #     w_list = pickle.load(f)
        #
        # return w_list[0], w_list[1]

        video_frames, len_ = get_frames(self.video_list[index], n_frames = 15)
        print(len(video_frames))
        video_frames = np.asarray(video_frames)
        video_frames = video_frames/255
        class_list = ['Run', 'Walk', 'Wave', 'Sit', 'Turn', 'Stand']
        # print(class_list)
        class_id_loc = np.where(class_list == video_label)
        label = class_id_loc
        d = torch.as_tensor(np.array(video_frames).astype('float'))
        l = torch.as_tensor(np.array(label).astype('float'))
        return (d, l)

    def __len__(self):
        return self.video_list.shape[0]

This is the file I have created on Colab. Colab Sheet

@aarti9 I changed my approach. Divided my test and train data in folders and used your develop branch from fresh. But a new issue arrived. It's saying that ViViT has no attribute evaluate. What's so strange here ? Please acknowledge.

AttributeError: 'ViViT' object has no attribute 'evaluate'

Screenshot 2021-08-09 at 2 32 10 PM

@aarti9 Instead of model.evaluate(), I ran evaluate. That did the trick to run the code.
But the thing is that for any number of epochs, or any values of parameters, the accuracy output is fixed to 21.09%. It seems strange.
There might be some fix needed with how the accuracy is being calculated. Can you please check at your end, what can be the issue.

@vaibhavsah Hello, have u fixed the problem of the 21.09% accuracy? Also, I faced some issues about training vivit on custom dataset, could u please share the Colab sheet again, the link has broken. Thanks.

@2000222 Sorry, even I couldn't find the solution. Had to switch to basic Transformer model