ajabri / videowalk

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Home Page:http://ajabri.github.io/videowalk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDNN_STATUS_INTERNAL_ERROR with batch size 40 on 8x2080RTX

vadimkantorov opened this issue · comments

@ajabri Have you encountered this stack trace? pytorch/pytorch#51382

with batch size 35 everything works, with batch size 40 even the first iteration breaks, it uses 7Gb memory out of 11gb just before the exception

This is surprising I cannot fit 2x batch size on 4xtimes of GPUs... What was your 2080RTX memory size? Also 11Gb or a larger one?

Okay, it seems that my cuda device 7 is faulty :( pytorch/pytorch#51382 (comment)