`fit_loader` vs. `fit_generator`

Question

`fit_loader` vs. `fit_generator`

githubnemo opened this issue 7 years ago · comments

I feel that having a fit_loader is a special case of Keras' fit_generator and I found myself in a situation where I was missing the latter.

Is there a reason why fit_loader is not implemented as syntactic sugar for fit_generator and fit_generator is missing completely?

Nick Cullen, PhD · Answer 1 · Mon May 22 2017 10:52:36 GMT+0800 (China Standard Time)

fit_loader is the same thing as fit_generator... pytorch just calls them "loaders" while Keras calls them "generators". The Keras implementation of fit_generator iterates through the generator and calls train_on_batch. I'm actually working on a re-write so if you have any specific requests let me know.

githubnemo · Answer 2 · Tue May 23 2017 01:45:46 GMT+0800 (China Standard Time)

I don't think that they are equal. Generators are a very generic concept in python. I can easily do a modification of an already existing loader using generators. But the reverse doesn't hold.

For example, I stumbled upon a problem with torchsample where I needed labels for a secondary loss that did not use any labels but torchsample forces me to use labels for every defined loss function, otherwise it is not executed. This would have been easy with generators:

def mnist_modified():
    for _, (data, target) in enumerate(mnist_train_loader):
        yield ([data,data], [target,target])

But with fit_loaders I would have to implement a whole loader instead.
Generators are the more generic approach. I agree that it is convenient to have fit_loader but it can be implemented with fit_generator and the latter should exist as it is the more generic of both methods.

At the chance of drifting off-topic, I like the fact that torchsample provides much functionality from Keras but I think it takes a lot of the flexibility when defining losses. I would have liked it if torchsample was a bit less convenient but a bit more flexible. I would be perfectly content with torchsample demanding that I supply several values in my training function (val_loss, loss, ...) as long as it lets me define the information flow of my data and the loss function. The compile and fit functions feel very static to me.

ajk · Answer 3 · Mon Oct 08 2018 02:17:12 GMT+0800 (China Standard Time)

+1

In my use case I am creating [X,Y] batches on-the-fly from some in-memory objects. And I cannot generate all possible [X,Y] outcomes ahead of time because it would not fit in memory.

It seems like pytorch loaders are designed for fixed and fitting-in-memory datasets. Method fit_generator would be aplicable in my use-case where data_loaders are not enough.

ajk · Answer 4 · Mon Oct 08 2018 02:20:06 GMT+0800 (China Standard Time)

@ncullen93 Would you be interested in merging PR if I would have code it?

ajk · Answer 5 · Mon Oct 08 2018 13:27:41 GMT+0800 (China Standard Time)

@githubnemo I found another keras-like wrapper library that has fit_generator and that I can recommend -> https://pytoune.org/