oscarknagg / few-shot

Repository for few-shot learning machine learning projects

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Best practice for evaluation code

sangwoomo opened this issue · comments

Hi, thank you for your great implementation!

I found that your Keras-like fit function is elegant, but as the evaluation code is implemented as a callback function after each epoch, it is unclear how to load the trained model and check the performance of it.

Can you share your evaluation code?

I only used the callback when evaluating so I don't have any other evaluation code. This function is what the evaluate callback uses.

def evaluate(model: Module, dataloader: DataLoader, prepare_batch: Callable, metrics: List[Union[str, Callable]],

You can call it on its own if you want to evaluate a pre-trained model. Here's the outline of an evaluation script.

import torch
from torch.utils.data import DataLoader
from few_shot.eval import evaluate
from few_shot.datasets import OmniglotDataset
from few_shot.core import NShotTaskSampler

args = your_args
model = ModelClass()
model.load_state_dict('path/to/model/weights.pt')
dataloader = DataLoader(
    OmniglotDataset(),
    batch_sampler=NShotTaskSampler(evaluation, args.eval_batches, n=args.n, k=args.k, q=args.q,
                                   num_tasks=args.meta_batch_size),
    num_workers=8
)
prepare_batch = your_prepare_batch_function
metrics = evaluate(model, dataloader, your_prepare_batch_function, metrics=['accuracy'])

Hi @oscarknagg, in this evaluation code, I don't see the forward and backward part for support set in new task. Could you explain how can we adapt for new task?

@oscarknagg Thank you for your great contribution. I ran the code one by one and I do not see if ever evaluate.py is being run. Am I mistaked?