Few shots learner with embedding layer

Question

Few shots learner with embedding layer

rodrigoheck opened this issue 3 years ago · comments

Question

Instead of finetuning a transformer (which is a little overkill for my task), I am trying to implement a few shot learner using Keras. In order to do that, I am first extracting the embeddings from a model previously fine-tuned in a similar dataset (by following this code). Then, I am using these embeddings as an input for my Sequential Model in Keras that then classifies it into some determined category. But my performance is very poor (next to random guess) and I can't find out why that is. Maybe I am doing it wrong or the embeddings shouldn't be used for that purpose?

I appreciate any input :)

Timo Moeller · Answer 1 · Mon Feb 08 2021 16:55:12 GMT+0800 (China Standard Time)

Hey @rodrigoheck in general this should work.

I guess you have seen the paper To Tune or Not to Tune discussing this matter.

So regerding your problem. The code you pointed to seems generally fine and should work better than random. Once that is done you could experiment with a different extraction_strategy or extraction_layer.
So since your results are random I guess some other part went wrong.

Did you try classifying the data by finetuning the whole model with success?
Can you be sure that text, embedding and label correspond correctly?

Hope that helps in figuring out whats going on.

rodrigoheck · Answer 2 · Mon Feb 15 2021 04:57:52 GMT+0800 (China Standard Time)

Thanks for you input, @Timoeller !

You are indeed right, and I probably made some mistake when creating the pipeline. I did it again and now it is working fine.

Let me ask one more question: is it possible for a task classification Inferencer to also provide the embeddings? Or I must load the Inferencer again with this specific task?

Thanks!

Timo Moeller · Answer 3 · Mon Feb 15 2021 15:45:58 GMT+0800 (China Standard Time)

Nice, glad to hear it works now.

Yes you can do that: You can finetune a transformer on classification tasks and afterwards use an embedding inferencer to get the embeddings of the finetuned model. It should indeed help you create better embeddings (used as an index or for clustering?) of the data.