Adding additional information for a classification task

Question

Adding additional information for a classification task

danarte opened this issue 2 years ago · comments

Hello,
I'm wondering what is the best method of adding additional information to each sequence for a classification task? Information like genome location, or some annotation information?

This repository is great and I was able to adapt my data (bunch of sequences) and to get much better classification ability than I got with other models, but I believe I can improve the classification (and perhaps publish the results if they are good enough) if I could add additional data to the classifier.

What would be the best method to develop such model? simply add the data to the sequence (data in different format like IDs, ints, floats, characters...)? write a costume task? train a classifier and then "envelope" it inside a bigger model while including the additional information? some other method?

I'm not very versed in the huggingface framework therefore I feel a bit lost while looking for a straightforward solution like I would do with other less complex models.

PS - It would be a great and a guaranteed publication if I could use the visualization/importance feature like you showed in your example while including the additional data.