How to mine fine-tuning samples from specified corpora

Question

How to mine fine-tuning samples from specified corpora

glacierck opened this issue 4 months ago · comments

How to expand the system to limit the generation of fine-tuning samples based on a given set of corpus documents, rather than blindly fabricating them。
For example, generating fine-tuning samples for disease diagnosis, I hope it is based on the case in the uploaded real diagnosis report