DISC-NLPBeginer

Due to the poor performance of the large language model (LLM) in the face of highly compositional reasoning questions, we tested quantitatively on two datasets of geographic location and kinship.

The results show that LLMs are deficient in both deductive and inductive reasoning, and provide insights into possible solutions: giving model the logic rules and specifically designed prompting.

About

program materials

Languages

Language:Jupyter Notebook 58.1%Language:Python 41.9%