大家是如何准备自己的数据集的?
YowFung opened this issue · comments
各位大佬,我还是个新手,请教一下大家都是怎么准备自己的数据集的?
我现在都不知道怎么让程序跑起来,根据 README.md
的指示下载一些文件(如下图),但是不知道怎么存放、怎么重命名。
看了代码中好多用了绝对路径的地方,应该都是要改成自己的路径吧,具体是怎么改呢,对应的文件上哪找呢?
/data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc
/data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc/test.100.simcse.dev.32.knn.jsonl
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/mrc-ner.test.100
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.100.simcse.32.knn.jsonl
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.random.32.knn.jsonl
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource/test.10000.simcse.32.knn.jsonl
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.knn.sequence.fullprompt
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.rectify.knn.sequence.fullprompt
/data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt.verified
/nfs1/shuhe/gpt3-ner/features/conll03
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.100.verify.knn.jsonl
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.verify.knn.jsonl
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/mrc-ner.train.dev
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/openai.17.knn.train.dev.sequence.fullprompt
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-full/openai.15.knn.train.dev.sequence.fullprompt
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_bert
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll
/nfs1/shuhe/gpt3-ner/gpt3-data/en_conll/results.tmp
/nfs1/shuhe/gpt3-ner/origin_data/conll03_mrc
/nfs1/shuhe/gpt3-nmt/sup-simcse-roberta-large
/nfs1/shuhe/gpt3-nmt/data/en-fr/dev.en
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding/test.100.full.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted/test.full.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource
/nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource/low_resource_1_knn/test.simcse.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/ontonotes5_mrc/
/nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/
/nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/test.embedding.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding
/nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding/test.mrc.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/zh_msra
/nfs/shuhe/gpt3-ner/gpt3-data/zh_msra/test.embedding.knn.jsonl
/nfs/shuhe/gpt3-ner/gpt3-data/ace2004/
/nfs/shuhe/gpt3-ner/gpt3-data/ace2005/
/nfs/shuhe/gpt3-ner/gpt3-data/genia/
/nfs/shuhe/gpt3-ner/models/text2vec-base-chinese
/home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en
/home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en/test.8.embedding.knn.jsonl
各位大佬,我还是个新手,请教一下大家都是怎么准备自己的数据集的?
我现在都不知道怎么让程序跑起来,根据
README.md
的指示下载一些文件(如下图),但是不知道怎么存放、怎么重命名。看了代码中好多用了绝对路径的地方,应该都是要改成自己的路径吧,具体是怎么改呢,对应的文件上哪找呢?
/data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc /data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc/test.100.simcse.dev.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/mrc-ner.test.100 /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.100.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.random.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource/test.10000.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.rectify.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt.verified /nfs1/shuhe/gpt3-ner/features/conll03 /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.100.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/mrc-ner.train.dev /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/ /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/openai.17.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-full/openai.15.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_bert /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll/results.tmp /nfs1/shuhe/gpt3-ner/origin_data/conll03_mrc /nfs1/shuhe/gpt3-nmt/sup-simcse-roberta-large /nfs1/shuhe/gpt3-nmt/data/en-fr/dev.en /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding/test.100.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted/test.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource/low_resource_1_knn/test.simcse.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ontonotes5_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding/test.mrc.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ace2004/ /nfs/shuhe/gpt3-ner/gpt3-data/ace2005/ /nfs/shuhe/gpt3-ner/gpt3-data/genia/ /nfs/shuhe/gpt3-ner/models/text2vec-base-chinese /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en/test.8.embedding.knn.jsonl
你好 最近在复现代码 新手 可以讨论一下吗
各位大佬,我还是个新手,请教一下大家都是怎么准备自己的数据集的?
我现在都不知道怎么让程序跑起来,根据README.md
的指示下载一些文件(如下图),但是不知道怎么存放、怎么重命名。
看了代码中好多用了绝对路径的地方,应该都是要改成自己的路径吧,具体是怎么改呢,对应的文件上哪找呢?/data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc /data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc/test.100.simcse.dev.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/mrc-ner.test.100 /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.100.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.random.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource/test.10000.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.rectify.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt.verified /nfs1/shuhe/gpt3-ner/features/conll03 /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.100.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/mrc-ner.train.dev /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/ /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/openai.17.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-full/openai.15.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_bert /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll/results.tmp /nfs1/shuhe/gpt3-ner/origin_data/conll03_mrc /nfs1/shuhe/gpt3-nmt/sup-simcse-roberta-large /nfs1/shuhe/gpt3-nmt/data/en-fr/dev.en /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding/test.100.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted/test.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource/low_resource_1_knn/test.simcse.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ontonotes5_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding/test.mrc.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ace2004/ /nfs/shuhe/gpt3-ner/gpt3-data/ace2005/ /nfs/shuhe/gpt3-ner/gpt3-data/genia/ /nfs/shuhe/gpt3-ner/models/text2vec-base-chinese /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en/test.8.embedding.knn.jsonl
你好 最近在复现代码 新手 可以讨论一下吗
同新手,可以讨论一下吗
各位大佬,我还是个新手,请教一下大家都是怎么准备自己的数据集的?
我现在都不知道怎么让程序跑起来,根据README.md
的指示下载一些文件(如下图),但是不知道怎么存放、怎么重命名。
看了代码中好多用了绝对路径的地方,应该都是要改成自己的路径吧,具体是怎么改呢,对应的文件上哪找呢?/data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc /data2/wangshuhe/gpt3_ner/gpt3-data/ontonotes5_mrc/test.100.simcse.dev.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/mrc-ner.test.100 /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.100.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/test.random.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/low_resource/test.10000.simcse.32.knn.jsonl /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/ /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.entity.rectify.knn.sequence.fullprompt /data2/wangshuhe/gpt3_ner/gpt3-data/conll_mrc/100-results/openai.32.knn.sequence.fullprompt.verified /nfs1/shuhe/gpt3-ner/features/conll03 /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.100.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/test.verify.knn.jsonl /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/mrc-ner.train.dev /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/ /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-3/openai.17.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_2003/text-full/openai.15.knn.train.dev.sequence.fullprompt /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll_bert /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll /nfs1/shuhe/gpt3-ner/gpt3-data/en_conll/results.tmp /nfs1/shuhe/gpt3-ner/origin_data/conll03_mrc /nfs1/shuhe/gpt3-nmt/sup-simcse-roberta-large /nfs1/shuhe/gpt3-nmt/data/en-fr/dev.en /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding/test.100.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/start_word_embedding_sorted/test.full.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource /nfs/shuhe/gpt3-ner/gpt3-data/conll_mrc/low_resource/low_resource_1_knn/test.simcse.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ontonotes5_mrc/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/ /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding /nfs/shuhe/gpt3-ner/gpt3-data/zh_onto4/start_word_embedding/test.mrc.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra /nfs/shuhe/gpt3-ner/gpt3-data/zh_msra/test.embedding.knn.jsonl /nfs/shuhe/gpt3-ner/gpt3-data/ace2004/ /nfs/shuhe/gpt3-ner/gpt3-data/ace2005/ /nfs/shuhe/gpt3-ner/gpt3-data/genia/ /nfs/shuhe/gpt3-ner/models/text2vec-base-chinese /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en /home/wangshuhe/gpt-ner/openai_access/low_resource_data/conll_en/test.8.embedding.knn.jsonl
你好 最近在复现代码 新手 可以讨论一下吗
同新手,可以讨论一下吗
当然可以!!!!!