martiansideofthemoon / style-transfer-paraphrase

Official code and data repository for our EMNLP 2020 long paper "Reformulating Unsupervised Style Transfer as Paraphrase Generation" (https://arxiv.org/abs/2010.05700).

Home Page:http://style.cs.umass.edu

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

COHA datasets - request example of end-to-end training

GenTxt opened this issue · comments

I have read the information under 'Custom Datasets' but I'm unclear how it applies to COHA files which are single line texts in the following format:

@@10133

" But , oh , I do n't like those people . They do n't like us . They 're dead , they do n't care , they do n't even feel foolish , " Albany said . I felt mad enough @ @ @ @ @ @ @ @ @ @ hotly as she met his eyes . " etc.

I would like to train another COHA model. Is it possible to provide an end-to-end example that explains how to do this?

Should I collect the COHA text files into groups and merge for train.txt, dev.txt, test.txt ?

I assume train.label, dev.label, test.label in the example COHA folders can be used and modified e.g. 1890s-1900s --> 1940s-1950s

Thanks