CatalogueLegacies / simGeorge

Making a GPT-2 inflected model that writes descriptions a bit like Mary Dorothy George

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

simGeorge: background and license

This repository hosts code/settings for creating a GPT-2 inflected language model that writes catalogue descriptions in the "voice" of Mary Dorothy George.

These code/settings can be run via Max Woolf's Colaboratory Notebook (see 'How To Make Custom AI-Generated Text With GPT-2'), or a local install of the python package gpt-2-simple (though the latter hasn't been tested).

The dataset used is CurV-corpus-27Jan2019.txt at http://doi.org/10.5281/zenodo.3245037.

Unless otherwise stated, these materials are licensed under a GNU General Public License v3.0.

This work is based on data created during the project 'Curatorial Voice: legacy descriptions of art objects and their contemporary uses', and is associated with the project 'Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship'.

Two datasets and two papers have emerged from this work:

  • James Baker and Andrew Salway, ‘Curatorial labour, voice, and legacy: Mary Dorothy George and the Catalogue of Political and Personal Satires, 1930-1954’, Historical Research (forthcoming 2020)
  • Andrew Salway and James Baker, ‘Investigating Curatorial Voice with Corpus Linguistic Techniques: the case of Dorothy George and applications in museological practice’, Museum & Society (2020).
  • Baker, James, & Salway, Andrew. (2019). Corpus Linguistic Analysis of the BMSatire Descriptions corpus [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3245017
  • Baker, James, & Salway, Andrew. (2019). Creation of the BMSatire Descriptions corpus (Version v1.0). Zenodo. http://doi.org/10.5281/zenodo.3245037

All data are derived from text written by M. Dorothy George and published between 1935 and 1954 as volumes 5 to 11 of the Catalogue of Political and Personal Satires Preserved in the Department of Prints and Drawings in the British Museum. This text is published in lightly edited form by the British Museum via ResearchSpace as linked open data at https://public.researchspace.org/sparql. The data, text and images available via this service are published under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license (Research Space, 2016; accessed 10 September 2018).

About

Making a GPT-2 inflected model that writes descriptions a bit like Mary Dorothy George


Languages

Language:Python 100.0%