gpt-3 vietnamese vietnamese-nlp lora nlp poem-generator

Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation 📜🖋️

Poetry generation has been a challenging task in the field of Natural Language Processing, as it requires the model to understand the nuances of language, sentiment, and style. In this paper, we propose using Large Language Models to generate Vietnamese poems of various genres from natural language prompts, thereby facilitating an intuitive process with enhanced content control.

Our most efficacious model, the GPT-3 Babbage variant, achieves a custom evaluation score of 0.8, specifically tailored to the "luc bat" genre of Vietnamese poetry. Furthermore, we also explore the idea of paraphrasing poems into normal text prompts and yield a relatively high score of 0.781 in the "luc bat" genre. This experiment presents the potential for cross-Language poem-to-poem translation with translated poems as the inputs while concurrently maintaining complete control over the generated content.

Dataset

The orignial dataset is a collection of 171188 Vietnamese poems with different genres: luc-bat, 5-chu, 7-chu, 8-chu, 4-chu. Download here.

For more detail, refer to the Acknowledgments section

We also created our own datasets for prompt-based generation in the resource/dataset folder.

Pre-evaluation

We trained a custom genre classifier based on BERT with the accuracy of 99.7% to classify the correct genre before scoring. For more detail, refer to our vietnamese-poem-classifier. This would be helpful during blind test (where genre is not specified).

The training code is in this repo. To train the classifier, run:

python poem_classifier_training.py

Evaluation

We use a custom function to score the quality of a poem, based soldly on its conformation to the rigid rule of various types of vietnamese poem. Using 3 criterias: Length, Tone and Rhyme as follow: score = L/10 + 3T/10 + 6R/10

Table 1: Result comparison of models

Models	Luc Bat	Blind	7 Chu	8 Chu	5 Chu	4 Chu
text-to-poem
ChatGPT (zero-shot)	0.440	0.345	0.292	0.197	0.284	0.238
Davinci (1000 samples)	0.580	-	-	-	-	-
BLOOM (20k samples)	0.678	0.596	0.367	0.279	0.480	0.440
Babbage (20k samples)	0.718	-	-	-	-	-
Babbage	0.805	0.795	0.661	0.500	0.382	0.392
poem-to-poem
Babbage	0.781	-	-	-	-	-

Currently, the Luc Bat genre score highest due to sheer sample size. It also has the tendency to genrerate Luc Bat when the genre is not specified, so it also scores very high during blind test.

Inference

The opensource version use a Lora for Bloom-7b1 in 8bit and can run on colab. You can try it here (probably run out of memory and crash. It used to run fine, new library versions conflict a lot)

Citation

@misc{huynh2024vietnamese,
      title={Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation}, 
      author={Triet Minh Huynh and Quan Le Bao},
      year={2024},
      eprint={2401.01078},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgments

This project was inspired by the evaluation method from fsoft-ailab's SP-GPT2 Poem-Generator

Dataset also taken from their repo

About

Generate Vietnamese poem with natural language prompts 📜🖋️

gpt-3 vietnamese vietnamese-nlp lora nlp poem-generator

MIT License

Languages

Language:Jupyter Notebook 81.5%Language:Python 18.5%