datamllab / awsome-LLM-generated-text-detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome-LLM-Generated-Text-Detection-Materials Awesome

A curated, but probably biased and incomplete, list of LLM-generated text detection resources.

If you want to contribute to this list, feel free to pull a request. Also you can contact Ruixiang Tang from the Data Lab at Rice University through email: rt39@rice.edu.

What is LLM-generated Text Detection?

The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system.

we group exitsting methods into two categories: black-box detection and white-box detection. Black-box detection methods are limited to API-level access to LLMs. They rely on collecting text samples from human and machine sources, respectively, to train a classification model that can be used to discriminate between LLM- and human-generated texts.

An alternative is white-box detection, in this scenario, the detector has full access to the LLMs and can control the model's generation behavior for traceability purposes. In practice, black-box detectors are commonly constructed by external entities, whereas white-box detection is generally carried out by LLM developers.

Table of Contents

Review Paper

Black-Box Detection

White-Box Detection

About