data_cleaning_exp

History Update Problem

Experiment I: OpenRefine Data Cleaning Process Generate Six versions of Data Cleaning Processes

Experiment II:

Pipeline:_

data input
prompt types: [1].zero_shot: data cleaning objectives, requirements [2].example_based: data cleaning objectives, example repairs, requirements [3].example_sample: data cleaning objectives, example repairs, sample rows, requirements [4].profile_example_sample: data cleaning objectives, example repairs, sample rows, profiling results, requirements
For each type of prompt, log LLM's responses: -- zero_shot -- example_based -- example_sample -- profile_example_sample

Check the python scripts from LLM's responses Question: How does the quality of response reflect the quality of the prompt?

Language:Python 93.8%Language:Jupyter Notebook 5.2%Language:Shell 0.9%