gururise / AlpacaDataCleaned

Alpaca dataset from Stanford, cleaned and curated

gururise/AlpacaDataCleaned Issues

Chinese sft data
Updated 5 months ago
How to format dataset fields in model prompt?
Closed 5 months ago1
Where is the 9k cleaned alpaca data in the paper Alpagasus?
Updated a year ago2
Is there a boost in performance for full fine-tuning versus LoRA?
Closed a year ago2
The MNLI score in lm-evaluation-harness
Updated a year ago
Is the "alpaca_data_cleaned_archive.json" file having all cleaned data?
Closed a year ago2
PIQA dataset's metric
Updated a year ago
Command to run the evaluation
Updated a year ago
Identify code snippet in "input" fields
Updated a year ago1
Evaluation Metric
Updated a year ago8
Contributing to the dataset curation with Argilla and the Alpaca Garbage collector
Closed a year ago2
Diffs as data
Closed a year ago1
80% of math outputs are wrong
Closed a year ago1
Correct or potentially to be cleaned?
Closed a year ago6
Separate instructions by functionality
Closed a year ago
Any chance we could improve the dataset beyond fixing?
Updated a year ago41
good job
Closed a year ago1
What about starting a crowdfunding campaign to collect money to run the examples against GPT-4?
Updated a year ago5
How are you going about cleaning?
Updated a year ago4
overall approach
Closed a year ago
Idea about better cleaning
Closed a year ago3
Incorrect key string in alpaca_data_cleaned.json
Closed a year ago
Hosting your dataset on the Hugging Face Hub
Closed a year ago3
ModuleNotFoundError: No module named 'utils'
Closed a year ago1
Adding scripts for data cleaning
Closed a year ago3