Error running inference on Llama3 model
vinven7 opened this issue · comments
Vineeth Venugopal commented
When I run inference on a Llama3 model finetuned using Ludwig, I keep getting this error:
set_cols, feature, missing_value_strategy, computed_fill_value, backend)
1756 logger.warning(
1757 f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
1758 f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
1759 f"samples will ultimately be dropped from the dataset."
1760 )
1761 else:
-> 1762 raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")
ValueError: Invalid missing value strategy fill_with_const
Here is my training script:
qlora_fine_tuning_config = yaml.safe_load(
"""
model_type: llm
base_model: meta-llama/Meta-Llama-3-8B-Instruct
input_features:
- name: Prompt
type: text
preprocesssing:
max_sequence_length :256
output_features:
- name: Response
type: text
preprocesssing:
max_sequence_length :150
prompt:
template: >-
### Prompt: {Prompt}
### responses :
quantization:
bits: 4
generation:
temperature: 0.1
max_new_tokens: 150
preprocessing:
split:
probabilities:
- 1.0
- 0.0
- 0.0
adapter:
type: lora
trainer:
type: finetune
epochs: 10
batch_size: 1
eval_batch_size: 1
enable_gradient_checkpointing: true
gradient_accumulation_steps: 16
learning_rate: 0.00001
optimizer:
type: paged_adam
params:
eps: 1.e-8
betas:
- 0.9
- 0.999
weight_decay: 0
learning_rate_scheduler:
warmup_fraction: 0.03
reduce_on_plateau: 0
"""
)
new_model = LudwigModel(config=qlora_fine_tuning_config, logging_level=logging.INFO)
results = new_model.train(dataset=train_df)
And for inference:
new_model.predict(_test_df.loc[0:1])
Here is the full trace:
4 def predict(index):
----> 5 test_predictions = new_model.predict(_test_df.loc[index:index])[0]
7 completion = oclient.chat.completions.create(
8 model="gpt-3.5-turbo",
9 temperature = 0.1,
(...)
42 ]
43 )
44 results = completion.choices[0].message.content
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/api.py:1141, in LudwigModel.predict(self, dataset, data_format, split, batch_size, generation_config, skip_save_unprocessed_output, skip_save_predictions, output_directory, return_type, callbacks, **kwargs)
1139 start_time = time.time()
1140 logger.debug("Preprocessing")
-> 1141 dataset, _ = preprocess_for_prediction( # TODO (Connor): Refactor to use self.config_obj
1142 self.config_obj.to_dict(),
1143 dataset=dataset,
1144 training_set_metadata=self.training_set_metadata,
1145 data_format=data_format,
1146 split=split,
1147 include_outputs=False,
1148 backend=self.backend,
1149 callbacks=self.callbacks + (callbacks or []),
1150 )
1152 logger.debug("Predicting")
1153 with self.backend.create_predictor(self.model, batch_size=batch_size) as predictor:
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:2334, in preprocess_for_prediction(config, dataset, training_set_metadata, data_format, split, include_outputs, backend, callbacks)
2332 training_set, test_set, validation_set, training_set_metadata = processed
2333 else:
-> 2334 processed = data_format_processor.preprocess_for_prediction(
2335 config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks
2336 )
2337 dataset, training_set_metadata, new_hdf5_fp = processed
2338 training_set_metadata = training_set_metadata.copy()
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:276, in DataFramePreprocessor.preprocess_for_prediction(config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks)
273 if isinstance(dataset, pd.DataFrame):
274 dataset = backend.df_engine.from_pandas(dataset)
--> 276 dataset, training_set_metadata = build_dataset(
277 config,
278 dataset,
279 features,
280 preprocessing_params,
281 mode="prediction",
282 metadata=training_set_metadata,
283 backend=backend,
284 callbacks=callbacks,
285 )
286 return dataset, training_set_metadata, None
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1271, in build_dataset(config, dataset_df, features, global_preprocessing_parameters, mode, metadata, backend, random_seed, skip_save_processed_input, callbacks)
1269 for feature_config in feature_configs:
1270 preprocessing_parameters = feature_name_to_preprocessing_parameters[feature_config[NAME]]
-> 1271 handle_missing_values(dataset_cols, feature_config, preprocessing_parameters, backend)
1273 # Happens after missing values are handled to avoid NaN casting issues.
1274 logger.debug("cast columns")
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1703, in handle_missing_values(dataset_cols, feature, preprocessing_parameters, backend)
1701 missing_value_strategy = preprocessing_parameters["missing_value_strategy"]
1702 computed_fill_value = preprocessing_parameters.get("computed_fill_value")
-> 1703 _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)
File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1762, in _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)
1756 logger.warning(
1757 f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
1758 f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
1759 f"samples will ultimately be dropped from the dataset."
1760 )
1761 else:
-> 1762 raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")
ValueError: Invalid missing value strategy fill_with_const
-3.0
- 0.10.3
aliarabat commented
I encountered the same error as reported above
Vineeth Venugopal commented
I didn't quite solve the problem but found a workaround:
Since the model completed finetuning,
I loaded the model to HuggingFace and then ran inference on this model (base model + fine tuned Peft adapater).
aliarabat commented
Sounds great! Could you please explain how you finetuned the model using Ludwig or used a different approach?
Thanks in advance