ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Home Page:http://ludwig.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error running inference on Llama3 model

vinven7 opened this issue · comments

When I run inference on a Llama3 model finetuned using Ludwig, I keep getting this error:

set_cols, feature, missing_value_strategy, computed_fill_value, backend)
   1756         logger.warning(
   1757             f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
   1758             f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
   1759             f"samples will ultimately be dropped from the dataset."
   1760         )
   1761 else:
-> 1762     raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")

ValueError: Invalid missing value strategy fill_with_const

Here is my training script:

qlora_fine_tuning_config = yaml.safe_load(
"""
  model_type: llm
  base_model:   meta-llama/Meta-Llama-3-8B-Instruct


  input_features:
    - name: Prompt
      type: text
      preprocesssing: 
           max_sequence_length :256

  output_features:
    - name: Response
      type: text
      preprocesssing: 
           max_sequence_length :150

  prompt:
    template: >-

      ### Prompt: {Prompt}

      ### responses : 
      
  
  quantization:
    bits: 4

  generation:
    temperature: 0.1
    max_new_tokens: 150
    
  preprocessing:
    split:
       probabilities:
        - 1.0
        - 0.0
        - 0.0
  
  adapter:
    type: lora

  trainer:
    type: finetune
    epochs: 10
    batch_size: 1
    eval_batch_size: 1
    enable_gradient_checkpointing: true
    gradient_accumulation_steps: 16
    learning_rate: 0.00001
    optimizer:
      type: paged_adam
      params:
        eps: 1.e-8
        betas:
          - 0.9
          - 0.999
        weight_decay: 0
    learning_rate_scheduler:
      warmup_fraction: 0.03
      reduce_on_plateau: 0
  """
  )

new_model = LudwigModel(config=qlora_fine_tuning_config, logging_level=logging.INFO)
results = new_model.train(dataset=train_df)

And for inference:

new_model.predict(_test_df.loc[0:1])

Here is the full trace:

     4 def predict(index):
----> 5         test_predictions = new_model.predict(_test_df.loc[index:index])[0]
      7         completion = oclient.chat.completions.create(
      8         model="gpt-3.5-turbo",
      9         temperature = 0.1,
   (...)
     42         ]
     43         )
     44         results = completion.choices[0].message.content

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/api.py:1141, in LudwigModel.predict(self, dataset, data_format, split, batch_size, generation_config, skip_save_unprocessed_output, skip_save_predictions, output_directory, return_type, callbacks, **kwargs)
   1139 start_time = time.time()
   1140 logger.debug("Preprocessing")
-> 1141 dataset, _ = preprocess_for_prediction(  # TODO (Connor): Refactor to use self.config_obj
   1142     self.config_obj.to_dict(),
   1143     dataset=dataset,
   1144     training_set_metadata=self.training_set_metadata,
   1145     data_format=data_format,
   1146     split=split,
   1147     include_outputs=False,
   1148     backend=self.backend,
   1149     callbacks=self.callbacks + (callbacks or []),
   1150 )
   1152 logger.debug("Predicting")
   1153 with self.backend.create_predictor(self.model, batch_size=batch_size) as predictor:

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:2334, in preprocess_for_prediction(config, dataset, training_set_metadata, data_format, split, include_outputs, backend, callbacks)
   2332         training_set, test_set, validation_set, training_set_metadata = processed
   2333 else:
-> 2334     processed = data_format_processor.preprocess_for_prediction(
   2335         config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks
   2336     )
   2337     dataset, training_set_metadata, new_hdf5_fp = processed
   2338     training_set_metadata = training_set_metadata.copy()

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:276, in DataFramePreprocessor.preprocess_for_prediction(config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks)
    273 if isinstance(dataset, pd.DataFrame):
    274     dataset = backend.df_engine.from_pandas(dataset)
--> 276 dataset, training_set_metadata = build_dataset(
    277     config,
    278     dataset,
    279     features,
    280     preprocessing_params,
    281     mode="prediction",
    282     metadata=training_set_metadata,
    283     backend=backend,
    284     callbacks=callbacks,
    285 )
    286 return dataset, training_set_metadata, None

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1271, in build_dataset(config, dataset_df, features, global_preprocessing_parameters, mode, metadata, backend, random_seed, skip_save_processed_input, callbacks)
   1269 for feature_config in feature_configs:
   1270     preprocessing_parameters = feature_name_to_preprocessing_parameters[feature_config[NAME]]
-> 1271     handle_missing_values(dataset_cols, feature_config, preprocessing_parameters, backend)
   1273 # Happens after missing values are handled to avoid NaN casting issues.
   1274 logger.debug("cast columns")

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1703, in handle_missing_values(dataset_cols, feature, preprocessing_parameters, backend)
   1701 missing_value_strategy = preprocessing_parameters["missing_value_strategy"]
   1702 computed_fill_value = preprocessing_parameters.get("computed_fill_value")
-> 1703 _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1762, in _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)
   1756         logger.warning(
   1757             f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
   1758             f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
   1759             f"samples will ultimately be dropped from the dataset."
   1760         )
   1761 else:
-> 1762     raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")

ValueError: Invalid missing value strategy fill_with_const

-3.0

  • 0.10.3

I encountered the same error as reported above

I didn't quite solve the problem but found a workaround:
Since the model completed finetuning,
I loaded the model to HuggingFace and then ran inference on this model (base model + fine tuned Peft adapater).

Sounds great! Could you please explain how you finetuned the model using Ludwig or used a different approach?
Thanks in advance