[Errno 36] File name too long exception when running predict

Question

[Errno 36] File name too long exception when running predict

karamata opened this issue 4 months ago · comments

I face this problem, but don't know how to by pass, anyone help me please

...
OSError                                   Traceback (most recent call last)
[<ipython-input-4-fc68552c3e50>](https://localhost:8080/#) in <cell line: 145>()
    143 )
    144 
--> 145 predictions, prediction_results = model.predict(dataset=eval_df, skip_save_predictions=False, output_directory="predictions")
    146 
    147 predictions.head()

9 frames
[/usr/local/lib/python3.10/dist-packages/fsspec/implementations/local.py](https://localhost:8080/#) in _open(self)
    318         if self.f is None or self.f.closed:
    319             if self.autocommit or "w" not in self.mode:
--> 320                 self.f = open(self.path, mode=self.mode)
    321                 if self.compression:
    322                     compress = compr[self.compression]

OSError: [Errno 36] File name too long: '/content/predictions/product_name_probabilities_Ống_thoát_nước__xifong__dùng_cho_chậu_rửa_trong_nhà_bếp___nắp_ống_bằng_inox__thân_ống_và_đường_ống_bằng_nhựa__lõi_ống_dẫn_bằng_thép__bộ_gồm_nắp_đậy__ống_dẫn__khớp_nối__hàng_mới_100.csv'

Arnav Garg · Answer 1 · Sat Feb 10 2024 06:51:34 GMT+0800 (China Standard Time)

Hi @karamata! Are you able to share the code you're using to generate predictions? It seems like product_name_probabilities_Ống_thoát_nước__xifong__dùng_cho_chậu_rửa_trong_nhà_bếp___nắp_ống_bằng_inox__thân_ống_và_đường_ống_bằng_nhựa__lõi_ống_dẫn_bằng_thép__bộ_gồm_nắp_đậy__ống_dẫn__khớp_nối__hàng_mới_100.csv is too big of a file name and the error is not coming from Ludwig, but the host machine since it's an Errno 36.

Tran Phu · Answer 2 · Sun Feb 11 2024 21:49:35 GMT+0800 (China Standard Time)

Hi @karamata! Are you able to share the code you're using to generate predictions? It seems like product_name_probabilities_Ống_thoát_nước__xifong__dùng_cho_chậu_rửa_trong_nhà_bếp___nắp_ống_bằng_inox__thân_ống_và_đường_ống_bằng_nhựa__lõi_ống_dẫn_bằng_thép__bộ_gồm_nắp_đậy__ống_dẫn__khớp_nối__hàng_mới_100.csv is too big of a file name and the error is not coming from Ludwig, but the host machine since it's an Errno 36.

Hi @arnavgarg1, pls see my code base below, and this code I running on colab

config = {
  "input_features": [
    {
      "name": "col1",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "col2",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "col3",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "col4",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "col5",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "col6",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
    {
      "name": "col7",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
    {
      "name": "col8",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
    {
      "name": "col9",
      "type": "sequence",
      "preprocessing": {
        "encoder": {
          "type": "embed",
          "reduce_output": None,
        },
      },
    },
  ],
  "output_features": [
    {
      "name": "product_name",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "drop_row"
      },
    },
    {
      "name": "product_type",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
    {
      "name": "output1",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
    {
      "name": "output2",
      "type": "category",
      "preprocessing": {
        "missing_value_strategy": "fill_with_const",
        "fill_value": "",
      },
    },
  ],
  "trainer": {
    "epochs": 2,
    "batch_size": 512,
    "max_batch_size": 512,
    "learning_rate": 0.01,
    "use_mixed_precision": True,
  }
}

dataset = dataset.sample(frac=1).reset_index()

train_df, test_df = train_test_split(dataset, train_size=0.9)

model = LudwigModel(config=config, logging_level=logging.INFO)

train_stats, preprocessed_data, output_directory = model.train(training_set=train_df, test_set=test_df)

np.random.seed(13)
eval_df = test_df.sample(n=1000)

test_stats, predictions, output_directory = model.evaluate(
  eval_df,
  collect_predictions=True,
  collect_overall_stats=True,
  # skip_save_eval_stats=True,
  # skip_save_predictions=True,
  # output_directory="test_results",
  # return_type="dict"
)

predictions, prediction_results = model.predict(dataset=eval_df)

predictions.head()