patrickzib / SFA

Scalable Time Series Data Analytics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use New Dataset

wwjd1234 opened this issue · comments

I have my data in csv format.
How do I get the code to read in the csv files?

The only way I have been able to get it to work so far is by doing the following:

open CBF in notepad, then paste in my data, then save
I did this for both train and test

Hi,

  1. You first have to create a new dataset in main/resources/datasets/univariate/NEW_DS, where NEW_DS is any name you like to use.

  2. Next, add two files to the new folder NEW_DS_TEST and NEW_DS_TRAIN

  3. You can add the NEW_DS String to existing files like UCRClassificationTest in public static String[] datasets = new String[]{ ...}

  4. Alternatively, you could create a new test, which just loads your new dataset and applies the classifier you intend to use. Something similar to:

  @Test
  public void testSingleDataset() throws IOException {

    // the relative path to the datasets
    ClassLoader classLoader = SFAWordsTest.class.getClassLoader();

    File dir = new File(classLoader.getResource("datasets/univariate/").getFile());
    String dataset = "CBF";

    for (File train : new File(dir.getAbsolutePath() + "/" + dataset).listFiles()) {
      if (train.getName().toUpperCase().endsWith("TRAIN")) {
        File test = new File(train.getAbsolutePath().replaceFirst("TRAIN", "TEST"));

        // Load the train/test splits
        TimeSeries[] testSamples = TimeSeriesLoader.loadDataset(test);
        TimeSeries[] trainSamples = TimeSeriesLoader.loadDataset(train);

        // The WEASEL-classifier
        Classifier w = new WEASELClassifier();
        Classifier.Score scoreW = w.eval(trainSamples, testSamples);
        System.out.println(dataset + ";" + scoreW.toString());
      }
    }
  }