ankitapasad / layerwise-analysis

Layer-wise analysis of self-supervised pre-trained speech representations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CCA analysis of custom model

sausage-333 opened this issue · comments

Hello, I've read your work recently. It was really great work.

Meanwhile, what I want is to evaluate CCAs (word/phoneme both) with my custom model.
Is it available?

Best regards,
Kangwook

Hi Kangwook,

Appreciate your interest in our work.

For using a custom model, you ideally need the module to extract representations. Unfortunately, the option is not directly available, but there are multiple ways to do this:

Option 1: Have the same directory structure

Using your implementation, extract representations from the custom model for the sample utterance IDs saved in step 2b.

1a: Modify location:

Save the representations as layer_*.npy following the same directory structure. Example: $save_dir_pth/hubert_small/librispeech_dev-clean_sample1/contextualized/word_level/0/layer_*.npy

1b: OR, modify directory name in the evaluation code:

In the code and the script.

Option 2: Add the model to the codebase

Add the relevant modules to model_utils.py on ModelLoader, DataLoader, and FeatExtractor classes.

I hope this helps! Would appreciate it if you can share which route you preferred and if you like to add a PR for option 2.