xyutao / fscil

Official repository for Few-Shot Class-Incremental Learning (FSCIL)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some questions about the detail implementation of your great job

ScutQi opened this issue · comments

Thanks for your great job. I am interested in your work and attempt to implement your work in pytorch but there are several problems when I am doing it. I would appreciate it if you could answer my questions.
Q1: When session t=1, how do you initialize the value of centroid vector for each NG node? use k-means or random initialization?
Q2: When calculating the anchor loss, you extract the subgraph of G(t), is there a restriction on the subgraph? And G(t) has many subgraphs,which subgraph should be chosen to calculate the anchor loss?
Thank you very much!

Thanks for your attention to our work. Here are my answers:
A1: At the initial session, we randomly pick N feature vectors from the training feature vector set to initialize the NG nodes.
A2: The anchor loss is calculated on the subgraph whose NG nodes are learned at previous sessions. These nodes are assigned with old classes' label (i.e., c \in \bigcup_{i=1}^{t-1} L(i)) and should be stabilized to avoid forgetting.

You may treat G(t) as a combination of two subgraphs G_o(t) and G_n(t), which stores NG nodes for the old and new classes, respectivelay.

Thank you for answering my questions!
I have two other questions:
Q1: How do you fine-tune the pre-train network? For example, if we use base classes set(60 classes) to train the Quicknet in the session t=1,the number of neurons in the output layer is 60. When session t=2, you use new classes set(5 classes) to finetune the network, do you add 5 new neurons directly to the output layer and use the new set to train the new network?
Q2: When session t>1, how do you update the zj and cj of the NG node vj? As I understand it, if zj and cj are updated according to the rule(For every NG node, find the nearset f in the F(t), use f and f‘s label as the pseudo image and label ) in the session t=1,the cj of all NG nodes will become the new classes labels when session t>1 because the D(t-1) is unseen in the sesstion t.Is my understanding right?
I would appreciate it if you could reply me。

A1: Yes, we simply add 5 new neurons directly to the output layer and finetune the entire network with the new class training set.
A2: At session t>1, we do not need to update the old NG-node's z_j and c_j, since we aim to stabilize these nodes using anchor loss. We only assign new NG nodes with new z_j and c_j from D(t)

Thank for replying me!