Explanation for NUM_ZERO, ORTHO and ORTHO_v2

Question

Explanation for NUM_ZERO, ORTHO and ORTHO_v2

haofanwang opened this issue 3 months ago · comments

Frank (Haofan) Wang commented 3 months ago

As title, I cannot find any detail about these inference trick in the paper. Especially for fidelity and extremely style, you use different settings.

Here is my understandings, not sure whether they are correct.
(1) For NUM_ZERO, you actually add some zero tokens to make it possible that the query discard ID information (maybe better to keep the background uncontaminated? But it is in an implicit manner.)
(2) For ORTHO or ORTHO_v2, you calculate the projection of ID_hidden_state to hidden_state, then orthogonal = id_hidden_states - projection is to obtain more disentangled ID information. Is this a experimental finding?