In this work we describe the EEND vector clustering based speaker diarization system we implemeted in ISCSLP2022 CSSD Challenge. In order to get a competitive result we investigated serveral aspect including:
(1) the method for simulating natural conversation speech data;(2) cite conformer as the encoder to capture local features;(3) some tricks of loss funtion; (4) model average;(5) use Spectural Clustering in embdding cluter. On MagicData-RAMC dataset with the CDER metric our system achieved 21.9 on dev set and 24.5 on test respectively.