CONO

Unsupervised cross-domain image retrieval based on noise pseudo-tags

Authors: Xiaoyun Ren, Xingbo Zhao, Qianwen Lu, Qingchuan Tao, and Yongxiang Li

Abstract

With the explosive growth of various image types,large-scale cross-domain image retrieval has garnered increasing attention. This paper tackles a relatively underexplored problem in this domain: fully unsupervised cross-domain image retrieval (FUCIR), which operates without category annotations and correspondence relationships between domains. Existing single-domain unsupervised image retrieval methods often generate pseudo-labels based on intrinsic relationships within the data to guide the learning process. However, this is not directly applicable in cross-domain learning, where the inevitable noise in pseudo-labels not only causes significant overfitting but also leads to unreliable correspondences. These issues severely limit the effectiveness of pseudo-label based methods in FUCIR and have yet to be fully addressed. To overcome these challenges, we introduce CONO, a novel method that effectively learns cross-domain representations using Self-supervision Pseudo-labels Annotation (SPA) and Robust Representation Learning (RRL). CONO employs SPA to produce high-quality pseudo-labels that capture the intrinsic relationships within the data, while RRL extracts discriminative and domain-invariant representations from these imperfect pseudo-labels. We demonstrate the effectiveness of our method through extensive experiments on three benchmark datasets, comparing it with five state-of-the-art methods.

Framework

Requirements

pip install requirements.txt

Datasets

The OfficeHome dataset could be downloaded from https://www.hemanthdv.org/officeHomeDataset.html.

The Office31 and image_CLEF dataset could be downloaded from https://github.com/jindongwang/transferlearning/tree/master/data.

The directory structure of datasets.

  --officehome
       --Art
       --Clipart
       --Real_World
       --Product
  --office31
       --amazon
       --dslr
       --webcam
       --...
  --...

Quickly Training

Preheat to get fake tags

python train_setup1.py

At the end of the SPA step, false labels with noise and image features corresponding to the data set class are obtained. Then RRL is used to train the noisy data.(You need to enter UCRP_TRIAN2 to properly start python train_setup2.py)

Noise pseudo-label training

python train_setup2.py

Comparison with the State-of-the-Art Methods

We conducted CONO unsupervised cross-domain image retrieval on three datasets to evaluate the performance of CONO and other methods.The specific comparative experimental data of the three data sets are shown below.

Ablation Study

In the study of ablation experiments, the important role of each component is verified. Lcon and Lr in SPA and RRL can perform ablation experiments on each module. Here, we give the experimental results of the representative image_CLEF dataset, as shown in Table.

At the same time, we compared the three mainstream clustering methods, and finally chose K-means to be more advantageous. The data of the three clustering methods on the office31 dataset is shown below.

Example of Retrievals

In this Table, we present a comparison between the CONO method and the CoDA method, showcasing their respective retrieval capabilities using examples from the OfficeHome dataset. This comparison aims to underscore the reliability and superiority of our approach in cross-domain image retrieval. For instance, consider a scenario where we retrieve an image of a pen from the Art domain and search for similar images.

LYXRhythm / CONO