om-ai-lab / awesome-RSVLM

Collection of Remote Sensing Vision-Language Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome Remote Sensing Vision-Language Models & Papers

Collection of Remote Sensing Vision-Language models and papers

To add your work to this repo, feel free to submit the request or contact me at zilun.zhang@zju.edu.cn

Paper List

  • EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering (2023.12) [pdf]

    • Junjue Wang, Zhuo Zheng, Zihang Chen, Ailong Ma, and Yanfei Zhong
  • A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval (2023.10) [pdf]

    • Jiancheng Pan, Qing Ma, Cong Bai
  • A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval (2023.10) [pdf]

    • Fuzhong Zheng, Xu Wang, Luyao Wang, Xiong Zhang, Hongze Zhu, Long Wang and Haisu Zhang
  • Multilanguage Transformer for Improved Text to Remote Sensing Image Retrieval (2023.10) [pdf]

    • Mohamad M. Al Rahhal; Yakoub Bazi; Norah A. Alsharif; Laila Bashmal; Naif Alajlan; Farid Melgani
  • A Fusion Encoder with Multi-Task Guidance for Cross-Modal Text–Image Retrieval in Remote Sensing (2023.09) [pdf]

    • Xiong Zhang, Weipeng Li , Xu Wang, Luyao Wang, Fuzhong Zheng, Long Wang and Haisu Zhang
  • Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval (2023.09) [pdf]

    • Yuan Yuan, Yang Zhan, Zhitong Xiong
  • Hypersphere-based remote sensing cross-modal text–image retrieval via curriculum learning (2023.09) [pdf]

    • Weihang Zhang, Jihao Li, Shuoke Li, Jialiang Chen, Wenkai Zhang, Xin Gao, Xian Sun
  • RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model (2023.06) [pdf]

    • Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin
  • RemoteCLIP: A Vision Language Foundation Model for Remote Sensing (2023.06) [pdf]

    • Fan Liu, Delong Chen, Zhangqingyun Guan, Xiaocong Zhou, Jiale Zhu, Jun Zhou
  • Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval (2023.06) [pdf]

    • Jiancheng Pan, Qing Ma, Cong Bai
  • Vision-Language Models in Remote Sensing: Current Progress and Future Trends (2023.05) [pdf]

    • Congcong Wen, Yuan Hu, Xiang Li, Zhenghang Yuan, Xiao Xiang Zhu
  • MCRN: A Multi-source Cross-modal Retrieval Network for remote sensing (2022.12) [pdf]

    • Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Yongqiang Mao, Ruixue Zhou, Hongqi Wang, Kun Fu, Xian Sun
  • RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data (2022.10) [pdf]

    • Yang Zhan, Zhitong Xiong, Yuan Yuan
  • Learning to Evaluate Performance of Multi-modal Semantic Localization (2022.09) [pdf]

    • Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang Mao, Jialiang Chen, Shouke Li, Hongqi Wang, Xian Sun
  • Knowledge-Aware Cross-Modal Text-Image Retrieval for Remote Sensing Images (2022.09) [pdf]

    • Li Mi, Siran Li, Christel Chappuis, Devis Tuia
  • CLIP-RS: A Cross-modal Remote Sensing Image Retrieval Based on CLIP, a Northern Virginia Case Study (2022.05) [pdf]

    • Djoufack Basso, Larissa
  • Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval (2022.04) [pdf]

    • Zhiqiang Yuan, Wenkai Zhang, Kun Fu, Xuan Li, Chubo Deng, Hongqi Wang, Xian Sun
  • Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information (2022.04) [pdf]

    • Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan Zhang, Hongqi Wang, Kun Fu, Xian Sun
  • Fine tuning CLIP with Remote Sensing (Satellite) images and captions (2021.10) [pdf]

    • Arto, Dev Vidhani, Goutham, Mayank Bhaskar, Sujit Pal

About

Collection of Remote Sensing Vision-Language Models

License:MIT License