2024 CVPR Accpted Paper List
A repository that organizes the 2024 CVPR Accpeted list.
I plan to keep it updated, but currently only the title and author are displayed.
I've organized it into README.md
, .xlsx
, and .csv
files, so feel free to use whatever you're comfortable with.
I'll upload the code when it's complete, and issues and contributions are always welcome.
# | Title | Authors |
---|---|---|
1 | Privacy-Preserving Face Recognition Using Trainable Feature Subtraction | Yuxi Mi (Fudan University) · Zhong Zhizhou (Fudan University) · Yuge Huang (Tencent Youtu Lab) · Jiazhen Ji (Tencent Youtu Lab) · Jianqing Xu (HIT) · Jun Wang (None) · ShaoMing Wang (WeChat Pay Lab33) · Shouhong Ding (Tencent Youtu Lab) · Shuigeng Zhou (Fudan University) |
2 | LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking | Jialin Li (Tencent) · Qiang Nie (Tencent Youtu Lab) · Weifu Fu (Tencent Youtu Lab) · Yuhuan Lin (Tencent Youtu Lab) · Guangpin Tao (Tencent YoutuLab) · Yong Liu (Tencent Youtu Lab) · Chengjie Wang (Shanghai Jiao Tong University) |
3 | VideoMAC: Video Masked Autoencoders Meet ConvNets | Gensheng Pei (Nanjing University of Science and Technology) · Tao Chen (None) · Xiruo Jiang (None) · Huafeng Liu (Nanjing University of Science and Technology) · Zeren Sun (Nanjing University of Science and Technology) · Yazhou Yao (Nanjing University of Science and Technology) |
4 | Discovering Syntactic Interaction Clues for Human-Object Interaction Detection | Jinguo Luo () · Weihong Ren (Harbin Institute of Technology, Shenzhen) · Weibo Jiang (Harbin Institute of Technology) · Xi'ai Chen (Shenyang Institute of Automation, Chinese Academy of Sciences) · Qiang Wang (Shenyang University) · Zhi Han (Shenyang Institute of Automation, Chinese Academy of Sciences) · Honghai LIU (Harbin Institute of Technology, Shenzhen) |
5 | LiDAR-Net: A Real-scanned 3D Point Cloud Dataset for Indoor Scenes | Yanwen Guo (Nanjing University) · Yuanqi Li (Nanjing University) · Dayong Ren (nanjing university) · Xiaohong Zhang (None) · Jiawei Li (Nanjing University) · Liang Pu (None) · Changfeng Ma (Nanjing University) · xiaoyu zhan () · Jie Guo (Nanjing University) · Mingqiang Wei (Nanjing University of Aeronautics and Astronautics) · Yan Zhang (None) · Piaopiao Yu (Nanjing University) · Shuangyu Yang (Nanjing University) · Donghao Ji (nanjing university) · Huisheng Ye (Nanjing University) · Hao Sun (nanjing university) · Yansong Liu (nanjing university) · Yinuo Chen (Nanjing University) · Jiaqi Zhu (nanjing university) · Hongyu Liu (nanjing university) |
6 | A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation | Qucheng Peng (University of Central Florida) · Ce Zheng (University of Central Florida) · Chen Chen () |
7 | Real-Time Exposure Correction via Collaborative Transformations and Adaptive Sampling | Ziwen Li (Huazhong University of Science and Technology) · Feng Zhang (Huazhong University of Science and Technology) · Meng Cao (International Digital Economy Academy (IDEA)) · Jinpu Zhang (Huazhong University of Science and Technology) · Yuanjie Shao (Huazhong University of Science and Technology) · Yuehuan Wang (Huazhong University of Science and Technology) · Nong Sang (Huazhong University of Science and Technology) |
8 | RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback | Tianyu Yu (Tsinghua University, Tsinghua University) · Yuan Yao (Tsinghua University) · Haoye Zhang (Tsinghua University, Tsinghua University) · Taiwen He (Tsinghua University, Tsinghua University) · Yifeng Han (Zhejiang University) · Ganqu Cui (Tsinghua University, Tsinghua University) · Jinyi Hu (Tsinghua University, Tsinghua University) · Zhiyuan Liu (Tsinghua University) · Hai-Tao Zheng (Tsinghua University, Tsinghua University) · Maosong Sun (Tsinghua University, Tsinghua University) |
9 | CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition | Feng Lu (Tsinghua University) · Xiangyuan Lan (Peng Cheng Laboratory) · Lijun Zhang (University of Chinese Academy of Sciences) · Dongmei Jiang (Peng Cheng Laboratory) · Yaowei Wang (Pengcheng Laboratory) · Chun Yuan (Tsinghua University, Tsinghua University) |
10 | From a Bird’s Eye View to See: Joint Camera and Subject Registration without the Camera Calibration | Zekun Qian (Tianjin University) · Ruize Han (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Chinese Academy of Sciences) · Wei Feng (Tianjin University) · Song Wang (University of South Carolina) |
11 | SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting | Zhijing Shao (The Hong Kong University of Science and Technology (Guangzhou)) · Wang Zhaolong (Tsinghua University) · Zhuang Li (Prometheus Vision Technology Co., Ltd.) · Duotun Wang (The Hong Kong University of Science and Technology (Guangzhou)) · Xiangru Lin () · Yu Zhang (Prometheus Vision Technology Co., Ltd.) · Mingming Fan (Hong Kong University of Science and Technology) · Zeyu Wang (The Hong Kong University of Science and Technology (Guangzhou)) |
12 | Explaining CLIP's performance disparities on data from blind/low vision users | Daniela Massiceti (Microsoft Research) · Camilla Longden (Microsoft Research, Cambridge) · Agnieszka Słowik (Microsoft) · Samuel Wills (World Bank) · Martin Grayson (Research, Microsoft) · Cecily Morrison (Microsoft Research) |
13 | UniPAD: A Universal Pre-training Paradigm for Autonomous Driving | Honghui Yang (Zhejiang University) · Sha Zhang (None) · Di Huang (University of Sydney) · Xiaoyang Wu (The University of Hong Kong) · Haoyi Zhu (University of Science and Technology of China) · Tong He (Shanghai AI Lab) · SHIXIANG TANG (The Chinese University of Hong Kong) · Hengshuang Zhao (The University of Hong Kong) · Qibo Qiu (Zhejiang Lab) · Binbin Lin (Zhejiang University) · Xiaofei He (Zhejiang University) · Wanli Ouyang (University of Sydney) |
14 | Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline | yu chen (Beijing Waiyan Online Digital Technology Co., Ltd) · Fei Gao (Hangzhou Institute of Technology, Xidian University) · YanguangZhang (Hangzhou Dianzi University) · Maoying Qiao (University of Technology Sydney) · Nannan Wang (Xidian University) |
15 | Frequency-aware Event-based Video Deblurring for Real-World Motion Blur | Taewoo Kim () · Hoonhee Cho (None) · Kuk-Jin Yoon (KAIST) |
16 | Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning | Jaewoo Jeong (Korea Advanced Institute of Science & Technology) · Daehee Park (Korea Advanced Institute of Science and Technology) · Kuk-Jin Yoon (KAIST) |
17 | Quantifying Task Priority for Multi-Task Optimization | Wooseong Jeong (The Korea Advanced Institute of Science and Technology (KAIST)) · Kuk-Jin Yoon (KAIST) |
18 | CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing | A Liu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Shuai Xue (Beijing Institute of Technology) · Gan Jianwen (Macao University of Science and Techonology) · Jun Wan () · Yanyan Liang (Macau University of Science and Technology) · Jiankang Deng (Huawei) · Sergio Escalera (Computer Vision Center) · Zhen Lei (Institute of Automation, Chinese Academy of Sciences) |
19 | Stratified Avatar Generation from Sparse Observations | Han Feng (Wuhan University) · Wenchao Ma (None) · Quankai Gao (University of Southern California) · Xianwei Zheng (Wuhan University) · Nan Xue (None) · Huijuan Xu (Pennsylvania State University--University Park) |
20 | EVS-assisted joint Deblurring, Rolling-Shutter Correction and Video Frame Interpolation through Sensor Inverse Modeling | Rui Jiang (OMNIVISION) · Fangwen Tu (OMNIVISION) · Yixuan Long (OMNIVISION TECHNOLOGIES SINGAPORE PTE. LTD.) · Aabhaas Vaish (OmniVision Technologies) · Bowen Zhou (OmniVision Technologies) · Qinyi Wang (OmniVision) · Wei Zhang (OMNIVISION) · Yuntan Fang (OmniVision Technologies) · Luis Capel (Omnivision Technologies) · Bo Mu (OMNIVISION) · Tiejundai (OmniVision Technologies, USA) · Andreas Suess (OMNIVISION) |
21 | Diffusion-ES: Generative Evolutionary Search with Diffusion Models for Trajectory Optimization | Brian Yang (School of Computer Science, Carnegie Mellon University) · Huangyuan Su (Computer Science, School of Engineering and Applied Sciences, Harvard University) · Nikolaos Gkanatsios (Carnegie Mellon University) · Tsung-Wei Ke (CMU, Carnegie Mellon University) · Ayush Jain (Carnegie Mellon University) · Jeff Schneider (Carnegie Mellon University) · Katerina Fragkiadaki (CMU) |
22 | Towards Robust Learning to Optimize with Theoretical Guarantees | Qingyu Song (The Chinese University of Hong Kong) · Wei Lin (The Chinese University of Hong Kong) · Juncheng Wang (Hong Kong Baptist University) · Hong Xu (CUHK) |
23 | Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches | Qing Yu (LY Corporation) · Mikihiro Tanaka (LY Corporation) · Kent Fujiwara (LY Corporation) |
24 | Test-Time Domain Generalization for Face Anti-Spoofing | Qianyu Zhou (Shanghai Jiao Tong University) · Ke-Yue Zhang (Tencent) · Taiping Yao (Tencent Youtu Lab) · Xuequan Lu (La Trobe University) · Shouhong Ding (Tencent Youtu Lab) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) |
25 | StraightPCF: Straight Point Cloud Filtering | Dasith de Silva Edirimuni (Deakin University) · Xuequan Lu (La Trobe University) · Gang Li (Deakin University) · Lei Wei (Deakin University) · Antonio Robles-Kelly (Defence Science and Technology Group (DST), Deakin University) · Hongdong Li (Australian National University) |
26 | BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model | song yiran (None) · Qianyu Zhou (Shanghai Jiao Tong University) · Xiangtai Li (Nanyang Technological University) · Deng-Ping Fan (ETH Zurich) · Xuequan Lu (La Trobe University) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) |
27 | BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition | Yuxuan Zhou (Universität Mannheim) · Xudong Yan (City University of Macao) · Zhi-Qi Cheng (Carnegie Mellon University) · Yan Yan (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences) · Qi Dai (Microsoft Research Asia) · Xian-Sheng Hua (Terminus Group) |
28 | Class Tokens Infusion for Weakly Supervised Semantic Segmentation | Sung-Hoon Yoon (None) · Hoyong Kwon (Korea Advanced Institute of Science & Technology) · Hyeonseong Kim (Korea Advanced Institute of Science and Technology) · Kuk-Jin Yoon (KAIST) |
29 | Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions | Yujeong Chae (KAIST) · Hyeonseong Kim (Korea Advanced Institute of Science and Technology) · Kuk-Jin Yoon (KAIST) |
30 | T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory | Daehee Park (Korea Advanced Institute of Science and Technology) · Jaeseok Jeong (KAIST) · Sung-Hoon Yoon (None) · Jaewoo Jeong (Korea Advanced Institute of Science & Technology) · Kuk-Jin Yoon (KAIST) |
31 | Snapshot Lidar: Fourier embedding of phasors for single-image depth reconstruction | Sarah Friday (Dartmouth College) · Yunzi Shi (Dartmouth College) · Yaswanth Kumar Cherivirala (Univ. of Michigan/NVIDIA) · Vishwanath Saragadam (University of California, Riverside) · Adithya Pediredla (Dartmouth College) |
32 | G3DR: Generative 3D Reconstruction in ImageNet | Pradyumna Reddy () · Ismail Elezi (Technische Universität München) · Jiankang Deng (Huawei) |
33 | Human Gaussian Splatting : Real-time Rendering of Animatable Avatars | Arthur Moreau (Mines Paris - PSL University) · Jifei Song (Huawei Technologies Ltd.) · Helisa Dhamo (None) · Richard Shaw (Huawei Technologies Ltd.) · Yiren Zhou (Huawei Technologies Ltd.) · Eduardo Pérez-Pellitero (Huawei Noah's Ark Lab (UK)) |
34 | The Neglected Tails in Vision-Language Models | Shubham Parashar (Texas A&M University - College Station) · Tian Liu (Texas A&M University - College Station) · Zhiqiu Lin (Carnegie Mellon University) · Xiangjue Dong (Texas A&M University - College Station) · Yanan Li (Zhejiang Lab) · James Caverlee (Texas A&M University) · Deva Ramanan (Carnegie Mellon University) · Shu Kong (Texas A&M University) |
35 | HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios | HyunJun Jung (Technische Universität München) · Shun-Cheng Wu (Technical University Munich) · Patrick Ruhkamp (Technical University Munich) · Guangyao Zhai (Technical University of Munich) · Hannah Schieber (Technische Universität München/Friedrich-Alexander Universität Erlangen-Nürnberg) · Giulia Rizzoli (University of Padua) · Pengyuan Wang (Technische Universität München) · Hongcheng Zhao (Technische Universität München) · Lorenzo Garattoni (Toyota Motor Europe) · Sven Meier (Toyota Motors Europe NV/SA) · Daniel Roth (Technische Universität München) · Nassir Navab (TU Munich) · Benjamin Busam (None) |
36 | Cross-spectral Gated-RGB Stereo Depth Estimation | Samuel Brucker (Mercedes Benz Research & Development) · Stefanie Walz (Mercedes-Benz AG) · Mario Bijelic (Princeton University) · Felix Heide (Department of Computer Science, Princeton University) |
37 | DiffAvatar: Simulation-Ready Garment Optimization with Differentiable Simulation | Yifei Li (Massachusetts Institute of Technology) · Hsiaoyu Chen (Meta) · Egor Larionov (Meta) · Nikolaos Sarafianos (Meta Reality Labs Research) · Wojciech Matusik (Massachusetts Institute of Technology) · Tuur Stuyck (Meta) |
38 | Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence | Ripon Saha (Arizona State University) · Dehao Qin (Clemson University) · Nianyi Li (None) · Jinwei Ye (None) · Suren Jayasuriya (Arizona State University) |
39 | Region-Based Representations Revisited | Michal Shlapentokh-Rothman (University of Illinois, Urbana Champaign) · Ansel Blume (University of Illinois Urbana Champaign) · Yao Xiao (University of Illinois at Urbana-Champaign) · Yuqun Wu (Department of Computer Science) · Sethuraman T V (Department of Computer Science) · Heyi Tao (University of Illinois at Urbana-Champaign) · Jae Yong Lee (University of Illinois at Urbana-Champaign) · Wilfredo Torres-Calderon (Reconstruct) · Yu-Xiong Wang (None) · Derek Hoiem (University of Illinois at Urbana-Champaign) |
40 | DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models | Nastaran Saadati (Iowa State University) · Minh Pham (New York University) · Nasla Saleem (Iowa State University) · Joshua R. Waite (Iowa State University) · Aditya Balu (Iowa State University) · Zhanhong Jiang (Iowa State University) · Chinmay Hegde (New York University) · Soumik Sarkar (Iowa State University) |
41 | GLiDR: Topologically Regularized Graph Generative Network for Sparse LiDAR Point Clouds | Prashant Kumar (Indian Institute of Technology Delhi) · Kshitij Bhat (None) · Vedang Nadkarni (Birla Institute of Technology and Science (BITS Pilani)) · Prem Kalra (Indian Institute of Technology, Delhi) |
42 | Learning the 3D Fauna of the Web | Zizhang Li (Zhejiang University) · Dor Litvak (University of Texas at Austin) · Ruining Li (University of Oxford) · Yunzhi Zhang (Stanford University) · Tomas Jakab (University of Oxford) · Christian Rupprecht (University of Oxford) · Shangzhe Wu (Stanford University) · Andrea Vedaldi (University of Oxford) · Jiajun Wu (Stanford University) |
43 | Hearing Anything Anywhere | Mason Wang (Stanford University) · Ryosuke Sawata (Sony Research) · Samuel Clarke (Stanford University) · Ruohan Gao (Stanford University) · Shangzhe Wu (Stanford University) · Jiajun Wu (Stanford University) |
44 | IBD-SLAM: Learning Image-Based Depth Fusion for Generalizable SLAM | Minghao Yin (The University of Hong Kong) · Shangzhe Wu (Stanford University) · Kai Han (The University of Hong Kong) |
45 | MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers | Haoyu Ma (University of California, Irvine) · Shahin Mahdizadehaghdam (Meta) · Bichen Wu (Facebook) · Zhipeng Fan (Facebook) · Yuchao Gu (None) · Wenliang Zhao (Meta Inc) · Lior Shapira (Meta) · Xiaohui Xie (University of California, Irvine) |
46 | Single-view Scene Point Cloud Human Grasp Generation | Yan-Kang Wang (SUN YAT-SEN UNIVERSITY)) · Chengyi Xing (Stanford University) · Yi-Lin Wei (SUN YAT-SEN UNIVERSITY) · Xiao-Ming Wu (SUN YAT-SEN UNIVERSITY) · Wei-Shi Zheng (SUN YAT-SEN UNIVERSITY) |
47 | Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation | Keonhee Han (Fraunhofer IIS) · Dominik Muhle (Technical University of Munich) · Felix Wimbauer (Technical University of Munich) · Daniel Cremers (Technical University Munich) |
48 | Fully Convolutional Slice-to-Volume Reconstruction for Single-Stack MRI | Sean I. Young (Harvard Medical School / MIT) · Yaël Balbastre (Massachusetts General Hospital, Harvard Medical School) · Bruce Fischl (Massachusetts General Hospital, Harvard University) · Polina Golland (Massachusetts Institute of Technology) · Juan Iglesias (Harvard University) |
49 | Seeing the World through Your Eyes | Hadi Alzayer (University of Maryland) · Kevin Zhang (University of Maryland, College Park) · Brandon Y. Feng (Massachusetts Institute of Technology) · Christopher Metzler (University of Maryland, College Park) · Jia-Bin Huang (University of Maryland, College Park) |
50 | Grounded Text-to-Image Synthesis with Attention Refocusing | Quynh Phung (University of Maryland, College Park) · Songwei Ge (University of Maryland, College Park) · Jia-Bin Huang (University of Maryland, College Park) |
51 | TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion | Yu-Ying Yeh (University of California, San Diego) · Jia-Bin Huang (University of Maryland, College Park) · Changil Kim (Facebook) · Lei Xiao (None) · Thu Nguyen-Phuoc (Reality Labs Research, Meta) · Numair Khan (None) · Cheng Zhang (Facebook) · Manmohan Chandraker (UC San Diego) · Carl Marshall (Reality Labs Research) · Zhao Dong (Meta RL Research) · Zhengqin Li (Facebook) |
52 | Simple but Effective Text-to-Video Generation with Grid Diffusion Models | Taegyeong Lee (Ulsan National Institute of Science and Technology) · Soyeong Kwon (Ulsan National Institute of Science and Technology) · Taehwan Kim (UNIST) |
53 | Single Mesh Diffusion Models with Field Latents for Texture Generation | Thomas W. Mitchel (PlayStation) · Carlos Esteves (Google) · Ameesh Makadia (Google Research) |
54 | VecFusion: Vector Font Generation with Diffusion | Vikas Thamizharasan (Adobe Systems) · Difan Liu (None) · Shantanu Agarwal (Balbix) · Matthew Fisher (Adobe Research) · Michaël Gharbi (Massachusetts Institute of Technology) · Oliver Wang (Adobe Research) · Alec Jacobson (University of Toronto and Adobe Systems) · Evangelos Kalogerakis (UMass Amherst) |
55 | Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields | Leili Goli (University of Toronto) · Cody Reading (Simon Fraser University) · Silvia Sellán (University of Toronto) · Alec Jacobson (University of Toronto and Adobe Systems) · Andrea Tagliasacchi (Simon Fraser University) |
56 | YolOOD: Utilizing Object Detection Concepts for Multi-Label Out-of-Distribution Detection | Alon Zolfi (Ben-Gurion University of the Negev) · Guy AmiT (Ben-Gurion University of the Negev) · Amit Baras () · Satoru Koda (Fujitsu Limited) · Ikuya Morikawa (Fujitsu Research) · Yuval Elovici (Ben Gurion University of the Negev) · Asaf Shabtai (Ben-Gurion University of the Negev) |
57 | Learning to Localize Sound Sources from Mixtures without Prior Source Knowledge | Dongjin Kim (Kyung Hee University) · Sung Jin Um (Kyung Hee University) · Sangmin Lee (University of Illinois Urbana-Champaign) · Jung Uk Kim (Kyung Hee University) |
58 | Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations | Sangmin Lee (University of Illinois Urbana-Champaign) · Bolin Lai (Georgia Institute of Technology) · Fiona Ryan (Georgia Institute of Technology) · Bikram Boote (University of Illinois, Urbana Champaign) · James Rehg (None) |
59 | SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving | Yiming Xie (Shenzhen International Graduate School, Tsinghua University) · Henglu Wei (Tsinghua University, Tsinghua University) · Zhenyi Liu (Stanford University) · Xiaoyu Wang (Department of Automation, Tsinghua University) · Xiangyang Ji (Tsinghua University) |
60 | Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation | Alexander Raistrick (Princeton University) · Lingjie Mei (Princeton University) · Karhan Kayan (Princeton University) · David Yan (Princeton University) · Yiming Zuo (Princeton University) · Beining Han (Department of Computer Science, Princeton University) · Hongyu Wen (Princeton University) · Meenal Parakh (Princeton University) · Stamatis Alexandropoulos (Princeton University) · Lahav Lipson (Princeton University) · Zeyu Ma (Princeton university) · Jia Deng (Princeton University) |
61 | Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms | Joren Brunekreef (Netherlands Cancer Institute) · Eric Marcus (Netherlands Cancer Institute) · Ray Sheombarsing (None) · Jan-Jakob Sonke (Netherlands Cancer Institute) · Jonas Teuwen (Netherlands Cancer Institute) |
62 | Describing Differences in Image Sets with Natural Language | Lisa Dunlap (University of California, Berkeley) · Yuhui Zhang (Stanford University) · Xiaohan Wang (Zhejiang University) · Ruiqi Zhong (University of California Berkeley) · Trevor Darrell (Electrical Engineering & Computer Science Department) · Jacob Steinhardt (University of California Berkeley) · Joseph Gonzalez (University of California - Berkeley) · Serena Yeung (Stanford) |
63 | SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting | Hoon Kim (Beeble Inc.) · Minje Jang (Beeble Inc.) · Wonjun Yoon (Beeble Inc.) · Jisoo Lee (Beeble Inc.) · Donghyun Na (Beeble Inc.) · Sanghyun Woo (New York University) |
64 | Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection | Taeheon Kim (Korea Advanced Institute of Science & Technology) · Sebin Shin (KAIST) · Youngjoon Yu (Korea Advanced Institute of Science and Technology (KAIST)) · Hak Gu Kim (Chung-Ang University) · Yong Man Ro (Korea Advanced Institute of Science and Technology) |
65 | FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning | Rishub Tamirisa (AI@UIUC) · Chulin Xie (University of Illinois, Urbana Champaign) · Wenxuan Bao (University of Illinois Urbana Champaign) · Andy Zhou (Lapis Labs) · Ron Arel (Lapis Lapis, UIUC) · Aviv Shamsian (Bar-Ilan University) |
66 | ReCoRe: Regularized Contrastive Representation Learning of World Model | Rudra Poudel (Toshiba Europe, Cambridge, UK) · Harit Pandya (Toshiba Europe) · Stephan Liwicki (Toshiba Europe Ltd) · Roberto Cipolla (University of Cambridge) |
67 | Gaussian Splatting SLAM | Hidenobu Matsuki (Imperial College London) · Riku Murai (Imperial College London) · Paul Kelly (Imperial College London) · Andrew J. Davison (Imperial College London) |
68 | PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation | Ruining Deng (Vanderbilt University) · Quan Liu (Vanderbilt University) · Can Cui (Vanderbilt University) · Tianyuan Yao (Vanderbilt University) · Jialin Yue (Vanderbilt University) · Juming Xiong (Vanderbilt University) · Lining yu (Vanderbilt University) · Yifei Wu (Vanderbilt University) · Mengmeng Yin (Vanderbilt University) · Yu Wang (Vanderbilt University Medical Center) · Shilin Zhao (Vanderbilt University) · Yucheng Tang (NVIDIA) · Haichun Yang (Vanderbilt Unversity Medical School) · Yuankai Huo (Vanderbilt University) |
69 | See, Say, and Segment: Correcting False Premises with LMMs | Tsung-Han Wu (University of California, Berkeley) · Giscard Biamby (University of California, Berkeley) · David Chan (University of California Berkeley) · Lisa Dunlap (University of California, Berkeley) · Ritwik Gupta (Defense Innovation Unit) · Xudong Wang (Electrical Engineering & Computer Science Department, University of California Berkeley) · Trevor Darrell (Electrical Engineering & Computer Science Department) · Joseph Gonzalez (University of California - Berkeley) |
70 | On Scaling up a Multilingual Vision and Language Model | Xi Chen (Google) · Josip Djolonga (Google) · Piotr Padlewski (Google) · Basil Mustafa (Google) · Soravit Changpinyo (Google Research) · Jialin Wu (Google) · Carlos Riquelme Ruiz (Google) · Sebastian Goodman (Google) · Xiao Wang (Google DeepMind) · Yi Tay (Google) · Siamak Shakeri (Research, Google) · Mostafa Dehghani (Google DeepMind) · Daniel Salz (Google) · Mario Lučić (Google) · Michael Tschannen (Google DeepMind) · Arsha Nagrani (Google ) · Hexiang Hu (Google Deepmind) · Mandar Joshi (Google DeepMind) · Bo Pang (Google) · Ceslee Montgomery (Google) · Paulina Pietrzyk (Google) · Marvin Ritter (Google DeepMind) · AJ Piergiovanni (Google) · Matthias Minderer (Google) · Filip Pavetic (Google) · Austin Waters (Google) · Gang Li (Google) · Ibrahim Alabdulmohsin (Google) · Lucas Beyer (Google Brain/DM Zürich) · Julien Amelot (Research, Google) · Kenton Lee (Google Research) · Andreas Steiner (Google) · Yang Li (Google) · Daniel Keysers (Google) · Anurag Arnab (Google) · Yuanzhong Xu (Google) · Keran Rong (Google Deepmind) · Alexander Kolesnikov (Google) · Mojtaba Seyedhosseini (Google) · Anelia Angelova (Google) · Xiaohua Zhai (Google) · Neil Houlsby (Google) · Radu Soricut (Google) |
71 | Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation | Xiaoyang Chen (University of Pennsylvania, University of Pennsylvania) · Hao Zheng (University of Pennsylvania, University of Pennsylvania) · Yuemeng LI (University of Pennsylvania) · Yuncong Ma (University of Pennsylvania, University of Pennsylvania) · Liang Ma (University of Pennsylvania, University of Pennsylvania) · Hongming Li (University of Pennsylvania, University of Pennsylvania) · Yong Fan (University of Pennsylvania, University of Pennsylvania) |
72 | Scaling Up Dynamic 3D Human-Scene Interaction Modelling | Nan Jiang (Peking University) · Zhiyuan Zhang (Department of Automation, Tsinghua University) · Hongjie Li (Peking University) · Xiaoxuan Ma () · Zan Wang (None) · Yixin Chen (BIGAI) · Tengyu Liu (None) · Yixin Zhu (Peking University) · Siyuan Huang (Beijing Institute of General Artificial Intelligence) |
73 | Making Large Multimodal Models Understand Arbitrary Visual Prompts | Mu Cai (Department of Computer Science, University of Wisconsin, Madison) · Haotian Liu (University of Wisconsin-Madison) · Siva Mustikovela (Heidelberg University) · Gregory P. Meyer (Cruise) · Yuning Chai (Cruise) · Dennis Park (Toyota Research Institute) · Yong Jae Lee (Department of Computer Sciences, University of Wisconsin - Madison) |
74 | ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation | Jia-Hao Wu (National Yang Ming Chiao Tung University) · Fu-Jen Tsai (National Tsinghua University) · Yan-Tsung Peng (National Chengchi University) · Charles Tsai (Qualcomm Inc, QualComm) · Chia-Wen Lin (National Tsing Hua University) · Yen-Yu Lin (National Yang Ming Chiao Tung University) |
75 | Do Vision and Language Encoders Represent the World Similarly? | Mayug Maniparambil (ML Labs, Dublin City University) · Raiymbek Akshulakov (University of California, Berkeley) · YASSER ABDELAZIZ DAHOU DJILALI (Technology Innovation Institute) · Mohamed El Amine Seddik (Technology Innovation Institute) · Sanath Narayan (Technology Innovation Institute) · Karttikeya Mangalam (University of California Berkeley) · Noel O'Connor (Dublin City University) |
76 | PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics | Tianyi Xie (University of California, Los Angeles) · Zeshun Zong (University of California, Los Angeles) · Yuxing Qiu (UCLA & LightSpeed Studios) · Xuan Li (None) · Yutao Feng (Zhejiang University) · Yin Yang (University of Utah) · Chenfanfu Jiang (University of California, Los Angeles) |
77 | Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning | Nikhil Singh (Massachusetts Institute of Technology) · Chih-Wei Wu (Netflix) · Iroro Orife (Netflix) · Kalayeh (None) |
78 | DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision | Lu Ling (Purdue University) · Yichen Sheng (Purdue University) · Zhi Tu (Purdue University) · Wentian Zhao (Adobe Systems) · Cheng Xin (Rutgers University) · Kun Wan (Adobe Inc.) · Lantao Yu (Adobe Inc.) · Qianyu Guo (None) · Zixun Yu (Purdue University) · Yawen Lu (Purdue University; Rochester Institute of Tech) · Xuanmao Li (Huazhong University of Science and Technology) · Xingpeng Sun (Purdue University) · Rohan Ashok (Purdue University) · Aniruddha Mukherjee (Purdue University) · Hao Kang (Wormpex AI Research) · Xiangrui Kong (Purdue University) · Gang Hua (Wormpex AI Research) · Tianyi Zhang (Purdue University) · Bedrich Benes (Purdue University) · Aniket Bera (Purdue University) |
79 | Restricted Memory Banks Improve Video Object Segmentation: A Revisit | Junbao Zhou (None) · Ziqi Pang (UIUC) · Yu-Xiong Wang (None) |
80 | Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models | Hongjie Wang (Princeton University) · Difan Liu (None) · Yan Kang (None) · Yijun Li (Adobe Research) · Zhe Lin (Adobe Research) · Niraj Jha (Princeton University) · Yuchen Liu (None) |
81 | Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers | Hongjie Wang (Princeton University) · Bhishma Dedhia (Princeton University) · Niraj Jha (Princeton University) |
82 | ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models | Jeong-gi Kwak (Korea University) · Erqun Dong (University of British Columbia) · Yuhe Jin (University of British Columbia) · Hanseok Ko (Korea University) · Shweta Mahajan (University of British Columbia) · Kwang Moo Yi (University Of British Columbia) |
83 | UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement | yaofeng xie (Ocean University of China) · Lingwei Kong (Sanya Oceanographic Institution, Ocean University of China) · Kai Chen (Sanya Oceanographic Institution, Ocean University of China) · Zheng Ziqiang (Hong Kong University of Science and Technology) · Xiao Yu (Sanya Oceanographic Institution, Ocean University of China) · Zhibin Yu (Sanya Oceanographic Institution, Ocean university of China) · Bing Zheng (Sanya Oceanographic Institution, Ocean University of China) |
84 | Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation | Zihan Wang (None) · Xiangyang Li (Institue of Computing Technology, Chinese Academy of Sciences) · Jiahao Yang (Institute of Computing Technology, Chinese Academy of Sciences) · Yeqi Liu (Institute of Computing Technology, Chinese Academy of Sciences) · Junjie Hu (University of Wisconsin, Madison) · Ming Jiang (Indiana University) · Shuqiang Jiang (Institute of Computing Technology, Chinese Academy of Sciences) |
85 | In Search of a Data Transformation That Accelerates Neural Field Training | Junwon Seo (None) · Sangyoon Lee (Pohang University of Science and Technology) · Kwang In Kim (Pohang University of Science and Technology) · Jaeho Lee (POSTECH) |
86 | Advancing Saliency Ranking with Human Fixations: Dataset, Models and Benchmarks | Bowen Deng (Computer Vision Laboratory University of Nottingham) · Siyang Song (University of Leicester) · Andrew French (University of Nottingham) · Denis Schluppeck (University of Nottingham) · Michael Pound (University of Nottingham) |
87 | Single-View Refractive Index Tomography with Neural Fields | Brandon Zhao (California Institute of Technology) · Aviad Levis (California Institute of Technology) · Liam Connor (California Institute of Technology) · Pratul P. Srinivasan (Google Research) · Katherine Bouman (California Institute of Technology) |
88 | TULIP: A Multi-camera 3D Dataset for Precision Assessment of Parkinson's Disease | Kyungdo Kim (Duke University) · Sihan Lyu (Duke University) · Sneha Mantri (Duke University) · Timothy DUNN (Duke University) |
89 | Generative Unlearning for Any Identity | Juwon Seo (Kyung Hee University) · Sung-Hoon Lee (Kyung Hee University) · Tae-Young Lee (Kyung Hee University) · SeungJun Moon (KLleon) · Gyeong-Moon Park (Kyung Hee University) |
90 | GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding | Zi-Ting Chou (National Taiwan University) · Sheng-Yu Huang (National Taiwan University) · I-Jieh Liu (National Taiwan University) · Yu-Chiang Frank Wang (NVIDIA) |
91 | 4SAVED - Four Seasons Autonomous Vehicle Environment Dataset | Daniel Kent (Michigan State University) · Mohammed Alyaqoub (Michigan State University) · Xiaohu Lu (Michigan State University) · Sayed Khatounabadi (Michigan State University) · Kookjin Sung (Michigan State University) · Cole Scheller (Michigan State University) · Alexander Dalat (University of Michigan - Ann Arbor) · Xinwei Guo (Michigan State University) · Asma Bin Thabit (Michigan State University) · Roberto Muntaner Whitley (Michigan State University) · Hayder Radha (Michigan State University) |
92 | Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation | Wenxiao Deng (None) · Wenbin Li (Nanjing University) · Tianyu Ding (Microsoft) · Lei Wang (University of Wollonong) · Hongguang Zhang (Systems Engineering Institute, AMS) · Kuihua Huang (National University of Defense Technology) · Jing Huo (Nanjing University) · Yang Gao (Nanjing University) |
93 | DREAM: Diffusion Rectification and Estimation-Adaptive Models | Jinxin Zhou (Ohio State University, Columbus) · Tianyu Ding (Microsoft) · Tianyi Chen (Microsoft) · Jiachen Jiang (Ohio State University, Columbus) · Ilya Zharkov (Microsoft) · Zhihui Zhu (Ohio State University, Columbus) · Luming Liang (Microsoft) |
94 | ProMotion: Prototypes As Motion Learners | Yawen Lu (Purdue University; Rochester Institute of Tech) · Dongfang Liu (Rochester Institute of Technology) · Qifan Wang (Meta AI) · Cheng Han (Rochester Institute of Technology) · Yiming Cui (University of Florida) · Zhiwen Cao (Purdue University) · Xueling Zhang (Rochester Institute of Technology) · Yingjie Victor Chen (Purdue University) · Heng Fan (University of North Texas) |
95 | EgoGen: An Egocentric Synthetic Data Generator | Gen Li (ETH Zurich) · Kaifeng Zhao (ETHZ - ETH Zurich) · Siwei Zhang (None) · Xiaozhong Lyu (Department of Computer Science, ETHZ - ETH Zurich) · Mihai Dusmanu (Microsoft) · Yan Zhang (ETH Zurich) · Marc Pollefeys (ETH Zurich / Microsoft) · Siyu Tang (ETH Zurich) |
96 | Collaborating Foundation models for Domain Generalized Semantic Segmentation | Mohammed-Yasser BENIGMIM (Telecom Paris) · Subhankar Roy (University of Aberdeen) · Slim Essid (Télécom Paris) · Vicky Kalogeiton (Ecole polytechnique) · Stéphane Lathuilière (Télécom ParisTech) |
97 | Make Pixels Dance: High-Dynamic Video Generation | Yan Zeng (ByteDance) · Guoqiang Wei (ByteDance) · Jiani Zheng (None) · Jiaxin Zou (ByteDance Ltd.) · Yang Wei (East China Normal University) · Yuchen Zhang ( ByteDance Research) · Hang Li (ByteDance Technology) |
98 | Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA | Zhuowan Li (Johns Hopkins University) · Bhavan Jasani (Amazon) · Peng Tang (Amazon) · Shabnam Ghadar (Amazon) |
99 | Learning Dense Visual Correspondence for Category-level Garment Manipulation | Ruihai Wu (Peking University) · Haoran Lu (Peking University) · Yiyan Wang (Beijing Institute of Technology) · Yubo Wang (Peking University) · Hao Dong (None) |
100 | Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset | Yiming Li (New York University) · Zhiheng Li (New York University) · Nuo Chen (New York University) · Moonjun Gong (New York University) · Zonglin Lyu (New York University) · Zehong Wang (New York University) · Peili Jiang (New York University) · Chen Feng (New York University) |
101 | Diffusion Time-step Curriculum for One Image to 3D Generation | YI Xuanyu (National Technological University) · Zike Wu (Nanyang Technological University) · Qingshan Xu (Nanyang Technological University) · Pan Zhou (Sea Group) · Joo Lim (I2R, A*STAR) · Hanwang Zhang (Nanyang Technological University) |
102 | Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior | Zike Wu (Nanyang Technological University) · Pan Zhou (Sea Group) · YI Xuanyu (National Technological University) · Xiaoding Yuan (Johns Hopkins University) · Hanwang Zhang (Nanyang Technological University) |
103 | SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining | Chull Hwan Song (Dealicious Inc) · Taebaek Hwang (None) · Jooyoung Yoon (Dealicious Inc) · Shunghyun Choi (None) · Yeong Hyeon Gu (Sejong University) |
104 | TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding | Shuhuai Ren (None) · Linli Yao (Peking University) · Shicheng Li (Peking University) · Xu Sun (Peking University) · Lu Hou (Huawei Technologies Ltd.) |
105 | EFHQ: Multi-purpose ExtremePose-Face-HQ dataset | Trung Dao (VinAI) · Duc H Vu (Miami University of Ohio) · Cuong Pham (Posts & Telecommunications Institute of Technology and VinAI Research) · Anh Tran (None) |
106 | OpenStreetView-5M: The Many Roads to Global Visual Geolocation | Guillaume Astruc (ENPC, Ecole Nationale des Ponts et Chausees) · Nicolas Dufour (Ecole Nationale des Ponts et Chausees) · Ioannis Siglidis (Ecole Nationale des Ponts et Chausees) · Constantin Aronssohn (ENPC, Ecole Nationale des Ponts et Chausees) · Nacim Bouia (Ecole Normale Superieure) · Stephanie Fu (University of California, Berkeley) · Romain Loiseau (Ecole Nationale des Ponts et Chausees) · Van Nguyen Nguyen (Ecole des Ponts ParisTech) · Charles Raude (ENPC, Ecole Nationale des Ponts et Chausees) · Elliot Vincent (Imagine (LIGM) - Willow (Inria)) · Lintao XU (Université Gustave Eiffel) · Hongyu Zhou (Ecole Nationale des Ponts et Chausees) · Loic Landrieu (ENPC) |
107 | Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans | Romain Loiseau (Ecole Nationale des Ponts et Chausees) · Elliot Vincent (Imagine (LIGM) - Willow (Inria)) · Mathieu Aubry (ENPC) · Loic Landrieu (ENPC) |
108 | Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology | Oren Kraus (Recursion) · Kian Kenyon-Dean (Recursion Pharma) · Saber Saberian (Recursion Pharma) · Maryam Fallah (Recursion Pharmaceuticals) · Peter McLean (Recursion) · Jess Leung (Recursion) · Vasudev Sharma (Recursion) · Ayla Khan (University of Utah) · Jia Balakrishnan (Recursion Pharmaceuticals) · Safiye Celik (Recursion) · Dominique Beaini (Valence Labs) · Maciej Sypetkowski (Valence Labs) · Chi Cheng (Boston University, Boston University) · Kristen Morse (Recursion) · Maureen Makes (University of Utah) · Ben Mabey (None) · Berton Earnshaw (University of Utah) |
109 | Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation | Ba Ngo (Chonnam National University) · Nhat-Tuong Do-Tran (National Yang Ming Chiao Tung University) · Tuan-Ngoc Nguyen (FPT Telecom) · Hae-Gon Jeon (None) · Tae Jong Choi (Chonnam National University) |
110 | AssistGUI: Task-Oriented Desktop Graphical User Interface Automation | Difei Gao (None) · Lei Ji (Research, Microsoft) · Zechen Bai (Show Lab, National University of Singapore) · Mingyu Ouyang (National University of Singaore, National University of Singapore) · Peiran Li (national university of singaore, National University of Singapore) · Dongxing Mao (SUTD) · Qin WU (National University of Singapore) · Weichen Zhang (National University of Singapore) · Peiyi Wang (national university of singaore, National University of Singapore) · Xiangwu Guo (South China University of Technology) · Hengxu Wang (national university of singaore, National University of Singapore) · Luowei Zhou (Google) · Mike Zheng Shou (National University of Singapore) |
111 | Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression | Hancheng Ye (Fudan University) · Chong Yu (Fudan University NVIDIA Corporation) · Peng Ye (Fudan University) · Renqiu Xia (Shanghai Jiao Tong University) · Bo Zhang (Shanghai AI Laboratory) · Yansong Tang () · Jiwen Lu (Tsinghua University) · Tao Chen (None) |
112 | SuperPrimitive: Scene Reconstruction at a Primitive Level | Kirill Mazur (Imperial College London) · Gwangbin Bae (Imperial College London) · Andrew J. Davison (Imperial College London) |
113 | Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer | Zhen Zhao (East China Normal University) · Jingqun Tang (Bytedance) · Chunhui Lin (Bytedance) · Binghong Wu (Bytedance) · Can Huang (Bytedance) · Hao Liu (Bytedance Inc.) · Xin Tan (East China Normal University) · Zhizhong Zhang (East China Normal University) · Yuan Xie (East China Normal University) |
114 | Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception | Haoming Chen (East China Normal Univeristy) · Zhizhong Zhang (East China Normal University) · Yanyun Qu (Xiamen University) · Ruixin Zhang (Tencent Youtu Lab) · Xin Tan (East China Normal University) · Yuan Xie (East China Normal University) |
115 | QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction | Ishak Ayad (ETIS & AGM, CY Cergy Paris University, ENSEA, CNRS) · Nicolas Larue (None) · Maï K. Nguyen (ETIS , CY Cergy Paris University, ENSEA, CNRS) |
116 | BodyMAP - Jointly Predicting Body Mesh and 3D Applied Pressure Map for People in Bed | Abhishek Tandon (Carnegie Mellon University) · Anujraaj Goyal (Carnegie Mellon University) · Henry M. Clever (NVIDIA) · Zackory Erickson (Carnegie Mellon University) |
117 | PeerAiD: Improving Adversarial Distillation from a Specialized Peer Tutor | Jaewon Jung (Seoul National University) · Hongsun Jang (Seoul National University) · Jaeyong Song (Seoul National University) · Jinho Lee (Seoul National University) |
118 | Dexterous Grasp Transformer | Guo-Hao Xu (Sun Yat-sen University) · Yi-Lin Wei (SUN YAT-SEN UNIVERSITY) · Dian Zheng (None) · Xiao-Ming Wu (SUN YAT-SEN UNIVERSITY) · Wei-Shi Zheng (SUN YAT-SEN UNIVERSITY) |
119 | CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers | Shahaf Arica (Technion - Israel Institute of Technology) · Or Rubin (Technion - Israel Institute of Technology) · Sapir Gershov (Technion - Israel Institute of Technology) · Shlomi Laufer (Technion) |
120 | Beyond Text: Frozen Large Language Models in Visual Signal Comprehension | Lei Zhu (Peking University) · Fangyun Wei (None) · Yanye Lu (Peking University) |
121 | Rethinking Boundary Discontinuity Problem for Oriented Object Detection | Hang Xu (Hangzhou Dianzi University) · Xinyuan Liu (Institute of Computing Technology, Chinese Academy of Sciences) · Haonan Xu (ICT, Chinese Academy of Sciences) · Yike Ma (, Chinese Academy of Sciences) · Zunjie Zhu (Hangzhou Dianzi University) · Chenggang Yan (Hangzhou Dianzi University, Tsinghua University) · Feng Dai (ICT, Chinese Academy of Sciences) |
122 | LAA-Net: Localized Artifact Attention Network for High-Quality Deepfakes Detection | Dat NGUYEN (University of Luxembourg) · Nesryne Mejri (SnT, University of Luxembourg) · Inder Pal Singh (University of Luxemburg) · Polina Kuleshova (University of Luxemburg) · Marcella Astrid (University of Luxemburg) · Anis Kacem (University of Luxemburg) · Enjie Ghorbel (CRISTAL laboratory, ENSI, University of Manouba) · Djamila Aouada (None) |
123 | Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks | Shin'ya Yamaguchi (Kyoto University) · Sekitoshi Kanai (NTT) · Kazuki Adachi (NTT) · Daiki Chijiwa (NTT, The University of Tokyo) |
124 | Transcriptomics-guided Slide Representation Learning in Computational Pathology | Guillaume Jaume (Harvard University) · Lukas Oldenburg (Brigham and Women's Hospital, Harvard Medical School) · Anurag Vaidya (Massachusetts Institute of Technology) · Richard J. Chen (Harvard University) · Drew F. K. Williamson (Massachusetts General Hospital, Harvard University) · Thomas Peeters (Harvard University) · Andrew Song (Brigham and Women's hospital) · Faisal Mahmood (Harvard University) |
125 | Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction | Guillaume Jaume (Harvard University) · Anurag Vaidya (Massachusetts Institute of Technology) · Richard J. Chen (Harvard University) · Drew F. K. Williamson (Massachusetts General Hospital, Harvard University) · Paul Pu Liang (Carnegie Mellon University) · Faisal Mahmood (Harvard University) |
126 | InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models | Jiun Tian Hoe (Nanyang Technological University) · Xudong Jiang (Nanyang Technological University) · Chee Seng Chan (Universiti Malaya) · Yap-peng Tan (Nanyang Technological University) · Weipeng Hu (Nanyang Technological University) |
127 | A Vision Check-up for Language Models | Pratyusha Sharma (Massachusetts Institute of Technology) · Tamar Rott Shaham (MIT) · Manel Baradad (Massachusetts Institute of Technology) · Stephanie Fu (University of California, Berkeley) · Adrian Rodriguez-Munoz (Massachusetts Institute of Technology) · Shivam Duggal (Massachusetts Institute of Technology) · Phillip Isola (None) · Antonio Torralba (MIT) |
128 | Analyzing and Improving the Training Dynamics of Diffusion Models | Tero Karras (NVIDIA) · Miika Aittala (NVIDIA) · Jaakko Lehtinen (Aalto University & NVIDIA) · Janne Hellsten (NVIDIA) · Timo Aila (NVIDIA) · Samuli Laine (NVIDIA) |
129 | Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images | Chaoqin Huang (Shanghai Jiao Tong University) · Aofan Jiang (Shanghai Jiao Tong University) · Jinghao Feng (Shanghai Jiao Tong University) · Ya Zhang (Shanghai Jiao Tong University) · Xinchao Wang (National University of Singapore) · Yanfeng Wang (Shanghai Jiao Tong University) |
130 | Adaptive Dilated Convolution from Frequency View | Linwei Chen (Beijing Institute of Technology) · Lin Gu (RIKEN / the University of Tokyo) · Dezhi Zheng (None) · Ying Fu (None) |
131 | NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors | Yannan He (University of Tübingen) · Garvita Tiwari (University of Tuebingen and MPI-Saarbrucken) · Tolga Birdal () · Jan Lenssen (Saarland Informatics Campus, Max-Planck Institute) · Gerard Pons-Moll (University of Tübingen) |
132 | DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback | Yangyi Chen (School of Computer Science, University of Illinois at Urbana-Champaign) · Karan Sikka (SRI International) · Michael Cogswell (SRI International) · Heng Ji (University of Illinois, Urbana-Champaign) · Ajay Divakaran (SRI International) |
133 | Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models | Yabin Zhang (The Hong Kong Polytechnic University) · Wenjie Zhu (None) · Hui Tang (Hong Kong University of Science and Technology) · Zhiyuan Ma (None) · Kaiyang Zhou (Hong Kong Baptist University) · Lei Zhang (The Hong Kong Polytechnic University) |
134 | Evaluating Transferability in Retrieval Tasks: An Approach Using MMD and Kernel Methods | Mengyu Dai (Florida State University) · Amir Hossein Raffiee (SalesForce.com) · Aashish Jain (Salesforce) · Joshua Correa (SalesForce.com) |
135 | CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras | Sachin Shah (University of Maryland, College Park) · Matthew Chan (Department of Computer Science, University of Maryland, College Park) · Haoming Cai (University of Maryland, College Park) · Jingxi Chen (University of Maryland College Park) · Sakshum Kulshrestha (University of Maryland, College Park) · Chahat Deep Singh (University of Maryland, College Park) · Yiannis Aloimonos (University of Maryland, College Park) · Christopher Metzler (University of Maryland, College Park) |
136 | Question Aware Vision Transformer for Multimodal Reasoning | Roy Ganz (Technion - Israel Institute of Technology, Technion) · Yair Kittenplon (Amazon) · Aviad Aberdam (Amazon AWS AI) · Elad Ben Avraham (Amazon) · Oren Nuriel (Amazon) · Shai Mazor (Amazon) · Ron Litman (Amazon AI Labs) |
137 | Binding Touch to Everything: Learning Unified Multimodal Tactile Representations | Fengyu Yang (Yale University) · Chao Feng () · Ziyang Chen (University of Michigan) · Hyoungseob Park (Yale University) · Daniel Wang (Yale University) · Yiming Dou (University of Michigan - Ann Arbor) · Ziyao Zeng (Yale University) · xien chen (Yale University) · Suchisrit Gangopadhyay (Yale University) · Andrew Owens (University of Michigan) · Alex Wong (Yale University) |
138 | From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding | Yonglu Li (Shanghai Jiaotong University) · Xiaoqian Wu (None) · Xinpeng Liu (Shanghai Jiao Tong University) · Zehao Wang (None) · Yiming Dou (University of Michigan - Ann Arbor) · Yikun Ji (Shanghai Jiaotong University) · Junyi Zhang () · Yixing Li (Shanghai Jiao Tong University) · Xudong LU (The Chinese University of Hong Kong) · Jingru Tan (Central South University) · Cewu Lu (Shanghai Jiao Tong University) |
139 | Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion | Sofia Casarin (Free University of Bozen) · Cynthia Ugwu (Free University of Bozen) · Sergio Escalera (Computer Vision Center) · Oswald Lanz (Free University of Bozen-Bolzano) |
140 | Dr. Bokeh: DiffeRentiable Occlusion-aware Bokeh Rendering | Yichen Sheng (Purdue University) · Zixun Yu (Purdue University) · Lu Ling (Purdue University) · Zhiwen Cao (Adobe Systems) · Xuaner Zhang (Adobe) · Xin Lu (Adobe Inc.) · Ke Xian (Nanyang Technological University) · Haiting Lin (Adobe Systems) · Bedrich Benes (Purdue University) |
141 | DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video | Huiqiang Sun (None) · Xingyi Li (Huazhong University of Science and Technology) · Liao Shen (Huazhong University of Science and Technology) · Xinyi Ye (School of Artificial Intelligence and Automation, Huazhong University of Science and Technology) · Ke Xian (Nanyang Technological University) · Zhiguo Cao () |
142 | S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes | Xingyi Li (Huazhong University of Science and Technology) · Zhiguo Cao () · Yizheng Wu (Nanyang Technological University) · Kewei Wang (Huazhong University of Science and Technology) · Ke Xian (Nanyang Technological University) · Zhe Wang (Sensetime Group Limited) · Guosheng Lin (Nanyang Technological University) |
143 | ViewFusion: Towards Multi-View Consistency via Interpolated Denoising | Xianghui Yang (University of Sydney) · Gil Avraham (Amazon) · Yan Zuo (Amazon) · Sameera Ramasinghe (Amazon) · Loris Bazzani (Amazon) · Anton van den Hengel (University of Adelaide) |
144 | Boosting Image Restoration via Priors from Pre-trained Models | Xiaogang Xu (Zhejiang Lab) · Shu Kong (Texas A&M University) · Tao Hu (National University of Singapore) · Zhe Liu (Zhejiang Lab) · Hujun Bao (Zhejiang University) |
145 | SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery | Xin Guo (Ant Group) · Jiangwei Lao (Ant Group) · Bo Dang (Wuhan University) · Yingying Zhang (Hikvision Research Institute) · Lei Yu (antgroup) · Lixiang Ru (Wuhan University) · Liheng Zhong (Ant Group) · Ziyuan Huang (National University of Singapore) · Kang Wu (Wuhan University) · Dingxiang Hu (mybank) · HUIMEI HE (Ant Group) · Jian Wang (, Institute of automation, Chinese academy of science) · Jingdong Chen (Ant Group) · Ming Yang (Ant Group) · Yongjun Zhang (None) · Yansheng Li (Wuhan University) |
146 | AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection | Trevine Oorloff (University of Maryland, College Park) · Surya Koppisetti (Reality Defender Inc) · Nicolo Bonettini (Reality Defender) · Divyaraj Solanki (Reality Defender Inc.) · Ben Colman (Reality Defender) · Yaser Yacoob (University of Maryland, College Park) · Ali Shahriyari (Reality Defender) · Gaurav Bharaj (Flawless AI) |
147 | CAT-Seg: Cost Aggregation for Open-vocabulary Semantic Segmentation | Seokju Cho (Korea University) · Heeseong Shin (Korea University) · Sunghwan Hong (Korea University) · Anurag Arnab (Google) · Paul Hongsuck Seo (Google) · Seungryong Kim (Korea University) |
148 | What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions | Brian Chen (Samsung) · Nina Shvetsova (None) · Andrew Rouditchenko (Massachusetts Institute of Technology) · Daniel Kondermann (Heidelberg University, Ruprecht-Karls-Universität Heidelberg) · Samuel Thomas (IBM Research) · Shih-Fu Chang (Columbia University) · Rogerio Feris (International Business Machines) · James Glass (Massachusetts Institute of Technology) · Hilde Kuehne (University of Bonn MIT-IBM Watson AI Lab) |
149 | Grounding Everything: Emerging Localization Properties in Vision-Language Transformers | Walid Bousselham (Johann Wolfgang Goethe Universität Frankfurt am Main) · Felix Petersen (Stanford University) · Vittorio Ferrari (Synthesia) · Hilde Kuehne (University of Bonn MIT-IBM Watson AI Lab) |
150 | Source-Free Domain Adaptation with Frozen Multimodal Foundation Model | Song Tang (University of Shanghai for Science and Technology) · Wenxin Su (University of Shanghai for Science and Technology) · Mao Ye (University of Electronic Science and Technology of China) · Xiatian Zhu (University of Surrey) |
151 | Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis | Mingyang Zhao (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Jiang Jingen (Shandong University) · Lei Ma (Peking University) · Shiqing Xin (Shandong University) · Gaofeng Meng (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Dong-Ming Yan (Institute of Automation, Chinese Academy of Sciences) |
152 | Self-Supervised Dual Contouring | Ramana Sundararaman (École Polytechnique) · Roman Klokov (École Polytechnique) · Maks Ovsjanikov (Ecole Polytechnique, France) |
153 | Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architectures | Huijie Zhang (University of Michigan - Ann Arbor) · Yifu Lu (University of Michigan - Ann Arbor) · Ismail Alkhouri (Michigan State University; University of Michigan) · Saiprasad Ravishankar (Michigan State University) · Dogyoon Song (University of Michigan - Ann Arbor) · Qing Qu (University of Michigan) |
154 | CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation | Xi Liu (University of Electronic Science and Technology of China) · Ying Guo (Meituan) · Cheng Zhen (Meituan) · Tong Li (Meituan) · Yingying Ao (Meituan) · Pengfei Yan (Meituan) |
155 | Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation | Qi Yang (School of Artificial Intelligence, University of Chinese Academy of Sciences.) · Xing Nie (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Tong Li (Meituan) · Gaopengfei (Beijing SanKuai Online Technology Co., Ltd.) · Ying Guo (Meituan) · Cheng Zhen (Meituan) · Pengfei Yan (Meituan) · Shiming Xiang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) |
156 | FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures | Lisa Mais (Max Delbrück Center for Molecular Medicine) · Peter Hirsch (Max Delbrück Center for Molecular Medicine) · Claire Managan (HHMI Janelia Research Campus) · Ramya Kandarpa (Environmental Resources Management (ERM)) · Josef Rumberger (Max Delbrück Center for Molecular Medicine) · Annika Reinke (German Cancer Research Center) · Lena Maier-Hein (German Cancer Research Center (DKFZ)) · Gudrun Ihrke (HHMI Janelia Research Campus) · Dagmar Kainmueller (Universität Potsdam) |
157 | LEDITS++: Limitless Image Editing using Text-to-Image Models | Manuel Brack (Technische Universität Darmstadt) · Felix Friedrich (TU Darmstadt, Hessian.AI) · Katharina Kornmeier (Align Technology) · Linoy Tsaban (Hugging Face) · Patrick Schramowski (TU Darmstadt) · Kristian Kersting (TU Darmstadt) · Apolinário Passos (Universidade de Brasília) |
158 | A General and Efficient Training for Transformer via Token Expansion | Wenxuan Huang (East China Normal University) · Yunhang Shen (Tencent) · Jiao Xie (Xiamen University) · Baochang Zhang (Beihang University) · Gaoqi He (East China Normal University) · Ke Li (Tencent) · Xing Sun (Tencent YouTu Lab) · Shaohui Lin (East China Normal University) |
159 | Permutation Equivariance of Transformers and Its Applications | Hengyuan Xu (None) · Liyao Xiang (Shanghai Jiaotong University) · Hangyu Ye (Shanghai Jiaotong University) · Dixi Yao (University of Toronto) · Pengzhi Chu (Shanghai Jiaotong University) · Baochun Li (University of Toronto) |
160 | AiOS: All-in-One-Stage 3D Wholebody Mesh Recovery | Qingping SUN (City University of Hong Kong) · Yanjun Wang (Shanghai Jiao Tong University) · Ailing Zeng (IDEA) · Wanqi Yin (SenseTime Research ) · Chen Wei (SenseTime International PTE. LTD.) · Wenjia Wang (University of Hong Kong) · Haiy Mei (None) · Chi LEUNG (City University of Hong Kong) · Ziwei Liu (Nanyang Technological University) · Lei Yang (The Chinese University of Hong Kong) · Zhongang Cai (Nanyang Technological University) |
161 | LaneCPP: Continuous 3D Lane Detection using Physical Priors | Maximilian Pittner (Bosch) · Joel Janai (Robert Bosch GmbH, Bosch) · Alexandru Paul Condurache (None) |
162 | Viewpoint-Aware Visual Grounding in 3D Scenes | Xiangxi Shi (Oregon State University) · Zhonghua Wu (SenseTime) · Stefan Lee (Oregon State University) |
163 | C3: High-performance and low-complexity neural compression from a single image or video | Hyunjik Kim (DeepMind) · Matthias Bauer (Google DeepMind) · Lucas Theis (Google) · Jonathan Richard Schwarz (Harvard University) · Emilien Dupont (DeepMind) |
164 | Making Vision Transformers Truly Shift-Equivariant | Renan A. Rojas-Gomez (University of Illinois at Urbana Champaign) · Teck-Yian Lim (DSO National Laboratories) · Minh Do (VinUniversity) · Raymond A. Yeh (Purdue University) |
165 | Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer | Yuang Ai (Institute of Automation, Chinese Academy of Sciences) · Xiaoqiang Zhou (University of Science and Technology of China) · Huaibo Huang (None) · Lei Zhang (The Hong Kong Polytechnic University) · Ran He (None) |
166 | Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration | Yuang Ai (Institute of Automation, Chinese Academy of Sciences) · Huaibo Huang (None) · Xiaoqiang Zhou (University of Science and Technology of China) · Jiexiang Wang (University of Science and Technology of China) · Ran He (None) |
167 | Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection | Jin Yang (Xi'an jiao tong university) · Ping Wei (None) · Huan Li (Xi'an Jiaotong University) · Ziyang Ren (Xi'an Jiaotong University) |
168 | Data-Free Quantization via Pseudo-label Filtering | Chunxiao Fan (Hefei University of Technology) · Ziqi Wang (Hefei University of Technology) · Dan Guo (Hefei University of Technology) · Meng Wang (Hefei University of Technology) |
169 | UniGS: Unified Representation for Image Generation and Segmentation | Lu Qi (University of California, Merced) · Lehan Yang (University of Sydney) · Weidong Guo (Tencent) · Yu Xu (University of Waterloo) · Bo Du (Wuhan University) · Varun Jampani (Google Research) · Ming-Hsuan Yang (University of California at Merced) |
170 | Named Entity Driven Zero-Shot Image Manipulation | Zhida Feng (Wuhan University of Science and Technology) · Li Chen (Wuhan University of Science and Technology) · Jing Tian (National University of Singapore) · Jiaxiang Liu (Baidu) · Shikun Feng (Baidu) |
171 | NeISF: Neural Incident Stokes Field for Geometry and Material Estimation | Chenhao Li () · Taishi Ono (Sony Europe Ltd.) · Takeshi Uemori (Sony Semiconductor Solutions Corporation) · Hajime Mihara (Sony Semiconductor Solutions Corporation) · Alexander Gatto (Sony Semiconductor Solutions Europe) · Hajime Nagahara (Osaka University) · Yusuke Moriuchi (Sony Semiconductor Solutions Corporation) |
172 | When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach | TAO MA (Peking University) · Bing Bai (Qiyuan Lab) · Haozhe Lin (None) · Heyuan Wang (Peking University) · Yu Wang (Qiyuan Lab) · Lin Luo (Peking University) · Lu Fang (Tsinghua University, Tsinghua University) |
173 | Scaling Up Video Summarization Pretraining with Large Language Models | Dawit Argaw Argaw (None) · Seunghyun Yoon (Adobe Research) · Fabian Caba Heilbron (Adobe Research) · Hanieh Deilamsalehy (None) · Trung Bui (Adobe Research) · Zhaowen Wang (Adobe Research) · Franck Dernoncourt (Adobe Systems) · Joon Chung (KAIST) |
174 | CoDe: An Explicit Content Decoupling Framework for Image Restoration | Enxuan Gu (Dalian University of Technology) · Hongwei Ge (Dalian University of Technology) · Yong Guo (Max-Planck Institute for Informatics) |
175 | Move Anything with Layered Scene Diffusion | Jiawei Ren (Nanyang Technological University) · Mengmeng Xu (Meta AI) · Jui-Chieh Wu (Meta) · Ziwei Liu (Nanyang Technological University) · Tao Xiang (University of Surrey) · Antoine Toisoul (Meta) |
176 | Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching | Matteo Bastico (Mines Paris - PSL) · Etienne Decencière (Mines Paris) · Laurent Corté (Mines ParisTech) · Yannick TILLIER (Mines ParisTech) · David Ryckelynck (Mines Paris PSL University) |
177 | One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning | Pei-Kai Huang (Department of Computer Science, National Tsing Hua University) · Cheng-Hsuan Chiang (National Tsinghua University) · Tzu-Hsien Chen (National Tsinghua University) · Jun-Xiong Chong (National Tsing Hua University) · Tyng-Luh Liu (IIS/Academia Sinica) · Chiou-Ting Hsu (National Tsing Hua University) |
178 | Rich Human Feedback for Text-to-Image Generation | Youwei Liang (University of California, San Diego) · Junfeng He (Google) · Gang Li (Google) · Peizhao Li (GE HealthCare) · Arseniy Klimovskiy (Google) · Nicholas Carolan (Google) · Jiao Sun (University of Southern California) · Jordi Pont-Tuset (Google Research) · Sarah Young (Google) · Feng Yang (Google Research) · Junjie Ke (None) · Krishnamurthy Dvijotham (Google DeepMind) · Katherine Collins (University of Cambridge) · Yiwen Luo (Research, Google) · Yang Li (Google) · Kai Kohlhoff (Google Research) · Deepak Ramachandran (Google) · Vidhya Navalpakkam (Research, Google) |
179 | Towards Accurate and Robust Architectures via Neural Architecture Search | Yuwei Ou (Sichuan University) · Yuqi Feng (Sichuan University) · Yanan Sun (Sichuan University) |
180 | FINER: Flexible spectral-bias tuning in Implicit NEural Representation by Variable-periodic Activation Functions | Zhen Liu (Nanjing University) · Hao Zhu () · Qi Zhang (Tencent AI Lab) · Jingde Fu (Nanjing University) · Weibing Deng (nanjing university) · Zhan Ma (Nanjing University) · Yanwen Guo (Nanjing University) · Xun Cao (Nanjing University) |
181 | Open-World Semantic Segmentation Including Class Similarity | Matteo Sodano (Institute of Photogrammetry and Robotics, University of Bonn (Germany)) · Federico Magistri (Rheinische Friedrich-Wilhelms Universität Bonn) · Lucas Nunes (University of Bonn) · Jens Behley (University of Bonn) · Cyrill Stachniss (University of Bonn) |
182 | Regressor-Segmenter Mutual Prompt Learning for Crowd Counting | Mingyue Guo (University of Chinese Academy of Sciences) · Li Yuan (Peking University) · Zhaoyi Yan (PengCheng Laboratory) · Binghui Chen (Alibaba Group) · Yaowei Wang (Pengcheng Laboratory) · Qixiang Ye (University of Chinese Academy of Sciences) |
183 | HEAL-SWIN: A Vision Transformer On The Sphere | Oscar Carlsson (Chalmers University of Technology) · Jan E. Gerken (Chalmers University of Technology) · Hampus Linander (Chalmers University of Technology) · Heiner Spiess (Technische Universität Berlin) · Fredrik Ohlsson (Umea University) · Christoffer Petersson (Zenseact) · Daniel Persson (Chalmers University of Technology) |
184 | KPConvX: Modernizing Kernel Point Convolution with Kernel Attention | Hugues Thomas (Apple Inc.) · Yao-Hung Hubert Tsai (Apple) · Timothy Barfoot (University of Toronto) · Jian Zhang (Apple) |
185 | Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping | Alex Costanzino (University of Bologna) · Pierluigi Zama Ramirez (University of Bologna) · Giuseppe Lisanti (University of Bologna) · Luigi Di Stefano (University of Bologna) |
186 | Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples | Phillip Howard (Intel Labs) · Avinash Madasu (None) · Tiep Le (Intel) · Gustavo Lujan-Moreno (Intel) · Anahita Bhiwandiwalla (Intel) · Vasudev Lal (None) |
187 | SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution | Zhixuan Liang (The University of Hong Kong) · Yao Mu (The University of Hong Kong) · Hengbo Ma (None) · Masayoshi Tomizuka (University of California, Berkeley) · Mingyu Ding (UC Berkeley) · Ping Luo (The University of Hong Kong) |
188 | Equivariant plug-and-play image reconstruction | Matthieu Terris (INRIA) · Thomas Moreau (INRIA) · Nelly Pustelnik (CNRS) · Julián Tachella (CNRS) |
189 | Batch Normalization Alleviates the Spectral Bias in Coordinate Networks | Zhicheng Cai (Nanjing University) · Hao Zhu () · Qiu Shen (Nanjing University) · Xinran Wang (Nanjing University) · Xun Cao (Nanjing University) |
190 | Neighbor Relations Matter in Video Scene Detection | Jiawei Tan () · Hongxing Wang (Chongqing University) · Jiaxin Li (Chongqing University) · Zhilong Ou (Chongqing University) · Zhangbin Qian (Chongqing University) |
191 | QUADify: Extracting Meshes with Pixel-level Details and Materials from Images | Maximilian Frühauf (ETH Zurich & Disney Research |
192 | DiVa-360: The Dynamic Visual Dataset for Immersive Neural Fields | Cheng-You Lu (University of Technology Sydney) · Peisen Zhou (Brown University) · Angela Xing (Brown University) · Chandradeep Pokhariya (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Arnab Dey (Université de Nice-Sophia Antipolis) · Ishaan Shah (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Rugved Mavidipalli (Brown University) · Dylan Hu (Brown University) · Andrew Comport (CNRS) · Kefan Chen (Brown University) · Srinath Sridhar (None) |
193 | MANUS: Markerless Grasp Capture using Articulated 3D Gaussians | Chandradeep Pokhariya (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Ishaan Shah (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Angela Xing (Brown University) · Zekun Li (Tencent AI Lab) · Kefan Chen (Brown University) · Avinash Sharma (International Institute of Information Technology Hyderabad) · Srinath Sridhar (None) |
194 | APISR: Anime Production Inspired Real-World Anime Super-Resolution | Boyang Wang (University of Michigan - Ann Arbor) · Fengyu Yang (Yale University) · Xihang Yu (University of Michigan - Ann Arbor) · Chao Zhang (Zhejiang University) · Hanbin Zhao (Zhejiang University) |
195 | GALA: Generating Animatable Layered Assets from a Single Scan | Taeksoo Kim (Seoul National University) · Byungjun Kim (Seoul National University) · Shunsuke Saito (Reality Labs Research) · Hanbyul Joo (None) |
196 | Selective, Interpretable and Motion Consistent Privacy Attribute Obfuscation for Action Recognition | Filip Ilic (Technische Universität Graz) · He Zhao (York University) · Thomas Pock (Graz University of Technology) · Richard P. Wildes (York University) |
197 | On the Estimation of Image-matching Uncertainty in Visual Place Recognition | Mubariz Zaffar (Delft University of Technology) · Liangliang Nan (Delft University of Technology) · Julian F. P. Kooij (Delft University of Technology) |
198 | MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection | Jakub Micorek (Technische Universität Graz) · Horst Possegger (Graz University of Technology) · Dominik Narnhofer (Technische Universität Graz) · Horst Bischof (Graz University of Technology) · Mateusz Kozinski (Technische Universität Graz) |
199 | URHand: Universal Relightable Hands | Zhaoxi Chen (Nanyang Technological University) · Gyeongsik Moon (None) · Kaiwen Guo (Google) · Chen Cao (Facebook) · Stanislav Pidhorskyi (Meta) · Tomas Simon (Meta) · Rohan Joshi (Facebook) · Yuan Dong (Facebook) · Yichen Xu (Meta platforms inc) · Bernardo Pires (Meta Platforms Inc.) · He Wen (Meta Platformts, Inc.) · Lucas Evans (Meta) · Bo Peng (Meta Platforms Inc.) · Julia Buffalini (Meta) · Autumn Trimble (Meta) · Kevyn McPhail (Meta) · Melissa Schoeller (Meta Platforms Inc) · Shoou-I Yu (Reality Labs Research, Meta) · Javier Romero (None) · Michael Zollhoefer (Meta) · Yaser Sheikh (Meta) · Ziwei Liu (Nanyang Technological University) · Shunsuke Saito (Reality Labs Research) |
200 | Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction | Ruixuan Yu (Shandong University) · Jian Sun (Xi'an Jiaotong University) |
201 | LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example | Soyeon Yoon (Korea Advanced Institute of Science & Technology) · Kwan Yun (Korea Advanced Institute of Science & Technology) · Kwanggyoon Seo (KAIST) · Sihun Cha (Korea Advanced Institute of Science and Technology) · Jung Eun Yoo (Korea Advanced Institute of Science & Technology) · Junyong Noh (Korea Advanced Institute of Science and Technology) |
202 | StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN | Jongwoo Choi (Visual Media Lab, KAIST) · Kwanggyoon Seo (KAIST) · Amirsaman Ashtari (MD Anderson Cancer Center) · Junyong Noh (Korea Advanced Institute of Science and Technology) |
203 | How to Train Neural Field Representations: A Comprehensive Study and Benchmark | Samuele Papa (University of Amsterdam) · Riccardo Valperga (University of Amsterdam) · David Knigge (University of Amsterdam) · Miltiadis Kofinas (University of Amsterdam) · Phillip Lippe (University of Amsterdam) · Jan-Jakob Sonke (Netherlands Cancer Institute) · Efstratios Gavves () |
204 | Open-vocabulary object 6D pose estimation | Jaime Corsetti (Fondazione Bruno Kessler & University of Trento) · Davide Boscaini (Fondazione Bruno Kessler) · Changjae Oh (Queen Mary University London) · Andrea Cavallaro (EPFL - EPF Lausanne) · Fabio Poiesi (Fondazione Bruno Kessler) |
205 | Amodal Ground Truth and Completion in the Wild | Guanqi Zhan (VGG, University of Oxford) · Chuanxia Zheng (University of Oxford) · Weidi Xie (Shanghai Jiaotong University) · Andrew Zisserman (University of Oxford) |
206 | Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing | ChangHee Yang (LG Electornic) · Chan Hee Kang (Sogang University) · Kyeongbo Kong (Pusan National University) · Hanni Oh (Sogang University) · Suk-Ju Kang (Sogang University) |
207 | HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes | Yichen Yao (ShanghaiTech University) · Zimo Jiang (ShanghaiTech University) · YUJING SUN (the University of Hong Kong, University of Hong Kong) · Zhencai Zhu (Innovation Academy for Microsatellites) · Xinge Zhu (The Chinese University of Hong Kong) · Runnan Chen (None) · Yuexin Ma (ShanghaiTech University) |
208 | Diff-BGM: A Diffusion Model for Video Background Music Generation | Sizhe Li (Peking University) · Yiming Qin (Peking University) · Minghang Zheng (Peking University) · Xin Jin (Beijing Electronic Science and Technology Institute) · Yang Liu (Peking University) |
209 | Distilling Semantic Priors from SAM to Efficient Image Restoration Models | Quan Zhang (Tsinghua University, Tsinghua University) · Xiaoyu Liu (None) · Wei Li (Huawei Noah's Ark Lab) · Hanting Chen (Huawei Technologies Ltd.) · Junchao Liu (Huawei Noah's Ark Lab) · Jie Hu (Huawei Technologies Ltd.) · Zhiwei Xiong (None) · Chun Yuan (Tsinghua University, Tsinghua University) · Yunhe Wang (Huawei Noah's Ark Lab) |
210 | Time-, Memory- and Parameter-Efficient Visual Adaptation | Otniel-Bogdan Mercea (University of Tübingen) · Alexey Gritsenko (Google) · Cordelia Schmid (Inria / Google) · Anurag Arnab (Google) |
211 | MeshPose: Unifying DensePose and 3D Body Mesh reconstruction | Eric-Tuan Le (University College London) · Antonios Kakolyris (Snap Inc.) · Petros Koutras (Snap Inc.) · Himmy Tam (Snap Inc.) · Efstratios Skordos (Snap Inc.) · George Papandreou (Snap Inc.) · Riza Alp Guler (Snap Inc.) · Iasonas Kokkinos (Snap Inc.) |
212 | OpenBias: Open-set Bias Detection in Generative Models | Moreno D'Incà (University of Trento) · Elia Peruzzo (University of Trento) · Massimiliano Mancini (University of Trento) · Dejia Xu (University of Texas at Austin) · Vidit Goel (Snap Inc.) · Xingqian Xu (University of Illinois, Urbana Champaign) · Zhangyang Wang (University of Texas at Austin) · Humphrey Shi (U of Oregon |
213 | Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models | Pablo Marcos-Manchón (Universitat de Barcelona) · Roberto Alcover () · Juan SanMiguel (Universidad Autónoma de Madrid) · Jose M. Martinez (Universidad Autónoma de Madrid) |
214 | LAENeRF: Local Appearance Editing for Neural Radiance Fields | Lukas Radl (Graz University of Technology) · Michael Steiner (Technische Universität Graz) · Andreas Kurz (Technische Universität Graz) · Markus Steinberger (Technische Universität Graz) |
215 | ZONE: Zero-Shot Instruction-Guided Local Editing | Shanglin Li (Beijing University of Aeronautics and Astronautics) · Bohan Zeng (Beijing University of Aeronautics and Astronautics) · Yutang Feng (Beijing University of Aeronautics and Astronautics) · Sicheng Gao (Bayerische Julius-Maximilians-Universität Würzburg) · Xuhui Liu (Beihang University) · Jiaming Liu (Xiaohongshu) · Li Lin (Xiamen University) · Xu Tang (Shanghaitech University) · Yao Hu (Zhejiang University, Tsinghua University) · Jianzhuang Liu (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences) · Baochang Zhang (Beihang University) |
216 | Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias | Wenyu Zhang (I2R, ASTAR) · Qingmu Liu (National University of Singapore) · Felix Cong (National University of Singapore) · Mohamed Ragab (Institute for Infocomm Research , ASTAR) · Chuan-Sheng Foo (Centre for Frontier AI Research, A*STAR) |
217 | SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors | Dave Chen (Technische Universität München) · Haoxuan Li (Technische Universität München) · Hsin-Ying Lee (Snap Inc.) · Sergey Tulyakov (Snap Inc.) · Matthias Nießner (Technical University of Munich) |
218 | CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs | Haocheng Yuan (University of Edinburgh) · Jing Xu (University of Edinburgh, University of Edinburgh) · Hao Pan (Microsoft Research) · Adrien Bousseau (INRIA) · Niloy J. Mitra (University College London) · Changjian Li (University of Edinburgh) |
219 | Looking 3D: Anomaly Detection with 2D-3D Alignment | Ankan Kumar Bhunia (The University of Edinburgh) · Changjian Li (University of Edinburgh) · Hakan Bilen (University of Edinburgh) |
220 | Self-Supervised Class-Agnostic Motion Prediction with Spatial and Temporal Consistency Regularizations | Kewei Wang (Huazhong University of Science and Technology) · Yizheng Wu (Nanyang Technological University) · Jun Cen (None) · Zhiyu Pan (None) · Xingyi Li (Huazhong University of Science and Technology) · Zhe Wang (Sensetime Group Limited) · Zhiguo Cao () · Guosheng Lin (Nanyang Technological University) |
221 | Autoregressive Queries for Adaptive Tracking with Spatio-Temporal Transformers | Jinxia Xie (Guangxi Normal University) · Bineng Zhong (Guangxi Normal University) · Zhiyi Mo (Wuzhou university) · Shengping Zhang (Harbin Institute of Technology) · Liangtao Shi (Guangxi Normal University) · Shuxiang Song (Guangxi Normal University) · Rongrong Ji (Xiamen University) |
222 | IDGuard: Robust, General, Identity-centric POI Proactive Defense Against Face Editing Abuse | Yunshu Dai (SUN YAT-SEN UNIVERSITY) · Jianwei Fei (Nanjing University of Information Science and Technology) · Fangjun Huang (SUN YAT-SEN UNIVERSITY) |
223 | HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation | Yongliang Lin (Zhejiang University) · Yongzhi Su (Apple) · Praveen Nathan (German Research Center for AI) · Sandeep Inuganti (German Research Center for AI) · Yan Di (Technische Universität München) · Martin Sundermeyer (None) · Fabian Manhardt (Google) · Didier Stricker (Universität Kaiserslautern) · Jason Rambach (None) · Yu Zhang (Zhejiang University) |
224 | SemCity: Semantic Scene Generation with Triplane Diffusion | Jumin Lee (Korea Advanced Institute of Science and Technology) · Sebin Lee (Korea Advanced Institute of Science and Technology (KAIST)) · Changho Jo (Neosapience) · Woobin Im (Korea Advanced Institute of Science and Technology) · Ju-hyeong Seon (Korea Advanced Institute of Science & Technology) · Sung-Eui Yoon (KAIST) |
225 | Tri-Perspective View Decomposition for Geometry-Aware Depth Completion | Zhiqiang Yan (Nanjing University of Science and Technology) · Yuankai Lin (Huazhong University of Science and Technology) · Kun Wang (Nanjing University of Science and Technology) · Yupeng Zheng (Institute of Automation,Chinese Academy of Sciences) · Yufei Wang (Northwest Polytechnical University Xi'an) · Zhenyu Zhang (None) · Jun Li (Nanjing University of Science and Technology) · Jian Yang (Nanjing University of Science and Technology) |
226 | Improving Depth Completion via Depth Feature Upsampling | Yufei Wang (Northwest Polytechnical University Xi'an) · Ge Zhang (Northwest Polytechnical University Xi'an) · Shaoqian Wang (Northwest Polytechnical University Xi'an) · Bo Li (None) · Qi Liu (Northwest Polytechnical University Xi'an) · Le Hui (Nanjing University Of Science And Technology) · Yuchao Dai (Northwestern Polytechnical University) |
227 | Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation | Fahimeh HosseiniNoohdani (Sharif University of Technology) · Parsa Hosseini (Sharif University of Technology) · Aryan Yazdan Parast (Sharif University of Technology) · Hamidreza Araghi (Sharif University of Technology) · Mahdieh Baghshah (Sharif University of Technology) |
228 | WildlifeMapper: Aerial Image Analysis for Multi-Species Detection and Identification | Satish Kumar (None) · Bowen Zhang (University of California, Santa Barbara) · Chandrakanth Gudavalli (University of California, Santa Barbara) · Connor Levenson (University of California, Santa Barbara) · Lacey Hughey (Smithsonian National Zoo and Conservation Biology Institute) · Jared Stabach (Smithsonian Conservation Biology Institute) · Irene Amoke (Kenya Wildlife Trust) · Gordon Ojwang (University of Groningen) · Joseph Mukeka (Wildlife Reserach and Training Institute) · Howard Frederick (Tanzania Wildlife Research Institute) · Stephen Mwiu (Wildlife Research and Training Institute) · Joseph Ochieng Ogutu (Universität Hohenheim) · B S Manjunath (UC Santa Barbara) |
229 | The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding | Lorenzo Bianchi (CNR-ISTI) · Fabio Carrara (CNR-ISTI) · Nicola Messina (Institute of Information Science and Technologies - National Research Council (ISTI-CNR)) · Claudio Gennaro (CNR) · Fabrizio Falchi (CNR) |
230 | EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars | Nikita Drobyshev (Meta) · Antoni Bigata Casademunt (Imperial College London) · Konstantinos Vougioukas (Facebook) · Zoe Landgraf (Facebook) · Stavros Petridis (Facebook) · Maja Pantic (Facebook) |
231 | Text-Guided Variational Image Generation \for Industrial Anomaly Detection and Segmentation | LEE MIN GYU (Chung-Ang University, LGCNS) · Jongwon Choi (Chung-Ang University) |
232 | Poly Kernel Inception Network for Remote Sensing Detection | Xinhao Cai (Nanjing University of Science and Technology) · Qiuxia Lai (Communication University of China) · Yuwei Wang (Nanjing University of Science and Technology) · Wenguan Wang (Zhejiang University) · Zeren Sun (Nanjing University of Science and Technology) · Yazhou Yao (Nanjing University of Science and Technology) |
233 | Initialization Matters for Adversarial Transfer Learning | Andong Hua (University of California, Santa Barbara) · Jindong Gu (University of Oxford) · Zhiyu Xue (University of California, Santa Barbara) · Nicholas Carlini (None) · Eric Wong (University of Pennsylvania) · Yao Qin (University of California, Santa Barbara) |
234 | RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction | Baptiste Brument (IRIT, University of Toulouse, France) · Robin Bruneau (University of Copenhagen) · Yvain Queau (CNRS) · Jean Mélou (IRIT) · Francois Lauze (Department fo Computer Science, University of Copenhagen) · Jean-Denis Durou (IRIT) · Lilian Calvet (OR-X, Balgrist Hospital, University of Zurich) |
235 | OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition | Yuchen Pan (Harbin Institute of Technology) · Junjun Jiang (Harbin Institute of Technology) · Kui Jiang (Harbin Institute of Technology) · Zhihao Wu (Harbin Institute of Technology, Shenzhen) · Keyuan Yu (Harbin Institute of Technology) · Xianming Liu (Harbin Institute of Technology) |
236 | Optimizing Diffusion Noise Can Serve As Universal Motion Priors | Korrawe Karunratanakul (ETH Zurich) · Konpat Preechakul (University of California, Berkeley) · Emre Aksan (Google) · Thabo Beeler (Google) · Supasorn Suwajanakorn (Vidyasirimedhi Institute of Science and Technology) · Siyu Tang (ETH Zurich) |
237 | 3D Geomery-aware Deformable Gaussian Splatting for Dynamic View Synthesis | Zhicheng Lu (Northwest Polytechnical University Xi'an) · xiang guo (Northwest Polytechnical University Xi'an) · Le Hui (Nanjing University Of Science And Technology) · Tianrui Chen (Northwest Polytechnical University Xi'an) · Min Yang (None) · Xiao Tang (None) · feng zhu (None) · Yuchao Dai (Northwestern Polytechnical University) |
238 | Space-time Diffusion Features for Zero-shot Text-driven Motion Transfer | Rafail Fridman (Weizmann Institute of Science) · Danah Yatim (Weizmann Institute of Science) · Omer Bar-Tal (Weizmann Institute of Science) · Yoni Kasten (NVIDIA Research) · Tali Dekel (Weizmann Institute of Science) |
239 | DiVAS: Video and Audio Synchronization with Dynamic Frame Rates | Clara Maria Fernandez Labrador (Disney Research) · Mertcan Akcay (Disney Research) · Eitan Abecassis (Walt Disney Company) · Joan Massich (Disney Research) · Christopher Schroers (Disney Research |
240 | Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements | Niccolò Biondi (University of Florence, it) · Federico Pernici (University of Florence Italy) · Simone Ricci (University of Florence) · Alberto Bimbo (Universita di Firenze) |
241 | RankED: Addressing Imbalance and Uncertainty in Edge Detection Using Ranking-based Losses | bedrettin cetinkaya (Middle East Technical University) · Sinan Kalkan (Middle East Technical University) · Emre Akbas (METU) |
242 | DualAD: Disentangling the Dynamic and Static World for End-to-End Driving | Simon Doll (Eberhard-Karls-Universität Tübingen) · Niklas Hanselmann (Mercedes Benz Research & Development) · Lukas Schneider (Mercedes Benz Research & Development) · Richard Schulz (Mercedes Benz AG) · Marius Cordts (Mercedes-Benz AG) · Markus Enzweiler (Esslingen University of Applied Sciences) · Hendrik Lensch (University of Tübingen) |
243 | Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives | Kristen Grauman (University of Texas at Austin) · Andrew Westbury (Facebook AI Research) · Lorenzo Torresani (Facebook) · Kris Kitani (Carnegie Mellon University) · Jitendra Malik (University of California at Berkeley) · Triantafyllos Afouras (University of Oxford) · Kumar Ashutosh (None) · Vijay Baiyya (University of Louisiana at Lafayette) · Siddhant Bansal (University of Bristol, UK) · Bikram Boote (University of Illinois, Urbana Champaign) · Eugene Byrne (Meta) · Zachary Chavis (University of Minnesota) · Joya Chen (National University of Singapore) · Feng Cheng (University of North Carolina at Chapel Hill) · Fu-Jen Chu (Facebook) · Sean Crane (School of Computer Science, Carnegie Mellon University) · Avijit Dasgupta (IIIT Hyderabad) · Jing Dong (Meta) · Maria Escobar (Universidad de Los Andes) · Cristhian Forigua (Universidad de Los Andes) · Abrham Gebreselasie (Carnegie Mellon University) · Sanjay Haresh (Qualcomm Inc, QualComm) · Jing Huang (Facebook) · Md Mohaiminul Islam (UNC Chapel Hill) · Suyog Jain (PathAI) · Rawal Khirodkar (Meta) · Devansh Kukreja (Carnegie Mellon University) · Kevin Liang (FAIR at Meta) · Jia-Wei Liu (National University of Singapore) · Sagnik Majumder (UT Austin & Meta AI) · Yongsen Mao (Simon Fraser University) · Miguel Martin (Meta Platforms, Inc.) · Effrosyni Mavroudi () · Tushar Nagarajan (Meta) · Francesco Ragusa (None) · Santhosh Kumar Ramakrishnan (University of Texas, Austin) · Luigi Seminara (University of Catania) · Arjun Somayazulu (University of Texas at Austin) · Yale Song (Meta) · Shan Su (University of Pennsylvania) · Zihui Xue (None) · Edward Zhang (University of Pennsylvania, University of Pennsylvania) · Jinxu Zhang (University of Pennsylvania, University of Pennsylvania) · Angela Castillo (Universidad de Los Andes) · Changan Chen (University of Texas at Austin) · Fu Xinzhu (National University of Singapore) · Ryosuke Furuta (The University of Tokyo) · Cristina González (Universidad de Los Andes) · Gupta (None) · Jiabo Hu (Facebook) · Yifei Huang (The University of Tokyo) · Yiming Huang (University of Pennsylvania) · Weslie Khoo (Indiana University) · Anush Kumar (Torc Robotics) · Robert Kuo (Facebook) · Sach Lakhavani (None) · Miao Liu (META AI) · Mi Luo (The University of Texas at Austin) · Zhengyi Luo (Carnegie Mellon University) · Brighid Meredith (meta) · Austin Miller (Meta) · Oluwatumininu Oguntola (University of North Carolina at Chapel Hill) · Xiaqing Pan (Meta) · Penny Peng (Meta) · Shraman Pramanick (None) · Merey Ramazanova (KAUST) · Fiona Ryan (Georgia Institute of Technology) · Wei Shan (University of North Carolina at Chapel Hill) · Kiran Somasundaram (None) · Chenan Song (national university of singaore, National University of Singapore) · Audrey Southerland (Georgia Institute of Technology) · Masatoshi Tateno (AIST, National Institute of Advanced Industrial Science and Technology) · Huiyu Wang (Facebook) · Yuchen Wang (Indiana University) · Takuma Yagi (None) · Mingfei Yan (None) · Xitong Yang (Meta) · Zecheng Yu (University of Tokyo) · Shengxin Zha (Meta GenAI) · Chen Zhao (King Abdullah University of Science and Technology (KAUST)) · Ziwei Zhao (Indiana University) · Zhifan Zhu (University of Bristol) · Jeff Zhuo (University of North Carolina at Chapel Hill) · Pablo ARBELAEZ (Universidad de los Andes) · Gedas Bertasius (UNC Chapel Hill) · Dima Damen () · Jakob Engel (Research, Meta Reality Labs) · Giovanni Maria Farinella (University of Catania, Italy) · Antonino Furnari (University of Catania) · Bernard Ghanem (KAUST) · Judy Hoffman (Georgia Institute of Technology) · C.V. Jawahar (IIIT-Hyderabad) · Richard Newcombe (Meta, Reality Labs Research) · Hyun Soo Park (The University of Minnesota) · James Rehg (None) · Yoichi Sato (University of Tokyo) · Manolis Savva (Simon Fraser University) · Jianbo Shi (None) · Mike Zheng Shou (National University of Singapore) · Michael Wray (University of Bristol) |
244 | SIGNeRF: Scene Integrated Generation for Neural Radiance Fields | Jan-Niklas Dihlmann (Eberhard-Karls-Universität Tübingen) · Andreas Engelhardt (University of Tübingen) · Hendrik Lensch (University of Tübingen) |
245 | SHINOBI: SHape and Illumination using Neural Object decomposition via BRDF optimization and Inverse rendering from unconstrained Image collections | Andreas Engelhardt (University of Tübingen) · Amit Raj (Google ) · Mark Boss (Stability AI) · Yunzhi Zhang (Stanford University) · Abhishek Kar (Google) · Yuanzhen Li (Massachusetts Institute of Technology) · Ricardo Martin-Brualla (Google) · Jonathan T. Barron (Google) · Deqing Sun (Google) · Hendrik Lensch (University of Tübingen) · Varun Jampani (Google Research) |
246 | Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement | Ziyu Wang (Shanghai Jiao Tong University) · Yue Xu (Shanghai Jiao Tong University) · Cewu Lu (Shanghai Jiao Tong University) · Yonglu Li (Shanghai Jiaotong University) |
247 | DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling | Xiaoyun Zheng (Peking University Shenzhen Graduate School) · Liwei Liao (Peking University) · Xufeng Li (Cityu) · Jianbo Jiao (University of Birmingham) · Rongjie Wang (PengCheng Laboratory) · Feng Gao (Peking University) · Shiqi Wang (City University of Hong Kong) · Ronggang Wang (Peking University Shenzhen Graduate School) |
248 | Learning Group Activity Features Through Person Attribute Prediction | Chihiro Nakatani (None) · Hiroaki Kawashima (University of Hyogo) · Norimichi Ukita (None) |
249 | Low-power, Continuous Remote Behavioral Localization with Event Cameras | Friedhelm Hamann (TU Berlin) · Suman Ghosh (TU Berlin) · Ignacio Juarez Martinez (University of Oxford) · Tom Hart (Oxford Brookes University) · Alex Kacelnik (University of Oxford) · Guillermo Gallego (TU Berlin-ECDF-SCIoI) |
250 | A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning | Yuelin Zhang (The Chinese University of Hong Kong) · Pengyu Zheng (The Chinese University of Hong Kong) · Wanquan Yan (The Chinese University of Hong Kong) · Chengyu Fang (Tsinghua University, Tsinghua University) · Shing Shin Cheng (The Chinese University of Hong Kong) |
251 | Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling | Liwen Wu (Computer Science and Engineering Department, University of California, San Diego) · Sai Bi (Adobe Systems) · Zexiang Xu (Adobe Research) · Fujun Luan (Adobe Systems) · Kai Zhang (Adobe Systems) · Iliyan Georgiev (Adobe) · Kalyan Sunkavalli (Adobe Research) · Ravi Ramamoorthi (None) |
252 | What, How, and When Should Object Detectors Update in Continually Changing Test Domains? | Jayeon Yoo (Seoul National University) · Dongkwan Lee (Seoul National University) · Inseop Chung (Seoul National University) · Donghyun Kim (MIT-IBM Watson AI Lab) · Nojun Kwak (Seoul National University) |
253 | Rethinking FID: Towards a Better Evaluation Metric for Image Generation | Sadeep Jayasumana (Google) · Srikumar Ramalingam (Google) · Andreas Veit (Google) · Daniel Glasner (Google) · Ayan Chakrabarti (Google) · Sanjiv Kumar (Google) |
254 | MarkovGen: Structured Prediction for Efficient Text-to-Image Generation | Sadeep Jayasumana (Google) · Daniel Glasner (Google) · Srikumar Ramalingam (Google) · Andreas Veit (Google) · Ayan Chakrabarti (Google) · Sanjiv Kumar (Google) |
255 | UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory | Haiwen Diao (Dalian University of Technology) · Bo Wan (KU Leuven) · Ying Zhang (Tencent) · Xu Jia (Dalian University of Technology) · Huchuan Lu (Dalian University of Technology) · Long Chen (HKUST) |
256 | L-MAGIC: Language Model Assisted Generation of Images with Consistency | zhipeng cai (Intel) · Matthias Mueller (None) · Reiner Birkl (Intel Corporation) · Diana Wofk (Intel) · Shao-Yen Tseng (Intel) · JunDa Cheng (Huazhong University of Science and Technology) · Gabriela Ben Melech Stan (Intel) · Vasudev Lal (None) · Michael Paulitsch (Intel) |
257 | Partial-to-Partial Shape Matching with Geometric Consistency | Viktoria Ehm (Technische Universität München) · Maolin Gao (None) · Paul Roetzer (None) · Marvin Eisenberger (Technical University Munich) · Daniel Cremers (Technical University Munich) · Florian Bernard (University of Bonn) |
258 | Learning from One Continuous Video Stream | Joao Carreira (DeepMind) · Michael King (Fit) · Viorica Patraucean (DeepMind) · Dilara Gokay (Google DeepMind) · Catalin Ionescu (Google) · Yi Yang (DeepMind) · Daniel Zoran (DeepMind) · Joseph Heyward (Google) · Carl Doersch (DeepMind) · Yusuf Aytar (Google DeepMind) · Dima Damen () · Andrew Zisserman (University of Oxford) |
259 | Purified and Unified Steganographic Network | GuoBiao Li (Fudan University) · Sheng Li (Fudan University) · Zicong Luo (Fudan University) · Zhenxing Qian (Fudan University) · Xinpeng Zhang (Fudan University) |
260 | Audio-Visual Segmentation via Unlabeled Frame Exploitation | Jinxiang Liu (Shanghai Jiao Tong University) · Yikun Liu (Shanghai Jiaotong University) · Ferenas (None) · Chen Ju () · Ya Zhang (Shanghai Jiao Tong University) · Yanfeng Wang (Shanghai Jiao Tong University) |
261 | Artist-Friendly Relightable and Animatable Neural Heads | Yingyan Xu (Department of Computer Science, ETHZ - ETH Zurich) · Prashanth Chandran (None) · Sebastian Weiss (DisneyResearch |
262 | Neural Redshift: Random Networks are not Random Functions | Damien Teney (Idiap Research Institute) · Armand Nicolicioiu (ETHZ - ETH Zurich) · Valentin Hartmann (EPFL) · Ehsan Abbasnejad (University of Adelaide) |
263 | MMA-Diffusion: MultiModal Attack on Diffusion Models | Yijun Yang (The Chinese University of Hong Kong) · Ruiyuan Gao (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Xiaosen Wang (Huazhong University of Science and Technology) · Tsung-Yi Ho (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Xu Nan (Institute of Automation, Chinese Academy of Sciences) · Qiang Xu (The Chinese University of Hong Kong) |
264 | Parameter Efficient Self-Supervised Geospatial Domain Adaptation | Linus Scheibenreif (University of St.Gallen) · Michael Mommert (Stuttgart University of Applied Sciences) · Damian Borth (University of St.Gallen) |
265 | Torwards Open-Vocabulary HOI Detection via Conditional Multi-level Decoding and Fine-grained Semantic Enhancement | Ting Lei (Peking University) · Shaofeng Yin (Peking University) · Yang Liu (Peking University) |
266 | TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing | Sherry X. Chen (University of California, Santa Barbara) · Yaron Vaxman (cloudinary) · Elad Ben Baruch (Cloudinary) · David Asulin (Cloudinary Ltd.) · Aviad Moreshet (Cloudinary) · Kuo-Chin Lien (Layer AI) · Misha Sra (University of California, Santa Barbara) · Pradeep Sen (UC Santa Barbara) |
267 | Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis | Marianna Ohanyan (Picsart ) · Hayk Manukyan (Picsart AI Research) · Zhangyang Wang (University of Texas at Austin) · Shant Navasardyan (Picsart AI Research) · Humphrey Shi (U of Oregon |
268 | Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps | Octave Mariotti (University of Edinburgh) · Oisin Mac Aodha (University of Edinburgh) · Hakan Bilen (University of Edinburgh) |
269 | DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception | Yibo Wang (Tsinghua University) · Ruiyuan Gao (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Kai Chen (The Hong Kong University of Science and Technology) · Kaiqiang Zhou (Huawei Technologies Ltd.) · Yingjie CAI (The Chinese University of Hong Kong) · Lanqing Hong (Huawei Technologies Ltd.) · Zhenguo Li (Huawei) · Lihui Jiang (Huawei Technologies Ltd.) · Dit-Yan Yeung (Hong Kong University of Science and Technology) · Qiang Xu (The Chinese University of Hong Kong) · Kai Zhang (Shenzhen International Graduate School, Tsinghua University) |
270 | ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images | Nicolas Bourriez (Ecole Normale Supérieure de Paris) · Ihab Bendidi (Ecole Normale Superieure) · Cohen Ethan (Ecole Normale Supérieure de Paris) · Gabriel Watkinson (Ecole Normale Supérieure de Paris) · Maxime Sanchez (IBENS) · Guillaume Bollot (Synsight company) · Auguste Genovesio (Ecole Normale Supérieure de Paris) |
271 | Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation | Hang Li (University of Munich) · Chengzhi Shen (Technische Universität München) · Philip H.S. Torr (University of Oxford) · Volker Tresp (Ludwig-Maximilians-Universität München) · Jindong Gu (University of Oxford) |
272 | GDA: Generalized Diffusion for Robust Test-time Adaptation | Yun-Yun Tsai (Columbia University) · Fu-Chen Chen (Amazon Lab126) · Albert Chen (Amazon) · Junfeng Yang (Columbia University) · Che-Chun Su (Amazon) · Min Sun (None) · Cheng-Hao Kuo (Amazon) |
273 | RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D | Lingteng Qiu (None) · Guanying Chen (The Chinese University of Hong Kong, Shenzhen) · Xiaodong Gu (Alibaba Group) · Qi Zuo (Alibaba Group) · Mutian Xu (None) · Yushuang Wu (The Chinese University of Hong Kong (Shenzhen)) · Weihao Yuan (Alibaba Group) · Zilong Dong (Alibaba Group) · Liefeng Bo (None) · Xiaoguang Han (The Chinese University of Hong Kong, Shenzhen) |
274 | GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors | Yuan Dong (Alibaba Group) · Qi Zuo (Alibaba Group) · Xiaodong Gu (Alibaba Group) · Weihao Yuan (Alibaba Group) · zhengyi zhao (Alibaba Group) · Zilong Dong (Alibaba Group) · Liefeng Bo (None) · Qixing Huang (University of Texas at Austin) |
275 | IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images | Yushuang Wu (The Chinese University of Hong Kong (Shenzhen)) · Luyue Shi (The Chinese University of Hong Kong, Shenzhen) · Junhao Cai (Hong Kong University of Science and Technology) · Weihao Yuan (Alibaba Group) · Lingteng Qiu (None) · Zilong Dong (Alibaba Group) · Liefeng Bo (None) · Shuguang Cui (The Chinese University of Hong Kong, Shenzhen) · Xiaoguang Han (The Chinese University of Hong Kong, Shenzhen) |
276 | DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting | Jer Pelhan (Universtiy of Ljubljana) · Alan Lukezic (University of Ljubljana) · Vitjan Zavrtanik (University of Ljubljana) · Matej Kristan (University of Ljubljana) |
277 | RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection | Zhiwei Lin (Peking University) · Zhe Liu (University of Electronic Science and Technology of China) · Zhongyu Xia (Peking University) · Xinhao Wang (Peking University) · Yongtao Wang (Peking University) · Shengxiang Qi (Chongqing Changan Automobile Co., Ltd) · Yang Dong (Chongqing Changan Automobile Co., Ltd.) · Nan Dong (changan) · Le Zhang (University of Electronic Science and Technology of China) · Ce Zhu (University of Electronic Science and Technology of China) |
278 | Inversion-Free Image Editing with Natural Language | Sihan Xu (University of Michigan - Ann Arbor) · Yidong Huang (University of Michigan - Ann Arbor) · Jiayi Pan (University of California, Berkeley) · Ziqiao Ma (University of Michigan) · Joyce Chai (University of Michigan) |
279 | GROUNDHOG: Grounding Large Language Models to Holistic Segmentation | Yichi Zhang (University of Michigan) · Ziqiao Ma (University of Michigan) · Xiaofeng Gao (Amazon AGI) · Suhaila Shakiah (Amazon) · Qiaozi Gao (Amazon) · Joyce Chai (University of Michigan) |
280 | Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models | Gianni Franchi (ENSTA Paris) · Olivier Laurent (Université Paris-Saclay) · Maxence Leguéry (ENSTA Paris) · Andrei Bursuc (valeo.ai) · Andrea Pilzer (NVIDIA) · Angela Yao (National University of Singapore) |
281 | NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows | Zhenggang Tang (UIUC) · Jason Ren (Apple) · Xiaoming Zhao (UIUC) · Bowen Wen (NVIDIA) · Jonathan Tremblay (NVIDIA) · Stan Birchfield (NVIDIA) · Alexander G. Schwing (UIUC) |
282 | GaussianAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh | Jing Wen (University of Illinois Urbana-Champaign) · Xiaoming Zhao (UIUC) · Jason Ren (Apple) · Alexander G. Schwing (UIUC) · Shenlong Wang (University of Illinois, Urbana Champaign) |
283 | Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture | Fei Wang (Hefei University of Technology) · Dan Guo (Hefei University of Technology) · Kun Li (Hefei University of Technology) · Zhun Zhong (University of Trento) · Meng Wang (Hefei University of Technology) |
284 | Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning | Rui Li (None) · Tobias Fischer (Swiss Federal Institute of Technology) · Mattia Segu (ETH Zurich - Swiss Federal Institute of Technology) · Marc Pollefeys (ETH Zurich / Microsoft) · Luc Van Gool (ETH Zurich) · Federico Tombari (Google, TUM) |
285 | FairRAG: Fair Human Generation via Fair Retrieval Augmentation | Robik Shrestha (Rochester Institute of Technology) · Yang Zou (Amazon) · Qiuyu Chen (Amazon) · Zhiheng Li (Amazon AGI) · Yusheng Xie (Amazon) · Siqi Deng (Amazon) |
286 | Convolutional Prompting meets Language Models for Continual Learning | ANURAG Roy (IIT Kharagpur) · Riddhiman Moulick (Indian Institute of Technology Kharagpur) · Vinay Verma Verma (None) · Saptarshi Ghosh (Indian Institute of Technology Kharagpur) · Abir Das (Indian Institute of Technology Kharagpur) |
287 | CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition | Qixuan Zheng (City University of Hong Kong) · Ming Zhang (Hong Kong Applied Science and Technology Research Institute (ASTRI)) · Hong Yan (City University of Hong Kong) |
288 | GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects | Sungphill Moon (Naver Labs) · Hyeontae Son (Naver Labs) · Dongcheol Hur (NAVER LABS) · Sangwook Kim (Naver Labs) |
289 | Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | Bingyan Liu (South China University of Technology) · Chengyu Wang (Alibaba Group) · Tingfeng Cao (South China University of Technology) · Kui Jia (South China University of Technology) · Jun Huang (University of Science and Technology of China) |
290 | TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video | Minye Wu (KU Leuven) · Zehao Wang (KU Leuven) · Georgios Kouros (Department of Electrical Engineering, KU Leuven, Belgium, KU Leuven) · Tinne Tuytelaars (KU Leuven) |
291 | Universal Robustness via Median Random Smoothing for Real-World Super-Resolution | Zakariya Chaouai (Paris-Saclay University, CEA, List) · Mohamed Tamaazousti (CEA/LIST) |
292 | CAGE: Controllable Articulation GEneration | Jiayi Liu (None) · Hou In Ivan Tam (Simon Fraser University) · Ali Mahdavi Amiri (Simon Fraser University) · Manolis Savva (Simon Fraser University) |
293 | IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration | Tai Ma (East China Normal University) · zhangsuwei (East China Normal University) · Jiafeng Li (East China Normal University) · Ying Wen (East China Normal University) |
294 | PromptKD: Unsupervised Prompt Distillation for Vision-Language Models | Zheng Li (Nankai University) · Xiang Li (Nankai University) · xinyi fu (Ant group) · Xin Zhang (Nankai University) · Weiqiang Wang (University of Southern California) · Shuo Chen (RIKEN) · Jian Yang (Nankai University) |
295 | LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching | Yixun Liang (Hong Kong University of Science and Technology) · Xin Yang (None) · Jiantao Lin (Hong Kong University of Science and Technology) · Haodong LI (Hong Kong University of Science and Technology) · Xiaogang Xu (Zhejiang Lab) · Ying-Cong Chen (The Hong Kong University of Science and Technology) |
296 | What Moves Together Belongs Together | Jenny Seidenschwarz (Department of Informatics, Technische Universität München) · Aljoša Ošep (Carnegie Mellon University) · Francesco Ferroni () · Simon Lucey (University of Adelaide) · Laura Leal-Taixe (NVIDIA) |
297 | SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation | Aysim Toker (Technical University Munich) · Marvin Eisenberger (Technical University Munich) · Daniel Cremers (Technical University Munich) · Laura Leal-Taixe (NVIDIA) |
298 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min (Peking University) · Dawei Zhao (Defense Innovation Institute) · Liang Xiao (Defense Innovation Institute) · Jian Zhao () · Xinli Xu (Hong Kong University of Science and Technology) · Zheng Zhu (Tsinghua University) · Lei Jin (Beijing University of Posts and Telecommunications) · Jianshu Li (Ant Group) · Yulan Guo (SUN YAT-SEN UNIVERSITY) · Junliang Xing (Tsinghua University) · Liping Jing (Beijing Jiaotong University) · Yiming Nie (National University of Defense Technology) · Bin Dai (National University of Defense Technology) |
299 | Accurate Training Data for Occupancy Map Prediction in Automated Driving using Evidence Theory | Jonas Kälble (Robert Bosch GmbH) · Sascha Wirges (Robert Bosch GmbH, Bosch) · Maxim Tatarchenko (Bosch) · Eddy Ilg (None) |
300 | HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation | Linglin Jing (Loughborough University) · Yiming Ding (Fudan University) · Yunpeng Gao (Northwest Polytechnical University Xi'an) · Zhigang Wang (Shanghai AI Lab) · Xu Yan (None) · Dong Wang (Shanghai AI Laboratory) · Gerald Schaefer (Loughborough University) · Hui Fang (Loughborough University) · Bin Zhao (Northwest Polytechnical University Xi'an) · Xuelong Li (Northwestern Polytechnical University) |
301 | Accurate Spatial Gene Expression Prediction by integrating Multi-resolution features | Youngmin Chung (Sung Kyun Kwan University) · Ji Hun Ha (Sung Kyun Kwan University) · Kyeong Chan Im (Sungkyunkwan University) · Joo Sang Lee (Sungkyunkwan University) |
302 | SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation | Jiehong Lin (South China University of Technology) · lihua liu (South China University of Technology) · Dekun Lu (South China University of Technology) · Kui Jia (South China University of Technology) |
303 | Characteristics Matching Based Hash Codes Generation for Efficient Fine-grained Image Retrieval | Zhen-Duo Chen (Shandong University) · Li-Jun Zhao (Shandong University) · Zi-Chao Zhang (Shandong University) · Xin Luo (Shandong University) · Xin-Shun Xu (Shandong University) |
304 | MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation | Petru-Daniel Tudosiu (King's College London, University of London) · Yongxin Yang (Queen Mary University of London) · Shifeng Zhang (Huawei Technologies Ltd.) · Fei Chen (Huawei Noah's Ark Lab) · Steven McDonagh (University of Edinburgh) · Gerasimos Lampouras (Huawei Technologies Ltd.) · Ignacio Iacobacci (Huawei Noah's Ark Lab) · Sarah Parisot (Huawei) |
305 | Accelerating Diffusion Sampling with Optimized Time Steps | Shuchen Xue (Academy of Mathematics and Systems Science, Chinese Academy of Sciences) · Zhaoqiang Liu (University of Electronic Science and Technology of China) · Fei Chen (Huawei Noah's Ark Lab) · Shifeng Zhang (Huawei Technologies Ltd.) · Tianyang Hu (Huawei Noah's Ark Lab) · Enze Xie (Huawei Noah's Ark Lab) · Zhenguo Li (Huawei) |
306 | DAP: A Dynamic Adversarial Patch for Evading Person Detectors | Amira Guesmi (New York University, Abu Dhabi) · Ruitian Ding (New York University) · Muhammad Abdullah Hanif (New York University, Abu Dhabi) · Ihsen Alouani (The Queen's University Belfast) · Muhammad Shafique (New York University) |
307 | Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving | JunDa Cheng (Huazhong University of Science and Technology) · Wei Yin ( Shenzhen DJI Sciences and Technologies Ltd.) · Kaixuan Wang (Hong Kong University of Science and Technology) · Xiaozhi Chen (DJI Innovations) · Shijie Wang (Huazhong University of Science and Technology) · Xin Yang (Huazhong University of Science and Technology) |
308 | Language-driven Grasp Detection | An Vuong (FPT Software - AIC Lab) · Minh VU (Automation & Control Institute, TU Wien) · Baoru Huang (University College London, University of London) · Nghia Nguyen (Hanoi University of Science and Technology) · Hieu Le (FPT Software AI Center) · Thieu Vo (Ton Duc Thang University) · Anh Nguyen (University of Liverpool) |
309 | SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation | Keqi Chen (Université de Strasbourg) · vinkle srivastav (University of Strasbourg) · Nicolas Padoy (University of Strasbourg) |
310 | FMA-Net: Flow Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring | Geunhyuk Youk (None) · Jihyong Oh (Chung-Ang University) · Munchurl Kim (Korea Advanced Institute of Science and Technology) |
311 | VBench: Comprehensive Benchmark Suite for Video Generative Models | Ziqi Huang (Nanyang Technological University) · Yinan He (Sensetime Research) · Jiashuo Yu (Shanghai AI Laboratory) · Fan Zhang (None) · Chenyang Si (Sea AI Lab) · Yuming Jiang (Nanyang Technological University) · Yuanhan Zhang (Nanyang Technological University) · Tianxing Wu (Nanyang Technological University) · Jin Qingyang (Nanyang Technological University) · Nattapol Chanpaisit (Nanyang Technological University) · Yaohui Wang (Shanghai AI Laboratory) · Xinyuan Chen (Shanghai Artificial Intelligence Laboratory) · Limin Wang (Nanjing University) · Dahua Lin (The Chinese University of Hong Kong) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Ziwei Liu (Nanyang Technological University) |
312 | Transferable Structural Sparse Adversarial Attack Via Exact Group Sparsity Training | Di Ming (Chongqing University of Technology) · Peng Ren (Chongqing University of Technology) · Yunlong Wang (IQVIA) · Xin Feng (Chongqing University of Technology) |
313 | LaRE | |
2 | ||
: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection | Yunpeng Luo (Tencent Youtu Lab) · Junlong Du (Tencent YouTu Lab) · Ke Yan () · Shouhong Ding (Tencent Youtu Lab) | |
314 | Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection | Jongha Kim (Korea University) · Jihwan Park (Korea University) · Jinyoung Park (Korea University) · Jinyoung Kim (Korea University) · Sehyung Kim (Korea University) · Hyunwoo J. Kim (Korea University) |
315 | EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling | Haiyang Liu (the university of tokyo) · Zihao Zhu (Keio University) · Giorgio Becherini (Max Planck Institute for Intelligent Systems, Max-Planck Institute) · YICHEN PENG (Japan Advanced Institute of Science and Technology, Tokyo Institute of Technology) · Mingyang Su (Tsinghua University, Tsinghua University) · YOU ZHOU (Huawei Technologies Ltd.) · Xuefei Zhe (City University of Hong Kong) · Naoya Iwamoto (Huawei Technologies Japan K.K.) · Bo Zheng (Huawei Technologies Japan) · Michael J. Black (University of Tübingen) |
316 | Discriminative Probing and Tuning for Text-to-Image Generation | Leigang Qu (National University of Singapore) · Wenjie Wang (National University of Singapore) · Yongqi Li (Hong Kong Polytechnic University) · Hanwang Zhang (Nanyang Technological University) · Liqiang Nie (Harbin Institute of Technology (Shenzhen)) · Tat-seng Chua (National University of Singapore) |
317 | CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification | Haoran Lai (University of Science and Technology of China) · Qingsong Yao (University of the Chinese Academy of Sciences) · Zihang Jiang (University of Science and Technology of China) · Rongsheng Wang (University of Science and Technology of China) · Zhiyang He (Xunfei Healthcare Technology Co., Ltd.) · Xiaodong Tao (Xunfei Healthcare Co. Ltd) · S Kevin Zhou (University of Science and Technology of China) |
318 | Adaptive Softassign via Hadamard-Equipped Sinkhorn | Binrui Shen (Xi'an Jiaotong-Liverpool University) · Qiang Niu (Xi'an Jiaotong-Liverpool University) · Shengxin Zhu (Beijing Normal Unversity) |
319 | Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels | Tianming Liang (Sun Yat-sen University) · Chaolei Tan (SUN YAT-SEN UNIVERSITY) · Beihao Xia (Huazhong University of Science and Technology) · Wei-Shi Zheng (SUN YAT-SEN UNIVERSITY) · Jian-Fang Hu (SUN YAT-SEN UNIVERSITY) |
320 | SocialCircle: Learning the Angle-based Social Interaction Representation for Pedestrian Trajectory Prediction | Conghao Wong (None) · Beihao Xia (Huazhong University of Science and Technology) · Ziqian Zou (Huazhong University of Science and Technology) · Yulong Wang (Huazhong Agricultural University) · Xinge You (Huazhong University of Science and Technology) |
321 | Edge-Aware 3D Instance Segmentation Network with Intelligent Semantic Prior | Wonseok Roh (Korea University) · Hwanhee Jung (Korea University) · Giljoo Nam (Meta) · Jinseop Yeom (Korea University) · Hyunje Park (Korea University) · Sang Ho Yoon (KAIST) · Sangpil Kim (Korea University) |
322 | SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer | Rui Zhu (Chinese University of Hong Kong (Shenzhen)) · Yingwei Pan (None) · Yehao Li (JD AI Research) · Ting Yao (JD AI Research) · Zhenglong Sun (The Chinese University of Hong Kong, Shenzhen) · Tao Mei (JD Explore Academy) · Chang-Wen Chen (The Hong Kong Polytechnic University) |
323 | FreeMan: Towards benchmarking 3D human pose estimation under Real-World Conditions | Jiong WANG (Fudan University) · Fengyu Yang (Chinese University of Hong Kong(Shenzhen)) · Bingliang Li (The Chinese University of Hong Kong (Shenzhen)) · Wenbo Gou (Carnegie Mellon University) · Danqi Yan (The Chinese University of Hong Kong Shenzhen) · Ailing Zeng (IDEA) · Yijun Gao (Tencent Turing Lab) · Junle Wang (Tencent) · Yanqing Jing (Tencent) · Ruimao Zhang (The Chinese University of Hong Kong (Shenzhen)) |
324 | Towards Detailed and Robust 3D Clothed Human Reconstruction with High-Frequency and Low-Frequency Information of Parametric Body Models | Yifan Yang (South China University of Technology) · Dong Liu (South China University of Technology) · Shuhai Zhang (South China University of Technology) · Zeshuai Deng (SCUT) · Zixiong Huang (South China University of Technology) · Mingkui Tan (South China University of Technology) |
325 | Not All Prompts Are Secure: A Switchable Backdoor Attack against Pre-trained Models | Sheng Yang () · Jiawang Bai (None) · Kuofeng Gao (Tsinghua University, Tsinghua University) · Yong Yang (Tencent Security) · Yiming Li (Zhejiang University) · Shu-Tao Xia (Shenzhen International Graduate School, Tsinghua University) |
326 | One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications | Mengyao Lyu (Tsinghua University) · Yuhong Yang () · Haiwen Hong (Alibaba Group) · Hui Chen (Tsinghua University, Tsinghua University) · Xuan Jin (University of Science and Technology of China) · Yuan He (Alibaba Group) · Hui Xue (Zhejiang University, Tsinghua University) · Jungong Han (Aberystwyth University) · Guiguang Ding (Tsinghua University) |
327 | VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning | Kang chenkang (Huaihai Institute of Technology) · Xiangqian Wu (Harbin Institute of Technology) |
328 | Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis | Tianci Bi (Xi'an Jiaotong University) · Xiaoyi Zhang (Research, Microsoft) · Zhizheng Zhang (Microsoft Research) · Wenxuan Xie (Microsoft Research Asia) · Cuiling Lan (Microsoft) · Yan Lu (Microsoft Research Asia) · Nanning Zheng (Xi'an Jiaotong University) |
329 | AEROBLAD²E: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error | Jonas Ricker (Ruhr University Bochum) · Denis Lukovnikov (Ruhr University Bochum) · Asja Fischer (Ruhr-Universität Bochum) |
330 | Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling | Leon Sick (None) · Dominik Engel (Ulm University) · Pedro Hermosilla (Technische Universität Wien) · Timo Ropinski (Ulm University) |
331 | Residual Denoising Diffusion Models | Jiawei Liu (Shenyang Institute of Automation, Chinese Academy of Sciences) · Qiang Wang (Shenyang University) · Huijie Fan (None) · Yinong Wang (University of Hong Kong) · Yandong Tang (Shenyang Institue of Automation) · Liangqiong Qu (The University of Hong Kong) |
332 | Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention | Ju-Hyeon Nam (Inha University) · Nur Suriza Syazwany (Inha University) · Su Kim (Inha University) · Sang-Chul Lee (Inha University) |
333 | Improving Unsupervised Hierarchical Representation with Reinforcement Learning | Ruyi An (Nanyang Technological University) · Yewen Li (Nanyang Technological University) · Xu He (Huawei Technologies Ltd.) · Pengjie Gu (Nanyang Technological University) · Mengchen Zhao (South China University of Technology) · Dong Li (Huawei Technologies Ltd.) · Jianye Hao (Tianjin University) · Bo An (Nanyang Technological University) · Chaojie Wang (Skywork AI) · Mingyuan Zhou (The University of Texas at Austin) |
334 | M | |
3 | ||
-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection | Bin Pu (Hong Kong University of Science and Technology) · Liwen Wang () · Jiewen Yang (Hong Kong University of Science and Technology) · He Guannan (Sichuan University) · Xingbo Dong (Anhui University) · Shengli Li (Shenzhen Maternity and Child Healthcare Hospital) · Ying Tan (Shenzhen Maternity and Child Healthcare Hospital) · Ming Chen (Harbin Red Cross Central Hospital ) · Zhe Jin (Anhui University) · Kenli Li (Hunan University) · Xiaomeng Li (The Hong Kong University of Science and Technology) | |
335 | ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | Le Xue (None) · Ning Yu (Salesforce Research) · Shu Zhang (SalesForce.com) · Artemis Panagopoulou (University of Pennsylvania) · Junnan Li (None) · Roberto Martín-Martín (University of Texas at Austin) · Jiajun Wu (Stanford University) · Caiming Xiong (Salesforce Research) · Ran Xu (SalesForce.com) · Juan Carlos Niebles (Salesforce Research) · Silvio Savarese (Salesforce) |
336 | HIVE: Harnessing Human Feedback for Instructional Visual Editing | Shu Zhang (SalesForce.com) · Xinyi Yang (Salesforce Research) · Yihao Feng (Salesforce Research) · Can Qin (Northeastern University) · Chia-Chih Chen (Salesforce) · Ning Yu (Salesforce Research) · Zeyuan Chen (SalesForce.com) · Huan Wang (SalesForce.com) · Silvio Savarese (Salesforce) · Stefano Ermon (Stanford University) · Caiming Xiong (Salesforce Research) · Ran Xu (SalesForce.com) |
337 | NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images | YUFEI HAN (None) · Heng Guo (Beijing University of Posts and Telecommunications) · Koki Fukai (Osaka University) · Hiroaki Santo (Osaka University) · Boxin Shi (None) · Fumio Okura (Osaka University) · Zhanyu Ma (Beijing University of Post and Telecommunication) · Yunpeng Jia (Beijing University of Posts and Telecommunications) |
338 | Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning | Rongjie Li (SIST ,ShanghaiTech University) · Yu Wu (ShanghaiTech University) · Xuming He (ShanghaiTech University) |
339 | Test-Time Backdoor Defense via Detecting and Repairing | Jiyang Guan (Institute of Automation, Chinese Academy of Sciences) · Jian Liang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Ran He (None) |
340 | Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names | Yapeng Li (Wuhan University) · Yong Luo (Wuhan University) · Zengmao Wang (Wuhan University) · Bo Du (Wuhan University) |
341 | Cross-dimension Affinity Distillation for 3D EM Neuron Segmentation | Xiaoyu Liu (None) · Miaomiao Cai (University of Science and Technology of China) · Yinda Chen (University of Science and Technology of China) · Yueyi Zhang (None) · Te Shi (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center) · Ruobing Zhang (Suzhou Institute of Biomedical Engineering and Technology) · Xuejin Chen (University of Science and Technology of China) · Zhiwei Xiong (None) |
342 | Flow-Guided Online Stereo Rectification for Wide Baseline Stereo | Anush Kumar (Torc Robotics) · Fahim Mannan () · Omid Hosseini Jafari (Torc Robotics) · Shile Li (Torc Robotics) · Felix Heide (Department of Computer Science, Princeton University) |
343 | PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation | Jinfeng Xu (Huazhong University of Science and Technology) · Siyuan Yang (HUST) · Xianzhi Li (Huazhong University of Science and Technology) · Yuan Tang (Huazhong University of Science and Technology) · yixue Hao (Huazhong University of Science and Technology) · Long Hu (Huazhong University of Science and Technology) · Min Chen (South China University of Technology) |
344 | Circuit Design and Efficient Simulation of Quantum Inner Product and Empirical Studies of Its Effect on Near-Term Hybrid Quantum-Classic Machine Learning | Hao Xiong (Shanghai Jiao Tong University) · Yehui Tang (Shanghai Jiaotong University) · Xinyu Ye (Shanghai Jiaotong University) · Junchi Yan (Shanghai Jiao Tong University) |
345 | WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights | Youngdong Jang (Korea University) · Dong In Lee (Korea University) · MinHyuk Jang (Korea University) · Jong Wook Kim (Korea University) · Feng Yang (Google Research) · Sangpil Kim (Korea University) |
346 | Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning | Woo-Jin Ahn (Korea University) · Geun-Yeong Yang (Korea University) · Hyunduck Choi (Chonnam National University) · Myo-Taeg Lim (Korea University) |
347 | DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | Chengxiang Fan (Zhejiang University) · Muzhi Zhu (Zhejiang University) · Hao Chen (Zhejiang University) · Yang Liu (Zhejiang University) · Weijia Wu (None) · Huaqi Zhang (Hangzhou VIVO Information Technology Co., Ltd) · Chunhua Shen (Zhejiang University) |
348 | Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors | Nicolae Ristea (University Politehnica of Bucharest) · Florinel Croitoru (University of Bucharest) · Radu Tudor Ionescu (None) · Marius Popescu (University of Bucharest) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) · Mubarak Shah (University of Central Florida) |
349 | Learning Continuous 3D Words for Text-to-Image Generation | Ta-Ying Cheng (Department of Computer Science, University of Oxford) · Matheus Gadelha (Adobe Systems) · Thibault Groueix (Adobe Systems) · Matthew Fisher (Adobe Research) · Radomir Mech (University of Calgary) · Andrew Markham (University of Oxford) · Niki Trigoni (University of Oxford) |
350 | NOPE: Novel Object Pose Estimation from a Single Image | Van Nguyen Nguyen (Ecole des Ponts ParisTech) · Thibault Groueix (Adobe Systems) · Georgy Ponimatkin (CIIRC, Czech Technical University, Czech Technical University of Prague) · Yinlin Hu (Magic Leap) · Renaud Marlet (INRIA) · Mathieu Salzmann (EPFL) · Vincent Lepetit (Ecole des Ponts ParisTech) |
351 | TutteNet: Injective 3D Deformations by Composition of 2D Mesh Deformations | Bo Sun (University of Texas, Austin) · Thibault Groueix (Adobe Systems) · Chen Song (University of Texas at Austin) · Qixing Huang (University of Texas at Austin) · Noam Aigerman (Université de Montréal) |
352 | Unmixing before Fusion: A Generalized Paradigm for Multi-modality-based Hyperspectral Image Synthesis | Yang Yu (None) · Erting Pan (Wuhan University) · Xinya Wang (Wuhan University) · Yuheng Wu (Wuhan University) · Xiaoguang Mei (Wuhan University) · Jiayi Ma (Wuhan University) |
353 | ModaVerse: Efficiently Transforming Modalities with LLMs | Xinyu Wang (University of Adelaide) · Bohan Zhuang (Monash University) · Qi Wu (University of Adelaide) |
354 | ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion | Jiayu Yang (Australian National University) · Ziang Cheng (Australian National University) · Yunfei Duan (Tencent Game) · Pan Ji (Tencent XR Vision Labs) · Hongdong Li (Australian National University) |
355 | ProTeCt: Prompt Tuning for Taxonomic Open Set Classification | Tz-Ying Wu (University of California, San Diego) · Chih-Hui Ho (University of California San Diego) · Nuno Vasconcelos (University of California San Diego) |
356 | Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians | Yuelang Xu (Tsinghua University, Tsinghua University) · Benwang Chen (Tsinghua University, Tsinghua University) · Zhe Li (Tsinghua University) · Hongwen Zhang (Beijing Normal University) · Lizhen Wang (Tsinghua University, Tsinghua University) · Zerong Zheng (Tsinghua University) · Yebin Liu (Tsinghua University) |
357 | CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment | Sajid Javed (Khalifa University of Science and Technology) · Arif Mahmood (Information Technology University, Lahore) · IYYAKUTTI IYAPPAN GANAPATHI (Khalifa University of Science, Technology and Research) · Fayaz Ali (Khalifa University of Science, Technology and Research) · Naoufel Werghi (Khalifa University) · Mohammed Bennamoun (University of Western Australia) |
358 | 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation | zhang songchun (None) · Yibo Zhang (None) · Quan Zheng (Institute of Software, Chinese Academy of Sciences) · Rui Ma (Jilin University) · Wei Hua (Zhejiang Lab) · Hujun Bao (Zhejiang University) · Weiwei Xu (Zhejiang University) · Changqing Zou (Zhejiang University) |
359 | CoDeF: Content Deformation Fields for Temporally Consistent Video Processing | Hao Ouyang (Department of Computer Science and Engineering, Hong Kong University of Science and Technology) · Qiuyu Wang (None) · Yuxi Xiao (Wuhan University) · Qingyan Bai (Tsinghua University) · Juntao Zhang (Hong Kong University of Science and Technology) · Kecheng Zheng (Ant Group) · Xiaowei Zhou (None) · Qifeng Chen (Hong Kong University of Science and Technology) · Yujun Shen (The Chinese University of Hong Kong) |
360 | Self-correcting LLM-controlled Diffusion | Tsung-Han Wu (University of California, Berkeley) · Long Lian (University of California, Berkeley) · Joseph Gonzalez (University of California - Berkeley) · Boyi Li (UC Berkeley / NVIDIA) · Trevor Darrell (Electrical Engineering & Computer Science Department) |
361 | Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-training via Differentiable Rendering of Line Segments | Yusuke Takimoto (Huawei Technologies Japan K.K.) · Hikari Takehara (Huawei Technologies Japan K.K.) · Hiroyuki Sato (Huawei Technologies Japan K.K.) · Zihao Zhu (Keio University) · Bo Zheng (Huawei Technologies Japan) |
362 | HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances | Supreeth Narasimhaswamy (, State University of New York, Stony Brook) · Uttaran Bhattacharya (Adobe Inc.) · Xiang Chen (Adobe Research) · Ishita Dasgupta (Department of Computer Science, University of Massachusetts at Amherst) · Saayan Mitra (Adobe Research) · Minh Hoai (State University of New York, Stony Brook) |
363 | GenN2N: Generative NeRF2NeRF Translation | Xiangyue Liu () · Han Xue (Tsinghua University, Tsinghua University) · Kunming Luo (Hong Kong University of Science and Technology) · Ping Tan (Hong Kong University of Science and Technology) · Li Yi () |
364 | UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity | Jialong Zuo (None) · Hanyu Zhou (Huazhong University of Science and Technology) · Ying Nie (Huawei Noah's Ark Lab) · Feng Zhang (Huazhong University of Science and Technology) · Tianyu Guo (Peking University) · Nong Sang (Huazhong University of Science and Technology) · Yunhe Wang (Huawei Noah's Ark Lab) · Changxin Gao (Huazhong University of Science and Technology) |
365 | Building Optimal Neural Architectures using Interpretable Knowledge | Keith Mills (University of Alberta) · Fred Han (Huawei Technologies Ltd.) · Mohammad Salameh (Huawei Technologies Ltd.) · Shengyao Lu (University of Alberta) · CHUNHUA ZHOU (Huawei Technologies Ltd.) · Jiao He (huawei) · Fengyu Sun (Tongji University) · Di Niu (University of Alberta) |
366 | Real-time 3D-aware Portrait Video Relighting | Ziqi Cai (Chinese Academy of Sciences & Beijing Jiaotong University) · Kaiwen Jiang (None) · Shu-Yu Chen (Chinese Academy of Sciences) · Yu-Kun Lai (Cardiff University) · Hongbo Fu (City University of Hong Kong) · Boxin Shi (None) · Lin Gao (None) |
367 | Adding Universal Compatibility of Plugins for Upgraded Diffusion Model | Lingmin Ran (National University of Singapore) · Xiaodong Cun (Tencent AI Lab) · Jia-Wei Liu (National University of Singapore) · Rui Zhao (None) · Song Zijie (Fudan University) · Xintao Wang (Tencent) · Jussi Keppo (National University of Singapore) · Mike Zheng Shou (National University of Singapore) |
368 | Exploiting Style Latent Flows for Generalizing Video Deepfake Detection | Jongwook Choi (Chung-Ang University) · Taehoon Kim (Chung-Ang University) · Yonghyun Jeong (NAVER) · Seungryul Baek (UNIST) · Jongwon Choi (Chung-Ang University) |
369 | OakInk2: A Dataset of Embodied Hands-Object Manipulation in Long-Horizon Complex Task Completion | Xinyu Zhan (Shanghai Jiaotong University) · Lixin Yang (Shanghai Jiao Tong University) · Yifei Zhao (Shanghai Jiaotong University) · Kangrui Mao (Shanghai Jiao Tong University) · Hanlin Xu (Shanghai Jiaotong University) · Zenan Lin (South China University of Technology) · Kailin Li (Shanghai Jiaotong University) · Cewu Lu (Shanghai Jiao Tong University) |
370 | Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes | YuJie Lu (Donghua University, Shanghai) · Long Wan (Donghua University, Shanghai) · Nayu Ding (Donghua University, Shanghai) · Yulong Wang (Donghua University, Shanghai) · Shuhan Shen (Institute of automation, Chinese academy of science) · Shen Cai (Donghua University) · Lin Gao (None) |
371 | EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning | Hongxia Xie (National Chiao Tung University) · Chu-Jun Peng (National Yang Ming Chiao Tung University) · Yu-Wen Tseng (Department of computer science and informational engineering, National Taiwan University) · Hung-Jen Chen (National Yang Ming Chiao Tung University) · Chan-Feng Hsu (National Chiao Tung University) · Hong-Han Shuai (National Yang Ming Chiao Tung University) · Wen-Huang Cheng (National Taiwan University) |
372 | Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios | Jie Xu (University of Electronic Science and Technology of China) · Yazhou Ren (University of Electronic Science and Technology of China) · Xiaolong Wang (University of Electronic Science and Technology of China) · Lei Feng (Nanyang Technological University) · Zheng Zhang (Harbin Institute of Technology) · Gang Niu (RIKEN) · Xiaofeng Zhu (University of Electronic Science and Technology of China) |
373 | REACTO: Reconstructing Articulated Objects from a Single Video | Chaoyue Song (Nanyang Technological University) · Jiacheng Wei (Nanyang Technological University) · Chuan-Sheng Foo (Centre for Frontier AI Research, ASTAR) · Guosheng Lin (Nanyang Technological University) · Fayao Liu (Institute for Infocomm Research, ASTAR) |
374 | Modular Blind Video Quality Assessment | Wen (None) · Mu Li (Harbin Institute of Technology (Shenzhen)) · Yabin ZHANG (Bytedance) · Yiting Liao (Bytedance) · Junlin Li (ByteDance Inc.) · Li zhang (Bytedance Inc.) · Kede Ma (City University of Hong Kong) |
375 | SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing | Zeyinzi Jiang (Alibaba Group) · Chaojie Mao (Alibaba Group) · Yulin Pan (Alibaba Group, China) · Zhen Han (Alibaba Group) · Jingfeng Zhang (Alibaba Group) |
376 | Rotation-Agnostic Image Representation Learning for Digital Pathology | Saghir Alfasly (Mayo Clinic) · Abubakr Shafique (Mayo Clinic) · Peyman Nejat (Mayo Clinic) · Jibran Khan (Luther College) · Areej Alsaafin (Mayo Clinic) · Ghazal Alabtah (Mayo Clinic) · Hamid Tizhoosh (None) |
377 | MedBN: Robust Test Time Adaptation against Malicious Test Samples | Hyejin Park (Pohang University of Science and Technology (POSTECH)) · Jeongyeon Hwang (Pohang University of Science and Technology) · Sunung Mun (Pohang University of Science and Technology) · Sangdon Park (POSTECH) · Jungseul Ok (POSTECH) |
378 | Augmented Identity Distraction for Face Anonymization | Zhenzhong Kuang (Hangzhou Dianzi University) · Xiaochen Yang (Hangzhou Dianzi University) · Yingjie Shen (Hangzhou Dianzi University) · Chao Hu (Hangzhou Dianzi University) · Jun Yu (Hangzhou Dianzi University) |
379 | Continual Motion Prediction Learning Framework via Meta-Representation Learning and Optimal Memory Buffer Retention Strategy | Dae Jun Kang (None) · Dongsuk Kum (Korea Advanced Institute of Science and Technology) · Sanmin Kim (KAIST) |
380 | Robust Synthetic-to-Real Transfer for Stereo Matching | Jiawei Zhang (Beijing University of Aeronautics and Astronautics) · Jiahe Li (Beijing University of Aeronautics and Astronautics) · Lei Huang (Beihang University) · Xiaohan Yu (Macquarie University) · Lin Gu (RIKEN / the University of Tokyo) · Jin Zheng (Beijing University of Aeronautics and Astronautics) · Xiao Bai (Beijing University of Aeronautics and Astronautics) |
381 | Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models | Peifei Zhu (None) · Tsubasa Takahashi (LY Corporation) · Hirokatsu Kataoka (LY Corporation) |
382 | Revisiting Counterfactual Problems in Referring Expression Comprehension | Zhihan Yu (Beijing University of Posts and Telecommunications) · Ruifan Li (Beijing University of Post and Telecommunication) |
383 | Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance | Junkai Fan (Nanjing University of Science and Technology) · Jiangwei Weng (Nanjing University of Science and Technology) · Kun Wang (Nanjing University of Science and Technology) · Yijun Yang (None) · Jianjun Qian (Nanjing University of Science and Techonology) · Jun Li (Nanjing University of Science and Technology) · Jian Yang (Nanjing University of Science and Technology) |
384 | High-Fidelity Hair Modeling from a Monocular Video | Keyu Wu (Zhejiang University) · LINGCHEN YANG (ETHZ - ETH Zurich) · Zhiyi Kuang (Zhejiang University) · Yao Feng (None) · Xutao Han (Zhejiang University) · Yuefan Shen (Zhejiang University) · Hongbo Fu (City University of Hong Kong) · Kun Zhou (Zhejiang University) · Youyi Zheng (Zhejiang University) |
385 | JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients | Woo Kyoung Han (Korea University) · Sunghoon Im (DGIST) · Jaedeok Kim (NVIDIA) · Kyong Hwan Jin (Korea University) |
386 | On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm | Peng Sun (None) · Bei Shi (Northwestern Polytechnical University, Northwest Polytechnical University Xi'an) · David Yu (None) · Tao Lin (Westlake University) |
387 | NC-TTT: A Noise Constrastive Approach for Test-Time Training | David OSOWIECHI (École de Technologie Supérieure, ETS Montreal) · Gustavo Vargas Hakim (École de technologie supérieure, Université du Québec) · Mehrdad Noori (École de technologie supérieure, Université du Québec) · Milad Cheraghalikhani (École de technologie supérieure, Université du Québec) · Ali Bahri (École de technologie supérieure, Université du Québec) · Moslem Yazdanpanah (École de technologie supérieure, Université du Québec) · Ismail Ben Ayed (ETS Montreal) · Christian Desrosiers (École de technologie supérieure) |
388 | FreeKD: Knowledge Distillation via Semantic Frequency Prompt | Yuan Zhang (Peking University) · Tao Huang (The University of Sydney) · Jiaming Liu (Peking University) · Tao Jiang (Zhejiang University) · Kuan Cheng (Peking University) · Shanghang Zhang (Peking University) |
389 | Cloud-Device Collaborative Learning for Multimodal Large Language Models | Guanqun Wang (Peking University) · Jiaming Liu (Peking University) · Chenxuan Li (Peking university) · Yuan Zhang (Peking University) · Ma Junpeng (Peking University) · Xinyu Wei (Peking University) · Kevin Zhang (Peking University) · Maurice Chong (Peking University) · Renrui Zhang (MMLab of CUHK & Shanghai AI Laboratory) · Yijiang Liu (Nanjing University) · Shanghang Zhang (Peking University) |
390 | Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation | Jiaming Liu (Peking University) · Ran Xu (None) · Senqiao Yang (Harbin Institute of Technology) · Renrui Zhang (MMLab of CUHK & Shanghai AI Laboratory) · Qizhe Zhang (Peking University) · Zehui Chen (University of Science and Technology of China) · Yandong Guo (OPPO Research Institute) · Shanghang Zhang (Peking University) |
391 | FC-GNN: Recovering Reliable and Accurate Correspondences from Interferences | Haobo Xu (None) · Jun Zhou (Shanghai Jiaotong University) · Hua Yang (Shanghai Jiaotong University) · Renjie Pan (Shanghai Jiaotong University) · Cunyan Li (Shanghai Jiaotong University) |
392 | Misalignment-Robust Frequency Distribution Loss for Image Transformation | Zhangkai Ni (Tongji University) · Juncheng Wu (Tongji University) · Zian Wang (Tongji University) · Wenhan Yang (Peng Cheng Lab) · Hanli Wang (Tongji University) · Lin Ma (Meituan) |
393 | ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions | Chunlong Xia (Baidu) · Xinliang Wang (Baidu) · Feng Lv (Baidu) · Xin Hao (Beijing Institute of Technology) · Yifeng Shi (Baidu) |
394 | CORES: Convolutional Response-based Score for Out-of-distribution Detection | Keke Tang (Guangzhou University) · Chao Hou (Guangzhou University) · Weilong Peng (None) · Runnan Chen (None) · Peican Zhu (Northwest Polytechnical University Xi'an) · Wenping Wang (Texas A&M University - College Station) · Zhihong Tian (Guangzhou University) |
395 | CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation | Townim Chowdhury (None) · Kewen Liao (Australian Catholic University) · Vu Minh Hieu Phan (University of Adelaide) · Minh-Son To (Flinders University of South Australia) · Yutong Xie (University of Adelaide) · Kevin Hung (Royal Adelaide Hospital) · David Ross (University of South Australia) · Anton van den Hengel (University of Adelaide) · Johan Verjans (University of Adelaide) · Zhibin Liao (University of Adelaide) |
396 | Enhancing Visual Continual Learning with Language-Guided Supervision | Bolin Ni (Institute of Automation, Chinese Academy of Sciences) · Hongbo Zhao (Institute of Automation, Chinese Academy of Sciences) · Chenghao Zhang (Alibaba Group) · Ke Hu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Gaofeng Meng (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Shiming Xiang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) |
397 | On The Vulnerability of Efficient Vision Transformers to Adversarial Computation Attacks | Navaneet K L (University of California, Davis) · Soroush Abbasi Koohpayegani (University of California, Davis) · Essam Sleiman (Harvard University, Harvard University) · Hamed Pirsiavash (University of California, Davis) |
398 | Relation Rectification in Diffusion Model | Yinwei Wu (National University of Singapore) · Xingyi Yang (National University of Singapore) · Xinchao Wang (National University of Singapore) |
399 | Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge | Bo Zou (Computer Science, Tsinghua University, Tsinghua University) · Shaofeng Wang (Capital Medical Universty) · Hao Liu (, Tsinghua University) · Gaoyue Sun (Imperial College London) · Yajie Wang (Tsinghua University, Tsinghua University) · Zuo FeiFei (LargeV .Inc) · Chengbin Quan (Tsinghua University, Tsinghua University) · Youjian Zhao (Tsinghua University) |
400 | ATTA: Label-Free Accuracy Estimation for Test-Time Adaptation | Taeckyung Lee (KAIST) · Sorn Chottananurak (KAIST) · Taesik Gong (Bell Labs) · Sung-Ju Lee (Korea Advanced Institute of Science & Technology) |
401 | Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding | Sicong Leng (Nanyang Technological University) · Hang Zhang (Sichuan University) · Guanzheng Chen (SUN YAT-SEN UNIVERSITY) · Xin Li (Alibaba Group) · Shijian Lu (Nanyang Technological University) · Chunyan Miao (School of Computer Science and Engineering, Nanyang Technological University) · Lidong Bing (Alibaba DAMO Academy) |
402 | Spurious Generalization Indicators - A Sanity Check on Shape Bias, Spectral Bias, and the Critical Band | Paul Gavrikov (Offenburg University) · Janis Keuper (Institute for Machine Learning and Analytics, Offenburg University) |
403 | Mining Supervision for Dynamic Regions in Self-Supervised Monocular Depth Estimation | Hoang Chuong Nguyen (Australian National University) · Tianyu Wang (Australian National University) · Jose M. Alvarez (NVIDIA) · Miaomiao Liu (Australian National University) |
404 | Bayesian Differentiable Physics for Cloth Digitalization | Deshan Gong (University of Leeds) · Ningtao Mao (University of Leeds) · He Wang (None) |
405 | Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning | Menghao Zhang (Beijing University of Posts and Telecommunications) · Jingyu Wang (Beijing University of Post and Telecommunication, Tsinghua University) · Qi Qi (Beijing University of Posts and Telecommunications) · Haifeng Sun (Beijing University of Posts and Telecommunications) · Zirui Zhuang (Beijing University of Posts and Telecommunications) · Pengfei Ren (Beijing University of Posts and Telecommunications) · Ruilong Ma (Beijing University of Posts and Telecommunications) · Jianxin Liao (Beijing University of Posts and Telecommunications) |
406 | Dynamic Support Information Mining for Category-Agnostic Pose Estimation | Pengfei Ren (Beijing University of Posts and Telecommunications) · Yuanyuan Gao (Beijing University of Posts and Telecommunications) · Haifeng Sun (Beijing University of Posts and Telecommunications) · Qi Qi (Beijing University of Posts and Telecommunications) · Jingyu Wang (Beijing University of Post and Telecommunication, Tsinghua University) · Jianxin Liao (Beijing University of Posts and Telecommunications) |
407 | Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras | Huajian Huang (The Hong Kong University of Science and Technology) · Longwei Li (SUN YAT-SEN UNIVERSITY) · Hui Cheng (SUN YAT-SEN UNIVERSITY) · Sai-Kit Yeung (The Hong Kong University of Science and Technology) |
408 | Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation | Ji-Jia Wu (Department of computer science and informational engineering, National Taiwan University) · Andy Chang (National Yang Ming Chiao Tung University) · Chieh-Yu Chuang (National Chiao Tung University, National Chiao Tung University) · Chun-Pei Chen (National Chiao Tung University) · Yu-Lun Liu (National Yang Ming Chiao Tung University) · Min-Hung Chen (NVIDIA) · Hou-Ning Hu (MediaTek Inc.) · Yung-Yu Chuang (Department of computer science and informational engineering, National Taiwan University) · Yen-Yu Lin (National Yang Ming Chiao Tung University) |
409 | PartDistill: 3D Shape Part Segmentation by Vision-Language Model Distillation | Ardian Umam (None) · Cheng-Kun Yang (Department of computer science and informational engineering, National Taiwan University) · Min-Hung Chen (NVIDIA) · Jen-Hui Chuang (None) · Yen-Yu Lin (National Yang Ming Chiao Tung University) |
410 | HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative | CONG MA (Senseauto Research) · Qiao Lei (SenseAuto Research) · Chengkai Zhu (SenseAuto Research) · Kai Liu (SenseAuto Research) · Zelong Kong (SenseAuto Research) · Liqing (SenseAuto) · Xueqi Zhou (Beijing Sensetime Technology Development Co., Ltd.) · Yuheng KAN (Zhejiang University) · Wei Wu (Tsinghua University, Tsinghua University) |
411 | NeLF-Pro: Neural Light Field Probes | Zinuo You (ETH Zurich) · Andreas Geiger (University of Tübingen) · Anpei Chen (Department of Computer Science, ETHZ - ETH Zurich) |
412 | 360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries | Huajian Huang (The Hong Kong University of Science and Technology) · Changkun Liu (Hong Kong University of Science and Technology) · Yipeng Zhu (Hong Kong University of Science and Technology) · Hui Cheng (SUN YAT-SEN UNIVERSITY) · Tristan Braud (Hong Kong University of Science and Technology) · Sai-Kit Yeung (The Hong Kong University of Science and Technology) |
413 | Enhancing the Power of OOD Detection via Sample-Aware Model Selection | Feng Xue (Shanghai Jiaotong University) · Zi He (HuNan University) · Yuan Zhang (Beijing Normal University) · Chuanlong Xie (Beijing Normal University) · Zhenguo Li (Huawei) · Falong Tan (Hunan University) |
414 | Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion | Litu Rout (University of Texas at Austin) · Yujia Chen (Google) · Abhishek Kumar (Google DeepMind) · Constantine Caramanis (University of Texas, Austin) · Sanjay Shakkottai (University of Texas, Austin) · Wen-Sheng Chu (Google Research) |
415 | Anomaly Score: Evaluating Generative Models and Individual Generated Images based on Complexity and Vulnerability | Jaehui Hwang (Yonsei University) · Junghyuk Lee (Yonsei University) · Jong-Seok Lee (Yonsei University) |
416 | Learning Occupancy for Monocular 3D Object Detection | Liang Peng (FABU Inc) · Junkai Xu (Zhejiang University) · Haoran Cheng (College of Computer Science and Technology, Zhejiang University) · Zheng Yang (Fabu Inc) · Xiaopei Wu (Zhejiang University) · Wei Qian (Fabu Inc.) · Wenxiao Wang (Zhejiang University) · Boxi Wu (Zhejiang University) · Deng Cai (Zhejiang University) |
417 | EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams | Christen Millerdurai (Max Planck Institute for Informatics) · Hiroyasu Akada (Max Planck Institute for Informatics) · Jian Wang (Max Planck Institute for Informatics) · Diogo Luvizon (Saarland Informatics Campus, Max-Planck Institute) · Christian Theobalt (MPI Informatik) · Vladislav Golyanik (MPI for Informatics) |
418 | LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images | Jing Zhang (New York University) · Irving Fang (New York University) · Hao Wu (New York University) · Akshat Kaushik (New York University) · Alice Rodriguez (New York University) · Hanwen Zhao (New York University) · Juexiao Zhang (New York University) · Zhuo Zheng (Stanford University) · Radu Iovita (New York University) · Chen Feng (New York University) |
419 | Dynamic LiDAR Re-simulation using Compositional Neural Fields | Hanfeng Wu (None) · Xingxing Zuo (Caltech) · Stefan Leutenegger (Department of Informatics, Technische Universität München) · Or Litany (NVIDIA / Technion) · Konrad Schindler (ETH Zurich) · Shengyu Huang (None) |
420 | Deep Imbalanced Regression via Hierarchical Classification Adjustment | Haipeng Xiong (National University of Singapore) · Angela Yao (National University of Singapore) |
421 | All Rivers Run to the Sea: Private Learning with Asymmetric Flows | Yue Niu (USC) · Ramy E. Ali (Samsung) · Saurav Prakash (University of Illinois at Urbana-Champaign) · Salman Avestimehr (University of Southern California) |
422 | OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM | Yutao Hu (University of Hong Kong) · Tianbin (None) · Quanfeng Lu (Shanghai AI Laboratory) · Wenqi Shao (The Chinese University of Hong Kong) · Junjun He (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Chinese Academy of Sciences) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Ping Luo (The University of Hong Kong) |
423 | ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection | Yichen Bai (None) · Zongbo Han (Tianjin University) · Bing Cao (Tianjin University) · Xiaoheng Jiang (Zhengzhou University) · Qinghua Hu (Tianjin University) · Changqing Zhang (Tianjin University) |
424 | Task-Customized Mixture of Adapters for General Image Fusion | Pengfei Zhu (Tianjin University) · Yang Sun (Tianjin University) · Bing Cao (Tianjin University) · Qinghua Hu (Tianjin University) |
425 | FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment | Jinglin Xu (University of Science and Technology Beijing) · Sibo Yin (Peking University) · Guohao Zhao (Peking University) · Zishuo Wang (None) · Yuxin Peng (Peking University) |
426 | FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding | Jinglin Xu (University of Science and Technology Beijing) · Guohao Zhao (Peking University) · Sibo Yin (Peking University) · Wenhao Zhou (University of Science and Technology Beijing) · Yuxin Peng (Peking University) |
427 | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Chenshuang Zhang (Korea Advanced Institute of Science and Technology) · Fei Pan (University of Michigan - Ann Arbor) · Junmo Kim (Korea Advanced Institute of Science and Technology) · In So Kweon (Korea Advanced Institute of Science and Technology) · Chengzhi Mao (Columbia University) |
428 | SVDinsTN: A Tensor Network Paradigm for Efficient Structure Search from Regularized Modeling Perspective | Yu-Bang Zheng (Southwest Jiaotong University) · Xile Zhao (University of Electronic Science and Technology of China) · Junhua Zeng (RIKEN) · Chao Li (RIKEN) · Qibin Zhao (RIKEN) · Heng-Chao Li (Southwest Jiaotong University) · Ting-Zhu Huang (University of Electronic Science and Technology of China) |
429 | Super-Resolution Reconstruction from Bayer-Pattern Spike Streams | Yanchen Dong (Peking University) · Ruiqin Xiong (Peking University) · Jian Zhang (None) · Zhaofei Yu (Peking University) · Xiaopeng Fan (Harbin Institute of Technology) · Shuyuan Zhu (University of Electronic Science and Technology of China) · Tiejun Huang (Peking University) |
430 | Sketch in VR, Make it Real: Rapid 3D Model Generation using VR 3D Sketching | Tianrun Chen (Zhejiang University) · Chaotao Ding (Huzhou university) · Shangzhan Zhang () · Chunan Yu (Huzhou University) · Ying Zang (Huzhou University) · Zejian Li (Zhejiang University) · Sida Peng (None) · Lingyun Sun (Zhejiang University) |
431 | Multi-View Attentive Contextualization for Multi-View 3D Object Detection | Xianpeng Liu (North Carolina State University) · Ce Zheng (University of Central Florida) · Ming Qian (None) · Nan Xue (None) · Chen Chen () · Zhebin Zhang (OPPO) · Chen Li (Innopeak Technology Inc.) · Tianfu Wu () |
432 | MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling | Xuzhe Zhang (Columbia University) · Yuhao Wu (Duke University) · Elsa Angelini (Télécom ParisTech) · Ang Li (University of Maryland, College Park) · Jia Guo (Columbia University) · Jerod Rasmussen (University of California, Irvine) · Thomas O'Connor (University of Rochester) · Pathik Wadhwa (University of California, Irvine) · Andrea Jackowski (None) · Hai Li (Duke University) · Jonathan Posner (Duke University) · Andrew Laine (Columbia University) · Yun Wang (Duke University) |
433 | Enhancing Multi-modal Cooperation via Sample-level Modality Valuation | Yake Wei (Renmin University of China) · Ruoxuan Feng (Renmin University of China) · Zihe Wang (Renmin University of China) · Di Hu (Renmin University of China) |
434 | Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging | Bhargav Ghanekar (Rice University) · Salman Siddique Khan (Rice University) · Pranav Sharma (Qualcomm Inc, QualComm) · Shreyas Singh (Indian Institute of Technology, Madras) · Vivek Boominathan (Rice University) · Kaushik Mitra (Indian Institute of Technology, Madras, Dhirubhai Ambani Institute Of Information and Communication Technology) · Ashok Veeraraghavan (William Marsh Rice University) |
435 | Nearest Is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks | Boheng Li (Wuhan University) · Yishuo Cai (Central South University) · Haowei Li (Wuhan University) · Feng Xue (ZJU-Hangzhou Global Scientific and Technological Innovation Center) · Zhifeng Li (Tencent) · Yiming Li (Zhejiang University) |
436 | FSC: Few-point Shape Completion | Xianzu Wu (Jianghan University) · Xianfeng Wu (Jianghan University) · Tianyu Luan (State University of New York at Buffalo) · Yajing Bai (Jianghan University) · Zhongyuan Lai (Jianghan University) · Junsong Yuan (State University of New York at Buffalo) |
437 | BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning | Hongwei Zheng (Meituan) · Linyuan Zhou (meituan) · Han Li (Shanghai Jiaotong University) · Jinming Su (Meituan) · Xiaoming Wei (Meituan) · Xu Xiaoming (meituan) |
438 | Desigen: A Pipeline for Controllable Design Template Generation | Haohan Weng (South China University of Technology) · Danqing Huang (Microsoft) · YU QIAO (Central South University) · Hu Zheng (Keio University, Tokyo Institute of Technology) · Chin-Yew Lin (Microsoft) · Tong Zhang (South China University of Technology) · C. L. Philip Chen (South China University of Technology) |
439 | Large Language Models are Good Prompt Learners for Low-Shot Image Classification | Zhaoheng Zheng (None) · Jingmin Wei (University of Southern California) · Xuefeng Hu (University of Southern California) · Haidong Zhu (University of Southern California) · Ram Nevatia (None) |
440 | Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding | Zhihao Yuan (None) · Jinke Ren (The Chinese University of Hong Kong, Shenzhen) · Chun-Mei Feng (None) · Hengshuang Zhao (The University of Hong Kong) · Shuguang Cui (The Chinese University of Hong Kong, Shenzhen) · Zhen Li (The Chinese University of Hong Kong, Shenzhen) |
441 | Deformable One-shot Face Stylization via DINO Semantic Guidance | Yang Zhou (Shenzhen University) · Zichong Chen (Shenzhen University) · Hui Huang (Shenzhen University) |
442 | MoCha-Stereo: Motif Channel Attention Network for Stereo Matching | Ziyang Chen (Guizhou University) · Wei Long (None) · He Yao (None) · Yongjun Zhang (None) · Bingshu Wang (Northwest Polytechnical University Xi'an) · Yongbin Qin (Guizhou University) · Jia Wu (Monash University) |
443 | Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations | Lei Fan (Northwestern University) · Jianxiong Zhou (Northwestern University) · Xiaoying Xing (Northwestern University) · Ying Wu (Northwestern University) |
444 | ShapeMaker: Self-Supervised Joint Shape Canonicalization, Segmentation, Retrieval and Deformation | Yan Di (Technische Universität München) · Chenyangguang Zhang (Tsinghua University) · Chaowei Wang (Northwestern Polytechnical University, Northwest Polytechnical University Xi'an) · Ruida Zhang (Department of Automation, Tsinghua University, Tsinghua University) · Guangyao Zhai (Technical University of Munich) · Yanyan Li (Technical University Munich) · Bowen Fu (Technische Universität München) · Xiangyang Ji (Tsinghua University) · Shan Gao (Northwest Polytechnical University Xi'an) |
445 | MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection | Boyang Peng (Tongji University) · Sanqing Qu (Tongji University) · Yong Wu (Tongji University) · Tianpei Zou (Tongji University) · Lianghua He (Tongji University) · Alois Knoll (Technical University Munich) · Guang Chen (Tongji University) · Changjun Jiang (Tongji University) |
446 | EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI | Tai Wang (Shanghai AI Laboratory) · Xiaohan Mao (Shanghai Jiaotong University) · Chenming Zhu (The Chinese University Of Hong Kong, Shenzhen) · Runsen Xu (The Chinese University of Hong Kong) · Ruiyuan Lyu (Shanghai AI Laboratory) · Peisen Li (Tsinghua University, Tsinghua University) · Xiao Chen (The Chinese University of Hong Kong) · Wenwei Zhang (None) · Kai Chen (Shanghai AI Laboratory) · Tianfan Xue (The Chinese University of Hong Kong) · Xihui Liu (The University of Hong Kong) · Cewu Lu (Shanghai Jiao Tong University) · Dahua Lin (The Chinese University of Hong Kong) · Jiangmiao Pang (Shanghai AI Laboratory ) |
447 | CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data | Wei Fang (Alibaba Group) · Yuxing Tang (Alibaba Group) · Heng Guo (Alibaba Group) · Mingze Yuan (Peking University) · Tony C. W. MOK (Alibaba DAMO Academy) · Ke Yan (Alibaba DAMO Academy) · Jiawen Yao (Alibaba Group) · Xin Chen (Guangzhou First People's Hospital) · Zaiyi Liu (Guangdong General Hospital) · Le Lu (Alibaba Group) · Ling Zhang (Alibaba Group) · Minfeng Xu (Alibaba Group) |
448 | Point Transformer V3: Simpler, Faster, Stronger | Xiaoyang Wu (The University of Hong Kong) · Li Jiang (Max Planck Institute for Informatics) · Peng-Shuai Wang (Peking University) · Zhijian Liu (Massachusetts Institute of Technology) · Xihui Liu (The University of Hong Kong) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Wanli Ouyang (University of Sydney) · Tong He (Shanghai AI Lab) · Hengshuang Zhao (The University of Hong Kong) |
449 | Effective Video Mirror Detection with Inconsistent Motion Cues | Alex Warren (Swansea University) · Ke Xu (City University of Hong Kong) · Jiaying Lin (City University of Hong Kong) · Gary Tam (Swansea University) · Rynson W.H. Lau (City University of Hong Kong) |
450 | Unsupervised Salient Instance Detection | Xin Tian (Huawei Technologies Ltd.) · Ke Xu (City University of Hong Kong) · Rynson W.H. Lau (City University of Hong Kong) |
451 | Color Shift Estimation-and-Correction for Image Enhancement | Yiyu Li (City University of Hong Kong) · Ke Xu (City University of Hong Kong) · Gerhard Hancke Hancke (None) · Rynson W.H. Lau (City University of Hong Kong) |
452 | SeD: Semantic-Aware Discriminator for Image Super-Resolution | Bingchen Li (University of Science and Technology of China) · Xin Li (None) · Hanxin Zhu (University of Science and Technology of China) · YEYING JIN (National University of Singapore) · Ruoyu Feng (University of Science and Technology of China) · Zhizheng Zhang (Microsoft Research) · Zhibo Chen (University of Science and Technology of China) |
453 | CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model | Jianhao Zeng (Tianjin University) · Dan Song (Tianjin University) · Weizhi Nie (Tianjin University) · Hongshuo Tian (Tianjin University) · Tongtong Wang (Tencent LightSpeed Studio) · Anan Liu (Tianjin University) |
454 | Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds | Zhimin Yuan (School of Informatics Xiamen University) · Wankang Zeng (Xiamen University) · Yanfei Su (Xiamen University) · Weiquan Liu (Xiamen University) · Ming Cheng (Xiamen University) · Yulan Guo (SUN YAT-SEN UNIVERSITY) · Cheng Wang (Xiamen University) |
455 | Fine-grained Bipartite Concept Factorization for Clustering | Chong Peng (None) · Pengfei Zhang (Qingdao University) · Yongyong Chen (Harbin Institute of Technology (Shenzhen)) · zhao kang (University of Electronic Science and Technology of China) · Chenglizhao Chen (China University of Petroleum) · Qiang Cheng (University of Kentucky) |
456 | RepKPU: Point Cloud Upsampling with Kernel Point Representation and KernelPoint-to-Displacement Generation | Yi Rong (Nanjing University) · Haoran Zhou (Nanjing University) · Kang Xia (nanjing university) · Cheng Mei (nanjing university) · Jiahao Wang () · Tong Lu (Nanjing University) |
457 | PanoContext-Former: Panoramic Total Scene Understanding with a Transformer | Yuan Dong (Alibaba Group) · Chuan Fang (Hong Kong University of Science and Technology) · Liefeng Bo (None) · Zilong Dong (Alibaba Group) · Ping Tan (Hong Kong University of Science and Technology) |
458 | Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection | Yicheng Xiao (Tsinghua University, Tsinghua University) · Zhuoyan Luo (Tsinghua University) · Yong Liu (None) · Yue Ma (Tsinghua University, Tsinghua University) · Hengwei Bian (Carnegie Mellon University) · Yatai Ji (None) · Yujiu Yang (Tsinghua University) · Xiu Li (Tsinghua University) |
459 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong (Fudan University) · Shilin Yan (Fudan University) · Renrui Zhang (MMLab of CUHK & Shanghai AI Laboratory) · Wanyun Li (Fudan University) · Xinyu Zhou (None) · Pinxue Guo (Fudan University) · Kaixun Jiang (Fudan University) · Yiting Cheng (None) · Jinglun Li (None) · Zhaoyu Chen (Fudan University) · Wenqiang Zhang (None) |
460 | Zero-Shot Structure-Preserving Diffusion Model for High Dynamic Range Tone Mapping | Ruoxi Zhu (Fudan University) · Shusong Xu (Alibaba Group) · Peiye Liu (Alibaba Group) · Sicheng Li (Alibaba Group) · Yanheng Lu (Alibaba Group) · Dimin Niu (Alibaba Group) · Zihao Liu (Alibaba Group) · Zihao Meng (Alibaba Group) · Li Zhiyong (Alibaba Group) · Xinhua Chen (Fudan University) · Yibo Fan (Fudan University) |
461 | Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training | Shizhan Gong (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Qi Dou (The Chinese University of Hong Kong) · Farzan Farnia (The Chinese University of Hong Kong) |
462 | A Category Agnostic Model for Visual Rearrangement | Yuyi Liu (Institute of Computing Technology,University of the Chinese Academy of Sciences) · Xinhang Song (None) · Weijie Li (Alibaba Group) · XIAOHAN Wang (Xi'an Jiaotong University) · Shuqiang Jiang (Institute of Computing Technology, Chinese Academy of Sciences) |
463 | BiPer: Binary Neural Networks using a Periodic Function | Edwin Vargas (None) · Claudia Correa (Universidad Industrial de Santander) · Carlos Hinojosa (KAUST) · TBD TBD (None) |
464 | Attention Calibration for Disentangled Text-to-Image Personalization | Yanbing Zhang (East China University of Science and Technology) · Mengping Yang (East China University of Science and Technology) · Qin Zhou (East China University of Science and Technology) · Zhe Wang (East China University of Science and Technology) |
465 | G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images | Zixiong Huang (South China University of Technology) · Qi Chen (The University of Adelaide) · Libo Sun (University of Adelaide) · Yifan Yang (South China University of Technology) · Naizhou Wang (CVTE research) · Qi Wu (University of Adelaide) · Mingkui Tan (South China University of Technology) |
466 | UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion | Junsheng Zhou (Tsinghua University) · Weiqi Zhang (Tsinghua University) · Baorui Ma (BAAI) · Kanle Shi (Kuaishou Technology) · Yu-Shen Liu (None) · Zhizhong Han (Wayne State University) |
467 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen (University of California, Merced) · Aliaksandr Siarohin (Snap Inc.) · Willi Menapace (University of Trento) · Ekaterina Deyneka (Snap Inc.) · Hsiang-wei Chao (Snap Inc.) · Byung Jeon (Snap Inc.) · Yuwei Fang (Snap Inc.) · Hsin-Ying Lee (Snap Inc.) · Jian Ren (Snap Inc.) · Ming-Hsuan Yang (University of California at Merced) · Sergey Tulyakov (Snap Inc.) |
468 | BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models | Fengyuan Shi (Nanjing University) · Jiaxi Gu (Huawei Noah‘s Ark Lab) · Hang Xu (Huawei Noah‘s Ark Lab) · Songcen Xu (Huawei Noah's Ark Lab) · Wei Zhang (Huawei Technologies Ltd.) · Limin Wang (Nanjing University) |
469 | CLIP-Driven Open-Vocabulary 3D Scene Graph Generation via Cross-Modality Contrastive Learning | Lianggangxu Chen (East China Normal University) · Xuejiao Wang (East China Normal University) · Jiale Lu (East China Normal University) · Shaohui Lin (East China Normal University) · Changbo Wang (East China Normal University) · Gaoqi He (East China Normal University) |
470 | Lane2Seq: Towards Unified Lane Detection via Sequence Generation | Kunyang Zhou (Southeast University) |
471 | Depth-Aware Concealed Crop Detection in Dense Agricultural Scenes | Liqiong Wang (China Three Gorges University) · Jinyu Yang (University of Birmingham) · Yanfu Zhang (College of William and Mary) · Fangyi Wang (China Three Gorges University) · Feng Zheng (Southern University of Science and Technology) |
472 | FedKTL: An Upload-Efficient Knowledge Transfer Scheme With a Pre-trained Generator in Heterogeneous Federated Learning | Jianqing Zhang (Shanghai Jiao Tong University & Tsinghua University) · Yang Liu (Tsinghua University, Tsinghua University) · Yang Hua (Queen's University Belfast) · Jian Cao (Shanghai Jiaotong University) |
473 | EASE-DETR: Easing the Competition among Object Queries | Yulu Gao (Beijing University of Aeronautics and Astronautics) · Yifan Sun (Baidu Research) · Xudong Ding (Beijing University of Aeronautics and Astronautics) · Chuyang Zhao (Beijing University of Aeronautics and Astronautics) · Si Liu (Beihang University) |
474 | MS-DETR: Efficient DETR Training with Mixed Supervision | Chuyang Zhao (Beijing University of Aeronautics and Astronautics) · Yifan Sun (Baidu Research) · Wenhao Wang (None) · Qiang Chen (Baidu) · Errui Ding (Baidu Inc.) · Yi Yang (Zhejiang University) · Jingdong Wang (Baidu) |
475 | A Specialized Dataset for Traffic Scene Perception | Peng-Tao Jiang (vivo Mobile Communication Co., Ltd) · Yuqi Yang (Nankai University) · Yang Cao (Hong Kong University of Science and Technology) · Qibin Hou (Nankai University) · Ming-Ming Cheng (Nankai University, Tsinghua University) · Chunhua Shen (Zhejiang University) |
476 | DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | Tianhao Qi (University of Science and Technology of China) · Shancheng Fang (University of Science and Technology of China) · Yanze Wu (ByteDance Inc.) · Hongtao Xie (University of Science and Technology of China) · Jiawei Liu (Institute of automation, Chinese academy of sciences) · Lang chen (ByteDance) · Qian HE (Institute of Remote Sensing Application, Chinese Academic of Sciences) · Yongdong Zhang (University of Science and Technology of China) |
477 | Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models | Pengze Zhang (Sun Yat-sen University) · Hubery Yin (Tencent) · Chen Li (Tencent) · Xiaohua Xie (SUN YAT-SEN UNIVERSITY) |
478 | Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection | Zhanwei Zhang (None) · Minghao Chen (Zhejiang University) · Shuai Xiao (Alibaba Group) · Liang Peng (FABU Inc) · Hengjia Li (FABU Inc) · Binbin Lin (Zhejiang University) · Ping Li (Hangzhou Dianzi University) · Wenxiao Wang (Zhejiang University) · Boxi Wu (Zhejiang University) · Deng Cai (Zhejiang University) |
479 | Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling | Zhe Li (Tsinghua University) · Zerong Zheng (Tsinghua University) · Lizhen Wang (Tsinghua University, Tsinghua University) · Yebin Liu (Tsinghua University) |
480 | MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures | Zhangyang Xiong () · Chenghong Li (The Chinese University of Hong Kong, Shenzhen) · Kenkun Liu (The Chinese University of Hong Kong (Shenzhen)) · Hongjie Liao (Chinese University of Hong Kong, Shenzhen) · Jianqiao HU (The Chinese University of Hong Kong, Shenzhen) · Junyi Zhu (The Chinese University of Hongkong, Shenzhen) · Shuliang Ning (The Chinese University of HongKong, ShenZhen) · Lingteng Qiu (None) · Chongjie Wang (The Chinese University of Hong Kong ,Shenzhen) · Shijie Wang (The Chinese University of Hong Kong, Shenzhen) · Shuguang Cui (The Chinese University of Hong Kong, Shenzhen) · Xiaoguang Han (The Chinese University of Hong Kong, Shenzhen) |
481 | PeVL: Pose-Enhanced Vision-Language Model for Fine-Grained Human Action Recognition | Haosong Zhang (School of Computer Science and Engineering, Nanyang Technological University) · Mei Leong (, ASTAR) · Liyuan Li (I2R, ASTAR) · Weisi Lin (Nanyang Technological University) |
482 | Curriculum Point Prompting for Weakly-Supervised Referring Segmentation | Qiyuan Dai (ShanghaiTech University) · Sibei Yang (None) |
483 | Bi-Causal: Group Activity Recognition via Bidirectional Causality | Youliang Zhang (Wuhan University) · Wenxuan Liu (Wuhan University of Technology) · danni xu (National University of Singapore) · Zhuo Zhou (Wuhan University) · Zheng Wang (Wuhan University) |
484 | Targeted Representation Alignment for Open-World Semi-Supervised Learning | Ruixuan Xiao (Zhejiang University) · Lei Feng (Nanyang Technological University) · Kai Tang (Zhejiang University) · Junbo Zhao (Zhejiang University) · Yixuan Li (University of Wisconsin Madison) · Gang Chen (College of Computer Science and Technology, Zhejiang University) · Haobo Wang (Zhejiang University) |
485 | MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception | Yiran Qin (The Chinese University of Hong Kong(Shenzhen)) · Enshen Zhou (Shanghai AI Laboratory) · Qichang Liu (Shanghai AI Laboratory) · Zhenfei Yin (University of Sydney) · Lu Sheng (Beihang University) · Ruimao Zhang (The Chinese University of Hong Kong (Shenzhen)) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Jing Shao (Shanghai AI Laboratory) |
486 | Transfer CLIP for Generalizable Image Denoising | Jun Cheng (Huazhong University of Science and Technology) · Dong Liang (Huazhong University of Science and Technology) · Shan Tan (Huazhong University of Science and Technology) |
487 | Modality-Collaborative Test-Time Adaptation for Action Recognition | Baochen Xiong (Institute of Automation, Chinese Academy of Sciences; Peng Cheng Lab) · Xiaoshan Yang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Yaguang Song (Peng Cheng Laboratory) · Yaowei Wang (Pengcheng Laboratory) · Changsheng Xu (None) |
488 | Segment Any Events via Weighted Adaptation of Pivotal Tokens | Zhiwen Chen (Xidian University) · Zhiyu Zhu (City University of Hong Kong) · Yifan Zhang (City University of Hong Kong) · Junhui Hou (City University of Hong Kong) · Guangming Shi (Xidian University) · Jinjian Wu (Xidian University) |
489 | Generalizable Face Landmarking Guided by Conditional Face Warping | Jiayi Liang (Beijing Institute of Technology) · Haotian Liu (Beijing Institute of Technology) · Hongteng Xu (Renmin University of China) · Dixin Luo (Beijing Institute of Technology) |
490 | Behind the Veil: Enhanced Indoor 3D Scene Reconstruction with Occluded Surfaces Completion | Su Sun (Purdue University) · Henry Zhao (Bosch Research) · Yuliang Guo (Bosch US Research) · Ruoyu Wang (Bosch) · Xinyu Huang (Robert Bosch Research NA) · Yingjie Victor Chen (Purdue University) · Liu Ren (Bosch Research) |
491 | Correcting Diffusion Generation through Resampling | Yujian Liu (University of California, Santa Barbara) · Yang Zhang (International Business Machines) · Tommi Jaakkola (Massachusetts Institute of Technology) · Shiyu Chang (UC Santa Barbara) |
492 | Random Entangled Tokens for Adversarially Robust Vision Transformer | Huihui Gong (University of Sydney) · Minjing Dong (City University of Hong Kong) · Siqi Ma (University of New South Wales) · Seyit Camtepe (CSIRO) · Surya Nepal (, CSIRO) · Chang Xu (University of Sydney) |
493 | PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF | Yutao Feng (Zhejiang University) · Yintong Shang (University of Utah) · Xuan Li (None) · Tianjia Shao (Zhejiang University) · Chenfanfu Jiang (University of California, Los Angeles) · Yin Yang (University of Utah) |
494 | BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks | Shangqian Gao (University of Pittsburgh) · Yanfu Zhang (College of William and Mary) · Feihu Huang (Nanjing University of Aeronautics and Astronautics) · Heng Huang (University of Pittsburgh) |
495 | VideoCon: Robust Video-Language Alignment via Contrast Captions | Hritik Bansal (University of California, Los Angeles) · Yonatan Bitton (Google) · Idan Szpektor (Google) · Kai-Wei Chang (University of California, Los Angeles) · Aditya Grover (University of California, Los Angeles) |
496 | SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation | Yamei Chen (Technische Universität München) · Yan Di (Technische Universität München) · Guangyao Zhai (Technical University of Munich) · Fabian Manhardt (Google) · Chenyangguang Zhang (Tsinghua University) · Ruida Zhang (Department of Automation, Tsinghua University, Tsinghua University) · Federico Tombari (Google, TUM) · Nassir Navab (TU Munich) · Benjamin Busam (None) |
497 | MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision | Chenyangguang Zhang (Tsinghua University) · Guanlong Jiao (Tsinghua University, Tsinghua University) · Yan Di (Technische Universität München) · Gu Wang (Tsinghua University) · Ziqin Huang (Tsinghua University, Tsinghua University) · Ruida Zhang (Department of Automation, Tsinghua University, Tsinghua University) · Fabian Manhardt (Google) · Bowen Fu (Technische Universität München) · Federico Tombari (Google, TUM) · Xiangyang Ji (Tsinghua University) |
498 | KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation | Ruida Zhang (Department of Automation, Tsinghua University, Tsinghua University) · Chenyangguang Zhang (Tsinghua University) · Yan Di (Technische Universität München) · Fabian Manhardt (Google) · Xingyu Liu (Tsinghua University, Tsinghua University) · Federico Tombari (Google, TUM) · Xiangyang Ji (Tsinghua University) |
499 | RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception | Ruiyang Hao (Institute for AI Industry Research, Tsinghua University) · Siqi Fan (Institute for AI Industry Research, Tsinghua University) · Yingru Dai (Tsinghua University, Tsinghua University) · Zhenlin Zhang (China Automotive Innovation Corporation) · Chenxi Li (CAIC) · YuntianWang (China Automotive Innovation Corporation) · Haibao Yu (University of Hong Kong) · Wenxian Yang (Tsinghua University, Tsinghua University) · Jirui Yuan (Tsinghua University, Tsinghua University) · Zaiqing Nie (Tsinghua University, Tsinghua University) |
500 | Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models | Xin Li (Tencent Youtu Lab) · Yunfei Wu (Tencent YouTu Lab) · Xinghua Jiang (None) · ZhiHao Guo (Tencent YOUTU Lab) · Mingming Gong (Tencent YouTu Lab) · Haoyu Cao (Tencent Youtu Lab) · Yinsong Liu (Tencent Youtu Lab) · Deqiang Jiang (Tencent YouTu Lab) · Xing Sun (Tencent YouTu Lab) |
501 | HumMUSS: Human Motion Understanding using State Space Models | Arnab Mondal (McGill University) · Stefano Alletto (Apple) · Denis Tome (Apple) |
502 | LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding | Chuwei Luo (DAMO Academy, Alibaba Group) · Yufan Shen (Zhejiang University) · Zhaoqing Zhu (Alibaba Group) · Qi Zheng (Alibaba Group) · Zhi Yu (Zhejiang University) · Cong Yao (Alibaba DAMO Academy) |
503 | Byzantine-robust Decentralized Federated Learning via Dual-domain Clustering and Trust Bootstrapping | Peng Sun (Hunan University) · Xinyang Liu (Hong Kong Polytechnic University) · Zhibo Wang (Zhejiang University) · Bo Liu (Shenzhen Institute of Artificial Intelligence and Robotics for Society) |
504 | AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond | Zixiang Zhou (xiaobing.ai) · Yu Wan () · Baoyuan Wang (Xiaobing.ai) |
505 | 6-DoF Pose Estimation with MultiScale Residual Correlation | Yuelong Li (Amazon) · Yafei Mao (None) · Raja Bala (Amazon) · Sunil Hadap (Amazon) |
506 | Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction | Cheng Sun (NVIDIA) · Wei-En Tai (None) · Yu-Lin Shih (None) · Kuan-Wei Chen (National Tsinghua University) · Yong-Jing Syu (National Tsinghua University) · Kent Selwyn The (National Tsinghua University) · Yu-Chiang Frank Wang (NVIDIA) · Hwann-Tzong Chen (National Tsing Hua University) |
507 | GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes | Haozhe Lin (None) · Chunyu Wei (Tsinghua University, Tsinghua University) · Li He (Qiyuan Lab) · Yuchen Guo (Tsinghua University, Tsinghua University) · Yuchy Zhao (Tsinghua University, Tsinghua University) · Shanglong Li (Tsinghua University) · Lu Fang (Tsinghua University, Tsinghua University) |
508 | PEGASUS: Personalized Generative 3D Avatars with Composable Attributes | Hyunsoo Cha (Seoul National University) · Byungjun Kim (Seoul National University) · Hanbyul Joo (None) |
509 | CMA: A Chromaticity Map Adapter for Robust Detection of Screen-Recapture Document Images | Changsheng Chen (Shenzhen University) · Liangwei Lin (Shenzhen University) · Yongqi Chen (Shenzhen University) · Bin Li (Shenzhen University) · Jishen Zeng (Alibaba Group) · Jiwu Huang (Shenzhen University) |
510 | SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection | Peng Qi (National University of Singapore) · Zehong Yan (National University of Singapore) · Wynne Hsu (National University of Singapore) · Mong Li Lee (National University of Singapore) |
511 | OMG-Seg: Is One Model Good Enough For All Segmentation? | Xiangtai Li (Nanyang Technological University) · Haobo Yuan (Wuhan University) · Wei Li (Nanyang Technological University) · Henghui Ding (None) · Size Wu (Nanyang Technological University) · Wenwei Zhang (None) · Yining Li (Shanghai AI Laboratory) · Kai Chen (Shanghai AI Laboratory) · Chen Change Loy (NANYANG TECHNOLOGICAL UNIVERSITY) |
512 | Spatial-Aware Regression for Keypoint Localization | dongkai.wang Wang (Peking University) · Shiliang Zhang (Peking University) |
513 | LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model | dongkai.wang Wang (Peking University) · shiyu xuan (Peking University) · Shiliang Zhang (Peking University) |
514 | Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework | Ziyao Huang (, Chinese Academy of Sciences) · Fan Tang (Institute of Computing Technology, CAS) · Yong Zhang (Tencent AI Lab) · Xiaodong Cun (Tencent AI Lab) · Juan Cao (Institute of Computing Technology, Chinese Academy of Sciences) · Jintao Li (Institute of Computing Technology, Chinese Academy of Sciences) · Tong-yee Lee (National Cheng Kung University) |
515 | MoST: Multi-modality Scene Tokenization for Motion Prediction | Norman Mu (University of California Berkeley) · Jingwei Ji (Waymo LLC) · Zhenpei Yang (Waymo LLC) · Nathan Harada (Google) · Haotian Tang (Massachusetts Institute of Technology) · Kan Chen (Waymo) · Charles R. Qi (Waymo) · Runzhou Ge (Waymo) · Kratarth Goel (Waymo) · Zoey Yang (Waymo) · Scott Ettinger (Waymo LLC) · Rami Al-Rfou (Waymo) · Dragomir Anguelov (Waymo) · Yin Zhou (Waymo) |
516 | 3D Feature Tracking at 250 FPS via Event Camera | Siqi Li (Tsinghua University) · Zhou Zhikuan (None) · Zhou Xue (Li Auto) · Yipeng Li (Tsinghua University, Tsinghua University) · Shaoyi Du (Xi'an Jiaotong University) · Yue Gao (Tsinghua University, Tsinghua University) |
517 | CoDi-2: Interleaved and In-Context Any-to-Any Generation | Zineng Tang (University of North Carolina, Chapel Hill) · Ziyi Yang (Microsoft) · MAHMOUD KHADEMI (Microsoft) · Yang Liu (Microsoft) · Chenguang Zhu (Zoom) · Mohit Bansal (University of North Carolina at Chapel Hill) |
518 | Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization | Insoo Kim (Korea Advanced Institute of Science and Technology) · Jae Seok Choi (Samsung Advanced Institute of Technology (SAIT)) · Geonseok Seo (Samsung) · Kinam Kwon (Samsung) · Jinwoo Shin (Korea Advanced Institute of Science and Technology) · Hyong-Euk Lee (Samsung Advanced Institute of Technology) |
519 | DeMatch: Deep Decomposition of Motion Field for Two-View Correspondence Learning | Shihua Zhang (Wuhan University) · Zizhuo Li (Wuhan University) · Yuan Gao (Wuhan University) · Jiayi Ma (Wuhan University) |
520 | OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos | Dongyoung Choi (Korea Advanced Institute of Science and Technology) · Hyeonjoong Jang (None) · Min H. Kim (KAIST) |
521 | FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models | Lin Zhao (Infinigence) · Tianchen Zhao (Tsinghua University, Tsinghua University) · Zinan Lin (Microsoft Research) · Xuefei Ning (Tsinghua University, Tsinghua University) · Guohao Dai (Shanghai Jiaotong University) · Huazhong Yang (Tsinghua University, Tsinghua University) · Yu Wang (Tsinghua University, Tsinghua University) |
522 | MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model | Kaiyu Song (SUN YAT-SEN UNIVERSITY) · Hanjiang Lai (SUN YAT-SEN UNIVERSITY) · Yan Pan (SUN YAT-SEN UNIVERSITY) · Jian Yin () |
523 | MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying | Ryan Burgert (Stony Brook University) · Brian Price (Adobe Research) · Jason Kuen (Adobe Research) · Yijun Li (Adobe Research) · Michael Ryoo (Stony Brook University) |
524 | Hyper-MD: Mesh Denoising with Customized Parameters Aware of Noise Intensity and Geometric Characteristics | Xingtao Wang (Harbin Institute of Technology) · Hongliang Wei (Harbin Institute of Technology) · Xiaopeng Fan (Harbin Institute of Technology) · Debin Zhao (Harbin Institute of Technology) |
525 | Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations | Rui Zhao (None) · Ruiqin Xiong (Peking University) · Jing Zhao (cncert) · Jian Zhang (None) · Xiaopeng Fan (Harbin Institute of Technology) · Zhaofei Yu (Peking University) · Tiejun Huang (Peking University) |
526 | SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction | Zhiyang Yao (Department of Electronic Engineering, Tsinghua University) · Shuyang Liu (Tsinghua university) · Xiaoyun Yuan (Tsinghua University) · Lu Fang (Tsinghua University, Tsinghua University) |
527 | Dual-scale Transformer for Large-scale Single-Pixel Imaging | Gang Qu (Westlake University) · Ping Wang (Zhejiang University) · Xin Yuan (Westlake University) |
528 | 3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation | Zidu Wang (Institute of automation, Chinese Academy of Sciences) · Xiangyu Zhu (None) · Tianshuo Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · baiqin wang (None) · Zhen Lei (Institute of Automation, Chinese Academy of Sciences) |
529 | Diversified and Personalized Multi-rater Medical Image Segmentation | Yicheng Wu (Monash University) · Xiangde Luo (University of Electronic Science and Technology of China) · Zhe Xu (The Chinese University of Hong Kong; Harvard Medical School) · Xiaoqing Guo (University of Oxford, University of Oxford) · Lie Ju (Monash University) · Zongyuan Ge (Monash University) · Wenjun Liao (University of Electronic Science and Technology of China) · Jianfei Cai (Monash University) |
530 | DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization | Zeqin Yu (Sun Yat-Sen University) · Jiangqun Ni (Sun Yat-Sen University) · Yuzhen Lin (Shenzhen University) · Haoyi Deng (Shenzhen University) · Bin Li (Shenzhen University) |
531 | ExtDM: Dual Distribution Extrapolation Diffusion Model for Video Prediction | Zhicheng Zhang (Nankai University) · Junyao Hu (Nankai University) · Wentao Cheng (Nankai University) · Danda Paudel (None) · Jufeng Yang (None) |
532 | Instance-Aware Group Quantization for Vision Transformers | Jaehyeon Moon (Yonsei University) · Dohyung Kim (yonsei) · Jun Yong Cheon (Yonsei University) · Bumsub Ham (Yonsei University) |
533 | OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning | Geng Xinyu (None) · Jiaming Wang (Harbin Institute of Technology) · Jiawei Gong (Harbin Institute of Technology) · yuerong xue (Harbin Institute of Technology) · Jun Xu (Harbin Institute of Technology) · Fanglin Chen (Harbin Institute of Technology (Shenzhen)) · Xiaolin Huang (Shanghai Jiao Tong University, Tsinghua University) |
534 | Learning Diffusion Texture Priors for Image Restoration | Tian Ye (The Hong Kong University of Science and Technology (Guangzhou)) · Sixiang Chen (Hong Kong University of Science and Technology (GZ)) · Wenhao Chai (University of Washington) · Zhaohu Xing (Hong Kong University of Science and Technology) · Jing Qin (Hong Kong Polytechnic University) · Ge lin (State University of New York at Buffalo) · Lei Zhu (Hong Kong University of Science and Technology (Guangzhou) & HKUST) |
535 | SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation | Zhixuan Liu (Carnegie Mellon University) · Peter Schaldenbrand (CMU, Carnegie Mellon University) · Beverley-Claire Okogwu (CMU, Carnegie Mellon University) · Wenxuan Peng (Nanyang Technological University) · Youngsik Yun (Dongguk University) · Andrew Hundt (Carnegie Mellon University) · Jihie Kim (Dongguk University) · Jean Oh (Carnegie Mellon University) |
536 | Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge | Haoxiang Ma (Beihang University) · Modi Shi (Beijing University of Aeronautics and Astronautics) · Boyang GAO (Geometry Robotics ltd. & Harbin Institute of Technology) · Di Huang (Beihang University) |
537 | Doubly Abductive Counterfactual Inference for Text-based Image Editing | Xue Song (Fudan University) · Jiequan Cui (The Chinese University of Hong Kong) · Hanwang Zhang (Nanyang Technological University) · Jingjing Chen (Fudan University) · Richang Hong (Hefei University of Technology) · Yu-Gang Jiang (Fudan University) |
538 | Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences | Minyoung Hwang (Seoul National University) · Luca Weihs (Allen Institute for Artificial Intelligence) · Chanwoo Park (Massachusetts Institute of Technology) · Kimin Lee (KAIST) · Aniruddha Kembhavi (Allen Institute for Artificial Intelligence) · Kiana Ehsani (Allen Institute for Artificial Intelligence) |
539 | Active Prompt Learning in Vision Language Models | Jihwan Bang (KAIST) · Sumyeong Ahn (Michigan State University) · Jae-Gil Lee (Korea Advanced Institute of Science and Technology) |
540 | LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs | Yunsheng Ma (Purdue University) · Can Cui (Purdue University) · Xu Cao (University of Illinois Urbana-Champaign) · Wenqian Ye (University of Virginia) · Peiran Liu (Purdue University) · Juanwu Lu (Purdue University) · Amr Abdelraouf (None) · Rohit Gupta (Toyota Motor Corporation) · Kyungtae Han (Toyota Motor North America) · Aniket Bera (Purdue University) · James Rehg (None) · Ziran Wang (Purdue University) |
541 | Model Adaptation for Time Constrained Embodied Control | Jaehyun Song (Sungkyunkwan University) · Minjong Yoo (Sungkyunkwan University) · Honguk Woo (Sungkyunkwan University) |
542 | StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On | Jeongho Kim (KAIST) · Gyojung Gu (Korea Advanced Institute of Science and Technology) · Minho Park (KAIST) · Sunghyun Park (KAIST) · Jaegul Choo (Korea Advanced Institute of Science and Technology) |
543 | FedSOL: Stabilized Orthogonal Learning in Federated Learning | Gihun Lee (KAIST AI) · Minchan Jeong (Korea Advanced Institute of Science and Technology) · SangMook Kim (KAIST) · Jaehoon Oh (Samsung Advanced Institute of Technology) · Se-Young Yun (KAIST) |
544 | Pre-training Vision Models with Mandelbulb Variations | Benjamin N. Chiche (Rist Inc.) · Yuto Horikawa (Osaka University) · Ryo Fujita (Kyoto University) |
545 | Learning from Synthetic Human Group Activities | Chang (None) · Danrui Li (Rutgers University) · Deep Patel (NEC Laboratories America) · Parth Goel (Oracle) · Seonghyeon Moon (Roblox) · Samuel Sohn (Rutgers University) · Honglu Zhou (Rutgers University) · Sejong Yoon (The College of New Jersey) · Vladimir Pavlovic (Rutgers University) · Mubbasir Kapadia (Rutgers University ) |
546 | HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models | Li Pang (Xi'an Jiaotong University) · Xiangyu Rui (Xi'an Jiaotong University) · Long Cui (Xi'an Jiaotong University) · Hongzhong Wang (Xi'an Jiaotong University) · Deyu Meng () · Xiangyong Cao (Xi'an Jiaotong University) |
547 | Data-Efficient Multimodal Fusion on a Single GPU | Noël Vouitsis (Layer 6 AI) · Zhaoyan Liu (Layer6 AI) · Satya Krishna Gorti (Layer6 AI) · Valentin Villecroze (Layer 6) · Jesse C. Cresswell (Layer 6 AI) · Guangwei Yu (Layer6 AI) · Gabriel Loaiza-Ganem (Layer 6 AI) · Maksims Volkovs (Layer6 AI) |
548 | Generating Human Motion in 3D Scenes from Text Descriptions | Zhi Cen (Zhejiang University) · Huaijin Pi (Zhejiang University) · Sida Peng (None) · Zehong Shen (Zhejiang University) · Minghui Yang (Ant Group) · Shuai Zhu (Ant Group) · Hujun Bao (Zhejiang University) · Xiaowei Zhou (None) |
549 | TexVocab: Texture Vocabulary-conditioned Human Avatars | Yuxiao Liu (None) · Zhe Li (Tsinghua University) · Yebin Liu (Tsinghua University) · Haoqian Wang (Tsinghua University, Tsinghua University) |
550 | D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection | Dinh Phat (None) · TAEHOON KIM (None) · JAEMIN NA (None) · Jiwon Kim (Hyundai Motor Company) · Keonho LEE (Hyundai Motor Company) · Kyunghwan Cho (Hyundai Motor Company) · Wonjun Hwang (Ajou University) |
551 | TE-TAD: Towards Fully End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression | Ho-Joong Kim (Korea University) · Jung-Ho Hong (Korea University) · Heejo Kong (Korea University) · Seong-Whan Lee (Korea University) |
552 | MART: Masked Affective RepresenTation Learning via Masked Temporal Distribution Distillation | Zhicheng Zhang (Nankai University) · Pancheng Zhao (Nankai University) · Eunil Park (Sungkyunkwan University) · Jufeng Yang (None) |
553 | LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion | Pancheng Zhao (Nankai University) · Peng Xu (Tsinghua University, Tsinghua University) · Pengda Qin (Alibaba Group) · Deng-Ping Fan (ETH Zurich) · Zhicheng Zhang (Nankai University) · Guoli Jia (None) · Bowen Zhou (Tsinghua University) · Jufeng Yang (None) |
554 | Dual-consistency Model Inversion for Non-exemplar Class Incremental Learning | Zihuan Qiu (None) · Yi Xu (Dalian University of Technology) · Fanman Meng (University of Electronic Science and Technology of China) · Hongliang Li (University of Electronic Science and Technology of China, Tsinghua University) · Linfeng Xu (University of Electronic Science and Technology of China) · Qingbo Wu (University of Electronic Science and Technology of China) |
555 | Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning | Tung Le (University of California, Irvine) · Khai Nguyen (UT Austin) · shanlin sun (University of California, Irvine) · Nhat Ho (University of Texas, Austin) · Xiaohui Xie (University of California, Irvine) |
556 | Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory | 飞 叶 (University of York) · Adrian Bors (University of York) |
557 | Taming Stable Diffusion for Text to 360 | |
∘ | ||
Panorama Image Generation | Cheng Zhang (None) · Qianyi Wu (Monash University) · Camilo Cruz Gambardella (Monash University) · Xiaoshui Huang (Shanghai AI Laboratory) · Dinh Phung (Monash University) · Wanli Ouyang (University of Sydney) · Jianfei Cai (Monash University) | |
558 | How Far Can We Compress Instant NGP-Based NeRF? | Yihang Chen (Shanghai Jiao Tong University) · Qianyi Wu (Monash University) · Mehrtash Harandi (Monash University) · Jianfei Cai (Monash University) |
559 | MoDE: CLIP Data Experts via Clustering | Jiawei Ma (Columbia University) · Po-Yao Huang (Facebook) · Saining Xie (Facebook) · Shang-Wen Li (Facebook) · Luke Zettlemoyer (University of Washington) · Shih-Fu Chang (Columbia University) · Wen-tau Yih (Meta Platforms, Inc.) · Hu Xu (FAIR, Multimodal Foundation) |
560 | Selective nonlinearities removal from digital signals | Krzysztof Maliszewski (University of Canterbury) · Magdalena Urbanska (Massey University) · Varvara Vetrova (University of Canterbury) · Sylwia Kolenderska (University of Canterbury) |
561 | Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation | Xinyao Li (None) · Yuke Li (Wuhan University) · Zhekai Du (University of Electronic Science and Technology of China) · Fengling Li (University of Technology Sydney) · Ke Lu (University of Electronic Science and Technology of China) · Jingjing Li (University of Electronic Science and Technology of China) |
562 | ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification | Jiangbo Shi (None) · Chen Li (Xi'an Jiaotong University) · Tieliang Gong (Xi'an Jiaotong University) · Yefeng Zheng (None) · Huazhu Fu (Institute of High Performance Computing, Singapore, A*STAR) |
563 | MESA: Matching Everything by Segmenting Anything | Yesheng Zhang (Shanghai Jiaotong University) · Xu Zhao (Shanghai Jiao Tong University) |
564 | Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld | Yijun Yang (University of Technology Sydney) · Tianyi Zhou (University of Maryland, College Park) · kanxue Li (Yunnan University) · Dapeng Tao (Yunnan University) · Lusong Li (JDT) · Li Shen (JD Explore Academy) · Xiaodong He (JD AI Research) · Jing Jiang (University of Technology Sydney) · Yuhui Shi (Southern University of Science and Technology) |
565 | DIOD: Self-Distillation Meets Object Discovery | Sandra Kara (CEA) · Hejer AMMAR (CEA) · Julien Denize (CEA) · Florian Chabot (CEA) · Quoc Cuong PHAM (CEA) |
566 | ContextSeg: Sketch Semantic Segmentation by Querying the Context with Attention | Jiawei Wang (Shandong University) · Changjian Li (University of Edinburgh) |
567 | PolarRec: Radio Interferometric Data Reconstruction with Polar Coordinate Representation | Ruoqi Wang (The Hong Kong University of Science and Technology (Guangzhou)) · Zhuoyang Chen (The Hong Kong University of Science and Technology (Guangzhou)) · Jiayi Zhu (Hong Kong University of Science and Technology (Guangzhou)) · Qiong Luo (Hong Kong University of Science and Technology) · Feng Wang (Guangzhou University) |
568 | Text-to-3D Generation with Bidirectional Diffusion using both 3D and 2D priors | Lihe Ding (The Chinese University of Hong Kong) · Shaocong Dong (Hong Kong University of Science and Technology) · Zhanpeng Huang (SenseTime Research) · Zibin Wang (Sensetime Group Limited) · Yiyuan Zhang (The Chinese University of Hong Kong) · Kaixiong Gong (None) · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) · Tianfan Xue (The Chinese University of Hong Kong) |
569 | Interactive3D: Create What You Want by Interactive 3D Generation | Shaocong Dong (Hong Kong University of Science and Technology) · Lihe Ding (The Chinese University of Hong Kong) · Zhanpeng Huang (SenseTime Research) · Zibin Wang (Sensetime Group Limited) · Tianfan Xue (The Chinese University of Hong Kong) · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) |
570 | Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Zuoyue Li (ETH Zürich) · Zhenqiang Li (The University of Tokyo) · Zhaopeng Cui (None) · Marc Pollefeys (ETH Zurich / Microsoft) · Martin R. Oswald (University of Amsterdam) |
571 | Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior | Chen Cheng (Nanyang Technological University) · Xiaofeng Yang (Nanyang Technological University) · Fan Yang (None) · Chengzeng Feng (Nanyang Technological University) · ZHOUJIE FU (Nanyang Technological University) · Chuan-Sheng Foo (Centre for Frontier AI Research, ASTAR) · Guosheng Lin (Nanyang Technological University) · Fayao Liu (Institute for Infocomm Research, ASTAR) |
572 | Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation | Xu Zheng (Northeastern University) · Pengyuan Zhou (Aarhus University) · ATHANASIOS (ICT) · Lin Wang (Hong Kong University of Science and Technology) |
573 | MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI | Xiang Yue (Ohio State University) · Yuansheng Ni (University of Waterloo) · Kai Zhang (Ohio State University, Columbus) · Tianyu Zheng (Beijing University of Posts and Telecommunications) · Ruoqi Liu (Ohio State University) · Ge Zhang (University of Waterloo) · Samuel Stevens (Ohio State University, Columbus) · Dongfu Jiang (University of Waterloo) · Weiming Ren (University of Waterloo) · Yuxuan Sun (Westlake University) · Cong Wei (University of Waterloo) · Botao Yu (The Ohio State University) · Ruibin Yuan (Hong Kong University of Science and Technology) · Renliang Sun (International Digital Economy Academy) · Ming Yin (Princeton University) · Boyuan Zheng (Ohio State University, Columbus) · Zhenzhu Yang (China University of Geoscience Beijing) · Yibo Liu (University of Victoria) · Wenhao Huang (BAAI) · Huan Sun (Ohio State University, Columbus) · Yu Su (Ohio State University) · Wenhu Chen (University of Waterloo) |
574 | Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain | Qunliang Xing (Beihang University) · Mai Xu (Beihang University, Tsinghua University) · Shengxi Li (Beihang University) · Xin Deng (Beijing University of Aeronautics and Astronautics) · Meisong Zheng (Alibaba Group) · huaida liu (Alibaba Group) · Ying Chen (Alibaba Group) |
575 | HOI-M | |
3 | ||
: Capture Multiple Humans and Objects Interaction within Contextual Environment | Juze Zhang (ShanghaiTech University) · Jingyan Zhang (ShanghaiTech University) · Zining Song (ShanghaiTech University) · Zhanhe Shi (ShanghaiTech University) · Chengfeng Zhao (ShanghaiTech University) · Ye Shi (ShanghaiTech University) · Jingyi Yu (Shanghai Tech University) · Lan Xu (None) · Jingya Wang (ShanghaiTech University) | |
576 | Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos | Leonhard Sommer (University of Freiburg, Albert-Ludwigs-Universität Freiburg) · Artur Jesslen (University of Freiburg) · Eddy Ilg (None) · Adam Kortylewski (University of Freiburg & MPI-INF) |
577 | SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement | Tao Wang (Beijing University of Posts and Telecommunications) · Lei Jin (Beijing University of Posts and Telecommunications) · Zheng Wang (Wuhan University) · Jianshu Li (Ant Group) · Liang Li (None) · Fang Zhao (Tencent AI Lab) · Yu Cheng (National University of Singapore) · Li Yuan (Peking University) · Li ZHOU (Wuhan University) · Junliang Xing (Tsinghua University) · Jian Zhao () |
578 | Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion | Fan Zhang (Beijing Institute of Technology) · Shaodi You (Kyushu University) · Yu Li (International Digital Economy Academy) · Ying Fu (None) |
579 | OED: Towards One-stage End-to-End Dynamic Scene Graph Generation | Guan Wang (Peking University) · Zhimin Li (Tencent Data Platform) · Qingchao Chen (Peking University) · Yang Liu (Peking University) |
580 | LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment | yiming ren (None) · xiao han (ShanghaiTech University) · Chengfeng Zhao (ShanghaiTech University) · Jingya Wang (ShanghaiTech University) · Lan Xu (None) · Jingyi Yu (Shanghai Tech University) · Yuexin Ma (ShanghaiTech University) |
581 | I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions | Chengfeng Zhao (ShanghaiTech University) · Juze Zhang (ShanghaiTech University) · Jiashen Du (None) · Ziwei Shan (ShanghaiTech University) · Junye Wang (ShanghaiTech University) · Jingyi Yu (Shanghai Tech University) · Jingya Wang (ShanghaiTech University) · Lan Xu (None) |
582 | Robust Self-calibration of Focal Lengths from the Fundamental Matrix | Viktor Kocur (Comenius University in Bratislava) · Daniel Kyselica (Comenius University in Bratislava) · Zuzana Kukelova (Czech Technical University in Prague) |
583 | OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers | Han Liang (ShanghaiTech University) · Jiacheng Bao (Shanghai Tech University) · Ruichi Zhang (ShanghaiTech University) · Sihan Ren (ShanghaiTech University) · Yuecheng Xu (ShanghaiTech University) · Sibei Yang (None) · Xin Chen (University of Chinese Academy of Sciences, ShanghaiTech University) · Jingyi Yu (ShanghaiTech University) · Lan Xu (None) |
584 | MaskPLAN: Masked Generative Layout Planning from Partial Input | Hang Zhang (ETHZ - ETH Zurich) · Anton Savov (ETHZ - ETH Zurich) · Benjamin Dillenburger (ETHZ - ETH Zurich) |
585 | SVDTree: Semantic Voxel Diffusion for Single Image Tree Reconstruction | Yuan Li (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhihao Liu (The University of Tokyo) · Bedrich Benes (Purdue University) · Xiaopeng Zhang (Institute of Automation, Chinese Academy of Sciences) · Jianwei Guo (Institute of Automation, Chinese Academy of Sciences) |
586 | Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions | Weizhen He () · Yiheng Deng (Zhejiang University) · SHIXIANG TANG (The Chinese University of Hong Kong) · Qihao CHEN (Liaoning Technical University) · Qingsong Xie (OPPO) · Yizhou Wang (None) · Lei Bai (Shanghai AI Laboratory) · Feng Zhu (SenseTime Group LTD) · Rui Zhao (Qing Yuan Research Institute, Shanghai Jiao Tong University) · Wanli Ouyang (University of Sydney) · Donglian Qi (Zhejiang University) · Yunfeng Yan (Zhejiang University) |
587 | ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association | Shuxiao Ding (Mercedes-Benz AG & University of Bonn) · Lukas Schneider (Mercedes Benz Research & Development) · Marius Cordts (Mercedes-Benz AG) · Jürgen Gall (University of Bonn) |
588 | Confronting Ambiguity in 6D Object Pose Estimation via Score-Based Diffusion on | |
S | ||
E | ||
( | ||
3 | ||
) | Tsu-Ching Hsiao (Woven by Toyota) · Hao-Wei Chen (National Tsing Hua University) · Hsuan-Kung Yang (National Tsinghua University) · Chun-Yi Lee (National Tsing Hua University) | |
589 | Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition | Xiang Li (Carnegie Mellon University) · Jinglu Wang (Microsoft Research Asia) · Xiaohao Xu (University of Michigan - Ann Arbor) · Xiulian Peng (Microsoft Research Asia) · Rita Singh (School of Computer Science, Carnegie Mellon University) · Yan Lu (Microsoft Research Asia) · Bhiksha Raj (Carnegie Mellon University) |
590 | Task-conditioned adaptation of visual features in multi-task policy learning | Pierre Marza (Institut National des Sciences Appliquées de Lyon) · Laetitia Matignon (LIRIS, CNRS) · Olivier Simonin (INSA de Lyon) · Christian Wolf (Naver Labs Europe) |
591 | One-Shot Structure-Aware Stylized Image Synthesis | Hansam Cho (Korea University) · Jonghyun Lee (Korea University) · Seunggyu Chang (NAVER Cloud) · Yonghyun Jeong (NAVER) |
592 | VTimeLLM: Empower LLM to Grasp Video Moments | Bin Huang (Tsinghua University) · Xin Wang (None) · Hong Chen (None) · Zihan Song (Tsinghua University, Tsinghua University) · Wenwu Zhu (Tsinghua University, Tsinghua University) |
593 | Self-supervised debiasing using low rank regularization | Geon Yeong Park (Korea Advanced Institute of Science and Technology) · Chanyong Jung (Korea Advanced Institute of Science and Technology) · Sangmin Lee (Korea Advanced Institute of Science & Technology) · Jong Chul Ye (Korea Advanced Institute of Science and Technology) · Sang Wan Lee (Korea Advanced Institute of Science & Technology) |
594 | Cyclic Learning for Binaural Audio Generation and Localization | Zhaojian Li (Northwest Polytechnical University) · Bin Zhao (Northwest Polytechnical University Xi'an) · Yuan Yuan (Northwest Polytechnical University Xi'an) |
595 | Visual Objectification in Films: Towards a New AI Task for Video Interpretation | Julie Tores (Université Côte d'azur) · Lucile Sassatelli (Universite Cote d'Azur) · Hui-Yin Wu (Inria at Université Côte d'Azur) · Clement Bergman (INRIA) · Léa Andolfi (Université Paris-Sorbonne (Paris IV)) · Victor Ecrement (Sorbonne Université) · Frederic Precioso (Universite Cote d'Azur) · Thierry Devars (Université Paris-Sorbonne (Paris IV)) · Magali GUARESI (CNRS) · Virginie Julliard (Université Paris-Sorbonne (Paris IV)) · Sarah Lécossais (Sorbonne Paris Nord) |
596 | ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments | Jingyu Zhang (Fudan University) · Kun Yang (Fudan University) · Yilei Wang (Fudan University) · Hanqi Wang (Fudan University) · Peng Sun (Duke Kunshan University) · Liang Song (Fudan University) |
597 | SOAC: Spatio-Temporal Overlap-Aware Multi-Sensor Calibration using Neural Radiance Fields | Quentin HERAU (Huawei/University of Burgundy) · Nathan Piasco (Huawei Technologies Ltd.) · Moussab Bennehar (Huawei Noah's Ark Lab) · Luis Roldao Jimenez (Huawei Technologies Ltd.) · Dzmitry Tsishkou (Huawei Technologies Ltd.) · MigniotCyrille (University of Burgundy) · Pascal Vasseur (Université de Picardie Jules-Verne) · Cedric Demonceaux (Université de Bourgogne) |
598 | Adaptive Slot Attention: Object Discovery with Dynamic Slot Number | Ke Fan (Fudan University) · Zechen Bai (Show Lab, National University of Singapore) · Tianjun Xiao (Amazon) · Tong He (Amazon Web Services) · Max Horn (GSK plc) · Yanwei Fu (Fudan University) · Francesco Locatello (ISTA) · Zheng Zhang (New York University) |
599 | 3D Neural Edge Reconstruction | Lei Li (ETH Zurich) · Songyou Peng (ETH Zurich & MPI Tübingen) · Zehao Yu (None) · Shaohui Liu (ETH Zurich) · Rémi Pautrat (Microsoft Mixed Reality & AI lab) · Xiaochuan Yin (utopilot) · Marc Pollefeys (ETH Zurich / Microsoft) |
600 | Towards Co-Evaluation of Cameras, HDR, and Algorithms for Industrial-Grade 6DoF Pose Estimation | Agastya Kalra (Google) · Guy Stoppi (Intrinsic) · Dmitrii Marin (Intrinsic) · Vage Taamazyan (Intrinsic) · Aarrushi Shandilya (Intrinsic AI) · Rishav Agarwal (Intrinsic) · Anton Boykov (University of Waterloo) · Tze Chong (Google) · Michael Stark (Intrinsic) |
601 | PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution | Honghao Chen (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xiangxiang Chu (MeiTuan) · Renyongjian (University of the Chinese Academy of Sciences) · Xin Zhao (University of Science and Technology Beijing) · Kaiqi Huang (, Institute of automation, Chinese academy of science) |
602 | HDRFlow: Real-Time HDR Video Reconstruction with Large Motions | Gangwei Xu (None) · Yujin Wang (Shanghai Artificial Intelligence Laboratory) · Jinwei Gu (The Chinese University of Hong Kong) · Tianfan Xue (The Chinese University of Hong Kong) · Xin Yang (Huazhong University of Science and Technology) |
603 | Towards Efficient Replay in Federated Incremental Learning | Yichen Li (Huazhong University of Science and Technology) · Qunwei Li (Ant Group) · Haozhao Wang (Huazhong University of Science and Technology) · Ruixuan Li (Huazhong University of Science and Technology) · Wenliang Zhong (Ant Group) · Guannan Zhang (Tongji University) |
604 | Time-Efficient Light-Field Acquisition Using Coded Aperture and Events | Shuji Habuchi (Nagoya University) · Keita Takahashi (Nagoya University) · Chihiro Tsutake (Nagoya University) · Toshiaki Fujii (Nagoya University) · Hajime Nagahara (Osaka University) |
605 | MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning | Matteo Farina (University of Trento) · Massimiliano Mancini (University of Trento) · Elia Cunegatti (University of Trento) · Gaowen Liu (None) · Giovanni Iacca (University of Trento) · Elisa Ricci (University of Trento) |
606 | SubT-MRS Datasets: Pushing SLAM Towards All-weather Environments | Shibo Zhao (Carnegie Mellon University) · Yuanjun Gao (None) · Tianhao Wu (University of Virginia, Charlottesville) · Damanpreet Singh (CMU, Carnegie Mellon University) · Rushan Jiang (Oracle) · Haoxiang Sun (Carnegie Mellon University) · Mansi Sarawata (CMU, Carnegie Mellon University) · Warren Whittaker (Carnegie Mellon University) · Ian Higgins (Carnegie Mellon University) · Shaoshu Su (State University of New York at Buffalo) · Yi Du (State University of New York at Buffalo) · Can Xu (None) · John Keller (Carnegie Mellon University) · Jay Karhade (Carnegie Mellon University) · Lucas Nogueira (Carnegie Mellon University) · Sourojit Saha (CMU, Carnegie Mellon University) · Yuheng Qiu (CMU, Carnegie Mellon University) · Ji Zhang (Carnegie Mellon University) · Wenshan Wang (School of Computer Science, Carnegie Mellon University) · Chen Wang (University at Buffalo) · Sebastian Scherer (None) |
607 | DiffusionRegPose: Enhancing Multi-Person Pose Estimation using a Diffusion-Based End-to-End Regression Approach | Dayi Tan (Tongji university) · Hansheng Chen (Stanford University) · Wei Tian (Tongji University) · Lu Xiong (Tongji University) |
608 | MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception | Thien-Minh Nguyen (Nanyang Technological University) · Shenghai Yuan (National Technological University) · Thien Nguyen (Nanyang Technological University) · Pengyu Yin (Nanyang Technological University) · Haozhi Cao (Nanyang Technological University) · Lihua Xie (Nanyang Technological University) · Maciej Wozniak (KTH Royal Institute of Technology) · Patric Jensfelt (KTH Royal Institute of Technology, Stockholm, Sweden) · Marko Thiel (Hamburg University of Technology) · Justin Ziegenbein (Technische Universität Hamburg) · Noel Blunder (Technische Universität Hamburg) |
609 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang (Shanghai AI Laboratory) · Shenyuan Gao (None) · Yihang Qiu (Shanghai Jiao Tong University) · Li Chen (Shanghai AI Laboratory) · Tianyu Li (Fudan University) · Bo Dai (Shanghai AI Laboratory) · Kashyap Chitta () · Penghao Wu (University of California, San Diego) · Jia Zeng (Shanghai Jiaotong University) · Ping Luo (The University of Hong Kong) · Jun Zhang (The Hong Kong University of Science and Technology) · Andreas Geiger (University of Tübingen) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Hongyang Li (Shanghai AI Lab) |
610 | Meta-Point Learning and Refining for Category-Agnostic Pose Estimation | Junjie Chen (Jiangxi University of Finance and Economics) · Jiebin Yan (Jiangxi University of Finance and Economics) · Yuming Fang (Jiangxi University of Finance and Economics) · Li Niu () |
611 | Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning | Zihua Zhao (Shanghai Jiao Tong University) · Mengxi Chen (Shanghai Jiaotong University) · Tianjie Dai (Shanghai Jiao Tong University) · Jiangchao Yao (Shanghai Jiaotong University) · Bo Han (HKBU) · Ya Zhang (Shanghai Jiao Tong University) · Yanfeng Wang (Shanghai Jiao Tong University) |
612 | Harnessing Meta-Learning for Improving Full-Frame Video Stabilization | Muhammad Kashif Ali (None) · Eun Woo Im (Hanyang University) · Dongjin Kim (Hanyang University) · Tae Kim Kim (None) |
613 | Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption | Nobuhiko Wakai (None) · Satoshi Sato (Panasonic Holdings Corporation) · Yasunori Ishii (Panasonic Holdings Corporation) · Takayoshi Yamashita (Chubu University) |
614 | Active Domain Adaptation with False Negative Prediction for Object Detection | Yuzuru Nakamura (Panasonic Holdings Corporation) · Yasunori Ishii (Panasonic Holdings Corporation) · Takayoshi Yamashita (Chubu University) |
615 | SURE: SUrvey REcipes for building reliable and robust deep networks | Yuting Li (China Three Gorges University) · Yingyi Chen (Department of Electrical Engineering, KU Leuven, Belgium, KU Leuven) · Xuanlong Yu (Université Paris-Saclay) · Dexiong Chen (Max Planck Institute of Biochemistry) · Xi Shen (Tencent AI Lab) |
616 | One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models | Lin Li (King's College London) · Haoyan Guan (King's College London, University of London) · Jianing Qiu (Imperial College London) · Michael Spratling (King's College London and University of Luxembourg) |
617 | Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning | Shiming Chen (Carnegie Mellon University) · Wenjin Hou (Huazhong University of Science and Technology) · Salman Khan (Mohamed bin Zayed University of Artificial Intelligence) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) |
618 | Domain-Rectifying Adapter for Cross-domain Few-Shot Segmentation | 嘉鹏 苏 (Harbin Institute of Technology) · Qi Fan (The Hong Kong University of Science and Technology) · Wenjie Pei (Harbin Institute of Technology) · Guangming Lu (Harbin Institute of Technology, Shenzhen) · Fanglin Chen (Harbin Institute of Technology (Shenzhen)) |
619 | Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation | Jin Wang (China University of Petroleum) · Bingfeng Zhang (China University of Petroleum (East China)) · Jian Pang (China University of Petroleum (East China)) · Honglong Chen (China University of Petroleum) · Weifeng Liu (China University of Petroleum (East China)) |
620 | Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation | Hanyang Chi (None) · Jian Pang (China University of Petroleum (East China)) · Bingfeng Zhang (China University of Petroleum (East China)) · Weifeng Liu (China University of Petroleum (East China)) |
621 | Semantic-Aware Multi-Label Adversarial Attacks | Hassan Mahmood (Northeastern University) · Ehsan Elhamifar (None) |
622 | CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection | Haonan Zhang (Xi'an Jiaotong University) · Longjun Liu (Xi'an Jiaotong University) · Yuqi Huang (Xi'an Jiaotong University) · YangZhao (Xi'an Jiaotong University) · Xinyu Lei (Xi'an Jiaotong University) · Bihan Wen (Nanyang Technological University) |
623 | Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning | Wenjin Hou (Huazhong University of Science and Technology) · Shiming Chen (Carnegie Mellon University) · Shuhuang Chen (Huazhong University of Science and Technology) · Ziming Hong (The University of Sydney) · Yan Wang (Alibaba Group) · Xuetao Feng (Alibaba Group) · Salman Khan (Mohamed bin Zayed University of Artificial Intelligence) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) · Xinge You (Huazhong University of Science and Technology) |
624 | UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model | Shuai Yuan (Duke University, Meta, TikTok) · Lei Luo (Meta) · Zhuo Hui (Facebook) · Can Pu (Facebook) · Xiaoyu Xiang (Meta) · Rakesh Ranjan () · Denis Demandolx (Meta) |
625 | LLM4SGG: Large Language Model for Weakly Supervised Scene Graph Generation | Kibum Kim (Korea Advanced Institute of Science & Technology) · Kanghoon Yoon (Korea Advanced Institute of Science & Technology) · Jaehyeong Jeon (Korea Advanced Institute of Science and Technology) · Yeonjun In (Korea Advanced Institute of Science & Technology) · Jinyoung Moon (ETRI) · Donghyun Kim (MIT-IBM Watson AI Lab) · Chanyoung Park (Korea Advanced Institute of Science and Technology) |
626 | Towards Generalizable Tumor Synthesis | Qi Chen (University of Science and Technology of China) · Xiaoxi Chen (None) · Haorui Song (Johns Hopkins University) · Alan L. Yuille (Johns Hopkins University) · Zhiwei Xiong (None) · Chen Wei (Johns Hopkins University) · Zongwei Zhou (Johns Hopkins University) |
627 | Fusing Personal and Environmental Cues for Identification and Segmentation of First-Person Camera Wearers in Third-Person View | Ziwei Zhao (Indiana University) · Yuchen Wang (Indiana University) · Chuhua Wang (Indiana University, Bloomington) |
628 | Any-Shift Prompting for Generalization over Distributions | Zehao Xiao (University of Amsterdam) · Jiayi Shen (University of Amsterdam) · Mohammad Mahdi Derakhshani (University of Amsterdam) · Shengcai Liao (Inception Institute of Artificial Intelligence) · Cees G. M. Snoek (University of Amsterdam) |
629 | Mask Grounding for Referring Image Segmentation | Yong Xien Chng (None) · Henry Zheng (Tsinghua University) · Yizeng Han (Tsinghua University, Tsinghua University) · Xuchong QIU (Bosch) · Gao Huang (Tsinghua University, Tsinghua University) |
630 | EGTR: Extracting Graph from Transformer for Scene Graph Generation | Jinbae Im (NAVER Cloud) · Jeongyeon Nam (Naver Cloud) · Nokyung Park (Korea University) · Hyungmin Lee (NAVER) · Seunghyun Park (NAVER Cloud) |
631 | Temperature-based Backdoor Attacks on Thermal Infrared Object Detection | Wen Yin (Huazhong University of Science and Technology) · Jian Lou (Zhejiang University) · Pan Zhou (Huazhong University of Science and Technology) · Yulai Xie (Huazhong University of Science and Technology) · Dan Feng (Huazhong University of Science and Technology) · Yuhua Sun (None) · Tailai Zhang (Huazhong University of Science and Technology) · Lichao Sun (Lehigh University) |
632 | SPAD: Spatially Aware Multiview Diffusers | Yash Kant (University of Toronto / Snap Research) · Aliaksandr Siarohin (Snap Inc.) · Ziyi Wu (University of Toronto) · Michael Vasilkovsky (Snap Inc.) · Guocheng Qian (KAUST) · Jian Ren (Snap Inc.) · Riza Alp Guler (Snap Inc.) · Bernard Ghanem (KAUST) · Sergey Tulyakov (Snap Inc.) · Igor Gilitschenski (University of Toronto) |
633 | Training Generative Image Super-Resolution Models by Wavelet-Domain Losses Enables Better Control of Artifacts | Cansu Korkmaz (Koc University) · Ahmet Murat Tekalp (Koç University) · Zafer Dogan (Koc University) |
634 | Diffusion-driven GAN Inversion for Multi-Modal Facial Image Generation | Jihyun Kim (Yonsei University, LG Electronics) · Changjae Oh (Queen Mary University London) · Hoseok Do (LG Electronics) · Soohyun Kim (Korea University) · Kwanghoon Sohn (Yonsei University) |
635 | Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair | Jeonghoon Park (Korea Advanced Institute of Science and Technology) · Chaeyeon Chung (Korea Advanced Institute of Science and Technology) · Jaegul Choo (Korea Advanced Institute of Science and Technology) |
636 | CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion | Xiaoyu Wu (Shanghai Jiaotong University) · Yang Hua (Queen's University Belfast) · Chumeng Liang (University of Southern California) · Jiaru Zhang (Shanghai Jiao Tong University) · Hao Wang (Louisiana State University) · Tao Song (Shanghai Jiao Tong University) · Haibing Guan (Shanghai Jiaotong University) |
637 | Posterior Distillation Sampling | Juil Koo (None) · Chanho Park (KAIST) · Minhyuk Sung (KAIST) |
638 | Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception | Lei Fan (Northwestern University) · Mingfu Liang (Northwestern University) · Yunxuan Li (Northwestern University) · Gang Hua (Wormpex AI Research) · Ying Wu (Northwestern University) |
639 | AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving | Mingfu Liang (Northwestern University) · Jong-Chyi Su (None) · Samuel Schulter () · Sparsh Garg (NEC Laboratories America) · Shiyu Zhao (Rutgers University, New Brunswick) · Ying Wu (Northwestern University) · Manmohan Chandraker (UC San Diego) |
640 | SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation | Chen Sichen (Shanghai Jiao Tong University) · Yingyi Zhang (Tencent Youtu Lab) · Siming Huang (Duke University) · Ran Yi (Shanghai Jiao Tong University) · Ke Fan (Shanghai Jiaotong University) · Ruixin Zhang (Tencent Youtu Lab) · Peixian Chen (Xiamen University) · Jun Wang (None) · Shouhong Ding (Tencent Youtu Lab) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) |
641 | MindBridge: A Cross-Subject Brain Decoding Framework | Shizun Wang (National University of Singapore) · Songhua Liu (None) · Zhenxiong Tan (National University of Singapore) · Xinchao Wang (National University of Singapore) |
642 | A Closer Look at Audio-Visual Segmentation | Yuanhong Chen (University of Adelaide) · Yuyuan Liu (University of Adelaide) · Hu Wang (The University of Adelaide) · Fengbei Liu (Cornell University) · Chong Wang (University of Adelaide) · Helen Frazer (BreastScreen Victoria) · Gustavo Carneiro (University of Surrey) |
643 | Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction | Hao Li (Xiamen University) · Ying Chen (Xiamen University) · Yifei Chen (Huawei) · Rongshan Yu (National University of Singapore) · Wenxian Yang (Aginome Scientific) · Liansheng Wang (Xiamen University, Tsinghua University) · Bowen Ding (Shanghai Jiaotong University) · Yuchen Han (Shanghai Jiaotong University) |
644 | Efficient Hyperparameter Optimization with Adaptive Fidelity Identification | Jiantong Jiang (The University of Western Australia) · Zeyi Wen (Hong Kong University of Science and Technology (Guangzhou)) · Atif Mansoor (University of Western Australia) · Ajmal Mian (University of Western Australia) |
645 | GPT-4V(ision) is a Versatile and Human-Aligned Evaluator for Text-to-3D Generation | Tong Wu (None) · Guandao Yang (None) · Zhibing Li (The Chinese University of Hong Kong) · Kai Zhang (Adobe Systems) · Ziwei Liu (Nanyang Technological University) · Leonidas Guibas (Stanford University) · Dahua Lin (The Chinese University of Hong Kong) · Gordon Wetzstein (Stanford University) |
646 | Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval | Minkuk Kim (Kyung Hee University) · Hyeon Kim (Kyunghee University) · Jinyoung Moon (ETRI) · Jinwoo Choi (Kyung Hee University) · Seong Tae Kim (Kyung Hee University) |
647 | A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images | Junkai Deng (Chinese Academy of Sciences, Chinese Academy of Sciences) · Fei Hou (Institute of Software, Chinese Academy of Sciences) · Xuhui Chen (Chinese Academy of Sciences, Chinese Academy of Sciences) · Wencheng Wang (Institute of Software, Chinese Academy of Sciences) · Ying He (Nanyang Technological University) |
648 | Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling | Jiawei Shi (Northwest Polytechnical University Xi'an) · Hui Deng (Northwest Polytechnical University Xi'an) · Yuchao Dai (Northwestern Polytechnical University) |
649 | Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation | Xingqun Qi (The Hong Kong University of Science and Technology) · Jiahao Pan (Hong Kong University of Science and Technology) · Peng Li (Tsinghua University) · Ruibin Yuan (Hong Kong University of Science and Technology) · Xiaowei Chi (Hong Kong University of Science and Technology) · Mengfei Li (Hong Kong University of Science and Technology) · Wenhan Luo (SUN YAT-SEN UNIVERSITY) · Wei Xue (Hong Kong University of Science and Technology) · Shanghang Zhang (Peking University) · Qifeng Liu (The Hong Kong University of Science and Technology) · Yike Guo (Imperial College London) |
650 | MuseChat: A Conversational Music Recommendation System for Videos | Zhikang Dong (State University of New York at Stony Brook) · Bin Chen (Bytedance Inc.) · Xiulong Liu (University of Washington) · Pawel Polak (State University of New York at Stony Brook) · Peng Zhang (Bytedance) |
651 | Controlling Encoder of Deep Video Compression for Machine | Xingtong Ge (Beijing Institute of Technology) · Jixiang Luo (None) · XINJIE ZHANG (The Hong Kong University of Science and Technology) · Tongda Xu (Tsinghua University) · Guo Lu (Shanghai Jiaotong University) · Dailan He (The Chinese University of Hong Kong) · Jing Geng (Beijing Institute of Technology) · Yan Wang (Tsinghua University, Tsinghua University) · Jun Zhang (The Hong Kong University of Science and Technology) · Hongwei Qin (SenseTime Co.) |
652 | Boosting Neural Representations for Videos with a Conditional Decoder | XINJIE ZHANG (The Hong Kong University of Science and Technology) · Ren Yang (Microsoft Research) · Dailan He (The Chinese University of Hong Kong) · Xingtong Ge (Beijing Institute of Technology) · Tongda Xu (Tsinghua University) · Yan Wang (Tsinghua University, Tsinghua University) · Hongwei Qin (SenseTime Co.) · Jun Zhang (The Hong Kong University of Science and Technology) |
653 | Revamping Federated Learning Security from a Defender's Perspective: A Unified Defense with Homomorphic Encrypted Data Space | Naveen Kumar Kummari (Indian Institute of Technology Hyderabad, India) · Reshmi Mitra (Southeast Missouri State University) · Krishna Mohan Chalavadi (Indian Institute of Technology Hyderabad) |
654 | iToF-flow-based High Frame Rate Depth Imaging | Yu Meng (Nanjing University) · Zhou Xue (Li Auto) · Xu Chang (Bytedance Inc) · Xuemei Hu (Nanjing University) · Tao Yue (Nanjing University) |
655 | BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation | Yunhao Ge (University of Southern California) · Yihe Tang (Stanford University) · Jiashu Xu (University of Southern California) · Cem Gokmen (NVIDIA) · Chengshu Li (Stanford University) · Wensi Ai (Stanford University) · Benjamin Martinez (Stanford University) · Arman Aydin (Stanford University) · Mona Anvari (Computer Science Department, Stanford University) · Ayush Chakravarthy (Stanford University) · Hong-Xing Yu (Computer Science Department, Stanford University) · Josiah Wong (Stanford University) · Sanjana Srivastava (Stanford University) · Sharon Lee (Stanford University) · Shengxin Zha (Meta GenAI) · Laurent Itti (USC) · Yunzhu Li (University of Illinois Urbana-Champaign) · Roberto Martín-Martín (University of Texas at Austin) · Miao Liu (META AI) · Pengchuan Zhang (Meta AI) · Ruohan Zhang (Stanford University) · Li Fei-Fei (Stanford University) · Jiajun Wu (Stanford University) |
656 | Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | Zikai Xiao (None) · Guo-Ye Yang (None) · Xue Yang (Shanghai AI Laboratory) · Tai-Jiang Mu (Tsinghua University, Tsinghua University) · Junchi Yan (Shanghai Jiao Tong University) · Shi-Min Hu (Tsinghua University, Tsinghua University) |
657 | Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision | Yi Yu (Southeast University) · Xue Yang (Shanghai AI Laboratory) · Qingyun Li (Harbin Institute of Technology) · Feipeng Da (Southeast University) · Jifeng Dai (Tsinghua University, Tsinghua University) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Junchi Yan (Shanghai Jiao Tong University) |
658 | PointOBB: Learning Oriented Object Detection via Single Point Supervision | Junwei Luo (Wuhan University) · Xue Yang (Shanghai AI Laboratory) · Yi Yu (Southeast University) · Qingyun Li (Harbin Institute of Technology) · Junchi Yan (Shanghai Jiao Tong University) · Yansheng Li (Wuhan University) |
659 | Practical Measurements of Translucent Materials with Inter-Pixel Translucency Prior | Zhenyu Chen (Nanjing University) · Jie Guo (Nanjing University) · Shuichang Lai (Nanjing University) · Ruoyu Fu (nanjing university) · mengxun kong (None) · Chen Wang (Nanjing University) · Hongyu Sun (Guangdong Oppo Mobile Telecommunications Corp., Ltd) · Zhebin Zhang (OPPO) · Chen Li (Innopeak Technology Inc.) · Yanwen Guo (Nanjing University) |
660 | Cross Initialization for Personalized Text-to-Image Generation | Lianyu Pang (None) · Jian Yin () · Haoran Xie (Lingnan University) · Qiping Wang (East China Normal University) · Qing Li (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Xudong Mao (None) |
661 | Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications | Yuwen Xiong (University of Toronto) · Zhiqi Li (Nanjing University) · Yuntao Chen (CAIR, HKISI, CAS) · Feng Wang (Tsinghua University, Tsinghua University) · Xizhou Zhu (Shanghai AI Laboratory) · Jiapeng Luo (SenseTime Research) · Wenhai Wang (Shanghai AI Laboratory) · Tong Lu (Nanjing University) · Hongsheng Li (The Chinese University of Hong Kong) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Lewei Lu (SenseTime) · Jie Zhou (None) · Jifeng Dai (Tsinghua University, Tsinghua University) |
662 | Resolution Limit of Single-Photon LIDAR | Stanley H. Chan (Purdue University, USA) · Hashan Weerasooriya (Purdue University) · Weijian Zhang (Purdue University) · Pamela Abshire (University of Maryland, College Park) · Istvan Gyongy (University of Edinburgh, University of Edinburgh) · Robert Henderson (University of Edinburgh) |
663 | MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World | Yining Hong () · Zishuo Zheng (None) · Peihao Chen (South China University of Technology) · Yian Wang (Department of Computer Science, University of Massachusetts at Amherst) · Junyan Li (Zhejiang University) · Chuang Gan (MIT-IBM Watson AI Lab) |
664 | Listening and Imagining: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio | Chao Xu (Zhejiang University) · Yang Liu (Alibaba Group) · Jiazheng Xing (Zhejiang University) · Weida Wang (Xingji Meizu Group) · Mingze Sun (None) · Jun Dan (Zhejiang University) · Tianxin Huang (Tencent youtu lab) · Siyuan Li (Westlake University, Zhejiang University) · Zhi-Qi Cheng (Carnegie Mellon University) · Ying Tai (Nanjing University) · Baigui Sun (Alibaba Group) |
665 | PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization | Xu Peng (Xiamen University) · Junwei Zhu (Tencent Youtu Lab) · Boyuan Jiang (Tencent Youtu Lab) · Ying Tai (Nanjing University) · Donghao Luo (Tencent YouTu Lab) · Jiangning Zhang (Tencent Youtu Lab) · Wei Lin (Xiamen University) · Taisong Jin (Xiamen University) · Chengjie Wang (Shanghai Jiao Tong University) · Rongrong Ji (Xiamen University) |
666 | Prompting Vision Foundation Models for Pathology Image Analysis | CHONG YIN (None) · Siqi Liu (Shenzhen Research Institute of Big Data) · Kaiyang Zhou (Hong Kong Baptist University) · Vincent Wong (The Chinese University of Hong Kong) · Pong C. Yuen (Hong Kong Baptist Unviersity) |
667 | VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models | Xiang Li (National University of Singapore) · Qianli Shen (national university of singaore, National University of Singapore) · Kenji Kawaguchi (National University of Singapore) |
668 | LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging | Haoyang Ge (Tianjin University) · Qiao Feng (None) · Hailong Jia (Tianjin University) · Xiongzheng Li (None) · Xiangjun Yin (None) · You Zhou (Nanjing University) · Jingyu Yang (Tianjin University) · Kun Li (None) |
669 | Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency | Xu Yingjie (None) · Bangzhen Liu (South China University of Technology) · Hao Tang (School of Computer Science and Engineering, Nanjing University of Science and Technology) · Bailin Deng (Cardiff University) · Shengfeng He (Singapore Management University) |
670 | SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution | Rongyuan Wu (Hong Kong Polytechnic University) · Tao Yang (Tsinghua University, Tsinghua University) · Lingchen Sun (Hong Kong Polytechnic University) · Zhengqiang ZHANG (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Shuai Li (The Hong Kong Polytechnic University) · Lei Zhang (The Hong Kong Polytechnic University) |
671 | UniVS: Unified and Universal Video Segmentation with Prompts as Queries | Minghan Li (The Hong Kong Polytechnic University ) · Shuai Li (The Hong Kong Polytechnic University) · Xindong Zhang (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Lei Zhang (The Hong Kong Polytechnic University) |
672 | InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks | Zhe Chen (Nanjing University) · Jiannan Wu (University of Hong Kong) · Wenhai Wang (Shanghai AI Laboratory) · Weijie Su (University of Science and Technology of China) · Guo Chen (Nanjing University) · Sen Xing (Tsinghua University, Tsinghua University) · Zhong Muyan (Tsinghua University, Tsinghua University) · Qing-Long Zhang (Shanghai Artificial Intelligence Laboratory) · Xizhou Zhu (Shanghai AI Laboratory) · Lewei Lu (SenseTime) · Bin Li (University of Science and Technology of China) · Ping Luo (The University of Hong Kong) · Tong Lu (Nanjing University) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Jifeng Dai (Tsinghua University, Tsinghua University) |
673 | Unsegment Anything by Simulating Deformation | Jiahao Lu (National University of Singapore) · Xingyi Yang (National University of Singapore) · Xinchao Wang (National University of Singapore) |
674 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation guided by the Characteristic Dance Primitives | Ronghui Li (Tsinghua University) · Yuxiang Zhang (Tsinghua University, Tsinghua University) · Yachao Zhang (Tsinghua University) · Hongwen Zhang (Beijing Normal University) · Jie Guo (Peng Cheng Laboratory) · Yan Zhang (ETH Zurich) · Yebin Liu (Tsinghua University) · Xiu Li (Tsinghua University) |
675 | Multimodal Representation Learning by Alternating Unimodal Adaptation | Xiaohui Zhang (Beijing Jiaotong University) · Jaehong Yoon (University of North Carolina at Chapel Hill) · Mohit Bansal (University of North Carolina at Chapel Hill) · Huaxiu Yao (Department of Computer Science, University of North Carolina at Chapel Hill) |
676 | Efficient Model Stealing Defense with Noise Transition Matrix | Dong-Dong Wu (Southeast University) · Chilin Fu (Ant Group) · Weichang Wu (Alibaba Group) · Wenwen Xia (Shanghai Jiaotong University) · Xiaolu Zhang (None) · JUN ZHOU (Ant Group) · Min-Ling Zhang (Southeast University) |
677 | CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection | JiaBao Wang (Nankai University) · yuming chen (None) · Zhaohui Zheng (Nankai University) · Xiang Li (Nankai University) · Ming-Ming Cheng (Nankai University, Tsinghua University) · Qibin Hou (Nankai University) |
678 | Image Sculpting: Precise Object Editing with 3D Geometry Control | Jiraphon Yenphraphai (New York University) · Xichen Pan (New York University) · Sainan Liu (Intel) · Daniele Panozzo (New York University) · Saining Xie (Facebook) |
679 | Balancing Act: Distribution-Guided Debiasing in Diffusion Models | Rishubh Parihar (Indian Institute of Science, Bangalore) · Abhijnya Bhat (Indian Institute of Science, Indian institute of science, Bangalore) · Abhipsa Basu (Indian Institute of Science) · Saswat Mallick (Indian Institute of Science, Indian institute of science, Bangalore) · Jogendra Kundu Kundu (None) · R. Venkatesh Babu (Indian Institute of Science) |
680 | A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization | Hongwei Ren (Hong Kong University of Science and Technology) · Jiadong Zhu (The Hong Kong University of Science and Technology (Guangzhou)) · Yue Zhou (Hong Kong University of Science and Technology) · Haotian FU (Hong Kong University of Science and Technology) · Yulong Huang (Central South University) · Bojun Cheng (Hong Kong University of Science and Technology) |
681 | Towards 3D Vision with Low-Cost Single-Photon Cameras | Fangzhou Mu (NVIDIA) · Carter Sifferman (University of Wisconsin - Madison) · Sacha Jungerman (University of Wisconsin - Madison) · Yiquan Li (University of Wisconsin - Madison) · Zhiyue Han (None) · Michael Gleicher (Department of Computer Sciences, University of Wisconsin - Madison) · Mohit Gupta (Department of Computer Science, University of Wisconsin - Madison) · Yin Li (University of Wisconsin, Madison) |
682 | LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding | Min Liang (University of Science and Technology Beijing) · Jia-Wei Ma (University of Science and Technology Beijing) · Xiaobin Zhu (University of Science and Technology Beijing) · Jingyan Qin (University of Science and Technology Beijing) · Xu-Cheng Yin (University of Science and Technology Beijing) |
683 | DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement | Jiuming Liu (Shanghai Jiao Tong University) · Guangming Wang (University of Cambridge) · Weicai Ye (Zhejiang University) · Chaokang Jiang () · Jinru Han (Shanghai Jiao Tong University) · Zhe Liu (Shanghai Jiaotong University) · Guofeng Zhang (Zhejiang University) · Dalong Du (PhiGent Robotics) · Hesheng Wang (Shanghai Jiao Tong University) |
684 | OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition | Jianqiang Wan (Alibaba Group) · Sibo Song (Alibaba Group) · Wenwen Yu (Huazhong University of Science and Technology) · Yuliang Liu (Huazhong University of Science and Technology) · Wenqing Cheng (Huazhong University of Science and Technology) · Fei Huang (Alibaba Group) · Xiang Bai (Huazhong University of Science and Technology) · Cong Yao (Alibaba DAMO Academy) · Zhibo Yang (Alibaba Group) |
685 | Inter-X: Towards Versatile Human-Human Interaction Analysis | Liang Xu (Shanghai Jiao Tong University) · Xintao Lv (Shanghai Jiaotong University) · Yichao Yan (Shanghai Jiao Tong University) · Xin Jin (Eastern Institute for Advanced Study) · Wu Shuwen (Shanghai Jiaotong University) · Congsheng Xu (Shanghai Jiaotong University) · Yifan Liu (Shanghai Jiao Tong University) · Yizhou Zhou (WeChat AI) · Fengyun Rao (WeChat, Tencent Inc.) · Xingdong Sheng (Shanghai Jiaotong University) · Yunhui LIU (Lenovo Research) · Wenjun Zeng (None) · Xiaokang Yang (Shanghai Jiao Tong University, China) |
686 | Adapt or Perish: Adaptive Sparse Transformer with Attentive Feature Refinement for Image Restoration | Shihao Zhou (Nankai University) · Duosheng Chen (Nankai University) · Jinshan Pan (Nanjing University of Science and Technology) · Jinglei Shi (Nankai University) · Jufeng Yang (None) |
687 | ES | |
3 | ||
: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations | Yuanhang Zhang (Institute of Computing Technology, Chinese Academy of Sciences) · Shuang Yang (Institute of Computing Technology, Chinese Academy of Sciences) · Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences) · Xilin Chen (None) | |
688 | DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior | Tianyu Huang (Harbin Institute of Technology & City University of Hong Kong) · Yihan Zeng (Huawei Technologies Ltd.) · Zhilu Zhang (Harbin Institute of Technology) · Wan Xu (Harbin Institute of Technology) · Hang Xu (Huawei Noah‘s Ark Lab) · Songcen Xu (Huawei Noah's Ark Lab) · Rynson W.H. Lau (City University of Hong Kong) · Wangmeng Zuo (Harbin Institute of Technology) |
689 | ProxyCap: Real-time Monocular Full-body Capture in World Space via Human-Centric Proxy-to-Motion Learning | Yuxiang Zhang (Tsinghua University, Tsinghua University) · Hongwen Zhang (Beijing Normal University) · Liangxiao Hu (Harbin Institute of Technology) · Jiajun Zhang (Beijing University of Posts and Telecommunications) · Hongwei Yi (Max Planck Institute for Intelligent Systems, Max-Planck Institute) · Shengping Zhang (Harbin Institute of Technology) · Yebin Liu (Tsinghua University) |
690 | FedMef: Towards Memory-efficient Federated Dynamic Pruning | Hong Huang (City University of Hong Kong) · Weiming Zhuang (Sony Research) · Chen Chen (Sony AI) · Lingjuan Lyu (Sony AI) |
691 | Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model | Runmin Dong (Tsinghua University, Tsinghua University) · Shuai Yuan (None) · Bin Luo (Tsinghua University) · Mengxuan Chen (None) · Jinxiao Zhang (Tsinghua University, Tsinghua University) · Lixian Zhang (National Supercomputing Center in Shenzhen) · Weijia Li (None) · Juepeng Zheng (Sun Yat-Sen University) · Haohuan Fu (Tsinghua University, Tsinghua University) |
692 | Re-thinking Data Availablity Attacks Against Deep Neural Networks | Bin Fang (Shanghai Jiao Tong University) · Bo Li (Tencent Youtu Lab) · Shuang Wu (Tencent YouTu Lab) · Shouhong Ding (Tencent Youtu Lab) · Ran Yi (Shanghai Jiao Tong University) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) |
693 | GLID: Pre-training a Generalist Encoder-Decoder Vision Model | Jihao Liu (The Chinese University of Hong Kong) · Jinliang Zheng (SenseTime) · Yu Liu (The Chinese University of Hong Kong) · Hongsheng Li (The Chinese University of Hong Kong) |
694 | Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing | Yafei Zhang (Kunmimg University of Science and Technology) · Shen Zhou (Kunmimg University of Science and Technology) · Huafeng Li (Kunmimg University of Science and Technology) |
695 | Text-Enhanced Data-free Approach for Federated Class-Incremental Learning | Minh-Tuan Tran (Monash University) · Trung Le (Monash University) · Xuan-May Le (University of Melbourne) · Mehrtash Harandi (Monash University) · Dinh Phung (Monash University) |
696 | NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation | Minh-Tuan Tran (Monash University) · Trung Le (Monash University) · Xuan-May Le (University of Melbourne) · Mehrtash Harandi (Monash University) · Quan Tran (servicenow) · Dinh Phung (Monash University) |
697 | Synergistic Global-space Camera and Human Reconstruction from Videos | Yizhou Zhao (Carnegie Mellon University) · Tuanfeng Y. Wang (None) · Bhiksha Raj (Carnegie Mellon University) · Min Xu (Carnegie Mellon University) · Jimei Yang (Adobe Research) · Chun-Hao P. Huang (Adobe Systems) |
698 | PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis | Zhengyao Lv (University of Hong Kong) · Yuxiang Wei (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Wangmeng Zuo (Harbin Institute of Technology) · Kwan-Yee K. Wong (The University of Hong Kong) |
699 | iKUN: Speak to Trackers without Retraining | Yunhao Du (Beijing University of Posts and Telecommunications) · Cheng Lei (Beijing University of Posts and Telecommunications) · Zhicheng Zhao (Beijing University of Posts and Telecommunications) · Fei Su (Beijing University of Posts and Telecommunications) |
700 | ACT: Adversarial Consistency Models | Fei Kong (University of Electronic Science and Technology of China) · Jinhao Duan (Drexel University) · Lichao Sun (Lehigh University) · Hao Cheng (Hong Kong University of Science and Technology(Guangzhou)) · Renjing Xu (Hong Kong University of Science and Technology (Guangzhou)) · Heng Tao Shen (University of Electronic Science and Technology of China) · Xiaofeng Zhu (University of Electronic Science and Technology of China) · Xiaoshuang Shi (University of Electronic Science and Technology of China) · Kaidi Xu (Drexel University) |
701 | PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion | Ying-Tian Liu (Tsinghua University, Tsinghua University) · Yuan-Chen Guo (Tsinghua University) · Guan Luo (Tsinghua University, Tsinghua University) · Heyi Sun (Tsinghua University, Tsinghua University) · Wei Yin ( Shenzhen DJI Sciences and Technologies Ltd.) · Song-Hai Zhang (Tsinghua University, Tsinghua University) |
702 | A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling | Qu Wentao (Nanjing University of Science and Technology) · Yuantian Shao (Nanjing University of Science and Technology) · Lingwu Meng (Nanjing University of Science and Technology) · Xiaoshui Huang (Shanghai AI Laboratory) · Liang Xiao (Nanjing University of Science and Technology) |
703 | NeRF Director: Revisiting View Selection in Neural Volume Rendering | Wenhui Xiao (Queensland University of Technology) · Rodrigo Santa Cruz (CSIRO) · David Ahmedt-Aristizabal (CSIRO) · Olivier Salvado (CSIRO) · Clinton Fookes (Queensland University of Technology) · Leo Lebrat (CSIRO / QUT) |
704 | Efficient Privacy-Preserving Visual Localization Using 3D Ray Clouds | Heejoon Moon (Hanyang University) · Chunghwan Lee (Hanyang University) · Je Hyeong Hong (Hanyang University) |
705 | CFAT: Unleashing Triangular Windows for Image Super-resolution | Abhisek Ray (Indian Institute of Technology, Patna) · Gaurav Kumar (Indian Institute of Technology (IIT), Patna) · Maheshkumar Kolekar (Indian Institute of Technology, Patna) |
706 | Exact Fusion via Feature Distribution Matching for Few-shot Image Generation | Yingbo Zhou (East China Normal University) · Yutong Ye (None) · Pengyu Zhang (East China Normal University) · Xian Wei (Chinese Academy of Sciences) · Mingsong Chen (East China Normal University) |
707 | WorDepth: Variational Language Prior for Monocular Depth Estimation | Ziyao Zeng (Yale University) · Hyoungseob Park (Yale University) · Fengyu Yang (Yale University) · Daniel Wang (Yale University) · Stefano Soatto (University of California, Los Angeles) · Dong Lao (University of California, Los Angeles) · Alex Wong (Yale University) |
708 | Test-Time Adaptation for Depth Completion | Hyoungseob Park (Yale University) · Anjali W Gupta (Yale) · Alex Wong (Yale University) |
709 | FairVLMed Dataset: Harnessing Fairness in Vision-and-Language Learning via FairCLIP | Yan Luo (Harvard Ophthalmology AI Lab) · MIN SHI (Harvard University) · Muhammad Osama Khan (New York University) · Muhammad Muneeb Afzal (New York University) · Hao Huang (New York University) · Shuaihang Yuan (New York University) · Yu Tian (None) · Luo Song (Mass Eye and Ear) · Ava Kouhana (Harvard Ophthalmology AI lab) · Tobias Elze (Harvard University) · Yi Fang (New York University) · Mengyu Wang (Harvard University) |
710 | Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors | Haoxuanye Ji (Xi'an Jiaotong University) · Pengpeng Liang (Zhengzhou University) · Erkang Cheng (Nullmax) |
711 | Taming Self-Training for Open-Vocabulary Object Detection | Shiyu Zhao (Rutgers University, New Brunswick) · Samuel Schulter () · Long Zhao (Google Research) · Zhixing Zhang (Rutgers University) · Vijay Kumar BG (NEC Laboratories America) · Yumin Suh (NEC Labs America) · Manmohan Chandraker (UC San Diego) · Dimitris N. Metaxas (Rutgers) |
712 | Distilling Vision-Language Models on Millions of Videos | Yue Zhao (UT Austin) · Long Zhao (Google Research) · Xingyi Zhou (Google) · Jialin Wu (Google) · Chun-Te Chu (Google) · Hui Miao (Google) · Florian Schroff (Google) · Hartwig Adam (Google Research) · Ting Liu (Google Research) · Boqing Gong (Google) · Philipp Krähenbühl (University of Texas at Austin) · Liangzhe Yuan (Google) |
713 | Generating Enhanced Negatives for Training Language-Based Object Detectors | Shiyu Zhao (Rutgers University, New Brunswick) · Long Zhao (Google Research) · Vijay Kumar BG (NEC Laboratories America) · Yumin Suh (NEC Labs America) · Dimitris N. Metaxas (Rutgers) · Manmohan Chandraker (UC San Diego) · Samuel Schulter () |
714 | IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Junbo Yin (Beijing Institute of Technology) · Wenguan Wang (Zhejiang University) · Runnan Chen (None) · Wei Li (Inceptio) · Ruigang Yang (Inceptio ) · Pascal Frossard (EPFL) · Jianbing Shen (University of Macau) |
715 | An Empirical Study of Scaling Law for OCR | Miao Rang (Huawei Noah's Ark Lab) · Zhenni Bi (Huawei Noah Ark Lab) · Chuanjian Liu (Huawei Technologies Ltd.) · Yunhe Wang (Huawei Noah's Ark Lab) · Kai Han (Huawei Noah's Ark Lab) |
716 | Video-Based Human Pose Regression via Decoupled Space-Time Aggregation | Jijie He (Zhejiang Gongshang University) · Wenwu Yang (Zhejiang Gongshang University) |
717 | Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs | Sunghwan Hong (Korea University) · Jaewoo Jung (Korea University) · Heeseong Shin (Korea University) · Jiaolong Yang (Microsoft Research) · Chong Luo (Microsoft Research Asia) · Seungryong Kim (Korea University) |
718 | Boosting Adversarial Training via Fisher-Rao Norm-based Regularization | Xiangyu Yin (University of Liverpool) · Wenjie Ruan (University of Exeter) |
719 | Rethinking Visual Feature Extraction: Modeling Representatives from A Neural Clustering View | Guikun Chen (Zhejiang University) · Xia Li (Department of Computer Science, ETHZ - ETH Zurich) · Yi Yang (Zhejiang University) · Wenguan Wang (Zhejiang University) |
720 | Close Imitation of Expert Retouching for Black-and-White Photography | Seunghyun Shin (None) · Jisu Shin (None) · Jihwan Bae (CHA University, School of Medicine) · Inwook Shim (Inha University) · Hae-Gon Jeon (None) |
721 | Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs | Kanchana Ranasinghe (None) · Satya Narayan Shukla (Meta AI) · Omid Poursaeed (Meta AI) · Michael Ryoo (Stony Brook University) · Tsung-Yu Lin (Department of Computer Science, University of Massachusetts, Amherst) |
722 | Virtual Immunohistochemistry Staining for Histological Images Assisted by Weakly-supervised Learning | Jiahan Li (Harbin Institute of Technology) · Jiuyang Dong (Harbin Institute of Technology) · Shenjin Huang (None) · Xi Li (Department of Gastroenterology, Shenzhen Hospital, Peking University) · Junjun Jiang (Harbin Institute of Technology) · Xiaopeng Fan (Harbin Institute of Technology) · Yongbing Zhang (Harbin Institute of Technology) |
723 | TEA: Test-time Energy Adaptation | Yige Yuan (None) · Bingbing Xu (Institute of Computing Technology, Chinese Academy of Sciences) · Liang Hou (Kuaishou Technology) · Fei Sun (Institute of Computing Technology, Chinese Academy of Sciences) · Huawei Shen (Institute of Computing Technology, Chinese Academy of Sciences) · Xueqi Cheng (, Chinese Academy of Sciences) |
724 | Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving | Yuqi Wang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Jiawei He (Institute of automation, Chinese Academy of Sciences) · Lue Fan (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Hongxin Li (Institute of Automation, Chinese Academy of Sciences) · Yuntao Chen (CAIR, HKISI, CAS) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) |
725 | Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation | Xiao Ma (SEA AI Lab) · Sumit Patidar (Dyson) · Iain Haughton (Dyson Ltd) · Stephen James (Dyson) |
726 | ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models | Meng-Li Shih (University of Washington) · Wei-Chiu Ma (Cornell University) · Lorenzo Boyice (Google) · Aleksander Holynski (UC Berkeley & Google Research) · Forrester Cole (Google) · Brian Curless (University of Washington) · Janne Kontkanen (Research, Google) |
727 | Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses | Inhee Lee (Seoul National University) · Byungjun Kim (Seoul National University) · Hanbyul Joo (None) |
728 | Makeup Prior Models for 3D Facial Makeup Estimation and Applications | Xingchao Yang (Cyberagent) · Takafumi Taketomi (CyberAgent) · Yuki Endo (University of Tsukuba) · Yoshihiro Kanamori (University of Tsukuba) |
729 | Video-conditioned Text Representations for Activity Recognition | Kumara Kahatapitiya (Stony Brook University) · Anurag Arnab (Google) · Arsha Nagrani (Google ) · Michael Ryoo (Stony Brook University) |
730 | Multimodal autoregressive learning for time-aligned and contextual modalities | AJ Piergiovanni (Google) · Isaac Noble (Google) · Dahun Kim (Google) · Michael Ryoo (Stony Brook University) · Victor Gomes (Google) · Anelia Angelova (Google) |
731 | 3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling | Chaokang Jiang () · Guangming Wang (University of Cambridge) · Jiuming Liu (Shanghai Jiao Tong University) · Hesheng Wang (Shanghai Jiao Tong University) · Zhuang Ma (PhiGent) · Zhenqiang Liu (None) · LIANG (None) · Yi Shan (PhiGent Robotics) · Dalong Du (PhiGent Robotics) |
732 | MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models | Yanting Wang (Pennsylvania State University) · Hongye Fu (Zhejiang University) · Wei Zou (Pennsylvania State University) · Jinyuan Jia (Pennsylvania State University) |
733 | Data Poisoning based Backdoor Attacks to Contrastive Learning | Jinghuai Zhang (University of California, Los Angeles (UCLA)) · Hongbin Liu (Duke University) · Jinyuan Jia (Pennsylvania State University) · Neil Zhenqiang Gong (Duke University) |
734 | Generalized Event Cameras | Varun Sundar (University of Wisconsin, Madison) · Matthew Dutson (University of Wisconsin, Madison) · Andrei Ardelean (NovoViz) · Claudio Bruschini (EPFL - EPF Lausanne) · Edoardo Charbon (EPFL - EPF Lausanne) · Mohit Gupta (Department of Computer Science, University of Wisconsin - Madison) |
735 | Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image Generative Models | Takami Sato (None) · Justin Yue (University of California, Irvine) · Nanze Chen (University of Cambridge) · Ningfei Wang (University of California, Irvine) · Alfred Chen (University of California, Irvine) |
736 | DiaLoc: An Iterative Approach to Embodied Dialog Localization | Chao Zhang (Toshiba Europe Ltd) · Mohan Li (Toshiba Europe Ltd) · Ignas Budvytis (University of Cambridge) · Stephan Liwicki (Toshiba Europe Ltd) |
737 | Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video | Hongchi Xia (Shanghai Jiaotong University) · Chih-Hao Lin (None) · Wei-Chiu Ma (Cornell University) · Shenlong Wang (University of Illinois, Urbana Champaign) |
738 | A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions | Jack Urbanek (Facebook) · Florian Bordes (Meta AI) · Pietro Astolfi (Meta AI) · Mary Williamson (Meta AI (FAIR)) · Vasu Sharma (Meta AI/ CMU) · Adriana Romero-Soriano (Meta) |
739 | WaveFace: Authentic Face Restoration with Efficient Frequency Recovery | Yunqi Miao (The university of Warwick) · Jiankang Deng (Huawei) · Jungong Han (Aberystwyth University) |
740 | ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization | Weiyao Wang (Facebook) · Pierre Gleize (Polytech Nice Sophia) · Hao Tang (Meta Platforms) · Xingyu Chen (Facebook) · Kevin Liang (FAIR at Meta) · Matt Feiszli (Meta AI) |
741 | LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP | Yunshi HUANG (École de technologie supérieure, Université du Québec) · Fereshteh Shakeri (École de technologie supérieure) · Jose Dolz (École de technologie supérieure) · Malik Boudiaf (École de technologie supérieure) · Houda Bahig (University of Montreal) · Ismail Ben Ayed (ETS Montreal) |
742 | Event-based Structure-from-Orbit | Ethan Elms (University of Adelaide) · Yasir Latif (The University of Adelaide) · Tae Ha Park (Stanford University) · Tat-Jun Chin (None) |
743 | Boosting Object Detection with Zero-Shot Day-Night Domain Adaptation | Zhipeng Du (University of Edinburgh) · Miaojing Shi (King's College London) · Jiankang Deng (Huawei) |
744 | Tyche: Stochastic in Context Learning for Universal Medical Image Segmentation | Marianne Rakic (Massachusetts Institute of Technology) · Hallee Wong (MIT) · Jose Javier Gonzalez Ortiz (DataBricks) · Beth Cimini (Broad Institute) · John Guttag (Massachusetts Institute of Technology) · Adrian V. Dalca (Harvard University) |
745 | Contrasting intra-modal and ranking cross-modal hard negatives to enhance visio-linguistic compositional understanding | Le Zhang (Mila-Quebec AI Institute) · Rabiul Awal (None) · Aishwarya Agrawal (None) |
746 | Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations | Chenyu You (Yale University) · Yifei Min (Yale University) · Weicheng Dai (Yale University) · Jasjeet Sekhon (Yale University) · Lawrence Staib (Yale University) · James Duncan (Yale University) |
747 | Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Matching Framework | Vu Minh Hieu Phan (University of Adelaide) · Yutong Xie (University of Adelaide) · Yuankai Qi (The University of Adelaide) · Lingqiao Liu (None) · Liyang Liu (University of Adelaide) · Bowen Zhang (The University of Adelaide) · Zhibin Liao (University of Adelaide) · Qi Wu (University of Adelaide) · Minh-Son To (Flinders University of South Australia) · Johan Verjans (University of Adelaide) |
748 | Fairy: Fast Parallellized Instruction-Guided Video-to-Video Synthesis | Bichen Wu (Facebook) · Ching-Yao Chuang (Meta) · Xiaoyan Wang (Massachusetts Institute of Technology) · Yichen Jia (Facebook) · Kapil Krishnakumar (Meta, Inc.) · Tong Xiao (None) · Feng Liang (The University of Texas at Austin) · Licheng Yu (None) · Peter Vajda (Facebook) |
749 | AVID: Any-Length Video Inpainting with Diffusion Model | Zhixing Zhang (Rutgers University) · Bichen Wu (Facebook) · Xiaoyan Wang (Massachusetts Institute of Technology) · Yaqiao Luo (Facebook) · Luxin Zhang (Meta) · Yinan Zhao (Facebook) · Peter Vajda (Facebook) · Dimitris N. Metaxas (Rutgers) · Licheng Yu (None) |
750 | CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation | Jun Wang (University of California, San Diego) · Yuzhe Qin (University of California, San Diego, University of California, San Diego) · Kaiming Kuang (University of California, San Diego) · Yigit Korkmaz (University of Southern California) · Akhilan Gurumoorthy (University of California, San Diego) · Hao Su (UCSD) · Xiaolong Wang (UCSD) |
751 | BioCLIP: A Vision Foundation Model for the Tree of Life | Samuel Stevens (Ohio State University, Columbus) · Jiaman Wu (Ohio State University, Columbus) · Matthew Thompson (Ohio State University, Columbus) · Elizabeth Campolongo (The Ohio State University) · Chan Hee Song (The Ohio State University) · David Carlyn (Ohio State University) · Li Dong (Microsoft Research) · Wasila Dahdul (University of California, Irvine) · Charles Stewart (Rensselaer Polytechnic Institute) · Tanya Berger-Wolf (None) · Wei-Lun Chao (Ohio State University) · Yu Su (Ohio State University) |
752 | GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians | Shenhan Qian (Technische Universität München) · Tobias Kirschstein (Department of Informatics, Technische Universität München) · Liam Schoneveld (Woven by Toyota) · Davide Davoli (Toyota Motor Europe NV/SA associated partner by contracted services) · Simon Giebenhain (Technische Universität München) · Matthias Nießner (Technical University of Munich) |
753 | PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos | Yufei Zhang (None) · Jeffrey Kephart (IBM, International Business Machines) · Zijun Cui (University of Southern California) · Qiang Ji (Rensselaer Polytechnic Institute) |
754 | Learning to Segment Referred Objects from Narrated Egocentric Videos | Yuhan Shen (Northeastern University) · Huiyu Wang (Facebook) · Xitong Yang (Meta) · Matt Feiszli (Meta AI) · Ehsan Elhamifar (None) · Lorenzo Torresani (Facebook) · Effrosyni Mavroudi () |
755 | Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos | Yuhan Shen (Northeastern University) · Ehsan Elhamifar (None) |
756 | TRINS: Towards Multimodal Language Models That Can Read | Ruiyi Zhang () · Yanzhe Zhang (Georgia Institute of Technology) · Jian Chen (Mohamed bin Zayed University of Artificial Intelligence) · Yufan Zhou (State University of New York, Buffalo) · Jiuxiang Gu (Adobe Systems) · Changyou Chen (State University of New York, Buffalo) · Tong Sun (Adobe Systems) |
757 | Customization Assistant for Text-to-image Generation | Yufan Zhou (State University of New York, Buffalo) · Ruiyi Zhang () · Jiuxiang Gu (Adobe Systems) · Tong Sun (Adobe Systems) |
758 | One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion | Minghua Liu (University of California, San Diego) · Ruoxi Shi (University of California, San Diego) · Linghao Chen (None) · Zhuoyang Zhang (IIIS, Tsinghua University) · Chao Xu (University of California, Los Angeles) · Xinyue Wei (University of California, San Diego) · Hansheng Chen (Stanford University) · Chong Zeng (Zhejiang University) · Jiayuan Gu (University of California, San Diego) · Hao Su (UCSD) |
759 | ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining | Ruoxi Shi (University of California, San Diego) · Xinyue Wei (University of California, San Diego) · Cheng Wang (University of California, San Diego) · Hao Su (UCSD) |
760 | DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos | Arjun Balasingam (Massachusetts Institute of Technology) · Joseph Chandler (Massachusetts Institute of Technology) · Chenning Li (None) · Zhoutong Zhang (Adobe Systems) · Hari Balakrishnan (Massachusetts Institute of Technology) |
761 | G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping | Junfeng Cheng (Imperial College London) · Tania Stathaki (Imperial College London) |
762 | HDQMF: Holographic Feature Decomposition Using Quantum Algorithms | Prathyush Poduval (University of California, Irvine) · Zhuowen Zou (University of California, Irvine) · Mohsen Imani (University of California, Irvine) |
763 | LLSS: Low-Latency Neural Stereo Streaming | Qiqi Hou (Qualcomm Inc, QualComm) · Farzad Farhadzadeh (Qualcomm Inc, QualComm) · Amir Said (Qualcomm Inc, QualComm) · Guillaume Sautiere (Qualcomm Inc, QualComm) · Hoang Le (Qualcomm AI Research) |
764 | Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring | Chengxu Liu (Xi'an Jiaotong University) · Xuan Wang (Megvii Technology Inc.) · Xiangyu Xu (Xi'an Jiaotong University) · Ruhao Tian (Xi'an Jiaotong University) · Shuai Li (Megvii Technology Inc.) · Xueming Qian (Xi'an Jiaotong University, Tsinghua University) · Ming-Hsuan Yang (University of California at Merced) |
765 | PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness | Anh-Quan Cao (INRIA) · Angela Dai () · Raoul de Charette (Inria) |
766 | LAN: Learning to Adapt Noise for Image Denoising | Changjin Kim (Hanyang University) · Tae Kim Kim (None) · Sungyong Baik (Hanyang University) |
767 | Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships | Rangel Daroya (University of Massachusetts at Amherst) · Aaron Sun (University of Massachusetts Amherst) · Subhransu Maji (University of Massachusetts, Amherst) |
768 | Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model | Shraman Pramanick (None) · Guangxing Han (Columbia University) · Rui Hou (Meta Inc. ) · Sayan Nag (University of Toronto) · Ser-Nam Lim (Meta AI) · Nicolas Ballas (Facebook) · Qifan Wang (Meta AI) · Rama Chellappa (Johns Hopkins University) · Amjad Almahairi (Facebook) |
769 | MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models | Sanjoy Chowdhury (None) · Sayan Nag (University of Toronto) · Joseph J (Adobe Systems) · Balaji Vasan Srinivasan (Adobe Research) · Dinesh Manocha (University of Maryland, College Park) |
770 | RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation | Zeyuan Yang (, Tsinghua University) · LIU JIAGENG (None) · Peihao Chen (South China University of Technology) · Anoop Cherian (None) · Tim Marks (None) · Jonathan Le Roux (Mitsubishi Electric Research Labs) · Chuang Gan (MIT-IBM Watson AI Lab) |
771 | MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding | Xu Cao (University of Illinois Urbana-Champaign) · Tong Zhou (Tencent AI Lab) · Yunsheng Ma (Purdue University) · Wenqian Ye (University of Virginia) · Can Cui (Purdue University) · Kun Tang (Tencent) · Zhipeng Cao (Tencent) · Kaizhao Liang (University of Texas at Austin) · Ziran Wang (Purdue University) · James Rehg (None) · chao zheng (tencent) |
772 | Label Propagation for Zero-shot Classification with Vision-Language Models | Vladan Stojnić (Czech Technical University in Prague) · Yannis Kalantidis (NAVER LABS Europe) · Giorgos Tolias (None) |
773 | Category-Level Multi-Part Multi-Joint 3D Shape Assembly | Yichen Li (Massachusetts Institute of Technology) · Kaichun Mo (NVIDIA Research) · Yueqi Duan (None) · He Wang (None) · Jiequan Zhang (None) · Lin Shao (National University of Singapore) · Wojciech Matusik (Massachusetts Institute of Technology) · Leonidas Guibas (Stanford University) |
774 | NARUTO: Neural Active Reconstruction from Uncertain Target Observations | Ziyue Feng (Clemson University) · Huangying Zhan (InnoPeak Technology, Inc. (OPPO US Research Center)) · Zheng Chen (Indiana University, Bloomington) · Qingan Yan (OPPO US Research Center) · Xiangyu Xu (InnoPeak Technology, Inc.) · Changjiang Cai (None) · Bing Li (Clemson University) · Qilun Zhu (Clemson University) · Yi Xu (OPPO US Research Center) |
775 | The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes | Myeongseob Ko (Virginia Polytechnic Institute and State University) · Feiyang Kang (Virginia Polytechnic Institute and State University) · Weiyan Shi (Stanford University) · Ming Jin (Virginia Tech) · Zhou Yu (Columbia University) · Ruoxi Jia (Virginia Tech) |
776 | 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting | Zhiyin Qian (Department of Computer Science, ETHZ - ETH Zurich) · Shaofei Wang (None) · Marko Mihajlovic (Swiss Federal Institute of Technology) · Andreas Geiger (University of Tübingen) · Siyu Tang (ETH Zurich) |
777 | Alpha Invariance: On Inverse Scaling Between Distance and Volume Density in a Neural Radiance Field | Joshua Ahn (University of Chicago) · Haochen Wang (Toyota Technological Institute at Chicago) · Raymond A. Yeh (Purdue University) · Greg Shakhnarovich (Toyota Technological Institute at Chicago) |
778 | Object Dynamics Modeling with Hierarchical Point Cloud-based Representations | Chanho Kim (Oregon State University) · Li Fuxin (Oregon State University) |
779 | Tactile-Augmented Radiance Fields | Yiming Dou (University of Michigan - Ann Arbor) · Fengyu Yang (Yale University) · Yi Liu (University of Michigan - Ann Arbor) · Antonio Loquercio (University of California, Berkeley) · Andrew Owens (University of Michigan) |
780 | Exploring Region-Word Alignment in Built-in Detector for Open-Vocabulary Object Detection | Heng Zhang (Gaoling School of Artificial Intelligence, Renmin University of China) · Qiuyu Zhao (JD) · Linyu Zheng (JD) · Hao Zeng (JD.com) · Zhiwei Ge (JD) · Tianhao Li (JD) · Sulong Xu (JD) |
781 | Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection | Suyeon Kim (Pohang University of Science and Technology) · Dongha Lee (Yonsei University) · SeongKu Kang (University of Illinois Urbana-Champaign) · Sukang Chae (Pohang University of Science and Technology) · Sanghwan Jang (POSTECH) · Hwanjo Yu (POSTECH) |
782 | FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication | Eric Slyman (Oregon State University) · Stefan Lee (Oregon State University) · Scott Cohen (Adobe Systems) · Kushal Kafle (Adobe Systems) |
783 | Label-Efficient Group Robustness via Out-of-Distribution Concept Curation | Yiwei Yang (University of Washington) · Anthony Liu (University of Michigan) · Robert Wolfe (University of Washington) · Aylin Caliskan (University of Washington) · Bill Howe (University of Washington) |
784 | Shadow-Enlightened Image Outpainting | Hang Yu (Shanghai University) · Ruilin Li (None) · Shaorong Xie (Shanghai University) · Jiayan Qiu (Univerisity of Leicester) |
785 | HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces | Haithem Turki (Carnegie Mellon University) · Vasu Agrawal (Meta Reality Labs Research) · Samuel Rota Bulò (Meta) · Lorenzo Porzi (Facebook) · Peter Kontschieder (Meta) · Deva Ramanan (Carnegie Mellon University) · Michael Zollhoefer (Meta) · Christian Richardt (Meta Reality Labs) |
786 | SpecNeRF: Gaussian Directional Encoding for Specular Reflections | Li Ma (None) · Vasu Agrawal (Meta Reality Labs Research) · Haithem Turki (Carnegie Mellon University) · Changil Kim (Facebook) · Chen Gao (Meta) · Pedro V. Sander (Hong Kong University of Science and Technology) · Michael Zollhoefer (Meta) · Christian Richardt (Meta Reality Labs) |
787 | ViewDiff: 3D-Consistent Image Generation with Text-To-Image Models | Lukas Hoellein (None) · Aljaž Božič (Facebook) · Norman Müller (Meta) · David Novotny (Facebook) · Hung-Yu Tseng (Meta) · Christian Richardt (Meta Reality Labs) · Michael Zollhoefer (Meta) · Matthias Nießner (Technical University of Munich) |
788 | BoQ: A Place is Worth a Bag of learnable Queries | Amar Ali-bey (Université Laval) · Brahim Chaib-draa (Laval university) · Philippe Giguère (Université Laval) |
789 | Diffusion Models Without Attention | Jing Nathan Yan (Cornell University) · Jiatao Gu (Apple (MLR)) · Alexander Rush (Cornell University) |
790 | Solving Masked Jigsaw Puzzles with Diffusion Transformers | Jinyang Liu (Northeastern University) · Wondmgezahu Teshome (Northeastern University) · Sandesh Ghimire (QualComm) · Mario Sznaier (Northeastern University) · Octavia Camps (Northeastern University) |
791 | PRDP: Proximal Reward Difference Prediction\for Large-Scale Reward Finetuning of Diffusion Models | Fei Deng (Google) · Qifei Wang (Google) · Wei Wei (Google) · Tingbo Hou (Google Research) · Matthias Grundmann (Google) |
792 | One-Prompt to Segment All Medical Images | Wu (None) · Min Xu (Carnegie Mellon University) |
793 | Learning Vision from Generative Models Rivals Learning Vision from Data | Yonglong Tian (Google) · Lijie Fan (Massachusetts Institute of Technology) · Kaifeng Chen (Google) · Dina Katabi (Massachusetts Institute of Technology) · Dilip Krishnan (Google) · Phillip Isola (None) |
794 | Visual Anagrams: Synthesizing Multi-View Optical Illusions with Diffusion Models | Daniel Geng (University of Michigan) · Inbum Park (University of Michigan - Ann Arbor) · Andrew Owens (University of Michigan) |
795 | Neural Refinement for Absolute Pose Regression with Feature Synthesis | Shuai Chen (University of Oxford) · Yash Bhalgat (Visual Geometry Group, University of Oxford) · Xinghui Li (University of Oxford) · Jia-Wang Bian (University of Oxford) · Kejie Li (University of Oxford) · Zirui Wang (University of Oxford) · Victor Adrian Prisacariu (None) |
796 | Map-Relative Pose Regression for Visual Re-Localization | Shuai Chen (University of Oxford) · Tommaso Cavallari (Niantic Inc.) · Victor Adrian Prisacariu (None) · Eric Brachmann (None) |
797 | Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly | Hang Du (Beijing University of Posts and Telecommunications) · Sicheng Zhang (Beijing University of Posts and Telecommunications) · Binzhu Xie (Beijing University of Posts and Telecommunications) · Guoshun Nan (Beijing University of Posts and Telecommunications) · Jiayang Zhang (Beijing University of Posts and Telecommunications) · Junrui Xu (Beijing University of Posts and Telecommunications) · Hangyu Liu (Beijing University of Posts and Telecommunications) · Sicong Leng (Nanyang Technological University) · Jiangming Liu (Yunnan University) · Hehe Fan (None) · Dajiu Huang (South China University) · Jing Feng (Beijing University of Posts and Telecommunications) · Linli Chen (Sichuan University) · Can Zhang (Beijing University of Posts and Telecommunications) · Xuhuan Li (Beijing University of Posts and Telecommunications) · Hao Zhang (None) · Jianhang Chen (Beijing University of Posts and Telecommunications) · Qimei Cui (Beijing University of Posts and Telecommunications) · Xiaofeng Tao (Beijing University of Posts and Telecommunications) |
798 | Patch Diffusion: Parallel Inference for High-Resolution Diffusion Models | Muyang Li (None) · Tianle Cai (Princeton University) · Jiaxin Cao (Lepton AI) · Qinsheng Zhang (Georgia Institute of Technology) · Han Cai (Massachusetts Institute of Technology) · Junjie Bai (Lepton AI Inc.) · Yangqing Jia (Lepton AI) · Kai Li (Princeton University) · Song Han (Massachusetts Institute of Technology) |
799 | Condition-Aware Neural Network for Controlled Image Generation | Han Cai (Massachusetts Institute of Technology) · Muyang Li (None) · Qinsheng Zhang (Georgia Institute of Technology) · Ming-Yu Liu (NVIDIA) · Song Han (Massachusetts Institute of Technology) |
800 | DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions | Yunxiao Shi (Qualcomm AI Research Qualcomm) · Manish Singh (Qualcomm AI Research) · Hong Cai (Qualcomm AI Research) · Fatih Porikli (QualComm) |
801 | Language-Driven Anchors for Zero-Shot Adversarial Robustness | Xiao Li (Tsinghua University) · Wei Zhang (Department of Computer Science and Technology, Tsinghua University) · Yining Liu (Harbin Institute of Technology at Weihai) · Zhanhao Hu (UC Berkeley) · Bo Zhang (Tsinghua University, Tsinghua University) · Xiaolin Hu (Tsinghua University, Tsinghua University) |
802 | Infrared Adversarial Car Stickers | Xiaopei Zhu (Tsinghua University) · Yuqiu Liu (Beijing Forestry University) · Zhanhao Hu (UC Berkeley) · Jianmin Li (Department of computer science and technology, Tsinghua University) · Xiaolin Hu (Tsinghua University, Tsinghua University) |
803 | GlitchBench: Can large multimodal models detect video game glitches? | Mohammad Reza Taesiri (University of Alberta) · Tianjun Feng (University of Alberta) · Cor-Paul Bezemer (University of Alberta) · Anh Nguyen (Auburn University) |
804 | Spectrum AUC Difference (SAUCD): Human Aligned 3D Shape Evaluation | Tianyu Luan (State University of New York at Buffalo) · Zhong Li (InnoPeak Technology) · Lele Chen (University of Rochester) · Xuan Gong (Harvard University) · Lichang Chen (Department of Computer Science, University of Maryland, College Park) · Yi Xu (OPPO US Research Center) · Junsong Yuan (State University of New York at Buffalo) |
805 | GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians | Liangxiao Hu (Harbin Institute of Technology) · Hongwen Zhang (Beijing Normal University) · Yuxiang Zhang (Tsinghua University, Tsinghua University) · Boyao ZHOU (Tsinghua University) · Boning Liu (Department of Automation, Tsinghua University) · Shengping Zhang (Harbin Institute of Technology) · Liqiang Nie (Harbin Institute of Technology (Shenzhen)) |
806 | CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images | Guanlin Shen (Tsinghua University) · Jingwei Huang (Huawei Technologies Ltd.) · Zhihua Hu (Nanjing University of Information Science and Technology) · Bin Wang (Tsinghua University) |
807 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu (Fudan University) · Qi Dai (Microsoft Research Asia) · Zhi-Qi Cheng (Carnegie Mellon University) · Han Hu (Microsft Research Asia) · Xintong Han (Huya Inc) · Zuxuan Wu (Fudan University) · Yu-Gang Jiang (Fudan University) |
808 | BEVSpread: Spread Voxel Pooling for Bird’s-Eye-View Representation in Vision-based Roadside 3D Object Detection | Wenjie Wang (Zhejiang University) · Yehao Lu (Zhejiang University) · Guangcong Zheng (None) · Shuigenzhan (Zhejiang University) · Xiaoqing Ye (Baidu Inc.) · Zichang Tan (Baidu) · Jingdong Wang (Baidu) · Gaoang Wang (Zhejiang University) · Xi Li (Zhejiang University) |
809 | Can Vision-Language Models Think from a First-person Perspective? | Sijie Cheng (None) · Zhicheng Guo (Tsinghua University, Tsinghua University) · Jingwen Wu (University of Toronto) · Kechen Fang (Tsinghua University) · Peng Li (Tsinghua University) · Huaping Liu (Tsinghua University, Tsinghua University) · Yang Liu (Tsinghua University) |
810 | ReGenNet: Towards Human Action-Reaction Synthesis | Liang Xu (Shanghai Jiao Tong University) · Yizhou Zhou (WeChat AI) · Yichao Yan (Shanghai Jiao Tong University) · Xin Jin (Eastern Institute for Advanced Study) · Wenhan Zhu (None) · Fengyun Rao (WeChat, Tencent Inc.) · Xiaokang Yang (Shanghai Jiao Tong University, China) · Wenjun Zeng (None) |
811 | Fair-VPT: Fair Visual Prompt Tuning for Image Classification | Sungho Park (Yonsei university) · Hyeran Byun (Yonsei University) |
812 | 3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow | Felix Taubner (LG Electronics) · Prashant Raina (LG Electronics) · Mathieu Tuli (LG Electronics Canada Incorporated, TAIL) · Eu Wern Teh (LG Corporation) · Chul Lee (Department of Computer Science, University of Toronto) · Jinmiao Huang (Meta) |
813 | Beyond Average: Individualized Visual Scanpath Prediction | Xianyu Chen (University of Minnesota) · Ming Jiang (University of Minnesota, Minneapolis) · Qi Zhao (University of Minnesota, Minneapolis) |
814 | Content-Style Decoupling for Unsupervised Makeup Transfer without Generating Pseudo Ground Truth | Zhaoyang Sun (Wuhan University of Technology) · Shengwu Xiong (Wuhan University of Technology) · Yaxiong Chen (Wuhan University of Technology) · Yi Rong (Wuhan University of Technology) |
815 | Digital Life Project: Autonomous 3D Characters with Social Intelligence | Zhongang Cai (Nanyang Technological University) · Jianping Jiang (Peking University) · Zhongfei Qing (SenseTime Research) · Xinying Guo (Nanyang Technological University) · Mingyuan Zhang (Nanyang Technological University) · Zhengyu Lin (Sensetime) · Haiy Mei (None) · Chen Wei (SenseTime International PTE. LTD.) · Wang Ruisi (Nanyang Technological University) · Wanqi Yin (SenseTime Research ) · Liang Pan (Shanghai AI Lab) · Xiangyu Fan (Chinese University of Hong Kong) · Han Du (Universität des Saarlandes) · Peng Gao (SenseTime LTD.) · Zhitao Yang (SenseTime Co Ltd.) · Yang Gao (SenseTime) · Jiaqi Li (SenseTime) · Tianxiang Ren (Xiamen University) · YuKun Wei (Sensetime Research) · Xiaogang Wang (The Chinese University of Hong Kong) · Chen Change Loy (NANYANG TECHNOLOGICAL UNIVERSITY) · Lei Yang (The Chinese University of Hong Kong) · Ziwei Liu (Nanyang Technological University) |
816 | Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks | Yuhao Liu (City University of Hong Kong) · Zhanghan Ke (City University of Hong Kong) · Fang Liu (City University of Hong Kong) · Nanxuan Zhao (Adobe Research) · Rynson W.H. Lau (City University of Hong Kong) |
817 | Masking Clusters in Vision-language Pretraining | Zihao Wei (University of Michigan - Ann Arbor) · Zixuan Pan (University of Michigan - Ann Arbor) · Andrew Owens (University of Michigan) |
818 | Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Zhiwu Qing (Huazhong University of Science and Technology, Tsinghua University) · Shiwei Zhang (Alibaba Group) · Jiayu Wang (None) · Xiang Wang (Huazhong University of Science and Technology) · Yujie Wei (Fudan University) · Yingya Zhang (Alibaba Group) · Changxin Gao (Huazhong University of Science and Technology) · Nong Sang (Huazhong University of Science and Technology) |
819 | InstructVideo: Instructing Video Diffusion Models with Human Feedback | Hangjie Yuan (Nanyang Technological University) · Shiwei Zhang (Alibaba Group) · Xiang Wang (Huazhong University of Science and Technology) · Yujie Wei (Fudan University) · Tao Feng (Tsinghua University) · Yining Pan (Singapore University of Technology and Design) · Yingya Zhang (Alibaba Group) · Ziwei Liu (Nanyang Technological University) · Samuel Albanie (University of Cambridge) · Dong Ni (Zhejiang University) |
820 | DreamVideo: Composing Your Dream Videos with Customized Subject and Motion | Yujie Wei (Fudan University) · Shiwei Zhang (Alibaba Group) · Zhiwu Qing (Huazhong University of Science and Technology, Tsinghua University) · Hangjie Yuan (Nanyang Technological University) · Zhiheng Liu (University of Science and Technology of China) · Yu Liu (Alibaba Group) · Yingya Zhang (Alibaba Group) · Jingren Zhou (Alibaba Group) · Hongming Shan (None) |
821 | Unbiased Estimator for Distorted Conic in Camera Calibration | Chaehyeon Song (Seoul National University) · Jaeho Shin (Seoul National University) · Myung-Hwan Jeon (Seoul National University) · Jongwoo Lim (Seoul National University) · Ayoung Kim (None) |
822 | Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions | Runhao Zeng (Shenzhen University) · Xiaoyong Chen (Shenzhen University) · Jiaming Liang (Shenzhen University) · Huisi Wu (Shenzhen University) · Guang-Zhong Cao (Shenzhen University) · Yong Guo (Max-Planck Institute for Informatics) |
823 | DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes | Xiaoyu Zhou (Peking University) · Zhiwei Lin (Peking University) · Xiaojun Shan (Peking Univerisity) · Yongtao Wang (Peking University) · Deqing Sun (Google) · Ming-Hsuan Yang (University of California at Merced) |
824 | SeNM-VAE: Semi-Supervised Noise Modeling with Hierarchical Variational Autoencoder | Dihan Zheng (Tsinghua University) · Yihang Zou (Tsinghua University) · Xiaowen Zhang (Hisilicon) · Chenglong Bao (Tsinghua University, Tsinghua University) |
825 | FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition | ganggui ding (Zhejiang University) · Canyu Zhao (Zhejiang University) · Wen Wang (Zhejiang University) · Zhen Yang (Zhejiang University) · Zide Liu (Zhejiang University) · Hao Chen (Zhejiang University) · Chunhua Shen (Zhejiang University) |
826 | SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models | Tongtian Yue (, Institute of automation, Chinese academy of science) · Jie Cheng (State Key Laboratory of Multimodal Artificial Intelligence Systems, CASIA) · Longteng Guo (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xingyuan Dai (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zijia Zhao (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xingjian He (, Institute of automation, Chinese academy of science) · Gang Xiong (Institute of Automation, Chinese Academy of Science) · Yisheng Lv (Institute of Automation, Chinese Academy of Science) · Jing Liu (Institute of automation, Chinese academy of science) |
827 | Neural Markov Random Field for Stereo Matching | Tongfan Guan (The Chinese University of Hong Kong) · Chen Wang (University at Buffalo) · Yun-Hui Liu (The Chinese University of Hong Kong) |
828 | Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition | Yifei Chen (Huawei) · Dapeng Chen (Huawei Technologies Ltd.) · Ruijin Liu (Xi'an Jiaotong University) · Sai Zhou (Huawei Technologies Ltd.) · Wenyuan Xue (Huawei Technologies Ltd.) · Wei Peng (Huawei Technologies Ltd.) |
829 | Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes | Gaurav Shrivastava (Department of Computer Science, University of Maryland, College Park) · Abhinav Shrivastava (University of Maryland) |
830 | HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting | Yuheng Jiang (ShanghaiTech University) · Zhehao Shen (ShanghaiTech University) · Penghao Wang (None) · Zhuo Su (ByteDance) · Yu Hong (ShanghaiTech University) · Yingliang Zhang (DGene Inc.) · Jingyi Yu (ShanghaiTech University) · Lan Xu (None) |
831 | DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning | Sikai Bai (The Hong Kong University of Science and Technology) · Jie ZHANG (The Hong Kong Polytechnic University) · Song Guo (Department of Computer Science and Engineering, Hong Kong University of Science and Technology) · Shuaicheng Li (Sensetime Group Limited) · Jingcai Guo (The Hong Kong Polytechnic University) · Jun Hou (Sensetime) · Tao Han (Northwestern Polytechnical University) · Xiaocheng Lu (Northwestern Polytechnical University) |
832 | AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search | Junghyup Lee (Yonsei University) · Bumsub Ham (Yonsei University) |
833 | PointInfinity: Resolution-Invariant Point Diffusion Models | Zixuan Huang (University of Illinois Urbana-Champaign) · Justin Johnson (University of Michigan) · Shoubhik Debnath (FAIR, Meta) · James Rehg (None) · Chao-Yuan Wu (Meta) |
834 | Seeing Motion During Nighttime with Event Camera | Haoyue Liu (Huazhong University of Science and Technology) · Shihan Peng (Huazhong University of Science and Technology) · Lin Zhu (Beijing Institute of Technology) · Yi Chang (Huazhong University of Science and Technology) · Hanyu Zhou (Huazhong University of Science and Technology) · Luxin Yan (Huazhong University of Science and Technology) |
835 | CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image | Donggeun Yoon (Chungnam National University) · Donghyeon Cho (Hanyang University) |
836 | InstructDiffusion: A Generalist Modeling Interface for Vision Tasks | Zigang Geng (University of Science and Technology of China) · Binxin Yang (University of Science and Technology of China) · Tiankai Hang (Southeast University) · Chen Li (Xi'an Jiaotong University) · Shuyang Gu (Research, Microsoft) · Ting Zhang (Beijing Normal University) · Jianmin Bao (Microsoft) · Zheng Zhang (Microsoft) · Houqiang Li (University of Science and Technology of China) · Han Hu (Microsft Research Asia) · Dong Chen (Microsoft) · Baining Guo (Microsoft Research) |
837 | VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning | Ziyang Luo (None) · Nian Liu (Mohamed bin Zayed University of Artificial Intelligence) · Wangbo Zhao (National University of Singapore) · Xuguang Yang (Northwestern Polytechnical University Xi'an) · Dingwen Zhang (Northwestern Polytechnical University) · Deng-Ping Fan (ETH Zurich) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) · Junwei Han (Northwestern Polytechnical University, Tsinghua University) |
838 | GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding | Hao Li (Northwest Polytechnical University) · Dingwen Zhang (Northwestern Polytechnical University) · Yalun Dai (University of Chinese Academy of Sciences) · Nian Liu (Mohamed bin Zayed University of Artificial Intelligence) · Lechao Cheng (Hefei University of Technology) · Li Jingfeng (Northwest Polytechnical University Xi'an) · Jingdong Wang (Baidu) · Junwei Han (Northwestern Polytechnical University, Tsinghua University) |
839 | Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata | Dongsu Zhang (Seoul National University) · Francis Williams (NVIDIA) · Žan Gojčič (NVIDIA) · Karsten Kreis (NVIDIA) · Sanja Fidler (Department of Computer Science, University of Toronto) · Young Min Kim (Seoul National University) · Amlan Kar (NVIDIA) |
840 | Bi-SSC: Geometric-Semantic Bidirectional Fusion for Camera-based 3D Semantic Scene Completion | Yujie Xue (HNU) · Ruihui Li (Hunan University) · F anWu (Wuhan University) · Zhuo Tang (Hunan University) · Kenli Li (Hunan University) · Duan Mingxing (Hunan University) |
841 | A theory of volumetric representations for opaque solids | Bailey Miller (Carnegie Mellon University) · Hanyu Chen (CMU, Carnegie Mellon University) · Alice Lai (Carnegie Mellon University) · Ioannis Gkioulekas (Carnegie Mellon University) |
842 | Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications | Junyi Ma (Shanghai Jiao Tong University) · Xieyuanli Chen (National University of Defense Technology) · Jiawei Huang (HAOMO Technology Co., Ltd) · Jingyi Xu (Beijing Institute of Technology) · Zhen Luo (Beijing Institute of Technology) · Jintao Xu (Xi'an Jiaotong University) · Weihao Gu (Tsinghua University, Tsinghua University) · Rui Ai (HAOMO.AI Technology Co.,Ltd. ) · Hesheng Wang (Shanghai Jiao Tong University) |
843 | Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization | Yujia Liu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Chenxi Yang (Peking University) · Dingquan Li (Peng Cheng Laboratory) · Jianhao Ding (Peking University) · Tingting Jiang (Peking University) |
844 | Video Harmonization with Triplet Spatio-Temporal Variation Patterns | Zonghui Guo (None) · XinYu Han (Ocean University of China) · Jie Zhang (Institute of Computing Technology, Chinese Academy of Sciences) · Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences) · Haiyong Zheng (Ocean University of China) |
845 | Learning Correlation Structures for Vision Transformers | Manjin Kim (POSTECH) · Paul Hongsuck Seo (Google) · Cordelia Schmid (Inria / Google) · Minsu Cho (POSTECH) |
846 | Uncertainty-aware Action Decoupling Transformer for Action Anticipation | Hongji Guo (None) · Nakul Agarwal (None) · Shao-Yuan Lo (Johns Hopkins University) · Kwonjoon Lee (Honda Research Institute) · Qiang Ji (Rensselaer Polytechnic Institute) |
847 | Producing and Leveraging Online Map Uncertainty in Trajectory Prediction | Xunjiang Gu (University of Toronto) · Guanyu Song (University of Toronto) · Igor Gilitschenski (University of Toronto) · Marco Pavone (NVIDIA) · Boris Ivanovic (NVIDIA) |
848 | SEAS: ShapE-Aligned Supervision for Person Re-Identification | Haidong Zhu (University of Southern California) · Pranav Budhwant (University of Southern California) · Zhaoheng Zheng (None) · Ram Nevatia (None) |
849 | CDMAD: Class-Distribution-Mismatch-Aware Debiasing for Class-Imbalanced Semi-Supervised Learning | Hyuck Lee (Korea Advanced Institute of Science and Technology) · Heeyoung Kim (Korea Advanced Institute of Science and Technology) |
850 | MovieChat: From Dense Token to Sparse Memory for Long Video Understanding | Enxin Song () · Wenhao Chai (University of Washington) · Guanhong Wang (Zhejiang University) · Haoyang Zhou (Zhejiang University) · Feiyang Wu (Zhejiang University) · Yucheng Zhang (Zhejiang University) · Tian Ye (The Hong Kong University of Science and Technology (Guangzhou)) · Haozhe Chi (Zhejiang University) · Xun Guo (Microsoft Research Asia) · Yanting Zhang (Donghua University, Shanghai) · Yan Lu (Microsoft Research Asia) · Jenq-Neng Hwang (None) · Gaoang Wang (Zhejiang University) |
851 | AUEditNet: Dual-Branch Facial Action Unit Intensity Manipulation with Implicit Disentanglement | Shiwei Jin (None) · Zhen Wang (Qualcomm Technologies, Inc.) · Lei Wang (Qualcomm) · Peng Liu (Qualcomm Inc, QualComm) · Ning Bi (QualComm) · Truong Nguyen (University of California, San Diego) |
852 | Link-Context Learning for Multimodal LLMs | Yan Tai (Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo, China) · Weichen Fan (HyperGAI) · Zhao Zhang (Sensetime Research) · Ziwei Liu (Nanyang Technological University) |
853 | Shadows Don’t Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now | Ayush Sarkar (Department of Computer Science at University of Illinois Urbana-Champaign) · Hanlin Mai (University of Illinois Urbana Champaign) · Amitabh Mahapatra (University of Illinois Urbana-Champaign) · David Forsyth (University of Illinois at Urbana-Champaign) · Svetlana Lazebnik (University of Illinois at Urbana-Champaign) · Anand Bhattad (None) |
854 | LCD: Towards Hierarchical Embeddings with Localizability, Composability, and Decomposability Learned from Anatomy | Mohammad Reza Hosseinzadeh Taher (Arizona State University) · Michael Gotway (Mayo Clinic) · Jianming Liang (Arizona State University) |
855 | PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks | Marina Neseem (Brown University) · Conor McCullough (Google) · Randy Hsin (Google) · Chas Leichner (Google) · Shan Li (Google) · In Suk Chong (Google) · Andrew Howard (Google) · Lukasz Lew (Research, Google) · Sherief Reda (Brown University) · Ville-Mikko Rautio (Google) · Daniele Moro (Google Research) |
856 | DIEM: Decomposition-Integration Enhancing Multimodal Insights | Xinyi Jiang (None) · Guoming Wang (Zhejiang University) · Junhao Guo (Zhejiang University) · Juncheng Li (Zhejiang University) · Wenqiao Zhang (National University of Singapore) · Rongxing Lu (University of New Brunswick) · Siliang Tang (Zhejiang University) |
857 | CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation | Lingjun Zhao (University of Michigan - Ann Arbor) · Jingyu Song (University of Michigan - Ann Arbor) · Katherine Skinner (University of Michigan - Ann Arbor) |
858 | Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation | Mukul Khanna (Georgia Institute of Technology) · Yongsen Mao (Simon Fraser University) · Hanxiao Jiang (University of Illinois Urbana-Champaign) · Sanjay Haresh (Qualcomm Inc, QualComm) · Brennan Shacklett (Stanford University) · Dhruv Batra (FAIR (Meta) and Georgia Tech) · Alexander William Clegg (Meta AI) · Eric Undersander (Meta) · Angel Xuan Chang (Simon Fraser University) · Manolis Savva (Simon Fraser University) |
859 | Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding | Alessandro Achille (California Institute of Technology) · Greg Ver Steeg (University of California, Riverside) · Tian Yu Liu (University of California, Los Angeles) · Matthew Trager (Amazon) · Carson Klingenberg (Amazon Web Services) · Stefano Soatto (AWS) |
860 | Variance-guided and Parameter-Efficient Feature Space Adaptation for Cross-domain Few-Shot Learning | Rashindrie Perera (University of Melbourne) · Saman Halgamuge (University of Melbourne) |
861 | TransNeXt: Robust Foveal Visual Perception for Vision Transformers | Dai Shi (Independent researcher) |
862 | Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization | Jimyeong Kim (Seoul National University) · Jungwon Park (Seoul National University) · Wonjong Rhee (Seoul National University) |
863 | Improving Subject-Driven Image Synthesis with Context-Agnostic Guidance | Kelvin C.K. Chan (Google) · Yang Zhao (Google) · Xuhui Jia (Google) · Ming-Hsuan Yang (University of California at Merced) · Huisheng Wang (Google) |
864 | StyLitGAN: Image-based Relighting via Latent Control | Anand Bhattad (None) · James Soole (University of Illinois Urbana-Champaign) · David Forsyth (University of Illinois at Urbana-Champaign) |
865 | ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations | Maitreya Patel (Arizona State University) · Changhoon Kim (Arizona State University) · Sheng Cheng (Arizona State University) · Chitta Baral (Arizona State University) · 'YZ' Yezhou Yang (Arizona State University) |
866 | WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models | Changhoon Kim (Arizona State University) · Kyle Min (Intel Labs) · Maitreya Patel (Arizona State University) · Sheng Cheng (Arizona State University) · 'YZ' Yezhou Yang (Arizona State University) |
867 | Relightable Gaussian Avatars | Shunsuke Saito (Reality Labs Research) · Gabriel Schwartz (Meta) · Tomas Simon (Meta) · Junxuan Li (Meta Reality Labs) · Giljoo Nam (Meta) |
868 | FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | SICHENG MO (University of California, Los Angeles) · Fangzhou Mu (NVIDIA) · Kuan Heng Lin (University of California, Los Angeles) · Yanli Liu (OPPO US Research Center) · Bochen Guan (OPPO US Research Center) · Yin Li (University of Wisconsin, Madison) · Bolei Zhou (University of California, Los Angeles) |
869 | Utility-Fairness Trade-Offs and How to Find Them | Sepehr Dehdashtian (Michigan State University) · Bashir Sadeghi (Michigan State University) · Vishnu Naresh Boddeti (None) |
870 | Physical Property Understanding from Language-Embedded Feature Fields | Albert Zhai (University of Illinois at Urbana-Champaign) · Yuan Shen (None) · Emily Y. Chen (University of Illinois Urbana Champaign) · Gloria Wang (Department of Computer Science) · Xinlei Wang (University of Illinois Urbana-Champaign) · Sheng Wang (University of Illinois Urbana-Champaign) · Kaiyu Guan (University of Illinois, Urbana Champaign) · Shenlong Wang (University of Illinois, Urbana Champaign) |
871 | SLICE: Stabilized LIME for Consistent Explanations for Image Classification | Revoti Bora (Norwegian University of Science and Technology) · Kiran Raja (Norwegian University of Science and Technology) · Philipp Terhörst (Universität Paderborn) · Raymond Veldhuis (University of Twente) · Raghavendra Ramachandra (Norwegian University of Science and Technology (NTNU)) |
872 | GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image | Chong Bao (Zhejiang University) · Yinda Zhang (Google) · Yuan Li (Zhejiang University) · Xiyu Zhang (Zhejiang University) · Bangbang Yang (ByteDance Inc) · Hujun Bao (Zhejiang University) · Marc Pollefeys (ETH Zurich / Microsoft) · Guofeng Zhang (Zhejiang University) · Zhaopeng Cui (None) |
873 | MuRF: Multi-Baseline Radiance Fields | Haofei Xu (Department of Computer Science, ETHZ - ETH Zurich) · Anpei Chen (Department of Computer Science, ETHZ - ETH Zurich) · Yuedong Chen (Monash University) · Christos Sakaridis (ETH Zurich) · Yulun Zhang (ETH Zürich) · Marc Pollefeys (ETH Zurich / Microsoft) · Andreas Geiger (University of Tübingen) · Fisher Yu (ETH Zurich) |
874 | SNI-SLAM: Semantic Neural Implicit SLAM | Siting Zhu (None) · Guangming Wang (University of Cambridge) · Hermann Blum (Computer Vision and Geometry Lab, ETH Zürich) · Jiuming Liu (Shanghai Jiao Tong University) · LiangSong (China University of Mining Technology - Xuzhou) · Marc Pollefeys (ETH Zurich / Microsoft) · Hesheng Wang (Shanghai Jiao Tong University) |
875 | SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes | Alexandros Delitzas (ETH Zurich) · Ayça Takmaz (None) · Federico Tombari (Google, TUM) · Robert Sumner (Massachusetts Institute of Technology) · Marc Pollefeys (ETH Zurich / Microsoft) · Francis Engelmann (Department of Computer Science, ETHZ - ETH Zurich) |
876 | NeRF On-the-go: Exploiting Uncertainty for Distractor-free NeRFs in the Wild | weining ren (ETHz) · Zihan Zhu (ETHZ - ETH Zurich) · Boyang Sun (ETH Zurich) · Jiaqi Chen (ETHZ - ETH Zurich) · Marc Pollefeys (ETH Zurich / Microsoft) · Songyou Peng (ETH Zurich & MPI Tübingen) |
877 | GLACE: Global Local Accelerated Coordinate Encoding | Fangjinhua Wang (None) · Xudong Jiang (ETHZ - ETH Zurich) · Silvano Galliani (Microsoft) · Christoph Vogel (Microsoft) · Marc Pollefeys (ETH Zurich / Microsoft) |
878 | LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry | Weirong Chen (Technische Universität München) · Le Chen (Max Planck Institute for Intelligent Systems, Max-Planck Institute) · Rui Wang (Microsoft) · Marc Pollefeys (ETH Zurich / Microsoft) |
879 | F | |
3 | ||
Loc: Fusion and Filtering for Floorplan Localization | Changan Chen (None) · Rui Wang (Microsoft) · Christoph Vogel (Microsoft) · Marc Pollefeys (ETH Zurich / Microsoft) | |
880 | Multi-Level Neural Scene Graphs for Dynamic Urban Environments | Tobias Fischer (Swiss Federal Institute of Technology) · Lorenzo Porzi (Facebook) · Samuel Rota Bulò (Meta) · Marc Pollefeys (ETH Zurich / Microsoft) · Peter Kontschieder (Meta) |
881 | Multiway Point Cloud Mosaicking with Diffusion and Global Optimization | Shengze Jin (Department of Computer Science, ETHZ - ETH Zurich) · Iro Armeni (Stanford University) · Marc Pollefeys (ETH Zurich / Microsoft) · Daniel Barath (ETHZ - ETH Zurich) |
882 | Structure-from-Motion from Pixel-wise Correspondences | Philipp Lindenberger (Department of Computer Science, ETHZ - ETH Zurich) · Paul-Edouard Sarlin (ETH Zurich) · Marc Pollefeys (ETH Zurich / Microsoft) |
883 | Efficient Solution of Point-Line Absolute Pose | Petr Hruby (Department of Computer Science, ETHZ - ETH Zurich) · Timothy Duff (University of Washington) · Marc Pollefeys (ETH Zurich / Microsoft) |
884 | When StyleGAN Meets Stable Diffusion: a | |
W |
Adapter for Personalized Image Generation | Xiaoming Li (MMLab@NTU) · Xinyu Hou (Nanyang Technological University) · Chen Change Loy (NANYANG TECHNOLOGICAL UNIVERSITY) | | 885 | Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling | Jianan Fan (University of Sydney) · Dongnan Liu (University of Sydney) · Hang Chang (Lawrence Berkeley National Lab) · Heng Huang (University of Pittsburgh) · Mei Chen () · Weidong Cai (The University of Sydney) | | 886 | AV-RIR: Audio-Visual Room Impulse Response Estimation | Anton Ratnarajah (University of Maryland, College Park) · Sreyan Ghosh (University of Maryland, College Park) · Sonal Kumar (University of Maryland, College Park) · Purva Chiniya (University of Maryland, College Park) · Dinesh Manocha (University of Maryland, College Park) | | 887 | Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning | Leonardo Iurada (Polytechnic Institute of Turin) · Marco Ciccone (Politecnico di Torino) · Tatiana Tommasi (Politecnico di Torino) | | 888 | Patch2Self2: Self-supervised Denoising on Coresets via Matrix Sketching | Shreyas Fadnavis (Johnson and Johnson) · Agniva Chowdhury (Oak Ridge National Laboratory) · Joshua Batson (Anthropic) · Petros Drineas (Purdue University) · Eleftherios Garyfallidis (Indiana University) | | 889 | SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Integrated Character-Level Diffusion and Contextual Consistency | Qilong Zhangli (Rutgers University) · Jindong Jiang (Rutgers University) · Di Liu (Rutgers University, New Brunswick) · Licheng Yu (None) · Xiaoliang Dai (Facebook) · Ankit Ramchandani (Meta Platforms, Inc.) · Guan Pang (Facebook) · Dimitris N. Metaxas (Rutgers) · Praveen Krishnan (Meta AI) | | 890 | Classification-Free 3D Object Grounding with Regularized Concept Learners | Chun Feng (University of Science and Technology of China) · Joy Hsu (Stanford University) · Weiyu Liu (Stanford University) · Jiajun Wu (Stanford University) | | 891 | Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen (The University of Hong Kong) · Mengmeng Xu (Meta AI) · Jiawei Ren (Nanyang Technological University) · Yuren Cong (Institute of Information Processing, Leibniz University Hanover) · Sen He (Meta AI) · Yanping Xie (Meta) · Animesh Sinha (Meta AI) · Ping Luo (The University of Hong Kong) · Tao Xiang (University of Surrey) · Juan-Manuel Pérez-Rúa (Meta AI) | | 892 | Neural Visibility Field for Active Mapping | Shangjie Xue (Georgia Institute of Technology) · Jesse Dill (Georgia Institute of Technology) · Pranay Mathur (Georgia Institute of Technology) · Frank Dellaert (Google) · Panagiotis Tsiotras (Georgia Institute of Technology) · Danfei Xu (Georgia Institute of Technology) | | 893 | Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions | Saeed Khorram (Apple) · Mingqi Jiang (Oregon State University) · Mohamad Shahbazi (ETH Zürich) · Mohamad Hosein Danesh (McGill University) · Li Fuxin (Oregon State University) | | 894 | Real-Time Simulated Avatar from Head-Mounted Sensors | Zhengyi Luo (Carnegie Mellon University) · Jinkun Cao (Carnegie Mellon University) · Rawal Khirodkar (Meta) · Alexander Winkler (Meta) · Jing Huang (Facebook) · Kris Kitani (Carnegie Mellon University) · Weipeng Xu (Meta Reality Labs Research) | | 895 | RoHM: Robust Human Motion Reconstruction via Diffusion | Siwei Zhang (None) · Bharat Lal Bhatnagar (Eberhard-Karls-Universität Tübingen) · Yuanlu Xu (Meta Reality Labs Research) · Alexander Winkler (Meta) · Petr Kadlecek (Meta) · Siyu Tang (ETH Zurich) · Federica Bogo (Meta) | | 896 | CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models | Tuna Han Salih Meral (Virginia Tech) · Enis Simsar (ETH Zurich) · Federico Tombari (Google, TUM) · Pinar Yanardag (Virginia Polytechnic Institute and State University) | | 897 | PairAug: What Can Augmented Image-Text Pairs Do for Radiology? | Yutong Xie (University of Adelaide) · Qi Chen (The University of Adelaide) · Sinuo Wang (University of Adelaide) · Minh-Son To (Flinders University of South Australia) · Iris Lee (South Australia medical imaging) · Ee Win Khoo (The Queen Elizabeth Hospital) · Kerolos Hendy (Flinders University of South Australia) · Daniel Koh (Monash University, Malaysia Campus) · Yong Xia (Northwestern Polytechnical University) · Qi Wu (University of Adelaide) | | 898 | DART: Implicit Doppler Tomography for Radar Novel View Synthesis | Tianshu Huang (Carnegie Mellon University) · John Miller (Carnegie Mellon University) · Akarsh Prabhakara (Carnegie Mellon University) · Tao Jin (CMU, Carnegie Mellon University) · Tarana Laroia (CMU, Carnegie Mellon University) · Zico Kolter (Carnegie Mellon University) · Anthony Rowe (Carnegie Mellon University) | | 899 | Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence | Junyi Zhang () · Charles Herrmann (Google) · Junhwa Hur (Google) · Eric Chen (University of Illinois Urbana-Champaign) · Varun Jampani (Google Research) · Deqing Sun (Google) · Ming-Hsuan Yang (University of California at Merced) | | 900 | SnAG: Scalable and Accurate Video Grounding | Fangzhou Mu (NVIDIA) · SICHENG MO (University of California, Los Angeles) · Yin Li (University of Wisconsin, Madison) | | 901 | Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning | Yun Li (CSIRO's Data61) · Zhe Liu (Tiktok) · Hang Chen (Snap Inc.) · Lina Yao (CSIRO's Data61 and University of New South Wales) | | 902 | Self-Supervised Facial Representation Learning with Facial Region Awareness | Zheng Gao (Queen Mary, University of London) · Ioannis Patras (Queen Mary University of London) | | 903 | Privacy-preserving Optics for Enhancing Protection in Face De-identification | Jhon Lopez (Universidad Industrial de Santander) · Carlos Hinojosa (KAUST) · TBD TBD (None) · Bernard Ghanem (KAUST) | | 904 | Semantic Line Combination Detector | JINWON KO (Korea University, Seoul) · Dongkwon Jin (Korea University) · Chang-Su Kim (Korea University) | | 905 | Fitting Flats to Flats | Gabriel Dogadov (Technische Universität Berlin) · Ugo Finnendahl (Technische Universität Berlin) · Marc Alexa (TU Berlin) | | 906 | Mind the Time: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Willi Menapace (University of Trento) · Aliaksandr Siarohin (Snap Inc.) · Ivan Skorokhodov (KAUST) · Ekaterina Deyneka (Snap Inc.) · Tsai-Shien Chen (University of California, Merced) · Anil Kag (Snap Inc.) · Yuwei Fang (Snap Inc.) · Aleksei Stoliar (None) · Elisa Ricci (University of Trento) · Jian Ren (Snap Inc.) · Sergey Tulyakov (Snap Inc.) | | 907 | SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects | Abhinav Kumar (Michigan State University) · Yuliang Guo (Bosch US Research) · Xinyu Huang (Robert Bosch Research NA) · Liu Ren (Bosch Research) · Xiaoming Liu (None) | | 908 | Learning to Transform Dynamically for Better Adversarial Transferability | Rongyi Zhu (None) · Zeliang Zhang (University of Rochester) · Susan Liang (University of Rochester) · Zhuo Liu (University of Rochester) · Chenliang Xu (University of Rochester) | | 909 | Noise-free Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising | Haichao Zhang (Northeastern University) · Yi Xu (Northeastern University) · Hongsheng Lu (Toyota Motor North America) · Takayuki Shimizu (Toyota Motor North America, Inc.) · Yun Fu (Northeastern University) | | 910 | TIM: A Time Interval Machine for Audio-Visual Video Understand | Jacob Chalk (None) · Jaesung Huh (University of Oxford) · Evangelos Kazakos (Czech Technical University of Prague) · Andrew Zisserman (University of Oxford) · Dima Damen () | | 911 | Splat-SLAM: Dense RGB-D SLAM via 3D Gaussian Splatting | Nikhil Keetha (Carnegie Mellon University) · Jay Karhade (Carnegie Mellon University) · Krishna Murthy Jatavallabhula (Massachusetts Institute of Technology) · Gengshan Yang (Reality Labs Research, Meta) · Sebastian Scherer (None) · Deva Ramanan (Carnegie Mellon University) · Jonathon Luiten (RWTH Aachen University) | | 912 | Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors | Yu Zhang (Shanghai Jiaotong University) · Songpengcheng Xia () · Lei Chu (University of Southern California) · Jiarui Yang (Shanghai Jiaotong University) · Qi Wu (Shanghai Jiaotong University) · Ling Pei (Shanghai Jiao Tong Univeristy) | | 913 | BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP | Jiawang Bai (None) · Kuofeng Gao (Tsinghua University, Tsinghua University) · Shaobo Min (University of Science and Technology of China) · Shu-Tao Xia (Shenzhen International Graduate School, Tsinghua University) · Zhifeng Li (Tencent) · Wei Liu (Tencent AI Lab) | | 914 | Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos | Mehmet Saygin Seyfioglu (University of Washington) · Wisdom Ikezogwo (Department of Computer Science) · Fatemeh Ghezloo (University of Washington) · Ranjay Krishna (University of Washington) · Linda Shapiro (UW Reality Lab University of Washington) | | 915 | LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning | Siyuan Cheng (Purdue University) · Guanhong Tao (Purdue University) · Yingqi Liu (Microsoft) · Guangyu Shen (Purdue University) · Shengwei An (Purdue University) · Shiwei Feng (Purdue University, West Lafayette) · Xiangzhe Xu (Purdue University) · Kaiyuan Zhang (Computer Science, Purdue University) · Shiqing Ma (University of Massachusetts at Amherst) · Xiangyu Zhang (, Purdue University) | | 916 | Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation | Dongliang Cao (None) · Marvin Eisenberger (Technical University Munich) · Nafie El Amrani (Rheinische Friedrich-Wilhelms Universität Bonn) · Daniel Cremers (Technical University Munich) · Florian Bernard (University of Bonn) | | 917 | Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture | Juanwu Lu (Purdue University) · Can Cui (Purdue University) · Yunsheng Ma (Purdue University) · Aniket Bera (Purdue University) · Ziran Wang (Purdue University) | | 918 | JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments | Duy Tho Le (Monash University) · Chenhui Gou (Monash University) · Stavya Datta (Monash University) · Hengcan Shi (None) · Ian Reid (University of Adelaide) · Jianfei Cai (Monash University) · Hamid Rezatofighi (Monash University) | | 919 | HRVDA: High-Resolution Visual Document Assistant | Chaohu Liu (University of Science and Technology of China) · Kun Yin (Tencent YouTu Lab) · Haoyu Cao (Tencent Youtu Lab) · Xinghua Jiang (None) · Xin Li (Tencent Youtu Lab) · Yinsong Liu (Tencent Youtu Lab) · Deqiang Jiang (Tencent YouTu Lab) · Xing Sun (Tencent YouTu Lab) · Linli Xu (University of Science and Technology of China) | | 920 | UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition | Xiaohan Ding (Tencent AI Lab) · Yiyuan Zhang (The Chinese University of Hong Kong) · Yixiao Ge (Tencent) · Sijie Zhao (Tencent AI Lab) · Lin Song (Tencent AI Lab) · Xiangyu Yue (None) · Ying Shan (Tencent) | | 921 | Bridging Sources in Geospatial Sensing with Cross Sensor Pretraining | Boran Han (Amazon/AWS) · Shuai Zhang (Amazon) · Xingjian Shi (Boson AI) · Markus Reichstein (Max-Planck Institute) | | 922 | DiffPortrait3D: Single-Portrait Novel View Synthesis with 3D-Aware Diffusion | Yuming Gu (USC Institute for Creative Technologies, University of Southern California) · Hongyi Xu (Bytedance) · You Xie (Bytedance) · Guoxian Song (Bytedance Inc) · Yichun Shi (ByteDance) · Di Chang (University of Southern California) · Jing Yang (USC Institute for Creative Technologies) · Linjie Luo (ByteDance Inc.) | | 923 | Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration | Mingyuan Meng (The University of Sydney) · Dagan Feng (University of Sydney) · Lei Bi (the University of Sydney) · Jinman Kim (University of Sydney) | | 924 | Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Multimodal Humor Generation | Shanshan Zhong (SUN YAT-SEN UNIVERSITY) · Zhongzhan Huang (Sun Yat-Sen University) · Shanghua Gao (Harvard University) · Wushao Wen (SUN YAT-SEN UNIVERSITY) · Liang Lin (Sun Yat-sen University) · Marinka Zitnik (Harvard University) · Pan Zhou (Sea Group) | | 925 | Improved Visual Grounding through Self-Consistent Explanations | Ruozhen He (Rice University) · Paola Cascante-Bonilla (Rice University) · Ziyan Yang (Rice University) · Alex Berg (None) · Vicente Ordonez (Rice University) | | 926 | Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models | Yushi Hu (University of Washington) · Otilia Stretcu (Google Research) · Chun-Ta Lu (Google Research) · Krishnamurthy Viswanathan (Google) · Kenji Hata (Google) · Enming Luo (Google) · Ranjay Krishna (University of Washington) · Ariel Fuxman (Google) | | 927 | Instruct-Imagen: Image Generation with Multi-modal Instruction | Hexiang Hu (Google Deepmind) · Kelvin C.K. Chan (Google) · Yu-Chuan Su (Google) · Wenhu Chen (University of Waterloo) · Yandong Li (Google Research) · Kihyuk Sohn (Google) · Yang Zhao (Google) · Xue Ben (Google) · William Cohen (Google DeepMind) · Ming-Wei Chang (Google) · Xuhui Jia (Google) | | 928 | Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting | Taeho Kang (Seoul National University) · Youngki Lee (Seoul National University) | | 929 | Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use | Imad Eddine Toubal (University of Missouri) · Aditya Avinash (Google) · Neil Alldrin (Google) · Jan Dlabal (Research, Google) · Wenlei Zhou (Google) · Enming Luo (Google) · Otilia Stretcu (Google Research) · Hao Xiong (Google) · Chun-Ta Lu (Google Research) · Howard Zhou (Google Research) · Ranjay Krishna (University of Washington) · Ariel Fuxman (Google) · Tom Duerig (Google) | | 930 | Putting the Object Back into Video Object Segmentation | Ho Kei Cheng (University of Illinois Urbana-Champaign) · Seoung Wug Oh (Adobe Systems) · Brian Price (Adobe Research) · Joon-Young Lee (Adobe Research) · Alexander G. Schwing (UIUC) | | 931 | ManiFPT: Defining and Analyzing Fingerprints of Generative Models | Hae Jin Song (University of Southern California) · Mahyar Khayatkhoei (USC/ISI) · Wael AbdAlmageed (Clemson University) | | 932 | Advancing Chemical Structure Recognition in Hand-Drawn Images by Atom-Level Entity Localization | Martijn Oldenhof (KU Leuven) · Edward De Brouwer (Yale University) · Adam Arany (KU Leuven) · Yves Moreau (University of Leuven) | | 933 | Scalable 3D Registration via Truncated Entry-wise Absolute Residuals | Tianyu Huang (None) · Liangzu Peng (Johns Hopkins University) · Rene Vidal (Johns Hopkins University) · Yun-Hui Liu (The Chinese University of Hong Kong) | | 934 | Asymmetric Masked Distillation for Pre-Training Small Foundation Models | Zhiyu Zhao (Nanjing University) · Bingkun Huang (Nanjing University) · Sen Xing (Tsinghua University, Tsinghua University) · Gangshan Wu (Nanjing University) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Limin Wang (Nanjing University) | | 935 | Faces that Speak: Jointly Synthesising Talking Face and Speech from Text | Youngjoon Jang (Korea Advanced Institute of Science & Technology) · Kim (None) · Junseok Ahn (Korea Advanced Institute of Science and Technology) · Doyeop Kwak (Korea Advanced Institute of Science & Technology) · Hongsun Yang (42dot) · Yooncheol Ju (42dot) · ILHWAN KIM (None) · Byeong-Yeol Kim (42dot) · Joon Chung (KAIST) | | 936 | Class Incremental Learning with Multi-Teacher Distillation | Haitao Wen (University of Electronic Science and Technology of China) · Lili Pan (University of Electronic Science and Technology of China) · Yu Dai (University of Electronic Science and Technology of China) · Heqian Qiu (University of Electronic Science and Technology of China) · Lanxiao Wang (University of Electronic Science and Technology of China) · Qingbo Wu (University of Electronic Science and Technology of China) · Hongliang Li (University of Electronic Science and Technology of China, Tsinghua University) | | 937 | Generative Image Dynamics | Zhengqi Li (Google) · Richard Tucker (Google) · Noah Snavely (Google / Cornell) · Aleksander Holynski (UC Berkeley & Google Research) | | 938 | CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving | Mozhgan Pourkeshavarz (Huawei Technologies Ltd.) · Junrui Zhang (University of Toronto) · Amir Rasouli (Huawei Technologies Canada) | | 939 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao (Institute of Automation, Chinese Academy of Sciences) · Bolin Ni (Institute of Automation, Chinese Academy of Sciences) · Junsong Fan (Centre for Artificial Intelligence and Robotics (CAIR) Hong Kong Institute of Science & Innovation Chinese Academy of Sciences) · Yuxi Wang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Yuntao Chen (CAIR, HKISI, CAS) · Gaofeng Meng (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) | | 940 | AlignMiF: Geometry-Aligned Multimodal Implicit Field for Enhanced LiDAR-Camera Joint Synthesis | Tao Tang (SYSU) · Guangrun Wang (University of Oxford) · Yixing Lao (None) · Peng Chen (Alibaba Group) · Jie Liu (North China University of Technology) · Liang Lin (SUN YAT-SEN UNIVERSITY, Tsinghua University) · Kaicheng Yu (Alibaba Group) · Xiaodan Liang (Sun Yat-sen University) | | 941 | Distributionally Generative Augmentation for Fair Facial Attribute Classification | Fengda Zhang (Nanyang Technological University) · Qianpei He (Zhejiang University) · Kun Kuang (Zhejiang University) · Jiashuo Liu (Tsinghua University, Tsinghua University) · Long Chen (HKUST) · Chao Wu (Zhejiang University) · Jun Xiao (Zhejiang University) · Hanwang Zhang (Nanyang Technological University) | | 942 | CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs | Yingji Zhong (None) · Lanqing Hong (Huawei Technologies Ltd.) · Zhenguo Li (Huawei) · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) | | 943 | GOAT-Bench: A Benchmark for Multi-modal Lifelong Navigation | Mukul Khanna (Georgia Institute of Technology) · Ram Ramrakhya (None) · Gunjan Chhablani (Georgia Institute of Technology) · Sriram Yenamandra (Georgia Institute of Technology) · Theo Gervet (Carnegie Mellon University) · Matthew Chang (University of Illinois, Urbana Champaign) · Zsolt Kira (Georgia Institute of Technology) · Devendra Singh Chaplot (Carnegie Mellon University) · Dhruv Batra (FAIR (Meta) and Georgia Tech) · Roozbeh Mottaghi (Meta) | | 944 | Learning Adaptive Spatial Coherent Correlations for Speech-Preserving Facial Expression Manipulation | Tianshui Chen (Guangdong University of Technology) · jianman lin (Guangdong University of Technology) · Zhijing Yang (Guangdong University of Technology) · Chunmei Qing (South China University of Technology) · Liang Lin (Sun Yat-sen University) | | 945 | A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint | Xiaofeng Cong () · Jie Gui (Southeast University) · Jing Zhang (The University of Sydney) · Junming Hou (Southeast University) · Hao Shen (Hefei University of Technology) | | 946 | Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models | Weiwei Cao (University of Science and Technology of China) · Jianpeng Zhang (None) · Yingda Xia (Alibaba Group) · Tony C. W. MOK (Alibaba DAMO Academy) · Zi Li (Alibaba DAMO Academy) · Xianghua Ye (Zhejiang University) · Le Lu (Alibaba Group) · Jian Zheng (Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences) · Yuxing Tang (Alibaba Group) · Ling Zhang (Alibaba Group) | | 947 | Bootstrapping SparseFormers from Vision Foundation Models | Ziteng Gao (National University of Singapore) · Zhan Tong (Tencent AI Lab) · Kevin Qinghong Lin (national university of singaore, National University of Singapore) · Joya Chen (National University of Singapore) · Mike Zheng Shou (National University of Singapore) | | 948 | THRONE: A Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models | Prannay Kaul (University of Oxford, University of Oxford) · Zhizhong Li (Amazon) · Hao Yang (Amazon) · Yonatan Dukler (AWS AI) · Ashwin Swaminathan (University of Maryland, College Park) · CJ Taylor (Penn) · Stefano Soatto (AWS) | | 949 | Text-Guided 3D Face Synthesis - From Generation to Editing | Yunjie Wu (NetEase, Inc.) · Yapeng Meng (Tsinghua University, Tsinghua University) · Zhipeng Hu (Leihuo Game, NetEase) · Lincheng Li () · Haoqian Wu (NetEase Fuxi AI Lab) · Kun Zhou (Zhejiang University) · Weiwei Xu (Zhejiang University) · Xin Yu (University of Queensland) | | 950 | Clockwork Diffusion: Efficient Generation With Model-Step Distillation | Amirhossein Habibian (Qualcomm AI Research) · Amir Ghodrati (QualComm AI Research) · Noor Fathima (Qualcomm Inc, QualComm) · Guillaume Sautiere (Qualcomm Inc, QualComm) · Risheek Garrepalli (Qualcomm Inc, QualComm) · Fatih Porikli (QualComm) · Jens Petersen (Qualcomm AI Research) | | 951 | Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models | Huan Ling (Nvidia, University of Toronto) · Seung Wook Kim (NVIDIA) · Antonio Torralba (MIT) · Sanja Fidler (Department of Computer Science, University of Toronto) · Karsten Kreis (NVIDIA) | | 952 | Inlier Confidence Calibration for Point Cloud Registration | Yongzhe Yuan (Xidian University) · Yue Wu (Xidian University) · Xiaolong Fan (Xidian University) · Maoguo Gong (Xidian University) · Qiguang Miao (Xidian University) · Wenping Ma (Xidian University) | | 953 | Scalable and Simplified Functional Map Learning | Robin Magnet (École Polytechnique) · Maks Ovsjanikov (Ecole Polytechnique, France) | | 954 | ADFactory: An Effective Framework for Generalizing Optical Flow with Nerf | Han Ling (Nanjing University of Science and Technology) · Quansen Sun (Nanjing University of Science and Technology) · Yinghui Sun (Nanjing University of Science and Technology) · Xian Xu (Southeast Community College Area) · Xingfeng Li (Nanjing University of Science and Technology) | | 955 | IReNe: Instant Recoloring of Neural Radiance Fields | Alessio Mazzucchelli (Arquimea Research Center) · Adrian Garcia-Garcia (Arquimea Research Center) · Elena Garces (Universidad Rey Juan Carlos) · Fernando Rivas-Manzaneque (None) · Francesc Moreno-Noguer (Universidad Politécnica de Cataluna) · Adrian Penate-Sanchez (Universidad de Las Palmas de Gran Canaria) | | 956 | HardMo:A Large-scale Hardcase Dataset for Motion Capture | Jiaqi Liao (None) · Chuanchen Luo (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Yinuo Du (Beijing University of Posts and Telecommunications) · Yuxi Wang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xu-Cheng Yin (University of Science and Technology Beijing) · Man Zhang (None) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Junran Peng (Institute of automation, Chinese academy of science) | | 957 | HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions | Hao Xu (None) · Li Haipeng (None) · Yinqiao Wang (The Chinese University of Hong Kong) · Shuaicheng Liu (None) · Chi-Wing Fu (The Chinese University of Hong Kong) | | 958 | Don’t drop your samples! Coherence-aware training benefits Conditional diffusion | Nicolas Dufour (Ecole Nationale des Ponts et Chausees) · Victor Besnier (Valeo.ai) · Vicky Kalogeiton (Ecole polytechnique) · David Picard (None) | | 959 | 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions | Weijia Li (None) · Haote Yang (PJLab) · Zhenghao Hu (SUN YAT-SEN UNIVERSITY) · Juepeng Zheng (Sun Yat-Sen University) · Gui-Song Xia (Wuhan University) · Conghui He (None) | | 960 | Improving Physics-Augmented Continuum Neural Radiance Fields-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization | Takuhiro Kaneko (None) | | 961 | An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains | George Eskandar (Universität Stuttgart) | | 962 | Constrained Layout Design with Factor Graphs | Mohammed Haroon Dupty (National University of Singapore) · Yanfei Dong (PayPal Inc.) · Sicong Leng (Nanyang Technological University) · Guoji Fu (National University of Singapore) · Yong Liang Goh (National University of Singapore) · Wei Lu (Singapore University of Technology and Design) · Wee Sun Lee (National University of Singapore) | | 963 | FastMAC: Stochastic Spectral Sampling of Correspondence Graph | Yifei Zhang (University of Chinese Academy of Sciences) · Hao Zhao (Tsinghua University, Tsinghua University) · Hongyang Li (Shanghai AI Lab) · Siheng Chen (Shanghai Jiao Tong University) | | 964 | Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions | Oindrila Saha (University of Massachusetts at Amherst) · Grant Horn (University of Massachusetts at Amherst) · Subhransu Maji (University of Massachusetts, Amherst) | | 965 | Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID | Wentao Tan (South China University of Technology) · Changxing Ding (South China University of Technology) · Jiayu Jiang (South China University of Technology) · Fei Wang (South China University of Technology) · Yibing Zhan (JD Explore Academy) · Dapeng Tao (Yunnan University) | | 966 | Towards Generalizing to Unseen Domains with Few Labels | Chamuditha Galappaththige (Mohamed bin Zayed University of Artificial Intelligence) · Sanoojan Baliah (Mohamed bin Zayed University of Artificial Intelligence) · Malitha Gunawardhana (University of Auckland) · Muhammad Haris Khan (None) | | 967 | Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation | guo (None) · Tianwei Lin (Horizon Robotics) | | 968 | Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | Jialin Wu (Google) · Xia Hu (Research, Google) · Yaqing Wang (Research, Google) · Bo Pang (Google) · Radu Soricut (Google) | | 969 | Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation | Feng Liu (Michigan State University) · Minchul Kim (Michigan State University) · Zhiyuan Ren (Michigan State University) · Xiaoming Liu (None) | | 970 | Observation-Guided Diffusion Probabilistic Models | Junoh Kang (Seoul National University) · Jinyoung Choi (Seoul National University) · Sungik Choi (LG AI Research) · Bohyung Han (Seoul National University) | | 971 | Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation | Sihan liu (Xiamen University) · Yiwei Ma (Xiamen University) · Xiaoqing Zhang (Xiamen University) · Haowei Wang (Xiamen University) · Jiayi Ji (Xiamen University) · Xiaoshuai Sun (Xiamen University) · Rongrong Ji (Xiamen University) | | 972 | GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding | Chengyao Wang (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Li Jiang (Max Planck Institute for Informatics) · Xiaoyang Wu (The University of Hong Kong) · Zhuotao Tian (The Chinese University of Hong Kong) · Bohao Peng (The Chinese University of Hong Kong) · Hengshuang Zhao (The University of Hong Kong) · Jiaya Jia (The Chinese University of Hong Kong) | | 973 | Fully Exploiting Every Real Sample: Super-Pixel Sample Gradient Model Stealing | Yunlong Zhao () · Xiaoheng Deng (Central South University) · Yijing Liu (Zhejiang University) · Xinjun Pei (None) · Jiazhi Xia (Central South University) · Wei Chen (State key laboratory of CAD&CG) | | 974 | LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising | Yuxing Duan (None) | | 975 | MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant | Chenlu Zhan (None) · Gaoang Wang (Zhejiang University) · Yu LIN (Zhejiang University) · Hongwei Wang (Zhejiang University) · Jian Wu (Zhejiang University) | | 976 | DePT: Decoupled Prompt Tuning | Ji Zhang (University of Electronic Science and Technology of China) · Shihan Wu (University of Electronic Science and Technology of China) · Lianli Gao (University of Electronic Science and Technology of China, Tsinghua University) · Heng Tao Shen (University of Electronic Science and Technology of China) · Jingkuan Song (University of Electronic Science and Technology of China,) | | 977 | A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion | Feng Yu (University of Minnesota - Twin Cities) · Teng Zhang (University of Central Florida) · Gilad Lerman (University of Minnesota, Minneapolis) | | 978 | Bi-level Learning of Task-Specific Decoders for Joint Registration and One-Shot Segmentation | Xin Fan (Dalian University of Technology) · Xiaolin Wang (Dalian University of Technology) · Jiaxin Gao (Dalian University of Technology) · Jia Wang (Dalian University of Technology) · Zhongxuan Luo (Dalian University of Technology) · Risheng Liu (Dalian University of Technology) | | 979 | Osprey: Pixel Understanding with Visual Instruction Tuning | Yuqian Yuan (Zhejiang University) · Wentong Li (College of Computer Science and Technology, Zhejiang University) · Jian liu (AntGroup) · Dongqi Tang (Ant Group) · Xinjie Luo (Zhejiang University) · Chi Qin (Microsoft) · Lei Zhang (The Hong Kong Polytechnic University) · Jianke Zhu (Zhejiang University) | | 980 | NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-efficient Scene Representation | Sicheng Li (Zhejiang University) · Hao Li (None) · Yiyi Liao (Zhejiang University) · Lu Yu (Zhejiang University) | | 981 | Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance | Yuto Enyo (None) · Ko Nishino (Kyoto University) | | 982 | Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing | Hyelin Nam (Korea Advanced Institute of Science & Technology) · Gihyun Kwon (Korea Advanced Institute of Science & Technology) · Geon Yeong Park (Korea Advanced Institute of Science and Technology) · Jong Chul Ye (Korea Advanced Institute of Science and Technology) | | 983 | Domain Prompt Learning with Quaternion Networks | Qinglong Cao (Shanghai Jiao Tong University) · Zhengqin Xu (Shanghai Jiaotong University) · Yuntian Chen (Eastern Institute for Advanced Study) · Chao Ma (Shanghai Jiao Tong University) · Xiaokang Yang (Shanghai Jiao Tong University, China) | | 984 | Towards More Unified In-context Visual Understanding | Dianmo Sheng (University of Science and Technology of China) · Dongdong Chen (Microsoft Research) · Zhentao Tan (Alibaba DAMO Academy; University of Science and Technology of China) · Qiankun Liu (Beijing Institute of Technology) · Qi Chu (University of Science and Technology of China) · Jianmin Bao (Microsoft) · Tao Gong (University of Science and Technology of China) · Bin Liu (None) · Shengwei Xu (Beijing Electronic Science and Technology Institute) · Nenghai Yu (University of Science and Technology of China) | | 985 | Low-Resource Vision Challenges for Foundation Models | Yunhua Zhang (University of Amsterdam) · Hazel Doughty (None) · Cees G. M. Snoek (University of Amsterdam) | | 986 | Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis | Zhan Li () · Zhang Chen (OPPO US Research Center, InnoPeak Technology, Inc.) · Zhong Li (InnoPeak Technology) · Yi Xu (OPPO US Research Center) | | 987 | Long-Tailed Anomaly Detection with Learnable Class Names | Chih-Hui Ho (University of California San Diego) · Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories (MERL)) · Nuno Vasconcelos (University of California San Diego) | | 988 | DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection | Lewei Yao (Harbin Institute of Technology) · Renjie Pi (None) · Jianhua Han (Huawei Technologies Ltd.) · Xiaodan Liang (Sun Yat-sen University) · Hang Xu (Huawei Noah‘s Ark Lab) · Wei Zhang (Huawei Technologies Ltd.) · Zhenguo Li (Huawei) · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) | | 989 | Uncertainty-Driven Continual Learning for Autonomous Driving | Lei Lai (Boston University, Boston University) · Eshed Ohn-Bar (Boston University, Boston University) · Sanjay Arora (Red Hat, Inc.) · John Yi (Boston University, Boston University) | | 990 | PlatoNeRF: 3D Reconstruction in Plato’s Cave via Single-View Two-Bounce Lidar | Tzofi Klinghoffer (Massachusetts Institute of Technology) · Xiaoyu Xiang (Meta) · Siddharth Somasundaram (Massachusetts Institute of Technology) · Yuchen Fan (Facebook) · Christian Richardt (Meta Reality Labs) · Ramesh Raskar (Massachusetts Institute of Technology) · Rakesh Ranjan () | | 991 | VideoMosaic: Connecting the Temporal Dots in Long Videos for LLMs | Reuben Tan (Boston University) · Ximeng Sun (Boston University) · Ping Hu (University of Electronic Science and Technology of China) · Jui-Hsien Wang (Adobe Systems) · Hanieh Deilamsalehy (None) · Bryan A. Plummer (None) · Bryan Russell (Adobe Research) · Kate Saenko (Meta / Boston University) | | 992 | PracticalDG: Perturbation Distillation on Vision-Language Models for Hybrid Domain Generalization | Zining Chen (Beijing University of Posts and Telecommunications) · Weiqiu Wang (Beijing University of Posts and Telecommunications) · Zhicheng Zhao (Beijing University of Posts and Telecommunications) · Fei Su (Beijing University of Posts and Telecommunications) · Aidong Men (Beijing University of Posts and Telecommunications) · Hongying Meng (None) | | 993 | Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning | Rui Zhao (None) · Bin Shi (Xi'an Jiaotong University) · Jianfei Ruan (Xi'an Jiaotong University) · Tianze Pan (Xi'an Jiaotong University) · Bo Dong (Xi'an Jiaotong University) | | 994 | Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment | Muhammad Sohail Danish (Mohamed bin Zayed University of Artificial Intelligence) · Muhammad Haris Khan (None) · Muhammad Akhtar Munir (None) · M. Saquib Sarfraz (Karlsruhe Institute of Technology / Mercedes-Benz) · Mohsen Ali (Information Technology University) | | 995 | Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising | Haijin Zeng (IMEC & Universiteit Gent) · Jiezhang Cao (ETH Zürich) · Yongyong Chen (Harbin Institute of Technology (Shenzhen)) · Kai Zhang (None) · Hiep Luong (Universiteit Gent) · Wilfried Philips (Universiteit Gent) | | 996 | DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors | Biwen Lei (Alibaba Group) · Kai Yu (None) · Mengyang Feng (Alibaba Group) · Miaomiao Cui (Alibaba Group) · Xuansong Xie (Alibaba Group) | | 997 | ZeroShape: Regression-based Zero-shot Shape Reconstruction | Zixuan Huang (University of Illinois Urbana-Champaign) · Stefan Stojanov (Georgia Institute of Technology) · Anh Thai (Georgia Institute of Technology) · Varun Jampani (Google Research) · James Rehg (None) | | 998 | Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning | Ziming Hong (The University of Sydney) · Li Shen (JD Explore Academy) · Tongliang Liu (Mohamed bin Zayed University of Artificial Intelligence) | | 999 | ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe | Yifan Bai (Xi’an Jiaotong University) · Zeyang Zhao (Xi'an Jiaotong University) · Yihong Gong (Xi'an Jiaotong University) · Xing Wei (None) | | 1000 | DPHMs: Diffusion Parametric Head Models for Depth-based Tracking | Jiapeng Tang (Technische Universität München) · Angela Dai () · Yinyu Nie (Huawei Technologies Ltd.) · Lev Markhasin (None) · Justus Thies (Max-Planck Institute for Intelligent Systems) · Matthias Nießner (Technical University of Munich) | | 1001 | CNC-Net: Self-Supervised Learning for CNC Machining Operations | Mohsen Yavartanoo (None) · Sangmin Hong (Seoul National University) · Reyhaneh Neshatavar (None) · Kyoung Mu Lee (Seoul National University) | | 1002 | MaxQ: Multi-Axis Query for N:M Sparsity Network | Jingyang Xiang (None) · Siqi Li (Zhejiang University) · Junhao Chen (Zhejiang University) · Zhuangzhi Chen (Zhejiang University of Technology) · Tianxin Huang (Tencent youtu lab) · Linpeng Peng (Zhejiang University) · Yong Liu (Zhejiang University) | | 1003 | High-Quality Facial Geometry and Appearance Capture at Home | Yuxuan Han (Tsinghua University) · Junfeng Lyu (School of Software, Tsinghua University) · Feng Xu (Tsinghua University, Tsinghua University) | | 1004 | mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration | Qinghao Ye (Alibaba Group) · Haiyang Xu (Alibaba Group) · Jiabo Ye (East China Normal University) · Ming Yan (Alibaba Group) · Anwen Hu (Alibaba Group) · Haowei Liu (Institute of Automation, Chinese Academy of Sciences) · Qi Qian (Alibaba Group) · Ji Zhang (Alibaba Group) · Fei Huang (Alibaba Group) | | 1005 | Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models | Haoning Wu (Nanyang Technological University) · Zicheng Zhang (Shanghai Jiaotong University) · Erli Zhang (Nanyang Technological University) · Chaofeng Chen (Nanyang Technological University) · Liang Liao (Nanyang Technological University) · Annan Wang (Nanyang Technological University) · Kaixin Xu (I2R, ASTAR) · Chunyi Li (None) · Jingwen Hou (Nanyang Technological University) · Guangtao Zhai (Shanghai Jiao Tong University) · Xue Geng (Institute for Infocomm Research, ASTAR) · Wenxiu Sun (SenseTime Research and Tetras.AI) · Qiong Yan (SenseTime Research) · Weisi Lin (Nanyang Technological University) | | 1006 | Deep Equilibrium Diffusion Restoration with Parallel Sampling | Jiezhang Cao (ETH Zürich) · Yue Shi (Shanghai Jiaotong University) · Kai Zhang (None) · Yulun Zhang (ETH Zürich) · Radu Timofte (University of Würzburg) · Luc Van Gool (ETH Zurich) | | 1007 | Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data | Yu Deng (Xiaobing.ai) · Duomin Wang () · Xiaohang Ren (xiaobing) · Xingyu Chen (Xiaobing.AI) · Baoyuan Wang (Xiaobing.ai) | | 1008 | Efficient Scene Recovery Using Luminous Flux Prior | ZhongYu Li (University of Science and Technology of China) · Lei Zhang (University of Science and Technology of China) | | 1009 | Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding | Hoang-Quan Nguyen (University of Arkansas - Fayetteville) · Thanh-Dat Truong (University of Arkansas) · Xuan-Bac Nguyen (None) · Ashley Dowling (University of Arkansas - Fayetteville) · Xin Li (State University of New York at Albany) · Khoa Luu (University of Arkansas) | | 1010 | IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation | Yizhi Song (Purdue University) · Zhifei Zhang (Adobe Research) · Zhe Lin (Adobe Research) · Scott Cohen (Adobe Systems) · Brian Price (Adobe Research) · Jianming Zhang (Adobe Systems) · Soo Ye Kim (Adobe Systems) · He Zhang (Adobe Systems) · Wei Xiong (Adobe Systems) · Daniel Aliaga (Purdue University) | | 1011 | Learning Without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels | Zhuohong Li () · Wei He (Wuhan University) · Jiepan Li (None) · Fangxiao Lu (Wuhan University) · Hongyan Zhang (China University of Geosciences Wuhan) | | 1012 | Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning | Siteng Huang (Zhejiang University & Westlake University) · Biao Gong (Alibaba Group) · Yutong Feng (Alibaba Group) · Zhang Min (Westlake University) · Yiliang Lv (Gientech AIL) · Donglin Wang (Westlake University) | | 1013 | Hyperbolic Anomaly Detection | Huimin Li (Beihang University) · Zhentao Chen (Beihang University) · Yunhao Xu (Beihang University) · Junlin Hu (Beihang University) | | 1014 | Multiple View Geometry Transformers for 3D Human Pose Estimation | Ziwei Liao (University of Toronto) · jialiang zhu (Southeast University) · Chunyu Wang (Microsoft) · Han Hu (Microsft Research Asia) · Steven L. Waslander (University of Toronto) | | 1015 | H-ViT: A Hierarchical Vision Transformer for Deformable Image Registration | MORTEZA GHAHREMANI (Technische Universität München) · Mohammad Khateri (University of Eastern Finland) · Bailiang Jian (Technische Universität München) · Benedikt Wiestler (Technical University Munich) · Ehsan Adeli (Stanford University) · Christian Wachinger (Technische Universität München) | | 1016 | RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation | Peng Lu (SIGS, Tsinghua University) · Tao Jiang (Shanghai AI Laboratory) · Yining Li (Shanghai AI Laboratory) · Xiangtai Li (Nanyang Technological University) · Kai Chen (Shanghai AI Laboratory) · Wenming Yang (Tsinghua University,) | | 1017 | Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data | Xinting Liao (Zhejiang Univerisity) · Weiming Liu (Zhejiang University) · Chaochao Chen (Zhejiang University) · Pengyang Zhou (Zhejiang University) · Fengyuan Yu (Zhejiang University) · Huabin Zhu (Zhejiang University) · Binhui Yao (University of Canberra) · Tao Wang (Midea Group) · Xiaolin Zheng (Zhejiang University) · Yanchao Tan (Fuzhou University) | | 1018 | SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes | Soubhik Sanyal (None) · Partha Ghosh (Max Planck Institute for Intelligent Systems, Max-Planck Institute) · Jinlong Yang (Google) · Michael J. Black (University of Tübingen) · Justus Thies (Max-Planck Institute for Intelligent Systems) · Timo Bolkart (Google) | | 1019 | Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment | Ziyu Shan (Shanghai Jiao Tong University) · Yujie Zhang (Shanghai Jiao Tong University) · Qi Yang (Tencent MediaLab) · Haichen Yang (Shanghai Jiaotong University) · Yiling Xu (None) · Jenq-Neng Hwang (None) · Xiaozhong Xu (Tencent Media Lab) · Shan Liu (Tencent Media Lab) | | 1020 | Training-free Pretrained Model Merging | Zhengqi Xu (Zhejiang University) · Ke Yuan (None) · Huiqiong Wang (Zhejiang University) · Yong Wang (State Grid Shandong Electronic Power Company) · Mingli Song (Zhejiang University) · Jie Song (Zhejiang University) | | 1021 | Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement | David Yu (None) · Zhuorong Li (None) · Lina Wei (Hangzhou City University ) · Canghong Jin (Hangzhou City University) · Yun Zhang (Hangzhou City University) · Sixian Chan (the College of Computer Science and Technology at Zhejiang University of Technology) | | 1022 | Anatomically Constrained Implicit Face Models | Prashanth Chandran (None) · Gaspard Zoss (Disney Research, Disney) | | 1023 | Revisiting Global Translation Estimation with Feature Tracks | Peilin Tao (None) · Hainan Cui (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Mengqi Rong (, Institute of automation, Chinese academy of science) · Shuhan Shen (Institute of automation, Chinese academy of science) | | 1024 | LoCoNet: Long-Short Context Network for Active Speaker Detection | Xizi Wang (Indiana University, Bloomington) · Feng Cheng (University of North Carolina at Chapel Hill) · Gedas Bertasius (UNC Chapel Hill) | | 1025 | WinSyn: A High Resolution Testbed for Synthetic Data | Tom Kelly (King Abdullah University of Science and Technology) · John Femiani (None) · Peter Wonka (KAUST) | | 1026 | Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection | Huan Liu (None) · Zichang Tan (Baidu) · Chuangchuang Tan (Beijing Jiaotong University) · Yunchao Wei (UTS) · Jingdong Wang (Baidu) · Yao Zhao (Beijing Jiaotong University) | | 1027 | Robust Distillation via Untargeted and Targeted Intermediate Adversarial Samples | Junhao Dong (Nanyang Technological University) · Piotr Koniusz (Australian National University) · Junxi Chen (SUN YAT-SEN UNIVERSITY) · Z. Wang (University of British Columbia) · Yew-Soon Ong (Nanyang Technological University) | | 1028 | Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning | Yixiong Zou (Huazhong University of Science and Technology) · Yicong Liu (Huazhong University of Science and Technology) · Yiman Hu (Huazhong University of Science and Technology) · Yuhua Li (Huazhong University of Science and Technology) · Ruixuan Li (Huazhong University of Science and Technology) | | 1029 | Neural Super-Resolution for Real-time Rendering with Radiance Demodulation | Jia Li (Shandong University) · Ziling Chen (Shandong University) · Xiaolong Wu (None) · Lu Wang (Shandong University) · Beibei Wang (Nankai University) · Lei Zhang (The Hong Kong Polytechnic University) | | 1030 | Noisy One Point Homographies are Surprisingly Good | Yaqing Ding (None) · Jonathan Astermark (Lund University / Lund Institute of Technology) · Magnus Oskarsson (Lund University) · Viktor Larsson (Lund University) | | 1031 | Alchemist: Parametric Control of Material Properties with Diffusion Models | Prafull Sharma (Massachusetts Institute of Technology) · Varun Jampani (Google Research) · Yuanzhen Li (Massachusetts Institute of Technology) · Xuhui Jia (Google) · Dmitry Lagun (Google) · Fredo Durand (Massachusetts Institute of Technology) · William Freeman (MIT and Google) · Mark Matthews (Google) | | 1032 | DisCo: Disentangled Control for Realistic Human Dance Generation | Tan Wang (Nanyang Technological University) · Linjie Li (Microsoft) · Kevin Lin (Microsoft) · Yuanhao Zhai (State University of New York at Buffalo) · Chung-Ching Lin (Microsoft) · Zhengyuan Yang (Microsoft) · Hanwang Zhang (Nanyang Technological University) · Zicheng Liu (Microsoft) · Lijuan Wang (Microsoft) | | 1033 | PaReNeRF: Toward Fast Large-scale Dynamic NeRF with Patch-based Reference | Xiao Tang (None) · Min Yang (None) · Penghui Sun (Samsung R&D Institute) · Hui Li (Samsung R&D Institute China Xi’an (SRCX)) · Yuchao Dai (Northwestern Polytechnical University) · feng zhu (None) · Hojae Lee (None) | | 1034 | Validating Privacy-Preserving Face Recognition under a Minimum Assumption | Hui Zhang (None) · Xingbo Dong (Anhui University) · YenLungLai (Anhui University) · Ying Zhou (Anhui University) · Xiaoyan ZHANG (Anhui University) · Xingguo Lv (Anhui University) · Zhe Jin (Anhui University) · Xuejun Li (Anhui University) | | 1035 | FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning | Junyuan Zhang (University of Hong Kong) · Shuang Zeng (The University of Hong Kong) · Miao Zhang (New York University) · Runxi Wang (Beijing University of Aeronautics and Astronautics) · Feifei Wang (Stanford University) · Yuyin Zhou (UC Santa Cruz) · Paul Pu Liang (Carnegie Mellon University) · Liangqiong Qu (The University of Hong Kong) | | 1036 | Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval | Jiamian Wang (Rochester Institute of Technology) · Guohao Sun (Rochester Institute of Technology) · Pichao Wang (Amazon) · Dongfang Liu (Rochester Institute of Technology) · Sohail Dianat (Rochester Institute of Technology) · MAJID RABBANI (Rochester Institute of Technology) · Raghuveer Rao (DEVCOM Army Research Laboratory) · ZHIQIANG TAO (Rochester Institute of Technology) | | 1037 | JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups | Simindokht Jahangard (None) · Zhixi Cai (None) · Shiki Wen (Monash University) · Hamid Rezatofighi (Monash University) | | 1038 | Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding | Peng Jin (Peking University) · Ryuichi Takanobu (miHoYo) · Cai Zhang (Nanrui Group Co., Ltd) · Xiaochun Cao (SUN YAT-SEN UNIVERSITY) · Li Yuan (Peking University) | | 1039 | Suppress and Balance: Towards Generalized Multi-Modal Face Anti-Spoofing | Xun Lin (Beihang University) · Shuai Wang (Beihang University) · RIZHAO CAI (Nanyang Technological University) · Yizhong Liu (Beihang University) · Ying Fu (None) · Wenzhong Tang (Beihang University) · Zitong YU (Nanyang Technological University) · Alex C. Kot (Nanyang Technological University) | | 1040 | Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation | Qinghe Ma (Nanjing University) · Jian Zhang (Nanjing university) · Lei Qi (Southeast University) · Qian Yu (Shandong Women's University) · Yinghuan Shi (Nanjing University) · Yang Gao (Nanjing University) | | 1041 | Universal Novelty Detection through Adaptive Contrastive Learning | Hossein Mirzaei (Sharif University of Technology, Sharif University of Technology) · Mojtaba Nafez (Sharif University of Technology) · Mohammad Jafari (Sharif University of Technology) · Mohammad Soltani (Sharif University of Technology) · Mohammad Azizmalayeri (Amsterdam UMC) · Jafar Habibi (Sharif University of Technology) · Mohammad Sabokrou (Okinawa Institute of Science and Technology (OIST)) · Mohammad Rohban (Sharif University of Technology) | | 1042 | LAMP: Learn A Motion Pattern for Few-Shot Video Generation | Rui-Qi Wu (Nankai University) · Liangyu Chen (Megvii Technology Inc.) · Tong Yang (Fudan University) · Chun-Le Guo (None) · Chongyi Li () · Xiangyu Zhang (MEGVII Technology) | | 1043 | CLiC: Concept Learning in Context | Mehdi Safaee (SFU GrUVi Lab) · Aryan Mikaeili (Simon Fraser University) · Or Patashnik (Tel Aviv University) · Daniel Cohen-Or (Google) · Ali Mahdavi Amiri (Simon Fraser University) | | 1044 | Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis | Jiawen Li (None) · Yuxuan Chen (None) · Hongbo Chu (None) · Sun Qiehe (Tsinghua University) · Tian Guan (Graduate School at Shenzhen, Tsinghua University) · Anjia Han (SUN YAT-SEN UNIVERSITY) · Yonghong He (Tsinghua University, Tsinghua University) | | 1045 | Empowering Dynamics-aware Text-to-Video Diffusion with LLMs | Hao Fei (National University of Singapore) · Shengqiong Wu (National University of Singapore) · Wei Ji (None) · Hanwang Zhang (Nanyang Technological University) · Tat-seng Chua (National University of Singapore) | | 1046 | LEAD: Exploring Logit Space Evolution for Model Selection | Zixuan Hu (None) · Xiaotong Li (Peking University) · SHIXIANG TANG (The Chinese University of Hong Kong) · Jun Liu () · Yichun Hu (Peking University) · Ling-Yu Duan (Peking University) | | 1047 | PixelLM: Pixel Reasoning with Large Multimodal Model | Zhongwei Ren (Beijing Jiaotong University) · Zhicheng Huang (University of Science and Technology Beijing) · Yunchao Wei (UTS) · Yao Zhao (Beijing Jiaotong University) · Dongmei Fu (University of Science and Technology Beijing) · Jiashi Feng (ByteDance) · Xiaojie Jin (ByteDance Inc./TikTok) | | 1048 | Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency | Yuqi Zhang (Sichuan University) · Han Luo (Sichuan University) · Yinjie Lei (Sichuan University) | | 1049 | MR-VNet: Media Restoration using Volterra Networks | Siddharth Roheda (Samsung Research) · Amit Unde (SRIB Bangalore) · Loay Rashid (Samsung Research Institute Bangalore) | | 1050 | Single-Model and Any-Modality for Video Object Tracking | Zongwei Wu (Bayerische Julius-Maximilians-Universität Würzburg) · Jilai Zheng (Shanghai Jiaotong University) · Xiangxuan Ren (Shanghai Jiao Tong University) · Florin-Alexandru Vasluianu (Bayerische Julius-Maximilians-Universität Würzburg) · Chao Ma (Shanghai Jiao Tong University) · Danda Paudel (None) · Luc Van Gool (ETH Zurich) · Radu Timofte (University of Würzburg) | | 1051 | Neural Fields as Distributions: Signal Processing Beyond Euclidean Space | Daniel Rebain (None) · Soroosh Yazdani (Google) · Kwang Moo Yi (University Of British Columbia) · Andrea Tagliasacchi (Simon Fraser University) | | 1052 | OmniMotionGPT: Animal Motion Generation with Limited Data | Zhangsihao Yang (None) · Mingyuan Zhou (Innopeak Technology) · Mengyi Shan (University of Washington) · Bingbing Wen (University of Washington) · Ziwei Xuan (Innopeak Technology) · Mitch Hill (None) · Junjie Bai (CuraCloud Corporation) · Guo-Jun Qi (University of Central Florida) · Yalin Wang (Arizona State University) | | 1053 | Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology | Wenhao Tang (Chongqing University) · Fengtao ZHOU (Department of Computer Science and Engineering, Hong Kong University of Science and Technology) · Sheng Huang (Chongqing University) · Xiang Zhu (Chongqing University) · Yi Zhang (Chongqing University) · Bo Liu (Rutgers University) | | 1054 | WonderJourney: Going from Anywhere to Everywhere | Hong-Xing Yu (Computer Science Department, Stanford University) · Haoyi Duan (Stanford University) · Junhwa Hur (Google) · Kyle Sargent (Computer Science Department, Stanford University) · Michael Rubinstein (Google) · William Freeman (MIT and Google) · Forrester Cole (Google) · Deqing Sun (Google) · Noah Snavely (Google / Cornell) · Jiajun Wu (Stanford University) · Charles Herrmann (Google) | | 1055 | UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and Unfavorable Data Pairs | Youngju Na (KAIST) · Woo Jae Kim (Korea Advanced Institute of Science and Technology (KAIST)) · Kyu Han (Korea Advanced Institute of Science & Technology) · Suhyeon Ha (Korea Advanced Institute of Science and Technology) · Sung-Eui Yoon (KAIST) | | 1056 | SSR-Encoder: Selective Subject Representation Encoder for Subject-Driven Generation | Yuxuan Zhang (Shanghai Jiao Tong University) · Yiren Song (Shanghai Jiaotong University) · Jiaming Liu (Xiaohongshu) · Rui Wang (Beijing University of Posts and Telecommunications) · Jinpeng Yu (None) · Hao Tang (ETH Zurich) · Huaxia Li (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Xu Tang (Shanghaitech University) · Yao Hu (Zhejiang University, Tsinghua University) · Han Pan (Shanghai Jiao Tong University) · Zhongliang Jing (Shanghai Jiao Tong University) | | 1057 | Few-shot Learner Parameterization by Diffusion Time-steps | Zhongqi Yue (Nanyang Technological University) · Pan Zhou (Sea Group) · Richang Hong (Hefei University of Technology) · Hanwang Zhang (Nanyang Technological University) · Qianru Sun (None) | | 1058 | Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes | Xiaotian Sun (Xiamen University) · Qingshan Xu (Nanyang Technological University) · Xinjie Yang (Xiamen University) · Yu Zang (Xiamen University) · Cheng Wang (Xiamen University) | | 1059 | Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis | Simon Niedermayr (Technical University of Munich) · Josef Stumpfegger (Technische Universität München) · rüdiger westermann (Technische Universität München) | | 1060 | The STVchrono Dataset: Towards Continuous Change Recognition in Time | Yanjun Sun () · Yue Qiu (AIST, National Institute of Advanced Industrial Science and Technology) · Mariia Khan (Edith Cowan University) · Fumiya Matsuzawa (AIST, University of Tsukuba) · Kenji Iwata (AIST, National Institute of Advanced Industrial Science and Technology) | | 1061 | SPIN: Simultaneous Perception, Interaction and Navigation | Shagun Uppal (Carnegie Mellon University) · Ananye Agarwal (Carnegie Mellon University) · Haoyu Xiong (CMU, Carnegie Mellon University) · Kenneth Shaw (Carnegie Mellon University) · Deepak Pathak (Carnegie Mellon University) | | 1062 | SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation | Xie Bin (None) · Jiale Cao (Tianjin University) · Jin Xie (Chongqing University) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) · Yanwei Pang (Tianjin University) | | 1063 | Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection | Ke Li (Xidian University) · Di Wang (Xidian University) · Zhangyuan Hu (Xidian University) · Wenxuan Zhu (Xidian University) · Shaofeng Li (None) · Quan Wang (Xidian University) | | 1064 | Motion Blur Decomposition with Cross-shutter Guidance | Xiang Ji (The University of Tokyo) · Haiyang Jiang (None) · Yinqiang Zheng (None) | | 1065 | Real-time Acquisition and Reconstruction of Dynamic Volumes with Neural Structured Illumination | Yixin Zeng (None) · Zoubin Bi (State Key Laboratory of CAD&CG, Zhejiang Univerisity) · Yin Mingrui (Zhejiang University) · Xiang Feng (Zhejiang University) · Kun Zhou (Zhejiang University) · Hongzhi Wu (Zhejiang University) | | 1066 | MV-Adapter: Exploring Parameter Efficient Learning for Video Text Retrieval | bowen zhang (Bytedance) · Xiaojie Jin (ByteDance Inc./TikTok) · Weibo Gong (ByteDance) · Kai Xu (University of Chinese Academy of Sciences) · Xueqing Deng (ByteDance Research) · Peng Wang (Bytedance US AILab) · Zhao Zhang (Hefei University of Technology) · Xiaohui Shen (ByteDance) · Jiashi Feng (ByteDance) | | 1067 | Mind marginal non-crack regions: Clustering-inspired representation learning for crack segmentation | zhuangzhuang chen (shenzhen university) · Zhuonan Lai (Shenzhen University) · Jie Chen (Shenzhen University) · Jianqiang Li (Shenzhen University) | | 1068 | SpatialTracker: Tracking Any 2D Pixels in 3D Space | Yuxi Xiao (Wuhan University) · Qianqian Wang (Cornell University) · Shangzhan Zhang () · Nan Xue (None) · Sida Peng (None) · Yujun Shen (The Chinese University of Hong Kong) · Xiaowei Zhou (None) | | 1069 | TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models | Zhongwei Zhang (University of Science and Technology of China) · Fuchen Long (JD.com) · Yingwei Pan (None) · Zhaofan Qiu (University of Science and Technology of China) · Ting Yao (JD AI Research) · Yang Cao (University of Science and Technology of China) · Tao Mei (JD Explore Academy) | | 1070 | FreePoint: Unsupervised Point Cloud Instance Segmentation | Zhikai Zhang (Wuhan University) · Jian Ding (None) · Li Jiang (Max Planck Institute for Informatics) · Dengxin Dai () · Gui-Song Xia (Wuhan University) | | 1071 | Perceptual Assessment and Optimization of HDR Image Rendering | Peibei Cao (City University of Hong Kong) · Rafal Mantiuk (University of Cambridge) · Kede Ma (City University of Hong Kong) | | 1072 | Programmable Motion Generation for Open-set Motion Control Tasks | Hanchao Liu (Tsinghua University, Tsinghua University) · Xiaohang Zhan (The Chinese University of Hong Kong) · Shaoli Huang (Tencent AI Lab) · Tai-Jiang Mu (Tsinghua University, Tsinghua University) · Ying Shan (Tencent) | | 1073 | Learning Degradation Independent Representations for Camera ISP Pipelines | Yanhui Guo (McMaster University) · Fangzhou Luo (McMaster University) · Xiaolin Wu (McMaster University) | | 1074 | Projecting Trackable Thermal Patterns for Dynamic Computer Vision | Mark Sheinin (Carnegie Mellon University) · Aswin C. Sankaranarayanan (Carnegie Mellon University) · Srinivasa G. Narasimhan (Carnegie Mellon University) | | 1075 | MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors | He Zhang (None) · Shenghao Ren (None) · Haolei Yuan (None) · Jianhui Zhao (Beijing University of Aeronautics and Astronautics) · Fan Li (Beijing University of Aeronautics and Astronautics) · Shuangpeng Sun (Tsinghua University, Tsinghua University) · Zhenghao Liang (Tsinghua University, Tsinghua University) · Tao Yu (Tsinghua University, Tsinghua University) · Qiu Shen (Nanjing University) · Xun Cao (Nanjing University) | | 1076 | Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization | Ioanna Ntinou (Queen Mary University of London) · Enrique Sanchez (Samsung AI Center Cambridge) · Georgios Tzimiropoulos (Queen Mary University London) | | 1077 | Overcoming Generic Knowledge Loss with Selective Parameter Update | Wenxuan Zhang (King Abdullah University of Science and Technology) · Paul Janson (Concordia University/ MILA) · Rahaf Aljundi (Toyota Motor Europe) · Mohamed Elhoseiny (KAUST) | | 1078 | EventPS: Real-Time Photometric Stereo Using an Event Camera | Bohan Yu (None) · Jieji Ren (Shanghai Jiao Tong University) · Jin Han () · Feishi Wang (Peking University) · Jinxiu Liang (None) · Boxin Shi (None) | | 1079 | Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction | Jinzhi Zheng (University of Chinese Academy of Sciences) · Heng Fan (University of North Texas) · Libo Zhang (Institute of Software Chinese Academy of Sciences) | | 1080 | Open-Vocabulary 3D Semantic Segmentation with Foundation Models | Li Jiang (Max Planck Institute for Informatics) · Shaoshuai Shi (Saarland Informatics Campus, Max-Planck Institute) · Bernt Schiele (Max Planck Institute for Informatics) | | 1081 | Control4D: Efficient 4D Portrait Editing with Text | Ruizhi Shao (Tsinghua University, Tsinghua University) · Jingxiang Sun (None) · Cheng Peng (Tsinghua University, Tsinghua University) · Zerong Zheng (Tsinghua University) · Boyao ZHOU (Tsinghua University) · Hongwen Zhang (Beijing Normal University) · Yebin Liu (Tsinghua University) | | 1082 | Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform | Chunghyun Park (POSTECH) · Seungwook Kim (POSTECH) · Jaesik Park (Seoul National University) · Minsu Cho (POSTECH) | | 1083 | TokenCompose: Grounding Diffusion with Token-level Supervision | Zirui Wang (Princeton University) · Zhizhou Sha (Tsinghua University, Tsinghua University) · Zheng Ding (University of California, San Diego) · Yilin Wang (Tsinghua University, Tsinghua University) · Zhuowen Tu (University of California, San Diego) | | 1084 | Pick-or-Mix: Dynamic Channel Sampling for ConvNets | Ashish Kumar (Indian Institute of Technology, Kanpur) · Daneul Kim (Seoul National University) · Jaesik Park (Seoul National University) · Laxmidhar Behera (Indian Institute of Technology , Kanpur) | | 1085 | Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions | Zeyu Han (Sichuan University) · Fangrui Zhu (Northeastern University) · Qianru Lao (Harvard University) · Huaizu Jiang (Northeastern University) | | 1086 | MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | Kunchang Li (SIAT, UCAS) · Yali Wang (SIAT, Chinese Academy of Sciences) · Yinan He (Sensetime Research) · Yizhuo Li (The University of Hong Kong) · Yi Wang (Shanghai AI Laboratory) · Yi Liu (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Chinese Academy of Sciences) · Zun Wang (Australian National University) · Jilan Xu (None) · Guo Chen (Nanjing University) · Ping Luo (The University of Hong Kong) · Limin Wang (Nanjing University) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) | | 1087 | Estimating Extreme 3D Image Rotations using Cascaded Attention | Shay Dekel (None) · Yosi Keller (Bar Ilan University) · Martin Čadík (Brno University of Technology) | | 1088 | Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers | Zi-Xin Zou (None) · Zhipeng Yu (University of the Chinese Academy of Sciences) · Yuan-Chen Guo (Tsinghua University) · Yangguang Li (Shanghai AI Laboratory) · Yan-Pei Cao (Tencent ARC Lab) · Ding Liang (Tsinghua University, Tsinghua University) · Song-Hai Zhang (Tsinghua University, Tsinghua University) | | 1089 | Communication-Efficient Federated Learning with Accelerated Client Gradient | Geeho Kim (Seoul National University) · Jinkyu Kim (Seoul National University) · Bohyung Han (Seoul National University) | | 1090 | CAMEL: CAusal Motion Enhancement tailored for Lifting Text-driven Video Editing | Guiwei Zhang (Beijing University of Aeronautics and Astronautics) · Tianyu Zhang (Du Xiaoman Financial) · Guanglin Niu (Beihang University) · Zichang Tan (Baidu) · Yalong Bai (JD AI Research) · Qing Yang (Du Xiaoman Technology(BeiJing)) | | 1091 | Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery | Mubashir Noman (MBZUAI) · Muzammal Naseer (MBZUAI) · Hisham Cholakkal (MBZUAI) · Rao Anwer (Mohamed bin Zayed University of Artificial Intelligence) · Salman Khan (Mohamed bin Zayed University of Artificial Intelligence) · Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence) | | 1092 | Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network | Yong Shu () · Liquan Shen (Shanghai University) · Xiangyu Hu (Shanghai University) · Mengyao Li (Shanghai University) · Zihao Zhou (Shanghai University) | | 1093 | Designing Scalable Vision Models in the Vision-Language Era | Jieneng Chen (Johns Hopkins University) · Qihang Yu (Johns Hopkins University) · Xiaohui Shen (ByteDance) · Alan L. Yuille (Johns Hopkins University) · Liang-Chieh Chen (None) | | 1094 | Towards Progressive Multi-Frequency Representation for Image Warping | Jun Xiao (The Hong Kong Polytechnic University) · Zihang Lyu (Hong Kong Polytechnic University) · Cong Zhang (Hong Kong Polytechnic University) · Yakun Ju (Nanyang Technological University) · Changjian Shui (Vector Institute) · Kin-man Lam (University of Sydney, University of Sydney) | | 1095 | SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control | Jaskirat Singh (Australian National University) · Jianming Zhang (Adobe Systems) · Qing Liu (Adobe Systems) · Cameron Smith (Adobe Systems) · Zhe Lin (Adobe Research) · Liang Zheng (Australian National University) | | 1096 | HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud | WENCAN CHENG (None) · Hao Tang (ETH Zurich) · Luc Van Gool (ETH Zurich) · Jong Hwan Ko (Sungkyunkwan University (SKKU)) | | 1097 | GraCo: Granularity-Controllable Interactive Segmentation | Yian Zhao (Peking University) · Kehan Li (Peking University) · Zesen Cheng (Peking University) · Pengchong Qiao (None) · Xiawu Zheng (Xiamen University) · Rongrong Ji (Xiamen University) · Chang Liu (Tsinghua University, Tsinghua University) · Li Yuan (Peking University) · Jie Chen (Peking University) | | 1098 | Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera | Jiye Lee (Seoul National University) · Hanbyul Joo (None) | | 1099 | DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation | Yuanchen Wu (None) · Xichen Ye (Shanghai University) · KequanYang (Shanghai University) · Jide Li () · Xiaoqiang Li (shanghai university) | | 1100 | Image Neural Field Diffusion Models | Yinbo Chen (University of California, San Diego) · Oliver Wang (Adobe Research) · Richard Zhang (Adobe Systems) · Eli Shechtman (Adobe) · Xiaolong Wang (UCSD) · Michaël Gharbi (Massachusetts Institute of Technology) | | 1101 | Segment Every Out-of-Distribution Object | Wenjie Zhao (Univeristy of Texas at Dallas) · Jia Li (None) · Xin Dong (Harvard University) · Yu Xiang (University of Texas, Dallas) · Yunhui Guo (The University of Texas at Dallas) | | 1102 | Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution | Shangchen Zhou (Nanyang Technological University) · Peiqing Yang (S-Lab, Nanyang Technological University) · Jianyi Wang (Nanyang Technological University) · Yihang Luo (Nanyang Technological University) · Chen Change Loy (NANYANG TECHNOLOGICAL UNIVERSITY) | | 1103 | Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild | Fanghua Yu (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences) · Jinjin Gu (University of Sydney) · Zheyuan Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Chinese Academy of Sciences) · Jinfan Hu (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Chinese Academy of Sciences) · Xiangtao Kong (Hong Kong Polytechnic University) · Xintao Wang (Tencent) · Jingwen He (Shanghai ai lab) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Chao Dong (SIAT) | | 1104 | A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction | Jin Gong (Tsinghua University) · Runzhao Yang (Department of Automation, Tsinghua University) · Weihang Zhang (Tsinghua University) · Jinli Suo (Tsinghua University, Tsinghua University) · Qionghai Dai (Tsinghua University, Tsinghua University) | | 1105 | DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data | Qihao Liu (Johns Hopkins University) · Yi Zhang (Sony Corporation of America) · Song Bai (ByteDance) · Adam Kortylewski (University of Freiburg & MPI-INF) · Alan L. Yuille (Johns Hopkins University) | | 1106 | An Interactive Navigation Method with Effect-oriented Affordance | XIAOHAN Wang (Xi'an Jiaotong University) · Yuehu LIU (College of Artificial Intelligence, Xi'an Jiaotong University) · Xinhang Song (None) · Yuyi Liu (Institute of Computing Technology,University of the Chinese Academy of Sciences) · Sixian Zhang (ICT,UCAS,China) · Shuqiang Jiang (Institute of Computing Technology, Chinese Academy of Sciences) | | 1107 | VOODOO 3D: VOlumetric pOrtrait Disentanglement fOr Online 3D head reenactment | Phong Tran (MBZUAI) · Egor Zakharov (Skolkovo Institute of Science and Technology) · Long Nhat Ho (Mohamed bin Zayed University of Artificial Intelligence) · Anh Tran (None) · Liwen Hu (Pinscreen) · Hao Li (Mohamed bin Zayed University of Artificial Intelligence) | | 1108 | NAPGuard: Towards Detecting Naturalistic Adversarial Patches | Wu (None) · Jiakai Wang (Zhongguancun Laboratory) · Jiejie Zhao (Zhongguancun Laboratory) · Yazhe Wang (Zhongguancun Laboratory) · Xianglong Liu (BUAA) | | 1109 | Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning | Christopher Liao (None) · Theodoros Tsiligkaridis (MIT Lincoln Laboratory, Massachusetts Institute of Technology) · Brian Kulis (Boston University) | | 1110 | A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning | Xiaoyang Xu (None) · Mengda Yang (None) · Wenzhe Yi (None) · Ziang Li (None) · Juan Wang (None) · Hongxin Hu (State University of New York, Buffalo) · Yong ZHUANG (Wuhan University) · Yaxin Liu (Wuhan University) | | 1111 | Masked and Shuffled Blind Spot Denoising for Real-World Images | Hamadi Chihaoui (University of Bern) · Paolo Favaro (Institute für Informatik, University of Bern) | | 1112 | Flattening the Parent Bias: Hierarchical Semantic Segmentation in the Poincaré Ball | Simon Weber (Technische Universität München) · Barış Zöngür (Technische Universität München) · Nikita Araslanov (TU Munich) · Daniel Cremers (Technical University Munich) | | 1113 | Splatter Image: Ultra-Fast Single-View 3D Reconstruction | Stanislaw Szymanowicz (University of Oxford, University of Oxford) · Christian Rupprecht (University of Oxford) · Andrea Vedaldi (University of Oxford) | | 1114 | DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling | Linqi Zhou (Stanford University) · Andy Shih (Stanford University) · Chenlin Meng (None) · Stefano Ermon (Stanford University) | | 1115 | Generative Region-Language Pretraining for Open-Ended Object Detection | Chuang Lin (None) · Yi Jiang (bytedance) · Lizhen Qu (Monash University) · Zehuan Yuan (Nanjing University) · Jianfei Cai (Monash University) | | 1116 | ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing | Jun-Kun Chen (None) · Samuel Rota Bulò (Meta) · Norman Müller (Meta) · Lorenzo Porzi (Facebook) · Peter Kontschieder (Meta) · Yu-Xiong Wang (None) | | 1117 | C r o w d D i f f : Multi-hypothesis Crowd Density Estimation using Diffusion Models | Don Yasiru Ranasinghe (Whiting School of Engineering) · Nithin Gopalakrishnan Nair (Johns Hopkins University) · Wele Gedara Chaminda Bandara (Johns Hopkins University) · Vishal M. Patel (Johns Hopkins University) | | 1118 | Boosting Diffusion Models with Moving Average Sampling in Frequency Domain | Yurui Qian (University of Science and Technology of China) · Qi Cai (JD) · Yingwei Pan (None) · Yehao Li (JD AI Research) · Ting Yao (JD AI Research) · Qibin Sun (University of Science and Technology of China) · Tao Mei (JD Explore Academy) | | 1119 | Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity | Ruijie Quan (Zhejiang University) · Wenguan Wang (Zhejiang University) · Zhibo Tian (Lanzhou University) · Fan Ma (None) · Yi Yang (Zhejiang University) | | 1120 | Rethinking Multi-domain Generalization with A General Learning Objective | Zhaorui Tan (None) · Xi Yang (Xi'an Jiaotong-Liverpool University) · Kaizhu Huang (Duke Kunshan University) | | 1121 | A Theory of Joint Light and Heat Transport for Lambertian Scenes | Mani Ramanagopal (Carnegie Mellon University) · Sriram Narayanan (Carnegie Mellon University) · Aswin C. Sankaranarayanan (Carnegie Mellon University) · Srinivasa G. Narasimhan (Carnegie Mellon University) | | 1122 | Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering | Kim Youwang (Pohang University of Science and Technology) · Tae-Hyun Oh (None) · Gerard Pons-Moll (University of Tübingen) | | 1123 | Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset | Yujin Jeon (Pohang University of Science and Technology) · Eunsue Choi (Pohang University of Science and Technology) · Youngchan Kim (Pohang University of Science and Technology) · Yunseong Moon (Pohang University of Science and Technology) · Khalid Omer (Meta Reality Labs) · Felix Heide (Department of Computer Science, Princeton University) · Seung-Hwan Baek (POSTECH) | | 1124 | Towards Text-guided 3D Scene Composition | Qihang Zhang (The Chinese University of Hong Kong) · Chaoyang Wang (Snap Inc) · Aliaksandr Siarohin (Snap Inc.) · Peiye Zhuang (Snap Inc.) · Yinghao Xu (Chinese University of Hong Kong) · Ceyuan Yang (The Chinese University of Hong Kong) · Dahua Lin (The Chinese University of Hong Kong) · Bolei Zhou (University of California, Los Angeles) · Sergey Tulyakov (Snap Inc.) · Hsin-Ying Lee (Snap Inc.) | | 1125 | Efficient Stitchable Task Adaptation | Haoyu He (Monash University) · Zizheng Pan (None) · Jing Liu () · Jianfei Cai (Monash University) · Bohan Zhuang (Monash University) | | 1126 | MeaCap: Memory-Augmented Zero-shot Image Captioning | Zequn Zeng (None) · Yan Xie (None) · Hao Zhang (Xidian University, Xi'an, China) · Chiyu Chen (Xi'an University of Electronic Science and Technology) · Zhengjue Wang (Xidian University) · Bo Chen (Xidian University) | | 1127 | Retrieval-Augmentated Layout Transformer for Content-Aware Layout Generation | Daichi Horita (None) · Naoto Inoue (None) · Kotaro Kikuchi (None) · Kota Yamaguchi (CyberAgent) · Kiyoharu Aizawa (The University of Tokyo) | | 1128 | A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning | Siddharth Srivastava (TensorTour Inc) · Gaurav Sharma (TensorTour Inc.) | | 1129 | MuGE: Multiple Granularity Edge Detection | Caixia Zhou (None) · Yaping Huang (Beijing Jiaotong University) · Mengyang Pu (North China Electric Power University) · Qingji Guan (Beijing Jiaotong University) · Ruoxi Deng (Wenzhou University) · Haibin Ling (State University of New York, Stony Brook) | | 1130 | Efficient Multitask Dense Predictor via Binarization | Yuzhang Shang (Illinois Institute of Technology) · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) · Gaowen Liu (None) · Ramana Kompella (Cisco) · Yan Yan (Illinois Institute of Technology) | | 1131 | Novel View Synthesis with View-Dependent Effects from a Single Image | Juan Luis Gonzalez Bello (KAIST) · Munchurl Kim (Korea Advanced Institute of Science and Technology) | | 1132 | Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation | Hongwei Yan (Tsinghua University) · Liyuan Wang (Tsinghua University) · Kaisheng Ma (Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University) · Yi Zhong (Tsinghua University, Tsinghua University) | | 1133 | Small Scale Data-Free Knowledge Distillation | He Liu (None) · Yikai Wang (Tsinghua University) · Huaping Liu (Tsinghua University, Tsinghua University) · Fuchun Sun (Tsinghua University) · Anbang Yao (Intel) | | 1134 | Open-World Human-Object Interaction Detection via Multi-modal Prompts | Jie Yang (The Chinese University of Hong Kong, Shenzhen) · Bingliang Li (The Chinese University of Hong Kong (Shenzhen)) · Ailing Zeng (IDEA) · Lei Zhang (International Digital Economy Academy (IDEA)) · Ruimao Zhang (The Chinese University of Hong Kong (Shenzhen)) | | 1135 | FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features | Andre Rochow (Rheinische Friedrich-Wilhelms Universität Bonn) · Max Schwarz (University of Bonn) · Sven Behnke (University of Bonn) | | 1136 | VidToMe: Video Token Merging for Zero-Shot Video Editing | Xirui Li (Shanghai Jiaotong University) · Chao Ma (Shanghai Jiao Tong University) · Xiaokang Yang (Shanghai Jiao Tong University, China) · Ming-Hsuan Yang (University of California at Merced) | | 1137 | Text-image Alignment for Diffusion-based Perception | Neehar Kondapaneni (California Institute of Technology) · Markus Marks (None) · Manuel Knott (ETHZ - ETH Zurich) · Rogério Guimarães (California Institute of Technology) · Pietro Perona (California Institute of Technology) | | 1138 | PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation | Yuqi Wang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Yuntao Chen (CAIR, HKISI, CAS) · Xingyu Liao (University of Science and Technology of China) · Lue Fan (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) | | 1139 | AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution | Cheeun Hong (None) · Kyoung Mu Lee (Seoul National University) | | 1140 | SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design | Seokju Yun (University of Seoul) · Youngmin Ro (University of Seoul) | | 1141 | Domain Separation Graph Neural Networks for Saliency Object Ranking | Zijian Wu (Nanjing University of Science and Technology) · Jun Lu (Nanjing University of Science and Technology) · Jing Han (Nanjing University Of Science And Technology) · Lianfa Bai (Nanjing University of Science and Technology) · Yi Zhang (Nanjing University of Science and Technology) · Zhuang Zhao (Nanjing University of Science and Technology) · Siyang Song (University of Leicester) | | 1142 | Solving the Catastrophic Forgetting Problem in Generalized Category Discovery | Xinzi Cao (Sun Yat-Sen University) · Xiawu Zheng (Xiamen University) · Guanhong Wang (Zhejiang University) · Weijiang Yu (SUN YAT-SEN UNIVERSITY) · Yunhang Shen (Tencent) · Ke Li (Tencent) · Yutong Lu (SUN YAT-SEN UNIVERSITY) · Yonghong Tian (Peking University) | | 1143 | Improving Image Restoration through Removing Degradations in Textual Representations | Jingbo Lin (Harbin Institute of Technology) · Zhilu Zhang (Harbin Institute of Technology) · Yuxiang Wei (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Dongwei Ren (Harbin Institute of Technology) · Dongsheng Jiang (Huawei Technologies Ltd.) · Qi Tian (Huawei Technologies Ltd.) · Wangmeng Zuo (Harbin Institute of Technology) | | 1144 | SketchINR: A First Look into Sketches as Implicit Neural Representations | Hmrishav Bandyopadhyay (University of Surrey) · Ayan Kumar Bhunia (University of Surrey, United Kingdom) · Pinaki Nath Chowdhury (University of Surrey) · Aneeshan Sain (University of Surrey) · Tao Xiang (University of Surrey) · Timothy Hospedales (None) · Yi-Zhe Song (None) | | 1145 | Activity-Biometrics: Person Identification from Daily Activities | Shehreen Azad (University of Central Florida) · Yogesh S. Rawat (University of Central Florida) | | 1146 | Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation | Ming Xu (Australian National University) · Stephen Gould (Australian National University) | | 1147 | Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement | Jinyoung Jun (None) · Jae-Han Lee (Gauss Labs) · Chang-Su Kim (Korea University) | | 1148 | Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras | Ashwath Shetty (Saarland Informatics Campus, Max-Planck Institute) · Marc Habermann (Saarland Informatics Campus, Max-Planck Institute) · Guoxing Sun (Max Planck Institute for Informatics) · Diogo Luvizon (Saarland Informatics Campus, Max-Planck Institute) · Vladislav Golyanik (MPI for Informatics) · Christian Theobalt (MPI Informatik) | | 1149 | Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle | Youtian Lin (Harbin Institute of Technology) · Zuozhuo Dai (Alibaba Group) · Siyu Zhu (Fudan University) · Yao Yao (Nanjing University) | | 1150 | Instance-level Expert Knowledge and Aggregate Discriminative Attention for Radiology Report Generation | Shenshen Bu (Sun Yat-sen University) · Taiji Li (SUN YAT-SEN UNIVERSITY) · Zhiming Dai (SUN YAT-SEN UNIVERSITY) · Yuedong Yang (SUN YAT-SEN UNIVERSITY) | | 1151 | HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation | Zhiying Leng (Beihang University) · Tolga Birdal () · Xiaohui Liang (Zhongguancun Laboratory) · Federico Tombari (Google, TUM) | | 1152 | MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images | Junwen Huang (Technische Universität München) · Hao Yu (Technical University Munich) · Kuan-Ting Yu (XYZ Robotics) · Nassir Navab (TU Munich) · Slobodan Ilic (Technical University Munich) · Benjamin Busam (None) | | 1153 | Resource-Efficient Transformer Pruning for Finetuning of Large Models | Fatih Ilhan (Georgia Institute of Technology) · Gong Su (IBM, International Business Machines) · Selim Tekin (College of Computing, Georgia Institute of Technology) · Tiansheng Huang (Georgia Institute of Technology) · Sihao Hu (Georgia Institute of Technology) · Ling Liu (Georgia Institute of Technology) | | 1154 | Towards Variable and Coordinated Holistic Co-Speech Motion Generation | Yifei Liu (South China University of Technology) · Qiong Cao (JD Explore Academy) · Yandong Wen (Max Planck Institute for Intelligent Systems) · Huaiguang Jiang (South China University of Technology) · Changxing Ding (South China University of Technology) | | 1155 | Fast ODE-based Sampling for Diffusion Models in Around 5 Steps | Zhenyu Zhou (None) · Defang Chen (Zhejiang University) · Can Wang (Zhejiang University) · Chun Chen (Zhejiang University) | | 1156 | Referring Image Editing: Object-level Image Editing via Referring Expressions | Chang Liu (None) · Xiangtai Li (Nanyang Technological University) · Henghui Ding (None) | | 1157 | InNeRF360: Text-Guided 3D-Consistent Object Inpainting on Unbounded Neural Radiance Fields | Dongqing Wang (EPFL) · Tong Zhang (EPFL) · Alaa Abboud (EPFL - EPF Lausanne) · Sabine Süsstrunk (None) | | 1158 | TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes | Xuying Zhang (Nankai University) · Bo-Wen Yin (Nankai University) · yuming chen (None) · Zheng Lin (Nankai University) · Yunheng Li (Nankai University) · Qibin Hou (Nankai University) · Ming-Ming Cheng (Nankai University, Tsinghua University) | | 1159 | Unsupervised Template-assisted Point Cloud Shape Correspondence Network | Jiacheng Deng (University of Science and Technology of China) · Jiahao Lu (University of Science and Technology of China) · Tianzhu Zhang (University of Science and Technology of China, Tsinghua University) | | 1160 | X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition | Shuofeng Sun (Beijing University of Posts and Telecommunications) · Yongming Rao (Tsinghua University) · Jiwen Lu (Tsinghua University) · Haibin Yan (Beijing University of Posts and Telecommunications) | | 1161 | HOIAnimator: Text-Prompt Human-Object Animations Generation with Perceptive Diffusion Models | Wenfeng Song (Beijing Information Science and Technology University) · Xinyu Zhang () · Shuai Li (Beijing University of Aeronautics and Astronautics) · Yang Gao (Beijing University of Aeronautics and Astronautics) · Aimin Hao (None) · Xia HOU (Beijing Information Science & Technology University) · Chenglizhao Chen (China University of Petroleum) · Ning Li (Beijing Information Science and Technology University) · Hong Qin (Stony Brook University (SUNY at Stony Brook)) | | 1162 | Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models | Huimin Huang (Zhejiang University) · Yawen Huang (None) · Lanfen Lin (Zhejiang University) · Ruofeng Tong (None) · Yen-Wei Chen (Ritsumeikan University) · Hao Zheng (Tencent) · Yuexiang Li (Tencent Jarvis Lab) · Yefeng Zheng (None) | | 1163 | WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concept | Yong Hyun Ahn (Kyung Hee University) · Hyeon Kim (Kyunghee University) · Seong Tae Kim (Kyung Hee University) | | 1164 | ToonerGAN: Reinforcing GANs for Obfuscating Automated Facial Indexing | Kartik Thakral (Indian Institute of Technology Jodhpur) · Shashikant Prasad (Indian Institute of Technology, Jodhpur, Dhirubhai Ambani Institute Of Information and Communication Technology) · Stuti Aswani (Indian Institute of Technology, Jodhpur) · Mayank Vatsa (IIT Jodhpur) · Richa Singh (IIT Jodhpur) | | 1165 | UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All | Yuanhuiyi Lyu (Hong Kong University of Science and Technology) · Xu Zheng (Northeastern University) · Jiazhou Zhou (Hong Kong University of Science and Technology) · Lin Wang (Hong Kong University of Science and Technology) | | 1166 | Hybrid Functional Maps for Crease-Aware Non-Isometric Shape Matching | Lennart Bastian (None) · Yizheng Xie (Technische Universität München) · Nassir Navab (TU Munich) · Zorah Lähner (Rheinische Friedrich-Wilhelms Universität Bonn) | | 1167 | Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching | Gong Rui (Nanyang Technological University) · Weide Liu (Harvard University) · ZAIWANG GU (None) · Xulei Yang (Institute for Infocomm Research (I2R), ASTAR) · Jun Cheng (Institute For Infocomm Research, ASTAR) | | 1168 | LowRankOcc: Tensor Decomposition and Low-Rank Recovery for Vision-based 3D Semantic Occupancy Prediction | Linqing Zhao (Tianjin University, Tsinghua University) · Xiuwei Xu (Tsinghua University, Tsinghua University) · Ziwei Wang (Tsinghua University, Tsinghua University) · Yunpeng Zhang (PhiGent Robotics) · Borui Zhang (Tsinghua University, Tsinghua University) · Wenzhao Zheng (Tsinghua University, Tsinghua University) · Dalong Du (PhiGent Robotics) · Jie Zhou (None) · Jiwen Lu (Tsinghua University) | | 1169 | Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation | Wenxuan Wang (National Lab of Pattern Recognition, Institute of Automation,Chinese Academy of Sciences) · Tongtian Yue (, Institute of automation, Chinese academy of science) · Yisi Zhang (University of Science and Technology Beijing) · Longteng Guo (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xingjian He (, Institute of automation, Chinese academy of science) · Xinlong Wang (Beijing Academy of Artificial Intelligence) · Jing Liu (Institute of automation, Chinese academy of science) | | 1170 | Streaming dense video captioning | Xingyi Zhou (Google) · Anurag Arnab (Google) · Shyamal Buch (Stanford University) · Shen Yan (Google Research) · Austin Myers (Google) · Xuehan Xiong (Google) · Arsha Nagrani (Google ) · Cordelia Schmid (Inria / Google) | | 1171 | Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction | Junuk Cha (None) · Jihyeon Kim (Ulsan National Institute of Science and Technology) · Jae Shin Yoon (Adobe Systems) · Seungryul Baek (UNIST) | | 1172 | BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning | Ruyang Liu (Peking University) · Chen Li (Tencent ARC Lab) · Yixiao Ge (Tencent) · Thomas H. Li (AIIT, Peking University) · Ying Shan (Tencent) · Ge Li (Peking University Shenzhen Graduate School) | | 1173 | Video Frame Interpolation via Direct Synthesis with the Event-based Reference | Yuhan Liu () · Yongjian Deng (Beijing University of Technology) · Hao Chen (Southeast University) · Zhen Yang (Beijing University of Technology) | | 1174 | Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation | Xiao Lin (University of Science and Technology of China) · Wenfei Yang (University of Science and Technology of China) · Yuan Gao (University of Science and Technology of China) · Tianzhu Zhang (University of Science and Technology of China, Tsinghua University) | | 1175 | CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation | Bo-Yuan Sun (Nankai University) · Yuqi Yang (Nankai University) · Le Zhang (University of Electronic Science and Technology of China) · Ming-Ming Cheng (Nankai University, Tsinghua University) · Qibin Hou (Nankai University) | | 1176 | Dual Prior Unfolding for Snapshot Compressive Imaging | Jiancheng Zhang (Northwest Polytechnical University Xi'an) · Haijin Zeng (IMEC & Universiteit Gent) · Jiezhang Cao (ETH Zürich) · Yongyong Chen (Harbin Institute of Technology (Shenzhen)) · Dengxiu Yu (Northwest Polytechnical University) · Yinping Zhao (Northwestern Polytechnical University) | | 1177 | MCNet: Rethinking the Core Ingredients for Accurate and Efficient Homography Estimation | Haokai Zhu (Zhejiang University) · Si-Yuan Cao (Zhejiang University) · Jianxin Hu (Zhejiang University) · Sitong Zuo (Beijing University of Posts and Telecommunications) · Beinan Yu (Zhejiang University) · Jiacheng Ying (Zhejiang University) · Junwei Li (Zhejiang University) · Hui-Liang Shen (None) | | 1178 | Dispel Darkness for Better Fusion: A Controllable Visual Enhancer based on Cross-modal Conditional Adversarial Learning | Hao Zhang (Wuhan University) · Linfeng Tang (Wuhan University) · Xinyu Xiang (Wuhan University) · Xuhui Zuo (Wuhan University) · Jiayi Ma (Wuhan University) | | 1179 | MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos | Jielin Qiu (Carnegie Mellon University) · Jiacheng Zhu (Massachusetts Institute of Technology) · William Han (Carnegie Mellon University) · Aditesh Kumar (Carnegie Mellon University) · Karthik Mittal (School of Computer Science, Carnegie Mellon University) · Claire Jin (School of Computer Science, Carnegie Mellon University) · Zhengyuan Yang (Microsoft) · Linjie Li (Microsoft) · Jianfeng Wang (Microsoft) · DING ZHAO (Carnegie Mellon University) · Bo Li (UIUC) · Lijuan Wang (Microsoft) | | 1180 | Open Set Domain Adaptation for Semantic Segmentation | Seun-An Choe (Kyung Hee University) · Ah-Hyung Shin (Kyung Hee University) · Keon Hee Park (KyungHee University) · Jinwoo Choi (Kyung Hee University) · Gyeong-Moon Park (Kyung Hee University) | | 1181 | Video-P2P: Video Editing with Cross-attention Control | Shaoteng Liu (The Chinese University of Hong Kong) · Yuechen Zhang (None) · Wenbo Li (Huawei Technologies Ltd.) · Zhe Lin (Adobe Research) · Jiaya Jia (The Chinese University of Hong Kong) | | 1182 | LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge | Gongwei Chen (Harbin Institute of Technology) · Leyang Shen (Harbin Institute of Technology) · Rui Shao (Harbin Institute of Technology) · Xiang Deng (Harbin Institute of Technology (Shenzhen)) · Liqiang Nie (Harbin Institute of Technology (Shenzhen)) | | 1183 | From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Evonne Ng (University of California, Berkeley) · Javier Romero (None) · Timur Bagautdinov (Reality Labs Research) · Shaojie Bai (Meta) · Trevor Darrell (Electrical Engineering & Computer Science Department) · Angjoo Kanazawa (UC Berkeley) · Alexander Richard (Reality Labs Research, Meta) | | 1184 | DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing | Yujun Shi (national university of singaore, National University of Singapore) · Chuhui Xue (ByteDance Inc.) · Jun Hao Liew (ByteDance) · Jiachun Pan (National University of Singapore) · Hanshu Yan (ByteDance) · Wenqing Zhang (Huazhong University of Science and Technology) · Vincent Y. F. Tan (National University of Singapore) · Song Bai (ByteDance) | | 1185 | SIRA: Scalable Inter-frame Relation and Association for Radar Perception | Ryoma Yataka (None) · Pu (Perry) Wang (None) · Petros Boufounos (Mitsubishi Electric Research Laboratories) · Ryuhei Takahashi (Mitsubishi Electric Corporation) | | 1186 | The More You See in 2D, the More You Perceive in 3D | Xinyang Han (None) · Zelin Gao () · Angjoo Kanazawa (UC Berkeley) · Shubham Goel (Avataar) · Yossi Gandelsman (University of California, Berkeley) | | 1187 | Pixel Aligned Language Models | Jiarui Xu (University of California, San Diego) · Xingyi Zhou (Google) · Shen Yan (Google Research) · Xiuye Gu (None) · Anurag Arnab (Google) · Chen Sun (Brown University) · Xiaolong Wang (UCSD) · Cordelia Schmid (Inria / Google) | | 1188 | Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection | Zhiyuan Yan (Tencent YouTu Lab) · Yuhao Luo (The Chinese University of Hong Kong, Shenzhen) · Siwei Lyu (State University of New York, Buffalo) · Qingshan Liu (Nanjing University of Posts and Telecommunications) · Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen) | | 1189 | Rethinking the Evaluation Protocol of Domain Generalization | Han Yu (Tsinghua University) · Xingxuan Zhang (Tsinghua University) · Renzhe Xu (Tsinghua University) · Jiashuo Liu (Tsinghua University, Tsinghua University) · Yue He (Tsinghua University, Tsinghua University) · Peng Cui (Tsinghua University, Tsinghua University) | | 1190 | PFStorer: Personalized Face Restoration and Super-Resolution | Tuomas Varanka (None) · Tapani Toivonen (Huawei Technologies Ltd.) · Soumya Tripathy (Huawei Technologies Ltd. Finland) · Guoying Zhao (None) · Erman Acar (Huawei Technologies Ltd.) | | 1191 | Make Adapters Great Again | Jan-Martin Steitz (None) · Stefan Roth (None) | | 1192 | Eclipse: Disambiguating Illumination and Materials using Unintended Shadows | Dor Verbin (None) · Ben Mildenhall (Google) · Peter Hedman (Google) · Jonathan T. Barron (Google) · Todd Zickler (Harvard University) · Pratul P. Srinivasan (Google Research) | | 1193 | ASAM: Boosting Segment Anything Model with Adversarial Tuning | Bo Li (Tencent Youtu Lab) · Haoke Xiao (Xiamen University) · Lv Tang (University of the Chinese Academy of Sciences) | | 1194 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal (Max-Planck Institute for Informatics) · Rishabh Dabral (Saarland Informatics Campus, Max-Planck Institute) · Ikhsanul Habibie (Saarland Informatics Campus, Max-Planck Institute) · Lucia Donatelli (Vrije Universiteit Amsterdam) · Marc Habermann (Saarland Informatics Campus, Max-Planck Institute) · Christian Theobalt (MPI Informatik) | | 1195 | A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution | Zhixiong Yang (National University of Defense Technology) · Jingyuan Xia (National University of Defense Technology) · Shengxi Li (Beihang University) · Xinghua Huang (National University of Defense Technology) · Shuanghui Zhang (National University of Defense Technology) · Zhen Liu (National University of Defense Technology) · Yaowen Fu (National University of Defense Technology) · Yongxiang Liu (National University of Defense Technology) | | 1196 | Continuous Optical Zooming: A Benchmark for Arbitrary-Scale Image Super-Resolution in Real World | Huiyuan Fu (Beijing University of Posts and Telecommunications) · Fei Peng (Beijing University of Posts and Telecommunications) · Xianwei Li (Beijing University of Posts and Telecommunications) · Yejun Li (Beijing University of Posts and Telecommunications) · Xin Wang (State University of New York at Stony Brook) · Huadong Ma (Beijing University of Post and Telecommunication, Tsinghua University) | | 1197 | SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model | Zhengang Li (Northeastern University) · Yan Kang (None) · Yuchen Liu (None) · Difan Liu (None) · Tobias Hinz (Adobe Systems) · Feng Liu (Adobe Systems) · Yanzhi Wang (Northeastern University) | | 1198 | Compositional Video Understanding with Spatiotemporal Structure-based Transformers | Hoyeoung Yun (Hanyang University) · Jinwoo Ahn (Hanyang University) · Minseo Kim (Hanyang University) · Eun-Sol Kim (Hanyang University) | | 1199 | FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Bowen Wen (NVIDIA) · Wei Yang (NVIDIA) · Jan Kautz (NVIDIA) · Stan Birchfield (NVIDIA) | | 1200 | Adversarially Robust Few-shot Learning via Parameter Co-distillation of Similarity and Class Concept Learners | Junhao Dong (Nanyang Technological University) · Piotr Koniusz (Australian National University) · Junxi Chen (SUN YAT-SEN UNIVERSITY) · Xiaohua Xie (SUN YAT-SEN UNIVERSITY) · Yew-Soon Ong (Nanyang Technological University) | | 1201 | Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach | Beichen Zhang (Shanghai Jiaotong University) · Xiaoxing Wang (Shanghai Jiao Tong University) · Xiaohan Qin (None) · Junchi Yan (Shanghai Jiao Tong University) | | 1202 | Binarized Low-light Raw Video Enhancement | Gengchen Zhang (Beijing Institute of Technology) · Yulun Zhang (ETH Zürich) · Xin Yuan (Westlake University) · Ying Fu (None) | | 1203 | Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On | Xu Yang (None) · Changxing Ding (South China University of Technology) · Zhibin Hong (HeyGen) · Junhao Huang (Heygen) · Jin Tao (South China University of Technology) · Xiangmin Xu (South China University of Technology) | | 1204 | ScanFormer: Referring Expression Comprehension by Iteratively Scanning | Wei Su (Zhejiang University) · Peihan Miao (Zhejiang University) · Huanzhang Dou (Zhejiang University) · Xi Li (Zhejiang University) | | 1205 | Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text | Junshu Tang (None) · Yanhong Zeng (None) · Ke Fan (Shanghai Jiaotong University) · Xuheng Wang (Tsinghua University, Tsinghua University) · Bo Dai (Shanghai AI Laboratory) · Kai Chen (Shanghai AI Laboratory) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) | | 1206 | Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction | Hsin-Ying Lee (University of California, Merced) · Hung-Yu Tseng (Meta) · Hsin-Ying Lee (Snap Inc.) · Ming-Hsuan Yang (University of California at Merced) | | 1207 | GSVA: Generalized Segmentation via Multimodal Large Language Models | Zhuofan Xia (Tsinghua University) · Dongchen Han (Tsinghua University, Tsinghua University) · Yizeng Han (Tsinghua University, Tsinghua University) · Xuran Pan (Tsinghua University, Tsinghua University) · Shiji Song (Tsinghua University, Tsinghua University) · Gao Huang (Tsinghua University, Tsinghua University) | | 1208 | ElasticDiffusion: Training-free Arbitrary Size Image Generation | Moayed Haji Ali (Rice University) · Guha Balakrishnan (Rice University) · Vicente Ordonez (Rice University) | | 1209 | General Point Model Pretraining with Autoencoding and Autoregressive | Zhe Li (华中科技大学) · Zhangyang Gao (Westlake University, China) · Cheng Tan (None) · Bocheng Ren (None) · Laurence Yang (Hainan University) · Stan Z. Li (Westlake University) | | 1210 | Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes | Zhiyuan Yu (Na) · Zheng Qin (National University of Defense Technology) · lintao zheng (National University of Defense Technology) · Kai Xu (National University of Defense Technology) | | 1211 | Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval | Fan Zhang (None) · Xian-Sheng Hua (Terminus Group) · Chong Chen (Terminus Group) · Xiao Luo (University of California, Los Angeles) | | 1212 | Uncertainty Visualization via Low-Dimensional Posterior Projections | Omer Yair (Technion, Technion) · Tomer Michaeli (Technion) · Elias Nehme (Electrical Engineering Department, Technion – Israel Institute of Technology, Technion - Israel Institute of Technology) | | 1213 | Visual Delta Generator for Semi-supervised Composed Image Retrieval | Young Kyun Jang (Meta AI) · Donghyun Kim (MIT-IBM Watson AI Lab) · Zihang Meng (Meta) · Dat Huynh (Meta) · Ser-Nam Lim (Meta AI) | | 1214 | Coherent Temporal Synthesis for Incremental Action Segmentation | Guodong Ding (Natioal University of Singapore) · Hans Golong (National University of Singapore) · Angela Yao (National University of Singapore) | | 1215 | Show, Search, and Tell: Exploring Guided Visual Search as a Core Mechanism in Multimodal LLMs | Penghao Wu (University of California, San Diego) · Saining Xie (Facebook) | | 1216 | Real-Time Neural BRDF with Spherically Distributed Primitives | Yishun Dou (Huawei) · Zhong Zheng (huawei.com) · Qiaoqiao Jin (Shanghai Jiao Tong University) · Bingbing Ni (Shanghai Jiao Tong University) · Yugang Chen (Hisilicon) · Junxiang Ke (Huawei Technologies Ltd.) | | 1217 | Omni-Q: Omni-Directional Scene Understanding for Unsupervised Visual Grounding | Sai Wang (Wuhan University) · Yutian Lin (Wuhan University) · Yu Wu (None) | | 1218 | PatchFD: Reliable Model Patching for Unified Failure Detection | Fei Zhu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhen Cheng (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Xu-Yao Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Cheng-Lin Liu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhaoxiang Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) | | 1219 | Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities | Mingcheng Li (Fudan University) · Dingkang Yang (Fudan University) · Xiao Zhao (None) · Shuaibing Wang (Fudan University) · Yan Wang (Fudan University) · Kun Yang (Fudan University) · Mingyang Sun (Fudan University) · Dongliang Kou (Academy for Engineering and Technology, Fudan University, Shanghai, China.) · Qian (Fudan University) · Lihua Zhang (Fudan University) | | 1220 | Depth-aware Test-Time Training for Zero-shot Video Object Segmentation | Weihuang Liu (University of Macau) · Xi Shen (Tencent AI Lab) · Haolun Li (University of Macau) · Xiuli Bi (Chongqing University of Posts and Telecommunications) · Bo Liu (Chongqing University of Posts and Telecommunications) · Chi-Man Pun (University of Macau) · Xiaodong Cun (Tencent AI Lab) | | 1221 | RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models | Ozgur Kara (Georgia Institute of Technology) · Bariscan Kurtkaya (Koc University) · Hidir Yesiltepe (Virginia Polytechnic Institute and State University) · James Rehg (None) · Pinar Yanardag (Virginia Polytechnic Institute and State University) | | 1222 | Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models | Kota Sueyoshi (Osaka University) · Takashi Matsubara (Osaka Universiry) | | 1223 | MoReVQA: Exploring Modular Reasoning Models for Video Question Answering | Juhong Min (POSTECH) · Shyamal Buch (Stanford University) · Arsha Nagrani (Google ) · Minsu Cho (POSTECH) · Cordelia Schmid (Inria / Google) | | 1224 | Geometry Transfer for Stylizing Radiance Fields | Hyunyoung Jung (Seoul National University) · Seonghyeon Nam (Facebook) · Nikolaos Sarafianos (Meta Reality Labs Research) · Sungjoo Yoo (None) · Alexander Sorkine-Hornung (Facebook) · Rakesh Ranjan () | | 1225 | UniDepth: Universal Monocular Metric Depth Estimation | Luigi Piccinelli (ETH Zurich) · Yung-Hsu Yang (None) · Christos Sakaridis (ETH Zurich) · Mattia Segu (ETH Zurich - Swiss Federal Institute of Technology) · Siyuan Li (None) · Luc Van Gool (ETH Zurich) · Fisher Yu (ETH Zurich) | | 1226 | Diffusion Model Alignment Using Direct Preference Optimization | Bram Wallace (SalesForce.com) · Meihua Dang (Stanford University) · Rafael Rafailov (Stanford University) · Linqi Zhou (Stanford University) · Aaron Lou (Stanford University) · Senthil Purushwalkam (None) · Stefano Ermon (Stanford University) · Caiming Xiong (Salesforce Research) · Shafiq Joty (SalesForce.com) · Nikhil Naik (MIT) | | 1227 | CSTA: CNN-based Spatiotemporal Attention for Video Summarization | Jaewon Son (None) · Jaehun Park (Sung Kyun Kwan University) · Kwangsu Kim (Department of Computer Science & Engineering, College of Computing, Sungkyunkwan University) | | 1228 | Towards Robust Emotion Recognition in Context Debiasing | Dingkang Yang (Fudan University) · Kun Yang (Fudan University) · Mingcheng Li (Fudan University) · Shunli Wang (Fudan University) · Shuaibing Wang (Fudan University) · Lihua Zhang (Fudan University) | | 1229 | Multimodal Dataset Pruning using Image-Captioning Models | Anas Mahmoud (University of Toronto) · Mostafa Elhoushi (Meta, FAIR) · Amro Abbas (Meta) · Yu Yang (University of California, Los Angeles) · Newsha Ardalani (Facebook) · Hugh Leather (Facebook) · Ari Morcos (Meta AI (FAIR)) | | 1230 | AMU-Tuning: Learning Effective Bias for CLIP-based Few-shot Classification | Yuwei Tang (Tianjin University) · ZhenYi Lin (TianJin University) · Qilong Wang (university of tianjin of china) · Pengfei Zhu (Tianjin University) · Qinghua Hu (Tianjin University) | | 1231 | Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation | Song Wang (Zhejiang University) · Jiawei Yu (Zhejiang University) · Wentong Li (College of Computer Science and Technology, Zhejiang University) · Wenyu Liu (Zhejiang University) · Xiaolu Liu (Zhejiang University) · Junbo Chen (UDEER AI PTE.LTD) · Jianke Zhu (Zhejiang University) | | 1232 | Towards Fairness-Aware Adversarial Learning | Yanghao Zhang (University of Liverpool) · Tianle Zhang (University of Liverpool) · Ronghui Mu (Lancaster University) · Xiaowei Huang (University of Liverpool) · Wenjie Ruan (University of Exeter) | | 1233 | Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds | Tianrui Lou (None) · Xiaojun Jia (, Chinese Academy of Sciences) · Jindong Gu (University of Oxford) · Li Liu (University of Oulu) · Siyuan Liang (National University of Singapore) · Bangyan He (Institute of Information Engineering, CAS) · Xiaochun Cao (SUN YAT-SEN UNIVERSITY) | | 1234 | JoAPR: Cleaning the Lens of Prompt Learning for Visual-Language Models | YUNCHENG GUO (None) · Xiaodong Gu (Fudan University) | | 1235 | Retrieval-Augmented Egocentric Video Captioning | Jilan Xu (None) · Yifei Huang (The University of Tokyo) · Junlin Hou (Hong Kong University of Science and Technology) · Guo Chen (Nanjing University) · Yuejie Zhang (Fudan University) · Rui Feng (Fudan University) · Weidi Xie (Shanghai Jiaotong University) | | 1236 | Low-Rank Knowledge Decomposition for Medical Foundation Models | Yuhang Zhou () · Haolin li (Fudan University) · Siyuan Du (Fudan University) · Jiangchao Yao (Shanghai Jiaotong University) · Ya Zhang (Shanghai Jiao Tong University) · Yanfeng Wang (Shanghai Jiao Tong University) | | 1237 | FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models | Shivangi Aneja (Technical University of Munich) · Justus Thies (Max-Planck Institute for Intelligent Systems) · Angela Dai () · Matthias Nießner (Technical University of Munich) | | 1238 | Pixel-level Semantic Correspondence through Layout-aware Representation Learning and Multi-scale Matching Integration | Yixuan Sun (Fudan University) · Zhangyue Yin (Fudan University) · Haibo Wang (None) · Yan Wang (Fudan University) · Xipeng Qiu (Fudan University) · Weifeng Ge (Fudan University) · Wenqiang Zhang (None) | | 1239 | Understanding Video Transfomers via Universal Concept Discovery | MATTHEW KOWAL (None) · Achal Dave (None) · Rares Andrei Ambrus (Toyota Research Institute) · Adrien Gaidon (Toyota Research Institute (TRI)) · Kosta Derpanis (York University/Samsung) · Pavel Tokmakov (Toyota Research Institute) | | 1240 | CPR: Retrieval Augmented Generation for Copyright Protection | Aditya Golatkar (University of California, Los Angeles) · Alessandro Achille (California Institute of Technology) · Luca Zancato (AWS AI Labs) · Yu-Xiang Wang (UC Santa Barbara / Amazon) · Ashwin Swaminathan (University of Maryland, College Park) · Stefano Soatto (AWS) | | 1241 | Generative Proxemics: A Prior for 3D Social Interaction from Images | Lea Müller (University of California, Berkeley) · Vickie Ye (University of California, Berkeley) · Georgios Pavlakos (University of Texas at Austin) · Michael J. Black (University of Tübingen) · Angjoo Kanazawa (UC Berkeley) | | 1242 | Event-assisted Low-Light Video Object Segmentation | Li Hebei (University of Science and Technology of China) · Jin Wang (University of Science and Technology of China) · Jiahui Yuan (University of Science and Technology of China) · Yue Li (None) · Wenming Weng (None) · Yansong Peng (None) · Yueyi Zhang (None) · Zhiwei Xiong (None) · Xiaoyan Sun (University of Science and Technology of China) | | 1243 | 3DToonify: Creating Your High-Fidelity 3D Stylized Avatar Easily from 2D Portrait Images | Yifang Men (Alibaba Group) · Hanxi Liu (Tianjin University) · Yuan Yao (Alibaba group) · Miaomiao Cui (Alibaba Group) · Xuansong Xie (Alibaba Group) · Zhouhui Lian (Peking University) | | 1244 | Animating General Image with Large Visual Motion Model | Dengsheng Chen (Meituan) · Xiaoming Wei (Meituan) · Xiaolin Wei (Meituan) | | 1245 | DeIl: Direct and Inverse CLIP for Open-World Few-Shot Learning | Shuai Shao (Zhejiang Lab) · Yu Bai (None) · Yan WANG (Beihang University) · Bao-di Liu (China University of Petroleum (East China)) · Yicong Zhou (University of Macau) | | 1246 | DETRs Beat YOLOs on Real-time Object Detection | Yian Zhao (Peking University) · Wenyu Lv (Baidu) · Shangliang Xu (Baidu) · Jinman Wei (Tianjin University) · Guanzhong Wang (Baidu) · Qingqing Dang (Baidu) · Yi Liu (None) · Jie Chen (Peking University) | | 1247 | GPT4Point: A Unified Framework for Point-Language Understanding and Generation | Zhangyang Qi (None) · Ye Fang (None) · Zeyi Sun (Shanghai Jiao Tong University) · Xiaoyang Wu (The University of Hong Kong) · Tong Wu (None) · Jiaqi Wang (Shanghai AI Laboratory) · Dahua Lin (The Chinese University of Hong Kong) · Hengshuang Zhao (The University of Hong Kong) | | 1248 | Scene Adaptive Sparse Transformer for Event-based Object Detection | Yansong Peng (None) · Li Hebei (University of Science and Technology of China) · Yueyi Zhang (None) · Xiaoyan Sun (University of Science and Technology of China) · Feng Wu (University of Science and Technology of China) | | 1249 | Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary | Leheng Zhang (University of Electronic Science and Technology of China) · Yawei Li (ETH Zurich) · Xingyu Zhou (University of Electronic Science and Technology of China) · Xiaorui Zhao (None) · Shuhang Gu (University of Electronic Science and Technology of China) | | 1250 | Amodal Completion via Progressive Mixed Context Diffusion | Katherine Xu (None) · Lingzhi Zhang (School of Engineering and Applied Science, University of Pennsylvania) · Jianbo Shi (None) | | 1251 | Deep Generative Model based Rate-Distortion for Image Downscaling Assessment | yuanbang liang (Cardiff Univeristy) · Bhavesh Garg (WadhwaniAI) · Paul L. Rosin (Cardiff University) · Yipeng Qin (Cardiff University) | | 1252 | Rendering Every Pixel for High-Fidelity Geometry in 3D GANs | Alex Trevithick (None) · Matthew Chan (NVIDIA) · Towaki Takikawa (NVIDIA) · Umar Iqbal (None) · Shalini De Mello (NVIDIA Research) · Manmohan Chandraker (UC San Diego) · Ravi Ramamoorthi (None) · Koki Nagano (None) | | 1253 | Forecasting of 3D Whole-body Human Poses with Grasping Objects | yan haitao (None) · Qiongjie Cui (Nanjing University of Science and Technology) · Jiexin Xie (Fudan University) · Shijie Guo (Fudan University) | | 1254 | SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities | Boyuan Chen (MIT) · Zhuo Xu (Google Deepmind) · Sean Kirmani (Google DeepMind) · brian ichter (Google) · Dorsa Sadigh (Google) · Leonidas Guibas (Stanford University) · Fei Xia (Google) | | 1255 | Residual Learning in Diffusion Models | Zhang Junyu (Central South University) · Daochang Liu (University of Sydney) · Eunbyung Park (SKKU) · Shichao Zhang (Central South University) · Chang Xu (University of Sydney) | | 1256 | Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains | Bang-Dang Pham () · Phong Tran (MBZUAI) · Anh Tran (None) · Cuong Pham (Posts & Telecommunications Institute of Technology and VinAI Research) · Rang Nguyen (VinAI Research) · Minh Hoai (State University of New York, Stony Brook) | | 1257 | PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought | Junyi Yao (None) · Yijiang Liu (Nanjing University) · Zhen Dong (UC Berkeley) · Mingfei Guo (Stanford University) · Helan Hu (Peking University) · Kurt Keutzer (EECS, UC Berkeley) · Li Du (Nanjing University) · Daquan Zhou (National University of Singapore) · Shanghang Zhang (Peking University) | | 1258 | FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization | Jiahui Zhang (None) · Fangneng Zhan (None) · MUYU XU (Nanyang Technological University) · Shijian Lu (Nanyang Technological University) · Eric P. Xing (Mohamed bin Zayed Univeristy of AI) | | 1259 | PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns | Shuliang Ning (The Chinese University of HongKong, ShenZhen) · Duomin Wang () · Yipeng Qin (Cardiff University) · Zirong Jin () · Baoyuan Wang (Xiaobing.ai) · Xiaoguang Han (The Chinese University of Hong Kong, Shenzhen) | | 1260 | Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation. | Dong Zhao (Xi'an University of Electronic Science and Technology) · Shuang Wang (Xidian University) · Qi Zang (Xidian University) · Licheng Jiao (Xidian University) · Nicu Sebe (University of Trento) · Zhun Zhong (University of Trento) | | 1261 | In-distribution Public Data Synthesis with Diffusion Models for Differentially Private Image Classification | Jinseong Park (Seoul National University) · Yujin Choi (Seoul National University) · Jaewook Lee (Seoul National University) | | 1262 | Revisiting Sampson Approximations for Geometric Estimation Problems | Felix Rydell (KTH Royal Institute of Technology) · Angelica Torres (Max Planck Institute for Mathematics in the Sciences) · Viktor Larsson (Lund University) | | 1263 | PaintNeSF: Artistic Creation of Stylized Scenes with Vectorized 3D Strokes | Haobin Duan (Beihang University) · Miao Wang (Beihang University) · Yanxun Li (Buaa Software Engineering) · Yong-Liang Yang (University of Bath) | | 1264 | Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception | Junwen He (Dalian University of Technology) · Yifan Wang (Dalian University of Technology) · Lijun Wang (Dalian University of Technology) · Huchuan Lu (Dalian University of Technology) · Bin Luo (Alibaba Group) · Jun-Yan He (DAMO Academy, Alibaba Group) · Jin-Peng Lan (Alibaba Group) · Xuansong Xie (Alibaba Group) | | 1265 | Generative Latent Coding for Ultra-Low Bitrate Image Compression | Zhaoyang Jia (University of Science and Technology of China) · Jiahao Li (None) · Bin Li (Microsoft) · Houqiang Li (University of Science and Technology of China) · Yan Lu (Microsoft Research Asia) | | 1266 | Transferable and Principled Efficiency for Open-Vocabulary Segmentation | Jingxuan Xu (Beijing Jiaotong University) · Wuyang Chen (University of Texas at Austin) · Yao Zhao (Beijing Jiaotong University) · Yunchao Wei (UTS) | | 1267 | Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers | Subhadeep Koley (University of Surrey) · Ayan Kumar Bhunia (University of Surrey, United Kingdom) · Aneeshan Sain (University of Surrey) · Pinaki Nath Chowdhury (University of Surrey) · Tao Xiang (University of Surrey) · Yi-Zhe Song (None) | | 1268 | Flexible Depth Completion for Sparse and Varying Point Densities | Jinhyung Park (Carnegie Mellon University) · Yu-Jhe Li (Carnegie Mellon University) · Kris Kitani (Carnegie Mellon University) | | 1269 | Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners | Yazhou Xing (The Hong Kong University of Science and Technology) · Yingqing He (HKUST) · Zeyue Tian (Hong Kong University of Science and Technology) · Xintao Wang (Tencent) · Qifeng Chen (Hong Kong University of Science and Technology) | | 1270 | ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation | Dar-Yen Chen (Cardinal Blue) · Hamish Tennent (PicCollage) · Ching-Wen Hsu (PicCollage) | | 1271 | On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation | Agneet Chatterjee (Arizona State University) · Tejas Gokhale (University of Maryland, Baltimore County) · Chitta Baral (Arizona State University) · 'YZ' Yezhou Yang (Arizona State University) | | 1272 | SingularTrajectory: Universal Trajectory Predictor using Diffusion Model | Inhwan Bae (GIST) · Young-Jae Park (None) · Hae-Gon Jeon (None) | | 1273 | SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks | Yaxu Xie (German Research Center for Artificial Intelligence) · Alain Pagani (German Research Center for Artificial Intelligence) · Didier Stricker (Universität Kaiserslautern) | | 1274 | Sparse Global Matching for Video Frame Interpolation with Large Motion | Chunxu Liu (Nanjing University) · Guozhen Zhang (Nanjing University) · Rui Zhao (Qing Yuan Research Institute, Shanghai Jiao Tong University) · Limin Wang (Nanjing University) | | 1275 | Mitigating Motion Blur in Neural Radiance Fields with Events and Frames | Marco Cannici (Department of Informatics, University of Zurich, University of Zurich) · Davide Scaramuzza (University of Zurich) | | 1276 | Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features | Thomas Wimmer (Technical University of Munich) · Peter Wonka (KAUST) · Maks Ovsjanikov (Ecole Polytechnique, France) | | 1277 | PIGEON: Predicting Image Geolocations | Lukas Haas (Stanford University) · Michal Skreta (Stanford University) · Silas Alberti (Stanford University) · Chelsea Finn (Stanford University) | | 1278 | Alpha-CLIP: A CLIP Model Focusing on Wherever You Want | Zeyi Sun (Shanghai Jiao Tong University) · Ye Fang (None) · Tong Wu (None) · Pan Zhang (Shanghai Artificial Intelligence Laboratory) · Yuhang Zang (Nanyang Technological University) · Shu Kong (Texas A&M University) · Yuanjun Xiong (Mthreads) · Dahua Lin (The Chinese University of Hong Kong) · Jiaqi Wang (Shanghai AI Laboratory) | | 1279 | Discriminability-Driven Channel Selection for Out-of-Distribution Detection | Yue Yuan (Shandong University) · Rundong He (Shandong University) · Yicong Dong (Shandong University) · Zhongyi Han (Shandong University) · Yilong Yin (Shandong University) | | 1280 | Improving Generalization via Meta-Learning on Hard Samples | Nishant Jain (Indian Institute of Technology, Roorkee, Dhirubhai Ambani Institute Of Information and Communication Technology) · Arun Suggala (Google) · Pradeep Shenoy (Google) | | 1281 | Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes | Chi-Hsi Kung (National Yang Ming Chiao Tung University) · 書緯 呂 (National Yang Ming Chiao Tung University) · Yi-Hsuan Tsai (Google) · Yi-Ting Chen (National Yang Ming Chiao Tung University) | | 1282 | CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor | Shuyang Sun (University of Oxford) · Runjia Li (University of Oxford) · Philip H.S. Torr (University of Oxford) · Xiuye Gu (None) · Siyang Li (Google) | | 1283 | M&M VTO: Multi-Garment Virtual Try-On and Editing | Luyang Zhu (Department of Computer Science, University of Washington) · Yingwei Li (Google) · Nan Liu (Google) · Hao Peng (Google) · Dawei Yang (Google Inc.) · Ira Kemelmacher-Shlizerman (University of Washington) | | 1284 | OneLLM: One Framework to Align All Modalities with Language | Jiaming Han (The Chinese University of Hong Kong) · Kaixiong Gong (None) · Yiyuan Zhang (The Chinese University of Hong Kong) · Jiaqi Wang (Shanghai AI Laboratory) · Kaipeng Zhang (Shanghai AI Laboratory) · Dahua Lin (The Chinese University of Hong Kong) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Peng Gao (The Chinese University of Hong Kong) · Xiangyu Yue (None) | | 1285 | LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition | Zhonglin Sun (Queen Mary University of London) · Chen Feng (Queen Mary University of London) · Ioannis Patras (Queen Mary University of London) · Georgios Tzimiropoulos (Queen Mary University London) | | 1286 | Cross-view and Cross-pose Completion for 3D Human Understanding | Matthieu Armando (Naver Labs Europe) · Salma Galaaoui (Naver Labs Europe) · Fabien Baradel (Naver Labs Europe) · Thomas Lucas (Naver Labs Europe) · Vincent Leroy (Naver Labs Europe) · Romain BRÉGIER (None) · Philippe Weinzaepfel (Naver Labs Europe) · Grégory Rogez (Naver Labs Europe) | | 1287 | SinSR: Diffusion-Based Image Super-Resolution in a Single Step | Yufei Wang (Nanyang Technological University) · Wenhan Yang (Peng Cheng Lab) · Xinyuan Chen (Shanghai Artificial Intelligence Laboratory) · Yaohui Wang (Shanghai AI Laboratory) · Lanqing Guo (Nanyang Technological University) · Lap-Pui Chau (The Hong Kong Polytechnic University) · Ziwei Liu (Nanyang Technological University) · Yu Qiao (Shanghai Aritifcal Intelligence Laboratory) · Alex C. Kot (Nanyang Technological University) · Bihan Wen (Nanyang Technological University) | | 1288 | Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning | Sicong Shen (Beihang University) · Yang Zhou (Beihang University) · Bingzheng Wei (Xiaomi Corporation) · Eric Chang (Massachusetts Institute of Technology) · Yan Xu (Beijing University of Aeronautics and Astronautics) | | 1289 | DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF | Jie Long Lee (None) · Chen Li (National University of Singapore) · Gim Hee Lee (National University of Singapore) | | 1290 | Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation | Li Hu (Alibaba) | | 1291 | Taming Mode Collapse in Score Distillation for Text-to-3D Generation | Peihao Wang (University of Texas, Austin) · Dejia Xu (University of Texas at Austin) · Zhiwen Fan (University of Texas, Austin) · Dilin Wang (Facebook) · Sreyas Mohan (Meta) · Forrest Iandola (Meta) · Rakesh Ranjan () · Yilei Li (Facebook) · Qiang Liu (University of Texas, Austin) · Zhangyang Wang (University of Texas at Austin) · Vikas Chandra (Facebook) | | 1292 | Relightable and Animatable Neural Avatar from Sparse-View Video | Zhen Xu (Zhejiang University) · Sida Peng (None) · Chen Geng (Zhejiang University) · Linzhan Mou (Zhejiang University) · Zihan Yan (University of Illinois Urbana-Champaign) · Jiaming Sun (Image Derivative Inc.) · Hujun Bao (Zhejiang University) · Xiaowei Zhou (None) | | 1293 | DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses | Chen Zhao (EPFL - EPF Lausanne) · Tong Zhang (EPFL) · Zheng Dang (None) · Mathieu Salzmann (EPFL) | | 1294 | Rethinking Generalizable Face Anti-spoofing via Hierarchical Prototype-guided Distribution Refinement in Hyperbolic Space | Chengyang Hu (None) · Ke-Yue Zhang (Tencent) · Taiping Yao (Tencent Youtu Lab) · Shouhong Ding (Tencent Youtu Lab) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) | | 1295 | Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields | Shijie Zhou (University of California, Los Angeles) · Haoran Chang (University of California, Los Angeles) · Sicheng Jiang (University of California, Los Angeles) · Zhiwen Fan (University of Texas, Austin) · Zehao Zhu (University of Texas at Austin) · Dejia Xu (University of Texas at Austin) · Pradyumna Chari (University of California, Los Angeles) · Suya You (University of Southern California) · Zhangyang Wang (University of Texas at Austin) · Achuta Kadambi (UCLA) | | 1296 | PostureHMR: Posture Transformation for 3D Human Mesh Recovery | Yupei Song (None) · Xiao WU (Southwest Jiaotong University) · Zhaoquan Yuan (None) · Jian-Jun Qiao (Southwest Jiaotong University) · Qiang Peng (Southwest Jiaotong University) | | 1297 | ControlRoom3D: Room Generation using Semantic Controls | Jonas Schult (Rheinisch Westfälische Technische Hochschule Aachen) · Sam Tsai (Meta) · Lukas Hoellein (None) · Bichen Wu (Facebook) · Jialiang Wang (Facebook) · Chih-Yao Ma (Facebook) · Kunpeng Li (Meta) · Xiaofang Wang (Meta) · Felix Wimbauer (Technical University of Munich) · Zijian He (None) · Peizhao Zhang (Facebook) · Bastian Leibe (RWTH Aachen University) · Peter Vajda (Facebook) · Ji Hou (Facebook) | | 1298 | VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction | Jiaqi Lin (Tsinghua University) · Zhihao Li (Huawei Technologies Ltd.) · Xiao Tang (Huawei Technologies Ltd.) · Jianzhuang Liu (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences) · Shiyong Liu (Huawei Noah's Ark Lab) · Jiayue Liu (Tsinghua University, Tsinghua University) · Yangdi Lu (Huawei Technologies Ltd.) · Xiaofei Wu (Huawei Technologies Ltd.) · Songcen Xu (Huawei Noah's Ark Lab) · Youliang Yan (Huawei Technologies Ltd.) · Wenming Yang (Tsinghua University,) | | 1299 | WANDR: Wrist-driven Autonomous Navigation for Data-based Goal Reaching | Markos Diomataris (None) · Nikos Athanasiou (None) · Omid Taheri () · Xi Wang (None) · Otmar Hilliges (None) · Michael J. Black (University of Tübingen) | | 1300 | Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering | Tao Lu (Nanjing University) · Mulin Yu (Shanghai AI Laboratory) · Linning Xu (The Chinese University of Hong Kong) · Yuanbo Xiangli (None) · Limin Wang (Nanjing University) · Dahua Lin (The Chinese University of Hong Kong) · Bo Dai (Shanghai AI Laboratory) | | 1301 | SimDA: Simple Diffusion Adapter for Efficient Video Generation | Zhen Xing (Fudan University) · Qi Dai (Microsoft Research Asia) · Han Hu (Microsft Research Asia) · Zuxuan Wu (Fudan University) · Yu-Gang Jiang (Fudan University) | | 1302 | GART: Gaussian Articulated Template Models | Jiahui Lei (University of Pennsylvania) · Yufu Wang (University of Pennsylvania) · Georgios Pavlakos (University of Texas at Austin) · Lingjie Liu (Saarland Informatics Campus, Max-Planck Institute) · Kostas Daniilidis (University of Pennsylvania) | | 1303 | Learning from Observer Gaze: Zero-shot Attention Prediction Oriented by Human-Object Interaction Recognition | Yuchen Zhou (Sun Yat-Sen University) · Linkai Liu (Sun Yat-Sen University) · Chao Gou (Sun Yat-Sen University) | | 1304 | Spatio-Temporal Turbulence Mitigation: A Translational Perspective | Xingguang Zhang (Purdue University) · Nicholas M Chimitt () · Yiheng Chi (Purdue University) · Zhiyuan Mao (Samsung Research America) · Stanley H. Chan (Purdue University, USA) | | 1305 | Anchor-based Robust Finetuning of Vision-Language Models | Jinwei Han (Wuhan University) · Zhiwen Lin (Tencent) · Zhongyisun Sun (Tencent Youtu Lab) · Yingguo Gao (Tencent Youtu Lab) · Ke Yan () · Shouhong Ding (Tencent Youtu Lab) · Yuan Gao (Wuhan University) · Gui-Song Xia (Wuhan University) | | 1306 | Denoising Point Cloud in Latent Space via Graph Convolution and Invertible Neural Network | Aihua Mao (South China University of Technology) · Biao Yan (None) · Zijing Ma (South China University of Technology) · Ying He (Nanyang Technological University) | | 1307 | Composing Object Relations and Attributes for Image-Text Matching | Khoi Pham (University of Maryland, College Park) · Chuong Huynh (University of Maryland, College Park) · Ser-Nam Lim (Meta AI) · Abhinav Shrivastava (University of Maryland) | | 1308 | SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation | Thuan Hoang Nguyen (Ho Chi Minh City University of Technology) · Anh Tran (None) | | 1309 | Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D | Karran Pandey (University of Toronto) · Paul Guerrero (Adobe Systems) · Matheus Gadelha (Adobe Systems) · Yannick Hold-Geoffroy (Adobe Research) · Karan Singh (Department of Computer Science) · Niloy J. Mitra (University College London) | | 1310 | PSDPM: Prototype-based Secondary Discriminative Pixels Mining for Weakly Supervised Semantic Segmentation | Xinqiao Zhao (Xi’an Jiaotong-Liverpool University) · Yang (None) · Tianhong Dai (University of Aberdeen) · Bingfeng Zhang (China University of Petroleum (East China)) · Jimin Xiao (Xi'an Jiaotong-Liverpool University) | | 1311 | DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization | Jisu Nam (Korea University) · Heesu Kim (NAVER) · DongJae Lee (KAIST) · Siyoon Jin (Korea University) · Seungryong Kim (Korea University) · Seunggyu Chang (NAVER Cloud) | | 1312 | COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction | Qihang Ma (East China Normal Universitry) · Xin Tan (East China Normal University) · Yanyun Qu (Xiamen University) · Lizhuang Ma (Dept. of Computer Sci. & Eng., Shanghai Jiao Tong University) · Zhizhong Zhang (East China Normal University) · Yuan Xie (East China Normal University) | | 1313 | Point Cloud Pre-training with Diffusion Models | xiao zheng (None) · Xiaoshui Huang (Shanghai AI Laboratory) · Guofeng Mei (None) · Zhaoyang Lyu (Shanghai AI Laboratory) · Yuenan Hou (Shanghai AI Laboratory) · Wanli Ouyang (University of Sydney) · Bo Dai (Shanghai AI Laboratory) · Yongshun Gong (Shandong University) | | 1314 | Leveraging Stereo Prior for Generalizable Novel-View Synthesis | Haechan Lee (Pohang University of Science and Technology) · Wonjoon Jin (Pohang University of Science and Technology) · Seung-Hwan Baek (POSTECH) · Sunghyun Cho (POSTECH) | | 1315 | Prompt3D: Random Prompt Assisted Weakly-Supervised 3D Object Detection | Xiaohong Zhang (None) · Huisheng Ye (Nanjing University) · Jingwen Li (nanjing university) · Qinyu Tang (Nanjing University) · Yuanqi Li (Nanjing University) · Yanwen Guo (Nanjing University) · Jie Guo (Nanjing University) | | 1316 | Language-driven All-in-one Adverse Weather Removal | Hao Yang (Beijing Institute of Technology) · Liyuan Pan (Beijing Institute of Technology) · Yan Yang (ANU) · Wei Liang (Beijing Institute of Technology) | | 1317 | Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction | Inhwan Bae (GIST) · Junoh Lee (Gwangju Institute of Science and Technology) · Hae-Gon Jeon (None) | | 1318 | Efficient Meshflow and Optical Flow Estimation from Event Cameras | Xinglong Luo (None) · Ao Luo (Megvii Technology Inc.) · Zhengning Wang (University of Electronic Science and Technology of China) · Chunyu Lin (Beijing Jiaotong University) · Bing Zeng (None) · Shuaicheng Liu (None) | | 1319 | SPU-PMD: Self-Supervised Point Cloud Upsampling via Progressive Mesh Deformation | Yanzhe Liu (Dalian Maritime University) · Rong Chen (Dalian Maritime University) · Yushi Li (Xi'an Jiaotong-Liverpool University) · Yixi Li (Dalian Martime University) · Xuehou Tan (Tokai University) | | 1320 | C3Net: Compound Conditioned ControlNet for Multimodal Content Generation | Juntao Zhang (Hong Kong University of Science and Technology) · Yuehuai LIU (Hong Kong University of Science and Technology) · Yu-Wing Tai (None) · Chi-Keung Tang (The Hong Kong University of Science and Technology) | | 1321 | Volumetric Environment Representation for Vision-Language Navigation | Liu (None) · Wenguan Wang (Zhejiang University) · Yi Yang (Zhejiang University) | | 1322 | LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis | Zehan Zheng (Tongji University) · Fan Lu (Tongji University) · Weiyi Xue (Tongji University) · Guang Chen (Tongji University) · Changjun Jiang (Tongji University) | | 1323 | Leveraging Camera Triplets for Efficient and Accurate Structure-from-Motion | Lalit Manam () · Venu Madhav Govindu (Indian Institute of Science) | | 1324 | LEAD: Learning Decomposition for Source-free Universal Domain Adaptation | Sanqing Qu (Tongji University) · Tianpei Zou (Tongji University) · Lianghua He (Tongji University) · Florian Röhrbein (Chemnitz University of Technology) · Alois Knoll (Technical University Munich) · Guang Chen (Tongji University) · Changjun Jiang (Tongji University) | | 1325 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Christian Diller (Technische Universität München) · Angela Dai () | | 1326 | Contrastive Mean-Shift Learning for Generalized Category Discovery | Sua Choi (None) · Dahyun Kang (POSTECH) · Minsu Cho (POSTECH) | | 1327 | Federated Generalized Category Discovery | Nan Pu (University of Trento) · Wenjing Li (University of Science and Technology of China) · Xinyuan Ji (Leiden University) · Yalan Qin (Shanghai University) · Nicu Sebe (University of Trento) · Zhun Zhong (University of Trento) | | 1328 | Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis? | Hanxin Zhu (University of Science and Technology of China) · Tianyu He (None) · Xin Li (None) · Bingchen Li (University of Science and Technology of China) · Zhibo Chen (University of Science and Technology of China) | | 1329 | Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters | JiazuoYu (None) · Yunzhi Zhuge (Dalian University of Technology) · Lu Zhang (Dalian University of Technology) · Ping Hu (University of Electronic Science and Technology of China) · Dong Wang (Dalian University of Technology) · Huchuan Lu (Dalian University of Technology) · You He (Tsinghua University, Tsinghua University) | | 1330 | How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval? | Subhadeep Koley (University of Surrey) · Ayan Kumar Bhunia (University of Surrey, United Kingdom) · Aneeshan Sain (University of Surrey) · Pinaki Nath Chowdhury (University of Surrey) · Tao Xiang (University of Surrey) · Yi-Zhe Song (None) | | 1331 | Equivariant Multi-Modality Image Fusion | Zixiang Zhao (Xi'an Jiaotong University) · Haowen Bai (Xi'an Jiaotong University) · Jiangshe Zhang (Xi'an Jiaotong University) · Yulun Zhang (ETH Zürich) · Kai Zhang (None) · Shuang Xu (Northwest Polytechnical University Xi'an) · Dongdong Chen (Heriot-Watt University) · Radu Timofte (University of Würzburg) · Luc Van Gool (ETH Zurich) | | 1332 | DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | Chong Mou (Peking University) · Xintao Wang (Tencent) · Jiechong Song (None) · Ying Shan (Tencent) · Jian Zhang (None) | | 1333 | Iterated Learning Improves Compositionality in Large Vision-Language Models | Chenhao Zheng (University of Michigan) · Jieyu Zhang (Department of Computer Science, University of Washington) · Aniruddha Kembhavi (Allen Institute for Artificial Intelligence) · Ranjay Krishna (University of Washington) | | 1334 | Detours for Navigating Instructional Videos | Kumar Ashutosh (None) · Zihui Xue (None) · Tushar Nagarajan (Meta) · Kristen Grauman (University of Texas at Austin) | | 1335 | Domain Gap Embeddings for Generative Dataset Augmentation | Yinong Wang (None) · Younjoon Chung (Carnegie Mellon University) · Chen Henry Wu (Carnegie Mellon University) · Fernando De la Torre (Carnegie Mellon) | | 1336 | Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation | Zhekai Du (University of Electronic Science and Technology of China) · Xinyao Li (None) · Fengling Li (University of Technology Sydney) · Ke Lu (University of Electronic Science and Technology of China) · Lei Zhu (Shandong Normal University) · Jingjing Li (University of Electronic Science and Technology of China) | | 1337 | TransLoc4D: Transformer-based 4D-Radar Place Recognition | Guohao Peng (Nanyang Technological University) · Heshan Li (Nanyang Technological University) · Yangyang Zhao (Nanyang Technological University) · Jun Zhang (Nanyang Technological University) · Zhenyu Wu (Nanyang Technological University) · Pengyu Zheng (Chinese University of Hong Kong) · Danwei Wang (Nanyang Technological University) | | 1338 | Higher-order Relational Reasoning for Pedestrian Trajectory Prediction | Sungjune Kim (Korea University) · Hyung-gun Chi (Purdue University) · Hyerin Lim (Hyundai Motor Company) · Karthik Ramani (Purdue University) · Jinkyu Kim (Korea University) · Sangpil Kim (Korea University) | | 1339 | Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation | Jingyun Wang (None) · Guoliang Kang (Beihang University) | | 1340 | Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification | Sravanti Addepalli (Indian Institute of Science) · Ashish Asokan (Indian Institute of Science, Indian institute of science, Bangalore) · Lakshay Sharma (Indian Institute of Science, Indian institute of science, Bangalore) · R. Venkatesh Babu (Indian Institute of Science) | | 1341 | CCEdit: Creative and Controllable Video Editing via Diffusion Models | Ruoyu Feng (University of Science and Technology of China) · Wenming Weng (None) · Yanhui Wang (None) · Yuhui Yuan (Microsoft Research Asia) · Jianmin Bao (Microsoft) · Chong Luo (Microsoft Research Asia) · Zhibo Chen (University of Science and Technology of China) · Baining Guo (Microsoft Research) | | 1342 | Towards Learning a Generalist Model for Embodied Navigation | Duo Zheng (Department of Computer Science and Engineering, The Chinese University of Hong Kong) · Shijia Huang (The Chinese University of Hong Kong) · Lin Zhao (Beijing Institute of Technology) · Yiwu Zhong (University of Wisconsin, Madison) · Liwei Wang (CUHK) | | 1343 | Small Steps and Level Sets: Fitting Neural Surface Models with Point Guidance | Chamin Hewa Koneputugodage (Australian National University) · Yizhak Ben-Shabat (Technion, Israel Institute of Technology) · Dylan Campbell (Australian National University) · Stephen Gould (Australian National University) | | 1344 | Absolute Pose from One or Two Scaled and Oriented Features | Jonathan Ventura (None) · Zuzana Kukelova (Czech Technical University in Prague) · Torsten Sattler (Czech Technical University in Prague) · Daniel Barath (ETHZ - ETH Zurich) | | 1345 | DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation | Zeeshan Hayder (CSIRO) · Xuming He (ShanghaiTech University) | | 1346 | \emph{RealCustom}: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization | Mengqi Huang (University of Science and Technology of China) · Zhendong Mao (None) · Mingcong Liu (ByteDance Inc.) · Qian HE (Institute of Remote Sensing Application, Chinese Academic of Sciences) · Yongdong Zhang (University of Science and Technology of China) | | 1347 | Driving Everywhere with Large Language Model Policy Adaptation | Boyi Li (UC Berkeley / NVIDIA) · Yue Wang (Massachusetts Institute of Technology) · Jiageng Mao (CUHK) · Boris Ivanovic (NVIDIA) · Sushant Veer (NVIDIA) · Karen Leung (University of Washington) · Marco Pavone (NVIDIA) | | 1348 | SANeRF-HQ: Segment Anything for NeRF in High Quality | Yichen Liu (HKUST) · Benran Hu (The Hong Kong University of Science and Technology) · Chi-Keung Tang (The Hong Kong University of Science and Technology) · Yu-Wing Tai (None) | | 1349 | APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation | Weizhao He (None) · Yang Zhang (Shenzhen University) · Wei Zhuo (Shenzhen University) · Linlin Shen (None) · Jiaqi Yang (University of Nottingham) · Songhe Deng (None) · Liang Sun (Shenzhen University) | | 1350 | ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning | Beomyoung Kim (NAVER Cloud / KAIST) · Joonsang Yu (NAVER) · Sung Ju Hwang (Korea Advanced Institute of Science and Technology) | | 1351 | Discover and Mitigate Multiple Biased Subgroups in Image Classifiers | Zeliang Zhang (University of Rochester) · Mingqian Feng (University of Rochester) · Zhiheng Li (Amazon AGI) · Chenliang Xu (University of Rochester) | | 1352 | MICap: A Unified Model for Identity-aware Movie Descriptions | Haran Raajesh (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Naveen Desanur (International Institute of Information Technology, Hyderabad, International Institute of Information Technology Hyderabad) · Zeeshan Khan (INRIA) · Makarand Tapaswi (Wadhwani AI, IIIT Hyderabad) | | 1353 | Differentiable Point-based Inverse Rendering | Hoon-Gyu Chung (Korea University) · Seokjun Choi (Pohang University of Science and Technology) · Seung-Hwan Baek (POSTECH) | | 1354 | Referring Expression Counting | Siyang Dai (Singapore University of Technology and Design) · Jun Liu () · Ngai-Man Cheung (Singapore University of Technology and Design) | | 1355 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang (Electrical Engineering & Computer Science Department, University of California Berkeley) · Trevor Darrell (Electrical Engineering & Computer Science Department) · Sai Saketh Rambhatla (Meta) · Rohit Girdhar (Meta) · Ishan Misra (Facebook) | | 1356 | Shadow Generation for Composite Image Using Diffusion Model | Qingyang Liu (None) · Junqi You (Shanghai Jiaotong University) · Jian-Ting Wang (Shanghai JiaoTong University) · Xinhao Tao (Shanghai Jiaotong University) · Bo Zhang (Shanghai Jiao Tong University) · Li Niu () | | 1357 | DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes | Hao Yan (Tianjin University) · Zhihui Ke (Tianjin University) · Xiaobo Zhou (Tianjin University) · Tie Qiu (Tianjin University) · Xidong Shi (Tianjin University) · DaDong Jiang (Tianjin University) | | 1358 | OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation | Ganlong Zhao (University of Hong Kong) · Guanbin Li (Sun Yat-sen University) · Weikai Chen (Tencent America) · Yizhou Yu (The University of Hong Kong) | | 1359 | Rolling Shutter Correction with Intermediate Distortion Flow Estimation | Mingdeng Cao (The University of Tokyo) · Sidi Yang (Shenzhen International Graduate School, Tsinghua University) · Yujiu Yang (Tsinghua University) · Yinqiang Zheng (None) | | 1360 | Towards Transferable Targeted 3D Adversarial Attack in the Physical World | Yao Huang (Beihang University) · Yinpeng Dong (Tsinghua University) · Shouwei Ruan (None) · Xiao Yang (Tsinghua University, Tsinghua University) · Hang Su (Tsinghua University) · Xingxing Wei (None) | | 1361 | AnyDoor: Zero-shot Object-level Image Customization | Xi Chen (the University of Hong Kong, University of Hong Kong) · Lianghua Huang (Alibaba Group) · Yu Liu (Alibaba Group) · Yujun Shen (The Chinese University of Hong Kong) · Deli Zhao (Alibaba Group) · Hengshuang Zhao (The University of Hong Kong) | | 1362 | GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs | Gege Gao (ETH Zürich) · Weiyang Liu (University of Cambridge) · Anpei Chen (Department of Computer Science, ETHZ - ETH Zurich) · Andreas Geiger (University of Tübingen) · Bernhard Schölkopf (ELLIS Institute) | | 1363 | Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion | Jiangtong Tan (None) · Jie Huang (University of Science and Technology of China) · Kaiwen Zheng (University of Science and Technology of China) · Man Zhou (University of Science and Technology of China) · Keyu Yan (University of Science and Technology of China) · Danfeng Hong (Chinese Academy of Sciences, Aerospace Information Research Institute) · Feng Zhao (University of Science and Technology of China) | | 1364 | Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models | David Stotko (Rheinische Friedrich-Wilhelms-Universität Bonn) · Nils Wandel (University of Bonn) · Reinhard Klein (University of Bonn) | | 1365 | On Train-Test Class Overlap and Detection for Image Retrieval | Chull Hwan Song (Dealicious Inc) · Jooyoung Yoon (Dealicious Inc) · Taebaek Hwang (None) · Shunghyun Choi (None) · Yeong Hyeon Gu (Sejong University) · Yannis Avrithis (IARAI) | | 1366 | 3D Facial Expressions through Analysis-by-Neural-Synthesis | George Retsinas (None) · Panagiotis Filntisis (None) · Radek Danecek (Max Planck Institute for Intelligent Systems, Max-Planck Institute) · Victoria Abrevaya (None) · Anastasios Roussos (Foundation for Research and Technology - Hellas) · Timo Bolkart (Google) · Petros Maragos (National Technical University of Athens) | | 1367 | Exploring the Transferability of Visual Prompting for Multimodal Large Language Models | Yichi Zhang (Tsinghua University) · Yinpeng Dong (Tsinghua University) · Siyuan Zhang (None) · Tianzan Min (Tsinghua University, Tsinghua University) · Hang Su (Tsinghua University) · Jun Zhu (Tsinghua University) | | 1368 | VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection | Zihua Liu () · Hiroki Sakuma (SenseTime Japan Ltd.) · Masatoshi Okutomi (Tokyo Institute of Technology) | | 1369 | Unified Language-driven Zero-shot Domain Adaptation | Senqiao Yang (Harbin Institute of Technology) · Zhuotao Tian (The Chinese University of Hong Kong) · Li Jiang (Max Planck Institute for Informatics) · Jiaya Jia (The Chinese University of Hong Kong) | | 1370 | Aligning Logits Generatively for Principled Black-Box Knowledge Distillation | Jing Ma (None) · Xiang Xiang (Huazhong University of Science and Technology) · Ke Wang (Alibaba Group) · Yuchuan Wu (Alibaba Group) · Yongbin Li (Alibaba Group) | | 1371 | HomoFormer: Homogenized Transformer for Image Shadow Removal | Jie Xiao (University of Science and Technology of China) · Xueyang Fu (University of Science and Technology of China) · Yurui Zhu (University of Science and Technology of China) · Dong Li (University of Science and Technology of China) · Jie Huang (University of Science and Technology of China) · Kai Zhu (University of Science and Technology of China) · Zheng-Jun Zha (University of Science and Technology of China) | | 1372 | ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers | Narges Norouzi (None) · Svetlana Orlova (Eindhoven University of Technology) · Daan de Geus (Eindhoven University of Technology) · Gijs Dubbelman (Eindhoven University of Technology) | | 1373 | Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed | Yifan Wang (None) · Xingyi He (Zhejiang University) · Sida Peng (None) · Dongli Tan (Zhejiang University) · Xiaowei Zhou (None) | | 1374 | Language-guided Image Reflection Separation | Haofeng Zhong (Peking University) · Yuchen Hong (Peking University) · Shuchen Weng (Peking University) · Jinxiu Liang (None) · Boxin Shi (None) | | 1375 | Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning | Nirat Saini (None) · Khoi Pham (University of Maryland, College Park) · Abhinav Shrivastava (University of Maryland) | | 1376 | Transductive Zero-Shot & Few-Shot CLIP | Ségolène Martin (CentraleSupelec) · Yunshi HUANG (École de technologie supérieure, Université du Québec) · Fereshteh Shakeri (École de technologie supérieure) · Jean-Christophe Pesquet (CentraleSupelec) · Ismail Ben Ayed (ETS Montreal) | | 1377 | PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models | Yiming Zhang (None) · Zhening Xing (Shanghai AI Laboratory) · Yanhong Zeng (None) · Youqing Fang (Anhui University) · Kai Chen (Shanghai AI Laboratory) | | 1378 | DemoCaricature: Democratising Caricature Generation with a Rough Sketch | Dar-Yen Chen (Cardinal Blue) · Ayan Kumar Bhunia (University of Surrey, United Kingdom) · Subhadeep Koley (University of Surrey) · Aneeshan Sain (University of Surrey) · Pinaki Nath Chowdhury (University of Surrey) · Yi-Zhe Song (None) | | 1379 | GenZI: Zero-Shot 3D Human-Scene Interaction Generation | Lei Li (Technische Universität München) · Angela Dai () | | 1380 | Motion Diversification Networks | Hee Jae Kim (Boston University, Boston University) · Eshed Ohn-Bar (Boston University, Boston University) | | 1381 | On the Scalability of Diffusion-based Text-to-Image Generation | Hao Li (AWS AI Labs) · Yang Zou (Amazon) · Ying Wang (Amazon) · Orchid Majumder (Amazon Web Services) · Yusheng Xie (Amazon) · R. Manmatha (Amazon) · Ashwin Swaminathan (University of Maryland, College Park) · Zhuowen Tu (University of California, San Diego) · Stefano Ermon (Stanford University) · Stefano Soatto (AWS) | | 1382 | 360 + x : A Panoptic Multi-modal Scene Understanding Dataset | Hao Chen (ASML) · Yuqi Hou (University of Birmingham) · Chenyuan Qu (University of Birmingham) · Irene Testini (Cardiff University) · Xiaohan Hong (University of Birmingham) · Jianbo Jiao (University of Birmingham) | | 1383 | SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing | Tomoki Ichikawa (Kyoto University) · Shohei Nobuhara (Kyoto Institute of Technology) · Ko Nishino (Kyoto University) | | 1384 | Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking | Cheng-Yao Hong (Academia Sinica) · Yen-Chi Hsu (Department of computer science and informational engineering, National Taiwan University) · Tyng-Luh Liu (IIS/Academia Sinica) | | 1385 | FaceLift: Semi-supervised 3D Facial Landmark Localization | David Ferman (Flawless AI) · Pablo Garrido (Flawless AI) · Gaurav Bharaj (Flawless AI) | | 1386 | BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation | Jiahao Lu (University of Science and Technology of China) · Jiacheng Deng (University of Science and Technology of China) · Tianzhu Zhang (University of Science and Technology of China, Tsinghua University) | | 1387 | Unlocking Pretrained Image Backbones for Semantic Image Synthesis | Tariq Berrada (Meta) · Jakob Verbeek (Meta AI) · camille couprie (Facebook) · Karteek Alahari (Inria) | | 1388 | Test-Time Zero-Shot Temporal Action Localization | Benedetta Liberatori (University of Trento) · Alessandro Conti (University of Trento) · Paolo Rota (University of Trento) · Yiming Wang (Fondazione Bruno Kessler) · Elisa Ricci (University of Trento) | | 1389 | HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D | Sangmin Woo (Korea Advanced Institute of Science & Technology) · byeongjun park () · Hyojun Go (Twelvelabs) · Jin-Young Kim (Yonsei University) · Changick Kim (Korea Advanced Institute of Science and Technology) | | 1390 | Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates | Ka Chun SHUM (The Hong Kong University of Science and Technology) · Jaeyeon Kim (Hong Kong University of Science and Technology) · Binh-Son Hua (VinAI Research) · Duc Nguyen (Deakin University) · Sai-Kit Yeung (The Hong Kong University of Science and Technology) | | 1391 | Infer from What You Have Seen Before: Temporally-dependent Classifier for Semi-supervised Video Semantic Segmentation | Jiafan Zhuang (Shantou University) · Zilei Wang (University of Science and Technology of China) · Yixin Zhang (University of Science and Technology of China) · Zhun Fan (Shantou University) | | 1392 | Adapt Before Comparison: A New Perspective on Cross-Domain Few-Shot Segmentation | Jonas Herzog (None) | | 1393 | FreeU: Free Lunch in Diffusion U-Net | Chenyang Si (Sea AI Lab) · Ziqi Huang (Nanyang Technological University) · Yuming Jiang (Nanyang Technological University) · Ziwei Liu (Nanyang Technological University) | | 1394 | From Variance to Veracity: Unbundling and Mitigating Gradient Variance in Differentiable Bundle Adjustment Layers | Swaminathan Gurumurthy (School of Computer Science, Carnegie Mellon University) · Karnik Ram (Technische Universität München) · Bingqing Chen (Bosch) · Zachary Manchester (Carnegie Mellon University) · Zico Kolter (Carnegie Mellon University) | | 1395 | Image Restoration by Denoising Diffusion Models With Iteratively Preconditioned Guidance | Tomer Garber (Open University of Israel) · Tom Tirer (Bar-Ilan University) | | 1396 | Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation | Xiaohan Lei () · Min Wang (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center) · Wengang Zhou (University of Science and Technology of China) · Li Li (University of Science and Technology of China) · Houqiang Li (University of Science and Technology of China) | | 1397 | AnyScene: Customized Image Synthesis with Composited Foreground | Ruidong Chen (Tianjin University) · Lanjun Wang (Tianjin University) · Weizhi Nie (Tianjin University) · Yongdong Zhang (University of Science and Technology of China) · Anan Liu (Tianjin University) | | 1398 | No More Ambiguity in 360 ∘ Room Layout via Bi-Layout Estimation | Yu-Ju Tsai (University of California, Merced) · Jin Cheng Jhang (National Tsing Hua University) · JINGJING ZHENG (None) · Wei Wang (Amazon) · Albert Chen (Amazon) · Min Sun (None) · Cheng-Hao Kuo (Amazon) · Ming-Hsuan Yang (University of California at Merced) | | 1399 | Mean-Shift Feature Transformer | Takumi Kobayashi (National Institute of Advanced Industrial Science and Technology (AIST)) | | 1400 | SFOD: Spiking Fusion Object Detector | Yimeng Fan (School of Microelectronics, Tianjin University) · Wei Zhang (None) · Changsong Liu (Tianjin University) · Mingyang Li (Tianjin University) · Wenrui Lu (Tianjin University) | | 1401 | RegionGPT: Towards Region Understanding Vision Language Model | Qiushan Guo (The University of Hong Kong) · Shalini De Mello (NVIDIA Research) · Danny Yin (NVIDIA) · Wonmin Byeon (NVIDIA) · Ka Chun Cheung (NVIDIA) · Yizhou Yu (The University of Hong Kong) · Ping Luo (The University of Hong Kong) · Sifei Liu (NVIDIA) | | 1402 | Unlocking the Potential of Pre-trained Vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors | Ziqin Zhou (None) · Hai-Ming Xu (The University of Adelaide) · Yangyang Shu (None) · Lingqiao Liu (None) | | 1403 | Relational Matching for Weakly Semi-Supervised Oriented Object Detection | Wenhao Wu (City University of Hong Kong) · Hau San Wong (City University of Hong Kong) · Si Wu (South China University of Technology) · Tianyou Zhang (South China University of Technology) | | 1404 | JointSQ: Joint Sparsification-Quantization for Distributed Learning | Weiying Xie (None) · Haowei Li (None) · Ma Jitao (None) · Yunsong Li () · Jie Lei (Xi'an University of Electronic Science and Technology) · donglai Liu (Xi'an University of Electronic Science and Technology) · Leyuan Fang (None) | | 1405 | Endow SAM with Keen Eyes: Temporal-spatial Prompt Learning for Video Camouflaged Object Detection | Wenjun Hui (None) · Zhenfeng Zhu (Beijing Jiaotong University) · Shuai Zheng (Beijing Jiaotong University) · Yao Zhao (Beijing Jiaotong University) | | 1406 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang (None) · Jianmin Bao (Microsoft) · Wenming Weng (None) · Ruoyu Feng (University of Science and Technology of China) · Dacheng Yin (University of Science and Technology of China) · Tao Yang (Xi'an JiaoTong University) · Jingxu Zhang (Research, Microsoft) · Qi Dai (Microsoft Research Asia) · Zhiyuan Zhao (Microsoft) · Chunyu Wang (Microsoft) · Kai Qiu (Microsoft) · Yuhui Yuan (Microsoft Research Asia) · Xiaoyan Sun (University of Science and Technology of China) · Chong Luo (Microsoft Research Asia) · Baining Guo (Microsoft Research) | | 1407 | NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning | Mustafa B Gurbuz (Georgia Institute of Technology) · Jean Moorman (Georgia Institute of Technology) · Constantine Dovrolis (Georgia Institute of Technology) | | 1408 | Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences | Axel Barroso-Laguna (None) · Sowmya Munukutla (None) · Victor Adrian Prisacariu (None) · Eric Brachmann (None) | | 1409 | DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets | Harsh Rangwani (None) · Pradipto Mondal (Indian Institute of Science, Indian institute of science, Bangalore) · Mayank Mishra (CMU, Carnegie Mellon University) · Ashish Asokan (Indian Institute of Science, Indian institute of science, Bangalore) · R. Venkatesh Babu (Indian Institute of Science) | | 1410 | Learning for Transductive Threshold Calibration in Open-World Recognition | Qin ZHANG (Amazon) · DONGSHENG An (Amazon) · Tianjun Xiao (Amazon) · Tong He (Amazon Web Services) · Qingming Tang (Amazon, Alexa) · Ying Nian Wu (UCLA) · Joseph Tighe (Meta) · Yifan Xing (None) | | 1411 | MorpheuS: Neural Dynamic 360 \degree Surface Reconstruction from Monocular RGB-D Video | Hengyi Wang (University College London, University of London) · Jingwen Wang (University College London) · Lourdes Agapito (University College London) | | 1412 | FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders | Soumen Basu (Indian Institute of Technology Delhi) · Mayuna Gupta (Indian Institute of Technology, Delhi) · Chetan Madan (Indian Institute of Technology, Delhi) · Pankaj Gupta (PGIMER Chandigarh) · Chetan Arora (Indian Institute of Technology Delhi) | | 1413 | LightOctree: Lightweight 3D Spatially-Coherent Indoor Lighting Estimation | Xuecan Wang (None) · Shibang Xiao (Beijing University of Aeronautics and Astronautics) · Xiaohui Liang (Zhongguancun Laboratory) | | 1414 | One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls | Minghui Hu (Nanyang Technological University) · Jianbin Zheng (South China University of Technology) · Chuanxia Zheng (University of Oxford) · Chaoyue Wang (JD Explore Academy) · Dacheng Tao (None) · Tat-Jen Cham (Nanyang Technological University) | | 1415 | Segmenting Whole Objects by Synthesizing Them | Ege Ozguroglu () · Ruoshi Liu (Columbia University) · Dídac Surís (Columbia University) · Dian Chen (Toyota Research Institute) · Achal Dave (None) · Pavel Tokmakov (Toyota Research Institute) · Carl Vondrick (Columbia University) | | 1416 | ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles | Jiawei Zhang (University of Illinois, Urbana Champaign) · Chejian Xu (University of Illinois at Urbana-Champaign) · Bo Li (UIUC) | | 1417 | Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse | Yining Wang (Fudan University) · Junjie Sun (Fudan University) · Chenyue Wang (Fudan University) · Mi Zhang (Fudan University) · Min Yang (Fudan University) | | 1418 | SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching | Xinghui Li (University of Oxford) · Jingyi Lu (University of Hong Kong) · Kai Han (The University of Hong Kong) · Victor Adrian Prisacariu (None) | | 1419 | 3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos | Jiakai Sun (Zhejiang University) · Han Jiao (Zhejiang University) · Guangyuan Li (None) · Zhanjie Zhang (Zhejiang University) · Lei Zhao (Zhejiang University) · Wei Xing (Zhejiang University) | | 1420 | VCoder: Versatile Visual Encoder for Accurate Object-Level Perception with Large Language Models | Jitesh Jain (Georgia Institute of Technology) · Jianwei Yang (Microsoft Research) · Humphrey Shi (U of Oregon | UIUC | PAIR) | | 1421 | TextCraft: Your Text Encoder Can be Image Quality Controller | Yanyu Li (Northeastern University) · Xian Liu (The Chinese University of Hong Kong) · Anil Kag (Snap Inc.) · Ju Hu (Snap Inc.) · Yerlan Idelbayev (Snap Inc.) · Dhritiman Sagar (Snap Inc.) · Yanzhi Wang (Northeastern University) · Sergey Tulyakov (Snap Inc.) · Jian Ren (Snap Inc.) | | 1422 | 3D Human Pose Perception from Egocentric Stereo Videos | Hiroyasu Akada (Max Planck Institute for Informatics) · Jian Wang (Max Planck Institute for Informatics) · Vladislav Golyanik (MPI for Informatics) · Christian Theobalt (MPI Informatik) | | 1423 | Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching | Shitong Shao (Southeast University) · Zeyuan Yin (Mohamed bin Zayed University of Artificial Intelligence) · Muxin Zhou (Mohamed bin Zayed University of Artificial Intelligence) · Xindong Zhang (The Hong Kong Polytechnic University, Hong Kong Polytechnic University) · Zhiqiang Shen (Mohamed bin Zayed University of Artificial Intelligence) | | 1424 | AAMDM: Accelerated Auto-regressive Motion Diffusion Model | Tianyu Li (Georgia Institute of Technology) · Calvin Zhuhan Qiao (University of British Columbia) · Ren Guanqiao (Beijing University of Aeronautics and Astronautics) · KangKang Yin (Simon Fraser University) · Sehoon Ha (Georgia Institute of Technology) | | 1425 | TexOct: Generating Textures of 3D Models with Octree-based Diffusion | Jialun Liu (Baidu) · Chenming Wu (None) · Xinqi Liu (Baidu Inc) · Xing Liu (Baidu) · Jinbo Wu (Baidu) · Haotian Peng (Baidu) · Chen Zhao (None) · Haocheng Feng (Baidu) · Jingtuo Liu (Baidu) · Errui Ding (Baidu Inc.) | | 1426 | OTE: Exploring Accurate Scene Text Recognition Using One Token | Jianjun Xu (University of Science and Technology of China) · Yuxin Wang (University of Science and Technology of China) · Hongtao Xie (University of Science and Technology of China) · Yongdong Zhang (University of Science and Technology of China) | | 1427 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang (None) · Dongdong Chen (Microsoft Research) · Chong Luo (Microsoft Research Asia) · Bo He (None) · Lu Yuan (Microsoft) · Zuxuan Wu (Fudan University) · Yu-Gang Jiang (Fudan University) | | 1428 | Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation | Biao Gong (Alibaba Group) · Siteng Huang (Zhejiang University & Westlake University) · Yutong Feng (Alibaba Group) · Shiwei Zhang (Alibaba Group) · Yuyuan Li (Zhejiang University) · Yu Liu (Alibaba Group) | | 1429 | L Q M F o r m e r :~Language-aware Query Mask Transformer for Referring Image Segmentation | Nisarg Shah (Johns Hopkins University) · Vibashan VS (Johns Hopkins University) · Vishal M. Patel (Johns Hopkins University) | | 1430 | Latent Modulated Function for Computational Optimal Continuous Image Representation | Zongyao He (Sun Yat-sen University) · Zhi Jin (Sun Yat-sen University) | | 1431 | Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Bingxin Ke (ETH Zurich) · Anton Obukhov (None) · Shengyu Huang (None) · Nando Metzger (ETH Zürich) · Rodrigo Caye Daudt (ETH Zurich) · Konrad Schindler (ETH Zurich) | | 1432 | LiDAR-based Person Re-identification | Wenxuan Guo () · Zhiyu Pan (Department of Automation, Tsinghua University) · Yingping Liang (None) · Ziheng Xi (Tsinghua University, Tsinghua University) · Zhi Chen Zhong (Tsinghua University, Tsinghua University) · Jianjiang Feng (Tsinghua University) · Jie Zhou (None) | | 1433 | Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification | Bin Yang (Wuhan University) · Jun Chen (Wuhan University) · Mang Ye (Wuhan University) | | 1434 | Spherical Mask: Coarse-to-Fine 3D Point Cloud Instance Segmentation with Spherical Representation | Sangyun Shin (University of Oxford) · Kaichen Zhou (Department of Computer Science, University of Oxford) · Madhu Vankadari (Department of Computer Science, University of Oxford) · Andrew Markham (University of Oxford) · Niki Trigoni (University of Oxford) | | 1435 | Neural Spline Fields for Burst Image Fusion and Layer Separation | Ilya Chugunov (Princeton University) · David Shustin (Princeton University) · Ruyu Yan (Princeton University) · Chenyang Lei (The Hong Kong University of Science and Technology) · Felix Heide (Department of Computer Science, Princeton University) | | 1436 | L2B: Learning to Bootstrap Robust Models for Combating Label Noise | Yuyin Zhou (UC Santa Cruz) · Xianhang Li (University of California, Santa Cruz) · Fengze Liu (ByteDance) · Qingyue Wei (Stanford University) · Xuxi Chen (University of Texas at Austin) · Lequan Yu (The University of Hong Kong) · Cihang Xie (University of California, Santa Cruz) · Matthew P. Lungren (Microsoft) · Lei Xing (Stanford University) | | 1437 | Deep Video Inverse Tone Mapping Based on Temporal Clues | Yuyao Ye (Peking University) · Ning Zhang (None) · Yang Zhao (Hefei University of Technology) · Hongbin Cao (ByteDance) · Ronggang Wang (Peking University Shenzhen Graduate School) | | 1438 | GS-IR: 3D Gaussian Splatting for Inverse Rendering | Zhihao Liang (None) · Qi Zhang (Tencent AI Lab) · Ying Feng (Tencent AI Lab) · Ying Shan (Tencent) · Kui Jia (South China University of Technology) | | 1439 | SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis | Ziqiao Peng (Renmin University of China) · Wentao Hu (Beijing University of Posts and Telecommunications) · Yue Shi (Psyche AI Inc.) · Xiangyu Zhu (None) · Xiaomei Zhang (None) · Hao Zhao (Tsinghua University, Tsinghua University) · Jun He (Renmin University of China) · Hongyan Liu (Tsinghua University, Tsinghua University) · Zhaoxin Fan (Renmin University of China, Tsinghua University) | | 1440 | Attack To Defend: Exploiting Adversarial Attacks for Detecting Poisoned Models | Samar Fares (None) · Karthik Nandakumar (Mohamed Bin Zayed University of Artificial Intelligence) | | 1441 | Dual-View Visual Contextualization for Web Navigation | Jihyung Kil (Ohio State University, Columbus) · Chan Hee Song (The Ohio State University) · Boyuan Zheng (Ohio State University, Columbus) · Xiang Deng (Google) · Yu Su (Ohio State University) · Wei-Lun Chao (Ohio State University) | | 1442 | D3still: Decoupled Differential Distillation for Asymmetric Image Retrieval | Yi Xie (South China University of Technology) · Yihong Lin (South China University of Technology) · Wenjie Cai (Meta) · Xuemiao Xu (South China University of Technology) · Huaidong Zhang (South China University of Technology) · Yong Du (Ocean University of China) · Shengfeng He (Singapore Management University) | | 1443 | Few-Shot Object Detection with Foundation Models | Guangxing Han (Columbia University) · Ser-Nam Lim (Meta AI) | | 1444 | Delving into the Trajectory Long-tail Distribution for Muti-object Tracking | Sijia Chen (Huazhong University of Science and Technology) · En Yu (Huazhong University of Science and Technology) · Jinyang Li (Huazhong University of Science and Technology) · Wenbing Tao (Huazhong University of Science and Technology) | | 1445 | Non-autoregressive Sequence-to-Sequence Vision-Language Models | Kunyu Shi (Amazon) · Qi Dong (Amazon) · Luis Goncalves (California Institute of Technology) · Zhuowen Tu (University of California, San Diego) · Stefano Soatto (AWS) | | 1446 | Seeing the Unseen: Visual Common Sense for Semantic Placement | Ram Ramrakhya (None) · Aniruddha Kembhavi (Allen Institute for Artificial Intelligence) · Dhruv Batra (FAIR (Meta) and Georgia Tech) · Zsolt Kira (Georgia Institute of Technology) · Kuo-Hao Zeng (Allen Institute for Artificial Intelligence) · Luca Weihs (Allen Institute for Artificial Intelligence) | | 1447 | MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning | Ahmed Agiza (None) · Marina Neseem (Brown University) · Sherief Reda (Brown University) | | 1448 | Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields | Haoyuan Wang (City University of Hong Kong) · Wenbo Hu (ByteDance) · Lei Zhu (City University of Hong Kong) · Rynson W.H. Lau (City University of Hong Kong) | | 1449 | PREGO: online mistake detection in PRocedural EGOcentric videos | Alessandro Flaborea (None) · Guido M. D'Amely di Melendugno (University of Roma "La Sapienza") · Leonardo Plini (University of Roma "La Sapienza") · Luca Scofano (University of Roma "La Sapienza") · Edoardo De Matteis (Sapienza University) · Antonino Furnari (University of Catania) · Giovanni Maria Farinella (University of Catania, Italy) · Fabio Galasso (None) | | 1450 | 3D LiDAR Mapping in Dynamic Environments using a 4D Implicit Neural Representation | Xingguang Zhong (Rheinische Friedrich-Wilhelms Universität Bonn) · Yue Pan (University of Bonn) · Cyrill Stachniss (University of Bonn) · Jens Behley (University of Bonn) | | 1451 | SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction | Yuanhui Huang (Tsinghua University) · Wenzhao Zheng (Tsinghua University, Tsinghua University) · Borui Zhang (Tsinghua University, Tsinghua University) · Jie Zhou (None) · Jiwen Lu (Tsinghua University) | | 1452 | SUGAR: Pre-training 3D Visual Representation for Robotics | Shizhe Chen (INRIA) · Ricardo Garcia Pinel (INRIA) · Ivan Laptev (INRIA Paris) · Cordelia Schmid (Inria / Google) | | 1453 | GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models | Taoran Yi (Huazhong University of Science and Technology) · Jiemin Fang (Huawei Technologies Ltd.) · Junjie Wang (None) · Guanjun Wu (None) · Lingxi Xie (Huawei Technologies Ltd.) · Xiaopeng Zhang (Huawei Technologies Ltd.) · Wenyu Liu (Huazhong University of Science and Technology) · Qi Tian (Huawei Technologies Ltd.) · Xinggang Wang (Huazhong University of Science and Technology) | | 1454 | Active Generalized Category Discovery | Shijie Ma (Institute of Automation, Chinese Academy of Sciences) · Fei Zhu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Zhun Zhong (University of Trento) · Xu-Yao Zhang (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) · Cheng-Lin Liu (Institute of automation, Chinese academy of science, Chinese Academy of Sciences) | | 1455 | CoG-DQA: Chain-of-Guiding Learning with Large Language Models for Diagram Question Answering | Shaowei Wang (Xi'an Jiaotong University) · Lingling Zhang (Xi'an Jiaotong University) · Longji Zhu (Xi'an Jiaotong University) · Tao Qin (Xi'an Jiaotong University) · Kim-Hui Yap (Nanyang Technological University) · Xinyu Zhang (None) · Jun Liu (Xi'an Jiaotong University) | | 1456 | ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models | Xinyu Tian (Australian National University) · Shu Zou (Australian National University) · Zhaoyuan Yang (General Electric) · Jing Zhang (Australian National University) | | 1457 | Coherence As Texture -- Passive Textureless 3D Reconstruction by Self-interference | Wei-Yu Chen (Carnegie Mellon University) · Aswin C. Sankaranarayanan (Carnegie Mellon University) · Anat Levin (Weizmann Institute of Science) · Matthew O’Toole (Carnegie Mellon University) | | 1458 | A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives | Simone Peirone (Polytechnic Institute of Turin) · Francesca Pistilli (Polytechnic Institute of Turin) · Antonio Alliegro (Politecnico di Torino) · Giuseppe Averta (Polytechnic of Turin) | | 1459 | Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation | Feilong Tang (Monash University) · Zhongxing Xu (Weill Cornell Medicine, Cornell University) · Zhaojun QU (Xi'an Jiaotong-Liverpool University) · Wei Feng (Monash University) · xingjian jiang (University of Michigan - Ann Arbor) · Zongyuan Ge (Monash University) | | 1460 | Compact 3D Gaussian Representation for Radiance Field | Joo Chan Lee (Sungkyunkwan University) · Daniel Rho (Korea Telecom Research) · Xiangyu Sun (None) · Jong Hwan Ko (Sungkyunkwan University (SKKU)) · Eunbyung Park (SKKU) | | 1461 | Unsupervised Universal Image Segmentation | Xudong Wang (Electrical Engineering & Computer Science Department, University of California Berkeley) · Dantong Niu (University of California, Berkeley) · Xinyang Han (None) · Long Lian (University of California, Berkeley) · Roei Herzig (Tel Aviv University) · Trevor Darrell (Electrical Engineering & Computer Science Department) | | 1462 | PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild | Kun Yuan (Kuaishou Technology) · Hongbo Liu (BYD Auto Co., Ltd) · Mading Li (Kuaishou Technology) · Muyi Sun (Institute of automation, Chinese Academy of Sciences) · Ming Sun (Kuaishou Tech) · Jiachao Gong (Beijing Kuaishou ) · Jinhua Hao (Kuaishou Tech) · Chao Zhou (Peking University) · Yansong Tang () | | 1463 | FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations | Christian Diller (Technische Universität München) · Thomas Funkhouser (Princeton University) · Angela Dai () | | 1464 | FlowIE:Efficient Image Enhancement via Rectified Flow | Yixuan Zhu (None) · Wenliang Zhao (Automation, Tsinghua University, Tsinghua University) · Ao Li (Tsinghua University) · Yansong Tang () · Jie Zhou (None) · Jiwen Lu (Tsinghua University) | | 1465 | Combining Frame and GOP Embeddings for Neural Video Representation | Jens Eirik Saethre (ETH Zurich & Disney Research|Studios) · Roberto Azevedo (Disney Research, Disney) · Christopher Schroers (Disney Research|Studios, Disney) | | 1466 | OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation | Qidong Huang (University of Science and Technology of China) · Xiaoyi Dong (Microsoft) · Pan Zhang (Shanghai Artificial Intelligence Laboratory) · Bin Wang (Shanghai AI Laboratory) · Conghui He (None) · Jiaqi Wang (Shanghai AI Laboratory) · Dahua Lin (The Chinese University of Hong Kong) · Weiming Zhang (University of Science and Technology of China) · Nenghai Yu (University of Science and Technology of China) | | 1467 | TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models | Yushi Huang (SenseTime) · Ruihao Gong (SenseTime) · Jing Liu () · Tianlong Chen (Massachusetts Institute of Technology) · Xianglong Liu (BUAA) | | 1468 | Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor | Jae Park Park (None) · Gyoomin Lee (Dongguk University) · Seunggi Park (Dongguk University) · Sung In Cho (Dongguk University) | | 1469 | Pose Adapted Shape Learning for Large-Pose Face Reenactment | Gee-Sern Hsu (None) · Jie-Ying Zhang (National Taiwan University of Science and Technology) · Yu-Hsiang Huang (National Taiwan University of Science and Technology) · Wei-Jie Hong (National Taiwan University of Science and Technology) | | 1470 | Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models | Shengqu Cai (ETH Zurich & Stanford University) · Duygu Ceylan (Adobe Systems) · Matheus Gadelha (Adobe Systems) · Chun-Hao P. Huang (Adobe Systems) · Tuanfeng Y. Wang (None) · Gordon Wetzstein (Stanford University) | | 1471 | Improving Out-of-Distribution Generalization in Graphs via Hierarchical Semantic Environments | Yinhua Piao (Seoul National University) · Sangseon Lee (Seoul National University) · Yijingxiu Lu (Seoul National University) · Sun Kim (Seoul National University, Seoul National University) | | 1472 | Towards Understanding and Improving Adversarial Robustness of Vision Transformers | Samyak Jain () · Tanima Dutta (IIT BHU) | | 1473 | ProG: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval | Fang Kaipeng (None) · Jingkuan Song (University of Electronic Science and Technology of China,) · Lianli Gao (University of Electronic Science and Technology of China, Tsinghua University) · Pengpeng Zeng (University of Electronic Science and Technology of China) · Zhi-Qi Cheng (Carnegie Mellon University) · Xiyao LI (Kuaishou Technology) · Heng Tao Shen (University of Electronic Science and Technology of China) | | 1474 | ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting | Yankai Jiang (Shanghai Artificial Intelligence Laboratory) · Zhongzhen Huang (None) · Rongzhao Zhang (Shanghai Artificial Intelligence Laboratory) · Xiaofan Zhang (Shanghai Jiao Tong University) · Shaoting Zhang (University of North Carolina at Charlotte) | | 1475 | Dynamic Cues-Assisted Transformer for Robust Point Cloud Registration | Hong Chen (Huazhong University of Science and Technology) · Pei Yan (Huazhong University of Science and Technology) · sihe xiang (None) · Yihua Tan (Huazhong University of Science and Technology) | | 1476 | Improved Self-Training for Test-Time Adaptation | Jing Ma (None) | | 1477 | Retrieval-Augmented Open-Vocabulary Object Detection | Jooyeon Kim (Korea University) · Eulrang Cho (Korea University) · Sehyung Kim (Korea University) · Hyunwoo J. Kim (Korea University) | | 1478 | Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation | Luca Barsellotti (University of Modena and Reggio Emilia) · Roberto Amoroso (NVIDIA) · Marcella Cornia (University of Modena and Reggio Emilia) · Lorenzo Baraldi (Università degli Studi di Modena e Reggio Emilia) · Rita Cucchiara (Università di Modena e Reggio Emilia) | | 1479 | NTO3D: Neural Target Object 3D Reconstruction with Segment Anything | Xiaobao Wei (University of the Chinese Academy of Sciences) · Renrui Zhang (MMLab of CUHK & Shanghai AI Laboratory) · Jiarui Wu (Beijing University of Aeronautics and Astronautics) · Jiaming Liu (Peking University) · Ming Lu (Intel Labs China) · Yandong Guo (OPPO Research Institute) · Shanghang Zhang (Peking University) | | 1480 | Structure-Aware Sparse-View X-ray 3D Reconstruction | Yuanhao Cai (Johns Hopkins University) · Jiahao Wang (Johns Hopkins University) · Alan L. Yuille (Johns Hopkins University) · Zongwei Zhou (Johns Hopkins University) · Angtian Wang (Johns Hopkins University) | | 1481 | NB-GTR: Narrow-Band Guided Turbulence Removal | Yifei Xia (Peking University) · Chu Zhou (Peking University) · Chengxuan Zhu (Peking University) · Minggui Teng (Peking University) · Chao Xu (Peking University) · Boxin Shi (None) | | 1482 | LangSplat: 3D Language Gaussian Splatting | Minghan Qin (Tsinghua University) · Wanhua Li (Harvard University) · Jiawei ZHOU (Tsinghua University) · Haoqian Wang (Tsinghua University, Tsinghua University) · Hanspeter Pfister (Harvard University) | | 1483 | Mudslide: A Universal Nuclear Instance Segmentation Method | Jun Wang (Peking University) | | 1484 | Retrieval-Augmented Embodied Agents | Yichen Zhu (Midea Group) · Zhicai Ou (AI Innovation Center, Midea Group) · Xiaofeng Mou (Midea Group) · Jian Tang (Midea Group) | | 1485 | Long-Tail Class Incremental Learning via Independent Sub-prototype Construction | Xi Wang (Xidian University) · Xu Yang (Xi'an University of Electronic Science and Technology) · Jie Yin (None) · Kun Wei (Xidian University) · Cheng Deng (Xidian University) | | 1486 | Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining | Xiang Chen (Nanjing University of Science and Technology) · Jinshan Pan (Nanjing University of Science and Technology) · Jiangxin Dong (Nanjing University of Science and Technology) | | 1487 | Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation | Lin Long (Zhejiang University) · Haobo Wang (Zhejiang University) · Zhijie Jiang (Zhejiang University) · Lei Feng (Nanyang Technological University) · Chang Yao (Zhejiang University) · Gang Chen (College of Computer Science and Technology, Zhejiang University) · Junbo Zhao (Zhejiang University) | | 1488 | Contextrast: Contextual Contrastive Learning for Semantic Segmentation | Changki Sung (Korea Advanced Institute of Science & Technology) · Wanhee Kim (Korea Advanced Institute of Science & Technology) · Jungho An (Korea Advanced Institute of Science & Technology) · WooJu Lee (KAIST) · Hyungtae Lim (KAIST) · Hyun Myung (KAIST) | | 1489 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye () · Dan Xu (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology) | | 1490 | Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis | FeiFan Xu (None) · Rui Li (Shantou University) · Si Wu (South China University of Technology) · Yong Xu (Peng Cheng Laboratory) · Hau San Wong (City University of Hong Kong) | | 1491 | MonoCD: Monocular 3D Object Detection with Complementary Depths | Longfei Yan (None) · Pei Yan (Huazhong University of Science and Technology) · Shengzhou Xiong (Huazhong University of Science and Technology) · Xuanyu Xiang (Huazhong University of Science and Technology) · Yihua Tan (Huazhong University of Science and Technology) | | 1492 | JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation | Yu Zeng (None) · Vishal M. Patel (Johns Hopkins University) · Haochen Wang (Toyota Technological Institute at Chicago) · Xun Huang (NVIDIA) · Ting-Chun Wang (NVIDIA) · Ming-Yu Liu (NVIDIA) · Yogesh Balaji (NVIDIA) | | 1493 | A Linear N-Point Solver for Line and Motion Estimation with Event Cameras | Ling Gao (ShanghaiTech University) · Daniel Gehrig (None) · Hang Su (None) · Davide Scaramuzza (University of Zurich) · Laurent Kneip (ShanghaiTech University) | | 1494 | Training on Synthetic Data Beats Real Data in Multimodal Relation Extraction | Zilin Du (Nanyang Technological University) · Haoxin Li (Nanyang Technological University) · Xu Guo (Nanyang Technological University) · Boyang Li (Nanyang Technological University) | | 1495 | HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images | Xihe Yang (The Chinese University of Hong Kong, Shenzhen) · Xingyu Chen (Xiaobing.AI) · Daiheng Gao () · Finn Wong (Xiaobing.AI) · Xiaoguang Han (The Chinese University of Hong Kong, Shenzhen) · Baoyuan Wang (Xiaobing.ai) | | 1496 | 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering | Guanjun Wu (None) · Taoran Yi (Huazhong University of Science and Technology) · Jiemin Fang (Huawei Technologies Ltd.) · Lingxi Xie (Huawei Technologies Ltd.) · Xiaopeng Zhang (Huawei Technologies Ltd.) · Wei Wei (Huazhong University of Science and Technology) · Wenyu Liu (Huazhong University of Science and Technology) · Qi Tian (Huawei Technologies Ltd.) · Xinggang Wang (Huazhong University of Science and Technology) | | 1497 | Differentiable Information Bottleneck for Deterministic Multi-view Clustering | Xiaoqiang Yan () · Zhixiang Jin (Zhengzhou University) · Fengshou Han (Zhengzhou University) · Yangdong Ye (Zhengzhou University) | | 1498 | SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering | Antoine Guédon (Ecole des Ponts ParisTech) · Vincent Lepetit (Ecole des Ponts ParisTech) | | 1499 | R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization | Kennard Chan (, ASTAR) · Fayao Liu (Institute for Infocomm Research, ASTAR) · Guosheng Lin (Nanyang Technological University) · Chuan-Sheng Foo (Centre for Frontier AI Research, ASTAR) · Weisi Lin (Nanyang Technological University) | | 1500 | An Aggregation-Free Federated Learning for Tackling Data Heterogeneity | Yuan Wang (Institute of High Performance Computing, Singapore, ASTAR) · Huazhu Fu (Institute of High Performance Computing, Singapore, ASTAR) · Renuga Kanagavelu (Institute of High Performance Computing, Singapore, ASTAR) · Qingsong Wei (Agency for Science, Technology and Research (ASTAR)) · Yong Liu (Institute of High Performance Computing, Singapore, ASTAR) · Rick Goh (Institute of High Performance Computing, Singapore, A*STAR) | | 1501 | Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models | Zhang Li (None) · Biao Yang (Huazhong University of Science and Technology) · Qiang Liu (Kingsoft Office) · Zhiyin Ma (Huazhong University of Science and Technology) · Shuo Zhang (Huazhong University of Science and Technology) · Jingxu Yang (Kingsoft Office Corporation Limited) · Yabo Sun (Kingsoft Office) · Yuliang Liu (Huazhong University of Science and Technology) · Xiang Bai (Huazhong University of Science and Technology) | | 1502 | Zero-Reference Low-Light Enhancement via Physical Quadruple Priors | Wenjing Wang (Peking University) · Huan Yang (Microsoft) · Jianlong Fu (Microsoft) · Jiaying Liu (Peking University) | | 1503 | Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective | Jinjing Zhao (The University of Sydney) · Fangyun Wei (None) · Chang Xu (University of Sydney) | | 1504 | DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion | Tom Van Wouwe (Stanford University) · Seunghwan Lee (Stanford University) · Antoine Falisse (Stanford University) · Scott Delp (Stanford University) · Karen Liu (Computer Science Department, Stanford University) | | 1505 | HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion | Jingbo Zhang (City University of Hong Kong) · Xiaoyu Li (Tencent AI Lab) · Qi Zhang (Tencent AI Lab) · Yan-Pei Cao (Tencent ARC Lab) · Ying Shan (Tencent) · Jing Liao (City University of Hong Kong) | | 1506 | CurveCloudNet: Processing Point Clouds with 1D Structure | Colton Stearns (None) · Alex Fu (Illumix) · Jiateng Liu (Department of Computer Science) · Jeong Joon Park (Stanford University) · Davis Rempe (NVIDIA) · Despoina Paschalidou (Stanford) · Leonidas Guibas (Stanford University) | | 1507 | CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification | Yiyu Chen (Industrial Bank Co., Ltd) · Zheyi Fan (Beijing Institute of Technology) · Zhaoru Chen (Beijing Institute of Technology) · Yixuan Zhu (Beijing Institute of Technology) | | 1508 | Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach | Guoqiang Liang (Hong Kong University of Science and Technology) · Kanghao Chen (Hong Kong University of Science and Technology) · Hangyu Li (Hong Kong University of Science and Technology) · Yunfan Lu (Hong Kong University of Science and Technology(GuangZhou)) · Lin Wang (Hong Kong University of Science and Technology) | | 1509 | Learning Visual Prompt for Gait Recognition | Kang Ma () · Ying Fu (None) · Chunshui Cao (Watrix Technology) · Saihui Hou (Beijing Normal University) · Yongzhen Huang (Beijing Normal University) · Dezhi Zheng (None) | | 1510 | SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology | Saarthak Kapse (State University of New York at Stony Brook) · Pushpak Pati (International Business Machines) · Srijan Das (University of North Carolina at Charlotte) · Jingwei Zhang (None) · Chao Chen (State University of New York, Stony Brook) · Maria Vakalopoulou (CentraleSupelec) · Joel Saltz (State University of New York at Stony Brook) · Dimitris Samaras (Stony Brook University) · Rajarsi Gupta (Academic medical center at State University of New York at Stony Brook) · Prateek Prasanna (State University of New York, Stony Brook) | | 1511 | FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning | Qiwei Li (Peking University) · Yuxin Peng (Peking University) · Jiahaun Zhou (Peking University) | | 1512 | Boosting Adversarial Transferability by Block Shuffle and Rotation | Kunyu Wang (The Chinese University of Hong Kong) · he xuanran (TikTok) · Wenxuan Wang (The Chinese University of Hong Kong) · Xiaosen Wang (Huazhong University of Science and Technology) | | 1513 | Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving | Junhao Zheng (Xi'an Jiaotong University) · Chenhao Lin (Xi'an Jiaotong University) · Jiahao Sun (Xi'an Jiaotong University) · Zhengyu Zhao (Xi'an Jiaotong University) · Qian Li (Xi'an Jiaotong University) · Chao Shen (Xi’an Jiaotong University) | | 1514 | Discovering and Mitigating Visual Biases through Keyword Explanation | Younghyun Kim (KAIST) · Sangwoo Mo (None) · Minkyu Kim (KRAFTON, Inc.) · Kyungmin Lee (Korea Advanced Institute of Science & Technology) · Jaeho Lee (POSTECH) · Jinwoo Shin (Korea Advanced Institute of Science and Technology) | | 1515 | XFibrosis: Explicit Vessel-Fiber Modeling for Fibrosis Staging from Liver Pathology Images | CHONG YIN (None) · Siqi Liu (Shenzhen Research Institute of Big Data) · Fei Lyu (Hong Kong Baptist University) · Jiahao Lu (Copenhagen University) · Sune Darkner (Copenhagen University) · Vincent Wong (The Chinese University of Hong Kong) · Pong C. Yuen (Hong Kong Baptist Unviersity) | | 1516 | MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning | Chaoyi Zhang (The University of Sydney, University of Sydney) · Kevin Lin (Microsoft) · Zhengyuan Yang (Microsoft) · Jianfeng Wang (Microsoft) · Linjie Li (Microsoft) · Chung-Ching Lin (Microsoft) · Zicheng Liu (Microsoft) · Lijuan Wang (Microsoft) | | 1517 | GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | Mustafa Munir (The University of Texas at Austin) · William Avery (None) · Md Mostafijur Rahman (University of Texas at Austin) · Radu Marculescu (University of Texas, Austin) | | 1518 | GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting | Yiwen Chen (Nanyang Technological University) · Zilong Chen (Tsinghua University) · Chi Zhang (Tencent ) · Feng Wang (Tsinghua University, Tsinghua University) · Xiaofeng Yang (Nanyang Technological University) · Yikai Wang (Tsinghua University) · Zhongang Cai (Nanyang Technological University) · Lei Yang (The Chinese University of Hong Kong) · Huaping Liu (Tsinghua University, Tsinghua University) · Guosheng Lin (Nanyang Technological University) | | 1519 | MoML: Online Meta Adaptation for 3D Human Motion Prediction | Xiaoning Sun (Nanjing University of Science and Technology) · Huaijiang Sun (Nanjing University of Science and Technology) · Bin Li (Nanjing University of Science and Technology) · Dong Wei (Nanjing University of Science and Technology) · Weiqing Li (Nanjing University of Science and Technology) · Jianfeng Lu (Nanjing University of Science and Technology) | | 1520 | Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance | Zan Wang (None) · Yixin Chen (BIGAI) · Baoxiong Jia (University of California, Los Angeles) · Puhao Li (Department of Automation, Tsinghua University) · Jinlu Zhang (Peking University) · Jingze Zhang (Tsinghua University, Tsinghua University) · Tengyu Liu (None) · Yixin Zhu (Peking University) · Wei Liang (Beijing Institute of Technology) · Siyuan Huang (Beijing Institute of General Artificial Intelligence) | | 1521 | Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Yi Zhang (Beihang University) · Meng-Hao Guo (Tsinghua University, Tsinghua University) · Miao Wang (Beihang University) · Shi-Min Hu (Tsinghua University, Tsinghua University) | | 1522 | Improving Graph Contrastive Learning via Adaptive Positive Sampling | Jiaming Zhuo (Hebei University of Technology) · Feiyang Qin (None) · Can Cui (Hebei University of Technology) · Kun Fu (Hebei University of Technology) · Bingxin Niu (Hebei University of Techonology) · Mengzhu Wang (Hebei University of Technology) · Yuanfang Guo (Beihang University) · Chuan Wang (institute of information engineering) · Zhen Wang (None) · Xiaochun Cao (SUN YAT-SEN UNIVERSITY) · Liang Yang (Hebei University of Technology) | | 1523 | Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling | Shentong Mo (CMU, Carnegie Mellon University) · Pedro Morgado (None) | | 1524 | PEM: Prototype-based Efficient MaskFormer for Image Segmentation | Niccolò Cavagnero (Polytechnic Institute of Turin) · Gabriele Rosi (Polytechnic Institute of Turin) · Claudia Cuttano (Polytechnic Institute of Turin) · Francesca Pistilli (Polytechnic Institute of Turin) · Marco Ciccone (Politecnico di Torino) · Giuseppe Averta (Polytechnic of Turin) · Fabio Cermelli (Politecnico di Torino) | | 1525 | VILA: On Pre-training for Visual Language Models | Ji Lin (Massachusetts Institute of Technology) · Danny Yin (NVIDIA) · Wei Ping (NVIDIA) · Pavlo Molchanov (NVIDIA) · Mohammad Shoeybi (NVIDIA) · Song Han (Massachusetts Institute of Technology) | | 1526 | Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning | Chen Zhao (King Abdullah University of Science and Technology (KAUST)) · Shuming Liu (KAUST) · Karttikeya Mangalam (University of California Berkeley) · Guocheng Qian (KAUST) · Fatimah Zohra (King Abdullah University of Science and Technology) · Abdulmohsen Alghannam (University of Virginia, Charlottesville) · Jitendra Malik (University of California at Berkeley) · Bernard Ghanem (KAUST) | | 1527 | Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation | Hyunwoo Ryu (Yonsei University) · Jiwoo Kim (Yonsei University) · Hyunseok An (Yonsei University) · Junwoo Chang (Yonsei University) · Joohwan Seo (University of California, Berkeley) · Taehan Kim (Samsung) · Yubin Kim (Massachusetts Institute of Technology) · Chaewon Hwang (Ewha Women's University) · Jongeun Choi (Yonsei University) · Roberto Horowitz (University of California, Berkeley) | | 1528 | Vision-and-Language Navigation via Causal Learning | Liuyi Wang (Tongji University) · Zongtao He (Tongji University) · Ronghao Dang (Tongji University) · mengjiao shen (Tongji University) · Chengju Liu (Tongji University) · Qijun Chen (Tongji University) | | 1529 | BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics | Wenqian Zhang (ShanghaiTech University) · Molin Huang (Shanghaitech University) · Yuxuan Zhou (None) · Juze Zhang (ShanghaiTech University) · Jingyi Yu (ShanghaiTech University) · Jingya Wang (ShanghaiTech University) · Lan Xu (None) | | 1530 | A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models | Julio Silva-Rodríguez (École de technologie supérieure, Université du Québec) · Sina Hajimiri (École de technologie supérieure, Université du Québec) · Ismail Ben Ayed (ETS Montreal) · Jose Dolz (École de technologie supérieure) | | 1531 | A noisy elephant in the room: Is your out-of-distribution detector robust to label noise? | Galadrielle Humblot-Renaux (Aalborg University) · Sergio Escalera (Computer Vision Center) · Thomas B. Moeslund (Aalborg University) | | 1532 | Learning with Structural Labels for Learning with Noisy Labels | Noo-ri Kim (Sungkyunkwan University) · Jin-Seop Lee (Sungkyunkwan University) · Jee-Hyong Lee (Sungkyunkwan University) | | 1533 | What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models | Letian Zhang (Tongji University) · Xiaotong Zhai (University of Warwick) · Zhongkai Zhao (National University of Singapore) · Yongshuo Zong (School of Informatics, University of Edinburgh) · Xin Wen (The University of Hong Kong) · Bingchen Zhao (None) | | 1534 | Rethinking Diffusion Model for Multi-Contrast MRI Super-Resolution | Guangyuan Li (None) · Chen Rao (Zhejiang University) · Juncheng Mo (Zhejiang University) · Zhanjie Zhang (Zhejiang University) · Wei Xing (Zhejiang University) · Lei Zhao (Zhejiang University) | | 1535 | Bayesian Exploration of Pre-trained Models for Low-shot Image Classification | Yibo Miao (Shanghai Jiaotong University) · Yu lei (None) · Feng Zhou (Renmin University of China) · Zhijie Deng (Shanghai Jiaotong University) | | 1536 | PLGSLAM: Progressive Neural Scene Represenation with Local to Global Bundle Adjustment | Tianchen Deng (None) · Guole Shen (None) · Tong Qin (Shanghai Jiaotong University) · jianyu wang (Shanghai Jiao Tong University) · Wentao Zhao (Shanghai Jiao Tong University) · Jingchuan Wang (None) · Danwei Wang (Nanyang Technological University) · Weidong Chen (Sha