AI News Hub Logo

AI News Hub

Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection

cs.CV updates on arXiv.org
Lihua Zhou, Mao Ye, Xiatian Zhu, Nianxin Li, Changyi Ma, Shuaifeng Li, Yitong Qin, Hongbin Liu, Jiebo Luo, Zhen Lei

arXiv:2605.04531v1 Announce Type: new Abstract: Open-vocabulary object detection with vision-language models (VLMs) such as Grounding DINO suffers from performance degradation under test-time distribution shifts, primarily due to semantic misalignment between text embeddings and shifted visual embeddings of region proposals. While recent test-time adaptive object detection methods for VLM-based either rely on costly backpropagation or bypass semantic misalignment via external memory, none directly and efficiently align text and vision in a training-free manner. To address this, we propose Reward-Guided Semantic Evolution (RGSE), a training-free framework that directly refines the text embeddings at test time. Inspired by evolutionary search, RGSE treats text embedding adaptation as a semantic search process: it perturbs text embeddings as candidate variants, evaluates them via cosine similarity with current and historical high-confidence visual proposals as a reward signal, and fuses them into a refined embedding through reward-weighted averaging. Without any backpropagation, RGSE achieves state-of-the-art performance across multiple detection benchmarks while adding minimal computational overhead. Our code will be open source upon publication.