Abstract: Large vision-language models (LVLMs) have demonstrated remarkable capabilities in autonomous driving scene understanding tasks. However, these models occasionally generate hallucinatory ...
Semantic segmentation of remote sensing images is pivotal for comprehensive Earth observation, but the demand for interpreting new object categories, coupled with the high expense of manual annotation ...