and look at road maps to find the entrance to a highway. To address the question of whether large-scale visual language models (LVLMs) can read maps and find the right route like humans ...