Publication Category: Vision-Language Models