Papers
arxiv:2407.02075

Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts

Published on Jul 2, 2024
Authors:
,
,
,
,

Abstract

Label Anything, a transformer-based architecture, achieves state-of-the-art performance in few-shot semantic segmentation by using diverse visual prompts and supporting multi-class classification.

AI-generated summary

Few-shot semantic segmentation aims to segment objects from previously unseen classes using only a limited number of labeled examples. In this paper, we introduce Label Anything, a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation. Our approach leverages diverse visual prompts -- points, bounding boxes, and masks -- to create a highly flexible and generalizable framework that significantly reduces annotation burden while maintaining high accuracy. Label Anything makes three key contributions: (i) we introduce a new task formulation that relaxes conventional few-shot segmentation constraints by supporting various types of prompts, multi-class classification, and enabling multiple prompts within a single image; (ii) we propose a novel architecture based on transformers and attention mechanisms; and (iii) we design a versatile training procedure allowing our model to operate seamlessly across different N-way K-shot and prompt-type configurations with a single trained model. Our extensive experimental evaluation on the widely used COCO-20^i benchmark demonstrates that Label Anything achieves state-of-the-art performance among existing multi-way few-shot segmentation methods, while significantly outperforming leading single-class models when evaluated in multi-class settings. Code and trained models are available at https://github.com/pasqualedem/LabelAnything.

Community

Sign up or log in to comment

Models citing this paper 5

Browse 5 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.02075 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.02075 in a Space README.md to link it from this page.

Collections including this paper 1