Skip to content

adityagandhamal/prompt-learning-ovss

Repository files navigation

Open Vocabulary Segmentation with Prompt Learning

This repository contains the codebase of prompt learning techniques integrated with CAT-Seg (CVPR'24) to adapt the Vision-Language Model CLIP to the downstream task of semantic segmentation in an Open-Vocabulary setting

Following is the list of the Prompt Learning techniques contained in this repository

  • Context Optimization CoOp (IJCV'22)

    CoOp

    • Modeling a prompt’s context using a set of learnable vectors, which can be optimized through minimizing the loss

    • Instead of using a vanilla template "a photo of a [CLASS]", use learnable context vectors as prompts

      • e.g. "X X X X [CLASS]"
    • The integration of this technique into CAT-Seg can be found in class CLIP of ./catseg/third_party/model_vpt.py on main

  • Conditional Context Optimization CoCoOp (CVPR'22)

    CoCoOp

    • It follows a similar approach as CoOp but the in this case, the context vectors are conditioned on the image features

    • This augments the learnable prompts with the image context as a prior

    • The integration of this technique into CAT-Seg can be found in class CLIP of ./catseg/third_party/model_vpt.py on branch CoCoOp

  • Textual-based Class-aware Prompt tuning for Visual-Language Model TCP (CVPR'24)

    TCP

    • This technique proposes to induce textual-knowledge into learnable prompts

    • This enhances the generalizability across unseen classes by combining the prior textual knowledge into the finetuned learnable prompts

    • The integration of this technique into CAT-Seg can be found in class CLIP of ./catseg/third_party/model_vpt.py on branch TCP

About

Prompt Learning with CLIP for OVSS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published