PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model

Zheng Zhang*, Yeyao Ma*, Enming Zhang*, Xiang Bai

^{* Equal Contribution}

Arxiv Paper

Features

A powerful extension of the Large Multi-modal Model for generic (panoptic, instance, semantic) segmentation, referring segmentation and interactivate segmentation.
Support joint training across multiple segmentation tasks and visual-language tasks.
Demonstrates zero-shot capabilities on unseen task, such as open-vocabulary segmentation, generalizaed referring segmentation, and video object segmentation.

Updates

Release evaluation code
Release training code

Installation

See Installation instructions.

Getting Started

See Preparing Datasets for PSALM.

See Getting Started with PSALM.

Model Zoo

Download PSALM here.

Citation

If you think this work is useful for your research, please use the following BibTeX entry.

@inproceedings{zhang2025psalm,
  title={Psalm: Pixelwise segmentation with large multi-modal model},
  author={Zhang, Zheng and Ma, Yeyao and Zhang, Enming and Bai, Xiang},
  booktitle={European Conference on Computer Vision},
  pages={74--91},
  year={2025},
  organization={Springer}
}

Acknowledgement

Thanks for awesome works: Mask2former, Mask2former-Simplify and LLaVA. Code is based on these works.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.idea		.idea
datasets		datasets
deepspeed/launcher		deepspeed/launcher
docs		docs
images		images
psalm		psalm
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model

Zheng Zhang, Yeyao Ma, Enming Zhang*, Xiang Bai

Features

Updates

Installation

Getting Started

Model Zoo

Citation

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

License

zamling/PSALM

Folders and files

Latest commit

History

Repository files navigation

PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model

Zheng Zhang*, Yeyao Ma*, Enming Zhang*, Xiang Bai

Features

Updates

Installation

Getting Started

Model Zoo

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Zheng Zhang, Yeyao Ma, Enming Zhang*, Xiang Bai

Packages