Extracting Unimodal Features #477

sreebhattacharyya · 2024-09-26T03:11:51Z

Hello! I am trying to use Qwen-VL to extract unimodal features for a given input image and accompanying text query. How can that be achieved? I am aware that models like BLIP-2 have a direct API (extract_features) that aids in doing this. But how can it be achieved in the context of Qwen-VL?

thusinh1969 · 2024-10-06T09:20:26Z

Exactly what I was about to query. How do we get encoder embedding from Qwen2-VL for text and/or image or image/text combined input --> feature extracted.

Thanks,
Steve

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting Unimodal Features #477

Extracting Unimodal Features #477

sreebhattacharyya commented Sep 26, 2024

thusinh1969 commented Oct 6, 2024

Extracting Unimodal Features #477

Extracting Unimodal Features #477

Comments

sreebhattacharyya commented Sep 26, 2024

thusinh1969 commented Oct 6, 2024