I get Error with CONFIG = {"generation_config": {"response_modalities": ["AUDIO","TEXT"]}} in gemini-2/live_api_starter.py #386

abdul7235 · 2024-12-27T14:30:43Z

Description of the bug:

From the documentation:

https://ai.google.dev/api/multimodal-live

I believe I can get response in multiple modalilties, but running the above CONFIG in my code I get following error:

websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: a0bb7a2dd8834b47, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language; then sent 1007 (invalid frame payload data) Request trace id: a0bb7a2dd8834b47, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language

Why am I getting this error?

Actual vs expected behavior:

Is it possible to have response in Audio and Text Simaltaneoulsy?

If yes, please help me sort it out.

If no, then for good's sake it must be mentioned clearly in the documentation!

Any other information you'd like to share?

I had appreciate if google and GCP had properly organized and clear documentation for their services, APIs and SDKs. It is such a horrible experience integrating google's services due to documentation being horribly scattered and vague.

LarsDu · 2024-12-29T22:59:32Z

I'm getting the same error. It would be nice to get both text and audio at the same time. This is particularly useful for generating dialogues for things like games...

Giom-V · 2024-12-30T14:24:24Z

Hello @abdul7235 and @LarsDu,

At the moment, multimodalities is not available publically. You can only get Audio using the live APIs, and text using the "classic" ones.

I'll see what can be done to make that clear in the documentation.

abdul7235 · 2024-12-30T16:47:33Z

@Giom-V

Will multimodal output be available in the coming months?

Also, could you guide me on how to gain access to the non-public multimodal API?

kshitij01042002 · 2025-01-03T11:23:04Z

Is it possible to get the Audio and function calling? @Giom-V

simix · 2025-01-04T11:35:56Z

this is when you sent unsafe prompts
im looking how to block it , i tried to use this and doesnt work

self.safety_settings = [
{
"category": "HARM_CATEGORY_DANGEROUS",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_SEXUAL",
"threshold": "BLOCK_NONE",
}
]

simix · 2025-01-04T11:50:28Z

now this one works for me :

self.safety_settings = [
{
"category": "HARM_CATEGORY_HARASSMENT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"threshold": "BLOCK_NONE"
}
],

kshitij01042002 · 2025-01-06T05:48:16Z

I am getting this error, any idea why I might be getting this?

Error in Gemini session: received 1007 (invalid frame payload data) Request trace id: 4bdc******bb063, No matching per-key-config for API key_salt: 7448*****67170175; then sent 1007 (invalid frame payload data) Request trace id: 4bdc6000403bb063, No matching per-key-config for API key_salt: 74484*****170175

Giom-V · 2025-01-06T13:31:02Z

Will multimodal output be available in the coming months?

Yes it will.

Also, could you guide me on how to gain access to the non-public multimodal API?

@abdul7235, sorry but this is not a progra; you can request to join. The only way is to be very active (like GDEs) and we will reach out to you.

Giom-V · 2025-01-06T13:32:24Z

Is it possible to get the Audio and function calling? @Giom-V

@kshitij01042002 This notebook will show you how to use function calling with the live API: https://github.com/google-gemini/cookbook/blob/main/gemini-2/live_api_tool_use.ipynb

Giom-V · 2025-01-06T13:35:04Z

I am getting this error, any idea why I might be getting this?

Error in Gemini session: received 1007 (invalid frame payload data) Request trace id: 4bdc******bb063, No matching per-key-config for API key_salt: 7448*****67170175; then sent 1007 (invalid frame payload data) Request trace id: 4bdc6000403bb063, No matching per-key-config for API key_salt: 74484*****170175

@kshitij01042002 I'm guesssing you're not using the right API key. It should start with "AIza..." which is not the case of the one you seem to be using.

You need to generate it on AI Studio as documented here.

Giom-V · 2025-01-06T13:40:37Z

this is when you sent unsafe prompts
im looking how to block it , i tried to use this and doesnt work

@simix I think your mistake is that your were using HARM_CATEGORY_DANGEROUS instead of HARM_CATEGORY_DANGEROUS_CONTENT. Here's the related documentation for reference.

abdul7235 · 2025-01-15T05:50:21Z

@Giom-V I just need to re confirm that will I be able to get response from gemini in Audio + Text in the upcoming version?

E.g If I ask "Hello Gemini tell me about the weather." Can I get the response in Audio and the same thing that gemini is speaking in text too? I mean I need the same response in both audio and text.

Giom-V · 2025-01-15T09:41:07Z

@abdul7235 I don't it will be possible as Gemini generates the audio output directly, without using a TTS mechanism. If you want both (and I can see why) I think you'll have to generate text then use a TTS service to generate the audio.

ArthurG · 2025-01-17T22:00:56Z

With the live-api on websockets [1], is there a way to adjust the safety params [2]? I couldn't see it in the source code of the python lib or the docs @Giom-V

[1] https://github.com/google-gemini/cookbook/blob/main/gemini-2/live_api_tool_use.ipynb
[2] https://ai.google.dev/api/generate-content#v1beta.HarmCategory

Giom-V · 2025-01-20T13:01:36Z

@ArthurG I don't think you can at the moment.

Giom-V added the type:documentation The documentation needs to be updated label Dec 30, 2024

Giom-V self-assigned this Dec 30, 2024

github-actions bot mentioned this issue Jan 1, 2025

Monthly issue metrics report markmcd/gemini-api-cookbook#10

Open

This was referenced Jan 23, 2025

fix: Add validation and documentation for response modalities #416

Closed

docs: Add comment about response modalities limitation #417

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I get Error with CONFIG = {"generation_config": {"response_modalities": ["AUDIO","TEXT"]}} in gemini-2/live_api_starter.py #386

I get Error with CONFIG = {"generation_config": {"response_modalities": ["AUDIO","TEXT"]}} in gemini-2/live_api_starter.py #386

abdul7235 commented Dec 27, 2024 •

edited

Loading

LarsDu commented Dec 29, 2024

Giom-V commented Dec 30, 2024

abdul7235 commented Dec 30, 2024

kshitij01042002 commented Jan 3, 2025

simix commented Jan 4, 2025

simix commented Jan 4, 2025

kshitij01042002 commented Jan 6, 2025 •

edited

Loading

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

abdul7235 commented Jan 15, 2025

Giom-V commented Jan 15, 2025

ArthurG commented Jan 17, 2025

Giom-V commented Jan 20, 2025

I get Error with CONFIG = {"generation_config": {"response_modalities": ["AUDIO","TEXT"]}} in gemini-2/live_api_starter.py #386

I get Error with CONFIG = {"generation_config": {"response_modalities": ["AUDIO","TEXT"]}} in gemini-2/live_api_starter.py #386

Comments

abdul7235 commented Dec 27, 2024 • edited Loading

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

LarsDu commented Dec 29, 2024

Giom-V commented Dec 30, 2024

abdul7235 commented Dec 30, 2024

kshitij01042002 commented Jan 3, 2025

simix commented Jan 4, 2025

simix commented Jan 4, 2025

kshitij01042002 commented Jan 6, 2025 • edited Loading

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

Giom-V commented Jan 6, 2025

abdul7235 commented Jan 15, 2025

Giom-V commented Jan 15, 2025

ArthurG commented Jan 17, 2025

Giom-V commented Jan 20, 2025

abdul7235 commented Dec 27, 2024 •

edited

Loading

kshitij01042002 commented Jan 6, 2025 •

edited

Loading