anyway to access the raw audio data (pcm or otherwise) #643

mattfeury · 2025-01-10T18:46:19Z

hello, this is not an issue per se but i saw similar questions and wanted to ask it: is there any way to gain access to the underlying audio data from a call?

i have tried pulling it in an AudioRecord object, but it does not seem to be sent in any source. I've tried VOICE_CALL, REMOTE_SUBMIX, UPLINK, DOWNLINK, etc. nothing returns the audio data. i'd like to pump it over the network.

thank you

The text was updated successfully, but these errors were encountered:

afalls-twilio · 2025-01-13T16:16:32Z

@mattfeury I'm not sure why you want to do this, the voice SDK does just that, it sends audio data over the network. That being said, the only way to get access to the raw PCM data is to write your own audio device by extending the AudioDevice class.

mattfeury · 2025-01-13T18:48:59Z

@mattfeury I'm not sure why you want to do this, the voice SDK does just that, it sends audio data over the network. That being said, the only way to get access to the raw PCM data is to write your own audio device by extending the AudioDevice class.

thank you for the response. i'm referring to the playback audio (e.g. the "renderer" data). currently, twilio plays this back via an AudioTrack which means it plays via a local speaker (built-in, headset, etc). I want to forward the data to an external source. I think AudioDevice will let me do this.

One follow up question: Is AudioDevice.audioDeviceReadRenderData blocking? I notice in the example it pumps to an audio track and that audio track is doing in a blocking manner, but since i'm replacing that audioTrack with an async network call, I want to be sure i'm doing it correctly. it seems that AudioDevice.audioDeviceReadRenderData is not blocking and since any following calls are async, the speakerRendererRunnable runs extremely frequently. e.g. 5000 loops in just 10ms, despite my CALLBACK_BUFFER_SIZE_MS being 10ms. this makes me think i need to put an explicit Thread.sleep in the runnable. is this correct?

afalls-twilio · 2025-01-18T00:26:57Z

@mattfeury yes, internally, webrtc operates on 10m blocks and this is by design to reduce audio latency... as for it being blocking, I believe it is but I will have to spend some time digging into WebRTC and get back to you... If you are seeing it get return more frequently then every 10ms, It might not be.... from the SDK perspective AudioDevice.audioDeviceReadRenderData just calls an internal WebRTC API called RequestPlayoutData.

87620089 · 2025-01-18T01:48:26Z

How to combine the microphone sound from AudioRecord and the sound from audioDeviceReadRenderData into one pcm file？ The sound from the microphone will be sent through audioDeviceWriteCaptureData

afalls-twilio · 2025-01-20T17:40:34Z

@87620089 you would have to mix the pcm samples down yourself, maybe something like this?
https://stackoverflow.com/questions/12089662/mixing-16-bit-linear-pcm-streams-and-avoiding-clipping-overflow

Also, you would have to make sure they are the same bitrate before you mix them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anyway to access the raw audio data (pcm or otherwise) #643

anyway to access the raw audio data (pcm or otherwise) #643

mattfeury commented Jan 10, 2025

afalls-twilio commented Jan 13, 2025

mattfeury commented Jan 13, 2025

afalls-twilio commented Jan 18, 2025 •

edited

Loading

87620089 commented Jan 18, 2025

afalls-twilio commented Jan 20, 2025 •

edited

Loading

anyway to access the raw audio data (pcm or otherwise) #643

anyway to access the raw audio data (pcm or otherwise) #643

Comments

mattfeury commented Jan 10, 2025

afalls-twilio commented Jan 13, 2025

mattfeury commented Jan 13, 2025

afalls-twilio commented Jan 18, 2025 • edited Loading

87620089 commented Jan 18, 2025

afalls-twilio commented Jan 20, 2025 • edited Loading

afalls-twilio commented Jan 18, 2025 •

edited

Loading

afalls-twilio commented Jan 20, 2025 •

edited

Loading