Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing information elements in certain voice messages #2006

Closed
Jaffex opened this issue Nov 1, 2023 · 8 comments
Closed

Missing information elements in certain voice messages #2006

Jaffex opened this issue Nov 1, 2023 · 8 comments
Assignees
Labels
A-Voice-Message O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect Team: Element X Feature X-Needs-Design X-Needs-Product

Comments

@Jaffex
Copy link

Jaffex commented Nov 1, 2023

Steps to reproduce

I'm using the mautrix-signal bridge to talk to my signal contacts via matrix.

When a contact sends me a voice message on signal, this voice message gets sent as a voice message in matrix as well.

Outcome

What did you expect?

These voice messages show up in element desktop and the legacy element-ios client and have an indicator of
a) how long they are while they are not playing
b) how much time is left during playback
c) an indicator of where the current playback is via highlighting the waveform

image
Example from element-desktop


image
Example from element-desktop while playing back the bridged message

Everything is fine here.

I would expect this to be the same in element x.

What happened instead?

Voice messages that come via the signal bridge don't show how long they are, or where the playback is currently at while playing:
image
The same two voice messages from above, displayed in element x
This shows missing duration information for the bridged voice message.


image
Example from element x while playing back the bridged voice message
The indicator of the current playback position will remain at the start of the message during the entire playback.
The remaining time within the voice message stays at 0:00 during the entire playback.


image
Example from element x while playing back the voice message from element-desktop
Everything is fine here.

Additional Information

From what I can see while looking at the messages, the voice messages that come via the bridge are m4a format while the messages generated in element x or desktop are ogg.

I am aware that this might be an issue around how the bridge transmits the voice messages, but I thought it might be an issue for element x as well, since the other clients seem to be able to make sense of these voice messages' meta data.

Here is the message information for the bridged message:

{
  "room_id": "redacted",
  "type": "m.room.message",
  "content": {
    "msgtype": "m.audio",
    "body": "Sprachnachricht 01.11.23, 11:05.m4a",
    "info": {
      "mimetype": "audio/ogg",
      "size": 18827
    },
    "file": {
      "key": {
        "k": "redacted",
        "alg": "A256CTR",
        "ext": true,
        "kty": "oct",
        "key_ops": [
          "encrypt",
          "decrypt"
        ]
      },
      "iv": "redacted",
      "hashes": {
        "sha256": "redacted"
      },
      "url": "redacted",
      "v": "v2"
    },
    "org.matrix.msc1767.file": {
      "url": null,
      "name": "Sprachnachricht 01.11.23, 11:05.m4a",
      "mimetype": "audio/ogg",
      "size": 66485
    },
    "org.matrix.msc3245.voice": {}
  }
}

Weird thing here is the mimetype, which says "audio/ogg", which might relate to the problem.

For comparison, here is the data for the element-desktop generated voice message:

{
  "type": "m.room.message",
  "content": {
    "body": "Voice message",
    "msgtype": "m.audio",
    "file": {
      "v": "v2",
      "key": {
        "alg": "A256CTR",
        "ext": true,
        "k": "redacted",
        "key_ops": [
          "encrypt",
          "decrypt"
        ],
        "kty": "oct"
      },
      "iv": "redacted",
      "hashes": {
        "sha256": "redacted"
      },
      "url": "redacted"
    },
    "info": {
      "duration": 2264,
      "mimetype": "audio/ogg",
      "size": 6275
    },
    "org.matrix.msc1767.text": "Voice message",
    "org.matrix.msc1767.file": {
      "file": {
        "v": "v2",
        "key": {
          "alg": "A256CTR",
          "ext": true,
          "k": "redacted",
          "key_ops": [
            "encrypt",
            "decrypt"
          ],
          "kty": "oct"
        },
        "iv": "redacted",
        "hashes": {
          "sha256": "redacted"
        },
        "url": "redacted"
      },
      "name": "Voice message.ogg",
      "mimetype": "audio/ogg",
      "size": 6275
    },
    "org.matrix.msc1767.audio": {
      "duration": 2264,
      "waveform": [
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        29,
        29,
        29,
        24,
        24,
        24,
        22,
        22,
        22,
        2,
        2,
        2,
        13,
        13,
        13,
        18,
        18,
        18,
        15,
        15,
        15,
        13,
        13,
        13,
        14,
        14,
        14,
        12,
        12,
        12,
        12,
        12,
        12,
        8,
        8,
        8,
        10,
        10,
        10,
        17,
        17,
        17,
        812,
        812,
        812,
        799,
        799,
        799,
        692,
        692,
        692,
        856,
        856,
        856,
        549,
        549,
        549,
        364,
        364,
        364,
        712,
        712,
        712,
        927,
        927,
        927,
        935,
        935,
        935,
        836,
        836,
        836,
        779,
        779,
        779,
        603,
        603,
        603,
        549,
        549,
        549,
        60,
        60,
        60,
        337,
        337,
        337,
        122,
        122,
        122,
        152
      ]
    },
    "org.matrix.msc3245.voice": {},
    "m.mentions": {}
  }
}

Your phone model

iPhone 12 mini

Operating system version

iOS 17.1

Application version

1.4.0

Homeserver

Synapse

Will you send logs?

No

@Jaffex Jaffex added the T-Defect label Nov 1, 2023
@pixlwave pixlwave added A-Voice-Message S-Minor Impairs non-critical functionality or suitable workarounds exist O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Nov 3, 2023
@pixlwave
Copy link
Member

pixlwave commented Nov 3, 2023

@Jaffex Do you happen to have an example file of one of the bridged voice messages you can share with us? From the event description something looks off as the filename is for an m4a file but the mimetype says it's an OGG file.

"name": "Sprachnachricht 01.11.23, 11:05.m4a",
"mimetype": "audio/ogg",

@pixlwave pixlwave added the X-Needs-Info This issue is blocked awaiting information from the reporter label Nov 3, 2023
@pixlwave
Copy link
Member

pixlwave commented Nov 3, 2023

Media info from the file (sent privately):

General
Complete name                            : Sprachnachricht 03.11.23, 15_30.m4a
Format                                   : Ogg
File size                                : 13.1 KiB
Duration                                 : 1 s 911 ms
Overall bit rate                         : 56.1 kb/s
Writing application                      : Lavc59.37.100 libopus
creation_time                            : 2023-11-03T14:30:46.000000Z
vendor_id                                : [0][0][0][0]
major_brand                              : M4A 
minor_version                            : 0
compatible_brands                        : M4A mp42isom
FileExtension_Invalid                    : oga ogg ogm ogv ogx opus spx

Audio
ID                                       : 869979633 (0x33DAD5F1)
Format                                   : Opus
Duration                                 : 1 s 911 ms
Channel(s)                               : 1 channel
Channel layout                           : M
Sampling rate                            : 48.0 kHz
Compression mode                         : Lossy
Writing library                          : Lavf59.27.100
Language                                 : English

So it is being converted correctly, but the filename is incorrect.

@Jaffex
Copy link
Author

Jaffex commented Nov 3, 2023

As an addendum:

I also use the mautrix-whatsapp bridge, which also provides voice messages.

In those cases, the voice message shows the correct duration:

image

But here, pressing the play-button does not start playback of the voice message. (There seems to be no way to listen to this message in Element X)

These voice messages also play correctly in element-desktop and legacy element-ios.

Here is the data for such a file:

{
  "room_id": "redacted",
  "type": "m.room.message",
  "content": {
    "body": "audio.ogg",
    "file": {
      "hashes": {
        "sha256": "redacted"
      },
      "iv": "redacted",
      "key": {
        "alg": "A256CTR",
        "ext": true,
        "k": "redacted",
        "key_ops": [
          "encrypt",
          "decrypt"
        ],
        "kty": "oct"
      },
      "url": "redacted",
      "v": "v2"
    },
    "info": {
      "duration": 1000,
      "mimetype": "audio/ogg; codecs=opus",
      "size": 3307
    },
    "msgtype": "m.audio",
    "org.matrix.msc1767.audio": {
      "duration": 1000,
      "waveform": [
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        32,
        32,
        64,
        96,
        128,
        160,
        192,
        224,
        288,
        320,
        320,
        288,
        256,
        224,
        192,
        160,
        128,
        128,
        128,
        96,
        96,
        96,
        64,
        64,
        64,
        64,
        64,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32,
        32
      ]
    },
    "org.matrix.msc3245.voice": {}
  }
}

(An example was provided to @pixlwave in a private chat)

@pixlwave
Copy link
Member

pixlwave commented Nov 3, 2023

The file from the WhatsApp bridge is an OGG, however the sample rate isn't to spec, which states 48kHz. Not sure if thats hardcoded in our decoding somewhere so mentioning it in case.

General
Complete name                            : audio.ogg
Format                                   : Ogg
File size                                : 3.23 KiB
Duration                                 : 1 s 707 ms
Overall bit rate                         : 15.5 kb/s

Audio
ID                                       : 100 (0x64)
Format                                   : Opus
Duration                                 : 1 s 707 ms
Channel(s)                               : 1 channel
Channel layout                           : M
Sampling rate                            : 16.0 kHz
Compression mode                         : Lossy
Writing library                          : WhatsApp

@s3ase
Copy link

s3ase commented Nov 21, 2023

I can confirm I also run WhatsApp Bridge and voice message does not play on element x iOS, but plays on Element X android, Element iOS and Element Web

@nimau
Copy link
Contributor

nimau commented Nov 28, 2023

There are two issues here:

  1. In the first case, the duration is missing. That's why the duration is not shown in the timeline and during playback.
    This works in the legacy iOS app because the app preloads all voice messages. In EX, for performance reasons, we only load a voice message the first time it is played. A possible fix is to update the UI with the duration once the file is loaded. This will fix the duration during playback, but not for those that were not loaded.

  2. The mimeType contains parameters in the second case.
    This is a bug because EX expects the mime type to be exactly "audio/ogg".

@Jaffex
Copy link
Author

Jaffex commented Jan 5, 2024

After migrating to the go rewrite of the mautrix signal bridge, I noticed that voice messages come through as .m4a (I believe when sent from an iPhone) or .aac (When sent from android).

Both of those filetypes trigger a player element on element desktop and legacy element app, but not in element x.
The only difference to a "real" voice message in element desktop seems to be that the player element also shows the file name.

Element Desktop:
image
Element X:
2024-01-05 11-11-01-0

In Element X, the only option is to open these files in a separate app to listen to them.

Would it be possible to also add these file types to be playable with the voice message player?

@pixlwave
Copy link
Member

pixlwave commented Jan 8, 2024

Closing as fixed by ##2190

@Jaffex That's a reasonable request, please open an issue for it 👍

@pixlwave pixlwave closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Voice-Message O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect Team: Element X Feature X-Needs-Design X-Needs-Product
Projects
None yet
Development

No branches or pull requests

6 participants