Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version >= 0.1.2 don't work with webtorrent #406

Closed
Fabiencdp opened this issue May 15, 2018 · 12 comments
Closed

Version >= 0.1.2 don't work with webtorrent #406

Fabiencdp opened this issue May 15, 2018 · 12 comments

Comments

@Fabiencdp
Copy link

Fabiencdp commented May 15, 2018

Hi,
I try to use node-webrtc with webtorrent-hybrid instead of electron-webrtc.
It works with node-webrtc version 0.1.0 and 0.1.1. But don't with 0.1.2 and superior versions.
After some test, it look like the problem come from the binary from 0.1.2. I run my test on linux x64.

I have made a test repo to show the difference : https://github.com/Fabiencdp/webtorrent-wrtc-test
Did you have an idea on what happen ?

You can take a look at this issue webtorrent/webtorrent#76

@markandrus
Copy link
Member

markandrus commented May 15, 2018

Thanks, @Fabiencdp. I will test it out

@surprisetalk
Copy link

Hi friends -- has there been any progress on this? Is there anything I can do to help?

@markandrus
Copy link
Member

Hi @surprisetalk,

Yes, you can definitely help out! We still don't know why it doesn't work. I took a look at @Fabiencdp's example, but I wasn't sure what the expected behavior was (I've never used instant.io, for example, which is mentioned in the usage example). I was AFK for a bit, so I haven't followed up, but if you could take a look, that would help.

I think we want to identify,

  • Is something throwing that didn't previously throw?
  • Are events firing in a different order than before?
  • Is a certain event not firing at all?

My feeling is that adding some logging to @Fabiencdp's project or some logging in its dependencies may help us answer these questions.

Mark

@surprisetalk
Copy link

Thanks @markandrus!

I did some research today with webtorrent-hybrid using the latest version of wrtc.

Seeding consistently works in the following scenario:

  1. Seed a file using webtorrent-hybrid (replacing electron-webrtc with wrtc).
  2. Open the magnet link on instant.io.
  3. Upload the file on instant.io and then quickly close the tab/window. The node.js seeder will hand over all the appropriate data to the leecher in both chrome and firefox.

So seeding files works, but somewhere the metadata exchange is breaking! Indeed, take a look at the following log (this happens when you add a magnet link on instant.io):

  bittorrent-protocol [8ffcc747] new wire +0ms
  bittorrent-protocol [8ffcc747] handshake i=21a7b1357cec4803f5a24eec2aa5efd22b6dcf1e p=2d5757303030302d7364447971512f3461764172 exts={ dht: true } +2ms
  bittorrent-protocol [8ffcc747] Error: wire not speaking BitTorrent protocol () +22ms
  bittorrent-protocol [8ffcc747] end +0ms
  bittorrent-protocol [8ffcc747] got uninterested +0ms
  bittorrent-protocol [8ffcc747] got choke +0ms
  bittorrent-protocol [8ffcc747] destroy +15ms
  bittorrent-protocol [8ffcc747] end +0ms
  bittorrent-protocol [8ffcc747] got uninterested +0ms
  bittorrent-protocol [8ffcc747] got choke +0ms
  bittorrent-protocol [fc4df1d9] new wire +4s
  bittorrent-protocol [fc4df1d9] handshake i=21a7b1357cec4803f5a24eec2aa5efd22b6dcf1e p=2d5757303030302d7364447971512f3461764172 exts={ dht: true } +1ms

  bittorrent-protocol [fc4df1d9] Error: wire not speaking BitTorrent protocol (z'
                              V�z'
$@�|�H��N�!0HA&                   VH
               V�S8&
                    V �aI&
                          p��'
                              V) +107ms
  bittorrent-protocol [fc4df1d9] end +0ms
  bittorrent-protocol [fc4df1d9] got uninterested +0ms
  bittorrent-protocol [fc4df1d9] got choke +0ms
  bittorrent-protocol [fc4df1d9] destroy +13ms
  bittorrent-protocol [fc4df1d9] end +0ms
  bittorrent-protocol [fc4df1d9] got uninterested +0ms
  bittorrent-protocol [fc4df1d9] got choke +0ms
set=0 +1ms
  bittorrent-protocol [55ab5a66] new wire +258ms
  bittorrent-protocol [55ab5a66] handshake i=21a7b1357cec4803f5a24eec2aa5efd22b6dcf1e p=2d5757303030302d7364447971512f3461764172 exts={ dht
: true } +1ms
  bittorrent-protocol [55ab5a66] Error: wire not speaking BitTorrent protocol (v7�kP�['
                                    `�"@�:ut_metadatai1ee13:meta) +6ms
  bittorrent-protocol [55ab5a66] end +0ms
  bittorrent-protocol [55ab5a66] got uninterested +1ms
  bittorrent-protocol [55ab5a66] got choke +0ms
28 offset=0 length=16384 +8ms
  bittorrent-protocol [55ab5a66] destroy +8ms
  bittorrent-protocol [55ab5a66] end +0ms
  bittorrent-protocol [55ab5a66] got uninterested +1ms
  bittorrent-protocol [55ab5a66] got choke +0ms
  bittorrent-protocol [ffd15b9f] piece index=29 off
set=0 +0ms
16384
  bittorrent-protocol [ffd15b9f] got request index=
30 offset=0 length=16384 +1ms
  bittorrent-protocol [ffd15b9f] got request index=
31 offset=0 length=16384 +0ms

Interesting, right? It looks like some initial handshake messages are getting garbled. You can even see a ut_metadata snippet in one of the errors.

I'll continue investigating tomorrow -- just thought I'd post my first findings here in case this seems familiar to anybody.

@surprisetalk
Copy link

surprisetalk commented Jun 20, 2018

Update:

Started two leeching peers using webtorrent-hybrid with wrtc.

Peer 1 had the following exchange:

  bittorrent-protocol [124112b4] sent message 20 [] d1:md11:ut_metadatai1e6:ut_pexi2eee <Buffer 00 64 31 3a 6d 64 31 31 3a 75 74 5f 6d 65 74 61 64 61 74 61 69 31 65 36 3a 75 74 5f 70 65 78 69 32 65 65 65> <Buffer 00 00 00 25 14> +1ms
  bittorrent-protocol [124112b4] received buffer 106 jd1:md11:ut_metadatai1e6:ut_pexi2eee <Buffer 6a 00 64 31 3a 6d 64 31 31 3a 75 74 5f 6d 65 74 61 64 61 74 61 69 31 65 36 3a 75 74 5f 70 65 78 69 32 65 65 65> +2ms
  bittorrent-protocol [124112b4] got unknown message (jd1:md11:ut_metadatai1e6:ut_pexi2eee) +0ms

Peer 2 logged this:

  bittorrent-protocol [97bcfafe] sent message 20 [] d1:md11:ut_metadatai1e6:ut_pexi2eee <Buffer 00 64 31 3a 6d 64 31 31 3a 75 74 5f 6d 65 74 61 64 61 74 61 69 31 65 36 3a 75 74 5f 70 65 78 69 32 65 65 65> <Buffer 00 00 00 25 14> +1ms
  bittorrent-protocol [97bcfafe] received buffer 0 d1:md11:ut_metadatai1e6:ut_pexi2eee <Buffer 00 00 64 31 3a 6d 64 31 31 3a 75 74 5f 6d 65 74 61 64 61 74 61 69 31 65 36 3a 75 74 5f 70 65 78 69 32 65 65 65> +1ms

Note that both peers sent the exact same message at the same time, yet one received buffer was prefixed with 00 (choke/garbage) and the other was prefixed with 6a (garbage). I'm finding that the prefixes of ut_metadata-related (and possibly all extension-related) messages have a garbled beginning prefix.

I'm trying to track down what occurs after Wire.prototype._message in bittorrent-protocol/index.js. Specifically, the following lines seem suspicious:

this._push(buffer)
if (data) this._push(data)

As always, let me know if anything jumps out at you.

EDIT: If both peers are started at about the same time, the handshake sends successfully. The errors only seem to occur when one leecher is running, and then I remove and start a second leecher. Strange!

@markandrus
Copy link
Member

@surprisetalk

I'm finding that the prefixes of ut_metadata-related (and possibly all extension messages) have a garbled beginning prefix.

Hmm, I wonder if this is specific to Buffers that get _push-ed through the Wire. I see in that method the following happens:

var dataLength = data ? data.length : 0
var buffer = Buffer.allocUnsafe(5 + (4 * numbers.length))

buffer.writeUInt32BE(buffer.length + dataLength - 4, 0)
buffer[4] = id
for (var i = 0; i < numbers.length; i++) {
  buffer.writeUInt32BE(numbers[i], 5 + (4 * i))
}

Buffer support was actually added in v0.0.65. See these changes in datachannel.cc.

I'm not sure what's significant about v0.1.2, though. There are only very minor changes in, e.g., how we copy the webrtc::DataBuffer in MessageEvent. See v0.1.1 versus v0.1.2.

@surprisetalk
Copy link

surprisetalk commented Jun 22, 2018

I found the problem!

First, look at the following block of code from simple-peer/index.js:

Peer.prototype._onChannelMessage = function (event) {
  var self = this
  if (self.destroyed) return
  var data = event.data
  if (data instanceof ArrayBuffer) data = Buffer.from(data)
  self.push(data)
}

This function takes event.data from WebRTC's onmessage event, and pushes it down through a Node.js pipe. In this case, bittorrent-protocol is going to be accepting this data in Wire.prototype._write.

Anyway, if you look at the Buffer docs here, you'll see that Buffer.from does not allocate new memory when it receives an ArrayBuffer. Indeed, when I use the following code, webtorrent-hybrid works with webrtc as it should:

Peer.prototype._onChannelMessage = function (event) {
  var self = this
  if (self.destroyed) return
  var data = event.data
  if (data instanceof ArrayBuffer) {
    let data_ = Buffer.allocUnsafe(Buffer.from(data).length)
    Buffer.from(data).copy( data_ )
    self.push( data_ )
  } else
  {
    self.push( data )
  }
}

So it looks like wrtc is reusing the memory for event.data while it gets pushed downstream to bittorrent-protocol. Here are a few solutions:

  1. We could make some changes to simple-peer, but if you read the "memory usage" section of its README, you'll see that this behavior is intentional. I doubt @feross would be on board for that.
  2. We could clone the data immediately after simple-peer pushes the data downstream toward bittorrent-protocol. I'm not sure if this would work, but it may be worth a try.
  3. @markandrus and the team perform some wizardry and figure out a way to hold on to that memory for a little longer 🙃

@markandrus
Copy link
Member

Awesome work! I’m AFK this weekend, but happy to look more closely Monday. :-)

@markandrus
Copy link
Member

@surprisetalk OK, I believe I understand the bug now. It is a bug in node-webrtc and the way we construct ArrayBuffer. Actually, I am surprised this hasn't caused more problems! :-O

We call this version of New with message, which results in an externalized ArrayBuffer. "Externalized" means message will not be freed when the ArrayBuffer is garbage collected. This is actually good in this case, because message will already be freed when the MessageEvent that surfaced it gets destroyed. We'd have a bad time (segfault) if it got freed twice.

Herein lies the problem, though: if we hold onto two ArrayBuffers from an RTCDataChannel in the JavaScript layer, the message underlying the earlier of the two has already been freed when the first MessageEvent gets destroyed. Reading the earlier of the two messages is likely to surface junk data. I guess the code that called New assumed New made a copy of message, but it doesn't.

Here's what we can do, then: construct ArrayBuffer in an "internalized" state, and release the message when constructing it. This means ArrayBuffer can free the message whenever it's garbage collected. I'll followup with an integration test and the fix.

Thanks again! You're investigation helped a ton.

@markandrus
Copy link
Member

@Fabiencdp @surprisetalk I merged #419 and release [email protected]. Can either of y'all retest with webtorrent? If it works, then I think we can close this issue.

@surprisetalk
Copy link

Sorry, I've been on vacation and working on other projects.

Just tested webtorrent-hybrid with version 0.1.6 and it works splendidly!
Thank you so much for your help.

@markandrus
Copy link
Member

Very nice, thank you! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants