Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critical issue: no apparent way to use pure binary TCP sockets in Node 6.9.5 #11316

Closed
RickBullotta opened this issue Feb 11, 2017 · 6 comments
Closed
Labels
question Issues that look for answers. stream Issues and PRs related to the stream subsystem.

Comments

@RickBullotta
Copy link
Contributor

RickBullotta commented Feb 11, 2017

Version 6.9.5
Windows 10
Net and Stream subsystem issue

I am trying to use Node to read (and write) binary streams from a TCP socket. I have tried every possible approach including:

socketClient.setEncoding(null)
socketClient.setEncoding('binary')

setEncoding(null) results in UTF-8 encoding being applied and screws everything up. setEncoding('binary') still seems to be modifying the bytes from the incoming stream in some cases.

I have verified the correct/actual inbound byte values with a Java app, so the issue is confined to Node.

Here is a concrete example:

The following 10 bytes (hex) are sent from the server:

00 02 00 00 00 04 00 c8 00 01

...however, when I try to read them from the Node stream associated with the socket, I see the following:

00 02 00 00 00 04 00 c3 88 00 01

Note that there are now 11 bytes, and c8 has been replaced with c3 88.

I have spent a couple days trying to figure this out, and have had zero luck. Is this a bug with the Stream classes? Is there some magic encoding setting that I don't know about?

Please advise. Thanks!

**** UPDATE ****

I may have found a workaround. The issue actually seems to be the conversion of the "data" parameter in the 'data' event emitted by the stream/socket.

I was doing:

client.on('data', function(data) {
         var rawBuffer = Buffer.from(data);

...and that was not working. Out of desperation, I tried:

client.on('data', function(data) {
         var rawBuffer = Buffer.from(data,'binary');

...and it worked.

If you'll indulge my venting (after losing many hours on this), the documentation for use of encoding on buffers and streams SUCKS. Not only is there no listing of the various options and their behaviors, but no usage information whatsoever.

This needs to be addressed to prevent others from experience the three days of hell I just went through...

@mscdex
Copy link
Contributor

mscdex commented Feb 11, 2017

This kind of question belongs on the nodejs/help repo instead. To answer your question though: streams work with binary data by default (assuming you're not using object mode of course), so don't use .setEncoding() at all and you'll be fine.

@mscdex mscdex closed this as completed Feb 11, 2017
@mscdex mscdex added question Issues that look for answers. stream Issues and PRs related to the stream subsystem. labels Feb 11, 2017
@RickBullotta
Copy link
Contributor Author

Maybe they do, but Buffers apparently do not. Something bizarre happened as you can plainly see. And the only solution was to explicitly call for binary encoding in Buffer.from(data,...)

And clearly there is a documentation issue - which is why I think this issue should remain open...or be moved to the appropriate location.

Thanks!

Rick

@sam-github
Copy link
Contributor

client.on('data', function(data) {
         var rawBuffer = Buffer.from(data);

data is already a buffer, why are you converting it to another buffer? What is "raw" about that buffer?

You did read https://nodejs.org/api/net.html#net_event_data and https://nodejs.org/api/stream.html#stream_event_data?

The listener callback will be passed the chunk of data as a string if a default encoding has been specified for the stream using the readable.setEncoding() method; otherwise the data will be passed as a Buffer.

The node docs are not perfect, please open another issue describing what specifically you think is missing from the documentation, this issue is too vague.

@RickBullotta
Copy link
Contributor Author

Thanks for the info @sam-github. Actually, it isn't a buffer. typeof returns 'String' and data.constructor.name = 'String'. That said, it is a stream of bytes, and the code that I provided correctly converts it to a buffer...

The stream itself is a Duplex stream created by opening a socket. Prior to calling .connect(...) on the socket, the encoding is set to 'binary' using setEncoding('binary'). That seems to be part of the issue, and it requires the extra code to convert the string to a buffer using the binary encoding in the "data" event.

The optimal solution is therefore to not set the encoding on the socket/stream at all, and a properly formatted byte buffer is what is sent as the 'data' parameter to the 'data' event.

That's all working now - as I mentioned, it's just a lot of very vague documentation on how all these various encodings interact, which encoders are even applicable to each use case, and which ones to use when. While I appreciate the very helpful comments, it would still be beneficial to the broader audience to fix the documentation.

Per your request: "Specifically", the documentation at:

https://nodejs.org/api/stream.html#stream_readable_setencoding_encoding

...does not describe what all of the supported encodings are (there are more than just 'UTF8' and 'hex'), nor does it link to any further documentation on them. Also, this documentation refers to "setEncoding(null)" as the recommended approach for binary streams, which is incorrect and does not work (this defaults to UTF8 encoding). There also appears to be an undocumented or semi-documented 'binary' encoding.

On the other hand, the documentation for use of encodings on Buffer objects is comprehensive and informative:

https://nodejs.org/api/buffer.html#buffer_buffers_and_character_encodings

@sam-github
Copy link
Contributor

Prior to calling .connect(...) on the socket, the encoding is set to 'binary' using setEncoding('binary').

The docs say data will be passed as a Buffer if you don't call .setEncoding(), which you are calling!

But I can see how you got a bit lost in all that.

I opened an issue for you: #11352

@RickBullotta
Copy link
Contributor Author

Thanks @sam-github - doc edited and pull request submitted. The main doc issue I wanted to fix was the incorrect reference to setEncoding(null) - and I clarified the default behavior (using Buffers).

gibfahn pushed a commit that referenced this issue May 19, 2017
Removed an incorrect reference to the use of setEncoding(null) as the
proper way to handling binary streams or to disable encoding, and
explained that the default encoding is "no encoding", and that this is
the correct approach for dealing with binary data via Buffers.

PR-URL: #11363
Fixes: #11352
Refs: #11316
Reviewed-By: Sam Roberts <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Gibson Fahnestock <[email protected]>
anchnk pushed a commit to anchnk/node that referenced this issue May 19, 2017
Removed an incorrect reference to the use of setEncoding(null) as the
proper way to handling binary streams or to disable encoding, and
explained that the default encoding is "no encoding", and that this is
the correct approach for dealing with binary data via Buffers.

PR-URL: nodejs#11363
Fixes: nodejs#11352
Refs: nodejs#11316
Reviewed-By: Sam Roberts <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Gibson Fahnestock <[email protected]>
MylesBorins pushed a commit that referenced this issue Jul 17, 2017
Removed an incorrect reference to the use of setEncoding(null) as the
proper way to handling binary streams or to disable encoding, and
explained that the default encoding is "no encoding", and that this is
the correct approach for dealing with binary data via Buffers.

PR-URL: #11363
Fixes: #11352
Refs: #11316
Reviewed-By: Sam Roberts <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Gibson Fahnestock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Issues that look for answers. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests

3 participants