Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with the encoding option #67

Closed
ehmicky opened this issue Aug 7, 2023 · 3 comments · Fixed by #69
Closed

Issues with the encoding option #67

ehmicky opened this issue Aug 7, 2023 · 3 comments · Fixed by #69

Comments

@ehmicky
Copy link
Collaborator

ehmicky commented Aug 7, 2023

The encoding option has the following issues.

1. It does not work with getStreamAsBuffer()

The encoding option is currently limited to getStream(). It is a noop when using getStreamAsBuffer().

get-stream/index.js

Lines 51 to 53 in c960887

export async function getStreamAsBuffer(stream, options) {
return getStream(stream, {...options, encoding: 'buffer'});
}

2. It handles the output encoding, not the input one

While the encoding option is currently documented as The encoding of the incoming stream., it is actually the encoding of the output instead. For example, it cannot be used to decode an incoming hex/base64/etc. stream. Instead, it encodes the incoming stream.

import {compose} from 'node:stream'
import getStream from 'get-stream'

console.log(await getStream(compose([Buffer.from('...').toString('hex')]))) // 2e2e2e, as expected
console.log(await getStream(compose([Buffer.from('...').toString('hex')]), {encoding: 'hex'})) // 326532653265, not ...

3. It is not needed for buffer/string difference

Using buffer vs string as input or output is handled without the need of the encoding option. Especially now that a separate method exists for buffers.

4. Encoding is a separate concern

Based on the above, the encoding option's only purpose is the following: encoding the return value as a hex or base64 string. It can also be used for latin1, etc., but returning those less common encodings as output seems much less useful.

However, encoding a string as hex or base64 can be done fairly simply and in many different ways, outside of this package. Encoding as hex without streaming can be as simple as (await getStreamAsBuffer(stream)).toString('hex'). Base64 streaming can be done with packages like [b64]*https://github.com/hapijs/b64), and without streaming with .toString('base64') or btoa().

In general, it seems like one would want to use get-stream to consume a stream in an easy and safe fashion, but encoding it to hex or base64 might be a separate concern better handled by other packages.

5. Its implementation might lead to bugs

The current implementation requires using a PassThrough stream instead of consuming the input stream directly. This additional layer has created issues such as #52, because the PassThrough stream does its own buffering.

6. It is slower

The intermediate stream requires its own computation, i.e. it is slower (although probably not much). It might also lead to inefficient buffering strategies.

7. This prevents supporting ReadableStream

The current implementation could easily support web-compatible ReadableStream, in addition to Node.js-specific streams, since it simply iterates with for await, which works well with ReadableStream. However, the encoding option would be difficult to support with ReadableStream. TextDecoder does not work since the encoding option encodes (not decodes). TextEncoder does not work because can only encodes to utf8. Also the list of encodings supporting in the browsers are different from the ones supported by Node.js.


Based on the above, I am curious whether you'd consider removing the encoding option @sindresorhus? Besides making the code simpler, faster and more resilient, it would enable supporting ReadableStream. I'd be happy to implement both of this, if you think this is a good idea.

@sindresorhus
Copy link
Owner

You make I good case for removing the encoding option. I agree.

@sindresorhus
Copy link
Owner

I'd be happy to implement both of this, if you think this is a good idea.

👍

@ehmicky
Copy link
Collaborator Author

ehmicky commented Aug 8, 2023

Done in #69.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants