Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BREAKING(streams): have toJson() and toText() accept ReadableStream<Uint8Array> #5079

Closed
wants to merge 2 commits into from

Conversation

iuioiua
Copy link
Contributor

@iuioiua iuioiua commented Jun 19, 2024

What's changed

Previously, toJson() and toText() accepted ReadableStream, without the type of stream being defined. Now, these functions accept ReadableStream<Uint8Array>.

Why this change was made

This change was made to align toJson() to Request.json() and Response.json(), and toText() to Request.text() and Response.text(), which seems to be the most common use case for these functions. The change also aligned these two functions with the similarly-oriented toBlob(), which did correctly accept ReadableStream<Uint8Array>.

Migration guide

To migrate, ensure that a ReadableStream of Uint8Arrays is being passed to these functions:

  import { toText } from "@std/streams/to-json";

- const stream = ReadableStream(["Hello, ", "world!"]);
+ const stream = ReadableStream(["Hello, ", "world!"]).pipeThrough(new TextEncoderStream());
  await toText(stream);

In most cases, adding .pipeThrough(new TextEncoderStream()) for existing use cases that use ReadableStream<string> is all that's needed to migrate.

Discussion

Alternatively, all 3 functions could be modified to accept ReadbleStream<Uint8Array> | ReadableStream<string>. However, this wasn't done in order to align with the Request and Response objects and slightly improve performance while still catering to the majority of use cases.

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.43%. Comparing base (c0f87db) to head (c0e8f9c).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5079      +/-   ##
==========================================
- Coverage   92.43%   92.43%   -0.01%     
==========================================
  Files         481      481              
  Lines       38750    38746       -4     
  Branches     5449     5448       -1     
==========================================
- Hits        35817    35813       -4     
  Misses       2878     2878              
  Partials       55       55              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@iuioiua iuioiua marked this pull request as ready for review June 19, 2024 05:49
@iuioiua iuioiua requested review from kt3k and crowlKats as code owners June 19, 2024 05:49
@kt3k
Copy link
Member

kt3k commented Jun 19, 2024

This change was made to align toJson() to Request.json() and Response.json(), and toText() to Request.text() and Response.text(),

I don't understand why we need to align them to Response methods. This seems unnecessarily reducing the use cases.

@iuioiua
Copy link
Contributor Author

iuioiua commented Jun 20, 2024

That's what it originally said in the descriptions of these functions. Alternatively, we can go ahead with what I described in the Discussion section of the initial comment, but I'd prefer not to.

@kt3k
Copy link
Member

kt3k commented Jun 20, 2024

The original inspiration for these APIs are stream/consumers APIs of Node.js (see #3587 ), and they accept streams of strings

@iuioiua
Copy link
Contributor Author

iuioiua commented Jun 20, 2024

Then why don't users use node:stream/consumers? It's implemented cleanly in Deno for server-side use cases. For most browser-based use cases, one would likely just need the Request and Response methods.

@kt3k
Copy link
Member

kt3k commented Jun 20, 2024

Then why don't users use node:stream/consumers? It's implemented cleanly in Deno for server-side use cases. For most browser-based use cases, one would likely just need the Request and Response methods.

I don't see the point well. Is this argument related to this PR? Or do you suggest removing these APIs?

@iuioiua
Copy link
Contributor Author

iuioiua commented Jun 20, 2024

Yes, I suggest removing them. Frankly, I've never understood the reasons for adding them in the first place 😅

@kt3k
Copy link
Member

kt3k commented Jun 20, 2024

Maybe let's keep discussing that as RC feedback to streams

@iuioiua iuioiua closed this Jun 20, 2024
@iuioiua iuioiua deleted the streams-response branch June 20, 2024 07:56
@iuioiua
Copy link
Contributor Author

iuioiua commented Jul 19, 2024

This almost slipped past us before stabilizing @std/streams. What would you like to do, @kt3k? I now suggest proceeding with this PR, as it seems to support almost all use cases (GH search). My second preference would be to remove these functions.

@iuioiua iuioiua mentioned this pull request Jul 19, 2024
6 tasks
@kt3k
Copy link
Member

kt3k commented Jul 19, 2024

I still don't understand why excluding string (and other type) input is desirable. Why do we need to align the behavior to Response.prototype.toJson? What's the benefit?

@iuioiua
Copy link
Contributor Author

iuioiua commented Jul 19, 2024

I think a better question is why should we support ReadableStream<string> when ReadableStream<Uint8Array> is the primary use case? These 3 functions were deliberately created to align with Response methods (it's in the their descriptions), but extending them to also support ReadableStream<string> seems like it was an arbitrary decision. As a Standard Library, we should be consistent and not do more than we need to, across the board.

@kt3k
Copy link
Member

kt3k commented Jul 19, 2024

why should we support ReadableStream when ReadableStream is the primary use case?

Because they are exposed as public API and we should avoid unnecessary breaking changes as far as possible.

These 3 functions were deliberately created to align with Response methods (it's in the their descriptions)

It's also explicitly documented that they accept string inputs. So this is intentional design, not a mistake.

@kt3k
Copy link
Member

kt3k commented Jul 19, 2024

Also there seem no downside of accepting ReadableStream<string>. Isn't it just sensible/predictable enhancement to Response.prototype.json? We sometimes do such enhancement to Web spec inspired APIs. e.g. crypto.digest of std/crypto accepts Iterable and AsyncIterable data sources in addition to ArrayBuffer(View)s

@iuioiua
Copy link
Contributor Author

iuioiua commented Jul 22, 2024

Also there seem no downside of accepting ReadableStream<string>.

If we try to justify having code we don't need, we end up with a bloated codebase.

Isn't it just sensible/predictable enhancement to Response.prototype.json? We sometimes do such enhancement to Web spec inspired APIs. e.g. crypto.digest of std/crypto accepts Iterable and AsyncIterable data sources in addition to ArrayBuffer(View)s

Yes, but that should be vetted on a case-by-case basis, and within reason. There seems to be no good reason to do this for these functions, as most are using ReadableStream<Uint8Array>. I think the Standard Library should cater to majority of use cases. Current users of these functions would also enjoy a small performance boost.

import { toJson } from "./streams/to_json.ts";

async function toText(stream: ReadableStream<Uint8Array>): Promise<string> {
  const textDecoder = new TextDecoder();
  const reader = stream.getReader();
  let result = "";

  while (true) {
    const { done, value } = await reader.read();

    if (done) {
      break;
    }

    result += textDecoder.decode(value, { stream: true });
  }
  result += textDecoder.decode();
  return result;
}

function toJson2(stream: ReadableStream<Uint8Array>): Promise<unknown> {
  return toText(stream).then(JSON.parse);
}

function createStream() {
  const chunks = [
    "[1, true",
    ', [], {}, "hello',
    '", null]',
  ];
  return ReadableStream.from(chunks).pipeThrough(new TextEncoderStream());
}

const group = "x";

Deno.bench("current", { group, baseline: true }, async () => {
  await toJson(createStream());
});

Deno.bench("new", { group }, async () => {
  await toJson2(createStream());
});
cpu: Apple M2
runtime: deno 1.45.2 (aarch64-apple-darwin)

file:///Users/asher/GitHub/std/bench.ts
benchmark      time (avg)        iter/s             (min … max)       p75       p99      p995
--------------------------------------------------------------- -----------------------------

group x
current        18.36 µs/iter      54,463.3  (13.83 µs … 389.46 µs) 16.83 µs 46.58 µs 96.5 µs
new            17.38 µs/iter      57,527.5   (15.5 µs … 222.79 µs) 16.21 µs 37.42 µs 99.5 µs

summary
  current
   1.06x slower than new

@kt3k
Copy link
Member

kt3k commented Jul 23, 2024

as most are using ReadableStream.

I disagree with this observation. There are several std APIs that return ReadableStream<string>:

  • CsvStringifyStream
  • JsonStringifyStream
  • TextDelimiterStream
  • TextLineStream

Also this change causes perf penalty when the user has ReadableStream<string>. With this change the user once needs to encode it into ReadableStream<Uint8Array>, then decode it to text again using toText.

@iuioiua
Copy link
Contributor Author

iuioiua commented Jul 23, 2024

Ok. I'll submit a superseding PR that clarifies these functions and unblocks [email protected].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants