Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept ArrayBuffer (and typed array/data view?) anywhere Buffer is allowed in the API #1826

Closed
domenic opened this issue May 28, 2015 · 29 comments
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js. wip Issues and PRs that are still a work in progress.

Comments

@domenic
Copy link
Contributor

domenic commented May 28, 2015

That is, update fs, http, streams, etc.

This is a proposal, but I would think with @trevnorris's work on making Buffer a subclass of Uint8Array, it seems likely to be not too hard...

On the web the best practice has emerged that when accepting arguments, allow any of ArrayBuffer or typed array or DataView, which is why I think more than just Buffer | ArrayBuffer would make sense.

What do people think? @trevnorris, am I understanding correctly that after your work lands this would not be very hard?

@mscdex mscdex added the buffer Issues and PRs related to the buffer subsystem. label May 28, 2015
@mscdex
Copy link
Contributor

mscdex commented May 28, 2015

I'm guessing that this would be within the confines of core and not 3rd party addons (at least such addons would need to explicitly support ArrayBuffers also)?

@domenic
Copy link
Contributor Author

domenic commented May 28, 2015

@mscdex good question. I am not sure how third-party addons get access to the underlying void*. If they all go through a function, we should be able to modify that function to accept any of these...

@trevnorris
Copy link
Contributor

@domenic if you pass new Buffer(new ArrayBuffer(5)) would you expect the data to be copied or pointed to by the new buffer instance?

Either way it can be done. This was actually on my list of things to implement, but I wanted to keep the initial PR small.

TBH it wouldn't be hard to support ArrayBuffer in most of the API. e.g.

var buf = new Buffer(16);
var ab = new ArrayBuffer(16);
buf.copy(ab);

@trevnorris
Copy link
Contributor

Third party support shouldn't be a problem. The native Buffer API still hands back the void* that the ArrayBuffer points to. So anyone can operate on it as they usually have.

@domenic
Copy link
Contributor Author

domenic commented May 28, 2015

@trevnorris to clarify I wasn't just talking about the buffer API; I was talking about the whole io.js API surface area. FS, streams, etc. Not sure if that came across in the OP... editing now.

if you pass new Buffer(new ArrayBuffer(5)) would you expect the data to be copied or pointed to by the new buffer instance?

Given how new Uint8Array(new ArrayBuffer(5)) behaves I'd expected pointed to, but I am more used to typed arrays than I am to buffers to be honest...

@domenic
Copy link
Contributor Author

domenic commented May 28, 2015

This was actually on my list of things to implement, but I wanted to keep the initial PR small.

Yeah definitely, can leave this for a minor version probably.

@trevnorris
Copy link
Contributor

A pointer is fine with me.

If we say that an ArrayBuffer is treated as binary data then this is feasible. Typed Arrays can be a little tricky because of user's expectations (e.g. is the "length" to be written the byte length or the index length). But it can definitely be investigated.

@obastemur
Copy link
Contributor

+1 for even talking ArrayBuffer / node.JS Buffer..

The native Buffer API still hands back the void* that the ArrayBuffer points to.
A pointer is fine with me.

Let's say variable X = ArrayBuffer while the Y = Buffer (points to X's memory). As long as Y is alive, there is a need to keep X from GC.

A naive hope / question / discussion, (although it's a clear break in the API) why not just use ArrayBuffer and node.JS buffer as it's view (optional) ?

@Fishrock123 Fishrock123 added the feature request Issues that request new features to be added to Node.js. label Jun 1, 2015
@feross
Copy link
Contributor

feross commented Jun 6, 2015

if you pass new Buffer(new ArrayBuffer(5)) would you expect the data to be copied or pointed to by the new buffer instance?

I would expect pointed to. All other argument types to the Buffer constructor get copied, but they're all array-like (have length property and are indexable). ArrayBuffer is not array-like, so it's not a huge deal if we add a special case here. It's more useful to have it share memory, I think.

@Trott
Copy link
Member

Trott commented Mar 11, 2016

This has been inactive for a while. I'm going to throw a help wanted label on it, but feel free to remove it if, you know, help isn't actually wanted. (Why do I type these things?)

@Trott Trott added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Mar 11, 2016
@trevnorris
Copy link
Contributor

This would be more easily done now that Buffer is a Uint8Array. In fact the native Buffer::HasInstance() only checks if the object is a Uint8Array instance. Change would be tedious and require a lot of documentation updates. Especially since it may be confusing out of context reading an API takes a Uint8Array, which Buffer is, but not know that you can actually pass a Buffer.

ArrayBuffer would require a bit more work, but also doable now.

@domenic
Copy link
Contributor Author

domenic commented Mar 11, 2016

It seems to me that the "right" way to fix this is to have everything internally operate on ArrayBuffers, then you have the equivalent of the following in all public binary-data accepting APIs:

function f(..., buffer) {
  if (ArrayBuffer.isView(buffer)) {
    buffer = buffer.buffer;
    // this will detect Buffers, Uint8Arrays, DataViews, etc.
  }
  // now you can process the ArrayBuffer buffer, maybe
  // after validating that it's a true ArrayBuffer.
}

@Trott
Copy link
Member

Trott commented Jul 6, 2017

This was actually on my list of things to implement,

@trevnorris Is it still on your list? If not, would you be willing to mentor someone else who wanted to do it? If so, we can add a mentor available label, I suppose.

@TimothyGu
Copy link
Member

A small update after #12223:

  • The C++ infrastructure for consuming all views of ArrayBuffer (all TypedArray types and DataView) has been set up.
  • No module, as far as I know, supports plain ArrayBuffer yet.
  • 2 modules (crypto and zlib) now support all views of ArrayBuffer in addition to Buffer.
  • Most modules now support Uint8Array in addition to Buffer, but not Uint16Array, DataView, etc., thanks to work done by @addaleax.

Example of converting a Buffer-only function to one that also supports Uint8Array: 627ecee
Example of converting a function that supports Uint8Array to one that supports all other ArrayBufferViews: 2ced07c


@Trott Regardless of @trevnorris's status, I would be available for mentoring this, both now and during the upcoming Code & Learn. If someone drafts a list of modules to convert, as far as I'm concerned even good-first-contribution can be applied to this issue.

@Trott
Copy link
Member

Trott commented Jul 7, 2017

@Zahidul-Islam might be interested in taking this on if @TimothyGu is willing to mentor. Maybe he can start with a single module just to get a feel for it?

@trevnorris
Copy link
Contributor

@Trott Been wrapped up w/ work and async hook. If @TimothyGu is available then he should probably mentor.

@TimothyGu
Copy link
Member

I'd like to reiterate that I'm available for mentoring. @Zahidul-Islam We can get started on any module that doesn't yet support any other TypedArrays except Uint8Array. What time will you be available for talk on IRC? I'm generally available 12am – 9am UTC due to the timezone I'm in.

@jemjam
Copy link

jemjam commented Oct 6, 2017

Hey @TimothyGu I've been peering at this item and wouldn't mind helping out (but could maybe make use of some mentoring). Your comment makes sense so I can probably get started, but still wouldn't mind some help. I'm at the Code&Learn today, but can also follow up later.

@TimothyGu TimothyGu added wip Issues and PRs that are still a work in progress. and removed help wanted Issues that need assistance from volunteers or PRs that need help to proceed. labels Oct 7, 2017
addaleax pushed a commit to addaleax/node that referenced this issue Dec 13, 2017
PR-URL: nodejs#16042
Refs: nodejs#1826
Reviewed-By: Timothy Gu <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
MylesBorins pushed a commit that referenced this issue Jan 8, 2018
PR-URL: #16042
Refs: #1826
Reviewed-By: Timothy Gu <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
@SirR4T
Copy link
Contributor

SirR4T commented Aug 22, 2018

@TimothyGu : maybe I'm missing something, but I couldn't find any API surface to convert to use TypedArrays or DataViews in the following modules:

  • vm
  • v8

Also, I'm not sure how a string_decoder should work with TypedArrays, as they do not support a .toString() method (unlike a Buffer, and so a Uint8Array). Any ideas?

@BeniCheni
Copy link
Contributor

Friendly reminder #22562 PR tries out the string_decoder item from the checklist. Looking forward to your review.

addaleax pushed a commit that referenced this issue Sep 17, 2018
Refs: #1826
PR-URL: #22562
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
@BeniCheni
Copy link
Contributor

Friendly reminder: #22921 tries out the vm item from the checklist of earlier comment.

targos pushed a commit that referenced this issue Sep 18, 2018
Refs: #1826
PR-URL: #22562
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this issue Sep 19, 2018
Refs: #1826
PR-URL: #22562
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
targos pushed a commit that referenced this issue Sep 20, 2018
Refs: #1826
PR-URL: #22562
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Trivikram Kamat <[email protected]>
@BeniCheni
Copy link
Contributor

BeniCheni commented Oct 2, 2018

danbev pushed a commit that referenced this issue Oct 8, 2018
In tls module, accept ArrayBuffer/DataView in place of isUint8Array in
the source code & related test code in "test-tls-basic-validations.js",
per the "tls" item in the checklist of the comment in #1826.

PR-URL: #23210
Refs: #1826
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
jasnell pushed a commit that referenced this issue Oct 17, 2018
In tls module, accept ArrayBuffer/DataView in place of isUint8Array in
the source code & related test code in "test-tls-basic-validations.js",
per the "tls" item in the checklist of the comment in #1826.

PR-URL: #23210
Refs: #1826
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
@oyyd
Copy link
Contributor

oyyd commented Oct 29, 2018

Related change for v8 module is introduced in #23953.

refack pushed a commit to oyyd/node that referenced this issue Oct 31, 2018
This commit allow passing `TypedArray` and `DataView` to:
- v8.deserialize()
- new v8.Deserializer()
- v8.serializer.writeRawBytes()

PR-URL: nodejs#23953
Refs: nodejs#1826
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
targos pushed a commit that referenced this issue Nov 2, 2018
This commit allow passing `TypedArray` and `DataView` to:
- v8.deserialize()
- new v8.Deserializer()
- v8.serializer.writeRawBytes()

PR-URL: #23953
Refs: #1826
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
refack pushed a commit that referenced this issue Nov 2, 2018
PR-URL: #22921
Refs: #1826
Refs: #22921 (comment)
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
targos pushed a commit that referenced this issue Nov 2, 2018
PR-URL: #22921
Refs: #1826
Refs: #22921 (comment)
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@teslatickles
Copy link

teslatickles commented Nov 23, 2018

I'm eager to help @TimothyGu. If possible, I would be incredibly thankful for any guidance/mentoring. I looked at the examples you provided and think I understand what needs to be done (maybe some initial guidance?). I apologize if I come off as a complete "noob", but I really enjoy Node, the community seems great, and I want to actually CONTRIBUTE!

BethGriggs pushed a commit that referenced this issue Apr 8, 2019
This commit allow passing `TypedArray` and `DataView` to:
- v8.deserialize()
- new v8.Deserializer()
- v8.serializer.writeRawBytes()

PR-URL: #23953
Refs: #1826
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
BethGriggs pushed a commit that referenced this issue Apr 8, 2019
PR-URL: #22921
Refs: #1826
Refs: #22921 (comment)
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@benwiley4000
Copy link

benwiley4000 commented Jan 6, 2020

Not sure if it's exactly the same problem but in my case what I want is something that is exactly the same as Buffer/Uint8Array in terms of API (either would work for me) but just allows an arbitrarily large size like ArrayBuffer instead of being capped like Buffer and Uint8Array. Not sure if there is any plan to remove the Uint8Array cap but otherwise a good solution for a wrapper around ArrayBuffer would be to allocate a series of internal Uint8Array views on construction and use those to set and grab values.

E.g.

const UINT8_ARRAY_VIEW_SIZE = 1000 * 1000 * 500 // 500MB ?

class UncappedUint8Array extends Uint8Array {
  static from(arr) {
    return arr.slice();
  }

  constructor(arrayBuffer) {
    this._arrayBuffer = arrayBuffer;
    const { length } = this;
    this._views = Array(Math.ceil(length / UINT8_ARRAY_VIEW_SIZE));
    let offset = 0;
    while (offset < length) {
      const size = Math.min(length - offset, UINT8_ARRAY_VIEW_SIZE);
      this._views.push(new Uint8Array(arrayBuffer, offset, size));
      offset += size;
    }
  }

  get length() {
    return this._arrayBuffer.byteLength;
  }

  // could be used for index get
  _byteAt(index) {
    const viewSize = UINT8_ARRAY_VIEW_SIZE;
    return this._views[Math.floor(length / viewSize)][length % viewSize];
  }

  // could be used for index set
  _setByteAt(index, value) {
    const viewSize = UINT8_ARRAY_VIEW_SIZE;
    this._views[Math.floor(length / viewSize)][length % viewSize] = value;
  }

  indexOf(value) {
    const { length } = this;
    for (let i = 0; i < length; i++) {
      if (this._byteAt(i) === value) {
        return i;
      }
    }
    return -1;
  }

  slice(startIndex, endIndex) {
    return new UncappedUint8Array(this._arrayBuffer.slice(startIndex, endIndex));
  }

  // ... etc
}

Not sure what amount of work would be required for me to implement the whole Uint8Array or Buffer interfaces this way, if too complicated could be something to consider for core? I guess I would need a Proxy to handle indexes? But not sure if that would be sufficient for all the other methods.. would they defer to the index Proxy?

@addaleax
Copy link
Member

I’ll close this issue because I think it’s (mostly?) done.

@benwiley4000 The fact that ArrayBuffers are fixed-size is prescribed by the JS spec, it’s outside of Node.js’s control. This is something that probably would have to be taken up with TC39 first, if you want to see changes in the language or in Node.js.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

No branches or pull requests