Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: document AliasedBuffers usage scope in style guide #24724

Closed
wants to merge 3 commits into from

Conversation

gireeshpunathil
Copy link
Member

Expose AliasedBuffer API and its function through the C++ style guide, and also differentiate it with
TypedArrays.

Fixes: #22977

A draft attempt to follow @joyeecheung 's thought, open to iterate over the verbiage.

/cc @joyeecheung @addaleax

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the doc Issues and PRs related to the documentations. label Nov 29, 2018
need to be transferred between C++ and JS semantics, and can incur
significant effort if passing around a lot with the data.

AliasedBuffer on the other had, provides direct access to the backing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AliasedBuffer on the other had, provides direct access to the backing
AliasedBuffer on the other hand, provides direct access to the backing

@addaleax
Copy link
Member

Um … this reads a bit confusing to me. AliasedBuffers are TypedArrays, and conversely, we can directly access a TypedArray’s memory? I’m not sure there’s anything to actually choose here…

The last paragraph also seems somewhat odd, because the reason that we have AliasedBuffer is that we can use the abstraction to hook into the VM’s object modification observation mechanisms; so it seems like the opposite is true? Not using AliasedBuffer to modify TypedArray memory is what breaks these mechanisms. (This detection stuff is only actually supported by @nodejs/chakracore, afaik, and I’m not sure about the state of that project as a whole.)

So: We should use AliasedBuffers when we want the VM to be able to observe modifications from the C++ side. That’s pretty it, as far as I know.

@@ -21,6 +21,7 @@
* [Use explicit pointer comparisons](#use-explicit-pointer-comparisons)
* [Ownership and Smart Pointers](#ownership-and-smart-pointers)
* [Avoid non-const references](#avoid-non-const-references)
* [Choice between Typed Arrays and AliasedBuffers](#typed-arrays-andaliased-buffers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [Choice between Typed Arrays and AliasedBuffers](#typed-arrays-andaliased-buffers)
* [Choice between Typed Arrays and AliasedBuffers](#choice-between-typed-arrays-and-aliasedbuffers)

### Choice between Typed Arrays and AliasedBuffers

Typed Arrays define APIs for handling binary data. While the usage of
these APIs are straightforward, the data created through these APIs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
these APIs are straightforward, the data created through these APIs
these APIs is straightforward, the data created through these APIs


AliasedBuffer on the other had, provides direct access to the backing
buffer of a JS object. This means direct writes into the buffer from C++
becomes visible through the wrapping JS object and vice versa.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
becomes visible through the wrapping JS object and vice versa.
become visible through the wrapping JS object and vice versa.

buffer of a JS object. This means direct writes into the buffer from C++
becomes visible through the wrapping JS object and vice versa.

While this is more efficient than Typed Arrays, it overrides the v8's
Copy link
Contributor

@vsemozhetbyt vsemozhetbyt Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
While this is more efficient than Typed Arrays, it overrides the v8's
While this is more efficient than Typed Arrays, it overrides the V8's

@gireeshpunathil
Copy link
Member Author

thanks for the reviews! let me wait to hear from @joyeecheung as well.

@joyeecheung
Copy link
Member

joyeecheung commented Nov 30, 2018

I agree with @addaleax . For the background of the AliasedBuffer abstraction, see #15077

The most applicable case of AliasedBuffer is when you have some global data that you need to pass between C++ and JS out-of-band. For example, the AliasedBuffers on the Environment fs_stats_field_array are globally shared among all the fs.*stat callback wrappers and fs.*statSync implementation, the C++ bindings put the results in those buffers and the JS code would read them with the assumption that during the execution the data in those global buffers are fresh and are exactly what they want. This makes it difficult for ChakraCore to implement the time-travel debugging because there is no way for the VM to observe this mutation, hence the abstraction was introduced. Although, AFAICT, we now always return the buffer or pass it to callbacks before we return to the JS layer, and the JS layer does not actually grab it from the global binding anymore (even though essentially the values are still global per environment, we only pass around a local reference to that global), so in theory the VM should be able to observe the change without this abstraction now.

They are also used when the ArrayBufferViews need to be persistent, e.g. the ones in the FSReqPromise - those are not globally shared among callsites, but they are globals to the V8 isolate, and can be garbage collected after the promise resolves and the JS land is done with them . Although these are also always resolved back to the JS land now so I guess it's not strictly necessary for the VM to observe them through the abstraction, but nonetheless AliasedBuffer serves as a home-grown abstraction for manipulating these global arrays so there is still merit in it.

@gireeshpunathil
Copy link
Member Author

thanks for all the reviews! pushed a modified version that I think covers the essence of inputs especially from @addaleax and @joyeecheung ; PTAL, thanks!


// Modify the data through the setters at specified index
// Thus, the setter has a chance to hook mutation observers
data.SetValue(0, 12345);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use the [] accessors, so I think the style guide should list them as the primary alternative?

So if the object's backing data is being modified from C++ side
out-of-band with the calling context, the recommended abstraction to use
is AliasedBuffer class, that has place holders to hook observers
for data mutations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is very hard to read for me.

I don’t want to tell you what to write, but I think something like this would do:

When modifying typed arrays directly from C++, use an AliasedBuffer when possible,
because that makes it easier for some JS debuggers to observe these modifications.

We’d probably want to go into more detail when – if ever – something like node-chakracore gets merged into Node.js core, and we actually need to take care of always making modifications observable. For now, the chakracore team has been doing that on their own in the places where we don’t use AliasedBuffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax - thanks, pushed a modified version with almost your suggested wordings. ptal, thanks.

@gireeshpunathil gireeshpunathil changed the title doc: document choice between TypedArray and AliasedBuffer in style guide doc: document AliasedBuffers usage scope in style guide Dec 1, 2018

// Modify the data through natural operator semantics,
// but those operators are overloaded in AliasedBuffer
// and provide placeholders to hook mutation observers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this provides placeholders is only relevant to VM engine implementers (i.e. not Node.js coders), so I don’t think it needs to be in the style guide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This information is also included in the doc comment for AliasedBuffer, which is a much better place for it anyway?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removed the comment that reveals info relevant to VM implementors.


When working with typed arrays that involves direct data modification
from C++, and when such modifications can have side effects outside
of the callsite, use an AliasedBuffer when possible, because that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I’m still somewhat confused… What does

when such modifications can have side effects outside of the callsite

mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean to express this:
If the C++ layer modification of data is unanticipated / unforeseen / invisible at the callsite.

Example:

buffer.fill('x')

here, the backend buffer is filled with data at C++ level, and this is evident by looking at the JS callsite. Whereas:

obj.registerModule('foo)';

running into C++ land, and modifying artifacts linked to obj is not evident at the callsite.

My way of telling is to recommend AliasedBuffer usage in second type of cases. Does it make any meaning to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gireeshpunathil Yes, but … the thing is, afaik (and you or @nodejs/chakracore can correct me if I’m wrong), there’s no real difference between these cases.

The VM debugger wouldn’t be notified of modifications even in the buffer.fill('x') case; and because we don’t use AliasedBuffer (and we can’t really, unless we port memset/memcpy to it), they still have to do explicit notifications after our modifications:

https://github.com/nodejs/node-chakracore/blob/38520056313bc747e3ca6dcc7598c2e51394c0ac/src/node_buffer.cc#L575-L583

(They haven’t fully covered all the Buffer methods, to be fair.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax - ok; I removed that phrase to avoid confusion.

[On a side note]: In a sufficiently large application, what the VM (or a monitoring service that leverages the VM function) observed data at this level will be useful for? In other words, for a program or a user, a buffer.fill is an obvious callsite that mutates the buffer data at some lower levels, so may not be of real interest (or of any use), but would be interested to get notified on an out-of-bounds access because there is no normal way of knowing it otherwise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gireeshpunathil So, the ChakraCore application for this whole notification thingy is time-travel debugging… I haven’t ever tried that myself, but I believe the purpose here is to be able to compare the values of JS objects at different points in time, and that’s what the notifications are for? Without this, I think the debugger could show the same contents of a buffer after a .fill() call and when going back in time to before that call?

@gireeshpunathil
Copy link
Member Author

@@ -21,6 +21,7 @@
* [Use explicit pointer comparisons](#use-explicit-pointer-comparisons)
* [Ownership and Smart Pointers](#ownership-and-smart-pointers)
* [Avoid non-const references](#avoid-non-const-references)
* [When to use AliasedBuffers](#when-to-use-aliasedbuffers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Use AliasedBuffers to manipulate TypedArrays would be easier to recognize in the TOC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung - done, thanks.


When working with typed arrays that involves direct data modification
from C++, use an AliasedBuffer when possible, because that makes it
easier for some JS debuggers to observe these modifications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term some JS debuggers reads somewhat strange to me, can we just omit the reason and direct the reader to the comment in aliased_buffer.h? (one needs to read the header to use it anyway)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified accordingly, and linked with the header file section for reference. Hope @addaleax is good with this change?

Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: nodejs#22977
@gireeshpunathil
Copy link
Member Author

@targos @vsemozhetbyt the content you have reviewed and change suggested is fundamentally changed, so please take another look, thanks!

Copy link
Contributor

@vsemozhetbyt vsemozhetbyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about these suggestions, so it would be good to get a +1 from native speakers and C++ users. Otherwise, doc format LGTM.

@@ -257,6 +258,21 @@ class ExampleClass {
};
```

### Use AliasedBuffers to manipulate TypedArrays

When working with typed arrays that involves direct data modification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When working with typed arrays that involves direct data modification
When working with typed arrays that involve direct data modification

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed, thanks.

### Use AliasedBuffers to manipulate TypedArrays

When working with typed arrays that involves direct data modification
from C++, use an AliasedBuffer when possible. The API abstraction and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from C++, use an AliasedBuffer when possible. The API abstraction and
from C++, use an `AliasedBuffer` when possible. The API abstraction and

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


When working with typed arrays that involves direct data modification
from C++, use an AliasedBuffer when possible. The API abstraction and
the usage scope of AliasedBuffer is documented in [aliased_buffer.h][].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the usage scope of AliasedBuffer is documented in [aliased_buffer.h][].
the usage scope of `AliasedBuffer` are documented in [aliased_buffer.h][].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

the usage scope of AliasedBuffer is documented in [aliased_buffer.h][].

```c++
// Create an AliasedBuffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Create an AliasedBuffer
// Create an AliasedBuffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@gireeshpunathil
Copy link
Member Author

@vsemozhetbyt thanks for the suggestion, and the recommendation. /cc @Trott (a native speaker) and @jasnell (a C++ user / both) - request a review!


When working with typed arrays that involve direct data modification
from C++, use an `AliasedBuffer` when possible. The API abstraction and
the usage scope of `AliasedBuffer` is documented in [aliased_buffer.h][].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nit:

Suggested change
the usage scope of `AliasedBuffer` is documented in [aliased_buffer.h][].
the usage scope of `AliasedBuffer` are documented in [aliased_buffer.h][].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vsemozhetbyt - changed, thanks.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM, yes :)

@gireeshpunathil
Copy link
Member Author

@Trott
Copy link
Member

Trott commented Dec 4, 2018

Landed in 447b390

@Trott Trott closed this Dec 4, 2018
Trott pushed a commit to Trott/io.js that referenced this pull request Dec 4, 2018
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: nodejs#22977

PR-URL: nodejs#24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
BridgeAR pushed a commit that referenced this pull request Dec 5, 2018
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: #22977

PR-URL: #24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
@BridgeAR BridgeAR mentioned this pull request Dec 5, 2018
4 tasks
refack pushed a commit to refack/node that referenced this pull request Jan 14, 2019
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: nodejs#22977

PR-URL: nodejs#24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
BethGriggs pushed a commit that referenced this pull request Feb 12, 2019
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: #22977

PR-URL: #24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
@BethGriggs BethGriggs mentioned this pull request Feb 12, 2019
BethGriggs pushed a commit that referenced this pull request Feb 20, 2019
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: #22977

PR-URL: #24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
rvagg pushed a commit that referenced this pull request Feb 28, 2019
Explain usage context and scope of AliasedBuffer API and its
function in the C++ style guide. Provide an example code.

Fixes: #22977

PR-URL: #24724
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

doc: document AliasedBuffer in C++ style guide
7 participants