Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WasmFS] Fix FS.readFile + WASM_BIGINT #17316

Merged
merged 10 commits into from
Jul 18, 2022

Conversation

kleisauke
Copy link
Collaborator

@kleisauke kleisauke commented Jun 25, 2022

_wasmfs_read_file wrote a 32-bit unsigned integer, while
makeGetValue in JS read this as a 64-bit integer.

This caused the length to be returned incorrectly when linking
with -sWASM_BIGINT. Fix this by ensuring that _wasmfs_read_file
will write a 64-bit integer.

Resolves: #17444.

The wrong type was used for `makeGetValue` causing the length
to be returned incorrectly when linking with `-sWASM_BIGINT`.
tests/test_other.py Show resolved Hide resolved
// The integer length is returned in the first 8 bytes of the buffer.
var length = {{{ makeGetValue('buf', '0', 'i64') }}};
// The integer length resides in the first 8 bytes of the buffer.
var length = {{{ makeGetValue('buf', '0', 'i32') }}};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how about changing

*(uint32_t*)result = size;

to write an i64? I think that might be better. The size can be more than 32 bits, and we are all set up for that, just the wrong type is written in that C++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, WasmFS currently uses 32-bit sizes in a lot of places where it should be using 64-bit sizes...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 567cfd3 ensures that it writes an i64 in C++, and that it is read out as an i64 in JS. I've updated the PR description accordingly.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

lgtm % last comment

system/lib/wasmfs/js_api.cpp Outdated Show resolved Hide resolved
var length = {{{ makeGetValue('buf', '0', 'i64') }}};

// Default return type is binary.
// The buffer contents exist 8 bytes after the returned pointer.
#if WASM_BIGINT
// The length as BigInt must be converted to a Number.
var ret = new Uint8Array(HEAPU8.subarray(buf + 8, buf + 8 + Number(length)));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering using $bigintToI53Checked here instead of Number(length).

$MAX_INT53: '{{{ Math.pow(2, 53) }}}',
$MIN_INT53: '-{{{ Math.pow(2, 53) }}}',
// Counvert a bigint value (usually coming from Wasm->JS call) into an int53
// JS Number. This is used when we have an incoming i64 that we know is a
// pointer or size_t and is expected to be withing the int53 range.
// Returns NaN if the incoming bigint is outside the range.
$bigintToI53Checked__deps: ['$MAX_INT53', '$MIN_INT53'],
$bigintToI53Checked: function(num) {
return (num < MIN_INT53 || num > MAX_INT53) ? NaN : Number(num);
},

That is, something similar to {{{ receiveI64ParamAsI53('length', -cDefine('EOVERFLOW')) }}}.

emscripten/src/parseTools.js

Lines 1118 to 1122 in 5003f94

function receiveI64ParamAsI53(name, onError) {
if (WASM_BIGINT) {
// Just convert the bigint into a double.
return `${name} = bigintToI53Checked(${name}); if (isNaN(${name})) return ${onError};`;
}

But I think this is redundant, since that is only a issue when using >= ~8GB buffers (_malloc(Number.MAX_SAFE_INTEGER + 1)). I'm not aware of any VM that currently supports such large buffers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that size buffer should not be a problem.

But actually I think that using readI53FromU64 may be the best thing here. We do actually want a JS number here for the reason you said, and that seems like the simplest way to get it. Then we wouldn't need an ifdef here: readI53FromU64 returns a Number either way.

@sbc100 what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should modify makeGetValue above such that we can pass i53 instead of i64 ? This would differ from i64 in that the return value would always be a Number?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 Nice idea! sgtm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, there is a difference between the signed and unsigned case though, which is why we have both readI53FromU64 and readI53FromI64, so just i53 is not enough information I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already support i64 and u64 in makeGetValue so we can support i53 and u53 too.

Copy link
Member

@kripken kripken Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot that, i53/u53 might make sense then.

One issue is that functions need to include the helper function as a dep (and to the developer it's less visible without explicitly calling the dep). But maybe the auto-dep stuff you recently added could make that work automatically.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have i53ConversionDeps for this purpose. Can makeGetValue not just call bigintToI53Checked... I don't see any singed/unsigned version of bigintToI53Checked..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 8efae3e incorporates this (but this will probably conflict with PR #17459). According to POSIX, off_t is a signed integer type, so readI53FromU64 is not appropriate here.

Note that this will not catch overflows, see e.g. these TODO items:

// TODO: Add $readI53FromI64Signaling() variant.

// TODO: Add $readI53FromU64Signaling() variant.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@sbc100 did you want to take another look?

@@ -22,6 +22,8 @@ __wasi_fd_t wasmfs_create_file(char* pathname, mode_t mode, backend_t backend);
// The buffer will also contain the file length.
// Caller must free the returned pointer.
void* _wasmfs_read_file(char* path) {
static_assert(sizeof(off_t) == 8, "File offset type must be 64-bit");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this assert here.. I'm pretty sure if this ever changed all kind of stuff would break.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly the assert is a little redundant, but it's static so no overhead, and also it is helpful for the reader, I think, as it indicates this is a contract with the JS side that must not change.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -408,6 +408,9 @@ function makeGetValue(ptr, pos, type, noNeedFirst, unsigned, ignore, align) {
}

const offset = calcFastOffset(ptr, pos, noNeedFirst);
if (type === 'i53' || type === 'u53') {
return 'readI53From' + (unsigned ? 'U' : 'I') + '64(' + offset + ')';
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already have another use of this in #17459 !

@kripken
Copy link
Member

kripken commented Jul 18, 2022

Let's maybe consider the assert in a followup, as there is a user that has hit the bug, so let me land this fix now.

@kripken kripken merged commit e79f10e into emscripten-core:main Jul 18, 2022
@kleisauke kleisauke deleted the wasmfs-fix-readfile-bigint branch July 19, 2022 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WasmFS: FS.readFile doesn't work with BigInt
4 participants