Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Add support for read/seek on files in the native file system #16307

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

msabwat
Copy link
Contributor

@msabwat msabwat commented Feb 16, 2022

Hi,

This is a set of changes to try to solve the issue described in #14094.

This PR will add :

  • EMSCRIPTEN_NATIVE_FS option: to spawn the nativefs_worker and setup pthreads so that they can read/seek selected files from sharedMemory
  • NATIVEFS_MAX_FD option: to set the maximum number of files the user can select
  • nativefs_* functions that can be called from c/c++
  • a button in the demo page to allow file selection
  • unit tests and one functional test for nativefs_read and nativefs_seek

I had an issue when I wanted to pass parameters from pthreads to the main thread, directly from a javascript library. I patched mainThreadEM_ASM to be able to make the calls, without having to define functions in C only to add Macros to inline javascript code.

}
};
}
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than putting this in the preample, I think you can just create an "NativeFS.init()" function in your library and then use the __postset feature of the library to call that init function on startup.

// This option depends on the previous one, it controls the size of allocated
// nativefs structure that contains state information on the files currently open.
// default value is 1024, maximum number of file descriptors allowed.
var NATIVEFS_MAX_FD=1024
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems like you are creating an entirely new FS space.. that sits alongside the existing one. Why not integrate with it such that certain FSs can be from native FS and others from other filesystems? Then you don't need a new API at all and you can just use the normal C fread/fwrite/fopen APIs.

As it stands it seem confusing that fd 1 will exist both as regular fd and a nativefs fd.

@@ -190,6 +190,9 @@ function callMain(args) {
assert(ret == 0, '_emscripten_proxy_main failed to start proxy thread: ' + ret);
#endif
#else
#if EMSCRIPTEN_NATIVE_FS
_nativefs_cleanup();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any, I think it would be good to keep this kind of thing in the library file for nativefs. You can use addAtExit like library_fs.js does.

"defines": [
"NATIVEFS_SEEK_SET",
"NATIVEFS_SEEK_CUR",
"NATIVEFS_SEEK_END"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use different constants to the existing SEEK_XX ones?

}
#endif

#endif NATIVEFS_H
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NATIVEFS_H should be in a comment here

var fd = 2;
var sharedMemory;
var fs_lock = -1;
var nativefs_handle;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this code has to be in worker.

Why can't the files be stored on the main thread? It looks like its just an array of blobs which should work just as well on the main thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the review !
I thought it would be easier to have this version first. With Atomics.wait() impossible in the main thread, and having to do async reads instead of sync reads.

I agree it would be better to avoid having to send the file anywhere at all. I will send another version taking your comments into account.

@kripken
Copy link
Member

kripken commented Feb 16, 2022

I haven't read all the details here, but there is a lot of code. Can you perhaps write up a short overview of what this does, and how?

As a general thought: We are working hard on WasmFS now, which will eventually replace the old FS code, so new file-related things should use WasmFS in most cases. Here, could we add a file backend for WasmFS that supports the file picker? (Also, this might be easier to do in WasmFS as it has native support for multithreading.)

WasmFS is not quite stable, but implementing some backends is already possible. See #16209 and #16229 for generic mechanisms for interacting with JS.

@msabwat
Copy link
Contributor Author

msabwat commented Feb 17, 2022

I haven't read all the details here, but there is a lot of code. Can you perhaps write up a short overview of what this does, and how?

It allows a user to compile a multi-threaded C/C++ application that will read/seek files selected in the native filesystem with the file picker. Because I though making calls in the main thread was not ideal, there is a dedicated runtime worker that will receive all the files along with operations to read/seek/cleanup the state.

for example: https://github.com/msabwat/emscripten/blob/nfs/2/tests/browser/nativefs/nativefs_seek.c

As a general thought: We are working hard on WasmFS now, which will eventually replace the old FS code, so new file-related things should use WasmFS in most cases. Here, could we add a file backend for WasmFS that supports the file picker? (Also, this might be easier to do in WasmFS as it has native support for multithreading.)

WasmFS is not quite stable, but implementing some backends is already possible. See #16209 and #16229 for generic mechanisms for interacting with JS.

Thank you for the explanation, it makes more sense to add a wasmfs backend

@kripken
Copy link
Member

kripken commented Feb 22, 2022

@msabwat Thanks for the explanation!

Given the dedicated worker approach here, I think in WasmFS you can use an async proxied JS backend (#16229). It's pretty new, but what you're doing sounds like a great use case for that. If you don't mind changing APIs atm then you might be interested to try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants