Refine the API in the public header #340

ChinYikMing · 2024-01-31T18:45:05Z

The following should be included in an emulator's simple and clear public API:

create/init core
run emulation
delete/destroy core

Other components, including as memory, file systems, program data, etc., should be abstracted from the user, as a result, setting a configuration value (vm_attr_t) is sufficient. The user should manage about memory (state_t) and elf stuff before this PR. The user may just construct a core, run it, and shut it down after this PR, so they won't need to worry about them anymore.

For stdio remapping, rv_remap_stdstream function is introduced.

The vm_attr_t has multiple fields and they are commented clearly in the code.

elf is reopened in dump_test_signature because elf is allocated during rv_create. It is acceptable to reopen elf since it is only for testing. Print inferior exit code to console inside main instead of syscall_exit because the actual usage of exit code depends on applications of using riscv public API.

The io interface is not changed in this PR because it could maybe reused with semu in some way, still need to be investigated. Also, Logging feature and system emulator integration are not implemented yet.

src/syscall_sdl.c

src/emulate.c

src/main.c

src/riscv.c

ChinYikMing · 2024-01-31T20:01:54Z

As discuss before, exit code shall be stored and maybe not to output to console directly.

Unfortunately, CI/host-x64 and CI/host-arm run make check and make misalign for testing, and the compare signature includes "inferior exit code 0" during exit but I do not output them in this PR. Shall we modify make check and make misalign ?

src/syscall.c

jserv · 2024-01-31T20:07:00Z

As discuss before, exit code shall be stored and maybe not to output to console directly.
Unfortunately, CI/host-x64 and CI/host-arm run make check and make misalign for testing, and the compare signature includes "inferior exit code 0" during exit but I do not output them in this PR. Shall we modify make check and make misalign ?

After a RISC-V program ends, it is important to verify the exit code (i.e., $?). Consequently, modifications to 'check' and related build targets are necessary to accommodate this.

ChinYikMing · 2024-01-31T20:09:07Z

After a RISC-V program ends, it is important to verify the exit code (i.e., $?). Consequently, modifications to 'check' and related build targets are necessary to accommodate this.

Sure, a build macro can be added to syscall_exit and print the exit code only on test for passing CI.

visitorckw · 2024-01-31T20:56:19Z

Due to the memory size being changed to a runtime-defined value, do we need to check if the addresses of load and store instructions exceed the specified range?

jserv · 2024-02-01T04:23:26Z

I would like to express a concern regarding the rv_userdata function being part of the public API. Originally, it was designed for internal implementation, which suggests there is no necessity to make rv_userdata accessible to users interacting with the public API. It might be more appropriate to keep this function internal to maintain clarity and simplicity in the API's usage.

ChinYikMing · 2024-02-01T06:10:12Z

Sure, a build macro can be added to syscall_exit and print the exit code only on test for passing CI.

The PRINT_EXIT_CODE build macro do not used in default build, thus make CI/host-x64 or CI/host-arm64 failed. Shall I add it to default build? Personally, I am prefer default build without PRINT_EXIT_CODE.

Another way to pass CI is that modifying the CI flow, build rv32emu with PRINT_EXIT_CODE in CI tests.

jserv · 2024-02-01T06:51:49Z

Sure, a build macro can be added to syscall_exit and print the exit code only on test for passing CI.

The PRINT_EXIT_CODE build macro do not used in default build, thus make CI/host-x64 or CI/host-arm64 failed. Shall I add it to default build? Personally, I am prefer default build without PRINT_EXIT_CODE.
Another way to pass CI is that modifying the CI flow, build rv32emu with PRINT_EXIT_CODE in CI tests.

Drop macro PRINT_EXIT_CODE because it is not necessary to have multiple builds. Instead, let the shell or builtin commands to check exit code of RISC-V program emulated.

ChinYikMing · 2024-02-01T06:56:04Z

Due to the memory size being changed to a runtime-defined value, do we need to check if the addresses of load and store instructions exceed the specified range?

Sound reasonable. Maybe in future PR.

Makefile

ChinYikMing · 2024-02-01T10:58:45Z

Drop macro PRINT_EXIT_CODE because it is not necessary to have multiple builds. Instead, let the shell or builtin commands to check exit code of RISC-V program emulated.

For minimal changes, printing inferior exit code %d inside main is able to pass the CI tests.

ChinYikMing · 2024-02-01T11:17:11Z

I would like to express a concern regarding the rv_userdata function being part of the public API. Originally, it was designed for internal implementation, which suggests there is no necessity to make rv_userdata accessible to users interacting with the public API. It might be more appropriate to keep this function internal to maintain clarity and simplicity in the API's usage.

Actually, we could introduce some helper functions or getter functions to access the specific field of userdata without losing the encapsulation. For example, some helper functions for memory viewing instead of returning whole userdata.

RinHizakura · 2024-02-01T13:59:48Z

src/main.c

+        .emu_data.vm_user = malloc(sizeof(vm_user_t)),
+        .cycle_per_step = 100,
+        .allow_misalign = opt_misaligned,
+    };


Since we are going to make these parameters user-defined now, I will suggest having some checks (check_vm_attr()) on the vm_attr_t before the emulator creation stage, otherwise bad parameters can lead to a crash which is not easy to figure out quickly. For example: mem_size should be larger than stack_size + args_offset_size.

Note that mem_size should also be larger than the ELF mapping range to load correctly, but this may not be easy to check before the emulator creation stage. As an alternative, maybe applying the range check for memory_write at the emulator running stage is a simple idea to solve this. We may need to take care of the possible performance drop for the extra comparison on this solution.

Nice perspective. I think validation of vm_attr_t could be complex. For example, shall we check the invalid file during validation or propagate to elf_open to verify the invalidation. Similar situation for memory, if the user-provided memory size is not enough to run emulation, shall we simply abort before rv_create or propagate to segmentation fault. We may need more discussion about this. Maybe in future PR?

Sure. We can leave these at other places.

Sure. We can leave these at other places.

I have mentioned it in the commit message.

src/main.c

src/riscv.c

ChinYikMing · 2024-02-02T08:08:53Z

Note that the run_and_trace function requires program name to run for now, see here. If we drop rv_userdata, the run_and_trace can only get program name by passing opt_prog_name or attr to it.

We could streamline the run and run_and_trace functions by treating run_and_trace as a specialized variant of rv_run, activated through specific callback functions or flags. This strategy would enable us to eliminate the need for exposing rv_userdata directly.

Since we have introduced vm_attr_t, we can simply add one more field (e.g., trace). rv_run can check that easily.

ChinYikMing · 2024-02-02T08:18:10Z

Since we have introduced vm_attr_t, we can simply add one more field (e.g., trace). rv_run can check that easily.

Another way: we can introduce rv_set_main_loop(rv, void callback(void *arg), void *arg). The application can pass their own callback to register main loop. In this way, we can simply call their callback and decouple for many scenario. Inspired by emscripten. If callback is NULL, we can fallback to normal rv_run. If callback is NULL and trace field in vm_attr_t is set, then fallback to rv_run_and_trace. Specifically, rv_run and rv_run_and_trace are private function in "riscv.c".

jserv · 2024-02-02T08:18:12Z

Since we have introduced vm_attr_t, we can simply add one more field (e.g., trace). rv_run can check that easily.

Exactly. It is time to drop rv_userdata by extensive use of vm_attr_t. We may refine the term "userdata" as well to avoid unintended confusion.

jserv · 2024-02-02T08:57:03Z

Another way: we can introduce rv_set_main_loop(void callback(void *arg), void *arg). The application can pass their own callback to register main loop. In this way, we can simply call their callback and decouple for many scenario. Inspired by emscripten. If callback is NULL, we can fallback to normal rv_run. If callback is NULL and trace field in vm_attr_t is set, then fallback to rv_run_and_trace. Specifically, rv_run and rv_run_and_trace are private function in "riscv.c".

Including the callback function as part of the emulator's context is possible, but it’s not necessary to have a separate function like rv_set_main_loop for assigning callbacks. Specifying the callback function directly using C99 designated initializers is more efficient, especially since it is unlikely we'll need to change the callback during the execution of the RISC-V emulation main loop.

src/riscv.h

src/riscv_private.h

src/riscv.h

src/riscv.c

RinHizakura

I'm good with the changes now! Let's see if other guys have more comments.

Thank you!

jserv

In your git commit messages, it is more appropriate to mention "this commit" rather than "this PR," as "PR" refers to a pull request, not a commit.

The following should be included in an emulator's simple and clear public API: 1. create/init core 2. run emulation 3. delete/destroy core Other components, including as memory, file systems, program data, etc., should be abstracted from the user, as a result, setting a configuration value (vm_attr_t) is sufficient. The user needs to manage about memory (state_t) and elf stuff before this commit. The user can just construct a core, run it, and shut it down after this commit, so they won't need to worry about them anymore. The vm_attr_t has multiple fields and they are commented clearly in the code. As you can see in "main", there are various mode to run the emulator such as "run_and_trace", "gdbstub", and "profiling". Thus, a field call "run_flag" is introduced in vm_attr_t. For standard stream remapping, rv_remap_stdstream function is introduced. The emulator can remap default standard stream to required streams after creating the emulator by calling the rv_remap_stdstream function. rv_userdata has been dropped since PRIV macro is sufficient for internal implemntation. Also, application will not need to access it directly. elf is reopened in dump_test_signature because elf is allocated during rv_create. It is acceptable to reopen elf since it is only for testing. Print inferior exit code to console inside main instead of syscall_exit because the actual usage of exit code depends on applications of using riscv public API. The io interface is not changed in this commit because it could maybe reused with semu in some way, still need to be investigated. Logging feature and system emulator integration are not implemented yet. Also, a validator for validating the user-defined vm_attr_t might need to be introduced. related: sysprog21#310

visitorckw · 2024-02-03T14:51:43Z

The changes also look good to me. Thanks!

Since commit sysprog21#340, the size of memory can be changed. As a result, this commit has removed outdated comments. Some comments are added to describe the IO functions in order to increase consistency.

Refine the API in the public header

Since commit sysprog21#340, the size of memory can be changed. As a result, this commit has removed outdated comments. Some comments are added to describe the IO functions in order to increase consistency.

ChinYikMing commented Jan 31, 2024

View reviewed changes