-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[µTVM] Add virtual machine, test zephyr runtime on real hardware #6703
Conversation
…ysical HW (#6789) * [BUGFIX] Respect infinite-timed session start timeouts. * When debugging, the intended behavior is to set the session start timeout to infinite to allow the user to configure the debugger. * At present, if a session start retry timeout is defined, the current logic will bail after the retry timeout expires. * This change makes the session start logic retry forever, once per retry timeout. * Document RPCEndpoint::Create. * Add stm32f746xx to tvm.target.micro() call; fix parameter name. * This API is expected to just be used with positional args, not kwargs, so this change isn't expected to cause any breakage. * model is more inline with the rest of the file, given TVM Target Specification RFC. * [BUGFIX] If session start fails, exit transport context manager. * If an error occurred during session setup, then complex transports e.g. DebugWrapperTransport would not de-initialize. * Align transport writes/reads in TransportLogger * fix syntax errors which were not exercised in previous PR * Remove microTVM logic from standard RPC server, add debug shell. * microTVM uses the host RPC server as a way to launch a debugger in a dedicated, separate terminal window. microTVM needs to be able to launch the debugger itself, because its model of the device flash/debug flow separates these two things into distinct operations implemented by shell commands (for maximum portability across frameworks). * microTVM can be configured to launch the debugger (e.g. GDB) in the same terminal as is used for flashing, but this is sub-optimal because then it hides any logs emitted by the device. * Using the standard RPC server was hard because GDB expects the user to issue SIGINT to interrupt program flow, but due to the RPC server's necessary use of multiprocessing, multiple signal handlers needed to be SIG_IGN'd, and further, because libtvm.so is intentionally frontend-agnostic, it's difficult to include signal handling directly in that binary (Python expects you to call PyErr_CheckSignals, but we don't require and don't want to require python-dev to compile libtvm.so, and this is the only such case where libtvm.so is expected to block the main thread for a long period of time). * Here we implement a separate microTVM debug shell python script using the non-blocking server implementation. * Add serial transport, parameterize test_zephyr to work on real hardware * add pytest test fixture, missed from previous change. * this test fixture helps to parameterize the test case * address leandron@ comment from #6703
@leandron @u99127 @manupa-arm please take a look when you have a minute and explicitly approve if you're good w/ this change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment, mainly trying to reduce the size of the target VM created here, and keeping it cleaner.
# nrfjprog | ||
cd ~ | ||
mkdir -p nrfjprog | ||
wget --no-verbose -O nRFCommandLineTools1090Linuxamd64.tar.gz https://www.nordicsemi.com/-/media/Software-and-other-downloads/Desktop-software/nRF-command-line-tools/sw/Versions-10-x-x/10-9-0/nRFCommandLineTools1090Linuxamd64tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to have something here to cleanup the files/packages being downloaded in this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great suggestion! i've done that and we saved about 500MB.
* this often contributes to erroneous changes to that file
@leandron please take a look when you have a minute and explicitly approve if you're good w/ this change |
I'm happy with the current version. Would like also to hear from @manupa-arm and @u99127, if possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Post-RFC opinion] we might want to just use of one of proposed requirement.txt here -- at least locally -- until such a time where the codebase is refactored to use requirement.txt.
Other than that LGTM.
[tool.poetry.dependencies] | ||
attrs = "^19" | ||
decorator = "^4.4" | ||
numpy = "~1.19" | ||
psutil = "^5" | ||
scipy = "^1.4" | ||
python = "^3.6" | ||
tornado = "^6" | ||
typed_ast = "^1.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@areusch , After reading the RFC, do we want to align with a one of the requirement.txt ?. So we can have synergy between this and ci-qemu ? and would suggest introduction of poetry after next RFC if and when that happens, thus the changes will go here as well.
@manupa-arm i'm not sure which requirement.txt you're referring to--we have not yet created any requirements.txt in spirit of the RFC. i'd prefer to just merge this change and treat this PR as separate from the RFC since it was released before I released the RFC. i think it's better to merge what we have now to make progress and then align the two after the RFC is implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@areusch, sounds good! and agreed that can be done later -- I was actually referring to create a requirement.txt being a replica of the toml but then that might be needed to re-adjust again based on what would be agreed for the ci. Thus, agree that could be a seperate effort.
Thanks for the work!.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your patience and perseverance with this one :)
regards
Ramana
Thanks @areusch @u99127 @leandron @manupa-arm ! |
…ysical HW (apache#6789) * [BUGFIX] Respect infinite-timed session start timeouts. * When debugging, the intended behavior is to set the session start timeout to infinite to allow the user to configure the debugger. * At present, if a session start retry timeout is defined, the current logic will bail after the retry timeout expires. * This change makes the session start logic retry forever, once per retry timeout. * Document RPCEndpoint::Create. * Add stm32f746xx to tvm.target.micro() call; fix parameter name. * This API is expected to just be used with positional args, not kwargs, so this change isn't expected to cause any breakage. * model is more inline with the rest of the file, given TVM Target Specification RFC. * [BUGFIX] If session start fails, exit transport context manager. * If an error occurred during session setup, then complex transports e.g. DebugWrapperTransport would not de-initialize. * Align transport writes/reads in TransportLogger * fix syntax errors which were not exercised in previous PR * Remove microTVM logic from standard RPC server, add debug shell. * microTVM uses the host RPC server as a way to launch a debugger in a dedicated, separate terminal window. microTVM needs to be able to launch the debugger itself, because its model of the device flash/debug flow separates these two things into distinct operations implemented by shell commands (for maximum portability across frameworks). * microTVM can be configured to launch the debugger (e.g. GDB) in the same terminal as is used for flashing, but this is sub-optimal because then it hides any logs emitted by the device. * Using the standard RPC server was hard because GDB expects the user to issue SIGINT to interrupt program flow, but due to the RPC server's necessary use of multiprocessing, multiple signal handlers needed to be SIG_IGN'd, and further, because libtvm.so is intentionally frontend-agnostic, it's difficult to include signal handling directly in that binary (Python expects you to call PyErr_CheckSignals, but we don't require and don't want to require python-dev to compile libtvm.so, and this is the only such case where libtvm.so is expected to block the main thread for a long period of time). * Here we implement a separate microTVM debug shell python script using the non-blocking server implementation. * Add serial transport, parameterize test_zephyr to work on real hardware * add pytest test fixture, missed from previous change. * this test fixture helps to parameterize the test case * address leandron@ comment from apache#6703
…che#6703) * Split transport classes into transport package. * Introduce transport timeouts. * black format * Add metadata-only artifacts * Simplify utvm rpc server API and ease handling of short packets. * add zephyr test against qemu * Add qemu build config * fix typo * cleanup zephyr main * fix nonblocking piping on some linux kernels * don't double-open transport * validate FD are in non-blocking mode * gitignore test debug files * cleanup zephyr compiler * re-comment serial until added * remove logging * add zephyr exclusions to check_file_type * add asf header * lint * black format * more pylint * kill utvm rpc_server bindings, which don't work anymore and fail pylint * fix compiler warning * fixes related to pylint * clang-format again * more black format * add qemu regression * Fix paths for qemu/ dir * fix typo * fix SETFL logic * export SessionTerminatedError and update except after moving * fix test_micro_artifact * retrigger staging CI * fix jenkins syntax hopefully * one last syntax error * Add microTVM VM setup scripts * obliterate USE_ANTLR from cmake.config * add poetry deps to pyproject.toml - mainly taken from output of `pip freeze` in ci-gpu and ci-lint * initial attempt at setup.py + autodetect libtvm_runtime SO path * hack to hardcode in build * make pyproject lock * Add ci_qemu to Jenkinsfile * build in qemu * checkpoint * create diff for jared * add missing stuff * address liangfu comments * fix new bug with list passing * release v0.0.2 * works on hardware * switch to pytest for zephyr tests * add missing import * fix option parsing * remove extraneous changes * lint * asf lint, somehow local pass didn't work * file type lint * black-format * try to fix ARMTargetParser.h #include in LLVM < 8.0 * rm misspelled deamon lines * move to apps/microtvm-vm * fetch keys from kitware server * fix path exclusions in check_file_type * retrigger CI * reorganize vm, add tutorial * fixes for reorganization - enable vagrant ssh * update ssh instructions * rm commented code * standardize reference VM release process, add prerelease test * remove -mfpu from this change * fix exit code of test_zephyr * rm unneeded files, update check_file_type * add asf header * git-black * git-black against main * git-black with docker * fixes for virtualbox * black format * install python3.8, for zephyr gdb * timestamp zephyr vm name, permits launching multiple VMs * log warning when initial vagrant destroy fails * revert changes moved into apache#6789 * address leandron@ comments * black format * black format * add --skip-build to test subcommand, detach device from other VMs * black format * address leandron@ comments * don't rm release test when building only 1 provider * revert pyproject.toml * remove need to copy pyproject.toml to root * this often contributes to erroneous changes to that file
…ysical HW (apache#6789) * [BUGFIX] Respect infinite-timed session start timeouts. * When debugging, the intended behavior is to set the session start timeout to infinite to allow the user to configure the debugger. * At present, if a session start retry timeout is defined, the current logic will bail after the retry timeout expires. * This change makes the session start logic retry forever, once per retry timeout. * Document RPCEndpoint::Create. * Add stm32f746xx to tvm.target.micro() call; fix parameter name. * This API is expected to just be used with positional args, not kwargs, so this change isn't expected to cause any breakage. * model is more inline with the rest of the file, given TVM Target Specification RFC. * [BUGFIX] If session start fails, exit transport context manager. * If an error occurred during session setup, then complex transports e.g. DebugWrapperTransport would not de-initialize. * Align transport writes/reads in TransportLogger * fix syntax errors which were not exercised in previous PR * Remove microTVM logic from standard RPC server, add debug shell. * microTVM uses the host RPC server as a way to launch a debugger in a dedicated, separate terminal window. microTVM needs to be able to launch the debugger itself, because its model of the device flash/debug flow separates these two things into distinct operations implemented by shell commands (for maximum portability across frameworks). * microTVM can be configured to launch the debugger (e.g. GDB) in the same terminal as is used for flashing, but this is sub-optimal because then it hides any logs emitted by the device. * Using the standard RPC server was hard because GDB expects the user to issue SIGINT to interrupt program flow, but due to the RPC server's necessary use of multiprocessing, multiple signal handlers needed to be SIG_IGN'd, and further, because libtvm.so is intentionally frontend-agnostic, it's difficult to include signal handling directly in that binary (Python expects you to call PyErr_CheckSignals, but we don't require and don't want to require python-dev to compile libtvm.so, and this is the only such case where libtvm.so is expected to block the main thread for a long period of time). * Here we implement a separate microTVM debug shell python script using the non-blocking server implementation. * Add serial transport, parameterize test_zephyr to work on real hardware * add pytest test fixture, missed from previous change. * this test fixture helps to parameterize the test case * address leandron@ comment from apache#6703
…che#6703) * Split transport classes into transport package. * Introduce transport timeouts. * black format * Add metadata-only artifacts * Simplify utvm rpc server API and ease handling of short packets. * add zephyr test against qemu * Add qemu build config * fix typo * cleanup zephyr main * fix nonblocking piping on some linux kernels * don't double-open transport * validate FD are in non-blocking mode * gitignore test debug files * cleanup zephyr compiler * re-comment serial until added * remove logging * add zephyr exclusions to check_file_type * add asf header * lint * black format * more pylint * kill utvm rpc_server bindings, which don't work anymore and fail pylint * fix compiler warning * fixes related to pylint * clang-format again * more black format * add qemu regression * Fix paths for qemu/ dir * fix typo * fix SETFL logic * export SessionTerminatedError and update except after moving * fix test_micro_artifact * retrigger staging CI * fix jenkins syntax hopefully * one last syntax error * Add microTVM VM setup scripts * obliterate USE_ANTLR from cmake.config * add poetry deps to pyproject.toml - mainly taken from output of `pip freeze` in ci-gpu and ci-lint * initial attempt at setup.py + autodetect libtvm_runtime SO path * hack to hardcode in build * make pyproject lock * Add ci_qemu to Jenkinsfile * build in qemu * checkpoint * create diff for jared * add missing stuff * address liangfu comments * fix new bug with list passing * release v0.0.2 * works on hardware * switch to pytest for zephyr tests * add missing import * fix option parsing * remove extraneous changes * lint * asf lint, somehow local pass didn't work * file type lint * black-format * try to fix ARMTargetParser.h #include in LLVM < 8.0 * rm misspelled deamon lines * move to apps/microtvm-vm * fetch keys from kitware server * fix path exclusions in check_file_type * retrigger CI * reorganize vm, add tutorial * fixes for reorganization - enable vagrant ssh * update ssh instructions * rm commented code * standardize reference VM release process, add prerelease test * remove -mfpu from this change * fix exit code of test_zephyr * rm unneeded files, update check_file_type * add asf header * git-black * git-black against main * git-black with docker * fixes for virtualbox * black format * install python3.8, for zephyr gdb * timestamp zephyr vm name, permits launching multiple VMs * log warning when initial vagrant destroy fails * revert changes moved into apache#6789 * address leandron@ comments * black format * black format * add --skip-build to test subcommand, detach device from other VMs * black format * address leandron@ comments * don't rm release test when building only 1 provider * revert pyproject.toml * remove need to copy pyproject.toml to root * this often contributes to erroneous changes to that file
…ysical HW (apache#6789) * [BUGFIX] Respect infinite-timed session start timeouts. * When debugging, the intended behavior is to set the session start timeout to infinite to allow the user to configure the debugger. * At present, if a session start retry timeout is defined, the current logic will bail after the retry timeout expires. * This change makes the session start logic retry forever, once per retry timeout. * Document RPCEndpoint::Create. * Add stm32f746xx to tvm.target.micro() call; fix parameter name. * This API is expected to just be used with positional args, not kwargs, so this change isn't expected to cause any breakage. * model is more inline with the rest of the file, given TVM Target Specification RFC. * [BUGFIX] If session start fails, exit transport context manager. * If an error occurred during session setup, then complex transports e.g. DebugWrapperTransport would not de-initialize. * Align transport writes/reads in TransportLogger * fix syntax errors which were not exercised in previous PR * Remove microTVM logic from standard RPC server, add debug shell. * microTVM uses the host RPC server as a way to launch a debugger in a dedicated, separate terminal window. microTVM needs to be able to launch the debugger itself, because its model of the device flash/debug flow separates these two things into distinct operations implemented by shell commands (for maximum portability across frameworks). * microTVM can be configured to launch the debugger (e.g. GDB) in the same terminal as is used for flashing, but this is sub-optimal because then it hides any logs emitted by the device. * Using the standard RPC server was hard because GDB expects the user to issue SIGINT to interrupt program flow, but due to the RPC server's necessary use of multiprocessing, multiple signal handlers needed to be SIG_IGN'd, and further, because libtvm.so is intentionally frontend-agnostic, it's difficult to include signal handling directly in that binary (Python expects you to call PyErr_CheckSignals, but we don't require and don't want to require python-dev to compile libtvm.so, and this is the only such case where libtvm.so is expected to block the main thread for a long period of time). * Here we implement a separate microTVM debug shell python script using the non-blocking server implementation. * Add serial transport, parameterize test_zephyr to work on real hardware * add pytest test fixture, missed from previous change. * this test fixture helps to parameterize the test case * address leandron@ comment from apache#6703
…che#6703) * Split transport classes into transport package. * Introduce transport timeouts. * black format * Add metadata-only artifacts * Simplify utvm rpc server API and ease handling of short packets. * add zephyr test against qemu * Add qemu build config * fix typo * cleanup zephyr main * fix nonblocking piping on some linux kernels * don't double-open transport * validate FD are in non-blocking mode * gitignore test debug files * cleanup zephyr compiler * re-comment serial until added * remove logging * add zephyr exclusions to check_file_type * add asf header * lint * black format * more pylint * kill utvm rpc_server bindings, which don't work anymore and fail pylint * fix compiler warning * fixes related to pylint * clang-format again * more black format * add qemu regression * Fix paths for qemu/ dir * fix typo * fix SETFL logic * export SessionTerminatedError and update except after moving * fix test_micro_artifact * retrigger staging CI * fix jenkins syntax hopefully * one last syntax error * Add microTVM VM setup scripts * obliterate USE_ANTLR from cmake.config * add poetry deps to pyproject.toml - mainly taken from output of `pip freeze` in ci-gpu and ci-lint * initial attempt at setup.py + autodetect libtvm_runtime SO path * hack to hardcode in build * make pyproject lock * Add ci_qemu to Jenkinsfile * build in qemu * checkpoint * create diff for jared * add missing stuff * address liangfu comments * fix new bug with list passing * release v0.0.2 * works on hardware * switch to pytest for zephyr tests * add missing import * fix option parsing * remove extraneous changes * lint * asf lint, somehow local pass didn't work * file type lint * black-format * try to fix ARMTargetParser.h #include in LLVM < 8.0 * rm misspelled deamon lines * move to apps/microtvm-vm * fetch keys from kitware server * fix path exclusions in check_file_type * retrigger CI * reorganize vm, add tutorial * fixes for reorganization - enable vagrant ssh * update ssh instructions * rm commented code * standardize reference VM release process, add prerelease test * remove -mfpu from this change * fix exit code of test_zephyr * rm unneeded files, update check_file_type * add asf header * git-black * git-black against main * git-black with docker * fixes for virtualbox * black format * install python3.8, for zephyr gdb * timestamp zephyr vm name, permits launching multiple VMs * log warning when initial vagrant destroy fails * revert changes moved into apache#6789 * address leandron@ comments * black format * black format * add --skip-build to test subcommand, detach device from other VMs * black format * address leandron@ comments * don't rm release test when building only 1 provider * revert pyproject.toml * remove need to copy pyproject.toml to root * this often contributes to erroneous changes to that file
This PR adds two Vagrantfiles:
tools/microtvm/base-box
intended to support general µTVM development. it includes all the dependencies necessary to build the Zephyr runtime and test it with attached hardware (I.e. use USB port forwarding). This means it includes cross-compilers for RISC-V, ARM, and x86, among others (see Zephyr SDK).tvm
directory using Host-VM shared folders, builds your local copy of TVM inside the VM, then creates a poetry (Python) virtualenv containing all TVM and Zephyr dependencies. You can use this VM to test µTVM against real hardware, for example:tvm@microtvm:/Users/andrew/ws/tvm2$ TVM_LIBRARY_PATH=build-microtvm poetry run python3 tests/micro/qemu/test_zephyr.py --microtvm-platforms=stm32f746xx -s
This PR also includes additional transports needed to talk to real hardware, specifically a pySerial-based RPC transport layer plus utilities to invoke GDB to debug e.g. runtime problems, bad operator implementations, and to help with porting to new architectures. Because µTVM aims to be platform-agnostic, µTVM assumes only that some shell command exists to launch GDB and connect to the SoC's debug port. Due to this constraint, an additional RPC server is included:
tvm.exec.microtvm_debug_shell
, which uses the event-driven RPC server to host the debugger in a dedicated shell, so that signals can be forwarded to the inferior GDB.cc @tmoreau89 @tqchen @u99127 @tom-gall @liangfu @mshawcroft