* review

apache · Feb 14, 2023 · 3f9609b · 3f9609b
1 parent f1b369f
commit 3f9609b
Show file tree

Hide file tree

Showing 5 changed files with 82 additions and 74 deletions.
diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst
@@ -127,7 +127,12 @@ Deploying the compiled model here require use some tools on host as well as on t
 TVM has simplified user friendly command line based tools as well as
 developer centric python API interface for various steps like auto tuning, building and deploying.
 
-TVM compilation process for remote devices has multiple stages listed below.
+
+|Adreno deployment pipeline|
+
+*Fig.2 Build and Deployment pipeline on Adreno devices*
+
+The figure above demonstrates a generalized pipeline for various stages listed below.
 
 **Model import:**
 At this stage we import a model from well known frameworks like Tensorflow, PyTorch, ONNX ...etc.
@@ -150,7 +155,7 @@ At this stage we run the TVM compilation output on the target. Deployment is pos
 environment using RPC Setup and also using TVM's native tool which is native binary cross compiled for Android.
 At this stage we can run the compiled model on Android target and unit test output correctness and performance aspects.
 
-**Aplication Integration:**
+**Application Integration:**
 This stage is all about integrating TVM compiled model in applications. Here we discuss about
 interfacing tvm runtime from Android (cpp native environment or from JNI) for setting input and getting output.
 
@@ -234,7 +239,6 @@ Below command will configure the build the host compiler
    cd build
    cp ../cmake/config.cmake .
 
-   echo set\(USE_OPENCL ON\) >> config.cmake
    echo set\(USE_RPC ON\) >> config.cmake
    echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake
    echo set\(USE_LIBBACKTRACE AUTO\) >> config.cmake
@@ -258,7 +262,7 @@ Finally we can export python path as
 
 ::
 
-   export PYTHONPATH=$PWD:/python
+   export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
    python3 -c "import tvm" # Verify tvm python package
 
 
@@ -274,7 +278,6 @@ Target build require Android NDK to be installed.
    mkdir -p build-adreno
    cd build-adreno
    cp ../cmake/config.cmake .
-   echo set\(USE_MICRO OFF\) >> config.cmake
    echo set\(USE_OPENCL ON\) >> config.cmake
    echo set\(USE_RPC ON\) >> config.cmake
    echo set\(USE_CPP_RPC ON\) >> config.cmake
@@ -342,73 +345,29 @@ manually and also inside docker using automated tools.
 **Automated RPC Setup:**
 Here we will explain how to setup RPC in docker environment.
 
-Below command launches tracker in docker environment, where docker listens on port 9120.
+Below command launches tracker in docker environment, where tracker listens on port 9190.
 
 ::
 
    ./tests/scripts/ci.py adreno -i # Launch a new shell on the anreno docker
-   source  tests/scripts/setup-adreno-env.sh -e tracker -p 9120
+   source  tests/scripts/setup-adreno-env.sh -e tracker -p 9190
 
 Now, the below comand can run TVM RPC on remote android device with id "abcdefgh".
 
 
 ::
 
    ./tests/scripts/ci.py adreno -i # Launch a new shell on adreno docker.
-   source  tests/scripts/setup-adreno-env.sh -e device -p 9120 -d abcdefgh
+   source  tests/scripts/setup-adreno-env.sh -e device -p 9190 -d abcdefgh
 
 
 **Manual RPC Setup:**
 
-Below command in manual setup starts the tracker on port 9120
-
-::
-
-   python3 -m tvm.exec.rpc_tracker --host "0.0.0.0" --port "9120"
-
-TVM RPC launch on Android device require some environment setup due to Android device is connected via ADB interface and we need to re-route
-TCP/IP communication over ADB interface. Below commands will do necessary setup and run tvm_rpc on remote device.
-
-::
-
-    # Set android device to use
-    export ANDROID_SERIAL=abcdefgh
-    # Create a temporary folder on remote device.
-    adb shell "mkdir -p /data/local/tmp/tvm_ci"
-    # Copy tvm_rpc and it's dependency to remote device
-    adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_test/tvm_rpc
-    adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_test
-    # Forward port 9120 from target to host
-    adb reverse tcp:9210 tcp:9120
-    # tvm_rpc by default listens on ports starting from 5000 for incoming connections.
-    # Hence, reroute connections to these ports on host to remore device.
-    adb forward tcp:5000 tcp:5000
-    adb forward tcp:5001 tcp:5001
-    adb forward tcp:5002 tcp:5002
-    # Finally launch rpc_daemon on remote device with identity key as "android"
-    adb shell "cd /data/local/tmp/tvm_test; killall -9 tvm_rpc; sleep 2; LD_LIBRARY_PATH=/data/local/tmp/tvm_test/ ./tvm_rpc server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:9120 --key=android"
-
-Upon successfull running this remote device will be available on tracker which can be queried as below.
-
-::
-
-   python3 -m tvm.exec.query_rpc_tracker --port 9120
-   Tracker address 127.0.0.1:9120
-   Server List
-   ------------------------------
-   server-address           key
-   ------------------------------
-       127.0.0.1:5000    server:android
-   ------------------------------
-
-   Queue Status
-   -------------------------------
-   key       total  free  pending
-   -------------------------------
-   android   1      1     0
-   -------------------------------
+Please refer to the tutorial
+`How To Deploy model on Adreno <https://tvm.apache.org/docs/how_to/deploy_models/deploy_model_on_adreno.html>`_
+for manual RPC environment setup.
 
-This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9120 (rpc-port).
+This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9190 (rpc-port).
 
 
 .. _commandline_interface:
@@ -431,7 +390,7 @@ Here we use a model from Keras and it uses RPC setup for tuning and finally gene
    resnet50.h5 -o \
    keras-resnet50.log \
    --early-stopping 0 --repeat 30 --rpc-key android \
-   --rpc-tracker 127.0.0.1:9120 --trials 1024 \
+   --rpc-tracker 127.0.0.1:9190 --trials 1024 \
    --tuning-records keras-resnet50-records.log --tuner xgb
 
 **Model Compilation:**
@@ -466,7 +425,7 @@ We can use below tvmc command to deploy on remore target via RPC based setup.
 ::
 
    python3 -m tvm.driver.tvmc run --device="cl" keras-resnet50.tar \
-   --rpc-key android --rpc-tracker 127.0.0.1:9120 --print-time
+   --rpc-key android --rpc-tracker 127.0.0.1:9190 --print-time
 
 tvmc based run has more option to initialize the input in various modes line fill, random ..etc.
 
@@ -628,4 +587,4 @@ We then can compile our model in any convinient way
        )
 
 .. |High-level overview of the Adreno™ A5x architecture for OpenCL| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/adreno_architecture.png
-.. |Android deployment pipeline| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/android_deployment_pipeline.jpg
+.. |Adreno deployment pipeline| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/Adreno-Deployment-Pipeline.jpg
diff --git a/gallery/how_to/deploy_models/deploy_model_on_adreno.py b/gallery/how_to/deploy_models/deploy_model_on_adreno.py
@@ -53,11 +53,17 @@
 #
 #   adb devices
 #
+# Set the android device to use
+#
+# .. code-block:: bash
+#
+#   export ANDROID_SERIAL=<device-hash>
+#
 # Then to upload these two files to the device you should use:
 #
 # .. code-block:: bash
 #
-#   adb -s <device_hash> push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
+#   adb push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
 #
 # At this moment you will have «libtvm_runtime.so» and «tvm_rpc» on path /data/local/tmp on your device.
 # Sometimes cmake can’t find «libc++_shared.so». Use:
@@ -70,7 +76,7 @@
 #
 # .. code-block:: bash
 #
-#   adb -s <device_hash> push libc++_shared.so /data/local/tmp
+#   adb push libc++_shared.so /data/local/tmp
 #
 # We are now ready to run the TVM RPC Server.
 # Launch rpc_tracker with following line in 1st console:
@@ -83,12 +89,12 @@
 #
 # .. code-block:: bash
 #
-#   adb -s <device_hash> reverse tcp:9190 tcp:9190
-#   adb -s <device_hash> forward tcp:9090 tcp:9090
-#   adb -s <device_hash> forward tcp:9091 tcp:9091
-#   adb -s <device_hash> forward tcp:9092 tcp:9092
-#   adb -s <device_hash> forward tcp:9093 tcp:9093
-#   adb -s <device_hash> shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=9090 --tracker=127.0.0.1:9190 --key=android --port-end=9190
+#   adb reverse tcp:9190 tcp:9190
+#   adb forward tcp:5000 tcp:5000
+#   adb forward tcp:5002 tcp:5001
+#   adb forward tcp:5003 tcp:5002
+#   adb forward tcp:5004 tcp:5003
+#   adb shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=5000 --tracker=127.0.0.1:9190 --key=android --port-end=5100
 #
 # Before proceeding to compile and infer model, specify TVM_TRACKER_HOST and TVM_TRACKER_PORT
 #
@@ -130,6 +136,10 @@
 from tvm.relay.op.contrib import clml
 from tvm import autotvm
 
+# Below are set of configuration that controls the behaviour of this script like
+# local run or device run, target definitions,  dtype setting and auto tuning enablement.
+# Change these settings as needed if required.
+
 # Adreno devices are efficient with float16 compared to float32
 # Given the expected output doesn't effect by lowering precision
 # it's advisable to use lower precision.
@@ -156,7 +166,8 @@
 arch = "arm64"
 target = tvm.target.Target("llvm -mtriple=%s-linux-android" % arch)
 
-# Auto tuning is compute and time taking task, hence disabling for default run. Please enable it if required.
+# Auto tuning is compute intensive and time taking task,
+# hence disabling for default run. Please enable it if required.
 is_tuning = False
 tune_log = "adreno-resnet18.log"
 
@@ -220,6 +231,19 @@
 #################################################################
 # Precisions
 # ----------
+
+# Adreno devices are efficient with float16 compared to float32
+# Given the expected output doesn't effect by lowering precision
+# it's advisable to use lower precision.
+
+# TVM support Mixed Precision through ToMixedPrecision transformation pass.
+# We may need to register precision rules like precision type, accumultation
+# datatype ...etc. for the required operators to override the default settings.
+# The below helper api simplifies the precision conversions across the module.
+# Now it supports dtypes "float16" and "float16_acc32".
+
+# dtype is set to "float16_acc32" in configuration section above.
+
 from tvm.relay.op.contrib import adreno
 
 adreno.convert_to_dtype(mod["main"], dtype)
@@ -236,6 +260,12 @@
 # Prepare TVM Target
 # ------------------
 
+# This generated example running on our x86 server for demonstration.
+
+# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
+# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
+# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.
+
 if local_demo:
     target = tvm.target.Target("llvm")
 elif test_target.find("opencl"):
@@ -254,6 +284,10 @@
 rpc_tracker_port = int(os.environ.get("TVM_TRACKER_PORT", 9190))
 key = "android"
 
+# Auto tuning is compute intensive and time taking task.
+# It is set to False in above configuration as this script runs in x86 for demonstration.
+# Please to set :code:`is_tuning` to True to enable auto tuning.
+
 if is_tuning:
     # Auto Tuning Stage 1: Extract tunable tasks
     tasks = autotvm.task.extract_from_program(
@@ -275,9 +309,9 @@
         ),
     )
     n_trial = 1024  # Number of iteration of training before choosing the best kernel config
-    early_stopping = False  # Do we apply early stopping when the loss is not minimizing
+    early_stopping = False  # Can be enabled to stop tuning while the loss is not minimizing.
 
-    # Iterate through each task and call the tuner
+    # Auto Tuning Stage 3: Iterate through the tasks and tune.
     from tvm.autotvm.tuner import XGBTuner
 
     for i, tsk in enumerate(reversed(tasks[:3])):
@@ -295,14 +329,17 @@
                 autotvm.callback.log_to_file(tmp_log_file),
             ],
         )
-    # Pick the best performing kerl configurations from the overall log.
+    # Auto Tuning Stage 4: Pick the best performing configurations from the overall log.
     autotvm.record.pick_best(tmp_log_file, tune_log)
 
 #################################################################
 # Enable OpenCLML Offloading
 # --------------------------
 # OpenCLML offloading will try to accelerate supported operators
 # by using OpenCLML proprietory operator library.
+
+# By default :code:`enable_clml` is set to False in above configuration section.
+
 if not local_demo and enable_clml:
     mod = clml.partition_for_clml(mod, params)
 

diff --git a/gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py b/gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py
@@ -65,7 +65,10 @@
 
 # To enable OpenCLML accelerated operator library.
 enable_clml = False
-cross_compiler = "/opt/android-sdk-linux/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
+cross_compiler = (
+    os.environ["ANDROID_NDK_HOME"]
+    + "/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
+)
 
 #######################################################################
 # Make a Keras Resnet50 Model
@@ -104,6 +107,12 @@
 rpc_key = "android"
 rpc_tracker = rpc_tracker_host + ":" + str(rpc_tracker_port)
 
+# Auto tuning is compute intensive and time taking task.
+# It is set to False in above configuration as this script runs in x86 for demonstration.
+# Please to set :code:`is_tuning` to True to enable auto tuning.
+
+# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
+# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.
 
 if is_tuning:
     tvmc.tune(
@@ -125,6 +134,11 @@
 # -----------
 # Compilation to produce tvm artifacts
 
+# This generated example running on our x86 server for demonstration.
+# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
+
+# OpenCLML offloading will try to accelerate supported operators by using OpenCLML proprietory operator library.
+# By default :code:`enable_clml` is set to False in above configuration section.
 
 if not enable_clml:
     if local_demo:

diff --git a/tests/scripts/task_build_adreno_bins.sh b/tests/scripts/task_build_adreno_bins.sh
@@ -28,7 +28,6 @@ cd ${output_directory}
 
 cp ../cmake/config.cmake .
 
-echo set\(USE_MICRO OFF\) >> config.cmake
 if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
 echo set\(USE_CLML "${ADRENO_OPENCL}"\) >> config.cmake
 echo set\(USE_CLML_GRAPH_EXECUTOR "${ADRENO_OPENCL}"\) >> config.cmake

diff --git a/tests/scripts/task_config_build_adreno.sh b/tests/scripts/task_config_build_adreno.sh
@@ -23,7 +23,6 @@ mkdir -p "$BUILD_DIR"
 cd "$BUILD_DIR"
 cp ../cmake/config.cmake .
 
-echo set\(USE_OPENCL ON\) >> config.cmake
 if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
 echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake
 fi