Skip to content

Commit

Permalink
* review
Browse files Browse the repository at this point in the history
  • Loading branch information
srkreddy1238 committed Feb 14, 2023
1 parent f1b369f commit 3f9609b
Show file tree
Hide file tree
Showing 5 changed files with 82 additions and 74 deletions.
77 changes: 18 additions & 59 deletions docs/how_to/deploy/adreno.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,12 @@ Deploying the compiled model here require use some tools on host as well as on t
TVM has simplified user friendly command line based tools as well as
developer centric python API interface for various steps like auto tuning, building and deploying.

TVM compilation process for remote devices has multiple stages listed below.

|Adreno deployment pipeline|

*Fig.2 Build and Deployment pipeline on Adreno devices*

The figure above demonstrates a generalized pipeline for various stages listed below.

**Model import:**
At this stage we import a model from well known frameworks like Tensorflow, PyTorch, ONNX ...etc.
Expand All @@ -150,7 +155,7 @@ At this stage we run the TVM compilation output on the target. Deployment is pos
environment using RPC Setup and also using TVM's native tool which is native binary cross compiled for Android.
At this stage we can run the compiled model on Android target and unit test output correctness and performance aspects.

**Aplication Integration:**
**Application Integration:**
This stage is all about integrating TVM compiled model in applications. Here we discuss about
interfacing tvm runtime from Android (cpp native environment or from JNI) for setting input and getting output.

Expand Down Expand Up @@ -234,7 +239,6 @@ Below command will configure the build the host compiler
cd build
cp ../cmake/config.cmake .

echo set\(USE_OPENCL ON\) >> config.cmake
echo set\(USE_RPC ON\) >> config.cmake
echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake
echo set\(USE_LIBBACKTRACE AUTO\) >> config.cmake
Expand All @@ -258,7 +262,7 @@ Finally we can export python path as

::

export PYTHONPATH=$PWD:/python
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
python3 -c "import tvm" # Verify tvm python package


Expand All @@ -274,7 +278,6 @@ Target build require Android NDK to be installed.
mkdir -p build-adreno
cd build-adreno
cp ../cmake/config.cmake .
echo set\(USE_MICRO OFF\) >> config.cmake
echo set\(USE_OPENCL ON\) >> config.cmake
echo set\(USE_RPC ON\) >> config.cmake
echo set\(USE_CPP_RPC ON\) >> config.cmake
Expand Down Expand Up @@ -342,73 +345,29 @@ manually and also inside docker using automated tools.
**Automated RPC Setup:**
Here we will explain how to setup RPC in docker environment.

Below command launches tracker in docker environment, where docker listens on port 9120.
Below command launches tracker in docker environment, where tracker listens on port 9190.

::

./tests/scripts/ci.py adreno -i # Launch a new shell on the anreno docker
source tests/scripts/setup-adreno-env.sh -e tracker -p 9120
source tests/scripts/setup-adreno-env.sh -e tracker -p 9190

Now, the below comand can run TVM RPC on remote android device with id "abcdefgh".


::

./tests/scripts/ci.py adreno -i # Launch a new shell on adreno docker.
source tests/scripts/setup-adreno-env.sh -e device -p 9120 -d abcdefgh
source tests/scripts/setup-adreno-env.sh -e device -p 9190 -d abcdefgh


**Manual RPC Setup:**

Below command in manual setup starts the tracker on port 9120

::

python3 -m tvm.exec.rpc_tracker --host "0.0.0.0" --port "9120"

TVM RPC launch on Android device require some environment setup due to Android device is connected via ADB interface and we need to re-route
TCP/IP communication over ADB interface. Below commands will do necessary setup and run tvm_rpc on remote device.

::

# Set android device to use
export ANDROID_SERIAL=abcdefgh
# Create a temporary folder on remote device.
adb shell "mkdir -p /data/local/tmp/tvm_ci"
# Copy tvm_rpc and it's dependency to remote device
adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_test/tvm_rpc
adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_test
# Forward port 9120 from target to host
adb reverse tcp:9210 tcp:9120
# tvm_rpc by default listens on ports starting from 5000 for incoming connections.
# Hence, reroute connections to these ports on host to remore device.
adb forward tcp:5000 tcp:5000
adb forward tcp:5001 tcp:5001
adb forward tcp:5002 tcp:5002
# Finally launch rpc_daemon on remote device with identity key as "android"
adb shell "cd /data/local/tmp/tvm_test; killall -9 tvm_rpc; sleep 2; LD_LIBRARY_PATH=/data/local/tmp/tvm_test/ ./tvm_rpc server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:9120 --key=android"

Upon successfull running this remote device will be available on tracker which can be queried as below.

::

python3 -m tvm.exec.query_rpc_tracker --port 9120
Tracker address 127.0.0.1:9120
Server List
------------------------------
server-address key
------------------------------
127.0.0.1:5000 server:android
------------------------------

Queue Status
-------------------------------
key total free pending
-------------------------------
android 1 1 0
-------------------------------
Please refer to the tutorial
`How To Deploy model on Adreno <https://tvm.apache.org/docs/how_to/deploy_models/deploy_model_on_adreno.html>`_
for manual RPC environment setup.

This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9120 (rpc-port).
This concludes RPC Setup and we have rpc-tracker available on host 127.0.0.1 (rpc-tracker) and port 9190 (rpc-port).


.. _commandline_interface:
Expand All @@ -431,7 +390,7 @@ Here we use a model from Keras and it uses RPC setup for tuning and finally gene
resnet50.h5 -o \
keras-resnet50.log \
--early-stopping 0 --repeat 30 --rpc-key android \
--rpc-tracker 127.0.0.1:9120 --trials 1024 \
--rpc-tracker 127.0.0.1:9190 --trials 1024 \
--tuning-records keras-resnet50-records.log --tuner xgb

**Model Compilation:**
Expand Down Expand Up @@ -466,7 +425,7 @@ We can use below tvmc command to deploy on remore target via RPC based setup.
::

python3 -m tvm.driver.tvmc run --device="cl" keras-resnet50.tar \
--rpc-key android --rpc-tracker 127.0.0.1:9120 --print-time
--rpc-key android --rpc-tracker 127.0.0.1:9190 --print-time

tvmc based run has more option to initialize the input in various modes line fill, random ..etc.

Expand Down Expand Up @@ -628,4 +587,4 @@ We then can compile our model in any convinient way
)
.. |High-level overview of the Adreno™ A5x architecture for OpenCL| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/adreno_architecture.png
.. |Android deployment pipeline| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/android_deployment_pipeline.jpg
.. |Adreno deployment pipeline| image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/how-to/Adreno-Deployment-Pipeline.jpg
61 changes: 49 additions & 12 deletions gallery/how_to/deploy_models/deploy_model_on_adreno.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,17 @@
#
# adb devices
#
# Set the android device to use
#
# .. code-block:: bash
#
# export ANDROID_SERIAL=<device-hash>
#
# Then to upload these two files to the device you should use:
#
# .. code-block:: bash
#
# adb -s <device_hash> push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
# adb push {libtvm_runtime.so,tvm_rpc} /data/local/tmp
#
# At this moment you will have «libtvm_runtime.so» and «tvm_rpc» on path /data/local/tmp on your device.
# Sometimes cmake can’t find «libc++_shared.so». Use:
Expand All @@ -70,7 +76,7 @@
#
# .. code-block:: bash
#
# adb -s <device_hash> push libc++_shared.so /data/local/tmp
# adb push libc++_shared.so /data/local/tmp
#
# We are now ready to run the TVM RPC Server.
# Launch rpc_tracker with following line in 1st console:
Expand All @@ -83,12 +89,12 @@
#
# .. code-block:: bash
#
# adb -s <device_hash> reverse tcp:9190 tcp:9190
# adb -s <device_hash> forward tcp:9090 tcp:9090
# adb -s <device_hash> forward tcp:9091 tcp:9091
# adb -s <device_hash> forward tcp:9092 tcp:9092
# adb -s <device_hash> forward tcp:9093 tcp:9093
# adb -s <device_hash> shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=9090 --tracker=127.0.0.1:9190 --key=android --port-end=9190
# adb reverse tcp:9190 tcp:9190
# adb forward tcp:5000 tcp:5000
# adb forward tcp:5002 tcp:5001
# adb forward tcp:5003 tcp:5002
# adb forward tcp:5004 tcp:5003
# adb shell LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=0.0.0.0 --port=5000 --tracker=127.0.0.1:9190 --key=android --port-end=5100
#
# Before proceeding to compile and infer model, specify TVM_TRACKER_HOST and TVM_TRACKER_PORT
#
Expand Down Expand Up @@ -130,6 +136,10 @@
from tvm.relay.op.contrib import clml
from tvm import autotvm

# Below are set of configuration that controls the behaviour of this script like
# local run or device run, target definitions, dtype setting and auto tuning enablement.
# Change these settings as needed if required.

# Adreno devices are efficient with float16 compared to float32
# Given the expected output doesn't effect by lowering precision
# it's advisable to use lower precision.
Expand All @@ -156,7 +166,8 @@
arch = "arm64"
target = tvm.target.Target("llvm -mtriple=%s-linux-android" % arch)

# Auto tuning is compute and time taking task, hence disabling for default run. Please enable it if required.
# Auto tuning is compute intensive and time taking task,
# hence disabling for default run. Please enable it if required.
is_tuning = False
tune_log = "adreno-resnet18.log"

Expand Down Expand Up @@ -220,6 +231,19 @@
#################################################################
# Precisions
# ----------

# Adreno devices are efficient with float16 compared to float32
# Given the expected output doesn't effect by lowering precision
# it's advisable to use lower precision.

# TVM support Mixed Precision through ToMixedPrecision transformation pass.
# We may need to register precision rules like precision type, accumultation
# datatype ...etc. for the required operators to override the default settings.
# The below helper api simplifies the precision conversions across the module.
# Now it supports dtypes "float16" and "float16_acc32".

# dtype is set to "float16_acc32" in configuration section above.

from tvm.relay.op.contrib import adreno

adreno.convert_to_dtype(mod["main"], dtype)
Expand All @@ -236,6 +260,12 @@
# Prepare TVM Target
# ------------------

# This generated example running on our x86 server for demonstration.

# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.
# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.

if local_demo:
target = tvm.target.Target("llvm")
elif test_target.find("opencl"):
Expand All @@ -254,6 +284,10 @@
rpc_tracker_port = int(os.environ.get("TVM_TRACKER_PORT", 9190))
key = "android"

# Auto tuning is compute intensive and time taking task.
# It is set to False in above configuration as this script runs in x86 for demonstration.
# Please to set :code:`is_tuning` to True to enable auto tuning.

if is_tuning:
# Auto Tuning Stage 1: Extract tunable tasks
tasks = autotvm.task.extract_from_program(
Expand All @@ -275,9 +309,9 @@
),
)
n_trial = 1024 # Number of iteration of training before choosing the best kernel config
early_stopping = False # Do we apply early stopping when the loss is not minimizing
early_stopping = False # Can be enabled to stop tuning while the loss is not minimizing.

# Iterate through each task and call the tuner
# Auto Tuning Stage 3: Iterate through the tasks and tune.
from tvm.autotvm.tuner import XGBTuner

for i, tsk in enumerate(reversed(tasks[:3])):
Expand All @@ -295,14 +329,17 @@
autotvm.callback.log_to_file(tmp_log_file),
],
)
# Pick the best performing kerl configurations from the overall log.
# Auto Tuning Stage 4: Pick the best performing configurations from the overall log.
autotvm.record.pick_best(tmp_log_file, tune_log)

#################################################################
# Enable OpenCLML Offloading
# --------------------------
# OpenCLML offloading will try to accelerate supported operators
# by using OpenCLML proprietory operator library.

# By default :code:`enable_clml` is set to False in above configuration section.

if not local_demo and enable_clml:
mod = clml.partition_for_clml(mod, params)

Expand Down
16 changes: 15 additions & 1 deletion gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@

# To enable OpenCLML accelerated operator library.
enable_clml = False
cross_compiler = "/opt/android-sdk-linux/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
cross_compiler = (
os.environ["ANDROID_NDK_HOME"]
+ "/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang"
)

#######################################################################
# Make a Keras Resnet50 Model
Expand Down Expand Up @@ -104,6 +107,12 @@
rpc_key = "android"
rpc_tracker = rpc_tracker_host + ":" + str(rpc_tracker_port)

# Auto tuning is compute intensive and time taking task.
# It is set to False in above configuration as this script runs in x86 for demonstration.
# Please to set :code:`is_tuning` to True to enable auto tuning.

# Also, :code:`test_target` is set to :code:`llvm` as this example to make compatible for x86 demonstration.
# Please change it to :code:`opencl` or :code:`opencl -device=adreno` for RPC target in configuration above.

if is_tuning:
tvmc.tune(
Expand All @@ -125,6 +134,11 @@
# -----------
# Compilation to produce tvm artifacts

# This generated example running on our x86 server for demonstration.
# To deply and tun on real target over RPC please set :code:`local_demo` to False in above configuration sestion.

# OpenCLML offloading will try to accelerate supported operators by using OpenCLML proprietory operator library.
# By default :code:`enable_clml` is set to False in above configuration section.

if not enable_clml:
if local_demo:
Expand Down
1 change: 0 additions & 1 deletion tests/scripts/task_build_adreno_bins.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ cd ${output_directory}

cp ../cmake/config.cmake .

echo set\(USE_MICRO OFF\) >> config.cmake
if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
echo set\(USE_CLML "${ADRENO_OPENCL}"\) >> config.cmake
echo set\(USE_CLML_GRAPH_EXECUTOR "${ADRENO_OPENCL}"\) >> config.cmake
Expand Down
1 change: 0 additions & 1 deletion tests/scripts/task_config_build_adreno.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ mkdir -p "$BUILD_DIR"
cd "$BUILD_DIR"
cp ../cmake/config.cmake .

echo set\(USE_OPENCL ON\) >> config.cmake
if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake
fi
Expand Down

0 comments on commit 3f9609b

Please sign in to comment.