Skip to content

Commit

Permalink
[PYTHON, TVM] Python TVM library, unit tests and end to end example
Browse files Browse the repository at this point in the history
* VTA python library
* Python unit tests
* End to end example with Resnet18
* README instructions
* Bug fixes
  • Loading branch information
tmoreau89 authored and tqchen committed Jul 12, 2018
1 parent e7557db commit 16b5877
Show file tree
Hide file tree
Showing 35 changed files with 4,046 additions and 77 deletions.
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@ endif
all: lib/libvta.$(SHARED_LIBRARY_SUFFIX)

VTA_LIB_SRC = $(wildcard src/*.cc src/tvm/*.cc)
ifeq ($(TARGET), PYNQ_TARGET)
ifeq ($(TARGET), VTA_PYNQ_TARGET)
VTA_LIB_SRC += $(wildcard src/pynq/*.cc)
LDFLAGS += -L/usr/lib -lsds_lib
LDFLAGS += -L/opt/python3.6/lib/python3.6/site-packages/pynq/drivers/ -l:libdma.so
LDFLAGS += -L/opt/python3.6/lib/python3.6/site-packages/pynq/lib/ -l:libdma.so
endif
VTA_LIB_OBJ = $(patsubst %.cc, build/%.o, $(VTA_LIB_SRC))

Expand All @@ -79,7 +79,7 @@ cpplint:
python nnvm/dmlc-core/scripts/lint.py vta cpp include src hardware tests

pylint:
pylint python/vta --rcfile=$(ROOTDIR)/tests/lint/pylintrc
pylint python/tvm_vta --rcfile=$(ROOTDIR)/tests/lint/pylintrc

doc:
doxygen docs/Doxyfile
Expand Down
80 changes: 80 additions & 0 deletions apps/pynq_rpc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
### PYNQ RPC Server for VTA

This guide describes how to setup a Pynq-based RPC server to accelerate deep learning workloads with VTA.

## Pynq Setup

Follow the getting started tutorial for the [Pynq board](http://pynq.readthedocs.io/en/latest/getting_started.html).
* For this RPC setup make sure to go with the *Connect to a Computer* Ethernet setup.

Make sure that you can ssh into your Pynq board successfully:
```bash
ssh [email protected]
```

When ssh-ing onto the board, the default password for the `xilinx` account is `xilinx`.

For convenience let's go ahead and mount the Pynq board's file system to easily access it and maintain it:
```bash
sshfs [email protected]:/home/xilinx <mountpoint>
```

## Pynq TVM & VTA installation

On your **host PC**, go to the `<mountpoint>` directory of your Pynq board file system.
```bash
cd <mountpoint>
```

From there, clone the VTA repository:
```bash
git clone [email protected]:uwsaml/vta.git --recursive
```

Next, clone the TVM repository:
```bash
git clone [email protected]:dmlc/tvm.git --recursive
```

TVM is rapidly changing, and to ensure stability, we keep track of working TVM checkpoints.
As of now, the TVM checkpoint `e4c2af9abdcb3c7aabafba8084414d7739c17c4c` is known to work with VTA.
```bash
git checkout e4c2af9abdcb3c7aabafba8084414d7739c17c4c
```

Now, ssh into your **Pynq board** to build the TVM runtime with the following commands:
```bash
ssh [email protected] # ssh if you haven't done so
cd ~/tvm
cp make/config.mk .
echo USE_RPC=1 >> config.mk
make runtime -j2
```

## Pynq RPC server setup

We're now ready to build the Pynq RPC server on the Pynq board.
```bash
ssh [email protected] # ssh if you haven't done so
cd ~/vta
export TVM_PATH = /home/xilinx/tvm
make
```

The last stage will build the `192.168.2.99:home/xilinx/vta/lib/libvta.so` library file. We are now ready to launch the RPC server on the Pynq. In order to enable the FPGA drivers, we need to run the RPC server with administrator privileges (using `su`, account: `xilinx`, pwd: `xilinx`).
```bash
ssh [email protected] # ssh if you haven't done so
cd ~/vta
su
./apps/pynq_rpc/start_rpc_server.sh
```

You should see the following being displayed when starting the RPC server:
```
INFO:root:Load additional library /home/xilinx/vta/lib/libvta.so
INFO:root:RPCServer: bind to 0.0.0.0:9091
```

Note that it should be listening on port `9091`.

To kill the RPC server, just enter the `Ctrl + c` command.
2 changes: 1 addition & 1 deletion apps/pynq_rpc/start_rpc_server.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/bash
export PYTHONPATH=${PYTHONPATH}:/home/xilinx/tvm/python
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/python3.6/lib/python3.6/site-packages/pynq/drivers/
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/python3.6/lib/python3.6/site-packages/pynq/lib/
python -m tvm.exec.rpc_server --load-library /home/xilinx/vta/lib/libvta.so
5 changes: 5 additions & 0 deletions examples/resnet18/pynq/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
quantize_graph.json
quantize_params.pkl
synset.txt
*.jpg
vta.bit
98 changes: 98 additions & 0 deletions examples/resnet18/pynq/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Resnet-18 Example on Pynq-based VTA Design

In order to run this example you'll need to have:
* VTA installed
* TVM installed
* NNVM installed
* A Pynq-based RPC server running

## VTA installation

Clone the VTA repository in the directory of your choosing:
```bash
git clone [email protected]:uwsaml/vta.git --recursive
```

Update your `~/.bashrc` file to include the VTA python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
```bash
export PYTHONPATH=<vta root>/python:${PYTHONPATH}
```

## TVM installation

Clone the TVM repository in the directory of your choosing:
```bash
git clone [email protected]:dmlc/tvm.git --recursive
```

TVM is rapidly changing, and to ensure stability, we keep track of working TVM checkpoints.
As of now, the TVM checkpoint `e4c2af9abdcb3c7aabafba8084414d7739c17c4c` is known to work with VTA.
```bash
git checkout e4c2af9abdcb3c7aabafba8084414d7739c17c4c
```

Before building TVM, copy the `make/config.mk` file into the root TVM directory:
```bash
cd <tvm root>
cp make/config.mk .
```

In the 'config.mk' file sure that:
* `LLVM_CONFIG` points to the llvm-config executable (e.g. `LLVM_CONFIG = /usr/bin/llvm-config-4.0`). You'll need to have llvm4.0 installed or later.
* `USE_RPC` should be set to 1

Launch the compilation, this takes about 5 minutes.
```bash
cd <tvm root>
make -j4
```

Finally update your `~/.bashrc` file to include the TVM python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
```bash
export PYTHONPATH=<tvm root>/python:<tvm root>/topi/python:${PYTHONPATH}
```

## NNVM installation

Clone the NNVM repository from `tqchen` in the directory of your choosing:
```bash
git clone [email protected]:tqchen/nnvm.git --recursive
```

To run this example, we rely on a special branch of NNVM: `qt`:
```bash
cd <nnvm root>
git checkout qt
```

Launch the compilation, this takes less a minute.
```bash
cd <nnvm root>
make -j4
```

Finally update your `~/.bashrc` file to include the NNVM python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
```bash
export PYTHONPATH=<nnvm root>/python:${PYTHONPATH}
```

## Pynq RPC Server Setup

Follow the [Pynq RPC Server Guide](https://github.com/saml/vta/tree/master/apps/pynq_rpc/README.md)

## Running the example

Simply run the following python script:
```bash
python imagenet_predict.py
```

This will run imagenet classification using the ResNet18 architecture on a VTA design that performs 8-bit integer inference, to perform classification on a cat image `cat.jpg`.

The script reports runtime measured on the Pynq board, and the top-1 result category:
```
('x', (1, 3, 224, 224))
Build complete...
('TVM prediction top-1:', 281, 'tabby, tabby cat')
t-cost=0.41906
```
174 changes: 174 additions & 0 deletions examples/resnet18/pynq/imagenet_predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# some standard imports
import nnvm
import tvm
from nnvm.compiler import graph_attr
import vta
import os
import numpy as np
from PIL import Image
import pickle
import json
import logging
import wget
from tvm.contrib import graph_runtime, rpc, util

factor = 16
host = "pynq"
port = 9091
verbose = False
# only run fpga component, mark non-conv ops as nop
debug_fpga_only = False

# Obtain model and hardware files (they're too large to check-in)
url = "https://homes.cs.washington.edu/~moreau/media/vta/"
TEST_FILE = 'cat.jpg'
CATEG_FILE = 'synset.txt'
RESNET_GRAPH_FILE = 'quantize_graph.json'
RESNET_PARAMS_FILE = 'quantize_params.pkl'
BITSTREAM_FILE = 'vta.bit'
for file in [TEST_FILE, CATEG_FILE, RESNET_GRAPH_FILE, RESNET_PARAMS_FILE, BITSTREAM_FILE]:
if not os.path.isfile(file):
print "Downloading {}".format(file)
wget.download(url+file)

# Program the FPGA remotely
assert tvm.module.enabled("rpc")
remote = rpc.connect(host, port)
remote.upload(BITSTREAM_FILE, BITSTREAM_FILE)
fprogram = remote.get_function("tvm.contrib.vta.init")
fprogram(BITSTREAM_FILE)

if verbose:
logging.basicConfig(level=logging.INFO)

# Change to -device=tcpu to run cpu only inference.
target = "llvm -device=vta"

synset = eval(open(os.path.join(CATEG_FILE)).read())
image = Image.open(os.path.join(TEST_FILE)).resize((224, 224))

def transform_image(image):
image = np.array(image) - np.array([123., 117., 104.])
image /= np.array([58.395, 57.12, 57.375])
image = image.transpose((2, 0, 1))
image = image[np.newaxis, :]
return image

def mark_nop(graph, conv_layer=-1, skip_conv_layer=()):
"""Helper function to mark certain op as nop
Useful to debug performance issues.
"""
jgraph = json.loads(graph.json())
counter = 0
for nid, node in enumerate(jgraph["nodes"]):
op_name = node["op"]
if op_name != "tvm_op":
continue
attrs = node["attrs"]
node_name = node["name"]
func_name = attrs["func_name"]
if func_name.find("quantized_conv2d") != -1:
if conv_layer >= 0:
if counter != conv_layer:
attrs["func_name"] = "__nop"
if counter in skip_conv_layer:
attrs["func_name"] = "__nop"
counter += 1
else:
if conv_layer >= 0:
attrs["func_name"] = "__nop"
attrs["func_name"] = "__nop"
if attrs["func_name"] != "__nop":
print("Run function %s"% func_name)
graph = nnvm.graph.load_json(json.dumps(jgraph))
return graph

x = transform_image(image)
print('x', x.shape)

######################################################################
# now compile the graph
import nnvm.compiler
np.random.seed(0)
sym = nnvm.graph.load_json(
open(os.path.join(RESNET_GRAPH_FILE)).read())
params = pickle.load(
open(os.path.join(RESNET_PARAMS_FILE)))

shape_dict = {"data": x.shape}
dtype_dict = {"data": 'float32'}
shape_dict.update({k: v.shape for k, v in params.items()})
dtype_dict.update({k: str(v.dtype) for k, v in params.items()})

graph = nnvm.graph.create(sym)
graph_attr.set_shape_inputs(sym, shape_dict)
graph_attr.set_dtype_inputs(sym, dtype_dict)
graph = graph.apply("InferShape").apply("InferType")

dtype = "float32"
sym = vta.graph.remove_stochastic(sym)
sym = vta.graph.clean_cast(sym)
sym = vta.graph.clean_conv_fuse(sym)
if "vta" in target:
sym = vta.graph.pack(sym, shape_dict, factor)

graph_attr.set_shape_inputs(sym, shape_dict)
sym = sym.apply("InferShape")
graph_attr.set_dtype_inputs(sym, dtype_dict)
sym = sym.apply("InferType")

with nnvm.compiler.build_config(opt_level=3):
bdict = {}
if "vta" not in target:
bdict = {"add_lower_pass": []}
else:
bdict = {"add_lower_pass": vta.debug_mode(0)}
with tvm.build_config(**bdict):
graph, lib, params = nnvm.compiler.build(
sym, target, shape_dict, dtype_dict,
params=params)

remote = rpc.connect(host, port)
temp = util.tempdir()
lib.save(temp.relpath("graphlib.o"))
remote.upload(temp.relpath("graphlib.o"))
lib = remote.load_module("graphlib.o")
ctx = remote.ext_dev(0) if "vta" in target else remote.cpu(0)

print("Build complete...")

def run_e2e(graph):
"""Running end to end example
"""
if debug_fpga_only:
graph = mark_nop(graph, skip_conv_layer=(0,))
m = graph_runtime.create(graph, lib, ctx)
# set inputs
m.set_input('data', tvm.nd.array(x.astype("float32")))
m.set_input(**params)
# execute
timer = m.module.time_evaluator("run", ctx, number=10)
tcost = timer()
# get outputs
tvm_output = m.get_output(
0,tvm.nd.empty((1000,), dtype, remote.cpu(0)))
top1 = np.argmax(tvm_output.asnumpy())
print('TVM prediction top-1:', top1, synset[top1])
print("t-cost=%g" % tcost.mean)


def run_layer(old_graph):
"""Run a certain layer."""
for layer_id in range(1, 2):
graph = mark_nop(old_graph, layer_id)
m = graph_runtime.create(graph, lib, ctx)
# set inputs
m.set_input('data', tvm.nd.array(x.astype("float32")))
m.set_input(**params)
# execute
timer = m.module.time_evaluator("run", ctx, number=10)
tcost = timer()
print("resnet[%d]: %g\n"% (layer_id, tcost.mean))

run_e2e(graph)
File renamed without changes.
Loading

0 comments on commit 16b5877

Please sign in to comment.