Skip to content

Latest commit

 

History

History
5 lines (4 loc) · 366 Bytes

README.md

File metadata and controls

5 lines (4 loc) · 366 Bytes

Separable Convolution by CUDA and TVM

CUDA and TVM code for separable convolution (depthwise convolution + 1x1 convolution).
TVM scheduler for normal convolution with the tensor layout in the form of NHWC, currently missing in the TVM repository.
Improved runtime of TVM’s depthwise convolution scheduler by replacing filterreuse with input reuse.