Implementation of the MobileNets CNN model in CUDA The main branch is the original design. For improved design, please see branch kernel_layout.
Paper: https://arxiv.org/abs/1704.04861
- Copy the entire repository to your workspace (CARC).
- No extra libraries or dependencies are required.
- Run the following commands in exact order.
module load nvidia-hpc-sdk
nvcc MobileNets_host.cu -o MobileNets_host
sbatch job.sl
cat gpujob.out