This repository accompanies this published in Towards Data Science.
An exercise in implementing the same CNN architecture in both PyTorch and Tensorflow. I have tried to keep the architecture, optimizer, learning rate, and scheduler the same across both implementation, but minor differences are inevitable. Both achieve similar accuracy of around 99% against the test set.
Each Notebook can be quickly launched in Google Colab using the links below. For GPU acceleration, remember to change your Notebook runtime to GPU.
In each example, we use the MNIST dataset. This data consists of 70,000 images of handwritten digits (0 to 9). Each image is a 28x28 pixel grid of grayscale values. As this is a common research dataset, both PyTorch and Tensorflow include their own helper functions for fetching this data. I have made use of these helper functions in each case.
The PyTorch implementation is based off the example provided by the PyTorch development team, available in GitHub here. I made various modifications to this code in order to harmonize it with the Tensorflow example as well as to make it more amenable to running inside a Jupyter Notebook.
For the Tensorflow example, I made use of Amy Jang's tutorial on Kaggle, which itself borrows from the Keras development team's example and the tutorial by Yassine Ghouzam. I again made various modifications to this code in order to harmonize it with the PyTorch example.