A toy convolutional neural network framework, written by Swift and Metal, for iOS and macOS devices.
On macOS
- Donwload MNIST dataset
curl -O http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
curl -O http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
curl -O http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
curl -O http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
gunzip t*-ubyte.gz
- Define a network
let net = Sequential([
Conv(3, count: 3, padding: 1),
Conv(3, count: 3, padding: 1),
Conv(3, count: 3, padding: 1),
ReLU(),
MaxPool(2, step: 2),
Conv(3, count: 6, padding: 1),
Conv(3, count: 6, padding: 1),
Conv(3, count: 6, padding: 1),
ReLU(),
MaxPool(2, step: 2),
Dense(inFeatures: 6 * 7 * 7, outFeatures: 120),
Dense(inFeatures: 120, outFeatures: 10)
])
- Create a data reader
let reader = MNISTReader(
root: "/.../mnist", // path to your dataset
batchSize: 64
)
- Use GPU (or the computation will be extremely slow)
Core.device = MTLCreateSystemDefaultDevice()
- Train
func train() {
ModelStorage.load(net, path: "mnistmodel01.nnm")
for i in 0..<3 {
var j = 0
var runningLoss: Float = 0.0
while let (img, label) = reader.nextTrain() {
net.zeroGrad()
let _ = net.forward(img)
let loss = net.loss(label)
net.backward(label)
net.step(lr: 0.0001, momentum: 0.99)
runningLoss = max(runningLoss, loss)
if j % 100 == 99 {
print("[\(i), \(j)] loss: \(runningLoss)")
runningLoss = 0.0
ModelStorage.save(net, path: "mnistmodel01.nnm")
}
j += 1
}
}
ModelStorage.save(net, path: "mnistmodel01.nnm")
}
train()
It's very easy to understand the train()
function if you have used PyTorch.
Total training time is about 24 minutes on my computer (MacBook Pro Retina, 13-inch, Mid 2014)
Maybe you think that it's very slow.
Yes. So it's recommended to start training before you having lunch.
- Test
func test() {
ModelStorage.load(net, path: "mnistmodel01.nnm")
var cor = 0
var tot = 0
while let (img, label) = reader.nextTest() {
let score = net.forward(img)
let pred = score.indexOfMax()
if pred == label {
cor += 1
print("\(tot): Y \(pred) == \(label)")
} else {
print("\(tot): N \(pred) != \(label)")
}
tot += 1
}
print("correct: \(cor) / \(tot)")
}
test()
Since we only trained for three epochs, the accuracy rate won't be very high (about 86%).
If you wanna improve it, you can train more epochs.
On iOS
-
Move pretrained model mnistmodel01.nnm file to your iOS project. Make sure the Target Membership is selected.
-
Create a model with the same structure as before.
let model = Sequential([
Conv(3, count: 3, padding: 1),
...
Dense(inFeatures: 120, outFeatures: 10)
])
- Load model parameters and enable GPU computation in
viewDidLoad()
.
override func viewDidLoad() {
super.viewDidLoad()
let path = Bundle.main.path(forResource: "mnistmodel01", ofType: "nnm")!
ModelStorage.load(model, path: path)
Core.device = MTLCreateSystemDefaultDevice()
}
- Use
forward()
function to predict.
@IBAction func check() {
let img = getViewImage() // this function convert UIView to NNArray
let res = model.forward(img)
let pred = res.indexOfMax()
button.setTitle("\(pred)", for: .normal)
}
You can read the source for detail.
- Conv (2D convolutional layer)
- Dense (fully connected layer)
- ReLU (leaky relu)
- MaxPool (2D max pooling layer)
- AveragePool (2D average pooling layer (uncompleted ver.))