Bender API

This document explains the basic API in Bender.

Creating a network

To create a network model you can create it from scratch or import it from a TensorFlow graph. We will explain how to create a network from scratch:

let network = Network(inputSize: inputSize,
                      parameterLoader: loader)

    ->> Convolution(convSize: ConvSize(outputChannels: 16, kernelSize: 3, stride: 2))
    ->> InstanceNorm()
    ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 3, stride: 2), neuronType: .relu)
    ->> InstanceNorm()
    ->> FullyConnected(neurons: 128)
    ->> Neuron(type: .tanh)
    ->> FullyConnected(neurons: 10)
    ->> Softmax()


First, we have to create the network which receives the inputSize and a parameter loader. The network comes with a start node which is the starting point of the network. The inputSize is the size expected by the first layer in the network. If the images you pass the network to be processed are not of the expected size then the start node will resize them accordingly.

The parameterLoader is responsible for loading the weights for each layer. It will be explained in detail further below.

After creating the network you can add layers to it with the ->> operator. You can also add parallel paths to your network by adding an array of subgraphs like this:

    ->> [Convolution(convSize: someSize), LocalResponseNorm(),
         Convolution(convSize: otherSize), InstanceNorm()]
    ->> Add()

In this case the output of previousLayer is passed to two different Convolution layers and then through a normalization layer and after that they are added with the Add layer.

After you finish adding layers to your network, you must call network.initialize() to finish setting up your network.

Running a network

To run a network call run(/* ... */):

// get image from somewhere
let image = MPSImage(/* ... */) image) { output in
    // ...


A network created from scratch needs a ParameterLoader which is responsible for loading the parameters for its layers. There are two types of ParameterLoader implemented in Bender: PerLayerBinaryLoader, which expects a different file for each parameter, and SingleBinaryLoader that takes a single file and gets all parameters from it.

This loaders should cover most cases but if you want to implement a different loader then you have to create a class conforming to ParameterLoader:

public protocol ParameterLoader {

    /// Weight checkpoint. This variable is the prefix of the weight files.
    var checkpoint: String { get set }

    /// Loads weights for a single buffer
    /// - Parameters:
    ///   - id: NetworkLayer id
    ///   - modifier: The type of weights to load (e.g. bias, scale, shift). Use to distinguish the different parameters needed for a layer.
    ///   - size: Amount of floats to load
    /// - Returns: A pointer to the loaded floats
    func loadWeights(for id: String, modifier: String, size: Int) -> UnsafePointer<Float>


Note: the modifier argument of loadWeights is passed when a layer requests more than one parameter (i.e. weights and bias). It is used to differentiate between these.

Creating custom layers

To add a new layer you must create a subclas of NetworkLayer and override some of the following functions:

func initialize(network: Network, device: MTLDevice)
func execute(commandBuffer: MTLCommandBuffer)
func updatedCheckpoint(device: MTLDevice) // optional

Initializing the layer

initialize is called when the network is initialized and it should set up everything for later execution. It should load the weights (as we do not want to do this in every execution loop) and also create the outputImage and outputSize. Bender uses MPSTemporaryImage for images used internally in a layer and MPSImage for inter-layer communication. Apple suggests using only MPSTemporaryImage's for images used and consumed in one MTLCommandBuffer but we experienced some problems with them and creating the MPSImage's at initialization time should not have performance hits at execution time.

outputImage and outputSize are two variables defined in NetworkLayer which must be instantiated at initialization time or before. outputImage is the image passed to the next nodes and outputSize is the size of this image.

Executing the layer

The execute function is called in each run loop. It should execute the layer and store the result in outputImage. You must override this function.

Updating checkpoints

Sometimes you might have more than one set of learned parameters. In that case, this function will be called if the network is asked to change its parameters (also known as checkpoint in TF).

You should override this method if your layer depends on learned parameters and your app allows to change checkpoints.

Implementing custom Metal kernels

If your custom layer needs a custom Metal kernel function then you should create a .metal file and implement it there. Then in your initialize function you can get the MTLCopmutePipelineState for that function by calling:

pipeline = MetalShaderManager.shared.getFunction(name: "my_function")

There is a lot to learn about Metal and its special considerations and differences with CUDA but we won't document that here. You can go to the WWDC videos to get started if you are not familiar with Metal.


The MetalShaderManager keeps all the custom kernel functions that an app has loaded so that different layers that use the same Metal function will effectively use the same MTLComputePipelineState.

It also manages the function constants passed to these Metal functions. If you have a function that relies on function constants then you can pass them to the MetalShaderManager when you get your function. The function to get a kernel function is this:

/// Get a MTLComputePipelineState with a Metal function of the given name
/// - Parameters:
///   - name: name of the function
///   - bundle: Bundle where the shader function was compiled. Used to get the correct library
///   - constants: functions constants passed to this function
/// - Returns: a MTLComputePipelineState for the requested function
func getFunction(name: String, in bundle: Bundle = Bundle.main, constants: [FunctionConstantBase]? = nil) -> MTLComputePipelineState

The function constants are created like this:

let c1 = FunctionConstant(index: 0, type: MTLDataType.ushort, value: 2)
let c2 = FunctionConstant(index: 1, type: MTLDataType.float, value: 0.5)

Composing layers

One thing we realized is that it is useful to have single nodes that perform only a convolution or only a normalization but, on the other hand, in a single network we might want to run the same normalization after each convolution and possibly add an activation neuron behind. We also want to easily support residual layers.

Therefore, we support composite layers which basically are just a set of layers which we want to reuse in a network.

For example if you want to use most of the Convolution layers followed by an Instance Normalization and an activation neuron you could define this:

class ConvolutionBlock: CompositeLayer {

    public var input: NetworkLayer
    public var output: NetworkLayer

    public init(convSize: ConvSize, neuron: ActivationNeuronType = .relu, id: String? = nil) {
        let convId = id ?? ""
        let group = Convolution(convSize: convSize, neuronType: .none, id: convId)
                    ->> InstanceNorm(id: convId + "-instanceNorm")
                    ->> Neuron(type: neuron, id: convId + "NEURON")

        self.input = group.input
        self.output = group.output


A CompositeLayer is a temporary structure that will be removed as soon as its sublayers are added to the network graph. It has an input and an output which must be set to the first and last sublayers, as can be seen in the Example.

But how does this work?

A CompositeLayer is a Group. Group's are temporal structures used during the generation of the execution graph. A Group consists of an input and an output which are the first and last nodes of a group of nodes. Bender's operator ->> works on Groups. It takes two groups and returns one by joining the output from the first to the input of the second. The first two Groups can now be destroyed.

A CompositeLayer is the same. It will be destroyed as soon as it is applied in an operator. But its sublayers survive... 😀