Skip to content

Commit

Permalink
Added block-oriented network, ReLU activation function and hinge loss
Browse files Browse the repository at this point in the history
The BlockNet class is based on the earlier SimpleNet class, but has
been designed to better integrate groupwise dropout during training.
The rectified linear activation function was added to those already
available, and the standard svm-like hinge loss was added.

Test files for the new blocky net were also added.
  • Loading branch information
Philip-Bachman committed Jan 15, 2013
1 parent 133c593 commit eabd463
Show file tree
Hide file tree
Showing 8 changed files with 659 additions and 11 deletions.
36 changes: 36 additions & 0 deletions ActFunc.m
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
case 4
acts = ActFunc.logexp_ff(pre_values, pre_weights);
case 5
acts = ActFunc.relu_ff(pre_values, pre_weights);
case 6
acts = ActFunc.softmax_ff(pre_values, pre_weights);
otherwise
error('No valid activation function type selected.');
Expand All @@ -54,6 +56,9 @@
node_grads = ActFunc.logexp_bp(...
post_grads, post_weights, pre_values, pre_weights);
case 5
node_grads = ActFunc.relu_bp(...
post_grads, post_weights, pre_values, pre_weights);
case 6
node_grads = ActFunc.softmax_bp(...
post_grads, post_weights, pre_values, pre_weights);
otherwise
Expand Down Expand Up @@ -192,6 +197,37 @@
return
end

function [ cur_acts ] = relu_ff(pre_acts, pre_weights)
% Compute simple rectified linear activation function.
%
% Parameters:
% pre_acts: previous layer activations (obs_count x pre_dim)
% pre_weights: weights from pre -> cur (pre_dim x cur_dim)
% Outputs:
% cur_acts: activations at current layer (obs_count x cur_dim)
%
cur_acts = max(0, pre_acts * pre_weights);
return
end

function [ cur_grads ] = ...
relu_bp(post_grads, post_weights, pre_acts, pre_weights)
% Compute the gradient for each node in the current layer given
% the gradients in post_grads for nodes at the next layer.
%
% Parameters:
% post_grads: grads at next layer (obs_dim x post_dim)
% post_weights: weights from current to post (cur_dim x post_dim)
% pre_acts: activations at previous layer (obs_dim x pre_dim)
% pre_weights: weights from prev to current (pre_dim x cur_dim)
% Outputs:
% cur_grads: gradients at current layer (obs_dim x cur_dim)
%
nz_acts = (pre_acts * pre_weights) > 0;
cur_grads = (post_grads * post_weights') .* nz_acts;
return
end

function [ cur_acts ] = softmax_ff(pre_acts, pre_weights)
% Compute simple softmax activation function where each row in the
% matrix (pre_acts * pre_weights) is "softmaxed".
Expand Down
Loading

0 comments on commit eabd463

Please sign in to comment.