Batch Training :: Neuron Model and Network Architectures (Neural Network Toolbox)

Neural Network Toolbox

Batch Training

Batch training, in which weights and biases are only updated after all of the inputs and targets are presented, can be applied to both static and dynamic networks. We discuss both types of networks in this section.

Batch Training with Static Networks

Batch training can be done using either adapt or train, although train is generally the best option, since it typically has access to more efficient training algorithms. Incremental training can only be done with adapt; train can only perform batch training.

Let's begin with the static network we used in previous examples. The learning rate will be set to 0.1.

net = newlin([-1 1;-1 1],1,0,0.1);
net.IW{1,1} = [0 0];
net.b{1} = 0;

For batch training of a static network with adapt, the input vectors must be placed in one matrix of concurrent vectors.

```
P = [1 2 2 3; 2 1 3 1];
T = [4 5 7 7];
```

When we call adapt, it will invoke trains (which is the default adaptation function for the linear network) and learnwh (which is the default learning function for the weights and biases). Therefore, Widrow-Hoff learning is used.

[net,a,e,pf] = adapt(net,P,T);
a = 0 0 0 0
e = 4 5 7 7

Note that the outputs of the network are all zero, because the weights are not updated until all of the training set has been presented. If we display the weights we find:

»net.IW{1,1}
  ans = 4.9000    4.1000
»net.b{1}
  ans =
    2.3000

This is different from the result we had after one pass of adapt with incremental updating.

Now let's perform the same batch training using train. Since the Widrow-Hoff rule can be used in incremental or batch mode, it can be invoked by adapt or train. There are several algorithms that can only be used in batch mode (e.g., Levenberg-Marquardt), and so these algorithms can only be invoked by train.

The network will be set up in the same way.

net = newlin([-1 1;-1 1],1,0,0.1);
net.IW{1,1} = [0 0];
net.b{1} = 0;

For this case, the input vectors can either be placed in a matrix of concurrent vectors or in a cell array of sequential vectors. Within train any cell array of sequential vectors is converted to a matrix of concurrent vectors. This is because the network is static, and because train always operates in the batch mode. Concurrent mode operation is generally used whenever possible, because it has a more efficient MATLAB implementation.

```
P = [1 2 2 3; 2 1 3 1];
T = [4 5 7 7];
```

Now we are ready to train the network. We will train it for only one epoch, since we used only one pass of adapt. The default training function for the linear network is trainc, and the default learning function for the weights and biases is learnwh, so we should get the same results that we obtained using adapt in the previous example, where the default adaptation function was trains.

net.inputWeights{1,1}.learnParam.lr = 0.1;
net.biases{1}.learnParam.lr = 0.1;
net.trainParam.epochs = 1;
net = train(net,P,T);

If we display the weights after one epoch of training we find:

»net.IW{1,1}
  ans = 4.9000    4.1000
»net.b{1}
  ans =
    2.3000

This is the same result we had with the batch mode training in adapt. With static networks, the adapt function can implement incremental or batch training depending on the format of the input data. If the data is presented as a matrix of concurrent vectors, batch training will occur. If the data is presented as a sequence, incremental training will occur. This is not true for train, which always performs batch training, regardless of the format of the input.

Batch Training With Dynamic Networks

Training static networks is relatively straightforward. If we use train the network is trained in the batch mode and the inputs are converted to concurrent vectors (columns of a matrix), even if they are originally passed as a sequence (elements of a cell array). If we use adapt, the format of the input determines the method of training. If the inputs are passed as a sequence, then the network is trained in incremental mode. If the inputs are passed as concurrent vectors, then batch mode training is used.

With dynamic networks, batch mode training is typically done with train only, especially if only one training sequence exists. To illustrate this, let's consider again the linear network with a delay. We use a learning rate of 0.02 for the training. (When using a gradient descent algorithm, we typically use a smaller learning rate for batch mode training than incremental training, because all of the individual gradients are summed together before determining the step change to the weights.)

net = newlin([-1 1],1,[0 1],0.02);
net.IW{1,1}=[0 0];
net.biasConnect=0;
net.trainParam.epochs = 1;
Pi = {1};
P = {2 3 4};
T = {3 5 6};

We want to train the network with the same sequence we used for the incremental training earlier, but this time we want to update the weights only after all of the inputs are applied (batch mode). The network is simulated in sequential mode because the input is a sequence, but the weights are updated in batch mode.

```
net=train(net,P,T,Pi);
```

The weights after one epoch of training are

```
»net.IW{1,1}
ans = 0.9000    0.6200
```

These are different weights than we would obtain using incremental training, where the weights would be updated three times during one pass through the training set. For batch training the weights are only updated once in each epoch.

Training Styles Summary