traingdm (Neural Network Toolbox)

Neural Network Toolbox

traingdm

Gradient descent with momentum backpropagation

Syntax

[net,TR,Ac,El] = traingdm(net,Pd,Tl,Ai,Q,TS,VV,TV)

info = traingdm(code)

Description

traingdm is a network training function that updates weight and bias values according to gradient descent with momentum.

traingdm(net,Pd,Tl,Ai,Q,TS,VV) takes these inputs,

net -- Neural network

Pd -- Delayed input vectors

Tl -- Layer target vectors

Ai -- Initial input delay conditions

Q -- Batch size

TS -- Time steps

VV -- Either empty matrix [] or structure of validation vectors

TV -- Empty matrix [] or structure of test vectors

and returns,

net -- Trained network

TR -- Training record of various values over each epoch:
- TR.epoch -- Epoch number
  TR.perf -- Training performance
  TR.vperf -- Validation performance
  TR.tperf -- Test performance
Ac -- Collective layer outputs for last epoch

El -- Layer errors for last epoch

Training occurs according to the traingdm's training parameters shown here with their default values:

net.trainParam.epochs 10 Maximum number of epochs to train

net.trainParam.goal 0 Performance goal

net.trainParam.lr 0.01 Learning rate

net.trainParam.max_fail 5 Maximum validation failures

net.trainParam.mc 0.9 Momentum constant.

net.trainParam.min_grad 1e-10 Minimum performance gradient

net.trainParam.show 25 Epochs between showing progress

net.trainParam.time inf Maximum time to train in seconds

Dimensions for these variables are

Pd -- No x Ni x TS cell array, each element P{i,j,ts} is a Dij x Q matrix.

Tl -- Nl x TS cell array, each element P{i,ts} is a Vi x Q matrix.

Ai -- Nl x LD cell array, each element Ai{i,k} is an Si x Q matrix.

where

Ni = net.numInputs

Nl = net.numLayers

LD = net.numLayerDelays

Ri = net.inputs{i}.size

Si = net.layers{i}.size

Vi = net.targets{i}.size

Dij = Ri * length(net.inputWeights{i,j}.delays)

If VV or TV is not [], it must be a structure of validation vectors,

VV.PD, TV.PD -- Validation/test delayed inputs

VV.Tl, TV.Tl -- Validation/test layer targets

VV.Ai, TV.Ai -- Validation/test initial input conditions

VV.Q, TV.Q -- Validation/test batch size

VV.TS, TV.TS -- Validation/test time steps

Validation vectors are used to stop training early if the network performance on the validation vectors fails to improve or remains the same for max_fail epochs in a row. Test vectors are used as a further check that the network is generalizing well, but do not have any effect on training.

traingdm(code) returns useful information for each code string:

'pnames' -- Names of training parameters

'pdefaults' -- Default training parameters

Network Use

You can create a standard network that uses traingdm with newff, newcf, or newelm.

To prepare a custom network to be trained with traingdm

Set net.trainFcn to 'traingdm'. This will set net.trainParam to traingdm's default parameters.
Set net.trainParam properties to desired values.

In either case, calling train with the resulting network will train the network with traingdm.

See newff, newcf, and newelm for examples.

Algorithm

traingdm can train any network as long as its weight, net input, and transfer functions have derivative functions.

Backpropagation is used to calculate derivatives of performance perf with respect to the weight and bias variables X. Each variable is adjusted according to gradient descent with momentum,

```
dX = mc*dXprev + lr*(1-mc)*dperf/dX
```

where dXprev is the previous change to the weight or bias.

Training stops when any of these conditions occur:

The maximum number of epochs (repetitions) is reached.
The maximum amount of time has been exceeded.
Performance has been minimized to the goal.
The performance gradient falls below mingrad.
Validation performance has increase more than max_fail times since the last time it decreased (when using validation).

See Also

newff, newcf, traingd, traingda, traingdx, trainlm

traingda traingdx