Neural Network Toolbox |
Training an Elman Network
Elman networks can be trained with either of two functions, train
or adapt
.
When using the function train
to train an Elman network the following occurs.
traingdx
is recommended.
When using the function adapt
to train an Elman network, the following occurs.
learngdm
is recommended.
Elman networks are not as reliable as some other kinds of networks because both training and adaption happen using an approximation of the error gradient.
For an Elman to have the best chance at learning a problem it needs more hidden neurons in its hidden layer than are actually required for a solution by another method. While a solution may be available with fewer neurons, the Elman network is less able to find the most appropriate weights for hidden neurons since the error gradient is approximated. Therefore, having a fair number of neurons to begin with makes it more likely that the hidden neurons will start out dividing up the input space in useful ways.
The function train
trains an Elman network to generate a sequence of target vectors when it is presented with a given sequence of input vectors. The input vectors and target vectors are passed to train
as matrices P
and T
. Train
takes these vectors and the initial weights and biases of the network, trains the network using backpropagation with momentum and an adaptive learning rate, and returns new weights and biases.
Let us continue with the example of the previous section, and suppose that we want to train a network with an input P
and targets T
as defined below
Here T
is defined to be 0, except when two 1's occur in P
, in which case T
is 1.
As noted previously, our network has five hidden neurons in the first layer.
We use trainbfg
as the training function and train for 100 epochs. After training we simulate the network with the input P
and calculate the difference between the target output and the simulated network output.
Note that the difference between the target and the simulated output of the trained network is very small. Thus, the network is trained to produce the desired output sequence on presentation of the input vector P.
See Chapter 11 for an application of the Elman network to the detection of wave amplitudes.
Creating an Elman Network (newelm) | Hopfield Network |
© 1994-2005 The MathWorks, Inc.