Why is my custom neural network implementation stuck at 0.25 cost?
I made a custom neural network library that gets stuck at 0.25 loss. The network is a 2-2-1 network meaning it has two inputs, One hidden layer of size two, and one output. Im training it on a XOR dataset. (exclusive or)
WHAT I KNOW:
This implementation works with only one layer.
This leads me to believe the error is somewhere in the part of my code where I am finding the derivative of the activation's in earlier layers to the cost function, because when there is only two layers you don't need to do that, and if you do it doesn't matter.
The Pseudo code of back-prop of a single training example:
forward(inputs)
for each neuron in last layer:
neuron.cost = neuron.activation - neuron.expected_output
#this is getting the cost for each neuron
for each layer except the last one(i):
for each neuron in layer(j):
for each neuron in layer+1(h):
#this is where i calc the deriv of each weight
layer.weight[(the one connecting J and H)].deriv += layer.activations[j] *
(layer+1.actFuncPrime(layerNext.z[h]) * 2 * layerNext.costs[h]);
# "z" is the weighted sum before activation function is applied
layer.neurons[j].cost += layer.weights[(the one connecting J and H)] *
(layer+1.actFuncPrime(layerNext.z[h]) * 2 * layerNext.costs[h]);
#also i calculate the bias sensitivities but I dont think it needs to be shows here.
The real javascript code:
this.forward(input);
for (var i = 0; i < this.layers[this.layers.length - 1].size; i++) {
let layer = this.layers[this.layers.length - 1];
layer.costs[i] = layer.activations[i] - expectedOut[i]; //we will power^2 at the end.
}
for (i = this.layers.length - 2; i >= 0; i--) {
let layer = this.layers[i];
let layerNext = this.layers[i + 1];
for (var j = 0; j < layer.size; j++) {
for (var h = 0; h < layerNext.size; h++) {
layer.ws[h + j * layerNext.size] +=
layer.activations[j] *
(layerNext.actFuncPrime(layerNext.z[h]) * 2 * layerNext.costs[h]);
layer.costs[j] +=
layer.w[h + j * layerNext.size] *
(layerNext.actFuncPrime(layerNext.z[h]) * 2 * layerNext.costs[h]);//error here maybe?
}
}
for (h = 0; h < layerNext.size; h++) {
layerNext.bs[h] +=
layerNext.actFuncPrime(layerNext.z[h]) * 2 * layerNext.costs[h];
}
}
To update the weights and biases I just add them to their negative respective grads * 0.001 (learning rate).
After training:
It only outputs ~0.5 no matter what I input giving it a loss of about 0.25....
Solution 1:
Guys i caught covid and it juiced my brain and i fixed the bug. I was forgetting to reset the costs of each neuron before each epoch. This was making the costs add up. I hope dearly this helps someone else. I added layer.costs[j] = 0; before line: layer.costs[j] +=....;