Higher validation accuracy, than training accurracy using Tensorflow and Keras
This happens when you use Dropout
, since the behaviour when training and testing are different.
When training, a percentage of the features are set to zero (50% in your case since you are using Dropout(0.5)
). When testing, all features are used (and are scaled appropriately). So the model at test time is more robust - and can lead to higher testing accuracies.
You can check the Keras FAQ and especially the section "Why is the training loss much higher than the testing loss?".
I would also suggest you to take some time and read this very good article regarding some "sanity checks" you should always take into consideration when building a NN.
In addition, whenever possible, check if your results make sense. For example, in case of a n-class classification with categorical cross entropy the loss on the first epoch should be -ln(1/n)
.
Apart your specific case, I believe that apart from the Dropout
the dataset split may sometimes result in this situation. Especially if the dataset split is not random (in case where temporal or spatial patterns exist) the validation set may be fundamentally different, i.e less noise or less variance, from the train and thus easier to to predict leading to higher accuracy on the validation set than on training.
Moreover, if the validation set is very small compared to the training then by random the model fits better the validation set than the training.]
This indicates the presence of high bias in your dataset. It is underfitting. The solutions to issue are:-
Probably the network is struggling to fit the training data. Hence, try a little bit bigger network.
Try a different Deep Neural Network. I mean to say change the architecture a bit.
Train for longer time.
Try using advanced optimization algorithms.
This actually a pretty often situation. When there is not so much variance in your dataset you could have the behaviour like this. Here you could find an explaination why this might happen.
There are a number of reasons this can happen.You do not shown any information on the size of the data for training, validation and test. If the validation set is to small it does not adequately represent the probability distribution of the data. If your training set is small there is not enough data to adequately train the model. Also your model is very basic and may not be adequate to cover the complexity of the data. A drop out of 50% is high for such a limited model. Try using an established model like MobileNet version 1. It will be more than adequate for even very complex data relationships. Once that works then you can be confident in the data and build your own model if you wish. Fact is validation loss and accuracy do not have real meaning until your training accuracy gets reasonably high say 85%.