Incoherence between validation accuracy and values inferred

After training a DL model with the following training curves:

the inference don’t match a validation accuracy of 0.73:

Category Image predictions probabilities
bonsai kangaroo 0.9701
headphone kangaroo 0.9699
starfish kangaroo 0.9698
elephant kangaroo 0.9723
euphonium kangaroo 0.9703
dollar_bill kangaroo 0.9624
helicopter kangaroo 0.9679
scissors kangaroo 0.9669
Motorbikes kangaroo 0.9679
gramophone kangaroo 0.9706
the prediction is Kangorro for all 1659 images in the validation set.

Can you provide following information:

  1. model configuration (export from model tab)

  2. Are you using any pretrained model ?

  3. Is your dataset balanced between classes ?

  4. is your dataset shuffled. If not, did you train with shuffle on ?

The exemple is on my DeepCognition account, project EFE_TL2, run20 on your internal server.

The model uses InceptionV3.1. AFAIK the dataset is balanced between classes. It’s the CalTech101.

Yes the dataset is shuffled and AFAIR I trained with the shuffle option on.