X_val: (1000, 3, 32, 32) X_train: (49000, 3, 32, 32) X_test: (1000, 3, 32, 32) y_val: (1000,) y_train: (49000,) y_test: (1000,)
Running tests with p = 0.3 Mean of input: 10.0029862212 Mean of train-time output: 10.0180516238 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.699532 Fraction of test-time output set to zero: 0.0
Running tests with p = 0.6 Mean of input: 10.0029862212 Mean of train-time output: 10.0146605666 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.399216 Fraction of test-time output set to zero: 0.0
Running tests with p = 0.75 Mean of input: 10.0029862212 Mean of train-time output: 10.0041925077 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.249896 Fraction of test-time output set to zero: 0.0
dx relative error: 5.44561222172e-11
Running check with dropout = 0 Initial loss: 2.31027832193 W1 relative error: 3.70e-06 W2 relative error: 8.95e-06 W3 relative error: 3.00e-08 b1 relative error: 2.10e-08 b2 relative error: 1.83e-09 b3 relative error: 9.60e-11
Running check with dropout = 0.25 Initial loss: 2.2995556198 W1 relative error: 2.61e-07 W2 relative error: 1.89e-09 W3 relative error: 4.52e-09 b1 relative error: 3.71e-10 b2 relative error: 4.50e-10 b3 relative error: 1.34e-10
Running check with dropout = 0.5 Initial loss: 2.30021447314 W1 relative error: 5.59e-07 W2 relative error: 4.28e-08 W3 relative error: 9.85e-08 b1 relative error: 2.54e-09 b2 relative error: 4.08e-09 b3 relative error: 6.62e-11
0 (Iteration 1 / 125) loss: 9.163244 (Epoch 0 / 25) train acc: 0.216000; val_acc: 0.192000 (Epoch 1 / 25) train acc: 0.236000; val_acc: 0.146000 (Epoch 2 / 25) train acc: 0.344000; val_acc: 0.209000 (Epoch 3 / 25) train acc: 0.360000; val_acc: 0.234000 (Epoch 4 / 25) train acc: 0.480000; val_acc: 0.248000 (Epoch 5 / 25) train acc: 0.570000; val_acc: 0.256000 (Epoch 6 / 25) train acc: 0.628000; val_acc: 0.281000 (Epoch 7 / 25) train acc: 0.682000; val_acc: 0.271000 (Epoch 8 / 25) train acc: 0.724000; val_acc: 0.267000 (Epoch 9 / 25) train acc: 0.800000; val_acc: 0.267000 (Epoch 10 / 25) train acc: 0.814000; val_acc: 0.273000 (Epoch 11 / 25) train acc: 0.836000; val_acc: 0.274000 (Epoch 12 / 25) train acc: 0.898000; val_acc: 0.296000 (Epoch 13 / 25) train acc: 0.908000; val_acc: 0.274000 (Epoch 14 / 25) train acc: 0.900000; val_acc: 0.280000 (Epoch 15 / 25) train acc: 0.956000; val_acc: 0.286000 (Epoch 16 / 25) train acc: 0.948000; val_acc: 0.264000 (Epoch 17 / 25) train acc: 0.962000; val_acc: 0.283000 (Epoch 18 / 25) train acc: 0.976000; val_acc: 0.287000 (Epoch 19 / 25) train acc: 0.984000; val_acc: 0.288000 (Epoch 20 / 25) train acc: 0.966000; val_acc: 0.272000 (Iteration 101 / 125) loss: 0.219312 (Epoch 21 / 25) train acc: 0.972000; val_acc: 0.298000 (Epoch 22 / 25) train acc: 0.976000; val_acc: 0.289000 (Epoch 23 / 25) train acc: 0.994000; val_acc: 0.283000 (Epoch 24 / 25) train acc: 0.994000; val_acc: 0.289000 (Epoch 25 / 25) train acc: 0.982000; val_acc: 0.287000 0.75 (Iteration 1 / 125) loss: 10.888994 (Epoch 0 / 25) train acc: 0.224000; val_acc: 0.202000 (Epoch 1 / 25) train acc: 0.300000; val_acc: 0.231000 (Epoch 2 / 25) train acc: 0.314000; val_acc: 0.220000 (Epoch 3 / 25) train acc: 0.404000; val_acc: 0.259000 (Epoch 4 / 25) train acc: 0.408000; val_acc: 0.217000 (Epoch 5 / 25) train acc: 0.478000; val_acc: 0.235000 (Epoch 6 / 25) train acc: 0.586000; val_acc: 0.275000 (Epoch 7 / 25) train acc: 0.634000; val_acc: 0.254000 (Epoch 8 / 25) train acc: 0.680000; val_acc: 0.300000 (Epoch 9 / 25) train acc: 0.748000; val_acc: 0.303000 (Epoch 10 / 25) train acc: 0.796000; val_acc: 0.268000 (Epoch 11 / 25) train acc: 0.870000; val_acc: 0.282000 (Epoch 12 / 25) train acc: 0.856000; val_acc: 0.285000 (Epoch 13 / 25) train acc: 0.880000; val_acc: 0.282000 (Epoch 14 / 25) train acc: 0.918000; val_acc: 0.315000 (Epoch 15 / 25) train acc: 0.906000; val_acc: 0.303000 (Epoch 16 / 25) train acc: 0.932000; val_acc: 0.290000 (Epoch 17 / 25) train acc: 0.942000; val_acc: 0.311000 (Epoch 18 / 25) train acc: 0.966000; val_acc: 0.296000 (Epoch 19 / 25) train acc: 0.938000; val_acc: 0.307000 (Epoch 20 / 25) train acc: 0.966000; val_acc: 0.313000 (Iteration 101 / 125) loss: 2.588185 (Epoch 21 / 25) train acc: 0.964000; val_acc: 0.297000 (Epoch 22 / 25) train acc: 0.968000; val_acc: 0.296000 (Epoch 23 / 25) train acc: 0.984000; val_acc: 0.337000 (Epoch 24 / 25) train acc: 0.984000; val_acc: 0.323000 (Epoch 25 / 25) train acc: 0.968000; val_acc: 0.317000
Explain what you see in this experiment. What does it suggest about dropout?
When using 0.75 dropout, we can get higher val accuracy since the dropout can conquer the overfitting PRoblem.
新闻热点
疑难解答