首页 > 学院 > 开发设计 > 正文

CS231n Assignment2--Q3

2019-11-06 06:12:28
字体:
来源:转载
供稿:网友

Q3: Dropout

Dropout.ipynb

X_val: (1000, 3, 32, 32) X_train: (49000, 3, 32, 32) X_test: (1000, 3, 32, 32) y_val: (1000,) y_train: (49000,) y_test: (1000,)

Dropout forward pass

Running tests with p = 0.3 Mean of input: 10.0029862212 Mean of train-time output: 10.0180516238 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.699532 Fraction of test-time output set to zero: 0.0

Running tests with p = 0.6 Mean of input: 10.0029862212 Mean of train-time output: 10.0146605666 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.399216 Fraction of test-time output set to zero: 0.0

Running tests with p = 0.75 Mean of input: 10.0029862212 Mean of train-time output: 10.0041925077 Mean of test-time output: 10.0029862212 Fraction of train-time output set to zero: 0.249896 Fraction of test-time output set to zero: 0.0

Dropout backward pass

dx relative error: 5.44561222172e-11

Fully-connected nets with Dropout

Running check with dropout = 0 Initial loss: 2.31027832193 W1 relative error: 3.70e-06 W2 relative error: 8.95e-06 W3 relative error: 3.00e-08 b1 relative error: 2.10e-08 b2 relative error: 1.83e-09 b3 relative error: 9.60e-11

Running check with dropout = 0.25 Initial loss: 2.2995556198 W1 relative error: 2.61e-07 W2 relative error: 1.89e-09 W3 relative error: 4.52e-09 b1 relative error: 3.71e-10 b2 relative error: 4.50e-10 b3 relative error: 1.34e-10

Running check with dropout = 0.5 Initial loss: 2.30021447314 W1 relative error: 5.59e-07 W2 relative error: 4.28e-08 W3 relative error: 9.85e-08 b1 relative error: 2.54e-09 b2 relative error: 4.08e-09 b3 relative error: 6.62e-11

Regularization experiment

0 (Iteration 1 / 125) loss: 9.163244 (Epoch 0 / 25) train acc: 0.216000; val_acc: 0.192000 (Epoch 1 / 25) train acc: 0.236000; val_acc: 0.146000 (Epoch 2 / 25) train acc: 0.344000; val_acc: 0.209000 (Epoch 3 / 25) train acc: 0.360000; val_acc: 0.234000 (Epoch 4 / 25) train acc: 0.480000; val_acc: 0.248000 (Epoch 5 / 25) train acc: 0.570000; val_acc: 0.256000 (Epoch 6 / 25) train acc: 0.628000; val_acc: 0.281000 (Epoch 7 / 25) train acc: 0.682000; val_acc: 0.271000 (Epoch 8 / 25) train acc: 0.724000; val_acc: 0.267000 (Epoch 9 / 25) train acc: 0.800000; val_acc: 0.267000 (Epoch 10 / 25) train acc: 0.814000; val_acc: 0.273000 (Epoch 11 / 25) train acc: 0.836000; val_acc: 0.274000 (Epoch 12 / 25) train acc: 0.898000; val_acc: 0.296000 (Epoch 13 / 25) train acc: 0.908000; val_acc: 0.274000 (Epoch 14 / 25) train acc: 0.900000; val_acc: 0.280000 (Epoch 15 / 25) train acc: 0.956000; val_acc: 0.286000 (Epoch 16 / 25) train acc: 0.948000; val_acc: 0.264000 (Epoch 17 / 25) train acc: 0.962000; val_acc: 0.283000 (Epoch 18 / 25) train acc: 0.976000; val_acc: 0.287000 (Epoch 19 / 25) train acc: 0.984000; val_acc: 0.288000 (Epoch 20 / 25) train acc: 0.966000; val_acc: 0.272000 (Iteration 101 / 125) loss: 0.219312 (Epoch 21 / 25) train acc: 0.972000; val_acc: 0.298000 (Epoch 22 / 25) train acc: 0.976000; val_acc: 0.289000 (Epoch 23 / 25) train acc: 0.994000; val_acc: 0.283000 (Epoch 24 / 25) train acc: 0.994000; val_acc: 0.289000 (Epoch 25 / 25) train acc: 0.982000; val_acc: 0.287000 0.75 (Iteration 1 / 125) loss: 10.888994 (Epoch 0 / 25) train acc: 0.224000; val_acc: 0.202000 (Epoch 1 / 25) train acc: 0.300000; val_acc: 0.231000 (Epoch 2 / 25) train acc: 0.314000; val_acc: 0.220000 (Epoch 3 / 25) train acc: 0.404000; val_acc: 0.259000 (Epoch 4 / 25) train acc: 0.408000; val_acc: 0.217000 (Epoch 5 / 25) train acc: 0.478000; val_acc: 0.235000 (Epoch 6 / 25) train acc: 0.586000; val_acc: 0.275000 (Epoch 7 / 25) train acc: 0.634000; val_acc: 0.254000 (Epoch 8 / 25) train acc: 0.680000; val_acc: 0.300000 (Epoch 9 / 25) train acc: 0.748000; val_acc: 0.303000 (Epoch 10 / 25) train acc: 0.796000; val_acc: 0.268000 (Epoch 11 / 25) train acc: 0.870000; val_acc: 0.282000 (Epoch 12 / 25) train acc: 0.856000; val_acc: 0.285000 (Epoch 13 / 25) train acc: 0.880000; val_acc: 0.282000 (Epoch 14 / 25) train acc: 0.918000; val_acc: 0.315000 (Epoch 15 / 25) train acc: 0.906000; val_acc: 0.303000 (Epoch 16 / 25) train acc: 0.932000; val_acc: 0.290000 (Epoch 17 / 25) train acc: 0.942000; val_acc: 0.311000 (Epoch 18 / 25) train acc: 0.966000; val_acc: 0.296000 (Epoch 19 / 25) train acc: 0.938000; val_acc: 0.307000 (Epoch 20 / 25) train acc: 0.966000; val_acc: 0.313000 (Iteration 101 / 125) loss: 2.588185 (Epoch 21 / 25) train acc: 0.964000; val_acc: 0.297000 (Epoch 22 / 25) train acc: 0.968000; val_acc: 0.296000 (Epoch 23 / 25) train acc: 0.984000; val_acc: 0.337000 (Epoch 24 / 25) train acc: 0.984000; val_acc: 0.323000 (Epoch 25 / 25) train acc: 0.968000; val_acc: 0.317000

这里写图片描述

Question

Explain what you see in this experiment. What does it suggest about dropout?

Answer

When using 0.75 dropout, we can get higher val accuracy since the dropout can conquer the overfitting PRoblem.


发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表