X_val: (1000, 3, 32, 32) X_train: (49000, 3, 32, 32) X_test: (1000, 3, 32, 32) y_val: (1000,) y_train: (49000,) y_test: (1000,)
Before batch normalization: means: [ 14.07577236 -27.92959657 -33.87722636] stds: [ 32.45504012 26.91054894 32.5339723 ] After batch normalization (gamma=1, beta=0) mean: [ 5.84254867e-17 2.32036612e-16 -2.39808173e-16] std: [ 1. 0.99999999 1. ] After batch normalization (nontrivial gamma, beta) means: [ 11. 12. 13.] stds: [ 1. 1.99999999 2.99999999]
After batch normalization (test-time): means: [ 0.02347014 -0.04351107 -0.05799624] stds: [ 1.04889033 1.01943852 1.02120248]
dx error: 1.49671323478e-09 dgamma error: 3.86167816656e-11 dbeta error: 6.4892893412e-12
dx difference: 1.62010020328e-12 dgamma difference: 3.6199883842e-14 dbeta difference: 0.0 speedup: 0.94x
Running check with reg = 0 Initial loss: 2.38575331646 W1 relative error: 9.09e-06 W2 relative error: 3.44e-05 W3 relative error: 9.63e-10 b1 relative error: 1.78e-07 b2 relative error: 1.78e-07 b3 relative error: 1.23e-10 beta1 relative error: 5.59e-09 beta2 relative error: 1.50e-08 gamma1 relative error: 5.54e-09 gamma2 relative error: 7.56e-08
Running check with reg = 3.14 Initial loss: 11.8905697635 W1 relative error: 1.00e+00 W2 relative error: 1.00e+00 W3 relative error: 1.90e-07 b1 relative error: 8.88e-08 b2 relative error: 4.44e-08 b3 relative error: 3.20e-10 beta1 relative error: 3.74e-08 beta2 relative error: 6.86e-08 gamma1 relative error: 3.76e-08 gamma2 relative error: 1.54e-08
(Iteration 1 / 400) loss: 2.320615 (Epoch 0 / 20) train acc: 0.115000; val_acc: 0.112000 (Epoch 1 / 20) train acc: 0.377000; val_acc: 0.289000 (Epoch 2 / 20) train acc: 0.460000; val_acc: 0.332000 (Epoch 3 / 20) train acc: 0.542000; val_acc: 0.375000 (Epoch 4 / 20) train acc: 0.612000; val_acc: 0.349000 (Epoch 5 / 20) train acc: 0.662000; val_acc: 0.340000 (Epoch 6 / 20) train acc: 0.712000; val_acc: 0.358000 (Epoch 7 / 20) train acc: 0.747000; val_acc: 0.347000 (Epoch 8 / 20) train acc: 0.811000; val_acc: 0.361000 (Epoch 9 / 20) train acc: 0.852000; val_acc: 0.355000 (Epoch 10 / 20) train acc: 0.882000; val_acc: 0.345000 (Iteration 201 / 400) loss: 0.687450 (Epoch 11 / 20) train acc: 0.907000; val_acc: 0.359000 (Epoch 12 / 20) train acc: 0.932000; val_acc: 0.354000 (Epoch 13 / 20) train acc: 0.939000; val_acc: 0.333000 (Epoch 14 / 20) train acc: 0.953000; val_acc: 0.346000 (Epoch 15 / 20) train acc: 0.977000; val_acc: 0.342000 (Epoch 16 / 20) train acc: 0.976000; val_acc: 0.357000 (Epoch 17 / 20) train acc: 0.982000; val_acc: 0.357000 (Epoch 18 / 20) train acc: 0.986000; val_acc: 0.355000 (Epoch 19 / 20) train acc: 0.989000; val_acc: 0.341000 (Epoch 20 / 20) train acc: 0.987000; val_acc: 0.348000 (Iteration 1 / 400) loss: 2.302501 (Epoch 0 / 20) train acc: 0.120000; val_acc: 0.129000 (Epoch 1 / 20) train acc: 0.221000; val_acc: 0.205000 (Epoch 2 / 20) train acc: 0.315000; val_acc: 0.260000 (Epoch 3 / 20) train acc: 0.333000; val_acc: 0.287000 (Epoch 4 / 20) train acc: 0.352000; val_acc: 0.299000 (Epoch 5 / 20) train acc: 0.390000; val_acc: 0.305000 (Epoch 6 / 20) train acc: 0.430000; val_acc: 0.310000 (Epoch 7 / 20) train acc: 0.459000; val_acc: 0.328000 (Epoch 8 / 20) train acc: 0.495000; val_acc: 0.325000 (Epoch 9 / 20) train acc: 0.507000; val_acc: 0.328000 (Epoch 10 / 20) train acc: 0.541000; val_acc: 0.325000 (Iteration 201 / 400) loss: 1.127434 (Epoch 11 / 20) train acc: 0.607000; val_acc: 0.336000 (Epoch 12 / 20) train acc: 0.634000; val_acc: 0.327000 (Epoch 13 / 20) train acc: 0.703000; val_acc: 0.347000 (Epoch 14 / 20) train acc: 0.696000; val_acc: 0.324000 (Epoch 15 / 20) train acc: 0.764000; val_acc: 0.339000 (Epoch 16 / 20) train acc: 0.781000; val_acc: 0.315000 (Epoch 17 / 20) train acc: 0.756000; val_acc: 0.305000 (Epoch 18 / 20) train acc: 0.824000; val_acc: 0.305000 (Epoch 19 / 20) train acc: 0.842000; val_acc: 0.321000 (Epoch 20 / 20) train acc: 0.876000; val_acc: 0.315000
Running weight scale 1 / 20 Running weight scale 2 / 20 Running weight scale 3 / 20 Running weight scale 4 / 20 Running weight scale 5 / 20 Running weight scale 6 / 20 Running weight scale 7 / 20 Running weight scale 8 / 20 Running weight scale 9 / 20 Running weight scale 10 / 20 Running weight scale 11 / 20 Running weight scale 12 / 20 Running weight scale 13 / 20 Running weight scale 14 / 20 Running weight scale 15 / 20 Running weight scale 16 / 20 cs231n/layers.py:773: RuntimeWarning: divide by zero encountered in log loss = -np.sum(np.log(PRobs[np.arange(N), y])) / N Running weight scale 17 / 20 Running weight scale 18 / 20 Running weight scale 19 / 20 Running weight scale 20 / 20
新闻热点
疑难解答