see the original repo
-
been testing mostly on imagewoof, size=128, epoch=5 (at 68ish, not even close to the current Leaderboard)
-
had a couple of epoch=80 runs (imagewoof size=128), breaking current best of 87.20. Here's the record for the highest (87.73):
python3 train.py --run 1 --woof 1 --size 128 --bs 64 --mixup 0 --epoch 80 --lr 4e-3 --gpu 2 --opt ranger --mom .95 --sched_type flat_and_anneal --ann_start 0.72 --sa 1
/home/hebe/.fastai/data/imagewoof2
8121 annealing start
epoch train_loss valid_loss accuracy top_k_accuracy time
0 2.059015 1.955903 0.353271 0.847544 00:45
1 1.810091 1.702113 0.469331 0.909392 00:44
2 1.647390 1.869125 0.452278 0.897684 00:43
3 1.528256 1.428508 0.587172 0.944770 00:46
4 1.398730 1.483926 0.581318 0.928481 00:44
5 1.322940 1.264610 0.682107 0.958768 00:44
6 1.259943 1.338293 0.644439 0.953678 00:45
7 1.208824 1.204837 0.714431 0.961822 00:44
8 1.151127 1.222800 0.702723 0.962840 00:45
9 1.117051 1.081342 0.761771 0.973021 00:46
10 1.066328 1.090195 0.764317 0.973530 00:44
11 1.055215 1.049962 0.780606 0.976584 00:44
12 1.010544 1.111728 0.754390 0.970985 00:44
13 0.976842 1.014729 0.791805 0.978366 00:46
14 0.957401 1.085027 0.765589 0.974294 00:44
15 0.936549 0.990090 0.804021 0.980657 00:44
16 0.922476 1.042703 0.775261 0.973530 00:46
17 0.892019 0.976394 0.811148 0.978621 00:45
18 0.871739 1.055000 0.769916 0.974803 00:44
19 0.854085 0.964798 0.817511 0.981166 00:45
20 0.854483 0.997314 0.795877 0.979893 00:45
21 0.826084 0.947457 0.822092 0.982184 00:44
22 0.811175 1.018532 0.789514 0.979384 00:44
23 0.787036 0.949561 0.826673 0.976839 00:46
24 0.793724 0.950058 0.828964 0.980402 00:44
25 0.781576 0.933840 0.830746 0.980148 00:43
26 0.763649 0.995162 0.796640 0.979893 00:41
27 0.760761 0.925460 0.835582 0.980657 00:39
28 0.744653 0.951564 0.819292 0.978621 00:40
29 0.743768 0.942577 0.827182 0.981420 00:40
30 0.720729 0.950166 0.820056 0.978366 00:41
31 0.721088 0.900436 0.846780 0.983202 00:40
32 0.716351 0.968126 0.811657 0.977602 00:39
33 0.710904 0.916060 0.838636 0.979893 00:40
34 0.697059 0.959878 0.817765 0.977857 00:41
35 0.687178 0.904532 0.842963 0.981166 00:41
36 0.692708 0.949097 0.829473 0.979893 00:40
37 0.682395 0.901483 0.843472 0.981420 00:40
38 0.672723 0.933989 0.824637 0.978875 00:40
39 0.672619 0.900278 0.842963 0.982438 00:41
40 0.660355 0.939471 0.831255 0.975566 00:40
41 0.665885 0.907486 0.841435 0.980148 00:40
42 0.655964 0.952094 0.824637 0.977348 00:39
43 0.656164 0.919422 0.840163 0.978366 00:41
44 0.637180 0.949795 0.823619 0.974803 00:40
45 0.640856 0.914269 0.841181 0.978875 00:41
46 0.638588 0.941862 0.828710 0.976330 00:40
47 0.629136 0.902516 0.841945 0.981420 00:41
48 0.626218 0.930414 0.833291 0.977602 00:41
49 0.626535 0.929343 0.837872 0.976839 00:40
50 0.622971 0.940724 0.827946 0.973276 00:40
51 0.629888 0.891042 0.854925 0.978621 00:40
52 0.618163 0.934122 0.827437 0.981929 00:40
53 0.613615 0.909135 0.837618 0.978112 00:40
54 0.616215 0.885717 0.853398 0.978112 00:41
55 0.610619 0.904276 0.844490 0.979639 00:41
56 0.602873 0.925057 0.840417 0.976330 00:41
57 0.615508 0.880197 0.852889 0.980657 00:40
58 0.601900 0.899529 0.847289 0.978112 00:41
59 0.608448 0.885909 0.850089 0.978621 00:41
60 0.601173 0.909748 0.841181 0.976839 00:40
61 0.597144 0.905546 0.845253 0.976075 00:41
62 0.587813 0.938127 0.839145 0.971240 00:41
63 0.583056 0.890574 0.853143 0.975821 00:40
64 0.579982 0.904091 0.846526 0.976075 00:40
65 0.572257 0.873078 0.850853 0.980148 00:41
66 0.564794 0.879189 0.854670 0.973785 00:40
67 0.560571 0.869368 0.856197 0.977602 00:38
68 0.554823 0.885016 0.859761 0.975312 00:41
69 0.542529 0.865427 0.858488 0.976839 00:41
70 0.543428 0.852592 0.865106 0.976075 00:40
71 0.540655 0.841956 0.867905 0.980911 00:40
72 0.533528 0.848855 0.868160 0.980402 00:40
73 0.530833 0.831290 0.873250 0.980148 00:41
74 0.527480 0.828368 0.871978 0.979384 00:40
75 0.527150 0.831567 0.872487 0.980402 00:41
76 0.526826 0.829058 0.875032 0.980911 00:40
77 0.523319 0.824739 0.875286 0.979639 00:41
78 0.520348 0.826227 0.875286 0.980148 00:41
79 0.523115 0.825280 0.877322 0.980148 00:41
[0.877322]
0.8773225
0.0
A quick summary of the underlying mathematics:
CNN (ResNet) | PDE ("heat" equation) |
---|---|
input layer | initial condition |
feed forward | solving the equation |
hidden layers | solution at intermediate times |
output layer | solution at final time |
convolution with 3×3 kernel | differential operator of order ≤ 2 |
weights | coefficients |
boundary handling (padding) | boundary condition |
multiple channels/filters/feature_maps | system of (coupled) PDEs |
e.g. 16×16×3×3 kernel | 16×16 matrix of differential operators |
16×16×1×1 kernel | 16×16 matrix of constants |
groups=2 (in Conv2d) | matrix is block diagonal (direct sum of 2 blocks) |
The idea of ResNet with a Twist is to add "variable coefficients" in front of the differential operators, variables being simply "linear in the x and y direction" which suffices for rotating and scaling the feature maps.