liuyao12 / Ranger-Mish-ImageWoof-5

Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ranger-Mish-ImageWoof-5

see the original repo

ResNet with a Twist

  • been testing mostly on imagewoof, size=128, epoch=5 (at 68ish, not even close to the current Leaderboard)

  • had a couple of epoch=80 runs (imagewoof size=128), breaking current best of 87.20. Here's the record for the highest (87.73): python3 train.py --run 1 --woof 1 --size 128 --bs 64 --mixup 0 --epoch 80 --lr 4e-3 --gpu 2 --opt ranger --mom .95 --sched_type flat_and_anneal --ann_start 0.72 --sa 1

/home/hebe/.fastai/data/imagewoof2
8121  annealing start
epoch     train_loss  valid_loss  accuracy  top_k_accuracy  time    
0         2.059015    1.955903    0.353271  0.847544        00:45       
1         1.810091    1.702113    0.469331  0.909392        00:44       
2         1.647390    1.869125    0.452278  0.897684        00:43       
3         1.528256    1.428508    0.587172  0.944770        00:46       
4         1.398730    1.483926    0.581318  0.928481        00:44       
5         1.322940    1.264610    0.682107  0.958768        00:44       
6         1.259943    1.338293    0.644439  0.953678        00:45       
7         1.208824    1.204837    0.714431  0.961822        00:44       
8         1.151127    1.222800    0.702723  0.962840        00:45       
9         1.117051    1.081342    0.761771  0.973021        00:46        
10        1.066328    1.090195    0.764317  0.973530        00:44        
11        1.055215    1.049962    0.780606  0.976584        00:44        
12        1.010544    1.111728    0.754390  0.970985        00:44        
13        0.976842    1.014729    0.791805  0.978366        00:46        
14        0.957401    1.085027    0.765589  0.974294        00:44        
15        0.936549    0.990090    0.804021  0.980657        00:44        
16        0.922476    1.042703    0.775261  0.973530        00:46        
17        0.892019    0.976394    0.811148  0.978621        00:45        
18        0.871739    1.055000    0.769916  0.974803        00:44        
19        0.854085    0.964798    0.817511  0.981166        00:45        
20        0.854483    0.997314    0.795877  0.979893        00:45        
21        0.826084    0.947457    0.822092  0.982184        00:44        
22        0.811175    1.018532    0.789514  0.979384        00:44        
23        0.787036    0.949561    0.826673  0.976839        00:46        
24        0.793724    0.950058    0.828964  0.980402        00:44        
25        0.781576    0.933840    0.830746  0.980148        00:43        
26        0.763649    0.995162    0.796640  0.979893        00:41        
27        0.760761    0.925460    0.835582  0.980657        00:39        
28        0.744653    0.951564    0.819292  0.978621        00:40        
29        0.743768    0.942577    0.827182  0.981420        00:40        
30        0.720729    0.950166    0.820056  0.978366        00:41        
31        0.721088    0.900436    0.846780  0.983202        00:40        
32        0.716351    0.968126    0.811657  0.977602        00:39        
33        0.710904    0.916060    0.838636  0.979893        00:40        
34        0.697059    0.959878    0.817765  0.977857        00:41        
35        0.687178    0.904532    0.842963  0.981166        00:41        
36        0.692708    0.949097    0.829473  0.979893        00:40        
37        0.682395    0.901483    0.843472  0.981420        00:40        
38        0.672723    0.933989    0.824637  0.978875        00:40        
39        0.672619    0.900278    0.842963  0.982438        00:41        
40        0.660355    0.939471    0.831255  0.975566        00:40        
41        0.665885    0.907486    0.841435  0.980148        00:40        
42        0.655964    0.952094    0.824637  0.977348        00:39        
43        0.656164    0.919422    0.840163  0.978366        00:41        
44        0.637180    0.949795    0.823619  0.974803        00:40        
45        0.640856    0.914269    0.841181  0.978875        00:41        
46        0.638588    0.941862    0.828710  0.976330        00:40        
47        0.629136    0.902516    0.841945  0.981420        00:41        
48        0.626218    0.930414    0.833291  0.977602        00:41        
49        0.626535    0.929343    0.837872  0.976839        00:40        
50        0.622971    0.940724    0.827946  0.973276        00:40        
51        0.629888    0.891042    0.854925  0.978621        00:40        
52        0.618163    0.934122    0.827437  0.981929        00:40        
53        0.613615    0.909135    0.837618  0.978112        00:40        
54        0.616215    0.885717    0.853398  0.978112        00:41        
55        0.610619    0.904276    0.844490  0.979639        00:41        
56        0.602873    0.925057    0.840417  0.976330        00:41        
57        0.615508    0.880197    0.852889  0.980657        00:40        
58        0.601900    0.899529    0.847289  0.978112        00:41        
59        0.608448    0.885909    0.850089  0.978621        00:41        
60        0.601173    0.909748    0.841181  0.976839        00:40        
61        0.597144    0.905546    0.845253  0.976075        00:41        
62        0.587813    0.938127    0.839145  0.971240        00:41        
63        0.583056    0.890574    0.853143  0.975821        00:40        
64        0.579982    0.904091    0.846526  0.976075        00:40        
65        0.572257    0.873078    0.850853  0.980148        00:41        
66        0.564794    0.879189    0.854670  0.973785        00:40        
67        0.560571    0.869368    0.856197  0.977602        00:38        
68        0.554823    0.885016    0.859761  0.975312        00:41        
69        0.542529    0.865427    0.858488  0.976839        00:41        
70        0.543428    0.852592    0.865106  0.976075        00:40        
71        0.540655    0.841956    0.867905  0.980911        00:40        
72        0.533528    0.848855    0.868160  0.980402        00:40        
73        0.530833    0.831290    0.873250  0.980148        00:41        
74        0.527480    0.828368    0.871978  0.979384        00:40        
75        0.527150    0.831567    0.872487  0.980402        00:41        
76        0.526826    0.829058    0.875032  0.980911        00:40        
77        0.523319    0.824739    0.875286  0.979639        00:41        
78        0.520348    0.826227    0.875286  0.980148        00:41        
79        0.523115    0.825280    0.877322  0.980148        00:41        
[0.877322]
0.8773225
0.0

A quick summary of the underlying mathematics:

CNN (ResNet) PDE ("heat" equation)
input layer initial condition
feed forward solving the equation
hidden layers solution at intermediate times
output layer solution at final time
convolution with 3×3 kernel differential operator of order ≤ 2
weights coefficients
boundary handling (padding) boundary condition
multiple channels/filters/feature_maps system of (coupled) PDEs
e.g. 16×16×3×3 kernel 16×16 matrix of differential operators
16×16×1×1 kernel 16×16 matrix of constants
groups=2 (in Conv2d) matrix is block diagonal (direct sum of 2 blocks)

The idea of ResNet with a Twist is to add "variable coefficients" in front of the differential operators, variables being simply "linear in the x and y direction" which suffices for rotating and scaling the feature maps.

About

Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs

License:Apache License 2.0


Languages

Language:Jupyter Notebook 98.8%Language:Python 1.2%