Top-1 accuracy (%) of representative vision backbones with 20 popular optimizers on CIFAR-100. The torch-style training settings are used for AlexNet, VGG-13, R-50/101 (ResNet-50/101), and MobV2 (MobileNet.V2), while other backbones adopt modern training recipes, including Eff-B0 (EfficientNet-B0). IF-12, PF-12, CF-12, AF-12, and CA-12 are abbreviated for MetaFormer S12 variants. The blue and gray regions denote the top and outlier (trivial) results, while others are inliers.
You can swipe left and right to see the full table.
Backbone | Alex | VGG | R-50 | R-101 | MobV2 | Eff-B0 | DeiT-S | MLP-S | Swin-T | CX-T | Moga-S | IF-12 | PF-12 | CF-12 | AF-12 | CA-12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SGD-M | 66.78 | 77.08 | 78.76 | 84.87 | 77.16 | 79.41 | 63.39 | 72.64 | 78.95 | 60.09 | 75.06 | 77.40 | 77.70 | 83.46 | 83.02 | 81.21 |
SGDP | 66.54 | 77.56 | 79.25 | 85.30 | 77.50 | 79.55 | 63.53 | 69.24 | 80.56 | 61.25 | 80.86 | 77.55 | 77.53 | 83.54 | 82.88 | 81.56 |
LION | 62.11 | 73.87 | 75.28 | 82.79 | 75.35 | 76.97 | 74.57 | 74.19 | 81.84 | 82.29 | 85.03 | 78.65 | 79.66 | 84.62 | 82.41 | 79.59 |
Adam | 65.29 | 73.41 | 74.10 | 83.34 | 74.56 | 76.48 | 71.04 | 72.84 | 80.71 | 82.03 | 84.92 | 78.39 | 79.18 | 84.81 | 81.54 | 82.18 |
Adamax | 67.30 | 73.80 | 75.21 | 83.27 | 74.60 | 78.37 | 73.31 | 73.07 | 81.28 | 80.25 | 84.51 | 78.02 | 79.55 | 84.31 | 81.42 | 82.50 |
AdamP | 60.27 | 75.56 | 78.17 | 84.64 | 77.79 | 78.65 | 71.55 | 73.66 | 80.91 | 84.47 | 86.45 | 79.20 | 81.70 | 85.15 | 82.12 | 83.40 |
AdamW | 62.71 | 73.90 | 75.56 | 84.01 | 76.88 | 78.77 | 72.15 | 73.59 | 81.30 | 83.52 | 86.19 | 79.39 | 80.55 | 85.46 | 82.24 | 83.60 |
Adan | 63.98 | 74.90 | 77.08 | 84.96 | 77.73 | 78.43 | 76.33 | 79.94 | 83.35 | 84.65 | 86.45 | 80.59 | 83.23 | 85.58 | 83.51 | 84.89 |
LAMB | 66.90 | 75.55 | 77.19 | 85.05 | 77.49 | 78.77 | 75.39 | 74.98 | 83.47 | 84.13 | 86.04 | 80.21 | 80.01 | 85.40 | 83.16 | 83.74 |
NAdam | 60.49 | 73.96 | 74.56 | 82.78 | 75.69 | 77.06 | 72.75 | 73.77 | 81.80 | 82.26 | 85.23 | 78.37 | 80.32 | 84.81 | 81.82 | 82.83 |
RAdam | 61.69 | 74.64 | 75.19 | 81.85 | 75.62 | 77.08 | 72.41 | 72.11 | 79.84 | 82.18 | 84.95 | 78.46 | 79.71 | 84.93 | 81.44 | 82.35 |
AdaBelief | 62.98 | 75.09 | 80.53 | 85.47 | 75.78 | 78.48 | 70.66 | 73.30 | 80.98 | 83.31 | 84.80 | 78.55 | 81.00 | 85.03 | 83.21 | 83.56 |
AdaBound | 66.59 | 77.00 | 78.11 | 84.45 | 78.76 | 79.88 | 68.59 | 70.31 | 80.67 | 49.18 | 78.48 | 75.03 | 77.62 | 82.73 | 83.08 | 82.38 |
AdaFactor | 63.91 | 74.49 | 75.41 | 84.42 | 75.38 | 77.83 | 74.02 | 71.16 | 80.36 | 82.82 | 85.17 | 78.78 | 78.81 | 84.90 | 81.94 | 82.36 |
LARS | 64.35 | 75.71 | 78.25 | 84.45 | 76.23 | 72.43 | 71.36 | 72.64 | 81.29 | 61.40 | 75.93 | 77.66 | 78.78 | 82.98 | 81.00 | 82.05 |
NovoGrad | 64.24 | 76.09 | 79.36 | 85.23 | 74.83 | 74.23 | 73.13 | 67.03 | 81.82 | 79.99 | 82.86 | 77.16 | 80.42 | 83.51 | 81.28 | 82.98 |
Sophia | 64.30 | 74.18 | 75.19 | 82.54 | 76.60 | 78.95 | 71.47 | 72.74 | 80.61 | 83.76 | 85.39 | 77.67 | 78.90 | 84.58 | 81.67 | 82.96 |
AdaGrad | 45.79 | 71.29 | 73.30 | 81.81 | 33.87 | 77.93 | 67.24 | 67.50 | 75.83 | 83.03 | 83.03 | 32.28 | 44.40 | 79.67 | 78.71 | 38.09 |
AdaDelta | 66.72 | 74.14 | 75.07 | 83.58 | 75.32 | 77.88 | 65.44 | 71.32 | 80.25 | 74.25 | 81.06 | 75.91 | 76.40 | 84.05 | 82.62 | 82.08 |
RMSProp | 59.33 | 73.30 | 74.25 | 79.38 | 73.94 | 76.83 | 70.71 | 71.63 | 77.52 | 82.29 | 85.17 | 77.40 | 77.14 | 84.01 | 79.72 | 81.83 |