-
Notifications
You must be signed in to change notification settings - Fork 13
Intel SapphireRapids S2 M4 C52
- Processor: Intel(R) Xeon(R) Platinum 8470
- Base frequency: 2.0
- Number of sockets: 2
- Number of memory domains per socket: 4
- Number of cores per socket: 52
- Number of HWThreads per core: 1
- MachineState output: NA
- Memory Info: 16x 64GB DDR5 4800MHz with ECC, Dual-Rank
+----------+-------------------+
| Compiler | icc (ICC) |
|----------|-------------------|
| Version | 2021.6.0 20220226 |
+----------+-------------------+
Optimizing flags: -fast -xHost -qopt-streaming-stores=always -qopt-zmm-usage=high -std=c99 -ffreestanding -qopenmp
All results are in GB/s
.
Summary results:
+-----------------------------------------------+
| Single core | 32.54 (Init) |
| Memory domain | 68.30 (Sum with 8 cores) |
| Socket | 270.97 (Sum with 9 cores) |
| Node | 597.93 (Update with 13 cores) |
+-----------------------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 32.54 15.35 23.97 20.67 20.98 19.45 21.61 20.41
2 60.33 29.61 40.48 37.08 38.01 35.12 39.05 37.14
3 59.60 39.20 49.06 48.09 48.99 46.68 49.68 48.13
4 58.83 39.66 53.93 54.05 54.75 53.51 52.85 52.54
5 58.02 49.58 57.08 57.83 59.19 58.30 60.48 59.87
6 56.76 59.46 59.66 60.11 61.49 61.21 62.67 62.48
7 55.12 67.31 61.04 62.44 62.21 62.59 63.18 63.32
8 54.24 68.30 61.37 63.47 62.30 62.79 63.08 63.16
9 53.66 68.23 61.43 63.41 62.17 62.83 62.99 63.10
10 53.31 68.07 61.29 63.25 62.09 62.70 62.86 62.99
11 52.99 67.92 61.14 63.08 62.01 62.65 62.74 62.87
12 52.76 67.79 61.00 62.86 61.96 62.59 62.71 62.81
13 52.55 67.72 60.66 62.77 61.84 62.49 62.53 62.66
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4 5 6 7 8
1 32.54 64.34 95.95 127.65 157.17 189.77 165.01 187.85
2 60.33 117.89 144.52 188.84 233.27 269.10 296.70 257.78
3 59.60 117.26 158.41 223.32 266.87 307.37 349.94 324.15
4 58.83 115.25 168.48 227.90 270.25 310.91 353.26 372.80
5 58.02 111.87 169.94 226.74 269.38 310.76 355.16 394.65
6 56.76 110.88 168.30 224.54 268.60 310.73 354.68 402.33
7 55.12 109.39 166.28 221.88 266.72 309.66 354.21 403.81
8 54.24 107.91 165.20 220.00 263.58 309.30 354.58 405.00
9 53.66 107.52 163.20 218.04 262.36 308.98 354.58 404.16
10 53.31 106.84 161.12 215.63 260.76 308.16 355.30 404.14
11 52.99 106.42 160.45 215.06 258.79 307.14 353.76 404.63
12 52.76 106.11 159.74 213.98 258.21 306.55 353.12 404.63
13 52.55 105.53 159.18 213.41 256.68 305.53 354.15 403.18
Sum:
#nm 1 2 3 4 5 6 7 8
1 15.35 30.56 45.81 61.02 76.22 91.47 100.68 113.99
2 29.61 58.44 83.07 110.10 137.41 164.67 191.20 219.65
3 39.20 77.22 112.99 152.32 190.09 227.23 264.43 302.93
4 39.66 79.23 117.54 157.90 196.77 235.67 275.05 314.37
5 49.58 98.89 145.72 195.60 242.75 291.55 336.97 388.19
6 59.46 118.62 174.50 234.22 290.37 349.28 402.92 463.47
7 67.31 132.77 198.42 264.59 328.83 395.76 448.03 520.44
8 68.30 134.33 202.80 270.72 332.97 401.30 454.44 534.04
9 68.23 134.81 203.20 270.97 333.52 402.74 453.70 534.70
10 68.07 134.63 202.98 270.09 332.25 401.56 452.71 533.64
11 67.92 134.55 202.95 269.76 331.84 400.67 452.41 532.25
12 67.79 134.51 202.34 269.18 331.58 400.44 452.41 531.33
13 67.72 134.18 202.15 268.57 330.06 398.79 450.60 531.11
Copy
#nm 1 2 3 4 5 6 7 8
1 23.97 47.44 71.46 96.29 119.95 144.24 152.19 174.33
2 40.48 80.36 114.67 153.37 192.21 231.94 272.56 312.78
3 49.06 98.06 147.14 199.76 249.74 301.97 355.48 406.30
4 53.93 108.29 163.82 220.55 276.83 334.68 393.56 450.95
5 57.08 114.69 172.66 231.95 289.86 350.23 411.19 471.52
6 59.66 119.32 179.61 241.62 303.21 366.98 431.01 495.21
7 61.04 121.90 184.46 248.20 311.38 376.74 440.95 507.05
8 61.37 123.02 185.98 250.96 313.86 380.60 444.86 511.31
9 61.43 123.31 186.12 250.76 313.53 381.78 445.17 512.59
10 61.29 123.14 185.83 250.56 314.18 380.05 444.41 511.50
11 61.14 122.81 185.37 249.87 313.17 379.45 442.46 509.17
12 61.00 122.54 184.66 248.97 311.52 376.98 441.62 507.63
13 60.66 121.99 183.73 247.71 310.34 376.12 437.83 505.11
Update
#nm 1 2 3 4 5 6 7 8
1 20.67 40.56 61.25 82.22 103.17 124.66 126.82 144.97
2 37.08 73.67 99.87 133.24 167.14 200.45 237.19 269.92
3 48.09 96.53 136.85 185.19 232.30 279.16 329.87 378.30
4 54.05 108.90 162.58 221.27 278.21 334.84 398.35 453.29
5 57.83 116.76 175.81 238.73 302.49 365.88 435.88 502.15
6 60.11 120.34 183.77 250.10 315.60 384.34 457.05 528.43
7 62.44 124.23 190.30 258.48 327.90 398.75 475.08 549.09
8 63.47 127.43 195.85 266.43 337.07 411.18 487.11 564.96
9 63.41 128.23 197.45 268.22 341.21 417.67 497.54 579.35
10 63.25 127.49 197.29 269.43 343.53 418.63 501.16 586.70
11 63.08 127.34 197.55 269.22 344.98 422.60 503.25 587.53
12 62.86 127.45 197.51 270.17 345.68 422.65 506.30 597.74
13 62.77 127.07 197.43 270.31 347.01 426.26 512.28 597.93
Triad
#nm 1 2 3 4 5 6 7 8
1 20.98 41.41 63.30 85.86 103.66 123.70 144.36 164.72
2 38.01 75.57 108.72 145.27 179.39 216.45 252.30 288.32
3 48.99 97.29 140.95 189.04 235.29 282.35 330.44 375.42
4 54.75 109.49 160.01 214.52 267.61 321.11 374.19 425.99
5 59.19 117.21 174.81 232.98 288.75 347.58 406.27 464.30
6 61.49 121.62 182.48 243.56 300.05 360.64 423.65 481.74
7 62.21 122.71 185.27 247.35 304.99 367.15 429.77 491.19
8 62.30 123.09 185.61 248.27 305.63 367.57 430.60 492.21
9 62.17 123.23 185.59 247.83 305.63 367.30 430.70 490.64
10 62.09 123.16 185.11 247.71 305.47 367.83 429.58 490.34
11 62.01 123.01 184.98 247.37 304.64 367.11 429.12 489.37
12 61.96 123.26 184.81 247.22 304.69 366.12 428.87 489.31
13 61.84 123.10 184.52 246.76 303.99 365.54 427.94 488.86
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad