-
Notifications
You must be signed in to change notification settings - Fork 13
Intel CascadeLake S2 M2 C20
- Processor: Intel(R) Xeon(R) Gold 6248 CPU
- Base frequency: 2.5 GHz
- Number of sockets: 2
- Number of memory domains per socket: 2
- Memory domain specs: 3-channel DDR4-2933
- Number of cores per socket: 20
- Number of HWThreads per core: 2
- MachineState output: NA
+----------+---------------------------------+
| Compiler | icc (ICC) |
|----------|---------------------------------|
| Version | icc (ICC) 19.0.5.281 20190815 |
+----------+---------------------------------+
Optimizing flags: -fast -xHost -qopt-streaming-stores=always -std=c99 -ffreestanding -qopenmp
All results are in GB/s
.
Summary results:
+--------------------------------------------+
| Single core | 18.63 (SDaxpy) |
| Memory domain | 61.05 (Sum with 7 cores) |
| Socket | 121.26 (Sum with 7 cores) |
| Node | 241.72 (Sum with 7 cores) |
+--------------------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 7.60 13.07 10.60 16.29 13.72 18.07 15.22 18.63
2 15.05 26.88 20.72 31.31 26.61 35.70 29.98 36.85
3 22.51 40.39 29.95 43.36 37.22 44.65 40.34 44.40
4 29.63 49.83 37.77 46.83 44.16 47.93 45.78 47.88
5 36.93 56.81 43.66 49.38 47.98 50.01 48.95 50.06
6 42.51 60.14 46.81 50.63 49.40 50.62 49.69 50.47
7 46.36 61.05 48.55 50.83 49.60 50.40 49.56 50.04
8 49.61 60.64 49.00 50.39 48.99 49.72 48.98 49.36
9 50.54 60.00 48.98 50.30 48.98 49.54 49.08 49.29
10 50.26 59.19 48.83 50.04 48.77 49.34 49.24 49.34
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4
1 7.60 14.30 21.52 28.73
2 15.05 28.07 42.16 56.11
3 22.51 41.83 62.62 83.01
4 29.63 55.46 82.64 108.42
5 36.93 68.46 101.11 130.17
6 42.51 79.15 115.77 138.99
7 46.36 84.28 124.92 148.63
8 49.61 85.22 127.14 154.27
9 50.54 86.27 129.05 162.22
10 50.26 86.49 129.19 166.03
Sum:
#nm 1 2 3 4
1 13.07 26.11 38.75 51.61
2 26.88 53.54 80.09 106.34
3 40.39 79.80 119.42 159.19
4 49.83 100.03 149.44 198.42
5 56.81 113.15 169.59 225.12
6 60.14 119.04 178.69 237.97
7 61.05 121.26 181.61 241.72
8 60.64 119.55 179.25 239.07
9 60.00 118.79 178.08 236.55
10 59.19 117.82 176.94 235.29
Copy
#nm 1 2 3 4
1 10.60 20.27 30.49 40.66
2 20.72 39.29 58.83 78.34
3 29.95 56.83 85.37 113.36
4 37.77 72.71 108.75 144.18
5 43.66 84.89 126.92 168.44
6 46.81 91.06 136.61 181.62
7 48.55 95.20 142.76 189.48
8 49.00 95.84 143.66 191.19
9 48.98 96.09 144.16 191.28
10 48.83 95.41 143.14 189.85
Update
#nm 1 2 3 4
1 16.29 32.54 48.26 64.33
2 31.31 62.38 93.50 124.44
3 43.36 83.41 125.21 166.87
4 46.83 92.11 138.67 184.88
5 49.38 98.62 148.19 197.62
6 50.63 100.10 150.60 201.70
7 50.83 100.73 151.54 202.13
8 50.39 99.66 149.77 200.39
9 50.30 99.54 149.55 199.69
10 50.04 98.79 148.50 198.59
Triad
#nm 1 2 3 4
1 13.72 26.77 40.07 53.36
2 26.61 51.22 76.34 101.55
3 37.22 71.46 107.90 143.13
4 44.16 86.80 131.00 173.26
5 47.98 94.51 141.67 187.95
6 49.40 96.14 144.42 192.48
7 49.60 96.87 145.26 193.09
8 48.99 95.31 143.29 190.53
9 48.98 95.09 142.42 189.60
10 48.77 94.79 142.31 189.06
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad