-
Notifications
You must be signed in to change notification settings - Fork 13
AMD Rome S2 M4 C64
- Processor: AMD EPYC 7662 64-Core Processor
- Base frequency: 2.0 GHz
- Number of sockets: 2
- Number of memory domains per socket: 4
- Memory domain specs: 2-channel DDR4-3200
- Number of cores per socket: 64
- Number of HWThreads per core: 2
- MachineState output: NA
+----------+-------------------------------------------------------------------+
| Compiler | AMD clang |
|----------|-------------------------------------------------------------------|
| Version | AMD clang version 10.0.0 (CLANG: AOCC_2.2.0-Build#93 2020_06_25) |
|----------|-------------------------------------------------------------------|
Optimizing flags: -Ofast -fnt-store=aggressive -std=c99 -fopenmp
Remark: On the this Rome system a larger than default data set (20GB instead of 4GB) was used to rule out caching effects.
All results are in GB/s
.
Summary results:
+-----------------------------------------------+
| Single core | 34.28 (STriad) |
| Memory domain | 43.30 (Sum with 16 cores) |
| Socket | 171.57 (Sum with 16 cores) |
| Node | 338.88 (Update with 16 cores)|
+-----------------------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 23.42 8.16 33.47 32.14 33.56 33.27 34.28 33.88
2 23.42 16.13 40.56 41.25 40.53 39.68 40.00 39.35
3 23.44 24.04 40.67 39.39 39.37 37.92 38.23 37.40
4 23.42 31.51 40.00 38.61 38.42 36.90 37.32 36.36
5 23.43 36.40 39.69 38.42 38.14 36.75 37.11 36.15
6 23.44 35.69 39.11 38.05 37.44 36.10 36.51 35.56
7 23.44 36.00 38.59 37.44 36.83 35.47 35.96 35.00
8 23.45 36.78 38.05 36.92 36.27 34.95 35.44 34.49
9 26.11 39.46 39.11 38.32 37.63 36.57 36.80 36.05
10 28.31 40.39 39.58 38.86 38.42 37.56 37.66 37.05
11 30.29 40.70 39.99 39.47 39.12 38.46 38.38 37.92
12 32.30 41.36 40.31 39.91 39.66 39.17 38.93 38.65
13 34.22 42.05 40.64 40.42 40.22 39.85 39.52 39.35
14 36.03 42.66 40.78 40.75 40.58 40.36 39.92 39.85
15 37.66 43.09 40.80 40.88 40.79 40.69 40.14 40.19
16 39.11 43.30 40.74 40.86 40.84 40.85 40.21 40.34
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4 5 6 7 8
1 23.42 46.80 70.20 93.61 116.81 140.21 163.33 186.71
2 23.42 46.82 70.17 93.62 116.99 140.37 163.66 186.86
3 23.44 46.85 70.21 93.68 117.03 140.48 163.60 186.99
4 23.42 46.83 70.12 93.63 117.03 140.22 163.42 186.97
5 23.43 46.86 70.24 93.67 117.07 140.45 163.61 187.13
6 23.44 46.86 70.26 93.72 117.12 140.45 163.88 187.24
7 23.44 46.88 70.30 93.74 117.16 140.60 163.96 187.38
8 23.45 46.89 70.33 93.76 117.19 140.57 163.94 187.42
9 26.11 52.20 77.61 103.47 129.32 155.15 180.97 206.74
10 28.31 56.28 84.40 112.49 140.60 168.64 196.69 224.91
11 30.29 60.55 90.76 121.05 151.34 181.42 211.60 241.80
12 32.30 64.48 96.75 128.97 161.31 193.35 225.73 257.81
13 34.22 68.35 102.53 136.58 170.71 204.93 239.13 273.26
14 36.03 71.95 107.94 144.08 179.85 216.01 251.73 287.91
15 37.66 75.21 112.84 150.47 188.07 225.62 263.23 300.96
16 39.11 77.91 116.95 156.07 195.04 234.14 272.70 311.39
Sum:
#nm 1 2 3 4 5 6 7 8
1 8.16 16.18 24.17 32.29 40.25 48.36 56.48 64.51
2 16.13 32.23 48.33 64.57 80.47 96.39 112.42 128.20
3 24.04 48.08 72.15 96.03 119.97 144.10 167.85 191.76
4 31.51 62.51 94.47 125.98 157.17 188.51 219.76 250.79
5 36.40 72.95 109.28 145.62 181.83 218.18 253.92 290.27
6 35.69 71.35 107.10 142.49 178.26 213.63 248.47 283.31
7 36.00 71.93 107.82 143.69 179.91 214.79 250.68 285.62
8 36.78 73.53 110.07 146.76 182.97 219.33 255.07 290.99
9 39.46 78.91 118.11 157.37 196.08 234.54 272.76 310.78
10 40.39 80.79 120.94 161.00 201.01 240.46 279.21 318.95
11 40.70 81.38 121.87 162.07 202.32 242.50 281.34 320.99
12 41.36 82.62 123.74 164.71 205.44 245.52 285.50 325.30
13 42.05 83.95 125.76 167.21 208.10 248.89 289.02 329.41
14 42.66 85.17 127.55 169.72 211.13 252.78 293.64 333.37
15 43.09 85.93 128.61 171.14 213.21 255.18 295.19 336.38
16 43.30 86.33 129.25 171.57 213.80 255.15 296.23 336.86
Copy
#nm 1 2 3 4 5 6 7 8
1 33.47 65.72 98.72 134.01 166.29 197.55 228.82 259.89
2 40.56 81.02 121.53 161.79 202.20 242.45 283.13 322.62
3 40.67 81.24 122.03 162.61 203.09 243.69 284.12 324.14
4 40.00 79.92 119.84 159.79 199.55 239.05 278.52 317.78
5 39.69 79.30 119.09 158.51 197.86 237.57 276.90 315.90
6 39.11 78.11 117.28 156.32 195.17 234.26 273.25 311.62
7 38.59 77.14 115.71 154.25 192.60 231.05 269.28 307.35
8 38.05 76.15 114.09 152.09 189.82 227.73 265.38 303.09
9 39.11 78.22 117.37 156.33 195.20 233.89 272.85 311.33
10 39.58 79.16 118.62 158.18 197.61 236.59 275.47 314.54
11 39.99 79.95 119.95 159.70 199.52 239.03 278.54 318.17
12 40.31 80.54 120.89 161.01 200.86 240.71 280.51 319.87
13 40.64 81.24 121.81 162.35 202.66 242.91 283.15 322.65
14 40.78 81.53 122.24 162.75 203.25 243.50 283.65 323.49
15 40.80 81.56 122.30 162.90 203.41 243.66 284.17 323.90
16 40.74 81.38 122.09 162.56 203.08 243.23 283.37 323.38
Update
#nm 1 2 3 4 5 6 7 8
1 32.14 64.33 95.57 128.60 160.72 190.89 222.81 256.62
2 41.25 82.58 123.52 164.81 205.70 247.46 289.00 329.62
3 39.39 78.76 118.38 158.02 197.89 237.18 276.75 316.25
4 38.61 77.26 116.14 154.66 193.32 232.05 270.46 308.92
5 38.42 76.98 115.74 154.28 193.22 232.44 271.33 310.38
6 38.05 76.15 114.57 152.86 191.43 230.39 268.81 308.06
7 37.44 74.94 112.73 150.53 188.30 226.36 264.44 303.16
8 36.92 73.96 111.04 148.40 185.76 223.08 260.63 298.21
9 38.32 76.75 115.41 153.85 192.52 231.71 270.41 309.85
10 38.86 77.88 117.03 156.31 195.53 235.15 274.93 315.11
11 39.47 79.05 118.84 158.64 198.61 238.89 279.14 319.47
12 39.91 79.93 120.17 160.56 201.14 241.60 282.40 323.30
13 40.42 81.06 121.99 163.16 204.68 246.34 288.20 330.82
14 40.75 81.69 122.97 164.49 206.24 248.34 290.81 333.33
15 40.88 82.01 123.41 165.13 207.02 249.33 291.69 334.77
16 40.86 81.96 123.73 165.81 208.43 251.34 294.79 338.88
Triad
#nm 1 2 3 4 5 6 7 8
1 33.56 68.24 101.69 135.64 169.67 203.60 238.03 270.19
2 40.53 80.99 121.39 161.72 202.04 242.32 282.53 322.45
3 39.37 78.66 118.11 157.44 196.61 235.50 274.99 314.15
4 38.42 76.91 115.20 153.27 191.32 229.47 267.65 305.44
5 38.14 76.31 114.41 152.41 190.44 228.38 266.20 303.83
6 37.44 74.83 112.27 149.65 186.89 224.32 261.48 298.76
7 36.83 73.65 110.44 147.10 183.69 220.34 256.95 292.91
8 36.27 72.65 108.85 145.22 181.10 217.42 253.03 288.64
9 37.63 75.31 113.03 150.47 187.96 225.38 262.41 299.74
10 38.42 76.86 115.21 153.61 191.78 229.86 267.91 305.70
11 39.12 78.22 117.32 156.31 195.31 233.93 272.55 311.10
12 39.66 79.29 118.90 158.45 197.68 236.93 275.77 314.88
13 40.22 80.43 120.51 160.61 200.47 240.20 279.94 319.04
14 40.58 81.10 121.57 161.97 202.00 242.27 282.02 321.13
15 40.79 81.57 122.30 162.85 202.97 243.38 283.19 323.15
16 40.84 81.68 122.42 162.89 203.14 243.63 283.35 323.26
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad