-
Notifications
You must be signed in to change notification settings - Fork 13
AMD Rome S2 M2 C64
- Processor: AMD EPYC 7662 64-Core Processor
- Base frequency: 2.0 GHz
- Number of sockets: 2
- Number of memory domains per socket: 2
- Memory domain specs: 4-channel DDR4-3200
- Number of cores per socket: 64
- Number of HWThreads per core: 2
- MachineState output: NA
+----------+-------------------------------------------------------------------+
| Compiler | AMD clang |
|----------|-------------------------------------------------------------------|
| Version | AMD clang version 10.0.0 (CLANG: AOCC_2.2.0-Build#93 2020_06_25) |
+----------+-------------------------------------------------------------------+
Optimizing flags: -Ofast -fnt-store=aggressive -std=c99 -fopenmp
Remark: On the this Rome system a larger than default data set (20GB instead of 4GB) was used to rule out caching effects.
All results are in GB/s
.
Summary results:
+---------------------------------------------+
| Single core | 36.46 (STriad) |
| Memory domain | 76.80 (Sum with 32 cores) |
| Socket | 152.80 (Sum with 32 cores) |
| Node | 303.83 (Sum with 32 cores) |
+---------------------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 23.43 8.02 32.44 31.15 34.83 34.11 36.46 35.02
2 23.43 16.04 42.88 43.35 46.28 44.33 45.15 42.69
3 23.43 24.10 44.10 43.45 47.03 44.05 44.99 42.57
4 23.42 31.68 44.44 43.88 46.66 43.31 44.39 41.95
5 23.43 36.80 44.06 43.68 46.33 43.22 44.22 41.64
6 23.44 36.17 43.97 44.06 45.79 42.55 43.58 41.20
7 23.44 36.61 44.06 44.12 45.34 42.02 43.10 40.73
8 23.45 37.57 43.89 44.03 44.99 41.69 42.76 40.36
9 26.38 41.84 48.50 48.17 49.15 45.75 46.96 44.27
10 29.30 45.84 51.46 51.03 52.47 49.11 50.47 47.62
11 32.23 49.44 54.95 54.34 55.75 52.41 53.83 50.92
12 35.15 52.79 58.14 57.35 58.81 55.52 56.97 54.01
13 38.08 56.29 60.98 60.03 61.67 58.47 60.03 56.93
14 41.01 59.82 63.97 62.84 64.38 61.28 62.81 59.77
15 43.93 63.15 66.44 65.20 66.78 63.75 65.29 62.23
16 46.86 66.46 69.03 67.68 68.93 65.99 67.48 64.52
17 49.25 68.26 68.92 67.81 70.07 67.66 69.08 65.80
18 51.69 69.73 69.45 68.74 70.89 68.80 70.17 67.00
19 53.98 70.48 69.72 69.32 71.89 70.20 71.30 68.24
20 56.39 71.10 70.10 69.94 72.70 71.42 72.27 69.31
21 58.64 72.23 70.26 70.44 73.45 72.50 73.22 70.31
22 61.00 72.98 70.61 70.84 74.12 73.47 73.99 71.31
23 63.31 73.95 70.89 71.34 74.71 74.47 74.67 72.12
24 65.53 74.60 71.23 71.86 75.22 75.24 75.17 72.86
25 65.95 74.93 71.23 71.76 75.22 75.24 75.47 72.93
26 66.69 75.30 71.73 72.20 75.51 75.53 75.74 73.33
27 67.56 75.63 72.25 72.26 75.74 75.77 75.92 73.58
28 67.89 75.84 72.49 72.47 75.77 75.74 76.09 73.72
29 68.41 76.13 72.83 72.81 75.91 75.79 76.21 73.88
30 69.15 76.37 73.25 73.05 76.00 75.93 76.25 74.14
31 69.61 76.37 73.45 73.17 75.91 75.79 76.30 74.16
32 70.05 76.80 73.84 73.24 75.93 75.82 76.28 74.30
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4
1 23.43 46.83 70.14 93.48
2 23.43 46.79 70.28 93.52
3 23.43 46.79 70.23 93.56
4 23.42 46.79 70.11 93.49
5 23.43 46.79 70.22 93.57
6 23.44 46.84 70.24 93.68
7 23.44 46.87 70.27 93.72
8 23.45 46.89 70.32 93.75
9 26.38 52.74 79.11 105.46
10 29.30 58.60 87.88 117.14
11 32.23 64.46 96.66 128.88
12 35.15 70.29 105.43 140.54
13 38.08 76.15 114.19 152.23
14 41.01 82.00 122.95 163.95
15 43.93 87.85 131.72 175.63
16 46.86 93.69 140.53 187.32
17 49.25 97.93 146.95 195.80
18 51.69 103.06 154.67 205.88
19 53.98 108.14 161.84 215.76
20 56.39 112.76 169.07 225.61
21 58.64 117.19 175.29 233.71
22 61.00 121.92 182.48 243.36
23 63.31 126.48 189.99 252.63
24 65.53 130.81 196.11 261.24
25 65.95 132.05 198.21 263.24
26 66.69 133.57 200.16 266.78
27 67.56 135.00 202.02 268.81
28 67.89 136.17 204.18 271.90
29 68.41 136.78 205.31 272.84
30 69.15 138.11 206.58 275.55
31 69.61 139.48 208.37 276.81
32 70.05 139.65 209.12 278.45
Sum:
#nm 1 2 3 4
1 8.02 15.97 23.94 31.93
2 16.04 31.96 48.11 64.35
3 24.10 48.31 72.63 96.81
4 31.68 63.86 95.65 127.42
5 36.80 73.71 110.42 147.36
6 36.17 72.26 108.45 144.65
7 36.61 73.14 109.67 146.18
8 37.57 75.10 112.66 150.08
9 41.84 83.67 125.47 167.01
10 45.84 91.72 137.30 182.81
11 49.44 98.85 148.19 197.11
12 52.79 105.44 158.10 210.70
13 56.29 112.61 168.73 224.62
14 59.82 119.50 178.96 238.15
15 63.15 126.32 189.55 251.52
16 66.46 132.61 198.63 263.72
17 68.26 136.29 204.32 271.52
18 69.73 139.15 208.28 276.65
19 70.48 140.90 210.56 280.37
20 71.10 142.07 213.11 283.62
21 72.23 144.46 215.93 286.20
22 72.98 145.75 218.58 290.57
23 73.95 147.51 221.02 292.94
24 74.60 148.80 222.67 296.83
25 74.93 149.67 224.18 296.61
26 75.30 150.51 224.68 298.38
27 75.63 150.99 225.52 299.40
28 75.84 151.01 226.69 300.89
29 76.13 152.24 227.41 302.27
30 76.37 152.57 227.93 303.03
31 76.37 152.59 228.29 302.78
32 76.80 152.80 228.01 303.83
Copy
#nm 1 2 3 4
1 32.44 64.91 97.27 130.06
2 42.88 85.54 128.59 171.32
3 44.10 87.94 132.05 175.74
4 44.44 88.67 133.38 177.41
5 44.06 87.97 132.19 175.98
6 43.97 87.99 131.99 175.84
7 44.06 88.09 132.05 175.90
8 43.89 87.77 131.87 175.33
9 48.50 97.05 145.63 193.89
10 51.46 102.99 154.26 205.53
11 54.95 109.93 164.95 219.77
12 58.14 115.92 173.92 231.82
13 60.98 122.35 182.71 243.75
14 63.97 127.85 191.60 254.88
15 66.44 133.14 198.90 265.74
16 69.03 137.94 206.47 274.53
17 68.92 137.55 205.71 274.51
18 69.45 138.83 207.67 276.37
19 69.72 139.61 208.84 277.77
20 70.10 140.01 210.09 279.47
21 70.26 140.51 210.42 280.25
22 70.61 141.19 211.48 281.46
23 70.89 141.71 212.10 282.85
24 71.23 142.16 213.14 283.83
25 71.23 142.77 213.06 283.41
26 71.73 143.36 214.89 285.74
27 72.25 144.32 215.80 287.52
28 72.49 144.91 217.45 288.71
29 72.83 145.38 218.19 290.62
30 73.25 146.32 219.26 291.85
31 73.45 147.11 220.15 292.31
32 73.84 147.20 220.37 293.23
Update
#nm 1 2 3 4
1 31.15 62.38 92.85 124.79
2 43.35 86.35 129.44 172.89
3 43.45 86.76 130.26 173.78
4 43.88 87.47 131.56 174.88
5 43.68 87.45 131.24 175.01
6 44.06 88.08 132.14 176.74
7 44.12 88.31 132.51 176.58
8 44.03 88.05 132.52 176.56
9 48.17 96.27 144.89 193.10
10 51.03 102.24 153.22 205.11
11 54.34 108.84 163.32 218.44
12 57.35 114.35 172.29 229.58
13 60.03 120.37 180.56 241.84
14 62.84 125.66 189.04 252.01
15 65.20 130.53 196.17 262.29
16 67.68 135.67 203.94 272.84
17 67.81 136.08 204.72 274.33
18 68.74 137.92 207.40 277.22
19 69.32 139.18 209.22 279.84
20 69.94 140.19 210.80 282.97
21 70.44 141.46 212.90 285.47
22 70.84 142.49 214.83 287.67
23 71.34 143.39 216.47 290.02
24 71.86 143.66 216.53 290.41
25 71.76 144.63 217.40 290.96
26 72.20 145.06 218.54 292.76
27 72.26 145.69 219.00 293.74
28 72.47 145.67 219.70 294.41
29 72.81 146.36 221.50 297.97
30 73.05 147.34 222.65 300.04
31 73.17 147.57 223.15 300.45
32 73.24 147.94 223.62 300.03
Triad
#nm 1 2 3 4
1 34.83 70.94 104.48 141.48
2 46.28 92.61 138.83 185.13
3 47.03 94.04 141.14 188.22
4 46.66 93.42 140.06 186.59
5 46.33 92.86 139.27 185.45
6 45.79 91.45 137.37 183.43
7 45.34 90.58 135.86 181.12
8 44.99 89.91 135.01 179.89
9 49.15 98.18 147.38 196.46
10 52.47 104.96 157.29 210.12
11 55.75 111.51 167.08 223.16
12 58.81 117.59 176.43 235.19
13 61.67 123.37 184.93 246.73
14 64.38 128.74 193.04 257.37
15 66.78 133.58 200.32 266.54
16 68.93 137.78 206.44 274.71
17 70.07 139.98 209.98 279.57
18 70.89 141.78 212.49 283.00
19 71.89 143.65 215.67 286.94
20 72.70 145.36 217.94 290.24
21 73.45 146.81 219.86 292.50
22 74.12 148.24 222.03 295.46
23 74.71 149.48 223.82 297.64
24 75.22 150.42 224.99 299.68
25 75.22 150.52 225.37 300.18
26 75.51 151.01 226.16 300.82
27 75.74 151.41 226.91 301.69
28 75.77 151.54 226.93 301.92
29 75.91 151.77 227.51 302.43
30 76.00 151.92 227.46 303.10
31 75.91 151.88 227.28 302.30
32 75.93 151.81 227.02 302.90
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad