-
Notifications
You must be signed in to change notification settings - Fork 13
AMD Interlagos S2 M2 C16
- Processor: AMD Opteron(TM) Processor 6276
- Base frequency: 2.3 GHz
- Number of sockets: 2
- Number of memory domains per socket: 2
- Memory domain specs: 2-channel DDR3-1866
- Number of cores per socket: 16
- Number of HWThreads per core: 1
- MachineState output:
+----------+---------------------------------+
| Compiler | icc (ICC) |
|----------|---------------------------------|
| Version | icc (ICC) 19.0.5.281 20190815 |
+----------+---------------------------------+
Optimizing flags: -fast -xHost -qopt-streaming-stores=always -std=c99 -ffreestanding -qopenmp
All results are in GB/s
.
Summary results:
+--------------------------------------------+
| Single core | 9.54 (Copy) |
| Memory domain | 16.92 (Sum with 6 cores) |
| Socket | 33.78 (Sum with 8 cores) |
| Node | 67.63 (Sum with 7 cores) |
+--------------------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 8.48 8.91 9.54 7.37 9.31 9.33 8.78 8.55
2 8.22 10.39 12.00 7.35 10.49 9.97 9.21 9.34
3 10.89 13.99 14.68 10.33 12.76 13.18 11.80 11.86
4 13.15 16.67 16.13 12.68 13.83 14.81 13.23 13.19
5 14.30 16.73 16.38 14.10 14.54 15.04 14.18 14.17
6 14.99 16.92 16.28 15.25 14.54 14.82 13.53 14.33
7 15.17 16.89 16.54 15.86 15.48 15.47 14.59 14.99
8 15.22 16.84 16.24 15.85 14.22 15.01 13.98 14.56
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4
1 8.48 16.86 24.98 33.08
2 8.22 16.05 24.40 32.37
3 10.89 21.91 32.69 42.99
4 13.15 25.99 38.93 51.61
5 14.30 28.39 42.70 56.67
6 14.99 29.93 43.06 58.57
7 15.17 30.72 45.15 60.46
8 15.22 30.50 45.26 59.89
Sum:
#nm 1 2 3 4
1 8.91 17.82 26.66 35.56
2 10.39 20.75 31.21 41.40
3 13.99 27.95 41.99 55.95
4 16.67 33.33 50.01 66.70
5 16.73 33.23 49.86 67.40
6 16.92 33.70 50.57 67.27
7 16.89 33.75 50.73 67.63
8 16.84 33.78 50.62 67.39
Copy
#nm 1 2 3 4
1 9.54 19.02 28.52 37.96
2 12.00 24.32 35.35 47.74
3 14.68 29.42 43.83 58.51
4 16.13 31.69 49.05 64.00
5 16.38 32.34 47.97 66.10
6 16.28 33.59 49.61 66.13
7 16.54 32.18 48.41 64.30
8 16.24 32.68 48.07 63.82
Update
#nm 1 2 3 4
1 7.37 14.73 22.04 29.45
2 7.35 14.65 21.94 29.18
3 10.33 20.61 31.02 41.19
4 12.68 24.99 37.92 50.51
5 14.10 27.77 41.15 57.23
6 15.25 32.10 46.72 62.42
7 15.86 30.95 46.90 62.88
8 15.85 32.12 47.53 63.45
Triad
#nm 1 2 3 4
1 9.31 18.80 28.21 37.42
2 10.49 21.00 30.99 40.74
3 12.76 25.44 37.88 50.70
4 13.83 27.35 42.33 55.73
5 14.54 28.35 41.80 59.22
6 14.54 30.93 44.85 62.15
7 15.48 29.26 44.58 60.87
8 14.22 30.75 45.90 60.42
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad