Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FeatureRequest] Graviton 3 support #471

Closed
jlinford opened this issue Jun 23, 2022 · 8 comments
Closed

[FeatureRequest] Graviton 3 support #471

jlinford opened this issue Jun 23, 2022 · 8 comments

Comments

@jlinford
Copy link

Is your feature request related to a problem? Please describe.
Please add Graviton 3 support

Describe the solution you'd like
Commands like likwid-perfctr -e should work on Graviton 3:

Describe alternatives you've considered
Graviton 2 works great, but I was hoping for Graviton 3 support.

Additional context
Currently, the tool reports Graviton3 as an "Unsupported ARMv8 Processor"

[jlinford@c7g-dy-c7g16xlarge-1 likwid-20220602]$ ./bin/likwid-perfctr -e
ERROR - [./src/perfmon.c:perfmon_init_maps:1368] Unsupported ARMv8 Processor
ERROR - [./src/perfmon.c:perfmon_check_counter_map:748] Counter and event maps not initialized.
This architecture has 0 counters.
Counter tags(name, type<, options>):



This architecture has 0 events.
Event tags (tag, id, umask, counters<, options>):
@jlinford
Copy link
Author

For a quick fix, you can make V1 look like N1. Yes, SVE counters and others are missing, but this seems to work otherwise:

[jlinford@c7g-dy-c7g16xlarge-2 likwid]$ git diff -p
diff --git a/config.mk b/config.mk
index d7cca276..221be518 100644
--- a/config.mk
+++ b/config.mk
@@ -8,10 +8,10 @@
 # configuration options setup steps.
 # Supported: GCC, CLANG, ICC, MIC (ICC), GCCX86 (for 32bit systems)
 # GCCARMv8, GCCARMv7 and GCCPOWER
-COMPILER = GCC#NO SPACE
+COMPILER = GCCARMv8#NO SPACE

 # Path were to install likwid
-PREFIX ?= /usr/local#NO SPACE
+PREFIX ?= /ebs/users/jlinford/opt/likwid-20220602#NO SPACE

 # Set the default mode for MSR access.
 # This can usually be overriden on the commandline.
diff --git a/make/include_GCCARMv8.mk b/make/include_GCCARMv8.mk
index b31da1ac..42e34150 100644
--- a/make/include_GCCARMv8.mk
+++ b/make/include_GCCARMv8.mk
@@ -1,5 +1,5 @@
-CC  = gcc
-FC  = gfortran
+CC  = gcc10-gcc
+FC  = gcc10-gfortran
 AS  = as
 AR  = ar
 PAS = ./perl/AsmGen.pl
@@ -12,7 +12,7 @@ ANSI_CFLAGS   =
 #ANSI_CFLAGS += -Wextra
 #ANSI_CFLAGS += -Wall

-CFLAGS   = -march=armv8-a -mtune=cortex-a57 -mabi=lp64 -O2 -std=c99 -Wno-format -fPIC
+CFLAGS   = -mcpu=neoverse-v1 -mabi=lp64 -O2 -std=c99 -Wno-format -fPIC
 #FCFLAGS  = -module ./  # ifort
 FCFLAGS  = -J ./  -fsyntax-only  #gfortran
 PASFLAGS  = ARMv8
diff --git a/src/includes/topology.h b/src/includes/topology.h
index 2eef5102..d1382dd2 100644
--- a/src/includes/topology.h
+++ b/src/includes/topology.h
@@ -165,6 +165,7 @@ struct topology_functions {
 #define  NV_DENVER2    0x03U
 #define  APP_XGENE1    0x00U
 #define  ARM_NEOVERSE_N1 0xD0CU
+#define  ARM_NEOVERSE_V1 0xD40U
 #define  FUJITSU_A64FX 0x001U

 /* ARM vendors */
diff --git a/src/perfmon.c b/src/perfmon.c
index b367071a..b3408bd9 100644
--- a/src/perfmon.c
+++ b/src/perfmon.c
@@ -1364,6 +1364,15 @@ perfmon_init_maps(void)
                             perfmon_numCounters = perfmon_numCountersNeoN1;
                             translate_types = neon1_translate_types;
                             break;
+                        case ARM_NEOVERSE_V1:
+                           // FIXME: This pretends V1 is an N1
+                            eventHash = neon1_arch_events;
+                            perfmon_numArchEvents = perfmon_numArchEventsNeoN1;
+                            counter_map = neon1_counter_map;
+                            box_map = neon1_box_map;
+                            perfmon_numCounters = perfmon_numCountersNeoN1;
+                            translate_types = neon1_translate_types;
+                            break;
                         default:
                             ERROR_PLAIN_PRINT(Unsupported ARMv8 Processor);
                             break;
diff --git a/src/topology.c b/src/topology.c
index 6e4c8819..0c3bc402 100644
--- a/src/topology.c
+++ b/src/topology.c
@@ -123,6 +123,7 @@ static char* arm_cortex_a53 = "ARM Cortex A53";
 static char* arm_cortex_a72 = "ARM Cortex A72";
 static char* arm_cortex_a73 = "ARM Cortex A73";
 static char* arm_neoverse_n1 = "ARM Neoverse N1";
+static char* arm_neoverse_v1 = "ARM Neoverse V1";
 static char* fujitsu_a64fx = "Fujitsu A64FX";
 static char* power7_str = "POWER7 architecture";
 static char* power8_str = "POWER8 architecture";
@@ -177,6 +178,7 @@ static char* short_arm8 = "arm8";
 static char* short_arm8_cav_tx2 = "arm8_tx2";
 static char* short_arm8_cav_tx = "arm8_tx";
 static char* short_arm8_neo_n1 = "arm8_n1";
+static char* short_arm8_neo_v1 = "arm8_v1";
 static char* short_a64fx = "arm64fx";

 static char* short_power7 = "power7";
@@ -1188,6 +1190,10 @@ topology_setName(void)
                             cpuid_info.name = arm_neoverse_n1;
                             cpuid_info.short_name = short_arm8_neo_n1;
                             break;
+                        case ARM_NEOVERSE_V1:
+                            cpuid_info.name = arm_neoverse_v1;
+                            cpuid_info.short_name = short_arm8_neo_v1;
+                            break;
                         default:
                             return EXIT_FAILURE;
                             break;
[jlinford@c7g-dy-c7g16xlarge-2 likwid]$

@TomTheBear
Copy link
Member

Hi John,
We were asked by AWS at ISC22 to add and evaluate the Graviton 3. So, it's on our list.
You almost did everything to add it to LIKWID. The only thing missing are some events. So it shouldn't be much effort to add it natively. Of course, we would need to check whether there are some specialties (separate memory controller or other unit, energy measurements through some arch-specific access like Marvell Thunder X2, CPU frequency control, ...). Does PR #444 contain the additional events?

I'm currently not happy with the include_GCCARMv8 because it is quite specific with the -mcpu and -march flags. I'm trying to find a more general way. Like basic -march=armv8+fp+... and some logic in the build system to change this depending on the system. It makes LIKWID somewhat system-specific, but better than the requirement to adapt the build manually. It would be still required for exotic ARM architectures but the common ones should work.

(Could you supply the missing info in #444? Then I could add both architectures in one release.)

@TomTheBear
Copy link
Member

See #472

@jlinford
Copy link
Author

In addition to include_GCCARMv8, maybe have an include_GCCARM with -mcpu=native and no -march or -mtune flags. Keep the more arch-specific -march=armv8+fp... flags in include_GCCARMv8 for people who need them. Also sets a precedent for having a general "native" config by default with alternate arch-specific configs as needed. Thoughts?

@jlinford
Copy link
Author

jlinford commented Jun 27, 2022

Graviton 3 uses the Neoverse V1 core, so you'll want to use these PMUs instead of the N2 events listed in Issue #444
https://developer.arm.com/documentation/101427/0101/Debug-descriptions/Performance-Monitoring-Unit/PMU-events?lang=en

I admit I have not read both lists and confirmed that they are different :-D

@jlinford
Copy link
Author

jlinford commented Jun 27, 2022

From a c7g.16xlarge instance (biggest Graviton 3)

# /proc/cpuinfo
processor       : 63
BogoMIPS        : 2100.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs paca pacg dcpodp svei8mm svebf16 i8mm bf16 dgh rng
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd40
CPU revision    : 1
ls /sys/devices

ARMH0061:00
armv8_pmuv3_0
breakpoint
kprobe
LNXSYSTM:00
pci0000:00
platform
pnp0
software
system
tracepoint
uprobe
virtual

@TomTheBear
Copy link
Member

There seems to be a misunderstanding. #444 should add general support for ARM Neoverse N2. The PR is not related to AWS Graviton 3. I need the /proc/cpuinfo and directory listing /sys/devices for a ARM Neoverse N2 system to complete the PR.

For AWS Graviton 3, this info is already encoded in your patch and in PR #472 (topology detection and perf unit). I'll check the event list in the PR to use the V1 list.

Providing a general include_GCCARM.mk is difficult because it sets some additional info (e.g. PASFLAGS used for likwid-bench) and the COMPILER name is used for controlling the build (exclude x86 code when building for ARM). But we could probably assume that include_GCCARM.mk relates to an ARMv8 architecture (ARMv7 is rarely used nowadays). I could use -mcpu=native in include_GCCARMv8.mk directly. That seems to be the most portable solution in the end.

@TomTheBear
Copy link
Member

AWS Graviton 3 support got merged into the master branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants