Skip to content

Latest commit

 

History

History
20 lines (19 loc) · 3.81 KB

env_variables.md

File metadata and controls

20 lines (19 loc) · 3.81 KB

Environment variables used by AdaptiveCpp

  • ACPP_DEBUG_LEVEL: if set, overrides the output verbosity. 0: none, 1: error, 2: warning, 3: info, 4: verbose, default is the value of HIPSYCL_DEBUG_LEVEL macro.
  • ACPP_VISIBILITY_MASK: can be used to activate only a subset of backends. Syntax: backend;backend2;... Possible values are omp (OpenMP), cuda, hip, ocl (OpenCL) and ze (Level Zero). omp will always be active as a CPU backend is required. For most backends, device level visibility has to be set via vendor specific variables for now, including {CUDA,HIP}_VISIBLE_DEVICES and ZE_AFFINITY_MASK. Certain backends, particularly ocl, support device level visibility specifications: For example, omp;ocl:0,4 exposes OpenCL device 0 and 4, omp;ocl:0.0,3.0 exposes device 0 from platform 0 and device 0 from platform 3. Instead of numbers, strings can also be passed, in which case a device will match if the platform/device name contains the given string. * acts as wildcard. Examples: omp;ocl:Intel.0 (first device from platforms containing "Intel" in the name), omp;ocl:Graphics.* (All devices from platforms containing "Graphics" in their name), omp;ocl:CPU (All devices containing CPU in their name)
  • ACPP_RT_DAG_REQ_OPTIMIZATION_DEPTH: maximum depth when descending the DAG requirement tree to look for DAG optimization opportunities, such as eliding unnecessary dependencies.
  • ACPPL_RT_MQE_LANE_STATISTICS_MAX_SIZE: For the multi_queue_executor, the maximum size of entries in the lane statistics, i.e. the maximum number of submissions to retain statistical information about. This information is used to estimate execution lane utilization.
  • ACPP_RT_MQE_LANE_STATISTICS_DECAY_TIME_SEC: The time in seconds (floating point value) after which to forget information about old submissions.
  • ACPP_RT_SCHEDULER: Set scheduler type. Allowed values:
    • direct is a low-latency direct-submission scheduler.
    • unbound is the default scheduler and supports automatic work distribution across multiple devices. If the HIPSYCL_EXT_MULTI_DEVICE_QUEUE extension is used, the scheduler must be unbound.
  • ACPP_DEFAULT_SELECTOR_BEHAVIOR: Set behavior of default selector. Allowed values:
    • strict (default): Strictly behave as defined by the SYCL specification
    • multigpu: Makes default selector behave like a multigpu selector from the HIPSYCL_EXT_MULTI_DEVICE_QUEUE extension
    • system: Makes default selector behave like a system selector from the HIPSYCL_EXT_MULTI_DEVICE_QUEUE extension
  • ACPP_HCF_DUMP_DIRECTORY: If set, hipSYCL will dump all embedded HCF data files in this directory. HCF is hipSYCL's container format that is used by all compilation flows that are fully controlled by hipSYCL to store kernel code.
  • ACPP_PERSISTENT_RUNTIME: If set to 1, hipSYCL will use a persistent runtime that will continue to live even if no SYCL objects are currently in use in the application. This can be helpful if the application consists of multiple distinct phases in which SYCL is used, and multiple launches of the runtime occur.
  • ACPPL_RT_MAX_CACHED_NODES: Maximum number of nodes that the runtime buffers before flushing work.
  • ACPP_SSCP_FAILED_IR_DUMP_DIRECTORY: If non-empty, hipSYCL will dump the IR of code that fails SSCP JIT into this directory.
  • ACPP_RT_GC_TRIGGER_BATCH_SIZE: Number of nodes in flight that trigger a garbage collection job to be spawned
  • ACPP_RT_OCL_NO_SHARED_CONTEXT: If set to a non-zero value, instructs the OpenCL backend to not attempt to construct a shared context across devices within a platform. This can be necessary on OpenCL implementations that do not support this. Note that if shared contexts are unavailable, support for data transfers between devices might be limited as the devices can no longer directly talk to each other.