Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC] Add OpenCL support to compile GPU kernels #689

Closed
seven-mile opened this issue Jun 15, 2024 · 1 comment
Closed

[GSoC] Add OpenCL support to compile GPU kernels #689

seven-mile opened this issue Jun 15, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@seven-mile
Copy link
Collaborator

seven-mile commented Jun 15, 2024

This is the overview issue for the GSoC project Compile GPU kernels using ClangIR.

Participant: @seven-mile

Mentors: @jopperm @Naghasan @bcardosolopes


How to play with it

We have an artifact evaluation repo, which contains instructions to run example kernels from the polybenchGPU benchmark.


Goal 1: Teach CIR pointers about address space

Status: Done

Before the project, most of the codes of ClangIR ignore address space stuff. But it's a vital feature for heterogeneous programming languages like OpenCL. How should we model the address space in ClangIR is also an interesting question: when we look into the OG Clang pipeline, there are already two different designs of it: the one from Clang AST and the one from LLVM.

At first, we chose to copy the LLVM one for simplicity without double thinking, which is the patch #606. But this approach leads to some dilemma on some problems, and is a bit against the goal of ClangIR: to provide necessary information from the source code. After comprehending pros and cons of the two world again and again, we proposed an RFC to make sure we have an easy-to-understand code structure of CIRGen and Lowering and keep all the potential optimization opportunities.

Finally, the patch #692 and #738 implemented the RFC and made related address space patches very clean and neat.

Goal 2: Add OpenCL language and SPIR-V target to ClangIR

Status: Done

ClangIR supports only C/C++ for languages and x86-64/Arm64 for targets. For OpenCL kernels, we need to at least enhance the pipeline with OpenCL language and SPIR-V target. For simplicity, we also only set our goal as the latest and widely used combination of OpenCL 3.0 + spirv64-unknown-unknown.

In original CodeGen, there are many language-specific or target-specific hooks to provide customization points like "the handling logic of local qualified variable declaration" or "the address space map from Clang AST to SPIR-V target". These support codes are also implemented correspondingly.

Goal 3: Support OpenCL vector types

Status: Done

OpenCL vectors are implemented as ExtVectorType in Clang. ClangIR already has a very complete cir.vector type and related ops. Based on cir.vector, we generate codes for ExtVectorTypes from the frontend, and carefully keep consistent in the compilation result like loading vec3 as vec4.

This is a rather independent feature, which also leaves several future works like the issues at the end of the list.

Goal 4: Emit OpenCL kernel and module metadata

Status: Done

The LLVM SPIRV Translator defines the representation of SPIR-V in LLVM in this spec. Apart from some equivalent information, inconsistencies are mainly handled by extra LLVM metadata.

As pointed out in this thread, MLIR prefers structured and well-defined attributes rather than a large dictionary like LLVM Metadata. In the patches above, OpenCLKernelMetadataAttr, OpenCLVersionAttr and OpenCLKernelArgMetadataAttr were added one by one to properly carry these metadata.

We also discussed the possibilities of unifying common metadata like workgroup size among offloading languages. It would definitely be useful for future optimization. But it may require a broad view of considering design details of entry points from every offloading languages. So it would be a bonus rather than a short-term goal.

Goal 5: Support OpenCL built-ins

Status: Partially Done

Calls to OpenCL built-in functions are encoded as normal function calls with calling convention spir_func by the frontend. Such behaviour is transparent for CIR. Thus, once we support spir_func calling convention, the built-in functions should work flawlessly.

For OpenCL built-in types such as pipe and image, they will ultimately be converted to llvm::TargetExtType. There are a couple of common approaches to address this design problem:

  • Introduce specific types like cir.cl.pipe and cir.cl.image to naturally represent them, postponing the conversion to TargetExtType.
  • Opt for an escape hatch by performing the lowering earlier, which may enhance implementation correctness but could somewhat contradict the objectives of CIR.
  • Alternatively, as a middle ground, we could introduce a single type parameterized by a name string, like cir.cl.builtin_type<"pipe"> or cir.cl.builtin_type<"image">.

This missing feature is tracked by #802.

Goal 6: Support global/static variables with qualifiers global constant local

Status: Done

In OpenCL, local memory can be considered as an implicit static varaiable with local address space. We have to support these constructs with static storage duration and address space. We have taught cir.global and cir.get_global with addrspace and added the counterparts of CIRGen and LLVM Lowering. For local memory representation, OpenCL runtime emits a static declaration with local address space.

The constant qualifiers is almost done, except for the missing constant attribute in global op, which is tracked by the issue #801.

Goal 7: Correct calling convention for CIR

Status: Done

When lowering to LLVM IR, CIR defaults the call conv to cdecl. SPIR-V requires two specific calling conventions SpirKernel and SpirFunction to be used for device kernels and non-kernel functions.

We migrated the cir::CallingConv to the MLIR defined mlir::cir::CallingConv enum attribute. cir.func operations are extendedd to be equipped by these attributes.

Theoretically, cir.calls are supposed to have the same CallConv attribute, which should be a similar change to the cir.func one: extend the dialect and collect the call conv in CIRGenCall following OG skeleton. Since this is not necessary for OpenCL/SPIR-V, we left it for future development. It's tracked by the issue #803.

Goal 8: Nice user experience of end-to-end kernel compilation

Status: Done

To keep the user experience aligned with OpenCL Support for Clang itself, we fixed the bitcode emission of ClangIR pipeline in #782. This makes clang driver able to correctly connect the output of cc1 invocation with the input of llvm spirv translator invocation. After that, we only need type clang -fclangir --target=spirv64 kernel.cl -o kernel.spv to compile an OpenCL kernel to a SPIR-V binary.

We also explored the usability of experimental SPIR-V backend. A general issue about debug info, #793, blocks it. But if we strip the debug info manually, the backend also works well.


Future works

This is a summary of remaining works that requires extra attention. There are of course other features not yet implemented, but most of them are properly tracked in source code and not very critical.

@seven-mile seven-mile added the enhancement New feature or request label Jun 15, 2024
@seven-mile seven-mile self-assigned this Jun 15, 2024
@seven-mile
Copy link
Collaborator Author

The goal of the project has been overall achieved, and future works are tracked by separate issues mentioned above. 🎉

If you are interested in this project, you can read the section "How to play with it" above for instructions to get a functional test report of OpenCL C in ClangIR. And of course, feel free to contact me by email or reach me on discord. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant