-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM and SPIRV-LLVM-Translator pulldown (WW47 2024) #16165
Draft
iclsrc
wants to merge
1,943
commits into
sycl
Choose a base branch
from
llvmspirv_pulldown
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch fixes: mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:137:8: error: unused variable 'vectorType' [-Werror,-Wunused-variable] mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:154:8: error: unused variable 'srcType' [-Werror,-Wunused-variable] mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:155:8: error: unused variable 'destType' [-Werror,-Wunused-variable]
This PR implements a new type trait as a builtin, __builtin_hlsl_is_typed_resource_element_compatible This type traits verifies that the given input type is suitable as a typed resource element type. It checks that the given input type is homogeneous, has no more than 4 sub elements, does not exceed 16 bytes, and does not contain any arrays, booleans, or enums. Fixes an issue in llvm/llvm-project#113730 that needed to cause that PR to be reverted. Fixes llvm/llvm-project#113223
…4988) [5.2:625:17] The syntax of the DESTROY clause on the DEPOBJ construct with no argument was deprecated.
See #105195 as well as the big comment in DynamicRecursiveASTVisitor.cpp for more context.
…ate handlers. NFC. Cleanup the SHLI/SRLI/SRAI handlers to be more consistent - prep for a future patch.
…andedBits on SSE shift-by-immediate nodes. Attempt to peek through multiple-use SHLI/SRLI/SRAI source vectors.
Some tests were including LibcTest.h directly. Instead you should include Test.h which does proper indirection for other test frameworks we support (zxtest, gtest). Also added some license headers to tests that were missing them.
- create a clang built-in in Builtins.td - link dot4add_i8packed in hlsl_intrinsics.h - add lowering to spirv backend through expansion of operation as OPSDot is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp - add lowering to spirv backend using OpSDot in applicable SPIRV version or if SPV_KHR_integer_dot_product is enabled - add dot4add_i8packed intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td op Dot4AddI8Packed - add tests for HLSL intrinsic lowering to dx/spv intrinsic in dot4add_i8packed.hlsl - add tests for sema checks in dot4add_i8packed-errors.hlsl - add test of spir-v lowering in SPIRV/dot4add_i8packed.ll - add test to dxil lowering in DirectX/dot4add_i8packed.ll Resolves #99220
This retries the PR 113521 skipping a test in a remote environment.
When you set a "next branch breakpoint" and run to it while stepping, you have to claim the stop at that breakpoint to be the top of the inlined call stack, or you will seem to "step in" and then plans might try to step back out again. This records the PrefferedLineEntry for next branch breakpoints and adds a test to make sure this works.
For GFX10+, image_gather4 instructions that have v[254:255] as dst reg and the d16 bit on can be assembled correctly but the generated binary fails to disassemble (e.g. image_gather4 v[254:255], v[1:2], s[8:15], s[12:15] dmask:0x8 dim:SQ_RSRC_IMG_2D d16). This patch fixes this problem.
Until now, suppression of `DT_DEBUG` has been hardcoded as a downstream patch in lld. This can instead be achieved by passing `-z rodynamic`. Have the driver do this so that the private patch can be removed. If the scope of lld's `-z rodynamic` is broadened (within reason) to do more in future, that's likely to be fine as `PT_DYNAMIC` isn't writable on PlayStation. PS5 only. On PS4, the equivalent hardcoded configuration will remain in the proprietary linker. SIE tracker: TOOLCHAIN-16704
LLVM support for the attribute has been implemented already, so it just plumbs it through to the CUDA front-end. One notable difference from NVCC is that the attribute can be used regardless of the targeted GPU. On the older GPUs it will just be ignored. The attribute is a performance hint, and does not warrant a hard error if compiler can't benefit from it on a particular GPU variant.
The assumed-rank array are represented by DIGenericSubrange in debug metadata. We have to provide 2 things. 1. Expression to get rank value at the runtime from descriptor. 2. Assuming the dimension number for which we want the array information has been put on the DWARF expression stack, expressions which will extract the lowerBound, count and stride information from the descriptor for the said dimension. With this patch in place, this is how I see an assumed_rank variable being evaluated by GDB. ``` function mean(x) result(y) integer, intent(in) :: x(..) ... end program main use mod implicit none integer :: x1,xvec(3),xmat(3,3),xtens(3,3,3) x1 = 5 xvec = 6 xmat = 7 xtens = 8 print *,mean(xvec), mean(xmat), mean(xtens), mean(x1) end program main (gdb) p x $1 = (6, 6, 6) (gdb) p x $2 = ((7, 7, 7) (7, 7, 7) (7, 7, 7)) (gdb) p x $3 = (((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8))) (gdb) p x $4 = 5 ```
…114559) Currently, `FoldTensorCastProducerOp` incorrectly folds the following: ```mlir %pack = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32> %res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32> ``` as (note the static trailing dim in the result and dynamic tile dimension that corresponds to that): ```mlir %res = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x8x1xi32> ``` This triggers an Op verification failure and is due to the fact that the folder does not update the inner tile sizes in the pack Op. This PR addresses that. Note, supporting other Ops with size-like attributes is left as a TODO.
This patch fixes: mlir/lib/Dialect/Tensor/IR/TensorOps.cpp:4781:17: error: unused variable 'tileSize' [-Werror,-Wunused-variable]
Finish hooking up ClangIR code gen into the Clang control flow, initializing enough that basic code gen is possible. Add an almost empty `cir.func` op to the ClangIR dialect. Currently the only property of the function is its name. Add the code necessary to code gen a cir.func op. Create essentially empty files clang/lib/CIR/Dialect/IR/{CIRAttrs.cpp,CIRTypes.cpp}. These will be filled in later as attributes and types are defined in the ClangIR dialect. (Part of upstreaming the ClangIR incubator project into LLVM.)
I have to check for the sc list size being changed by the call-site search, not just that it had more than one element. Added a test for multiple CU's with the same name in a given module, which would have caught this mistake. We were also doing all the work to find call sites when the found decl and specified decl's only difference was a column, but the incoming specification hadn't specified a column (column number == 0).
…e"" (#115034) In C++ it's UB to use undeclared values as enum. And there is support __ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE need such values. So use `int` in TSAN interface, and mask out irrelevant bits and cast to enum ASAP. `ThreadSanitizer.cpp` already declare morder parameterd in these functions as `i32`. This may looks like a slight change, as we previously didn't mask out additional bits for `fmo`, and `NoTsanAtomic` call. But from implementation it's clear that they are expecting exact enum. Reverts llvm/llvm-project#115032 Reapply llvm/llvm-project#114724
…5023) Data transfer from a variable with a descriptor to a pointer. We create a descriptor for the pointer so we can use the flang runtime to perform the transfer. The Assign function handles all corner cases. We add a new entry points `CUFDataTransferDescDescNoRealloc` to avoid reallocation since the variable on the LHS is not an allocatable.
I plan to remove s32 as a legal type to match SelectionDAG and to remove i32 from the GPR regclass on RV64.
The lit test fmuladd-soft-float.ll only specifies s390x as platform, but the test is Linux specific, causing problems when run on z/OS. This change updates the triple to fix this.
Adds the runtime support routines for XRay on SystemZ. Only function entry/exit is implemented.
Expands pseudo instructions PATCHABLE_FUNCTION_ENTER and PATCHABLE_RET into a small instruction sequence which calls into the XRay library.
These cannot be 0.
This was hitting a "not implemented UNREACHABLE". Like the other cooperative matrix operations, map this construct to a SPIR-V friendly IR function call. Let `transDbgInfo` skip over `OpConstantComposite` because we're mapping `OpConstantComposite` to an LLVM `Instruction` without having a corresponding `SPIRVInstruction`. Original commit: KhronosGroup/SPIRV-LLVM-Translator@04b546550077c4f
…nce even if the input SPIR-V module is invalid as reported by spirv-val (#2852) Original commit: KhronosGroup/SPIRV-LLVM-Translator@69f65ef9257f3db
fixes #2768 Generate an LLVM memcpy for OpCopyLogical, rather than a call to an OpCopyLogical function. Original commit: KhronosGroup/SPIRV-LLVM-Translator@1a1bf17d9e8684c
The `OpSizeOf` instruction was added in SPIR-V 1.1, but not supported yet. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9aeb7eb92d7c0cb
This improves SPV_KHR_untyped_pointers extension. Removing struct type from special handling (translate as typed pointer) allowed to fix `spirv-val` error in `CXX/global-ctor.cl` test: ``` error: line 88: OpFunctionCall Argument <id> '25[%this1]'s type does not match Function <id> '11[%_ptr_Generic_class_Something]'s parameter type. %30 = OpFunctionCall %void %_ZNU3AS49SomethingC2Ei %this1 %26 ``` Other changes allow to translate structs in a new way without violating validation or test checks. Original commit: KhronosGroup/SPIRV-LLVM-Translator@15fd1cc50e12465
Handle all queries of `Alignment` decorations through one and the same helper function. Original commit: KhronosGroup/SPIRV-LLVM-Translator@67685320c1192af
LLVMToSPIRVBase had a custom destructor, but no copy constructor, no move constructor, no move assignment operator, and no copy assignment operator, so it was not complying with the Rule of Five. Explicitly add them as deleted to comply. Original commit: KhronosGroup/SPIRV-LLVM-Translator@93fb018aa342da1
…ters` (#2867) Do not lose variable type in forward translation - take it from the already translated "untyped" variable. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9207ef2aa150773
Something wrong with github. I don't see any conflicts locally with git. |
jsji
force-pushed
the
llvmspirv_pulldown
branch
from
November 23, 2024 01:23
0cfaffa
to
6387a05
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@f8bae3a
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@9207ef2aa150773