fix[dace][next]: Updated Handling of Not Supported Memlets#2070
Merged
philip-paul-mueller merged 13 commits intoGridTools:mainfrom Jun 11, 2025
Merged
fix[dace][next]: Updated Handling of Not Supported Memlets#2070philip-paul-mueller merged 13 commits intoGridTools:mainfrom
philip-paul-mueller merged 13 commits intoGridTools:mainfrom
Conversation
There is one PR missing, that is not yet opened in DaCe. However, that PR is not needed by GT4Py, thus we can technically merge it.
Now the generation of Maps in the code generator in GPU mode has become a hard error. This makes it imperative that we only merge the PR once we have CI back and have made sure that all tests, also the one in ICON4Py have passed!
…lemented properly inside DaCe.
Actually resynced with DaCe what is supported and what not.
philip-paul-mueller
added a commit
that referenced
this pull request
Jun 10, 2025
Updates the DaCe dependency. This updates does not contains the updates to the GPU codegen. They are handled in a separate [PR#2070](#2070).
edopao
requested changes
Jun 10, 2025
Contributor
edopao
left a comment
There was a problem hiding this comment.
As you suggest, I think we should set allow_implicit_memlet_to_map=False in gtx.program_processors.runners.dace.workflow.common.set_dace_config()
Contributor
Author
|
Thanks for the hint, I forgot to ask you where you set these values. |
Contributor
It is actually the place you suggested during review of my PR 😄 |
edopao
approved these changes
Jun 11, 2025
stubbiali
pushed a commit
to stubbiali/gt4py
that referenced
this pull request
Aug 19, 2025
Updates the DaCe dependency. This updates does not contains the updates to the GPU codegen. They are handled in a separate [PR#2070](GridTools#2070).
stubbiali
pushed a commit
to stubbiali/gt4py
that referenced
this pull request
Aug 19, 2025
…#2070) If the DaCe GPU code generator encounters a Memlet that can not be expressed as a `cudaMemcpy*()` call, then it converts it to a Map. However, the issue is that these Maps have the wrong iteration order, i.e. wrong memory access pattern and it might not even launch, because of too many blocks in the `y` direction of the compute grid. For this reason GT4Py has to handle these Memlets explicitly. However, [DaCe PR#1976](spcl/dace#1976) changed this slightly and thus GT4Py had to follow. Note, that some of these changes were already introduced by [GT4Py PR#2004](GridTools#2004), however, they were made for the original version of the DaCe PR (and the GT4Py PR had to be merged before the DaCe PR was merged). Furthermore, this PR fixes a different issue, also related to the expansion of Memlets, which can be found in [DaCe PR#2033](spcl/dace#2033) (it is not yet merged and currently at commit `19b6bba`). It fixes a bug in how the Memlets are expanded. The DaCe PR also adds the possibility to generate an error instead of slightly converting Memlets into Maps and this PR enables this feature. --------- Co-authored-by: edopao <edoardo.paone@cscs.ch>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If the DaCe GPU code generator encounters a Memlet that can not be expressed as a
cudaMemcpy*()call, then it converts it to a Map.However, the issue is that these Maps have the wrong iteration order, i.e. wrong memory access pattern and it might not even launch, because of too many blocks in the
ydirection of the compute grid.For this reason GT4Py has to handle these Memlets explicitly.
However, DaCe PR#1976 changed this slightly and thus GT4Py had to follow.
Note, that some of these changes were already introduced by GT4Py PR#2004, however, they were made for the original version of the DaCe PR (and the GT4Py PR had to be merged before the DaCe PR was merged).
Furthermore, this PR fixes a different issue, also related to the expansion of Memlets, which can be found in DaCe PR#2033 (it is not yet merged and currently at commit
19b6bba).It fixes a bug in how the Memlets are expanded.
The DaCe PR also adds the possibility to generate an error instead of slightly converting Memlets into Maps and this PR enables this feature.