Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki: enable imports and mark headers as ignored instead of blocked #80

Merged
merged 2 commits into from
Mar 27, 2024

Conversation

reuterbal
Copy link
Collaborator

This is an alternative solution to #79.

  • re-enables chasing import dependencies (after the last-minute default behaviour change in Loki)
  • fixes the problem that the header module defining NCLV, which is required for variable demotion, was marked as blocked and not ignored

Performance looks good with this, and demotion verified by diff'ing against the gpu_scc variant:

$ NV_ACC_CUDA_HEAPSIZE=6G bin/dwarf-cloudsc-loki-scc 1 163840 128
     NUMPROC=1, NUMOMP=1, NGPTOTG=163840, NPROMA=128, NGPBLKS=1280
 Reference MFLOP count for 100 columns :  12.48232900
     NUMOMP    NGPTOT  #GP-cols     #BLKS    NPROMA tid# : Time(msec)  MFlops/s     col/s
          1    163840    163840         1       128    0 :        105    194538   1558509 @ rank#0:core#172
          1    163840    163840      1280       128   -1 :       1613     12671    101518 : TOTAL @ rank#0
    1 x   1    163840    163840      1280       128   -1 :       1613     12671    101518 : TOTAL
...
$ bin/dwarf-cloudsc-loki-scc-hoist 1 163840 128
     NUMPROC=1, NUMOMP=1, NGPTOTG=163840, NPROMA=128, NGPBLKS=1280
 Reference MFLOP count for 100 columns :  12.48232900
     NUMOMP    NGPTOT  #GP-cols     #BLKS    NPROMA tid# : Time(msec)  MFlops/s     col/s
          1    163840    163840         1       128    0 :         62    329376   2638741 @ rank#0:core#172
          1    163840    163840      1280       128   -1 :       1573     12998    104134 : TOTAL @ rank#0
    1 x   1    163840    163840      1280       128   -1 :       1573     12998    104134 : TOTAL
...
$ bin/dwarf-cloudsc-loki-scc-stack 1 163840 128
     NUMPROC=1, NUMOMP=1, NGPTOTG=163840, NPROMA=128, NGPBLKS=1280
 Reference MFLOP count for 100 columns :  12.48232900
     NUMOMP    NGPTOT  #GP-cols     #BLKS    NPROMA tid# : Time(msec)  MFlops/s     col/s
          1    163840    163840         1       128    0 :         62    325949   2611287 @ rank#0:core#38
          1    163840    163840      1280       128   -1 :       1561     13096    104920 : TOTAL @ rank#0
    1 x   1    163840    163840      1280       128   -1 :       1561     13096    104920 : TOTAL

@reuterbal reuterbal force-pushed the nabr-fix-loki-config branch from 383218f to a99664b Compare March 26, 2024 14:04
Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and works as intended. Many thanks for tracking this down. GTG from me 🚢

@reuterbal reuterbal merged commit 2add816 into main Mar 27, 2024
18 checks passed
@reuterbal reuterbal deleted the nabr-fix-loki-config branch March 27, 2024 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants