Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU + ClimaCore compilation error (but only sometimes) #499

Open
trontrytel opened this issue Dec 13, 2024 · 4 comments
Open

GPU + ClimaCore compilation error (but only sometimes) #499

trontrytel opened this issue Dec 13, 2024 · 4 comments
Labels

Comments

@trontrytel
Copy link
Member

trontrytel commented Dec 13, 2024

I have added a GPU + ClimaCore test in the CI. It works when the below code block is inside a function (i.e. the way it is currently written). However if instead you just try to execute

FT = Float32                                                                    
                                                                                
Ch2022 = CMP.Chen2022VelType(FT)                                                
liquid = CMP.CloudLiquid(FT)                                                    
ice = CMP.CloudIce(FT)                                                          
rain = CMP.Rain(FT)                                                             
snow = CMP.Snow(FT)                                                             
                                                                                
params = (; liquid, ice, Ch2022)                                                
                                                                                
space_3d_ρq = make_extruded_sphere(FT)                                          
space_3d_ρ = make_extruded_sphere(FT)                                           
space_3d_w = make_extruded_sphere(FT)                                           
                                                                                
ρq = CC.Fields.ones(space_3d_ρq) .* FT(1e-3)                                    
ρ = CC.Fields.ones(space_3d_ρ)                                                  
w = CC.Fields.zeros(space_3d_w)                                                 
                                                                                
Y = (; ρq, ρ)                                                                   
p = (; w, params)                                                               
                                                                                
t = 1                                                                           
                                                                                
set_sedimentation_precomputed_quantities(Y, p, t)   

it fails with

ERROR: LoadError: InvalidIRError: compiling MethodInstance for ClimaCoreCUDAExt.knl_copyto!(::ClimaCore.DataLayouts.VIJFH{…}, ::Base.Broadcast.Broadcasted{…}, ::ClimaCore.DataLayouts.UniversalSize{…}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to _broadcast_getindex_evalf(f::Tf, args::Vararg{Any, N}) where {Tf, N} @ Base.Broadcast broadcast.jl:709)

Here is the full test: https://github.com/CliMA/CloudMicrophysics.jl/blob/main/test/gpu_clima_core_test.jl

This looks like the same error I'm getting in ClimaAtmos when trying to add the cloud sedimentation CliMA/ClimaAtmos.jl#3442. Maybe it will be easier to debug here?

@trontrytel trontrytel added the GPU label Dec 13, 2024
@trontrytel
Copy link
Member Author

@trontrytel
Copy link
Member Author

Does it mean I could try putting the whole Atmos driver into a function?....

@charleskawczynski
Copy link
Member

Sorry I'm arriving a bit late to this.

The issue here is with this line:

    @. w = CMN.terminal_velocity(
        params.liquid,
        params.Ch2022.rain,
        Y.ρ,
        max(0, Y.ρq / Y.ρ),
    )

Julia's compiler is not able to infer through Base's broadcast recursive getindex.

This is because we create a tuple of tuple of objects (((params.liquid, ), (params.Ch2022.rain, ))), and the compiler struggles to infer unwrapping this object.

This limitation may be fixed in later versions of Julia (e.g., this PR), so I'd rather not suggest we change the way terminal_velocity is defined, that should work just fine.

The reason there is a difference between the posted code working and not working inside and outside a function (respectively) is likely due to inference being a heuristic-- other things can impact inference (and on the GPU that translates to the function working or not).

One simple (albeit a bit inconvenient) solution is to require users to define a wrapper function. We can change:

@. w = CMN.terminal_velocity(
        params.liquid,
        params.Ch2022.rain,
        Y.ρ,
        max(0, Y.ρq / Y.ρ),
    )

to

wrapper_terminal_velocity(p,ρ,x) = CMN.terminal_velocity(p.a,p.b,ρ,x)

struct TwoParams{A, B}
    a::A
    b::B
end
Base.Broadcast.broadcastable(x::TwoParams) = tuple(x)

p = TwoParams(params.liquid, params.Ch2022.rain)
@. w = wrapper_terminal_velocity(
        p,
        Y.ρ,
        max(0, Y.ρq / Y.ρ),
    )

@charleskawczynski
Copy link
Member

charleskawczynski commented Jan 7, 2025

I chatted with @trontrytel, and this is a counter example to my suggested solution pattern-- we can't always group parameters. This may be due to the rain object in the example (below) having both floats and tuples.

Here is a breaking reproducer that is more boiled down, with only CloudMicrophysics and ClimaCore DataLayouts:

import ClimaCore: DataLayouts
import CUDA
import CloudMicrophysics.Parameters as CMP
import CloudMicrophysics.MicrophysicsNonEq as CMN

struct HelperParams{A, B}
    liquid::A
    rain::B
end
Base.Broadcast.broadcastable(x::HelperParams) = tuple(x)

wrapper_terminal_velocity(p_hlp, ρ, q) = CMN.terminal_velocity(
    p_hlp.liquid, p_hlp.rain, ρ, q
)

function set_sedimentation_precomputed_quantities!(ρ, w, liquid, rain)
    p_hlp = HelperParams(liquid, rain)
    @. w = wrapper_terminal_velocity(
        p_hlp,
        ρ,
        max(0, ρ / ρ),
    )
    return nothing
end

function main(::Type{FT}) where {FT}

    rain = CMP.Chen2022VelType(FT).rain
    liquid = CMP.CloudLiquid(FT)

    ρ = DataLayouts.VIJFH{FT}(CUDA.CuArray{FT}, zeros; Nv=4, Nij=4, Nh=4)
    w = DataLayouts.VIJFH{FT}(CUDA.CuArray{FT}, zeros; Nv=4, Nij=4, Nh=4)
    # ρ = DataLayouts.VIJFH{FT}(Array{FT}, zeros; Nv=4, Nij=4, Nh=4)
    # w = DataLayouts.VIJFH{FT}(Array{FT}, zeros; Nv=4, Nij=4, Nh=4)
    @show ρ

    set_sedimentation_precomputed_quantities!(ρ, w, liquid, rain)
    return nothing
end


using Test
@testset "GPU inference failure" begin
    main(Float64)
end

This error is slightly different than CliMA/ClimaCore.jl#2065, but seems to be the same issue: the julia compiler cannot infer the recursive broadcast getindex.

The next thing we need to do is make a reproducer without ClimaCore DataLayouts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants