Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the quickstart documentation. #410

Merged
merged 1 commit into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ backend and depend on

### CUDA
```julia
using CUDA
import CUDA
using KernelAbstractions
```
[`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl) is currently the most mature way to program for GPUs.
Expand Down
28 changes: 14 additions & 14 deletions docs/src/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,33 +46,32 @@ The [`synchronize`](@ref) blocks the *host* until the kernel has completed on th
## Launching kernel on the backend

To launch the kernel on a backend-supported backend `isa(backend, KA.GPU)` (e.g., `CUDABackend()`, `ROCBackend()`, `oneBackend()`), we generate the kernel
for this backend provided by `CUDAKernels`, `ROCKernels`, or `oneAPIKernels`.
for this backend.

First, we initialize the array using the Array constructor of the chosen backend with

```julia
using CUDAKernels # Required to access CUDABackend
using CUDA: CuArray
A = CuArray(ones(1024, 1024))
```

```julia
using ROCKernels # Required to access ROCBackend
using ROCArrays: ROCArray
A = ROCArray(ones(1024, 1024))
```

```julia
using oneAPIKernels # Required to access oneBackend
using oneAPI: oneArray
A = oneArray(ones(1024, 1024))
```
The kernel generation and execution are then
```julia
backend = get_backend(A)
mul2_kernel(backend, 64)(A, ndrange=size(A))
synchronize(backend)
all(A .== 2.0)
```

For simplicity, we stick with the case of `backend=CUDABackend()`.

## Synchronization
!!! danger
All kernel launches are asynchronous, use [`synchronize(backend)`](@ref)
Expand All @@ -82,23 +81,24 @@ The code around KA may heavily rely on
[`GPUArrays`](https://github.com/JuliaGPU/GPUArrays.jl), for example, to
intialize variables.
```julia
using CUDAKernels # Required to access CUDABackend
function mymul(A::CuArray)
function mymul(A)
A .= 1.0
ev = mul2_kernel(CUDABackend(), 64)(A, ndrange=size(A))
backend = get_backend(A)
ev = mul2_kernel(backend, 64)(A, ndrange=size(A))
synchronize(backend)
all(A .== 2.0)
end
```

```julia
using CUDAKernels # Required to access CUDABackend
function mymul(A::CuArray, B::CuArray)
function mymul(A, B)
A .= 1.0
B .= 3.0
mul2_kernel(CUDABackend(), 64)(A, ndrange=size(A))
mul2_kernel(CUDABackend(), 64)(A, ndrange=size(A))
synchronize(CUDABackend())
backend = get_backend(A)
@assert get_backend(B) == backend
mul2_kernel(backend, 64)(A, ndrange=size(A))
mul2_kernel(backend, 64)(B, ndrange=size(B))
synchronize(backend)
all(A .+ B .== 8.0)
end
```
Expand Down