This branch is 11 commits ahead of, 1 commit behind cpc/hipcl:master.

Name	Name	Last commit message	Last commit date
Latest commit colleeneb and Kerilk Update CMakeLists.txt Sep 21, 2021 dd39656 · Sep 21, 2021 History 145 Commits
bin	bin	Update version to 1.0.0-alpha	Feb 9, 2021
cmake	cmake	CMake: make FindHIP more compatible with upstream HIP	Feb 8, 2021
cxxopts	cxxopts	Add C++ option parser library from https://github.com/jarro2783/cxxopts …	Jan 17, 2020
doc	doc	Add release notes for version 0.9	Apr 15, 2020
include	include	adding double_as_longlong and longlong_as_double	Sep 21, 2021
lib	lib	adding double_as_longlong and longlong_as_double	Sep 21, 2021
llvm_passes	llvm_passes	Adjust HipDynMem pass for LLVM 12+	Jun 29, 2021
samples	samples	Update CMakeLists.txt	Sep 21, 2021
spdlog	spdlog	Initial commit	Jul 26, 2019
CMakeLists.txt	CMakeLists.txt	Bump C++ standard and make it overridable	Jun 29, 2021
LICENSE	LICENSE	Initial commit	Jul 26, 2019
README.md	README.md	Update README.md	Apr 15, 2020
hipcl_config.h.in	hipcl_config.h.in	Update version to 1.0.0-alpha	Feb 9, 2021

Repository files navigation

HIPCL library

What is HIP?

Heterogeneous-compute Interface for Portability, or HIP, is a C++ runtime API and kernel language that allows developers to write code that runs on both AMD and NVidia GPUs. CUDA applications can be converted to HIP in a largely automated fashion.

What is HIPCL

HIPCL is a library that allows applications using the HIP API to be run on devices which support OpenCL and SPIR-V, thus providing a portability path from CUDA to OpenCL. HIPCL development is led by Customized Parallel Computing group of Tampere University, Finland.

Building HIPCL

There are a few extra install/usage options documented in 'doc' directory.

HIPCL has some prerequisites to build:

LLVM + patched Clang
LLVM-SPIRV translator tool from Khronos
An OpenCL implementation with (at least partial) 2.x support; HIPCL requires Shared Virtual Memory and clCreateProgramWithIL() support

Clang + LLVM

You'll need to build a patched Clang that can compile HIP source code to ELF+SPIR-V fat binaries.

Download LLVM + Clang:

git clone https://github.com/llvm-mirror/llvm.git
cd llvm
git checkout -b release_80 origin/release_80
cd tools
git clone https://github.com/cpc/hipcl-clang.git clang
cd clang
git checkout -b release_80 origin/release_80

Build+install LLVM/Clang:

cmake -DCMAKE_INSTALL_PREFIX=<llvm_install_dir> [other cmake flags] llvm-git-directory
make
sudo make install

LLVM-SPIRV Translator

download, build+install the LLVM-SPIRV translator:

git clone https://github.com/KhronosGroup/SPIRV-LLVM-Translator.git
cd SPIRV-LLVM-Translator
git checkout -b release_80 origin/llvm_release_80
mkdir build; cd build
cmake -DLLVM_DIR=<llvm_install_dir>/lib/cmake/llvm ..
make llvm-spirv
sudo cp tools/llvm-spirv/llvm-spirv <llvm_install_dir>/bin/

Known supported OpenCL implementations

At least Intel's "NEO" OpenCL implementation supports 2.x and SPIR-V on Intel GPUs.

It's also possible to use a sufficiently recent (2019/07+) POCL, but it must be built with LLVM-SPIRV support:

git clone https://github.com/pocl/pocl.git
cd pocl
mkdir build; cd build
cmake -DCMAKE_INSTALL_PREFIX=/usr \
      -DWITH_LLVM_CONFIG=<llvm_install_dir>/bin/llvm-config \
      -DLLVM_SPIRV=<llvm_install_dir>/bin/llvm-spirv \
      ..
make
sudo make install

The last step (sudo make install) is optional - it's possible to use Pocl from build directory (by exporting some env variables: POCL_BULDING=1 and OCL_ICD_VENDORS=<pocl-build-dir>/ocl-vendors). Note that -DCMAKE_INSTALL_PREFIX=/usr implies system-wide installation. See https://github.com/pocl/pocl/blob/master/doc/sphinx/source/install.rst for details.

Whatever you end up using, make sure that clinfo lists your chosen OpenCL implementation.

Build HIPCL library

build+install the HIPCL library:

git clone https://github.com/cpc/hipcl.git
cd hipcl
mkdir build ; cd build;
cmake -DCMAKE_INSTALL_PREFIX=<hipcl_install_dir> \
      -DCMAKE_CXX_COMPILER=<llvm_install_dir>/bin/clang++ \
      -DCMAKE_C_COMPILER=<llvm_install_dir>/bin/clang \
      ..
make

CMAKE_INSTALL_PREFIX defaults to /opt/hipcl. The samples directory contains some examples; these can be run from build directory, individually or via ctest.

make install will create <hipcl_install_dir>/{lib/libhipcl.so, share/kernellib.bc, include/hip} and copy the examples to <hipcl_install_dir>/bin/samples directory.

Note that CMake removes RPATH at make install time, which means that the samples installed into <hipcl_install_dir>/bin will look for libhipcl.so in the default system library paths (/usr/lib and such).

Using HIPCL library

HIPCL provides a CMake export target named hip::hipcl. Using it from CMake is therefore straightforward:

find_package(HIP REQUIRED CONFIG PATHS "${HIPCL_INSTALL_PREFIX}")
target_link_libraries(your-executable hip::hipcl)

This will automatically add all required flags. Note that you must compile your project with CMAKE_CXX_COMPILER set to the Clang built in the first step.

For using outside CMake, there is a ${HIPCL_INSTALL_PREFIX}/bin/hipcl_config binary which prints the required flags. Manually you can build using this command:

<llvm_install_dir>/bin/clang++ -pthread -fPIE -O2 -g -std=c++11 `hipcl_config -C` -o binary source.cc -Wl,-rpath,<hipcl_install_prefix>/lib -L<hipcl_install_prefix>/lib -lhipcl

To see what compilation commands are actually run, and get the intermediate files (including the SPIR-V), add -v --save-temps to the compilation flags. Intermediate files will be saved into the current working directory. The SPIR-V that ends up embedded in the ELF binary is in a file named "a.out-hip-spir64-unknown-unknown-sm_20".

CUDA conversion example

To convert a CUDA source to HIP source, use the hipify-clang tool from AMD's HIP repository: https://github.com/ROCm-Developer-Tools/HIP/tree/master/hipify-clang

Usage:

hipify-clang [hipify args] -- [clang cuda args]

E.g.

./hipify-clang -inplace -print-stats example.cu -- -x cuda --cuda-path=/usr/local/cuda-8.0 -I /usr/local/cuda-8.0/samples/common/inc

This should produce a source with CUDA API translated to HIP API calls. To build a HIPCL executable from this source, see above Using HIPCL library.

Frequently encountered issues

clEnqueueSVMMemCopy() failed with error -5 - this appears to be a driver bug on Intel GPUs, occurs when one tries to memcpy from read-only data stored in ELF to SVM memory. SVMMemCopy from other sources (stack / heap) works without issues.
programs may take a long time to start. This is because there Clang inserts startup hooks which register SPIR-V binaries; HIPCL at this point compiles each, and for each program built, creates all kernels. This can take a long time on some implementations.
HIPCL reports the global memory size from OpenCL as available memory, but unlike CUDA, it's not possible to allocate all of that memory in a single block; HIPCL is limited by CL_DEVICE_MAX_MEM_ALLOC_SIZE.

Known HIPCL-Clang issues

Using HIP_DYNAMIC_SHARED() macro outside a function scope is not yet supported. Doing so will likely result in error: Assertion FuncSet.size() <= 1 && "more than one function uses dynamic mem variable!"' failed.
There are unfortunately still some unresolved compiler bugs present in the HIPCL patches to Clang, so compilation may fail, especially when HIPCL is compiled with -O0 flag.

Known libhipcl issues

Some of these are simply not yet implemented, some are missing because they would require an OpenCL extension.

Device Side / Math Library

OpenCL Extension required:

__fsqrt_rd and various intrinsics for add/sub/div/mul with predefined rounding mode (currently these are mapped to OpenCL variants with default rounding mode)
__shfl and friends are only available on Intel via cl_intel_subgroups extension.

Host runtime API

Implemented with caveats:

hipEventElapsedTime() can return imprecise values
hipModuleLaunchKernel accepts the "extra" argument, but the size of pointed to memory (HIP_LAUNCH_PARAM_BUFFER_SIZE) must be exactly the sum of sizes of individual arguments - no padding is allowed. Otherwise it's impossible to figure out how to set the OpenCL kernel arguments.
A certain amount of Device properties are impossible to get via OpenCL API. Values reported by HIPCL are completely made up.

Not implemented and/or require extension to OpenCL:

hipError_t hipModuleLoadData(hipModule_t* module, const void* image);
hipError_t hipModuleLoadDataEx(hipModule_t *module, const void *image,...);

This API is not possible to implement with SPIR-V binaries, because there is no size parameter (only a void* pointer), and SPIR-V binaries don't have their size embedded. It might be possible to implement with disassembled text format of SPIR-V.
hipSetDeviceFlags(unsigned flags)

The flags change how the runtime waits for results (yield thread to OS or busy waiting / spinning)
hipError_t hipStreamCreateWithPriority(hipStream_t* stream, unsigned int flags, int priority);
hipError_t hipStreamCreateWithFlags(hipStream_t* stream, unsigned int flags);
hipError_t hipEventCreateWithFlags(hipEvent_t* event, unsigned flags);
hipError_t hipDeviceSetCacheConfig(hipFuncCache_t cacheConfig);
hipError_t hipDeviceGetCacheConfig(hipFuncCache_t* cacheConfig);
hipError_t hipDeviceGetSharedMemConfig(hipSharedMemConfig* pConfig);
hipError_t hipDeviceSetSharedMemConfig(hipSharedMemConfig config);
hipError_t hipFuncSetCacheConfig(const void* func, hipFuncCache_t config);
hipError_t hipCtxSetCacheConfig(hipFuncCache_t cacheConfig);
hipError_t hipCtxSetSharedMemConfig(hipSharedMemConfig config);
hipError_t hipPointerGetAttributes(hipPointerAttribute_t* attributes, const void* ptr);
hipError_t hipExtMallocWithFlags(void** ptr, size_t sizeBytes, unsigned int flags);
hipError_t hipGetDeviceProperties(hipDeviceProp_t *prop, int deviceId);
hipError_t hipStreamQuery(hipStream_t stream)
hipError_t hipModuleGetGlobal(hipDeviceptr_t* dptr, size_t* bytes, hipModule_t hmod, const char* name);
hipError_t hipHostGetFlags(unsigned int* flagsPtr, void* hostPtr);
hipError_t hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);
hipError_t hipDeviceGetPCIBusId(char* pciBusId, int len, int device);
hipError_t hipDeviceGetByPCIBusId(int* device, const char* pciBusId);
hipError_t hipSetDeviceFlags(unsigned flags);

Symbol API Not Implemented

hipError_t hipMemcpyToSymbolAsync(void*, const void*, size_t, size_t, hipMemcpyKind, hipStream_t, const char*);
hipError_t hipMemcpyFromSymbol(void*, const void*, size_t, size_t, hipMemcpyKind, const char*);
hipError_t hipMemcpyFromSymbolAsync(void*, const void*, size_t, size_t, hipMemcpyKind, hipStream_t, const char*);
hipError_t hipGetSymbolAddress(void** devPtr, const void* symbolName);
hipError_t hipGetSymbolSize(size_t* size, const void* symbolName);
hipError_t hipMemcpyToSymbol(void*, const void*, size_t, size_t, hipMemcpyKind, const char*);
hipError_t hipMemcpyToSymbol(const void* symbolName, const void* src, size_t sizeBytes, size_t offset __dparm(0), hipMemcpyKind kind __dparm(hipMemcpyHostToDevice));

Peer2Peer Functions Are Not Implemented Yet

hipError_t hipDeviceCanAccessPeer(int* canAccessPeer, int deviceId, int peerDeviceId);
hipError_t hipDeviceEnablePeerAccess(int peerDeviceId, unsigned int flags);
hipError_t hipDeviceDisablePeerAccess(int peerDeviceId);
hipError_t hipMemcpyPeer(void* dst, int dstDeviceId, const void* src, int srcDeviceId, size_t sizeBytes);
hipError_t hipMemcpyPeerAsync(void* dst, int dstDeviceId, const void* src, int srcDevice, size_t sizeBytes, hipStream_t stream __dparm(0));

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HIPCL library

What is HIP?

What is HIPCL

Building HIPCL

Clang + LLVM

LLVM-SPIRV Translator

Known supported OpenCL implementations

Build HIPCL library

Using HIPCL library

CUDA conversion example

Frequently encountered issues

Known HIPCL-Clang issues

Known libhipcl issues

Device Side / Math Library

Host runtime API

Symbol API Not Implemented

Peer2Peer Functions Are Not Implemented Yet

PROFILER Not implemented

API CALLBACKs Not implemented

TEXTURES not implemented

About

Releases

Packages

Languages

License

Kerilk/hipcl

Folders and files

Latest commit

History

Repository files navigation

HIPCL library

What is HIP?

What is HIPCL

Building HIPCL

Clang + LLVM

LLVM-SPIRV Translator

Known supported OpenCL implementations

Build HIPCL library

Using HIPCL library

CUDA conversion example

Frequently encountered issues

Known HIPCL-Clang issues

Known libhipcl issues

Device Side / Math Library

Host runtime API

Symbol API Not Implemented

Peer2Peer Functions Are Not Implemented Yet

PROFILER Not implemented

API CALLBACKs Not implemented

TEXTURES not implemented

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages