-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm support #3462
Closed
iotamudelta
wants to merge
16
commits into
facebookresearch:main
from
iotamudelta:rocm_support_squashed
Closed
ROCm support #3462
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
c9604bc
ROCm support
iotamudelta 396635d
Pacify clang-format
iotamudelta 0fc58d2
Revert move of GPU_WRAPPER guard.
iotamudelta b57b58b
Merge branch 'main' into rocm_support_squashed
iotamudelta b999e0e
Fix mismerge resulting in undefined symbols.
iotamudelta 4faa88b
Merge branch 'rocm_support_squashed' of github.com:iotamudelta/faiss …
iotamudelta 626dc1a
Merge branch 'main' into rocm_support_squashed
iotamudelta 9654632
Add Python support.
iotamudelta fb2be88
Merge branch 'rocm_support_squashed' of github.com:iotamudelta/faiss …
iotamudelta 9d2d750
Merge branch 'main' into rocm_support_squashed
iotamudelta 922a4ed
Merge branch 'main' into rocm_support_squashed
iotamudelta 67e7bfd
Merge branch 'main' into rocm_support_squashed
iotamudelta fd99f8c
Address feedback for cmake infra
ItsPitt 7db1b3d
Format changes
ItsPitt 8cd59e4
Merge branch 'main' into rocm_support_squashed
iotamudelta 0283aa9
Removing conditions
ItsPitt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,212 @@ | ||
#!/bin/bash | ||
|
||
# go one level up from faiss/gpu | ||
top=$(dirname "${BASH_SOURCE[0]}")/.. | ||
echo "top=$top" | ||
cd $top | ||
echo "pwd=`pwd`" | ||
|
||
# create all destination directories for hipified files into sibling 'gpu-rocm' directory | ||
for src in $(find ./gpu -type d) | ||
do | ||
dst=$(echo $src | sed 's/gpu/gpu-rocm/') | ||
echo "Creating $dst" | ||
mkdir -p $dst | ||
done | ||
|
||
# run hipify-perl against all *.cu *.cuh *.h *.cpp files, no renaming | ||
# run all files in parallel to speed up | ||
for ext in cu cuh h cpp | ||
do | ||
for src in $(find ./gpu -name "*.$ext") | ||
do | ||
dst=$(echo $src | sed 's@./gpu@./gpu-rocm@') | ||
hipify-perl -o=$dst.tmp $src & | ||
done | ||
done | ||
wait | ||
|
||
# rename all hipified *.cu files to *.hip | ||
for src in $(find ./gpu-rocm -name "*.cu.tmp") | ||
do | ||
dst=${src%.cu.tmp}.hip.tmp | ||
mv $src $dst | ||
done | ||
|
||
# replace header include statements "<faiss/gpu/" with "<faiss/gpu-rocm" | ||
# replace thrust::cuda::par with thrust::hip::par | ||
# adjust header path location for hipblas.h to avoid unnecessary deprecation warnings | ||
# adjust header path location for hiprand_kernel.h to avoid unnecessary deprecation warnings | ||
for ext in hip cuh h cpp | ||
do | ||
for src in $(find ./gpu-rocm -name "*.$ext.tmp") | ||
do | ||
sed -i 's@#include <faiss/gpu/@#include <faiss/gpu-rocm/@' $src | ||
sed -i 's@thrust::cuda::par@thrust::hip::par@' $src | ||
sed -i 's@#include <hipblas.h>@#include <hipblas/hipblas.h>@' $src | ||
sed -i 's@#include <hiprand_kernel.h>@#include <hiprand/hiprand_kernel.h>@' $src | ||
done | ||
done | ||
|
||
# hipify was run in parallel above | ||
# don't copy the tmp file if it is unchanged | ||
for ext in hip cuh h cpp | ||
do | ||
for src in $(find ./gpu-rocm -name "*.$ext.tmp") | ||
do | ||
dst=${src%.tmp} | ||
if test -f $dst | ||
then | ||
if diff -q $src $dst >& /dev/null | ||
then | ||
echo "$dst [unchanged]" | ||
rm $src | ||
else | ||
echo "$dst" | ||
mv $src $dst | ||
fi | ||
else | ||
echo "$dst" | ||
mv $src $dst | ||
fi | ||
done | ||
done | ||
|
||
# copy over CMakeLists.txt | ||
for src in $(find ./gpu -name "CMakeLists.txt") | ||
do | ||
dst=$(echo $src | sed 's@./gpu@./gpu-rocm@') | ||
if test -f $dst | ||
then | ||
if diff -q $src $dst >& /dev/null | ||
then | ||
echo "$dst [unchanged]" | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
done | ||
|
||
# Copy over other files | ||
for ext in py | ||
do | ||
for src in $(find ./gpu -name "*.$ext") | ||
do | ||
dst=$(echo $src | sed 's@./gpu@./gpu-rocm@') | ||
if test -f $dst | ||
then | ||
if diff -q $src $dst >& /dev/null | ||
then | ||
echo "$dst [unchanged]" | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
done | ||
done | ||
|
||
|
||
################################################################################### | ||
# C_API Support | ||
################################################################################### | ||
|
||
# Now get the c_api dir | ||
# This points to the faiss/c_api dir | ||
top_c_api=$(dirname "${BASH_SOURCE[0]}")/../../c_api | ||
echo "top=$top_c_api" | ||
cd ../$top_c_api | ||
echo "pwd=`pwd`" | ||
|
||
|
||
# create all destination directories for hipified files into sibling 'gpu-rocm' directory | ||
for src in $(find ./gpu -type d) | ||
do | ||
dst=$(echo $src | sed 's/gpu/gpu-rocm/') | ||
echo "Creating $dst" | ||
mkdir -p $dst | ||
done | ||
|
||
# run hipify-perl against all *.cu *.cuh *.h *.cpp files, no renaming | ||
# run all files in parallel to speed up | ||
for ext in cu cuh h cpp c | ||
do | ||
for src in $(find ./gpu -name "*.$ext") | ||
do | ||
dst=$(echo $src | sed 's@./gpu@./gpu-rocm@') | ||
hipify-perl -o=$dst.tmp $src & | ||
done | ||
done | ||
wait | ||
|
||
# rename all hipified *.cu files to *.hip | ||
for src in $(find ./gpu-rocm -name "*.cu.tmp") | ||
do | ||
dst=${src%.cu.tmp}.hip.tmp | ||
mv $src $dst | ||
done | ||
|
||
# replace header include statements "<faiss/gpu/" with "<faiss/gpu-rocm" | ||
# replace thrust::cuda::par with thrust::hip::par | ||
# adjust header path location for hipblas.h to avoid unnecessary deprecation warnings | ||
# adjust header path location for hiprand_kernel.h to avoid unnecessary deprecation warnings | ||
for ext in hip cuh h cpp c | ||
do | ||
for src in $(find ./gpu-rocm -name "*.$ext.tmp") | ||
do | ||
sed -i 's@#include <faiss/gpu/@#include <faiss/gpu-rocm/@' $src | ||
sed -i 's@thrust::cuda::par@thrust::hip::par@' $src | ||
sed -i 's@#include <hipblas.h>@#include <hipblas/hipblas.h>@' $src | ||
sed -i 's@#include <hiprand_kernel.h>@#include <hiprand/hiprand_kernel.h>@' $src | ||
done | ||
done | ||
|
||
# hipify was run in parallel above | ||
# don't copy the tmp file if it is unchanged | ||
for ext in hip cuh h cpp c | ||
do | ||
for src in $(find ./gpu-rocm -name "*.$ext.tmp") | ||
do | ||
dst=${src%.tmp} | ||
if test -f $dst | ||
then | ||
if diff -q $src $dst >& /dev/null | ||
then | ||
echo "$dst [unchanged]" | ||
rm $src | ||
else | ||
echo "$dst" | ||
mv $src $dst | ||
fi | ||
else | ||
echo "$dst" | ||
mv $src $dst | ||
fi | ||
done | ||
done | ||
|
||
# copy over CMakeLists.txt | ||
for src in $(find ./gpu -name "CMakeLists.txt") | ||
do | ||
dst=$(echo $src | sed 's@./gpu@./gpu-rocm@') | ||
if test -f $dst | ||
then | ||
if diff -q $src $dst >& /dev/null | ||
then | ||
echo "$dst [unchanged]" | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
else | ||
echo "$dst" | ||
cp $src $dst | ||
fi | ||
done |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ROCm specific? If so, can we allow 64 only for ROCm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have both wavefront 32 (E.g. navi) and 64 (E.g. MI250) devices. So this offers support for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like Nvidia is 32 only and ROCm is 32 or 64. Should we lock it accordingly in code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is desired, I could rework that assert using a ROCm flag to only allow a warpSize of 64 (and 32) on ROCm devices. It shouldn't be an issue at all!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think I would do that.