Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify raft build #2983

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 9 additions & 16 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,15 @@ workflows:
exec: linux-x86_64-gpu
gpu: "ON"
requires:
- Linux x86_64 (cmake)
- Linux x86_64 AVX2 (cmake)
- build_cmake:
name: Linux x86_64 GPU w/ RAFT (cmake)
exec: linux-x86_64-gpu
opt_level: "avx2"
gpu: "ON"
raft: "ON"
requires:
- Linux x86_64 GPU (cmake)
- build_conda:
name: Linux x86_64 (conda)
exec: linux-x86_64-cpu
Expand Down Expand Up @@ -378,18 +386,3 @@ workflows:
name: Linux arm64 nightlies
exec: linux-arm64-cpu
label: nightly

raft_nightly:
triggers:
- schedule:
cron: "0 0 * * *"
filters:
branches:
only:
- main
jobs:
- build_cmake:
name: Linux x86_64 GPU w/ RAFT (cmake)
exec: linux-x86_64-gpu
gpu: "ON"
raft: "ON"
4 changes: 1 addition & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,7 @@ if(FAISS_ENABLE_GPU)
endif()

if(FAISS_ENABLE_RAFT)
rapids_cpm_init()
include(cmake/thirdparty/get_raft.cmake)
include(cmake/thirdparty/get_cutlass.cmake)
find_package(raft COMPONENTS compiled distributed)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjnolet I wanted to try this, based on the documentation: https://docs.rapids.ai/api/raft/stable/build/#using-c-pre-compiled-shared-libraries

Wouldn't this be the preferred and simplest way to add the dependencies?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @algoriddle, We probably want to keep the cpm stuff here so that RAFT will automatically be built if it's not already installed. I suspect the issue we're facing is that two different versions of RAFT are being installed (e.g. different version from conda than what's being pulled inside cpm) and this is putting the headers from both on the search path, thus causing build issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I'm pretty sure that's the case here because if the same version of RAFT were being downloaded from conda that was specified in cpm then removing one of them wouldn't have fixed a failing RAFT build. I think we should lock both to 23.08 nightlies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoriddle, it looks like we're fetching 23.06 for cpm. Are we installing 23.08 from conda by chance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage of building RAFT if it's not installed? The Faiss build scripts don't build other dependencies (eg. MKL) either, but we expect them to be installed in the build environment. The downside of building RAFT when not available is that we can accidentally create a dependency on the locally built version - either a compile time or runtime dependency, neither of which is desired. We would like to only depend on the version of libraft that we explicitly install and compile with.

We install the nightly of libraft in the environment in what we call the 'cmake' contbuild (which doesn't use conda build, but invokes cmake+make directly. See the initialization of the build environment here: https://github.com/facebookresearch/faiss/blob/main/.circleci/config.yml#L191). I suggest we keep it this way (use the nightly of libraft), because it will allow us to notice breaking changes in upstream.

When I will create the workflow for building a conda package, I will be pinning the version of libraft to an appropriate version upstream, which I presume will be 23.08 initially. In that case the version pinning will be done in the conda package specification (meta.yaml). We will want the cmake script to build against that version of libraft and create a conda package dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoriddle

I suggest we keep it this way

Yeah that's totally fine w/ me. In the FAISS docs, we could just point the users to RAFT itself if they want to build it and install it from source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoriddle, we have also been kicking around the idea on RAFT to produce a static link library, which might actually make it easier for FAISS since

  1. only the actual used bits would be included in the FAISS binary, and
  2. downstream users of pre-build FAISS binaries would not have to worry about installing RAFT at all.

The second part there would also alleviate the potential that a user might have a different version fo RAFT already installed on their system when using FAISS.

Would there be any interest in using a static link library for RAFT if we produce one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any risk of static linking? If an end user uses faiss, but also loads libraft dynamically into a process (because they use some libraft features directly), would there be any problems or limitations imposed on the use of faiss (with a statically linked instance of libraft) + a dynamically loaded instance of libraft at the same time? Is there any global state in libraft? Are there any issues sharing data pointers between two instances of libraft (one statically linked, the other dynamically loaded) within the same process?

endif()

add_subdirectory(faiss)
Expand Down
107 changes: 0 additions & 107 deletions cmake/thirdparty/get_cutlass.cmake

This file was deleted.

52 changes: 0 additions & 52 deletions cmake/thirdparty/get_raft.cmake

This file was deleted.