Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44950: [C++] Bump minimum CMake version to 3.25 #44989

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Dec 10, 2024

This is currently under development. I am only exercising CI to find out what things would require to be updated
This is still being decided and a decision hasn't been finalized yet.

Rationale for this change

We want to upgrade our CMake version to 3.25 as discussed on the ML:
https://lists.apache.org/thread/h8jp16ktrj11fmjmjhlg6xvkvv9wzvjk

What changes are included in this PR?

TBD

Are these changes tested?

Yes, via CI.

Are there any user-facing changes?

Yes, the minimum CMake version to be used to build Arrow is bumped to 3.25.
This PR includes breaking changes to build systems.

Copy link

⚠️ GitHub issue #44950 has been automatically assigned in GitHub to PR creator.

@raulcd
Copy link
Member Author

raulcd commented Dec 10, 2024

I am unsure on how to fix the remaining failures for R.
For the gcc 12 job I am unsure why it is failing and for the Windows C++ RTools 40 ucrt64 it seems we install CMake from MINGW here but I am not sure if this is necessary or can be updated.
https://github.com/apache/arrow/blob/main/ci/scripts/PKGBUILD#L39

Of course this is the initial CI (we also have to update all the extended CI jobs for crossbow).

@h-vetinari
Copy link
Contributor

For the gcc 12 job I am unsure why it is failing

Updating the CMake lower bound will flip the default of several policies from legacy to new; sounds like you might be relying on legacy behaviour there in some way (once you figure out which policy is at fault, there's usually a migration path to keep the old behaviour)

@raulcd
Copy link
Member Author

raulcd commented Dec 11, 2024

ok, it seems R forces the builds to use the CMake provided on the images:

**** Not using cmake found at /bin/cmake
Error in .make_numeric_version(x, strict, .standard_regexps()$valid_numeric_version) : 
  invalid non-character version specification 'x' (type: double)
Calls: build_libarrow ... as.numeric_version -> numeric_version -> .make_numeric_version
Execution halted

@jonkeane @assignUser @amoeba will this be an issue for CRAN? Are we somehow forced to the CMake version on those images?

root@03abb5b759ba:/# /bin/cmake --version
cmake version 3.22.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

@assignUser
Copy link
Member

assignUser commented Dec 11, 2024

I went through the logs of our recent checks on cran and only one is using a version < 3.25 and that seems more incidental then purposely as it's the r-odrel arm64 but the intel version has 3.26

So I don't think we should be forced to use that cmake version, additionally we have a function that fetches current cmake if an unsuited version is found but apparently there is an issue with it as seen above. IIRC there was a change to numeric version in one of the las R Versions that is causing this? I'll have a look.

@nealrichardson
Copy link
Member

It's possible the version comparison error is from this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL731-L743

cmake_version <- function(cmd = "cmake") {
  tryCatch(
    {
      raw_version <- system(paste(cmd, "--version"), intern = TRUE, ignore.stderr = TRUE)
      pat <- ".* ([0-9\\.]+).*?"
      which_line <- grep(pat, raw_version)
      package_version(sub(pat, "\\1", raw_version[which_line]))
    },
    error = function(e) {
      return(0)
    }
  )
}

The error case should probably return("0")

@nealrichardson
Copy link
Member

Two other places in the R nixlibs.R script worth updating:

@pitrou
Copy link
Member

pitrou commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

@nealrichardson
Copy link
Member

nealrichardson commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

I could be remembering wrong, but I believe the function is used to check for cmake of a certain version, and this is to be robust to where cmake may not be installed or not found at the path provided. It does not emit an error, it traps it.

This might not be where the error is coming from that was observed in CI, I was just browsing the source to see where you might get a numeric version error. Looking again, and reading the output it produced, I think we're hitting this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL718

      } else {
        # Keep trying
        lg("Not using cmake found at %s", path, .indent = "****")
        if (found_version > 0) {
          lg("Version >= %s required; found %s", version_required, found_version, .indent = "*****")
        } else {

should be found_version > "0". We must not have been running into this before because the "found_version" was always sufficient if found, and if it wasn't found, we were returning a numeric 0 which works in this comparison.

(To be clear, we need to fix both this and the return("0") above.)

.env Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Dec 12, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 13, 2024
@jonkeane
Copy link
Member

I've pushed the changes Neal suggested which should fix the ubuntu failure 🤞 (hope you don't mind, @raulcd !)

@jonkeane
Copy link
Member

I did a bit of digging on the windows front, and I suspect what's going on is the the MSYS2/mingw cmake is being preferred (I'm not familiar enough with that ecosystem to know if we can override it with something on the system, but that also seems fragile itself).

https://github.com/raulcd/arrow/blob/e5d521134db4bed8507572fece3b56eb4a1b9158/ci/scripts/r_windows_build.sh#L26-L30 has some info about how to test newer dependencies, I wonder if we (conditionally) use those pacman commands to insall a newer cmake than what's in the CRAN repo (https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/ has the version I'm seeing installed listed, so if we override that one in our builds that might be sufficient.

@nealrichardson
Copy link
Member

I did a bit of digging on the windows front, and I suspect what's going on is the the MSYS2/mingw cmake is being preferred (I'm not familiar enough with that ecosystem to know if we can override it with something on the system, but that also seems fragile itself).

https://github.com/raulcd/arrow/blob/e5d521134db4bed8507572fece3b56eb4a1b9158/ci/scripts/r_windows_build.sh#L26-L30 has some info about how to test newer dependencies, I wonder if we (conditionally) use those pacman commands to insall a newer cmake than what's in the CRAN repo (https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/ has the version I'm seeing installed listed, so if we override that one in our builds that might be sufficient.

Unfortunately, that won't help us. https://github.com/r-windows/rtools-packages/ is archived; CRAN has moved away from the toolchain that Jeroen was maintaining. We should delete that comment.

For newer cmake with older rtools, maybe we can get it from https://github.com/Kitware/CMake/releases/download/v3.31.2/cmake-3.31.2-windows-x86_64.zip or something? Or maybe we can install newer rtools for cmake but use the older rtools for compilers etc.?

@raulcd
Copy link
Member Author

raulcd commented Dec 17, 2024

I don't understand why the job is installing mingw-w64-ucrt-x86_64-cmake-3.21.3-1-any.pkg.tar.x when the remote package seems to be updated to a newer version: https://packages.msys2.org/packages/mingw-w64-ucrt-x86_64-cmake
And we seem to be updating to pull the newest packages:

+ pacman --noconfirm -Syy
:: Synchronizing package databases...
downloading mingw32.db...
downloading mingw64.db...
downloading ucrt64.db...
downloading mirrors.db...

edit:
Ok, I see this is using CRAN mirrors here: https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/

Probably naive question here, I suppose we can't update the mirrors here to point to msys2 and download the newer cmake from there before building Arrow as we would have the same issue when we try to publish to CRAN, right? @jonkeane @nealrichardson

@nealrichardson
Copy link
Member

Probably naive question here, I suppose we can't update the mirrors here to point to msys2 and download the newer cmake from there before building Arrow as we would have the same issue when we try to publish to CRAN, right? @jonkeane @nealrichardson

Honestly I don't know. But it's not relevant for CRAN because (a) CRAN currently only builds on R 4.3 and 4.4, both of which have new enough cmake, and (b) we don't require cmake on CRAN anyway because we build the libarrow C++ library in our CI and download it in the CRAN build--CRAN only compiles the R bindings, which do not require cmake.

Solving the cmake issue for R < 4.3 is only for our own purposes of building C++ libraries that are compatible with older versions of R on Windows. We support more versions of R than CRAN actively checks on, though it appears that in CI, we only check on the Windows current release version, and when we test old R versions, we do it on linux.

IIUC we're building libarrow with the rtools4.0 toolchain for maximum compatibility. If we want to continue doing that, we could try installing newer cmake somewhere before https://github.com/apache/arrow/blob/main/.github/workflows/r.yml#L293 and making sure it's on the PATH. Or, to bump up, we would change the rtools version in that job from 40 to 43 to use the rtools43 toolchain with newer cmake.

@pitrou
Copy link
Member

pitrou commented Dec 18, 2024

We support more versions of R than CRAN actively checks on

Do we still want to do that? How is our R support policy decided?

@nealrichardson
Copy link
Member

We support more versions of R than CRAN actively checks on

Do we still want to do that? How is our R support policy decided?

We generally follow the tidyverse version support policy, which is the most recent 5 versions. Many enterprise users of R don't upgrade versions eagerly so they can be stuck on older versions longer than CRAN's testing window.

If there is an easy way to get new enough cmake into the R windows-cpp job so that we can keep building with the R 4.0 toolchain, that would be ideal. I don't see why we couldn't do that.

Copy link

Revision: 79de380

Submitted crossbow builds: ursacomputing/crossbow @ actions-53b79b55f3

Task Status
centos-7-amd64 GitHub Actions

@raulcd
Copy link
Member Author

raulcd commented Jan 22, 2025

I am trying to install the newer CMake on the linux-packaging tasks but it's slightly complex to understand the context and where the files might be located. My idea is to reuse the install_cmake.sh script, and I was trying to link or copy the file under ci/scripts into the Dockerfile, but I am struggling to understand where the docker build is executed and what the path should be (I was thinking on making a symbolic link on dev/tasks/linux-packages to ci/scripts if necessary). I tried to parse this logs and we seem to jump around on rake quite a little bit but it's confusing:

cd /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/../../..
docker build --cache-from apache/arrow-dev:aarch64-almalinux-9-package-apache-arrow --tag apache/arrow-dev:aarch64-almalinux-9-package-apache-arrow --build-arg DEBUG=yes --build-arg FROM=arm64v8/almalinux:9 almalinux-9
cd -
cp /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/yum/build.sh yum/build.sh
rm -rf yum/tmp
mkdir -p yum/tmp
cp apache-arrow-20.0.0.dev8.tar.gz yum/tmp/apache-arrow-20.0.0.dev8.tar.gz
cd yum
mkdir -p /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/apache-arrow/yum/build/almalinux-9-aarch64

cc @kou

@kou
Copy link
Member

kou commented Jan 22, 2025

https://github.com/apache/arrow/blob/main/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile and other dev/tasks/linux-packages/*/{apt,yum}/*/Dockerfiles are self-contained and no dependency Dockerfile. They use dev/tasks/linux-packages/*/{apt,yum}/*/ as docker build context.

If we change docker build context to the top directory, we can use ci/scripts/*. But it'll introduce another complexity. Could you embed a shell script instead of reusing ci/scripts/*?

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

Copy link

Revision: 9fc6608

Submitted crossbow builds: ursacomputing/crossbow @ actions-9094456c4e

Task Status
centos-7-amd64 GitHub Actions

@kou
Copy link
Member

kou commented Jan 24, 2025

Can I work on centos-7 in this branch?

@raulcd
Copy link
Member Author

raulcd commented Jan 24, 2025

Can I work on centos-7 in this branch?

Sure, I am able to run locally now but happy if you take it as I was stuck with the installation folder being slightly different than the expected one 😄

@kou
Copy link
Member

kou commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@kou
Copy link
Member

kou commented Jan 24, 2025

@github-actions crossbow submit centos-7-amd64

Copy link

Revision: 827eb53

Submitted crossbow builds: ursacomputing/crossbow @ actions-a08ad8d838

Task Status
centos-7-amd64 GitHub Actions

@@ -15,7 +15,7 @@
# specific language governing permissions and limitations
# under the License.

cmake_minimum_required(VERSION 3.16)
cmake_minimum_required(VERSION 3.25)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep using 3.16?
We don't need to require CMake 3.25 or later for users who just use find_package(Arrow).

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jan 25, 2025
@kou
Copy link
Member

kou commented Jan 25, 2025

RPM for CentOS 7 can be built now.
But testing it failed because we require CMake 3.25 or later for test environment. But it seems that we don't need to require CMake 3.25 or later for test environment.

See also: #44989 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants