Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when building mdbx-sys #4280

Closed
danielrachi1 opened this issue May 10, 2023 · 20 comments
Closed

Error when building mdbx-sys #4280

danielrachi1 opened this issue May 10, 2023 · 20 comments

Comments

@danielrachi1
Copy link
Contributor

Description

I'm getting this error:

error: failed to run custom build command for `mdbx-sys v0.11.6-4 (https://github.com/sigp/libmdbx-rs?tag=v0.1.4#096da80a)`

Caused by:
  process didn't exit successfully: `/home/danielrachi/Code/lighthouse/target/release/build/mdbx-sys-057a1d896a8dceb6/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at '"MDBX_version_info_struct_(unnamed_at_/home/danielrachi/_cargo/git/checkouts/libmdbx-rs-c1b523f5b64ff08c/096da80/mdbx-sys/libmdbx/mdbx_h_611_3)" is not a valid Ident', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/proc-macro2-1.0.56/src/fallback.rs:811:9
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In two scenarios:

  1. When running make test.
  2. When trying to install from source.

Version

rustc 1.69.0 (84c898d65 2023-04-16)
Trying to test: Lighthouse unstable branch @ b7b4549
Trying to build: Lighthouse stable branch @ 693886b

Present Behaviour

I pulled the unstable branch and tried to run make test (in my development folder):

danielrachi@swiftx ~/C/lighthouse (unstable)> make test
cargo test --workspace --release --exclude ef_tests --exclude beacon_chain --exclude slasher
   ...
error: failed to run custom build command for `mdbx-sys v0.11.6-4 (https://github.com/sigp/libmdbx-rs?tag=v0.1.4#096da80a)`

Caused by:
  process didn't exit successfully: `/home/danielrachi/Code/lighthouse/target/release/build/mdbx-sys-5e93f4144f3d672e/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at '"MDBX_version_info_struct_(unnamed_at_/home/danielrachi/_cargo/git/checkouts/libmdbx-rs-c1b523f5b64ff08c/096da80/mdbx-sys/libmdbx/mdbx_h_611_3)" is not a valid Ident', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/proc-macro2-1.0.55/src/fallback.rs:811:9
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
make: *** [Makefile:109: test-release] Error 101

Then, I pulled the stable branch and tried to install using make (in a folder generated by git-cloning sigp/lighthouse):

danielrachi@swiftx ~/lighthouse (stable)> make
cargo install --path lighthouse --force --locked \
        --features "jemalloc" \
        --profile "release" \

  Installing lighthouse v4.1.0 (/home/danielrachi/lighthouse/lighthouse)
    ...
error: failed to run custom build command for `mdbx-sys v0.11.6-4 (https://github.com/sigp/libmdbx-rs?tag=v0.1.4#096da80a)`

Caused by:
  process didn't exit successfully: `/home/danielrachi/lighthouse/target/release/build/mdbx-sys-49c7e9c0e0060040/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at '"MDBX_version_info_struct_(unnamed_at_/home/danielrachi/_cargo/git/checkouts/libmdbx-rs-c1b523f5b64ff08c/096da80/mdbx-sys/libmdbx/mdbx_h_611_3)" is not a valid Ident', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/proc-macro2-1.0.56/src/fallback.rs:811:9
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: failed to compile `lighthouse v4.1.0 (/home/danielrachi/lighthouse/lighthouse)`, intermediate artifacts can be found at `/home/danielrachi/lighthouse/target`
make: *** [Makefile:48: install] Error 101

Expected Behaviour

I expected the tests to start running in one scenario and to have lighthouse v4.1.0 installed in the other.

@michaelsproul
Copy link
Member

Yeah this is unfortunately a C compiler incompatibility. We're stuck on the current version of MDBX, but it's only used for the slasher so you can disable it with --no-default-features (which is a cargo argument). You can plumb it into make (but not make test) via CARGO_INSTALL_EXTRA_FLAGS, see: https://lighthouse-book.sigmaprime.io/installation-source.html#feature-flags.

We should probably disable it by default, as more and more people are having this issue

Which Linux distro are you on, and what does gcc --version show?

@danielrachi1
Copy link
Contributor Author

I'm using Fedora 38 (Workstation Edition)

danielrachi@swiftx ~> gcc --version
gcc (GCC) 13.1.1 20230426 (Red Hat 13.1.1-1)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@danielrachi1
Copy link
Contributor Author

If I run:
cargo test --workspace --release --exclude ef_tests --exclude beacon_chain --exclude slasher --no-default-features
(The cargo command make test runs but with --no-default-features at the end.) I no longer get this error. However I now get:

error: linking with `cc` failed: exit status: 1
...
= note: /usr/bin/ld: cannot find -lpq: No such file or directory
          collect2: error: ld returned 1 exit status


error: could not compile `watch` due to previous error

Also, I don't know if this will make any of the tests fail.

@michaelsproul
Copy link
Member

the libpq thing is for postgres, you need to sudo dnf install libpq-devel

the --exclude slasher flag should ensure that nothing fails

@danielrachi1
Copy link
Contributor Author

We should add libpq-devel to the list of additional requirements for developers in the book.

There are some tests using slasher logic outside of the slasher module, specifically in lighthouse/tests/beacon_node.rs

danielrachi@swiftx ~/C/lighthouse (fork_revert_logic) [101]> cargo test --workspace --release --exclude ef_tests --exclude beacon_chain --exclude slasher --no
-default-features
   Compiling lighthouse v4.1.0 (/home/danielrachi/Code/lighthouse/lighthouse)
error[E0599]: no variant or associated item named `Mdbx` found for enum `DatabaseBackend` in the current scope
    --> lighthouse/tests/beacon_node.rs:1906:74
     |
1906 |             assert_eq!(slasher_config.backend, slasher::DatabaseBackend::Mdbx);
     |                                                                          ^^^^ variant or associated item not found in `DatabaseBackend`

error[E0599]: no variant or associated item named `Mdbx` found for enum `DatabaseBackend` in the current scope
    --> lighthouse/tests/beacon_node.rs:1920:74
     |
1920 |             assert_eq!(slasher_config.backend, slasher::DatabaseBackend::Mdbx);
     |                                                                          ^^^^ variant or associated item not found in `DatabaseBackend`

For more information about this error, try `rustc --explain E0599`.
error: could not compile `lighthouse` due to 2 previous errors

This happens because they are not marked as part of the slasher feature. I added #[cfg(feature = "slasher")] on top of those tests (and others in that same file) and the error went away... But now other tests are failing and I don't know how are they related to the slasher.

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/tests.rs (target/release/deps/tests-e404371a88a45637)

running 9 tests
test short_chain ... FAILED
test chain_grows_with_metadata_and_multiple_skip_slots ... FAILED
test short_chain_with_skip_slot ... FAILED
test short_chain_with_reorg ... FAILED
test large_chain ... FAILED
test chain_grows_to_second_epoch ... FAILED
test short_chain_sync_starts_on_skip_slot ... FAILED
test chain_grows ... FAILED
test chain_grows_with_metadata ... FAILED

failures:

---- short_chain stdout ----
thread 'short_chain' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clie

---- chain_grows_with_metadata_and_multiple_skip_slots stdout ----
thread 'chain_grows_with_metadata_and_multiple_skip_slots' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299d
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- short_chain_with_skip_slot stdout ----
thread 'short_chain_with_skip_slot' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-

---- short_chain_with_reorg stdout ----
thread 'short_chain_with_reorg' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14

---- large_chain stdout ----
thread 'large_chain' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clients/cli.rs:48:9

---- chain_grows_to_second_epoch stdout ----
thread 'chain_grows_to_second_epoch' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clients/cli.rs:48:9

---- short_chain_sync_starts_on_skip_slot stdout ----
thread 'short_chain_sync_starts_on_skip_slot' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clients/cli.rs:48:9

---- chain_grows stdout ----
thread 'chain_grows' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clients/cli.rs:48:9

---- chain_grows_with_metadata stdout ----
thread 'chain_grows_with_metadata' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299db9ec823/testcontainers-0.14.0/src/clients/cli.rs:48:9

@danielrachi1
Copy link
Contributor Author

I think the real problem is that we don't have a way to share a reliable development environment. Docker could be used for this but there are some details I don't like about it for this purpose. I've heard Nix is a great tool for this. I'll give it a try and see if I can come up with something useful.

@michaelsproul
Copy link
Member

We should add libpq-devel to the list of additional requirements for developers in the book.

That's a great idea. Would you mind opening a PR?

This, and the other issues are fallout from merging a rather major new component, the beacon.watch chain indexer: #3362. Usually there are some quirks in docs and tests after merging such a large feature, even if we make every effort to avoid them.

---- chain_grows_with_metadata_and_multiple_skip_slots stdout ----
thread 'chain_grows_with_metadata_and_multiple_skip_slots' panicked at 'failed to start container', /home/danielrachi/.cargo/registry/src/jackfan.us.kg-1ecc6299d
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

This is also beacon.watch related. Do you have Docker/Podman installed? I think the new watch tests use Docker to spawn Postgres in a container. Or if you're already running in Docker, it might be a Docker-in-Docker sort of bug.

I think the real problem is that we don't have a way to share a reliable development environment.

We kind of do have a standard environment: it's the Github runner image used by CI. It would be possible to use this locally, but as you say, but probably not particularly fun (I would also prefer not to use Docker for my dev environment).

I've heard Nix is a great tool for this. I'll give it a try and see if I can come up with something useful.

I'd be wary of going too deep on this, just because for it to be effective it would need to be adopted as our primary CI. If we added a Nix env that wasn't tested on CI it would be prone to bitrot. To keep CI fast, it would probably mean our only CI run would have to use Nix. As far as I know none of the main Lighthouse devs use Nix at all, so it's also an issue of familiarity (I've used it very briefly several years ago).

Related to this we also have some in-progress work to further Docker-ify CI so that it runs on a bare metal machine owned by SigP: #4115. Those images could potentially be used for standardised local dev environments by people who are interested.

In summary I think we should:

  • Add libpq to dev requirements docs
  • Switch default slasher backend to LMDB so that tests can run without MDBX build issues. Disable MDBX by default when building from source, but keep it in the released binaries
  • Document how to use CI Docker images to run tests locally

@danielrachi1
Copy link
Contributor Author

Turns out you need docker installed and running. Added that to the PR.

I opened a second PR adding the slasher feature flag to the slasher tests I mentioned in a previous comment.

With those changes made I successfully compiled and passed all tests using:

cargo test --workspace --release --exclude ef_tests --exclude beacon_chain --exclude slasher --no-default-features

bors bot pushed a commit that referenced this issue May 30, 2023
…ments for developers in the Book (#4282)

## Issue Addressed

Realized this was missing while discussing #4280 

## Proposed Changes

Add an Item to the list of additional requirements for developers.
divagant-martian pushed a commit to divagant-martian/lighthouse that referenced this issue Jun 7, 2023
…ments for developers in the Book (sigp#4282)

## Issue Addressed

Realized this was missing while discussing sigp#4280 

## Proposed Changes

Add an Item to the list of additional requirements for developers.
ghost pushed a commit to oone-world/lighthouse that referenced this issue Jul 13, 2023
…ments for developers in the Book (sigp#4282)

## Issue Addressed

Realized this was missing while discussing sigp#4280 

## Proposed Changes

Add an Item to the list of additional requirements for developers.
@zhiqiangxu
Copy link
Contributor

How to install this libpq-devel on mac osx? I tried brew install libpq-dev and brew install libpq-devel, both reports Warning: No available formula with the name "xxx".

@michaelsproul
Copy link
Member

@zhiqiangxu I think it's just libpq on homebrew: https://formulae.brew.sh/formula/libpq

@zhiqiangxu
Copy link
Contributor

zhiqiangxu commented Aug 11, 2023

@michaelsproul brew install libpq runs successfully, but still reports library not found for -lpq.

UPDATE

This is what I got after brew install libpq:

image

Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
…ments for developers in the Book (sigp#4282)

## Issue Addressed

Realized this was missing while discussing sigp#4280 

## Proposed Changes

Add an Item to the list of additional requirements for developers.
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
…ments for developers in the Book (sigp#4282)

## Issue Addressed

Realized this was missing while discussing sigp#4280 

## Proposed Changes

Add an Item to the list of additional requirements for developers.
@eenagy
Copy link

eenagy commented Jun 6, 2024

Same issue occurs: the build fails on the latest Ubuntu 24.04, but works correctly on the latest Debian bookworm.

The exact same build script is used in both cases, with the only difference being the distribution. Other users have also reported this issue.

@michaelsproul
Copy link
Member

@eenagy Thanks for the info about Ubuntu 24.04. We are deprecating the MDBX backend in the slasher, so nobody is working on keeping it up to date.

If you would like to see the MDBX backend maintained we would consider a PR to switch over to the Reth team's bindings: https://github.com/paradigmxyz/reth/tree/main/crates/storage/libmdbx-rs

@michaelsproul
Copy link
Member

@eenagy I've just looked at your project and realised you're maintaining packages for Debian & Ubuntu! Thanks for doing that. I think it would be reasonable for your tools to turn off the slasher-mdbx feature to avoid the breakage.

@eenagy
Copy link

eenagy commented Jun 7, 2024

@eenagy I've just looked at your project and realised you're maintaining packages for Debian & Ubuntu! Thanks for doing that. I think it would be reasonable for your tools to turn off the slasher-mdbx feature to avoid the breakage.

All right, that's sounds good. I will note this, when I release the next version or patch.

@varun-doshi
Copy link

@michaelsproul brew install libpq runs successfully, but still reports library not found for -lpq.

UPDATE

This is what I got after brew install libpq:

image

Were you able to solve this?

@michaelsproul
Copy link
Member

@varun-doshi unless you need watch you don't need libpq. We should probably prevent it from being built in the homebrew formula.

If you build Lighthouse from source using make, you won't need it.

@varun-doshi
Copy link

I'm trying to run tests using cargo test with logging features enabled.
Can you please tell me how do I disable watch?

@michaelsproul
Copy link
Member

michaelsproul commented Sep 30, 2024

cargo test --exclude watch in that case, plus whatever other args you want (--features, --release, etc)

@michaelsproul
Copy link
Member

I'm going to close this issue as it was solved by:

If you continue having issues with libpq, please open a new issue @varun-doshi and we can discuss there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants