Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ErlangR24: init at 24.0 #122723

Merged
merged 1 commit into from
May 13, 2021
Merged

ErlangR24: init at 24.0 #122723

merged 1 commit into from
May 13, 2021

Conversation

ankhers
Copy link
Contributor

@ankhers ankhers commented May 12, 2021

Motivation for this change

Erlang R24 was just released.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@ankhers
Copy link
Contributor Author

ankhers commented May 12, 2021

All checks appear to have passed, but I am trying to run nixpkgs-review pr 122723 and I am getting an error saying

checking for OpenSSL in /nix/store/8iyzxj2ysxnknxc7jrldcsk2zzhv7ff1-openssl-1.1.1k-dev... configure: error: neither static nor dynamic crypto library found in /nix/store/8iyzxj2ysxnknxc7jrldcsk2zzhv7ff1-openssl-1.1.1k-dev
ERROR: /build/source/lib/crypto/configure failed!

I am trying to run on this on NixOS.

@r-rmcgibbo
Copy link

r-rmcgibbo commented May 12, 2021

Result of nixpkgs-review pr 122723 at c0b55a2e run on aarch64-linux 1

1 package failed to build:

Note that build failures may predate this PR, and could be nondeterministic or hardware dependent.
Please exercise your independent judgement. Does something look off? Please file an issue or reach out on IRC.


Result of nixpkgs-review pr 122723 at c0b55a2e run on x86_64-linux 1

1 package failed to build:

Note that build failures may predate this PR, and could be nondeterministic or hardware dependent.
Please exercise your independent judgement. Does something look off? Please file an issue or reach out on IRC.

@ankhers
Copy link
Contributor Author

ankhers commented May 12, 2021

Also, this does not switch the default Erlang to 24, it keeps it at 23. I don't know the full repercussions for changing the default. We can do that in a different PR.

@ankhers
Copy link
Contributor Author

ankhers commented May 12, 2021

This seems to be a known issue. I will wait for erlang/otp#4821 to be resolved and I will get this fixed afterwards.

@mogorman
Copy link
Contributor

on my box I applied the config change they sugggested
configureFlags = [ "--disable-parallel-configure" ];
but it seems to not be able to find ssl regardless
configure: WARNING: No (usable) OpenSSL found, skipping ssl, ssh and crypto applications

@berbiche berbiche mentioned this pull request May 12, 2021
Update configure options

The configure script now needs to be told about the headers and the
actual lib files separately.

Remove extra whitespace
@ankhers
Copy link
Contributor Author

ankhers commented May 13, 2021

There was a new configure flag that was added to Erlang's build system. We need to tell it about the dev headers, as well as the actual lib location now. These recent changes should fix the build.

@happysalada
Copy link
Contributor

Looks great, thanks for this!

Running the nixpkgs-review on darwin.

It looks good to me as far as I'm concerned!

I definitely that in order to switch to the default version of erlang we should at least wait a couple of months.

@happysalada
Copy link
Contributor

Result of nixpkgs-review pr 122723 run on x86_64-darwin 1

6 packages marked as broken and skipped:
  • couchdb3
  • cuter
  • erlangR18
  • erlangR19
  • erlang_basho_R16B02
  • lfe_1_2
24 packages built:
  • asls
  • elixir (elixir_1_11)
  • elixir_1_10
  • elixir_1_7
  • elixir_1_8
  • elixir_1_9
  • elixir_ls
  • erlang (erlangR23)
  • erlang-ls
  • erlangR20
  • erlangR21
  • erlangR22
  • erlangR24
  • erlang_javac
  • erlang_nox
  • erlang_odbc
  • erlang_odbc_javac
  • lfe (lfe_1_3)
  • mercury
  • rabbitmq-server
  • rebar
  • rebar3
  • relxExe
  • tsung

@DianaOlympos
Copy link
Contributor

I will do a patch to drop R18, R19 and 20 this week. So let's ignore it.

Bashor16 being broken is a pain and i am beginning to think we need to drop riak...

So i think that we mostly need to fix lfe , couch and cuter.

@gleber
Copy link
Contributor

gleber commented May 13, 2021

There's Riak KV v2.3.3 which is compatible with newer Erlang versions. We might want to upgrade it instead of dropping it.

@jonringer
Copy link
Contributor

jonringer commented May 13, 2021

downstream package updates and improvements to the erlang ecosystem can be done in subsequent PRs. I think this PR is good as-is.

Copy link
Contributor

@jonringer jonringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/NixOS/nixpkgs/pull/122723

7 packages marked as broken and skipped:
cuter erlangR18 erlangR19 erlang_basho_R16B02 lfe_1_2 riak yaws

29 packages built:
asls cl couchdb3 ejabberd elixir elixir_1_10 elixir_1_7 elixir_1_8 elixir_1_9 elixir_ls erlang erlang-ls erlangR20 erlangR21 erlangR22 erlangR24 erlang_javac erlang_nox erlang_odbc erlang_odbc_javac lfe mercury notmuch-bower rabbitmq-server rebar rebar3 relxExe tsung wings

@jonringer jonringer merged commit 7500267 into NixOS:master May 13, 2021
@happysalada
Copy link
Contributor

Super keen on dropping older versions of erlang, this build took a looooooooooooong time.

Regarding fixing the other packages, I'm not against it, but we will need somebody to maintain them. Unless somebody wants to add themselves as a maintainer, wouldn't it be better to leave them marked as broken until somebody shows up wanting to maintain them?
Perhaps we can do a call for maintainers? Post an issue on the relevant projects or something?

Regarding this PR, personally this is good to merge. @ankhers let me know if you want me to merge it, otherwise I'll leave you to merge it when you think it's right.

@jonringer
Copy link
Contributor

I already merged it. It's just a package addition in it's current form.

@joedevivo
Copy link
Contributor

joedevivo commented May 13, 2021

I'm not able to build R23 on x86_64-darwin after this was merged. Building for x86_64-linux is still working.

You can reproduce the issue by running nix build .#erlang on master. Rolling back to a commit from yesterday builds successfully.

Here's the tail of the log where things break down

=== Entering application hipe
make[3]: Entering directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib/hipe/rtl'
 ERLC   ../ebin/hipe_rtl_liveness.beam
 GEN    hipe_literals.hrl
make[4]: Entering directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib/hipe/main'
 ERLC   ../ebin/hipe_rtl_cleanup_const.beam
 ERLC   ../ebin/hipe_rtl_binary.beam
 ERLC   ../ebin/hipe_rtl_verify_gcsafe.beam
 ERLC   ../ebin/hipe_rtl_binary_match.beam
 ERLC   ../ebin/hipe_rtl_symbolic.beam
 ERLC   ../ebin/hipe_rtl_arch.beam
 ERLC   ../ebin/hipe_tagscheme.beam
../flow/liveness.inc:46: can't find include file "../main/hipe.hrl"
make[3]: *** [/private/tmp/nix-build-erlang-23.3.2.drv-4/source/make/x86_64-apple-darwin20.4.0/otp.mk:137: ../ebin/hipe_rtl_liveness.beam] Error 1
make[3]: *** Waiting for unfinished jobs....
 VSN    hipe.hrl
make[4]: Leaving directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib/hipe/main'
make[3]: Leaving directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib/hipe/rtl'
make[2]: *** [/private/tmp/nix-build-erlang-23.3.2.drv-4/source/make/otp_subdir.mk:29: opt] Error 2
make[2]: Leaving directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib/hipe'
make[1]: *** [/private/tmp/nix-build-erlang-23.3.2.drv-4/source/make/otp_subdir.mk:29: opt] Error 2
make[1]: Leaving directory '/private/tmp/nix-build-erlang-23.3.2.drv-4/source/lib'
make: *** [Makefile:607: secondary_bootstrap_build] Error 2

@happysalada
Copy link
Contributor

@joedevivo thanks for reporting!
There is an update on R23 here #122536
Would you mind trying that version when you have a moment?

I am on darwin too and posted the result of nixpkgs-review up top.
I wonder what could be the difference between our two systems.

@joedevivo
Copy link
Contributor

Here's what I've figured out so far.

If I delete the line ++ [ "--with-ssl-incl=${lib.getDev opensslPackage}" ] # This flag was introduced in R24 from interpreters/erlang/generic-builder.nix I can get 23.3.2 building again.

When I check out your branch, it attempts to build 23.3.4, and throws the same error around hipe

=== Entering application hipe
make[3]: Entering directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib/hipe/rtl'
 ERLC   ../ebin/hipe_rtl_liveness.beam
 GEN    hipe_literals.hrl
make[4]: Entering directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib/hipe/main'
 ERLC   ../ebin/hipe_rtl_cleanup_const.beam
 ERLC   ../ebin/hipe_rtl_binary.beam
 ERLC   ../ebin/hipe_rtl_verify_gcsafe.beam
 ERLC   ../ebin/hipe_rtl_binary_match.beam
 ERLC   ../ebin/hipe_rtl_arch.beam
 ERLC   ../ebin/hipe_tagscheme.beam
 ERLC   ../ebin/hipe_rtl_symbolic.beam
../flow/liveness.inc:46: can't find include file "../main/hipe.hrl"
make[3]: *** [/private/tmp/nix-build-erlang-23.3.4.drv-0/source/make/x86_64-apple-darwin20.4.0/otp.mk:137: ../ebin/hipe_rtl_liveness.beam] Error 1
make[3]: *** Waiting for unfinished jobs....
 VSN    hipe.hrl
make[4]: Leaving directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib/hipe/main'
make[3]: Leaving directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib/hipe/rtl'
make[2]: *** [/private/tmp/nix-build-erlang-23.3.4.drv-0/source/make/otp_subdir.mk:29: opt] Error 2
make[2]: Leaving directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib/hipe'
make[1]: *** [/private/tmp/nix-build-erlang-23.3.4.drv-0/source/make/otp_subdir.mk:29: opt] Error 2
make[1]: Leaving directory '/private/tmp/nix-build-erlang-23.3.4.drv-0/source/lib'
make: *** [Makefile:607: secondary_bootstrap_build] Error 2

These builds are long, so this is all I'll have for you tonight, but let me know if there's anything you want me to try in the morning.

@happysalada
Copy link
Contributor

thanks a lot for testing.

We know that it is something that happens on your computer, which is a darwin x86_64

Just briefly looking at the error, it seems to be some hipe related thing that is not found. Perhaps it has to do with a dependency that is missing on your machine. Did you override any of the default options of the generic builder?
(I also have darwin x86_64 and haven't made any override).
What is weird to me, is that if I try to build now, nix will find the binary cache build for darwin and use that. It seems to me that you must have something different than the default options otherwise nix would try to fetch from the binary cache.

Another thing to test, would be trying without hipe. Could you disable hipe ? (or remove line 84 of generic builder if it
s easier for you).

@DianaOlympos
Copy link
Contributor

@gleber i tried, but quite simply it is a lot of work to update to that from what i gathered and noone is really interested to update Riak

@joedevivo
Copy link
Contributor

@happysalada My plan is to override at some point, but I'm able to reproduce this with a clean checkout of your branch at 23.3.4. Because my plan is to override, I need to be able to build it myself.

Since #122536 has been merged, my nix build .#erlang works because it's in the binary cache, but I still can't build it myself when I run nix build .#erlang --option substitute false. That should pull in every dependency I need, right?

With regards to disabling hipe, I can't make any sense of it. It's still trying to build hipe and failing.

I did get it to pass by changing another setting, parallelBuild = false. If you thought the build ran long before, try it single threaded. It took my laptop 29m18s.

I think the following things are happening:

  • Removing --enable-hipe doesn't stop hipe from compiling
    • It's possible we'd need to --disable-hipe to make that happen
    • It could also be that it builds whether it's enabled or not, but we could maybe get around that with --without-hipe
  • hipe.hrl is generated from a hipe.hrl.src file. The parallel build could be hitting the thing that wants it before it's been generated.
    • It may be that my laptop, which is a 2019 6-Core i7, has too many parallel builders, and fewer might compensate for this race condition
    • It may have been a coincidence that before introducing that additional --with-ssl-incl= flag, it slowed down something earlier in the build which cascaded down to this point.
    • It's worth noting that while the comment in the generic-builder suggests this is a new flag, I can see it documented going back to at least R22, so it can have some effect on earlier versions.

If you run nix build .#erlang --option substitute false --keep-failed it'll tell you where the tmp dir is was building to is, but spoiler: it'll be the only one with erlang in the name under /tmp. In there you'll see an env-vars file. The good news is that it'll get created real fast, so you won't have to wait out the whole build to see it. Can you tell me what yours sets NIX_BUILD_CORES to? Mine is set to 12.

Erlang Configure Flags for reference

@joedevivo
Copy link
Contributor

@happysalada it turns out that nix-darwin is setting nix.conf's cores=0 if I don't specify an option, but nix itself defaults to cores=1. I'm going to try lowering my core count a few different times to see what I can get away with. What I don't understand is how I can build with -j12 if I'm building with asdf from the same erlang source.

@joedevivo
Copy link
Contributor

cores=2 saved me whopping 75 seconds :/

@happysalada
Copy link
Contributor

happysalada commented May 15, 2021

I have this setting in my darwin con buildCores = 4; but echo $NIX_BUILD_CORES returns nothing, so that variable is not set in my environment.
However, when you say that the build takes 28 min, for me that would be super fast.
When I ran the build for this PR, to build everything it took about 6-7 hours. I usually run it before I go to sleep.
my laptop is a 2015 macbook pro, so that might be the case then.

looking at the generic.nix file, parallelBuild defaults to false. Maybe that is the problem then. Did you override this on your laptop?
There is also a comment
# On some machines, parallel build reliably crashes on GEN asn1ct_eval_ext.erl step
Maybe you found another failure mode for parallel build.

@cw789 cw789 mentioned this pull request May 15, 2021
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants