Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn if num installed to a broken system compiler #11300

Merged
merged 1 commit into from
Mar 24, 2018

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Jan 24, 2018

This follows on from #11207.

If the num library was installed in a system switch where the user had write permissions to the system compiler's lib directory before #11207 was merged, then fresh installations of the ocamlfind package will detect this "system" installation of num and create a META file for it.

This in turn will cause the num package installation to fail. Annoyingly, this failure only appears once because opam then executes ocamlfind remove num so a subsequent attempt to install the num package will appear to have worked (however, in reality the system-installed files will conflict with the ones in the opam directory).

This patch displays a comprehensive error message the first time, strongly suggesting that the user manually delete the files which were incorrectly installed previously.

If you have a system compiler where num has been manually installed to ocamlc -where (e.g. by running sudo make install from the GitHub sources) and a fresh installation of the ocamlfind package, then you will now see the following if you attempt to install the num package:

dra@zesty64:~/num/src$ opam install num
The following actions will be performed:
  ∗  install num 1.1 

=-=- Gathering sources =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
[num.1.1] found in cache

=-=- Processing actions -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
[ERROR] The installation of num failed at "make findlib-install".
Processing  2/2: [num: make findlib-uninstall]
#=== ERROR while installing num.1.1 ===========================================#
# context      2.0.0~beta6 | linux/x86_64 | ocaml-system.4.06.0 | file:///home/dra/opam-repository
# path         ~/.opam/temp/.opam-switch/build/num.1.1
# command      /usr/bin/make findlib-install
# exit-code    2
# env-file     ~/.opam/log/num-21717-a00279.env
# output-file  ~/.opam/log/num-21717-a00279.out
### output ###
# [ERROR] It appears that the num library was previously installed to your system
#         compiler's lib directory, probably by a faulty opam package.
#         You will need to remove arith_flags.*, arith_status.*, big_int.*,
#         int_misc.*, nat.*, num.*, ratio.*, nums.*, libnums.* and
#         stublibs/dllnums.* from /usr/local/lib/ocaml.
# Makefile:46: recipe for target 'findlib-install' failed
# make: *** [findlib-install] Error 1



=-=- Error report -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
┌─ The following actions failed
│ ∗  install num 1.1
└─ 
╶─ No changes have been performed

cc @avsm

If the num library was installed in a system switch where the user had
write permissions to the system compiler's lib directory before ocaml#11207
was merged, then *fresh* installations of the ocamlfind package will
detect this "system" installation of num and create a META file for it.
This in turn will cause the num package installation to fail.
Annoyingly, this failure only appears once because opam then executes
ocamlfind remove num so a subsequent attempt to install the num package
will appear to have worked.

This patch displays a comprehensive error message the first time,
strongly suggesting that the user manually delete the files which were
incorrectly installed previously.
@samoht
Copy link
Member

samoht commented Feb 15, 2018

I haven't followed the discussion on this issue so I'll let an other maintainer merge this PR.

@dra27
Copy link
Member Author

dra27 commented Feb 15, 2018

@samoht - are you able to kick the DataKit CI and Camelus?

@samoht
Copy link
Member

samoht commented Feb 15, 2018

I've restarted the Datakit CI runs, but I don't control Camelus. /cc @AltGr to see if he can do something about it :-)

@AltGr
Copy link
Member

AltGr commented Feb 15, 2018

Kicked it, but a "Could not complete" error is probably reproducible...
(NB: you need to go to config→webhooks→opam-ci, find the corresponding request and replay it)

@camelus
Copy link
Contributor

camelus commented Feb 15, 2018

✅ All lint checks passed eba59ad
  • These packages passed lint tests: num.1.0, num.1.1

✅ Installability check (8230 → 8230)

@kit-ty-kate
Copy link
Member

Sorry for the long delay. Thanks !

@kit-ty-kate kit-ty-kate merged commit 26a110c into ocaml:master Mar 24, 2018
@Calvin-L
Copy link

Calvin-L commented Aug 6, 2019

The guard implemented in this PR is causing me a lot of headaches. (Although, maybe fewer headaches than I would be experiencing without it? Unclear.)

After running opam install num, I have arrived in a totally stuck state (the advice in the message is irrelevant). Here's what I'm seeing:

$ ocamlc -where
/usr/local/lib/ocaml
$ ls -l /usr/local/lib/ocaml/{arith_flags.,arith_status.,big_int.,int_misc.,nat.,num.,ratio.,nums.,libnums.,stublibs/dllnums.}*
ls: /usr/local/lib/ocaml/arith_flags.*: No such file or directory
ls: /usr/local/lib/ocaml/arith_status.*: No such file or directory
ls: /usr/local/lib/ocaml/big_int.*: No such file or directory
ls: /usr/local/lib/ocaml/int_misc.*: No such file or directory
ls: /usr/local/lib/ocaml/libnums.*: No such file or directory
ls: /usr/local/lib/ocaml/nat.*: No such file or directory
ls: /usr/local/lib/ocaml/num.*: No such file or directory
ls: /usr/local/lib/ocaml/nums.*: No such file or directory
ls: /usr/local/lib/ocaml/ratio.*: No such file or directory
ls: /usr/local/lib/ocaml/stublibs/dllnums.*: No such file or directory
$ opam install num
The following actions will be performed:
  ∗ install num 1.2

<><> Gathering sources ><><><><><><><><><><><><><><><><><><><><><><><><><><>  🐫 
[num.1.2] found in cache

<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><>  🐫 
[ERROR] The installation of num failed at "make findlib-install".

#=== ERROR while installing num.1.2 ===========================================#
# context     2.0.5 | macos/x86_64 | ocaml-system.4.07.1 | https://opam.ocaml.org#506eb174
# path        ~/.opam/default/.opam-switch/build/num.1.2
# command     ~/.opam/opam-init/hooks/sandbox.sh install make findlib-install
# exit-code   2
# env-file    ~/.opam/log/num-98674-ad8886.env
# output-file ~/.opam/log/num-98674-ad8886.out
### output ###
# [ERROR] It appears that the num library was previously installed to your system
#         compiler's lib directory, probably by a faulty opam package.
#         You will need to remove arith_flags.*, arith_status.*, big_int.*,
#         int_misc.*, nat.*, num.*, ratio.*, nums.*, libnums.* and
#         stublibs/dllnums.* from /usr/local/lib/ocaml.
# make: *** [findlib-install] Error 1



<><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><>  🐫 
┌─ The following actions failed
│ ∗ install num 1.2
└─ 
╶─ No changes have been performed
$ ocamlfind query num
/Users/loncaric/.opam/default/lib/num

A few more details about how we got here:

  • My package manager (Homebrew) installs a system-wide version of num for a program that I need.
  • The suggestion here (manually remove various files) is a bad one, since those files are managed by Homebrew.
  • What you are seeing above happens after a previous failed install for num. and after telling Homebrew to uninstall its version of num.
  • The suggestion here (manually remove various files) is totally irrelevant in my current state.
  • The only way I have found to successfully install this package is to uninstall Homebrew's version (undesirable since I lose the downstream program I need), nuke ~/.opam, opam init, and then reinstall num.

I desperately wish that Homebrew's num and opam's num could coexist. Failing that, I desperately wish that there was a clear way out of the "stuck" state I arrived in after running opam install num.

@dra27
Copy link
Member Author

dra27 commented Aug 6, 2019

@Calvin-L - opam's support for packages installed by a system compiler is very limited at the moment, so I'd expect the solution to be not installing the homebrew ocaml-num package, at least for the foreseeable future.

I'm possibly puzzled about the ocamlfind query num result - what's in /Users/loncaric/.opam/default/lib/num and, if there's a META there, what's its content?

@Calvin-L
Copy link

Calvin-L commented Aug 6, 2019

Thanks for the quick follow-up! I did get everything working by removing Homebrew's system-wide installation of num (although I had to nuke ~/.opam as well).

Here are the contents of the META file from my earlier corrupted state:

$ cat /Users/loncaric/.opam/default/lib/num/META 
# Specification for the "num" library:
requires = "num.core"
requires(toploop) = "num.core,num-top"
version = "[distributed with Ocaml]"
description = "Arbitrary-precision rational arithmetic"
package "core" (
  directory = "^"
  version = "[internal]"
  browse_interfaces = " Unit name: Arith_flags Unit name: Arith_status Unit name: Big_int Unit name: Int_misc Unit name: Nat Unit name: Num Unit name: Ratio "
  archive(byte) = "nums.cma"
  archive(native) = "nums.cmxa"
  plugin(byte) = "nums.cma"
  plugin(native) = "nums.cmxs"
)

@dra27
Copy link
Member Author

dra27 commented Aug 6, 2019

@Calvin-L - that META looks correct for an opam-installed ocamlfind which has detected the homebrew-installed num package. I'm not clear what you mean by corrupt - was it that you couldn't install anything else which depends on the opam num package?

@Calvin-L
Copy link

Calvin-L commented Aug 6, 2019

That's correct. I was using opam to install sexplib which depends on num. However, opam insisted on building num, which would then fail. (Sorry for using "corrupt" incorrectly; I have zero familiarity with opam's directory structure and I assumed that something was broken in ~/.opam.)

@dra27
Copy link
Member Author

dra27 commented Aug 6, 2019

@Calvin-L - no problem with terminology, I just wanted to be sure I'd correctly understood what had gone wrong from your perspective 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants