Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OCaml 5.3.0 #148

Merged
merged 15 commits into from
Mar 7, 2025
Merged

Support OCaml 5.3.0 #148

merged 15 commits into from
Mar 7, 2025

Conversation

shym
Copy link
Contributor

@shym shym commented Jan 23, 2025

This PR replaces the supported OCaml version with 5.3.0. This PR sits at the moment on top of #147 in order to illustrate with the new examples at least some of new possibilities of that version.

Most of the patches that are required for the compiler have been accepted upstream (so to be part of 5.4) and are therefore backported here. As the patches that we have in main for 5.2.1 are slightly incompatible (they were drafts for the patches that got upstreamed), they would need to be reworked for our main Makefile to support both versions. As 5.2.1 was still considered experimental, this PR proposes to simply support 5.3.0.

The main part of this PR then is the set of patches for 5.3.0. Those patches are extracted from upstream PRs (selecting the commits that are relevant for OCaml/Solo5):

  • PR 13526 (merged upstream) (10 commits)
  • PR 13674 (merged upstream) (1 commit)
  • PR 13719 (merged upstream) (1 commit)
  • PR 13735 (under review) (3 commits)
  • a not-opened-yet PR to support freestanding targets (2 commits)
  • a not-to-be-upstreamed commit to fix the maximum number of domains to 1

Note that since upstream PR 13272, the maximum number of domains can be set using a parameter in OCAMLRUNPARAM so I tried to replace that last patch by adding to nolibc a secure_getenv returning "d=1" for the OCAMLRUNPARAM value. But that value is then parsed by OCaml using sscanf, which is not provided in nolibc. I settled to keep that simple patch, at least for now, rather than come up with a sscanf (do we want to provide a hard-coded version that would work only for that single use case or a real implementation?).

Commit overview

  • Support for 5.3.0
    • 1d014d2 Add patches for 5.3.0
    • a6eb56a Use OCaml’s crossopt and installcross rules
    • 7a99ebc Add stubs for qsort and longjmp
  • Fix our solo5.conf to support OCaml-provided libraries
    • 85bdebe Add OCaml’s directory to the toolchain-configuration path
  • Update the version in opam files, CI, README
    • 76fcd25 opam: Update dependency to OCaml 5.3.0
    • e5dc6fe CI: Update OCaml version to 5.3.0
    • 911b05d CI: Show the configuration of the Solo5 toolchain
    • 641652f README: Update the supported OCaml version to 5.3.0
  • Drop patches for 5.2.1 (broken by the commit updating our Makefile)
  • Temporary fix for the CI
    • a2dbf0f TMP CI: Pin a test branch for Solo5 on FreeBSD

@shym
Copy link
Contributor Author

shym commented Feb 10, 2025

I updated this PR with the latest, merged, version of an upstream PR. In particular, it means that .size and .type directives should be properly emitted, so they can be depended upon for debugging.

@@ -0,0 +1,196 @@
From ddf39e7e09d03f5e83c92b6bb1bc094e8f116a66 Mon Sep 17 00:00:00 2001
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very quick review, is this patch really necessary? In particular, it changes the CRCs of the stdlib, which makes it impossible for a third-party library (compiled with the host compiler) to be linked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this patch to avoid the boostrap and retested a simple unikernel and it seems to me that this patch is not strictly necessary to ensure that the runtime can work with Solo5. Furthermore, the CRCs of the stdlib without this patch are the same as those of the host compiler, which makes it possible to link third-party libraries compiled with the host compiler with the stdlib of ocaml-solo5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example of why I based this PR on top of #147: without that patch, the config example is broken, as it reports

OS: None
Unix: true

(https://github.com/shym/ocaml-solo5/actions/runs/13496596181/job/37705178517#step:7:57).
On the other hand, maybe that’s a breakage we’d rather ignore for now (as it is broken on builds for OCaml 4, IIUC, so probably not expected to be correct in Mirage unikernels).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, maybe that’s a breakage we’d rather ignore for now (as it is broken on builds for OCaml 4, IIUC, so probably not expected to be correct in Mirage unikernels).

It depends on the objectives we have, but I think a conservative approach to OCaml's stdlib would be best. In addition, I think ocaml-solo5 should focus primarily on libasmrun.a (and not stdlib.cmxa).

As for Mirage, a differentiation (whether it is ocaml-solo5 or ocaml) via Sys.something is not used in the Mirage ecosystem. The only (important) differentiation is found mainly in the C stubs and during compilation (and it is currently managed by dune) because it is really in these C files that ocaml-solo5 produces different object files.

More generally, we know when we want to obtain a unikernel in advance — specifically before building it. Having the possibility of knowing dynamically whether or not we are in a unikernel can open the door to a whole host of possibilities that we might regret.

@palainp
Copy link
Member

palainp commented Feb 25, 2025

Dear all, thank you @shym for your work on that update :)
EDIT: fixed :)
I currently have an issue compiling this PR with qubes-mirage-firewall:

$ mirage configure -t xen && make depend && dune build
File "duniverse/mirage-xen/lib/bindings/dune", lines 1-10, characters 0-162:
 1 | (rule
 2 |  (enabled_if
 3 |   (= %{context_name} solo5))
 4 |  (deps
 5 |   (source_tree .))
 6 |  (targets libmirage-xen_bindings.a)
 7 |  (action
 8 |   (no-infer
 9 |    (progn
10 |     (run %{make})))))
Context: solo5
ocamlfind -toolchain solo5 ocamlopt -ccopt "-I ./include/ -O2 -std=c99 -Wall -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__"   -c -o bmap.o bmap.c
ocamlfind: Package `threads' not found
gmake: *** [<builtin>: bmap.o] Error 2

I don't know what I did wrong, I currently have pins on:

  • the future version of mirage (4.9.0 without functors for time, mclock, pclock, which compiles and works fine with ocaml 4.14.2, so I guess it should work here too, and the error seems unrelated to the unikernel compilation),
  • this PR,
  • the opam-monorepo PR proposed to work with the new Ocaml 5.3 compiler (updated to allow opam 2.3).

The issue seems to be mirage-xen compilation related and I'm not familiar with ocamlfind to check what is expected of my code :)

@shym
Copy link
Contributor Author

shym commented Feb 26, 2025

@palainp: thank you for testing that PR! I don’t really know why the problem was hiding on my machine. Anyway, I’ve reverted that change and added a small test in CI to make sure that ocamlfind accepts to run.
@dinosaure: I’ve also dropped the Config.target_os_type patch from the series, it should be fixed on the next OCaml release anyway.
The CI failure on FreeBSD is unrelated, see #149.

@hannesm
Copy link
Member

hannesm commented Feb 26, 2025

Before we merge and release this, we need a release of solo5 with Solo5/solo5#590 included.

@dinosaure
Copy link
Member

Hmmhmm, I currently have this error on my machine:

#=== ERROR while compiling ocaml-solo5.1.0.1 ==================================#
# context     2.3.0 | linux/x86_64 | ocaml-base-compiler.5.3.0 | pinned(git+file:///home/dinosaure/dev/ocaml-solo5#530#a2dbf0f05e0319a017d6df5dd5ffc551eedc9e47)
# path        ~/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1
# command     ~/.opam/opam-init/hooks/sandbox.sh build make -j31
# exit-code   2
# env-file    ~/.opam/log/ocaml-solo5-126344-db3a24.env
# output-file ~/.opam/log/ocaml-solo5-126344-db3a24.out
### output ###
# Error: The files "/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml/stdlib/stdlib.cmi"
# [...]
#        make inconsistent assumptions over interface "Stdlib"
# make[2]: *** [Makefile:2541: lambda/switch.cmx] Error 2
# File "/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml/parsing/camlinternalMenhirLib.ml", line 1:
# Error: The files "/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml/stdlib/stdlib.cmi"
#        and "/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml/parsing/camlinternalMenhirLib.cmi"
#        make inconsistent assumptions over interface "Stdlib"
# make[2]: *** [Makefile:2541: parsing/camlinternalMenhirLib.cmx] Error 2
# make[2]: Leaving directory '/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml'
# make[1]: *** [Makefile.cross:84: crossopt] Error 2
# make[1]: Leaving directory '/home/dinosaure/.opam/5.3.0/.opam-switch/build/ocaml-solo5.1.0.1/ocaml'
# make: *** [Makefile:124: _build/ocaml_is_built] Error 2

I can not understand it now but I will try to figure out where is the error (note that I was able to delete the boostrap patch on myself and compile ocaml-solo5).

@dinosaure
Copy link
Member

I looked at this error for a while and it seems to be related to my installation of OCaml by Archlinux. It seems that the version for Archlinux has been “slightly” modified. The checksum of stdlib.cmi is therefore different from the one produced by ocaml-solo5, hence the above error. So I uninstalled OCaml (via pacman) and reinstalled OPAM as well as a new switch OCaml 5.3.0 whose sources are indeed the same as the one used by ocaml-solo5 and I no longer get the error. Not sure if it's related to this PR but the error is still strange.

Despite this point, I agree with this PR.

Copy link
Member

@palainp palainp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this PR, however I've tested only on linux, so it's important to wait for Solo5/solo5#590 to be merged and solo5 released for CI to be happy with freeBSD.
Should we also wait for ocaml/ocamlfind#99 to be merged and released (and threads removed again)?

@shym
Copy link
Contributor Author

shym commented Mar 3, 2025

Should we also wait for ocaml/ocamlfind#99 to be merged and released (and threads removed again)?

I would not say so, we can keep using the same workaround we’ve been using for a long time, and drop it if/when upstream is fixed.

@shym
Copy link
Contributor Author

shym commented Mar 3, 2025

I can not understand it now but I will try to figure out where is the error (note that I was able to delete the boostrap patch on myself and compile ocaml-solo5).

This is really weird.
I wonder:

  • what is in the difference between this PR and your branch where it worked?
  • whether the Arch package is reproducible?

@palainp
Copy link
Member

palainp commented Mar 4, 2025

Thank you all for your comments. As the error is gone at @dinosaure , and it seems the only ci blocker is removed in current HEAD, rebasing should be enough?

shym added 11 commits March 4, 2025 17:21

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Rename the examples, so that there are no longer `main.xyz.ml`, just
`xyz.ml`
Extract the common C modules into a library so that all the tests can be
compiled (and executed) in one run of `dune`
Set up a `dune_gen` to avoid the boilerplate of all the options (in
particular since putting the manifest in a library requires an explicit
option to link it in)
Those patches are extracted from (only keeping what is relevant for
OCaml/Solo5):

- PR 13526 (merged upstream) (10 commits)
- PR 13674 (merged upstream) (1 commit)
- PR 13719 (merged upstream) (1 commit)
- PR 13735 (merged upstream) (7 commits)
- a not-opened-yet PR to support freestanding targets (2 commits)
- plus a not-to-be-upstreamed commit to fix the maximum number of
  domains to 1
Since PR 13526, the OCaml compiler comes with specific rules to build
and install a cross compiler so rely on them rather than
`ocaml-solo5`-specific workarounds
Use opam’s ability to install both what is registered in a .install file
and run the `install` commands to combine OCaml’s install rule with a
`.install` file for other files (`nolibc` and `openlibm` libraries,
`ocamlfind` files and the toolchain)
Those functions are necessary only to compile the bytecode part of the
OCaml compiler so they won’t be called
Follow the `ocamlfind` configuration for the standard OCaml compiler by
adding the compiler’s directory at the beginning of the configuration
`path` variable, so that OCaml-provided `META` files can be found by
`ocamlfind`

This makes it possible to find the OCaml-provided libraries, such as
`compiler-libs`
@shym
Copy link
Contributor Author

shym commented Mar 5, 2025

I rebased it, but it is still waiting for a release of Solo5 to support FreeBSD, so that the TMP commit can be dropped.

@palainp
Copy link
Member

palainp commented Mar 6, 2025

Now solo5 0.9.1 is released and merged in opam-repository, I think I can drop the last commit to trigger a CI test, is that ok for you @shym ?

@shym
Copy link
Contributor Author

shym commented Mar 6, 2025

Do you understand the CI error on FreeBSD? Is the solo5.0.9.1 package too fresh to have been visible during that run, maybe?
(Feel free to fix it if you know how to)

@hannesm
Copy link
Member

hannesm commented Mar 6, 2025

Do you understand the CI error on FreeBSD? (Feel free to fix it if you do)

to me it looks like there's still somwtiing pinning solo5 to your branch. maybe the caching is bad?

(git+file:///tmp/cirrus-ci-build#HEAD)
[ERROR] Could not synchronize /.opam/5.3.0/.opam-switch/sources/solo5 from "git+https://github.com/shym/solo5.git#freebsd-stdalign":
        Branch freebsd-stdalign not found

@palainp
Copy link
Member

palainp commented Mar 7, 2025

I agree the issue is likely due to the pin still present, I tried to trigger CI again, but I can see the pin is still present :(

@hannesm
Copy link
Member

hannesm commented Mar 7, 2025

My proposal: remove the cache stuff in .cirrus.yml. I don't know much about that caching (not sure who proposed it) -- but better have a correct CI than one with a bad caching ;) -- alternatively debug the caching...

@palainp palainp force-pushed the 530 branch 3 times, most recently from a3c787f to 15b119a Compare March 7, 2025 10:57
@palainp
Copy link
Member

palainp commented Mar 7, 2025

So I made things worse by playing with the cache :( I'm going to look for and read some docs :)

@palainp
Copy link
Member

palainp commented Mar 7, 2025

It seems to be not related with me fiddling with cache (ocaml/setup-ocaml#943). So I'll re-remove the cache part later when that issue will be solved.
EDIT: I mean the Test/Ocaml 5.3.0 action that fails now. I continue to look around for the freebsd cirrus failure.

@palainp
Copy link
Member

palainp commented Mar 7, 2025

\o/ CI takes longer but is green now :)

@palainp palainp merged commit 6182fa0 into mirage:main Mar 7, 2025
2 checks passed
@dinosaure
Copy link
Member

This is really weird.
I wonder:

  • what is in the difference between this PR and your branch where it worked?
  • whether the Arch package is reproducible?

It seems that the Archlinux version compile with -ffat-lto-objects. I suspect that it changes in a way (not sure why) *.cmi outputs. ocaml-solo5 expects, in some way (as far I understand), that the stdlib provided by the host system should be exactly the same as the one it tries to compile. I think that's the root of the issue.

@palainp
Copy link
Member

palainp commented Mar 8, 2025

This is really weird.
I wonder:

  • what is in the difference between this PR and your branch where it worked?
  • whether the Arch package is reproducible?

It seems that the Archlinux version compile with -ffat-lto-objects. I suspect that it changes in a way (not sure why) *.cmi outputs. ocaml-solo5 expects, in some way (as far I understand), that the stdlib provided by the host system should be exactly the same as the one it tries to compile. I think that's the root of the issue.

Oh sorry, I merged to fast, I thought it was OK. Do you want to fix that in a new PR and merge before a release?

@dinosaure
Copy link
Member

Oh sorry, I merged to fast, I thought it was OK. Do you want to fix that in a new PR and merge before a release?

It's ok but I think it arises the question of the reproducibility expected by ocaml-solo5 when it comes to load *.cmi from the host system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants