Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative package selection is broken #2005

Closed
AiyionPrime opened this issue May 3, 2020 · 2 comments · Fixed by #2007
Closed

negative package selection is broken #2005

AiyionPrime opened this issue May 3, 2020 · 2 comments · Fixed by #2007
Assignees
Labels
0. type: bug This is a bug 0. type: regression 1. severity: blocker This issue/pr is required for the next release 3. topic: build This is about the build system
Milestone

Comments

@AiyionPrime
Copy link
Member

AiyionPrime commented May 3, 2020

Bug report

What is the problem?

  • What is not working as expected?

In hanover we made use of the negative package selection like this
-gluon-web-autoupdater
this does not work anymore, as we intended to.

  • How is it misbehaving?

Building a recent gluon (latest) with our site results in the build-process trying to add the marked package, though its functionality (and files) are provided by a different one.
This results in breaking the build, as the step check_data_file_clashes does correctly break, as both packages provide nearly the same files.
The point is, the negative marked one should not only not provide no files, but not be part of the build at all.

  • When did the problem first start showing up?

Our nightly build system was broken for a while, due to broken patches on our end, so we were not immediately aware of it.
But after bisecting gluon 1df243d (merged in #1876) appears to be the first bad commit.

  • What were you doing when you first noticed the problem?

Repairing our buildsystem in order to provide new device-support.

  • On which devices (vendor, model and revision) is it misbehaving?

I bisected this only for the target ramips-mt76x8 but the issue should show up on every target we got.

  • Does the issue appear on multiple devices or targets?

Yes, I suppose it does on every single one of them.

What is the expected behaviour?

  • How do you think it should work instead?

The negative marked package should not be automatically added to the build.
In the best scenario, the automatic process would ensure, that something provides the functionality, but not adding a package on it's own.
The latest gluon should be able to build with the latest site of hanover ( freifunkh/site@5052b05 )

  • Did it work like that before?

Yes, before the merge of #1876 .

Gluon Version:
the current latest gluon commit has the issue:
daf8a6c

the bug exists since this commit:
1df243d

Site Configuration:

freifunkh/site@5052b05

Additional info
the script evaluated by git bisect:

#!/bin/bash
make update
target="ramips-mt76x8"; make V=s GLUON_TARGET=$target &> logs/$(date -Is)_$target.log

latest=$(ls -t logs | head -n1)
echo reading $latest
! cat "logs/$latest" | grep "check_data_file_clashes"

the replacement package in ff-Hochstifts package repo:
https://git.ffho.net/FreifunkHochstift/ffho-packages/src/master/ffho-web-autoupdater

Warning: there is an unrelated warning about the ffh-district-migrate package.
It's save to ignore it, for this issue.

Custom patches:

Only the ones specified in hanovers patch directory.
They are most likely unrelated to this problem.

https://github.com/freifunkh/site/blob/master/patches/0001-fix-when-vlan-crc-is-00000000.patch
https://github.com/freifunkh/site/blob/master/patches/0002-patches-uradvd-change-ip-lifetimes-to-150-300-second.patch
https://github.com/freifunkh/site/blob/master/patches/0003-added-TP-Link-TL-WR841ND-N-Devices-for-8M-and-16M-Va.patch

@neocturne neocturne self-assigned this May 3, 2020
@neocturne neocturne added 0. type: bug This is a bug 0. type: regression 1. severity: blocker This issue/pr is required for the next release 3. topic: build This is about the build system labels May 3, 2020
@AiyionPrime
Copy link
Member Author

@mweinelt, @lemoer and me briefly spoke about this yesterday evening in our mumble.
@blocktrron and @mweinelt mentioned circumvention measures/a workaround in april:
https://gist.github.com/AiyionPrime/aec87ea3d24e952f1b722816f1d4a8ff#file-gistfile1-txt-L65

@mweinelt mweinelt added this to the 2020.2 milestone May 3, 2020
@neocturne
Copy link
Member

neocturne commented May 3, 2020

Okay, the principal issue here is the order in which we process and combine the different package lists. Before 9c52365, this was the following (although we never documented the precise logic):

  1. OpenWrt defaults (including target-specific defaults)
  2. Device-specific packages from OpenWrt
  3. Generic default packages (from target/generic)
  4. Fixed default packages ("hostapd-mini")
  5. Packages derived from GLUON_FEATURES
  6. GLUON_SITE_PACKAGES
  7. Target default packages (target/*)
  8. (Removal of opkg for tiny targets)
  9. Device-specific packages from `target/*
  10. Device-specific packages from GLUON_..._SITE_PACKAGES

With 9c52365 and the following patches this turns into the following:

  1. OpenWrt defaults (including target-specific defaults)
  2. Device-specific packages from OpenWrt
  3. Generic default packages (from target/generic)
  4. GLUON_SITE_PACKAGES (too early!)
  5. Target default packages (target/*)
  6. Packages derived from GLUON_FEATURES + GLUON_FEATURES_$(class) (GLUON_FEATURES is handled later than before)
  7. GLUON_SITE_PACKAGES_$(class) (for x86 - no device-specific packages on these targets)
  8. (Removal of opkg for tiny targets)
  9. Device-specific packages from `target/*
  10. Device-specific packages from GLUON_..._SITE_PACKAGES
  11. GLUON_SITE_PACKAGES_$(class) (too late?)

I think we definitely need to respect the following order:

  • GLUON_FEATURES
  • GLUON_FEATURES_$(class)
  • GLUON_SITE_PACKAGES
  • GLUON_SITE_PACKAGES_$(class)

The question is when the target- and device-specific packages are supposed to be handled - and what we can achieve without a major rewrite of the target config logic.

Handling GLUON_FEATURES before target/* is especially tricky, as the combined feature list will depend on the target class (if any), which is only known after processing target/*. I assume handling GLUON_FEATURES after the target-specific packages should be fine though.

I also see the problem of the special handling of x86, which doesn't define any devices and requires additional logic that doesn't interact nicely with the device package selection.While I wanted to avoid backporting too many patches from OpenWrt master to get the new x86 image build code (and figuring out what to do with VMDK/VDI images), this issue might provide a good reason to do so after all...

I propose use the following order:

  1. OpenWrt defaults (including target-specific defaults)
  2. Device-specific packages from OpenWrt
  3. Generic default packages (from target/generic)
  4. Target default packages (target/*)
  5. (Removal of opkg for tiny targets)
  6. Packages derived from GLUON_FEATURES + GLUON_FEATURES_$(class)
  7. GLUON_SITE_PACKAGES
  8. GLUON_SITE_PACKAGES_$(class)
  9. Device-specific packages from target/*
  10. Device-specific packages from GLUON_..._SITE_PACKAGES

This slightly deviates from the original version, as it will allow overriding target-specific packages from Gluon and device-specific packages from OpenWrt via GLUON_FEATURES and GLUON_SITE_PACKAGES, but that should hardly matter in practice. Note that "Device-specific packages from target/*" still override GLUON_FEATURES and GLUON_SITE_PACKAGES, so that devices without WLAN may opt out of hostapd explicitly.

Edit: I had to revise the above list slightly: "Device-specific packages from OpenWrt" is happening a lot earlier than I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. type: bug This is a bug 0. type: regression 1. severity: blocker This issue/pr is required for the next release 3. topic: build This is about the build system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants