Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve change detection for GHAs #27904

Merged
merged 6 commits into from
Apr 8, 2024
Merged

fix: improve change detection for GHAs #27904

merged 6 commits into from
Apr 8, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented Apr 4, 2024

SUMMARY

Recently with #27867 (upgrading) pylint, we discovered that many python-related checks that should have run didn't run. The conditional execution system was uneven and showing different issues.

This PR streamline the whole conditional execution of github actions across the board.

The problem:

  • many, imperfect ways to do file change detection
    • native GHA paths triggers that are incompatible with "required checks" (and require no-op.yml, to the script)
    • a shell script that's a bit cryptic and a bit verbose and error-prone
  • actions that should be conditional but aren't set up that way

The solution:

  • creation of a single reusable action for all file-based conditional execution
  • introducing scripts/change_detector.py, a simple script inspired by scripts/ci_check_no_file_changes.sh, but more advanced and readable, handling both the pull_request and push context
  • moving all actions with file-related triggers to using this
  • getting rid of the no-op.yml hack altogether

TESTING

A fair amount of manual testing here, triggering with/without python/frontend/docker changes and making sure that the right steps get executed or skippped

pull_request:
types: [synchronize, opened, reopened, ready_for_review]
paths:
- "superset/**"
Copy link
Member

@john-bodley john-bodley Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mistercrunch I wonder if we would need to include .pre-commit-config.yaml or would the premise be that if pre-commit needed to touch any files the superset-python-misc.yml workflow would fail. That would then force the user run tox -e pre-commit locally which would then touch files in the superset/** and tests/** paths?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit has its own [required] check and runs on all files (not just the last commit as it does locally) and will non-zero exit if there's anything. So that prevents people from commit --no-verify or simply not setting up pre-commit

@mistercrunch
Copy link
Member Author

Now I'm realizing that some of the files I touched here use ./scripts/ci_check_no_file_changes.sh and others the "if/no-op" pattern, which is not super elegant but pretty functional and DRY. Let me standardize one way or another here.

@pull-request-size pull-request-size bot added size/XL and removed size/S labels Apr 4, 2024
@mistercrunch mistercrunch changed the title fix: add more path triggers for python GHAs fix: improve change detection for GHAs Apr 4, 2024
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case where you might use different github tokens for different actions? If not, you could DRY it up further and not pass a token from every action to the change detector, but rather use the secret directly there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seemed like a limitation / requirement give how GHA works, I was able to make everything else DRY, but not this. We really just need read access to a public repo here (looking up either the PR's file touched, or the diff since last commit for push events), so maybe there's a way this can work without passing a token explicitely. Let me test whether it works without passing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and seems it's just the way reusable actions work - they don't have access to secrets for security reason. The previous shell script had to do something similar, required also the PR_NUMBER to be passed (now I pick it up from env var), and I think it didn't support push events properly. So still a step forward.

from urllib.request import Request, urlopen

# Define patterns for each group of files you're interested in
PATTERNS = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the nice part that makes things evolutive. Here patterns for all actions can be managed centrally and be extended to support more rules.

@@ -109,7 +108,7 @@ jobs:
run: cypress-run-all
- name: Upload Artifacts
uses: actions/upload-artifact@v4
if: failure()
if: steps.check.outputs.python || steps.check.outputs.frontend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this statement different than failure() which will catch if any of the above run steps fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my understanding if: failure() is similar to if: steps.check.outcome == 'failure'. The previous bash script would receive params, like python or frontend and exit 1 if it discovered a file matched the pattern, then subsequent jobs used that failure as a trigger. Now if failure() is broader than if: steps.check.outcome == 'failure', and will trigger the job if ANY of the previous step failed. It feels like this is right though

Copy link
Member

@geido geido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT but I'd like to have some more eyes on this. CC @dpgaspar and @craig-rueda

@mistercrunch
Copy link
Member Author

I'll merge since we had 3 reviewers looking into this one so far. I will be monitoring master and upcoming PRs to make sure the right jobs are ran or skipped.

@mistercrunch mistercrunch merged commit e80d194 into master Apr 8, 2024
28 checks passed
@mistercrunch mistercrunch deleted the trigger_ci branch April 8, 2024 23:20
@Shoki52
Copy link

Shoki52 commented Apr 9, 2024

Hello there, can anyone help me? I want to add new currency by overwriting config file, where I can see the list of all currencies or locales

@mistercrunch
Copy link
Member Author

Please open a new issue. You start by searching the codebase for currency-related terms or symbol.

EnxDev pushed a commit to EnxDev/superset that referenced this pull request Apr 15, 2024
qleroy pushed a commit to qleroy/superset that referenced this pull request Apr 28, 2024
jzhao62 pushed a commit to jzhao62/superset that referenced this pull request May 16, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.1.0 labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels github_actions Pull requests that update GitHub Actions code preset-io size/XL 🚢 4.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants