Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update semver logic: one patch version per checklist #7458

Merged
merged 11 commits into from
Feb 1, 2022

Conversation

roryabraham
Copy link
Contributor

@roryabraham roryabraham commented Jan 29, 2022

Details

This changes the deploy paradigm so that we no longer bump the PATCH version or run a staging deploy when a checklist is locked. Instead, we bump the PATCH version when the checklist is first created.

Fixed Issues

$ #7166

Tests

I tested commands locally and added automated unit tests where possible.

  • Verify that no errors appear in the JS console

QA Steps

  1. Close the deploy checklist.

  2. The app should have it's PATCH version bumped, and a staging deploy should occur./

  3. The new checklist should be created with the new PATCH version.

  4. Lock the checklist when there is no staging deploy running.

  5. The awaitStagingDeploys action should complete quickly, and OSBotify should comment on the deploy checklist:

    🚀 All staging deploys are complete, @Expensify/applauseleads please begin QA on version https://github.com/Expensify/App/releases/tag/new-build-version 🚀

  6. Unlock the checklist and merge another PR.

  7. While the staging deploy for the other PR is running, lock the checklist again.

  8. The awaitStagingDeploys action should poll the GH API every 90 seconds until the staging deploy completes.

  9. Once the staging deploy completes, the action should complete and OSBotify should comment on the deploy checklist with the name comment as above, using the new build version.

  • Verify that no errors appear in the JS console

@roryabraham roryabraham added the InternalQA This pull request required internal QA label Jan 29, 2022
@roryabraham roryabraham self-assigned this Jan 29, 2022
@roryabraham
Copy link
Contributor Author

Planned changes to this SO:

New Expensify uses Github Actions for CI/CD. At a high level, the process works like this:

1) When a PR is merged to main, a new `BUILD` version is created (`1.0.2-72` becomes `1.0.2-73`) and it is typically deployed to staging immediately.
2) As PRs are deployed, they are accumulated in a special tracking issue in Expensify/App labeled `StagingDeployCash`. That issue contains:
    1) The latest version on staging
    2) A link to a comparison between the latest staging version and the previous production version.
    3) A checklist of all PRs that were merged and deployed to staging since the last production deploy
    4) A checklist of any deploy blockers found by Applause during QA.
+3) When Applause is ready to begin QA, they add the `🔐LockCashDeploys🔐` label to the `StagingDeployCash`. That will prevent any newly merged PRs from being deployed to staging.
-3) When Applause is ready to begin QA, they add the `🔐LockCashDeploys🔐` label to the `StagingDeployCash`. That will:
-    1) Create a new `PATCH` version (`1.0.2-73` becomes `1.0.3-0`)
-    2) Prevent any newly merged PRs from being deployed to staging.
4) If issues are found on staging that are not present on production, they become deploy blockers.
    1) Any issue or PR labelled `DeployBlockerCash` will be auto-assigned to an Expensify engineer as an `Hourly`.
    2) To fix a `DeployBlocker`, there are two options:
        1) Create a PR to fix the deploy blocker, and give it the `CherryPickStaging` label. When merged, that PR will be CP'd to staging.
        2) Remove the lock label from the `StagingDeployCash` checklist, and "open the floodgates" by deploying the main branch to staging. This is the nuclear option that deploys any and all new code to staging and invalidates the current QA cycle.
4) When Applause finishes QA, one of two things can happen:
    1) If there are issues found during QA, they must wait for Expensify engineers to fix the deploy blockers and proceed with the production deploy.
    2) If there are no issues found during QA, the assigned mobile-deployer for the week will close the `StagingDeployCash` issue with a comment that includes the `:shipit:` emoji. That will:
        1) Deploy the latest staging version to production
-       2) Create a new `StagingDeployCash` checklist with all the PRs that were merged while the `StagingDeployCash` was locked, thus restarting the process.
+       2) Create a new `StagingDeployCash` checklist with all the PRs that were merged while the `StagingDeployCash` was locked, thus restarting the process. At this time, we will bump the `PATCH` version (`1.0.2-73` becomes `1.0.3-0`), so each checklist is associated with a unique `PATCH` version.

Applause does daily QA Monday through Friday, so we'll do a production deploy Monday trough Thursday when no issues are found during QA. As per company policy, we typically don't do production deploys on Fridays, but the deployer can decide to do a deploy regardless [if nedeed](https://stackoverflow.com/c/expensify/questions/10635).


**Note:** There is currently no process to cherry-pick hotfixes to production.


function run() {
let currentStagingDeploys = [];
return promiseDoWhile(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW our async/await ban really kills me here, in my opinion this would be much cleaner written like so:

const _ = require('underscore');
const GitHubUtils = require('../../libs/GithubUtils');

async function run() {
    let currentStagingDeploys = [];
    do {
        const response = await GitHubUtils.octokit.actions.listWorkflowRuns({
            owner: GitHubUtils.GITHUB_OWNER,
            repo: GitHubUtils.APP_REPO,
            workflow_id: 'platformDeploy.yml',
            event: 'push',
        });
        currentStagingDeploys = _.filter(response.data.workflow_runs, workflowRun => workflowRun.status !== 'completed');
        
        console.log(
            _.isEmpty(currentStagingDeploys)
                ? 'No current staging deploys found'
                : `Found ${currentStagingDeploys.length} staging deploy${currentStagingDeploys.length > 1 ? 's' : ''} still running...`,
        );
        
        await new Promise(resolve => setTimeout(resolve, GitHubUtils.POLL_RATE * 9))
    } while(!_.isEmpty(currentStagingDeploys));
}

if (require.main === module) {
    run();
}

module.exports = run;

@roryabraham roryabraham marked this pull request as ready for review January 31, 2022 20:08
@roryabraham roryabraham requested a review from a team as a code owner January 31, 2022 20:08
@MelvinBot MelvinBot requested review from timszot and removed request for a team January 31, 2022 20:09
@roryabraham roryabraham requested a review from a team January 31, 2022 20:10
@roryabraham
Copy link
Contributor Author

This is a pretty large change so I'm going to pull in a second reviewer here.

@MelvinBot MelvinBot requested review from pecanoro and removed request for a team January 31, 2022 20:10
@roryabraham
Copy link
Contributor Author

Also, a note for @timszot and @pecanoro – all the index.js files in this PR are compiled files and so should be ignored in reviews.

pecanoro
pecanoro previously approved these changes Jan 31, 2022
Copy link
Contributor

@pecanoro pecanoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but I left a comment of something I don't fully understand

@roryabraham
Copy link
Contributor Author

@pecanoro's question led me to realize there was another potential race condition hiding here that's now solved. Before my last two commits, this was possible:

  1. Merge a pull request.
  2. Checklist is not yet locked, so we start bumping the app version and getting ready to start a staging deploy in preDeploy.yml. It takes a couple minutes to bump the app version and update the staging branch due to delays starting workflows with the triggerWorkflowAndWait action.
  3. Checklist is locked.
  4. Before the staging deploy from step 2 actually begins, the lockDeploys workflow runs the awaitStagingDeploys action, and finds no active platformDeploy.yml staging deploys, so it proceeds and tells Applause to begin QA.
  5. After that, step 2 completes and a new staging deploy begins. This is bad because Applause may have already started QA 💥

To resolve this, awaitStagingDeploys now waits for all platformDeploy.yml and preDeploy.yml workflow runs to complete before proceeding.

@roryabraham roryabraham requested a review from pecanoro January 31, 2022 22:08
Copy link
Contributor

@timszot timszot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@timszot timszot merged commit d74f7a4 into main Feb 1, 2022
@timszot timszot deleted the Rory-OnePatchVersionPerChecklist branch February 1, 2022 20:35
@MelvinBot
Copy link

Triggered auto assignment to @pecanoro (InternalQA), see https://stackoverflow.com/c/expensify/questions/5042 for more details.

@botify
Copy link

botify commented Feb 1, 2022

@timszot looks like this was merged without passing tests. Please add a note explaining why this was done and remove the Emergency label if this is not an emergency.

@OSBotify
Copy link
Contributor

OSBotify commented Feb 1, 2022

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

@timszot timszot removed the Emergency label Feb 1, 2022
@timszot
Copy link
Contributor

timszot commented Feb 1, 2022

Not sure why this got flagged as Emergency as the tests show as passed for the latest commit.

@roryabraham
Copy link
Contributor Author

roryabraham commented Feb 2, 2022

Okay, doing the internal QA steps slightly out of order here:

  • Close the deploy checklist
  • The app should have its PATCH version bumped, and a staging deploy should occur. ✅ https://github.com/Expensify/App/actions/runs/1785543922
  • The new checklist should be created with the new PATCH version.
  • Lock the checklist with a staging deploy still running.
  • The awaitStagingDeploys action should poll the GH API every 90 60 seconds until the staging deploy completes.
  • Once the staging deploy completes, the action should complete and OSBotify should comment on the deploy checklist with the name comment as above, using the new build version.

@OSBotify
Copy link
Contributor

OSBotify commented Feb 2, 2022

🚀 Deployed to staging by @timszot in version: 1.1.35-0 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

@OSBotify
Copy link
Contributor

OSBotify commented Feb 4, 2022

🚀 Deployed to production by @sketchydroide in version: 1.1.35-1 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
InternalQA This pull request required internal QA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants