Formalize a process for bulk changes #4469

ddbeck · 2019-07-08T17:15:05Z

Some recent changes, such as #3427 (enforcing feature orders) and #3888 (removing Edge Mobile) has highlighted that our process for making bulk changes to BCD is not ideal: it takes many PRs, reviewing is slow and tedious, and the difficulty has delayed many desirable changes. We can and should:

come up with a better process
try it out
document it for future use

ddbeck · 2019-07-08T17:16:54Z

I did some reading on how projects do large code changes (e.g., how projects use tools like codemod). After taking a look, it sounds like we might benefit from treating bulk changes a bit like database migrations. Our process might look something like this:

Through the normal PR process, add a script with tests that demonstrates the change to be made. Critically, the script should change one and only one thing about the data. Special cases should get their own script (and go through this process separately); manual edits should go through the ordinary change-and-review process.
Conduct the migration. This should be a tightly scheduled process, in which someone (presumably the script author) runs the script and opens a PR and a designated reviewer reviews and merges it. Ideally, this review should be easy because it's already been tested and the resulting diff will contain no manual edits.
Open and review a new PR, which introduces a linter change that enforces the changes applied by the script.

A lot of this tends to happen already (e.g., @vinyldarkscratch has written and shared fix scripts before). The main difference seems to be in writing tests and scheduling the application of the change. It seems like doing it this way would reduce the number of PRs needed to apply a change, though it may increase the risk that uninvolved PR authors may have to rebase their PRs.

If anyone has ideas for improving on this, I'm all ears. Similarly, we need to find a test case (e.g., @vinyldarkscratch's feature sorting might be a good candidate, if she's interested in testing the process), though there are plenty of bulk changes on our wish list.

queengooborg · 2019-07-08T17:41:00Z

I think that treating bulk changes like database migrations is just the way to go about it, and I’m in favor of this new review process. I would be more than happy to offer my PR and assist in this!

ddbeck · 2019-07-08T18:52:41Z

Thanks, @vinyldarkscratch!

I missed a couple things: one, I didn't quite make it explicit that the change script should come with tests, independent of the real data. Basically, we should make it abundantly clear what the change script does and validate that it does that thing, before running it real data.

Also, I should probably tag @Elchi3 on this, since we talked a little bit about this recently

jpmedley · 2019-07-08T21:13:04Z

How long would this process take from end to end? I'm close to being able to automate most updates for api/ for a full version of Chrome. I ask because I would need to use this process every six weeks.

queengooborg · 2019-07-09T06:09:04Z

Personally, I would say that changes like yours aren't really the bulk changes we're thinking about in this discussion, @jpmedley. What we're mostly referring to is things like massive, automated schema changes, or feature sorting, or the removal of a deprecated browser. So, I think you're in the clear!

I didn't quite make it explicit that the change script should come with tests, independent of the real data.

I'd assume that we would want to keep those tests around, correct? This also has me thinking about the linter scripts: perhaps we should have tests for each of the linter functions as well, to make sure that any changes to them will still produce results as expected. Just a little extra thought I figure I'd throw out!

ddbeck · 2019-07-09T08:24:38Z

@jpmedley Yeah, I don't think your case is impacted by this. I was thinking more of cases where we're transforming existing data. We know what the JSON looks like before and what it should look like after. Maybe I should have used the term "transformations" instead of "changes." Your automation would, presumably, introduce wholly new data and I don't think a process like I've described here would be helpful in such cases.

@vinyldarkscratch It's not be a bad idea to have tests for the existing linter behavior, though I suspect the code itself is well exercised anyway, if how many npm test failures I've seen is any indication. Either way, it's probably better as a separate issue; I don't think we need those tests to start using a process for bulk transformations.

Elchi3 · 2019-07-09T09:33:32Z

I'm totally in favor of having a bit more formal process around bulk-changes. Once we finally switch to semantic versioning (secretly aiming for 1.0.0 after 0.0.99), such changes can be reflected in the minor or (if breaking) major release versions and be properly announced in release notes as well. If we formalize this process until 1.0.0, I'm sure data consumers would appreciate it :)

I also agree Joe's case isn't necessarily affected here. It's more about structural things. Transformations is a good word for it or thinking about it like database migrations.

jpmedley · 2019-07-09T14:23:57Z

(Might I suggest the term 'schema migration'.)

queengooborg · 2019-07-10T12:27:16Z

Some conversation regarding milestones led me to an idea that I figured I’d place here. Milestones have due dates, and are used to help track big breakthroughs. Currently, only milestones for sprints exist, but it was mentioned in #4417 that peers can create their own milestones. Perhaps it would be useful here? (Just some 5:27-in-the-morning thoughts to throw out there!)

ddbeck · 2019-07-11T09:12:26Z

Yeah, milestones might be useful for tracking these migrations, since there will probably be at least three or four issues/PRs for each one: the proposal issue, the PR adding the migration script and test (this might be combined with the proposal itself), the PR running the migration, and the PR enabling the linter check (if applicable). We can try that out too and see if it's seen as helpful or essential

ddbeck · 2019-08-28T17:07:04Z

OK, we've merged #3427, the first attempt to go through this new process. My thanks to @vinyldarkscratch for doing the hard parts and @Elchi3 for supporting this experiment.

The good news is that this experiment shows this process is better. Despite some minor hiccups, I can't say I'd like to go back to the old way of doing things. In detail, the outline I described in #4469 (comment) seems to hold up. I did discover a couple of ways we might improve for the next migration:

It would be a good idea to have a long-lived tracking issue for each migration. We managed next steps in each successive PR of the process, which caused process discussion in each PR to delay progress a little (e.g., leaving a PR open until some task, like scheduling a migration, was completed). We could have that discussion in the issue that provokes a migration, or in a standalone migration tracking issue, but keeping things organized in the PRs themselves was a bit noisy.
The actual migration PR and review should probably be completed by two people who share more than a couple of overlapping daylight hours, even if that means deputizing someone to do the migration PR with a Co-authored-by annotation. A rigid schedule and a lot of time zones between us put a needless burden on @vinyldarkscratch and I'd like to avoid that in the future.

The next step toward completing this is to open a PR documenting the process. I plan to do this in the next few days. Watch this space.

ddbeck added infra Infrastructure issues (npm, GitHub Actions, releases) of this project docs Issues or pull requests regarding the documentation of this project. labels Jul 8, 2019

queengooborg mentioned this issue Jul 10, 2019

Governance: can/should peers modify milestones, projects, and/or labels? #4417

Closed

ddbeck mentioned this issue Jul 11, 2019

Enforce feature order #3427

Merged

ddbeck mentioned this issue Aug 6, 2019

Derive chrome_android CSS data from chrome #4568

Merged

bershanskiy mentioned this issue Aug 14, 2019

Remove flags from webview_android #4609

Closed

24 tasks

Elchi3 mentioned this issue Aug 29, 2019

Create a mirroring script for future mirror migrations #4719

Closed

jmswisher added the KR: Real BCD Key Result: Eliminate true/null values and replace them with "real" values. label Aug 30, 2019

jmswisher added this to the Odetta (S3 Q3 2019) milestone Aug 30, 2019

ddbeck mentioned this issue Sep 2, 2019

Add migrations process documentation #4755

Merged

Elchi3 closed this as completed in #4755 Sep 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalize a process for bulk changes #4469

Formalize a process for bulk changes #4469

ddbeck commented Jul 8, 2019 •

edited by Elchi3

Loading

ddbeck commented Jul 8, 2019

queengooborg commented Jul 8, 2019

ddbeck commented Jul 8, 2019

jpmedley commented Jul 8, 2019

queengooborg commented Jul 9, 2019

ddbeck commented Jul 9, 2019

Elchi3 commented Jul 9, 2019

jpmedley commented Jul 9, 2019

queengooborg commented Jul 10, 2019

ddbeck commented Jul 11, 2019

ddbeck commented Aug 28, 2019

Formalize a process for bulk changes #4469

Formalize a process for bulk changes #4469

Comments

ddbeck commented Jul 8, 2019 • edited by Elchi3 Loading

ddbeck commented Jul 8, 2019

queengooborg commented Jul 8, 2019

ddbeck commented Jul 8, 2019

jpmedley commented Jul 8, 2019

queengooborg commented Jul 9, 2019

ddbeck commented Jul 9, 2019

Elchi3 commented Jul 9, 2019

jpmedley commented Jul 9, 2019

queengooborg commented Jul 10, 2019

ddbeck commented Jul 11, 2019

ddbeck commented Aug 28, 2019

ddbeck commented Jul 8, 2019 •

edited by Elchi3

Loading