Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's build all en-us docs every 6h instead of 24h #2642

Closed
peterbe opened this issue Jan 21, 2021 · 9 comments
Closed

Let's build all en-us docs every 6h instead of 24h #2642

peterbe opened this issue Jan 21, 2021 · 9 comments
Labels
enhancement Improves an existing feature. 🚉 platform keeping the platform healthy

Comments

@peterbe
Copy link
Contributor

peterbe commented Jan 21, 2021

At the moment, in prod-build.yml the cron job runs every 24h. It takes about 50min each time. I believe if you only do mdn/content (aka. only the en-us) it only takes about 15min. The translated-content very rarely changes. It will change once we unfreeze it but even then it might be worth keep that stuff to every 24h.

What we could do instead is to build the en-us content every 6h and the whole everything every 24h.

This way, the time between a merged mdn/content PR and getting it into production will be reduced. We still have the CDN cache which might hold on to a page "too long" but if it's a new edit, the chances are better.

@peterbe
Copy link
Contributor Author

peterbe commented Jan 21, 2021

We could also be more specific. Instead of - cron: "0 */24 * * *" we could pick a couple of specific hours to run on. Hours that match the timezones of North Americans and Europeans better.

Whichever we do, we'd need to be more clever about knowing when to include the mdn/translated-content and when not to. For example:

if [ date.hour == 1 ];
  export CONTENT_TRANSLATED_ROOT=mdn/translated-content/files
fi 

Ideas welcome.

@escattone
Copy link
Contributor

@peterbe I really like this idea. I suppose itt adds a wrinkle to how we resolve #2224, in that we'll have to limit the scope of our approach to whatever we're actually building (e.g., only remove docs or redirects in S3 that don't exist within the scope of what we've actually built, for example the English documents).

@Ryuno-Ki
Copy link

Hm … so when switching to non-24h … can we make sure we don't run into Daylight Saving Time shifts somehow?
I'd even say, triggering a build at 1am (if I interpret @peterbe's code snippet correctly) could get risky …
I'd rather suggest 10pm and 4am.

10pm CET ==  2pm EST == 11am PST
 4am CET == 10pm EST ==  3pm PST
10am CET ==  4am EST ==  9pm PST
 2pm CET == 10am EST ==  3am PST

Would that work out?

@peterbe
Copy link
Contributor Author

peterbe commented Jan 25, 2021

I made an ad-hoc histogram that plots the number of commits on mdn/content per UTC hour:

00:00 ******
01:00 *********************
02:00 ***
03:00 ********
04:00 ********
05:00 ***********************
06:00 ***********
07:00 *****************************************************
08:00 *****************************************************************
09:00 *************************************************************
10:00 ************************************************************************************************
11:00 ****************************************************************************************************
12:00 ********************************************************************
13:00 **********************************************************************
14:00 ***********************
15:00 ****************************************
16:00 ******************************************************************
17:00 *****************************************
18:00 **********************************************
19:00 ***********************************
20:00 *************************
21:00 ********
22:00 **********
23:00 ****************

(this excludes the Dependabot PRs)

This is based on only about 3 weeks of PRs. And it's not very conclusive since we've been doing lots of mass-edits for things like flaw cleanups.

@Ryuno-Ki
Copy link

We should avoid the 7am - 7pm UTC at least.

@peterbe
Copy link
Contributor Author

peterbe commented Jan 26, 2021

We should avoid the 7am - 7pm UTC at least.

Sorry, I don't follow. Why should we avoid it?

@Ryuno-Ki
Copy link

Highest number of commits. The cache would stall too often.
If you would build „right after the decline”, chances are, most changes will become visible => happier contributors.

peterbe added a commit to peterbe/yari that referenced this issue Feb 12, 2021
build git-recent-hours
@peterbe peterbe mentioned this issue Feb 12, 2021
4 tasks
@schalkneethling
Copy link

This still sounds like a good idea, what do you think @escattone?

@schalkneethling schalkneethling added enhancement Improves an existing feature. 🚉 platform keeping the platform healthy labels Nov 3, 2021
@escattone
Copy link
Contributor

@schalkneethling I don't think it's worth the effort right now, because of my earlier comment, but I think it's worth considering in the future when there's time to make the changes necessary to the deployer code.

Repository owner moved this from Backlog to Done in Yari Platform Engineering Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves an existing feature. 🚉 platform keeping the platform healthy
Projects
Development

No branches or pull requests

4 participants