Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add commits differentiation logic for incremental build #4207

Merged
merged 6 commits into from
Nov 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion manifests/2.12.0/opensearch-dashboards-2.12.0.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
schema-version: '1.0'
schema-version: '1.1'
build:
name: OpenSearch Dashboards
version: 2.12.0
Expand Down
10 changes: 10 additions & 0 deletions src/build_workflow/build_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ class BuildArgs:
architecture: str
distribution: str
continue_on_error: bool
incremental: bool

def __init__(self) -> None:
parser = argparse.ArgumentParser(description="Build an OpenSearch Distribution")
Expand Down Expand Up @@ -103,6 +104,14 @@ def __init__(self) -> None:
action="store_true",
help="Do not fail the distribution build on any plugin component failure.",
)
parser.add_argument(
"-i",
"--incremental",
dest="incremental",
default=False,
action="store_true",
help="Given previous build artifacts are present, build incrementally.",
)

args = parser.parse_args()
self.logging_level = args.logging_level
Expand All @@ -116,6 +125,7 @@ def __init__(self) -> None:
self.distribution = args.distribution
self.script_path = sys.argv[0].replace("/src/run_build.py", "/build.sh")
self.continue_on_error = args.continue_on_error
self.incremental = args.incremental

def component_command(self, name: str) -> str:
return " ".join(
Expand Down
64 changes: 64 additions & 0 deletions src/build_workflow/build_incremental.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright OpenSearch Contributors
# SPDX-License-Identifier: Apache-2.0
#
# The OpenSearch Contributors require contributions made to
# this file be licensed under the Apache-2.0 license or a
# compatible open source license.

import logging
import os
from typing import List

from manifests.build_manifest import BuildManifest
from manifests.input_manifest import InputManifest


class BuildIncremental:
def __init__(self, input_manifest: InputManifest, distribution: str):
self.distribution = distribution
self.input_manifest = input_manifest

# Given input manifest and return a list of what components changed and added.
def commits_diff(self, input_manifest: InputManifest) -> List[str]:
build_manifest_path = os.path.join(self.distribution, "builds", input_manifest.build.filename, "manifest.yml")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming we will be downloading the build-manifest and storing it in the path mentioned before this logic executes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scope of the workflow on python side, we would assume we already have the builds folder along with build-manifest in local directory. We would need to download this entire builds folder on the CI level when we are running on Jenkins.

if not os.path.exists(build_manifest_path):
logging.info("Previous build manifest does not exist. Rebuilding Core.")
return [input_manifest.build.name.replace(" ", "-")]
previous_build_manifest = BuildManifest.from_path(build_manifest_path)
stable_input_manifest = input_manifest.stable()
if previous_build_manifest.build.version != stable_input_manifest.build.version:
Copy link
Collaborator

@rishabh6788 rishabh6788 Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this check is here?
Could there be a case where OS version in the previous build manifest differ from that in input manifest?
If yes, can you please give an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came out of my mind when I was testing on local. When we are building artifacts in local, we don't have any sub-folder for different version. All built artifacts stored in the same directory following this pattern: <architecture>/builds/<opensearch/opensearch-dashboards>/<plugins/... or manifest.yml>. Even we built input manifests of different versions. Latter build-manifest will overwrite the previous one but all artifacts would all be in place without any of removal.

So I believe at least we need to check we are incrementally building the same version artifacts.

logging.info("The version of previous build manifest doesn't match the current input manifest. Rebuilding Core.")
return [input_manifest.build.name.replace(" ", "-")]
components = []
for component in stable_input_manifest.components.select():
if component.name not in previous_build_manifest.components:
components.append(component.name)
logging.info(f"Adding {component.name} since it is missing from previous build manifest")
continue
if component.ref != previous_build_manifest.components[component.name].commit_id: # type: ignore[attr-defined]
components.append(component.name)
logging.info(f"Adding {component.name} because it has different commit ID and needs to be rebuilt.")
continue
return components

# Given updated plugins and look into the depends_on of all components to finalize a list of rebuilding components.
def rebuild_plugins(self, changed_plugins: List, input_manifest: InputManifest) -> List[str]:
if not changed_plugins:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be add this check in the run_build.py and call this method only if you have plugins to rebuild as the method name suggests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure. This would be considered in the current implementation on the building process for incremental build.

return []

if any(core in changed_plugins for core in ("OpenSearch", "OpenSearch-Dashboards")):
logging.info("Core engine has new changes, rebuilding all components.")
return [component.name for component in input_manifest.components.select()]

queue = changed_plugins
zelinh marked this conversation as resolved.
Show resolved Hide resolved
rebuild_list = []

rebuild_list.append("OpenSearch-Dashboards") if input_manifest.build.filename == "opensearch-dashboards" else rebuild_list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not understand this.
Aren't we supposed to rebuild everything if core is present in the changed_plugins list at line 49?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or for OSD plugins to rebuild, we need OSD to be rebuilt again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. OSD doesn't have any those built-tool available just like OS. Every time we rebuild any of dashboards plugin, we have to build OSD core.

while queue:
plugin = queue.pop(0)
rebuild_list.append(plugin) if plugin not in rebuild_list else rebuild_list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?
Is there a possibility to have duplicates in the passed changed_plugins list?
If yes, shouldn't that be handled before it is being passed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure there is no duplicate in the rebuild_list which we would return.

for dependent_plugin in input_manifest.plugins_depend_on(plugin):
queue.append(dependent_plugin) if dependent_plugin not in rebuild_list else queue

logging.info(f"Rebuilding list is {rebuild_list}")
return rebuild_list
7 changes: 7 additions & 0 deletions src/manifests/input_manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,13 @@ def stable(self) -> 'InputManifest':
manifest.components.__stabilize__()
return manifest

def plugins_depend_on(self, plugin: str) -> List[str]:
plugins = []
for component in self.components.select():
if component.depends_on is not None and plugin in component.depends_on:
zelinh marked this conversation as resolved.
Show resolved Hide resolved
plugins.append(component.name)
return plugins

class Ci:
def __init__(self, data: dict) -> None:
self.image = None if data is None else self.Image(data.get("image", None))
Expand Down
7 changes: 7 additions & 0 deletions src/run_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import sys

from build_workflow.build_args import BuildArgs
from build_workflow.build_incremental import BuildIncremental
from build_workflow.build_recorder import BuildRecorder
from build_workflow.build_target import BuildTarget
from build_workflow.builders import Builders
Expand Down Expand Up @@ -41,6 +42,12 @@

output_dir = BuildOutputDir(manifest.build.filename, args.distribution).dir

if args.incremental:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a placeholder for the next PR and it's just logging a list of plugins for rebuilding. I have complete tests for all those functions within buildIncremental class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still is part of the code flow and should be tested, if not in this PR then in subsequent PRs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will absolutely add tests in my following PR.

buildIncremental = BuildIncremental(manifest, args.distribution)
list_of_updated_plugins = buildIncremental.commits_diff(manifest)
logging.info(f"Plugins for incremental build: {buildIncremental.rebuild_plugins(list_of_updated_plugins, manifest)}")
return 0

Check warning on line 49 in src/run_build.py

View check run for this annotation

Codecov / codecov/patch

src/run_build.py#L46-L49

Added lines #L46 - L49 were not covered by tests
Comment on lines +45 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointing the exact question from the overview here: Wondering what difference is this making in overall build.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as I stated in the description of this issue, these additions in run_build.py is a placeholder to further implementation on incremental build. For this PR, it's intended for getting a list of components for rebuild.


with TemporaryDirectory(keep=args.keep, chdir=True) as work_dir:
logging.info(f"Building in {work_dir.name}")

Expand Down
Loading
Loading