Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abhisheku/mmlab model selection #3692

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9069624
initial commit
Nov 12, 2024
476a37f
Review comments
Nov 13, 2024
71ad8a3
Fixes
Nov 14, 2024
1256ae7
Fixes
Nov 14, 2024
a13de9f
Fixes
Nov 14, 2024
1e45634
Fixes
Nov 20, 2024
5de9a42
Merge branch 'main' into groopa/vision/mmd
roopavidhya Nov 21, 2024
74de879
If else rough
Dec 2, 2024
970d78e
test conditional block changes
abhishekMS2024 Dec 3, 2024
1e76215
test conditional block changes
abhishekMS2024 Dec 3, 2024
2241eaf
test conditional block changes
abhishekMS2024 Dec 4, 2024
a3459aa
test conditional block changes
abhishekMS2024 Dec 4, 2024
a88e05a
test conditional block changes
abhishekMS2024 Dec 5, 2024
1cc8bb5
test conditional block changes
abhishekMS2024 Dec 9, 2024
246d532
test conditional block changes
abhishekMS2024 Dec 9, 2024
a5f4628
test conditional block changes
abhishekMS2024 Dec 9, 2024
346c228
Removed unused code
abhishekMS2024 Dec 11, 2024
cfde392
Added output selector
abhishekMS2024 Dec 11, 2024
6c6fb01
Added output selector
abhishekMS2024 Dec 11, 2024
e05036b
remove un-needed components
deepanshMS Dec 18, 2024
2d01260
updated component versions
abhishekMS2024 Dec 18, 2024
031a310
removed unused code
abhishekMS2024 Dec 18, 2024
fa226b4
updated the download component for mmlab
abhishekMS2024 Dec 24, 2024
7cabbf1
updated the download component for mmlab
abhishekMS2024 Dec 24, 2024
15f5bad
updated the import model component
abhishekMS2024 Dec 24, 2024
067952c
code refactoring
abhishekMS2024 Dec 24, 2024
79817c8
Fixed syntax issue
abhishekMS2024 Dec 24, 2024
1b683a9
Merge branch 'main' into abhisheku/mmlab_model_selection
abhishekMS2024 Dec 24, 2024
a72c14c
Merge branch 'main' into abhisheku/mmlab_model_selection
abhishekMS2024 Jan 21, 2025
f38333d
Merge branch 'main' into abhisheku/mmlab_model_selection
deepanshMS Jan 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json

name: convert_model_to_mlflow
version: 0.0.35
version: 0.0.36
type: command

is_deterministic: True
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: pipeline
name: import_model
display_name: Import model
description: Import a model into a workspace or a registry
version: 0.0.41
version: 0.0.42

# Pipeline inputs
inputs:
Expand Down Expand Up @@ -252,6 +252,42 @@ jobs:
validation_info:
type: uri_file

model_framework_selector:
type: command
component: azureml:model_framework_selector:0.0.1
compute: ${{parent.inputs.compute}}
resources:
instance_type: '${{parent.inputs.instance_type}}'
identity:
type: user_identity
inputs:
validation_info: '${{parent.jobs.validation_trigger_import.outputs.validation_info}}'
model_framework: '${{parent.inputs.model_framework}}'
outputs:
is_mmd_framework:
type: uri_file

is_mmd_model:
type: if_else
condition: ${{parent.jobs.model_framework_selector.outputs.is_mmd_framework}}
true_block: ${{parent.jobs.download_mmd_model}}
false_block: ${{parent.jobs.download_model}}

download_mmd_model:
component: azureml:mmdetection_image_objectdetection_instancesegmentation_model_import:0.0.19
compute: ${{parent.inputs.compute}}
resources:
instance_type: '${{parent.inputs.instance_type}}'
identity:
type: user_identity
inputs:
model_family: 'MmDetectionImage'
model_name: ${{parent.inputs.model_id}}
download_from_source: true
outputs:
output_dir:
type: uri_file

download_model:
component: azureml:download_model:0.0.30
compute: ${{parent.inputs.compute}}
Expand All @@ -262,7 +298,6 @@ jobs:
inputs:
model_source: ${{parent.inputs.model_source}}
model_id: ${{parent.inputs.model_id}}
validation_info: ${{parent.jobs.validation_trigger_import.outputs.validation_info}}
update_existing_model: ${{parent.inputs.update_existing_model}}
token: ${{parent.inputs.token}}
outputs:
Expand All @@ -272,7 +307,7 @@ jobs:
type: uri_folder

convert_model_to_mlflow:
component: azureml:convert_model_to_mlflow:0.0.35
component: azureml:convert_model_to_mlflow:0.0.36
compute: ${{parent.inputs.compute}}
resources:
instance_type: '${{parent.inputs.instance_type}}'
Expand All @@ -286,6 +321,7 @@ jobs:
model_framework: ${{parent.inputs.model_framework}}
model_download_metadata: ${{parent.jobs.download_model.outputs.model_download_metadata}}
model_path: ${{parent.jobs.download_model.outputs.model_output}}
model_path_mmd: ${{parent.jobs.download_mmd_model.outputs.output_dir}}
hf_config_args: ${{parent.inputs.hf_config_args}}
hf_tokenizer_args: ${{parent.inputs.hf_tokenizer_args}}
hf_model_args: ${{parent.inputs.hf_model_args}}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
type: component
spec: spec.yaml
categories: ["Models"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json

type: command
name: model_framework_selector
display_name: Model Framework Selector
description: Checks the framework for model
version: 0.0.1

is_deterministic: True

inputs:

model_framework:
type: string
enum:
- Huggingface
- MMLab
- llava
- AutoML
default: Huggingface
optional: false
description: Framework from which model is imported from.

validation_info:
type: uri_file
description: Path to the validation info file
optional: false

# Pipeline outputs
outputs:
is_mmd_framework:
type: boolean
mode: rw_mount
is_control: true

environment: azureml://registries/azureml/environments/model-evaluation/versions/37
code: ../../src
command: mldesigner execute --source run_model_framework_selector.py --name validate --inputs model_framework='${{inputs.model_framework}}' --outputs output='${{outputs.is_mmd_framework}}'

tags:
Preview: ""
Internal: ""
Original file line number Diff line number Diff line change
Expand Up @@ -193,11 +193,11 @@ def save_as_mlflow(self):
mlflow_model_wrapper = ImagesDetectionMLflowModelWrapper(task_type=self._task)
artifacts_dict = self._prepare_artifacts_dict()
if self._task == MMLabDetectionTasks.MM_OBJECT_DETECTION.value:
pip_requirements = os.path.join(self.MODEL_DIR, "mmdet-od-requirements.txt")
conda_env_file = os.path.join(self.MODEL_DIR, "conda_od.yaml")
elif self._task == MMLabDetectionTasks.MM_INSTANCE_SEGMENTATION.value:
pip_requirements = os.path.join(self.MODEL_DIR, "mmdet-is-requirements.txt")
conda_env_file = os.path.join(self.MODEL_DIR, "conda_is.yaml")
else:
pip_requirements = None
conda_env_file = None
code_path = [
os.path.join(self.MODEL_DIR, "detection_predict.py"),
os.path.join(self.MODEL_DIR, "config.py"),
Expand All @@ -206,8 +206,8 @@ def save_as_mlflow(self):
super()._save(
mlflow_model_wrapper=mlflow_model_wrapper,
artifacts_dict=artifacts_dict,
pip_requirements=pip_requirements,
code_path=code_path,
conda_env=conda_env_file,
)

def _prepare_artifacts_dict(self) -> Dict:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
channels:
- conda-forge
dependencies:
- python=3.9.19
- pip<=24.0
- pip:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below is the list of vulnerabilities just for the first package https://pypi.org/pypi/mlflow/2.12.1/json

- mlflow==2.12.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"vulnerabilities": [
    {
        "aliases": [
            "CVE-2024-37057"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 2.0.0rc0 or newer, enabling a maliciously uploaded Tensorflow model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-j8mg-pqc5-x9gj",
        "link": "https://osv.dev/vulnerability/GHSA-j8mg-pqc5-x9gj",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37052"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.1.0 or newer, enabling a maliciously uploaded scikit-learn model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-76cg-cfhx-373f",
        "link": "https://osv.dev/vulnerability/GHSA-76cg-cfhx-373f",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37053"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.1.0 or newer, enabling a maliciously uploaded scikit-learn model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-43c4-9qgj-x742",
        "link": "https://osv.dev/vulnerability/GHSA-43c4-9qgj-x742",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37056"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.23.0 or newer, enabling a maliciously uploaded LightGBM scikit-learn model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-7p8j-qv6x-f4g4",
        "link": "https://osv.dev/vulnerability/GHSA-7p8j-qv6x-f4g4",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37060"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.27.0 or newer, enabling a maliciously crafted Recipe to execute arbitrary code on an end user’s system when run.",
        "fixed_in": [],
        "id": "GHSA-cv6c-7963-wxcg",
        "link": "https://osv.dev/vulnerability/GHSA-cv6c-7963-wxcg",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37055"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.24.0 or newer, enabling a maliciously uploaded pmdarima model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-x38x-g6gr-jqff",
        "link": "https://osv.dev/vulnerability/GHSA-x38x-g6gr-jqff",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37054"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 0.9.0 or newer, enabling a maliciously uploaded PyFunc model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-ghv6-9r9j-wh4j",
        "link": "https://osv.dev/vulnerability/GHSA-ghv6-9r9j-wh4j",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37058"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 2.5.0 or newer, enabling a maliciously uploaded Langchain AgentExecutor model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-cwgg-w6mp-w9hg",
        "link": "https://osv.dev/vulnerability/GHSA-cwgg-w6mp-w9hg",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37061"
        ],
        "details": "Remote Code Execution can occur in versions of the MLflow platform running version 1.11.0 or newer, enabling a maliciously crafted MLproject to execute arbitrary code on an end user’s system when run due to unfiltered input.",
        "fixed_in": [],
        "id": "GHSA-pqcv-qw2r-r859",
        "link": "https://osv.dev/vulnerability/GHSA-pqcv-qw2r-r859",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-37059"
        ],
        "details": "Deserialization of untrusted data can occur in versions of the MLflow platform running version 0.5.0 or newer, enabling a maliciously uploaded PyTorch model to run arbitrary code on an end user’s system when interacted with.",
        "fixed_in": [],
        "id": "GHSA-wf7f-8fxf-xfxc",
        "link": "https://osv.dev/vulnerability/GHSA-wf7f-8fxf-xfxc",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    },
    {
        "aliases": [
            "CVE-2024-27134"
        ],
        "details": "Excessive directory permissions in MLflow leads to local privilege escalation when using spark_udf. This behavior can be exploited by a local attacker to gain elevated permissions by using a ToCToU attack. The issue is only relevant when the spark_udf() MLflow API is called.",
        "fixed_in": [
            "2.16.0"
        ],
        "id": "GHSA-qpgc-w4mg-6v92",
        "link": "https://osv.dev/vulnerability/GHSA-qpgc-w4mg-6v92",
        "source": "osv",
        "summary": null,
        "withdrawn": null
    }
]

}

- cloudpickle==2.2.1
- datasets==2.15.0
- openmim==0.3.9
- torch==2.0.1
- torchvision==0.15.2
- transformers==4.38.2
- accelerate==0.27.2
- albumentations==1.3.0
- scikit-image==0.19.3
- simplification==0.7.10
- fairscale==0.4.13
name: mlflow-env
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
channels:
- conda-forge
dependencies:
- python=3.9.19
- pip<=24.0
- pip:
- mlflow==2.12.1
- cloudpickle==2.2.1
- datasets==2.15.0
- openmim==0.3.9
- torch==2.0.1
- torchvision==0.15.2
- transformers==4.38.2
- accelerate==0.27.2
- albumentations==1.3.0
- fairscale==0.4.13
name: mlflow-env
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

"""Select Model Framework Component."""
from azureml.model.mgmt.config import ModelFramework
from mldesigner import Input, Output, command_component
from azureml.model.mgmt.utils.logging_utils import get_logger
from azureml.model.mgmt.utils.exceptions import swallow_all_exceptions

logger = get_logger(__name__)


@command_component
@swallow_all_exceptions(logger)
def validate(
model_framework: Input(type="string", optional=False) # noqa: F821
) -> Output(type="boolean", is_control=True): # noqa: F821
"""Entry function of model validation script."""
if model_framework == ModelFramework.MMLAB.value:
result = True
else:
result = False

logger.info(f"Model framework: {model_framework}, result: {result}")

return result
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,8 @@ def _get_parser():
required=False,
help="Model download details",
)
parser.add_argument("--model-path", type=Path, required=True, help="Model input path")
parser.add_argument("--model-path", type=Path, required=False, help="Model input path")
parser.add_argument("--model-path-mmd", type=Path, required=False, help="MMD Model input path")
parser.add_argument("--license-file-path", type=Path, required=False, help="License file path")
parser.add_argument(
"--mlflow-model-output-dir",
Expand Down Expand Up @@ -107,7 +108,11 @@ def run():
inference_base_image = args.inference_base_image

model_download_metadata_path = args.model_download_metadata

model_path = args.model_path
if model_framework == ModelFramework.MMLAB.value:
model_path = args.model_path_mmd

mlflow_model_output_dir = args.mlflow_model_output_dir
license_file_path = args.license_file_path
TRUST_CODE_KEY = "trust_remote_code=True"
Expand Down
Loading