Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): add pytest hooks for updating golden files #12581

Merged
merged 7 commits into from
Feb 12, 2025
Merged

Conversation

hsheth2
Copy link
Collaborator

@hsheth2 hsheth2 commented Feb 7, 2025

Now we can avoid passing pytestconfigs all over the place. There's some additional refactoring/cleanup that I haven't yet done here.

In order to use this in other modules, you need to add this snippet to conftest.py.

from datahub.testing.pytest_hooks import (  # noqa: F401,E402
    load_golden_flags,
    pytest_addoption,
)

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

This will allow us to avoid passing `pytestconfig` all over the place.
@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata devops PR or Issue related to DataHub backend & deployment labels Feb 7, 2025
Copy link

codecov bot commented Feb 7, 2025

❌ 2 Tests Failed:

Tests completed Failed Passed Skipped
3180 2 3178 121
View the full list of 2 ❄️ flaky tests
tests.entity_versioning.test_versioning::test_link_unlink_three_versions_unlink_middle_and_latest

Flake rate in main: 28.12% (Passed 23 times, Failed 9 times)

Stack Traces | 8.98s run time
graph_client = DataHubGraph: configured to talk to http://localhost:8080 with token: eyJh**********u_kU

    @pytest.fixture(scope="function", autouse=True)
    def ingest_cleanup_data(graph_client: DataHubGraph):
        try:
            for urn in ENTITY_URN_OBJS:
                graph_client.emit_mcp(
                    MetadataChangeProposalWrapper(
                        entityUrn=urn.urn(),
                        aspect=DatasetKeyClass(
                            platform=urn.platform, name=urn.name, origin=urn.env
                        ),
                    )
                )
            for i in [2, 1, 0, 0, 1, 2]:
>               graph_client.unlink_asset_from_version_set(ENTITY_URNS[i])

tests/entity_versioning/test_versioning.py:31: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
...../ingestion/graph/entity_versioning.py:164: in unlink_asset_from_version_set
    response = self.execute_graphql(self.UNLINK_VERSION_MUTATION, variables)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = DataHubGraph: configured to talk to http://localhost:8080 with token: eyJh**********u_kU
query = '\n        mutation($input: UnlinkVersionInput!) {\n            unlinkAssetVersion(input: $input) {\n                urn\n            }\n        }\n    '
variables = {'input': {'unlinkedEntity': 'urn:li:dataset:(urn:li:dataPlatform:snowflake,versioning_0,PROD)', 'versionSet': 'urn:li:versionSet:(12345678910,dataset)'}}
operation_name = None, format_exception = True

    def execute_graphql(
        self,
        query: str,
        variables: Optional[Dict] = None,
        operation_name: Optional[str] = None,
        format_exception: bool = True,
    ) -> Dict:
        url = f"{self.config.server}/api/graphql"
    
        body: Dict = {
            "query": query,
        }
        if variables:
            body["variables"] = variables
        if operation_name:
            body["operationName"] = operation_name
    
        logger.debug(
            f"Executing {operation_name or ''} graphql query: {query} with variables: {json.dumps(variables)}"
        )
        result = self._post_generic(url, body)
        if result.get("errors"):
            if format_exception:
>               raise GraphError(f"Error executing graphql query: {result['errors']}")
E               datahub.configuration.common.GraphError: Error executing graphql query: [{'message': 'An unknown error occurred.', 'locations': [{'line': 3, 'column': 13}], 'path': ['unlinkAssetVersion'], 'extensions': {'code': 500, 'type': 'SERVER_ERROR', 'classification': 'DataFetchingException'}}]

...../ingestion/graph/client.py:1188: GraphError
tests.entity_versioning.test_versioning::test_link_unlink_three_versions_unlink_and_relink

Flake rate in main: 28.12% (Passed 23 times, Failed 9 times)

Stack Traces | 30.5s run time
graph_client = DataHubGraph: configured to talk to http://localhost:8080 with token: eyJh**********u_kU

    @pytest.fixture(scope="function", autouse=True)
    def ingest_cleanup_data(graph_client: DataHubGraph):
        try:
            for urn in ENTITY_URN_OBJS:
                graph_client.emit_mcp(
                    MetadataChangeProposalWrapper(
                        entityUrn=urn.urn(),
                        aspect=DatasetKeyClass(
                            platform=urn.platform, name=urn.name, origin=urn.env
                        ),
                    )
                )
            for i in [2, 1, 0, 0, 1, 2]:
                graph_client.unlink_asset_from_version_set(ENTITY_URNS[i])
            yield
        finally:
            for i in [2, 1, 0, 0, 1, 2]:
>               graph_client.unlink_asset_from_version_set(ENTITY_URNS[i])

tests/entity_versioning/test_versioning.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
...../ingestion/graph/entity_versioning.py:164: in unlink_asset_from_version_set
    response = self.execute_graphql(self.UNLINK_VERSION_MUTATION, variables)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = DataHubGraph: configured to talk to http://localhost:8080 with token: eyJh**********u_kU
query = '\n        mutation($input: UnlinkVersionInput!) {\n            unlinkAssetVersion(input: $input) {\n                urn\n            }\n        }\n    '
variables = {'input': {'unlinkedEntity': 'urn:li:dataset:(urn:li:dataPlatform:snowflake,versioning_1,PROD)', 'versionSet': 'urn:li:versionSet:(12345678910,dataset)'}}
operation_name = None, format_exception = True

    def execute_graphql(
        self,
        query: str,
        variables: Optional[Dict] = None,
        operation_name: Optional[str] = None,
        format_exception: bool = True,
    ) -> Dict:
        url = f"{self.config.server}/api/graphql"
    
        body: Dict = {
            "query": query,
        }
        if variables:
            body["variables"] = variables
        if operation_name:
            body["operationName"] = operation_name
    
        logger.debug(
            f"Executing {operation_name or ''} graphql query: {query} with variables: {json.dumps(variables)}"
        )
        result = self._post_generic(url, body)
        if result.get("errors"):
            if format_exception:
>               raise GraphError(f"Error executing graphql query: {result['errors']}")
E               datahub.configuration.common.GraphError: Error executing graphql query: [{'message': 'An unknown error occurred.', 'locations': [{'line': 3, 'column': 13}], 'path': ['unlinkAssetVersion'], 'extensions': {'code': 500, 'type': 'SERVER_ERROR', 'classification': 'DataFetchingException'}}]

...../ingestion/graph/client.py:1188: GraphError

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Feb 7, 2025
@@ -1,15 +1,14 @@
import pathlib
import site

from datahub.testing.pytest_hooks import ( # noqa: F401,E402
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ignoring E402 required?
https://docs.astral.sh/ruff/rules/module-import-not-at-top-of-file/

here specifically and everywhere this code snippet is being used..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup - will fix in a follow up PR

Copy link
Contributor

@sgomezvillamor sgomezvillamor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Feb 12, 2025
@hsheth2 hsheth2 merged commit 7472c53 into master Feb 12, 2025
222 of 223 checks passed
@hsheth2 hsheth2 deleted the pytest-golden-hook branch February 12, 2025 23:32
ttekampe pushed a commit to ttekampe/datahub that referenced this pull request Feb 14, 2025
ksrinath pushed a commit to ksrinath/datahub that referenced this pull request Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants