feat(schemaField): populate schemaFields with side effects #10928

david-leifker · 2024-07-17T02:08:50Z

Maintain schemaField entities and status aspects from dataset's schemaMetadata and status aspect.

Added configuration variables for Schema Field Side Effect
Added schemaMetadata diff support to soft-delete on column removal
Added large scale smoke-test
Consolidated MCL Hook for schemaField status
Added schemaFieldAliases and SideEffect MCP generation to hold v1/v2 schemaField URNs
Suppress schemaField ECE
schemaField ES Document Id Hash Impl and Upgrade Job

Enables schemaField Side Effect
MCP_SIDE_EFFECTS_SCHEMA_FIELD_ENABLED=true

Creates schemaField entities from Dataset schemaMetadata
SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_ENABLED=true

Enable the new document id algorithm for schemaFields
SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_ENABLED=true

Update All SchemaField ES Documents with Hash Algorithm
ELASTICSEARCH_INDEX_DOC_IDS_SCHEMA_FIELD_HASH_ID_ENABLED=true

Checklist

The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
Links to related issues (if applicable)
Tests for the changes have been added/updated (if applicable)
Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Summary by CodeRabbit

New Features
- Introduced the ability to generate schema fields from schema metadata during system upgrades.
- Added new configuration options for updating schema fields with specific batch size, delay, and limit settings.
Bug Fixes
- Enhanced filtering in metadata to support multiple aspect names.
Refactor
- Simplified the merging logic of maps in the AspectsBatch class.
- Adjusted system operation context methods to include additional parameters for Elasticsearch components.
Tests
- Added comprehensive tests for schema field generation, including batch processing and environment variable checks.
Chores
- Updated configuration files to support new schema field features and structured properties mappings.

coderabbitai · 2024-07-17T02:09:00Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The recent update introduces the creation of a new configuration class for generating schema fields from schema metadata, modifications to batch processing, and side effects handling for schema fields. Significant refactoring is done to existing methods and classes to enhance their functionality, including the merging logic for maps and aspect filtering capabilities. Additionally, several test classes are added to ensure the robustness of these new implementations and changes.

Changes

Files/Paths	Change Summary
`SystemUpdateConfig.java`, `SchemaFieldsFromSchemaMetadataConfig.java`, `SystemOperationContextFactory.java`	Added new methods and configuration classes for system operation context and schema fields from schema metadata.
`GenerateSchemaFieldsFromSchemaMetadata.java`, `GenerateSchemaFieldsFromSchemaMetadataStep.java`	Introduced classes for generating schema fields from schema metadata during system upgrades.
`GenerateSchemaFieldsFromSchemaMetadataStepTest.java`, `SchemaFieldSideEffectTest.java`	Added test classes to validate the functionality of schema field generation and side effect handling.
`Constants.java`	Added a new constant `SCHEMA_FIELD_KEY_ASPECT` for schema field key aspect.
`AspectsBatch.java`	Refactored the `merge` method to use a traditional loop instead of streams for better readability and control flow.
`EbeanAspectDao.java`	Added conditional checks to enhance filtering capabilities based on aspect names.
`SchemaFieldSideEffect.java`	Introduced a class for handling schema field side effects during metadata changes.
`entity-registry.yml`	Updated supported entity aspect names for `dataset` to include `status` and `schemaMetadata`.
`RestoreIndicesArgs.java`	Added a new field `aspectNames` to store multiple aspect names.
`application.yaml`	Added configuration parameters for `schemaFieldsFromSchemaMetadata` and `structuredProperties`.

Poem

In fields of schemas, changes bloom,
Metadata dances, brightening the room.
Side effects whisper, keys align,
Services hum in perfect time.
Tests ensure the dreams stay true,
Upgrades flourish, fresh and new.
🌸✨🐇

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 7

Outside diff range, codebase verification and nitpick comments (5)

metadata-service/services/src/main/java/com/linkedin/metadata/entity/restoreindices/RestoreIndicesArgs.java (1)
25-25: Addition of aspectNames field

The addition of the aspectNames field initialized with Collections.emptyList() is a safe choice as it avoids mutable state issues. However, if there's a scenario where aspectNames needs to be modified, consider using a mutable list initialized to an empty immutable list to avoid UnsupportedOperationException.
- public List<String> aspectNames = Collections.emptyList();
+ public List<String> aspectNames = new ArrayList<>();
metadata-service/factories/src/main/java/com/linkedin/gms/factory/context/SystemOperationContextFactory.java (2)
43-45: Update method signatures to include new parameter

The invocations of javaSystemOperationContext in both SystemOperationContextFactory.java and SystemUpdateConfig.java have not been updated to include the new BaseElasticSearchComponents parameter. This needs to be corrected to ensure proper dependency injection.

Update the method signature in SystemOperationContextFactory.java to include the BaseElasticSearchComponents parameter.

Update the method signature in SystemUpdateConfig.java to include the BaseElasticSearchComponents parameter.

Analysis chain

Parameter addition review in javaSystemOperationContext

The addition of the BaseElasticSearchComponents parameter with @Qualifier annotation is a good practice for clarifying dependency injection requirements. This change likely improves the modularity and configurability of the operation context creation. Ensure that all callers of this method are updated to pass the required component.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Ensure that all callers of `javaSystemOperationContext` are updated.
rg --type java -A 5 "javaSystemOperationContext"
Length of output: 2007

Script:
#!/bin/bash
# Check for the updated invocation of javaSystemOperationContext in SystemOperationContextFactory.java
rg --type java -A 5 "javaSystemOperationContext" metadata-service/factories/src/main/java/com/linkedin/gms/factory/context/SystemOperationContextFactory.java

# Check for the updated invocation of javaSystemOperationContext in SystemUpdateConfig.java
rg --type java -A 5 "javaSystemOperationContext" datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SystemUpdateConfig.java
Length of output: 1058
91-93: Potential Configuration Update Issue

The addition of the BaseElasticSearchComponents parameter in restliSystemOperationContext appears to be isolated to the SystemOperationContextFactory.java file. There are no corresponding updates in configuration files or service initializations.

File: metadata-service/factories/src/main/java/com/linkedin/gms/factory/context/SystemOperationContextFactory.java

Lines: 91-93

Please ensure that all necessary configuration files and service initializations are updated to reflect this change.

Analysis chain

Parameter addition review in restliSystemOperationContext

Similar to the previous method, the addition of the BaseElasticSearchComponents parameter enhances the flexibility in configuration. It's important to verify that the configuration files and service initializations are updated accordingly.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check for updates in configuration files related to `restliSystemOperationContext`.
rg --type properties "restliSystemOperationContext"
Length of output: 86

Script:
#!/bin/bash
# Search for updates in configuration files related to `restliSystemOperationContext` specifically within .properties files.
rg "restliSystemOperationContext" --glob "*.properties"
Length of output: 55

Script:
#!/bin/bash
# Search for references to `restliSystemOperationContext` across all file types to ensure comprehensive coverage.
rg "restliSystemOperationContext"
Length of output: 201
metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java (2)
343-396: Improve error handling in fetchRequiredAspects method.

The fetchRequiredAspects method lacks explicit error handling mechanisms for scenarios where data might be missing or if there are issues during aspect retrieval. Consider adding try-catch blocks, logging, or conditional checks to handle these cases gracefully.

File: metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java

Lines: 343-396

Analysis chain

Review aspect fetching logic in fetchRequiredAspects.

This utility method fetches necessary aspects from the batch items, ensuring that all required data is available for processing. The method is well-implemented with clear separation of concerns and effective use of Java streams. However, ensure that the error handling for missing data is adequate.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify error handling for missing data in fetchRequiredAspects method.

# Test: Search for error handling patterns in this method. Expect: Adequate error handling for missing data scenarios.
rg --type java -A 5 $'fetchRequiredAspects'
Length of output: 2646

Script:
#!/bin/bash
# Description: Verify error handling for missing data in fetchRequiredAspects method.

# Test: Search for error handling patterns in fetchRequiredAspects method.
rg --type java -A 50 'private static Map<Urn, Map<String, Aspect>> fetchRequiredAspects' metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java
Length of output: 2206
289-319: Review deletion MCP generation logic in buildSchemaMetadataSchemaFieldDeleteMCPs.

The buildSchemaMetadataSchemaFieldDeleteMCPs method lacks test coverage for edge cases such as empty fields or missing aspect specs. It is recommended to add specific tests to ensure these scenarios are handled correctly.

Add tests to cover edge cases for the buildSchemaMetadataSchemaFieldDeleteMCPs method.

Analysis chain

Review deletion MCP generation logic in buildSchemaMetadataSchemaFieldDeleteMCPs.

This method handles the generation of delete MCPs for schema fields when a dataset's schema metadata is deleted. The use of streams and mappings is efficient, but ensure that all edge cases, such as empty fields or missing aspect specs, are handled correctly.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify handling of edge cases in deletion MCP generation.

# Test: Search for test cases covering edge cases in this method. Expect: Comprehensive test coverage.
rg --type java -A 5 $'buildSchemaMetadataSchemaFieldDeleteMCPs'
Length of output: 1801

Script:
#!/bin/bash
# Description: Search for test cases covering edge cases in the buildSchemaMetadataSchemaFieldDeleteMCPs method.

# Find all Java test files that reference the buildSchemaMetadataSchemaFieldDeleteMCPs method.
rg --type java 'buildSchemaMetadataSchemaFieldDeleteMCPs' -l | grep -i 'test'
Length of output: 75

Script:
#!/bin/bash
# Description: Search for any references to the buildSchemaMetadataSchemaFieldDeleteMCPs method across all Java files.

# Find all Java files that reference the buildSchemaMetadataSchemaFieldDeleteMCPs method.
rg --type java 'buildSchemaMetadataSchemaFieldDeleteMCPs' -l
Length of output: 160

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 298c299 and d64859c.

Files selected for processing (12)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SystemUpdateConfig.java (3 hunks)
datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadata.java (1 hunks)
datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadataStep.java (1 hunks)
datahub-upgrade/src/test/java/com/linkedin/datahub/upgrade/schemafield/GenerateSchemaFieldsFromSchemaMetadataStepTest.java (1 hunks)
entity-registry/src/main/java/com/linkedin/metadata/aspect/batch/AspectsBatch.java (2 hunks)
li-utils/src/main/java/com/linkedin/metadata/Constants.java (1 hunks)
metadata-io/src/main/java/com/linkedin/metadata/entity/ebean/EbeanAspectDao.java (1 hunks)
metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java (1 hunks)
metadata-io/src/test/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffectTest.java (1 hunks)
metadata-models/src/main/resources/entity-registry.yml (2 hunks)
metadata-service/factories/src/main/java/com/linkedin/gms/factory/context/SystemOperationContextFactory.java (4 hunks)
metadata-service/services/src/main/java/com/linkedin/metadata/entity/restoreindices/RestoreIndicesArgs.java (2 hunks)

Additional comments not posted (25)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadata.java (2)

13-17: Class documentation and logging setup

The class documentation is clear and concise, explaining the purpose of the class. The use of the @Slf4j annotation for logging is appropriate here.

39-47: Method implementation review for steps()

The steps() method correctly returns the _steps list initialized in the constructor. This method is straightforward and adheres to the expected interface contract.

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SystemUpdateConfig.java (2)

138-178: Review of the javaSystemOperationContext method.

The method javaSystemOperationContext is well-structured and uses dependency injection to configure the system operation context. It integrates various services and configurations, which is crucial for the system's operation. The use of @Primary and @Nonnull annotations enhances the method's robustness by ensuring that the beans are correctly prioritized and that null values are handled properly.

However, ensure that all services and components used in this method are properly initialized and available in the application context to prevent runtime issues.

Line range hint 3-31: Review of new imports and configurations.

The imports added are extensive and seem to cover a wide range of functionalities including authentication, search, and entity services. Ensure that these imports are necessary and used within the file to avoid unused imports which can clutter the codebase.
metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java (7)
1-8: Ensure proper package and import organization.

The package declaration and imports are appropriately organized and follow Java conventions. The static imports for constants are used effectively to improve code readability.

44-47: Class annotations and logging setup.

The use of @Slf4j, @Getter, @Setter, and @Accessors(chain = true) annotations are appropriate for this class, providing logging capabilities and simplifying the getter and setter boilerplate. This setup enhances code readability and maintainability.

51-52: Validate the set of required aspects.

The static initialization of REQUIRED_ASPECTS with SCHEMA_METADATA_ASPECT_NAME and STATUS_ASPECT_NAME is crucial for the operations of this class. This ensures that the necessary data is always fetched before processing, which is a good practice to prevent runtime errors due to missing data.

54-98: Review side effect application logic in applyMCPSideEffect.

This method orchestrates the side effect processing by filtering and mapping changes to appropriate handlers. The use of Java streams for filtering and flatMapping is efficient and clean. However, ensure that the complex logic within the flatMap operations is well-tested to handle all edge cases.

Verification successful

The applyMCPSideEffect method is well-tested.

The test cases in SchemaFieldSideEffectTest.java comprehensively cover the conditional paths and edge cases for the applyMCPSideEffect method, ensuring the filtering and flatMapping logic is well-verified.

schemaMetadataToSchemaFieldKeyTest

statusToSchemaFieldStatusTest
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for applyMCPSideEffect method.

# Test: Search for the test cases covering this method. Expect: At least one test case per conditional path.
rg --type java -A 5 $'applyMCPSideEffect'
Length of output: 10658

Script:
#!/bin/bash
# Extract the test cases from SchemaFieldSideEffectTest.java to verify the coverage of applyMCPSideEffect method.

# Find the file path of SchemaFieldSideEffectTest.java
test_file_path=$(fd SchemaFieldSideEffectTest.java)

# Extract the test cases from the identified test file
rg --type java -A 10 'applyMCPSideEffect' $test_file_path
Length of output: 1387

Script:
#!/bin/bash
# Extract the complete test methods from SchemaFieldSideEffectTest.java to verify the coverage of applyMCPSideEffect method.

# Find the file path of SchemaFieldSideEffectTest.java
test_file_path=$(fd SchemaFieldSideEffectTest.java)

# Extract the complete test methods from the identified test file
rg --type java -A 50 'public void' $test_file_path
Length of output: 9993
398-402: Validate utility method getSchemaFieldUrn.

This method constructs a URN for schema fields based on a parent URN and a field. The implementation is straightforward and uses well-defined constants. Ensure that this method is covered by unit tests to verify the correctness of the URN format.

Verification successful

The utility method getSchemaFieldUrn is covered by unit tests.

The URN format generated by getSchemaFieldUrn is indirectly validated in the test cases through the PlatformEventMatcher, which compares the entityUrn field of the EntityChangeEvent objects.

EntityChangeEventGeneratorHookTest.java uses PlatformEventMatcher to verify the URN format.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the test coverage for getSchemaFieldUrn method.

# Test: Search for the test cases covering this method. Expect: At least one test case verifying URN format.
rg --type java -A 5 $'getSchemaFieldUrn'
Length of output: 27087

Script:
#!/bin/bash
# Description: Verify that the test cases using getSchemaFieldUrn are checking the URN format.

# Test: Search for assertions in the test files that call getSchemaFieldUrn.
rg --type java -A 10 'getSchemaFieldUrn' | grep -E 'assert|verify'
Length of output: 1351

Script:
#!/bin/bash
# Description: Verify that the verifyProducePlatformEvent method checks the URN format.

# Step 1: Find the implementation of verifyProducePlatformEvent.
rg --type java 'void verifyProducePlatformEvent' -A 10
Length of output: 2634

Script:
#!/bin/bash
# Description: Verify that the PlatformEventMatcher checks the URN format.

# Step 1: Find the implementation of PlatformEventMatcher.
rg --type java 'class PlatformEventMatcher' -A 20
Length of output: 3053

Script:
#!/bin/bash
# Description: Verify that the payloadMatches method in PlatformEventMatcher checks the URN format.

# Step 1: Find the complete implementation of payloadMatches method.
rg --type java 'public boolean payloadMatches' -A 20
Length of output: 3660
100-158: Review post-processing of MCP side effects in postMCPSideEffect.

This method handles the deletion scenarios for schema metadata and status aspects. It's well-structured and uses stream operations effectively. However, the complexity within the nested flatMap operations requires thorough testing to ensure that deletions are handled correctly without unintended side effects.

260-287: Review the method buildSchemaFieldKeyMCPs.

This overloaded method directly handles the creation of schema field key MCPs. The method is concise and uses the Java Stream API effectively. However, ensure that the getFields() call on schemaMetadata is guaranteed to be non-null to avoid potential NullPointerExceptions.
metadata-io/src/main/java/com/linkedin/metadata/entity/ebean/EbeanAspectDao.java (1)

515-517: Enhanced aspect filtering logic added.

The new conditional logic introduced at lines 515-517 allows for filtering based on multiple aspect names. This is a significant improvement as it increases the flexibility and capability of the streamAspectBatches method.
datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadataStep.java (6)
32-52: Documentation clarity and potential future work noted.

The class documentation is clear and outlines the responsibilities and operations of the class. It also mentions a future upgrade for generating additional aspects like tags and documentation. This is good for future planning and maintainability.

64-78: Constructor initialization and logging.

The constructor properly initializes class fields and logs the initialization. This is good practice for debugging and verifying that the component starts as expected.

91-103: Environment-based skipping logic.

The skip method uses an environment variable to determine if the upgrade step should be skipped. This is a flexible approach to control the execution based on deployment environments. Ensure that the environment variable is documented and managed properly in deployment configurations.

105-107: Utility method to get upgrade ID URN.

The method getUpgradeIdUrn uses a static method from BootstrapStep to generate a URN for the upgrade step. This encapsulates the URN generation logic well.

109-181: Main executable logic of the upgrade step.

This method is the core of the upgrade step, handling the streaming, processing, and ingestion of schema aspects. It includes error handling and batch processing logic. The use of lambdas and streams is appropriate and modern. However, ensure that the error handling is robust enough to handle specific exceptions that might occur during batch processing.

85-89: Method to generate URN-like strings for filtering.

The getUrnLike method generates a URN-like string for filtering datasets. This is a good use of a helper method to encapsulate functionality. However, ensure that the URN format aligns with expected standards across the DataHub platform.

Verification successful

URN format in getUrnLike method is consistent with the expected standards.

The format "urn:li:<entity_name>:%" used in the getUrnLike method aligns with the common URN format observed across the codebase. The '%' wildcard is appropriately used for filtering purposes.

The format "urn:li:<entity_type>:<entity_specific_details>" is consistently used across the DataHub platform.

The getUrnLike method's format follows this pattern, ensuring compatibility and consistency.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the URN format used in getUrnLike method.

# Test: Search for URN format usage in other parts of the codebase. Expect: Consistent URN format.
rg --type java $'urn:li:'
Length of output: 295980
metadata-models/src/main/resources/entity-registry.yml (1)

457-457: Addition of status aspect to schemaField entity.

The addition of the status aspect to the schemaField entity is consistent with the PR's goal to handle schema fields more comprehensively. This change should be verified for compatibility with existing data models and usage in the platform.
li-utils/src/main/java/com/linkedin/metadata/Constants.java (1)
85-85: Addition of SCHEMA_FIELD_KEY_ASPECT constant.

The addition of the SCHEMA_FIELD_KEY_ASPECT constant is straightforward and follows the established pattern in the file. This constant likely facilitates the handling of schema field keys across different parts of the application. Ensure that this new constant is used consistently where applicable.

Verification successful

Addition of SCHEMA_FIELD_KEY_ASPECT constant verified.

The SCHEMA_FIELD_KEY_ASPECT constant is used appropriately in the SchemaFieldSideEffect class and its corresponding test class. There are no other relevant places where this constant should be used but isn't.

metadata-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java

metadata-io/src/test/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffectTest.java
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of the `SCHEMA_FIELD_KEY_ASPECT` constant across the project.

# Test: Search for references to the `SCHEMA_FIELD_KEY_ASPECT` constant. Expect: Consistent usage and definitions.
rg --type java $'SCHEMA_FIELD_KEY_ASPECT'
Length of output: 2214

Script:
#!/bin/bash
# Description: Verify if there are any other relevant files or classes that should be using the `SCHEMA_FIELD_KEY_ASPECT` constant but aren't.

# Test: Search for schema field key usage across the project to identify any potential places where the constant should be used.
rg --type java 'schemaFieldKey'
Length of output: 926
metadata-io/src/test/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffectTest.java (5)

47-76: Review of class-level constants and setup for SchemaFieldSideEffectTest

The constants and configurations defined here are crucial for setting up the test environment. The use of List.of with ChangeType values and the configuration of AspectPluginConfig are appropriate and correctly implemented. The TEST_PLUGIN_CONFIG is particularly well-setup, specifying class names, enabling the plugin, and defining supported operations and entity aspects.

However, it's important to ensure that these configurations align with the actual capabilities and expectations of the SchemaFieldSideEffect class in production. This would typically require cross-verifying with the class's documentation or implementation to ensure consistency.

81-91: Setup method review

The setup method is well-structured, using mocking effectively to isolate the test environment. The use of RetrieverContext.builder() to assemble the context is clean and follows good practice in test setup design. This setup method ensures that each test starts with a fresh context, which is crucial for test reliability.

169-348: Test method: statusToSchemaFieldStatusTest and subsequent cases

This method tests the handling of status updates and their effects on SchemaFieldStatus. Similar to the previous test, it uses loops and conditional checks to validate the behavior under different scenarios. The reset of mockAspectRetriever (line 177) is crucial for ensuring that mock behavior from previous tests does not interfere.

The structure of the test, including the setup of expected results and the use of assertions to validate these results, is consistent and appropriate. The handling of different ChangeType scenarios ensures comprehensive coverage.

350-465: Deletion tests: schemaMetadataDeleteTest and statusDeleteTest

These tests verify the correct behavior when schema metadata and status aspects are deleted. The construction of expected results and the use of assertions (lines 390-401 and 457-465) are well-implemented. These tests are crucial for ensuring that deletions are handled correctly, especially in a system where cascading effects can occur.

The use of List.of and loops to construct expected results is efficient and makes the tests easier to understand and maintain.

468-475: Utility method: getTestSchemaMetadata

This method provides a mocked SchemaMetadata object for testing. The use of GenericRecordUtils.deserializeAspect to deserialize a JSON string into a SchemaMetadata object is a good use of existing utilities. Ensure that the JSON string accurately represents the structure expected by the tests, as any discrepancy here could lead to false positives or negatives in the test outcomes.

.../com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadata.java

...com/linkedin/datahub/upgrade/schemafield/GenerateSchemaFieldsFromSchemaMetadataStepTest.java

...a-io/src/main/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffect.java

.../src/test/java/com/linkedin/metadata/schemafields/sideeffects/SchemaFieldSideEffectTest.java

coderabbitai

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d64859c and a836ea3.

Files selected for processing (2)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java (1 hunks)
metadata-service/configuration/src/main/resources/application.yaml (1 hunks)

Additional comments not posted (4)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java (2)

3-7: Review imports and package declaration.

The imports and package declaration appear to be correct and relevant to the functionality being implemented.

14-16: Check class annotations.

The @Configuration and @Conditional annotations are appropriately used here to define a configuration class and conditionally load it based on the system's state.
metadata-service/configuration/src/main/resources/application.yaml (2)
344-348: Review new configuration parameters for schemaFieldsFromSchemaMetadata.

The configuration parameters for schemaFieldsFromSchemaMetadata are correctly formatted and seem to be logically placed within the system update section. Ensure that these values are documented and that there are corresponding validations in the application to prevent misconfiguration.

Line range hint 350-352: Review new configuration parameters for structuredProperties.

The structuredProperties configuration is added correctly and includes parameters for enabling and configuring structured properties. Ensure that these settings are used consistently throughout the application and are well-documented.

Verification successful

Review new configuration parameters for structuredProperties.

The structuredProperties configuration parameters (enabled, writeEnabled, and systemUpdateEnabled) are used consistently in the codebase:

UpdateIndicesService.java:

structuredProperties.enabled: @Value("${structuredProperties.enabled}") private boolean _structuredPropertiesHookEnabled;

structuredProperties.writeEnabled: @Value("${structuredProperties.writeEnabled}") private boolean _structuredPropertiesWriteEnabled;

ElasticSearchIndexBuilderFactory.java:

structuredProperties.systemUpdateEnabled: @Value("#{new Boolean('${structuredProperties.systemUpdateEnabled}')}") private boolean enableStructuredPropertiesReindex;

These parameters are integrated correctly within the application's services, ensuring that the structured properties feature is configurable via the application configuration.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of structuredProperties configuration in the application.

# Test: Search for the usage of structuredProperties configuration. Expect: Consistent and documented usage in the application.
rg --type yaml -A 5 $'structuredProperties'
Length of output: 11310

Script:
#!/bin/bash
# Description: Verify the usage of structuredProperties configuration parameters in the codebase.

# Test: Search for the usage of structuredProperties.enabled in the codebase.
rg 'structuredProperties.enabled'

# Test: Search for the usage of structuredProperties.writeEnabled in the codebase.
rg 'structuredProperties.writeEnabled'

# Test: Search for the usage of structuredProperties.systemUpdateEnabled in the codebase.
rg 'structuredProperties.systemUpdateEnabled'
Length of output: 560

Script:
#!/bin/bash
# Description: Extract the relevant code snippets for structuredProperties configuration parameters.

# Extract the usage of structuredProperties.enabled in UpdateIndicesService.java
ast-grep --lang java --pattern $'@Value("${structuredProperties.enabled}") $_' metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.writeEnabled in UpdateIndicesService.java
ast-grep --lang java --pattern $'@Value("${structuredProperties.writeEnabled}") $_' metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.systemUpdateEnabled in ElasticSearchIndexBuilderFactory.java
ast-grep --lang java --pattern $'@Value("#{new Boolean(\'${structuredProperties.systemUpdateEnabled}\')}") $_' metadata-service/factories/src/main/java/com/linkedin/gms/factory/search/ElasticSearchIndexBuilderFactory.java
Length of output: 555

Script:
#!/bin/bash
# Description: Extract the relevant code snippets for structuredProperties configuration parameters using rg with context lines.

# Extract the usage of structuredProperties.enabled in UpdateIndicesService.java
rg 'structuredProperties.enabled' -A 5 metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.writeEnabled in UpdateIndicesService.java
rg 'structuredProperties.writeEnabled' -A 5 metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.systemUpdateEnabled in ElasticSearchIndexBuilderFactory.java
rg 'structuredProperties.systemUpdateEnabled' -A 5 metadata-service/factories/src/main/java/com/linkedin/gms/factory/search/ElasticSearchIndexBuilderFactory.java
Length of output: 1089

.../src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java

coderabbitai

Caution

Inline review comments failed to post

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d64859c and a836ea3.

Files selected for processing (2)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java (1 hunks)
metadata-service/configuration/src/main/resources/application.yaml (1 hunks)

Additional comments not posted (4)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java (2)

3-7: Review imports and package declaration.

The imports and package declaration appear to be correct and relevant to the functionality being implemented.

14-16: Check class annotations.

The @Configuration and @Conditional annotations are appropriately used here to define a configuration class and conditionally load it based on the system's state.
metadata-service/configuration/src/main/resources/application.yaml (2)
344-348: Review new configuration parameters for schemaFieldsFromSchemaMetadata.

The configuration parameters for schemaFieldsFromSchemaMetadata are correctly formatted and seem to be logically placed within the system update section. Ensure that these values are documented and that there are corresponding validations in the application to prevent misconfiguration.

Line range hint 350-352: Review new configuration parameters for structuredProperties.

The structuredProperties configuration is added correctly and includes parameters for enabling and configuring structured properties. Ensure that these settings are used consistently throughout the application and are well-documented.

Verification successful

Review new configuration parameters for structuredProperties.

The structuredProperties configuration parameters (enabled, writeEnabled, and systemUpdateEnabled) are used consistently in the codebase:

UpdateIndicesService.java:

structuredProperties.enabled: @Value("${structuredProperties.enabled}") private boolean _structuredPropertiesHookEnabled;

structuredProperties.writeEnabled: @Value("${structuredProperties.writeEnabled}") private boolean _structuredPropertiesWriteEnabled;

ElasticSearchIndexBuilderFactory.java:

structuredProperties.systemUpdateEnabled: @Value("#{new Boolean('${structuredProperties.systemUpdateEnabled}')}") private boolean enableStructuredPropertiesReindex;

These parameters are integrated correctly within the application's services, ensuring that the structured properties feature is configurable via the application configuration.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of structuredProperties configuration in the application.

# Test: Search for the usage of structuredProperties configuration. Expect: Consistent and documented usage in the application.
rg --type yaml -A 5 $'structuredProperties'
Length of output: 11310

Script:
#!/bin/bash
# Description: Verify the usage of structuredProperties configuration parameters in the codebase.

# Test: Search for the usage of structuredProperties.enabled in the codebase.
rg 'structuredProperties.enabled'

# Test: Search for the usage of structuredProperties.writeEnabled in the codebase.
rg 'structuredProperties.writeEnabled'

# Test: Search for the usage of structuredProperties.systemUpdateEnabled in the codebase.
rg 'structuredProperties.systemUpdateEnabled'
Length of output: 560

Script:
#!/bin/bash
# Description: Extract the relevant code snippets for structuredProperties configuration parameters.

# Extract the usage of structuredProperties.enabled in UpdateIndicesService.java
ast-grep --lang java --pattern $'@Value("${structuredProperties.enabled}") $_' metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.writeEnabled in UpdateIndicesService.java
ast-grep --lang java --pattern $'@Value("${structuredProperties.writeEnabled}") $_' metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.systemUpdateEnabled in ElasticSearchIndexBuilderFactory.java
ast-grep --lang java --pattern $'@Value("#{new Boolean(\'${structuredProperties.systemUpdateEnabled}\')}") $_' metadata-service/factories/src/main/java/com/linkedin/gms/factory/search/ElasticSearchIndexBuilderFactory.java
Length of output: 555

Script:
#!/bin/bash
# Description: Extract the relevant code snippets for structuredProperties configuration parameters using rg with context lines.

# Extract the usage of structuredProperties.enabled in UpdateIndicesService.java
rg 'structuredProperties.enabled' -A 5 metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.writeEnabled in UpdateIndicesService.java
rg 'structuredProperties.writeEnabled' -A 5 metadata-io/src/main/java/com/linkedin/metadata/service/UpdateIndicesService.java

# Extract the usage of structuredProperties.systemUpdateEnabled in ElasticSearchIndexBuilderFactory.java
rg 'structuredProperties.systemUpdateEnabled' -A 5 metadata-service/factories/src/main/java/com/linkedin/gms/factory/search/ElasticSearchIndexBuilderFactory.java
Length of output: 1089

Comments failed to post (1)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java
18-34: Ensure proper validation of configuration values.

The configuration values batchSize, delayMs, and limit are used directly within the GenerateSchemaFieldsFromSchemaMetadata and GenerateSchemaFieldsFromSchemaMetadataStep classes without explicit validation. This could lead to potential runtime issues if these values are incorrectly set.

File: datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadata.java

File: datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/system/schemafield/GenerateSchemaFieldsFromSchemaMetadataStep.java

Recommendation: Add validation logic for these configuration parameters to ensure they are within acceptable ranges or formats before using them in the application.

Analysis chain

Review the bean definition method.

This method is well-defined with appropriate parameter annotations and types. It correctly creates a NonBlockingSystemUpgrade bean using the GenerateSchemaFieldsFromSchemaMetadata class. However, ensure that all the configurations such as batchSize, delayMs, and limit are being used correctly in the system and that their values are validated somewhere in the application to prevent configuration errors.

Ensure that the configuration values are validated appropriately to avoid runtime issues due to incorrect settings.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify proper usage of configuration values in the application.

# Test: Search for the usage of configuration values. Expect: Proper validation or usage in the system.
rg --type java -A 5 $'systemUpdate.schemaFieldsFromSchemaMetadata'
Length of output: 2092

Script:
#!/bin/bash
# Description: Search for the instantiation and method invocations of GenerateSchemaFieldsFromSchemaMetadata to verify usage and validation of configuration values.

# Find where GenerateSchemaFieldsFromSchemaMetadata is instantiated
rg --type java 'new GenerateSchemaFieldsFromSchemaMetadata'

# Find where methods of GenerateSchemaFieldsFromSchemaMetadata are invoked
rg --type java 'GenerateSchemaFieldsFromSchemaMetadata'
Length of output: 3349

Script:
#!/bin/bash
# Description: Extract the constructor and methods of GenerateSchemaFieldsFromSchemaMetadata to verify usage and validation of configuration values.

# Extract the constructor and methods of GenerateSchemaFieldsFromSchemaMetadata
ast-grep --lang java --pattern 'class GenerateSchemaFieldsFromSchemaMetadata {
  public GenerateSchemaFieldsFromSchemaMetadata($_, $_, $_, $_, $_, $_, $_) {
    $$$
  }
  $$$
}'
Length of output: 179

Script:
#!/bin/bash
# Description: Extract the class definition and methods of GenerateSchemaFieldsFromSchemaMetadata to verify usage and validation of configuration values.

# Search for the class definition of GenerateSchemaFieldsFromSchemaMetadata
rg --type java 'class GenerateSchemaFieldsFromSchemaMetadata' -A 20

# Search for methods within GenerateSchemaFieldsFromSchemaMetadata
rg --type java 'GenerateSchemaFieldsFromSchemaMetadata' -A 10
Length of output: 31083

Script:
#!/bin/bash
# Description: Search for validation logic within the GenerateSchemaFieldsFromSchemaMetadataStep class.

# Search for validation logic within GenerateSchemaFieldsFromSchemaMetadataStep
rg --type java 'GenerateSchemaFieldsFromSchemaMetadataStep' -A 20
Length of output: 25493

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a836ea3 and 0692047.

Files selected for processing (2)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java (1 hunks)
metadata-service/configuration/src/main/resources/application.yaml (1 hunks)

Files skipped from review as they are similar to previous changes (2)

datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java
metadata-service/configuration/src/main/resources/application.yaml

github-actions

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

actionlint

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:11:51: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 69 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:2:55: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 85 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:2:45: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 96 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:2:45: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 107 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:118: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 548 in 114f71d

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_base_change == 'true' && needs.setup.outputs.unique_tag || 'head' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:128: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 589 in 114f71d

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_base_change == 'true' && needs.setup.outputs.unique_slim_tag || 'head-slim' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
missing input "image_tag" which is required by action "Custom Docker build and push" defined at "./.github/actions/docker-custom-build-and-push". all required inputs are "image_tag", "images" [action]

datahub/.github/workflows/docker-unified.yml

Line 613 in 114f71d

uses: ./.github/actions/docker-custom-build-and-push

[actionlint] _{reported by reviewdog 🐶}
input "tags" is not defined in action "Custom Docker build and push" defined at "./.github/actions/docker-custom-build-and-push". available inputs are "build-args", "context", "file", "flavor", "image_tag", "images", "password", "platforms", "publish", "target", "username" [action]

datahub/.github/workflows/docker-unified.yml

Line 618 in 114f71d

tags: ${{ needs.setup.outputs.full_tag }}

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:123: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 630 in 114f71d

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_base_change == 'true' && needs.setup.outputs.unique_full_tag || 'head' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:123: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 684 in 114f71d

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_change == 'true' && needs.setup.outputs.unique_slim_tag || 'head-slim' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:113: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 770 in 114f71d

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_change == 'true' && needs.setup.outputs.unique_tag || 'head' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:2:55: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:4:63: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:6:95: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:8:24: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
property "integrations_service_change" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 927 in 114f71d

run: |

[actionlint] _{reported by reviewdog 🐶}
property "short_sha" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 1066 in 114f71d

    
           message: '{ "command": "git-sync", "args" : {"repoName": "${{ needs.setup.outputs.repository_name }}", "repoOrg": "${{ github.repository_owner }}", "repoBranch": "${{ needs.setup.outputs.branch_name }}", "repoShaShort": "${{ needs.setup.outputs.short_sha }}" }}'

github-actions

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

actionlint

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:1:113: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 770 in dcb59e9

    
           run: echo "tag=${{ needs.setup.outputs.ingestion_change == 'true' && needs.setup.outputs.unique_tag || 'head' }}" >> $GITHUB_OUTPUT

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:2:55: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in dcb59e9

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:4:63: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in dcb59e9

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:6:95: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in dcb59e9

run: |

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2086:info:8:24: Double quote to prevent globbing and word splitting [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 812 in dcb59e9

run: |

[actionlint] _{reported by reviewdog 🐶}
property "integrations_service_change" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 927 in dcb59e9

run: |

[actionlint] _{reported by reviewdog 🐶}
property "short_sha" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 1066 in dcb59e9

    
           message: '{ "command": "git-sync", "args" : {"repoName": "${{ needs.setup.outputs.repository_name }}", "repoOrg": "${{ github.repository_owner }}", "repoBranch": "${{ needs.setup.outputs.branch_name }}", "repoShaShort": "${{ needs.setup.outputs.short_sha }}" }}'

github-actions · 2024-09-04T14:27:43Z

.github/workflows/docker-unified.yml

[actionlint] _{reported by reviewdog 🐶}
shellcheck reported issue in this script: SC2129:style:2:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects [shellcheck]

datahub/.github/workflows/docker-unified.yml

Line 73 in 637c562

run: |

github-actions · 2024-09-04T14:27:44Z

.github/workflows/docker-unified.yml

[actionlint] _{reported by reviewdog 🐶}
property "integrations_service_change" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 939 in 637c562

run: |

github-actions · 2024-09-04T14:27:45Z

.github/workflows/docker-unified.yml

[actionlint] _{reported by reviewdog 🐶}
property "short_sha" is not defined in object type {backend_change: string; backend_only: string; branch_name: string; docker-login: string; elasticsearch_setup_change: string; frontend_change: string; frontend_only: string; full_tag: string; ingestion_base_change: string; ingestion_change: string; ingestion_only: string; kafka_setup_change: string; mysql_setup_change: string; postgres_setup_change: string; pr-publish: string; publish: string; python_release_version: string; repository_name: string; slim_tag: string; smoke_test_change: string; tag: string; unique_full_tag: string; unique_slim_tag: string; unique_tag: string} [expression]

datahub/.github/workflows/docker-unified.yml

Line 1078 in 637c562

message: '{ "command": "git-sync", "args" : {"repoName": "${{ needs.setup.outputs.repository_name }}", "repoOrg": "${{ github.repository_owner }}", "repoBranch": "${{ needs.setup.outputs.branch_name }}", "repoShaShort": "${{ needs.setup.outputs.short_sha }}" }}'

github-actions bot added product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment labels Jul 17, 2024

david-leifker requested review from shirshanka and RyanHolstien July 17, 2024 02:09

vercel bot had a problem deploying to Preview July 17, 2024 02:15 Failure

coderabbitai bot reviewed Jul 17, 2024

View reviewed changes

RyanHolstien approved these changes Jul 17, 2024

View reviewed changes

coderabbitai bot reviewed Jul 17, 2024

View reviewed changes

.../src/main/java/com/linkedin/datahub/upgrade/config/SchemaFieldsFromSchemaMetadataConfig.java Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jul 17, 2024

View reviewed changes

vercel bot had a problem deploying to Preview July 17, 2024 16:18 Failure

david-leifker force-pushed the schema-field-side-effect branch from a836ea3 to 0692047 Compare July 17, 2024 16:19

coderabbitai bot reviewed Jul 17, 2024

View reviewed changes

vercel bot had a problem deploying to Preview July 17, 2024 16:27 Failure

david-leifker marked this pull request as draft July 17, 2024 17:35

david-leifker force-pushed the schema-field-side-effect branch from 0692047 to 945f0cd Compare August 26, 2024 16:11

vercel bot had a problem deploying to Preview August 26, 2024 16:38 Failure

david-leifker force-pushed the schema-field-side-effect branch from 945f0cd to 114f71d Compare August 26, 2024 17:51

github-actions bot reviewed Aug 26, 2024

View reviewed changes

vercel bot deployed to Preview August 26, 2024 18:09 View deployment

david-leifker force-pushed the schema-field-side-effect branch from 114f71d to dcb59e9 Compare August 26, 2024 18:11

github-actions bot reviewed Aug 26, 2024

View reviewed changes

vercel bot deployed to Preview August 26, 2024 18:30 View deployment

david-leifker force-pushed the schema-field-side-effect branch from dcb59e9 to d5d0fd6 Compare August 26, 2024 19:05

vercel bot deployed to Preview August 26, 2024 19:22 View deployment

david-leifker force-pushed the schema-field-side-effect branch from d5d0fd6 to a46fe59 Compare August 27, 2024 14:00

vercel bot deployed to Preview August 27, 2024 14:24 View deployment

david-leifker force-pushed the schema-field-side-effect branch from a46fe59 to 5b7e517 Compare August 27, 2024 15:10

vercel bot deployed to Preview August 27, 2024 15:27 View deployment

david-leifker force-pushed the schema-field-side-effect branch from 5b7e517 to 60f0f8b Compare August 27, 2024 15:50

david-leifker force-pushed the schema-field-side-effect branch from 60f0f8b to d8f7f7b Compare August 27, 2024 16:02

vercel bot deployed to Preview August 27, 2024 16:33 View deployment

david-leifker force-pushed the schema-field-side-effect branch from d8f7f7b to 49876d5 Compare August 27, 2024 16:41

vercel bot deployed to Preview August 27, 2024 17:08 View deployment

david-leifker force-pushed the schema-field-side-effect branch from 49876d5 to c987986 Compare August 28, 2024 12:33

vercel bot deployed to Preview August 28, 2024 12:50 View deployment

david-leifker force-pushed the schema-field-side-effect branch from c987986 to 9869c17 Compare August 28, 2024 20:12

vercel bot deployed to Preview August 28, 2024 20:32 View deployment

david-leifker marked this pull request as ready for review September 4, 2024 14:22

feat(schemaField): populate schemaFields with side effects

637c562

david-leifker force-pushed the schema-field-side-effect branch from 9869c17 to 637c562 Compare September 4, 2024 14:27

github-actions bot reviewed Sep 4, 2024

View reviewed changes

vercel bot deployed to Preview September 4, 2024 14:44 View deployment

Update application.yaml

bcb8868

vercel bot deployed to Preview September 4, 2024 19:42 View deployment

david-leifker merged commit ede9520 into master Sep 4, 2024
74 of 76 checks passed

david-leifker deleted the schema-field-side-effect branch September 4, 2024 20:36

hsheth2 mentioned this pull request Oct 22, 2024

pattern_add_dataset_dataproduct works for Oracle ingestion but not S3 #11656

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(schemaField): populate schemaFields with side effects #10928

feat(schemaField): populate schemaFields with side effects #10928

david-leifker commented Jul 17, 2024 •

edited

Loading

coderabbitai bot commented Jul 17, 2024 •

edited

Loading

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot Sep 4, 2024

github-actions bot Sep 4, 2024

github-actions bot Sep 4, 2024

feat(schemaField): populate schemaFields with side effects #10928

feat(schemaField): populate schemaFields with side effects #10928

Conversation

david-leifker commented Jul 17, 2024 • edited Loading

Checklist

Summary by CodeRabbit

coderabbitai bot commented Jul 17, 2024 • edited Loading

Review skipped

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot Sep 4, 2024

Choose a reason for hiding this comment

github-actions bot Sep 4, 2024

Choose a reason for hiding this comment

github-actions bot Sep 4, 2024

Choose a reason for hiding this comment

david-leifker commented Jul 17, 2024 •

edited

Loading

coderabbitai bot commented Jul 17, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)