Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Multi-Cardinality Support to DedupliFHIR Backend #122

Merged
merged 9 commits into from
Aug 30, 2024
Merged

Conversation

IsaacMilarky
Copy link
Collaborator

Add Multi-Cardinality Support to DedupliFHIR Backend

Problem

Currently, the dedupliFHIR tool only supports a specific column structure when processing user-inputted data. Meaning, that the columns that the tool checks for and compares are predefined and static.

For example, if the user input a piece of data that included more than one street address or postal code it would be ignored by the tool.

This problem is detailed in this issue: #54

Solution

I have added functionality to support multi-cardinality if it is found in the input data. Both FHIR and CSV data now include support for multiple names, addresses, and postal codes.

Result

Tests have been adapted to use more than one address, the settings for the Splink backend are now computed after the data is parsed instead of before the data is parsed, etc.

Test Plan

Run make test

Signed-off-by: Isaac Milarsky <[email protected]>
Signed-off-by: Isaac Milarsky <[email protected]>
Signed-off-by: Isaac Milarsky <[email protected]>
Signed-off-by: Isaac Milarsky <[email protected]>
Signed-off-by: Isaac Milarsky <[email protected]>
Signed-off-by: Isaac Milarsky <[email protected]>
@IsaacMilarky IsaacMilarky added enhancement New feature or request dedupliFHIR DedupliFHIR repo tickets labels Aug 16, 2024
else:
blocking_rules.append(block_on(rule))
def get_additional_comparison_rules(parsed_data_df):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0303: Trailing whitespace (trailing-whitespace)

Signed-off-by: Isaac Milarsky <[email protected]>
@patsier-cms
Copy link
Contributor

Looks like the installation for tests on 3.11 is failing, is that a fluke or does that need to be updated?

Signed-off-by: Isaac Milarsky <[email protected]>
@IsaacMilarky
Copy link
Collaborator Author

Looks like the installation for tests on 3.11 is failing, is that a fluke or does that need to be updated?

Sorry I was just merging in Dependabot updates. It's just a fluke and is fixed now

Copy link
Contributor

@patsier-cms patsier-cms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good for now as an approach to support multiple! It would be good to add a ticket to the backlog that would compare street_address1 to street_address2 in addition to street_address1, but sounds like that would be a heavier lift

@IsaacMilarky IsaacMilarky merged commit efbb66a into dev Aug 30, 2024
8 checks passed
@IsaacMilarky IsaacMilarky deleted the cardinality branch August 30, 2024 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dedupliFHIR DedupliFHIR repo tickets enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FHIR Parser should accommodate elements with multiple cardinality.
2 participants