-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(exports): interpret PN tags as union between object and string #364
Conversation
A little addendum, you can reproduce the issue in OHIF by creating a single measurement and exporting a structured report. Set a breakpoint here (where it does the denaturalizing) and look at the '00100010' (PatientName) tag. It will be "[object Object]". Finally, this implies a bit of a paradigm shift. In OHIF there are places where it's clear PNs are not handled correcly. Here, for example. With this change, users will have to be careful not to do something like My change is... kinda sloppy. That's not to say it's sloppily done, rather it reduces cleanliness a fair amount, since it's breaking through some conceptual boundaries. (For instance adding denaturalizing code to ValueRepresentation, sneaking toString and toJSON into native types.) If anyone has a better idea on how to fix it (that doesn't require an epic refactor) I'm all ears. |
Thanks for working on this 👍 I'm trying to get my head around the issue - can you elaborate on your comment about derived SR having [object Object] as the Patient Name? |
Sure thing! What's happening is when OHIF tries to create a Structured Report, it is pulling instance metadata for the image you are annotating, which is drawn from DICOMWeb as application/dicom+json. The json representation of PN (person name) is not the same as in the binary format (part 10? I'm not super familiar with the naming conventions), as described here.
Which is to say, when you load a dataset, the Value field of the tag will be different depending on where you load it from. In the case of .dcm, the value will be:
whereas if loaded from json, it'll be:
So, in the latter circumstance, if you were write out the dcm file, eventually PersonName.write() would get called, which is derived from EncodedStringRepresentation, which calls toString() on the Value, and thus it writes out "[object Object]", since it is indeed an object, and not a string. This is a problem because OHIF hydrates its structured reports with json data, thus all PN tags become "[object Object]". A smaller change would be to just detect the type of Value when writing it from the PersonName class, and that would indeed fix binary output, but json output would be broken in the event someone loaded a binary dcm and wrote json. I hope it's clearer now! |
Thanks for the additional context. This is sounding like more of an OHIF issue than something that should be changed in dcmjs. The reason I say that is that by design a dataset object in dcmjs should be exactly in DICOM JSON Model, so we should be using the object form with the fields for PersonName and not the ^ separated version used in the Part 10 binary. Can you point to where in OHIF the SR is being created / used that leads to the reported issue? In dcmjs, the SR derivation is done by constructing a StructuredReport datasets being referenced as a argument:
This inherits from DerivedDataset, where the PatientName is just copied over: dcmjs/src/derivations/DerivedDataset.js Line 69 in f844e02
|
I believe they pull the metadata from here and create the SR just below it, here. They also do some work[1] to detect[2] what kind of PN tag they're dealing with.
Not according to the link I posted. I can see, for example, in the test mock for a minimum viable naturalized dataset the name is just a string, which is not what the standard describes. And similarly there is currently no conversion done between the object representation and a plain string of the format "A=B=C" when the dataset is written. Indeed, that's what I'm trying to do with my change. |
Gotcha, thanks for the links. It does seem that PN doesn't follow the intended pattern, so that's a bug in the original implementation. Do your suggested changes alter the API or are both string and object values are now supported? Also, is it possible to separate the style changes so the functional changes are in a single commit that's easier to review? |
My pleasure! Thanks for the review! The API hasn't changed, and indeed strings from part10 data and objects from json are supported by the change. It should be noted in future documentation that after ingestion (before and after naturalization) one should always use the json accessors to PN tags. After denaturalization, the value in the dict will take on this^string=format. I'll separate out the format changes (thanks, lint-on-save 😄 ) and think about adding a couple more tests. |
thanks again for working on this. Does anyone else want to have a look before this gets finalized? @sedghi maybe? |
@wayfarer3130 any comment here? |
src/DicomMetaDictionary.js
Outdated
data.Value.__pnDcm || | ||
(data.Value && | ||
Array.isArray(data.Value) && | ||
data.Value[0] == "string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you meant typeof data.Value[0] === "string"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed!
src/DicomMetaDictionary.js
Outdated
// object, and if it's a json object it mocks a string by overriding | ||
// toString. The latter ensures the ValueRepresentation output is | ||
// correct. | ||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a utility which converts/handles name parsing/generation for patient names in both directions.
This code is good, but it would be helpful to have it defined by itself so it can be tested more easily and so that it can be documented sufficiently by itself.
src/DicomMetaDictionary.js
Outdated
data.Value[0] == "string") | ||
) { | ||
data.Value.__objectLike = true; | ||
Object.defineProperty(data.Value, "Alphabetic", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the default encoding is as a List of the typed object, and not as a simple object. However, making the change on the list itself is probably ok, as it makes it easy to access the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought given it was a special case we may as well enforce this "objectlike" behaviour.
const measurement = toolClass.getMeasurementData( | ||
measurementGroup | ||
); | ||
const measurement = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a lot of formatting changes - if you want to run the formatter, do it as a separate PR that is just that change so that the review only has the substantive changes.
test/DicomMetaDictionary.test.js
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the reformatting, I have no idea what was actually changed in teh tests, so it becomes hard to figure out how to review it. Please commit separately in a different PR, and then continue on that branch that has been reformatted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmlambo did already split the PR into 3 commits so you should be able to review those independently. Maybe you are right though @wayfarer3130 and we should think whether it should be a different PR. Is there a practical benefit down the line to having two PRs?
Any more comments? 🙂 |
I'll defer to @wayfarer3130 for final review. |
src/utilities/dicomJson.js
Outdated
*/ | ||
function pnDenaturalize(value) { | ||
if (value && typeof value === "object") { | ||
return `${value.Alphabetic ?? ""}=${value.Ideographic ?? ""}=${ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should only get the ideographic/phonetic separators if one of them exists. That is:
if( value.Ideographic || value.Phonetic ) return your full string
return value.Alphabetic
That avoids you creating a string with the = in it when it isn't necessary (which changes the appearance in some systems).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice the .replace just below which removes all end-of-string adjacent =. Thus if the full string is "ABC==" it becomes "ABC", "ABC=DEF=" becomes "ABC=DEF", and "ABC==GHI" stays the same.
src/utilities/dicomJson.js
Outdated
if ( | ||
data.Value.__pnDcm || | ||
(data.Value && | ||
Array.isArray(data.Value) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't deal with tags such as ConsultingPhysician's Name which can contain 1...n instances. I know that is really a pain to have to deal with, as it will have to be decided how that works, but my suggestion is to:
- If the VM !== 1 or 0 or 1, then leave the object as an array of objects all the time.
- For other VM sizes, apply the change to the individual strings in each item, as a new String(originalValue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my. I indeed didn't know PN could have VM > 1. I'll do another pass with that in mind, thanks for the heads up.
src/anonymizer.js
Outdated
var newValue; | ||
if (tagString in tagNamesToReplace) { | ||
newValue = [tagNamesToReplace[tagString]]; | ||
if (Array.isArray(dict[tagString].Value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right behaviour because it modifies the original string, which is going to cause problems. You do NOT want to keep the references here because of that. Also, you don't want to clear every single array element there is, only the ones which have tag names to replace. The other ones should be left alone.
src/utilities/dicomJson.js
Outdated
{ | ||
Alphabetic: components[0], | ||
Ideographic: components[1], | ||
Phonetic: components[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a map for all the elements of the array in case the vm>1. What I would do is check if the base value is a string, and if it is, then convert it into an array and perform the mapping function on the result. Probably having a function like toArray = (value) => Array.isArray(value) ? value : [value]
You might need to undefined check as well.
src/utilities/dicomJson.js
Outdated
Array.isArray(data.Value) && | ||
typeof data.Value[0] == "string") | ||
) { | ||
data.Value.__objectLike = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a hidden property so it doesn't show up on iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the changes to the anonyization above, please add a test with a sequence (array) of values that is not anonymized, and ensure that sequence is unchanged in the original object after you anonymize it.
test/sample-sr.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a multiple name example here please? That allows you to add the appropriate tests to handle multiple names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added it to multiplicity.dcm, but it requires an update of the file in the test artifacts repo.
Incoming. Heh, this one is rather large, though some changes were reverted from before. Note that as it stands one test will fail, because multiplicity.dcm does not have the correct tags, and I don't have access to the test data repo. Let me know where to send it to get it released! This change may take a little while to explain and to understand, so take your time looking through it, and I'll answer any questions. Basically whenever a tag root gets created (in the form of For instance:
which replaces the dataset with a Proxy which intercepts any property assignment, looks up the propert's ValueRepresentation, and sets additional accessors on the value:
This is to ensure if someone assigns a PN incorrectly (IE, without using the json model), we can still output correct JSON, as well as allowing us to convert to part10 given the same scenario. I'll likely have a few smaller changes coming through, but let me know what you think in the meantime. |
In the end other changes worked out in the wash, it seems pretty stable to me. Does anyone have any more comments on this change? Anything I can explain better? Thanks! |
Thanks for your work on this @dmlambo 👍 From a read-through I believe I'm fine with this but would appreciate if @wayfarer3130 can look at the unresolved conversations. Also I see a test failure now: https://github.com/dcmjs-org/dcmjs/actions/runs/6178725114/job/17111489516?pr=364 |
This is because the multiplicity dcm in the test data repo doesn't have my changes. I've added a PN tag with multiplicity. I don't have permissions to update that file, so someone else would have to do it for me. 🙂 Or if someone wants to modify the existing file, I used DICOM Editor Tool to add this tag:
Thanks! |
src/ValueRepresentation.js
Outdated
// dicomDict.dict.upsertTag('00101001', 'PN', [{Alphabetic:'Doe^John'}]); | ||
// naturalizedDataset.OperatorsName = [{Alphabetic:'Doe^John'},{Alphabetic:'Doe^Jane'}]; | ||
// TODO: refactor with addAccessors.js in mind | ||
const handler = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you define the const handler statically? Creating a new instance of it every time isn't needed, as the values inside it are static.
return true; | ||
} | ||
|
||
addValueAccessors(value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here explaining that it is adding accessors on a String object, for the Alphabetic etc components?
this.padByte = PADDING_SPACE; | ||
} | ||
|
||
static checkComponentLengths(components) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see value in not writing non-conformant JSON, but reading it should be fully supported - can you test reading non-conformant data and then indicate here that this function only prevents one from creating it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands it's only checked on write. I'll add a comment about it.
{ Alphabetic: "Doe^Jane", Ideographic: "Janie", Phonetic: "Jayne" } | ||
]) | ||
); | ||
expect(otherPatientNamesIDValue.toString()).toEqual( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding tests here for the patient name handling - that is definitely worthwhile having this tested.
This is looking good now - much happier with it as updated. Could you move the handler out to a single const rather than redeclaring it in the creator? Then I'm happy with it. |
I'm happy to put new testing data in the repository, but do you really need to change and existing file? Wouldn't it be better to add a new file and leave existing tests/data in place? |
That's up to you; I saw there was a file that dealt with multiplicity, so I used it. If you want to add another file, it would suffice to use the one I attached and rename it to something more relavent, at which point I'll update the PR. |
Okay, data published as an asset here: https://github.com/dcmjs-org/data/releases/tag/multiplicity2 |
yarn run format
the JSON standard represents PeopleNames as an object, rather than a delineated string. this caused SRs derived from metadata to have [object Object] for the patient name, notable in Google's Healthcare dicom store, which seems to pick a random patient name when searching for series. much of the change is attempting to communicate this union somewhat opaquely, and ensuring objects with additional properties aren't destroyed when being naturalized. tested with OHIF master branch.
Since the sample dcm has image data in it, and we don't support serializing InlineBinary, I've added a simple SR derived from a TCIA frame for testing. Adds test for equality between the SR dcm and the same SR recreated from the json equivalent.
for easier testing and cleanliness
And also adds the concept of tag and value accessors to ValueRepresentation. Whenever a tag is created, it is assigned to a Proxy which allows values to be represented differently according to whether it's being output to JSON or being interacted with. This allows a user to assign a string to a PN tag, and have that tag be output as a JSON object conformant to spec.
Rebased on top of master. Should be good to go unless there are any more notes I missed! Thanks for all the feedback everyone! |
const file = await promisify(fs.readFile)(dcmPath); | ||
const dicomDict = DicomMessage.readFile(file.buffer); | ||
|
||
expect(dicomDict.dict["00101020"].Value).toEqual([1, 2]); | ||
expect(dicomDict.dict["0018100B"].Value).toEqual(["1.2", "3.4"]); | ||
}); | ||
|
||
it("Reads DICOM with PersonName multiplicity", async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for testing this.
@wayfarer3130 is this ready to merge? Sounds like it's needed for OHIF/Viewers#3739 |
Looks like good to go |
I had approved the PR, yes, I think this is good to go. |
🎉 This PR is included in version 0.29.11 🎉 The release is available on: Your semantic-release bot 📦🚀 |
the JSON standard represents PeopleNames as an object, rather than a delineated string. this caused SRs derived from metadata to have [object Object] for the patient name, notable in Google's Healthcare dicom store, which seems to pick a random patient name when searching for series. much of the change is attempting to communicate this union somewhat opaquely, and ensuring objects with additional properties aren't destroyed when being naturalized. tested with OHIF master branch.