Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create directory of example CIFs #161

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

rowlesmr
Copy link
Collaborator

From COMCIFS/cif_core#430 (comment)

Full CIF file examples of various concepts.

Initial commit for QPA by external standard.

@rowlesmr
Copy link
Collaborator Author

rowlesmr commented Jun 25, 2023

From @vaitkus

As a side-note, I noticed that some items from the powder dictionary have values that violate the current restrictions imposed by that dictionary. I will give a few examples here of the offending items, but I think we should raise a PR with this example file in the powder dictionary repository and continue the full discussion there. Note, I do not claim that the values are bad, just that they do not fit with the current dictionary definitions which may be too limiting or incorrect:

  • The DIFFRN_RADIATION_WAVELENGTH loop has 4 item and 3 value columns (the _diffrn_radiation_wavelength.type values are missing).
  • The value of _exptl_absorpt_coefficient_mu is often given with standard uncertainties (using the parenthesis notations, e.g. 17.7460(14)), although in the dictionary this item is currently defined as a Number and thus cannot have standard uncertainties. However, I find it a bit strange, since according to the human-readable definition it is calculated from other measurand items and could thus potentially have an SU. Any thoughts on this?
  • Data item _pd_meas.scan_method value 'scan' it not one of the currently known enumeration values for this item. Should it be included?
  • Data item _pd_calc.component_intensities_total does not seems to be currently defined in the dictionary.

@rowlesmr
Copy link
Collaborator Author

rowlesmr commented Jun 25, 2023

The DIFFRN_RADIATION_WAVELENGTH loop has 4 item and 3 value columns

I just forgot to put the values back. I was copying from a Cu loop, and this is a Co loop.

Data item _pd_meas.scan_method value 'scan'

It should have been 'cont'.

Data item _pd_calc.component_intensities_total does not seems to be currently defined in the dictionary

It's currently hiding in a PR (#155)

The value of _exptl_absorpt_coefficient_mu

This should definately have an SU associated with it. All three of the things mentioned in the description are able to be refined. _exptl_crystal.density_diffrn, _atom_site.occupancy, and _diffrn_radiation_wavelength.value are all Measurand. I can do a PR on this.

@vaitkus
Copy link
Collaborator

vaitkus commented Jun 26, 2023

Could the directory be renamed from Examples to examples to match the name in other dictionary repositories?

@vaitkus
Copy link
Collaborator

vaitkus commented Jun 26, 2023

Ok, I have a few more technical questions about the example:

  • Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?
  • Some loops contain data items from PR_PROC, PD_CALC and PD_MEAS categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA category? I might need to slightly update the validator cause currently it only allows parent-child combined loops, but not the sibling-combined loops.
  • The _pd_qpa_external_std.diffractogram_id data item is linked to the _pd_diffractogram.id data item. In the most basic case this means that _pd_qpa_external_std.diffractogram_id data item will have a value that matched one of the values of the _pd_diffractogram.id data item in the same data block. However, since these are powder diffraction files, I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks? Should this be somehow marked in the example file, e.g. by setting the appropriate _audit.schema value or including items from the AUDIT_CONFORM category)?

@rowlesmr
Copy link
Collaborator Author

rowlesmr commented Jun 26, 2023

Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?

Technically no. I could have sworn I added it in... I'll add it in, my auto-TOPAS output doesn't include a point id.

Some loops contain data items from PR_PROC, PD_CALC and PD_MEAS categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA category?

I'm assuming this is

loop_
	_pd_meas.2theta_scan
	_pd_meas.counts_total
	_pd_proc.ls_weight
	_pd_calc.intensity_total
	_pd_proc.intensity_bkg_calc
	_pd_calc.component_intensities_total

Yes, this is the intent behind having PR_PROC, PD_CALC, and PD_MEAS all as children of PD_DATA. If you have some combination of measured, processed, and/or calculated data where there is a one-to-one correspondence between data points (ie the same _pd_data.point_id value), then it makes sense (and saves space) in putting them together in the same loop. Each row describes the same point.

I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks?

Yes. Powder experiments are almost always going to be multi-block, and looking for id-values is kind of governed by https://github.com/COMCIFS/comcifs.github.io/blob/master/accepted/multi-block-principles.md; I've never actually used CIF in a neat one-block-is-one-experiment/structure way. I don't know how to properly denote that using _audit* data items. This also brushes up against in https://github.com/COMCIFS/comcifs.github.io/blob/master/draft/block_collections.md.

What I am wanting to say with _pd_qpa_external_std.diffractogram_id SRM676A is "When I quantified the current diffractogram (DIFFRACTOGRAM_0020), I used the information from the diffractogram identified as SRM676A. When I go there, I find values of k_factor and MAC which I use in my calculations."

Is there a better way to say that? Maybe _pd_qpa_external_std.ref_diffractogram_id, which is Encode, and not Link?

rowlesmr added 4 commits June 26, 2023 14:05
it isn't necessary; _pd_meas.counts_total has an SU of sqrt(count), and therefore the default weight of 1/SU^2^ is good. No specialist weight scheme was used.
@rowlesmr
Copy link
Collaborator Author

If you want to have a look at some other pdCIFs I made (before I really knew what I was doing), check out https://journals.iucr.org/j/issues/2022/03/00/yr5087/

@rowlesmr
Copy link
Collaborator Author

  • Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?

You could make a rule that if there is one loop in a block with one diff_id, then you could autogenerate the point ids. But making too many exceptions isn't really a good thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants