-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding source name to the Specify Collection object table #461
Comments
It would be good if we could see this in Specify |
There isn't really a way for GREL to get the file, or rather, OpenRefine project name as far as I can see. The only way I can think of is this being added manually. I would also recommend treating this as a tabular remark field (c.f. #444) so we don't occupy any customizable text fields with it. |
We already have a remarks, the new column might be 'remark_source' which can be: |
Question: Should remark_date be the date that the export was made, or the date it was post processed? |
As a result from the implementation of #444 we already have a column "remark source", so I suggest you choose another name.
For the specimen level remarks field, these fields are just prefixed "remark", so you get "remark source" and "remark date". Actually using the term "source" for the filename of the data is confusing here; Maybe it's better to use "datafile". So that means the following column names;
@bhsi-snm Do you approve of this proposal? |
Since we have code ready for monitoring a directory: I could extend this to add "datafile_source" and "datafile_date" to the csv export. This circumvents openRefine. |
The "datafile_source" and "datafile_date" and "datafile_remarks" columns for the tabular remarks have been added through the monitoring script. |
See issue #492 on conditionally adding values in the remarks columns. |
The monitoring script was not entirely implemented before Jan left so it has been made part of the post-processing GREL script instead (ticket #506 ). |
What is the issue ?
It would help in debugging and 'housekeeping' if the imported Digi app records had their original source file name attached in a separate field:
Source
"NHMD_PinnedInsects_20231121_16_16_SS_original.csv"
Detailed description of the issue.
If there is a discrepancy between imported records in Specify and what is in the 4.Archive directory, then having the source path would be a massive help.
Why is it needed/relevant ?
We gain a certain amount of future proofing in that it addresses issues like the one above and anticipates unforeseen problems.
Give scenario(s) of why and when this could be relevant.
If a curator discovers something in specify that is a little off the mark, we can go all the way back to the source to investigate. We have already agreed that the postprocessing GREL scripts should have their own version as they evolves with business needs.
Adding a source field ties neatly into this as it makes forensics much easier.
Estimate level of effort required.
easy
What could be the challenges ?
There does not seem to be a way to automatically add the file name to a column in open refine. That means it has be added manually in the open refine interface which is a trivial task.
What documentation required?
The documentation file "import_protocol_postProcessing.md" will need to be updated.
The text was updated successfully, but these errors were encountered: