-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review and refinement of export folder structure #493
Comments
20240321, an initial mock-up of a new export folder structure was made and sent to @PipBrewer for feedback. Mock-up version1: 20240402, a new version of the mock-up has been made based on feedback from Pip. Mock-up version2: The focus of the new folder structure is to make sure that files from each step of the process are saved to the archive (original, checked/corrected, processed, completely processed file imported to Specify). It is suggested to incorporate some automated processes through monitoring scripts to make sure copies are indeed saved. The files in the current folders will be looked through and sorted before implementation of any new folder structure. |
@AstridBVW This looks good to me. One question: will it be possible to determine at what stage the files in the archive where at in the processing stage by looking at the file names? |
@PipBrewer Yes, based on the suffix being "original", "checked", "processed", "forSpecify", or "imported". |
The new folder structure was implemented on 20240422. The files in the old folders were looked through and sorted. There was a folder leftover with files that needed to be checked further ("Left_over_mess"). The files in this folder have now been checked, and the folder has been deleted. Part of the new folder structure is a folder for a monitoring script. The monitoring script is not currently implemented, the folder remains and we are working around it for the time being (i.e. we are not using it). During the checking of the "Left_over_mess" folder, a file was discovered that had not been imported (NHMD_PinnedInsects_20240129_14_36_ABW_checked.csv). A copy of the file was found in the Archive but nowhere else, and it had not been post-processed. After a closer look, it seemed that there might be other older files in the Archive folder that also had not been imported/post-processed. After further investigation, it was discovered that these three export files had also not been post-processed or imported: NHMD_PinnedInsects_20240126_15_30_MJG_checked_corrected.csv They have now been moved to the "ReadyForOpenRefine" folder to be post-processed. |
The documentation for the import protocol has now been updated, https://github.com/NHMDenmark/Mass-Digitizer/blob/main/documentation/import_protocol_postProcessing.md. |
What is the issue?
The current folder structure for export files (import protocol) on the N-drive needs to be reviewed and refined.
Description
The current folder structure is not so easy to understand and navigate. The documentation for the import protocol is also not very easy to follow. The folder structure needs to be more intuitive and easy to navigate. It could also be beneficial to include some automated processes to eliminate some of the human errors that occur when multiple people are working within the same directory.
The folder structure can be found here:
N:\SCI-SNM-DigitalCollections\DaSSCo\DigiApp\Data
The current folder structure looks like this:
Why is it needed/relevant?
The scale of DaSSCo is steadily growing, and the processes/workflows are getting more complex. Also, more people are becoming involved in the processing of DaSSCo data. We need the folder structure/import protocol to accommodate this and not be a cause of issues/confusion.
Estimate level of effort required.
Hard
What is the expected acceptable result.
An export folder structure that accommodates DaSSCo the best way possible and is easy to adapt to any future needs of DaSSCo.
What could be the challenges?
We will need to figure out how to best implement automated processes as part of the folder structure, and if they are even beneficial at this stage.
What test are required ?
The flow of the new folder structure needs to be tested to see if it is intuitive enough/easy enough to navigate. Any automated processes need to be tested thoroughly.
What documentation required?
The documentation file "import_protocol_postProcessing.md" will need to be updated.
Associated issues
Closed issues
#454 Pre-processed exports from the app are not currently being saved
#349 Naming convention for digi app export files
#295 Write protocol for import and data validation
#489 Ensuring correct file names on Digi app exports
#461 Adding source name to the Specify Collection object table
#492 Tabular remarks - condition for setting the additional columns
The text was updated successfully, but these errors were encountered: