-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Images that are in the wrong folder when uploading via ingestion client #109
Comments
@ThomasAlscher1991 @bhsi-snm I am unable to open the tifs to look at the images, and the jpegs don't seem to be there in the folder. Could we generate the jpegs for these folders and put them on the N drive? |
The automatic script stopped somehow. But I restarted it, so most folders are now converted to JPEGs again. If there are still some missing, wait until tomorrow, then the whole backlog is cleaned up. |
I've seen the folders in WORKPIOF0001: |
I noticed that different people were uploading on those dates. I asked the digitisers on 12th July on Slack if anyone remembered any issues from this time, but no response. With this in mind, it is strange that the issues are clustered in time. Could this have been a bug? However, it is so long ago that no-one will be able to remember and so I think we need to close this issue. Is there a way to schedule checks so that we can be informed more quickly if this happens again and investigate it when things are fresh in people's minds? @bhsi-snm In the meantime: WORKHERB0001 (all should be in folder WORKPIOF0001): 23 Jan with 177 files 26 Jan with 358 files @k-zamzam @ThomasAlscher1991 Are you able to "fix" this metadata and move the files to correct folder? |
I looked into some JPEG folders to gather data for some tests and found pinned insects in WOKRHERB0001 as late as 2024-7-24. |
@PipBrewer This is just one example of how things can go wrong. To address this issue completely, we need to have quality control in place. This means identifying the possible places things can go wrong and documenting them. Once we have that in place, we can then explore various ways to correct them. Correct faulty uploads This section contains a guideline for correcting errors that have been encountered so far. General advice: Run these corrections when no other uploads are being processed. NOTE: Check if the pipeline has been assigned incorrectly, too. In case images have been uploaded to the wrong folder in your MEDIA_URL_base, go through this guide. Check if it was a server-side error (unlikely) |
To sum up, issues encountered while running QA of image content for images taken all-time (up to and including week 32) at all SNM workstations that are related to processing fall in the following categories: You can see visuals of these issues here |
I continue to find folders containing specimens that do not correspond, even in week 35. I have communicated with the Digitizers at NHMD and NHMA about this to ask them to be vigilant when selecting fields in the Ingestion Client. |
As discussed in IT team meeting in 26/08/2024, renaming GUIDs is low priority. I wonder whether there is a way to do all of this smartly and in bulk (at least at folder level). Possibly discuss this with Allison to see if this is something she can take on in the future (once she is up and running with other things - which have priority). Added this to the data board. @chelseagraham @bhsi-snm Have either of you listed in a document somewhere all of the incorrect ones found and what they should be? |
I've included these folders in each of my QA reports on GitHub / N:/ and I've tagged Bhupjit to make him aware of each instance. |
@chelseagraham @bhsi-snm @beckerah That will be hard to work with (having things in multiple reports/tickets), we need a way to consolidate the info and document what has been done on it (as it is a multistep process). We also need to monitor the reasons for these issues and how frequently they occur. A single place is needed therefore.
We also need to identify and isolate them as soon as possible BEFORE they go through image processing pipeline. |
@chelseagraham @PipBrewer @bhsi-snm
Maybe having a log like this could kill two birds with one stone: help us see when something's gone awry with ingestion, and also locate the images that match the barcodes. Thoughts? |
For now, it seems like a good solution to keep track of problems. Might be annoying to keep up long term, but definitely needed now. |
I've started a spreadsheet on the N drive, in DaSSCo\Data. It's currently named trackingBarcodesAndImages.ods. Feel free to take a look and offer feedback. Hopefully this will help us keep better track of things in the interim, at least while we're waiting on getting the barcodes added to the image metadata. |
When uploading the images via the ingestion client from the pinned insect workstation in room 411, some of the images ended up in the wrong workstation folder of the n-drive namely in WORKHERB0001.
There have been 2019 images that are in the wrong folder (ended up in WORKHERB0001 but should be WORKPIOF0001).
The folders which have the wrong images (pinned insects in herbarium): 2024-1-19, 2024-1-23, 2024-1-26
Probably, a case of not having the right folder selected when uploading the files, but it could be something else. This happened back in Jan.
As far as I know, there are 2 cases of error reports when the ingestion client has misbehaved but we haven't been able to replicate or pinpoint the error.
The dates of this error do not coincide with the error reported on Slack which was on 29th May and 28th June.
It might be a good idea to investigate it further to see if it is just a human error or something else.
The text was updated successfully, but these errors were encountered: