Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interim Data Workflow Steps for AU #567

Closed
beckerah opened this issue Jan 15, 2025 · 2 comments
Closed

Interim Data Workflow Steps for AU #567

beckerah opened this issue Jan 15, 2025 · 2 comments
Assignees
Labels
AU For tasks relating specifically to AU data Tasks or issues related to dassco-data workflows post-processing Specimen table export preparation for import using OpenRefine

Comments

@beckerah
Copy link
Contributor

(Copied from #550)
At least temporarily, the data workflow for AU should include:

  • Cleaning up folder images manually ingested to the N drive
  • Pulling barcodes and image names from output.json file to compare with species-web data in order to find folder images to be deleted
  • Cleaning up guid db on ingestion server
  • Checking a daily stats spreadsheet maintained by digitizers to compare numbers across species-web db and images on N drive
  • Checking a spreadsheet of specimen images incorrectly ID'd as folders and providing original image names so digitizers can re-image as needed
@beckerah beckerah added post-processing Specimen table export preparation for import using OpenRefine AU For tasks relating specifically to AU data Tasks or issues related to dassco-data workflows labels Jan 15, 2025
@beckerah beckerah self-assigned this Jan 15, 2025
@beckerah
Copy link
Contributor Author

Spreadsheets that should be maintained by both myself and Birgitte/Charlotte are mentioned in the simplified workflow doc here: #39

Thoughts about folder images on the N drive are here: #555

@beckerah
Copy link
Contributor Author

Re-working this workflow as we're temporarily storing images on ucloud. Once we're no longer doing that, I'm hoping to be able to pull the folder images out before ingestion, using the output.json contents. Follow this here: #16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AU For tasks relating specifically to AU data Tasks or issues related to dassco-data workflows post-processing Specimen table export preparation for import using OpenRefine
Projects
None yet
Development

No branches or pull requests

1 participant