Skip to content

Commit

Permalink
J uranic/constrain file patterns (#1388)
Browse files Browse the repository at this point in the history
* Create merfish-v2.4.yaml

Constrain all file patterns to end of line

* Update Histology directory schema

Constrain all file patterns to end of line

* Update IMC2D directory schema

Constrain all file patterns to end of line

* Update DESI directory schema

Constrain all file patterns to end of line

* Update MALDI directory schema

Constrain all file patterns to end of line

* Update MIBI directory schema

Constrain file patterns to end of line

* Update SIMS directory schema

Constrain file patterns to end of line

* Update LC-MS directory schema

Constrain file patterns to end of line

* Update CODEX directory schema

Constrain file patterns to end of line

* Update Cell DIVE directory schema

Constrain file patterns to end of line

* Update Phenocycler directory schema

Constrain file patterns to end of line

* Update 10x Multiome directory schema

Constrain file patterns to end of line

* Update ATACseq directory schema

Constrain file patterns to end of line

* Update MUSIC directory schema

Constrain file patterns to end of line

* Update RNAseq directory schema

Constrain file patterns to end of line

* Update RNAseq with probes directory schema

Constrain all file patterns to end of line

* Update SNAREseq2 directory schema

Constrain file patterns to end of line

* Update Auto-fluorescence directory schema

Constrain file patterns to end of line

* Update Confocal directory schema

Constrain file patterns to end of line

* Update Enhanced SRS directory schema

Constrain file patterns to end of line

* Update Light Sheet directory schema

Constrain file patterns to end of line

* Update Second Harmonic Generation directory schema

Constrain all file patterns to end of line

* Update Thick Section Multiphoton directory schema

Constrain all file patterns to end of lin

* Update CosMx directory schema

Constrain all file patterns to end of line.

* Update GeoMx directory schema

Constrain all file patterns to end of line.

* Update HiFi-Slide directory schema

Constrain all file patterns to end of line.

* Update Visium no probes directory schema

Constrain all file patterns to end of line.

* Update Visium with probes

Constrain all file patterns to end of line.

* Update Xenium directory schema

Constrain all file patterns to end of line.

* Update Segmentation Mask directory schema

Constrain all file patterns to end of line.

* Update Publication directory schema

Constrain all file patterns to end of line.

* Update Thick Section Multiphoton MxIF directory schema

Remove incorrect file path

* Update thick-section-multiphoton-mxif-v2.1.yaml

Match existing schema

* Update visium-with-probes-v3.4.yaml

Added a file end of line constraint

* Update CHANGELOG.md

* General: Update docs

* General: Update docs

* General: Update docs

* General: Update docs

---------

Co-authored-by: Juan Puerto <=>
  • Loading branch information
j-uranic authored Jan 17, 2025
1 parent b9be405 commit 31f9bbd
Show file tree
Hide file tree
Showing 64 changed files with 2,731 additions and 36 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
## v0.0.30 (in progress)
- Update Seg Mask documentation
- Update CosMx directory schema
- Constrain file patterns to end of line for all published directory schemas

## v0.0.29
- Add CosMX directory schema
Expand Down
16 changes: 15 additions & 1 deletion docs/10x-multiome/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,21 @@ REQUIRED - For this assay, you must also prepare and submit two additional metad
<br>

## Directory schemas
<summary><b>Version 2.0 (use this one)</b></summary>
<summary><b>Version 2.1 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. |
| <code>extras\/expected_cell_count\.txt$</code> | | The expected cell count for the RNA sequencing dataset. This is an optional file that, if present, will be used by the HIVE's RNA sequencing analysis pipeline. With some datasets, knowing the expected cell count has improved the output of the HIVE analysis pipeline. |
| <code>raw\/.*</code> || All raw data files for the experiment. |
| <code>raw\/fastq\/.*</code> || Raw sequencing files for the experiment. |
| <code>raw\/fastq\/RNA\/.*</code> || Directory containing fastq files pertaining to RNAseq sequencing. |
| <code>raw\/fastq\/RNA\/[^\/]+_R[^\/]+\.fastq\.gz$</code> || This is a GZip'd version of the forward and reverse fastq files from RNAseq sequencing (R1 and R2). |
| <code>raw\/fastq\/ATAC\/.*</code> || Directory containing fastq files pertaining to ATACseq sequencing. |
| <code>raw\/fastq\/ATAC\/[^\/]+_R[^\/]+\.fastq\.gz$</code> || This is a GZip'd version of the fastq files containing the forward, reverse and barcode reads from ATACseq sequencing (R1, R2 and R3). Further, if the barcodes are in R3 (as with 10X) then the metadata field "barcode reads" would be set to "Read 2 (R2)" and the fastq file named "*_R2*fastq.gz" would be expected. |
| <code>lab_processed\/.*</code> | | Experiment files that were processed by the lab generating the data. |

<summary><b>Version 2.0</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
27 changes: 26 additions & 1 deletion docs/af/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,32 @@ This schema is for autofluorescence (AF). For an example of an AF dataset & dire
<br>

## Directory schemas
<summary><b>Version 2.1 (use this one)</b></summary>
<summary><b>Version 2.2 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. [Exists in all assays] |
| <code>extras\/microscope_hardware\.json$</code> || **[QA/QC]** A file generated by the micro-meta app that contains a description of the hardware components of the microscope. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>extras\/microscope_settings\.json$</code> | | **[QA/QC]** A file generated by the micro-meta app that contains a description of the settings that were used to acquire the image data. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>raw\/.*</code> || Raw data files for the experiment. |
| <code>raw\/channel_layout\.tsv$</code> | | Table that includes a dictionary for channel to moiety, which may be a protein given in an OMAP panel or captured in the ASCT+B table. |
| <code>raw\/images\/.*</code> || Raw image files. Using this subdirectory allows for harmonization with other imaging assays. [This directory must include at least one raw file.] |
| <code>raw\/images\/[^\/]+\.(?:xml&#124;nd2&#124;oir&#124;lif&#124;czi&#124;tiff&#124;qptiff)$</code> || Raw microscope file for the experiment |
| <code>lab_processed\/.*</code> || Experiment files that were processed by the lab generating the data. |
| <code>lab_processed\/images\/.*</code> || Processed image files |
| <code>lab_processed\/images\/[^\/]+\.ome\.tiff$</code> (example: <code>lab_processed/images/HBM892.MDXS.293.ome.tiff</code>) || OME-TIFF files (multichannel, multi-layered) produced by the microscopy experiment. If compressed, must use loss-less compression algorithm. See the following link for the set of fields that are required in the OME TIFF file XML header. <https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0> |
| <code>lab_processed\/images\/[^\/]*ome-tiff\.channels\.csv$</code> || This file provides essential documentation pertaining to each channel of the accommpanying OME TIFF. The file should contain one row per OME TIFF channel. The required fields are detailed <https://docs.google.com/spreadsheets/d/1xEJSb0xn5C5fB3k62pj1CyHNybpt4-YtvUs5SUMS44o/edit#gid=0> |
| <code>lab_processed\/transformations\/.*</code> | | This directory contains transformation matrices that capture how each modality is aligned with the other and can be used to visualize overlays of multimodal data. This is needed to overlay images from the exact same tissue section (e.g., MALDI imaging mass spec, autofluorescence microscopy, MxIF, histological stains). In these cases data type may have different pixel sizes and slightly different orientations (i.e., one may be rotated relative to another). |
| <code>lab_processed\/transformations\/[^\/]+\.txt$</code> | | Transformation matrices used to overlay images from the exact same tissue section (e.g., MALDI imaging mass spec, autofluorescence microscopy, MxIF, histological stains). |
| <code>qa_qc\/.*</code> || Directory containing QA and/or QC information. |
| <code>qa_qc\/resolution_report\/.*</code> || Directory containing the results of resolution tests and/or vendor preventative maintenance reports. |
| <code>qa_qc\/resolution_report\/resolution\.txt$</code> | | This file summarizes the results of resolution tests or vendor reports from preventative maintenance visits. |
| <code>qa_qc\/resolution_report\/[^\/]+\.pdf$</code> | | This file is a pdf from a vendor preventative maintenance visit or resolution check tool demonstrating resolution. This file may include illumination test results. |
| <code>qa_qc\/illumination_report\/.*</code> || Directory containing the results of illumination tests and/or vendor preventative maintenance reports. |
| <code>qa_qc\/illumination_report\/illumination.txt$</code> | | This file summarizes the results of illumination tests or vendor reports from preventative maintenance visits. |
| <code>qa_qc\/illumination_report\/[^\/]+\.pdf$</code> | | This file is a pdf from a vendor preventative maintenance visit or illumination check tool demonstrating illumination intensity. |

<summary><b>Version 2.1</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
13 changes: 12 additions & 1 deletion docs/atacseq/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,18 @@ For additional documentation on this dataset type, please visit [here](https://d
<br>

## Directory schemas
<summary><b>Version 2.0 (use this one)</b></summary>
<summary><b>Version 2.1 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. |
| <code>raw\/.*</code> || All raw data files for the experiment. |
| <code>raw\/fastq\/.*</code> || Raw sequencing files for the experiment. |
| <code>raw\/fastq\/ATAC\/.*</code> || Directory containing fastq files pertaining to ATACseq sequencing. |
| <code>raw\/fastq\/ATAC\/[^\/]+_R[^\/]+\.fastq\.gz$</code> || This is a GZip'd version of the fastq files containing the forward, reverse and barcode reads from ATACseq sequencing (R1, R2 and R3). Further, if the barcodes are in R3 (as with 10X) then the metadata field "barcode reads" would be set to "Read 2 (R2)" and the fastq file named "*_R2*fastq.gz" would be expected. |
| <code>lab_processed\/.*</code> | | Experiment files that were processed by the lab generating the data. |

<summary><b>Version 2.0</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
23 changes: 22 additions & 1 deletion docs/celldive/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,28 @@ Related files:
<br>

## Directory schemas
<summary><b>Version 2.0 (use this one)</b></summary>
<summary><b>Version 2.1 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. [Exists in all assays] |
| <code>extras\/microscope_hardware\.json$</code> || **[QA/QC]** A file generated by the micro-meta app that contains a description of the hardware components of the microscope. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>extras\/microscope_settings\.json$</code> | | **[QA/QC]** A file generated by the micro-meta app that contains a description of the settings that were used to acquire the image data. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>raw\/.*</code> || This is a directory containing raw data. |
| <code>raw\/images\/.*</code> || Raw image files. Using this subdirectory allows for harmonization with other more complex assays, like Visium that includes both raw imaging and sequencing data. |
| <code>raw\/images\/round_info_[^\/]+\.dat$</code> (example: <code>raw/images/round_info_002.dat</code>) || Metadata file for the capture item-value tab separated format. This contains various instrument and acquisition details for each acquisition cycle. |
| <code>lab_processed\/.*</code> || Experiment files that were processed by the lab generating the data. |
| <code>lab_processed\/images\/.*</code> || This is a directory containing processed image files |
| <code>lab_processed\/images\/region_[^\/]+\/[^\/]+_region_[^\/]+\.ome\.(?:tif&#124;tiff)$</code> (example: <code>lab_processed/images/region_001/S20030092_region_011.ome.tif</code>) || OME TIFF Files for the corresponding region (e.g. region_001) by slide (e.g S20030077), organized into subdirectories based on their region. |
| <code>lab_processed\/images\/region_[^\/]+\/[^\/]*ome-tiff\.channels\.csv$</code> || This file provides essential documentation pertaining to each channel of the accommpanying OME TIFF. The file should contain one row per OME TIFF channel. The required fields are detailed <https://docs.google.com/spreadsheets/d/1xEJSb0xn5C5fB3k62pj1CyHNybpt4-YtvUs5SUMS44o/edit#gid=0> |
| <code>lab_processed\/annotations\/.*</code> || This is a directory containing annotations. |
| <code>lab_processed\/annotations\/slide_list\.txt$</code> || Information about the slides used by the experiment- each line corresponds to a slide name (begins with S - e.g. S20030077) - used in filenames. |
| <code>lab_processed\/virtual_histology\/.*</code> || This is a directory containing annotations for virtual histology images |
| <code>lab_processed\/virtual_histology\/HandE_RGB_thumbnail\.jpg$</code> || Virtual H&E RGB thumbnail |
| <code>lab_processed\/virtual_histology\/HandE_RGB\.tif$</code> || Virtual H&E RGB image |
| <code>lab_processed\/virtual_histology\/[^\/]+_VHE_region_[^\/]+\.tif$</code> || Virtual H&E image |

<summary><b>Version 2.0</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
26 changes: 25 additions & 1 deletion docs/codex/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,31 @@ Related files:
<br>

## Directory schemas
<summary><b>Version 2.2 (use this one)</b></summary>
<summary><b>Version 2.3 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. [Exists in all assays] |
| <code>extras\/microscope_hardware\.json$</code> || **[QA/QC]** A file generated by the micro-meta app that contains a description of the hardware components of the microscope. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>extras\/microscope_settings\.json$</code> | | **[QA/QC]** A file generated by the micro-meta app that contains a description of the settings that were used to acquire the image data. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>raw\/.*</code> || This is a directory containing raw data. |
| <code>lab_processed\/.*</code> || Experiment files that were processed by the lab generating the data. |
| <code>lab_processed\/images\/.*</code> || This is a directory containing processed image files |
| <code>lab_processed\/images\/[^\/]+\.ome\.tiff$</code> || OME-TIFF file (multichannel, multi-layered) produced by the experiment. If compressed, must use loss-less compression algorithm. See the following link for the set of fields that are required in the OME TIFF file XML header. <https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0> |
| <code>lab_processed\/images\/[^\/]*ome-tiff\.channels\.csv$</code> || This file should describe any processing that was done to generate the images in each channel of the accommpanying OME TIFF. The file should contain one row per OME TIFF channel. Two columns should be booleans "is this a channel to use for nuclei segmentation" and "is this a channel to use for cell segmentation". |
| <code>[^\/]*NAV[^\/]*\.tif$</code> (example: <code>NAV.tif</code>) | | Navigational Image showing Region of Interest (Keyance Microscope only) |
| <code>[^\/]+\.pdf$</code> (example: <code>summary.pdf</code>) | | **[QA/QC]** PDF export of Powerpoint slide deck containing the Image Analysis Report |
| <code>extras\/dir-schema-v2-with-dataset-json</code> || Empty file whose presence indicates the version of the directory schema in use |
| <code>processed\/drv_[^\/]*\/.*</code> || Processed files produced by the Akoya software or alternative software. |
| <code>raw\/cyc[^\/]*_reg[^\/]*\/.*</code> || Intermediary directory |
| <code>raw\/src_[^\/]*\/.*</code> || Intermediary directory |
| <code>raw\/cyc[^\/]*_reg[^\/]*\/[^\/]*_z[^\/]*_CH[^\/]*\.tif$</code> || TIFF files produced by the experiment. General folder format: Cycle(n)_Region(n)_date; General file format: name_tileNumber(n)_zplaneNumber(n)_channelNumber(n) |
| <code>raw\/src_[^\/]*\/cyc[^\/]*_reg[^\/]*_[^\/]*\/[^\/]+\.gci$</code> | | Group Capture Information File (Keyance Microscope only) |
| <code>raw\/dataset\.json$</code> (example: <code>raw/dataset.json</code>) || Data processing parameters file. This will include additional CODEX specific metadata needed for the HIVE processing workflow. |
| <code>raw\/reg_[^\/]*\.png$</code> (example: <code>raw/reg_00.png</code>) | | Region overviews |
| <code>raw\/experiment\.json$</code> (example: <code>raw/experiment.json</code>) | | JSON file produced by the Akoya software which contains the metadata for the experiment, including the software version used, microscope parameters, channel names, pixel dimensions, etc. (required for HuBMAP pipeline) |

<summary><b>Version 2.2</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
27 changes: 26 additions & 1 deletion docs/confocal/current/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,32 @@ Related files:
<br>

## Directory schemas
<summary><b>Version 2.0 (use this one)</b></summary>
<summary><b>Version 2.1 (use this one)</b></summary>

| pattern | required? | description |
| --- | --- | --- |
| <code>extras\/.*</code> || Folder for general lab-specific files related to the dataset. [Exists in all assays] |
| <code>extras\/microscope_hardware\.json$</code> || **[QA/QC]** A file generated by the micro-meta app that contains a description of the hardware components of the microscope. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>extras\/microscope_settings\.json$</code> | | **[QA/QC]** A file generated by the micro-meta app that contains a description of the settings that were used to acquire the image data. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. |
| <code>raw\/.*</code> || Raw data files for the experiment. |
| <code>raw\/channel_layout\.tsv$</code> || Table that includes a dictionary for channel to moiety, which may be a protein given in an OMAP panel or captured in the ASCT+B table. |
| <code>raw\/images\/.*</code> || Raw image files. Using this subdirectory allows for harmonization with other imaging assays. [This directory must include at least one raw file.] |
| <code>raw\/images\/[^\/]+\.(?:xml&#124;nd2&#124;oir&#124;lif&#124;czi&#124;tiff)$</code> || Raw microscope file for the experiment |
| <code>lab_processed\/.*</code> || Experiment files that were processed by the lab generating the data. |
| <code>lab_processed\/images\/.*</code> || Processed image files |
| <code>lab_processed\/images\/[^\/]+\.ome\.tiff$</code> (example: <code>lab_processed/images/HBM892.MDXS.293.ome.tiff</code>) || OME-TIFF files (multichannel, multi-layered) produced by the microscopy experiment. If compressed, must use loss-less compression algorithm. See the following link for the set of fields that are required in the OME TIFF file XML header. <https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0> |
| <code>lab_processed\/images\/[^\/]*ome-tiff\.channels\.csv$</code> || This file provides essential documentation pertaining to each channel of the accommpanying OME TIFF. The file should contain one row per OME TIFF channel. The required fields are detailed <https://docs.google.com/spreadsheets/d/1xEJSb0xn5C5fB3k62pj1CyHNybpt4-YtvUs5SUMS44o/edit#gid=0> |
| <code>lab_processed\/transformations\/.*</code> | | This directory contains transformation matrices that capture how each modality is aligned with the other and can be used to visualize overlays of multimodal data. This is needed to overlay images from the exact same tissue section (e.g., MALDI imaging mass spec, autofluorescence microscopy, MxIF, histological stains). In these cases data type may have different pixel sizes and slightly different orientations (i.e., one may be rotated relative to another). |
| <code>lab_processed\/transformations\/[^\/]+\.txt$</code> | | Transformation matrices used to overlay images from the exact same tissue section (e.g., MALDI imaging mass spec, autofluorescence microscopy, MxIF, histological stains). |
| <code>qa_qc\/.*</code> || Directory containing QA and/or QC information. |
| <code>qa_qc\/resolution_report\/.*</code> || Directory containing the results of resolution tests and/or vendor preventative maintenance reports. |
| <code>qa_qc\/resolution_report\/resolution\.txt$</code> | | This file summarizes the results of resolution tests or vendor reports from preventative maintenance visits. |
| <code>qa_qc\/resolution_report\/[^\/]+\.pdf$</code> | | This file is a pdf from a vendor preventative maintenance visit or resolution check tool demonstrating resolution. This file may include illumination test results. |
| <code>qa_qc\/illumination_report\/.*</code> || Directory containing the results of illumination tests and/or vendor preventative maintenance reports. |
| <code>qa_qc\/illumination_report\/illumination.txt$</code> | | This file summarizes the results of illumination tests or vendor reports from preventative maintenance visits. |
| <code>qa_qc\/illumination_report\/[^\/]+\.pdf$</code> | | This file is a pdf from a vendor preventative maintenance visit or illumination check tool demonstrating illumination intensity. |

<summary><b>Version 2.0</b></summary>

| pattern | required? | description |
| --- | --- | --- |
Expand Down
Loading

0 comments on commit 31f9bbd

Please sign in to comment.