Incorporate status quo specification details #71

HLWeil · 2023-09-11T10:35:44Z

The way tools handle the ARC has gradually grown beyond the current ISA and ARC specifications. I composed a list of status-quo differences in the tools and the current state of the specification. These should be discussed and implemented into the ARC-specification:

Changes to ISA-Tab

Xlsx vs txt files
Investigation file keys:
- Investigation PubMed ID changed to Investigation Publication PubMed ID
- Study PubMed ID changed to Study Publication PubMed ID
- [BUG] investigation file deviates from ISA ARCCommander#107
Metadata sheets
- Assay files now contain a metadata sheet with the assay section from the study section and an additional performers section
- Study files now contain a metadata sheet with only the top-level study section
- Sheetnames: isa_study, isa_assay and isa_investigation
Annotation table process separation
- Process separation in ISA-Tab is marked using the Protocol REF column. Now, instead this separation is done by having different sheets
- Annotation table sheet names correspond to isa-json process names
- Annotation Table headers:
  - Space between category and name was introduced
    - e.g. Characteristics [Organism] vs Characteristics[Organism]
  - Ontological Annotation of headers:
    - E.g. Term Source REF (PATO:0000146) vs only Term Source REF
  - Added Component columns
  - Characteristic instead of Characteristics
  - Factor instead of Factor Value
  - Parameter isntead of Parameter Value
  - Entity headers now specify whether it’s Input or Output:
    - E.g. Input [Source Name] and Output [Sample Name]

Further Specifications

Paths
- Based on arc root
- Or relative to reference path:
  - Data file paths in assay files: assays/<assayName>/dataset
  - Data file paths in study files: studies/<studyName>/resources
  - Assay file name in metadata section: assays
  - Study file name in metadata section: studies

Parsing behaviour

Processes are parsed in a different way compared to ISA-API
- In ISA-API, the input and output of the resulting process can be interpreted like in a chemical formula. All inputs map interchangeably to all outputs.
- Currently in ARCtrl, the table logic is continued in the results isa_process.json. The n-th input maps to the n-th output in the lists

The text was updated successfully, but these errors were encountered:

HLWeil mentioned this issue Sep 13, 2023

Merge ISA-Tab specification with ISA-XLSX changes #72

Merged

This was referenced Oct 9, 2023

Improve specification of ISA-XLSX elements #63

Closed

Study metadata section is not covered #64

Closed

Include isa-xlsx for ARC-specification 1.2 #76

Merged

HLWeil closed this as completed in #76 Nov 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorporate status quo specification details #71

Incorporate status quo specification details #71

HLWeil commented Sep 11, 2023

Xlsx vs txt files

Investigation file keys:

Metadata sheets

Annotation table process separation

Paths

Processes are parsed in a different way compared to ISA-API

Incorporate status quo specification details #71

Incorporate status quo specification details #71

Comments

HLWeil commented Sep 11, 2023

Changes to ISA-Tab

Xlsx vs txt files

Investigation file keys:

Metadata sheets

Annotation table process separation

Further Specifications

Paths

Parsing behaviour

Processes are parsed in a different way compared to ISA-API