Skip to content

Commit

Permalink
Merge pull request #136 from nf-core/soft_matrix_support
Browse files Browse the repository at this point in the history
Soft matrix support
  • Loading branch information
azedinez authored Sep 12, 2023
2 parents dcf587b + 45dd543 commit 8d61010
Show file tree
Hide file tree
Showing 14 changed files with 445 additions and 32 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ jobs:
- "test"
- "test_nogtf"
- "test_affy"
- "test_soft"
steps:
- name: Check out pipeline code
uses: actions/checkout@v3
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#129](https://github.com/nf-core/differentialabundance/pull/129)] - Module updates to fit with recent registry changes ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse), [@adamrtalbot](https://github.com/adamrtalbot))
- [[#130](https://github.com/nf-core/differentialabundance/pull/130)] - Document reasons for lack of differential expression ([@pinin4fjords](https://github.com/pinin4fjords), review by [@jfy133](https://github.com/jfy133))
- [[#131](https://github.com/nf-core/differentialabundance/pull/131)] - Improve gtf to table configurability ([@pinin4fjords](https://github.com/pinin4fjords), review by [@WackerO](https://github.com/WackerO))
- [# 136](https://github.com/nf-core/differentialabundance/pull/136)] - Added support for non-Affymetrix arrays via automatic download of SOFT matrices in GEO ([@azedinez](https://github.com/azedinez), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#137](https://github.com/nf-core/differentialabundance/pull/137)] - Add `--sizefactors_from_controls` and `--gene_id_col` for DESeq2 module to modules.config ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#145](https://github.com/nf-core/differentialabundance/pull/145)] - Template update for nf-core/tools v2.9 ([@nf-core-bot](https://github.com/nf-core-bot), review by [@pinin4fjords](https://github.com/pinin4fjords), [@WackerO](https://github.com/WackerO))

Expand Down
4 changes: 3 additions & 1 deletion assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,9 @@ if (! params$observations_name_col %in% colnames(observations)){
if (! is.null(params$features)){
features <- read_metadata(file.path(params$input_dir, params$features))
features <- features[,colnames(features) %in% simpleSplit(params$features_metadata_cols), drop = FALSE]
if (! is.null(params$features_metadata_cols)){
features <- features[,colnames(features) %in% simpleSplit(params$features_metadata_cols), drop = FALSE]
}
}
contrasts <- read_metadata(file.path(params$input_dir, params$contrasts_file))
Expand Down
24 changes: 24 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,30 @@ process {
].join(' ').trim() }
}

withName: GEOQUERY_GETGEO {
publishDir = [
[
path: { "${params.outdir}/tables/processed_abundance" },
mode: params.publish_dir_mode,
pattern: '*.matrix.tsv'
],
[
path: { "${params.outdir}/tables/annotation" },
mode: params.publish_dir_mode,
pattern: '*.annotation.tsv'
],
[
path: { "${params.outdir}/other/affy" },
mode: params.publish_dir_mode,
pattern: '*.{rds,sessionInfo.log}'
]
]
ext.prefix = { "normalised." }
ext.args = {
((params.features_metadata_cols == null) ? '' : "--metacols \"${params.features_metadata_cols}\"")
}
}

withName: DESEQ2_NORM {
ext.prefix = 'all'
publishDir = [
Expand Down
46 changes: 46 additions & 0 deletions conf/soft.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running SOFT array file analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines settings specific to array analysis with SOFT files from GEO
Use as follows:
nextflow run nf-core/differentialabundance -profile soft,<docker/singularity> --outdir <OUTDIR>
----------------------------------------------------------------------------------------
*/

params {

config_profile_name = 'SOFT matrix track test profile'
config_profile_description = 'Minimal settings for test of the SOFT matrix track'

// Study
study_type = 'geo_soft_file'
study_abundance_type = 'intensities'

// Observations
observations_id_col = 'id'
observations_name_col = 'id'


// Features
features_id_col = 'ID'
features_metadata_cols = 'ID,ENTREZ_GENE_ID,Gene Symbol,Sequence Type'
features_name_col = 'Gene Symbol'


// Exploratory
exploratory_assay_names = 'normalised'
exploratory_final_assay = 'normalised'

// Differential options
differential_file_suffix = ".limma.results.tsv"
differential_fc_column = "logFC"
differential_pval_column = "P.Value"
differential_qval_column = "adj.P.Val"
differential_feature_id_column = "probe_id"
differential_feature_name_column = "Symbol"

}

32 changes: 32 additions & 0 deletions conf/test_soft.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple
pipeline test with SOFT array files from GEO.
Use as follows:
nextflow run nf-core/differentialabundance -profile test_soft_array,<docker/singularity> --outdir <OUTDIR>
----------------------------------------------------------------------------------------
*/

includeConfig 'soft.config'

params {

config_profile_name = 'SOFT matrix track test profile'
config_profile_description = 'Minimal settings for test of the SOFT matrix track'

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = '6.GB'
max_time = '6.h'

// Input
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/differentialabundance/testdata/GSE50790.csv'
contrasts = 'https://raw.githubusercontent.com/nf-core/test-datasets/differentialabundance/testdata/GSE50790_contrasts.csv'
querygse = 'GSE50790'

}

16 changes: 16 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,22 @@ This is a numeric square matrix file, comma or tab-separated, with a column for

This is an archive of CEL files as frequently found in GEO.

### Use SOFT matrices

Alternatively, the user may want to work with SOFT matrices. In this case, setting

`--study_type geo_soft_file` and `--querygse [GSE study ID]`

enables the pipeline to download normalised SOFT matrices automatically (note that even though Affymetrix arrays are also supported in the SOFT matrix track, it is recommended to work from CEL files in this case).

As for other platforms You may subset the metadata features used in reporting etc. e.g. for GPL570 (Affymetrix Plus 2.0 arrays) this could be done with

```
--features_metadata_cols ID,Entrez_Gene_ID,Symbol,Definition
```

Full list of features metadata are available on GEO platform pages.

## Contrasts file

```bash
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@
"git_sha": "d0b4fc03af52a1cc8c6fb4493b921b57352b1dd8",
"installed_by": ["modules"]
},
"geoquery/getgeo": {
"branch": "master",
"git_sha": "6814b0659c51e447684a58c2b834a9f3b530540d",
"installed_by": ["modules"]
},
"gsea/gsea": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
Expand Down
24 changes: 24 additions & 0 deletions modules/nf-core/geoquery/getgeo/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

47 changes: 47 additions & 0 deletions modules/nf-core/geoquery/getgeo/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 8d61010

Please sign in to comment.