Skip to content

Commit

Permalink
Merge pull request #32 from nimh-dsst/feature/integrate-generate-renders
Browse files Browse the repository at this point in the history
integrating generate renders to pipeline
  • Loading branch information
Arshitha authored May 2, 2023
2 parents b674e3c + 1d6e8cd commit 18a3238
Show file tree
Hide file tree
Showing 5 changed files with 133 additions and 302 deletions.
140 changes: 41 additions & 99 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,74 +46,62 @@ Once conda finishes creating the virtual environment, activate `dsstdeface`.
conda activate dsstdeface
```

## Using `dsst_defacing_wf.py`
## Usage

To deface anatomical scans in the dataset, run the `src/dsst_defacing_wf.py` script. From within the `dsst-defacing-pipeline` cloned directory, run the following command to see the help message.
To deface anatomical scans in the dataset, run the `src/run.py` script. From within the `dsst-defacing-pipeline` cloned directory, run the following command to see the help message.

```text
% python src/dsst_defacing_wf.py -h
% python src/run.py -h
usage: dsst_defacing_wf.py [-h] [-n N_CPUS]
[-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
[-s SESSION_ID [SESSION_ID ...]]
[--no-clean]
bids_dir output_dir
usage: run.py [-h] [-n N_CPUS] [-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
[-s SESSION_ID [SESSION_ID ...]] [--no-clean]
bids_dir output_dir
Deface anatomical scans for a given BIDS dataset or a subject
directory in BIDS format.
Deface anatomical scans for a given BIDS dataset or a subject directory in
BIDS format.
positional arguments:
bids_dir The directory with the input dataset
formatted according to the BIDS standard.
output_dir The directory where the output files should
be stored.
bids_dir The directory with the input dataset formatted
according to the BIDS standard.
output_dir The directory where the output files should be stored.
options:
optional arguments:
-h, --help show this help message and exit
-n N_CPUS, --n-cpus N_CPUS
Number of parallel processes to run when
there is more than one folder. Defaults to
1, meaning "serial processing".
Number of parallel processes to run when there is more
than one folder. Defaults to 1, meaning "serial
processing".
-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...], --participant-label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]
The label(s) of the participant(s) that
should be defaced. The label corresponds to
sub-<participant_label> from the BIDS spec
(so it does not include "sub-"). If this
parameter is not provided all subjects
should be analyzed. Multiple participants
can be specified with a space separated
list.
-s SESSION_ID [SESSION_ID ...], --session-id SESSION_ID [SESSION_ID ...]
The ID(s) of the session(s) that should be
The label(s) of the participant(s) that should be
defaced. The label corresponds to
ses-<session_id> from the BIDS spec (so it
does not include "ses-"). If this parameter
is not provided all subjects should be
analyzed. Multiple sessions can be specified
with a space separated list.
--no-clean If this argument is provided, then AFNI
intermediate files are preserved.
```

The script can be run serially on a BIDS dataset or in parallel at subject/session level. The three methods of running
the script have been described below with example commands:

For readability of example commands, the following bash variables have been defined as follows:

```bash
INPUT_DIR="<path/to/BIDS/input/dataset>"
OUTPUT_DIR="<path/to/desired/defacing/output/directory>"
sub-<participant_label> from the BIDS spec (so it does
not include "sub-"). If this parameter is not provided
all subjects should be analyzed. Multiple participants
can be specified with a space separated list.
-s SESSION_ID [SESSION_ID ...], --session-id SESSION_ID [SESSION_ID ...]
The ID(s) of the session(s) that should be defaced.
The label corresponds to ses-<session_id> from the
BIDS spec (so it does not include "ses-"). If this
parameter is not provided all subjects should be
analyzed. Multiple sessions can be specified with a
space separated list.
--no-clean If this argument is provided, then AFNI intermediate
files are preserved.
```

**NOTE:** In the example commands below, `<path/to/BIDS/input/dataset>` and `<path/to/desired/output/directory>` are
placeholders for paths to input and output directories, respectively.
The script can be run serially on a BIDS dataset or in parallel at subject/session level. Both these methods of running
the script have been described below with example commands.

### Option 1: Serial defacing

If you have a small dataset with less than 10 subjects, then it might be easiest to run the defacing algorithm serially.

```bash
python src/dsst_defacing_wf.py ${INPUT_DIR} ${OUTPUT_DIR}
# activate your conda environment
conda activate dsstdeface

# once your conda environment is active, execute the following
python src/run.py ${INPUT_DIR} ${OUTPUT_DIR}
```

### Option 2: Parallel defacing
Expand All @@ -122,60 +110,14 @@ If you have dataset with over 10 subjects and since each defacing job is indepen
subject/session in the dataset using the `-n/--n-cpus` option. The following example command will run the pipeline occupying 10 processors at a time.

```bash
python src/dsst_defacing_wf.py ${INPUT_DIR} ${OUTPUT_DIR} -n 10
```

### Option 3: Parallel defacing using `swarm`


Assuming these scripts are run on the NIH HPC system, you can create a `swarm` file:

```bash

for i in `ls -d ${INPUT_DIR}/sub-*`; do \
SUBJ=$(echo $i | sed "s|${INPUT_DIR}/||g" ); \
echo "python dsst-defacing-pipeline/src/dsst_defacing_wf.py -i ${INPUT_DIR} -o ${OUTPUT_DIR} -p ${SUBJ}"; \
done > defacing_parallel_subject_level.swarm
```

The above BASH "for loop" crawls through the dataset and finds all subject directories to construct `dsst_defacing_wf.py` commands
with the `-p/--participant-label` option.

Next you can run the swarm file with the following command:

```bash
swarm -f defacing_parallel_subject_level.swarm --merge-output --logdir ${OUTPUT_DIR}/swarm_log
```

### Option 4: In parallel at session level

If the input dataset has multiple sessions per subject, then run the pipeline on every session in the dataset
in parallel. Similar to Option 2, the following commands loop through the dataset to find subject and session IDs to
create a `swarm` file to be run on NIH HPC systems.

```bash
for i in `ls -d ${INPUT_DIR}/sub-*`; do
SUBJ=$(echo $i | sed "s|${INPUT_DIR}/||g" );
for j in `ls -d ${INPUT_DIR}/${SUBJ}/ses-*`; do
SESS=$(echo $j | sed "s|${INPUT_DIR}/${SUBJ}/||g" )
echo "python dsst-defacing-pipeline/src/dsst_defacing_wf.py -i ${INPUT_DIR} -o ${OUTPUT_DIR} -p ${SUBJ} -s ${SESS}";
done;
done > defacing_parallel_session_level.swarm
```

To run the swarm file, once created, use the following command:
# activate your conda environment
conda activate dsstdeface

```bash
swarm -f defacing_parallel_session_level.swarm --merge-output --logdir ${OUTPUT_DIR}/swarm_log
# once your conda environment is active, execute the following
python src/run.py ${INPUT_DIR} ${OUTPUT_DIR} -n 10
```

## Using `generate_renders.py`

Generate 3D renders for every defaced image in the output directory.

```bash
python dsst-defacing-pipeline/src/generate_renders.py -o ${OUTPUT_DIR}
```
Additionally, the pipeline can be run on a single subject or session using the `-p/--participant-label` and `-s/--session-id`, respectively.

## Visual Inspection

Expand Down
34 changes: 31 additions & 3 deletions src/deface.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import gzip
import os
import re
import shutil
import os
import subprocess
from os import fspath
from pathlib import Path
Expand Down Expand Up @@ -69,8 +69,25 @@ def copy_over_sidecar(scan_filepath, input_anat_dir, output_anat_dir):
shutil.copy2(json_sidecar, output_anat_dir / filename)


def generate_3d_renders(defaced_img, render_outdir):
rotations = [(45, 5, 10), (-45, 5, 10)]
for idx, rot in enumerate(rotations):
yaw, pitch, roll = rot[0], rot[1], rot[2]
outfile = render_outdir.joinpath('defaced_render_0' + str(idx) + '.png')
if not outfile.exists():
if 'T2w' in render_outdir.parts:
fsleyes_render_cmd = f"fsleyes render --scene 3d -rot {yaw} {pitch} {roll} --outfile {outfile} {defaced_img} -dr 80 1000 -in spline -cm render1 -bf 0.3 -r 100 -ns 500;"
else:
fsleyes_render_cmd = f"fsleyes render --scene 3d -rot {yaw} {pitch} {roll} --outfile {outfile} {defaced_img} -dr 20 250 -in spline -cm render1 -bf 0.3 -r 100 -ns 500;"
cmd = f"export TMP_DISPLAY=$DISPLAY; unset DISPLAY; {fsleyes_render_cmd} export DISPLAY=$TMP_DISPLAY"

print(cmd)
run_command(cmd, "")
print(f"Has the render been created? {outfile.exists()}")


def vqcdeface_prep(bids_input_dir, defaced_anat_dir, bids_defaced_outdir):
defacing_qc_dir = bids_defaced_outdir.parent / 'QC_prep' / 'defacing_QC'
defacing_qc_dir = bids_defaced_outdir.parent / 'defacing_QC'
interested_files = [f for f in defaced_anat_dir.rglob('*.nii.gz') if
'work_dir' not in str(f).split('/')]
print(interested_files)
Expand Down Expand Up @@ -119,7 +136,10 @@ def reorganize_into_bids(input_bids_dir, subj_dir, sess_dir, primary_t1, bids_de
intermediate_files_dir = anat_dir / 'work_dir'
intermediate_files_dir.mkdir(parents=True, exist_ok=True)
for dirpath in anat_dir.glob('*'):
if dirpath.name.startswith('workdir') or dirpath.name.endswith('QC'):
if dirpath.name.startswith('workdir'):
new_name = '_'.join(['afni', dirpath.name])
shutil.move(str(dirpath), str(intermediate_files_dir / new_name))
elif dirpath.name.endswith('QC'):
shutil.move(str(dirpath), str(intermediate_files_dir))

vqcdeface_prep(input_bids_dir, anat_dir, bids_defaced_outdir)
Expand Down Expand Up @@ -209,6 +229,14 @@ def deface_primary_scan(input_bids_dir, subj_input_dir, sess_dir, mapping_dict,
# reorganizing the directory with defaced images into BIDS tree
reorganize_into_bids(input_bids_dir, subj_input_dir, sess_dir, primary_t1, output_dir, no_clean)

# prep for visual inspection using visualqc deface
print(f"Preparing for QC by visual inspection...\n")
defaced_imgs = list(output_dir.parent.rglob('defaced.nii.gz'))
for img in defaced_imgs:
generate_3d_renders(img, img.parent)

# print(f"All set to start visual inspection of defaced images!")

return missing_refacer_outputs


Expand Down
81 changes: 9 additions & 72 deletions src/generate_mappings.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,64 +32,6 @@ def run_command(cmdstr, logfile):
subprocess.run(cmdstr, stdout=logfile, stderr=subprocess.STDOUT, encoding='utf8', shell=True)


def primary_scans_qc_prep(mapping_dict, qc_prep):
"""Prepares a directory tree with symbolic links to primary scans and an id_list of primary scans to be used in the
visualqc_t1_mri command.
:param defaultdict mapping_dict: A dictionary mapping each subject's and session's primary and other scans.
:param Path() outdir: Path to output directory provided by the user. The output directory contains this scripts output files and
directories.
:return str vqc_t1_mri_cmd: A visualqc T1 MRI command string.
"""

interested_keys = ['primary_t1', 'others']
primaries = []
for subjid in mapping_dict.keys():

# check existence of sessions to query mapping dict
if list(mapping_dict[subjid].keys()) != interested_keys:
for sessid in mapping_dict[subjid].keys():
primary = mapping_dict[subjid][sessid]['primary_t1']
primaries.append(primary)
else:
primary = mapping_dict[subjid]['primary_t1']
primaries.append(primary)
# remove empty strings from primaries list
primaries = [p for p in primaries if p != '']

vqc_t1_mri = qc_prep / 't1_mri_QC'
vqc_t1_mri.mkdir(parents=True, exist_ok=True)

id_list = []
for primary in primaries:
entities = Path(primary).name.split('_')
subjid = entities[0]

# check existence of session to construct destination path
sessid = ""
for e in entities:
if e.startswith('ses-'):
sessid = e

dest = vqc_t1_mri / subjid / sessid / 'anat'
dest.mkdir(parents=True, exist_ok=True)

id_list.append(dest)
primary_link = dest / 'primary.nii.gz'
if not primary_link.is_symlink():
try:
primary_link.symlink_to(primary)
except:
pass

with open(vqc_t1_mri / 't1_mri_id_list.txt', 'w') as f:
f.write('\n'.join([str(i) for i in id_list]))

vqc_t1_mri_cmd = f"visualqc_t1_mri -u {vqc_t1_mri} -i {vqc_t1_mri / 't1_mri_id_list.txt'} -m primary.nii.gz"

return vqc_t1_mri_cmd


def sort_by_acq_time(sidecars):
"""Sorting a list of scans' JSON sidecars based on their acquisition time.
Expand Down Expand Up @@ -187,9 +129,9 @@ def update_mapping_dict(mapping_dict, anat_dir, is_sessions, sidecars, t1_unavai
sessid = anat_dir.parent.name
if sessid not in mapping_dict[subjid]:
mapping_dict[subjid][sessid] = {
'primary_t1': str(primary_t1),
'others': others
}
'primary_t1': str(primary_t1),
'others': others
}

else:
mapping_dict[subjid][sessid] = {
Expand All @@ -199,14 +141,14 @@ def update_mapping_dict(mapping_dict, anat_dir, is_sessions, sidecars, t1_unavai

else:
mapping_dict[subjid] = {
'primary_t1': str(primary_t1),
'others': others
}
'primary_t1': str(primary_t1),
'others': others
}

return mapping_dict, t1_unavailable, t1_available


def summary_to_stdout(vqc_t1_cmd, sess_ct, t1s_found, t1s_not_found, no_anat_dirs, output):
def summary_to_stdout(sess_ct, t1s_found, t1s_not_found, output):
readable_path_list = ['/'.join([path.parent.name, path.name]) for path in t1s_not_found]
print(f"====================")
print(f"Dataset Summary")
Expand All @@ -222,7 +164,7 @@ def summary_to_stdout(vqc_t1_cmd, sess_ct, t1s_found, t1s_not_found, no_anat_dir

def crawl(input_dir, output):
# make dir for log files and visualqc prep
dir_names = ['logs', 'QC_prep']
dir_names = ['logs', 'defacing_QC']
for dir_name in dir_names:
output.joinpath(dir_name).mkdir(parents=True, exist_ok=True)

Expand Down Expand Up @@ -253,10 +195,5 @@ def crawl(input_dir, output):
with open(output / 'logs' / 'anat_unavailable.txt', 'w') as f3:
f3.write('\n'.join([str(p) for p in no_anat_dirs]))

# write vqc command to file
vqc_t1_mri_cmd = primary_scans_qc_prep(mapping_dict, output / 'QC_prep')
with open(output / 'QC_prep' / 't1_mri_qc_cmd', 'w') as f4:
f4.write(f"{vqc_t1_mri_cmd}\n")

summary_to_stdout(vqc_t1_mri_cmd, total_sessions, t1s_found, t1s_not_found, no_anat_dirs, output)
summary_to_stdout(total_sessions, t1s_found, t1s_not_found, output)
return mapping_dict
Loading

0 comments on commit 18a3238

Please sign in to comment.