Merge pull request #32 from nimh-dsst/feature/integrate-generate-renders

integrating generate renders to pipeline
nimh-dsst · May 2, 2023 · 18a3238 · 18a3238
2 parents b674e3c + 1d6e8cd
commit 18a3238
Show file tree

Hide file tree

Showing 5 changed files with 133 additions and 302 deletions.
diff --git a/README.md b/README.md
@@ -46,74 +46,62 @@ Once conda finishes creating the virtual environment, activate `dsstdeface`.
 conda activate dsstdeface
 ```
 
-## Using `dsst_defacing_wf.py`
+## Usage
 
-To deface anatomical scans in the dataset, run the `src/dsst_defacing_wf.py` script. From within the `dsst-defacing-pipeline` cloned directory, run the following command to see the help message.
+To deface anatomical scans in the dataset, run the `src/run.py` script. From within the `dsst-defacing-pipeline` cloned directory, run the following command to see the help message.
 
 ```text
-% python src/dsst_defacing_wf.py -h
+% python src/run.py -h
 
-usage: dsst_defacing_wf.py [-h] [-n N_CPUS]
-                           [-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
-                           [-s SESSION_ID [SESSION_ID ...]]
-                           [--no-clean]
-                           bids_dir output_dir
+usage: run.py [-h] [-n N_CPUS] [-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
+              [-s SESSION_ID [SESSION_ID ...]] [--no-clean]
+              bids_dir output_dir
 
-Deface anatomical scans for a given BIDS dataset or a subject
-directory in BIDS format.
+Deface anatomical scans for a given BIDS dataset or a subject directory in
+BIDS format.
 
 positional arguments:
-  bids_dir              The directory with the input dataset
-                        formatted according to the BIDS standard.     
-  output_dir            The directory where the output files should   
-                        be stored.
+  bids_dir              The directory with the input dataset formatted
+                        according to the BIDS standard.
+  output_dir            The directory where the output files should be stored.
 
-options:
+optional arguments:
   -h, --help            show this help message and exit
   -n N_CPUS, --n-cpus N_CPUS
-                        Number of parallel processes to run when      
-                        there is more than one folder. Defaults to    
-                        1, meaning "serial processing".
+                        Number of parallel processes to run when there is more
+                        than one folder. Defaults to 1, meaning "serial
+                        processing".
   -p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...], --participant-label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]
-                        The label(s) of the participant(s) that       
-                        should be defaced. The label corresponds to   
-                        sub-<participant_label> from the BIDS spec    
-                        (so it does not include "sub-"). If this      
-                        parameter is not provided all subjects        
-                        should be analyzed. Multiple participants     
-                        can be specified with a space separated       
-                        list.
-  -s SESSION_ID [SESSION_ID ...], --session-id SESSION_ID [SESSION_ID ...]
-                        The ID(s) of the session(s) that should be    
+                        The label(s) of the participant(s) that should be
                         defaced. The label corresponds to
-                        ses-<session_id> from the BIDS spec (so it    
-                        does not include "ses-"). If this parameter   
-                        is not provided all subjects should be        
-                        analyzed. Multiple sessions can be specified  
-                        with a space separated list.
-  --no-clean            If this argument is provided, then AFNI       
-                        intermediate files are preserved.
-```
-
-The script can be run serially on a BIDS dataset or in parallel at subject/session level. The three methods of running
-the script have been described below with example commands:
-
-For readability of example commands, the following bash variables have been defined as follows:
-
-```bash
-INPUT_DIR="<path/to/BIDS/input/dataset>"
-OUTPUT_DIR="<path/to/desired/defacing/output/directory>"
+                        sub-<participant_label> from the BIDS spec (so it does
+                        not include "sub-"). If this parameter is not provided
+                        all subjects should be analyzed. Multiple participants
+                        can be specified with a space separated list.
+  -s SESSION_ID [SESSION_ID ...], --session-id SESSION_ID [SESSION_ID ...]
+                        The ID(s) of the session(s) that should be defaced.
+                        The label corresponds to ses-<session_id> from the
+                        BIDS spec (so it does not include "ses-"). If this
+                        parameter is not provided all subjects should be
+                        analyzed. Multiple sessions can be specified with a
+                        space separated list.
+  --no-clean            If this argument is provided, then AFNI intermediate
+                        files are preserved.
 ```
 
-**NOTE:** In the example commands below, `<path/to/BIDS/input/dataset>` and `<path/to/desired/output/directory>` are
-placeholders for paths to input and output directories, respectively.
+The script can be run serially on a BIDS dataset or in parallel at subject/session level. Both these methods of running
+the script have been described below with example commands.
 
 ### Option 1: Serial defacing
 
 If you have a small dataset with less than 10 subjects, then it might be easiest to run the defacing algorithm serially.
 
 ```bash
-python src/dsst_defacing_wf.py ${INPUT_DIR} ${OUTPUT_DIR}
+# activate your conda environment
+conda activate dsstdeface
+
+# once your conda environment is active, execute the following
+python src/run.py ${INPUT_DIR} ${OUTPUT_DIR}
 ```
 
 ### Option 2: Parallel defacing
@@ -122,60 +110,14 @@ If you have dataset with over 10 subjects and since each defacing job is indepen
 subject/session in the dataset using the `-n/--n-cpus` option. The following example command will run the pipeline occupying 10 processors at a time.
 
 ```bash
-python src/dsst_defacing_wf.py ${INPUT_DIR} ${OUTPUT_DIR} -n 10
-```
-
-### Option 3: Parallel defacing using `swarm`
-
-
-Assuming these scripts are run on the NIH HPC system, you can create a `swarm` file:
-
-  ```bash
-
-  for i in `ls -d ${INPUT_DIR}/sub-*`; do \
-    SUBJ=$(echo $i | sed "s|${INPUT_DIR}/||g" ); \
-    echo "python dsst-defacing-pipeline/src/dsst_defacing_wf.py -i ${INPUT_DIR} -o ${OUTPUT_DIR} -p ${SUBJ}"; \
-    done > defacing_parallel_subject_level.swarm
-  ```
-
-The above BASH "for loop" crawls through the dataset and finds all subject directories to construct `dsst_defacing_wf.py` commands
-with the `-p/--participant-label` option.
-
-Next you can run the swarm file with the following command:
-
-```bash
-swarm -f defacing_parallel_subject_level.swarm --merge-output --logdir ${OUTPUT_DIR}/swarm_log
-```
-
-### Option 4: In parallel at session level
-
-If the input dataset has multiple sessions per subject, then run the pipeline on every session in the dataset
-in parallel. Similar to Option 2, the following commands loop through the dataset to find subject and session IDs to
-create a `swarm` file to be run on NIH HPC systems.
-
-```bash
-for i in `ls -d ${INPUT_DIR}/sub-*`; do
-  SUBJ=$(echo $i | sed "s|${INPUT_DIR}/||g" );
-  for j in `ls -d ${INPUT_DIR}/${SUBJ}/ses-*`; do
-    SESS=$(echo $j | sed "s|${INPUT_DIR}/${SUBJ}/||g" )
-    echo "python dsst-defacing-pipeline/src/dsst_defacing_wf.py -i ${INPUT_DIR} -o ${OUTPUT_DIR} -p ${SUBJ} -s ${SESS}";
-    done;
-  done > defacing_parallel_session_level.swarm
-```
-
-To run the swarm file, once created, use the following command:
+# activate your conda environment
+conda activate dsstdeface
 
-```bash
-swarm -f defacing_parallel_session_level.swarm --merge-output --logdir ${OUTPUT_DIR}/swarm_log
+# once your conda environment is active, execute the following
+python src/run.py ${INPUT_DIR} ${OUTPUT_DIR} -n 10
 ```
 
-## Using `generate_renders.py`
-
-Generate 3D renders for every defaced image in the output directory.
-
-  ```bash
-  python dsst-defacing-pipeline/src/generate_renders.py -o ${OUTPUT_DIR}
-  ```
+Additionally, the pipeline can be run on a single subject or session using the `-p/--participant-label` and `-s/--session-id`, respectively. 
 
 ## Visual Inspection
 

diff --git a/src/deface.py b/src/deface.py
@@ -1,7 +1,7 @@
 import gzip
+import os
 import re
 import shutil
-import os
 import subprocess
 from os import fspath
 from pathlib import Path
@@ -69,8 +69,25 @@ def copy_over_sidecar(scan_filepath, input_anat_dir, output_anat_dir):
     shutil.copy2(json_sidecar, output_anat_dir / filename)
 
 
+def generate_3d_renders(defaced_img, render_outdir):
+    rotations = [(45, 5, 10), (-45, 5, 10)]
+    for idx, rot in enumerate(rotations):
+        yaw, pitch, roll = rot[0], rot[1], rot[2]
+        outfile = render_outdir.joinpath('defaced_render_0' + str(idx) + '.png')
+        if not outfile.exists():
+            if 'T2w' in render_outdir.parts:
+                fsleyes_render_cmd = f"fsleyes render --scene 3d -rot {yaw} {pitch} {roll} --outfile {outfile} {defaced_img} -dr 80 1000 -in spline -cm render1 -bf 0.3 -r 100 -ns 500;"
+            else:
+                fsleyes_render_cmd = f"fsleyes render --scene 3d -rot {yaw} {pitch} {roll} --outfile {outfile} {defaced_img} -dr 20 250 -in spline -cm render1 -bf 0.3 -r 100 -ns 500;"
+            cmd = f"export TMP_DISPLAY=$DISPLAY; unset DISPLAY; {fsleyes_render_cmd} export DISPLAY=$TMP_DISPLAY"
+
+            print(cmd)
+            run_command(cmd, "")
+            print(f"Has the render been created? {outfile.exists()}")
+
+
 def vqcdeface_prep(bids_input_dir, defaced_anat_dir, bids_defaced_outdir):
-    defacing_qc_dir = bids_defaced_outdir.parent / 'QC_prep' / 'defacing_QC'
+    defacing_qc_dir = bids_defaced_outdir.parent / 'defacing_QC'
     interested_files = [f for f in defaced_anat_dir.rglob('*.nii.gz') if
                         'work_dir' not in str(f).split('/')]
     print(interested_files)
@@ -119,7 +136,10 @@ def reorganize_into_bids(input_bids_dir, subj_dir, sess_dir, primary_t1, bids_de
         intermediate_files_dir = anat_dir / 'work_dir'
         intermediate_files_dir.mkdir(parents=True, exist_ok=True)
         for dirpath in anat_dir.glob('*'):
-            if dirpath.name.startswith('workdir') or dirpath.name.endswith('QC'):
+            if dirpath.name.startswith('workdir'):
+                new_name = '_'.join(['afni', dirpath.name])
+                shutil.move(str(dirpath), str(intermediate_files_dir / new_name))
+            elif dirpath.name.endswith('QC'):
                 shutil.move(str(dirpath), str(intermediate_files_dir))
 
         vqcdeface_prep(input_bids_dir, anat_dir, bids_defaced_outdir)
@@ -209,6 +229,14 @@ def deface_primary_scan(input_bids_dir, subj_input_dir, sess_dir, mapping_dict,
     # reorganizing the directory with defaced images into BIDS tree
     reorganize_into_bids(input_bids_dir, subj_input_dir, sess_dir, primary_t1, output_dir, no_clean)
 
+    # prep for visual inspection using visualqc deface
+    print(f"Preparing for QC by visual inspection...\n")
+    defaced_imgs = list(output_dir.parent.rglob('defaced.nii.gz'))
+    for img in defaced_imgs:
+        generate_3d_renders(img, img.parent)
+
+    # print(f"All set to start visual inspection of defaced images!")
+
     return missing_refacer_outputs
 
 

diff --git a/src/generate_mappings.py b/src/generate_mappings.py
@@ -32,64 +32,6 @@ def run_command(cmdstr, logfile):
     subprocess.run(cmdstr, stdout=logfile, stderr=subprocess.STDOUT, encoding='utf8', shell=True)
 
 
-def primary_scans_qc_prep(mapping_dict, qc_prep):
-    """Prepares a directory tree with symbolic links to primary scans and an id_list of primary scans to be used in the
-    visualqc_t1_mri command.
-
-    :param defaultdict mapping_dict: A dictionary mapping each subject's and session's primary and other scans.
-    :param Path() outdir: Path to output directory provided by the user. The output directory contains this scripts output files and
-        directories.
-    :return str vqc_t1_mri_cmd: A visualqc T1 MRI command string.
-    """
-
-    interested_keys = ['primary_t1', 'others']
-    primaries = []
-    for subjid in mapping_dict.keys():
-
-        # check existence of sessions to query mapping dict
-        if list(mapping_dict[subjid].keys()) != interested_keys:
-            for sessid in mapping_dict[subjid].keys():
-                primary = mapping_dict[subjid][sessid]['primary_t1']
-                primaries.append(primary)
-        else:
-            primary = mapping_dict[subjid]['primary_t1']
-            primaries.append(primary)
-        # remove empty strings from primaries list
-        primaries = [p for p in primaries if p != '']
-
-    vqc_t1_mri = qc_prep / 't1_mri_QC'
-    vqc_t1_mri.mkdir(parents=True, exist_ok=True)
-
-    id_list = []
-    for primary in primaries:
-        entities = Path(primary).name.split('_')
-        subjid = entities[0]
-
-        # check existence of session to construct destination path
-        sessid = ""
-        for e in entities:
-            if e.startswith('ses-'):
-                sessid = e
-
-        dest = vqc_t1_mri / subjid / sessid / 'anat'
-        dest.mkdir(parents=True, exist_ok=True)
-
-        id_list.append(dest)
-        primary_link = dest / 'primary.nii.gz'
-        if not primary_link.is_symlink():
-            try:
-                primary_link.symlink_to(primary)
-            except:
-                pass
-
-    with open(vqc_t1_mri / 't1_mri_id_list.txt', 'w') as f:
-        f.write('\n'.join([str(i) for i in id_list]))
-
-    vqc_t1_mri_cmd = f"visualqc_t1_mri -u {vqc_t1_mri} -i {vqc_t1_mri / 't1_mri_id_list.txt'} -m primary.nii.gz"
-
-    return vqc_t1_mri_cmd
-
-
 def sort_by_acq_time(sidecars):
     """Sorting a list of scans' JSON sidecars based on their acquisition time.
 
@@ -187,9 +129,9 @@ def update_mapping_dict(mapping_dict, anat_dir, is_sessions, sidecars, t1_unavai
         sessid = anat_dir.parent.name
         if sessid not in mapping_dict[subjid]:
             mapping_dict[subjid][sessid] = {
-                        'primary_t1': str(primary_t1),
-                        'others': others
-                    }
+                'primary_t1': str(primary_t1),
+                'others': others
+            }
 
         else:
             mapping_dict[subjid][sessid] = {
@@ -199,14 +141,14 @@ def update_mapping_dict(mapping_dict, anat_dir, is_sessions, sidecars, t1_unavai
 
     else:
         mapping_dict[subjid] = {
-                    'primary_t1': str(primary_t1),
-                    'others': others
-                }
+            'primary_t1': str(primary_t1),
+            'others': others
+        }
 
     return mapping_dict, t1_unavailable, t1_available
 
 
-def summary_to_stdout(vqc_t1_cmd, sess_ct, t1s_found, t1s_not_found, no_anat_dirs, output):
+def summary_to_stdout(sess_ct, t1s_found, t1s_not_found, output):
     readable_path_list = ['/'.join([path.parent.name, path.name]) for path in t1s_not_found]
     print(f"====================")
     print(f"Dataset Summary")
@@ -222,7 +164,7 @@ def summary_to_stdout(vqc_t1_cmd, sess_ct, t1s_found, t1s_not_found, no_anat_dir
 
 def crawl(input_dir, output):
     # make dir for log files and visualqc prep
-    dir_names = ['logs', 'QC_prep']
+    dir_names = ['logs', 'defacing_QC']
     for dir_name in dir_names:
         output.joinpath(dir_name).mkdir(parents=True, exist_ok=True)
 
@@ -253,10 +195,5 @@ def crawl(input_dir, output):
     with open(output / 'logs' / 'anat_unavailable.txt', 'w') as f3:
         f3.write('\n'.join([str(p) for p in no_anat_dirs]))
 
-    # write vqc command to file
-    vqc_t1_mri_cmd = primary_scans_qc_prep(mapping_dict, output / 'QC_prep')
-    with open(output / 'QC_prep' / 't1_mri_qc_cmd', 'w') as f4:
-        f4.write(f"{vqc_t1_mri_cmd}\n")
-
-    summary_to_stdout(vqc_t1_mri_cmd, total_sessions, t1s_found, t1s_not_found, no_anat_dirs, output)
+    summary_to_stdout(total_sessions, t1s_found, t1s_not_found, output)
     return mapping_dict