Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmp_dcm2bids/ #100

Closed
kousu opened this issue Nov 24, 2020 · 2 comments
Closed

tmp_dcm2bids/ #100

kousu opened this issue Nov 24, 2020 · 2 comments

Comments

@kousu
Copy link
Contributor

kousu commented Nov 24, 2020

dcm2bids assumes your target directory contains tmp_dcm2bids/$subject/. If it doesn't exist it crashes:

(.venv) [kousu@requiem Dcm2Bids]$ dcm2bids -d tests/data -c tests/data/config_test_not_case_sensitive_option.json -p 01
INFO:dcm2bids.dcm2bids:--- dcm2bids start ---
INFO:dcm2bids.dcm2bids:OS:version: Linux-5.9.9-arch1-1-x86_64-with-glibc2.2.5
INFO:dcm2bids.dcm2bids:python:version: 3.8.6 (default, Sep 30 2020, 04:00:38) [GCC 10.2.0]
INFO:dcm2bids.dcm2bids:dcm2bids:version: 2.1.4
INFO:dcm2bids.dcm2bids:dcm2niix:version: v1.0.20200331
INFO:dcm2bids.dcm2bids:participant: sub-01
INFO:dcm2bids.dcm2bids:session: 
INFO:dcm2bids.dcm2bids:config: /home/kousu/src/neuropoly/Dcm2Bids/tests/data/config_test_not_case_sensitive_option.json
INFO:dcm2bids.dcm2bids:BIDS directory: /home/kousu/src/neuropoly/Dcm2Bids
INFO:dcm2bids.utils:Running dcm2niix -b y -ba y -z y -f '%3s_%f_%p_%t' -o /home/kousu/src/neuropoly/Dcm2Bids/tmp_dcm2bids/sub-01 tests/data
Error: Unable to find any DICOM images in tests/data (or subfolders 5 deep)
Traceback (most recent call last):
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/bin/dcm2bids", line 11, in <module>
    load_entry_point('dcm2bids==2.1.4', 'console_scripts', 'dcm2bids')()
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/lib/python3.8/site-packages/dcm2bids-2.1.4-py3.8.egg/dcm2bids/dcm2bids.py", line 290, in main
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/lib/python3.8/site-packages/dcm2bids-2.1.4-py3.8.egg/dcm2bids/dcm2bids.py", line 120, in run
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/lib/python3.8/site-packages/dcm2bids-2.1.4-py3.8.egg/dcm2bids/dcm2niix.py", line 90, in run
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/lib/python3.8/site-packages/dcm2bids-2.1.4-py3.8.egg/dcm2bids/dcm2niix.py", line 100, in execute
  File "/home/kousu/src/neuropoly/Dcm2Bids/.venv/lib/python3.8/site-packages/dcm2bids-2.1.4-py3.8.egg/dcm2bids/utils.py", line 112, in run_shell_command
  File "/usr/lib/python3.8/subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['dcm2niix', '-b', 'y', '-ba', 'y', '-z', 'y', '-f', '%3s_%f_%p_%t', '-o', '/home/kousu/src/neuropoly/Dcm2Bids/tmp_dcm2bids/sub-01', 'tests/data']' returned non-zero exit status 2.

This is really confusing to me. It's not a parameter in the API:

Args:
dicom_dir (str or list): A list of folder with dicoms to convert
participant (str): Label of your participant
config (path): Path to a dcm2bids configuration file
output_dir (path): Path to the BIDS base folder
session (str): Optional label of a session
clobber (boolean): Overwrite file if already in BIDS folder
forceDcm2niix (boolean): Forces a cleaning of a previous execution of
dcm2niix
log_level (str): logging level

It's only mentioned in passing in the docs, but not that it's a parameter. All the docs imply that that folder will be created as a side-effect.

It will be made as a side-effect but only if the dcm2niix step runs:

commandTpl = "dcm2niix {} -o {} {}"
cmd = commandTpl.format(self.options, self.outputDir, dicomDir)

But this folder isn't actually necessary. If you put a single empty file in it dcm2bids runs fine:

(.venv) [kousu@requiem Dcm2Bids]$ mkdir -p tmp_dcm2bids/sub-01/
(.venv) [kousu@requiem Dcm2Bids]$ touch tmp_dcm2bids/sub-01/.nothing
(.venv) [kousu@requiem Dcm2Bids]$ dcm2bids -d tests/data -c tests/data/config_test_not_case_sensitive_option.json -p 01
INFO:dcm2bids.dcm2bids:--- dcm2bids start ---
INFO:dcm2bids.dcm2bids:OS:version: Linux-5.9.9-arch1-1-x86_64-with-glibc2.2.5
INFO:dcm2bids.dcm2bids:python:version: 3.8.6 (default, Sep 30 2020, 04:00:38) [GCC 10.2.0]
INFO:dcm2bids.dcm2bids:dcm2bids:version: 2.1.4
INFO:dcm2bids.dcm2bids:dcm2niix:version: v1.0.20200331
INFO:dcm2bids.dcm2bids:participant: sub-01
INFO:dcm2bids.dcm2bids:session: 
INFO:dcm2bids.dcm2bids:config: /home/kousu/src/neuropoly/Dcm2Bids/tests/data/config_test_not_case_sensitive_option.json
INFO:dcm2bids.dcm2bids:BIDS directory: /home/kousu/src/neuropoly/Dcm2Bids
WARNING:dcm2bids.dcm2niix:Previous dcm2niix directory output found:
WARNING:dcm2bids.dcm2niix:/home/kousu/src/neuropoly/Dcm2Bids/tmp_dcm2bids/sub-01
WARNING:dcm2bids.dcm2niix:Use --forceDcm2niix to rerun dcm2niix
INFO:dcm2bids.sidecar:Sidecars pairing:
INFO:dcm2bids.dcm2bids:moving acquisitions into BIDS folder
WARNING:dcm2bids.version:Your using dcm2niix version v1.0.20200331
WARNING:dcm2bids.version:A new version exists : v1.0.20201102
WARNING:dcm2bids.version:Check https://github.com/rordenlab/dcm2niix

In fact, the test circumvents this by tricking itself:

tmpSubDir = os.path.join(bidsDir.name, DEFAULT.tmpDirName, "sub-01")
shutil.copytree(os.path.join(TEST_DATA_DIR, "sidecars"), tmpSubDir)

This works because dcm2bids.dcm2niix.Dcm2niix simply checks if any file exists, not if they make sense:

try:
oldOutput = os.listdir(self.outputDir) != []
except:
oldOutput = False

So I think we should make sure this folder isn't necessary.. It's weird that dcm2niix is tied to the logging stream, and it's weird that despite the check above nothing else actually makes use of those files. I haven't thought or read through all the implications, but this seems really strange to me!

I propose that you'd want these features:

  • running dcm2bids on a pre-BIDSified dataset, should run successfully and change nothing or
  • running dcm2bids on a dataset of non-BIDSified NIfTI files should succeed. I've heard rumours this is already true?
    • these two mean that dcm2niix should only be an optional step, and therefore tmp_dcm2bids/ should be created on-demand
    • and the name should be something like "bidsify" not "dcm2bids". Maybe just "2bids"?
  • running dcm2bids multiple times on the same input should run successfully and change nothing
  • the logfile should sit at the root of the output, or outside of the dataset entirely
    • (in fact, maybe a separate issue, but I'd drop the logfile entirely; just log to stdout, and let the user decide to save it or not)
@kousu
Copy link
Contributor Author

kousu commented Nov 24, 2020

Actually I had this backwards! No wonder I was confused.

It turns out that tmp_dcm2bids/ is in fact where the data is read from. Consider:

From current master:

$ git rev-parse HEAD
14ff8b1443010ab88ca32645c38cdc71c42499db
$ git diff
diff --git a/tests/test_dcm2bids.py b/tests/test_dcm2bids.py
index 278b881..00b14d8 100644
--- a/tests/test_dcm2bids.py
+++ b/tests/test_dcm2bids.py
@@ -8,6 +8,8 @@ from bids import BIDSLayout
 from dcm2bids import Dcm2bids
 from dcm2bids.utils import DEFAULT, load_json
 
+import subprocess
+
 
 TEST_DATA_DIR = os.path.join(os.path.dirname(__file__), "data")
 
@@ -27,6 +29,8 @@ def test_dcm2bids():
         bidsDir.name,
     )
     app.run()
+    subprocess.run(["tree", bidsDir.name], check=True)
+
     layout = BIDSLayout(bidsDir.name, validate=False)
 
     assert layout.get_subjects() == ["01"]

I see this folder structure, as expected:

$ py.test  -s tests/test_dcm2bids.py::test_dcm2bids
====================================================================================== test session starts =======================================================================================
platform linux -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /home/kousu/src/neuropoly/Dcm2Bids
plugins: flake8-1.0.6, black-0.3.12, pylint-0.18.0, cov-2.10.1
collected 1 item                                                                                                                                                                                 

tests/test_dcm2bids.py /tmp/tmpv3hspibq
├── sub-01
│   ├── anat
│   │   └── sub-01_T1w.json
│   ├── dwi
│   │   └── sub-01_dwi.json
│   ├── fmap
│   │   ├── sub-01_echo-492_fmap.json
│   │   └── sub-01_echo-738_fmap.json
│   ├── func
│   │   └── sub-01_task-rest_bold.json
│   └── localizer
│       ├── sub-01_run-01_localizer.json
│       ├── sub-01_run-02_localizer.json
│       └── sub-01_run-03_localizer.json
└── tmp_dcm2bids
    ├── log
    │   └── sub-01_2020-11-24T001126.433043.log
    └── sub-01
        ├── 005_DTI_20100603125600.json
        ├── 006_DTI_20100603125600.json
        ├── 007_DTI_20100603125600.json
        └── 011_gre_field_mapping_20100603125600_e2_ph.json

9 directories, 13 files
.

[.....]

but if I skip copying the metadata to tmp_dcm2bids/:

$ git diff
diff --git a/tests/test_dcm2bids.py b/tests/test_dcm2bids.py
index 278b881..1044e46 100644
--- a/tests/test_dcm2bids.py
+++ b/tests/test_dcm2bids.py
@@ -8,6 +8,8 @@ from bids import BIDSLayout
 from dcm2bids import Dcm2bids
 from dcm2bids.utils import DEFAULT, load_json
 
+import subprocess
+
 
 TEST_DATA_DIR = os.path.join(os.path.dirname(__file__), "data")
 
@@ -18,7 +20,8 @@ def test_dcm2bids():
     bidsDir = TemporaryDirectory()
 
     tmpSubDir = os.path.join(bidsDir.name, DEFAULT.tmpDirName, "sub-01")
-    shutil.copytree(os.path.join(TEST_DATA_DIR, "sidecars"), tmpSubDir)
+    os.makedirs(tmpSubDir, exist_ok=True)
+    os.mknod(os.path.join(tmpSubDir, '.nothing')) # cirvumvent
 
     app = Dcm2bids(
         [TEST_DATA_DIR],
@@ -27,6 +30,8 @@ def test_dcm2bids():
         bidsDir.name,
     )
     app.run()
+    subprocess.run(["tree", bidsDir.name], check=True)
+
     layout = BIDSLayout(bidsDir.name, validate=False)
 
     assert layout.get_subjects() == ["01"]

then I get nonsense:

$ py.test  -s tests/test_dcm2bids.py::test_dcm2bids
====================================================================================== test session starts =======================================================================================
platform linux -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /home/kousu/src/neuropoly/Dcm2Bids
plugins: flake8-1.0.6, black-0.3.12, pylint-0.18.0, cov-2.10.1
collected 1 item                                                                                                                                                                                 

tests/test_dcm2bids.py /tmp/tmp8juel2ls
└── tmp_dcm2bids
    ├── log
    │   └── sub-01_2020-11-24T002417.898250.log
    └── sub-01

3 directories, 1 file

[.....]

That clears some things up for me. That, of course, fits with what I know from #98.

So I think the main bug here is that in the case where dcm2niix is skipped:

oldOutput = False

then copying in the sidecars is also skipped, making the whole process a no-op. The only reason it usually works is because that folder is expected to be populated by dcm2niix with both images AND sidecars. But if that fails for some reason then things get wonky. This is an example of how stateful programming is really hard to get right: each run of dcm2bids depends on the state left behind on disk from previous runs. I think we should either add shutil.copytree(sidecars, 'tmp_dcm2bids') in to that False case, or tell dcm2niix to not copy the sidecars and just always shutil.copytree(sidecars, 'tmp_dcm2bids') regardless of whether dcm2niix runs.

And it's a secondary bug that you can't run this on a folder of un-BIDSified NIfTIs. Fixing both will require rethinking the same parts of the code.

@arnaudbore
Copy link
Contributor

  1. dcm2bids was never meant to be a universal converter (only purpose take dicom folder and build bids structure).
  2. When dcm2bids find a previous run of dcm2niix into tmp_dcm2bids/sub-ID it does not run dcm2niix (does not create new nii images or json file). There is a way to force dcm2niix to be run and I think it should stay that way.
  3. Even if agree that this behavior should be more documented, as a user, this makes sense and allow to be more flexible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants