Skip to content

Commit

Permalink
docs: add initial user docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mikix committed Jan 18, 2024
1 parent c55f057 commit 66d6a15
Show file tree
Hide file tree
Showing 8 changed files with 372 additions and 151 deletions.
4 changes: 4 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

### Checklist
- [ ] Consider if documentation (like in `docs/`) needs to be updated
- [ ] Consider if tests should be added
30 changes: 30 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,35 @@
# Contributing to Chart Review

First off, thank you!
Read on below for tips on getting involved with the project.

## Talk to Us

If something annoys you, it probably annoys other folks too.
Don't be afraid to suggest changes or improvements!

Not every suggestion will align with project goals,
but even if not, it can help to talk it out.

Look at [open issues](https://github.com/smart-on-fhir/chart-review/issues),
and if you don't see your concern,
[file a new issue](https://github.com/smart-on-fhir/chart-review/issues/new)!

## Set up your dev environment

To use the same dev environment as us, you'll want to run these commands:
```sh
pip install .[dev]
pre-commit install
```

This will install dependencies & build tools,
as well as set up a `black` auto-formatter commit hook.

## Vocabulary

Here is a quick introduction to some terminology you'll see in the source code.

### Labels
- **Label**: a tag that can be applied to a word, like "Fever" or "Ideation".
These are often applied by humans during a chart review in Label Studio,
Expand Down
176 changes: 25 additions & 151 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,165 +1,39 @@
# chart-review
Measure agreement between two "_reviewers_" from the "_confusion matrix_"
# Chart Review

**Measure agreement between chart annotations.**

Whether your chart annotations come from humans, machine-learning, or coded data like ICD-10,
`chart-review` can compare them to reveal interesting statistics like:

**Accuracy**
* F1-score ([agreement](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090460/))
* [Sensitivity and Specificity](https://en.wikipedia.org/wiki/Sensitivity_and_specificity)
* [Positive (PPV) or Negative Predictive Value (NPV)](https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values#Relationship))
* False Negative Rate (FNR)
* [Positive (PPV) or Negative Predictive Value (NPV)](https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values#Relationship)
* False Negative Rate (FNR)

**Confusion Matrix**
**Confusion Matrix**
* TP = True Positive (type I error)
* TN = True Negative (type II error)
* FP = False Positive
* FN = False Negative

**Power Calculations** for sample size estimation
* Power = 1 - FNR
* FNR = FN / (FN + TP)
* FP = False Positive
* FN = False Negative

## Example

---
**CHART-REVIEW** here is defined as "reading" and "annotating" (highlighting) medical notes to measure accuracy of a measurement.
Measurements can establish the reliability of ICD10, or the reliable utility of NLP to automate labor intensive process.

Agreement among 2+ human subject matter expert reviewers is considered the defacto gold-standard for ground-truth labeling, but cannot be done manually at scale.

The most common chart-review measures agreement of the _**class_label**_ from a careful list of notes
* 1 human reviewer _vs_ ICD10 codes
* 1 human reviewer _vs_ NLP results
* 2 human reviewers _vs_ each other
```shell
$ ls
config.yaml labelstudio-export.json

$ chart-review accuracy jane john
accuracy-jane-john:
F1 Sens Spec PPV NPV TP FN TN FP Label
0.889 0.8 1.0 1.0 0.5 4 1 1 0 *
1.0 1.0 1.0 1.0 1.0 1 0 1 0 Cough
0 0 0 0 0 2 0 0 0 Fatigue
0 0 0 0 0 1 1 0 0 Headache
```

---
### How to Install
## Install
1. Clone this repo.
2. Install it locally like so: `pipx install .`

`chart-review` is not yet released on PyPI.

---
### How to Run

#### Set Up Project Folder

Chart Review operates on a project folder that holds your config & data.
1. Make a new folder.
2. Export your Label Studio annotations and put that in the folder as `labelstudio-export.json`.
3. Add a `config.yaml` file (or `config.json`) that looks something like this (read more on this format below):

```yaml
labels:
- cough
- fever

annotators:
jane: 2
john: 6
jack: 8

ranges:
jane: 242-250 # inclusive
john: [260-271, 277]
jack: [jane, john]
```
#### Run
Call `chart-review` with the sub-command you want and its arguments:

For Jane as truth for Jack's annotations:
```shell
chart-review accuracy jane jack
```

For Jack as truth for John's annotations:
```shell
chart-review accuracy jack john
```

Pass `--help` to see more options.

---
### Config File Format

`config.yaml` defines study specific variables.

* Class labels: `labels: ['cough', 'fever']`
* Annotators: `annotators: {'jane': 3, 'john': 8}`
* Note ranges: `ranges: {'jane': 40-50, 'john': [2, 3, 4, 5]}`

`annotators` maps a name to a Label Studio User ID
* human subject matter expert _like_ `jane`
* computer method _like_ `nlp`
* coded data sources _like_ `icd10`

`ranges` maps a selection of Note IDs from the corpus
* `corpus: start:end`
* `annotator1_vs_2: [list, of, notes]`
* `annotator2_vs_3: corpus`

#### External Annotations

You may have annotations from NLP or coded FHIR data that you want to compare against.
Easy!

Set up your config to point at a CSV file in your project folder that holds two columns:
- DocRef ID (real or anonymous)
- Label

```yaml
annotators:
human: 1
external_nlp:
filename: my_nlp.csv
```

When `chart-review` runs, it will inject the external annotations and match up the DocRef IDs
to Label Studio notes based on metadata in your Label Studio export.

---
**BASE COHORT METHODS**

`cohort.py`
* from chart_review import _labelstudio_, _mentions_, _agree_

class **Cohort** defines the base class to analyze study cohorts.
* init(`config.py`)

`simplify.py`
* **rollup**(...) : return _LabelStudioExport_ with 1 "rollup" annotation replacing individual mentions

`term_freq.py` (methods are rarely used currently)
* overlaps(...) : test if two mentions overlap (True/False)
* calc_term_freq(...) : term frequency of highlighted mention text
* calc_term_label_confusion : report of exact mentions with 2+ class_labels

`agree.py` get confusion matrix comparing annotators {truth, annotator}
* **confusion_matrix** (truth, annotator, ...) returns List[TruePos, TrueNeg, FalsePos, FalseNeg]
* **score_matrix** (matrix) returns dict with keys {F1, Sens, Spec, PPV, NPV, TP,FP,TN,FN}

`labelstudio.py` handles LabelStudio JSON

Class **LabelStudioExport**
* init(`labelstudio-export.json`)

Class **LabelStudioNote**
* init(...)

`publish.py` tables and figures for PubMed manuscripts
* table_csv(...)
* table_json(...)

---
**NICE TO HAVES LATER**

* **_confusion matrix_** type support using Pandas
* **score_matrix** would be nicer to use a Pandas strongly typed class

---
### Set up your dev environment

To use the same dev environment as us, you'll want to run these commands:
```sh
pip install .[dev]
pre-commit install
```
6 changes: 6 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Chart Review Documentation

These documents are meant to be built as one part of the larger body of
[Cumulus documentation](https://docs.smarthealthit.org/cumulus).

To test changes here locally, read more at the [Cumulus docs repo](https://github.com/smart-on-fhir/cumulus).
46 changes: 46 additions & 0 deletions docs/accuracy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: Accuracy Command
parent: Chart Review
nav_order: 5
# audience: lightly technical folks
# type: how-to
---

# The Accuracy Command

The `accuracy` command will print agreement statistics like F1 scores and confusion matrices
for every label in your project, between two annotators.

Provide two annotator names (the first name will be considered the ground truth) and
your accuracy scores will be printed to the console.

## Example

```shell
$ chart-review accuracy jane john
accuracy-jane-john:
F1 Sens Spec PPV NPV TP FN TN FP Label
0.929 0.958 0.908 0.901 0.961 91 4 99 10 *
0.895 0.895 0.938 0.895 0.938 17 2 30 2 cough
0.815 0.917 0.897 0.733 0.972 11 1 35 4 fever
0.959 1.0 0.812 0.921 1.0 35 0 13 3 headache
0.966 0.966 0.955 0.966 0.955 28 1 21 1 stuffy-nose
```

## Options

### `--config=PATH`

Use this to point to a secondary (non-default) config file.
Useful if you have multiple label setups (e.g. one grouped into a binary label and one not).

### `--project-dir=DIR`

Use this to run `chart-review` outside of your project dir.
Config files, external annotations, etc will be looked for in that directory.

### `--save`

Use this to write a JSON and CSV file to the project directory,
rather than printing to the console.
Useful for passing results around in a machine-parsable format.
Loading

0 comments on commit 66d6a15

Please sign in to comment.