Flexible assignments of cells to treatment and control groups #163

ekatsevi · 2024-12-30T20:53:38Z

Overview

The main goal of this pull request is to allow users to avoid filtering out cells with zero or more than two gRNAs, which is helpful in settings where the MOI is low but not that low. For example, cells with multiple NT gRNAs can safely still be considered control cells, or cells with one targeting gRNA and one or more NT gRNAs can safely still be considered treatment cells.

Updates to API

There is now an optional argument to set_analysis_parameters() called treatment_group. The two options are:
- inclusive: any cell containing a gRNA with a given target is a treatment cell (default in high MOI)
- exclusive: only cells containing a gRNA with a given target but no other targeting gRNAs are treatment cells (default in low MOI)
There is now an optional logical argument to run_qc() called remove_cells_w_zero_or_twoplus_grnas (defaulting to TRUE for low MOI and FALSE for high MOI). I have set the former default to FALSE for backward compatibility, but I recommend it is set to FALSE in order to avoid throwing out cells unnecessarily.
The data frames outputted by the calibration check, power check, and discovery analysis have two additional columns called n_trt and n_cntrl, giving the number of cells used in the treatment and control groups, respectively.

Limitation: The calibration check is not supported for treatment_group = "inclusive", control_group = "nt_cells", and remove_cells_w_zero_or_twoplus_grnas = FALSE. In this case, a proper calibration check could have cells labeled as undercover that include both the undercover NTs and targeting gRNAs. These cells are not a subset of the NT cells, and therefore break the assumption of the software that the undercover cells used in the calibration check are a subset of the NT cells (this assumption underlying the reindexing of the indiv_nt_grna_idxs with respect to all_nt_idxs, for example).

Updates to back end

The helper function process_initial_assignment_list() now implements the logic of the treatment group by updating grna_group_idxs accordingly. This function also calculates all_nt_idxs and adds it to grna_assignments_raw. Before, only grna_assignments had the field all_nt_idxs. Now, it is no longer the case that all_nt_idxs can be constructed as the union of indiv_nt_grna_idxs. For treatment_group = "inclusive", indiv_nt_grna_idxs can contain cells with targeting gRNAs but all_nt_idxs by definition includes cells with no targeting gRNAs.
Input checks were updated to reflect the changes to the API.
The helper function update_indiv_grna_assignments_for_nt_cells() was updated in order to remove the entries of indiv_nt_grna_idxs were nonoverlapping. Indeed, a cell containing two NT gRNAs would appear in the lists for both of these gRNAs.
The helper function add_num_cells_to_result() was added in order to compute n_trt and n_cntrl; it is now called in run_calibration_check(), run_power_check(), and run_discovery_analysis().
The helper C++ function compute_nt_nonzero_matrix_and_n_ok_pairs_v3() was modified to correctly compute the QC metrics in the case when the entries of indiv_nt_grna_idxs were overlapping.
Some testthat tests were added to test the new functionality. Another set of tests that needs to be manually run was added to tests/manual/test-flexible-cell-assignments.R, which ensures that running the new version of sceptre with defaults does not change the outputs (except for the addition of n_trt and n_cntrl) on the low- and high-MOI example data, compared to the existing one. These tests needed to be outside the testthat framework because they involve running two different versions of sceptre.

Limitations: The Nextflow pipeline, and any functions inside the R package pertaining to the Nextflow pipeline, were not updated.

…on, updated other aspects of package to be compatible with flexible treatment and control groups

Dev

ekatsevi added 4 commits December 13, 2024 17:08

Initial attempt to implement flexible cell assignments

ec39fc9

Added num treatment and control cells to output, updated print functi…

2697956

…on, updated other aspects of package to be compatible with flexible treatment and control groups

Passed existing tests

5b270d0

Added some tests for new functionality and updated documentation

52b14c7

ekatsevi requested a review from timothy-barry December 30, 2024 20:53

ekatsevi added 3 commits January 17, 2025 21:42

Merge pull request #172 from Katsevich-Lab/dev

082a104

Dev

Completed merge

ef052ab

Updated manual test

ba22f02

ekatsevi closed this Jan 30, 2025

ekatsevi removed the request for review from timothy-barry January 30, 2025 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flexible assignments of cells to treatment and control groups #163

Flexible assignments of cells to treatment and control groups #163

ekatsevi commented Dec 30, 2024 •

edited

Loading

Flexible assignments of cells to treatment and control groups #163

Flexible assignments of cells to treatment and control groups #163

Conversation

ekatsevi commented Dec 30, 2024 • edited Loading

Overview

Updates to API

Updates to back end

ekatsevi commented Dec 30, 2024 •

edited

Loading