Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible assignments of cells to treatment and control groups #163

Closed
wants to merge 7 commits into from

Conversation

ekatsevi
Copy link
Member

@ekatsevi ekatsevi commented Dec 30, 2024

Overview

The main goal of this pull request is to allow users to avoid filtering out cells with zero or more than two gRNAs, which is helpful in settings where the MOI is low but not that low. For example, cells with multiple NT gRNAs can safely still be considered control cells, or cells with one targeting gRNA and one or more NT gRNAs can safely still be considered treatment cells.

Updates to API

  1. There is now an optional argument to set_analysis_parameters() called treatment_group. The two options are:

    • inclusive: any cell containing a gRNA with a given target is a treatment cell (default in high MOI)
    • exclusive: only cells containing a gRNA with a given target but no other targeting gRNAs are treatment cells (default in low MOI)
  2. There is now an optional logical argument to run_qc() called remove_cells_w_zero_or_twoplus_grnas (defaulting to TRUE for low MOI and FALSE for high MOI). I have set the former default to FALSE for backward compatibility, but I recommend it is set to FALSE in order to avoid throwing out cells unnecessarily.

  3. The data frames outputted by the calibration check, power check, and discovery analysis have two additional columns called n_trt and n_cntrl, giving the number of cells used in the treatment and control groups, respectively.

Limitation: The calibration check is not supported for treatment_group = "inclusive", control_group = "nt_cells", and remove_cells_w_zero_or_twoplus_grnas = FALSE. In this case, a proper calibration check could have cells labeled as undercover that include both the undercover NTs and targeting gRNAs. These cells are not a subset of the NT cells, and therefore break the assumption of the software that the undercover cells used in the calibration check are a subset of the NT cells (this assumption underlying the reindexing of the indiv_nt_grna_idxs with respect to all_nt_idxs, for example).

Updates to back end

  1. The helper function process_initial_assignment_list() now implements the logic of the treatment group by updating grna_group_idxs accordingly. This function also calculates all_nt_idxs and adds it to grna_assignments_raw. Before, only grna_assignments had the field all_nt_idxs. Now, it is no longer the case that all_nt_idxs can be constructed as the union of indiv_nt_grna_idxs. For treatment_group = "inclusive", indiv_nt_grna_idxs can contain cells with targeting gRNAs but all_nt_idxs by definition includes cells with no targeting gRNAs.

  2. Input checks were updated to reflect the changes to the API.

  3. The helper function update_indiv_grna_assignments_for_nt_cells() was updated in order to remove the entries of indiv_nt_grna_idxs were nonoverlapping. Indeed, a cell containing two NT gRNAs would appear in the lists for both of these gRNAs.

  4. The helper function add_num_cells_to_result() was added in order to compute n_trt and n_cntrl; it is now called in run_calibration_check(), run_power_check(), and run_discovery_analysis().

  5. The helper C++ function compute_nt_nonzero_matrix_and_n_ok_pairs_v3() was modified to correctly compute the QC metrics in the case when the entries of indiv_nt_grna_idxs were overlapping.

  6. Some testthat tests were added to test the new functionality. Another set of tests that needs to be manually run was added to tests/manual/test-flexible-cell-assignments.R, which ensures that running the new version of sceptre with defaults does not change the outputs (except for the addition of n_trt and n_cntrl) on the low- and high-MOI example data, compared to the existing one. These tests needed to be outside the testthat framework because they involve running two different versions of sceptre.

Limitations: The Nextflow pipeline, and any functions inside the R package pertaining to the Nextflow pipeline, were not updated.

@ekatsevi ekatsevi closed this Jan 30, 2025
@ekatsevi ekatsevi removed the request for review from timothy-barry January 30, 2025 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant