-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* fix value_name in calculate_deltacq.R * 96-well delta cq vignette draft * simplify calculate_deltacq.R * clarifications to ddcq vignette * calculate_deltadeltacq_bytargetid runs * Deltadeltacq 53 (#58) * Fixes: #56 all variables are now explicitly defined * Fixed bug in deltadeltaqc function so that sample_id rather than target_id is passed onto calculate_normvalue function * calculate_deltacq documentation clarifications * explanations in deltacq_96well_vignette.Rmd * delta cq and vignette updates on README.md * multiple ref genes for deltacq, addresses #52 * Comments on delta Cq vignette Hi there, I had a few minutes so I went through your new vignette. It's really good, clear and easy to understand! I like the simpler (but still real) data set. I made a bunch of changes throughout, I hope you don't mind. My philosophy is to make as many comments and changes as possible and then let the author decide which ones are valuable. Please don't take the volume of comments/changes as a criticism! A few notes below about my comments: Summary section: tried to simplify the technical details of the experiment a bit to make it more approachable to non-microbiologists. Sorry if I got anything wrong! Throughout: shortened here and there by removing information I didn't think was critical I added comments in some places inside `[]` I mostly used "gene" instead of "target" because I think it's easier to understand, but qPCR users will probably be familiar with the word "target". Maybe just define it at the top. * responded to @seaaan's vignette edits * Comments on delta Cq vignette (#59) Hi there, I had a few minutes so I went through your new vignette. It's really good, clear and easy to understand! I like the simpler (but still real) data set. I made a bunch of changes throughout, I hope you don't mind. My philosophy is to make as many comments and changes as possible and then let the author decide which ones are valuable. Please don't take the volume of comments/changes as a criticism! A few notes below about my comments: Summary section: tried to simplify the technical details of the experiment a bit to make it more approachable to non-microbiologists. Sorry if I got anything wrong! Throughout: shortened here and there by removing information I didn't think was critical I added comments in some places inside `[]` I mostly used "gene" instead of "target" because I think it's easier to understand, but qPCR users will probably be familiar with the word "target". Maybe just define it at the top. * fixed col_types for doubles in read_lightcycler_1colour_cq * fixed decimal places of deltacq_96well select data Co-authored-by: Samuel Joseph Haynes <[email protected]> Co-authored-by: Sean Hughes <[email protected]>
- Loading branch information
1 parent
350c5e2
commit cafce12
Showing
15 changed files
with
778 additions
and
129 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,85 +1,144 @@ | ||
|
||
#' @describeIn calculate_deltacq_bysampleid get the median value of a set of | ||
#' normalization (reference) probes, for a single sample. | ||
#' Calculate a normalized value for a subset of reference ids | ||
#' | ||
#' @param norm_function Function to use to calculate the value to | ||
#' normalise by on log2/cq scale. | ||
#' Default function is median, alternatively could use mean. | ||
#' This is used to calculate the normalized `cq` values for reference | ||
#' `target_ids` (e.g. genes), to use in `delta_cq` calculation for each | ||
#' `sample_id`. | ||
#' | ||
#' Also used to calculate the normalized `delta_cq` values for reference | ||
#' `sample_ids`, to use in `deltadelta_cq` calculation for each `target_id`. | ||
#' | ||
#' @param value_df data frame containing relevant columns, those named in | ||
#' `value_name` and `id_name` parameters. | ||
#' @param ref_ids values of reference ids, that are used to calculate | ||
#' normalized reference value. | ||
#' @param value_name name of column containing values. This column should be | ||
#' numeric. | ||
#' @param id_name name of column containing ids. | ||
#' @param norm_function Function to use to calculate the value to normalize by. | ||
#' Default function is median, alternatively could use mean, geometric mean, | ||
#' etc. | ||
#' | ||
#' @export | ||
#' @importFrom tidyr %>% | ||
#' @importFrom stats median | ||
#' | ||
calculate_normcq <- function(cq_df, | ||
value_name = "cq", | ||
norm_target_ids = "ALG9", | ||
tid_name = "target_id", | ||
#' | ||
calculate_normvalue <- function(value_df, | ||
ref_ids, | ||
value_name = "value", | ||
id_name = "id", | ||
norm_function = median) { | ||
# make subset of cq_df where gene is one or more norm_target_ids | ||
value_to_norm_by <- dplyr::filter(cq_df, | ||
!!dplyr::sym(tid_name) %in% norm_target_ids) %>% | ||
.[[value_name]] %>% | ||
# make subset of value_df where gene is one or more ref_ids | ||
value_to_norm_by <- dplyr::filter(value_df, | ||
!!dplyr::sym(id_name) %in% ref_ids) %>% | ||
dplyr::pull(!!dplyr::sym(value_name)) %>% | ||
norm_function(na.rm = TRUE) | ||
# | ||
# assign summary (median) value to cq_df$value_to_norm_by | ||
# assign summary (median) value to value_df$value_to_norm_by | ||
# note this is the same value for every row, a waste of space technically | ||
dplyr::mutate(cq_df, value_to_norm_by = value_to_norm_by) | ||
dplyr::mutate(value_df, value_to_norm_by = value_to_norm_by) | ||
} | ||
|
||
#' Calculate delta cq to normalize quantification cycle (log2-fold) data within | ||
#' sample_id. | ||
#' Calculate delta cq (\eqn{\Delta Cq}) to normalize quantification cycle | ||
#' (log2-fold) data within sample_id. | ||
#' | ||
#' This function implements relative quantification by the delta Cq method. For | ||
#' each sample, the Cq values of all targets (e.g. genes, probes, primer sets) | ||
#' are compared to one or more reference target ids specified in | ||
#' `ref_target_ids`. | ||
#' | ||
#' @param cq_df a data frame containing columns `sample_id`, value_name (default | ||
#' `cq`) and tid_name (default `target_id`). Crucially, sample_id should be | ||
#' the same for different technical replicates measuring identical reactions | ||
#' in different wells of the plate, but differ for different biological and | ||
#' experimental replicates. | ||
#' @param value_name the column name of the value that will be normalized | ||
#' @param norm_target_ids names of PCR probes (or primer sets) to normalize by, | ||
#' i.e. reference genes | ||
#' @param tid_name the column name for probe sets | ||
#' experimental replicates. See tidyqpcr vignettes for examples. | ||
#' @param ref_target_ids names of targetss to normalize by, i.e. reference | ||
#' genes, hydrolysis probes, or primer sets. This can be one reference target | ||
#' id, a selection of multiple target ids, or even all measured target ids. In | ||
#' the case of all of them, the delta Cq value would be calculated relative to | ||
#' the median (or other `norm_function`) of all measured targets. | ||
#' @param norm_function Function to use to calculate the value to normalize by | ||
#' on given scale. Default is median, alternatively could use mean. | ||
#' | ||
#' @return data frame like cq_df with three additional columns: | ||
#' | ||
#' \tabular{ll}{ value_to_norm_by \tab the median value of the reference | ||
#' probes \cr value_norm \tab the normalized value, \eqn{\Delta Cq} \cr | ||
#' value_normexp \tab the normalized ratio, \eqn{2^(-\Delta Cq)} } | ||
#' \tabular{ll}{ ref_cq \tab summary (median/mean) cq value for reference | ||
#' target ids \cr delta_cq \tab normalized value, \eqn{\Delta Cq} \cr | ||
#' rel_abund \tab normalized ratio, \eqn{2^(-\Delta Cq)} } | ||
#' | ||
#' @export | ||
#' @importFrom tidyr %>% | ||
#' | ||
#' @importFrom stats median | ||
#' @importFrom rlang .data | ||
#' | ||
calculate_deltacq_bysampleid <- function(cq_df, | ||
norm_target_ids, | ||
value_name = "cq", | ||
tid_name = "target_id") { | ||
ref_target_ids, | ||
norm_function = median) { | ||
cq_df %>% | ||
dplyr::group_by(sample_id) %>% | ||
dplyr::do(calculate_normcq(., | ||
value_name, | ||
norm_target_ids, | ||
tid_name)) %>% | ||
dplyr::group_by(.data$sample_id) %>% | ||
dplyr::do(calculate_normvalue(.data, | ||
ref_ids = ref_target_ids, | ||
value_name = "cq", | ||
id_name = "target_id", | ||
norm_function = norm_function)) %>% | ||
dplyr::rename(ref_cq = .data$value_to_norm_by) %>% | ||
dplyr::ungroup() %>% | ||
dplyr::mutate(.value = !!dplyr::sym(value_name), # a tidyeval trick | ||
value_norm = .value - value_to_norm_by, | ||
value_normexp = 2^-value_norm) %>% | ||
dplyr::select(-.value) %>% | ||
dplyr::mutate( | ||
delta_cq = .data$cq - .data$ref_cq, | ||
rel_abund = 2^-.data$delta_cq) %>% | ||
return() | ||
} | ||
|
||
#' @describeIn calculate_deltacq_bysampleid Synonym for | ||
#' calculate_deltacq_plates. | ||
|
||
#' Calculate delta delta cq (\eqn{\Delta \Delta Cq}) to globally normalize | ||
#' quantification cycle (log2-fold) data across sample_id. | ||
#' | ||
#' @export | ||
#' This function does a global normalization, where all samples are compared to | ||
#' one or more reference samples specified in `ref_sample_ids`. There are other | ||
#' experimental designs that require comparing samples in pairs or small groups, | ||
#' e.g. a time course comparing `delta_cq` values against a reference strain at | ||
#' each time point. For those situations, instead we recommend adapting code | ||
#' from this function, changing the grouping variables used in to | ||
#' `dplyr::group_by` to draw the contrasts appropriate for the experiment. | ||
#' | ||
normalizeqPCR <- function(cq_df, | ||
value_name = "cq", | ||
norm_target_ids = "ALG9", | ||
tid_name = "target_id") { | ||
lifecycle::deprecate_warn("0.2", "normalizeqPCR()", | ||
"calculate_deltacq_bysampleid()", | ||
details = "Replaced with more descriptive name") | ||
calculate_deltacq_bysampleid(cq_df = cq_df, | ||
norm_target_ids = norm_target_ids, | ||
value_name = value_name, | ||
tid_name = tid_name) | ||
} | ||
#' @param deltacq_df a data frame containing columns `sample_id`, value_name | ||
#' (default `delta_cq`) and tid_name (default `target_id`). Crucially, | ||
#' sample_id should be the same for different technical replicates measuring | ||
#' identical reactions in different wells of the plate, but differ for | ||
#' different biological and experimental replicates. | ||
#' | ||
#' Usually this will be a data frame that was output by | ||
#' `calculate_deltacq_bysampleid`. | ||
#' | ||
#' @param ref_sample_ids reference sample_ids to normalize by | ||
#' @param norm_function Function to use to calculate the value to normalize by | ||
#' on given scale. Default is median, alternatively could use mean. | ||
#' | ||
#' @return data frame like cq_df with three additional columns: | ||
#' | ||
#' \tabular{ll}{ ref_delta_cq \tab summary (median/mean) \eqn{\Delta Cq} | ||
#' value for target_id in reference sample ids \cr deltadelta_cq \tab the | ||
#' normalized value, \eqn{\Delta \Delta Cq} \cr fold_change \tab the | ||
#' normalized fold-change ratio, \eqn{2^(-\Delta \Delta Cq)} } | ||
#' | ||
#' @export | ||
#' @importFrom tidyr %>% | ||
#' @importFrom stats median | ||
#' | ||
calculate_deltadeltacq_bytargetid <- function(deltacq_df, | ||
ref_sample_ids, | ||
norm_function = median) { | ||
deltacq_df %>% | ||
dplyr::group_by(.data$target_id) %>% | ||
dplyr::do(calculate_normvalue(.data, | ||
ref_ids = ref_sample_ids, | ||
value_name = "delta_cq", | ||
id_name = "sample_id", | ||
norm_function = norm_function)) %>% | ||
dplyr::rename(ref_delta_cq = .data$value_to_norm_by) %>% | ||
dplyr::ungroup() %>% | ||
dplyr::mutate( | ||
deltadelta_cq = .data$delta_cq - .data$ref_delta_cq, | ||
fold_change = 2^-.data$deltadelta_cq) %>% | ||
return() | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.