Skip to content

Gene sequence comparison visualizer

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

Deemolotus/gscVisualizer

Repository files navigation

gscVisualizer

Description

gscVisualizer is an R package to comparing the similarity between multiple RNA or DNA sequences. This tool can use to compare the difference between wild type gene (as an reference) and mutated type of genes. This package can also be used for computing difference among multiple dot-bracket form RNA sequences. A plot can also be generated to make data easier to look at.

Installation

To install the latest version of the package:

require("devtools")
devtools::install_github("Deemolotus/gscVisualizer", build_vignettes = TRUE)
library("gscVisualizer")

To run the ShinyApp:

gscVisualizer::runGscVisualizer()

Overview

ls("package:gscVisualizer")

gscVisualizer contains 7 functions to compute the difference among genes. The seqCompareAsInput function will takes in numbers of gene sequences as arguments and compute the difference between each gene sequence with the reference gene. seqCompareInPair function will takes in number if gene sequences as argument and calculate the difference between each pair of genes. seqCompareAsFile function will takes in a .fa file and convert the .fa file to dataframes, then do the difference comparison between every gene sequence with the reference sequence. seqCompareAsFilePair function do the similar thing as seqCompareInpair but get all data from .fa file instead of input arguments. dotComp function will takes in two dot-bracket sequence and then compute the difference between two RNA structure. checkSeq function will takes in one sequence and make sure the dot-bracket seequence is valid.

An overview of the package is illustrated below:

Alternatively, the Shiny app can also be used for generating the final result

gscVisualizer::runGscVisualizer()

The package tree structure is provide below

- gscVisualizer
  |- gscVisualizer.Rproj
  |- DESCRIPTION
  |- NAMESPACE
  |- LICENSE
  |- README
  |- data
    |- dotCompExample.rda
    |- ExampleSequence.rda
  |- inst
    |- extdata
      |- randomSeq.fa
      |- seqCompareAsFileTest.fa
      |- seqCompareAsInputExample.rds
      |- test.fa
      |- shiny.png
      |- workflow.png
      |- workflow1.png
    |- shiny-scripts
      |- app.R
    |-CITATION
  |- man
    |- checkSeq.Rd
    |- dotComp.Rd
    |- dotCompExample.Rd
    |- ExampleSequence.Rd
    |- plotter.Rd
    |- runGscVisualizer.Rd
    |- seqCompareAsFile.Rd
    |- seqCompareAsFilePair.Rd
    |- seqCompareAsInput.Rd
    |- seqCompareInPair.Rd
  |- R
    |- data.R
    |- dotComp.R
    |- plotter.R
    |- runGscVisualizer.R
    |- seqCompareAsInput.R
  |- vignettes
    |- Introduction_to_gscVisualizer.Rmd
    |- Introduction_to_gscVisualizer.md
    |- shinyAppGscVisualizer.Rmd
    |- shinyAppGscVisualizer.md
    |- dotComp.png
    |- dotCompResult.png
    |- guide1.png
    |- guide2.png
    |- guide3.png
    |- guide4.png
    |- guide5.png
    |- Rplot.png
    |- seqCompareAsFile.png
    |- seqCompareAsFilePair.png
  |- tests
    |- testthat.R
    |- testthat
      |- Rplots.pdf
      |- test-checkSeq.R
      |- test-dotComp.R
      |- test-plotter.R
      |- test-seqcompareAsFile.R
      |- test-seqcompareAsFilePair.R
      |- test-seqCompareInPair.R
      |- test-test-seqCompareAsInput.R

Tutorials

For tutorials and plot interpretation, refer to the vignette:

browseVignettes("gscVisualizer")

gscVisualizer introduction: https://github.com/Deemolotus/gscVisualizer/tree/master/vignettes/Introduction_to_gscVisualizer.md

instruction for the shiny app: https://github.com/Deemolotus/gscVisualizer/tree/master/vignettes/shinyAppGscVisualizer.md

Citation for Package

citation("gscVisualizer")

Tan, Z. (2020) gscVisualizer (Gene sequence comparison visualizer): an R package for comparing the difference in gene sequences, then visualize the result by plotting. URL https://github.com/Deemolotus/gscVisualizer

Contributions

The author of this package is Zhiwen Tan. All of seqCompareAsInput, seqCompareInPair, seqCompareAsFile and seqCompareAsFilePair functions makes use of s2c and translate functions from seqinr R package to convert the DNA or RNA sequences to amino acid. Those two functions also makes use of needles function from NameNeedle R package to calculate the difference between different gene sequences. The NameNeedle R package is also been used in dotComp function. The seqCompareAsFile and seqCompareAsFilePair function also makes use of readRNAStringSet function from Biostrings to extract information from .fa file to make a list of dataframes. The plotter function makes use of the graphic R package.

References

Charif D, Lobry J (2007). “SeqinR 1.0-2: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis.” In Bastolla U, Porto M, Roman H, Vendruscolo M (eds.), Structural approaches to sequence evolution: Molecules, networks, populations, series Biological and Medical Physics, Biomedical Engineering, 207-232. Springer Verlag, New York. ISBN : 978-3-540-35305-8. http://seqinr.r-forge.r-project.org/

Kevin R. Coombes (2020). NameNeedle: Using Needleman-Wunsch to Match Sample Names. R package version 1.2.6/r51. https://R-Forge.R-project.org/projects/nameneedle/

Pagès H, Aboyoun P, Gentleman R, DebRoy S (2020). Biostrings: Efficient manipulation of biological strings. R package version 2.58.0, https://bioconductor.org/packages/Biostrings

Wickham, H. and Bryan, J. (2019). R Packages (2nd edition). Newton, Massachusetts: O’Reilly Media. https://r-pkgs.org/

Kozomara, A., & Griffiths-Jones, S. (2011). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic acids research, 39(Database issue), D152–D157. https://doi.org/10.1093/nar/gkq1027

Steipe B., ABC project (.utility 4.07) A Bioinformatics Course: Applied Bioinformatics http://steipe.biochemistry.utoronto.ca/abc/index.php/Bioinformatics_Main_Page

Zhiwen. T, Sijie Xu (2020) miRNA Motif Analysis https://github.com/Deemolotus/BCB330Y-and-BCB430Y/tree/master/Main

Acknowledgements

This package was developed for BCB410H: Applied Bioinformatics, University of Toronto, Toronto, CANADA.

About

Gene sequence comparison visualizer

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages