Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of functions not inside interactive environment #75

Open
padpadpadpad opened this issue Oct 15, 2021 · 1 comment
Open

Use of functions not inside interactive environment #75

padpadpadpad opened this issue Oct 15, 2021 · 1 comment

Comments

@padpadpadpad
Copy link

Hi sangeranalyseR team

A lot has changed since I last used the package. I worked through my own scripts using the functions @roblanf made originally in sangeranalyseR. Is there any way to use the functions harnessed under-the-hood inside R Studio and related scripts?

For example some of my code used to look like:

# trim and resave trimmed sequence
trim_and_save <- function(seq_file, output_path, trim_cutoff = 1e-04){
  temp <- sangerseqR::read.abif(seq_file)
  trims <- sangeranalyseR::trim.mott(temp, cutoff = trim_cutoff)
  seq <- substring(temp@data$PBAS.2, trims$start, trims$finish)
  write(seq, paste(output_path, '/', tools::file_path_sans_ext(basename(seq_file)), '.txt', sep = ''))
}

# get summary data for each file
file_sum <- summarise.abi.folder('sequencing/sanger_final/raw')
write.csv(file_sum$summaries, 'sequencing/sanger_final/sanger_seq_qualcheck.csv')
# the trimming parameters will improve the quality of the files

# do trimming and re-save files
walk(files, trim_and_save, output_path = trimmed_path, trim_cutoff = 1e-04)
trimmed_files <- list.files(trimmed_path, full.names = TRUE, pattern = '.txt')

# bind all files together
d_trim <- map_df(trimmed_files, read_and_bind) %>%
  mutate(., seq_len = nchar(seq))

#-----------------------------------------------#
# quality control on these trimmed sequences ####
#-----------------------------------------------#

# filter out files that have < 100 base pairs
d_trim <- filter(d_trim, seq_len >= 100)

# filter out files that have > 5 secondary peaks or an average quality below 30
to_trim <- read_csv('sequencing/sanger_final/sanger_seq_qualcheck.csv') %>% 
  filter(trimmed.secondary.peaks > 5 | trimmed.mean.quality < 30) %>%
  pull(file.name) %>%
  tools::file_path_sans_ext()

Just feel like this approach gives me more flexibility than the interactive Shiny app, although it does look lovely and easy to use.

@Kuanhao-Chao
Copy link
Collaborator

Hi @padpadpadpad,

Thanks for using sangeranalyseR. Short answer is "Yes!", you can do everything in R-shell. We rewrote sangeranlyseR in object-oriented way, and it includes all the old features.

If you want to run with one abif file, see the example in this page.
For creating a contig, see the example in this page.
And for the multiple contigs alignment, here is the link.

We wrote wrapper functions, SangerRead, SangerContig, and SangerAlignment, for different levels of analysis. After executing any of them above, you will get an object including all the information (e.g. parameters, trimming results, alignment results). For more information, please check sangeranalyseR documentation

Best,
Kuan-Hao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants