Skip to content

Commit

Permalink
add Trimmomatic with some task for trimming paired ends reads
Browse files Browse the repository at this point in the history
  • Loading branch information
rjpbonnal committed Jun 11, 2012
1 parent bf973bb commit 9a45afb
Show file tree
Hide file tree
Showing 3 changed files with 83 additions and 3 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Provides a framework for handling NGS data with Bioruby.
* http://www.gnuplot.info/ tested on version 4.6
* libxslt1-dev
* CASAVA 1.8.2 <http://support.illumina.com/sequencing/sequencing_software/casava.ilmn>
* Java SE for running Trimmomatic

## Install
### Quick Start
Expand All @@ -31,6 +32,7 @@ Provides a framework for handling NGS data with Bioruby.
Pleas follow the instruction for your own distribution/operating system



## Tasks
We'll try to keep this list updated but just in case type `biongs -T` to get the most updated list.
_We are working on these and other tasks, if you find some bugs, please open an issue on Github._
Expand Down
9 changes: 8 additions & 1 deletion lib/bio/ngs/ext/versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@ common:
basename: samtools-0.1.18
suffix: tar.bz2
desc: "SAMtools"
type: make
type: make
trimmomatic:
version: 0.22
url: http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.22.zip
basename: Trimmomatic-0.22
suffix: zip
desc: "Trimmomatic: A flexible read trimming tool for Illumina NGS data"
type: binary


linux:
Expand Down
75 changes: 73 additions & 2 deletions lib/tasks/quality.thor
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ class Quality < Thor
desc "fastq_stats FASTQ", "Reports quality of FASTQ file"
method_option :output, :type=>:string, :aliases =>"-o", :desc => "Output file name. default is input file_name with .txt."
def fastq_stats(fastq)
puts "[#{Time.now}] Processing #{fastq} stats"
uuid = SecureRandom.uuid
puts "[#{Time.now}] #{uuid} Processing #{fastq} stats"
output_file = options.output || "#{fastq.gsub(/\.fastq\.gz/,'')}_stats.txt"
stats = Bio::Ngs::Fastx::FastqStats.new
if fastq=~/\.gz/
Expand All @@ -60,7 +61,7 @@ class Quality < Thor
[:reads_coverage,output_file],
[:nucleotide_distribution,output_file]]
Parallel.map(go_in_parallel, in_processes:go_in_parallel.size) do |graph|
puts "[#{Time.now}] Plotting #{graph.first} #{graph.last}"
puts "[#{Time.now}] #{uuid} Plotting #{graph.first} #{graph.last}"
send graph.first, graph.last
end
puts "[#{Time.now}] Finished #{fastq}"
Expand Down Expand Up @@ -226,5 +227,75 @@ class Quality < Thor
end
end
desc "trim_momatic_pe FORWARD REVERSE", "Trim reads on quality by using Trimmomatic, Paired Ends"
# method_option :threads, :type => :numeric, :default => 2, :desc => 'Number of threads to use by Trimmomatic'
def trim_momatic_pe(forward, reverse)
uuid = SecureRandom.uuid
puts "[#{Time.now}] #{uuid} Start trimming #{forward} and #{reverse} paired end reads by Trimmomatic"
puts "#{File.dirname(__FILE__)}/../bio/ngs/ext/bin/common/trimmomatic/trimmomatic-0.22.jar"
puts "java -classpath #{File.dirname(__FILE__)}/../bio/ngs/ext/bin/common/trimmomatic/trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 #{forward} #{reverse} #{forward.gsub(/fastq\.gz/,'trimmed.fastq.gz')} #{forward.gsub(/fastq\.gz/,'unpaired.fastq.gz')} #{reverse.gsub(/fastq\.gz/,'trimmed.fastq.gz')} #{reverse.gsub(/fastq\.gz/,'unpaired.fastq.gz')} LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36"
puts "[#{Time.now}] #{uuid} Finished "
end
desc "illumina_aggregated_sample_trim DIR PROJECT [SAMPLE]", "Trim aggregated data from Illumina project"
method_option :aggregated, :type => :boolean, :default => true, :desc => 'Process only reads with aggregated by biongs quality:aggregate'
def illumina_aggregated_sample_trim(directory, project_name, sample_name=nil)
projects = Bio::Ngs::Illumina.build(directory)
if (project = projects.get project_name)
if (sample = project.get sample_name)
forward = (sample.get :side, :left ).map{|uid, readsfile| readsfile.metadata[:filename]}.first
reverse = (sample.get :side, :right).map{|uid, readsfile| readsfile.metadata[:filename]}.first
trim_momatic_pe(File.join(directory,project.path,sample.path,forward), File.join(directory,project.path,sample.path,reverse))
else
puts "Sample #{sample_name} does not exist."
end
else
puts "Project #{project_name} does not exist."
end
end
desc "illumina_trim_run DIR","trim all fastq file in projects and samples directories as paired ends"
def illumina_trim_run(directory)
Bio::Ngs::Illumina.build(directory).each do |project_name, project|
project.each_sample do |sample_name, sample|
forward = (sample.get :side, :left ).map{|uid, readsfile| readsfile.metadata[:filename]}.first
reverse = (sample.get :side, :right).map{|uid, readsfile| readsfile.metadata[:filename]}.first
trim_momatic_pe(File.join(directory,project.path,sample.path,forward), File.join(directory,project.path,sample.path,reverse))
end
end
end
desc "list_projects_samples DIR", "list projects and samples in a run"
def list_projects_samples(directory)
Bio::Ngs::Illumina.build(directory).each do |project_name, project|
project.each_sample do |sample_name, sample|
puts "#{directory} #{project_name} #{sample_name}"
end
end
end
desc "list_samples DIR PROJECT", "list samples in a project run"
def list_samples(directory, project_name)
project = Bio::Ngs::Illumina.build(directory).get project_name
if project
project.each_sample do |sample_name, sample|
puts "#{directory} #{project_name} #{sample_name}"
end
else
puts "Project #{project_name} does not exist."
end
end
desc "clean_from_trimming [DIR]", "remove trimmomatic files from direcoty recursively"
def clean_from_trimming(dir=".")
files = Dir.glob(["**/*trimmed*", "**/*unpaired*"])
files_size = files.inject(0){|c,v| c+=v}
Dir.glob(["**/*trimmed*", "**/*unpaired*"]) do |file|
File.delete file
end
puts "Deleted #{files_size}"
end
end

1 comment on commit 9a45afb

@rjpbonnal
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to remove puts from
trim_momatic_pe

Please sign in to comment.