-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathSRAssembler.conf
104 lines (89 loc) · 3.6 KB
/
SRAssembler.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
## Vmatch is the default aligner used to find reads.
# 'l' determines how long the alignment must be. Increasing 'l' will make the
# search more stringent, lowering it will find more reads.
# 'e' determines how many errors are allowed. Increasing 'e' will find more
# reads, but also make Vmatch take longer to run.
## Before the first round, if a taboo_file has been supplied, it will be used as a query to remove reads that match it before the recursive contig building begins.
[Vmatch_protein_taboo]
l=13
e=3
[Vmatch_dna_taboo]
l=35
e=1
## In the first round, reads are aligned against the probe. Because this is a
## search for homologous reads, the matches should be allowed to be shorter and
## more error-prone than when searching for reads to extend the contigs.
# If the probe is a protein, it is indexed and the reads are used as queries.
[Vmatch_protein_init]
l=13
e=3
# If the probe is DNA, it is used as a query against the indexed reads.
# Searching the read index is much faster than using the reads as queries.
# Because DNA probes are most likely only going to be used in situations where
# close homology is expected, longer initial matches may be desired than with
# a protein probe.
[Vmatch_dna_init]
l=35
e=3
## During recursion rounds, assembled contigs are aligned to the indexed reads
## to find reads that might allow longer contigs to be assembled.
[Vmatch_extend_contig]
l=25
e=0
## Found reads will sometimes be assembled into a contig that is not homologous
## to the probe, and SRAssembler will extend this contig along with the others.
## The reads found during this extension will slow the assembly step, and can
## even prevent the assembly of correct homologous contigs. During cleaning
## rounds, contigs and reads that are not relevant to the probe are removed.
# First the assembled contigs are aligned to the probe sequence.
[Vmatch_protein_vs_contigs]
l=13
e=3
# or
[Vmatch_dna_vs_contigs]
l=35
e=3
# After irrelevant contigs have been removed, the found reads pool is aligned
# against the remaining contigs to identify the relevant reads. Consider making
# 'l' at least half the average length of your reads. You can also set "l=0" to
# require the entire read length to match an assembled contig.
[Vmatch_reads_vs_contigs]
l=50
e=2
## The assemblers have parameters that can be adjusted to change their tendency
## to assemble contigs in the face of uncertainty. You may use other parameters
## than the ones shown here. Check the assembler manuals for more information.
# SOAPdenovo2 indexes the kmers of the input reads in its pregraph stage.
[SOAPdenovo_pregraph]
# KmerFreqCutoff: kmers with frequency no larger than 'd' will be deleted.
d=0
# -R (optional) resolve repeats by reads
R
# Indexed kmers are assembled into contigs in a separate step by SOAPdenovo2.
[SOAPdenovo_contig]
# mergeLevel(min 0, max 3): the strength of merging similar sequences during
# contiging.
M=3
# EdgeCovCutoff: edges with coverage no larger than 'D' will be deleted.
D=2
# -R (optional) resolve repeats by reads
R
# Abyss assembler has only one step for assembly. We do not have suggestions
# for alternate parameters, but they are allowed.
[Abyss_contig]
## The spliced-aligners have parameters that should be set to adjust their
## sensitivity to your probe. Check their manuals for more information.
[GenomeThreader]
gcmincoverage=10
minmatchlen=20
prminmatchlen=15
[GeneSeqer]
x=14
y=16
z=40
w=0.5
## If you use the SNAP gene finder, it needs a Hidden Markov Model to use.
# If using SNAP, the environmentmal variable "ZOE" must be set to the path of
# the SNAP directory, as described in the SNAP manual.
[Snap]
snaphmm=A.thaliana.hmm