forked from indraniel/fqgrep
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
169 lines (122 loc) · 5.73 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
SYNOPSIS
========
'fqgrep' is an approximate sequence pattern matcher for FASTQ/FASTA files.
One can think of it as being a grep (http://en.wikipedia.org/wiki/Grep)
and agrep (http://en.wikipedia.org/wiki/Agrep) like tool optimized
for FASTQ (http://en.wikipedia.org/wiki/FASTQ_format) and FASTA
(http://en.wikipedia.org/wiki/FASTA_format) files. It can work directly
on both compressed and uncompressed file types.
Below is the help message via ('fqgrep -h') describing its usage:
Usage: fqgrep [options] -p <pattern> <fastq_or_fasta_files>
-h This help message
-V Program and version information
-p <STRING> Pattern of interest to grep [REQUIRED]
-v Invert match - show only sequences that
DO NOT match the pattern
-a Show all records irregardless of match status
Useful in conjunction with the -r option;
when one would like to do further post-processing
of the match data
-c Highlight matching string with color
-f Output matches in FASTA format
-r Output matches in detailed stats report format
-b <STRING> Delimiter string to separate columns
in detailed stats report [Default: '\t']
-m <INT> Total number of mismatches to at most allow for
in search pattern [Default: 0]
-s <INT> Max threshold of substitution mismatches to allow
for in search pattern [Default: unlimited]
-i <INT> Max threshold of insertion mismatches to allow for
in search pattern [Default: unlimited]
-d <INT> Max threshold of deletion mismatches to allow for
in search pattern [Default: unlimited]
-S <INT> Cost of base substitutions in obtaining
approximate match [Default: 1]
-I <INT> Cost of base insertions in obtaining
approximate match [Default: 1]
-D <INT> Cost of base deletions in obtaining
approximate match [Default: 1]
-e Force tre regexp engine usage
-C Display only a total count of matches
(per input FASTQ/FASTA file)
-o <out_file> Desired output file.
If not specified, defaults to stdout
PREREQUISITES
=============
"fqgrep" depends upon the following libraries:
* TRE Regular Expression Library (TRE)
[http://laurikari.net/tre/] version 0.8.0.
* zlib
[http://www.zlib.net]
They will need to be available on your system before fqgrep can be
installed.
TRE
===
On Ubuntu (or any other Debian-based Linux distribution):
---------------------------------------------------------
sudo apt-get install libtre-dev libtre5
On Mac OS X
-----------
via macports [http://www.macports.org/] :
sudo port install tre
or via homebrew [http://brew.sh] :
brew install tre
or via compiling direct source code:
wget http://laurikari.net/tre/tre-0.8.0.tar.gz
tar -zxvf tre-0.8.0.tar.gz
cd tre-0.8.0
./configure
make
sudo make install
Other Alternative Operating Systems
-----------------------------------
For other installation alternatives please visit the following web site:
http://laurikari.net/tre/download/
zlib
====
On Ubuntu (or any other Debian-based Linux distribution):
sudo apt-get install zlib1g zlib1g-dev
On Mac OS X the zlib library comes pre-installed.
Usage of the example trimmer, fqgrep-trim.pl, provided in the scripts
subdirectory of this git repository, may require the installation of the
'Path::Class' perl module (found on CPAN) onto your system.
As the root user, please try the following terminal command for
"Path::Class" installation:
cpan Path::Class
INSTALLATION
============
The installation process currently consists of a very simple Makefile.
Just do the following:
git clone git://github.com/indraniel/fqgrep.git;
cd fqgrep;
make; # try 'make genome' if at the Genome Center at Washington University
# or would like to create an executable with all the
# relevant libraries statically compiled into it
#
# try 'make macports' if installing on Mac OS X and you installed the
# TRE library via macports (http://www.macports.org/)
The 'fqgrep' executable should be in the working directory.
Afterwards, you can move the executable to wherever you wish.
Usually, this is the directory "/usr/local/bin" .
USAGE & DETAILS
===============
For more information about fqgrep and its associated scripts, please
visit the following documentation page:
https://github.com/indraniel/fqgrep/wiki
AUTHOR
======
Indraniel Das (indraniel at gmail dot com)
ACKNOWLEDGEMENTS
================
This software was developed at the Genome Center at Washington
University, St. Louis, MO.
Thanks to the Genome Center's Technology Development Informatics
Group (Jasreet Hundal, Jason Walker, and Todd Wylie) for insightful
discussions and feedback.
DISCLAIMER
==========
This software is provided "as is" without warranty of any kind.
DOI INFO
========
https://zenodo.org/badge/latestdoi/20108/indraniel/fqgrep
Created: March 15, 2011