-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added command line arguments --pctmarker_thresh and --pctORFscore_thr…
…esh to ShortBRED-Quantify. Added utils folder, with script to help rename input fasta files when needed
- Loading branch information
jimkaminski
committed
Feb 4, 2016
1 parent
d5d0897
commit 04d884f
Showing
8 changed files
with
130,113 additions
and
16 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
######################################################################### | ||
# Jim Kaminski | ||
# Huttenhower Lab | ||
# 2/3/2016 | ||
######################################################################### | ||
|
||
|
||
|
||
""" | ||
This script makes small changes to an input fasta file to format the sequence | ||
ids for ShortBRED. It will do the following: | ||
* Convert the first --numspaces (args.iSpacesToChange) to "_"'s | ||
* Add an unique ID when two seqs have the exact same name. | ||
* Replace characters like "*,[,:" with _ . | ||
""" | ||
import sys | ||
import argparse | ||
import re | ||
from argparse import RawTextHelpFormatter | ||
|
||
parser = argparse.ArgumentParser(description='This script makes small changes to an input fasta file to format the sequence \ | ||
ids for ShortBRED. It will do the following: \ | ||
* Convert the first --numspaces (args.iSpace) to "_"\'s \ | ||
* Add an unique ID when two seqs have the exact same name. \ | ||
* Replace characters like "*,[,:" with _ . ',formatter_class=RawTextHelpFormatter) | ||
parser.add_argument("--numspaces", default = 2, type=int, dest='iSpacesToChange') | ||
args = parser.parse_args() | ||
|
||
dictHeaderCounts = {} | ||
reBadChars=re.compile(r'[\\\/\*\=\:\'\[\]\.\;]') | ||
|
||
for strLine in sys.stdin: | ||
strLine = strLine.strip() | ||
if strLine[0]==">": | ||
strHeader = strLine[1:].replace(" ","_",args.iSpacesToChange) | ||
strHeader = re.sub(reBadChars,"_",strHeader) | ||
dictHeaderCounts[strHeader] = dictHeaderCounts.get(strHeader,0)+1 | ||
if dictHeaderCounts[strHeader] > 1: | ||
strHeader = strHeader + "___Copy"+str(dictHeaderCounts[strHeader]).zfill(4) | ||
print ">" + strHeader | ||
else: | ||
print strLine | ||
|
||
|
||
|