-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add gto_fasta_merge_streams and documentation
- Loading branch information
1 parent
50879cf
commit 1fc26cc
Showing
20 changed files
with
201 additions
and
11 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
\section{Program gto\char`_fasta\char`_merge\char`_streams} | ||
The \texttt{gto\char`_fasta\char`_merge\char`_streams} merges the three channels of information (headers, extra and DNA) and writes it into a FASTA file. \\ | ||
For help type: | ||
\begin{lstlisting} | ||
./gto_fasta_merge_streams -h | ||
\end{lstlisting} | ||
In the following subsections, we explain the input and output paramters. | ||
|
||
\subsection*{Input parameters} | ||
|
||
The \texttt{gto\char`_fasta\char`_merge\char`_streams} program needs the three files resulting from the execution of the \texttt{gto\char`_fasta\char`_split\char`_streams} tool, and the output standard stream for computation. The output stream is a FASTA or Multi-FASTA file.\\ | ||
The attribution is given according to: | ||
\begin{lstlisting} | ||
Usage: ./gto_fasta_merge_streams [options] [[--] args] | ||
or: ./gto_fasta_merge_streams [options] | ||
|
||
It merges the three channels of information (headers, extra and DNA) and writes it into a FASTA file. | ||
|
||
-h, --help Show this help message and exit | ||
|
||
Basic options | ||
-e, --extra=<str> Output file for the extra information | ||
-d, --dna=<str> Output file for the DNA information | ||
-H, --headers=<str> Output file for the headers information | ||
> output Output FASTA file format (stdout) | ||
|
||
Example: ./gto_fasta_merge_streams -e <filename> -d <filename> -H <filename> > output.fasta | ||
\end{lstlisting} | ||
|
||
\subsection*{Output} | ||
|
||
The output of the \texttt{gto\char`_fasta\char`_merge\char`_streams} program is a FASTA or Multi-FASTA file.\\ | ||
Using the three output files of the \texttt{gto\char`_fasta\char`_split\char`_streams} tool as input in this example, the output of this tool is the following: | ||
\begin{lstlisting} | ||
>AB000264 |acc=AB000264|descr=Homo sapiens mRNA | ||
ACAAGACGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCCTGGAGGGTCCACCGCTGCCCTGCTGCCATTGTCCCCGGCCCCACCTAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAAGTGGTTTGAGTGGACCTCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGCAGGCCAGTGCCGCGAATCCGCGCGCCGGGACAGAATCTCCTGCAAAGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCACCCCCCCAGCTAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAA | ||
>AB000263 |acc=AB000263|descr=Homo sapiens mRNA | ||
ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCCCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAA | ||
\end{lstlisting} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
#include <stdio.h> | ||
#include <stdlib.h> | ||
#include <string.h> | ||
#include <ctype.h> | ||
#include "argparse.h" | ||
#include <unistd.h> | ||
|
||
|
||
/* | ||
* This application merges FASTA into three channels of information: | ||
* - HEADERS; | ||
* - EXTRA; | ||
* - DNA. | ||
*/ | ||
int main(int argc, char *argv[]) | ||
{ | ||
|
||
FILE *HEADERS, *EXTRA, *DNA; | ||
int c, d = 0; | ||
const char *output_headers = NULL; | ||
const char *output_extra = NULL; | ||
const char *output_dna = NULL; | ||
|
||
char *programName = argv[0]; | ||
struct argparse_option options[] = { | ||
OPT_HELP(), | ||
OPT_GROUP("Basic options"), | ||
OPT_STRING('e', "extra", &output_extra, "Output file for the extra information"), | ||
OPT_STRING('d', "dna", &output_dna, "Output file for the DNA information"), | ||
OPT_STRING('H', "headers", &output_headers, "Output file for the headers information"), | ||
OPT_BUFF('>', "output", "Output FASTA file format (stdout)"), | ||
OPT_END(), | ||
}; | ||
struct argparse argparse; | ||
|
||
char usage[250] = "\nExample: "; | ||
strcat(usage, programName); | ||
strcat(usage, " -e <filename> -d <filename> -H <filename> > output.fasta\n"); | ||
|
||
argparse_init(&argparse, options, NULL, programName, 0); | ||
argparse_describe(&argparse, "\nIt merges the three channels of information (headers, extra and DNA) and writes it into a FASTA file.", usage); | ||
argc = argparse_parse(&argparse, argc, argv); | ||
|
||
if(argc != 0) | ||
argparse_help_cb(&argparse, options); | ||
|
||
if(output_headers == NULL) | ||
output_headers = "HEADERS.JV2"; | ||
|
||
if((HEADERS = fopen (output_headers, "r")) == NULL) | ||
{ | ||
fprintf(stderr, "Error: could not open file!"); | ||
return 1; | ||
} | ||
|
||
if(output_extra == NULL) | ||
output_extra = "EXTRA.JV2"; | ||
|
||
if((EXTRA = fopen (output_extra, "r")) == NULL) | ||
{ | ||
fprintf(stderr, "Error: could not open file!"); | ||
return 1; | ||
} | ||
|
||
if(output_dna == NULL) | ||
output_dna = "DNA.JV2"; | ||
|
||
if((DNA = fopen (output_dna, "r")) == NULL) | ||
{ | ||
fprintf(stderr, "Error: could not open file!"); | ||
return 1; | ||
} | ||
|
||
while((c = fgetc(EXTRA)) != EOF) | ||
{ | ||
|
||
if(c == '>') | ||
{ | ||
fprintf(stdout, "%c", c); | ||
while((c = fgetc(HEADERS)) != EOF) | ||
{ | ||
if(c == EOF) goto x; | ||
fprintf(stdout, "%c", c); | ||
if(c == '\n') break; | ||
} | ||
continue; | ||
} | ||
|
||
switch(c) | ||
{ | ||
|
||
case 0: | ||
if((d = fgetc(DNA)) == EOF) | ||
{ | ||
fprintf(stderr, "Error: invalid format!"); | ||
return 1; | ||
} | ||
fprintf(stdout, "%c", d); | ||
break; | ||
|
||
case 1: | ||
if((d = fgetc(DNA)) == EOF) | ||
{ | ||
fprintf(stderr, "Error: invalid format!"); | ||
return 1; | ||
} | ||
fprintf(stdout, "%c", tolower(d)); | ||
break; | ||
|
||
default: | ||
fprintf(stdout, "%c", c); | ||
break; | ||
} | ||
} | ||
|
||
x: | ||
|
||
if(!HEADERS) fclose(HEADERS); | ||
if(!EXTRA) fclose(EXTRA); | ||
if(!DNA) fclose(DNA); | ||
return EXIT_SUCCESS; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
ACAAGACGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCCTGGAGGGTCCACCGCTGCCCTGCTGCCATTGTCCCCGGCCCCACCTAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAAGTGGTTTGAGTGGACCTCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGCAGGCCAGTGCCGCGAATCCGCGCGCCGGGACAGAATCTCCTGCAAAGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCACCCCCCCAGCTAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCCCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAA |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
AB000264 |acc=AB000264|descr=Homo sapiens mRNA | ||
AB000263 |acc=AB000263|descr=Homo sapiens mRNA |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
>AB000264 |acc=AB000264|descr=Homo sapiens mRNA | ||
ACAAGACGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCCTGGAGGGTCCACCGCTGCCCTGCTGCCATTGTCCCCGGCCCCACCTAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAAGTGGTTTGAGTGGACCTCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGCAGGCCAGTGCCGCGAATCCGCGCGCCGGGACAGAATCTCCTGCAAAGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCACCCCCCCAGCTAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAA | ||
>AB000263 |acc=AB000263|descr=Homo sapiens mRNA | ||
ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCCCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAA |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
#!/bin/bash | ||
../../bin/gto_fasta_merge_streams -e EXTRA.JV2 -H HEADERS.JV2 -d DNA.JV2 > output.fasta |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters