Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in denoising.py when using --denoising and --distance > 1 arguments #75

Open
gregdev00 opened this issue Feb 28, 2025 · 1 comment

Comments

@gregdev00
Copy link

gregdev00 commented Feb 28, 2025

Hi I want to inform you, that the denoising.py tool throws exceptions if I use it with --denoising and --distance 3 arguments.

The issue appears to be caused by a typo in the variable name and missing functions that were available in previous versions.

Issue 1: Typo in Variable Name

The error occurs on line 1855 of denoising.py

It is: denoising_resized_seeds = tmpFiles.add( filename_woext + '_denoising_resizedSeeds.fasta' )

But it should be: denoising_resized_seeds = tmp_files.add( filename_woext + '_denoising_resizedSeeds.fasta' )

Issue 2: Missing Functions

Two functions that were previously available in clustering.py (before FROGS v5.0.0) are missing as well in denoising.py:

resizeSeed()
agregate_composition()

The script uses them, but they are not included in denoising.py anymore.

def resizeSeed(seed_in, seed_in_compo, seed_out):
    """
    @summary: add read abundance to seed sequence name
    @param seed_in : [str] Path to seed input fasta file
    @param seed_in_compo : [str] Path to seed input composition swarm file
    @param seed_out : [str] Path to seed output fasta file with abundance in name and sorted
    """
    dict_cluster_abond=dict()
    with open(seed_in_compo,"rt") as f:
        for idx,line in enumerate(f.readlines()):
            if not line.startswith("#"):
                cluster_name = "Cluster_" + str(idx+1) if not "FROGS_combined" in line.split()[0] else "Cluster_" + str(idx+1) + "_FROGS_combined"
                dict_cluster_abond[cluster_name]=sum([ int(n.split("_")[-1]) for n in line.strip().split()])
    f.close()

    FH_input = FastaIO( seed_in )
    FH_out=FastaIO(seed_out , "wt" )
    for record in FH_input:
        record.id += "_" + str(dict_cluster_abond[record.id])
        FH_out.write( record )
    FH_input.close()
    FH_out.close()
def agregate_composition(step1_compo , step2_compo, out_compo):
    """
    @summary: convert cluster composition in cluster in cluster composition in read (in case of two steps clustering)
    @param step1_compo : [str] Path to cluster1 composition in read (clustering step1)
    @param step2_compo : [str] Path to cluster2 composition in cluster1 (clustering step2) 
    @param out_composition : [str] Path to cluster2 composition in read
    """
    dict_cluster1_compo=dict()
    with open(step1_compo,"rt") as f:
        for idx,line in enumerate(f.readlines()):
            if "FROGS_combined" in line.split()[0]:
                dict_cluster1_compo["Cluster_"+str(idx+1)+"_FROGS_combined"]=line.strip()
            else:
                dict_cluster1_compo["Cluster_"+str(idx+1)]=line.strip()
    f.close()

    FH_out=open(out_compo,"wt")
    with open(step2_compo,"rt") as f:
        for line in f.readlines():
            compo=" ".join([dict_cluster1_compo["_".join(n.split('_')[0:-1])] for n in line.strip().split(" ")])
            FH_out.write(compo+"\n")

I hope you can fix this soon! Thanks for your hard work on FROGS.

@olivierrue
Copy link
Collaborator

Hi @gregdev00,
Thank you for reporting the problem. The bug has been fixed in version 5.0.1 which has just been released on the bioconda channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants