Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a warning for safer reading of FASTA files using IndexedFastaCustom + bumped GitHub Actions python version to 3.9 #122

Closed
wants to merge 4 commits into from

Conversation

SagiPolaczek
Copy link
Collaborator

Issue

A case that the code didn't cover yet:

  1. Creating fasta files
  2. Reading them using an instance of IndexedFastaCustom (.hdf5 files were created)
  3. Updating the fasta files
  4. Reading them again using IndexedFastaCustom with force_recreate_index=False.

Now the offset values are not matching the .fasta files content and the user didn't get any notice for that.

Suggested (relaxed) fix

  1. Compare the creation times of the files.
  2. Check if the creation time of the .fasta files is newer than the .hdf5 (index) file by a margin bigger than 42 (arbitrary number less than a minute)
  3. If true, raise a warning and notify the user.

@SagiPolaczek SagiPolaczek changed the title Add a warning for safer reading of FASTA files using IndexedFastaCustom Add a warning for safer reading of FASTA files using IndexedFastaCustom + bumped GitHub Actions python version to 3.9 Apr 17, 2024
Copy link
Collaborator

@mosheraboh mosheraboh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sagi!
Few comments inline

fasta_file_creation_time = os.path.getctime(self.filename)
index_file_creation_time = os.path.getctime(self.index_filename)

if fasta_file_creation_time > index_file_creation_time + 42:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the extra 42 seconds?

@@ -98,6 +99,16 @@ def __init__(
print(f"index file already found: {index_filename}")
already_found = True

if already_found:
# Fasta file should be older than it's corresponding index file
fasta_file_creation_time = os.path.getctime(self.filename)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe mtime is safer?

index_file_creation_time = os.path.getctime(self.index_filename)

if fasta_file_creation_time > index_file_creation_time + 42:
warnings.warn(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe raise an error and suggest deleting the index file to recreate it?

@mosheraboh
Copy link
Collaborator

@SagiPolaczek is it something you want to merge?
(I'm doing some cleaning)

@mosheraboh mosheraboh closed this Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants