Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported SAM header tag in @HD: pb #3215

Closed
notestaff opened this issue Nov 25, 2023 · 6 comments · Fixed by #3246 or #3247
Closed

Unsupported SAM header tag in @HD: pb #3215

notestaff opened this issue Nov 25, 2023 · 6 comments · Fixed by #3246 or #3247
Labels
question a user question how to do certain things

Comments

@notestaff
Copy link

notestaff commented Nov 25, 2023

Platform

  • SeqAn version: 3.3.0
  • Operating system: Linux bpb23-acc 5.4.0-136-generic #153-Ubuntu SMP Thu Nov 24 15:56:58 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Compiler: x86_64-conda-linux-gnu-c++ (conda-forge gcc 12.3.0-2) 12.3.0

Question

Is there a way to suppress the message Unsupported SAM header tag in @HD: pb when reading PacBio bam files?
Relatedly, the documentation says

* The function throws a seqan3::format_error if any unknown tag was encountered. It will also fail if the format is
, but that's no longer the case, right? (And the current behavior of not throwing an error seems correct.)

@notestaff notestaff added the question a user question how to do certain things label Nov 25, 2023
@eseiler
Copy link
Member

eseiler commented Nov 26, 2023

Hey @notestaff,

once again, thank you for finding flaws in our documentation!

It should definitely not say that it throws. It's probably a leftover from when we amended the alignment file parsing to be a bit less strict.

Do you have an example SAM/BAM file? Just the header section and one record/entry should be enough. We can also just add the pb header to a file ourselves, but it's better to have an example of what you are using.

In general, pb is not part of the SAM-specification's @HD tag, so we emit a warning.

What would be a solution you are looking for? It would be possible to redirect std:cerr while parsing the file, such that the warning is not emitted.

Some option to ignore/warn/error would also be possible, but would require code restructuring.

@notestaff
Copy link
Author

The first header line reads
@HD VN:1.6 SO:unknown pb:5.0.0
What would be useful is an option to suppress warnings about unsupported SAM header tag. Simplest might be to make this a compilation option, e.g. if SEQAN3_SUPPRESS_UNSUPPORTED_HEADER_TAG_WARNING is defined to 1 then don't print the warning. Or this could be part of the configuration options when opening a SAM file. Best would of course be to have a way to specify additional recognized tags. Thanks!

@eseiler
Copy link
Member

eseiler commented Feb 29, 2024

@notestaff
Copy link
Author

Great, thanks a lot! Just in time for external release of our seqan3-based tool :)

@notestaff
Copy link
Author

But, this fix suppresses ALL warnings -- a way to suppress specific ones would be better.

Also, the SAM spec says "you can freely add new tags for further data fields. Tags containing lowercase letters are reserved for local use". So I'm not sure this should be a warning at all, at least by default, especially for lowercase tags like in our example.

@eseiler
Copy link
Member

eseiler commented Feb 29, 2024

But, this fix suppresses ALL warnings -- a way to suppress specific ones would be better.

It's currently the only warning we emit ("Unsupported SAM header tag in [...]").

It's also a bit more flexible than just on/off in that you can postprocess the warning, e.g.,

void filter()
{
    auto fin = get_sam_file_input();
    std::ostringstream stream{};
    fin.options.stream_warnings_to = std::addressof(stream);
    auto it = fin.begin();
    for (auto && warning : stream.view() | std::views::split('\n'))
    {
        if (std::string_view sv{warning.data(), warning.size()}; !sv.empty() && !sv.ends_with("pb"))
            std::cerr << sv << '\n';
    }
}

Would be nicer to overload operator<<, but then we would need either some customisation point object...

Also, the SAM spec says "you can freely add new tags for further data fields. Tags containing lowercase letters are reserved for local use". So I'm not sure this should be a warning at all, at least by default, especially for lowercase tags like in our example.

I did miss this one sentence in the spec multiple times :)
Before this was a warning, it was an exception.
I'm not sure if the warning is meant as "it's not valid, but we just continue" or "we don't support that" (in contrast to alignment tags).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question a user question how to do certain things
Projects
None yet
2 participants