Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alphabet conversion is not properly constrained #3267

Closed
2 tasks done
rrahn opened this issue Jul 16, 2024 · 5 comments · Fixed by #3268
Closed
2 tasks done

Alphabet conversion is not properly constrained #3267

rrahn opened this issue Jul 16, 2024 · 5 comments · Fixed by #3268
Labels
bug faulty or wrong behaviour of code

Comments

@rrahn
Copy link
Contributor

rrahn commented Jul 16, 2024

Does this problem persist on the current main?

  • I have verified the issue on the current main

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Explicit casting of a bitpacked proxy type to its actual value type causes a compile time error.
See #3264 for more details.

Expected Behavior

It compiles.

Steps To Reproduce

#include <vector>

#include <seqan3/alphabet/container/bitpacked_sequence.hpp>
#include <seqan3/alphabet/nucleotide/dna4.hpp>

using namespace seqan3::literals;

seqan3::bitpacked_sequence<seqan3::dna4> source{"ACGGTCAGGTTC"_dna4};
auto it = source.begin();
seqan3::dna4 val = seqan3::dna4{*it}; // compile error - due to explicit conversion

Environment

- Operating system: macOS 14.5
- SeqAn version: main
- Compiler: gcc 13.3

Anything else?

One problem is this code:

template <alphabet in_t, alphabet out_t>
constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation
{
[]() constexpr {
std::array<out_t, alphabet_size<in_t>> ret{};
// for (decltype(alphabet_size<in_t>) i = 0; ...) causes indefinite compilation :(
for (auto i = decltype(alphabet_size<in_t>){0}; i < alphabet_size<in_t>; ++i)
assign_char_to(to_char(assign_rank_to(i, in_t{})), ret[i]);
return ret;
}()
};

Here, we only require in_t to be an alphabet type but this does not include that in_t is also default constructible. However, inside of the initialization of the conversion table the default constructor of in_t is called, causing the seen compile error.

To fix this the in_t must be constrained with std::default_initializable as well.

@rrahn rrahn added the bug faulty or wrong behaviour of code label Jul 16, 2024
@eseiler
Copy link
Member

eseiler commented Jul 16, 2024

diff --git a/include/seqan3/alphabet/detail/convert.hpp b/include/seqan3/alphabet/detail/convert.hpp
index c3ebda2a5..8405d6a9a 100644
--- a/include/seqan3/alphabet/detail/convert.hpp
+++ b/include/seqan3/alphabet/detail/convert.hpp
@@ -30,6 +30,7 @@ namespace seqan3::detail
  * \hideinitializer
  */
 template <alphabet in_t, alphabet out_t>
+    requires std::default_initializable<in_t>
 constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation
 {
     []() constexpr {
diff --git a/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp b/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
index f1c7bb054..5f97b42bc 100644
--- a/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
+++ b/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
@@ -77,6 +77,7 @@ public:
     template <typename other_nucl_type>
         requires (!std::same_as<nucleotide_base, other_nucl_type>)
               && (!std::same_as<derived_type, other_nucl_type>) && nucleotide_alphabet<other_nucl_type>
+              && std::default_initializable<other_nucl_type>
     explicit constexpr nucleotide_base(other_nucl_type const & other) noexcept
     {
         static_cast<derived_type &>(*this) =

should work.

Unfortunately, it's not enough to constrain the array (actually, we don't have to), we also have to constrain the ctor.

We probably have this ctor in multiple alphabets...
Maybe there is a way to construct the array if in_t is not default initializable?

@eseiler
Copy link
Member

eseiler commented Jul 16, 2024

Without confirming via debugging, I figure that this is the implicit conversion:

constexpr operator alphabet_type() const noexcept
{
if constexpr (std::is_class_v<alphabet_type>)
return *this;
else
return assign_rank_to(base_t::to_rank(), alphabet_type{});
/* Instead of static_cast'ing to the alphabet_type which also considers the constructors of the alphabet_type,
* we explicitly invoke this operator in various places.
* This prevents errors associated with using alphabet_type's constructors.
*
* This is one of error cases:
* The tuple composite seqan3::qualified returns a component_proxy which inherits from alphabet_proxy_base.
* The qualified alphabet itself inherits from phred_base.
* Now when accessing get<1>(seq_qual_alph) we want to call to_phred at some point because we want the quality,
* therefore the to_phred function from alphabet_proxy is called, but this function did a static_cast to the
* derived type which is calling the constructor from phred_base. Unfortunately now, the generic phred_base
* constructor uses `assign_phred_to(to_phred(other), static_cast<derived_type &>(*this))`; (here) which again
* tries to call to_phred of the alphabet_proxy => infinite loop :boom:
*/
}
//!\brief Implicit conversion to types that the emulated type is convertible to.
template <typename other_t>
requires (!std::is_class_v<alphabet_type>) && std::convertible_to<alphabet_type, other_t>
constexpr operator other_t() const noexcept
{
return operator alphabet_type();
}

seqan::bitpacked_sequence::reference_proxy_type inherits from alphabet_proxy.

template <typename other_nucl_type>
requires (!std::same_as<nucleotide_base, other_nucl_type>)
&& (!std::same_as<derived_type, other_nucl_type>) && nucleotide_alphabet<other_nucl_type>
explicit constexpr nucleotide_base(other_nucl_type const & other) noexcept
{
static_cast<derived_type &>(*this) =
detail::convert_through_char_representation<other_nucl_type, derived_type>[seqan3::to_rank(other)];
}

is the explicit ctor.

By constraining this one, we fall back to implicit conversion (Unless I'm mixing up conversion/casting rules).

@rrahn
Copy link
Contributor Author

rrahn commented Jul 16, 2024

I can confirm these findings. And correctly constraining the explicit constructor should do it. To fix this, I am planning to add a detail concept called convertable_through_char_representation that should be used for these constructors.

@eseiler
Copy link
Member

eseiler commented Jul 16, 2024

Something like that would also work, though it's not that nice...

diff --git a/include/seqan3/alphabet/detail/alphabet_proxy.hpp b/include/seqan3/alphabet/detail/alphabet_proxy.hpp
index 7be8044a8..953a59cd4 100644
--- a/include/seqan3/alphabet/detail/alphabet_proxy.hpp
+++ b/include/seqan3/alphabet/detail/alphabet_proxy.hpp
@@ -133,6 +133,8 @@ public:
     //!\brief The alphabet size.
     static constexpr auto alphabet_size = seqan3::alphabet_size<alphabet_type>;
 
+    using alphabet_t = alphabet_type;
+
     /*!\name Write functions
      * \brief All of these call the emulated type's write functions and then delegate to
      *        the assignment operator which invokes derived behaviour.
diff --git a/include/seqan3/alphabet/detail/convert.hpp b/include/seqan3/alphabet/detail/convert.hpp
index c3ebda2a5..0c172be14 100644
--- a/include/seqan3/alphabet/detail/convert.hpp
+++ b/include/seqan3/alphabet/detail/convert.hpp
@@ -30,18 +30,19 @@ namespace seqan3::detail
  * \hideinitializer
  */
 template <alphabet in_t, alphabet out_t>
-constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation
-{
-    []() constexpr {
+constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation{
+    []() constexpr
+    {
         std::array<out_t, alphabet_size<in_t>> ret{};
 
-        // for (decltype(alphabet_size<in_t>) i = 0; ...) causes indefinite compilation :(
-        for (auto i = decltype(alphabet_size<in_t>){0}; i < alphabet_size<in_t>; ++i)
-            assign_char_to(to_char(assign_rank_to(i, in_t{})), ret[i]);
+        static_assert(std::default_initializable<in_t> || requires { typename in_t::alphabet_t; });
+        using value_t = std::conditional_t<std::default_initializable<in_t>, in_t, typename in_t::alphabet_t>;
+
+        for (auto i = decltype(alphabet_size<value_t>){0}; i < alphabet_size<value_t>; ++i)
+            assign_char_to(to_char(assign_rank_to(i, value_t{})), ret[i]);
 
         return ret;
-    }()
-};
+    }()};
 // clang-format on
 
 } // namespace seqan3::detail

rrahn added a commit to rrahn/seqan3 that referenced this issue Jul 16, 2024
@rrahn rrahn mentioned this issue Jul 16, 2024
@rrahn
Copy link
Contributor Author

rrahn commented Jul 16, 2024

Sure, that seems valid. However, it feels too specific to seqan3. In general, any third party alphabet proxy should work as well but now we need to constraint that the proxy type exposes the alphabet_t type for which in_t is a proxy for. And then we would also need to check whether that type is also default_initializable, since alphabets to not require this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug faulty or wrong behaviour of code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants