Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type traits error in c++17 version #47851

Closed
YarikTH opened this issue Dec 14, 2020 · 24 comments
Closed

Type traits error in c++17 version #47851

YarikTH opened this issue Dec 14, 2020 · 24 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla invalid Resolved as invalid, i.e. not a bug libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@YarikTH
Copy link

YarikTH commented Dec 14, 2020

Bugzilla Link 48507
Resolution INVALID
Resolved on Jun 17, 2021 12:26
Version unspecified
OS Linux
Blocks #48661
CC @alexezeder,@ldionne,@mclow,@zygoloid,@tstellar

Extended Description

Here is the two strings code using type traits:

#include "nlohmann/json.hpp"
static_assert( std::is_copy_assignable<nlohmann::ordered_json>::value, "" );

https://godbolt.org/z/4eqjas

It works perfectly with libstdc++, and it works with libc++ using c++11 or c++14.
But libc++ with c++17 can't compile this code. I tried to dig into problem, to find the difference between c++17 and non-c++17 in libc++ headers, I tried to dig into nlohmann json to extract minimal example, but failed.

Root issue on Github: nlohmann/json#2491

@YarikTH
Copy link
Author

YarikTH commented Dec 14, 2020

assigned to @ldionne

@mclow
Copy link
Contributor

mclow commented Dec 14, 2020

There were definitely some "tightening up" of the requirements on the type traits after C++14. One of the goals was to avoid having the same type trait give different answers at different points in the compilation.

@llvmbot
Copy link
Member

llvmbot commented Dec 16, 2020

The "tightened up" requirements probably don't matter in this case.
What the compiler complains about is not "static_assert" but missing "is_copy_assignable<...>::value".

There is also some "magic" declaration than changes the compiler behavior:

using J = nlohmann::ordered_json;
using magic = decltype((std::declval<J&>() = std::declval<const J&>()));
static_assert(std::is_copy_assignable::value, "");

This code compiles successfully, even with "-std=c++17 -stdlib=libc++".
With no "magic" the result will be as described above.

See a working example here: https://godbolt.org/z/87zGf3

@YarikTH
Copy link
Author

YarikTH commented Dec 16, 2020

Speaking about magic.
I posted it in the issue on github, so here is a copy:

Additional info. I refuse to comprehend it, but forced instantiation of
std::is_copy_constructiblenlohmann::ordered_json 'fixes' this problem.
Looks like a bug in libc++.

Declaration of
static_assert( std::is_copy_constructiblenlohmann::ordered_json::value, "" );
before checking if is_copy_assignable causes code to compile.

See a working example here: https://godbolt.org/z/vooaWe

@YarikTH
Copy link
Author

YarikTH commented Dec 16, 2020

To emphasize that this problem not in the static_assert, here is the code without it:

#include <type_traits>
#include "https://raw.githubusercontent.com/nlohmann/json/v3.9.1/single_include/nlohmann/json.hpp"
using J = nlohmann::ordered_json;
constexpr bool foo = std::is_copy_assignable<J>::value;

See a working example here: https://godbolt.org/z/h7cYa1

@YarikTH
Copy link
Author

YarikTH commented Dec 16, 2020

I minimized the library code as much as I can. Still 400 lines of complicated template code, but at least it's a good starting point to investigate futher. It's too much to paste here, so here are the links:

Godbolt: https://godbolt.org/z/zf17vT
GitHub: https://github.com/YarikTH/json/blob/feature/libcxx_debug/single_include/nlohmann/json.hpp

@alexezeder
Copy link

Thanks to 400+ lines in the previous comment by Krylov Yaroslav, I was able to make a deeper investigation and found a bit smaller code example that also fails. But the example that I'm about to present also fails on libstdc++ too, GCC standard library implementation. While the original 400+ lines version fails only with libc++. Probably it's due to a reduced number of included files because I was only removing code, but not adding.

Here are two demonstrations on Compiler Explorer with the same code:

  1. https://godbolt.org/z/fhTf7j
  2. https://godbolt.org/z/jbeTje

The first one just shows that Clang now also fails with libstdc++, while GCC does not fail, as well as MSVC with its own STD library.

The second demonstrates that by using some workarounds a compilation error can be eliminated on Clang. There are 2 workarounds, but they have one thing in common - they manipulate somehow with std::pair, the first workaround just replaces the usage of std::pair with std::tuple in the code, while the second one explicitly instantiates std::pair with used template parameters.

@alexezeder
Copy link

Actually, this looks like a close but not exactly the same bug as the initial one.
For example, as I mentioned, it also fails on libstdc++, and I supposed that it's due to a reduced number of included files, but that's not true. I tried to manipulate with header files in any way possible (even using all headers from the original code), but the code I provided fails on both libstdc++ and libc++.
So, the problem probably not in the libc++, but in Clang, at least for my code example.

@alexezeder
Copy link

I also discovered that Clang versions < 6.0 do not fail with both code examples - the initial one by Krylov Yaroslav and the modified one by me.
https://godbolt.org/z/z1GzeW Clang 6.0.0 vs 5.0.1 (by Krylov Yaroslav from comment #​5)
https://godbolt.org/z/K53c6s Clang 6.0.0 vs 5.0.1 (by me from comment #​6)

Looks like there was some breaking change introduced in Clang 6.0.

@alexezeder
Copy link

I ran git bisect for commits from 287f684 (root of 5.0 release) to d359f20 (latest commit in 6.0 release). Here are links for these commits on GitHub: 287f684 and d359f20.

Bisection procedure pointed out the commit that fails a C++17 build with libc++ for the code example provided in comment #​5 - it's 67ef14f:

Author: Richard Smith [email protected]
Date: Tue Sep 26 18:37:55 2017 +0000

Resolve a defect in C++17 copy omission.

When selecting constructors for initializing an object of type T from a single
expression of class type U, also consider conversion functions of U that
convert to T (rather than modeling such conversions as calling a conversion
function and then calling a constructor).

This approach is proposed as the resolution for the defect, and is also already
implemented by GCC.

llvm-svn: 314231

Complete diff of this commit can be found on GitHub too - 67ef14f.

It looks like the addition of a new feature resulted in invoking some underlying Clang bug. But since I know more or less nothing about what's going on there, I cannot determine what's exactly causing the underlying bug.

@alexezeder
Copy link

Bisect run for the simplified code example (from comment #​6) points to the same 67ef14f commit.

So both of these code examples share the same exposition of that underlying Clang bug. The question here is why the first one does not fail with libstdc++? Explicit instantiations, different types, idk 🤷...

Also, I think it's worth marking this issue as confirmed.

@zygoloid
Copy link
Mannequin

zygoloid mannequin commented Feb 4, 2021

This is probably a bug in the user code -- while attempting to form a copy assignment of nlohmann::ordered_json in order to determine whether the type is copy-assignable, the user code probably recursively queried is_copy_assignablenlohmann::ordered_json itself, causing a cyclic template instantiation that cached the "wrong" result in the template specialization. (That's usually the cause for problems that look like this.)

@alexezeder
Copy link

First of all, thank you for bringing some clarity here.
I can confirm that yes, both code examples tries to instantiate is_copy_assignable::value again while trying to instantiate it for the first time. So the line:

    constexpr bool b = std::is_copy_assignable<some_type>::value;

is not actually that fails but this line from std header does:

    _LIBCPP_INLINE_VISIBILITY
    pair& operator=(typename conditional<
                        is_copy_assignable<first_type>::value &&
                        is_copy_assignable<second_type>::value,  // << `some_type` here
                    pair, __nat>::type const& __p)

There is a quite long way to instantiate std::pair of some_type according to the error trace, but this way is the most interesting part for me. After further investigation, I found out that the code in comment #​6 and code in comment #​5 differs slightly because the code from the original library (i.e. comment #​5) has these std::is_pointer::value checks. They should guard the conversion operator against being called with some pointer type passed. But... they do not work like that because Clang instantiates all that is_detected structures before evaluating bool values (but how else). This is why the error trace is so long, it just never stops on those checks. Here is a line from the error trace for the code example in comment #​5:

<source>:367:106: note: while substituting deduced template arguments into function template 'operator type-parameter-0-0' [with ValueType = const nlohmann::ordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, nlohmann::basic_json<nlohmann::ordered_map, std::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long, unsigned long, double, std::allocator, adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>>, std::__1::less<void>, std::__1::allocator<std::__1::pair<const std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, nlohmann::basic_json<nlohmann::ordered_map, std::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long, unsigned long, double, std::allocator, adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>>>>> *, $1 = (no value)]
    constexpr auto get_ptr() const noexcept -> decltype(std::declval<const basic_json_t&>().get_impl_ptr(std::declval<PointerType>()))

it's huge for one line, but the most important thing is that the conversion operator is instantiated for the [ValueType = const nlohmann::ordered_map<...> *] - a pointer type.

For example, the error in the code from comment #​6 can be resolved by making get_template_function_checker (and further) instantiation only when the std::is_pointer::value check passed, like so https://godbolt.org/z/Kex11h or more straightforward https://godbolt.org/z/55P99W.
Hope that a similar change can also solve the original problem in the nlohmann::json library.

Still, I have unanswered questions about that problem.
How does explicit instantiation solves the error for the Clang here, should it just end up with the same error? Probably only debugging of template instantiations in Clang with explicit template instantiation example could help me to understand that.
How do other major compilers manage to compile this code? I can definitely see some differences in the behavior of Clang, GCC and MSVC in provided examples, but they differ also in such trivial cases like https://godbolt.org/z/csWM3n, and that looks strange to me.

@zygoloid
Copy link
Mannequin

zygoloid mannequin commented Feb 5, 2021

How does explicit instantiation solves the error for the Clang here, should it just end up with the same error?

I'd imagine we have a dependency cycle: is_copy_assignablenlohmann::ordered_json depends on is_copy_constructiblenlohmann::ordered_json, which depends on is_copy_assignablenlohmann::ordered_json.

Explicitly instantiating is_copy_constructible means we enter the cycle at a different point, which means we break the cycle at a different point, and the point at which we break the cycle will typically be the point at which we cache a meaningless value. Entering the trait dependency cycle at different points can result in different behavior.

That's the same thing you see in the case in comment#2: there, we enter the cycle while actually performing the assignment, not while computing is_copy_assignable, so the cycle is broken in a different place, so we happen to not end up caching something bad in is_copy_assignable.

How do other major compilers manage to compile this code?

Different standard library implementations either will or won't check type traits as they go, and so may or may not cache bad values as a side-effect of any particular operation. In addition, GCC has an optimization where it doesn't consider template overload candidates if there's an exact match among non-template candidates, which can sometimes mean that GCC doesn't consider a SFINAE check that Clang does, so in the Clang compilation we end up performing more type trait computations, which can lead to us discovering more cycles. I think that's why GCC accepts https://godbolt.org/z/csWM3n.

These kinds of problems usually arise when a function template (usually a constructor or assignment operator or a conversion function) is insufficiently constrained. Usually what's needed, when a constructor or assignment operator is templated to take a T (or const T& or similar) or a conversion function is templated to return a T (or const T& or similar) is a SFINAE check that T is not the same as the enclosing class type, before anything else is checked.

For example, here's libc++'s std::optional converting constructor: https://github.com/llvm/llvm-project/blob/master/libcxx/include/optional#L715

Note that it checks _CheckOptionalArgsCtor<_Up>::template __enable_implicit<_Up>(), and the first things that _CheckOptionalArgsCtor checks are that _Up is not in_place_t nor optional:

template <class _Up>
using _CheckOptionalArgsCtor = _If<
    _IsNotSame<__uncvref_t<_Up>, in_place_t>::value &&
    _IsNotSame<__uncvref_t<_Up>, optional>::value,
    _CheckOptionalArgsConstructor,
    __check_tuple_constructor_fail
>;

Looking at the current implementation (https://github.com/YarikTH/json/blob/feature/libcxx_debug/single_include/nlohmann/json.hpp#L459) it looks like there is an attempt to apply this kind of protection to the conversion function, but it's not correct:

template < typename ValueType, typename std::enable_if <
               !std::is_pointer<ValueType>::value&&
               !detail::is_basic_json<ValueType>::value

[...]
&& detail::is_detected<detail::get_template_function, const basic_json_t&, ValueType>::value
, int >::type = 0 >

The problem here is that substitution into the enable_if condition does not short-circuit when we hit the first ::value that evaluates to false. So even if is_basic_json::value evaluates to true, we still substitute into the is_detected<...> check, which is how we end up recursing back to checking the same thing we were already checking.

The fix is to use std::conjunction instead of a sequence of &&s, so that evaluating the type traits short-circuits (see https://stackoverflow.com/a/55161485/1041090):

template < typename ValueType, typename std::enable_if <
           std::conjunction<
               std::negation<std::is_pointer<ValueType>>,
               std::negation<detail::is_basic_json<ValueType>>,

[...]
detail::is_detected<detail::get_template_function, const basic_json_t&, ValueType>>::value
, int >::type = 0 >

@zygoloid
Copy link
Mannequin

zygoloid mannequin commented Feb 5, 2021

Note that it checks _CheckOptionalArgsCtor<_Up>::template
__enable_implicit<_Up>(), and the first things that _CheckOptionalArgsCtor
checks are that _Up is not in_place_t nor optional:

And I should point out, critically, it does all these checks and short-circuits if they fail, before triggering instantiation of __enable_implicit, which is where the potentially recursive convertibility checks would happen.

@alexezeder
Copy link

Thank you again for making the problem less suspicious, at least for me.

But I want to clarify what is exactly strange in that code example where Clang is the only compiler that fails to compile (https://godbolt.org/z/csWM3n).

Different standard library implementations either will or won't check type
traits as they go, and so may or may not cache bad values as a side-effect
of any particular operation.
I agree that different standard library implementations can have specific checks, etc. But GCC and Clang in that example (on Compiler Explorer) use the same STD library from GCC, stdlibc++, so this statement doesn't apply to that example.

In addition, GCC has an optimization where it
doesn't consider template overload candidates if there's an exact match
among non-template candidates, which can sometimes mean that GCC doesn't
consider a SFINAE check that Clang does, so in the Clang compilation we end
up performing more type trait computations, which can lead to us discovering
more cycles. I think that's why GCC accepts https://godbolt.org/z/csWM3n.
As far as I can see in the error trace for this code, Clang is actually instantiating some_type::operator some_type(), but only if exactly that form of a copy-assignment operator is declared. Considering this example - https://godbolt.org/z/61dozK - it looks like Clang tries to convert some_type const& argument to some_type while checking that assignment operator, but it tries to do this only if the conversion operator is declared. But keeping in mind your words about Clang processing more type-trait paths it looks pretty reasonable, still a bit strange.

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2021

Bisection procedure pointed out the commit that fails a C++17 build with
libc++ for the code example provided in comment #​5 - it's
67ef14f:

Author: Richard Smith [email protected]
Date: Tue Sep 26 18:37:55 2017 +0000

Resolve a defect in C++17 copy omission.

This commit probably introduces an issue that isn't directly related to the considered one. Since Clang 6 the result of overload resolution depends on the language standard. The following will produce different results when compiled as C++14 or C++17 (see https://godbolt.org/z/qqrod3):

struct B

struct A
{
A() = default;
A(const B&);
};

struct B
{
operator A();
};

int main()
{
B b;
A a(b);
}

When Clang 6 (or a later version) is used, the conversion of B to A will be performed:

  • by constructor "A::A(const B&)" in C++14 mode;
  • by function "B::operator A()" in C++17 mode.

Clang 5 prefers the constructor in both cases.

To see the list of candidate funcitons for overload resolution, we can declare the additional constructor "A(B)" that will cause the ambiguity (and compiler diagnostics). Besides this additional constructor there are the following candidates:

A::A(const A&); // implicit
A::A(A&&); // implicit
A::A(const B&);
B::operator A(); // appears since Clang 6 when compiled as C++17

The presence of the operator in the above list is probably an error.
Candidate functions for direct initialization are the constructors, so the operator should not be considered.

The mandatory copy elision in C++17 cannot be applied here. It is applicable only when initializer is an prvalue and has the same type as the object being initialized. In such a case no conversion is required (nor is the overload resolution). In the example we have an lvalue initializer of type B that is used to initialize an object of type A.

GCC has a similar error since 8.4; see also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83445
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86521

@llvmbot
Copy link
Member

llvmbot commented Feb 7, 2021

The mandatory copy elision in C++17 cannot be applied here. It is applicable
only when initializer is an prvalue and has the same type as the object
being initialized.
In the example (https://godbolt.org/z/qqrod3) the initializer b is not a prvalue of type A. It can, however, be implicitly converted to prvalue of type A. This means that the compiler should make choise between "A::A(const A&)" and "A::A(const B&)". The latter is obviously better here, as doesn't require any user-defined conversion of its argument.

If the compiler had chosen the former option, it would have resulted in a copy-initialization of temporary A from lvalue B and the following direct-initialization of target object from const reference to the temporary. Here it's worth to note:

  • the copy-inilialization considers an extended set
    of candidate functions that includes "B::operator A()";
  • the temporary would be initialized with prvalue returned
    by the operator as far as the operator is not const-qualified
    (and this is the observed effect with CLang 6 in C++17 mode);
  • the temporary may be optimized out,
    but this is not required even by C++17.

Here are variations of the example where all the compiler configurations considered behave as expected:

  • Using "A a = b" instead of "A a(b)" is a copy-initialization;
    the operator is called. [https://godbolt.org/z/box79n]
  • If the operator is const-qualified, the constructor is called;
    CLang 6 in C++17 mode silently ignores the ambiguity that should
    appear in its overload resolution logic. [https://godbolt.org/z/61GojP]

@llvmbot
Copy link
Member

llvmbot commented Feb 7, 2021

I missed the move constructor in the previous post, but it doesn't really matter. We can exclude it by adding "A(const A&) = default". Nothing will change.

@tstellar
Copy link
Collaborator

This bug was not resolved in time for the 12.0.0 release, so it will have to wait for 12.0.1.

If you feel this is a high-priority bug that should be fixed for 12.0.0, please re-add release-12.0.0 to the Blocks field and add a comment explaining why this is high-priority.

@ldionne
Copy link
Member

ldionne commented Jun 17, 2021

Thanks to everyone for their analysis, in particular Richard Smith for layout out the solution.

I created a PR in nlohmann::json to fix this upstream: nlohmann/json#2825.

I'm going to close this because it's not a libc++ or Clang bug as far as I can tell.

@alexezeder
Copy link

I'm going to close this because it's not a libc++ or Clang bug as far as I
can tell.

Looking at this thread after some time, of course, the original problem in nlohmann/json is not libc++ or Clang bug, but there are some points from Alexander Karzhenkov that probably may be used to create a separate bug report.

@ldionne
Copy link
Member

ldionne commented Jun 17, 2021

but there are some points from Alexander Karzhenkov that probably may be used to create a separate bug report.

Please do file a bug report against Clang for those. I was just trying to clean up the list of blockers for the 12.x releases assigned to libc++.

@tstellar
Copy link
Collaborator

mentioned in issue #48661

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla invalid Resolved as invalid, i.e. not a bug libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

6 participants