Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blackbox testing and fallback testing #715

Closed
AlexGuteniev opened this issue Apr 14, 2020 · 9 comments
Closed

Blackbox testing and fallback testing #715

AlexGuteniev opened this issue Apr 14, 2020 · 9 comments
Labels
question Further information is requested resolved Successfully resolved without a commit

Comments

@AlexGuteniev
Copy link
Contributor

I notice tests I've seen so far are completely black box, this means they don't use internal function, and potentially can test other STL. Is this a requirement?
If so, this raises question how to test fallback scenarios for std::call_once or std::atomic<T>::wait. As I understand, CI runs on "normal" machines, not outdated.

@AlexGuteniev AlexGuteniev added the question Further information is requested label Apr 14, 2020
@miscco
Copy link
Contributor

miscco commented Apr 14, 2020

Not a Maintainer but my 2 Cents

Test should never depend on internal machinery but only externally visible public API. You get in all sorts of trouble because to access internal machinery you generally need to expose it and once you do all hell breaks loose. You might want to have a look on @tituswinters talks about refactoring and maintainability

Technically there is also no reason why any of those tests should not run against a different standard library implementation (modulo the obvious not yet implemented features). It is a standard after all.

With fallbacks do you mean unsupported systems such as windows XP? You can always run the test suite manually on your local machine for whatever platform you desire.

@AlexGuteniev
Copy link
Contributor Author

With fallbacks do you mean unsupported systems such as windows XP?

For call_once it is just XP/2003.
However for #593 it also means Vista or Win7.
Atomic wait has three OS level support:

  1. Native Win8+
  2. SRW Lock+Cond.Var. Vista+
  3. Timed backoff XP/2003

The later is for internal use by parallel_algorithms.cpp or call_once, or shared_ptr, whatever.

You can always run the test suite manually on your local machine for whatever platform you desire.

So I need at least two machines: 64 bit XP or 2003 and 64 bit Vista or Win7. The first is sort of exotic today.

Anyway, actually I'm running these tests by temporary altering the implementation (changing a variable in debugger), so that's not a problem to test these locally.

Not having these tests in CI is more of concern.

@StephanTLavavej
Copy link
Member

As a general rule, we test the Standard interface for Standard semantics. Occasionally we test implementation-specific guarantees, commenting that we’re doing so. In exceptional cases, we directly test internal machinery, usually when doing so provides better correctness validation than would easily be achievable. The Boyer-Moore algorithm tests are one example.

At this time, we haven’t attempted to run our tests against other STL implementations, although that is potentially interesting in the future, so remaining disciplined about non-Standard testing (keeping it uncommon and prominently noted) is good for multiple reasons.

Also in exceptional cases, we can arrange for test code to trigger fallback paths; there was one such example in our ConcRT adapting layer, although we never used that in automated testing. (Allocation failure fallbacks in stable_sort can be triggered through Standard code.)

I think your scenario justifies an exception for using non-Standard machinery in test code.

@StephanTLavavej StephanTLavavej added the resolved Successfully resolved without a commit label Apr 14, 2020
@tituswinters
Copy link

It is distinctly an anti-pattern for both your tests and the APIs under test to need to reach into internals / implementation details.

If you must do that, it is suggestive that the public APIs in question are at the wrong level of granularity (you have to pierce that abstraction to evaluate the details in question).

If you do break that abstraction, you've now tightly coupled the tests and the implementation, which weakens the validity of the test and adds busywork when internals change in ways that are unobservable by the public API. (See Chapter 12 in my recent book.)

@AlexGuteniev
Copy link
Contributor Author

I do not see a completely clean solution, except running XP and Vista on Contniuos Integration server.

STL API is defined by C++ Standard, which does not imply anything about operating system.

Yet, OS versions are different. It is suboptimal not to take advantage of Win8+ Native API (performance advantage due to more efficient native implementeation), and it is not possible to avoid obligation to support older OS versions.

I came up with relatively clean solution:
Double underscore __std_atomic_set_api_level function.
Must be called before the implementation internally selects platform API - otherwise has no effect.

One test that does not call it -- it is standard test.
Other two non-standard tests that differ only by calling __std_atomic_set_api_level with 0x600 and 0x502, otherwise sthey share imlementation with the first test.

See this commit: f2256d9

@MikeGitb
Copy link

Everything else aside: If the code is supposed to work on Windows XP, shouldn't testing on a genuine WinXP Machine (Physical or VM) be part of the regular testing process?

@StephanTLavavej
Copy link
Member

If you must do that, it is suggestive that the public APIs in question are at the wrong level of granularity (you have to pierce that abstraction to evaluate the details in question).

Yep. Of course, WG21 controls the public API, and even when it's at the right level of granularity, piercing the abstraction is sometimes worth it (very rarely).

Here's the specific example I mentioned - the Boyer-Moore algorithm in the Standard has a very nice interface for users, but it's also worthwhile to punch through and directly inspect the "delta_2" table within:

// Tests for Boyer-Moore Table2 construction
// This isn't easily observable from the public interface (other than that matches will be wrong),
// and the algorithm is self contained, so we're testing it directly.
void test_case_boyer_moore_table2_construction(string_view sv, initializer_list<ptrdiff_t> expectedValues) {
assert(sv.size() == expectedValues.size());
vector<ptrdiff_t> results(sv.size());
equal_to<> eq;
_Build_boyer_moore_delta_2_table(results.data(), sv.data(), static_cast<ptrdiff_t>(sv.size()), eq);
assert(equal(results.begin(), results.end(), expectedValues.begin(), expectedValues.end()));
}
void test_boyer_moore_table2_construction() {
// Test cases from Boyer and Moore's original 1977 paper on page 4:
test_case_boyer_moore_table2_construction("abcxxxabc", {14, 13, 12, 11, 10, 9, 11, 10, 1});
test_case_boyer_moore_table2_construction("abyxcdeyx", {17, 16, 15, 14, 13, 12, 7, 10, 1});
// Test case from Knuth, Morris, and Pratt's updated 1977 paper, from which our delta2
// construction algorithm is derived, on page 20:
test_case_boyer_moore_table2_construction("badbacbacba", {19, 18, 17, 16, 15, 8, 13, 12, 8, 12, 1});
}

(We recently discovered in #713 that the Knuth-Morris-Pratt algorithm for constructing the delta_2 table was incorrect - not just the implementation, the published algorithm - and we'll need to add additional test cases here and elsewhere. We should have had randomized testing of the public interface to begin with, but I argue that directly testing the results of this nontrivial internal algorithm provides additional value.)

Another potential example would have been directly testing the bignum implementation powering <charconv> from_chars(), which is otherwise invisible. That implementation has had a correctness issue in the past (now fixed) that direct testing would have been more likely to discover. The commonality is "small simple-looking interface conceals high-powered algorithms and data structures".

@AlexGuteniev
Copy link
Contributor Author

Aside XP, I hope that all tests would run on Vista or 7 at least once before shipping.
In #593 I've added atomic waits into other STL parts, to address #370 and #598 .
Now atomic wait changes have code paths not directly covered by atomic wait tests, but covered by, for example, tr1 tests. Note that it is Vista/7 code path in that change is non-trivial, XP/2003 or 8+ are trivial.

@AlexGuteniev
Copy link
Contributor Author

Everything else aside: If the code is supposed to work on Windows XP, shouldn't testing on a genuine WinXP Machine (Physical or VM) be part of the regular testing process?

The code is no longer supposed to work on Windows XP.

The variation between Vista and Win 7 is mutexes (TryAcquire... for SRWLOCK)
Another variation is atomic wait implementation which is between Win 8 and earlier (WaitOnAddress). Note that odd-sized atomics still use pre-Win 8 fallback on Win 8+, so dedicated pre-Win 8 coverage isn't even essential.

So in short after XP drop, the variation is too low.
Still in a perfect world need genuine Vista SP0 to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested resolved Successfully resolved without a commit
Projects
None yet
Development

No branches or pull requests

5 participants