Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Add assertions for potential OOB reads in std::nth_element #67023

Merged
merged 8 commits into from
Oct 19, 2023

Conversation

danlark1
Copy link
Contributor

Same as https://reviews.llvm.org/D147089 but for std::nth_element

@danlark1 danlark1 requested a review from a team as a code owner September 21, 2023 14:43
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Sep 21, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-libcxx

Changes

Same as https://reviews.llvm.org/D147089 but for std::nth_element


Full diff: https://github.com/llvm/llvm-project/pull/67023.diff

3 Files Affected:

  • (modified) libcxx/include/__algorithm/nth_element.h (+22-6)
  • (modified) libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp (+60-15)
  • (modified) libcxx/test/libcxx/algorithms/alg.sorting/bad_comparator_values.h (+68-1)
diff --git a/libcxx/include/__algorithm/nth_element.h b/libcxx/include/__algorithm/nth_element.h
index dbacf58f9ecdbc4..37e43ab0db8ca4f 100644
--- a/libcxx/include/__algorithm/nth_element.h
+++ b/libcxx/include/__algorithm/nth_element.h
@@ -116,10 +116,18 @@ __nth_element(_RandomAccessIterator __first, _RandomAccessIterator __nth, _Rando
                     return;
                 }
                 while (true) {
-                    while (!__comp(*__first, *__i))
+                    while (!__comp(*__first, *__i)) {
                         ++__i;
-                    while (__comp(*__first, *--__j))
-                        ;
+                        _LIBCPP_ASSERT_UNCATEGORIZED(
+                            __i != __last,
+                            "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
+                    }
+                    do {
+                        _LIBCPP_ASSERT_UNCATEGORIZED(
+                            __j != __first,
+                            "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
+                        --__j;
+                    } while (__comp(*__first, *__j));
                     if (__i >= __j)
                         break;
                     _Ops::iter_swap(__i, __j);
@@ -146,11 +154,19 @@ __nth_element(_RandomAccessIterator __first, _RandomAccessIterator __nth, _Rando
             while (true)
             {
                 // __m still guards upward moving __i
-                while (__comp(*__i, *__m))
+                while (__comp(*__i, *__m)) {
                     ++__i;
+                    _LIBCPP_ASSERT_UNCATEGORIZED(
+                        __i != __last,
+                        "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
+                }
                 // It is now known that a guard exists for downward moving __j
-                while (!__comp(*--__j, *__m))
-                    ;
+                do {
+                    _LIBCPP_ASSERT_UNCATEGORIZED(
+                        __j != __first,
+                        "Would read out of bounds, does your comparator satisfy the strict-weak ordering requirement?");
+                    --__j;
+                } while (!__comp(*__j, *__m));
                 if (__i >= __j)
                     break;
                 _Ops::iter_swap(__i, __j);
diff --git a/libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp b/libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp
index e5e417fe7bda2d4..1e741344b1fca6b 100644
--- a/libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp
+++ b/libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp
@@ -50,24 +50,34 @@
 #include "bad_comparator_values.h"
 #include "check_assertion.h"
 
-void check_oob_sort_read() {
-    std::map<std::size_t, std::map<std::size_t, bool>> comparison_results; // terrible for performance, but really convenient
-    for (auto line : std::views::split(DATA, '\n') | std::views::filter([](auto const& line) { return !line.empty(); })) {
-        auto values = std::views::split(line, ' ');
-        auto it = values.begin();
-        std::size_t left = std::stol(std::string((*it).data(), (*it).size()));
-        it = std::next(it);
-        std::size_t right = std::stol(std::string((*it).data(), (*it).size()));
-        it = std::next(it);
-        bool result = static_cast<bool>(std::stol(std::string((*it).data(), (*it).size())));
-        comparison_results[left][right] = result;
-    }
-    auto predicate = [&](std::size_t* left, std::size_t* right) {
+class ComparisonResults {
+public:
+    ComparisonResults(std::string_view data) {
+        for (auto line : std::views::split(data, '\n') | std::views::filter([](auto const& line) { return !line.empty(); })) {
+            auto values = std::views::split(line, ' ');
+            auto it = values.begin();
+            std::size_t left = std::stol(std::string((*it).data(), (*it).size()));
+            it = std::next(it);
+            std::size_t right = std::stol(std::string((*it).data(), (*it).size()));
+            it = std::next(it);
+            bool result = static_cast<bool>(std::stol(std::string((*it).data(), (*it).size())));
+            comparison_results[left][right] = result;
+        }
+    }
+
+    bool compare(size_t* left, size_t* right) {
         assert(left != nullptr && right != nullptr && "something is wrong with the test");
         assert(comparison_results.contains(*left) && comparison_results[*left].contains(*right) && "malformed input data?");
         return comparison_results[*left][*right];
-    };
+    }
 
+    size_t size() const { return comparison_results.size(); }
+private:
+    std::map<std::size_t, std::map<std::size_t, bool>> comparison_results; // terrible for performance, but really convenient
+};
+
+void check_oob_sort_read() {
+    ComparisonResults comparison_results(SORT_DATA);
     std::vector<std::unique_ptr<std::size_t>> elements;
     std::set<std::size_t*> valid_ptrs;
     for (std::size_t i = 0; i != comparison_results.size(); ++i) {
@@ -81,7 +91,7 @@ void check_oob_sort_read() {
         // because we're reading OOB.
         assert(valid_ptrs.contains(left));
         assert(valid_ptrs.contains(right));
-        return predicate(left, right);
+        return comparison_results.compare(left, right);
     };
 
     // Check the classic sorting algorithms
@@ -165,6 +175,39 @@ void check_oob_sort_read() {
     }
 }
 
+void check_oob_nth_element_read() {
+    ComparisonResults results(NTH_ELEMENT_DATA);
+    std::vector<std::unique_ptr<std::size_t>> elements;
+    std::set<std::size_t*> valid_ptrs;
+    for (std::size_t i = 0; i != results.size(); ++i) {
+        elements.push_back(std::make_unique<std::size_t>(i));
+        valid_ptrs.insert(elements.back().get());
+    }
+
+    auto checked_predicate = [&](size_t* left, size_t* right) {
+        // If the pointers passed to the comparator are not in the set of pointers we
+        // set up above, then we're being passed garbage values from the algorithm
+        // because we're reading OOB.
+        assert(valid_ptrs.contains(left));
+        assert(valid_ptrs.contains(right));
+        return results.compare(left, right);
+    };
+
+    {
+        std::vector<std::size_t*> copy;
+        for (auto const& e : elements)
+            copy.push_back(e.get());
+        TEST_LIBCPP_ASSERT_FAILURE(std::nth_element(copy.begin(), copy.begin(), copy.end(), checked_predicate), "Would read out of bounds");
+    }
+
+    {
+        std::vector<std::size_t*> copy;
+        for (auto const& e : elements)
+            copy.push_back(e.get());
+        TEST_LIBCPP_ASSERT_FAILURE(std::ranges::nth_element(copy, copy.begin(), checked_predicate), "Would read out of bounds");
+    }
+}
+
 struct FloatContainer {
   float value;
   bool operator<(const FloatContainer& other) const {
@@ -214,6 +257,8 @@ int main(int, char**) {
 
     check_oob_sort_read();
 
+    check_oob_nth_element_read();
+
     check_nan_floats();
 
     check_irreflexive();
diff --git a/libcxx/test/libcxx/algorithms/alg.sorting/bad_comparator_values.h b/libcxx/test/libcxx/algorithms/alg.sorting/bad_comparator_values.h
index 19ea023419ea90a..c0ffd16cd4ac4a1 100644
--- a/libcxx/test/libcxx/algorithms/alg.sorting/bad_comparator_values.h
+++ b/libcxx/test/libcxx/algorithms/alg.sorting/bad_comparator_values.h
@@ -11,7 +11,74 @@
 
 #include <string_view>
 
-inline constexpr std::string_view DATA = R"(
+inline constexpr std::string_view NTH_ELEMENT_DATA = R"(
+0 0 0
+0 1 0
+0 2 0
+0 3 0
+0 4 1
+0 5 0
+0 6 0
+0 7 0
+1 0 0
+1 1 0
+1 2 0
+1 3 1
+1 4 1
+1 5 1
+1 6 1
+1 7 1
+2 0 1
+2 1 1
+2 2 1
+2 3 1
+2 4 1
+2 5 1
+2 6 1
+2 7 1
+3 0 1
+3 1 1
+3 2 1
+3 3 1
+3 4 1
+3 5 1
+3 6 1
+3 7 1
+4 0 1
+4 1 1
+4 2 1
+4 3 1
+4 4 1
+4 5 1
+4 6 1
+4 7 1
+5 0 1
+5 1 1
+5 2 1
+5 3 1
+5 4 1
+5 5 1
+5 6 1
+5 7 1
+6 0 1
+6 1 1
+6 2 1
+6 3 1
+6 4 1
+6 5 1
+6 6 1
+6 7 1
+7 0 1
+7 1 1
+7 2 1
+7 3 1
+7 4 1
+7 5 1
+7 6 1
+7 7 1
+)";
+
+inline constexpr std::string_view SORT_DATA = R"(
 0 0 0
 0 1 1
 0 2 1

@danlark1
Copy link
Contributor Author

danlark1 commented Sep 22, 2023

I have a funny error in debug mode

/home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-e33075b085b1-1/llvm-project/libcxx-ci/build/generic-debug-mode/include/c++/v1/__algorithm/nth_element.h:46:1: error: function '__nth_element' has cognitive complexity of 144 (threshold 143) [readability-function-cognitive-complexity,-warnings-as-errors]

@danlark1
Copy link
Contributor Author

Ping

@ldionne ldionne assigned ldionne and unassigned var-const Sep 27, 2023
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the patch! I have a few comments but this is basically good. You can squash+merge once the review comments are addressed and the CI is green.

@danlark1
Copy link
Contributor Author

Thanks a lot for the patch! I have a few comments but this is basically good. You can squash+merge once the review comments are addressed and the CI is green.

Please, merge for me, I don't have commit rights

@ldionne
Copy link
Member

ldionne commented Sep 27, 2023

I'll wait for the CI to be green and then I'll merge. I wish we had auto-merge enabled in the repo.

@philnik777 philnik777 added the hardening Issues related to the hardening effort label Oct 6, 2023
@danlark1
Copy link
Contributor Author

@ldionne, CI is almost green, the errors are in the format of

error: function '__nth_element' has cognitive complexity of 144 (threshold 143) [readability-function-cognitive-complexity,-warnings-as-errors]

What should I do about it?

@ldionne
Copy link
Member

ldionne commented Oct 10, 2023

@philnik777 These clang-tidy errors should really not be errors, do you have a preference for how to deal with those?

Also @danlark1 if you rebase onto main the issues with macOS in the CI should be fixed now.

@philnik777
Copy link
Contributor

@philnik777 These clang-tidy errors should really not be errors, do you have a preference for how to deal with those?

That depends on what we want to do with these kinds of issues in general. I think we should avoid making functions so complex. If we have a general consensus there, I think we should just suppress the warning with a todo that the function should be split up (or simplified some other way). If not, then I guess we can just disable the clang-tidy check. Having it non-fatal is somewhat useless IMO, just like having compiler warning non-fatal is useless.

@danlark1
Copy link
Contributor Author

Also @danlark1 if you rebase onto main the issues with macOS in the CI should be fixed now.

@ldionne, Done

@ldionne
Copy link
Member

ldionne commented Oct 17, 2023

That depends on what we want to do with these kinds of issues in general. I think we should avoid making functions so complex. If we have a general consensus there, I think we should just suppress the warning with a todo that the function should be split up (or simplified some other way).

I think I agree. I just looked at the function and if this were code being committed today, it would likely never pass code review in that state (I'm not judging what was done at that time, but realistically we would ask for the function to be broken up -- I think).

@philnik777 is there a way to turn off clang-tidy warnings selectively? Do we simply use the usual pragma to disable diagnostics?

@philnik777
Copy link
Contributor

You can use // NOLINT(check-name) or // NOLINTNEXTLINE(check-name) to disable clang-tidy warnings. I would probably put a // NOLINTNEXTLINE(readability-function-cognitive-complexity) above the function.

@danlark1
Copy link
Contributor Author

danlark1 commented Oct 18, 2023

@ldionne, CI is green, please, merge

Thanks for the feedback!

@ldionne
Copy link
Member

ldionne commented Oct 19, 2023

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants