HPCC-33280: m_apport in HTTP threads and server are not NULL #19439

timothyklemm · 2025-01-21T15:47:44Z

Rely on the HTTP protocols to ensure that the apport value supplied first to threads, and then servers, will not be NULL.

Remove checks for NULL in the server implementation.
Remove private constructors that cannot be safely used.

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

Rely on the HTTP protocols to ensure that the apport value supplied first to threads, and then servers, will not be NULL. - Remove checks for NULL in the server implementation. - Remove private constructors that cannot be safely used. Signed-off-by: Tim Klemm <[email protected]>

github-actions · 2025-01-21T15:48:05Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-33280

Jirabot Action Result:
Workflow Transition To: Merge Pending
Updated PR

rpastrana

@timothyklemm the changes seem* fine. However it's not abundantly clear what benefit this change provides. It's not clear from the changes how m_apport is safe to use in all code paths.

timothyklemm · 2025-01-21T18:46:39Z

The protocol classes throw exceptions if the pointer is NULL before creating threads, which in turn create servers. The thread and server classes already make multiple assumptions about the pointer not being NULL. The server method from which I removed the two checks for NULL starts by dereferencing the pointer without checking.

The issue came up because my change to the server span creation time inserted an unnecessary check which was flagged by the most recent Coverity scan. If it was necessary to check where I added the check, then it would also be necessary to check before subsequent references. Unfortunately, the scan hasn't been able to point out that if the pre-existing check was necessary, all of the preceding references would also require checks.

- Change interface signatures to pass required data by reference. - Refactor pooled thread usage by protocol classes to simplify pass-by- reference while standardizing error handling.

rpastrana · 2025-01-24T14:04:39Z

esp/bindings/http/platform/httpprot.cpp

-                        throw;
-                    }
-                    delete [] holder;
+                    PooledThreadInfo pti(*accepted, *apport);


The struct is much cleaner than the generic array.
But this needs to be tested exhaustively, I'd also like to ask @asselitx to review

rpastrana

@timothyklemm left a few questions/concerns.
Overall this seems like a good change, but it doesn't seem to match the commit title. Let's make sure the commit title and message match the changes and informs the reviewer.

rpastrana · 2025-01-24T14:32:49Z

esp/bindings/http/platform/httpprot.cpp

+    PooledThreadInfo(ISocket& _socket, CEspApplicationPort& _apport) : socket(_socket), apport(_apport) {}
+    ~PooledThreadInfo()
+    {
+#if __cplusplus >= 201703L


I think I like this a lot, and yes, it might be an appropriate pattern elsewhere such as jtrace.
Other than the new log output, are there any other side effects?
What happens to the exception? In the pre-existing code, there's a throw which I don't see here.

We had no error handling in the secure protocol. The socket was not being closed and now will be. The structure and its usage address two unhandled memory leaks.

I wrote a simple test on my Mac, because a vcpkg patch file states that this function can't be used in Apple builds (and I hadn't noticed it was already being used in the platform). A destructor inside a try block observed the exception and the catch block still caught it. For this case, we're observing without interfering with the standard stack unwind behavior.

rpastrana · 2025-01-24T14:38:54Z

esp/bindings/http/platform/httpprot.cpp

+    ~PooledThreadInfo()
+    {
+#if __cplusplus >= 201703L
+        if (std::uncaught_exceptions() > 0)


I'm not familiar w/ this approach. Is this count per thread? Is there any information about the exceptions available?

My understanding is that it is per thread. Exceptions could be captured if the destructor needed to know details about what caused the stack to unwind. In this instance, we were not showing interest in what caused the failure.

Thinking exception handling in the thread pool classes and code downstream. Could we be subject to a situation (either now or in a well-meaning future change) where an exception is caught and already closes the socket then is re-thrown? Is it safe to possibly call close twice?

Or what if an exception is caught but the socket is never closed? Maybe this falls under the category of "thinking too hard about what could go wrong if people do stupid things and you can't prevent every future mistake".

As implemented, a double close is safe - the first close clears a value that must be set for the second close to have an effect.

We already have an example of not explicitly closing the socket, because the exception isn't caught early enough. It has been suggested to me that preceding the close with an explicit shutdown could be an improvement. Reinforces my opinion that never closing is incorrect behavior.

Could somebody catch the exception because the comment stating that exception cleanup was happening elsewhere was not clear enough? Yes. Is it likely enough to warrant pre-empting it? I'll defer to the reviewers.

rpastrana · 2025-01-24T14:40:03Z

esp/bindings/http/platform/httpprot.cpp

+                    pti.persistentHandler = persistentHandler;
+                    pti.shouldClose = shouldClose;
+                    // cleanup on exception is handled by pti
+                    http_thread_pool->start((void*)&pti, "", m_threadCreateTimeout > 0?m_threadCreateTimeout*1000:0);


are there any exceptions we should be catching and handling?

Based on the original code, the answer is no. Instead of catching, reacting to, and re-throwing all exceptions, pti's destructor will react to the existence of an exception without capturing it.

As for the destructor's abbreviated handler relative to what was here, there is no longer a socket reference to be released nor is there a heap allocation to be deleted.

To be sure I'm understanding, you no longer need to call accepted->Release() because you aren't incrementing the link count when you're stuffing accepted into the pti, unlike what was done using the void * array.

The pooled thread sets the socket, creating its own link which it is responsible for. Since the protocol isn't creating a link, it doesn't need to release it.

rpastrana · 2025-01-24T14:55:04Z

esp/bindings/http/platform/httpservice.cpp

@@ -430,8 +430,6 @@ int CEspHttpServer::processRequest()
                    espGetMethod = EspGetMethod::Unhandled;
            }
        }
-        else if (!m_apport)
-            wantTracing = false;
        Owned<ISpan> serverSpan;
        if (wantTracing)


Didn't see what else affects "wantTracing" but if it's only dependent on !m_apport, we prob don't need this check anymore. If there are other variables affecting it, ignore this comment

The flag is affected by tracing being enabled and also by certain "esp" service GET requests that were processed prior to the original creation of the span (look just before the start of processRequest to see the method names that process without tracing).

rpastrana · 2025-01-24T14:59:08Z

esp/bindings/http/platform/httpservice.cpp

        }
+        ctx->addTraceSummaryTimeStamp(LogMin, "handleHttp");


it was difficult to determine if there were any functional changes in this block, assuming it was a shift due to the removal of the nullptr check.

asselitx

This looks good as long as none of my questions about the exception throwing/catching warrant a change.

Like Rodrigo said, it would be good to test thoroughly locally including leaks and exception conditions. Check with Mark and Attila to see if any existing tests cover this code, and if not, add some if it is reasonable (esp. throwing an exception that would cause socket close).

asselitx · 2025-01-27T18:06:52Z

esp/bindings/http/platform/httpprot.cpp

+                    pti.persistentHandler = persistentHandler;
+                    pti.shouldClose = shouldClose;
+                    // cleanup on exception is handled by pti
+                    http_thread_pool->start((void*)&pti, "", m_threadCreateTimeout > 0?m_threadCreateTimeout*1000:0);


To be sure I'm understanding, you no longer need to call accepted->Release() because you aren't incrementing the link count when you're stuffing accepted into the pti, unlike what was done using the void * array.

asselitx · 2025-01-27T18:19:20Z

esp/bindings/http/platform/httpprot.cpp

+    ~PooledThreadInfo()
+    {
+#if __cplusplus >= 201703L
+        if (std::uncaught_exceptions() > 0)


Thinking exception handling in the thread pool classes and code downstream. Could we be subject to a situation (either now or in a well-meaning future change) where an exception is caught and already closes the socket then is re-thrown? Is it safe to possibly call close twice?

Or what if an exception is caught but the socket is never closed? Maybe this falls under the category of "thinking too hard about what could go wrong if people do stupid things and you can't prevent every future mistake".

esp/bindings/http/platform/httpprot.cpp

timothyklemm requested a review from rpastrana January 21, 2025 16:43

rpastrana reviewed Jan 21, 2025

View reviewed changes

Review feedback updates...

af74cc8

- Change interface signatures to pass required data by reference. - Refactor pooled thread usage by protocol classes to simplify pass-by- reference while standardizing error handling.

timothyklemm requested a review from rpastrana January 23, 2025 16:26

rpastrana reviewed Jan 24, 2025

View reviewed changes

timothyklemm requested a review from asselitx January 24, 2025 14:12

rpastrana reviewed Jan 24, 2025

View reviewed changes

asselitx approved these changes Jan 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-33280: m_apport in HTTP threads and server are not NULL #19439

HPCC-33280: m_apport in HTTP threads and server are not NULL #19439

timothyklemm commented Jan 21, 2025 •

edited

Loading

github-actions bot commented Jan 21, 2025

rpastrana left a comment

timothyklemm commented Jan 21, 2025

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

rpastrana left a comment

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

asselitx Jan 27, 2025

timothyklemm Jan 28, 2025

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

asselitx Jan 27, 2025

timothyklemm Jan 28, 2025

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

rpastrana Jan 24, 2025

timothyklemm Jan 24, 2025

asselitx left a comment

asselitx Jan 27, 2025

asselitx Jan 27, 2025

HPCC-33280: m_apport in HTTP threads and server are not NULL #19439

Are you sure you want to change the base?

HPCC-33280: m_apport in HTTP threads and server are not NULL #19439

Conversation

timothyklemm commented Jan 21, 2025 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Jan 21, 2025

rpastrana left a comment

Choose a reason for hiding this comment

timothyklemm commented Jan 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpastrana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asselitx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timothyklemm commented Jan 21, 2025 •

edited

Loading